[School-of-data] Help Defining Dataset Definition and Quality Parameters

Tarek Amr tarekamr at gmail.com
Mon Apr 29 12:15:17 UTC 2013

As far as I understood, there should be two approaches to do so.

(A) You can do it manually, sort of. Let's say you set some rules to
measure collaboration. Number of edits for each file, number of people who
edit it, may be quantity of discussions, process log files, you name it.

(B) Learn a computer to do this for you. In such case, you need some files
or records that you know they represent collaboration, and some that don't.
They you learn a classifier on the attributes of those records or resources
and then use it to tell if the whole data in general represent
collaboration or not.

Regarding the quality of the data, I guess you should check the following:

- Is it easy to transform the data to an open format a computer can read
and process
- Is it easy to extract some features from the data (for example: number of
edits on each file, their data, who edited them, etc)
- Aren't there any missing data

Anyway, those are my (not even) $ 0.002, so will wait for more experienced
ones to add their input here

On Mon, Apr 29, 2013 at 12:49 PM, Matan Rotman <matan.rotman at gmail.com>wrote:

> Hello all,
> My name is Matan Rotman, and I'm a student at Hebrew University majoring
> Political Science. As a part of my studying, I'm writing a paper that tries
> to understand whether the Israel open data program is efficient
> (Collaberative-wise), and if not, why not (hence, why wouldn't
> administrative dept. won't cooperate with the program). The first thing I
> need to do for that, though, is to understand if the datasets that are on
> the website are of quality or not. As i'm not a technical guy, I could use
> some help understanding what would be considered as the definition of a
> dataset (hopefully, as particular as possible), and more important, I could
> really use for some help with defining quality parameters so I could
> measure the quality of the different files and sets uploaded.
> The website is at http://data.gov.il (all in Hebrew though), and I'd love
> any help on the subject possible
> P.S
> I hope this is the right place to ask, and I'm going to ask also at Open
> government mailing list, so I apologize if you get my message twice.
> Best Regards,
> Matan
> _______________________________________________
> School-of-data mailing list
> School-of-data at lists.okfn.org
> http://lists.okfn.org/mailman/listinfo/school-of-data
> Unsubscribe: http://lists.okfn.org/mailman/options/school-of-data

Best Regards
Tarek Amr

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/school-of-data/attachments/20130429/8d1626a4/attachment-0001.html>

More information about the school-of-data mailing list