[data-protocols] RFC: JSON Table Schema
Tom Morris
tfmorris at gmail.com
Wed Nov 28 20:27:20 GMT 2012
Another data point is Google's Dataset Publishing Language (DSPL)
https://developers.google.com/public-data/docs/developer_guide
It's XML-based (ick!), but includes dataset level metadata, which can be
useful for provenance, in addition to the schema.
Tom
On Wed, Nov 28, 2012 at 2:42 PM, Xavier Badosa <xbadosa at gmail.com> wrote:
> Hi Rufus,
>
> Your simple schema for tabular data is interesting: it's similar but more
> powerful than the schema used by the US Census Bureau API:
>
> http://www.census.gov/developers/
>
> It's important to notice that many times what is considered "tabular data"
> (in your sense: some fields that are shared by a set of individuals) could
> be better represented in a cube model. Take for example the Census API:
>
> [
> ["P0010001","NAME","state"],
> ["710231","Alaska","02"],
> ["4779736","Alabama","01"],
> ["2915918","Arkansas","05"],
> ["6392017","Arizona","04"],
> ["37253956","California","06"],
> ...
> ]
>
> Rows in this example have an ID and this ID represents the possible values
> of a "variable" or "dimension" ("state" in the example). Instead of saying
> that this is some tabular data of indivuals (that happen to be states) with
> field "population" ("P0010001"), it seems more accurate to see it as a
> table ("table" in the statistical sense, not in the DB sense) or cube of
> population by state. This is a very frequent situation in statistics.
>
> To solve this special case (tabular data that is actually cubical,
> multidimensional) I have proposed JSON-stat
>
> http://json-stat.org/doc/
>
> Besides, the statistical community uses the SDMX standard for expressing
> statistics and is currently working on a JSON façade (SDMX-JSON). I'm a
> member of the SDMX-JSON group. JSON-stat is used in that group as a
> starting point.
>
> Probably we could benefit from some of your ideas.
>
>
> On Mon, Nov 26, 2012 at 11:34 AM, Rufus Pollock <rufus.pollock at okfn.org>wrote:
>
>> Hi All,
>>
>> I've been working on a simple schema for tabular data. The schema is
>> designed to be expressible in JSON.
>>
>> http://www.dataprotocols.org/en/latest/json-table-schema.html
>>
>> This is still incomplete (e.g. need to have format specified in more
>> detail) but I'd be very interested in any feedback or thoughts (e.g.
>> is this re-inventing the wheel - if so what is better?).
>>
>> Regards,
>>
>> Rufus
>>
>> ## Background
>>
>> In many ways this is just an extraction, with some refactoring, of
>> what was in the Simple Data Format spec:
>>
>> <http://www.dataprotocols.org/en/latest/simple-data-format.html>
>>
>> Splitting out into its own mini-RFC is good because smaller pieces are
>> more useful and it makes it re-usable (e.g. can be used from the data
>> packages spec).
>>
>> Real world use: something very like this is used in ReclineJS:
>> <http://reclinejs.com/docs/models.html#field> and also in the CKAN API
>> <http://docs.ckan.org/en/ckan-1.8/datastore-api.html
>>
>> _______________________________________________
>> data-protocols mailing list
>> data-protocols at lists.okfn.org
>> http://lists.okfn.org/mailman/listinfo/data-protocols
>> Unsubscribe: http://lists.okfn.org/mailman/options/data-protocols
>>
>
>
> _______________________________________________
> data-protocols mailing list
> data-protocols at lists.okfn.org
> http://lists.okfn.org/mailman/listinfo/data-protocols
> Unsubscribe: http://lists.okfn.org/mailman/options/data-protocols
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/data-protocols/attachments/20121128/63c80d31/attachment.htm>
More information about the data-protocols
mailing list