[data-protocols] Thoughts on JSON Table Schema

Rufus Pollock rufus.pollock at okfn.org
Tue Jul 7 08:21:38 UTC 2015


Hi Alex,

This is a great thread and we would like to answer in detail. However, I
one small process point: (though not obvious) we are deprecating this
mailing list and moving to the forum:

https://discuss.okfn.org/c/open-knowledge-labs/data-packages

Would you mind reposting your item there quickly and then we can run the
thread there.

Rufus

On 6 July 2015 at 18:35, Alex Dean <alex at snowplowanalytics.com> wrote:

> Hi,
>
> First can I say I am a long-time follower and huge fan of the
> dataprotocols.org project.
>
> At Snowplow we are thinking of using JSON Table Schema in our Iglu schema
> repository system:
>
> https://github.com/snowplow/iglu
>
> First a quick question - I couldn't find a JSON Schema for the JSON Table
> Schema. Has anybody written this yet?
>
> More broadly: I'm not convinced that the current unitary JSON Table Schema
> is a viable approach.
>
> Different relational databases have different capabilities - for example,
> a valid table definition for Redshift must have SORTKEY and DISTKEY, and
> indexes are not supported. This is distinct from Postgres DDL, which in
> turn is distinct from BigQuery DDL, Vertica DDL etc.
>
> For me, the value of a JSON Table Schema would be in making table DDL
> declarative and composable. To be useful though, it must be possible to
> generate valid idiomatic (i.e. database-specific) DDL from a given instance
> of a JSON Table Schema.
>
> Based on this, I'm leaning towards a JSON Table Schema which has
> database-specific flavors. I think the two options here are:
>
>    1. Create a separate definition document (in JSON Schema) for each
>    database that we want to support, or
>    2. Create a unitary JSON Table Schema which uses enums of e.g.
>    database-specific field-descriptor types to support differences
>
> The downside of the first option is that there is no guaranteed
> predictability of schema shape between different database types. The second
> option is a little more fiddly but probably more useful long-term.
>
> Does anybody have any thoughts on the above?
>
> Thanks,
>
> Alex
>
> --
> Co-founder
> Snowplow Analytics <http://snowplowanalytics.com/>
> The Roma Building, 32-38 Scrutton Street, London EC2A 4RQ, United Kingdom
> +44 (0)203 589 6116
> @alexcrdean <https://twitter.com/alexcrdean>
>
> _______________________________________________
> data-protocols mailing list
> data-protocols at lists.okfn.org
> https://lists.okfn.org/mailman/listinfo/data-protocols
> Unsubscribe: https://lists.okfn.org/mailman/options/data-protocols
>
>


-- 

*Rufus PollockFounder and President | skype: rufuspollock | @rufuspollock
<https://twitter.com/rufuspollock>Open Knowledge <http://okfn.org/> - see
how data can change the world**http://okfn.org/ <http://okfn.org/> | @okfn
<http://twitter.com/OKFN> | Open Knowledge on Facebook
<https://www.facebook.com/OKFNetwork> |  Blog <http://blog.okfn.org/>*
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/data-protocols/attachments/20150707/3fbe36ee/attachment-0001.html>


More information about the data-protocols mailing list