[data-protocols] Thoughts on JSON Table Schema

Alex Dean alex at snowplowanalytics.com
Mon Jul 6 17:35:09 UTC 2015


Hi,

First can I say I am a long-time follower and huge fan of the
dataprotocols.org project.

At Snowplow we are thinking of using JSON Table Schema in our Iglu schema
repository system:

https://github.com/snowplow/iglu

First a quick question - I couldn't find a JSON Schema for the JSON Table
Schema. Has anybody written this yet?

More broadly: I'm not convinced that the current unitary JSON Table Schema
is a viable approach.

Different relational databases have different capabilities - for example, a
valid table definition for Redshift must have SORTKEY and DISTKEY, and
indexes are not supported. This is distinct from Postgres DDL, which in
turn is distinct from BigQuery DDL, Vertica DDL etc.

For me, the value of a JSON Table Schema would be in making table DDL
declarative and composable. To be useful though, it must be possible to
generate valid idiomatic (i.e. database-specific) DDL from a given instance
of a JSON Table Schema.

Based on this, I'm leaning towards a JSON Table Schema which has
database-specific flavors. I think the two options here are:

   1. Create a separate definition document (in JSON Schema) for each
   database that we want to support, or
   2. Create a unitary JSON Table Schema which uses enums of e.g.
   database-specific field-descriptor types to support differences

The downside of the first option is that there is no guaranteed
predictability of schema shape between different database types. The second
option is a little more fiddly but probably more useful long-term.

Does anybody have any thoughts on the above?

Thanks,

Alex

-- 
Co-founder
Snowplow Analytics <http://snowplowanalytics.com/>
The Roma Building, 32-38 Scrutton Street, London EC2A 4RQ, United Kingdom
+44 (0)203 589 6116
@alexcrdean <https://twitter.com/alexcrdean>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/data-protocols/attachments/20150706/95af6681/attachment.html>


More information about the data-protocols mailing list