[data-protocols] Thoughts on JSON Table Schema

Alex Dean alex at snowplowanalytics.com
Mon Jul 6 18:15:43 UTC 2015


Thanks Paul! Any thoughts on my other, more rambling point?

Cheers,

Alex

On Mon, Jul 6, 2015 at 6:56 PM, Paul Walsh <paulywalsh at gmail.com> wrote:

> Hi,
>
> For the quick question: a JSON Schema of JSON Table Schema is here:
> https://github.com/dataprotocols/schemas/blob/master/json-table-schema.json
>
> Best,
>
> Paul
>
> On 6 Jul 2015, at 20:35, Alex Dean <alex at snowplowanalytics.com> wrote:
>
> Hi,
>
> First can I say I am a long-time follower and huge fan of the
> dataprotocols.org project.
>
> At Snowplow we are thinking of using JSON Table Schema in our Iglu schema
> repository system:
>
> https://github.com/snowplow/iglu
>
> First a quick question - I couldn't find a JSON Schema for the JSON Table
> Schema. Has anybody written this yet?
>
> More broadly: I'm not convinced that the current unitary JSON Table Schema
> is a viable approach.
>
> Different relational databases have different capabilities - for example,
> a valid table definition for Redshift must have SORTKEY and DISTKEY, and
> indexes are not supported. This is distinct from Postgres DDL, which in
> turn is distinct from BigQuery DDL, Vertica DDL etc.
>
> For me, the value of a JSON Table Schema would be in making table DDL
> declarative and composable. To be useful though, it must be possible to
> generate valid idiomatic (i.e. database-specific) DDL from a given instance
> of a JSON Table Schema.
>
> Based on this, I'm leaning towards a JSON Table Schema which has
> database-specific flavors. I think the two options here are:
>
>    1. Create a separate definition document (in JSON Schema) for each
>    database that we want to support, or
>    2. Create a unitary JSON Table Schema which uses enums of e.g.
>    database-specific field-descriptor types to support differences
>
> The downside of the first option is that there is no guaranteed
> predictability of schema shape between different database types. The second
> option is a little more fiddly but probably more useful long-term.
>
> Does anybody have any thoughts on the above?
>
> Thanks,
>
> Alex
>
> --
> Co-founder
> Snowplow Analytics <http://snowplowanalytics.com/>
> The Roma Building, 32-38 Scrutton Street, London EC2A 4RQ, United Kingdom
> +44 (0)203 589 6116
> @alexcrdean <https://twitter.com/alexcrdean>
>  _______________________________________________
> data-protocols mailing list
> data-protocols at lists.okfn.org
> https://lists.okfn.org/mailman/listinfo/data-protocols
> Unsubscribe: https://lists.okfn.org/mailman/options/data-protocols
>
>
>


-- 
Co-founder
Snowplow Analytics <http://snowplowanalytics.com/>
The Roma Building, 32-38 Scrutton Street, London EC2A 4RQ, United Kingdom
+44 (0)203 589 6116
+44 7881 622 925
@alexcrdean <https://twitter.com/alexcrdean>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/data-protocols/attachments/20150706/5067c68e/attachment.html>


More information about the data-protocols mailing list