[data-protocols] Thoughts on JSON Table Schema

Alex Dean alex at snowplowanalytics.com
Mon Jul 6 18:47:30 UTC 2015


Hi Paul,

Ah - thanks for that, super-helpful. I have created a ticket here:

https://github.com/dataprotocols/dataprotocols/issues/205

I think I am probably a bit blinkered coming at this from a DDL-generation
viewpoint - what are the main use cases for a generic JSON Table Schema
which cannot generate db-specific DDL?

Cheers,

Alex

On Mon, Jul 6, 2015 at 7:28 PM, Paul Walsh <paulywalsh at gmail.com> wrote:

> Hi Alex,
>
> Well, the way I see it, JSON Table Schema is trying to get to some rough
> consensus on the minimal viable information we need to describe tabular
> data in a way that is generally applicable to a wide range of input/output
> formats.
>
> It can’t be specific, because then the specificity never ends - today
> Redshift, tomorrow…
>
> However, I can see the utility of an extra document that provides
> additional information as you describe. I think the best way to start
> tackling this is to pitch the idea with some examples of how it may look.
> We could do it here on the list, but the issue tracker may be better, as
> most discussion on ideas like this occurs there:
> https://github.com/dataprotocols/dataprotocols/issues
>
> Best,
>
> Paul
>
>
>
>
> On 6 Jul 2015, at 21:15, Alex Dean <alex at snowplowanalytics.com> wrote:
>
> Thanks Paul! Any thoughts on my other, more rambling point?
>
> Cheers,
>
> Alex
>
> On Mon, Jul 6, 2015 at 6:56 PM, Paul Walsh <paulywalsh at gmail.com> wrote:
>
>> Hi,
>>
>> For the quick question: a JSON Schema of JSON Table Schema is here:
>> https://github.com/dataprotocols/schemas/blob/master/json-table-schema.json
>>
>> Best,
>>
>> Paul
>>
>> On 6 Jul 2015, at 20:35, Alex Dean <alex at snowplowanalytics.com> wrote:
>>
>> Hi,
>>
>> First can I say I am a long-time follower and huge fan of the
>> dataprotocols.org project.
>>
>> At Snowplow we are thinking of using JSON Table Schema in our Iglu schema
>> repository system:
>>
>> https://github.com/snowplow/iglu
>>
>> First a quick question - I couldn't find a JSON Schema for the JSON Table
>> Schema. Has anybody written this yet?
>>
>> More broadly: I'm not convinced that the current unitary JSON Table
>> Schema is a viable approach.
>>
>> Different relational databases have different capabilities - for example,
>> a valid table definition for Redshift must have SORTKEY and DISTKEY, and
>> indexes are not supported. This is distinct from Postgres DDL, which in
>> turn is distinct from BigQuery DDL, Vertica DDL etc.
>>
>> For me, the value of a JSON Table Schema would be in making table DDL
>> declarative and composable. To be useful though, it must be possible to
>> generate valid idiomatic (i.e. database-specific) DDL from a given instance
>> of a JSON Table Schema.
>>
>> Based on this, I'm leaning towards a JSON Table Schema which has
>> database-specific flavors. I think the two options here are:
>>
>>    1. Create a separate definition document (in JSON Schema) for each
>>    database that we want to support, or
>>    2. Create a unitary JSON Table Schema which uses enums of e.g.
>>    database-specific field-descriptor types to support differences
>>
>> The downside of the first option is that there is no guaranteed
>> predictability of schema shape between different database types. The second
>> option is a little more fiddly but probably more useful long-term.
>>
>> Does anybody have any thoughts on the above?
>>
>> Thanks,
>>
>> Alex
>>
>> --
>> Co-founder
>> Snowplow Analytics <http://snowplowanalytics.com/>
>> The Roma Building, 32-38 Scrutton Street, London EC2A 4RQ, United Kingdom
>> +44 (0)203 589 6116
>> @alexcrdean <https://twitter.com/alexcrdean>
>>  _______________________________________________
>> data-protocols mailing list
>> data-protocols at lists.okfn.org
>> https://lists.okfn.org/mailman/listinfo/data-protocols
>> Unsubscribe: https://lists.okfn.org/mailman/options/data-protocols
>>
>>
>>
>
>
> --
> Co-founder
> Snowplow Analytics <http://snowplowanalytics.com/>
> The Roma Building, 32-38 Scrutton Street, London EC2A 4RQ, United Kingdom
> +44 (0)203 589 6116
> +44 7881 622 925
> @alexcrdean <https://twitter.com/alexcrdean>
>
>
>


-- 
Co-founder
Snowplow Analytics <http://snowplowanalytics.com/>
The Roma Building, 32-38 Scrutton Street, London EC2A 4RQ, United Kingdom
+44 (0)203 589 6116
+44 7881 622 925
@alexcrdean <https://twitter.com/alexcrdean>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/data-protocols/attachments/20150706/92c5dc7e/attachment.html>


More information about the data-protocols mailing list