[data-protocols] Simple Data Format: straw man

Rufus Pollock rufus.pollock at okfn.org
Wed May 16 20:05:40 BST 2012


On 16 May 2012 15:29, Friedrich Lindenberg
<friedrich.lindenberg at okfn.org> wrote:
> Hi all,
>
> this is becoming quite complex, but I don't think we have our use
> cases straight at all yet. So there are three different problems we
> could try and solve:
>
> 1) Provenance metadata (as per @Francis). Different discussion.
> 2) Column-level metadata, typed CSV (SDF, as far as I understand)
> 3) A logical model of a dataset which can then be represented in CSV

SDF could be extended somewhat to support (3) -- see response to Nick.
I'd held back because I'm not sure about complexity that full DSPL
brings.

> Out of these, I think the only thing that really adds the necessary
> value while being largely unresolved is #3. Unfortunately, it requires
> some degree of #1 and #2. What I mean by logical model is: something
> that

What currently resolves (2) for you?

> - describes which entities that are represented (please don't kill me
> over the noun, I want to cover both dimensions and facts),
> - which attributes they have,
> - what roles those attributes play (measures, primary keys) and
> - which links combine the different entities.

[...]

> In all of this, I believe this has to be opinionated and pragmatic. We
> have to make some decisions without developing a theory of knowledge.
> That means you need to answer the type question, not provide 15
> answers like the current SDF proposal.

I think it's 2 (without the JSON-LD support it would be 1). I'm happy
to drop JSON-LD support but I do think it is a nice option.

> If we can in any way agree not to use linked data, I could avert my
> suicidal tendencies associated with interacting with that community.
> This goes as far as not re-using existing RDF ontologies, because that
> brings them to your home and they will come at night and demand you
> triplify your own family.

You're a clear -1 on JSON-LD support :-)

> As for the storage format, I think CSV doesn't open all the doors you
> may want on your caravan, but it *works*. Once you become agnostic (or
> even just slightly polyamorous, i.e. JSON), you have to go into a "set
> of format factories" thing in your implementation again, which just
> makes it a non-solution because you cannot rely upon support. I have
> dBase files and I'm not afraid to use them!

OK we have a clear -1 on any JSON support for the data transport. And
I definitely agree with having fewer (preferably) one way to do
things.

> In general, I think we should work up the requirements a bit more, but
> I think for the kinds of cases that we're mostly looking at, DSPL/JSON
> may be a very good starting point that we can use and develop. This
> would keep it lean, not requiring people to buy into a whole data
> package idea or triplificationtheory.

I'm not sure there's a "whole data package idea" -- it's pretty basic.

Using data package metadata was just the equivalent of the basic
metadata that DSPL ships but meant we could reuse some of the Data
Package idea (SDF stuff would be data packages but not all data
packages would be SDF).

Rufus



More information about the data-protocols mailing list