[datacatalogs] Request for Comments on Draft Data Catalog Standard (Schema and Protocol)

Rufus Pollock rufus.pollock at okfn.org
Tue Jun 12 15:22:26 UTC 2012


On 12 June 2012 14:34, Ed Summers <ehs at pobox.com> wrote:
> Thanks for the update Rufus. It's nice to see a simple JSON
> representation of DCAT come out of this work. A few questions:
>
> 1) Will a dataset have a URL? I think it would be useful for it to
> have one, and for JSON and RDF flavored representations of the dataset
> to include it. I think it will also be useful to include it in the
> changes.json representation instead of dataset_id. A side effect of

Agreed. Richard also raised this when I spoke with him in person last
year. This would also help with the protocol.

> assigning URLs to data catalogs is that publishers can also have a
> RESTful mechanism for updating and deleting datasets if they need it.
> But maybe this is opening a can of worms?

One question of course is whether the URL is the API url or the url of
the human readable version (sure with content-negotiation this isn't
so relevant but will everyone support that ...)

> 2) Are the full representations of the dataset made available in the
> changes.json, or will clients need to fetch the dataset to get the
> full information?

They will need to fetch. changes.json is lightweight.

> 3) Have you considered requiring the changes.json to be ordered by
> dataset modification time, and adding an assertion about the next page
> of results to the JSON similar to what Atom, RSS and HTML provide? I

The idea is that it would default be ordered by modification time
(though that info should be added).

> think this would mean the requirement for the "since" and "page"
> parameters would go away. A client wishing to receive updates since a
> particular time would simply keep reading pages until it found the
> datetime in question. It would also make the API more in-line with the
> REST principle of Hypertext as the Engine of Application State [2].

Nice point.

> To do this I think your changes.json would need to have a bit more
> structure similar to what Google did when JSON-ifying Atom with its
> JSON-C [3]. Rather than returning a flat list of items, you can push
> them down a level, so that you can include metadata about the list:
>
> {
>  "items": [ ... ],
>  "previous": "http://example.org/changes.json?page=1,
>  "next": "http://example.org/changes.json?page=3"
> }

I like that.

> Clients that want to drill backwards can simply follow the link and
> you will not need to have an "API" outside of what REST already
> provides.
>
> 4) Are both JSON and RDF representations of the dataset required?

No only the JSON representation is *required* atm with option to
provide other formats (n3 / rdf/xml etc)

Rufus

> //Ed
>
> [1] http://www.iana.org/assignments/link-relations/link-relations.xml
> [2] http://en.wikipedia.org/wiki/HATEOAS
> [3] https://developers.google.com/youtube/2.0/developers_guide_jsonc



-- 
Co-Founder, Open Knowledge Foundation
Promoting Open Knowledge in a Digital Age
http://www.okfn.org/ - http://blog.okfn.org/




More information about the data-catalogs mailing list