[ckan-discuss] API For Package Name
John Bywater
john.bywater at appropriatesoftware.net
Sun Nov 28 23:12:33 GMT 2010
Hi Will,
William Waites wrote:
> * [2010-11-28 14:00:20 +0000] John Bywater <john.bywater at appropriatesoftware.net> écrit:
> ]
> ] I also found a message from Jeni Tenison, which started a long thread:
> ] http://www.mail-archive.com/public-lod@w3.org/msg04086.html
> ]
> ] [...]
I'd like to find out whether or not Jeni got what she was looking for.
> ]
> ] So, looking at all this from a distance, there appears to be two poles:
> ] RDF for RDF-enabled developers and RDF-enabled generic tools, and (for
> ] the sake of simplicity, let's assume) domain driven design for "normal"
> ] developers who are writing "normal" applications.
>
> Yes, the RDF/JSON is basically used to avoid needing to parse RDF.
> Apart from that it is just another serialisation, the data model stays
> the same.
>
You mean parse RDF/XML? Does RDF/JSON not have exactly the same
complexity? How is it adequate otherwise? Questions for myself perhaps...
> That's why I made http://bnb.bibliographica.org/isbn for the Wikimedia DE
> folks, it's lossy but easy to understand.
>
Very nice!
> ] 1. A domain model is a model of behaviour-state. An RDF model is a model
> ] of state (RDF models behaviour as state). That is, a Domain model
> ] constitutes its own changes of state, whereas an RDF model expects
> ] something external to constitute changes of state.
>
> There has been some work recently that could introduce some notion of
> state transitions -- it's tied to provenance:
>
> http://iswc2010.semanticweb.org/accepted-papers/241
>
> so to express the state change from t0 -> t2 you write down the
> SPARQL SELECT/INSERT/DELETE statements that describe the transition
> (more or less). You do the writing down in RDF. So the transition
> behaviour itself becomes state in the code is data sense.
>
> I could imagine (not saying this would be a good idea) carrying around
> snippets of python code for operating on the data in the data itself
> to express more or less the same thing. But this is maybe a bit "out
> there".
>
Very interesting!
> ] 2. If CKAN presents JSON with absolute URLs where there are currently
> ] invariant UUIDs (in Version 2, or variable names in Version 1), which
> ] existing tools would be able to undertake traversals? For example, would
> ] Googlebot (or anything else in operation today) treat URLs in JSON as
> ] links, and follow them?
>
> I'm not sure but I don't think Googlebot will follow them. However for
> the existing spiders there are always the html pages that contain
> parallel linkages that they can crawl.
>
Would any other tools be able to take advantage of it? It seems to be
similar to CSV. CSV also doesn't "have a concept of" URLs. I've always
liked the idea of using URLs as identifiers in the API, but I'm just
trying identify exactly what we gain by introducing URLs into the data
formats.
As a lower limit, perhaps we can say that what is good for the CSV goose
must be good for the JSON gander? Beyond that, I'm wondering who
benefits. Apart from getting five stars for the API. :-)
Let's try to keep this discussion open?
> ] 3. Can we make use of the HTTP 'Accept' header? We could continue to
> ] support DDD with application/json and introduce support for RDF with
> ] application/rdf+xml. We could discriminate with content negotiation.
>
> Yes this is a good idea. There's already some less automatic linkage
> in the HTML header and a "human click on this if you want RDF" link in
> the page. Autoneg would also be trivial to implement -- "requested
> some RDF? 303 -> semantic.ckan.net".
>
Sounds good! At the same time, it occurs to me that many of the other
interfaces I've seen over the last few days present the different
content types at the same locations. I can't remember seeing one that
redirects to a different domain. Is that a common or expected thing to do?
I don't know (think you might have told me once...) what exactly the
cohesive mechanism is behind the RDF service. Whatever it is, perhaps we
could at least consider calling the service from CKAN's API controllers
and have the RDF content returned directly? We could also support the
extensions .json .rdf .n3 so content type can be specified in the locator.
It's just a thought, we wouldn't *necessarily* need to swamp up the CKAN
codebase. :-)
> ] 4. Where does DCat fit in? Could DCat be the data format used to
> ] represent a package entity for clients that prefer response content type
> ] of application/rdf+xml? Would DCat be able to represent a package
> ] register? Or should we use other RDF elements for that? What about the
> ] other objects of the model, such as groups?
>
> DCat is just a vocabulary used in RDF. It represents some elements of
> a package registry, but other vocabularies are used as well. For an
> example, see http://semantic.ckan.net/package/bas-clamp.n3 and
> http://semantic.ckan.net/catalogue.n3 (note, turtle or N3 will be much
> easier for you as a human to read. Pay little attention to RDF/XML
> unless you are a computer).
>
Wow, that's really neat. I haven't looked at this before.
> ] 5. Can CKAN support RDF-enabled developers and RDF-enabled generic tools
> ] as a viewpoint on its domain model? The good thing is that the list of
> ] use cases for the semantic web is very short, and there ought to be (in
> ] the language of domain driven design) a cohesive mechanism for it. "The
> ] collection of Semantic Web technologies (RDF, OWL, SKOS, SPARQL, etc.)
> ] provides an environment where application can query that data, draw
> ] inferences using vocabularies, etc." That is, apart from fixing the
> ] content type, we can at least imagine that CKAN could support queries.
> ] http://www.w3.org/standards/semanticweb/data
>
> It already does: http://semantic.ckan.net/sparql and above links...
>
That's really great. I hadn't visited those pages before. You'll have to
explain me how it works. I'm guessing (remembering?) there is a triple
store and you update it from CKAN's API?
> There's always room for improvement of course, most immediately
> separating out extension descriptions (e.g. so that Richard can
> generate voiD separately and have it pulled into the store) and doing
> something similar for the other instances, but I think CKAN already
> has the proverbial 5 stars.
>
That's great. What do you mean by "separating out extension descriptions"?
Is the separate "semantic" hostname particularly desirable? If so, would
it be desirable to have a "semantic." companion for each CKAN site? If
RDF was returned by the API, it could be returned by resources such as:
http://catalogue.data.gov.uk/api
What's '{"version": "1"}' in RDF? :-)
Whatever the service architecture, given there are so many possibilities
in between that appear to offer little but agony, we're in great shape.
I think we could very usefully document these different interfaces (the
Web Interface, the Semantic Interface, and the Domain Model Interface)
as a coherent multi-channel provision. I know the Web UI package details
package presented together links to package resources in different
formats. But we could make something more of the range of different
service capabilities. Now that we've identified the rather different
worlds each addresses, perhaps we could document the different
engineering purposes?
Or am I just catching up with what everybody already knows? :-)
Great discussion, thanks a lot for it. Let's also continue the
discussions about CKAN package extras, but maybe on ckan-dev?
Best wishes,
John.
More information about the ckan-discuss
mailing list