[ckan-discuss] API For Package Name
John Bywater
john.bywater at appropriatesoftware.net
Sun Nov 28 14:00:20 GMT 2010
Hi Richard,
Richard Cyganiak wrote:
> On 24 Nov 2010, at 15:28, John Bywater wrote:
>> The API is not very hypermedia-driven at the moment. […] Perhaps we
>> could go right back to the start, and look at the package register? At
>> the moment it returns a list of package IDs (or package names in API
>> Version 1). I'm detecting that you'd slightly prefer a list of
>> absolute URLs. :-)
>
> Yes!
>
Thought so. ;-)
>> Please forgive me, but recently I have been unable to decide whether
>> or not the IDs can be treated as relative URLs, with the locator of
>> package register (somehow) as the base URL? What do you think about
>> that? Is there a definitive answer?
>
> Good question. Some representation formats, such as HTML (and, in fact,
> RDF!), are designed explicitly for hypermedia, and the format specifies
> which parts of the message are URLs and what base URL they are resolved
> against. That makes the use of relative URLs straightforward, which is
> good for keeping message size down. JSON, for all its advantages,
> unfortunately doesn't know a URL from a string.
>
Thanks for saying that.
How do we define hypermedia? From what I can see, a hypermedia system
doesn't need to involve URLs, but rather there needs to be: a series of
resources where any resource can reference any other resource; an
independent means of accessing representations of referenced resources;
and a capability to infer references to resources from within a
representation of resource.
If so, CKAN+ckanclient appears already to be a hypermedia system.
Also, isn't RESTfulness more generic than either the Web or RDF?
That is, a RESTful API doesn't need to conform to the Web. A Web UI is
RESTful for Web clients. The CKAN API is RESTful for ckanclients. The
Web, RDF, and CKAN share common principals (the principals of REST).
They all use HTTP. But where the Web uses HTML, RDF uses RDF/XML (or
RDF/JSON, or something like that), and CKAN uses its own JSON schema.
>> Wikipedia says, "If it is likely that the client will want to access
>> related resources, these should be identified in the representation
>> returned, for example by providing their URIs in sufficient context,
>> such as hypertext links." There are identifiers. Are we missing the
>> "sufficient context", or is that provided by the published resource
>> locator templates? I really don't know. I've seen some discussions
>> about it being okay given that the locator templates are published.
>> But I wasn't totally convinced. :-)
>>
>> So, if we prefix each with the package registry locator, then the
>> message size goes up, but probably to no more than double. So that's
>> okay? And given the deliberate redundancy, it may be more susceptible
>> to compression than the average message.
>
> I think you summed up the trade-off very well. If URLs are included in
> the messages, then less context has to be hardcoded into clients -- I
> wouldn't need to tell the client that the URL for retrieving a package
> representation is obtained by concatenating the returned package ID to
> "http://ckan.net/api/rest/package/". That would simplify the clients,
> and make them more resilient against future change on the server (such
> as a move to a different domain!).
>
> On the downside, message size increases.
>
> Personally, I have a high tolerance for redundancy in messages. A
> side-effect of prolonged exposure to RDF ;-)
>
Thanks for the compliments. I'm not necessarily averse to that. The word
I should perhaps have selected is not redundancy but rather repetition?
Repetition is not redundant if it's necessary. :-)
> A good compromise might be to use URLs relative to the API base URL
> "http://ckan.net/api/rest/" when referring to resources within messages.
> So we'd have "package/this" and "package/that" and "group/foo" and
> "tag/bar" etc.
>
Yes, it might be. Or is that a half-way house that would please nobody?
>> Is that the sort of thing you'd like to see? We could make a list.
>>
>> The API is versioned, so we could develop all this into Version 3.
>
> +1!
>
Great. Let's see what we can do. There's a lot to consider.
I've been reading more about RDF. With its abstract data model, the data
formats (e.g. "RDF/XML"), and the libraries and tools, RDF appears to be
a hypermedia system par excellence. (The only fracture seems to be the
multitude of offerings which appear as attempts perhaps to win fame by
making RDF a bit more approachable. In doing so perhaps they only
compound the difficulties it presents? I'm not sure.)
I was also reading about the relation between linked data and RDF.
Whereas linked data seems originally to have been conceived in terms of
RDF (or at least to expect usage of RDF), the 5-stars of linked open
data clearly do not mandate RDF (you can use CSV and still have five stars).
I was also looking at how JSON appears within RDF and linked data. There
is RDF/JSON and JSON-LD. There is also (for example) JRON and probably a
dozen other offerings arrayed across a spectrum of possibility. None
appear to have achieved pre-eminence or even ubiquity. Therefore none of
them appear to be especially desirable (at least at the moment for a
"normal" developer).
I also found a message from Jeni Tenison, which started a long thread:
http://www.mail-archive.com/public-lod@w3.org/msg04086.html
<quote>
As part of the linked data work the UK government is doing, we're
looking at how to use the linked data that we have as the basis of APIs
that are readily usable by developers who really don't want to learn
about RDF or SPARQL.
One thing that we want to do is provide JSON representations of both RDF
graphs and SPARQL results. I wanted to run some ideas past this group as
to how we might do that.
To put this in context, what I think we should aim for is a pure
publishing format that is optimised for approachability for normal
developers, *not* an interchange format. RDF/JSON [1] and the SPARQL
results JSON format [2] aren't entirely satisfactory as far as I'm
concerned because of the way the objects of statements are represented
as JSON objects rather than as simple values. I still think we should
produce them (to wean people on to, and for those using more generic
tools), but I'd like to think about producing something that is a bit
more immediately approachable too.
</quote>
So, looking at all this from a distance, there appears to be two poles:
RDF for RDF-enabled developers and RDF-enabled generic tools, and (for
the sake of simplicity, let's assume) domain driven design for "normal"
developers who are writing "normal" applications.
I should admit that my head nearly exploded trying to match up RDF with
domain driven design. Metaphorically speaking, I had to switch
everything off for a while and allow the heat to dissipate. I was left
with the impression that there's a gross impedance mis-match between
domain driven design and RDF. That is to say:
1. A domain model is a model of behaviour-state. An RDF model is a model
of state (RDF models behaviour as state). That is, a Domain model
constitutes its own changes of state, whereas an RDF model expects
something external to constitute changes of state.
2. The scope of RDF is the World, vocabularies are fashioned to
represent factual aspects of the World (even if such a fact pertains to
an actual fiction). The scope of a domain model is a circumscribed
domain, the rest of the World is always already out of scope. Domain
driven design fashions worlds from domains. The little worlds of domain
driven design are contrived and immanent fictions, but they are always
based on facts.
3. Domain driven design (from the standpoint of RDF) is therefore highly
parochial ("you should use URLs instead of local names so everybody
knows what you mean"), whereas RDF (from the standpoint of domain driven
design) is highly static ("you should collocate behaviour with state so
there is an rhizome of coherent objects that does useful work").
4. RDF can incorporate domain driven design as a way of fashioning a new
vocabulary, but without its behaviour not even instantiation can happen.
5. Domain driven design can incorporate RDF as a ready-made data model
for an infinite domain, but behaviour would need to be reconstituted
from a state representation of behaviour (software is also data).
So I wondered:
1. Can we at once make CKAN conform with the 5-stars of linked open data
by using URLs for identifiers and enhance the experience for normal
developers? That is, would a better domain model be obtained by using
URLs instead of the locally typed and scoped identifiers that are
contextualized by (and therefore receive meaning from) the domain model?
CKAN would then have five stars (it currently has three stars?).
2. If CKAN presents JSON with absolute URLs where there are currently
invariant UUIDs (in Version 2, or variable names in Version 1), which
existing tools would be able to undertake traversals? For example, would
Googlebot (or anything else in operation today) treat URLs in JSON as
links, and follow them?
3. Can we make use of the HTTP 'Accept' header? We could continue to
support DDD with application/json and introduce support for RDF with
application/rdf+xml. We could discriminate with content negotiation.
4. Where does DCat fit in? Could DCat be the data format used to
represent a package entity for clients that prefer response content type
of application/rdf+xml? Would DCat be able to represent a package
register? Or should we use other RDF elements for that? What about the
other objects of the model, such as groups?
5. Can CKAN support RDF-enabled developers and RDF-enabled generic tools
as a viewpoint on its domain model? The good thing is that the list of
use cases for the semantic web is very short, and there ought to be (in
the language of domain driven design) a cohesive mechanism for it. "The
collection of Semantic Web technologies (RDF, OWL, SKOS, SPARQL, etc.)
provides an environment where application can query that data, draw
inferences using vocabularies, etc." That is, apart from fixing the
content type, we can at least imagine that CKAN could support queries.
http://www.w3.org/standards/semanticweb/data
Best wishes,
John.
PS Can we do it? Yes we CKAN! :-)
> Best,
> Richard
>
>
>
>>
>> Best wishes,
>>
>> John.
>>
>>
>>> Best,
>>> Richard
>>
>
More information about the ckan-discuss
mailing list