[ckan-discuss] CKAN package ID debate

David Read david.read at okfn.org
Tue Mar 23 18:11:50 GMT 2010


We're debating changes to the CKAN API to refer primarily to Package
IDs instead of Package Names. Read on below and please do contribute
your thoughts.

David

Users can interact with CKAN packages through the REST API by
referring to packages by name. Names don't change much, but we do want
to support mutability of this field, and so we're looking at using IDs
to refer to packages in the API instead, since these definitely don't
change, even when we start syncing packages across multiple CKAN
instances.

Examples of current use of package name in API:
Asking the API for a list of packages: ['aiddata-china', 'naptan',
'water-voles-uk']
Read a package: api/rest/package/aiddata-china returns
"{'name':'aiddata-china', 'title':'Aid data for China', ...}"
Search returns a list of matching packages: ['aiddata-china',
'naptan', 'water-voles-uk']

Although the 'title' field is best for human reading, you may want to
change the package's name for a few reasons. It may appear in a URL
somewhere and it for various reasons may need to reflect the content.
e.g. 'water-voles-2006-09' may be better as 'water-voles' when it
becomes clear that the dataset will be updated in future years. Also
we may want to change 'osm' to 'open-street-map' to disambiguify when
another package with those initials comes along, or they change their
name to 'OpenMap' because of a legal dispute with OSM Inc, and are
keen to change all references.

But there are advantages of using names in the API:
* more human readable
* aligns CKAN (and datapkg) with apt-get and CPAN, although I get the
impression those essentially don't allow module names to change

I think we want to therefore switch to using IDs. Dealing in names as
well is a 'nice to have' and kept perhaps for backwards compatibility.

It's relatively simple to allow users to specify an ID instead of
names in requests (whilst accepting either). The question is whether
we return names, IDs or both.

So here are my suggested options:

Option A - Use new URLs that include an API version number. Users
accessing this new version of the API get back package IDs. e.g.
/api/rest/2.0/package returns ['0d9ea8d59be5', '44758e5a0f9c', ...]
We could implement API versioning as suggested in the first answer
here: http://stackoverflow.com/questions/389169/best-practices-for-api-versioning

Option B - User specifies an option for the return format if he wants
IDs instead of package names. This could be a URL parameter or HTTP
header option, although not particularly RESTful.

Option C - Break back compatibility and just return IDs. We are still
sort of in beta and may not have many API users.

Do let us know if you think I'm on the right track or not.



More information about the ckan-discuss mailing list