[okfn-dev] Thinking more about datapkg
Matthew Brett
matthew.brett at gmail.com
Wed Jan 12 18:28:46 UTC 2011
Hi,
Rufus and I sat down for a while over the new year to think about
datapkg design.
This email is not really a summary of that discussion, but thoughts
that came to me after the discussion. I think we are hoping for
feedback.
One thing we discussed was the idea of the set of metadata about the
package as a 'catalog entry'.
I was playing with the idea of the catalog entry.
Maybe a data package can be any collection of bytes, for which the
only necessary criterion is: we know how to get the bytes; we know how
to get the name.
Start with an example.
I've got some files in an archive named
mydata-0.3.tar.gz
I know how to get the bytes (because it's a tar.gz file). The 'name'
is 'mydata-0.3'. In this case, the catalog entry can be compiled by
guessing:
name = mydata-0.3
format = tar.gz
Implied are:
revision =
version =
To publish 'mydata-0.3.tar.gz', I can make this trivial catalog entry,
or ask datapkg to make it, and then just add where I can get the data
name = mydata-0.3
format = tar.gz
url = http://www.mydomain.org/files/mydata-0.3.tar.gz
Now I just have to put this catalog entry somewhere (ckan, etc).
To install the data, I can obviously ask datapkg to do it:
datapkg install mydata
sort of thing.
Or I can do this:
wget http://www.mydomain.org/files/mydata-0.3.tar.gz
tar zvvf mydata-0.3.tar.gz
cat >> .datapkg/installed.catalogue << EOF
[mydata-0.3]
format = local
path = /path/where/unpacked
EOF
kind of thing.
That means, that there need be nothing specific about an archive, that
makes it a data package, but, of course, I can also make the catalog
entry be part of the archive. That might be using (as now) a standard
name - catalog.json or something.
Anyway - sorry - these thoughts still not entirely formed, but I
wanted to put them down before they faded,
See you,
Matthew
More information about the okfn-labs
mailing list