[okfn-dev] [Nipy-devel] data package source
satra at mit.edu
Wed Dec 8 19:17:35 UTC 2010
hi matthew and others,
Regression testing might require large real data-sets and we should be
careful about the size of the data-set that this package will provide.
one option is to consider the use of xnat as an alternative for storing the
data ( on central.xnat.org) and the data-pkg will simply use pyxnat to
query/retrieve relevant data to the local machine. (i'm cc:ing yannick).
please also consider the connectome file-format as a way of describing
On Tue, Dec 7, 2010 at 10:26 AM, Matthew Brett <matthew.brett at gmail.com>wrote:
> Hi Rufus and all,
> We (nipy folks , and neurodebian folks , and maybe others) have been
> thinking a little bit about what we wanted from a data package
> First - an apology. I have tried to explore datapkg, but rather
> What we've done, in the main, is to try and think out what we mean by
> stuff, and
> what we want, and we're slowly then coming back to what y'all have done.
> In our first implementation of data packages, before we knew about datapkg,
> did something extremely simple (but nevertheless not very good). If you're
> interested, the implementation is in nibabel . After we'd
> used that for a while, it became obvious that it was too clumsy and a
> difficult to understand, for the simple case where you want to unpack files
> somewhere and point the code at the files.
> Now we're thinking what we really want. The result of various discussions
> up in the attached document ``data_pkg_discuss.rst``. As the name
> it's trying to clarify various ideas we had about what is what.
> Now onto something real, usecases...
> We have - for example - a smallish package for reading image data -
> nibabel. We
> want to be able to use optional data packages from within nibabel. In
> particular, we wanted packages of test data of images in various formats,
> are too large to include in the code repository. Here's some things we
> * No dependency for nibabel on the data packaging code. That is, we wanted
> be able to *use* installed data packages without having to install - say
> ``datapkg``. This is obviously not essential, but desirable. We're less
> concerned about having to depend on - say - ``datapkg`` for installing the
> data, or modifying the data packages. Having said that, it would surely
> adoption of a standard packaging system if it was easy to implement a
> packaging protocol outside of the canonical implementation in - say -
> * Support for data package versions. We expect to have several versions of
> nibabel out in the wild, and maybe several versions of nibabel on a single
> machine. The versions of nibabel may well need different versions of the
> packages to run their tests. Even if there is just one version on the
> computer, it might be an older version that wants a version of the data
> package that is older than the current version. Thus we want to be able
> ask for different versions of a data package, and to be able to have
> versions of package installed at any one time
> * Support for user and system installs of data. As for python package
> we expect some of our packages to be installed system-wide and available
> all users, and others to be installed just for a single user. We want to
> able install data with the same distinction, so that system-wide packages
> see system-wide data. It should be possible for an individual piece of
> to find an individual data package, whether it is installed system-wide,
> only for the user.
> * Not of urgent importance for us, but it would be good to be able
> sign the packages
> with a trusted key, as for Debian packages.
> For these various reasons we tried to spec out what we thought we would
> need in
> the attached ``data_pkg_uses.rst``. I've also attached a script referenced
> that page, ``register_me.py`` - as ``register_me.txt``.
> Given my relative ignorance of ``datapkg``, I'll try to say the differences
> see from the current ``datapkg``:
> * I can't see support for data package versioning in ``datapkg`` - but I
> have missed it.
> * As far as I can see, there isn't a separation of system and user
> installs, in
> that there seems to be a (by default) sqlite 'repository' (right term?)
> knows about the packages a user has installed, but I could not find an
> obvious canonical way to pool system and user installation information.
> that right?
> * Because the default repository is sqlite, anyone trying to read the
> installations that ``datapkg`` did, will need sqlite or something similar.
> They'll likely have this if they are using a default python installation,
> not necessarily if they are using another language or a custom python
> Are these right?. Do our usecases make sense to y'all?
> We'd love to work together on stuff if that makes sense to you too...
> See you,
> Matthew (for various of us).
>  http://nipy.org
>  http://neuro.debian.net
>  http://nipy.org/nibabel/devel/data_pkg_design.html
> Nipy-devel mailing list
> Nipy-devel at neuroimaging.scipy.org
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the okfn-labs