[okfn-discuss] datapkg - haltering steps

Matthew Brett matthew.brett at gmail.com
Sun Oct 10 10:52:33 UTC 2010


First - please forgive me - it's late here and I'm not functioning at
my open-source best.

We (nipy.org) are just going over how to deal with data packaging.
Our first and very simple draft was here:


as you can see this is very crude, especially compared to datapkg.

I'm afraid, that I haven't yet looked in detail at datapkg, for which,
please forgive me, but I had a few preliminary questions:

1) I think our main usecase is being able to do something like this in our code:

my_package_path = None
    import some_data_pkg_manager as excelsior
except ImportError:
    hint = 'You need "some_data_pkg_manager", see http://a.helpful.url'
   version, pth = excelsior.have_local_pkg('my_package', version=0.3)
   if version >= 0.3: # well, you get what I mean
      my_package_path = pth
      hint = excelsior.installation_hint('my_package', version=0.3)
if my_package_path is None:
    print hint
   # Do something with the data

I hope you see what I mean.  The main point is, we want to be able to
query the local installations, whether system-wide or in the user
space, to get where the data is, rather than automatically trying to
pull the data down.   This is because - I work in Cuba and bandwidth
there is terrible - and - it seems like it would work better with
standalone installations.

I'm sure you've covered that - I just couldn't see it at a first glance.

2) The second thing was - on my (yes, I'm sorry) Mac, an attempt to do
'python setup.py develop' in the repository leads to a nasty set of
error messages from setuptools, where it appears to be cycling over
the Paste installation.  It was complicated enough that it wasn't
clear to me which installation target was causing the problems -
certainly it seemed to occur with 'urlgrabber' - but I thought I'd let
you know.

3) Related to same - one problem that we were trying to avoid with our
crude setup was needing the data package installed in order to query
the data.  That is, we were hoping to have minimal run-time
dependencies.  datapkg has rather heavy dependencies - do you think
there's any chance of a lightweight local query version, when not all
the dependencies are met?    We (as a group) have had some bad
experiences with setuptools in the past.

4) Just lastly - I wonder if y'all have had time to look at bento:


It is our favorite hacker's attempt at a way out of distutils /
setuptools hell - and he's very responsive to questions and
suggestions.   You can incorporate an entire distribution of bento as
one file on your project to do the build for you.   This was 300K at
last count (David C the author made a bento file for nipy).  This is
moderately annoying for code, but nothing for a data package I would
have thought.   The idea (half formed) that I had would be that each
data package would be responsible for its own installation - in our
case through a setup.py file - but maybe instead through the bento
build process.

Sorry if this is off point or poorly thought out - I should point out
that it's currently near 4 in the morning.

See y'all,


More information about the okfn-discuss mailing list