[ckan-dev] Hierarchical folder structure for a dataset

Ted Strauss ted.strauss at gmail.com
Thu Aug 10 20:42:24 UTC 2017


Thanks for sharing this example Damian, and very interesting to see the
variety of approaches taken to this problem.
I also have the feature described by Prashant on my wishlist for the CKAN
instance I'm working on at McGill University.
I have a question for Damien: what method is used to encode parent-child
relations among packages?
Is it the relations extension or something custom?

Chers
Ted Strauss

ᐧ

On Thu, Aug 10, 2017 at 2:24 PM, Damian Steer <D.Steer at bristol.ac.uk> wrote:

>
> > On 9 Aug 2017, at 06:29, Prashant Gupta <p.gupta at auckland.ac.nz> wrote:
> >
> > Hi,
> >
> > I am using CKAN to serve as an instrument (e.g. mass spec) data service,
> where we may ingest instrument data directly into CKAN – so we have a copy
> of raw data to be stored and shared, and later to be published and
> archived. The problem I am facing is the way CKAN stores its datasets and
> resources. For instrument data, it is vital to retain the folder structure
> and the resources (data, metadata and config files) to be in the correct
> folder. Otherwise the analysis software would have issues analysing it.
> >
> > Is there a way CKAN may allow to store dataset and resources in a way
> that when it is downloaded, the folder structure may be retained somehow,
> and resources are in their correct folder?
>
> Hi Prashant,
>
> Very familiar story :-)
>
> At the University of Bristol we had the same issue. We ended up using
> package relationships (parent / child)  to represent folder structures. You
> can see a fairly extreme example at [1].
>
> (We only use CKAN as a catalogue - the data is held externally - and we
> have a tool that generates the packages using the ckan web ap)
>
> Making it work required a fair amount of customisation:
>
> * Tag top level packages as ‘level=top’ [2] so browsing works over the top
> levels rather than showing all subfolders.
> * Generate and cache the tree you see in [1]. It can be expensive to
> generate.
>
> Archiving is an option (and we do zip as well - see the ‘Complete
> download’ link), however it does obsfucate the dataset. For example you can
> search for ‘bedes’ [3] and find images of postcards. It also lets the user
> grab just the bits they need.
>
> On the other hand we do recommend zipping (or probably 7zip in future) in
> cases where the individual files and directories don’t really make sense
> except as a whole. For example [4] contains a large number of images that
> represent slices through a sample. Individually they are very dull.
>
> Hope this helps,
>
> Damian Steer
>
> [1] <https://data.bris.ac.uk/data/dataset/upjtf9os1dzr154phmgvrupib>
> [2] <https://data.bris.ac.uk/data/dataset?level=top>
> [3] <https://data.bris.ac.uk/data/dataset?q=bedes>
> [4] <https://data.bris.ac.uk/data/dataset/37q0cntawxcq1rkktq3e9mr1p>
>
> ...
>
> --
> Damian Steer
> Senior Technical Researcher
> Research IT
> +44 (0) 117 39 41724
>
>
> _______________________________________________
> ckan-dev mailing list
> ckan-dev at lists.okfn.org
> https://lists.okfn.org/mailman/listinfo/ckan-dev
> Unsubscribe: https://lists.okfn.org/mailman/options/ckan-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/ckan-dev/attachments/20170810/aafe4a5b/attachment-0003.html>


More information about the ckan-dev mailing list