[ckan-dev] Hierarchical folder structure for a dataset

Harald von Waldow harald.vonwaldow at eawag.ch
Fri Aug 11 08:04:15 UTC 2017


Hi Damian

That looks pretty interesting. I would also not mind seeing how you
designed the user-interface to let people upload hierarchically
structured datasets. Is the code for that publicly available?

Cheers,
Harald

On 2017-08-10 20:24, Damian Steer wrote:
> 
>> On 9 Aug 2017, at 06:29, Prashant Gupta <p.gupta at auckland.ac.nz> wrote:
>>
>> Hi,
>>
>> I am using CKAN to serve as an instrument (e.g. mass spec) data service, where we may ingest instrument data directly into CKAN – so we have a copy of raw data to be stored and shared, and later to be published and archived. The problem I am facing is the way CKAN stores its datasets and resources. For instrument data, it is vital to retain the folder structure and the resources (data, metadata and config files) to be in the correct folder. Otherwise the analysis software would have issues analysing it. 
>>
>> Is there a way CKAN may allow to store dataset and resources in a way that when it is downloaded, the folder structure may be retained somehow, and resources are in their correct folder?
> 
> Hi Prashant,
> 
> Very familiar story :-)
> 
> At the University of Bristol we had the same issue. We ended up using package relationships (parent / child)  to represent folder structures. You can see a fairly extreme example at [1].
> 
> (We only use CKAN as a catalogue - the data is held externally - and we have a tool that generates the packages using the ckan web ap)
> 
> Making it work required a fair amount of customisation:
> 
> * Tag top level packages as ‘level=top’ [2] so browsing works over the top levels rather than showing all subfolders.
> * Generate and cache the tree you see in [1]. It can be expensive to generate.
> 
> Archiving is an option (and we do zip as well - see the ‘Complete download’ link), however it does obsfucate the dataset. For example you can search for ‘bedes’ [3] and find images of postcards. It also lets the user grab just the bits they need.
> 
> On the other hand we do recommend zipping (or probably 7zip in future) in cases where the individual files and directories don’t really make sense except as a whole. For example [4] contains a large number of images that represent slices through a sample. Individually they are very dull.
> 
> Hope this helps,
> 
> Damian Steer
> 
> [1] <https://data.bris.ac.uk/data/dataset/upjtf9os1dzr154phmgvrupib>
> [2] <https://data.bris.ac.uk/data/dataset?level=top>
> [3] <https://data.bris.ac.uk/data/dataset?q=bedes>
> [4] <https://data.bris.ac.uk/data/dataset/37q0cntawxcq1rkktq3e9mr1p>
> 
> ...
> 

-- 
Harald von Waldow
Eawag
ICT Services
Ueberlandstrasse 133
8600 Duebendorf
http://www.eawag.ch

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
URL: <http://lists.okfn.org/pipermail/ckan-dev/attachments/20170811/457c1161/attachment-0003.sig>


More information about the ckan-dev mailing list