[ckan-discuss] thedatahub.org - should be we harvesting other catalogues/directories?

Tim McNamara tim.mcnamara at okfn.org
Tue Oct 4 12:04:14 BST 2011


What do people think about harvesting other catalogues? Is that the
mission of thedatahub.org, to become an index of everything?

We could probably get the available datasets into the hundreds of
thousands if we wanted to... For example, here are 6011 datasets
regarding New Zealand's vegetation Ive just extracted:
https://scraperwiki.com/scrapers/nation_vegetation_survey_metadata/

A fairly good strategy might be to extract:

 - title
 - description
 - listing URL for original dataset
 - URLs of any resources

If we keep to extracting only a fairly small degree of specialist
metadata, then we wont be seen as attempting to duplicate the role of
other repositories.



More information about the ckan-discuss mailing list