[ckan-discuss] thedatahub.org - should be we harvesting other catalogues/directories?

David Read david.read at okfn.org
Tue Oct 4 13:16:16 BST 2011


On 4 October 2011 12:04, Tim McNamara <tim.mcnamara at okfn.org> wrote:

> What do people think about harvesting other catalogues? Is that the
> mission of thedatahub.org, to become an index of everything?
>

I believe the strategy agreed in Edinburgh is to list all the catalogues on
datacatalogs.org and provide a federated search index to all of them. I
guess the end result of what you suggest is roughly the same thing - you
search and it returns a snippet of the metadata record. The snippet is just
enough so you know you've found the right record, but not enough to make the
original catalogue redundant.

What do others, particularly those with CKANs around the world, think about
being brought together in this way?

David


> We could probably get the available datasets into the hundreds of
> thousands if we wanted to... For example, here are 6011 datasets
> regarding New Zealand's vegetation Ive just extracted:
> https://scraperwiki.com/scrapers/nation_vegetation_survey_metadata/
>
> A fairly good strategy might be to extract:
>
>  - title
>  - description
>  - listing URL for original dataset
>  - URLs of any resources
>
> If we keep to extracting only a fairly small degree of specialist
> metadata, then we wont be seen as attempting to duplicate the role of
> other repositories.
>
> _______________________________________________
> ckan-discuss mailing list
> ckan-discuss at lists.okfn.org
> http://lists.okfn.org/mailman/listinfo/ckan-discuss
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/ckan-discuss/attachments/20111004/1cd5dafb/attachment.htm>


More information about the ckan-discuss mailing list