[ckan-discuss] Harvesting Dublin Core documents

John Bywater john.bywater at appropriatesoftware.net
Wed Nov 24 15:45:35 GMT 2010


Dear all,

CKAN has recently been developed to harvest GEMINI documents from CSW 
servers (and other places). That was simply a requirement for the UK 
Location Information Infrastructure.

Although we had (and still have!) XSLT codes to transform GEMINI 
documents to Dublin Core, when I was implementing the GEMINI harvesting, 
I was given a large set of XPaths that meant I could pick out CKAN 
Package attributes values from a GEMINI document. The XPaths were the 
basis of an extended harvesting discussion with the client, so I decided 
to keep things simple, and not to use the XSLT.

Hence, CKAN still isn't able to harvest Dublin Core documents. I'd like 
to fix this. Hopefully, the only missing piece is the set of XPaths for 
picking out CKAN Package attributes values from a Dublin Core document.

Could we try to identify these XPaths? Those for GEMINI are here:
http://ckan.org/browser/ckan/model/harvesting.py#L588

Best wishes,

John.




More information about the ckan-discuss mailing list