[ckan-dev] Import cpomplex, nested metadata schema with attributes ...

Ryan Clark ryan.clark at azgs.az.gov
Mon Feb 18 16:40:09 UTC 2013


I'm working in a similar situation -- complex XML schema for metadata (ISO19139) that needs to be stored in CKAN "packages". I'm coming at it the other way, though, to start. That is, I'm working to export ISO19139 from CKAN Packages.

I found that the "extras" concept wasn't exactly the right idea for the kind of relational data that I wanted to append to the CKAN package. Instead, I built a CKAN plugin that builds additional tables in the database (through a paster command), and correlates those tables to CKAN's package and resource tables through foreign key relationships. Then I can grab the bundle of related database objects and pick them apart to export an XML doc. 

In the near future, I'll be tackling it in the opposite direction: get the XML, convert it to CKAN package plus additional related tables.

I looked at the ckanext-spatial extension for inspiration about how to develop the additional tables and the paster command. If this sounds like a good solution to you, I would take a look there.
____________________

Ryan Clark
ryan.clark at azgs.az.gov
(520) 302-4871






On Feb 18, 2013, at 9:03 AM, Heinrich Widmann <widmann at dkrz.de> wrote:

> Hi there,
> 
> in our projects we started using cKAN to harvest metadata.
> We first collected meta data from several data providers - most XML format - on disc
> and then we convert them to JSON key:value pairs and import them into CKAN
> using the data set API (i.e. we send HTTP PUT requests ...)
> 
> This works fine for the already given keys "Author", "Maintainer" (and "State") in "Additional Information" (settings) .
> "New" keys we add by  "extras" : { "newkey" : "value", ... } - I'm wouldn't be suprised, if this is not the appropriate way to add new keys ?
> 
> In principle we have complex, nested XML schemas with attributes and dependencies , e.g. something like :
> <metadata xmlns="http://www.openarchives.org/OAI/2.0/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
>    <MD_Metadata xmlns="http://www.isotc211.org/2005/gmd" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" .... xmlns:gco="http://www.isotc211.org/2005/gco" xmlns:oai="http://www.openarchives.org/OAI/2.0/" xmlns:iso="http://www.isotc211.org/2005/gmd" xsi:schemaLocation="http://www.isotc211.org/2005/gmd http://www.isotc211.org/2005/gmd/metadataEntity.xsd" id="de.dkrz.mpim.iso20209">
>   <fileIdentifier>
> <gco:CharacterString>de.dkrz.mpim.iso20209</gco:CharacterString>
>   </fileIdentifier>
>    ......................
> 
> If you convert this by a simple xml2json converter to a flat key:value schema, you get something like :
> 
> {"{http://www.openarchives.org/OAI/2.0/}metadata": {"{http://www.isotc211.org/
> 2005/gmd}MD_Metadata": {"{http://www.isotc211.org/2005/gmd}dataQualityInfo": {
> "{http://www.isotc211.org/2005/gmd}DQ_DataQuality": .............
> 
> Which can be imported in CKAN - but not in a structured, "resolved" and searchable key : value form.
> 
> Thanks,
> Heinrich
> 
> 
> -- 
> -----------------------------\\---------------------------------------
> Heinrich Widmann              \\ Deutsches Klimarechenzentrum GmbH
> Phone: +49 40 41173 282        \\   Abteilung Datenmanagement
> FAX:   +49 40 41173 476         \\    Bundesstr. 45a
> Email: widmann at dkrz.de           \\   D-20146 Hamburg
> http://www.dkrz.de                \\  Germany
> -----------------------------------\\---------------------------------
> 
> 
> _______________________________________________
> ckan-dev mailing list
> ckan-dev at lists.okfn.org
> http://lists.okfn.org/mailman/listinfo/ckan-dev
> Unsubscribe: http://lists.okfn.org/mailman/options/ckan-dev

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/ckan-dev/attachments/20130218/864f3a9d/attachment-0001.html>


More information about the ckan-dev mailing list