[ckan-dev] Storing/searching/displaying XML resources

Rufus Pollock rufus.pollock at okfn.org
Mon May 7 23:01:29 UTC 2012


On 7 May 2012 17:17, Haq, Salman <Salman.Haq at neustar.biz> wrote:
> I have a special use case where I want to store XML resources. For each such
> resource, I want to display a custom view that allows the user to add/edit
> the data in the resource. This is very similar to what ckanext-datastorer
> does except in my use case, the resource has a specialized view.
>
> Would the community recommend that I enhance ckanext-datastorer or use it as
> a template for a new custom extension? I am leaning towards the latter.

I think the latter may be easier -- though medium-term we may want to
find a way where one can plug in specialist importers to the
ckanext-datastorer depending on the incoming type of data (and perhaps
some other info).

> Also, how does ckanext-datastorer store the parsed data? It doesn't appear
> to have any special models for storing tabular data in the main postrgres
> db. Does it rely primarily on ElasticSearch as the backing store? Does this

Yes, it uses the CKAN DataStore backed by ElasticSearch rather than Postgres.

> mean that I will have to convert my XML documents into JSON documents and
> then store them via the data API?

That would be the natural approach if it were possible.

> Also, from the docs and source code, I still can't figure out what
> ckanext-archiver does and how it relates to ckanext-datastorer. They both
> seem to share some common code.

Archiver archives resources: i.e. it looks for resources with remote
urls and stores a copy of that data into the FileStore (i.e. it
*archives* it). The DataStorer instead processes the data and puts it
in the DataStore.

> To elaborate more on my use case, the XML document actually represents
> metadata about a database (eg: tables, columns, keys, row counts, etc). One
> way to think of the extension is as a 'metadatastorer'. The resources could
> be in XML format, or in the future, additional formats may be supported for
> different types of stores (eg: NoSQL dbs, etc)

Understood. I note we've also been thinking quite a bit about how to
specify metadata for datasets. In the simplest case we use the mapping
metadata in ElasticSearch to store info about fields (type, format
etc). We're also thinking about using JSON-LD contexts more heavily
for this purpose (see [1])

Rufus

[1]: http://lists.okfn.org/pipermail/ckan-discuss/2012-May/002186.html

>
> Thanks,
> Salman
>
> _______________________________________________
> ckan-dev mailing list
> ckan-dev at lists.okfn.org
> http://lists.okfn.org/mailman/listinfo/ckan-dev
>



-- 
Co-Founder, Open Knowledge Foundation
Promoting Open Knowledge in a Digital Age
http://www.okfn.org/ - http://blog.okfn.org/




More information about the ckan-dev mailing list