[ckan-dev] Storing/searching/displaying XML resources

Haq, Salman Salman.Haq at neustar.biz
Tue May 8 03:26:11 UTC 2012


Also, one more question:

Why use Solr to power site search and Elastic Search as a document store
when either could fulfill both purposes.

Thanks,
Salman


On 5/7/12 11:10 PM, "Haq, Salman" <Salman.Haq at neustar.biz> wrote:

>
>
>On 5/7/12 7:01 PM, "Rufus Pollock" <rufus.pollock at okfn.org> wrote:
>
>>On 7 May 2012 17:17, Haq, Salman <Salman.Haq at neustar.biz> wrote:
>>> I have a special use case where I want to store XML resources. For each
>>>such
>>> resource, I want to display a custom view that allows the user to
>>>add/edit
>>> the data in the resource. This is very similar to what
>>>ckanext-datastorer
>>> does except in my use case, the resource has a specialized view.
>>>
>>> Would the community recommend that I enhance ckanext-datastorer or use
>>>it as
>>> a template for a new custom extension? I am leaning towards the latter.
>>
>>I think the latter may be easier -- though medium-term we may want to
>>find a way where one can plug in specialist importers to the
>>ckanext-datastorer depending on the incoming type of data (and perhaps
>>some other info).
>
>
>Yes, I think a way for plugins (I use that term loosely) to register with
>ckanext-datastorer for specific file types would be a good way to go.
>
>
>>
>>> Also, how does ckanext-datastorer store the parsed data? It doesn't
>>>appear
>>> to have any special models for storing tabular data in the main
>>>postrgres
>>> db. Does it rely primarily on ElasticSearch as the backing store? Does
>>>this
>>
>>Yes, it uses the CKAN DataStore backed by ElasticSearch rather than
>>Postgres.
>
>Just out of curiosity, are there any ckanext's that have their own data
>models?
>
>>
>>> mean that I will have to convert my XML documents into JSON documents
>>>and
>>> then store them via the data API?
>>
>>That would be the natural approach if it were possible.
>>
>>> Also, from the docs and source code, I still can't figure out what
>>> ckanext-archiver does and how it relates to ckanext-datastorer. They
>>>both
>>> seem to share some common code.
>>
>>Archiver archives resources: i.e. it looks for resources with remote
>>urls and stores a copy of that data into the FileStore (i.e. it
>>*archives* it). The DataStorer instead processes the data and puts it
>>in the DataStore.
>
>Makes sense now.
>
>>
>>> To elaborate more on my use case, the XML document actually represents
>>> metadata about a database (eg: tables, columns, keys, row counts, etc).
>>>One
>>> way to think of the extension is as a 'metadatastorer'. The resources
>>>could
>>> be in XML format, or in the future, additional formats may be supported
>>>for
>>> different types of stores (eg: NoSQL dbs, etc)
>>
>>Understood. I note we've also been thinking quite a bit about how to
>>specify metadata for datasets. In the simplest case we use the mapping
>>metadata in ElasticSearch to store info about fields (type, format
>>etc). We're also thinking about using JSON-LD contexts more heavily
>>for this purpose (see [1])
>
>That would be good. I guess a 'resource' will become a tuple of 'metadata'
>and 'data'.
>
>What are your thoughts about 'Single Point Of Truth' [2]?
>
>It seems a resource could have multiple representations as a file, a json
>object in ES, as a graph in some triple store, etc. Borrowing from DVCS,
>these related but separate representations resemble branches. Do people
>have thoughts about how this would be handled in the API and the UI?
>
>
>Salman
>
>[2]: 
>http://teddziuba.com/2011/06/most-important-concept-systems-design.html
>
>
>>
>>Rufus
>>
>>[1]: http://lists.okfn.org/pipermail/ckan-discuss/2012-May/002186.html
>>
>>>
>>> Thanks,
>>> Salman
>>>
>>> _______________________________________________
>>> ckan-dev mailing list
>>> ckan-dev at lists.okfn.org
>>> http://lists.okfn.org/mailman/listinfo/ckan-dev
>>>
>>
>>
>>
>>-- 
>>Co-Founder, Open Knowledge Foundation
>>Promoting Open Knowledge in a Digital Age
>>http://www.okfn.org/ - http://blog.okfn.org/
>>
>>_______________________________________________
>>ckan-dev mailing list
>>ckan-dev at lists.okfn.org
>>http://lists.okfn.org/mailman/listinfo/ckan-dev
>
>
>_______________________________________________
>ckan-dev mailing list
>ckan-dev at lists.okfn.org
>http://lists.okfn.org/mailman/listinfo/ckan-dev





More information about the ckan-dev mailing list