[ckan-dev] Search functions in CKAN; custom schema.

Rufus Pollock rufus.pollock at okfn.org
Wed May 30 08:54:44 UTC 2012


On 29 May 2012 12:48, David Raznick <kindly at gmail.com> wrote:
> On Tue, May 29, 2012 at 11:25 AM, Friedrich Lindenberg
> <friedrich.lindenberg at okfn.org> wrote:
>> Hi all,
>>
>> I just wanted to query a few points regarding CKAN that I may not be
>> up to date on, since we're discussing using CKAN with a partner
>> organization and I want to be clear on the features.
>>
>> * Is global data search planned; when is it likely to land? Has anyone
>> played with indexing PDF/Word/... docs, e.g. via Tika?
>
> No not planned but interesting, and should probably be stored in the
> datastore.  This is because you can easily do full text search across

I should correct this in the sense that I've been thinking about this
for a while and even thought of putting it directly into the original
datastore implementation (this is really easy to do with ES).

The main issue is performance -- but could be fixed by judicious
timeouts? Also one could have a simple switch for this so that people
who want this in their install can just enable.

> all the tabular data already.  The datastorer could be pretty
> trivially extended to do this as long as the document parsers do not
> require to much work.  It would be very useful to have a core language
> metadata field/ posibably per resource if we were going to index
> textual documents.

We'd just use ES + Tika directly here though obviously we need
DataStorer to do relevant encoding etc.

>> * Can I have per-group or per-dataset schemata with custom vocabs and
>> have these enforced as validation when saving metadata, as well as
>> used to generate a custom form? e.g. I send people to
>> datahubio/datasets/new?schema=mymeta - this will ask for a couple of
>> extras and enforce they are part of an enumeration, then save that
>> association and use the form each time the dataset is edited.
>
> Yes, not particularly well tested, and got a reasonably large barrier
> to entry, but definitely there.  It is actually not done as a param
> but as as a top level entity i.e
> datahubio/my_dataset/new
> A custom extension can define what it can be called, its schema and its form.

Where are the docs for this BTW? IIRC there was some good work done
here earlier in the year.

Rufus

>>
>> Thanks for any advice,
>>
>>  - Friedrich
>>
>> _______________________________________________
>> ckan-dev mailing list
>> ckan-dev at lists.okfn.org
>> http://lists.okfn.org/mailman/listinfo/ckan-dev
>
> _______________________________________________
> ckan-dev mailing list
> ckan-dev at lists.okfn.org
> http://lists.okfn.org/mailman/listinfo/ckan-dev



-- 
Co-Founder, Open Knowledge Foundation
Promoting Open Knowledge in a Digital Age
http://www.okfn.org/ - http://blog.okfn.org/




More information about the ckan-dev mailing list