[ckan-dev] Search functions in CKAN; custom schema.

David Raznick kindly at gmail.com
Wed May 30 10:52:46 UTC 2012


On Wed, May 30, 2012 at 9:54 AM, Rufus Pollock <rufus.pollock at okfn.org> wrote:
> On 29 May 2012 12:48, David Raznick <kindly at gmail.com> wrote:
>> On Tue, May 29, 2012 at 11:25 AM, Friedrich Lindenberg
>> <friedrich.lindenberg at okfn.org> wrote:
>>> Hi all,
>>>
>>> I just wanted to query a few points regarding CKAN that I may not be
>>> up to date on, since we're discussing using CKAN with a partner
>>> organization and I want to be clear on the features.
>>>
>>> * Is global data search planned; when is it likely to land? Has anyone
>>> played with indexing PDF/Word/... docs, e.g. via Tika?
>>
>> No not planned but interesting, and should probably be stored in the
>> datastore.  This is because you can easily do full text search across
>
> I should correct this in the sense that I've been thinking about this
> for a while and even thought of putting it directly into the original
> datastore implementation (this is really easy to do with ES).
>
> The main issue is performance -- but could be fixed by judicious
> timeouts? Also one could have a simple switch for this so that people
> who want this in their install can just enable.

I do not think performance will be in an issue unless people are
adding 100s of datasets.  We should have the parsed documents on a
different index then the tabular data though. (you can still search
across indexs)

>
>> all the tabular data already.  The datastorer could be pretty
>> trivially extended to do this as long as the document parsers do not
>> require to much work.  It would be very useful to have a core language
>> metadata field/ posibably per resource if we were going to index
>> textual documents.
>
> We'd just use ES + Tika directly here though obviously we need
> DataStorer to do relevant encoding etc.
>
>>> * Can I have per-group or per-dataset schemata with custom vocabs and
>>> have these enforced as validation when saving metadata, as well as
>>> used to generate a custom form? e.g. I send people to
>>> datahubio/datasets/new?schema=mymeta - this will ask for a couple of
>>> extras and enforce they are part of an enumeration, then save that
>>> association and use the form each time the dataset is edited.
>>
>> Yes, not particularly well tested, and got a reasonably large barrier
>> to entry, but definitely there.  It is actually not done as a param
>> but as as a top level entity i.e
>> datahubio/my_dataset/new
>> A custom extension can define what it can be called, its schema and its form.
>
> Where are the docs for this BTW? IIRC there was some good work done
> here earlier in the year.

http://readthedocs.org/docs/ckan/en/latest/forms.html
>
> Rufus
>
>>>
>>> Thanks for any advice,
>>>
>>>  - Friedrich
>>>
>>> _______________________________________________
>>> ckan-dev mailing list
>>> ckan-dev at lists.okfn.org
>>> http://lists.okfn.org/mailman/listinfo/ckan-dev
>>
>> _______________________________________________
>> ckan-dev mailing list
>> ckan-dev at lists.okfn.org
>> http://lists.okfn.org/mailman/listinfo/ckan-dev
>
>
>
> --
> Co-Founder, Open Knowledge Foundation
> Promoting Open Knowledge in a Digital Age
> http://www.okfn.org/ - http://blog.okfn.org/
>
> _______________________________________________
> ckan-dev mailing list
> ckan-dev at lists.okfn.org
> http://lists.okfn.org/mailman/listinfo/ckan-dev




More information about the ckan-dev mailing list