[ckan-dev] Search functions in CKAN; custom schema.

David Raznick kindly at gmail.com
Wed May 30 13:26:48 UTC 2012


On Wed, May 30, 2012 at 2:16 PM, Rufus Pollock <rufus.pollock at okfn.org> wrote:
> On 30 May 2012 11:52, David Raznick <kindly at gmail.com> wrote:
>> On Wed, May 30, 2012 at 9:54 AM, Rufus Pollock <rufus.pollock at okfn.org> wrote:
>>> On 29 May 2012 12:48, David Raznick <kindly at gmail.com> wrote:
>>>> On Tue, May 29, 2012 at 11:25 AM, Friedrich Lindenberg
>>>> <friedrich.lindenberg at okfn.org> wrote:
>>>>> Hi all,
>>>>>
>>>>> I just wanted to query a few points regarding CKAN that I may not be
>>>>> up to date on, since we're discussing using CKAN with a partner
>>>>> organization and I want to be clear on the features.
>>>>>
>>>>> * Is global data search planned; when is it likely to land? Has anyone
>>>>> played with indexing PDF/Word/... docs, e.g. via Tika?
>>>>
>>>> No not planned but interesting, and should probably be stored in the
>>>> datastore.  This is because you can easily do full text search across
>>>
>>> I should correct this in the sense that I've been thinking about this
>>> for a while and even thought of putting it directly into the original
>>> datastore implementation (this is really easy to do with ES).
>>>
>>> The main issue is performance -- but could be fixed by judicious
>>> timeouts? Also one could have a simple switch for this so that people
>>> who want this in their install can just enable.
>>
>> I do not think performance will be in an issue unless people are
>> adding 100s of datasets.  We should have the parsed documents on a
>> different index then the tabular data though. (you can still search
>> across indexs)
>
> This point was about *querying* across multiple datasets/resources not
> about loading.

My point about having it on a separate index covered that.  There
should be *far* less data in all the documents then in say the whole
of uk spending data (which I would hope to have in the datastore one
day!)


>
> [...]
>
>>> Where are the docs for this BTW? IIRC there was some good work done
>>> here earlier in the year.
>>
>> http://readthedocs.org/docs/ckan/en/latest/forms.html
>
> thanks. (BTW why not use http://docs.ckan.org/en/latest/forms.html ?)
>
> One suggestion having read these a bit is that links in specific
> sections to relevant portion of the ckanext-example would be really
> useful but could break quite a bit (perhaps just to relevant files).
> But I understand the so many things, so little time aspect of things.
>
> rufus




More information about the ckan-dev mailing list