[ckan-dev] full text search and sql

Seb Bacon seb.bacon at gmail.com
Mon Feb 7 09:21:26 UTC 2011


On 6 February 2011 23:54, William Waites <ww at styx.org> wrote:
> So working a bit with James on the harvesting, he said to
> me, "even when the harvesting works i don't see any packages".
> Very strange. They show up in the version history... They
> show up in the API... They do not show up in the web interface...
>
> Now why is that?
>
> There were two problems... The home page and package list
> were changed to use the full-text search a few days ago.
> This is only really tested with Solr and enables nice
> facet-based stuff.
>
> Problem #1: the FTS index wasn't getting built properly.
> When a package is added by the harvester command line
> tool, the index tries to build (INSERT) before the package
> is created (or committed). Maybe it's happening in another
> thread, maybe it's really happening out of order. Not
> entirely sure since the trigger for the indexing is a bit
> magic (I followed it through a few times before but have
> forgotten how it works each time since). So the immediate
> kludge for the meetings and dog and pony show tomorrow is
> to just rebuild the search index after the harvesting run.
> Might this have something to do with a change in the
> semantics of magic between sqlalchemy 0.4 and 0.6?
>
> Problem #2: the SQL FTS index doesn't seem to properly
> handle a query of the form '*:*'. It returns no results.
> But, this is now what the home page and the package list
> controllers do. As an interim measure I made the home
> controller count packages in the traditional way using
> count() so at least it doesn't lie frighteningly and tell
> you that there are 0 packages available. And for the rest,
> some canned queries, 'Photography', 'Plan' that will pull
> out some known datasets since specific queries do still
> work with the SQL FTS index. I don't know yet (hopefully
> will find out tomorrow) if this might partially be an
> artefact of my development environment running postgresql
> 9.0 instead of 8.x series or if this will happen everywhere.

Problem #3 -- there's therefore at least one missing test :)

Sorry, no opinions about the rest as have not looked at it yet...

Seb




More information about the ckan-dev mailing list