[ckan-dev] Deleted packages in search results

Seb Bacon seb.bacon at gmail.com
Thu Jan 13 09:09:07 UTC 2011


Hi,

On 12 January 2011 21:48, Friedrich Lindenberg <friedrich at pudo.org> wrote:
> Hi all,
>
> I'm currently trying to debug an issue that has been brought up by the
> HRI folks: in their CKAN instance, they've deleted a large number of
> packages and they're using solr indexing. The problem with this is
> that both deleted and active packages are indexed, since we want
> admins to still search for them (do we?). Filtering for deleted
> packages is then done on the result set, while result counts remain
> wrong.
>
> My initial approach to fixing this was to do filtering within solr by
> passing a list of all packages for which the querying user is an admin
> solr in a query such as this:
>
>  +(state:active OR name:my_pkg1 OR name:my_pkg2)
>
> Of course, this doesn't scale, especially for sysadmins which are
> admin to all packages. The solr query parser quits at about 1k package
> names. I'm now a bit unsure since the only solution I can spot is to
> include the list of admins into the index, thus replicating a part of
> the authz layer in solr.
>
> Is there a better/smarter/easier way to circumvent this?

I'm not familiar with Solr yet but I've had the same problem to solve
in other systems previously, and the conclusion I came to was yours:
to maintain an index against each record of people with permission to
view it, which in my application meant (a) roles able to view it, plus
(b) exceptions in the form of usernames.

Seb




More information about the ckan-dev mailing list