[ckan-dev] Semantic Search - Extension
Sven R. Kunze
sven.kunze at s2007.tu-chemnitz.de
Fri Nov 30 10:44:48 UTC 2012
Hi guys,
I was currently thinking about trying to use SOLR for indexing the
following data:
- list of URIs indicating the vocabularies, predicates, classes and
entities (for topic search)
- latitude, longitude, radius (for geo search)
- min time, max time (for time search)
However, I would face the following issue:
- how can I assure, that e.g. subclasses of a class will also be found
although it is not mention in that RDF dataset explicitly
=> materialization could do that, but that is the business of a triplestore
- so I could let a triplestore create all additional triples and got them
indexed
- however forward-chaining is not the best way as it causes severe issues
when updating (rebuilding the complete closure, indexes etc.)
=> therefore, I'd like to handle the filtering myself via SPARQL queries
(where backward chaining can be done)
Implications:
The issue with the post-filtering (as the extension works now) is that the
facets aren't updated correctly.
So, pre-filtering would be more adequate.
Is there a way to pass a list of relevant items to SOLR?
The idea is that a triplestore could (or could not defined by the admin)
filter out the datasets that match the specified search criteria (topic,
geo, time) and SOLR could run its regular search based on that
pre-filtered list of datasets.
One more thing: if the described approach worked, could it also be
possible to pre-sort the list for SOLR (not by an index value but by a
pre-sorting the given list)?
Cheers,
Sven
More information about the ckan-dev
mailing list