[ECODP-dev] SOLR questions

John Glover john.glover at okfn.org
Mon Oct 14 15:24:28 UTC 2013


Hi Bert,

I have added a section to our "Additions for operations manual - Release
01.00" document called "CKAN-Solr Mapping" that lists the CKAN fields in
the EU ODP project and the corresponding Solr field name.

Regards,
John


On 9 October 2013 15:10, PASTOR CAMARASA José Juan (OP) <
Jose.PASTOR-CAMARASA at publications.europa.eu> wrote:

>  Hi John and Bert,****
>
> Thanks all this information.****
>
> Bert, could you add all this information to the Operational Manual. ****
>
> That I don't understand is that
> https://github.com/okfn/ckanext-ecportal/blob/master/ckanext/ecportal/solr/schema.xmlis a generic schema of CKAN, if I find about "keyword" or
> "geographical_coverage"  attribute I don’t find nothing about. There're not
> a specific definition for the ODP in SOLR? ****
>
> ** **
>
> Très cordialement / Best regards****
>
> José Pastor****
>
> ** **
>
> ** **
>
> *From:* John Glover [mailto:john.glover at okfn.org]
> *Sent:* Wednesday, October 09, 2013 11:27 AM
> *To:* Project list for EC ODP CKAN project
> *Cc:* ZAJAC Agnieszka (OP); PASTOR CAMARASA José Juan (OP); SABETE Vafa
> (OP)
> *Subject:* Re: *[ECODP-dev] SOLR questions*****
>
> ** **
>
> Hi Bert,****
>
> ** **
>
> Replies inline below.****
>
> ** **
>
> > In which document we can find information about how we have implemented
> SOLR in ODP? In the operation manual I don’t find nothing.****
>
> ** **
>
> I'm not really sure what information is required here. Our Solr schema is
> in the ckanext-ecportal extension [1], this contains the list of all fields
> that are currently indexed (and how we have configured Solr to index them).
> There is also some information  about the multilingual fields and the query
> parser in the Operations Manual (p. 34). It seems like you would be the
> best person to comment on the actual deployment aspects.****
>
> ** **
>
> > Where we can find the configuration used in ODP : wildcard, Boolean
> operators fuzzy search, range search, search by fields, …****
>
> ** **
>
> We don't have any special Solr query parsing in CKAN, we basically pass
> your query straight through to Solr, so this information is best obtained
> from the Solr documentation [2][3]. More information about our search API
> parameters is given in the docs [4].****
>
> ** **
>
> > In which fields we can do a search,****
>
> ** **
>
> These are listed in our Solr schema [1].****
>
> ** **
>
> > Where we can find d the list of stop words? They are only in English?***
> *
>
> ** **
>
> We are not really using any stop words at the moment (the default
> 'protwords.txt' is used for English, but this is practically empty,
> containing just two test examples).****
>
> ** **
>
> > How to search with special character (+ - && || ! ( ) { } [ ] ^ " ~ * ?
> : \)****
>
> ** **
>
> Special characters will generally be stripped by our current Solr
> analyzers at both index and query time, so currently you cannot search for
> these characters.****
>
> ** **
>
> [1]:
> https://github.com/okfn/ckanext-ecportal/blob/master/ckanext/ecportal/solr/schema.xml
> ****
>
> [2]: http://wiki.apache.org/solr/SolrQuerySyntax****
>
> [3]: http://wiki.apache.org/solr/DisMaxQParserPlugin****
>
> [4]:
> http://docs.ckan.org/en/ckan-1.8.2/apiv3.html#ckan.logic.action.get.package_search
> ****
>
> ** **
>
> ** **
>
> Regards,****
>
> John****
>
> ** **
>
> On 8 October 2013 13:34, Bert Van Nuffelen <bert.van.nuffelen at tenforce.com>
> wrote:****
>
> Hi Darwin and John,
>
> here are some solr questions from Jose. Can you answer them:****
>
> **·        **In which document we can find information about how we have
> implemented SOLR in ODP? In the operation manual I don’t find nothing. ***
> *
>
> **·        **Where we can find the configuration used in ODP : wildcard,
> Boolean operators fuzzy search, range search, search by fields, … ****
>
> **·        **In which fields we can do a search,****
>
> **·        **Where we can find d the list of stop words? They are only in
> English?****
>
> **·        **How to search with special character (+ - && || ! ( ) { } [
> ] ^ " ~ * ? : \)****
>
> kind regards,****
>
> Bert****
>
> *[JP] *----
> ****
>
> *From:* John Glover [mailto:john.glover at okfn.org]
> *Sent:* Wednesday, October 09, 2013 11:57 AM
> *To:* Project list for EC ODP CKAN project
> *Cc:* PASTOR CAMARASA José Juan (OP); ZAJAC Agnieszka (OP); HOHN Norbert
> (OP); SABETE Vafa (OP)
> *Subject:* Re: *[ECODP-dev] CKAN wild character for search*****
>
> ** **
>
> Hi Bert,****
>
> ** **
>
> Yes, the poverty vs poverties example will be covered by the stemming in
> Solr.****
>
> ** **
>
> In general, wildcard searches are not supported as we use the dismax query
> parser[1].****
>
> ** **
>
> However, if you search for a specific field by entering something like
> "title: pov*" (without the quotes), it will actually be possible. This is
> because if we detect a ":" character in the query, we fall back to using
> the default query parser which does support wildcards. But yes, it does not
> support wildcards at the start of terms.****
>
> ** **
>
> [1]: http://wiki.apache.org/solr/DisMaxQParserPlugin****
>
> ** **
>
> Regards,****
>
> John****
>
> ** **
>
> On 3 October 2013 16:38, Bert Van Nuffelen <bert.van.nuffelen at tenforce.com>
> wrote:****
>
> Hi John,****
>
> It seems I forgot you to include in this conversation.****
>
> Can you have a look at it?
>
> kind regards,****
>
> Bert****
>
> ** **
>
> 2013/10/3 Bert Van Nuffelen <bert.van.nuffelen at tenforce.com>****
>
> Hi José,****
>
> This should be captured by the stemming in the solr component I assume. **
> **
>
> So poverty should return poverties, unless you search for "poverty" (the
> exacty string).****
>
> @John, can you confirm this?****
>
> kind regards,****
>
> Bert****
>
> ** **
>
> ** **
>
> ** **
>
> ** **
>
> 2013/10/3 PASTOR CAMARASA José Juan (OP) <
> Jose.PASTOR-CAMARASA at publications.europa.eu>****
>
> ** **
>
> Hi Bert,****
>
> To do a search in CKAN what is the wild character to do search?****
>
> For example if I find about "poverty" or "poverties", what is the search
> wildcard we would use? I've tried with classical wildcards: pover%, pover?,
> pover* and I don't have reply.****
>
> And I presume we don't have lefty wild card?****
>
>  ****
>
> [image: cid:AF7C72ADF3D11F489500BF77D6D3C215 at publications.europa.eu]****
>
>  ****
>
> Très cordialement / Best regards****
>
> José Pastor****
>
>  ****
>
> ** **
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.okfn.org/mailman/private/ecodp-dev/attachments/20131014/1b85aca0/attachment.html>


More information about the ecodp-dev mailing list