[ckan-dev] Encoding problem in extra_* fields

Rafa rafsalrob at gmail.com
Mon May 14 09:41:22 UTC 2018


Hi everyone!

My CKAN 2.7.1 instance is deployed with ckanext-scheming and ckanext-fluent
extensions. We need to show CKAN fields in Spanish and English languages,
including the tags field. We configure a new field following the
instructions in https://github.com/ckan/ckanext-fluent and all work fine,
but when we try to find datasets by a keyword with accent mark, CKAN
doesn't show any result because Solr is changing the accent mark by unicode
string like \u00ed (for example, meteorología is changed
for meteorolog\u00eda).

I have tried by adding a <chartFilter> for the <fieldType> text in the
schema.xml file like this:

 <fieldType name="text" class="solr.TextField" positionIncrementGap="100">
        <analyzer type="index">
            *<charFilter class="solr.MappingCharFilterFactory"
mapping="mapping-ISOLatin1Accent.txt"/>*
            <tokenizer class="solr.WhitespaceTokenizerFactory"/>
            <filter class="solr.WordDelimiterFilterFactory"
generateWordParts="1" generateNumberParts="1" catenateWords="1"
catenateNumbers="1" catenateAll="0" splitOn$
            <filter class="solr.LowerCaseFilterFactory"/>
            <filter class="solr.SnowballPorterFilterFactory"
language="English" protected="protwords.txt"/>
            <filter class="solr.ASCIIFoldingFilterFactory"/>
        </analyzer>
        <analyzer type="query">
*            <charFilter class="solr.MappingCharFilterFactory"
mapping="mapping-ISOLatin1Accent.txt"/>*
            <tokenizer class="solr.WhitespaceTokenizerFactory"/>
            <filter class="solr.SynonymFilterFactory"
synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
            <filter class="solr.WordDelimiterFilterFactory"
generateWordParts="1" generateNumberParts="1" catenateWords="0"
catenateNumbers="0" catenateAll="0" splitOn$
            <filter class="solr.LowerCaseFilterFactory"/>
            <filter class="solr.SnowballPorterFilterFactory"
language="English" protected="protwords.txt"/>
            <filter class="solr.ASCIIFoldingFilterFactory"/>
        </analyzer>
    </fieldType>

but it doesn't work.

Could you help me, please?

Thanks. Rafa.



-- 
--------------------------------------------------
Rafael Salas Robledo
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/ckan-dev/attachments/20180514/371bdf77/attachment.html>


More information about the ckan-dev mailing list