[annotator-dev] default analyzer for tags field

Andrew Magliozzi andrew at finalsclub.org
Thu Aug 1 00:06:11 UTC 2013


Hey Gregly,

Roughly what might you suggest we do to improve upon our current schema?

Warmly,
Andrew




On Jul 31, 2013, at 6:49 PM, Randall Leeds <tilgovi at hypothes.is> wrote:

> My guess would be that there was no intention in particular here.
> 
> 
> On Tue, Jul 30, 2013 at 4:33 PM, Gergely, Ujvari <ujvari at hypothes.is> wrote:
>> Hello!
>> I've a theoretical question about how should the tag index work.
>> The tags field is defined as this in the annotation.py:
>> 
>> 'tags': {'type': 'string', 'index_name': 'tag'}
>> But no analyzer was set up for the search, so ES uses it's own analyzer which by default ignores searches to common stopwords for example:
>> 
>> "a", "an", "and", "are", "as", "at", "be", "but", "by",
>>   "for", "if", "in", "into", "is", "it",
>>   "no", "not", "of", "on", "or", "such",
>>   "that", "the", "their", "then", "there", "these",
>>   "they", "this", "to", "was", "will", "with"
>> This means that searching to these stopwords do not give back search results. 
>> 
>> My question: is this an intentional decision to avoid using trivial tags? If yes, wouldn't it make sense to not let create this tags if they're not that searchable?
>> 
>> Thanks
>> Gergely
>>  
>> 
>> 
>> _______________________________________________
>> annotator-dev mailing list
>> annotator-dev at lists.okfn.org
>> http://lists.okfn.org/mailman/listinfo/annotator-dev
>> Unsubscribe: http://lists.okfn.org/mailman/options/annotator-dev
> 
> _______________________________________________
> annotator-dev mailing list
> annotator-dev at lists.okfn.org
> http://lists.okfn.org/mailman/listinfo/annotator-dev
> Unsubscribe: http://lists.okfn.org/mailman/options/annotator-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/annotator-dev/attachments/20130731/5e34efde/attachment-0001.html>


More information about the annotator-dev mailing list