[ckan-dev] tags, valid characters

Jean Pommier jean.pommier at pi-geosolutions.fr
Thu Oct 17 08:58:34 UTC 2019


Hi list,

[Using CKAN 2.8.2]

There is something I don't get about tags.

I'm working on a french CKAN instance, so we tend to use accentuated 
characters. So far so good, accentuated characters seem accepted.

Recently, I had a trouble with some invalid tags that were inserted 
during harvesting (I still have to investigate how this have happened). 
This drove me to look at the tags validations regexp 
https://github.com/ckan/ckan/blob/master/ckan/logic/validators.py#L430

And now I don't understand anymore:

  * how come the '[\w \-.]*$'  regexp is accepting accents ?
  * The french error message when validation is not OK states that only
    lowercased characters, numbers and symbols -_. are allowed
    (https://github.com/ckan/ckan/blob/master/ckan/i18n/fr/LC_MESSAGES/ckan.po#L1852)

I have the feeling some information need to be updated, at least french 
error message since uppercase letters seem to pass and even accentuated 
characters)

What is the current policy about tags ? Any character, even accentuated, 
numbers and -_. ?

Thanks in advance for any clarification,

Best,

Jean


-- 

*Jean Pommier -- pi-Geosolutions*

Ingénieur, consultant indépendant

Tél. : (+33) 6 09 23 21 36
E-mail : jp at pi-geosolutions.fr
Web : www.pi-geosolutions.fr

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/ckan-dev/attachments/20191017/5d3de0d2/attachment.html>


More information about the ckan-dev mailing list