[the-datatank] Source encoding

Jan Vansteenlandt vansteenlandt.jan at gmail.com
Tue Mar 25 09:57:30 UTC 2014


Hi list!

I want to address a small configuration issue we encountered lately
concerning encoding of data when extracting it from its source. Currently
it's expected that it is utf8, however mostly a person will not know in
what encoding their data comes, and sometimes there's no way of converting
it to utf8.

Therefore, a suggestion is made by us that you can specify the encoding,
after which we will need to convert it internally to utf8 (performance goes
down a bit), and by default we expect utf8, or ISO 8859-1 encoded data.

Looking forward to your feedback on this matter.

-- 
Best regards,

Jan Vansteenlandt
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.okfn.org/pipermail/the-datatank/attachments/20140325/e0e62e31/attachment.html>


More information about the the-datatank mailing list