[okfn-labs] Find country names in blobs of unknown text

Rufus Pollock rufus.pollock at okfn.org
Fri Jun 13 15:57:06 UTC 2014


On 13 June 2014 16:54, Thomas Levine <_ at thomaslevine.com> wrote:

> I'm looking for a function or regular expression that finds country names
> in blobs of text.
> This can just be something that does a bunch of exact string matches so
> that it doesn't matter
> whether the source blob (company names in my case) is spelled "Aecom New
> Zealand Limited",
> "Aecom (New Zealand)", "Aecom, New Zealand", or "New Zealand". Has someone
> released something
> like this?
>

I don't have an answer here but imagine no-one has written this as a
library (but may be wrong!)


> If I don't see an answer soon, I'm going to write a regular expression
> that matches with a
> bunch of country names from some country name dataset.
>

If you want standard ISO country names:

http://data.okfn.org/data/core/country-list

If you want ISO + some other codes and french translation:

http://data.okfn.org/data/core/country-codes

Rufus
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/okfn-labs/attachments/20140613/148e7a77/attachment-0004.html>


More information about the okfn-labs mailing list