[okfn-labs] Find country names in blobs of unknown text

Michael Bauer michael.bauer at okfn.org
Fri Jun 13 16:03:29 UTC 2014


Thomas,

Could you use Opennames (hrm. Nomenklatura) for something like this?

e.g. add in the ISO country list and then work on alternate spellings?

Michael

On Fri, Jun 13, 2014 at 11:54:28AM -0400, Thomas Levine wrote:
> I'm looking for a function or regular expression that finds country names in blobs of text.
> This can just be something that does a bunch of exact string matches so that it doesn't matter
> whether the source blob (company names in my case) is spelled "Aecom New Zealand Limited",
> "Aecom (New Zealand)", "Aecom, New Zealand", or "New Zealand". Has someone released something
> like this?
> 
> If I don't see an answer soon, I'm going to write a regular expression that matches with a
> bunch of country names from some country name dataset.
> _______________________________________________
> okfn-labs mailing list
> okfn-labs at lists.okfn.org
> https://lists.okfn.org/mailman/listinfo/okfn-labs
> Unsubscribe: https://lists.okfn.org/mailman/options/okfn-labs

-- 
Data Diva | skype: mihi_tr | @mihi_tr
Open Knowledge | School of Data
http://okfn.org | http://schoolofdata.org 
GPG/PGP key: http://tentacleriot.eu/mihi.asc



More information about the okfn-labs mailing list