[okfn-labs] Find country names in blobs of unknown text
Thomas Levine
_ at thomaslevine.com
Sun Jun 15 13:03:38 UTC 2014
I suppose I could keep working on it, but I'm kind of done with it for the present project.
http://small.dada.pink/match-companies/reconcile.py
Does anyone else want something better? If not, I'm inclined to wait until someone asks for better.
On June 13, 2014 1:04:30 PM EDT, Friedrich Lindenberg <friedrich at pudo.org> wrote:
>On that note: http://opennames.org/datasets/iso-countries
>
>- Friedrich
>
>On 13 Jun 2014, at 19:03, Michael Bauer <michael.bauer at okfn.org> wrote:
>
>> Thomas,
>>
>> Could you use Opennames (hrm. Nomenklatura) for something like this?
>>
>> e.g. add in the ISO country list and then work on alternate
>spellings?
>>
>> Michael
>>
>> On Fri, Jun 13, 2014 at 11:54:28AM -0400, Thomas Levine wrote:
>>> I'm looking for a function or regular expression that finds country
>names in blobs of text.
>>> This can just be something that does a bunch of exact string matches
>so that it doesn't matter
>>> whether the source blob (company names in my case) is spelled "Aecom
>New Zealand Limited",
>>> "Aecom (New Zealand)", "Aecom, New Zealand", or "New Zealand". Has
>someone released something
>>> like this?
>>>
>>> If I don't see an answer soon, I'm going to write a regular
>expression that matches with a
>>> bunch of country names from some country name dataset.
>>> _______________________________________________
>>> okfn-labs mailing list
>>> okfn-labs at lists.okfn.org
>>> https://lists.okfn.org/mailman/listinfo/okfn-labs
>>> Unsubscribe: https://lists.okfn.org/mailman/options/okfn-labs
>>
>> --
>> Data Diva | skype: mihi_tr | @mihi_tr
>> Open Knowledge | School of Data
>> http://okfn.org | http://schoolofdata.org
>> GPG/PGP key: http://tentacleriot.eu/mihi.asc
>> _______________________________________________
>> okfn-labs mailing list
>> okfn-labs at lists.okfn.org
>> https://lists.okfn.org/mailman/listinfo/okfn-labs
>> Unsubscribe: https://lists.okfn.org/mailman/options/okfn-labs
>
>
>
>------------------------------------------------------------------------
>
>_______________________________________________
>okfn-labs mailing list
>okfn-labs at lists.okfn.org
>https://lists.okfn.org/mailman/listinfo/okfn-labs
>Unsubscribe: https://lists.okfn.org/mailman/options/okfn-labs
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/okfn-labs/attachments/20140615/1130fbbd/attachment-0004.html>
More information about the okfn-labs
mailing list