[okfn-labs] Find country names in blobs of unknown text

Pieter Colpaert pieter.colpaert at okfn.org
Fri Jun 13 15:56:58 UTC 2014


Hi Thomas,

You might have a look at DBpedia spotlight:
http://spotlight.dbpedia.org

This looks like a "named entity recognition" (NER) problem.

Kind regards,

Pieter

On 2014-06-13 17:54, Thomas Levine wrote:
> I'm looking for a function or regular expression that finds country names in blobs of text.
> This can just be something that does a bunch of exact string matches so that it doesn't matter
> whether the source blob (company names in my case) is spelled "Aecom New Zealand Limited",
> "Aecom (New Zealand)", "Aecom, New Zealand", or "New Zealand". Has someone released something
> like this?
>
> If I don't see an answer soon, I'm going to write a regular expression that matches with a
> bunch of country names from some country name dataset.
> _______________________________________________
> okfn-labs mailing list
> okfn-labs at lists.okfn.org
> https://lists.okfn.org/mailman/listinfo/okfn-labs
> Unsubscribe: https://lists.okfn.org/mailman/options/okfn-labs


-- 

+32 486 74 71 22

Open Knowledge Foundation Belgium
http://okfn.be

Open Transport Working Group OKFN
http://transport.okfn.org




More information about the okfn-labs mailing list