[okfn-labs] Named entities (was: Find country names in blobs of unknown text)

Friedrich Lindenberg friedrich at pudo.org
Wed Jun 18 17:18:59 UTC 2014


Hey dagoneye, everybody,

that’s really interesting, I like the stuff you’re builiding. I’m wondering if anyone on this list has experience with setting up targeted NER services that would run against a pre-defined, restricted set of entities? 

My first guess would be to use NLTK or StanfordNER, but I’m sure there must be much better approaches? 

Cheers, 

- Friedrich 

On 18 Jun 2014, at 20:06, dagoneye <matt at blog.dagoneye.it> wrote:

> Hi guys,
> speaking about named entity extraction, why not trying also dataTXT-NEX[1]?
> It's a named entity extraction/linking REST API, and there is a nice library
> for python users:
> https://github.com/SpazioDati/python-dandelion-eu#python-dandelion-eu
> 
> Some demos:
> https://dandelion.eu/products/datatxt/nex/demo/?text=Aecom%20(New%20Zealand)&exec=true#results
> or something like this:
> https://dandelion.eu/products/datatxt/nex/demo/?text=Aecom%20New%20Zealand%20Limited&exec=true#results 
> 
> With different ways to write USA ( like US, or U.S.A. etc. ), if dataTXT
> doesn't match all of them with the same entity, it's possible to improve and
> customize the match against every entity using "custom spot" feature.[2] 
> It works on organization names too, but only if these organizations have a
> Wikipedia page. ( only for now, from september it'll be possible to extend
> the graph )
> 
> There are free plans for non-profit, research and educational uses. (if you
> need more than 1000 calls a day for NEX)
> 
> Some background context on semanticweb.com: ( dataTXT isn't based on NLP )
> http://semanticweb.com/dandlions-new-bloom-family-semantic-text-analysis-apis_b41172
> 
> Matteo 
> --
> @dagoneye
> 
> full disclosure: I work for SpazioDati, the company behind dataTXT
> 
> [1] - https://dandelion.eu/products/datatxt/
> [2] - https://dandelion.eu/docs/api/datatxt/custom-spots/v1/
> 
> 
> 
> --
> View this message in context: http://okfn-labs.28008.n7.nabble.com/okfn-labs-Find-country-names-in-blobs-of-unknown-text-tp33p50.html
> Sent from the okfn-labs mailing list archive at Nabble.com.
> _______________________________________________
> okfn-labs mailing list
> okfn-labs at lists.okfn.org
> https://lists.okfn.org/mailman/listinfo/okfn-labs
> Unsubscribe: https://lists.okfn.org/mailman/options/okfn-labs

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 842 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <http://lists.okfn.org/pipermail/okfn-labs/attachments/20140618/ffb17947/attachment-0004.sig>


More information about the okfn-labs mailing list