[okfn-labs] Find country names in blobs of unknown text

dagoneye matt at blog.dagoneye.it
Wed Jun 18 17:06:17 UTC 2014


Hi guys,
speaking about named entity extraction, why not trying also dataTXT-NEX[1]?
It's a named entity extraction/linking REST API, and there is a nice library
for python users:
https://github.com/SpazioDati/python-dandelion-eu#python-dandelion-eu

Some demos:
https://dandelion.eu/products/datatxt/nex/demo/?text=Aecom%20(New%20Zealand)&exec=true#results
or something like this:
https://dandelion.eu/products/datatxt/nex/demo/?text=Aecom%20New%20Zealand%20Limited&exec=true#results 

With different ways to write USA ( like US, or U.S.A. etc. ), if dataTXT
doesn't match all of them with the same entity, it's possible to improve and
customize the match against every entity using "custom spot" feature.[2] 
It works on organization names too, but only if these organizations have a
Wikipedia page. ( only for now, from september it'll be possible to extend
the graph )

There are free plans for non-profit, research and educational uses. (if you
need more than 1000 calls a day for NEX)

Some background context on semanticweb.com: ( dataTXT isn't based on NLP )
http://semanticweb.com/dandlions-new-bloom-family-semantic-text-analysis-apis_b41172

 Matteo 
 --
 @dagoneye

full disclosure: I work for SpazioDati, the company behind dataTXT

[1] - https://dandelion.eu/products/datatxt/
[2] - https://dandelion.eu/docs/api/datatxt/custom-spots/v1/



--
View this message in context: http://okfn-labs.28008.n7.nabble.com/okfn-labs-Find-country-names-in-blobs-of-unknown-text-tp33p50.html
Sent from the okfn-labs mailing list archive at Nabble.com.



More information about the okfn-labs mailing list