[open-linguistics] ANN: NLP Interchange Format (NIF) 1.0 Spec, Demo and Reference Implementation

John McCrae jmccrae at cit-ec.uni-bielefeld.de
Thu Dec 1 22:28:14 UTC 2011


Hi,

I was wondering if it would be possible in anyway to use NIF embedded in an
XML document, perhaps by means of RDFa? From my (admittedly small)
knowledge of RDFa and reading the NIF documentation it seems like something
like this would be a good guess:

<div rel="string:subString">
  <span property="string:anchorOf" rel="sso:posTag"
href="&penn;NNP">Mary</span> 
  <span property="string:anchorOf" rel="sso:posTag"
href="&penn;VBZ">is</span> 
  <span property="string:anchorOf" rel="sso:posTag"
href="&penn;JJ">happy</span>.
</div>

The RDF interpretation of this is then

_:n1 string:anchorOf "Mary"
_:n1 sso:posTag penn:NNP
_:n2 string:anchorOf "is"
_:n2 sso:posTag penn:VBZ
_:n3 string:anchorOf "happy"
_:n3 sso:posTag penn:JJ

Of course, there are some obvious problems:

   - We don't really know where in the document the blank node lies,
   without referring to the original document
   - I think it may be impossible to represent multiple properties on the
   same node
   - It is way too verbose.


Something like this would be much better in my opinion:

<text>
  <span sso:posTag="&penn;NNP">Mary</span> <span
sso:posTag="&penn;VBZ">is</span> <span sso:posTag="&penn;JJ">happy</span>.
</text>

Which then generates the appropriate NIF triples

Have you considered anything along these lines yet?

Regards,
John McCrae

On Mon, Nov 28, 2011 at 8:34 AM, Sebastian Hellmann <
hellmann at informatik.uni-leipzig.de> wrote:

> The Natural Language Processing Interchange Format (NIF) is an
> RDF/OWL-based format that aims to achieve interoperability between Natural
> Language Processing (NLP) tools, language resources and annotations. The
> core of NIF consists of a vocabulary, which can represent Strings as RDF
> resources. A special URI Design is used to pinpoint annotations to a part
> of a document. These URIs can then be used to attach arbitrary annotations
> to the respective character sequence. Employing these URIs, annotations can
> be published on the Web as Linked Data and interchanged between different
> NLP tools and applications.
>
> In order to simplify the combination of tools, improve their
> interoperability and facilitating the use of Linked Data we developed the
> NLP Interchange Format (NIF). NIF addresses the interoperability problem on
> three layers: the structural, conceptual and access layer. NIF is based on
> a Linked Data enabled URI scheme for identifying elements in (hyper-) texts
> (structural layer) and a comprehensive ontology for describing common NLP
> terms and concepts (conceptual layer). NIF-aware applications will produce
> output (and possibly also consume input) adhering to the NIF ontology as
> REST services (access layer). Other than more centralized solutions such as
> UIMA and GATE, NIF enables the creation of heterogeneous, distributed and
> loosely coupled NLP applications, which use the Web as an integration
> platform. Another benefit is, that a NIF wrapper has to be only created
> once for a particular tool, but enables the tool to interoperate with a
> potentially large number of other
> tools without additional adaptations. Ultimately, we envision an ecosystem
> of NLP tools and services to emerge using NIF for exchanging and
> integrating rich annotations.
>
> We designed NIF to be very light-weight and to reduce the amount of
> triples to achieve better scalability. The following triples in N3 Syntax
> express that the string “W3C” on http://www.w3.org/**
> DesignIssues/LinkedData.html<http://www.w3.org/DesignIssues/LinkedData.html>(index 22849 to 22852) is linked to the DBpedia resource of
> “World_Wide_Web_Consortium”:
>
> @prefix ld: <http://www.w3.org/**DesignIssues/LinkedData.html#<http://www.w3.org/DesignIssues/LinkedData.html#>>
> .
> @prefix str: <http://nlp2rdf.lod2.eu/**schema/string/<http://nlp2rdf.lod2.eu/schema/string/>>
> .
> @prefix dbo: <http://dbpedia.org/ontology/> .
> @prefix scms: <http://ns.aksw.org/scms/> .
> @prefix nerd: <http://nerd.eurecom.fr/**ontology#<http://nerd.eurecom.fr/ontology#>>
> .
> ld:offset_22849_22852_W3C str:anchorOf "W3C" .
> ld:offset_22849_22852_W3C scms:means dbpedia:World_Wide_Web_**Consortium .
> ld:offset_22849_22852_W3C a dbo:Organisation , nerd:Organization .
>
> NIF already incorporates the Ontologies of Linguistic Annotation (OLiA,
> http://nachhalt.sfb632.uni-**potsdam.de/owl/<http://nachhalt.sfb632.uni-potsdam.de/owl/>)
> and the Named Entity Recognition and Disambiguation (NERD,
> http://nerd.eurecom.fr/**ontology/ <http://nerd.eurecom.fr/ontology/>)
> ontology. Please get in contact, if you know of further NLP ontologies,
> which we can reuse and integrate in NIF.
>
> This release consists of the following items:
> 1. The specification of NIF 1.0 ( http://nlp2rdf.org/nif-1-0 ) This
> document will guide the further implementation of NIF-enabled services. An
> average wrapper requires around 200-500 lines of code. The spec integrates
> several domain ontologies (OLiA, NERD) and will be extended in the future
> to cover more domains.
> 2. A community portal ( http://nlp2rdf.org )
> -- mailing list (nlp2rdf at lists.informatik.uni-**leipzig.de<nlp2rdf at lists.informatik.uni-leipzig.de>) -
> http://lists.informatik.uni-**leipzig.de/mailman/listinfo/**nlp2rdf<http://lists.informatik.uni-leipzig.de/mailman/listinfo/nlp2rdf>
> -- Read how to get involved (http://nlp2rdf.org/get-**involved<http://nlp2rdf.org/get-involved>)
> 3. A reference implementations of NIF 1.0 in Java
> -- Release 1.2 ( http://code.google.com/p/**nlp2rdf/downloads/detail?name=
> **nlp2rdf-1.2.tar.gz<http://code.google.com/p/nlp2rdf/downloads/detail?name=nlp2rdf-1.2.tar.gz>)
> -- Source code ( http://code.google.com/p/**nlp2rdf/<http://code.google.com/p/nlp2rdf/>)
> 4. Wrapper implementations for Stanford CoreNLP, SnowballStemmer, OpenNLP,
> MontyLingua, DBpedia Spotlight, UIMA, Gate (for ANNIE and also generic
> output), Mallet (alpha)
> -- Demo GUI (with links to implementations): http://nlp2rdf.lod2.eu/demo.*
> *php <http://nlp2rdf.lod2.eu/demo.php>
> -- List of implementations: http://nlp2rdf.org/**implementations<http://nlp2rdf.org/implementations>
> 5. Tutorials and Tutorial Challenges ( http://nlp2rdf.org/tutorials-**
> challenge <http://nlp2rdf.org/tutorials-challenge> )
> -- Tutorial: How to call a NIF web service with your favorite SemWeb
> library - http://nlp2rdf.org/tutorials/**tutorial-how-to-call-a-nif-**
> webservice-with-your-favorite-**semweb-library<http://nlp2rdf.org/tutorials/tutorial-how-to-call-a-nif-webservice-with-your-favorite-semweb-library>
> -- Tutorial Challenge: Semantic Search - http://nlp2rdf.org/tutorial-**
> challenges/tutorial-challenge-**semantic-search/<http://nlp2rdf.org/tutorial-challenges/tutorial-challenge-semantic-search/>
> -- Tutorial Challenge: Multilingual Part-Of-Speech Tagger -
> http://nlp2rdf.org/tutorial-**challenges/tutorial-challenge-**
> multilingual-part-of-speech-**tagger<http://nlp2rdf.org/tutorial-challenges/tutorial-challenge-multilingual-part-of-speech-tagger>
> -- Tutorial Challenge: Semantic Yellow Pages -
> http://nlp2rdf.org/tutorial-**challenges/tutorial-challenge-**
> semantic-yellow-pages<http://nlp2rdf.org/tutorial-challenges/tutorial-challenge-semantic-yellow-pages>
> 6. Slides - http://www.slideshare.net/**kurzum/nif-version-10<http://www.slideshare.net/kurzum/nif-version-10>
> 7. A technical report http://svn.aksw.org/papers/**2012/WWW_NIF/public.pdf<http://svn.aksw.org/papers/2012/WWW_NIF/public.pdf>including some evaluation.
>
> We would like to thank our colleagues from AKSW (http://aksw.org)
> research group and the LOD2 (http://lod2.eu) project for their helpful
> comments and inspiring discussions during the development of NIF.
> Especially, we would like to thank Christian Chiarcos (
> http://www.sfb632.uni-**potsdam.de/~chiarcos/<http://www.sfb632.uni-potsdam.de/~chiarcos/>)
> for his support while using OLiA, the members of the Working Group on Open
> Data in Linguistics (http://linguistics.okfn.org/) and the students that
> participated in the NIF field study: Markus Ackermann, Martin Brümmer,
> Didier Cherix, Marcus Nitzschke, Robert Schulze.
>
> Regards,
> Sebastian Hellmann, Jens Lehmann and Sören Auer
>
> ______________________________**_________________
> open-linguistics mailing list
> open-linguistics at lists.okfn.**org <open-linguistics at lists.okfn.org>
> http://lists.okfn.org/mailman/**listinfo/open-linguistics<http://lists.okfn.org/mailman/listinfo/open-linguistics>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/open-linguistics/attachments/20111201/9cfc4f13/attachment.html>


More information about the open-linguistics mailing list