[annotator-dev] document plugin for annotator

Mon May 13 20:44:43 UTC 2013

Hi all,

As part of some work I've been doing for Hypothesis [1] I've created
the Document plugin that enables the extraction of HTML metadata
(currently Dublin Core, Open Graph Protocol, Google Scholar metadata,
and link relations). For example, here's what the annotation JSON for
a PeerJ article looks like, in which you'll notice the new 'document'
key:

    https://gist.github.com/edsu/5570838

The reason for doing this was to provide part of a solution for
annotating documents, independent of their location on the web (URL,
DOI, etc) and serialization format (PDF, HTML , etc).  In Nick's
recent presentation at iAnnotate [2], he identified 4 Hilbert Problems
of annotation, the 2nd of which was the need for annotating documents
not formats. As he mentions in this talk, a big part of Hypothesis
contributions in this area have been around something called Fuzzy
Anchoring [3]. However in order to know when fuzzy anchors of a
resource at one URL can be applied to a resource at another URL we
found that we needed to know more about the resources being annotated.

So for example when persisting an annotation for this PeerJ article

    https://peerj.com/articles/53/

it is possible to look at the Google Scholar metadata and see that the
annotation should also be relevant for a user viewing the PDF:

    https://peerj.com/articles/53.pdf

Similarly, an annotation of this paper in the Astrophysics Data Service:

    http://adsabs.harvard.edu/abs/2006Natur.444..461V

would also be relevant for the view of the article in Nature at:

    http://www.nature.com/nature/journal/v444/n7118/full/nature05240.html

since they share the same DOI in common in their metadata.

I'm writing here to see if you all will consider a pull request for
the Document plugin [4]. It has unit tests, and I think can be
considered independent of the other work being done at Hypothesis,
which you will probably be seeing pull requests for shortly as well.

I welcome your feedback and ideas. Thanks!

//Ed

[1] http://hypothes.is
[2] http://www.youtube.com/watch?v=NdVzlLBdiiM
[3] http://hypothes.is/blog/fuzzy-anchoring
[4] https://github.com/okfn/annotator/pull/208