[annotator-dev] JSON-LD as data interchange format [WAS Re: Branching v2.0.x]

Fri Nov 1 00:49:13 UTC 2013

We use a triplestore to store annotations and have our own store plugin to
map from the internal Annotator representation to OA JSON-LD on save and
load:
https://github.com/uq-eresearch/annotator/blob/master/src/plugin/lorestore.coffee

However, storing the annotations in a triplestore is definitely overkill,
and other than the validation service (which could be split off into a
separate service anyway), most of our SPARQL queries are pretty simple, so
in hindsight there's not a lot of benefit of this approach. I expect the
same queries would be faster using something like MongoDB (at one point I
thought it would be interesting to fork the MIT-Annotation-Data-Store,
update it to use OA JSON-LD, and then run some benchmarks to compare the
two stores, but I still need to finish that).

Mapping the internal JSON representation to OA JSON-LD was pretty
straightforward, but I did have to hardcode some mappings based on
assumptions of which plugins we would be using - e.g. to map onto the right
kind of OA selectors because we deal with both text and image annotations,
and to map the internal field created by our motivations plugin to OA
motivation. NB: we're not using the tag plugin, but if we were I would have
represented them as multiple bodies as Rob has suggested.

So we do need to think about how targets and selectors are represented in
the current internal model, because this will become more complicated as it
becomes easier to write plugins for different media types for Annotator
2.0, and particularly if people start writing plugins that annotate
multiple targets (and yes I'm aware I was meant to have written up a
discussion of these very issues and posted it to the list months ago -
sorry for being so slack!)

Anna

On Fri, Nov 1, 2013 at 1:18 AM, Robert Sanderson <azaroth42 at gmail.com>wrote:

>
> Thanks Nick, Ed :)
>
> Great discussion, comments inline...
>
> On Thu, Oct 31, 2013 at 3:40 AM, Nick Stenning <nick at whiteink.com> wrote:
>
>> Ed Summers wrote:
>> > Is it worthwhile considering JSON-LD and OpenAnnotation for v2.0? I
>> > know that OpenAnnotation is mentioned quite a bit w/r/t Hypothes.is
>> > and it seems like JSON-LD would possibly fit in fairly well with the
>> >  annotator.
>>
>> Randall Leeds wrote:
>> > First we would need to resolve the remaining differences between the
>> >  annotator model and the open annotation model. Then JSON-LD can be
>> > added in various places at any time.
>>
>> So, here are a few thoughts:
>>
>> Most people, when they first come across Annotator, have not thought
>> about the deeper semantic modelling of annotation that OA is trying to
>> achieve. They have a specific problem domain and a specific set of goals
>>
>
>> I think we should be wary of exposing the full complexity of OA to users
>> (or integrators, or plugin developers) unless we are clear on exactly
>> what benefits that brings.
>>
>
> Completely agreed that users do not need to know about OA, nor should
> they.  If there's a badge along the lines of "W3C Open Annotation
> Compliant", then maybe they feel more comfortable knowing that their
> annotations are going to be easy to transfer to other systems (if they even
> think about that or care) but there's no reason why a user would care about
> the details of the model, just that it's being used.
>
> Folk who take the code and deploy it also probably don't care, beyond the
> same ability to swap out products and still be able to maintain their users
> annotations.   I guess this is what you mean by "integrators", Nick?
>
> External developers (eg not contributing to the core) /may/ care about the
> model, if they have some desired feature that is available in Open
> Annotation but not implemented in Annotator.  If there's a clear way
> forwards (eg just follow the model) then the integration becomes a lot
> easier.
>
> And core developers, I think you guys have a bonus side effect: if the
> model was used internally, you save a lot of documentation writing time as
> you can just point to the spec :)
>
> That said...
>
>
>
>> The mechanism used by Anna Gerber is the one I'd like us to consider
>> pursuing, where Annotator continues to have its own internal model of
>> the annotation, without any conceptual or technological "linked data"
>> overhead. To speak concretely, that means that a plugin still deals with
>> plain ol' JavaScript objects. No contexts, no "@"-symbols.
>>
>
> ... this is perfectly fine too.  Open Annotation is intended as an
> interoperability mechanism between systems that may or may not use it
> internally.  If it's exposed correctly at the edges, the internals aren't
> very important.  You could store in mysql tables, arbitrary json, rdf (OA
> or not) ... no big deal so long as there's a representation that conforms
> to OA that can be retrieved, potentially along side other representations.
>
> On the other hand, the closer the internal model is to the OA model, the
> easier that transformation is and the easier it is to add in additional
> functionality.  The Javascript objects can silently add in their @type and
> @id, the @context at the very end during serialization.  Perhaps it's worth
> building a full mapping between the two models to see how close they are
> already? From earlier such explorations, they seemed very similar.
>
>
> It's then up to a specific storage engine to render that object into an
>> interchange format of it's own choosing. Conveniently, JavaScript
>> objects are more-or-less already an interchange format, so the default
>> store can simply pass the object (or rather a JSON-stringified version
>> thereof) back to the reference annotator-store.
>>
>
> Yes.  If you can turn on OA serialization for OA based stores, and other
> serializations for other stores, this gives a great deal of flexibility
> while allowing both the integrator and/or user to take the future of their
> annotations into their own hands.  It also gives you insurance against
> future changes to the OA model, for example if (hopefully when!) there's a
> full W3C working group for Open Annotation, there may be concessions to be
> made for big industry support.
>
>
> I am very keen, however, to see a parallel effort (which should be made
>> vastly simpler by the changes now in progress for Annotator 2.0.x) to
>> develop a store plugin which transforms the Annotator internal
>> representation into JSON-LD and uses that as an interchange format.
>> Perhaps someone will even integrate this with an existing triple store?
>>
>
> Anna? :)
>
> In my experience, using a triple store for annotations is overkill. It's
> great that it's RDF and inherits semantics and all the promise of linked
> data, but at the end of the day if there's just a JSON/object store it will
> fulfill 99% of the use cases we know about.
>
>
>> The point here is that it is possible to do useful work as a plugin
>> author without necessarily having to engage with the full complexity of
>> OA data modelling. I don't believe we are providing any particular
>> benefit to plugin authors or integrators by pushing the OA
>> representation through the whole of Annotator.
>>
>
> As above, but perhaps the internal model documentation could provide
> references to the appropriate parts of the OA model for people coming from
> that direction, rather than without any [@]context?
>
> If annotator and annotator-store can receive and transmit OA to each other
> and to other clients that implement the same (very good, as discussed at
> iAnnotate) API, then I think the deal is done.  Whether or not the code
> uses the model internally should be a pragmatic decision, but one that
> should (IMO) be taken seriously.  The advantages of providing a (partial)
> reference implementation for a model that's the 4th largest of 155
> community groups, and adopted by the IDPF (who are responsible for EPUB)
> seem pretty high in terms of gaining new adopters, developers and support.
>
>
> =====
>> And on the specific question of how we model tags... well, this is a
>> nice example of where OA forces us to think more clearly about what we
>> are representing.
>>
>> My opinion is that tags as currently implemented are intended to be
>> alternative bodies of a single annotation.
>
>
>> Rob Sanderson wrote:
>> > In particular, the Open Annotation model says that all bodies of an
>> > annotation are about the target of the annotation.  So if a single
>> > annotation has two tags and one comment, then all three of those
>> > things are about the target(s) of the annotation, not the annotation
>> > itself.
>>
>> Agreed.
>>
>> As such, we can keep the "tags" key as just an array of strings within
>> Annotator, with no loss of generality. If an OA-compatible store plugin
>> then wishes to convert these to additional bodies on the resulting
>> annotation, that's possible.
>>
>
> Yup!  At the edges it would be trivial to go from an array of "string", to
> an array of {"char": "string"} objects to fit the model.
>
> Thanks!
>
> Rob
>
>
> _______________________________________________
> annotator-dev mailing list
> annotator-dev at lists.okfn.org
> http://lists.okfn.org/mailman/listinfo/annotator-dev
> Unsubscribe: http://lists.okfn.org/mailman/options/annotator-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/annotator-dev/attachments/20131101/587482e2/attachment-0001.html>