[okfn-discuss] Re: Marginalia

Fri Jan 26 19:12:15 UTC 2007

Geoffrey Glass wrote:
[snip]
>> First off, great to hear from you -- I'd been planning to write to you 
>> for the last couple of weeks about marginalia specifically web 
>> annotation in general and marginalia specifically. I don't know how 
>> much of the discussion on the list you saw but the we've been looking 
>> at marginalia in order to integrate it into the web interface for open 
>> shakespeare:
>>
>> http://demo.openshakespeare.org/

> Wow.  I checked the HTML source for a couple of the plays.  All of the 
> text is in one long pre element.  I'm afraid this means you're going to 
> run into performance problems with Marginalia unless you mark up the 
> text as paragraphs or divs or something (don't pick span - Marginalia 
> doesn't consider inline elements to be useful references).  A simplistic 
> but effective approach I expect would be to divide the document into 
> individual lines, each wrapped with <div class="line"> or <p 
> class="line"> or similar.  Then instead of examining every character 

Yes I'd thought that there might be a problem with one long pre and had 
already prepared line numbered versions -- just set format to lineno as in:

http://demo.openshakespeare.org/view?name=hamlet_gut&format=lineno

> between the start of the document and a highlighted passage, Marginalia 
> just fires off an XPath expression to grab the line and counts from 
> there.  (IE rears its ugly head here -  I don't think it supports XPath, 
> so Marginalia has to count lines instead.  See PathToNode in 
> marginalia.js).

Is this a change from your previous version? I thought the old version 
used your own custom range object with offsets in words from the start 
of an entry (with an entry designated by special markup inside the 
source document)

> Mind you, if you change the document structure later your annotation 
> references will no longer make sense.  (The dream feature for Marginalia 
> would be the ability to patch broken document pointers by searching for 
> the quoted phrase.  It should be doable, but it's not a priority for me 
> and I haven't investigated how to do the search etc.  An even more 
> advanced version would resolve multiple matches by picking the one 
> closes to the specified document location.)

This starts to sound similar to the approach proposed by Julian Todd 
(http://www.publicwhip.org/). He suggested that you just have a textbox 
in which you copy and paste the piece of text you wish to annotate and 
then the annotation engine figures out which piece of text this is. Of 
course this runs into problems when annotating short ranges.

>> Our first step was to port the backend code (i.e. the web annotation 
>> store REST interface) to python which we've nw done, see:
>>
>> <http://project.knowledgeforge.net/shakespeare/svn/annotater/trunk/>
 >
> I just got an email from someone talking about a Plone port of 
> marginalia.  Do you mind if I point him to you?

That would be fine.

>> Coming from this was a general discussion of annotation which for ease 
>> of reading I've just posted at:
>>
>> http://blog.okfn.org/2007/01/24/thinking-about-annotation/
>>
>> Inspired by your own use of Atom a particular focus was whether one 
>> could agree on a simple set of core attributes defining an annotation 
>> as this would the allow one to plug and play with regard to the front 
>> and back-end (i.e. one could have different annotation user interfaces 
>> each using the same store and conversely different stores for the same 
>> front-end).
 >
> Absolutely!  :)  My use of Atom is rather clumsy.  I really don't think 
> it's appropriate for me to overload the semantics of title, summary, 
> etc. as I do;  it might make more sense to embed the information as a 
> microformat in an entry's content.  But that might make parsing much 
> harder (especially when the content section is escaped with &s 
> everywhere).  For now I'm ignoring the issue.

On the contrary I thought your overloading approach was an excellent 
idea. Among other things it makes it possible (I think) for the 
annotation stream to get pulled in the same way as the blog post stream.

>> I completely understand regarding IE -- I'd been doing some playing 
>> around with getting cross-browser support for range operations when I 
>> discovered your work. Of course it would be nice to have IE support, 
>> it's definitely one of the things that set marginalia apart.
 >
> This is a common refrain.  It's probably my second priority on this 
> project right now (after making the people who are paying me happy);  
> given I"m doing other things this means I may not get to it for a while 
> (but who knows).

Sure and I can quite understand the desire to not have to deal with IE :)

>> I'll take a look. By the way have you thought of putting this stuff in 
>> a publicly-accesible source-code repository?
 >
> I haven't done such a thing before, but should probably look into it 
> when I have more time.
> 
> By the way, your Foundation looks to me like a most noble enterprise.  
> Good luck to you (even if I don't like Shakespeare :)

Thanks (the reason for doing Shakespeare is that it is the one thing 
everyone knows but the whole point of open shakespeare is most of the 
toolset will be directly portable to any other set of texts).

Regards,

Rufus