[humanities-dev] Couple of TEXTUS design questions

Wed Mar 14 11:48:58 UTC 2012

I've been hacking away at the reader interface for TEXTUS. This is the
component which presents a text, along with typography and annotations
and allows users to read it (it also allows for the creation of
annotations, but that's easy so I'm not going to talk about that
here). The currently online project which also does this is
openshakespeare; this works but has a few issues and I'd like to get
some thoughts on whether we actually care about them.

Firstly openshakespeare loads the entire text into a single web page
(i.e. http://openshakespeare.org/work/hamlet). With modern computers
and browsers this is probably fine, it's a relatively small amount of
data and loads rapidly (or at least it does at home - trying to access
it on my phone over a flaky gprs connection is not so hot). For TEXTUS
though I was originally envisaging an interface which presented pages,
where a page was 'however much can be put on the screen at the moment'
rather than an underlying page in a transcribed text. There are pros
and cons to both approaches:

Entire text visible
  + Free text search works trivially with browser CTRL+F or similar
  + Easy to copy / paste large sections into other documents
  - Expensive with large numbers of annotations, unless annotations
are loaded based on viewport

Single 'page' visible
  + Better UI for actually reading text, creating bookmarks etc
  + Maps simply onto a sensible UI for tablets
  + Easy to only retrieve and render applicable annotations
  - Search has to be implemented through back-end service (not
entirely a bad thing, can be more flexible but not as quick)

It would be possible to do either, so it's really down to what people
think is a preferable presentation style?

Secondly there's the issue of how we present annotations. In this
context annotations include attributed free text comments, links to
other texts or sections of texts, external links out to images of
scanned manuscripts and potentially many others. Because of this it is
guaranteed that annotations will overlap with one another, something
which the OKFN annotator allows but handles poorly (try creating a few
overlapping annotations on openshakespeare.org and you'll see the
problem). We can avoid some of these issues by making annotations
visible on the 'breadcrumb trail' bar in the UI (a component which
locates the currently visible sub-section of text, i.e. 'Hamlet > Act
II > Scene I'). We will have filtering for annotation types (i.e.
'show me all scanned image links', or 'only show me annotations from
<set of users>') but we're still hopefully going to end up with much
denser annotation than a naive approach will handle. I propose a few
different mechanisms, which would work in concert, to mitigate this:

1) Annotations pertaining to large sections of text, i.e. much more
than currently displayed in the viewport of the browser, will be
indicated by a control on the breadcrumb trail. For example, the
metadata for the entire text (author, title, edition etc) would appear
as an indicator under the root part of the breadcrumb trail. This
allows for annotations which apply to all the document, or all of a
chapter, without polluting the actual text display too much.

2) Some annotation types would not have their location within the text
shown by default, simply being presented if the text to which they
pertain is visible. This would work well for links back to scans of
the original manuscript, for example - you would see icons for the
scanned images corresponding to whatever text was visible, I don't
think we'd need to specifically say 'scan 1 is from this word to this
other word' as it would be obvious from the scans themselves (we might
even have a specific view to show all scans for the currently visible
text in sequence)

These two ideas both serve to reduce the number of annotations we
actually need to display in the text itself, but we'll still have
quite a few...

3) Overlapping annotations are trickier - annotator's approach
(inserting spans with particular styles) doesn't work for this. One
option would be to render the markers indicating annotations in a more
sophisticated way, for example using an HTML5 canvas behind the text
itself and drawing lines or boxes around annotated text and out to a
marginalia style display. This also has the advantage that multiple
annotations can have their text visible at the same time rather than
using pop-ups. A paginated approach to text display makes this more
practical, but it's possible with the scrollable full text view as
well.

Thoughts?

Tom