[okfn-discuss] A question of history

Rufus Pollock rufus.pollock at okfn.org
Tue May 23 16:38:00 UTC 2006


Matthew Brett wrote:
> Hello to the open-knowledge masters,
> 
> I was hoping for some advice about an open-knowledge problem that I
> have often discussed with my father - and am now hoping to find some
> way forward.
> 
> My father, Martin Brett, happens to be working on an
> open-knowledge type project.  He's an historian, and works on
> something like chuch history around 1100.  The main work in this area
> is to produce editions of manuscripts from that time.  Manuscripts
> almost invariably have many versions from different times, as the
> original has usually been lost, and the document survives in many
> copies, each of which has been edited, and often merged with other
> documents, by the person transcribing the document.  The work of an
> edition, as far as I understand it, is to take one particular version,
> preferably the version closest to the original, in the opinion of the
> historian, and annotate all important differences between this
> version, and other versions, so that the edition provides a basis for
> other historians to compare versions.

Fascinating. This problem is very similar to the ones which have arisen 
on the open shakespeare project which we have just started (see previous 
email to this list and [1]). There we have multiple editions of a given 
shakespeare text. In reality what we would like is some way to hold all 
original manuscript versions of the document and then be able to compare 
them as well as create new versions based on the originals.

> Usually editions are published by the historians as a book.  The
> problem is that an individual historian may well not have access to
> all the important manuscripts themselves, and editions take an
> enormous amount of work.  This in turn means that the historian often
> has a rather good but incomplete version of the edition on their hard
> disk for many years before enough edits have been done to make it
> acceptable for publication as a book.

And sitting on their hard disk means that knowledge is completely 'dead' ...

> For many years Martin has been trying to work on ways to make such
> provisional editions available online, so that other historians can
> benefit from the work already done, and even contribute to it
> collaboratively.

This is touching on every base in the open knowledge arena.

> Clearly this brings up many open-knowledge issues; licences,
> version control, packaging (Martin Brett version 0.34 as of 12/4/06
> for example), and allowing academic credit to be given to
> contributors.

Yup. Licensing is fairly easy. The original manuscript are pd and almost 
any amended edition is copyright the person who did the amendations. So 
the obvious thing to do would be put all this stuff out under an 
attribution license (or attribution-sharealike). One could also consider 
some deposit scheme where a text is only released after a set period 
which would address the fears of those academics who were concerned 
about having a head-start.

> Another related issue is how to encode the information so that
> it would be easy to change your view of which manuscript was primary,
> and automatically readjust the annotations relative to this other
> manuscript, instead of your first guess as to which was primary.

Yes. I think to simplify here we have the following requirements

1. Ability to store raw manuscript (and versions thereof)
2. Ability to compare this manuscript against other manuscript
3. Ability to annotate a munuscript

Both 2+3 require:

4. marked up version of manuscript that allow fine-grained referencing

I feel rather than biting off the whole problem at once it would be best 
to start with 1 (just encoding this kind of text may be complicated) and 
just do plain text. Next move on to 2 which would also involve some kind 
of mark up and then finally move to 3 (though note if we were happy with 
something quite crude we could just start 3 as a wiki with human-usable 
references back to the text -- e.g. a wiki page per page of the manuscript)

My other suggestion is to we try doing this first for a simpler case 
such, e.g. shakespeare! Such as situation could act as a (rather 
complex) 'hello world' example where we could address many of these 
issues in an easier environment.

> Could I ask for any thoughts as to how Martin might go about finding a
> useful method to think about all this and implement it

My suggestion for a way to start:
   1. get a plain text (or as close as possible) version of a manuscript
   2. store it in some kind of versioning system (e.g. subversion)

There will be plenty wrong with this but it will at least give an idea 
of what the next steps would be. Another thing to do would be to try and 
write a brief summary of the purpose of what one is trying to do along 
with some simple 'use cases' (e.g. 'I want to be able to compare two 
texts line by line') and then give them ratings as to how valuable they 
are and how difficult they would be to do -- this will give some 
information as to what one should try and do first.

What do you think? Does this sound a good way to approach the problem?

Regards,

Rufus

[1] http://www.okfn.org/wiki/ShakespeareKnowledgePackage




More information about the okfn-discuss mailing list