[open-humanities] OpenLiterature v1.0 - let's do a reboot!

Rufus Pollock rufus.pollock at okfn.org
Fri Jun 5 11:48:08 UTC 2015


On 4 June 2015 at 11:11, Seth Woodworth <seth at sethish.com> wrote:

> The primary output of GITenberg is intended to be epub (but not only).
> We've (probably) solidified on using Asciidoc as it supports most of the
> markup types we need, is 12 years old already, and has more than one
> converter implementation (asciidoc and asciidoctor).
>

The point is that all markup forms like that whether asciidoc or TEI or
HTML or ... have the issue that you embed your markup into the text which I
would suggest here is a *bad* idea (tm). We went the asciidoc style route
(in fact more markdown) for original open shakespeare and we ended up at
textus precisely because of benefits of separating markup and plain text.
To be clear Textus isn't a new markup format specifically: it is more the
concept of separating plain text and markup. Most of the markup in Textus
is just HTML plus some TEI for the "semantic" stuff.


>
> The asciidoc abstract syntax tree is fairly parse-able.  I could see
> creating textus as an output format for GITenberg books.  Say I add three
> paragraphs to the end of a chapter, is does textus make it easier for me to
> re-align annotations with the new document offset?
>

When do you add 3 paragraphs to end of the chapter of an existing book? But
no system is generally that great for that realignment. But yes, the
algorithm for realignment of annotations in textus would be pretty
straightforward in that case


> I like datapackages a great deal.  But I'm not very familiar with the
> ecosystem of CKAN.  I've looked into adding GITenberg as a package to the
>

Let me *repeat* ;-0 - data packages have *nothing* to do with CKAN. As
explained earlier in the thread I'd be using data packages here plus git or
s3 for storage - not suggesting using CKAN :-)

Data Package is a *really* simple standard for the metadata wrapper around
your data. It sounds like exactly what you are creating here.


> python library NLTK.  Is CKAN a good place to host text as data?  Eric
> Hellman of GITenberg (CC'd) is working on a yaml metadata specification
> that we can map the data to OPDS feeds and MARC records for libraries.  Our
> argument for yaml was it would be, in theory, easier for librarians to edit
> by hand.  Would the dpm tool offer us something we are missing?
>

I ultimately don't think there is much between yaml and json for editing
(both will be a little odd). I'd therefore really suggest taking a look at
extending DataPackage.json for your needs. It would seem a natural fit and
you can add any fields you need.

What you get here with Data Package is a) you get a spec that's been worked
on for a while b) existing tooling (though some of this may be less
oriented as your payload is just "blobs" of text ;-0 ...)

Rufus


>
> --S
>
> On Tue, Jun 2, 2015 at 6:51 PM, todd.d.robbins at gmail.com <
> todd.d.robbins at gmail.com> wrote:
>
>> Seth,
>>
>> How's GITenberg getting on? Also, per Rufus' suggestion have you looked
>> at storing texts as Textus structure
>> <http://okfnlabs.org/textus/doc/textus-format.html> and perhaps as Data
>> Packages <http://dataprotocols.org/data-packages/>?
>>
>> Cheers!
>>
>> –Tod
>>
>>
>>
>> On Fri, Apr 17, 2015 at 1:20 AM, Rufus Pollock <rufus.pollock at okfn.org>
>> wrote:
>>
>>> Hi Seth,
>>>
>>> Wanted to check you got the last response here :-) It would be great to
>>> continue the thread as I think there is great potential here at multiple
>>> possible levels of collaboration (from very basic to more extensive).
>>>
>>> Rufus
>>>
>>> On 25 February 2015 at 12:12, Rufus Pollock <rufus.pollock at okfn.org>
>>> wrote:
>>>
>>>> On 23 February 2015 at 14:55, Seth Woodworth <seth at sethish.com> wrote:
>>>>
>>>>> I have a project where I have forked Project Gutenberg to Github (
>>>>> https://gitenberg.github.io/).  On Github we are starting to
>>>>> collectively improve the copyediting and formatting of PG books.  We are
>>>>> likely going to be using asciidoc as our base markup format document.
>>>>>
>>>>> I would love it if OpenLiterature editors/contributors could make
>>>>> copy-edits or formatting edits in GITenberg and have the change end up on
>>>>> OpenLiterature.  I think there are tangible benefits to basing on top of
>>>>> GITenberg, having a source git commit to be able to point to to denote your
>>>>> text document version would be useful alone.
>>>>>
>>>>
>>>> I think there could be real synergies here. e.g. if GITenberg started
>>>> storing texts as Textus structure (and perhaps as Data Packages) that could
>>>> be the raw source for OpenLiterature. This not only builds on our original
>>>> approach for OpenShakespeare where we had texts in git, but also on the
>>>> recent work with "data" Data Packages both for Core Datasets project
>>>> <http://okfnlabs.org/blog/2015/01/03/data-curators-wanted-for-core-datasets.html> and
>>>> the storage and micro-services approach for OpenSpending
>>>> <http://labs.openspending.org/osep/01-approach-and-architecture-of-openspending.html> (in
>>>> OpenSpending we now store data as Budget Data Packages on s3).
>>>>
>>>> So basically GITenberg could become the core raw text repo that powers
>>>> OpenLiterature.
>>>>
>>>> Is this of interest to the OpenLiterature folks?
>>>>>
>>>>
>>>> Very much so IMO as per above.
>>>>
>>>> Rufus
>>>>
>>>>
>>>>>
>>>>> On Mon, Feb 23, 2015 at 4:34 AM, James Harriman-Smith <
>>>>> james.harriman-smith at cantab.net> wrote:
>>>>>
>>>>>> Good points Iain. And it'd be great if you could give a little time
>>>>>> to this once you've made it through those deadlines.
>>>>>>
>>>>>> To carry on the discussion, with a bit of reprise for those joining
>>>>>> us here. My suggested MVP was:
>>>>>>
>>>>>> Agreed by Iain:
>>>>>> 1. Upload of texts in a simple format, ideally one used on Gutenberg
>>>>>> 2. Allow those texts to be annotated by users publicly
>>>>>>
>>>>>> Questioned:
>>>>>> 3. Make annotations and texts searchable
>>>>>>
>>>>>> I think #3 is very important: it would allow someone to use the
>>>>>> platform for research far more effectively. I, for instance, often find
>>>>>> myself looking for ideas in my notes, jotted down in response to a passage,
>>>>>> and no longer remembering the phrase that triggered my idea.
>>>>>>
>>>>>> That said, I don't think #3 is essential. Open Literature can be
>>>>>> demonstrated without it, and will still be useful. What do others think
>>>>>> here?
>>>>>>
>>>>>> J
>>>>>>
>>>>>> On 23 February 2015 at 09:51, Iain Emsley <iainemsley at gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> @rufus , +1 for the reminder and enthusiasm :)
>>>>>>>
>>>>>>> @james: a discussion of the minimal viable product would be useful
>>>>>>> to provide a goal. Is #3, the search a first to do, or a rapid second?
>>>>>>> Whilst I agree that it is necessary, do we need it initially? #2 is the
>>>>>>> part where my earlier effort stalled in terms of integrating the existing
>>>>>>> JS and Wordpress.
>>>>>>>
>>>>>>> As with others, I'm tied up at the moment but _should_ have more
>>>>>>> time at the end of next month when some deadlines have passed by.
>>>>>>>
>>>>>>> Iain
>>>>>>>
>>>>>>> On Sun, Feb 22, 2015 at 7:32 AM, James Harriman-Smith <
>>>>>>> james.harriman-smith at okfn.org> wrote:
>>>>>>>
>>>>>>>> Dear list, Rufus, John,
>>>>>>>>
>>>>>>>> @John: it's great to see some maintenance on the Open Humanities
>>>>>>>> collection of websites happening, and a wiki booted for our activities.
>>>>>>>>
>>>>>>>> @Rufus, all: it's true that Open Literature has gone dormant of
>>>>>>>> late, and definitely needs a reboot. I'm afraid that, like John though, I
>>>>>>>> don't have time to give at the moment, as I'm writing up my thesis and
>>>>>>>> trying to secure some kind of paid academic employment for next year.
>>>>>>>>
>>>>>>>> That said, I think we could at least email about the minimum viable
>>>>>>>> product for Open Literature, to know what we have to do when we all have a
>>>>>>>> little more time.
>>>>>>>>
>>>>>>>> I'll start that in my next mail.
>>>>>>>>
>>>>>>>> J
>>>>>>>>
>>>>>>>> On 16 February 2015 at 09:19, Rufus Pollock <rufus.pollock at okfn.org
>>>>>>>> > wrote:
>>>>>>>>
>>>>>>>>> On 15 February 2015 at 15:10, John Levin <john at anterotesis.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Hi all,
>>>>>>>>>>
>>>>>>>>>> Regret I have very limited time - and am out of the UK - at the
>>>>>>>>>> moment. In any case it seems that the work to be done is mainly technical,
>>>>>>>>>> coding, at this point. Once Textus is up and running, then the non-techie
>>>>>>>>>> humanists can get stuck in, putting up texts etc.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> I would say there is need right now for a lot of *non*-techincal
>>>>>>>>> engagement - from being site editor, to blogging, tweeting, writing new
>>>>>>>>> essays, doing user testing, coordinating, organizing events. So would
>>>>>>>>> definitely welcome non-technical folks here :-)
>>>>>>>>>
>>>>>>>>> A couple of side-matters:
>>>>>>>>>> 1: Is this the OL twitter account? https://twitter.com/
>>>>>>>>>> OpenLiterature Is anyone looking after it?
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> No - but we could (i also wonder if we should move over
>>>>>>>>> OpenShakespeare to here since it already has quite a few followers)
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> 2: I have cleared spam and pending spam from the OL & Open Hums
>>>>>>>>>> wordpress sites.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Amazing!
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> 3: Fixed the links in this post:
>>>>>>>>>> http://openliterature.net/2011/09/05/shakespeare-and-
>>>>>>>>>> the-internet/
>>>>>>>>>> which were directing to Open Shakespeare & 404ing.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Fantastic - and thanks!
>>>>>>>>>
>>>>>>>>> Rufus
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Best,
>>>>>>>>>>
>>>>>>>>>> John
>>>>>>>>>>
>>>>>>>>>> On 15/02/2015 14:39, Rufus Pollock wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi All,
>>>>>>>>>>>
>>>>>>>>>>> I wanted to restart the conversation on getting OpenLiterature
>>>>>>>>>>> v1.0
>>>>>>>>>>> launched. In terms of key steps:
>>>>>>>>>>>
>>>>>>>>>>> 1. Identify the minimal viable product for OpenLiterature.net
>>>>>>>>>>>
>>>>>>>>>>> This also relates to minimal viable "Textus" platform to power
>>>>>>>>>>> this (for
>>>>>>>>>>> background and slide deck see http://okfnlabs.org/textus/). At
>>>>>>>>>>> present
>>>>>>>>>>> the key things would be finishing the viewer JS lib (we are 80%
>>>>>>>>>>> there)
>>>>>>>>>>> and integrating this into the wordpress site.
>>>>>>>>>>>
>>>>>>>>>>> Requirement: a discussion on this list
>>>>>>>>>>>
>>>>>>>>>>> We already have an issue list
>>>>>>>>>>> <https://github.com/okfn/openliterature.net/issues> that could
>>>>>>>>>>> be useful
>>>>>>>>>>>
>>>>>>>>>>> 2. Estimate work and skills needed
>>>>>>>>>>>
>>>>>>>>>>> My guess is we are talking about 3-6 person weeks here to get to
>>>>>>>>>>> MVP -
>>>>>>>>>>> though we would need to properly estimate.
>>>>>>>>>>>
>>>>>>>>>>> 3. Recruit team
>>>>>>>>>>>
>>>>>>>>>>> Anticipate roles like:
>>>>>>>>>>>
>>>>>>>>>>> - Product Owner
>>>>>>>>>>> - Cat herder (scrum master)
>>>>>>>>>>> - Site Editor
>>>>>>>>>>> - Designer
>>>>>>>>>>> - Frontend JS (viewer)
>>>>>>>>>>> - Wordpress Plugin write (PHP)
>>>>>>>>>>>
>>>>>>>>>>> Probably work asynchronously around a series of sprints (e.g. a
>>>>>>>>>>> few
>>>>>>>>>>> weekend or Saturday sprints)
>>>>>>>>>>>
>>>>>>>>>>> Who's interested? In particular, who would be interested in
>>>>>>>>>>> coordinating
>>>>>>>>>>> the initial phase of getting us all moving again?
>>>>>>>>>>>
>>>>>>>>>>> Rufus
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> John Levin
>>>>>>>>>> http://www.anterotesis.com
>>>>>>>>>> http://twitter.com/anterotesis
>>>>>>>>>>
>>>>>>>>>> _______________________________________________
>>>>>>>>>> open-humanities mailing list
>>>>>>>>>> open-humanities at lists.okfn.org
>>>>>>>>>> https://lists.okfn.org/mailman/listinfo/open-humanities
>>>>>>>>>> Unsubscribe: https://lists.okfn.org/mailman/options/open-
>>>>>>>>>> humanities
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>>
>>>>>>>>> *Rufus PollockFounder and President | skype: rufuspollock |
>>>>>>>>> @rufuspollock <https://twitter.com/rufuspollock>Open Knowledge
>>>>>>>>> <http://okfn.org/> - see how data can change the world**http://okfn.org/
>>>>>>>>> <http://okfn.org/> | @okfn <http://twitter.com/OKFN> | Open Knowledge on
>>>>>>>>> Facebook <https://www.facebook.com/OKFNetwork> |  Blog
>>>>>>>>> <http://blog.okfn.org/>*
>>>>>>>>>
>>>>>>>>> The Open Knowledge Foundation is a not-for-profit organisation.
>>>>>>>>> It is incorporated in England & Wales as a company limited by guarantee,
>>>>>>>>> with company number 05133759.  VAT Registration № GB 984404989. Registered
>>>>>>>>> office address: Open Knowledge Foundation, St John’s Innovation Centre,
>>>>>>>>> Cowley Road, Cambridge, CB4 0WS, UK.
>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> open-humanities mailing list
>>>>>>>>> open-humanities at lists.okfn.org
>>>>>>>>> https://lists.okfn.org/mailman/listinfo/open-humanities
>>>>>>>>> Unsubscribe:
>>>>>>>>> https://lists.okfn.org/mailman/options/open-humanities
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> James Harriman-Smith
>>>>>>>> Open Literature Working Group Coordinator
>>>>>>>> Open Knowledge Foundation
>>>>>>>> http://okfn.org/members/jameshs
>>>>>>>> Skype: james.harriman.smith
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> open-humanities mailing list
>>>>>>>> open-humanities at lists.okfn.org
>>>>>>>> https://lists.okfn.org/mailman/listinfo/open-humanities
>>>>>>>> Unsubscribe: https://lists.okfn.org/mailman/options/open-humanities
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> open-humanities mailing list
>>>>>>> open-humanities at lists.okfn.org
>>>>>>> https://lists.okfn.org/mailman/listinfo/open-humanities
>>>>>>> Unsubscribe: https://lists.okfn.org/mailman/options/open-humanities
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> James Harriman-Smith
>>>>>> Ph.D. Candidate, English Faculty
>>>>>> Peterhouse
>>>>>> University of Cambridge
>>>>>>
>>>>>> _______________________________________________
>>>>>> open-humanities mailing list
>>>>>> open-humanities at lists.okfn.org
>>>>>> https://lists.okfn.org/mailman/listinfo/open-humanities
>>>>>> Unsubscribe: https://lists.okfn.org/mailman/options/open-humanities
>>>>>>
>>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> open-humanities mailing list
>>>>> open-humanities at lists.okfn.org
>>>>> https://lists.okfn.org/mailman/listinfo/open-humanities
>>>>> Unsubscribe: https://lists.okfn.org/mailman/options/open-humanities
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>>
>>>> *Rufus PollockFounder and President | skype: rufuspollock |
>>>> @rufuspollock <https://twitter.com/rufuspollock>Open Knowledge
>>>> <http://okfn.org/> - see how data can change the world**http://okfn.org/
>>>> <http://okfn.org/> | @okfn <http://twitter.com/OKFN> | Open Knowledge on
>>>> Facebook <https://www.facebook.com/OKFNetwork> |  Blog
>>>> <http://blog.okfn.org/>*
>>>>
>>>> The Open Knowledge Foundation is a not-for-profit organisation.  It is
>>>> incorporated in England & Wales as a company limited by guarantee, with
>>>> company number 05133759.  VAT Registration № GB 984404989. Registered
>>>> office address: Open Knowledge Foundation, St John’s Innovation Centre,
>>>> Cowley Road, Cambridge, CB4 0WS, UK.
>>>>
>>>
>>>
>>>
>>> --
>>>
>>> *Rufus PollockFounder and President | skype: rufuspollock |
>>> @rufuspollock <https://twitter.com/rufuspollock>Open Knowledge
>>> <http://okfn.org/> - see how data can change the world**http://okfn.org/
>>> <http://okfn.org/> | @okfn <http://twitter.com/OKFN> | Open Knowledge on
>>> Facebook <https://www.facebook.com/OKFNetwork> |  Blog
>>> <http://blog.okfn.org/>*
>>>
>>> _______________________________________________
>>> open-humanities mailing list
>>> open-humanities at lists.okfn.org
>>> https://lists.okfn.org/mailman/listinfo/open-humanities
>>> Unsubscribe: https://lists.okfn.org/mailman/options/open-humanities
>>>
>>>
>>
>>
>> --
>> Tod Robbins
>> Digital Asset Manager, MLIS
>> todrobbins.com | @todrobbins <http://www.twitter.com/#!/todrobbins>
>>
>
>
> _______________________________________________
> open-humanities mailing list
> open-humanities at lists.okfn.org
> https://lists.okfn.org/mailman/listinfo/open-humanities
> Unsubscribe: https://lists.okfn.org/mailman/options/open-humanities
>
>


-- 

*Rufus PollockFounder and President | skype: rufuspollock | @rufuspollock
<https://twitter.com/rufuspollock>Open Knowledge <http://okfn.org/> - see
how data can change the world**http://okfn.org/ <http://okfn.org/> | @okfn
<http://twitter.com/OKFN> | Open Knowledge on Facebook
<https://www.facebook.com/OKFNetwork> |  Blog <http://blog.okfn.org/>*
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/open-humanities/attachments/20150605/419d2e81/attachment-0002.html>


More information about the open-humanities mailing list