[okfn-labs] PyBossa for cultural heritage transcription/description?

Daniel Lombraña González teleyinex at gmail.com
Fri Nov 23 13:06:15 UTC 2012


Hi again,

Let me introduce you to Francisco Brasileiro and Lucas Ferreira, our
contacts for the project they are doing about transcribing old books in
collaboration with the Internet Archive for the Brazil government.

Francisco and Lucas, Sam has recently contacted someone that is working
with the Internet Archive and they contacted us to know about the
possibilities of creating a PyBossa project where you can do some data
transcription from documents. If I'm not mistaken, your Brazil project has
an agreement with the Internet Archive for doing the scanning (maybe it is
already done) for the books, and you have almost built the full application
that will allow you to extract the data from those PDFs.

As all of you share a similar interest I think that you should meet and
talk to each other as maybe a nice collaboration could arise from this :-)

Let me know if you need more info about PyBossa, ok?

Cheers,

Daniel




On Fri, Nov 23, 2012 at 12:14 PM, Sam Leon <sam.leon at okfn.org> wrote:

> Hi Daniel,
>
> Amazing!
>
> Could you please intro me to the colleague you refer to who are working
> with the Internet Archive, I'd love to hear more about this use case so
> that we can publicise this. The people who we engage via openglam.orgwould be
> *very very *keen to hear about this.
>
> Cheers,
> Sam
>
>
> On Fri, Nov 23, 2012 at 7:25 AM, Daniel Lombraña González <
> teleyinex at gmail.com> wrote:
>
>> Hi there,
>>
>> For image classification and transcription PyBossa only needs the
>> following:
>>
>> 1.-  A list of image files that can be accessed via http, for example in
>> Flickr or in a personal http folder
>> 2.- Modify a Flickr Person Finder to fit their needs (what do they want
>> to transcribe? are they looking for specific elements in the pictures?)
>> 3.- Create the tasks in crowdcrafting.org
>> 4.- Start collecting the data :-)
>>
>> Our colleagues of Brazil have more or less a full workflow for
>> transcribing big scanned books. Actually they are collaborating with people
>> from the Internet Archive. We may contact them again I think :-)
>>
>> Cheers,
>>
>> Daniel
>>
>>
>>
>> On Thu, Nov 22, 2012 at 11:41 AM, Sam Leon <sam.leon at okfn.org> wrote:
>>
>>> Hi Daniel,
>>>
>>> I am sitting with someone here who is a musicology archivist from the Archives
>>> de la Ville de Bruxelles who is digitising content for the Internet Archive.
>>>
>>> How far are we away from doing something with these kind of images?
>>>
>>> http://archive.org/details/leprinceigoropra00pvms
>>>
>>> Best,
>>> Sam
>>>
>>> On Thu, Nov 22, 2012 at 8:33 AM, Daniel Lombraña González <
>>> teleyinex at gmail.com> wrote:
>>>
>>>> Hi,
>>>>
>>>> This is great :D You can actually create an app using the Flickr Person
>>>> Finder as a template and modify only one link to get the images from
>>>> Flickr. The goal could be use the http://www.flickr.com/commonsFlickr
>>>> Commons pools and classify the images, etc. Actually lots of Museums
>>>> and institutions are pushing photos to Flickr Commons<http://www.flickr.com/commons/institutions/>so we only need to contact one of those participant institutions and see if
>>>> they want the app :-)
>>>>
>>>> Cheers,
>>>>
>>>> Daniel
>>>>
>>>>
>>>> On Wed, Nov 21, 2012 at 9:15 PM, Etienne Posthumus <
>>>> eposthumus at gmail.com> wrote:
>>>>
>>>>>
>>>>>
>>>>>
>>>>> On 21 November 2012 17:55, Jonathan Gray <jonathan.gray at okfn.org>wrote:
>>>>>
>>>>>> It is a lovely project, and I'm wondering how far we are from being
>>>>>> able to - e.g. - have a PyBossa image classification/description project
>>>>>> with a cultural heritage institution or open content project (like the
>>>>>> Internet Archive or Wikimedia Foundation).
>>>>>>
>>>>>
>>>>> All the pieces are there, as Rufus says it could be used right now. As
>>>>> a matter of fact it is being done, albeit for a simpler application.
>>>>>
>>>>> We are busy making a 'tagging game' for the Amsterdam Museum to allow
>>>>> middle school pupils to tag items as part of their school visits to the
>>>>> museum.
>>>>> The source for the images is the Adlib museum management system, which
>>>>> has an API. The first prototype version runs on Django, as the museum were
>>>>> in a bit of a hurry to get something out the door and had specific
>>>>> requirements with regards to logins and writing the data back to the Adlib
>>>>> database. In phase 2 of the project we are adding a link to PyBossa so that
>>>>> one can generate a PyBossa app from the Django application, without needing
>>>>> to do any Python coding.
>>>>>
>>>>> The images are previewed and selected from the Adlib search API,
>>>>> questions are managed by the museum staff in the Django Admin backend, and
>>>>> the PyBossa items are generated as a combination of these two and created
>>>>> using the PyBossa API plus the user secret key.
>>>>>
>>>>> As soon as this part is functional it should appear in the
>>>>> Crowdcrafting site as an app.
>>>>>
>>>>> EP
>>>>>
>>>>> _______________________________________________
>>>>> okfn-labs mailing list
>>>>> okfn-labs at lists.okfn.org
>>>>> http://lists.okfn.org/mailman/listinfo/okfn-labs
>>>>> Unsubscribe: http://lists.okfn.org/mailman/options/okfn-labs
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>>
>>>> ··························································································································································
>>>> http://github.com/teleyinex
>>>> http://www.flickr.com/photos/teleyinex
>>>>
>>>> ··························································································································································
>>>> Por favor, NO utilice formatos de archivo propietarios para el
>>>> intercambio de documentos, como DOC y XLS, sino PDF, HTML, RTF, TXT, CSV
>>>> o cualquier otro que no obligue a utilizar un programa de un
>>>> fabricante concreto para tratar la información contenida en él.
>>>>
>>>> ··························································································································································
>>>>
>>>> _______________________________________________
>>>> okfn-labs mailing list
>>>> okfn-labs at lists.okfn.org
>>>> http://lists.okfn.org/mailman/listinfo/okfn-labs
>>>> Unsubscribe: http://lists.okfn.org/mailman/options/okfn-labs
>>>>
>>>>
>>>
>>>
>>> --
>>> Sam Leon
>>> Community Coordinator
>>> Open Knowledge Foundation
>>> http://okfn.org/
>>> Skype: samedleon
>>>
>>>
>>
>>
>> --
>>
>> ··························································································································································
>> http://github.com/teleyinex
>> http://www.flickr.com/photos/teleyinex
>>
>> ··························································································································································
>> Por favor, NO utilice formatos de archivo propietarios para el
>> intercambio de documentos, como DOC y XLS, sino PDF, HTML, RTF, TXT, CSV
>> o cualquier otro que no obligue a utilizar un programa de un
>> fabricante concreto para tratar la información contenida en él.
>>
>> ··························································································································································
>>
>
>
>
> --
> Sam Leon
> Community Coordinator
> Open Knowledge Foundation
> http://okfn.org/
> Skype: samedleon
>
>


-- 
··························································································································································
http://github.com/teleyinex
http://www.flickr.com/photos/teleyinex
··························································································································································
Por favor, NO utilice formatos de archivo propietarios para el
intercambio de documentos, como DOC y XLS, sino PDF, HTML, RTF, TXT, CSV
o cualquier otro que no obligue a utilizar un programa de un
fabricante concreto para tratar la información contenida en él.
··························································································································································
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/okfn-labs/attachments/20121123/5022c9fd/attachment-0002.html>


More information about the okfn-labs mailing list