[open-science-dev] PyBOSSA: re-implementing the open source PHP BOSSA framework in Python

Lucas Ferreira Mation lucasmation at gmail.com
Mon Dec 5 04:56:53 UTC 2011


Dear all,

just to report on Data Transcriber. I´m sorry for taking this long
(especially Jenny, sorry for not answering your email sooner), but I wanted
to receive at least some documentation before giving this report:

1) THe code is available here:
http://svn.lsd.ufcg.edu.br/repos/sc/table_transcriber  user name "anonymous",
and leave password blank. As better documentation is produced it will be
migrated here: http://redmine.lsd.ufcg.edu.br/projects/datadigitizer .
2) the development was slow, first because of dificulties installing Bossa,
then because of difficulties with the googleDocs spreadsheet API and then
because of the developers end of semester exams. The guys at UFCG intend to
pick up the pace, now that all that is over.
3) About googledocs, the main issue is that, although the spreadsheets were
public for anyone to see and edit, if a user accessed a transcription job
wile logged, that spreadsheet would get added to his googleDocs documents
list. Thus he would be able to change the spreadsheet on his own, without
receiving actual jobs from Bossa. The developers at UFCG looked for ways to
shut off this behaviour from googleDocs, but were not able to. I also asked
about it in a google forum
(here<http://code.google.com/intl/pt-BR/apis/spreadsheets/forum.html?place=topic%2Fgoogle-spreadsheets-api%2FZlVU7WmUNeE%2Fdiscussion>)
and receive no answer. If anyone has a hint, let us know.
4) Given 3, the we switched to a javascript code that emulates a table/grid
on the screen and retrieves the information typed on each cell. The
information is saved (that is already working) and later show to the next
volunteer transcriber (that part is not operational yet).
5) Given the difficulties faced to install bossa, program in php, etc, I
think that the group will be interested in porting this to pyBossa. Let me
talk to them and I´ll let you guys know.

regards
Lucas


2011/12/1 Daniel Lombraña González <teleyinex at gmail.com>

> Hi,
>
> On Wed, Nov 30, 2011 at 13:56, Rufus Pollock <rufus.pollock at okfn.org>wrote:
>
>> On Friday, 25 November 2011, Daniel Lombraña González wrote:
>> >
>> > Dear all,
>> >
>> > During a hackfest in Cape Town (Africa at Home), and thanks to the
>> collaboration of Rufus Pollock from Okfn, we have started to port and
>> improve BOSSA (a volunteer thinking framework) to Python. You can find the
>> code here: https://github.com/citizen-cyberscience-centre/pybossa
>>
>> Right now we have an operational core. Immediate next step is to
>> create a demo app. Suggestion is to do  basic image classification
>> writing the task frontend in javascript + html (in a way that others
>> can copy and paste) and to scrape photos from flickr commons. See this
>> issue: <https://github.com/citizen-cyberscience-centre/pybossa/issues/10>
>>
>> @Daniel: I believe you have already started on this (in collaboration
>> with gentleman from EpiCollect). How far have we come?
>>
>
> Well, not too far, sorry. I'm busy with the other projects this week, but
> I will work on this for sure, trust me ;)
>
>
>>
>> We'd also like to sign up some projects to try out the new platform so
>> if you've got a project please let us know and we can work to get this
>> running on PyBossa.
>>
>> (@Lucas: I was thinking of porting the DataDigitizer work -- would you
>> and colleages be interested in working on this / porting what you've
>> been working on?)
>>
>> > BOSSA is a framework that allows the creation of projects where the
>> humans are the CPUs. Imagine a project where an algorithm has to answer the
>> following question: does this photograph contain a dog? While for
>> algorithms, usually, this will be very complicated for humans is a fairly
>> simple task. BOSSA helps in the creation of this kind of projects,
>> providing tools for managing users, jobs, etc.
>>
>> The summary we have in the PyBossa README at the moment is:
>>
>> <quote>
>> PyBossa is an open source platform for crowd-sourcing online
>> (volunteer) assistance to perform tasks that require human cognition,
>> knowledge or intelligence (e.g. image classification, transcription,
>> information location etc).
>>
>> PyBossa was inspired by the BOSSA_ crowdsourcing engine but is written
>> in python (hence the name!). It can be used for any distributed tasks
>> application but was initially developed to help scientists and other
>> researchers crowd-source human problem-solving skills!
>> .. _BOSSA: http://bossa.berkeley.edu/
>> </quote>
>>
>> I think we should do blog post on this asap to let the wider community
>> know what is going on.
>>
>> >
>> > During the Africa at Home hackfest, we were in contact with the main
>> developer of BOSSA and agreed on porting the code to Python.
>>
>> Can we invite him to this list :-)
>>
>> >
>> > This new implementation will allow running a central service,
>> decoupling the task interface from BOSSA. This will allow the creation of
>> different projects based on any programming language as the core system
>> will provide a nice API to request and submit jobs using JSON. BOSSA will
>> take care of distributing the tasks among its users.
>>
>> We're directly porting the BOSSA sql structure and domain model.
>> However, we're making this available via a RESTful JSON web API which
>> should be easier to connect to than in current BOSSA setup where you
>> have to have your frontend run on BOSSA platform and use PHP function
>> calls.
>>
>> Top priority here is to document the API so frontend devs can write
>> against it.
>>
>
> Yes, I will need that too. I was reading the code for starting to code the
> app (to feed the DB). I hope that while I'm developing it I will create at
> the  same time the documentation or something usable.
>
> Cheers,
>
> Daniel
>
>>
>> Rufus
>>
>
>
>
> --
>
> ··························································································································································
> http://github.com/teleyinex
> http://www.flickr.com/photos/teleyinex
>
> ··························································································································································
> Por favor, NO utilice formatos de archivo propietarios para el
> intercambio de documentos, como DOC y XLS, sino PDF, HTML, RTF, TXT, CSV
> o cualquier otro que no obligue a utilizar un programa de un
> fabricante concreto para tratar la información contenida en él.
>
> ··························································································································································
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/open-science-dev/attachments/20111205/44606a46/attachment.html>


More information about the open-science-dev mailing list