[open-science-dev] Fwd: Brazil OCR - by David P. Anderson

Daniel Lombraña González teleyinex at gmail.com
Wed Jul 20 07:15:11 UTC 2011


Dear Lucas and Nigini,

Lucas is right, as far as I know there is no BOINC OCR solution yet, so the
only progress is what we did during the Berlin OKCon2011.

BOSSA has the options for creating jobs, managing users, etc. so adapting
what we did in BOSSA should not be too difficult. The idea is to use the
MySQL database of BOSSA to store the URI to the PDF documents as well as the
URIs of the pages (probably a CDN) so we can run jobs using this system (we
only have to translate the Google Docs infrastructure to MySQL or use a
similar GData library for PHP).

Regards,

Daniel

2011/7/19 Lucas Ferreira Mation <lucasmation at gmail.com>

>
> We did not get very faar during that day. The only thing that was done was
> a code to divide the pdf into it´s original pages.
> As faar as I know, all the code produced on that day was already used in
> the Berlin Hackfest.
> So it is all integrated in the current code at the git hub.
> The code that may be the most interesting to integrate with our current
> demo is the "ellipse finding" one, because it manages the job atribution.
>
> abraço
> Lucas
>
>
> 2011/7/19 Nigini Abilio <nigini at lsd.ufcg.edu.br>
>
>> I do agree with you about David, Lucas.
>> But as I saw in you email to Daniel now, I maybe was misunderstood. The
>> application I would like to put my hands on is the "Brazil OCR", and not the
>> "ellipse finding" one.
>>
>> Thanks.
>>
>>
>> 2011/7/19 Lucas Ferreira Mation <lucasmation at gmail.com>
>>
>>> Hi Daniel,
>>>
>>> can you find the  "elipse finding" demo that David sent us.
>>>
>>> If Daniel does not know we can ask David directly ( "David Anderson" <
>>> davea at ssl.berkeley.edu>,).
>>> But he is a busy guy so lets try to find this ourselves first. Lets keep
>>> his advice for more important stuff in the projetc, both in terms of
>>> strategy or when we get really stuck in the codding.
>>>
>>> abraço
>>> Lucas
>>>
>>> On Tue, Jul 19, 2011 at 2:54 PM, Nigini Abilio <nigini at lsd.ufcg.edu.br>wrote:
>>>
>>>> Hi Lucas.
>>>>
>>>> Thank you for your attention. Well, I just checked the users portal at
>>>> the specified server, and some configuration inside it, but no "Brazil OCR"
>>>> app is installed. Do you have any other information?
>>>>
>>>> Besides that, as the software was developed inside a hackfest, I was
>>>> expecting that the code was available at one Open Source Environment like
>>>> Github (as the last one we participated together). It would be really
>>>> helpful if it was the case.
>>>>
>>>> Best regards.
>>>>
>>>>
>>>> On Tue, Jul 19, 2011 at 1:59 PM, Lucas Ferreira Mation <
>>>> lucasmation at gmail.com> wrote:
>>>>
>>>>> Nigini, as faar as I know David installed this on your server (the UFCG
>>>>> server). I sent you the emails about this, pointing to that server,
>>>>> password, etc.
>>>>>
>>>>> regards
>>>>> Lucas
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Tue, Jul 19, 2011 at 1:50 PM, Nigini Abilio <nigini at lsd.ufcg.edu.br
>>>>> > wrote:
>>>>>
>>>>>> Hi people.
>>>>>>
>>>>>> I'm studying and trying to contribute with the Data Digitizer project.
>>>>>> Specifically, right now my group is searching on how to use BOSSA as a
>>>>>> project infrastructure.
>>>>>>
>>>>>> Following some pointers, I've got this application developed at "some
>>>>>> hackfest" by David Anderson, for the Brazil OCR problem:
>>>>>> https://isaac.ssl.berkeley.edu/test/bossa_apps.php
>>>>>>
>>>>>> My question is where is the source code?
>>>>>>
>>>>>> Thanks in advance.
>>>>>> __________________________
>>>>>> Nigini Abilio Oliveira
>>>>>> www.nigini.com.br
>>>>>>
>>>>>> _______________________________________________
>>>>>> open-science-dev mailing list
>>>>>> open-science-dev at lists.okfn.org
>>>>>> http://lists.okfn.org/mailman/listinfo/open-science-dev
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>


-- 
··························································································································································
http://github.com/teleyinex
http://www.flickr.com/photos/teleyinex
··························································································································································
Por favor, NO utilice formatos de archivo propietarios para el
intercambio de documentos, como DOC y XLS, sino HTML, RTF, TXT, CSV
o cualquier otro que no obligue a utilizar un programa de un
fabricante concreto para tratar la información contenida en él.
··························································································································································
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/open-science-dev/attachments/20110720/8555c175/attachment.html>


More information about the open-science-dev mailing list