[open-science] [okfn-labs] New PDF Table transcription for CrowdCrafting/PyBossa

Daniel Lombraña González teleyinex at gmail.com
Fri Sep 20 10:53:05 UTC 2013


Hi Félix,

Thanks a lot for sharing your project with the e-mail lists! I forgot about
it, hehe, but I've tweeted it :-D

I would also like to say thank you for re-using my PDF transcription table
template, and for championing this project using CrowdCrafting.org. As you
said, I also think CrowdCrafting.org has a great potential for social
hacking.

Your project make me remember another example, similar to yours:
http://crowdcrafting.org/app/heradsdomar/

In this case, a citizen like you, decided to analyze the conviction rates
of judges in Iceland. I wrote a blog post about it one all the data was
obtained and published:
http://okfnlabs.org/blog/2013/07/31/crodcrafting-data-journalism.html

Looking forward to write another blog post about your project with
hopefully similar amazing results!

All the best,

Daniel


On Fri, Sep 20, 2013 at 11:54 AM, José Félix Ontañón <felixonta at gmail.com>wrote:

> Hi Daniel!
>
> Please, let me introduce the first crowdcrafting-app reusing your
> pdftabletranscribe: Sevilla Presus 13
>
> http://crowdcrafting.org/app/sevilla-presus13
>
> We're encouraging people to transcribe the data locked in PDF files of
> revenues and expenditures budget for the city of Sevilla. This is a
> initiative from a local newspaper in Sevilla:
> http://www.sevillaactualidad.com/servicios-117/corporativo/21723-las-cuentas-claras
>  (spanish)
>
> The promise is to load the data on OpenSpending, so everyone could benefit
> of re-using the visualizations in their own digital publications.
> I really think crowdcrafting, besides for science, has a huge potential
> for social hacking.
> Cheers!
>
>
>
> 2013/9/13 Daniel Lombraña González <teleyinex at gmail.com>
>
>> Hi there!
>>
>> Today I'm really happy to announce a new application/template for PyBossa
>> that can be used in CrowdCrafting.org for transcribing tables locked in PDF
>> files :-D
>>
>> The application is very similar to the PDF transcription one<http://crowdcrafting.org/app/pdftranscribe/>,
>> as it is a new version of it, but showing how you can integrate a tabular
>> data library to format the transcriptions easily.
>>
>> The application basically loads a PDF file (that can be hosted in your
>> public Dropbox folder!) and asks you how many columns the table has in the
>> page, if any. Then, if the answer is 5, a new table will be automatically
>> created, adding new rows everything you complete one! Simple and clean!
>>
>> Each row is stored as a list in a JSON object, making really easy to
>> parse it and export it to other formats.
>>
>> Here you have a short Youtube video showing the app:
>> http://www.youtube.com/watch?v=yfnJHALzlZc
>>
>> The application: http://crowdcrafting.org/app/pdftabletranscribe/
>>
>> And the official Tweet:
>> https://twitter.com/teleyinex/status/378474287532744704
>>
>> NOTE: this app works really well, when in each page there is only 1
>> table, and there are no cells joined. For other cases, the template should
>> be adapted, this is just the minimum version to work with. The handsontable
>> library <http://crowdcrafting.org/app/pdftranscribe/> is really awesome,
>> so you can adapt it to your needs without problems.
>>
>> All the best,
>>
>> Daniel
>>
>> --
>> http://daniellombrana.es
>> http://citizencyberscience.net
>> http://www.shuttleworthfoundation.org/fellows/daniel-lombrana/
>>
>> ··························································································································································
>> Please do NOT use proprietary file formats to share files
>> like DOC or XLS, instead use PDF, HTML, RTF, TXT, CSV or
>> any other format that does not impose on the user the employment
>> of any specific software to work with the information inside the files.
>>
>> ··························································································································································
>> Por favor, NO utilice formatos de archivo propietarios para el
>> intercambio de documentos, como DOC y XLS, sino PDF, HTML, RTF, TXT, CSV
>> o cualquier otro que no obligue a utilizar un programa de un
>> fabricante concreto para tratar la información contenida en él.
>>
>> ··························································································································································
>>
>> _______________________________________________
>> okfn-labs mailing list
>> okfn-labs at lists.okfn.org
>> http://lists.okfn.org/mailman/listinfo/okfn-labs
>> Unsubscribe: http://lists.okfn.org/mailman/options/okfn-labs
>>
>>
>
>
> --
> http://about.me/fontanon
>



-- 
http://daniellombrana.es
http://citizencyberscience.net
http://www.shuttleworthfoundation.org/fellows/daniel-lombrana/
··························································································································································
Please do NOT use proprietary file formats to share files
like DOC or XLS, instead use PDF, HTML, RTF, TXT, CSV or
any other format that does not impose on the user the employment
of any specific software to work with the information inside the files.
··························································································································································
Por favor, NO utilice formatos de archivo propietarios para el
intercambio de documentos, como DOC y XLS, sino PDF, HTML, RTF, TXT, CSV
o cualquier otro que no obligue a utilizar un programa de un
fabricante concreto para tratar la información contenida en él.
··························································································································································
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/open-science/attachments/20130920/06084af2/attachment-0001.html>


More information about the open-science mailing list