[okfn-labs] New PDF Table transcription for CrowdCrafting/PyBossa
José Félix Ontañón
felixonta at gmail.com
Fri Sep 20 09:54:54 UTC 2013
Please, let me introduce the first crowdcrafting-app reusing your
pdftabletranscribe: Sevilla Presus 13
We're encouraging people to transcribe the data locked in PDF files of
revenues and expenditures budget for the city of Sevilla. This is a
initiative from a local newspaper in Sevilla:
The promise is to load the data on OpenSpending, so everyone could benefit
of re-using the visualizations in their own digital publications.
I really think crowdcrafting, besides for science, has a huge potential for
2013/9/13 Daniel Lombraña González <teleyinex at gmail.com>
> Hi there!
> Today I'm really happy to announce a new application/template for PyBossa
> that can be used in CrowdCrafting.org for transcribing tables locked in PDF
> files :-D
> The application is very similar to the PDF transcription one<http://crowdcrafting.org/app/pdftranscribe/>,
> as it is a new version of it, but showing how you can integrate a tabular
> data library to format the transcriptions easily.
> The application basically loads a PDF file (that can be hosted in your
> public Dropbox folder!) and asks you how many columns the table has in the
> page, if any. Then, if the answer is 5, a new table will be automatically
> created, adding new rows everything you complete one! Simple and clean!
> Each row is stored as a list in a JSON object, making really easy to parse
> it and export it to other formats.
> Here you have a short Youtube video showing the app:
> The application: http://crowdcrafting.org/app/pdftabletranscribe/
> And the official Tweet:
> NOTE: this app works really well, when in each page there is only 1 table,
> and there are no cells joined. For other cases, the template should be
> adapted, this is just the minimum version to work with. The handsontable
> library <http://crowdcrafting.org/app/pdftranscribe/> is really awesome,
> so you can adapt it to your needs without problems.
> All the best,
> Please do NOT use proprietary file formats to share files
> like DOC or XLS, instead use PDF, HTML, RTF, TXT, CSV or
> any other format that does not impose on the user the employment
> of any specific software to work with the information inside the files.
> Por favor, NO utilice formatos de archivo propietarios para el
> intercambio de documentos, como DOC y XLS, sino PDF, HTML, RTF, TXT, CSV
> o cualquier otro que no obligue a utilizar un programa de un
> fabricante concreto para tratar la información contenida en él.
> okfn-labs mailing list
> okfn-labs at lists.okfn.org
> Unsubscribe: http://lists.okfn.org/mailman/options/okfn-labs
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the okfn-labs