[OKFN - Austria] [science-at] Hacks/Hackers Vienna: Daten aus Dokumenten befreien @ Montag, Raum D

vasily.bunakov at stfc.ac.uk vasily.bunakov at stfc.ac.uk
Mi Aug 7 10:46:36 UTC 2013


I've joined your mail list as an aftermath of one of the Austrian conferences and feel some themes you discuss are quite interesting (although my Deutsch is close to zero level).

In particular, the extraction of data from PDF documents is an important topic as many national and local governments keep publishing their data in this format. We do have a "data curation" theme in one of our EU projects www.engage-project.eu  but we mostly concentrate on the refinement of tabular data (Excel, CSV) and linking them to reasonable reference material on the Web (DBpedia, Geonames, ...). So what you are doing with PDFs seems complementary to what we try as a part of ENGAGE project.

If you need to scale up your data extraction effort, or simply get it better known internationally, I can recommend using the beta-version of our platform www.engagedata.eu  It allows publishing (or just referencing) the "original" data, and publishing "derived" data, too: e.g. the result of data extraction from PDF can be published as a dataset explicitly related to the original. There are some social networking capabilities, too, so you can discuss what you are doing with other users, or request new data that someone knows about, or form a new community. 

We keep working on the platform and your feedback (there is a Web form for it) is very welcome but even in its present beta incarnation, as we hope, the platform can serve as a multi-national infrastructure for data publishing and re-publishing including for the extracted/refined data. We'll be happy if some of your hackathon outputs (as well as inputs - the original data or references to them) will be published via www.engagedata.eu 

With kind regards,
Vasily Bunakov
STFC Scientific Computing

-----Original Message-----
From: science-at-bounces at lists.okfn.org [mailto:science-at-bounces at lists.okfn.org] On Behalf Of Stefan Kasberger
Sent: 06 August 2013 19:31
To: science-at at lists.okfn.org; okfn-at at lists.okfn.org
Subject: [science-at] Hacks/Hackers Vienna: Daten aus Dokumenten befreien @ Montag, Raum D


am Montag ist wieder Hacks/Hackers MeetUp im Raum D im Museumsquartier.
Bei diesem Trefen rund um Datenjournalismus zeigt Mihi Bauer dieses Mal wie man aus Dokumenten (PDF) Daten/Informationen extrahiert. Etwas das auch in der Wissenschaft immer wieder von nutzen ist.


Grüße, Stefan

science-at mailing list
science-at at lists.okfn.org
Scanned by iCritical.

Mehr Informationen über die Mailingliste okfn-at