[open-science] Text mining, PDF to text conversion, and permissions on abstracts

Jenny Molloy jcmcoppice12 at gmail.com
Fri Mar 9 09:47:51 UTC 2012


Do suggest something like this to be looked at during the Open Science
hackday!
http://science.okfn.org/2012/03/07/open-science-hackday-31-mar-2012-london/

Clearly not something to be solved quickly but it may get a good group of
people discussing it and lead to a PDF focused group. These kind of tools
are useful across the range of OKF activities so I've copied in the
okfn-discuss list.

We have a table transcription tool from a previous hackday which it was
hoped could be turned into an automated tool
http://blog.okfn.org/2011/11/17/introducing-the-data-digitizer/

Jenny

On Thu, Mar 8, 2012 at 6:22 PM, Jessy Kate Schingler <jessy at jessykate.com>wrote:

> (oops, meant to reply-all)
>
> awesome, i didn't know about PDFbox.
>
> imagine how great it would be to combine good content extraction with the
> annotator tool
>
> On Thu, Mar 8, 2012 at 12:15 PM, Peter Murray-Rust <pm286 at cam.ac.uk>wrote:
>
>>
>>
>> 2012/3/8 Maximilian Haeussler <maximilianh at gmail.com>
>>
>>> Two years ago, I had the impression that pdfBox is the most mature
>>> software package in this area.
>>>
>>> I have used PDFBox extensively but not for about 18 months. It's good
>> and I also use it for graphics.
>>
>> It would be Wonderful to get a science-based OKF group for PDFing
>>
>> P.
>>
>>
>>>
>>>
>>
>>
>> --
>> Peter Murray-Rust
>> Reader in Molecular Informatics
>> Unilever Centre, Dep. Of Chemistry
>> University of Cambridge
>> CB2 1EW, UK
>> +44-1223-763069
>>
>> _______________________________________________
>> open-science mailing list
>> open-science at lists.okfn.org
>> http://lists.okfn.org/mailman/listinfo/open-science
>>
>>
>
>
> --
> Jessy
> http://jessykate.com
>
>
> _______________________________________________
> open-science mailing list
> open-science at lists.okfn.org
> http://lists.okfn.org/mailman/listinfo/open-science
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/open-science/attachments/20120309/b5539a5d/attachment-0001.html>


More information about the open-science mailing list