[open-bibliography] Google Refine contributor wanting to help this community

Tom Morris tfmorris at gmail.com
Wed Oct 19 21:56:04 UTC 2011


On Mon, Oct 17, 2011 at 12:22 AM, Thad Guidry <thadguidry at gmail.com> wrote:

> Tom, would you be kind enough to review their
> github https://github.com/okfn/bibserver/tree/master/bibserver/parsers and
> let everyone know the effort there ?

It's less than 300 lines of Python which implements a hand written
parser (rather than using parser/lexer generator), so it wouldn't be a
lot of work to write the moral equivalent in Java, but since parsers
tend to be fiddlie it'd probably be less trouble to start with a
proven Java parser as a base.  It also contains some stuff like name
piece guessing and Tex->Unicode conversion which we might want to
defer to the post-import phase (although that probably deserves
discussion).

For Java starting points, the bibtex2rdf project looks the most
promising right now with the MIT Simile/Babel project's converter as a
backup.

>(  Btw Tom, I might put a bounty on
> this again, not for sure however, since I do not directly need it, just
> helping scientist friends ;) and it looks like OKFN would like to see it
> happen as well ! )

Cash is good. :-)  If someone wanted to pay for this, I'd certainly be
happy to knock something together and iterate on it with them or their
designated users.  I just finished porting my Open Document Format
(ODF) Spreadsheet importer and exporter to Refine's new importer
architecture, so it's all still fresh in my mind.

Tom




More information about the open-bibliography mailing list