[open-bibliography] Google Refine contributor wanting to help this community

Thad Guidry thadguidry at gmail.com
Sat Oct 15 06:15:32 UTC 2011


Reading through a few posts from Jim Pitman previously and stumpled upon his
mention of SIMILE and CITELINE.

Well, it just so happens that 2 of the authors are now part of Google, and
contributors to Google Refine.

I brought up the idea to the team about getting a BibTex BIbJSON
importer/exporter, since we already support JSON, XML, RDF, and even MARC
already.  In fact, this current idea came from my brainstorm with one of the
few scientists that we have on Freebase.com who is also a community expert
like myself who wanted a better tool to use to analyze his citations, bibs,
etc, and convert his data. ;)

I think converting and manipulating BibTex records into JSON, specifically
BibJSON can be done within Google Refine's architecture with just a bit of
wiring up and a bit of help from this community.  It is entirely possible
now in the new release, to easily preview, convert, manipulate XML / JSON
data and output to XML / JSON however your heart desires.

What I am missing is a BibTex parser, or better yet, a generic BibTex to
JSON converter in Java source.  Hence this email to your community and
offering and asking for assistance.

One of the bonuses of this endeavor is to allow authors or publishers or
anyone to upload their bib metadata to Freebase.com, if they want, entirely
up to them.

Another bonus...I also saw mention of everyone's general interest on this
list about Schema.org... as it turns out, lol, a lot of the schema
development was borrowed from Freebase.com, since Google had acquired them
last year.  In fact, some of my Freebase schema work is in there.  (and yes,
that endeavor is and will be much more than just heuristics and natural
language techniques - there's a ton of research being done by Google
Research NYC at the moment, so stay tuned)

I would encourage folks to download
http://code.google.com/p/google-refine/downloads/list and use Google Refine
on any existing JSON file or even a flat csv file or any of the formats we
currently support and see how it generally works and allows export, even a
custom export option or via a template.  You'll soon discover that it is an
extremely useful tool and could be THE tool that this community could
leverage for their BibJSON / BibTex / structured text / processing and
cleaning needs.  (Btw, we use jQuery as well in Google Refine which runs on
your local desktop in your browser.)

We're all ears - Google Refine Team,

Disclaimer: I do not work for Google.  I have worked for a public library
for over 7 years.  I love cheese. :)

-- 
-Thad
http://www.freebase.com/view/en/thad_guidry
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/open-bibliography/attachments/20111015/6b8e1ab7/attachment.html>


More information about the open-bibliography mailing list