[open-bibliography] Metadata aggregators, discovery tools and libraries

Peter Murray-Rust pm286 at cam.ac.uk
Mon Jan 24 18:42:54 UTC 2011


On Mon, Jan 24, 2011 at 6:16 PM, Jim Pitman <pitman at stat.berkeley.edu>wrote:

> Peter Murray-Rust <pm286 at cam.ac.uk> wrote:
>
> > Just got back from US  but have been talking witrh Sam Adams about our
> tools
> > and we are fairly optimistic that we can technically scrape a lot. We may
> > need per-publisher crowdsourcing of templates, etc.
>
> BibSonomy has a suite of per-publisher scraping tools which are openly
> available, and also usable
> by webservice
>
> http://www.bibsonomy.org/scraperinfo
> http://scraper.bibsonomy.org/
>
> I suggest integration of these tools with an OKFN supported effort.
>

This looks very exciting. We'll delve into it. Not sure what the data
licence - if any - is

>
> > > We have more than that - we have ca 150,000 articles.
>
> Please can you expose this dataset with CC0 or whatever as a test dataset
> for
> the community to exercise various tools?
>

The crystaleye site http://wwmm.ch.cam.ac.uk/crystaleye does this for *data*
and it's possible to download the data as a series of linked Atom feeds
(?Sam?). We didn't emphasize metadata when we started. There are two sorts
of  metadata - the standard journal stuff; and the in-data author list which
is often better.

>
> > I think we will have to rescrape the biblio but that's tractable.
> > I am becoming very excited about the ideas of community-scraping and I
> think it will scale. We won't
> > get everything initially but we will get the stuff that is cared about.
>
> I strongly agree with that.
>
> --Jim P.
> ----------------------------------------------
> Jim Pitman
> Director, Bibliographic Knowledge Network Project
> http://www.bibkn.org/
>
> Professor of Statistics and Mathematics
> University of California
> 367 Evans Hall # 3860
> Berkeley, CA 94720-3860
>
> ph: 510-642-9970  fax: 510-642-7892
> e-mail: pitman at stat.berkeley.edu
> URL: http://www.stat.berkeley.edu/users/pitman
>
> _______________________________________________
> open-bibliography mailing list
> open-bibliography at lists.okfn.org
> http://lists.okfn.org/mailman/listinfo/open-bibliography
>



-- 
Peter Murray-Rust
Reader in Molecular Informatics
Unilever Centre, Dep. Of Chemistry
University of Cambridge
CB2 1EW, UK
+44-1223-763069
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/open-bibliography/attachments/20110124/44af2bad/attachment-0001.html>


More information about the open-bibliography mailing list