[openbiblio-dev] [Open-access] Trying to index the malaria literature for BOAI-Openness - what has to be done paper-by-paper?
pm286 at cam.ac.uk
Wed Mar 28 07:29:02 UTC 2012
On Wed, Mar 28, 2012 at 5:31 AM, Nils Dagsson Moskopp <
nils at dieweltistgarnichtso.net> wrote:
> Daniel Mietchen <daniel.mietchen at googlemail.com> schrieb am Tue, 13 Mar
> 2012 16:46:22 +0100:
> > Google does not give a simple list of results, and it currently yields
> > over 80k hits for malaria on PMC:
> > However, the Crawler currently being coded as part of the Open Access
> > Media Importer (cf.
> > ) does almost what you are looking for, and so it should not be too
> > difficult to modify it accordingly.
> I just implemented something that might be of value for this purpose, a
> command that outputs PMC metadata as CSV. Instructions for an sh-like
> shell follow:
This looks very useful. We are thinking along the same lines on
open-biblio so I have copied them.
> git clone https://github.com/erlehmann/open-access-media-importer.git
> cd open-access-media-importer
> ./oa-get metadata pubmed
> ./oa-cache list-articles pubmed | grep Malaria | grep creativecommons
> Besides git and Python 2.6, you will need python-progressbar. For
> operating systems, without sane package management, you can find it
> here: <http://pypi.python.org/pypi/progressbar>.
> Be aware that this downloads several GB of data from PubMed Central FTP
> and may take some time. If you find any errors, let me know.
> Is this "metadata" or does it include abstracts? Because systematic
downloading of abstracts is, I think , forbidden by the publishing
community. We should stick to the material outlines in the Principles of
Reader in Molecular Informatics
Unilever Centre, Dep. Of Chemistry
University of Cambridge
CB2 1EW, UK
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the openbiblio-dev