[openbiblio-dev] [Open-access] "the publishing community" Re: Trying to index the malaria literature for BOAI-Openness - what has to be done paper-by-paper?

Mike Taylor mike at indexdata.com
Wed Mar 28 09:13:17 UTC 2012


On 28 March 2012 10:04,  <koltzenburg at w4w.net> wrote:
> Hi pmr,
>
>> systematic
>> downloading of abstracts is, I think , forbidden by the publishing
>> community
>
> well, actually publishers are birds of many feathers,
> I would like to caution against othering "them" into one and the same (closed) pot :-)
> better don't glue them together as birds of only one feather,

The problem is, in the absence of explicit machine-readable statements
from each publisher about reuse rights of each journal or even each
article, the only safe course is to assume the worst, and not use what
you don't KNOW you can use.

For publishers that want to be The Good Guys, and want their articles
to be part of mining initiatives, the onus is on them to find good,
clear ways to communicate what rights are available.  This is probably
a good place to start:
        http://creativecommons.org/ns

-- Mike.


>
> thinks
> Claudia
>
> On Wed, 28 Mar 2012 08:29:02 +0100, Peter Murray-Rust wrote
>> On Wed, Mar 28, 2012 at 5:31 AM, Nils Dagsson Moskopp <
>> nils at dieweltistgarnichtso.net> wrote:
>>
>> > Daniel Mietchen <daniel.mietchen at googlemail.com> schrieb am Tue, 13 Mar
>> > 2012 16:46:22 +0100:
>> >
>> > > Google does not give a simple list of results, and it currently yields
>> > > over 80k hits for malaria on PMC:
>> > >
>> > https://www.google.com/search?q=malaria+site%3Awww.ncbi.nlm.nih.gov%2Fpmc.
>> > >
>> > > However, the Crawler currently being coded as part of the Open Access
>> > > Media Importer (cf.
>> > >
>> > http://wir.okfn.org/2012/03/10/open-access-media-importer-apology-frontend-usage/
>> > > ) does almost what you are looking for, and so it should not be too
>> > > difficult to modify it accordingly.
>> >
>> > I just implemented something that might be of value for this purpose, a
>> > command that outputs PMC metadata as CSV. Instructions for an sh-like
>> > shell follow:
>> >
>>
>> This looks very useful. We are  thinking along the same lines on
>> open-biblio so I have copied them.
>>
>> > git clone https://github.com/erlehmann/open-access-media-importer.git
>> > cd open-access-media-importer
>> > ./oa-get metadata pubmed
>> > ./oa-cache list-articles pubmed | grep Malaria | grep creativecommons
>> >
>> > Besides git and Python 2.6, you will need python-progressbar. For
>> > operating systems, without sane package management, you can find it
>> > here: <http://pypi.python.org/pypi/progressbar>.
>> >
>> > Be aware that this downloads several GB of data from PubMed Central FTP
>> > and may take some time. If you find any errors, let me know.
>> >
>> > Is this "metadata" or does it include abstracts? Because systematic
>> downloading of abstracts is, I think , forbidden by the publishing
>> community. We should stick to the material outlines in the Principles of
>> Open Bibliography.
>>
>> P.
>>
>> --
>> Peter Murray-Rust
>> Reader in Molecular Informatics
>> Unilever Centre, Dep. Of Chemistry
>> University of Cambridge
>> CB2 1EW, UK
>> +44-1223-763069
>
>
>
> _______________________________________________
> open-access mailing list
> open-access at lists.okfn.org
> http://lists.okfn.org/mailman/listinfo/open-access




More information about the openbiblio-dev mailing list