[Open-access] Trying to index the malaria literature for BOAI-Openness - what has to be done paper-by-paper?

Tom Olijhoek tom.olijhoek at gmail.com
Tue Mar 13 16:44:50 UTC 2012


Hi All,

Do I understand it right that you did a search for just *one keyword
"malaria"*?
PMR says that MW gave >70K references
I thought that Mark extracted >70K references using the *set of
keywords*provided by us (MW)?
This is of no consequence to the observation that google search seems to do
a good job.
We recently had a discussion (Mark, Tom , Bart, Serge [admin MW]) that the
MW ref database of 2010-2011 will be compared to the pmc exracted
references for these years to see how identical the sets are.
That will be the basis for an extraction over a long time period using the
chosen keywords.
When you can use Google to sort the CC-BY and CC-0 it would be fantastic!

TOM


On Tue, Mar 13, 2012 at 4:46 PM, Daniel Mietchen <
daniel.mietchen at googlemail.com> wrote:

> Google does not give a simple list of results, and it currently yields
> over 80k hits for malaria on PMC:
> https://www.google.com/search?q=malaria+site%3Awww.ncbi.nlm.nih.gov%2Fpmc.
>
> However, the Crawler currently being coded as part of the Open Access
> Media Importer (cf.
>
> http://wir.okfn.org/2012/03/10/open-access-media-importer-apology-frontend-usage/
> ) does almost what you are looking for, and so it should not be too
> difficult to modify it accordingly.
>
> I have copied Nils and Raphael - my partners on this - into this mail.
>
> Cheers,
>
> Daniel
>
>
> On Tue, Mar 13, 2012 at 4:14 PM, Peter Murray-Rust <pm286 at cam.ac.uk>
> wrote:
> >
> >
> > On Tue, Mar 13, 2012 at 3:09 PM, Daniel Mietchen
> > <daniel.mietchen at googlemail.com> wrote:
> >>
> >> Just seen in
> >>
> >>
> http://blogs.ch.cam.ac.uk/pmr/2012/03/13/sparc2012-a-manifesto-in-absentia-for-open-data/
> >> :
> >> "Our recent @ccess group is trying to index the malaria literature for
> >> BOAI-Openness and it has to be done paper-by-paper"
> >>
> >> I am not sure what PMR had in mind here that "has to be done
> >> paper-by-paper", but albeit PMC / PMUK do indeed a bad job in
> >> identifying papers by their licenses,
> >
> >
> > That's exactly what I meant!
> >>
> >> Google does it fairly well:
> >> 216 PMC articles under CC0 are being brought up in search for "malaria"
> >>
> >>
> https://www.google.com/search?q=malaria+site%3Awww.ncbi.nlm.nih.gov%2Fpmc+%22Creative+Commons+Public+Domain%22
> >> as well as over 9000 under CC BY:
> >>
> >>
> https://www.google.com/search?q=malaria+site%3Awww.ncbi.nlm.nih.gov%2Fpmc+%22http%3A%2F%2Fcreativecommons.org%2Flicenses%2Fby%22+-%22by-sa%22++-%22by-nc%22++-%22by-nd%22
> >> (for further stats, see
> >>
> >>
> http://wir.okfn.org/2011/07/14/a-wiki-approach-to-open-access-and-open-science/#comment-44
> >> ).
> >>
> >> Of course, the relevance of these papers for Malaria World cannot be
> >> established that way.
> >>
> > Malaria World gave us 70K references and the questions is how we
> establish
> > the Openness of them. We can start with Google and maybe intersect the
> sets.
> > Does google give a simple list of the 9000 papers?
> >
> >>
> >> _______________________________________________
> >> open-access mailing list
> >> open-access at lists.okfn.org
> >> http://lists.okfn.org/mailman/listinfo/open-access
> >
> >
> >
> >
> > --
> > Peter Murray-Rust
> > Reader in Molecular Informatics
> > Unilever Centre, Dep. Of Chemistry
> > University of Cambridge
> > CB2 1EW, UK
> > +44-1223-763069
>
> _______________________________________________
> open-access mailing list
> open-access at lists.okfn.org
> http://lists.okfn.org/mailman/listinfo/open-access
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/open-access/attachments/20120313/3b3ca1eb/attachment.html>


More information about the open-access mailing list