[Open-access] [open-science] OKF at Open Repositories 2014

Mike Taylor mike at indexdata.com
Thu Dec 5 16:16:01 UTC 2013

Wikipedia works because there's one of it.

eBay works for the same reason.

More pertinently, that's why arXiv works, too.

The whole system of IRs necessarily and *by design* leads to
balkanisation. How could it not? That's what the institutions actively
*want* -- come and see *our* awesome repo! What researchers need is
for there to be one repo in the world. (Plus any number of mirrors, of

-- Mike.

On 5 December 2013 16:13, Emanuil Tolev <emanuil at cottagelabs.com> wrote:
> On 5 December 2013 14:50, Peter Murray-Rust <pm286 at cam.ac.uk> wrote:
>> On Thu, Dec 5, 2013 at 1:32 PM, Emanuil Tolev <emanuil at cottagelabs.com>
>> wrote:
>>> On 5 December 2013 12:16, Jenny Molloy <jenny.molloy at okfn.org> wrote:
>>>> People seem to be pointing towards:
>>>> 1) Automatic/crowdsourced deposition of OA work into repositories via
>>>> OAButton system or other means (maybe we could persuade a few live
>>>> repositories to implement the deposit button Mark discussed).
>>> Hmm... if the OAButton showed you your own research (e.g. just by name),
>>> showed you how many people hit a paywall trying to get your article(s) and
>>> prompted you to deposit, that seems like a good way of convincing more
>>> people to do it. Would it not be insanely complex in terms of finding WHERE
>>> to deposit to though? (I.e. has to be a local institutional repo because UK
>>> is not Norway, correct me if I'm wrong.)
>> ?? Why does it have to be a local institutional repo? This religious
>> insistence on Universities at the centre has held us back. I use to work for
>> Glaxo - where do I put my papers? Where do Cottagelabbers put them? The
>> fewer outlets (> 1 but small) the better. Options include:
> Sorry Peter, I didn't mean "that's the only option", it's just the first
> thing everybody tries to do. I definitely go to CORE and in general
> aggregating information from individual repositories when a piece of
> software needs to be able to find papers. I quite agree that a centralised
> e.g. Norway's approach is superior in many ways. That's quite the paradigm
> shift though. We need to identify ways of making people believe this is
> possible and see the benefits + address concerns they have.
>> * arXiv
>> * EuropePMC
>> * CKAN
>> * Wikipedia
>> and perhaps national libraries.
> Right, fair enough.
> Dumb question: why can't arXiv handle content from all disciplines?
> (Manpower? Technical debt? All of it probably boils down to willingness and
> money, but if we can identify problems with existing
> centralised-but-not-comprehensive-enough systems, we can do something about
> it.)
>> >>Also I'm not sure how much metadata they hold about the articles people
>> >> report using the OAButton. An integration between the OAButton data and
>> >> CrossRef has more chance here, and is a cool idea. There's not enough
>> >> integration between OAButton and repos yet, though we do use CORE to look
>> >> stuff up automatically for our users.
>> Why should we base the future on a system that clearly isn't (yet)
>> working. Given the choice of going to Wikipedia and
>> theUniversityOfNowhereIHaveHeardOf where would people want to go?
> But Wikipedia covers everything, or at least doesn't exclude based on
> discipline. Is there a system that can take all of UK's research metadata?
> CKAN and elasticsearch and friends don't count until they're deployed with a
> wrapper that allows depositing in multiple ways and covered server bills.
> You mention Wikipedia, and it does fulfil those criteria. But how can we
> deposit research metadata on wikipedia (or more broadly, how to use it to
> describe research)? You could run a wiki which stores all the full text of
> all the OA articles or the PDF file where that's unavailable. But this is
> still a piece of software, not a system. If we're not talking about wiki
> software, but Wikipedia.org, I'm not sure how research metadata (or the text
> of all the OA research) would fit into it.
>>> 2) Indexing content for search and discovery.
>> >>I think Jorum, the open educational resource system, is built on DSpace,
>> >> and that's doing fine with its catalogue: http://find.jorum.ac.uk/
>> Is it? Can I search for (say) "Blackbird" or "Turdus Merula"? I don't get
>> any results. We need to build our own.
>> >>But I guess what the audience will really want to hear about is
>> >> basically an open alternative to Google Scholar.
>> Yes.
>> >>One thing which can attach information to very large amounts of DOI-s is
>> >> http://oag.cottagelabs.com/ , a mass license checker - goes directly to
>> >> publisher websites and scrapes them (or uses API-s where available). So this
>> >> is not precisely discovery just yet, but it does fill the legal gap (the
>> >> license of the items we index for discovery and search).
>> Excellent.
>> >>Many have tried to build an open index of scholarship (I've tried with
>> >> others too, still trying) but I haven't heard of a big one just yet. While
>> >> this is probably the "ultimate" topic of such a workshop, I don't think
>> >> we're at the stage where you can base a workshop on that idea. Maybe the
>> >> other 2 examples above + other existing projects come together to a greater
>> >> whole worthy of a workshop though.
>> There is no technical reason why not, and I think the need is becoming
>> clearer. Figshare showed that one graduate student can change the world,
>> OAButton showed that undergraduates will. Wikipedia has built a better
>> knowledge engine than almost any university. We have all the indexing tools
>> built (we can add Pubcrawler to the Cottagelab tools) and the problem can be
>> scaled with committed humans.
> Hm, my notion of a workshop was more towards the "people come to hear how to
> do this or about work in progress on this", but it's actually a lot more of
> a discussion / contribution thing. I guess you can run them both ways, more
> towards the lecture side of the spectrum and more towards the hacking side.
> So yes, "Building the Green cake and decorating it" or a subset of that is a
> viable topic then.
> Greetings,
> Emanuil
>> --
>> Peter Murray-Rust
>> Reader in Molecular Informatics
>> Unilever Centre, Dep. Of Chemistry
>> University of Cambridge
>> CB2 1EW, UK
>> +44-1223-763069

More information about the open-access mailing list