[okfn-discuss] Proposal for OpenThesis Project

Peter Murray-Rust pm286 at cam.ac.uk
Sun Jul 11 15:50:08 UTC 2010


Greetings all,

I believe there is a real need for a communal effort in making academic
theses more Open and would like the OKF to set up a project along these
lines. I believe that the OKF is the most appropriate organization to
address this issue (and perhaps the only one capable of doing it!). I am
posting this to okfn-discuss (after recent meta-discussions) but am also
happy for it to be re-copied and reformulated on
propose-project at okfn.org (and indeed this could act as the first trial of
that). As the process is new I'm writing this slightly more discursively and
occasionally in the first person, but obviously if this goes anywhere it
would be communally owned.

Motivation
========
 My motivation is that over several years (e.g. attending meetings of ETD
(Electronic Theses and Dissertations ) and OR(OpenRepositories)  and more
recently an  ETHOS meeting (JISC) that the university and HE sector does not
fully address the issue of making theses Open. I should make it clear that
they have all done a huge amount of good things - such as promoting
born-digital theses and promoting repositories, and the OpenThesis project
is intended to be entirely complementary.

The problem arises from the fact that theses are by their nature protected
by copyright. (There is an important additional point that much of a modern
thesis may be more suitably regarded as "data" or "code" or "metadata" but I
believe that OpenThesis will subsume these concerns by addressing the larger
problem of copyright). Theses are often handled by University Libraries (who
also often manage the repositories) and they naturally and responsibly
address the problem of copyright. Too frequently, however, the actual rights
are poorly represented, especially at the machine-understandbility level.
There is often a single copyright notice on a repository which takes a
(perhaps forgivable) approach that everything is forbidden unless permitted
explicitly. Licences are often unnecessarily restrictive (e.g. ND-NC). There
are excellent cases where libraries and authors are pro-active in
encouraging licences to be embedded in theses but the normal case is that
there are no explicit machine-readable rights on a per-work basis.

Theses also have a commercial value and there are organizations which
provide cataloguing and dissemination of theses or extract and republish of
material. These derivative works are usually protected and there is a
tendency for their rights to be applied to the orginal work by implication.

I am one of many who would like machine-discoverable and machine-readable
(i.e. semantic) theses. The original theses are often born in semantic form
(HTML, DOCX/OOXML or TeX) but then flattened into PDF. Many theses are only
available on a per-thesis basis, controlled by a portal/gatekeeper, which
destroys any possibility of Linked open data.

I am sure that most of the current lack of Linked Open Theses (LOT) is due
to ignorance of the value of Openness. I think that if we can explain
carefully and compellingly the value of LOT then many authors and many
instituitions will welcome it.

This is a global challenge. Institutions are regulated by local degree
regulations (and these must be of course be honoured). Countries can only
act for themselves (e.g. JISC, SURF(NL) and similar bodies elsewhere). The
OKF can do something that they cannot easily do:
* show the global vision
* create exemplars
* find and extol early adopters (and they already exist)
* support and coordinate the actual authors (many of whom want their theses
to be open).
* provide accurate and compelling information

Proposal
=======
(a) set up an open-Thesis mailing list and project/pirate page

(b) evolve a similar approach to the Panton Principles which applied to
theses. It would be something like:
1 author: make a clear statement of your wishes (do NOT rely on formal
licences to convey this)
2 author: identify which parts of your work do not involve third party
rights (e.g. graphic images or transcluded text). Label these clearly and in
machine-readable form; institution: support the author in this process
3 author (with institutional help): select an appropriate licence or set of
licences. (theses may contain text, source code, data and these all require
different licences.
4 institution: display the thesis and metadata and licences in machine
readable-form. Make it trivial for machines to ascertain that (a) this is a
thesis (b) what rights the machine-reader has to re-use the material.
Promote discovery of theses (e.g. through tables of contents).
5 institution: label theses as Open (e.g. with an OKF OpenThesis button)

(c) create exemplars for demonstration and advocacy
(d) engage with early-adopter repositories
(e) engage with regulators/funderadvocacy SPARC, JISC, Wellcome, SURF, NSDL,
OR, ETD, etc.
(f) design and populate an OpenThesis Bibliography (Table Of Contents) by a
mixture of crawling repositories and crowdsourcing. I would not expect this
to violate any rights

Support and funding
================

The OKF is now a fundable body so I would expect that engagement with
generic funders (JISC, SURF, ARDS, NSDL, etc.) would be appropriate. I would
also hope that research funders (e.g. Wellcome, RCUK) would be sympathetic.

Technical Requirements
===================
(a) mailing list
(b) project pages
(c) probably some exemplars in CKAN or a special resource

I would see the technology being developed on openbiblio-dev and
#jiscopenbib as almost excelty what we need. It will create an Open
ThesisTOC and will also allow us to annotate individual theses for Openness.
I'd suggest this was organised by Country => Institution (=>Department). An
attraction of this is we get a formal list of institutions as a result.

Risks
====

I think the IP risks are small but should be considered. With goodwill from
the community they are negligible.

It's ambitious but it can easily be scaled per country or even per
institution. This would distribute most of the human involvement.

Many OKF people are probably actively involved in theses (doing research,
writing up, just submitted, etc.) so there is a large pool of talent!

P.



-- 
Peter Murray-Rust
Reader in Molecular Informatics
Unilever Centre, Dep. Of Chemistry
University of Cambridge
CB2 1EW, UK
+44-1223-763069
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/okfn-discuss/attachments/20100711/65289217/attachment.html>


More information about the okfn-discuss mailing list