[open-bibliography] Openbiblio Principles was: Virtual meeting today

Sun Dec 19 22:41:59 UTC 2010

On Sun, Dec 19, 2010 at 9:21 PM, Karen Coyle <kcoyle at kcoyle.net> wrote:

> Quoting Peter Murray-Rust <pm286 at cam.ac.uk>:
>
>
>
>
>> > 1. Do not use "free", which is ambiguous. Use "without cost" or "without
>>>
>>> > restriction" as appropriate.
>>>
>>>
>>
>
>> The Open Source terms are "gratis" and "libre" which work well there. They
>>>
>>>
>>>
>> have finally been carried over to Open Access where they also work well.
>> (If
>> this had been done 10 years agao I beleieve that freedom of data would be
>> much advanced)
>>
>>
>
> Peter, not sure what you are suggesting here: that we use 'gratis' and
> 'libre'

Yes - at least explore them

> or 'Open Access'?
>
>
No!!

>
>
>
>> It may be helpful to visualize my motivation as from a scientist who until
>>
>> recently had no interaction with mainstream library practice. The
>> motivation
>> springs from the fact that secondary publishers use metadata to control
>> our
>> actions and also charge us money for it. We live in occupied territory.
>>
>> In science the action location or condition of an artifact is irrelevant
>> unlike monographs and incunabula. The article is platonic and there is a
>> platonic abstraction of the route to find and identify it - that's what I
>> mean by scientific bibliography. "Current contents" (does it still exist?)
>> is bibliography which should be ours by right.
>>
>>
>
> I think I'm beginning to understand. To me "locate" isn't limited to
> physical items -- essentially Google "locates" things for you with its
> search.

Yes!

> The OpenURL concept (used heavily for academic articles) is that of
> locating the most appropriate copy -- it recognizes that there is often more
> than one "instance" of a scientific article; some instances may be physical,
> some may be electronic; some are in the library, some are covered by the
> library's subscription. It is often referred to as a "locator service"
> (which I would feel comfortable using for something like Google as well).
>
>
>
Yes - most scientists don't actually care what instance they receive. In
some subjects the "final" version matters. However the difference between
the authors' final version (e.g. Word) and the publishers' PDF is
often irrelevant - scientists don't really care about pagination within a
journal (they only need the pages to locate and identify the article. In
some cases they prefer the authors' version as it hasn't been trashed into
unprocessable PDF.

> Does your term "address" have the sense of providing one or more actual
> links? In other words, would both of these be an address?:
>
>
>
Yes.

> Journal of X, vol 3 n 6, pp 746
>
>
This also acts as an Identifier

> http://journalofx/3/6/746
>
>
let's call that http://www.publisher.org/*journalofx/3/6/746*

There are some subtleties. Both require an address resolver. The first is
relatively useful for machines UNLESS there is something like Google. A
human may be able to solve this by going to a publisher web site and
browsing through (though some publishers websites are so awful it's
unbelievable). On many of them I can't find how to browse the archives.

The second also requires an Address resolver but it's so common you don't
think of it, yet the whole Net is built on it. It's the DNS system which
looks up all registered domain names and translates them into IP addresses.
This allows the browser or user agent to navigate to the publisher site.
Then the publisher server decodes *journalofx/3/6/746*  into the actual
place in the filestore (normally there is a close mapping). If there is a
query, then it's a database search

The universal DOI is also an address resolver. It turns unique strings into
web addresses. This is done by doi.org and publishers pay them a small fee
for each DOI minted. Again there is usually a split between top level and
publisher

>
>
>
>>
>> I think I mean something different by "location" - I mean the ability to
>> find any copy of the platonic article whereas for a physical book this is
>> different for each exemplar. That is valuable for a particularly library
>> and
>> I'm happy for it to be considered but it's not part of my struggle. (BTW I
>> am supportive of all aspects of bibliography - I'm just keener on one
>> part!)
>>
>>
>
> Right. This is also what library functions aim to do.
>
>
>
>>  I think the enumeration is valuable but need not be in the main body -
>>> it
>>>
>>>
>> could be an addendum. The purpose was to point out that these were things
>> that belonged by right to the commons. The great mistake of Open Access
>> was
>> not spelling out in detail what they wanted and it was confused with
>> meaningless terms such as "light green". So we must not allow the
>> publishers
>> to grab bits of our territory. The principles must - somewhere - address
>> that.
>>
>> This is not quibbling. Failure to free bibliography costs academia
>> hundreds
>> of millions of dollars a year
>>
>>
>
> It wasn't clear to me the purpose of that list. Yes, if you feel that
> particular elements are public domain, it would be good to provide a list.
> For readability, it may be best included in an appendix or addendum. I think
> that lists in the middle of text tend to drive readers away. :-)
>
>
>
Accepted

>
>>
>>> Regarding the location function we must first make clear what we mean
>>> by that (see above). I think Peter had something other in mind when he
>>> proposed this.
>>>
>>> > 5. I'm not sure why the heading in this first section is "Bibliographic
>>> data
>>> > already in the public domain", unless the intention is "Why we maintain
>>> that
>>> > bibliographic data is not covered by copyright." The wording does not
>>> say
>>> > that to me.
>>>
>>> The wording isn't clear but I couldn't think of anything better. It
>>> should indicate that the  section covers those parts of bibliographic
>>> descriptions which are not copyrightable and thus from the point of
>>> their creation in the public domain. Any ideas for a better wording?
>>> Then please change it.
>>>
>>> Yes, please do Karen! Adrian and I have wrestled with this and it
>>> reflects
>>>
>>>
>> our backgrounds. A third party would be valuable here. You need to try to
>> get into both our brains!
>>
>>
>
> I think it was the "already" that threw me on this. If you want it to be
> declarative, you could say:
>
> Bibliographic Data is Public Domain Data
>
>
>
That is our fundamental assertion.

> If the following paragraphs are an argument for open bib data, you could
> say:
>
> Why Bibliographic Data is Public Domain Information (or Data)
>
>
The problem is that the document could be a clarifying document (in which
case fine) or it could be a declaration of independence. Unfortuantely I
think it's the latter. So it's

We hold these truths to be
self-evident<https://mail.google.com/wiki/Self-evident>,
that all men are created equal, that they are endowed by their
Creator<https://mail.google.com/wiki/Creator_deity>with certain
unalienable
Rights <https://mail.google.com/wiki/Inalienable_rights>, that among these
are Life, Liberty, and the Pursuit of
Happiness<https://mail.google.com/wiki/Life,_liberty_and_the_pursuit_of_happiness>.

 Here "We" means the founding fathers, not the whole of humantity. They
asserted it as a political fact, not a universally held truth. I think it's
the same with bibliography.

>
>>
>>
>>> > 6. the section on URIs seems to be aimed at online resources, not
>>> > bibliographic data in the web environment. We have bib data in the web
>>> > environment that does not have URIs or URLs.
>>>
>>> ...on the other side we can identify print resources with URIs. I
>>> think the text in this part is ok because of being vague enough ("can
>>> be achieved" and now "is possible").
>>>
>>> I am a practising TBList and believe that all Web data *should* carry
>>>
>>>
>> URIs==URLs
>>
>>
>
>
> Well, maybe it should, but the vast majority does not, and we wouldn't want
> that data to be excluded from the discussion.

It isn't. It's

> Think about all of the citations in articles and digitized books.

I have - and that's what Open Bibliography and Open Citations are trying to
liberate

> I wouldn't over-emphasize URIs. That's not really about the data but about
> the technical format, and it seems to me that we want bib data, not just bib
> data in certain formats, to be open.

No - URIs are IDENTIFIERS. Every bibliographic description should have a
unique identifier

>
>
>
>
>> If a third party wishes to take Open (libre) bibliography and tag it and
>> posses the result, fine. What we have to avoid is the use of tagging to
>> restrict us from our common right
>>
>>
>
> So maybe that is a good point to make: that although someone may add to
> bibliographic information, and those additions may be worthy of some kind of
> IP ownership, it should not be the case that those additions change the
> copyright status of the bib data itself.

Exactly

> There is an easy analogy in this for books: publishers often take a public
> domain title, add an introduction or some other text, and then republish to
> PD book. They can copyright the introduction or other added text, but the
> original PD work remains in the public domain. Publishers slap copyrights on
> these books, but if you look at the small print the (c) only applies to the
> new part.
>
>
Yes, but this often effectively removes the original from the PD because of
the lack of metadata. Uncertainty always come down on the side of the
powerful. Copyright is only resolved by lawyers

> We should say that adding to bibliographic description does NOT change the
> public domain status of the factual part of the description; only the added
> parts *may* be covered by copyright. By saying this, a whole database (like
> EBSCO) therefore contains some copyrighted material, but they have not
> gained copyright over the names of authors or titles of articles, only over
> the part that they contributed that is creative in nature (well, the
> creative part is from US law, so we probably can't use that term).
>

Yes this is our inalienable right

>
>
>
>>  Hmm, I don't see this problem. We could contribute to making "public
>>> domain" a common term. I think it is already used quite often.
>>>
>>> The Pantonistas spent 2 years on this. It's not easy, but Jordan hatcher
>>>
>>>
>> and John Wilbanks converged on Public Domain being the only workable
>> solution for science data. This is because it *has* to conform with laws
>> and
>> while these vary between jurisdictions the Public Domain was a universal
>> solution.
>>
>> PD doesn't apply to - say - scholarly content because *automatically* that
>> carries copyright - by law. In contrast we argued that scientific data did
>> NOT, de facto, carry copyright. However to make this abundantly clear we
>> need to go through a conscious (but now simple) process of dedicating the
>> content to the PD - the PDDL. My guess is that this works for bibliography
>> as well - it is not, de facto, copyright.
>>
>>
>
> I'll go along with this as long as we are very clear about what we mean by
> public domain and why certain aspects of bib data are not covered by
> copyright. Unfortunately, we haven't had a court decision to back us up, and
> so we are kind of whistling in the dark. I think it would make sense to say
> that we are declaring this to be in the public domain (which is a kind of
> challenge) rather than stating that "factually" it *is* in the public
> domain, which is not (yet) true.
>
>
>
>>  > 11. #3 of the recommendations should refer to ANY restrictions on the
>>> data,
>>> > including attribution. Once we start mashing up data, anything but pure
>>> open
>>> > use becomes impossible. So perhaps this point should be about ANY
>>> > restrictions, of which non-commercial is one, attribution is another,
>>> and
>>> > even share-alike is another. (Note, if W3C provenance work becomes a
>>> > reality, we could say that people should pass on provenance data that
>>> is
>>> > received. This wouldn't so much be for the rights issue, in my mind,
>>> but
>>> so
>>> > that people can make selections based on whose data they trust most.
>>> One
>>> > issue with provenance is that is could give people a way to attempt to
>>> > control their data, and we will probably need to address that if it
>>> becomes
>>> > a reality.)
>>>
>>> I don't agree with this. The recommendations built on each other and
>>> it's #4 which says what you propose for #3. So you basically are
>>> proposing deleting #3 and - if we do so - we could also delete #1 and
>>> #2 because anybody who complies to #4 also complies to the three
>>> principles before that.
>>>
>>>
>>
>
> I went back and read the earlier statements and it is true that they would
> all have to be removed. So I think it would be worth discussing this. AFter
> all, if we've spent the entire document saying that factual bib data is in
> the public domain, then there shouldn't be any need for talk about licenses.
> I'm now quite confused about the overall goals of this document.
>
>
>
>
>> FWIW I am stuck in Barcelona because of the snow
>>
>>
>
>
> I can think of worse places, and hope you are at least comfortable, warm,
> with a drink in hand. :-)
>
> kc
>
> --
> Karen Coyle
>   kcoyle at kcoyle.net http://kcoyle.net
> ph: 1-510-540-7596
> m: 1-510-435-8234
> skype: kcoylenet
>
>
> _______________________________________________
> open-bibliography mailing list
> open-bibliography at lists.okfn.org
> http://lists.okfn.org/mailman/listinfo/open-bibliography
>
>

-- 
Peter Murray-Rust
Reader in Molecular Informatics
Unilever Centre, Dep. Of Chemistry
University of Cambridge
CB2 1EW, UK
+44-1223-763069
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/open-bibliography/attachments/20101219/9cb84a29/attachment-0001.html>