[open-science] Share Alike? Or not?

Carl Boettiger cboettig at gmail.com
Thu Jun 14 17:25:39 UTC 2012

I believe I've encountered problems with attribution stacking. Citations
that appear only in the supplement of journals are frequently not counted
in the major commercial citation metrics indices, and many journals limit
the number of citations.

For instance, I programmatically pull data from fishbase.org, which claims
that its content is cc-by-nc, and asks that I attribute authors of the
original paper in which the data was reported.  Since the queries need to
look at all the data to report the subset of numbers that match my query,
shouldn't I probably cite all 47,300 references and not just the hundreds
of sources that provide the final data?

Perhaps more importantly, I find the connection between an attribution
license and an academic citation confusing.  Isn't citation is a cultural
norm, not a legal requirement of copyright?  An attribution license
requires that I acknowledge the author in some unspecified way (i.e. it
isn't clear that I need to acknowledge them in a way that boosts their
citation count in ISI), if I share or remix and share their work.  When I
cite an academic paper, I am not sharing or remixing work.  I feel
uncomfortable with the use of an attribution license to "enforce" citation
practices.  If attribution doesn't equate to academic citation, then
perhaps the problem of "stacking" is more negligible.

On the other hand, if in using data from the repository I should attribute
both the repository and the original source, than the attribution stacking
problem introduced by copyright is a greater burden than the attribution
problem introduced by citation practices.  Is this any different than
citing a review or meta-analysis that determines "92 out of 120 studies
showed x" without citing the original articles?  When the goal of citation
was only reproducible research, there is no need to stack those citations
since it is easy to follow the chain.  But if the copyright license on the
data enforces citing all elements in the chain, then it is the terms of
copyright, not citation norms, which pose the greater burden of stacking.

Confused myself,


On Thu, Jun 14, 2012 at 7:51 AM, Diane Cabell <dc at icommons.org> wrote:

> Blush.  Yes, PDDL.
> Not a miner myself, so hard for me to know whether it is a frequent or
> serious problem.  A good question that deserves more than mere assertion.
> There are possible solutions but we don't have any consensus on whether
> they would be acceptable to authors.  As you suggest, one could throw all
> the citations into an attribution database -- whether or not data from any
> particular source was actually used -- and link to it from the resulting
> article/re-use.  But if the article is distributed/archived in hard copy
> version, authors may feel that this pushes them too far away from readers
> to get the recognition they would get if their name actually appeared in
> footnotes of the hard copy.   What if the link rots?  What if the final
> re-use only uses about 4% of all those cite-in-case-of-doubt sources?
>  Would that not be confusing to those who attempt to reproduce results? I
> don't know.  Not technically gifted enough.
> And the precise presentation of the citation could be more burdensome if,
> for example, the author requires any special format (as is possible with CC
> BY) or if all of the required citation elements are not easily obtainable.
> dc
> On Jun 14, 2012, at 2:44 PM, Rufus Pollock wrote:
> > On 14 June 2012 14:31, Diane Cabell <dc at icommons.org> wrote:
> > [...]
> >> Attribution stacking is a serious problem for those who are trying to
> mine
> >> large collections of data.  It can be difficult to track small bits
> that are
> >> pulled out from different sources.  The list of authors to attribute
> might
> >> be longer than the article itself.  Consider using CC Zero or ODbL for
> your
> >> data.
> >
> > Has anyone actually ever found attribution stacking to be a problem? I
> > hear it said quite frequently that this *could* be a problem but it
> > seems to be me there are fairly easy ways to deal with "attribution
> > stacking" so don't think this is really an issue (for more detail on
> > why see these comments from a few years ago:
> > <
> http://blog.okfn.org/2009/02/09/comments-on-the-science-commons-protocol-for-implementing-open-access-data/
> >).
> >
> > Also, as a minor aside, I assume you meant the PDDL (Public Domain
> > Dedication and License) rather than the ODbL in this context (ODbL =
> > Attribution / Share-Alike for data).
> >
> > Rufus
> _______________________________________________
> open-science mailing list
> open-science at lists.okfn.org
> http://lists.okfn.org/mailman/listinfo/open-science

Carl Boettiger
UC Davis
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/open-science/attachments/20120614/a046e74b/attachment-0001.html>

More information about the open-science mailing list