[open-bibliography] More needed to define "open"

Tom Morris tfmorris at gmail.com
Tue Apr 23 16:20:46 UTC 2013

It's a good reminder that there are multiple things which influence
how "open" something is, but I'm not following the first part of this:

On Tue, Apr 23, 2013 at 11:05 AM, Karen Coyle <kcoyle at kcoyle.net> wrote:
> One can
> announce a data license of CC0 but if the data itself is behind a paywall,
> or otherwise isn't available for use, then the license isn't doing anyone
> any good.

Is this in response to a real world example?  I'm having a hard time
reconciling CC0 with a paywall ToS/license.  Unless of course, someone
took CC0 data and put it behind a paywall -- which is, of course,
perfectly legal in the same way that selling GPL'd software is.

> It seems to me that there are at least 2 ways that data becomes truly open:
> 1. There is a data dump that is openly available to anyone who wishes to
> download it. This solution has problems relating to updating, however, and
> puts the burden of making the data usable on the recipient. It's a solution,
> but perhaps not the best.
> 2. The data owner provides an open interface that allows searching and
> linking. Linking needs to be bi-directional -- that the data can link out,
> but also that others can "link in."

I think both bulk data dumps and APIs are useful for different types
of access, so ideally both should be offered.  Stable RDF URLs that
people can link to are also useful, but somewhat orthogonal (and their
usefulness depends on whether you've got the RDF religion or not).
Stable URLs allows linking in, but linking out is a separate problem
which requires investment.

Your point about updates hints at a whole host of issues that people
don't often think about.  Can I get updates in real-time or only
batched every N months? Do I get the updates as updates or do I have
to figure them out myself by diff'ing the two dumps?  Do I get the
provenance of who made each update (perhaps important for me to figure
out if I think they're reliable or not)?  Is the updating process a
one way street or is the database "open" enough that it accepts my
updates too and merges them back in to the master (or is there some
type of federation with multiple "master"s)?

I think you need to distinguish between open data and free service
though.  While it would be great for data consumers to say that
unlimited free API calls are  an attribute of open data, I don't think
it's reasonable to assume that service providers will donate unlimited
funding for this.

What I like to see:

1. Bulk data dumps on a regular frequent basis.  I think DBpedia does
quarterly, OpenLibrary monthly, & Freebase weekly.  I like to see at
least monthly.
2. Live updates notification/feed - this could be RSS or some type of API
3. API with a reasonable free quota that allows searching and querying
for casual users


More information about the open-bibliography mailing list