[open-data-handbook] Wording relating to cost of data?

Rufus Pollock rufus.pollock at okfn.org
Thu Mar 21 12:25:25 UTC 2013


On 19 March 2013 07:15, Peter Krantz <peter at peterkrantz.se> wrote:
>
> Hi!
>
> In the ODH there are some places where cost of access to data is
> discussed. From time to time I meet people who are confused regarding
> the possibility of charging for data.
>
> For example this page has ambiguous statements:
> http://opendatahandbook.org/en/what-is-open-data/index.html
>
> "Open data is data that can be freely used, reused and redistributed
> by anyone - subject only, at most, to the requirement to attribute and
> sharealike."
>
> and in the first bullet point below:
>
> "Availability and Access: the data must be available as a whole and at
> no more than a reasonable reproduction cost, preferably by downloading
> over the internet."

Good point. The first quote is the summary of the Open Definition and
the second quote seems a slight misquote of the formal Open Definition
point 1 which says:

The work shall be available as a whole and at no more than a
reasonable reproduction cost, preferably downloading via the Internet
without charge ...

Seems like we should correct that second quote.

> So in the first paragraph it is free but in the bullet point it seems
> like it is OK to charge for data. The second statement has been used

Strictly it is *ok* to charge for data. What you must do is make data
available in bulk at cost of reproduction (which in general will be
free or nearly free).

> as an argument by people from a gvmt agency that have a business model
> where a single row of data about costs 0.6 EUR. (getting the entire
> database would cost around 400 000 EUR). As "open data" is gaining in
> popularity they like to be part of that and thus consider the 0.6 EUR
> a "reasonable reproduction cost".

But that can't be the cost of reproduction. The cost of reproduction
is essentially 0 for a single row and even for whole DB bulk access
for GBs today is cents (so little that it's basically not worth
charging ...)

> I think the Open data handbook has to be clarified to reduce
> ambiguity. Expensive data is not open data and maybe open data
> definitions need to be at the "end of the scale" stressing that data
> need to be free to be truly open. Experience from discussions about
> software patents (RAND terms etc) shows that "Reasonable" can mean
> very different things to different people.

I think http://opendefinition.org/okd/ is pretty clear and we should
inline that pretty much directly into the handbook.

> As a second alternative ambiguity can be reduced be providing an
> example of what "reasonable reproduction costs" can be, maybe by
> explaining the cost of the medium (e.g. a DVD) and that it only
> applies to a dataset as a whole.

Agreed as per above. Delighted if you want to submit a pull request
:-) https://github.com/okfn/opendatahandbook

Rufus




More information about the open-data-handbook mailing list