[Open-data-census] Penalty for no bulk data not appropriate for realtime or big data

Jury Konga jkonga at sympatico.ca
Tue May 19 10:05:49 UTC 2015


I agree with both of you.  Stephen I think you’re spot on with your issue and Rufus I agree this would be a valuable discussion in the census forum.

 

Rufus, I have a more fundamental issue re: current point system versus a % openness score (which can accommodate “not applicable” scenarios) – would you agree that this is a separate subject for the forum discussion?

 

Cheers  Jury

 

Local Open Data Census Lead

Open Knowledge Canada

 

From: open-data-census [mailto:open-data-census-bounces at lists.okfn.org] On Behalf Of Rufus Pollock
Sent: May-19-15 4:50 AM
To: Stephen Gates
Cc: Open Data Local Census Admins
Subject: Re: [Open-data-census] Penalty for no bulk data not appropriate for realtime or big data

 

First, this is a great discussion to have - and I have some thoughts I'd like to share.

 

One procedural point: would you mind posting this on the discuss forum at:

 

https://discuss.okfn.org/c/open-data-index

 

We're gradually migrating more substantive discussions and thread there and this is one that should definitely be there.

 

Rufus

 

On 19 May 2015 at 06:11, Stephen Gates <stephen.gates at me.com> wrote:

Hello,

 

I think the question for bulk data in the census needs to change. It is not always possible to publish open data in bulk. As pointed out in the open data handbook http://opendatahandbook.org/glossary/en/terms/bulk/  publishing bulk data is not practical for realtime or big data.

 

Can I suggest that the current question is reworded from:

 

Is the data available in bulk? - Data is available in bulk if the whole dataset can be downloaded easily. It is not available in bulk, if access to the data is through a web page that provides access to only part of the database.

 

to something like:

 

Is the data available in bulk or via a real-time feed? - Data is available in bulk if the whole dataset can be downloaded easily. It is not available in bulk, if access to the data is through a web page that provides access to only part of the database. A real-time feed provides access to a subset of a database that changes frequently and is too large to download in bulk.

 

As an example, in my view, a real-time public transport fed in GTFS-RT <https://developers.google.com/transit/gtfs-realtime/>  format should not be penalised 10 points for not being available in bulk.

 

 

What do you think? Should the question be changed? If so, what’s the process to change it (assuming most census reference the “master” question sheet)?

 

thanks

 

Stephen Gates
Australia’s Regional Open Data Census <http://australia.census.okfn.org> 

 


_______________________________________________
open-data-census mailing list
open-data-census at lists.okfn.org
https://lists.okfn.org/mailman/listinfo/open-data-census





 

-- 

Rufus Pollock

Founder and President | skype: rufuspollock |  <https://twitter.com/rufuspollock> @rufuspollock

 <http://okfn.org/> Open Knowledge - see how data can change the world

 <http://okfn.org/> http://okfn.org/ |  <http://twitter.com/OKFN> @okfn |  <https://www.facebook.com/OKFNetwork> Open Knowledge on Facebook |   <http://blog.okfn.org/> Blog

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/open-data-census/attachments/20150519/b1681f91/attachment-0001.html>


More information about the open-data-census mailing list