[Open-data-census] Penalty for no bulk data not appropriate for realtime or big data
Jury Konga
jkonga at sympatico.ca
Tue May 19 10:05:49 UTC 2015
I agree with both of you. Stephen I think you’re spot on with your issue and Rufus I agree this would be a valuable discussion in the census forum.
Rufus, I have a more fundamental issue re: current point system versus a % openness score (which can accommodate “not applicable” scenarios) – would you agree that this is a separate subject for the forum discussion?
Cheers Jury
Local Open Data Census Lead
Open Knowledge Canada
From: open-data-census [mailto:open-data-census-bounces at lists.okfn.org] On Behalf Of Rufus Pollock
Sent: May-19-15 4:50 AM
To: Stephen Gates
Cc: Open Data Local Census Admins
Subject: Re: [Open-data-census] Penalty for no bulk data not appropriate for realtime or big data
First, this is a great discussion to have - and I have some thoughts I'd like to share.
One procedural point: would you mind posting this on the discuss forum at:
https://discuss.okfn.org/c/open-data-index
We're gradually migrating more substantive discussions and thread there and this is one that should definitely be there.
Rufus
On 19 May 2015 at 06:11, Stephen Gates <stephen.gates at me.com> wrote:
Hello,
I think the question for bulk data in the census needs to change. It is not always possible to publish open data in bulk. As pointed out in the open data handbook http://opendatahandbook.org/glossary/en/terms/bulk/ publishing bulk data is not practical for realtime or big data.
Can I suggest that the current question is reworded from:
Is the data available in bulk? - Data is available in bulk if the whole dataset can be downloaded easily. It is not available in bulk, if access to the data is through a web page that provides access to only part of the database.
to something like:
Is the data available in bulk or via a real-time feed? - Data is available in bulk if the whole dataset can be downloaded easily. It is not available in bulk, if access to the data is through a web page that provides access to only part of the database. A real-time feed provides access to a subset of a database that changes frequently and is too large to download in bulk.
As an example, in my view, a real-time public transport fed in GTFS-RT <https://developers.google.com/transit/gtfs-realtime/> format should not be penalised 10 points for not being available in bulk.
What do you think? Should the question be changed? If so, what’s the process to change it (assuming most census reference the “master” question sheet)?
thanks
Stephen Gates
Australia’s Regional Open Data Census <http://australia.census.okfn.org>
_______________________________________________
open-data-census mailing list
open-data-census at lists.okfn.org
https://lists.okfn.org/mailman/listinfo/open-data-census
--
Rufus Pollock
Founder and President | skype: rufuspollock | <https://twitter.com/rufuspollock> @rufuspollock
<http://okfn.org/> Open Knowledge - see how data can change the world
<http://okfn.org/> http://okfn.org/ | <http://twitter.com/OKFN> @okfn | <https://www.facebook.com/OKFNetwork> Open Knowledge on Facebook | <http://blog.okfn.org/> Blog
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/open-data-census/attachments/20150519/b1681f91/attachment-0001.html>
More information about the open-data-census
mailing list