[Open-data-census] open-data-census Digest, Vol 18, Issue 3

Pierre Chrzanowski pierre.chrzanowski at gmail.com
Wed Nov 5 14:32:38 UTC 2014


Thanks Rufus,

One of the major problem I see with this methodology (lack of chaining) is
that it actually allow us to assess different instance of publication for a
dataset.

For instance, if UK Company Data was also available online but not as bulk.
What would have been the answers ?

This lead to very different interpretations and contributions as
exemplified by Simon.

I think we should clarify methodology to help us choose which dataset to
assess.

For instance, always prefer to assess publicly available data rather non
publicly avalaible data, and then online rather non online, etc.

Best
Pierre

On Mon Nov 03 2014 at 5:30:51 PM Rufus Pollock <rufus.pollock at okfn.org>
wrote:

> On 3 November 2014 16:11, Pierre Chrzanowski <pierre.chrzanowski at gmail.com
> > wrote:
>
>> Sorry to keep going on but I actually thought there were some evident
>> chains such as : bulk or format are null if data is not publicly available
>> online. Otherwise it means that one has to be able to have access to the
>> unavailable data to confirm evidences.
>>
>
> These are excellent points PIerre and we thought quite a bit about the
> implication chains last year (and have tried to build some into the survey
> logic).
>
> On the bulk the logic was this: in the UK you used to be able to get the
> Companies Register in bulk on CDs but not online. (So this is an example of
> bulk being true but online being false).
>
> Similarly, for format it is again the case that stuff coudl be available
> in a specific format but not publicly online.
>
>
>> For instance, I am being told that spending government data in France
>> exist in reusable format and in bulk. But I cannot access the data so why
>> should I believe this ? Should I go to the Ministry ?
>>
>
> I would say that is definitely a stretch: if data is not available to
> anyone then it would be impossible to know if bulk so i would mark this as
> no or unsure in this case. Similarly, on reusable. However, if e.g. the
> Ministry made the data available to researchers on CD-ROMs you would be
> able to answer this even if not publicly available.
>
> Rufus
>
>
>> Then, there are actually some questions that consider public availability
>> implicitly in their definition such as for bulk [1]. Two questions are
>> chained then.
>>
>> I hope that we will be able to sort that out before we publish anything.
>> Otherwise, I know there are some people ready to fire :)
>>
>> Best
>>
>> [1] Data is available in bulk if the whole dataset can be downloaded or
>> accessed easily. Conversely it is considered non-bulk if the citizens are
>> limited to just getting parts of the dataset (for example, if restricted to
>> querying a web form and retrieving a few results at a time from a very
>> large database).
>>
>>
>>
>> On Mon Nov 03 2014 at 3:25:59 PM Mor Rubinstein <morchickit at gmail.com>
>> wrote:
>>
>>> HI guys,
>>>
>>> Again, thanks for writing.
>>>
>>> The only chain that we mentioned in the tutorial is the follows:
>>> If the data is not available, then the system will mark the rest of the
>>> questions as 'no'.
>>>
>>> There is no other chain in the system, and we were expected each
>>> parameter to be taken into consideration independently. This is done, among
>>> the rest, in order to allow to different stakeholders in the open
>>> government sphere to understand what they need to focus on in order to
>>> improve they openness.
>>>
>>> I will update the reviewers guide, the site and the tutorial today in
>>> order to unsure that we will consistency and for the documentation for the
>>> next Index.
>>>
>>> Thank you guys for bringing it up, you are making the index better. :-)
>>>
>>> All the best,
>>> Mor
>>>
>>> On Mon, Nov 3, 2014 at 2:04 PM, Pierre Chrzanowski <
>>> pierre.chrzanowski at gmail.com> wrote:
>>>
>>>> Thanks Graeme,
>>>>
>>>> I think that Simon was referring to the transnational level criteria
>>>> for government spending data.
>>>>
>>>> @Christian, @Mor would be good to clarify chained / dependent
>>>> questions. It is true there is no proper guideline on that.
>>>>
>>>> All the best
>>>> Pierre
>>>>
>>>>
>>>> On Mon Nov 03 2014 at 2:20:18 PM Graeme Jones <jonesiom at gmail.com>
>>>> wrote:
>>>>
>>>>> Hi Pierre
>>>>>
>>>>> 2/ and 4/  I had a specific email exchange with Christian / Mor to
>>>>> clarify chained or independent (independent) to ensure consistency ;O)
>>>>>
>>>>> 3b/  I think experienced people in the #opendata community typically
>>>>> side with the lowest common denominator, you are benchmarking to improve so
>>>>> hopefully not already perfect or nothing left to do!
>>>>>
>>>>> 3b/  similarly the issue is often willing volunteers and/or unpaid
>>>>> hours.  I might have been able to persuade someone else to independently
>>>>> contribute/review Isle of Man submissions but difficult to justify
>>>>> unquantified unpaid hours to do the same for other jurisdictions -- last
>>>>> time I did submissions for about 16 countries and this time I allocated any
>>>>> spare unpaid hours to briefly review Jersey (ran out of time on Guernsey)
>>>>> but added some data on other jurisdictions such as UAE, US Virgin Islands,
>>>>> etc.
>>>>>
>>>>> people that know what/how to look are thin on the ground in big
>>>>> countries never mind little countries, hence the importance of mentors
>>>>> office hours initiatives etc
>>>>>
>>>>> 3b/  the push towards a localised UK OGL and financereports.gov.im
>>>>> were large steps in an offshore country and required *lots* of unpaid hours
>>>>> on lobbying, slidedecks, favours such as indirect legal opinion from HM
>>>>> Attorney General, frontline staff training on data cleansing, etc.
>>>>> sorry, perhaps I have missed something, but the financereports.gov.im
>>>>> microsite shows govt spending in a timescale at least as good as most of
>>>>> the best countries and better than most other countries and under a
>>>>> localised UK OGL -- the OGL in conjunction with independent criteria is
>>>>> largely why the Isle of Man is higher in the charts
>>>>>
>>>>> in fact the end result of a ranking last year was the Isle of Man
>>>>> Government requested membership of the Open Government Partnership, surely
>>>>> exactly what anyone in the open government movement should aspire to
>>>>> achieve?
>>>>>
>>>>> also scheduled discussions already include a shift to real time
>>>>> reporting of the national accounts with data visualisation as a
>>>>> minister/voter/taxpayer frontend
>>>>>
>>>>> Best regards,
>>>>> Graeme Jones
>>>>>
>>>>> On 3 November 2014 12:00, <open-data-census-request at lists.okfn.org>
>>>>> wrote:
>>>>>
>>>>>> Message: 1
>>>>>> Date: Mon, 03 Nov 2014 11:20:20 +0000
>>>>>> From: Pierre Chrzanowski <pierre.chrzanowski at gmail.com>
>>>>>> To: open-data-census <open-data-census at lists.okfn.org>
>>>>>> Cc: "okfn-fr-members at lists.okfn.org" <okfn-fr-members at lists.okfn.org
>>>>>> >,
>>>>>>         "Simon Chignard - data.gouv.fr" <simon at data.gouv.fr>
>>>>>> Subject: [Open-data-census] Serious inconsistencies in the application
>>>>>>         of      the methodology
>>>>>>
>>>>>> Hi list, I am forwarding a message from Simon Chignard who is
>>>>>> concerned
>>>>>> about the lack of quality and consistency in the current submissions.
>>>>>>
>>>>>> I think his feedbacks should be carefully taken into account for the
>>>>>> reviewing process.
>>>>>>
>>>>>> Best
>>>>>> Pierre
>>>>>>
>>>>>> Ps : text below is a Google translate from email wrote in French to
>>>>>> okf
>>>>>> france members list
>>>>>>
>>>>>> ---
>>>>>> Hello all,
>>>>>>
>>>>>> I spotted this weekend which seems to me to be serious
>>>>>> inconsistencies in
>>>>>> the application of the methodology of the Open Data Index since 2014.
>>>>>> I
>>>>>> alert you that the question of the reliability of the tool.
>>>>>>
>>>>>> 1 / An example: the assessment of open Zipcodes / Postcodes.
>>>>>>
>>>>>> Consider the postal code file for Spain, Sweden, Canada and France.
>>>>>>
>>>>>> In these four countries, the situation is the same: a more or less
>>>>>> public
>>>>>> operator (Correos, Postnummer, Canada Post and La Poste) sells, on
>>>>>> demand,
>>>>>> the postal code file.
>>>>>>
>>>>>> Yet, these are the scores on the same file:
>>>>>>
>>>>>> Zipcode / Canada: 55%
>>>>>> http://global.census.okfn.org/entry/ca/postcodes
>>>>>>
>>>>>> Zipcode / Spain: 45%
>>>>>> http://global.census.okfn.org/entry/es/postcodes
>>>>>>
>>>>>> Zipcode / France: 10%
>>>>>> http://global.census.okfn.org/entry/fr/postcodes
>>>>>>
>>>>>> Zipcode / Sweden: 55%
>>>>>> http://global.census.okfn.org/entry/se/postcodes
>>>>>>
>>>>>>
>>>>>> 2 / What is at issue
>>>>>>
>>>>>> The question posed here is that of chaining or independence criteria.
>>>>>>
>>>>>> In France we (collectively) have considered that the criteria
>>>>>> chained. This
>>>>>> means that if the data is not available then we put red all other
>>>>>> criteria.
>>>>>> However, in all other countries I could see they took each criterion
>>>>>> separately. They consider that given legally sold and closed may
>>>>>> still be
>>>>>> available online, be current, be downloaded in bulk, etc ...
>>>>>>
>>>>>> I took the example of Zipcodes but there is the same problem for other
>>>>>> evaluations, for example here:
>>>>>> http://global.census.okfn.org/entry/si/companies
>>>>>>
>>>>>> 3 / An assessment that differs between countries
>>>>>>
>>>>>> When we look in detail on the evaluation, we also see that the
>>>>>> application
>>>>>> of the criteria is more or less strict.
>>>>>>
>>>>>> An example: Zipcode / Slovania: 55%
>>>>>> http://global.census.okfn.org/entry/si/postcodes - the commentary
>>>>>> states:
>>>>>> Data is available from Post of Slovenia, purpose is hidden in HTML
>>>>>> format,
>>>>>> not available in bulk and Additional skills are needed to extract it.
>>>>>> Geodetska uprava (Slovenian equivalent of UK Ordnance Survey) resells
>>>>>> bulk
>>>>>> data with GIS Additional information.
>>>>>>
>>>>>> Just scrap the data then it deserves a score of 55%?
>>>>>>
>>>>>> One for the road: Finland / Spending: 90%
>>>>>> http://global.census.okfn.org/entry/fi/spending - Certain assets
>>>>>> data are
>>>>>> available on Finnish data portal Avoindata.fi. More information from
>>>>>> Netra
>>>>>> Will Be ouvert in the future.
>>>>>>
>>>>>> There was clearly a problem for the application of the methodology
>>>>>> described, for evaluating a current and non-availability "in the
>>>>>> future."
>>>>>>
>>>>>> 3 / A reviewer who is also the editor for a country
>>>>>>
>>>>>> I looked in detail ratings for the Isle of Man, who gets such good
>>>>>> scores
>>>>>> for Government Spending file (100%).
>>>>>> That evaluation and comment:
>>>>>> http://global.census.okfn.org/entry/im/spending
>>>>>>
>>>>>>
>>>>>> The proposed link is this one: http://financereports.gov.im - it in
>>>>>> no way
>>>>>> corresponds to the criteria of the methodology.
>>>>>>
>>>>>> The problem seems even more serious for this country - and unlike the
>>>>>> response Mor was Peter - it is one and the same person who proposed
>>>>>> the
>>>>>> evaluation and validated once.
>>>>>>
>>>>>> 4 / Why is that a problem?
>>>>>>
>>>>>> It was therefore clearly major inconsistencies in how to apply the
>>>>>> criteria
>>>>>> for each country. But if the goal is to produce a ranking of
>>>>>> countries -
>>>>>> not to assess individually), it is a problem. And even a serious
>>>>>> problem to
>>>>>> the extent that 10 places to play close to 10%!
>>>>>>
>>>>>> The only solution, to me it seems, is that the OKF can ensure that the
>>>>>> assessment is consistent for all countries .. if it is the
>>>>>> credibility of
>>>>>> the ranking is questioned.
>>>>>>
>>>>>> Simon
>>>>>>
>>>>>> PS: also the issue had already been raised in 2012 for the
>>>>>> classification
>>>>>> of W3C
>>>>>> https://lists.okfn.org/pipermail/euopendata/2013-February/001153.html
>>>>>> - so I do not feel that the only problem is discovered now.
>>>>>> -------------- next part --------------
>>>>>> An HTML attachment was scrubbed...
>>>>>> URL: <
>>>>>> http://lists.okfn.org/pipermail/open-data-census/attachments/20141103/99ca3879/attachment-0001.html
>>>>>> >
>>>>>>
>>>>>> ------------------------------
>>>>>>
>>>>>> Subject: Digest Footer
>>>>>>
>>>>>> _______________________________________________
>>>>>> open-data-census mailing list
>>>>>> open-data-census at lists.okfn.org
>>>>>> https://lists.okfn.org/mailman/listinfo/open-data-census
>>>>>>
>>>>>>
>>>>>> ------------------------------
>>>>>>
>>>>>> End of open-data-census Digest, Vol 18, Issue 3
>>>>>> ***********************************************
>>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> open-data-census mailing list
>>>>> open-data-census at lists.okfn.org
>>>>> https://lists.okfn.org/mailman/listinfo/open-data-census
>>>>>
>>>>
>>>> _______________________________________________
>>>> open-data-census mailing list
>>>> open-data-census at lists.okfn.org
>>>> https://lists.okfn.org/mailman/listinfo/open-data-census
>>>>
>>>>
>>>
>> _______________________________________________
>> open-data-census mailing list
>> open-data-census at lists.okfn.org
>> https://lists.okfn.org/mailman/listinfo/open-data-census
>>
>>
>
>
> --
>
> *Rufus PollockFounder and President | skype: rufuspollock | @rufuspollock
> <https://twitter.com/rufuspollock>Open Knowledge <http://okfn.org/> - see
> how data can change the world**http://okfn.org/ <http://okfn.org/> |
> @okfn <http://twitter.com/OKFN> | Open Knowledge on Facebook
> <https://www.facebook.com/OKFNetwork> |  Blog <http://blog.okfn.org/>*
>
> The Open Knowledge Foundation is a not-for-profit organisation.  It is
> incorporated in England & Wales as a company limited by guarantee, with
> company number 05133759.  VAT Registration № GB 984404989. Registered
> office address: Open Knowledge Foundation, St John’s Innovation Centre,
> Cowley Road, Cambridge, CB4 0WS, UK.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/open-data-census/attachments/20141105/aabf11c9/attachment-0001.html>


More information about the open-data-census mailing list