[Open-data-census] open-data-census Digest, Vol 18, Issue 3

Rufus Pollock rufus.pollock at okfn.org
Wed Nov 5 14:36:13 UTC 2014


On 5 November 2014 14:32, Pierre Chrzanowski <pierre.chrzanowski at gmail.com>
wrote:

> Thanks Rufus,
>
> One of the major problem I see with this methodology (lack of chaining) is
> that it actually allow us to assess different instance of publication for a
> dataset.
>
> For instance, if UK Company Data was also available online but not as
> bulk. What would have been the answers ?
>

Obviously, the online by default but I feel this is a pretty edge case.


> This lead to very different interpretations and contributions as
> exemplified by Simon.
>

What are the exact examples where this has caused a problem (my apologies
for missing this if already said).


> I think we should clarify methodology to help us choose which dataset to
> assess.
>
> For instance, always prefer to assess publicly available data rather non
> publicly avalaible data, and then online rather non online, etc.
>

That definitely seems sensible and I thought would be implied but as you
say spelling that out could definitely be useful.

Rufus



> Best
> Pierre
>
> On Mon Nov 03 2014 at 5:30:51 PM Rufus Pollock <rufus.pollock at okfn.org>
> wrote:
>
>> On 3 November 2014 16:11, Pierre Chrzanowski <
>> pierre.chrzanowski at gmail.com> wrote:
>>
>>> Sorry to keep going on but I actually thought there were some evident
>>> chains such as : bulk or format are null if data is not publicly available
>>> online. Otherwise it means that one has to be able to have access to the
>>> unavailable data to confirm evidences.
>>>
>>
>> These are excellent points PIerre and we thought quite a bit about the
>> implication chains last year (and have tried to build some into the survey
>> logic).
>>
>> On the bulk the logic was this: in the UK you used to be able to get the
>> Companies Register in bulk on CDs but not online. (So this is an example of
>> bulk being true but online being false).
>>
>> Similarly, for format it is again the case that stuff coudl be available
>> in a specific format but not publicly online.
>>
>>
>>> For instance, I am being told that spending government data in France
>>> exist in reusable format and in bulk. But I cannot access the data so why
>>> should I believe this ? Should I go to the Ministry ?
>>>
>>
>> I would say that is definitely a stretch: if data is not available to
>> anyone then it would be impossible to know if bulk so i would mark this as
>> no or unsure in this case. Similarly, on reusable. However, if e.g. the
>> Ministry made the data available to researchers on CD-ROMs you would be
>> able to answer this even if not publicly available.
>>
>> Rufus
>>
>>
>>> Then, there are actually some questions that consider public
>>> availability implicitly in their definition such as for bulk [1]. Two
>>> questions are chained then.
>>>
>>> I hope that we will be able to sort that out before we publish anything.
>>> Otherwise, I know there are some people ready to fire :)
>>>
>>> Best
>>>
>>> [1] Data is available in bulk if the whole dataset can be downloaded or
>>> accessed easily. Conversely it is considered non-bulk if the citizens are
>>> limited to just getting parts of the dataset (for example, if restricted to
>>> querying a web form and retrieving a few results at a time from a very
>>> large database).
>>>
>>>
>>>
>>> On Mon Nov 03 2014 at 3:25:59 PM Mor Rubinstein <morchickit at gmail.com>
>>> wrote:
>>>
>>>> HI guys,
>>>>
>>>> Again, thanks for writing.
>>>>
>>>> The only chain that we mentioned in the tutorial is the follows:
>>>> If the data is not available, then the system will mark the rest of the
>>>> questions as 'no'.
>>>>
>>>> There is no other chain in the system, and we were expected each
>>>> parameter to be taken into consideration independently. This is done, among
>>>> the rest, in order to allow to different stakeholders in the open
>>>> government sphere to understand what they need to focus on in order to
>>>> improve they openness.
>>>>
>>>> I will update the reviewers guide, the site and the tutorial today in
>>>> order to unsure that we will consistency and for the documentation for the
>>>> next Index.
>>>>
>>>> Thank you guys for bringing it up, you are making the index better. :-)
>>>>
>>>> All the best,
>>>> Mor
>>>>
>>>> On Mon, Nov 3, 2014 at 2:04 PM, Pierre Chrzanowski <
>>>> pierre.chrzanowski at gmail.com> wrote:
>>>>
>>>>> Thanks Graeme,
>>>>>
>>>>> I think that Simon was referring to the transnational level criteria
>>>>> for government spending data.
>>>>>
>>>>> @Christian, @Mor would be good to clarify chained / dependent
>>>>> questions. It is true there is no proper guideline on that.
>>>>>
>>>>> All the best
>>>>> Pierre
>>>>>
>>>>>
>>>>> On Mon Nov 03 2014 at 2:20:18 PM Graeme Jones <jonesiom at gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Hi Pierre
>>>>>>
>>>>>> 2/ and 4/  I had a specific email exchange with Christian / Mor to
>>>>>> clarify chained or independent (independent) to ensure consistency ;O)
>>>>>>
>>>>>> 3b/  I think experienced people in the #opendata community typically
>>>>>> side with the lowest common denominator, you are benchmarking to improve so
>>>>>> hopefully not already perfect or nothing left to do!
>>>>>>
>>>>>> 3b/  similarly the issue is often willing volunteers and/or unpaid
>>>>>> hours.  I might have been able to persuade someone else to independently
>>>>>> contribute/review Isle of Man submissions but difficult to justify
>>>>>> unquantified unpaid hours to do the same for other jurisdictions -- last
>>>>>> time I did submissions for about 16 countries and this time I allocated any
>>>>>> spare unpaid hours to briefly review Jersey (ran out of time on Guernsey)
>>>>>> but added some data on other jurisdictions such as UAE, US Virgin Islands,
>>>>>> etc.
>>>>>>
>>>>>> people that know what/how to look are thin on the ground in big
>>>>>> countries never mind little countries, hence the importance of mentors
>>>>>> office hours initiatives etc
>>>>>>
>>>>>> 3b/  the push towards a localised UK OGL and financereports.gov.im
>>>>>> were large steps in an offshore country and required *lots* of unpaid hours
>>>>>> on lobbying, slidedecks, favours such as indirect legal opinion from HM
>>>>>> Attorney General, frontline staff training on data cleansing, etc.
>>>>>> sorry, perhaps I have missed something, but the financereports.gov.im
>>>>>> microsite shows govt spending in a timescale at least as good as most of
>>>>>> the best countries and better than most other countries and under a
>>>>>> localised UK OGL -- the OGL in conjunction with independent criteria is
>>>>>> largely why the Isle of Man is higher in the charts
>>>>>>
>>>>>> in fact the end result of a ranking last year was the Isle of Man
>>>>>> Government requested membership of the Open Government Partnership, surely
>>>>>> exactly what anyone in the open government movement should aspire to
>>>>>> achieve?
>>>>>>
>>>>>> also scheduled discussions already include a shift to real time
>>>>>> reporting of the national accounts with data visualisation as a
>>>>>> minister/voter/taxpayer frontend
>>>>>>
>>>>>> Best regards,
>>>>>> Graeme Jones
>>>>>>
>>>>>> On 3 November 2014 12:00, <open-data-census-request at lists.okfn.org>
>>>>>> wrote:
>>>>>>
>>>>>>> Message: 1
>>>>>>> Date: Mon, 03 Nov 2014 11:20:20 +0000
>>>>>>> From: Pierre Chrzanowski <pierre.chrzanowski at gmail.com>
>>>>>>> To: open-data-census <open-data-census at lists.okfn.org>
>>>>>>> Cc: "okfn-fr-members at lists.okfn.org" <okfn-fr-members at lists.okfn.org
>>>>>>> >,
>>>>>>>         "Simon Chignard - data.gouv.fr" <simon at data.gouv.fr>
>>>>>>> Subject: [Open-data-census] Serious inconsistencies in the
>>>>>>> application
>>>>>>>         of      the methodology
>>>>>>>
>>>>>>> Hi list, I am forwarding a message from Simon Chignard who is
>>>>>>> concerned
>>>>>>> about the lack of quality and consistency in the current submissions.
>>>>>>>
>>>>>>> I think his feedbacks should be carefully taken into account for the
>>>>>>> reviewing process.
>>>>>>>
>>>>>>> Best
>>>>>>> Pierre
>>>>>>>
>>>>>>> Ps : text below is a Google translate from email wrote in French to
>>>>>>> okf
>>>>>>> france members list
>>>>>>>
>>>>>>> ---
>>>>>>> Hello all,
>>>>>>>
>>>>>>> I spotted this weekend which seems to me to be serious
>>>>>>> inconsistencies in
>>>>>>> the application of the methodology of the Open Data Index since
>>>>>>> 2014. I
>>>>>>> alert you that the question of the reliability of the tool.
>>>>>>>
>>>>>>> 1 / An example: the assessment of open Zipcodes / Postcodes.
>>>>>>>
>>>>>>> Consider the postal code file for Spain, Sweden, Canada and France.
>>>>>>>
>>>>>>> In these four countries, the situation is the same: a more or less
>>>>>>> public
>>>>>>> operator (Correos, Postnummer, Canada Post and La Poste) sells, on
>>>>>>> demand,
>>>>>>> the postal code file.
>>>>>>>
>>>>>>> Yet, these are the scores on the same file:
>>>>>>>
>>>>>>> Zipcode / Canada: 55%
>>>>>>> http://global.census.okfn.org/entry/ca/postcodes
>>>>>>>
>>>>>>> Zipcode / Spain: 45%
>>>>>>> http://global.census.okfn.org/entry/es/postcodes
>>>>>>>
>>>>>>> Zipcode / France: 10%
>>>>>>> http://global.census.okfn.org/entry/fr/postcodes
>>>>>>>
>>>>>>> Zipcode / Sweden: 55%
>>>>>>> http://global.census.okfn.org/entry/se/postcodes
>>>>>>>
>>>>>>>
>>>>>>> 2 / What is at issue
>>>>>>>
>>>>>>> The question posed here is that of chaining or independence criteria.
>>>>>>>
>>>>>>> In France we (collectively) have considered that the criteria
>>>>>>> chained. This
>>>>>>> means that if the data is not available then we put red all other
>>>>>>> criteria.
>>>>>>> However, in all other countries I could see they took each criterion
>>>>>>> separately. They consider that given legally sold and closed may
>>>>>>> still be
>>>>>>> available online, be current, be downloaded in bulk, etc ...
>>>>>>>
>>>>>>> I took the example of Zipcodes but there is the same problem for
>>>>>>> other
>>>>>>> evaluations, for example here:
>>>>>>> http://global.census.okfn.org/entry/si/companies
>>>>>>>
>>>>>>> 3 / An assessment that differs between countries
>>>>>>>
>>>>>>> When we look in detail on the evaluation, we also see that the
>>>>>>> application
>>>>>>> of the criteria is more or less strict.
>>>>>>>
>>>>>>> An example: Zipcode / Slovania: 55%
>>>>>>> http://global.census.okfn.org/entry/si/postcodes - the commentary
>>>>>>> states:
>>>>>>> Data is available from Post of Slovenia, purpose is hidden in HTML
>>>>>>> format,
>>>>>>> not available in bulk and Additional skills are needed to extract it.
>>>>>>> Geodetska uprava (Slovenian equivalent of UK Ordnance Survey)
>>>>>>> resells bulk
>>>>>>> data with GIS Additional information.
>>>>>>>
>>>>>>> Just scrap the data then it deserves a score of 55%?
>>>>>>>
>>>>>>> One for the road: Finland / Spending: 90%
>>>>>>> http://global.census.okfn.org/entry/fi/spending - Certain assets
>>>>>>> data are
>>>>>>> available on Finnish data portal Avoindata.fi. More information from
>>>>>>> Netra
>>>>>>> Will Be ouvert in the future.
>>>>>>>
>>>>>>> There was clearly a problem for the application of the methodology
>>>>>>> described, for evaluating a current and non-availability "in the
>>>>>>> future."
>>>>>>>
>>>>>>> 3 / A reviewer who is also the editor for a country
>>>>>>>
>>>>>>> I looked in detail ratings for the Isle of Man, who gets such good
>>>>>>> scores
>>>>>>> for Government Spending file (100%).
>>>>>>> That evaluation and comment:
>>>>>>> http://global.census.okfn.org/entry/im/spending
>>>>>>>
>>>>>>>
>>>>>>> The proposed link is this one: http://financereports.gov.im - it in
>>>>>>> no way
>>>>>>> corresponds to the criteria of the methodology.
>>>>>>>
>>>>>>> The problem seems even more serious for this country - and unlike the
>>>>>>> response Mor was Peter - it is one and the same person who proposed
>>>>>>> the
>>>>>>> evaluation and validated once.
>>>>>>>
>>>>>>> 4 / Why is that a problem?
>>>>>>>
>>>>>>> It was therefore clearly major inconsistencies in how to apply the
>>>>>>> criteria
>>>>>>> for each country. But if the goal is to produce a ranking of
>>>>>>> countries -
>>>>>>> not to assess individually), it is a problem. And even a serious
>>>>>>> problem to
>>>>>>> the extent that 10 places to play close to 10%!
>>>>>>>
>>>>>>> The only solution, to me it seems, is that the OKF can ensure that
>>>>>>> the
>>>>>>> assessment is consistent for all countries .. if it is the
>>>>>>> credibility of
>>>>>>> the ranking is questioned.
>>>>>>>
>>>>>>> Simon
>>>>>>>
>>>>>>> PS: also the issue had already been raised in 2012 for the
>>>>>>> classification
>>>>>>> of W3C
>>>>>>> https://lists.okfn.org/pipermail/euopendata/2013-February/001153.html
>>>>>>> - so I do not feel that the only problem is discovered now.
>>>>>>> -------------- next part --------------
>>>>>>> An HTML attachment was scrubbed...
>>>>>>> URL: <
>>>>>>> http://lists.okfn.org/pipermail/open-data-census/attachments/20141103/99ca3879/attachment-0001.html
>>>>>>> >
>>>>>>>
>>>>>>> ------------------------------
>>>>>>>
>>>>>>> Subject: Digest Footer
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> open-data-census mailing list
>>>>>>> open-data-census at lists.okfn.org
>>>>>>> https://lists.okfn.org/mailman/listinfo/open-data-census
>>>>>>>
>>>>>>>
>>>>>>> ------------------------------
>>>>>>>
>>>>>>> End of open-data-census Digest, Vol 18, Issue 3
>>>>>>> ***********************************************
>>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> open-data-census mailing list
>>>>>> open-data-census at lists.okfn.org
>>>>>> https://lists.okfn.org/mailman/listinfo/open-data-census
>>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> open-data-census mailing list
>>>>> open-data-census at lists.okfn.org
>>>>> https://lists.okfn.org/mailman/listinfo/open-data-census
>>>>>
>>>>>
>>>>
>>> _______________________________________________
>>> open-data-census mailing list
>>> open-data-census at lists.okfn.org
>>> https://lists.okfn.org/mailman/listinfo/open-data-census
>>>
>>>
>>
>>
>> --
>>
>> *Rufus PollockFounder and President | skype: rufuspollock | @rufuspollock
>> <https://twitter.com/rufuspollock>Open Knowledge <http://okfn.org/> - see
>> how data can change the world**http://okfn.org/ <http://okfn.org/> |
>> @okfn <http://twitter.com/OKFN> | Open Knowledge on Facebook
>> <https://www.facebook.com/OKFNetwork> |  Blog <http://blog.okfn.org/>*
>>
>> The Open Knowledge Foundation is a not-for-profit organisation.  It is
>> incorporated in England & Wales as a company limited by guarantee, with
>> company number 05133759.  VAT Registration № GB 984404989. Registered
>> office address: Open Knowledge Foundation, St John’s Innovation Centre,
>> Cowley Road, Cambridge, CB4 0WS, UK.
>>
>


-- 

*Rufus PollockFounder and President | skype: rufuspollock | @rufuspollock
<https://twitter.com/rufuspollock>Open Knowledge <http://okfn.org/> - see
how data can change the world**http://okfn.org/ <http://okfn.org/> | @okfn
<http://twitter.com/OKFN> | Open Knowledge on Facebook
<https://www.facebook.com/OKFNetwork> |  Blog <http://blog.okfn.org/>*

The Open Knowledge Foundation is a not-for-profit organisation.  It is
incorporated in England & Wales as a company limited by guarantee, with
company number 05133759.  VAT Registration № GB 984404989. Registered
office address: Open Knowledge Foundation, St John’s Innovation Centre,
Cowley Road, Cambridge, CB4 0WS, UK.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/open-data-census/attachments/20141105/b772043e/attachment-0001.html>


More information about the open-data-census mailing list