[Open-data-census] Open Data Census_proposals for improvement from Russia

Tatyana Tolsteneva tatyana.tolsteneva at svobodainfo.org
Mon Oct 28 08:52:58 GMT 2013


Dear colleagues,

Since September of this year, Ivan Pavlov, Chair of the Freedom of Information Foundation Board;  and I have taken part as country editors in the Open Data Census. We have done this with great pleasure and interest.  We are very grateful that this unique opportunity exists and that we have a chance to observe the operations from the inside of the process.

In Russia, the Open Data Census is valued so high that the country's Census rating it is used as an indicator of efficiency for Russia's national open data development strategy implementation.

This year, we have paid close attention to the OKFN's initiative and have noticed that both the evaluation process and the Census interface are now even better than when we started working as country editors. Clearly, The Open Knowledge Foundation is striving to ensure that the Census is recognized as an impartial measure of transparency. We greatly respect the open manner of work you have chosen in order to improve the initiative, and your willingness to entertain proposals on the study improvement and readiness to discuss them in a constructive way. 

It is in this spirit of the OKFN's desire for improvement and impartiality that we wish to make some additional recommendations that may not yet have been taken into consideration in developing the Census process. We understand that in order to run a successful crowd-sourcing project, the information sought should be kept simple and the interface quick and easy to use. Unfortunately, we also see that this approach has its weaknesses as well, namely, the simplification of an evaluation process may negatively impact the fairness of a study.

Our Foundation has been carrying out comparative studies and ratings based on informational openness of government entities in Russia and countries of the former USSR for the last nine years. It is our great hope that our specific experience in this area could be of use for development of the Open Data Census. 

There are a few specific areas where we see the potential to improve the impartiality of the system:

Publish Methodology: We believe that the methodology of your studies of open data development level in various countries should be formalized and published. While the FAQ section covers some aspects of the research in the FAQ is not enough to answer specific questions should they arise. In our opinion, the methodology should also include a clear description of contents for each information category included in each dataset for evaluation, and a comprehensive description of each dataset evaluation criterion (Data availability Questions). The international nature of the study necessitates this: in different countries, datasets evaluated within Open Data Census are formed in different ways, and their contents are differently interpreted at the national level. 
Benefit of the Doubt Policy: Evaluators can also meet difficulties in evaluating data that can impact the results of the study. In these conditions, when evaluating the situation in the Open Data for Russia, we based on in dubio pro reo principle: in other words, the research subject receives the benefit of the doubt related to evaluating datasets. This would ensure that a weakness of the study would not result in a negative bias towards the subject (in this case, country). We implement this same principle when carrying out research and evaluation for our Foundation.
Sub-Categories for Data Sets: One potential area for methodology improvement could be the possibility for separate evaluation of each sub-category of information included in a specific dataset. For example, when evaluating transport schedules, one should realize clearly what sub-categories of information can/must be present in such schedules. Since schedule contents (quality of datasets) can be different in different countries, each sub-category included in this dataset should be evaluated separately for the study to be objective.
From our point of view, current options for answering the Open Data Census questions – "yes", "no", and "unclear" – often do not allow to expand the study potential in full volume. For instance, if part of a dataset under evaluation is provided for free and part for fee, we cannot reflect this correctly within the three-point evaluation scale; therefore, when results summarized, such a situation will not be reflected reliably.  We believe this can be solved if there will be a clear list of conditions a dataset should meet to be evaluated as "yes", "no", or "unclear" for each sub-category included in it. 

·       Set out a clear description of study stages from the outset: This will help Country Editors and participants in the crowd sourcing process to set expectations and resources related to the time and outcomes of the study stages and the entire process. It also lends clarity and additional transparency to the process.

·       “Terms of Use” vs. “Openly Licensed”: As you may be aware, our organisation is comprised mainly of lawyers, sociologists and IT specialists. Our lawyer training causes us to pay particular attention to wording, including that for the rules governing the study.

·       One particular instance which caught our attention are the evaluation criterion for “openly licensed” as stated in the FAQ:

"It needs to state the terms of use or license that allow anyone to freely use, reuse or redistribute the data (subject at most to attribution or sharealike requirements). It is vital that a license is available (if there's no license, the data is not openly licensed). 

As you can see, one phrase allows "terms of use" be a condition for positive evaluation as a license , but another phrase states that a license is mandatory. Such wording is contradictory, and we consider this contradiction to be a rather important point.

Open license is not only a technical, but also a legal issue.  Different countries can solve this legal issue in quite different ways, depending on the peculiarities of their national legal regimes regulating informational relations. Such peculiarities should not impact the country's positions in the Census rating. We believe that if a lawyer familiar well with both international and national specificities of open data legal regulation takes part in the methodology upgrade, this will help to improve the methodology significantly.

Data use conditions similar to those contained in an open data license can be also guaranteed by national legislation, regarding not only specific datasets but entire information categories. If fixed in such a way, such conditions/terms of use can cover specific datasets by default without a direct reference to any license for any case of dataset publication; moreover, the terms of use may even be not mentioned together with the dataset. In the same time, such default terms of use can fully meet the sense and spirits of your study.

We think it is unjust if the formal absence of a license for use of state information deprives a country of 30 points for each of 10 datasets evaluated, even if national legislation contains direct provision(s) meaning that terms of use of the state information meet all conditions of the "open data license". This is what happened in the case with Russia, and possibly other countries.

We would like to humbly propose a few additional points that could be useful in improving procedural aspects of the study.

1.     The mechanism of Country Editors' work would become more transparent if the system stored and displayed all versions of scores for any dataset and any country approved by a Country Editor at any time, together with the name of the Country Editor that approved them.

2.     It would become easier to process each dataset if the system allowed the user to post comments for any of nine questions for each dataset, also providing possibility to post more than one comment, and displaying comment author's name.

3.     The study could be essentially enriched and extended if there were a possibility for constructive discussion of each country's evaluation scores. The current version of the Census does not take into account lack of consensus between different Country Editors. At the same time, different experts can have different points of view regarding evaluation of the same dataset. The current version shows only the latest entered scores, not allowing to display existence of alternate views on evaluation of current data.

4.     There is one more fact of practical significance: when printing data for a single country, color coding of evaluation scores is not displayed so one is to add data on current scores manually.

A number of the  improvements recently introduced by the Census, for instance, indication of scoring points for each dataset and public display of Country Editors' names at the country census page, has made the Census more usable, and the evaluation procedure for each country more transparent. At the same time, we notice some more definite methodology changes in the current annual cycle of the Census; in particular, the result scoring system has changed. Improvement of the system is surely very important, but we believe it would be optimal to modify methodology before the beginning of a new evaluation cycle for all participants of the Census to learn and to understand the game rules ahead of the game.

We understand that if implemented, some of our advice could create difficulties for using crowd-sourcing approach in your study since the evaluation procedure will become more complicated and resource-taking for anyone involved in it. We believe that, to solve this problem, a single study cycle could cover one or two datasets instead of ten.

We will be very glad to have a possibility to take part in discussing those and/or any other proposals on the Census development with other interested parties. Any questions from your side are welcome.

Terribly sorry for such a long letter and thanks a lot for attention to it. 

Yours sincerely,



Tatyana Tolsteneva
Development manager
tatyana.tolsteneva at svobodainfo.org

Freedom of Information Foundation

(formerly known as Institute for Information Freedom Development)

P.O. Box 527, St.-Petersburg, 192007, Russia
Phone:  +7 812 766-03-66 

Fax: +7 812 766-52-61 

Email: info at svobodainfo.org

www.svobodainfo.org

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/open-data-census/attachments/20131028/276c5fe1/attachment-0001.htm>


More information about the Open-data-census mailing list