[open-government] Long term preservation and archival for Open Data

Tree Martschink treemartschink at gmail.com
Wed Oct 9 18:59:58 UTC 2013


I'm actually working on a related issue right now for DC Council,
largely motivated by the possibility of the Uniform Electronic Legal
Material Act becoming law in the District of Columbia.  We're
exploring ways for the Council to begin directly publishing (that is,
without a "publisher" such as Lexis or Westlaw) the official version
of the DC Code, and along with it all the records that are part of the
legislative process.   As a consequence, this will also require
updating our current archiving solution to something other than Rows
of Massive Filing Cabinets.

The long and short of it is, this comes down to signatures and
authentication.  Whether you're trying to introduce a PDF transcript
of legislative testimony into evidence at trial, or use 10 years of
JSON formatted procurement spending data for econometric analysis, if
there's no way to verify the authenticity of the records, scraping and
mirroring is a stopgap at best.

All that is to say, the archiving issue is closely tied into custodial
publication, and we've just begun exploring our options.  I'll be
interested to hear what everyone has to say.

Tree

On Wed, Oct 9, 2013 at 12:43 PM, Ton Zijlstra <ton.zijlstra at gmail.com> wrote:
> Interesting question Ivan!
>
> In general I think governments cannot be presumed to keep supplying data for
> the sake of re-users only. For instance when the governments purpose for the
> data collection no longer exists.
>
> There are however various scenarios where mirroring of data might make
> sense:
> Government bodies reneging on earlier open data commitments or taking steps
> towards less transparency
> Government shutdowns as in the US (unlikely elsewhere in the world)
> Government bodies dissolving without transfer of data tasks/responsibilities
> Budget cuts hitting open data provision
> etc.
>
> A lot depends on the data itself as well. As archiving data may mean said
> data is rapidly becoming useless / outdated, other than for archival
> purposes themselves.
> For other types of data having historic data may actually be more valuable
> than just the current data. (e.g. I've been involved in a small project
> where government only published todays values of data, but provided no
> historic data, which we addressed by archiving the daily releases.)
>
> best,
> Ton
>
> ---------------------------------------------------------------------------
> Interdependent Thoughts
> Ton Zijlstra
>
> ton at tonzijlstra.eu
> +31-6-34489360
>
> http://zylstra.org/blog
>
> ---------------------------------------------------------------------------
>
>
> On Wed, Oct 9, 2013 at 6:13 PM, Ivan Begtin <ibegtin at gmail.com> wrote:
>>
>> Dear colleagues,
>>    most of us are involved in open data activities and availability of
>> opendata is critical issue when we want to re-use it.
>>
>> Right now we have a few examples when data, published earlier, disappear
>> later.
>> Sometimes it happens since data government information systems updated or
>> closed, sometimes when "Government shutdown" happens (like data.gov right
>> now) and sometimes when government agencies disbanded.
>>
>> I know that where are some archival initiatives related to government
>> websites. It's UK web archival initiative
>> (http://www.nationalarchives.gov.uk/webarchive/) and similar projects in
>> other countries (USA, Australia, Hong Kong and so on).
>>
>> As I understand no one such initiative covers datasets and when data.gov
>> is unavailable the only chance to get the data is to look at other
>> commerical/non-profit projects that re-publish data.gov datasets for own
>> use.
>>
>> So I would like to launch discussion about long term preservation and
>> archival for datasets published by government and not only government.
>>
>> What do you think from your experience in your countires, do we need to
>> launch long term preservation or it's not an issue right now?
>>
>>
>>
>> --
>> Best Regards,
>>   Ivan Begtin
>>
>> Director of NGO "Informational Culture"
>> email: ibegtin at infoculture.ru
>> phone: +7 499 500 96 58, +7 910 426 68 83
>> website: http://infoculture.ru
>>
>> _______________________________________________
>> open-government mailing list
>> open-government at lists.okfn.org
>> http://lists.okfn.org/mailman/listinfo/open-government
>> Unsubscribe: http://lists.okfn.org/mailman/options/open-government
>>
>
>
> _______________________________________________
> open-government mailing list
> open-government at lists.okfn.org
> http://lists.okfn.org/mailman/listinfo/open-government
> Unsubscribe: http://lists.okfn.org/mailman/options/open-government
>




More information about the open-government mailing list