[od-discuss] A harmonised Open Format definition

Aaron Wolf wolftune at riseup.net
Tue Apr 21 16:30:53 UTC 2015

I think this concept is excellent, the consolidation and the
clarification of having a page of recognized Open formats.

Your points here bring up a serious problem though.

If HTML and JPEG formats are considered non-machine-readable then we
absolutely *have* to *remove* the machine-readable requirement from the
Open Definition. This is a very serious issue. The OD covers thing like
photographs and other images along with stuff like the writings on a
blog. It is absolutely unacceptable to have the "machine-readable" part
in the requirements for format in the OD if it excludes these things!

I think we need to fix OD 2.1 to clarify that what is considered an Open
Format depends on the type of content. Obviously, a JPEG screenshot of a
webpage is not an Open Format for the webpage content, but HTML is. But
we cannot say that a JPEG photograph is non-Open format. We could say
that Open Data specifically should not be HTML… but I'm not certain
about that bit.

This absolutely must be addressed and clarified.

FWIW, I collected some initial bits about what qualifies as Open Format
at https://snowdrift.coop/p/snowdrift/w/en/formats-repositories and in
that case it is listed by the sort of project we're talking about. I
would love to see this more formally included in the OD.


On 04/21/2015 06:18 AM, Stephen Gates wrote:
> Hello Open Knowledge and Open Data Institute friends,
> I would like to explore the possibility of aligning the Open Definition
> <http://opendefinition.org/od/>, Open Data Census
> <http://census.okfn.org> and Open Data Certificates
> <https://certificates.theodi.org> definitions for Open Format. This
> would enable the Census, Certificate and other open data tools to refer
> to the Open Definition for a definition of Open Format, in the same way
> they currently do for Open Licences.
> To extend this concept further, I would like to mirror the Conformant
> Licenses <http://opendefinition.org/licenses/> page in the Open
> Definition with a Conformant Formats page. This would provide a list of
> file formats that conform with the Open Format definition. New formats
> could be submitted for assessment. Common formats (e.g. XML, JSON, KML,
> CSV, etc.) would be seeded on the page. Similar to the non-conformant
> licences <http://opendefinition.org/licenses/nonconformant/>  partially
> conforming formats could also be captured (e.g. XLS, SHP). This would
> cater for the spectrum of open file formats proposed by Tim Burners-Lee
> in his 5 star scheme <http://5stardata.info>.
> The respective definitions or help text are:
> *Open Definition* draft 2.1
> https://github.com/okfn/opendefinition/blob/master/source/open-definition-2.1-dev.markdown 
> The *work*/must/be machine-readable and provided in an open format. An
> open format is one which places no restrictions, monetary or otherwise,
> upon its use and can be fully processed with at least one
> free/libre/open-source software tool. Data /should/be provided in bulk
> where possible.
> *
> *
> *Open Data Census* 
> see format, machine readable and bulk rows in Google Sheet,
>  https://docs.google.com/spreadsheet/ccc?key=0AqR8dXc6Ji4JdFI0QkpGUEZyS0wxYWtLdG1nTk9zU3c&usp=drive_web#gid=0
> *Format*:
> This question describes the form that the data is available in. For
> example, for tabular data it might be: Excel, CSV, HTML or even PDF. For
> geodata it might be shapefiles, geojson or something else. If available
> in multiple formats, the format descriptors are listed separated with
> commas. Any further information is put in the comments section.
> *Machine Readable*:
> Files are digital, yes, but not all can be processed or parsed easily by
> a computer. In order to answer this question, you would need to look at
> the datasets file type.
> As a rule of thumb the following file types are machine readable:
> - XLS
> - CSV
> - JSON
> - XML
> If the files are in the following formats, the are NOT machine readable:
> - HTML
> - PDF
> - DOC
> - GIF
> - JPEG
> - PPT
> If you have a different file type and you don’t know if it’s machine
> readable or not, send an email to the Open Data Census list.
> *Bulk*:
> Data is available in bulk if the whole dataset can be downloaded easily.
> It is considered non-bulk if the citizens are limited to getting parts
> of the dataset through an online interface.
> For example, if restricted to querying a web form and retrieving a few
> results at a time from a very large database.
> *
> *
> *
> Open Data Certificates*
> Question: Is this data in a standard open format?
> Help Text: Open standards are created through a fair, transparent and
> collaborative process. Anyone can implement them and there’s lots of
> support so it’s easier for you to share data with more people. For
> example, XML, CSV and JSON are open standards. _Read more_… (links
> to https://www.gov.uk/government/uploads/system/uploads/attachment_data/file/183962/Open-Standards-Principles-FINAL.pdf)*
> *
> *Proposed changes*
> *A harmonised definition*
> The *work* /must/be machine-readable and provided in an open format. An
> open format is one which places no restrictions, monetary or otherwise,
> upon its use and can be fully processed with at least one
> free/libre/open-source software tool.
> In addition:
> - Data /should/ be provided in bulk, i.e. the whole dataset can be
> downloaded easily.
> - An open format /should/ be documented so it can be freely implemented
> by others.
> - An open format /should/ be defined through a fair, transparent and
> collaborative process.
> *Open Data Census and Open Data Certificates*
> Adjust questions and help text to reference the Open Format definition
> and/or conformant licenses page.
> *Open Definition site*
> - C
> onsider changing the page names from “Conformant Licences” and
> “Conformant Formats” to “Open Licences” and “Open Formats”. 
> - Delete the open format definition page
> <http://opendefinition.org/ofd/>. It is replaced by the Open Formats
> page and the updated Open Definition.
> *What do you think? *
> Is this worth progressing? Could this extend to Open APIs like Web Map
> Services (WMS)?
> thanks
> Stephen Gates
> (localiser of the Open Data Census and Open Data Certificates in Australia)
> _______________________________________________
> od-discuss mailing list
> od-discuss at lists.okfn.org
> https://lists.okfn.org/mailman/listinfo/od-discuss
> Unsubscribe: https://lists.okfn.org/mailman/options/od-discuss

More information about the od-discuss mailing list