[od-discuss] OD 2.1 draft

Peter Murray-Rust pm286 at cam.ac.uk
Sat Jan 17 13:06:07 UTC 2015


I've spent 25 years hacking proprietary formats and have mixed feelings. A
large number of formats have implicit semantics and these are provided by
the processing proprietary software. A common and notorious example is
units of measurement - pounds? what sort ? country codes? etc.

If you can't convert and roundtrip reliably then the closed format is bound
(often not clearly) to the proprietary processing software. Often the
proprietary documentation is closed or, more frequently, insufficient (even
to customers). I have found vendors who offer "Open APIs" only to discover
that this means they make their docs available to customers but under a
secrecy agreement. So "Open" means "Documented", not OKD-compliant.

In compsci things are not too bad. Apache and other create toools that read
proprietary formats such as Word and PDF (yes I know they are now "Open
Standards" but they often are diffiuclt to read - I work daily with Apache
PDFBox and there are many PDFs that can only be read with Acrobat).

It's much worse outside general computing - scientific instruments and
medical devices can be unreadable.

So, as I write this I have changed my view to argue against this. An Open
Data document should ideally have an Open community process for its
definition and documentation and at least one Open tool for reading it.

Can a Powerpoint or XLS  be Open? I don't know because I haven't started to
hack with the Apache Tools. Word documents are just about OK - I've worked
with DOCX and with Microsoft and know how MS-specific they are but at least
there are Open tools.

So for me data in proprietary format is not Open unless there are open
tools and open documentation.


On Fri, Jan 16, 2015 at 10:34 PM, Andrew Katz <Andrew.Katz at moorcrofts.com>
wrote:

> Hi Aaron
>
>
> > On 16 Jan 2015, at 17:35, Aaron Wolf <wolftune at riseup.net> wrote:
> >
> > I'm sympathetic to the idea that there is value in acknowledging when
> data is Open even when delivered in a non-Open format which is at least
> openable… but on the other hand, it's probably best to go all the way and
> require Open formats because we can simply say things like "the data from
> this city's government is almost fully Open but for the format, however, we
> have taken the data and re-released it in an Open format, so now it is
> fully Open!
>
> Spot on. I agree entirely. I retain a little concern that the proprietary
> format might contain additional information which is not easily
> translatable, or accessible, into a truly open format, but, in general, if
> I can get the information I’m looking for in an .xls file, so long as I can
> easily export it into .csv (for example) and lawfully redistribute it in
> that format, I’m happy.
>
>
>
> >
> > So, as long as we acknowledge that it is possible for non-Open data to
> be Openable (because the license is permissive enough to allow that), then
> I'm satisfied. Perhaps we should do something aside from the Open
> Definition to at least acknowledge this, even though it risks accepting a
> level of laziness from publishers… I think this sort of grey-area is *good*
> to acknowledge. It's just reality that there's grey, not everything is
> completely black and white.
> >
> Yep.
>
>
> > Happy to hear thoughts from others.
>
> > Cheers,
> > Aaron
> >
> Best
>
> Andrew
> _______________________________________________
> od-discuss mailing list
> od-discuss at lists.okfn.org
> https://lists.okfn.org/mailman/listinfo/od-discuss
> Unsubscribe: https://lists.okfn.org/mailman/options/od-discuss
>



-- 
Peter Murray-Rust
Reader in Molecular Informatics
Unilever Centre, Dep. Of Chemistry
University of Cambridge
CB2 1EW, UK
+44-1223-763069
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/od-discuss/attachments/20150117/6a62b2ac/attachment-0003.html>


More information about the od-discuss mailing list