[od-discuss] Machine readability in v2.1

Rufus Pollock rufus.pollock at okfn.org
Wed Jul 29 12:14:06 UTC 2015


Good suggested amendment Andrew. To summarize:

1.4 Machine Readability



The work should be provided in "machine-readable" form, that is one in
which the content can easily be accessed and processed by a computer, and
which is in form in which modifications to individual data/content elements
can easily be performed.


Rufus



On 29 July 2015 at 10:35, Andrew Stott <andrew.stott at dirdigeng.com> wrote:

> I would still be worried that this formulation could be interpreted as
> allowing PDFs of data. It needs to be the *content*, not the *form*, which
> needs to be easily accessed and processed by a computer. (Believe me a raw
> PDF file is *much* easier for a computer to read than a human!). So what
> about:
>
>
>
> The work should be provided in "machine-readable" form, that is one IN
> WHICH THE CONTENT can be easily accessed and processed by a computer, and
> which is in form in which modifications to individual elements OF THE
> CONTENT can easily be performed.
>
>
>
> Regards
>
>
>
> Andrew
>
> *From:* od-discuss [mailto:od-discuss-bounces at lists.okfn.org] *On Behalf
> Of *Rufus Pollock
> *Sent:* 29 July 2015 09:53
> *To:* Andrew Rens
> *Cc:* od-discuss at lists.okfn.org
> *Subject:* [od-discuss] Machine readability in v2.1 (was: Re:
> [okfn-discuss] Open Definition 2.1 final draft)
>
>
>
> Just forking subject as the thread was heading off in new directions!
>
>
>
> I appreciate, as Mike points out, that there will be variation and context
> specificity in what exactly constitutes machine readability but I think the
> general principle can be made clear. I also appreciate that we are
> attempting that with the current phrasing. In the spirit of offering
> something concrete, what about a new section 1.4 as follows:
>
>
>
> 1.4 Machine Readability
>
>
>
> The work should be provided in "machine-readable" form, that is one that
> can be easily accessed and processed by a computer, and which is in form in
> which modifications to individual data elements can easily be performed.
>
>
>
> I also note we have the following definition of machine readable in the
> Open Data Handbook:
>
>
>
> http://opendatahandbook.org/glossary/en/terms/machine-readable/
>
>
>
> <quote>
>
> Data in a data format that can be automatically read and processed by a
> computer, such as CSV, JSON, XML, etc. Machine-readable data must be
> structured data. Compare human-readable.
>
>
>
> Non-digital material (for example printed or hand-written documents) is by
> its non-digital nature not machine-readable. But even digital material need
> not be machine-readable. For example, consider a PDF document containing
> tables of data. These are definitely digital but are not machine-readable
> because a computer would struggle to access the tabular information - even
> though they are very human readable. The equivalent tables in a format such
> as a spreadsheet would be machine readable.
>
>
>
> As another example scans (photographs) of text are not machine-readable
> (but are human readable!) but the equivalent text in a format such as a
> simple ASCII text file or a text-processing format such as Microsoft Word
> file is machine readable.
>
>
>
> Note: The appropriate machine readable format may vary by type of data -
> so, for example, machine readable formats for geographic data may differ
> from those for tabular data.
>
> </quote>
>
>
>
> Regards,
>
>
>
> Rufus
>
>
>
> On 28 July 2015 at 22:05, Andrew Rens <andrewrens at gmail.com> wrote:
>
> Hi
>
> Perhaps it would be useful to be specific about "machine readable" in
> respect of data but expressly state that this specificity flows from  the
> general principle in 1.3
> "The work *should* be provided in the form preferred for working with and
> making modifications to it"  or whatever the final wording is agree.
> Additional wording would then stipulate: "When a work consists of or
> contains data then the preferred form for that data is a form that enables
> a recipient use automated processes to use or modify the data as a whole or
> in part."
>
> This would help by showing how the principle would be applied to one kind
> of knowledge.
>
> Of course automated processes like machine readable requires some
> refinement - algorithmic processes perhaps?
>
> Andrew
>
>
>
>
>
>
> Andrew Rens
>
>
>
> On 28 July 2015 at 15:59, Aaron Wolf <wolftune at riseup.net> wrote:
>
>
>
> On 07/28/2015 03:44 PM, Mike Linksvayer wrote:
> > On 07/28/2015 10:21 AM, Aaron Wolf wrote:
> >> On 07/28/2015 01:07 PM, Benjamin Ooghe-Tabanou wrote:
> >>> Yes I agree also that the "as a whole" is fine regarding "bulk"
> >>>
> >>> As Rufus pointed out my main concern left is on machine-readability.
> >>> Aaron I understand we want the OD to handle a larger picture than just
> >>> data, but since it has historically been used primarily for data, I
> >>> just want to make sure we can keep doing it afterwards and do not lose
> >>> actual specific requirements.
> >>> That's I why I proposed to simply replace the blurred "in a form
> >>> preferred" sentenced with a sentence precising the specific case of
> >>> data as It was agreed on earlier in the process.
> >>> As such, 1.3 first concerns "work" globally. Having at the end a "Data
> >>> must be machine readable" would add the proper precision.
> >>>
> >>> Benjamin Ooghe-Tabanou
> >>>
> >>>
> >>
> >> Adding "Data must be machine readable" to the end of 1.3 sounds fine to
> >> me. Let's do that.
> >
> > Looks like superfluous jargon to me:
> >
> > - the underlying issue of works being provided in a manner that the work
> > in question can be easily processed and manipulated is not specific to
> > data (even from a data-centric worldview, eg to mine data from 'content')
> >
>
> I am willing to consent to others' concerns, but I'm with Mike: 'should
> be provided in the form preferred for making modifications to it' — in
> principle, that means you have data you can actually use, i.e.
> machine-readable if that's the way you would usually manage the data.
>
> But, I could see changing 'making modifications' to 'working with and
> modifying' — working with data may be analyzing it but not modifying the
> data. So, to do analysis, you'd want it to be machine-readable, but this
> is independent of modifying the data.
>
> So, I think we need to have a better generalized wording here.
>
> I suggest 'provided in the form preferred for working with and making
> modifications to it'
>
> My concern here is about the "must" vs "should" aspect: If we used
> "must" would that say that my video is not "open" unless I provide all
> the source files? I have mixed feelings about that but certainly don't
> want it any stronger than "available upon request". We don't want to
> block the distribution of videos by making *all* distributions
> necessarily include all source files.
>
>
> > - machine-readability is not defined (with respect to what? eg a bitmap
> > image is read by a machine, even if it is encodes a scan of 'data' from
> > a printout)
> >
>
> I had this same concern about "machine-readability", but I thought
> qualifying this as data-specific would be acceptable. But I'm not sure.
>
>
>
> > Mike
> >
> > _______________________________________________
> > od-discuss mailing list
>
> > od-discuss at lists.okfn.org
> > https://lists.okfn.org/mailman/listinfo/od-discuss
> > Unsubscribe: https://lists.okfn.org/mailman/options/od-discuss
> >
>
> --
> Aaron Wolf
> co-founder, Snowdrift.coop
> music teacher, wolftune.com
>
> _______________________________________________
> od-discuss mailing list
> od-discuss at lists.okfn.org
> https://lists.okfn.org/mailman/listinfo/od-discuss
> Unsubscribe: https://lists.okfn.org/mailman/options/od-discuss
>
>
>
>
> _______________________________________________
> od-discuss mailing list
> od-discuss at lists.okfn.org
> https://lists.okfn.org/mailman/listinfo/od-discuss
> Unsubscribe: https://lists.okfn.org/mailman/options/od-discuss
>
>
>
>
>
> --
>
> Rufus Pollock
>
> Founder and President | skype: rufuspollock | @rufuspollock
> <https://twitter.com/rufuspollock>
>
> Open Knowledge <http://okfn.org/> - s*ee how data can change the world*
>
> http://okfn.org/ | @okfn <http://twitter.com/OKFN> | Open Knowledge on
> Facebook <https://www.facebook.com/OKFNetwork> |  Blog
> <http://blog.okfn.org/>
>
> _______________________________________________
> od-discuss mailing list
> od-discuss at lists.okfn.org
> https://lists.okfn.org/mailman/listinfo/od-discuss
> Unsubscribe: https://lists.okfn.org/mailman/options/od-discuss
>
>


-- 

*Rufus PollockFounder and President | skype: rufuspollock | @rufuspollock
<https://twitter.com/rufuspollock>Open Knowledge <http://okfn.org/> - see
how data can change the world**http://okfn.org/ <http://okfn.org/> | @okfn
<http://twitter.com/OKFN> | Open Knowledge on Facebook
<https://www.facebook.com/OKFNetwork> |  Blog <http://blog.okfn.org/>*
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/od-discuss/attachments/20150729/3257bef1/attachment-0003.html>


More information about the od-discuss mailing list