[od-discuss] Machine readability in v2.1

Andrew Rens andrewrens at gmail.com
Wed Jul 29 12:15:13 UTC 2015


+1

Andrew Rens



On 29 July 2015 at 08:14, Rufus Pollock <rufus.pollock at okfn.org> wrote:

> Good suggested amendment Andrew. To summarize:
>
> 1.4 Machine Readability
>
>
>
> The work should be provided in "machine-readable" form, that is one in
> which the content can easily be accessed and processed by a computer, and
> which is in form in which modifications to individual data/content elements
> can easily be performed.
>
>
> Rufus
>
>
>
> On 29 July 2015 at 10:35, Andrew Stott <andrew.stott at dirdigeng.com> wrote:
>
>> I would still be worried that this formulation could be interpreted as
>> allowing PDFs of data. It needs to be the *content*, not the *form*, which
>> needs to be easily accessed and processed by a computer. (Believe me a raw
>> PDF file is *much* easier for a computer to read than a human!). So what
>> about:
>>
>>
>>
>> The work should be provided in "machine-readable" form, that is one IN
>> WHICH THE CONTENT can be easily accessed and processed by a computer, and
>> which is in form in which modifications to individual elements OF THE
>> CONTENT can easily be performed.
>>
>>
>>
>> Regards
>>
>>
>>
>> Andrew
>>
>> *From:* od-discuss [mailto:od-discuss-bounces at lists.okfn.org] *On Behalf
>> Of *Rufus Pollock
>> *Sent:* 29 July 2015 09:53
>> *To:* Andrew Rens
>> *Cc:* od-discuss at lists.okfn.org
>> *Subject:* [od-discuss] Machine readability in v2.1 (was: Re:
>> [okfn-discuss] Open Definition 2.1 final draft)
>>
>>
>>
>> Just forking subject as the thread was heading off in new directions!
>>
>>
>>
>> I appreciate, as Mike points out, that there will be variation and
>> context specificity in what exactly constitutes machine readability but I
>> think the general principle can be made clear. I also appreciate that we
>> are attempting that with the current phrasing. In the spirit of offering
>> something concrete, what about a new section 1.4 as follows:
>>
>>
>>
>> 1.4 Machine Readability
>>
>>
>>
>> The work should be provided in "machine-readable" form, that is one that
>> can be easily accessed and processed by a computer, and which is in form in
>> which modifications to individual data elements can easily be performed.
>>
>>
>>
>> I also note we have the following definition of machine readable in the
>> Open Data Handbook:
>>
>>
>>
>> http://opendatahandbook.org/glossary/en/terms/machine-readable/
>>
>>
>>
>> <quote>
>>
>> Data in a data format that can be automatically read and processed by a
>> computer, such as CSV, JSON, XML, etc. Machine-readable data must be
>> structured data. Compare human-readable.
>>
>>
>>
>> Non-digital material (for example printed or hand-written documents) is
>> by its non-digital nature not machine-readable. But even digital material
>> need not be machine-readable. For example, consider a PDF document
>> containing tables of data. These are definitely digital but are not
>> machine-readable because a computer would struggle to access the tabular
>> information - even though they are very human readable. The equivalent
>> tables in a format such as a spreadsheet would be machine readable.
>>
>>
>>
>> As another example scans (photographs) of text are not machine-readable
>> (but are human readable!) but the equivalent text in a format such as a
>> simple ASCII text file or a text-processing format such as Microsoft Word
>> file is machine readable.
>>
>>
>>
>> Note: The appropriate machine readable format may vary by type of data -
>> so, for example, machine readable formats for geographic data may differ
>> from those for tabular data.
>>
>> </quote>
>>
>>
>>
>> Regards,
>>
>>
>>
>> Rufus
>>
>>
>>
>> On 28 July 2015 at 22:05, Andrew Rens <andrewrens at gmail.com> wrote:
>>
>> Hi
>>
>> Perhaps it would be useful to be specific about "machine readable" in
>> respect of data but expressly state that this specificity flows from  the
>> general principle in 1.3
>> "The work *should* be provided in the form preferred for working with
>> and making modifications to it"  or whatever the final wording is agree.
>> Additional wording would then stipulate: "When a work consists of or
>> contains data then the preferred form for that data is a form that enables
>> a recipient use automated processes to use or modify the data as a whole or
>> in part."
>>
>> This would help by showing how the principle would be applied to one kind
>> of knowledge.
>>
>> Of course automated processes like machine readable requires some
>> refinement - algorithmic processes perhaps?
>>
>> Andrew
>>
>>
>>
>>
>>
>>
>> Andrew Rens
>>
>>
>>
>> On 28 July 2015 at 15:59, Aaron Wolf <wolftune at riseup.net> wrote:
>>
>>
>>
>> On 07/28/2015 03:44 PM, Mike Linksvayer wrote:
>> > On 07/28/2015 10:21 AM, Aaron Wolf wrote:
>> >> On 07/28/2015 01:07 PM, Benjamin Ooghe-Tabanou wrote:
>> >>> Yes I agree also that the "as a whole" is fine regarding "bulk"
>> >>>
>> >>> As Rufus pointed out my main concern left is on machine-readability.
>> >>> Aaron I understand we want the OD to handle a larger picture than just
>> >>> data, but since it has historically been used primarily for data, I
>> >>> just want to make sure we can keep doing it afterwards and do not lose
>> >>> actual specific requirements.
>> >>> That's I why I proposed to simply replace the blurred "in a form
>> >>> preferred" sentenced with a sentence precising the specific case of
>> >>> data as It was agreed on earlier in the process.
>> >>> As such, 1.3 first concerns "work" globally. Having at the end a "Data
>> >>> must be machine readable" would add the proper precision.
>> >>>
>> >>> Benjamin Ooghe-Tabanou
>> >>>
>> >>>
>> >>
>> >> Adding "Data must be machine readable" to the end of 1.3 sounds fine to
>> >> me. Let's do that.
>> >
>> > Looks like superfluous jargon to me:
>> >
>> > - the underlying issue of works being provided in a manner that the work
>> > in question can be easily processed and manipulated is not specific to
>> > data (even from a data-centric worldview, eg to mine data from
>> 'content')
>> >
>>
>> I am willing to consent to others' concerns, but I'm with Mike: 'should
>> be provided in the form preferred for making modifications to it' — in
>> principle, that means you have data you can actually use, i.e.
>> machine-readable if that's the way you would usually manage the data.
>>
>> But, I could see changing 'making modifications' to 'working with and
>> modifying' — working with data may be analyzing it but not modifying the
>> data. So, to do analysis, you'd want it to be machine-readable, but this
>> is independent of modifying the data.
>>
>> So, I think we need to have a better generalized wording here.
>>
>> I suggest 'provided in the form preferred for working with and making
>> modifications to it'
>>
>> My concern here is about the "must" vs "should" aspect: If we used
>> "must" would that say that my video is not "open" unless I provide all
>> the source files? I have mixed feelings about that but certainly don't
>> want it any stronger than "available upon request". We don't want to
>> block the distribution of videos by making *all* distributions
>> necessarily include all source files.
>>
>>
>> > - machine-readability is not defined (with respect to what? eg a bitmap
>> > image is read by a machine, even if it is encodes a scan of 'data' from
>> > a printout)
>> >
>>
>> I had this same concern about "machine-readability", but I thought
>> qualifying this as data-specific would be acceptable. But I'm not sure.
>>
>>
>>
>> > Mike
>> >
>> > _______________________________________________
>> > od-discuss mailing list
>>
>> > od-discuss at lists.okfn.org
>> > https://lists.okfn.org/mailman/listinfo/od-discuss
>> > Unsubscribe: https://lists.okfn.org/mailman/options/od-discuss
>> >
>>
>> --
>> Aaron Wolf
>> co-founder, Snowdrift.coop
>> music teacher, wolftune.com
>>
>> _______________________________________________
>> od-discuss mailing list
>> od-discuss at lists.okfn.org
>> https://lists.okfn.org/mailman/listinfo/od-discuss
>> Unsubscribe: https://lists.okfn.org/mailman/options/od-discuss
>>
>>
>>
>>
>> _______________________________________________
>> od-discuss mailing list
>> od-discuss at lists.okfn.org
>> https://lists.okfn.org/mailman/listinfo/od-discuss
>> Unsubscribe: https://lists.okfn.org/mailman/options/od-discuss
>>
>>
>>
>>
>>
>> --
>>
>> Rufus Pollock
>>
>> Founder and President | skype: rufuspollock | @rufuspollock
>> <https://twitter.com/rufuspollock>
>>
>> Open Knowledge <http://okfn.org/> - s*ee how data can change the world*
>>
>> http://okfn.org/ | @okfn <http://twitter.com/OKFN> | Open Knowledge on
>> Facebook <https://www.facebook.com/OKFNetwork> |  Blog
>> <http://blog.okfn.org/>
>>
>> _______________________________________________
>> od-discuss mailing list
>> od-discuss at lists.okfn.org
>> https://lists.okfn.org/mailman/listinfo/od-discuss
>> Unsubscribe: https://lists.okfn.org/mailman/options/od-discuss
>>
>>
>
>
> --
>
> *Rufus PollockFounder and President | skype: rufuspollock | @rufuspollock
> <https://twitter.com/rufuspollock>Open Knowledge <http://okfn.org/> - see
> how data can change the world**http://okfn.org/ <http://okfn.org/> |
> @okfn <http://twitter.com/OKFN> | Open Knowledge on Facebook
> <https://www.facebook.com/OKFNetwork> |  Blog <http://blog.okfn.org/>*
>
> _______________________________________________
> od-discuss mailing list
> od-discuss at lists.okfn.org
> https://lists.okfn.org/mailman/listinfo/od-discuss
> Unsubscribe: https://lists.okfn.org/mailman/options/od-discuss
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/od-discuss/attachments/20150729/71c08a29/attachment-0003.html>


More information about the od-discuss mailing list