[od-discuss] OD v2 accepts Excel as OpenData?!???

Rufus Pollock rufus.pollock at okfn.org
Mon Nov 10 07:57:25 UTC 2014


On 10 November 2014 07:49, Gisle Hannemyr <gisle at ifi.uio.no> wrote:

> On 2014-11-08 20:48, Aaron Wolf wrote:
> > On 11/08/2014 09:55 AM, Mike Linksvayer wrote:
>
> >> Does published specification matter? Does an ad-hoc format that is not
> >> restricted by any means make a work non-open? Consider CSV variations.
>
> > Not sure here, but are there actually CSV variations where the
> > specification is not published?
>
> Is there any /real/ specifications for CSV published :-) .
>

Sort of: http://tools.ietf.org/html/rfc4180

See http://data.okfn.org/doc/csv for more.


> For the record: I've never seen a /complete/ and /authoritative/
> specification for most of the very many different CSV variations
> that people actually distribute.
>
> People may manage "Save as ..." in MS Excel to save the data in
> some  CSV-format, but unless the data is very simple, Excel doesn't
> export it in a /documented/ CSV format.  Excel exports in many
> different un-documented CSV formats - and the format you get
> depends on (among other things) Excel version, platform, locale
> and language settings.


I agree a bit here. CSV is so simple that people do vary it. The RFC does
mandate a stricter structure (specific line endings etc) as does stuff like
Tabular Data Package (http://dataprotocols.org/tabular-data-package/)


>
> > I mean if the character used for
> > separation is different, would we really say that nothing anywhere
> > publishes the specification for that character etc?
>
> In addition to the separation character, I've noticed different
> semantics for single and double quotes, escape characters
> (both for escaping the separation character, and for escaping
> the escape character), the treatment of newlines inside fields,
> and the encoding of non-ASCII characters.  Such variations
> appear in various combinations, and this fact, and the missing
> published  specifications, makes some CSV data sets far more
> difficult to use than I would like.
>

Yes, there are quite a few different dialects of CSV, which is why there is
the CSV Dialect Description Format:

http://dataprotocols.org/csv-dialect/


> I happen to deal with a lot of CSV data originating from various
> address-books and other GIS-resources, and I've resorted to writing
> my own "CSV normalizer" that converts all the different
> CSV variations I receive into a common format (prior to actual
> processing of the data).
>

Very understandable. The key point though is that you are able to process
that data with free/libre tools (even if the structure is a bit of a pain -
its probably not as much pain as getting PDF or hand-writtent stuff!)


> So if a "freely available published specification" of the format
> is made part of the definition, then very few (none?) of the CSV
> formats that exists matches the definition.  Maybe that is a good
> thing?  Personally, I would love seeing CSV replaced by something
> more robust and self-documenting, such as XML.
>

Understood though different folks may prefer different things.

Rufus



> --
> - gisle hannemyr [ gisle{at}hannemyr.no - http://folk.uio.no/gisle/ ]
> ========================================================================
>     "Don't follow leaders // Watch the parkin' meters" - Bob Dylan
> _______________________________________________
> od-discuss mailing list
> od-discuss at lists.okfn.org
> https://lists.okfn.org/mailman/listinfo/od-discuss
> Unsubscribe: https://lists.okfn.org/mailman/options/od-discuss
>



-- 

*Rufus PollockFounder and President | skype: rufuspollock | @rufuspollock
<https://twitter.com/rufuspollock>Open Knowledge <http://okfn.org/> - see
how data can change the world**http://okfn.org/ <http://okfn.org/> | @okfn
<http://twitter.com/OKFN> | Open Knowledge on Facebook
<https://www.facebook.com/OKFNetwork> |  Blog <http://blog.okfn.org/>*

The Open Knowledge Foundation is a not-for-profit organisation.  It is
incorporated in England & Wales as a company limited by guarantee, with
company number 05133759.  VAT Registration № GB 984404989. Registered
office address: Open Knowledge Foundation, St John’s Innovation Centre,
Cowley Road, Cambridge, CB4 0WS, UK.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/od-discuss/attachments/20141110/6ba4c4f9/attachment-0003.html>


More information about the od-discuss mailing list