[od-discuss] OD v2 accepts Excel as OpenData?!???

Gisle Hannemyr gisle at ifi.uio.no
Mon Nov 10 07:49:26 UTC 2014


On 2014-11-08 20:48, Aaron Wolf wrote:
> On 11/08/2014 09:55 AM, Mike Linksvayer wrote:

>> Does published specification matter? Does an ad-hoc format that is not
>> restricted by any means make a work non-open? Consider CSV variations.

> Not sure here, but are there actually CSV variations where the
> specification is not published?

Is there any /real/ specifications for CSV published :-) .

For the record: I've never seen a /complete/ and /authoritative/
specification for most of the very many different CSV variations
that people actually distribute.

People may manage "Save as ..." in MS Excel to save the data in
some  CSV-format, but unless the data is very simple, Excel doesn't
export it in a /documented/ CSV format.  Excel exports in many
different un-documented CSV formats - and the format you get
depends on (among other things) Excel version, platform, locale
and language settings.

> I mean if the character used for
> separation is different, would we really say that nothing anywhere
> publishes the specification for that character etc?

In addition to the separation character, I've noticed different
semantics for single and double quotes, escape characters
(both for escaping the separation character, and for escaping
the escape character), the treatment of newlines inside fields,
and the encoding of non-ASCII characters.  Such variations
appear in various combinations, and this fact, and the missing
published  specifications, makes some CSV data sets far more
difficult to use than I would like.

I happen to deal with a lot of CSV data originating from various
address-books and other GIS-resources, and I've resorted to writing
my own "CSV normalizer" that converts all the different
CSV variations I receive into a common format (prior to actual
processing of the data).

So if a "freely available published specification" of the format
is made part of the definition, then very few (none?) of the CSV
formats that exists matches the definition.  Maybe that is a good
thing?  Personally, I would love seeing CSV replaced by something
more robust and self-documenting, such as XML.
-- 
- gisle hannemyr [ gisle{at}hannemyr.no - http://folk.uio.no/gisle/ ]
========================================================================
    "Don't follow leaders // Watch the parkin' meters" - Bob Dylan



More information about the od-discuss mailing list