[@OKau] Standard format for publishing CSV spatial data on data.gov.au - feedback, comments?

Craig Thomler craig.thomler at gmail.com
Wed May 6 04:33:48 UTC 2015

Hi Steve,

My questions and concerns are not about the specific fields you include,
but around the need for another standard.

While it might be entirely valid, even critical, to create this standard, I
think there's a process necessary to establish the need for it. You may
have already done this, but it wasn't detailed in your email.

Here's some questions to consider:

Can you explain what problems this standard will solve and why no existing
standards are not able to solve them?

Did you consider extending an existing standard to solve these problems
before reinventing the wheel by creating another new standard (ANS)?

Does your standard make it harder or impossible for a data provider to meet
another standard? Particularly one that they are required to meet on a
mandatory basis for legal or contractual reasons?

What futureproofing is being built into this standard to stop the next
person working at National Maps, or elsewhere in government, from throwing
it out and creating another new standard?

Have you calculated the cost for data providers to meet your standard? What
benefits will they get to offset these costs?

What incentives are there for data providers to follow your standard? Will
you compensate them for the cost of meeting it?

Who will endorse this standard as an actual standard? Will be be an
industry or government-backed standard, or just a one-man/agency

Remember: "The nice thing about standards is that you have so many to
choose from." - Andrew S. Tanenbaum




Craig Thomler


*Mobile:* 0411 780 194 (*International:* +61 411 780 194)
*Phone:* 02 6161 4508 (*International: *+61 2 6161 4508)
*Skype:* craig.thomler

On 5 May 2015 at 16:09, Steve Bennett <stevage at gmail.com> wrote:

> Hi all,
>   By day, I work on National Map <http://nationalmap.gov.au>, which
> scours data.gov.au and other portals for open spatial data from state and
> federal open data portals (including local government data). It lets you
> choose which of these datasets to view on the map, primarily supporting
> quick assessment of value in a dataset or to answer basic questions.
> Now, we'd like to get a bit more systematic about harvesting CSV data.
> There are two main types:
> 1) Point data described by latitude and longitude
> 2) Administrative region described by reference to a known region such as
> a suburb, postcode, local government area (LGA), ABS statistical area etc.
> (Arbitrary line and polygon features are out of scope for now - they're
> better published as GeoJSON in any case).
> We'd really like to have an agreed standard that data providers can
> publish to, that is both supported by National Map (and other instances of
> Terria), but is a generally good, reusable format for data that will be
> used for other applications as well (including Excel, leaflet, CartoDB,
> QGIS...)
> ["We" in this case is NICTA, but I have a strong personal interest in
> this, as an Open Knowledge volunteer...]
> I'm tentatively calling it "Aus-Geo-CSV". I've started writing it up here:
> https://github.com/NICTA/nationalmap/wiki/Aus-Geo-CSV-standard-(proposed)
> It's probably best to comment here.
> Things I'd particularly like to know:
> - feedback on which fields are widely in use (particularly any history
> around why that's the case)
> - feedback on any likely difficulties in following the standard
> - what else should be in there? Should we be encouraging .vrt files to be
> provided? Do we need to mention character encodings, line endings, quoting,
> etc?
> - has someone else already done this, and better?
> - are there other administrative regions that are important to support?
> (I'm obviously focusing the spec on what National Map can and will support,
> but it can certainly be broader than that.)
> To make commenting easier, I'll quote the important bits here, but please
> do read the whole thing. (It will probably have changed since I write
> this...)
> In data.gov.au, datasets that conform to this standard SHOULD be tagged
> "aus-geo-csv". The dataset MUST NOT contain CSV files that do not conform
> (but may contain other non-CSV files).
> Latitude/longitude
>    - Preferred field names: lat, lon [the only format currently supported]
>    - Accepted field names: latitude, longitude; lat, lng
>    - Discouraged: x, y;
>    - Avoid: WKT (single column with data in POINT(-37.8 144.9) format);
>    easting,northing
> Each MUST be a number in decimal degrees (EPSG:4326). Numbers SHOULD NOT
> be enclosed in double quotes.
> Postcode
> A four digit postcode.
>    - Preferred field name: au:postcode
>    - Acceptable field names: postcode
>    - Discouraged: poa
> For greater precision, additional fields suburb and state MAY be
> provided. For example: postcode 3068, suburb Clifton Hill, state VIC.
> Local Government Area
> <https://github.com/NICTA/nationalmap/wiki/Aus-Geo-CSV-standard-(proposed)#by-name>By
> name
>    - Preferred field name: au:lga
>    - Acceptable field names: lga
> The contents MUST be the short form of the LGA name, with no "City of",
> "Council" etc. For example: "Melbourne", "Greater Geelong". It SHOULD be
> capitalised like this.
> A separate state column MUST be provided, as LGA names are not unique
> across states. A separate au:lga_code column SHOULD be provided.
> <https://github.com/NICTA/nationalmap/wiki/Aus-Geo-CSV-standard-(proposed)#by-id>By
> ID
>    - Preferred field name: au:lga_code
>    - Acceptable field name: lga_code
> This MUST be the 5 digit code described by the ABS
> <http://www.abs.gov.au/AUSSTATS/abs@.nsf/DetailsPage/1270.0.55.003July%202011>.
> For example, Brisbane is 31000. Complete lists are available here
> <http://www.abs.gov.au/AUSSTATS/abs@.nsf/DetailsPage/1270.0.55.003July%202011>
> .
> <https://github.com/NICTA/nationalmap/wiki/Aus-Geo-CSV-standard-(proposed)#state>
> State
>    - Preferred field name: au:state
>    - Acceptable field names: state
>    - Discouraged: ste
> The contents MUST be the two or three-letter form of the state or
> territory ("VIC", "NT").
> *KM: Case-sensitive?*
> <https://github.com/NICTA/nationalmap/wiki/Aus-Geo-CSV-standard-(proposed)#other-administrative-regions>Other
> administrative regions
> These region types are also supported:
>    - sa4: "Statistical area level 4
>    <http://www.abs.gov.au/ausstats/abs@.nsf/0/B01A5912123E8D2BCA257801000C64F2>"
>    (ABS)
>    - sa3: "Statistical area level 3
>    <http://www.abs.gov.au/ausstats/abs@.nsf/Latestproducts/E7369D1FCE596315CA257801000C64E5>"
>    (ABS)
>    - sa2: "Statistical area level 2
>    <http://www.abs.gov.au/ausstats/abs@.nsf/Latestproducts/88F6A0EDEB8879C0CA257801000C64D9>"
>    (ABS)
>    - sa1: "Statistical area level 1
>    <http://www.abs.gov.au/ausstats/abs@.nsf/0/7CAFD05E79EB6F81CA257801000C64CD>"
>    (ABS) [not currently supported by Terria]
>    - ced: "Commonwealth electoral division
>    <http://www.abs.gov.au/ausstats/abs@.nsf/0/9C8331F55896F9C5CA2578D40012CF99?opendocument>"
>    (ABS)
>    - sed: "State electoral division
>    <http://www.abs.gov.au/ausstats/abs@.nsf/Latestproducts/94496C7EA68A1522CA2578D40012CFB8>"
>    (ABS)
>    - ssc: "State suburbs
>    <http://www.abs.gov.au/AUSSTATS/abs@.nsf/Previousproducts/2C6132C0B332C336CA2578D40012CF76>"
>    (ABS)
>    - cnt2: "Two letter country codes" (ISO 3166-1 Alpha 2
>    <https://en.wikipedia.org/wiki/ISO_3166-1_alpha-2>)
>    - cnt3: "Three letter country codes" (ISO 3166-1 Alpha 3
>    <https://en.wikipedia.org/wiki/ISO_3166-1_alpha-3>)
> _______________________________________________
> okfn-au mailing list
> okfn-au at lists.okfn.org
> https://lists.okfn.org/mailman/listinfo/okfn-au
> Unsubscribe: https://lists.okfn.org/mailman/options/okfn-au
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/okfn-au/attachments/20150506/a0dfc33c/attachment-0004.html>

More information about the okfn-au mailing list