[Open-data-census] Dataset Definitons

Andrew Stott andrew.stott at dirdigeng.com
Thu Oct 3 19:12:29 BST 2013


Rufus

 

Thanks for your post
http://lists.okfn.org/pipermail/open-data-census/2013-October/000219.html

 

However I'd like to push back *strongly* on the proposed change to the
definition of "postcodes/zipcodes and their corresponding geolocations" and,
in particular, defining geolocations as "boundaries/areas corresponding to
those postcodes"

 

"boundaries" can be very demanding - it requires the trace of the
boundaries, which may or not be recognised administrative boundaries.  In
some cases the boundaries overlap because of how postcodes are defined (they
generally reflect collections of delivery points linked to delivery offices
and not recognised administrative boundaries - see for instance the
complexities in the US http://en.wikipedia.org/wiki/ZIP_code  It also may
involve more than one government agency to put the data together.

 

What is meant by "areas" - is this a geospatial bounding area (ie a
rectangle) or the *name* of the town to which the postal code relates -
which makes sense in Australia and, to some extent, the US, but which means
a data loss in countries such as the UK, Canada and the Netherlands which
have finer grained postal codes.  And turning a postal code into the name of
a town is not necessarily very useful.  A number of postcode databases
already identified in the Census give the name of the street in which the
postcode is.

 

In my experience the two standard use case of a postcode database and
geolocations which I have seen is as follows:

 

(1) I have a table of data in which each row has an attribute which is or
contains a postcode (perhaps as part of a postal address) - for instance
points of interest, crimes, houses for sale.  I want to plot each row on a
map.  So I need to translate the postcode into a lat/long.  A database of
postcodes and their lat/long (or the eastings and northings in the national
coordinate system, which can be converted to lat/longs by well known
formulae) allows me to do that.

 

(2) I have an application in which I want the user gives a location by
postcode as a shorthand for the address, and I want to use that location to
describe the issue or to find other relevant locations.  The classic
examples are http://www.fixmystreet.com/  or a "where is my nearest?"
application (cf http://maps.camden.gov.uk/ ) including the famous toilet
finders.  Again I need to be able to convert the postcode into a lat/long or
equivalent, and then use it to locate the problem or measure distances to
other similarly coded locations.

 

In both these use cases the key requirement is that the database should be
of postcodes and their corresponding geospatial co-ordinates (in lat/long or
a system convertible to lat/long).  Your proposed definition does not ensure
that.

 

So I would instead make obtaining a lat/long as the key requirement.  I
would also allow cases where the corresponding location information (eg town
name) could be converted to lat/long through the use of other open data (eg
an open address register or gazetteer with lat/longs) 

 

So I propose an alternative definition:

 

"a database of postcodes/zipcodes and a corresponding geospatial locations
in terms of a latitude and a longitude (or similar co-ordinates in an openly
published national co-ordinate system).  A database which gives a location
in terms of the name of a town or a street without lat/long co-ordinates is
not acceptable unless the name of the town or street can be further
converted to a latitude and longitude by means of other open data (eg an
open gazetteer with latitude and longitude attributes)."

 

Regards

 

Andrew

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/open-data-census/attachments/20131003/b416084f/attachment.htm>


More information about the Open-data-census mailing list