[ckan-discuss] PDF to replace RDF as primary format for data.southampton

Tim McNamara paperless at timmcnamara.co.nz
Fri Apr 1 10:26:16 BST 2011


On 1 April 2011 19:51, Christopher Gutteridge <cjg at ecs.soton.ac.uk> wrote:

> Due to the flak we got over my comments about PDF last week, we've been
> forced to re-evaluate it.
>
> We've made the difficult decision that RDF will no longer be the preferred
> format for information interchange on our data site. Full details about the
> decision here:
> http://blogs.ecs.soton.ac.uk/data/2011/04/01/pdf-selected-as-interchange-format/


This is a real shame. PDF is a presentation format for documents, not a
format for the interchange of data. I don't understand how PDF can be said
to facilitate "exchanging data"[1].

Your blog post makes two arguments: PDF is useful to "maintain the layout of
complex data sets in the browser on the desktop, and via printed hard copy."
 However, neither of those advantages have any bearing on the ability for
machines to consume the data. PDF  ruins the ability for a parser maintain
the data structure's because PDF text is completely unstructured.
Additionally, may I ask why the ability to generate paper copies is
relevant?

If you wish for people to be able to visualise data, then by all means
provide information via a presentation format. However, your own website[2]
says that "If we make the data available in a structured way with a license
which allows reuse<http://www.nationalarchives.gov.uk/doc/open-government-licence/>
then
our members, or anyone else, can build tools on top of it without needless
bureaucracy. That's common sense." If you publish data as PDF, e.g. remove
all structure, you make it impossible to build tools with those data.

I'm confused why data.southampton would like to subvert its own cause.

Tim McNamara


[1] para 3
http://blogs.ecs.soton.ac.uk/data/2011/04/01/pdf-selected-as-interchange-format/
[2] http://data.southampton.ac.uk
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/ckan-discuss/attachments/20110401/aa387809/attachment-0001.htm>


More information about the ckan-discuss mailing list