[ckan-discuss] PDF to replace RDF as primary format for data.southampton

Peter Krantz peter.krantz at gmail.com
Fri Apr 1 10:40:07 BST 2011


On Fri, Apr 1, 2011 at 11:26, Tim McNamara <paperless at timmcnamara.co.nz> wrote:
> If you publish
> data as PDF, e.g. remove all structure, you make it impossible to build
> tools with those data.

That is not true. PDF is definitely not unstructured. By parsing the
PDF format you get a machine readable representation. E.g. the
specification [1] privides a lot of parsing details on how to convert
a font specification or a color in the PDF source to e.g. a custom XML
element.

[1]: http://tiny.cc/april-1st

Regards,

Peter



More information about the ckan-discuss mailing list