[ckan-discuss] PDF to replace RDF as primary format for data.southampton
Peter Krantz
peter.krantz at gmail.com
Fri Apr 1 10:40:07 BST 2011
On Fri, Apr 1, 2011 at 11:26, Tim McNamara <paperless at timmcnamara.co.nz> wrote:
> If you publish
> data as PDF, e.g. remove all structure, you make it impossible to build
> tools with those data.
That is not true. PDF is definitely not unstructured. By parsing the
PDF format you get a machine readable representation. E.g. the
specification [1] privides a lot of parsing details on how to convert
a font specification or a color in the PDF source to e.g. a custom XML
element.
[1]: http://tiny.cc/april-1st
Regards,
Peter
More information about the ckan-discuss
mailing list