[School-of-data] PDF vs ePUB vs Open Data, was: PDF Extraction Tools

Friedrich Lindenberg friedrich.lindenberg at okfn.org
Mon Feb 11 20:53:41 UTC 2013


On Mon, Feb 11, 2013 at 8:22 PM, M. Fioretti <mfioretti at nexaima.net> wrote:
> My question is: from an automatic data extraction point of view, what
> is better between ePUB and PDF? In other words, what should an Open
> Data activist, interested in that kindof data processing, accept or
> recommend as a format for ebooks from public administrations?
> My feeling is that ePUB is better, but I am not sure. What do you
> think?

ePub is simply a Zip file with HTML and really nice metadata. From that
point of view, it is much preferable over PDF as a publication standard
(unless people start wrapping complex structures like tables into images).

Some ePub is encrypted with DRM, but this is true of PDF as well - both can
be .. dealt with.


 - Friedrich

> TIA,
> Marco
> _______________________________________________
> School-of-data mailing list
> School-of-data at lists.okfn.org
> http://lists.okfn.org/mailman/listinfo/school-of-data
> Unsubscribe: http://lists.okfn.org/mailman/options/school-of-data
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/school-of-data/attachments/20130211/fee5d232/attachment-0001.html>

More information about the school-of-data mailing list