[OpenSpending] Extracting data from PDFs
Lucy Chambers
lucy.chambers at okfn.org
Thu Dec 20 11:17:35 UTC 2012
Hi all,
I figured you might be able to help. My colleague, Michael, is writing
a course on Optical Character Recognition for the School of Data
project.
He's done the easy, nicely formatted PDFs. Now he's looking for some
real-life, nasty examples of PDFs that people have to deal with.
Probably scanned / photographed PDFs, or just really tricky PDFs so
that we get a good difficulty scale across the course.
Any pointers - very helpful, it's really nice to base these courses on
real data that people have actually been grappling with!
Lucy
--
Lucy Chambers
Project Coordinator,
School of Data & OpenSpending
Open Knowledge Foundation
Skype: lucyfediachambers
Twitter: @lucyfedia
More information about the openspending
mailing list