[Open-Legislation] msg foo

Stefan Sels stefan at sels.com
Tue Jan 11 23:14:16 UTC 2011


On 11/01/2011 23:48, stef wrote:
> besides having ocr being very important. i also would like to stress the need
> for a pdf to some sensible markup. not much, bullets, headings, footnotes
> should be recognized. tables would be a big bonus. lots of documents are only
> available as pdfs for datamining them we need a good representation of the
> text.

well, a nifty scripting would be great like with attachments and mail 
(but no base64 hell or similar). like if you detect it, you prepare it 
with markup, if not, you add an image with an mimetype.

I could even imagine to convert it into html and than sort into an IMAP 
server, letting you access it via your mail or web client.

like

*send fax/email/paper letter message
to pseudoanonymous address (.onion?)

<voodoo here>

*have an auto generated userid/passwort login with (like with anonbox)
*download the file as pdf
*download the file as mark up document and attachement
*forward it via email, fax, whatever.

you just need an server in a good legislation for that.

iceland anyone? there are great flights for 299 EUR from germany :)

greetings,
   stefan




More information about the open-legislation mailing list