[Open-Legislation] All your data are belong to us
stef
stefan.marsiske at gmail.com
Mon May 2 01:52:11 UTC 2011
On Mon, Apr 25, 2011 at 09:47:23AM +0200, JOSEFSSON Erik wrote:
> After a discussion on iindep, I thought this might be a better list for discussing this subject.
> Tratten only scrapes oeil. But oeil does not display e.g. committee deadlines for amendments. For that you'd need a script which monitors each committee page, e.g IMCO:
>
> http://www.europarl.europa.eu/meetdocs/2009_2014/organes/imco/imco_7leg_meetinglist.htm
>
> and downloads "MEETING DOCUMENTS AVAILABLE", e.g.:
>
> http://www.europarl.europa.eu/meetdocs/2009_2014/organes/imco/imco_20110411_1500.htm
>
> and looks for draft agendas and minutes, e.g.:
>
> http://www.europarl.europa.eu/meetdocs/2009_2014/documents/imco/oj/863/863898/863898en.pdf
>
>
> In such documents the deadlines (and other interesting stuff) is displayed like this:
>
> 24. Modernisation of public procurement
> IMCO/7/05500
> 2011/2048(INI) COM(2011)0015
> Rapporteur: Heide Rühle (Verts/ALE)
> Responsible: IMCO
> Opinions: INTA
> CONT Bart Staes (Verts/ALE)
> ECON Decision: no opinion
> EMPL Julie Girling (ECR)
> ENVI Åsa Westlund (S&D)
> ITRE Konrad Szymanski (ECR)
> REGI
> JURI Decision: no opinion
> * Exchange of views
> * Deadline for tabling amendments:14 July 2011, 12.00
>
>
> I'd be very interested in a script that extracts that last line. Any such hacks around? :-)
sir, your wish:
https://github.com/pudo/parltrack/tree/master/parltrack/scrapers
currently dumps tabling deadlines and a json dump of all scraped data from the
pdfs, see an example here: http://privatepaste.com/download/90a0716f02
--
gpg: https://www.ctrlc.hu/~stef/stef.gpg
gpg fp: F617 AC77 6E86 5830 08B8 BB96 E7A4 C6CF A84A 7140
More information about the open-legislation
mailing list