[okfn-be] DierenTheater, parsing (lachambre|dekamer).be (was Re: presentation)

Pieter Colpaert pieter.colpaert at okfn.org
Sun Mar 4 01:09:02 UTC 2012


Hi Laurent,

This is great work! How do you wish to advertise this to the public?
What is the end-product you have in mind?

I think for this to become really useful for end-users we need an
information architect to put all these things in order.

What do you think?

Kind regards,

Pieter

On 02/25/2012 04:35 AM, Laurent Peuch wrote:
> Ohai,
> 
> Some news: DierenTheater now provide nl data as well as fr data.
> 
> Sadly at the time I'm writing this (lachambre|dekamer).be server is
> down again (it's the website with the worst uptime I know) so I can't
> show you the result right now.
> 
>>> You can see the result here http://dieren.vnurpa.ethylix.be/lachambre/document/
>>> One example with quite a lot of data http://dieren.vnurpa.ethylix.be/lachambre/document/1825/
> 
> This often happens the weekend, I'm wondering if it's because the dude
> responsible to reboot the server is at home or because the admins just
> shut down the server for the weekend.
> 
> Anyway, my next step will be to add a Rest API. I don't know how much
> time this will take. Django seems to come with a lot of apps to do
> this, maybe I'll find one that match what I want.
> 
> Also, I first wanted to parse the whole website before doing those 2
> previous step but my motivation didn't follow (nothing funny seems to
> be left to parse) so I'm building the API instead to confront my work
> to reality and get feedback.
> 
> The current parsed data are:
> * all deputies and some related informations
> * all commissions and their members
> * all law projects and propositions (the "documents")
> * all written questions (there is more that 60 000 of it since the
>   48th legislation)
> * annual reports
> 
> (Note: the web interface showing the data isn't fully up to date.)
> 
> I'm not parsing any pdfs yet.
> 
> After this, this will be between parsing new data and building an
> intelligent automatic update strategy (doing a sequential parse of
> the whole website every night seems a bit overkill and some part like
> the commissions agenda are likely to be changed way more often).
> 
> Have a nice weekend,
> 


-- 
OKFN Belgium vzw/asbl
+32 (0) 486/747122




More information about the okfn-be mailing list