[okfn-bg] Bulgarian Parliament Open Data

Rufus Pollock rufus.pollock at okfn.org
Sat Dec 3 18:04:54 UTC 2011

On 1 December 2011 16:58, Boyan Yurukov <yurukov at gmail.com> wrote:
> Hello,
> I recently updated the scraping software that gathers information on
> the Bulgarian parliament. While they release some open data, it's in
> bad XML form and not full. I've fixed it, added data from the website,
> linked it and remixed it. It has 7400 data points and is 56Mb (11Mb
> zipped). The current data is from the last 10 years. The data
> includes:
> - MP profiles - general bio, , supported bills, absences, external
> consultants, participation in groups, committees and previous
> parliaments
> - Data on bills - laws, law propositions, decisions and official
> declarations; links to texts, bill history, etc.
> - Parliament groups - members, proposed bills, changes of members over
> time, consultants
> - Parliament committees - members, discussed bills, documents, changes
> of members over time, consultants
> - Parliament committee sittings - when, where, attending MPs, what
> points were discussed, transcripts and resulting reports
> - Parliament delegations - members, changes of members over time
> - Parliament friendship groups - members
> - Parliament procurement data - all public procurement requests with
> dates, description and procurement registry numbers
> - Inverse lookup lists on all of the above.
> - XSD describing the XML data.

Amazing work Boyan and I love the the full dataset entry:


:-) (we're about to (re-)introduce a follow extension on the DataHub
and this is going to be one of the first things we'll be following).
It might also be nice to get a brief blog post from you for a series
we're starting on interesting datasets on the DataHub.

> You can find all this on:
> http://parliament.yurukov.net/index_en.html
> Here's all the data:
> http://parliament.yurukov.net/data/data.zip
> And this is the scraping code as open source project:
> https://github.com/yurukov/Bulgarian-Parliament-Open-Data
> What is missing in this data is transcripts from the plenary meetings.
> There's data on those for the past 20 years, but it's not indexed. I'm
> working on scraping/indexing those. I've also set up a cron job to
> update all the data above twice a week. The content of bills and
> transcripts is in Bulgarian, but the structure and tags is in English.

Sounds fantastic :-)


More information about the okfn-bg mailing list