[iRail] BeLaws

Pieter Colpaert pieter.colpaert at gmail.com
Sat Feb 12 16:01:15 UTC 2011


Hi list!

I had trouble looking up some Belgian laws so I decided to scrape it and
make my own "one-field-google-like" site. As I use it for the NPO and
some people were interested in it I thought it would be interesting
sharing my results so far:

The project is at http://github.com/iRail/BeLaws

1. Scraping: The scraping is done. All the laws are downloaded on our
server. You can download them yourself (which will take very long) using
the fetcher script in the scraper directory.
2. Parsing: Tim Esselens will write a perl script to parse the html
files to a more readable format. He will add an API to it so that
everyone can write their own applications for looking up the law.
3. Hosting: The project will be hosted at belaws.iRail.be when it's
ready.
4. Interface: We put everything in apache lucene (which is an indexer
and full-text search engine). I'm writing a Java Servlet for it. You can
see the current results in the attached screenshots.

Pieter

-- 
+32 (0) 486 74 71 22
iRail vzw/asbl

http://project.iRail.be
-------------- next part --------------
A non-text attachment was scrubbed...
Name: screenshot7.png
Type: image/png
Size: 14985 bytes
Desc: not available
URL: <http://lists.okfn.org/pipermail/irail/attachments/20110212/3b2a45cf/attachment-0004.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: screenshot8.png
Type: image/png
Size: 12786 bytes
Desc: not available
URL: <http://lists.okfn.org/pipermail/irail/attachments/20110212/3b2a45cf/attachment-0005.png>


More information about the iRail mailing list