[iRail] API Liveboards - scraper failure due to new mobile railtime website

Pieter Colpaert pieter.colpaert at gmail.com
Tue Sep 6 16:04:08 UTC 2011


I agree. At this moment we don't parse any html with regex anymore.

Pieter

On Tue, 2011-09-06 at 18:00 +0200, Ludovic Gasc wrote:
> I've find an article why you mustn't use regex with HTML:
> http://www.codinghorror.com/blog/2009/11/parsing-html-the-cthulhu-way.html
> 
> On Tue, Aug 30, 2011 at 12:58 AM, Ludovic Gasc <gmludo at gmail.com> wrote:
> > Small remark:
> > I've used this library: http://simplehtmldom.sourceforge.net/manual.htm
> >
> > For me it will be easier to adapt quickly your source code with a HTML
> > parser than a regex.
> >
> > If it's a problem for somebody, please give us the right regex to do this.
> >
> > On Tue, Aug 30, 2011 at 12:56 AM, Ludovic Gasc <gmludo at gmail.com> wrote:
> >> Hi,
> >>
> >> As discussed on the IRC channel, this is a patch to fix this problem:
> >> https://github.com/iRail/iRail/pull/15
> >>
> >> Thanks for your remarks.
> >>
> >> Yours.
> >>
> >> PS: It's my first patch for iRail, please to be indulgent with me ;-)
> >>
> >> On Sun, Aug 28, 2011 at 11:25 PM, Pieter Colpaert
> >> <pieter.colpaert at gmail.com> wrote:
> >>> Hi list,
> >>>
> >>> Today the guys from railtime launched a new website: m.railtime.be.
> >>> Without notifying us (although we asked DUO, infrabel and NMBS to do
> >>> so). Of course, our liveboard API which still scrapes this site was
> >>> broken.
> >>>
> >>> Quentin made a quick fix for it and partially rewrote the scraper. A lot
> >>> of code was reusable although they changed they way you ask for a
> >>> specific station. Now you have to give the perfect railtime name AND the
> >>> perfect railtime id. As we have those in our database we could quickly
> >>> fix this as well. Why they did this? No idea.
> >>>
> >>> If anything is still not functioning correctly, please wait notifying
> >>> Quentin and me after Wednesday, when both of our examinations are
> >>> finished ;)
> >>>
> >>> Of course, the api is still at:
> >>> http://api.irail.be/liveboard/?station=brussel%
> >>> 20centraal&lang=nl&format=json
> >>>
> >>> Kind regards,
> >>>
> >>> Pieter
> >>>
> >>>
> >>> --
> >>> iRail vzw/asbl
> >>> +32 (0) 486/747122
> >>>
> >>> _______________________________________________
> >>> iRail mailing list
> >>> iRail at list.irail.be
> >>> http://lists.rootspirit.com/mailman/listinfo/irail
> >>>
> >>
> >>
> >>
> >> --
> >> Ludovic Gasc
> >>
> >
> >
> >
> > --
> > Ludovic Gasc
> >
> 
> 
> 

-- 
iRail vzw/asbl
+32 (0) 486/747122




More information about the iRail mailing list