Opening up rail performance data

Dafydd Vaughan on 12 July 2011

I admit it; I’m a bit of a train buff.  I don’t stand around at the end of platforms recording the numbers of trains, but I do like to know what is going on and how everything works.

I’ve been a regular user of trains for nearly 10 years.  When I was in college, I had to use the train to get to classes.  Before I moved to Cardiff in December, I commuted to work by train every day.  Now, I travel by train to meetings in London on a pretty regular basis.  I like to think that I’m a bit of an expert at train travel.

One of the things that always intrigued me is the rail performance figures that are published every few weeks. Somewhere in a secluded corner of a station will be a poster, sometimes titled “Passenger Charter Figures”, that is supposed to tell you how many trains run ‘on time’.

Normally (at least every time I’ve ever looked), the train company seems to have achieved its targets.  I always thought this was quite strange given the number of times my train was delayed or didn’t show up at all.

It turns out that there are numerous loopholes and tricks* they can use to make their figures look good.  For example, on time doesn’t mean on time, it actually means the train arrives at its destination no more than 5 minutes late (10 minutes for intercity travel).  Note the use of ‘destination’.  The train could depart 30 minutes late and drop me off at my stop 25 minutes late; but as long as it reaches its destination less than 5 minutes late (which for my journey could be nearly 5 hours later), the train is classed as ‘on time’.  Timetables have lots of padding built in which helps with this.

Luckily, last year I got to work on a joint project with Passenger Focus (the rail passenger watchdog) on ways of making these figures more useful.  We produced a basic HTML prototype to show how more detailed performance figures could be used by the public to hold train companies to account.

Unfortunately, even though much more detailed performance information is collected (the time a train passes each monitoring point on the network is recorded by Network Rail), it isn’t published, let alone open for re-use.  It’s generally considered ‘commercially sensitive’ – which is ridiculous.  Even if we wanted to use the, in my view, crap existing figures we wouldn’t be able to – it isn’t open.

We demoed our prototype and our ideas to a few people including the Office for Rail Regulation (ORR) – the public body that sets the terms of rail company operator licences – in the hope that they would either force the rail companies (or Network Rail) to open up the data.

We are, of course, not the only ones who have been trying to get this data opened up. Many other individuals and groups have been campaigning to the plethora organisations involved. Some of them have been campaigning for a lot longer than we have.

Last week the government announced a huge number of public datasets to open up.  Amongst these, two important elements stand out:

  • Office of Rail Regulation to increase the amount of data published relating to service performance and complaints by May 2012
  • Rail timetable information to be published weekly by National Rail from December 2011.

Obviously, everyone involved in the various campaigns have managed to get the issue to register on the radar of ORR!  Publishing this data is a huge & welcome step forward.  Everyone involved deserves a pat on the back.

But, we’re not there yet. We now need to make sure that the published performance data is detailed enough to be useful to the public.  Then, once it’s published, we need to make sure there are tools available that use it, so that the public can finally start to hold their train companies to account.

* the train companies would say that their figures are independently audited and there are no tricks and loopholes.

Photo Credit: AndrewHA (Flickr CC)

Edit: added a few links.