Journey history as open data, and cooperation with developers

Last April, I blogged about Chromaroma and its use of Oyster journey data from Transport for London. Since then, I've continued to hold up Chromaroma as one of the best examples of what can be done with journey data, in spite of a lack of cooperation from the transit authority.

When I first covered Chromaroma, I pointed out that they were screen-scraping the Transport for London site in order to retrieve Oyster journey histories, and I discussed some potential options for avoiding what is an inherently inelegant process, including the use of OAuth for authentication, and the definition of a common format for journey history data interchange.

Now that I've proposed the development of a system based on journey data, I'd like to revisit how Chromaroma has been doing.

Since I don't actually live in London, I'm a bit out of the loop, and this is actually old news. In August 2011, and with no notice to developers, Transport for London re-designed the Oyster online interface. This had the immediate effect of breaking the processes Chromaroma used to fetch users' journey histories.

Chromaroma's official blog post on the matter is a depressing story:

The marketing department, the press department and the teams in charge of behavioural change all get it — and they see the value. Sadly, there are still TfL departments that seem unwilling to get it. More importantly, there are individuals who don’t seem to see the world they are operating in, and it appears it is these people who are pulling the strings.

Influential people within the organisation seem opposed to innovation, Open Data, and working with external developers. Despite all the talk of open data and new ideas, there is still an overriding attitude of not taking risks or deviating from the status quo. The general concerns are only of cost, and not those of the future. Or the consumer.

Our favourite TfL quote has to be “Who wants to play games? In this age of austerity, no one wants to play games. I can’t see any value in this type of website.”

The post goes on to describe how the developers had offered to pay for the development of an OAuth-enabled interface for the data, or provide an interface so that users could at least extract their own journey histories from the system (presumably in a machine-readable format).

While many transit agencies have been quick to adopt GTFS and offer their schedule information online, progress has been much slower on other forms of data, like performance statistics and journey history. In large part, this seems to be due to the effort involved—agencies already using common software for scheduling (like Trapeze or HASTUS) can easily export their schedule data to GTFS, whereas offering an entirely new data feed involves considerably more development effort.

Of course, whenever you are screen-scraping, as Chromaroma is, you run the risk that the Web site you're scraping might change in some way that breaks your software. So I'd like to make clear that I don't take issue with Transport for London changing the design of the Oyster site. On the contrary, my concern is with TfL's refusal to work with the Chromaroma developers in any meaningful way, even when they offered to pay for improvements to TfL's systems.

Some transit agencies, including New York's MTA, engage in spirited discussions with developers, continuously working to improve the data they provide, knowing that better data translates directly to more satisfied riders. Regrettably, it seems that many other agencies have yet to learn that lesson.