Last week, ReadWriteWeb profiled WMATA’s open data efforts—from the agency’s initial (and ultimately unsuccessful) efforts to monetize the data, through the release of a real-time API and GTFS feed, and the eventual inclusion of WMATA’s data in Google Transit.
The ReadWriteWeb article paints this as a complete success story, in which, as David Alpert puts it, WMATA “got religion on open data”.
The reality is somewhat different. A GTFS feed and real-time data API may have been a substantial step forward in 2008, when this process started, but today there are many other categories of data WMATA could expose, and open, interoperable formats they could use to do so (particularly for real-time data). In addition, WMATA’s communications with developers could be better. While some agencies have active discussion groups where agency staff communicate freely with developers, at WMATA developers still get a somewhat chilly reception.
How could WMATA’s open data efforts be improved? Here are four suggestions:
- Increase the richness of existing data: Many of WMATA’s existing feeds are useful, but only to a point. For example, there’s a feed for alerts, but it only covers the rail system. In addition, the metadata provided in the feed is weak; it’s hard to determine the severity of an incident or the affected area solely by looking at the metadata. And while there is a feed for rail predictions, there’s no feed to tell you where the trains are, which for some applications is more useful than predictions.
- Make use of open standards for real-time data: It’s great that WMATA provides real-time data, but unfortunately the API used is specific to WMATA. Any developer who builds an app to work with WMATA’s API can’t take that code and re-use it in another city. The alternative would be for WMATA to adopt an open standard like SIRI or GTFS-realtime to disseminate real-time data. This way, developers can build an app once, and have it work with data sources provided by a range of transit authorities, without having to write more code for each additional city or transit authority.
- Provide a wider range of data: The data that WMATA provides is fairly basic: bus and rail stop locations, schedules, predictions, and related data. But there’s a lot more the agency could release, including performance data and ridership statistics. I’ve previously written about the need for WMATA to provide more performance data. In New York, the MTA provides several useful data sets having to do with fare media sales and turnstile usage, while Transport for London provides access to ridership data from its Rolling Origin & Destination Survey.
- Provide archival data: For the most part, transit agencies tend to provide data on current service, but sometimes developers and researchers are best served by having access to archival data. As an example, I’ve recently started a project to archive bus position data—but what if the agency provided certain historical data sets itself? This is something that the MTA in New York has done with both bus and rail data.