Synoptic first!

So, you’re a transit agency (or vendor, consultant, system integrator, etc.), and you’ve decided to develop an API to expose your real-time data. Perhaps you’ve gotten queries from developers like “I want to be able to query an API and get next bus arrivals at a stop…”.

It’s hard to say “no”, but fulfilling that developer’s request may not be the best way to go. If you have limited resources available to expose your data, there are better approaches available, which will in the long term enable the development of more advanced applications.
Continue reading Synoptic first!

Building Momentum with open data, open source, and open architecture

Max Gano, a Solution Architect with Sound Transit Research and Technology, describes his work in transit technology with the following catchphrase: “open data + open source + open architecture”.

WMATA’s ambitious new strategic plan, Momentum, includes a number of technology-related goals. How can the principles of open data, open source, and open architecture help us build Momentum?

As part of Momentum’s goal to “[m]eet or exceed customer expectations by consistently Delivering Quality Service”, WMATA seeks to “[m]ake it easy and intuitive to plan, pay, and ride”, with specific ‘strategic actions’ including to “[p]rovide readily-understandable and useful real-time information in stations, stops, and on vehicles” and “[p]rovide transit riders with a regional trip planning system that works for all systems and provides real-time information in vehicles, in stations, at bus stops, and on any device”.

These are high-level strategic goals, not detailed technical specifications, but the intent is still clear. Now, how do we achieve those goals? If past experience is any indication, they’ll be achieved by handing the contract off to a big-name vendor who will extract a significant sum from WMATA to deliver one of the major off-the-shelf transit passenger information systems; the project will probably come in late and over-budget, and when it’s done there probably still won’t be a usable API for developers.

This isn’t just an opportunity to take pot shots at government IT contractors; it’s a scenario which has played itself out over and over again. One of the latest and most visible examples has been healthcare.gov—and while many of the criticisms leveled at the site unfairly neglect to account for the complexity mandated by the Affordable Care Act itself, the reality is that what was built could have been built better, faster, and cheaper. Take The Health Sherpa, for example. Using the same data which powers healthcare.gov, they’ve built a bare-bones plan finder which (gasp!) actually works! Now, Health Sherpa doesn’t handle the enrollment side of the transaction, one of the largest pain points to date—but it still demonstrates that you can provide core functionality with a better, faster, cheaper alternative.

It’s the same way with transit data. Using proven open-source packages like OneBusAway and OpenTripPlanner, we can advance WMATA’s strategic goals, and we can advance them without the cost, delay, and dysfunction traditionally associated with government IT procurement. With open data, open source, and open architecture, we can deliver a cutting-edge product, while saving money and putting a better product in riders’ hands faster.

But to do this, we need the region’s transit authorities to fully embrace open data and open system architectures. Breaking free from the cost, inflexibility, and data silos traditionally associated with transit passenger information systems means fully considering open-source products alongside their proprietary counterparts, and demanding that agencies work with and alongside—not against—open-source developer communities, both locally and across cyberspace. Leveraging open standards like SIRI and GTFS-realtime, publishing high-quality open data, and collaborating with developers to enhance the quality and utility of data products all contribute to a better end product, and, ultimately, help advance the strategic goals which WMATA has outlined in Momentum.

With OneBusAway and OpenTripPlanner we already have the technology to deliver much of what WMATA seeks as part of Momentum. Are we completely there? No. Both of these projects are works in progress, with contributions coming in on a daily basis from a global community of software developers. But in terms of core features, OpenTripPlanner is already an outstanding package for trip planning (having proven itself in Portland), including real-time routing for transit and bike sharing systems, and OneBusAway provides a level of transit data integration yet unmatched in the region, as the Mobility Lab’s pilot shows. As these packages mature (including via support from forward-thinking transit agencies) the advancements benefit transit riders around the world, in every city where they’ve been deployed.

By using open-source software, agencies are no longer tied to a single vendor for service, support, and new features. No longer are they tied to costly service contracts, exorbitantly-priced change orders, or business-driven mandates to upgrade to a vendor’s latest-and-greatest (more expensive) product. Instead, agencies can hire their own staff developers, or contract with the vendor of their choice to build new features, maintain the system, and advance their strategic goals.

Similarly, because these tools are built around open infrastructure, there’s nothing stopping an agency from swapping out components of these systems with other alternatives built around the same data standards. Silos are non-existent in an open-architecture world. There’s no vendor lock-in, no fear of proprietary data formats, and no risk in experimenting with alternative tools.

Taking these tools from hundreds of thousands of lines of code on a server (or even just a conceptual plan) to a polished, production-ready service is a different kind of task than just writing a check to a vendor and walking away, but it’s hardly an insurmountable challenge—New York’s MTA has experienced extraordinary success in developing Bus Time on top of OneBusAway, saving money and delivering more features faster than proprietary alternatives. Along the way they’ve collaborated with the global developer community—this interactive list of routes was built not by the system integrator responsible for the Bus Time project, not by an MTA staff developer, but a member of the global developer community (yours truly, in fact).

Simply put, the model works, and the benefits are considerable. Can our region’s transit agencies overcome their technophobia and intransigence in time to avoid a healthcare.gov-esque disaster?

GTFS-realtime for WMATA buses

I’ve posted many times about the considerable value of open standards for real-time transit data. While it’s always best if a transit authority offers its own feeds using open standards like GTFS-realtime or SIRI, converting available real-time data from a proprietary API into an open format still gets the job done. After a few months of kicking the problem around, I’ve finally written a tool to produce GTFS-realtime StopTimeUpdate, VehiclePosition, and Alert messages for Metrobus, as well as GTFS-realtime Alert messages for Metrorail.

The tool, wmata-gtfsrealtime, isn’t nearly as straightforward as it might be, because while the WMATA API appears to provide all of the information you’d need to create a GTFS-realtime feed, you’ll quickly discover that the route, stop, and trip identifiers returned by the API bear no relation to those used in WMATA’s GTFS feed.

One of the basic tenets of GTFS-realtime is that it is designed to directly integrate with GTFS, and for that reason identifiers must be shared across GTFS and GTFS-realtime feeds.

In WMATA’s case, this means that it is necessary to first map routes in the API to their counterparts in the GTFS feed, and then, for each vehicle, map its trip to the corresponding trip in the GTFS feed. This is done by querying a OneBusAway TransitDataService (via Hessian remoting) for active trips for the mapped route, then finding the active trip which most closely matches the vehicle’s trip.

Matching is done by constructing a metric space in which the distance between a stoptime in the API data and its counterpart in the GTFS feed is defined as an (x, y, t) tuple—that is, our notion of “distance” becomes distance in both space and time. The distances fed into the metric are actually halved, in order to bias the scores towards matching based on time, while allowing some leeway for stops which are wrongly located in either the GTFS or real-time data.

The resulting algorithm will map all but one or two of the 900-odd vehicles on the road during peak hours. Spot-checking arrivals for stops in OneBusAway against arrivals for the same stop in NextBus shows relatively good agreement; of course, considering that NextBus is a “black box”, unexplained variances in NextBus arrival times are to be expected.

You may wonder why we can’t provide better data for Metrorail; the answer is simple: the API is deficient. As I’ve previously discussed, the rail API only provides the same data you get from looking at the PIDS in stations. Unfortunately, that’s not what we need to produce a GTFS-realtime feed. At a minimum, we would need to be able to get a list of all revenue trains in the system, including their current schedule deviation, and a trip ID which would either match a trip ID in the GTFS feed, or be something we could easily map to a trip ID in the GTFS feed.

This isn’t how it’s supposed to be. Look at this diagram, then, for a reality check, look at this one (both are from a presentation by Jamey Harvey, WMATA’s former Enterprise Architect). WMATA’s data management practices are, to say the least, sorely lacking. For most data, there’s no single source of truth. The problem is particularly acute for bus stops; one database might have the stop in one location and identified with one ID, while another database might have the same physical stop identified with a different number, and coordinates that place it in an entirely different location.

Better data management practices would make it easier for developers to develop innovative applications which increase the usability of transit services, and, ultimately improve mobility for the entire region. Isn’t that what it’s supposed to be about, at the end of the day?

OneBusAway might be coming to Ride On, maybe?

Today on GitHub I came across this commit. I don’t quite know what’s going on, but it sure looks to me like someone at Greenhorne & O’Mara or Ride On has been experimenting with OneBusAway and Ride On’s data.

This is something in which I am keenly interested. But unlike in other cities, here there seems to be almost no interest in connecting transit agencies with each other and with local developers. There’s great value in doing both—connecting transit agencies together helps reduce duplicated effort and provide riders with harmonized, federated services. But even more importantly, connecting transit agencies with interested developers can provide transit riders with services that might have been cost-prohibitive or otherwise infeasible to for those agencies to develop in-house or through conventional procurement methods.

There are a lot of innovative developers out there, with lots of great ideas. It’s unreasonable to expect transit authorities to shoulder the risk of incubating all of those ideas, some of which might fail spectacularly, but it’s quite another thing for transit authorities to, on a best-effort basis, provide those developers with the data they need to bring their ideas to fruition.

I’d have thought that this would fall within the remit of the Mobility Lab, but more than a year after the launch of the Mobility Lab, that still hasn’t happened.

So, while there’s plenty going on, there’s not a whole lot of coordination, whether between agencies or between agencies and the community. Thus we have nearly a half-dozen real-time sign projects going on in the region—and who knows how much more duplicated work is being done, with everyone toiling behind closed doors!

Contrast that to New York City, where the MTA has been working—transparently—to develop MTA Bus Time, based on OneBusAway. When the agency began work on a GTFS-realtime feed for real-time subway arrivals from the IRT ATS system, once again, they turned to the community to get comments on the proposed specification. MTA developers are active on the agency’s mailing list to respond to questions and bug reports from developers.

In Portland, TriMet worked with OpenPlans to develop OpenTripPlanner, transparently, in full view of the community. OpenTripPlanner has proven to be a huge success, powering an first-of-its-kind regional, intermodal trip planner

Transit in the Washington, D.C. area isn’t all that different than in Portland or New York. Sure, the modes vary from city to city, and the Lexington Avenue Line by itself carries more passengers in one day than the Metrorail and Metrobus systems combined, but at its core, transit is transit. If it worked in New York, if it worked in Portland, it can work here.

This doesn’t have to be hard; in all seriousness, it takes about a minute to create a new Google Group.

When everyone works together, we can all help make transit better.

Bringing OpenTripPlanner and OneBusAway to DC to improve rider experience

I want to bring OpenTripPlanner and OneBusAway to DC. Why? Simply put, because they’re a lot better than what we’ve got now.

WMATA’s trip planner has no API for developers, returns illogical and nonsensical results for some trips (which can be due in part to data quality problems), is based on a costly, proprietary product, and has a clunky, outdated-looking interface. As for leading-edge features like real-time and intermodal trip planning (including bicycling and bike sharing)? Dream on.
Continue reading Bringing OpenTripPlanner and OneBusAway to DC to improve rider experience