Apps ≇ frequency

Mobile apps for real-time passenger information are neither approximately nor actually equal to frequency of service. (and yes, “neither approximately nor actually equal to” is the name of the character in the title of this post)

But that doesn’t mean real-time passenger information isn’t valuable. On the contrary, it’s immensely valuable, in the right circumstances. For discretionary riders, who can vary their arrival and departure times, real-time passenger information is valuable. For passengers who have somewhere to wait before the bus or train comes (the proverbial “have another drink before you go”), it’s valuable. But for passengers, particularly transit-dependent passengers, who are trying to mesh the geometry of transit with their complex lives, nothing beats frequency of service.

Consider an example: you will depart Event A at 1:00 PM, and must be at Event B by 2:00 PM. A bus route connects the two, and the trip takes 45 minutes. If the route runs every 15 minutes (or more frequently), you have a good chance of making your second appointment, possibly even if you miss one arrival (or if the trip loses some time en-route). But if the bus runs every 20 minutes? Every 30 minutes? It becomes a game of chance. You might make your appointment, or you might not. Knowing when the bus will come does nothing to change the inexorable geometry of a low-frequency transit network—you may know when the bus is coming, but it’s still not going to get you to your appointment on time.

Some people might say “but you can call or text and push your appointment back!”. Sure, some people can. If you’re fortunate to be in a privileged position where you can dictate other people’s schedules, then you’re all set. But most of us simply have to be where we’re supposed to be, when we’re supposed to be there. So while it may help to know just how late you’re going to be, that neither excuses nor mitigates the impacts.

This is why transit planner and consultant Jarrett Walker says “frequency is freedom”. Sure, apps may help reduce wait time, but if a transit service is simply too infrequent to be useful, discretionary riders won’t ride, and captive riders will suffer.

Additionally, the benefits of real-time passenger information really only become apparent when the information provided to passengers is accurate and reliable. This isn’t a nuts-and-bolts post, so I will refrain from naming particular vendors or transit agencies, but not all real-time information is created equally.

It doesn’t do passengers any good, for example, when they arrive at a bus stop just as their app tells them the bus should be arriving, only to find that the bus departed several minutes prior. Nor does it do them any good to stand at a bus stop (in the cold, in several feet of snow, in the blazing summer heat, etc.), watching an app count down from ten minutes, to five minutes, to one minute, and then back up again, with no bus in sight. When these things happen, passengers become disillusioned. They lose faith in the system. In the short-term, they give up on the bus or the train and call a cab or book an Uber or walk. In the long-term, they begin making plans that allow them to avoid transit—perhaps they even buy a car.

As as software developer, and one who works on real-time passenger information systems, I’m not going to say that apps aren’t good. But I am also a transit rider, and I know there’s a balance. Two Sundays ago, for example, after leaving the Conveyal TRB Welcome Party in Columbia Heights, I walked over to 16th Street to catch an S bus home. OneBusAway told me that the next bus was 18 minutes away—that I’d just missed the previous bus—and there wasn’t a thing I could do about it. App or no app, I was going to sit and wait in the cold for another 18 minutes until the bus arrived. Arguably I should have checked OneBusAway before I left, but that’s precisely what “frequency is freedom” means: it was late, I was tired, and I was ready to go home. Not in another 18 minutes, but now (or maybe in another 5 or 10 minutes, but certainly not 18).

Telling people to plan their lives around a transit app just isn’t a good way to lure them out of their cars or endear them to transit. It’s much more compelling (and leads to a much more usable transit system) when we can simply tell people “show up at a stop, and there’ll be a bus in 10 minutes or less”. It’s also not a problem app developers can solve alone; as the cliché goes, when all you have is a hammer, everything looks like a nail. Providing reliable real-time passenger information is a good first step towards improving the usability of a transit network, and one that is often far less expensive than actually increasing frequency of service. But that doesn’t mean our work is done once the app goes live; on the contrary, we’ve only just begun.

WMATA’s half-hearted open data hurts everyone

I’ve written before about WMATA’s API for train positions and API for bus route information. This time, it’s WMATA’s API for elevator and escalator status that is cause for concern. It’s good that WMATA provides this data in a machine-readable format—in fact, they’re one of only a handful of agencies to do so—but as with WMATA’s other APIs, the implementation is half-hearted at best.

Inconsistent data, the absence of a formal developer relations mechanism, and unexplained, unannounced outages are bad for everyone. They make WMATA look bad, obviously. But more importantly, they make developers look bad, and reduce the incentive for local developers to build applications using WMATA’s data. When someone finds that an app doesn’t work, or that they’re getting stale, incomplete, or inconsistent data, their first instinct is usually to blame the app or the app’s developer, not WMATA.

What’s specifically wrong with the ELES API?

  • 11-day outage, made worse by non-existant developer relations:
    From March 28 to April 9, 2012, the ELES feed returned static data. This outage was never acknowledged publicly by WMATA, in any medium.

    Because WMATA does not provide any public point of contact for developer relations, there was no way for developers to formally report the problem, nor any way for developers to get useful information like an estimated time to resolution.

    An API outage such as this may seem like the sort of thing that would only impact a handful of transit data nerds, but rest assured, there were absolutely real-world impacts: Elevator-dependent Metrorail users who relied on mobile applications which used data from the API found themselves trapped at stations where the stale data led them to erroneously believe that an elevator was in service.

    While this may have been a one-time problem, the underlying issue remains: how could a critical service have gone down for 11 days with no public notice?

  • Feed missing information from the Web site:
    Like much of the information in WMATA’s open data initiative, the ELES API presents the same data as is presented on WMATA’s Web site…or at least that’s how it’s supposed to be.

    In reality, while the Web site lists “estimated return to service” dates for each elevator/escalator, that information is omitted from the API. In addition, others have observed that the API feed and Web site don’t always seem to be in sync. This could create considerable confusion for riders who sometimes check the Web site directly and sometimes use an app which gets data from the API.

  • Feed missing information necessary for maximum usefulness:
    Before presenting this point, it’s important to explain how the elevator outage information is used by elevator-dependent riders. When an elevator-dependent rider sees that there’s an elevator outage at a transfer station that will affect them, they generally avoid the outage by transferring at another station (for example, at Fort Totten rather than Gallery Place).

    But if it’s at their origin or destination station, then they can either use another nearby station (like Judiciary Square rather than Gallery Place), or they can call for a shuttle.

    Calling for a shuttle is a difficult, time-consuming process, but in many cases, especially for outlying stations, it’s a necessity.

    Neither WMATA’s Web site nor the API contain a key piece of information needed by elevator-dependent riders: where to go to get a shuttle—which station, which exit at that station, etc. This information is displayed on the PIDS, but is simply not available on the Web in any format.

  • No master list of units:
    As I explained when I wrote about WMATA’s performance monitoring program, including the agency’s Vital Signs Report, only summary statistics are available for WMATA’s elevators and escalators. Want to know which specific units have the best or worst track records? Want to know if a major overhaul has improved a unit’s availability? Want to know how the units at transfer stations hold up, compared to their peers at less-trafficked stations? You can’t, at least not with the data in the Vital Signs Report.

    But, that doesn’t mean it’s absolutely impossible to compute those statistics; it just takes more work. For one thing, you can forget about getting historical data. However, if you’re willing to archive data from the ELES API, you can actually create your own statistics. Store that in a database, and over time you’ll build up a record of which units were out of service, and when. Transfer the result of that into an OLAP cube, and you can slice and dice to your heart’s content. Want a report on units at transfer stations? Done. Want stats on outages specifically at peak hours? Done. Want a report just on your home station? Done.

    There’s only one piece missing: a list of all elevators and escalators in the Metrorail system. Why is this necessary? In order to compute statistics with the outage data, we have to know how many units there are—in statistical terms, the universe. Of course, we can find out from WMATA’s Web site that there are a total of 588 escalators, and 239 elevators, but that’s only good enough for computing the same system-wide metric that the Vital Signs Report provides. Any more detailed analysis—like at a per-station level, or a per-line level, or any of the examples given above, requires knowing not just how many units there are, but the IDs of those units, and their locations (so statistics can be computed on a per-station, or even per-unit level).

    If WMATA had made a real commitment to transparency and open data, and if there were a developer liaison appointed, I’d imagine it might take a day or two to get such a master list of units made available as a CSV or XML file—I would have to imagine that somewhere in the 100 TB of data managed by WMATA, there must be a list of these 827 units.

    But there isn’t even anyone to ask for the data. And, to make matters worse, every such request is treated with suspicion and mistrust. There’s no sense of developers working cooperatively with WMATA; it is, from the outset, combative. Yes, some of these data will make WMATA look bad, but some will make the agency look good—especially when it can be shown that a major overhaul, such as is taking place now at Dupont Circle and will soon take place at Bethesda, improves the reliability of the overhauled units. Besides, transparency isn’t about releasing the data that make you look good, it’s about releasing data, period.

What’s the point of all this, then? When General Manager Sarles says that he “[doesn’t] want to hide problems”, or that the Metro Forward campaign is making tangible improvements for riders, I expect to see data to back up those assertions.

When elevator-dependent riders have to cope with yet another outage, I don’t want for them to find out for the first time when they get to their destination and the only notice they have is a cone in front of the elevator door. I want for there to be timely (and, more importantly, meaningful) information available, in a wide variety of formats, including a high-quality API that encourages app developers to build tools that further increase the accessibility and further widen the dissemination of that information.

Why do I expect these things? I expect these things because Metrorail is supposed to be “America’s Subway”, a world-class system at the forefront of technological innovation and operational excellence. Right now, it is neither of those things. Instead, it is a system where riders climb up and down stopped escalators in dimly-lit stations and hope that their train does not pass over another poorly-maintained track circuit which shall fail to detect that it has become occupied and engender yet another fatal collision. It is a system where secrecy and the maintenance of fiefdoms are the norm, not transparency and cooperation for the good of the riding public.

I don’t claim that open data (and better still, open data that is timely and meaningful) will solve all of those problems, but it is a small step forward, and a step that WMATA could easily take using its existing infrastructure.

More on Metrobus stop names

A few weeks ago, at the Mobility Lab Transit Hack Day, I mentioned the small matter of the Metrobus stops at Rosslyn all being named “N MOORE ST + ROSSLYN STATION BUS BA”. Since I have the Metrobus stop database (all 11430 stops) loaded into MongoDB, it’s easy to write a MapReduce query to find all of the stops where a single stop name is associated with more than one stop ID. It’s perfectly normal for a stop name to be associated with two stop IDs, since at most locations you’ll have one stop on each side of the street. However, there are a total of 86 stop names which are associated with three or more stop IDs. Most appear to be cases where there are two stops in one direction in a given block, but some could cause real confusion:

  • COLLEGE PARK UMD STATION + BUS BAY: stop IDs 3003451, 3003913, 3003944
  • N MOORE ST + ROSSLYN STATION BUS BA: stop IDs 6000818, 6000824, 6000827, 6000882
  • TYSONS WESTPARK TRAN STA + BUS BAY: stop IDs 5001919, 5001941, 5001981, 5002109
  • WEST FALLS CHURCH STATION + BUS BAY: stop IDs 5001951, 5001958, 5002067, 5002151

At some of these locations, the bus bays are some distance from each other (and at WFC, there are bays on both sides of the station, although all of the Metrobus bays happen to be on the south side), so not being able to distinguish one bus bay from another poses a real wayfinding problem.

For the most part, this is a symptom of the fact that until the (relatively recent) advent of the open data movement, the development of GTFS, etc., the data contained in a transit agency’s internal databases would never be seen by the general public. Many transit agencies (even large ones) still generate timetables using manual or semi-manual processes, so errors in the data don’t necessarily result in errors in the published timetables. Thus oddities can lurk in the data for years, until someone finally comes along and starts looking at the data more closely. Stop names are a perfect example of that—if anything, only the stop names for major timepoints would have any public visibility. In addition, you’ll notice that all of these names are truncated at 34 or 35 characters. It’s entirely possible that somewhere within WMATA’s enterprise systems, the names are stored in full, and they only get truncated at some later point in the pipeline.

As an aside, there are even more interesting things in the data; all public WMATA stop IDs are seven digits long. But there are 519 stops in the Metrobus data whose IDs are less than seven digits long, including stops like 4229, which has no name, and stop 17014, whose name is “NO”.

An open-architecture multi-modal transit information display

I am a firm believer in the notion that one of the best ways to improve the usability of public transit (and thus improve ridership) is to make transit information more accessible. Smartphone apps for transit services are all the rage nowadays, but not everyone has a smartphone, and more to the point, not everyone necessarily wants to have to get out their smartphone and click through an app to find out when the next bus or train is coming. One of the more universally-accessible approaches is to encourage transit agencies, municipalities, and even private citizens and businesses to install dynamic signage to display real-time transit information.

As I’ve mentioned before, there are already do-it-yourself real-time displays for San Francisco, Chicago, and certain routes in New York, and I put together a rudimentary proof-of-concept for Metrorail a few weeks ago.

The problem with each of these, though, is that they’re one-off projects. Each one is designed for a particular city; they all have their own unique configuration mechanisms and are generally tied to a single transit authority.

What is really needed is an open-architecture framework for intermodal dynamic signage, something that makes it easy to integrate bus, rail, and bike sharing information from multiple agencies, in any city where the data is available, and display it on anything ranging from an iPad to a Chumby to an old PC hooked up to a TV, to professional-grade digital signage displays. For the past few weeks, I’ve been working on a project to try to make that a reality.
Continue reading An open-architecture multi-modal transit information display

41 of the top 50 transit agencies use Twitter. Does yours?

The National Transit Database is a fantastic resource for statistics on transit systems in the United States. One of the simpler NTD products is the set of agency profiles produced—in particular, for the top 50 agencies in the country (measured by unlinked trips). I’ve been interested in transit agencies’ use of social media, and Twitter in particular, for some time. So, two days ago, I decided it might make an interesting exercise to look up each of the top 50 agencies, and see who uses Twitter, and who doesn’t.

I put the results in a Google Docs spreadsheet; mouse over a Twitter username to see the latest Tweet, or click on the link to view the user on Twitter.

In simple terms, 41 of the top 50 agencies (using data from reporting year 2009) use Twitter. The agencies that do not are:

Of these, I’m most surprised to see that the CTA does not use Twitter. On the whole, though, I’d say the results are tentatively encouraging. I did not perform a detailed assessment to try to judge meaningful use; all I was looking for was whether an agency had an official presence on Twitter. It’s entirely possible that some of these agencies, despite having a Twitter account, do not use it to alert riders to disruptions in real-time, or do not respond to Tweets from riders. Many of the agencies, though, do seem to use Twitter as a real-time, two-way channel (as it’s intended to be used), and that’s a great step forward in terms of passenger communications.