A few weeks ago, at the Mobility Lab Transit Hack Day, I mentioned the small matter of the Metrobus stops at Rosslyn all being named “N MOORE ST + ROSSLYN STATION BUS BA”. Since I have the Metrobus stop database (all 11430 stops) loaded into MongoDB, it’s easy to write a MapReduce query to find all of the stops where a single stop name is associated with more than one stop ID. It’s perfectly normal for a stop name to be associated with two stop IDs, since at most locations you’ll have one stop on each side of the street. However, there are a total of 86 stop names which are associated with three or more stop IDs. Most appear to be cases where there are two stops in one direction in a given block, but some could cause real confusion:
- COLLEGE PARK UMD STATION + BUS BAY: stop IDs 3003451, 3003913, 3003944
- N MOORE ST + ROSSLYN STATION BUS BA: stop IDs 6000818, 6000824, 6000827, 6000882
- TYSONS WESTPARK TRAN STA + BUS BAY: stop IDs 5001919, 5001941, 5001981, 5002109
- WEST FALLS CHURCH STATION + BUS BAY: stop IDs 5001951, 5001958, 5002067, 5002151
At some of these locations, the bus bays are some distance from each other (and at WFC, there are bays on both sides of the station, although all of the Metrobus bays happen to be on the south side), so not being able to distinguish one bus bay from another poses a real wayfinding problem.
For the most part, this is a symptom of the fact that until the (relatively recent) advent of the open data movement, the development of GTFS, etc., the data contained in a transit agency’s internal databases would never be seen by the general public. Many transit agencies (even large ones) still generate timetables using manual or semi-manual processes, so errors in the data don’t necessarily result in errors in the published timetables. Thus oddities can lurk in the data for years, until someone finally comes along and starts looking at the data more closely. Stop names are a perfect example of that—if anything, only the stop names for major timepoints would have any public visibility. In addition, you’ll notice that all of these names are truncated at 34 or 35 characters. It’s entirely possible that somewhere within WMATA’s enterprise systems, the names are stored in full, and they only get truncated at some later point in the pipeline.
As an aside, there are even more interesting things in the data; all public WMATA stop IDs are seven digits long. But there are 519 stops in the Metrobus data whose IDs are less than seven digits long, including stops like 4229, which has no name, and stop 17014, whose name is “NO”.