Updates to tph.py

I have made some updates to tph.py, my tool for generating plots of transit service levels from GTFS feeds. Most importantly, these updates fix compatibility issues which kept it from working with certain agencies’ GTFS feeds. To start, here are two plots generated from the BART GTFS feed, using the new version of tph.py.

The first plot’s not actually all that useful, since it mixes up rail service and the AirBART bus shuttle, but it does demonstrate a point, which will be explained later. The second plot is actually useful; it readily demonstrates that the BART system was not designed to provide frequent service to the extremities of the network—while service through the core peaks at 22 trains per hour, most branches on average get four trains per hour.

Anyway, supporting the BART GTFS feed required two major changes: supporting feeds which do not use the direction_id field, and supporting feeds which use the frequencies.txt file (which is used in the BART GTFS feed for AirBART, hence its inclusion above) rather than explicit stoptimes for every trip. As a result of these changes, tph.py should now support any GTFS feed. However, feeds which do not use the direction_id field do require additional configuration to assign directions to routes and trips. This is all documented in the new documentation on the configuration file format.

In addition, tph.py‘s innards have had an overhaul; it no longer uses Google’s transitfeed module for parsing GTFS feeds. Instead, it uses a fork of the gtfs module. gtfs imports the feed and stores it in a SQLite database, using SQLAlchemy. This takes time upfront, but makes tph.py a lot faster to run. It also makes the code cleaner; some operations which previously required several nested for loops can now be done with a single SQL query.