Where does Chromaroma's data come from?
A few days ago, I was reading about Chromaroma, "a game that shows you your movements and location as you swipe your Oyster Card in and out of the Tube". Immediately, a question came to mind: where are they getting the journey data from? I poked around the site and couldn't readily find an answer, so I decided to try signing up. My own Oyster account no longer works, so I was not able to fully complete the sign-up process. That said, it looks like they're just impersonating users and screen-scraping data from the Oyster online interface, very much akin to what Yodlee does with financial data. It's not the worst possible approach, but it still has its flaws. If a user changes their password, the data-fetching process will fail. If TfL changes the design of the Oyster online interface, the process will fail. In the worst-case scenario, if Chromaroma suffers a breach, user data, including online credentials, could be compromised. If users have used those credentials on other sites, they could face a real problem. The ideas behind Chromaroma are sound, but we can learn from the web development community at large to provide better ways to access private transit data (like journey histories), and to provide better formats for the distribution of that data.
TfL does offer what appears to be an anonymized, aggregated feed of Oyster journey data; it wouldn't be suitable for applications like Chromaroma or many others which depend on having a specific user's information. So, for those applications that depend on having access to a specific user's journey history, how can we provide that securely and in a manner which respects users' privacy rights? How can we ensure that users remain in control of their own data? I think it's reasonable to look at best practices for other web-based applications, and see how they can be applied here. The solution that has been adopted by many sites, from Twitter to Google, is OAuth.
In an OAuth-enabled future, after signing up for Chromaroma, you'd be redirected to the Oyster website to allow Chromaroma to access your data. If you don't already have an Oyster account, you'll have the option to register your Oyster card online; otherwise, you'll log in to your existing Oyster account. You'd then be given a clear summary of the data Chromaroma would have access to, as well as an opportunity to preview the data. At that point, you'd be able to approve (or deny) the access, and return to Chromaroma.
Notice that in this scenario, users are shown exactly what data is being shared. This is also a good opportunity to remind users of the applicable privacy policies, as well as the fact that they can return at any time to sever the link. Unlike password-based schemes, with OAuth, users can terminate any application's access individually without affecting other applications.
In addition, as a matter of data transparency and portability, users should be able to download a machine-readable version of their own journey history from the online interface—they shouldn't have to fiddle with OAuth just to get a CSV or JSON or XML dump of their trips. This also raises a good point—an HTML table isn't really the best way to distribute this information. In fact, the best option would be the development of an open standard, along the same lines as GTFS and SIRI, for journey history interchange. This need not be a complex XML schema; it could be as simple as a specification of fields in a CSV file, but it would make it easier for developers to build applications which consume journey data, and even aggregate it across multiple transit providers.