Google Transit: How (and Why) the Search Giant is Remapping Public Transportation

2/21/12Follow @wroush

You can’t talk to a Googler for very long without hearing them recite the company’s mission statement: to organize the world’s information and make it universally accessible and useful. Not only does it sound noble, but it’s an all-purpose answer for the sorts of nosey questions tech journalists pose, like why Google would want to buy a company that compiles restaurant reviews (i.e. Zagat), or why it cares about flight reservation systems for airlines (ITA), or why it’s spending $30 million to encourage private companies to send robots to the Moon (the Google Lunar X Prize).

Of course, Google’s mission statement long ago ceased to be a full explanation of its intentions, or of its true impact. Google might like to be seen as a mere arranger of information—the meekly efficient librarian who puts the books back in the stacks every night. But the reality is that the company is too big, too wealthy, and too ambitious to step lightly on the world’s data. There isn’t a marketplace or a category of knowledge that Google can “organize” without remaking it in the process.

In areas like book publishing, video entertainment, and mobile communications, Google’s expanding reach has been exhaustively covered by the press. But there’s one area where Google (NASDAQ: GOOG) has exercised a transformative influence almost completely outside the spotlight of media attention: public transportation. The changes are easy to overlook, especially if you never step out of your car, or if you only ride the bus or subway in your own city. But there’s been a dramatic shift over the last five years in the way people plan trips on public transportation and the way transit agencies communicate with their riders—and Google is the main instigator.

This revolution, as with almost everything the company does, is proceeding at Internet scale. More than 475 transit agencies in the U.S. and around the world now submit their operating schedules to Google, which publishes the data as part of its Google Maps service. So whether you’re accessing a map from a desktop browser or a smartphone, you can figure out how to get where you’re going by bus or train, not just by car. To see arrival and departure times for thousands of bus and train lines, you can simply click on the little blue icons that connote transit stops (at least, you can if you’re using a desktop browser or an Android phone).

Live departure times in Google Maps for Mobile

The file format that Google invented in 2006 to make all this possible, called GTFS, has become the de facto world standard for sharing transit data. And now Google is pushing a related standard that enables agencies to alert riders about service delays in real time—thus answering that age-old question, “When’s my bus coming?” So far, Google is displaying these live transit updates for only four U.S. cities (Boston, Portland, OR, San Diego, and San Francisco) and two European cities (Madrid, Spain, and Turin, Italy). But it hopes to add many, many more.

Google’s activism in public transit is having widespread ripple effects. Most importantly, the company’s services are making it easier for public-transit users to plan their bus or train trips to minimize waits and missed connections. In theory, better experiences for riders translate into higher ridership, greater revenues for transit agencies, and less congestion on streets and highways.

On top of that, Google’s leadership has opened up space for a whole ecosystem of transit-app startups. It’s not as if Google invented the idea of putting transit data online—that’s been going on since at least 1994, when a pair of University of California students created a website called Transitinfo.org to tie together data from 26 transit agencies around the Bay Area. (It’s now called 511 Transit.) But the emergence of a common standard for publishing transit schedules has enabled independent developers who started out building apps tailored to their local systems to think much bigger.

Just look at Embark, a Y Combinator-funded startup in San Francisco. The company’s first mobile trip-planning app in 2008 covered only the Bay Area Rapid Transportation (BART) system. Now the startup makes apps for 12 transit systems in nine metropolitan areas, including London. “After BART we tried to make this something we could scale to other cities, and without a foundation based on standards that would have been pretty hard,” says David Hodge, Embark’s co-founder and CEO. “I don’t imagine anyone else [but Google] could have set a standard.”

The rise of GTFS has also helped to spur a larger “open government data” movement that cuts across areas like healthcare, energy, and education. And at transit agencies that were initially slow to publish their route and schedule information in digital form, including New York City’s MTA and Washington, D.C.’s Metro system, it has created irresistible pressure to open the data vaults and cooperate with outside developers.

But the most interesting thing about Google Transit—the company’s catch-all name for its transit agency data feeds—may be what it says about the company’s politics. Simply put, Google thinks people should drive less. That’s why it has its own bus fleet for shuttling San Francisco-based employees to the Googleplex in Mountain View every day; that’s why it’s researching robot cars; and that’s why driving directions on Google Maps are now supplemented by walking and biking directions as well as public-transit schedules.

If Google engineers could manage it, they’d probably try to undo the last seven decades of urban sprawl. Short of that, they think making mass transportation more efficient is one of the best ways to curb traffic congestion and carbon emissions.

“The biggest thing holding us back in the U.S. is land use patterns,” says Brian Ferris, a Google Transit engineer based in Zurich, Switzerland. “European cities are more compact, so public transportation dollars go a lot farther. In the U.S., huge parts of our cities were built after the automobile came to prominence. But we can’t change American cities tomorrow. What we can do is flip it around and ask how we can use information to make better decisions about where to live and how to commute.”

The Dream Is Alive in Portland

For the first three years of its life, GTFS stood for Google Transit Feed Specification. In 2009, Google proposed changing the name so that the G would henceforth stand for General—a sign of either magnanimity or pride, depending on your point of view. In any case, the creation of the standard, and the un-Googling of the name, make an interesting story.

Like so many current Google products, Google Transit emerged from “20 percent time,” the company’s way of encouraging employees to work on side projects that might bear unexpected fruit. The 20-percenter in this case was Chris Harrelson, a software engineer who’d joined Google Research after finishing a PhD at UC Berkeley on routing problems in public transportation systems. In mid-2005, Harrelson was monkeying with ways to incorporate transit data into Google Maps. That was when he heard from Tim and Bibiana McHugh, married IT managers at TriMet, the transit agency for Portland, OR. The McHughs were big believers in open data, and they wanted to partner with Google to make planning a trip around Portland by public transit as easy as planning a drive.

Harrelson was game, and he worked with Tim McHugh to write a program to export TriMet’s data into a file that could easily be fed into Google’s geospatial database. In December 2005, Google turned on Google Transit, with Portland as the first city providing bus and light-rail schedules within Google Maps. Harrelson added data for Seattle’s transit system in 2006, using the same data-dump format McHugh had devised. In 2007, Google published the format as the Google Transit Feed Specification.

There was nothing particularly complex about GTFS. Agencies willing to share their schedules simply needed to create about a dozen text files full of comma-delimited data showing the latitudes and longitudes of each stop on their system, the times buses and trains were supposed to arrive at each stop, and a few other details. Here are the first four lines from the stop-times file for TriMet:

trip_id,arrival_time,departure_time,stop_id,stop_sequence,
stop_headsign,pickup_type,drop_off_type,shape_dist_traveled,
timepoint

2666662,08:53:00,08:53:00,13170,1,45th Ave,0,0,0.0,1
2666662,08:54:26,08:54:26,7631,2,45th Ave,0,0,877.4,0
2666662,08:56:31,08:56:31,7625,3,45th Ave,0,0,2163.1,0

The entire GTFS feed for TriMet adds up to only 169 megabytes. “Portland deserves a lot of credit in this space,” says Google’s Ferris. “What I like about GTFS is that it is, at the end of the day, just the raw data. You can build almost anything with that.”

In the public-transit world—not a community historically known for rapid innovation—the impact of GTFS was immediate and electrifying. Transit agencies that had been casting about for more efficient ways to get route information and advisories to their customers suddenly had a consistent way to share their raw schedule data with outside developers, who would in turn repackage it for riders. Before that, each agency had taken its own approach to such data requests, and usually ended up having to reformat its data over and over, depending on the intended use.

“I was providing schedules in different formats to different people,” recalls Timothy Moore, longtime website manager for BART. “I was giving 511 Transit one look. I was giving some guy creating shopping-mall kiosks another look. I was thinking that if I could just release it in one format, it would make my life a lot easier. So when Google released GTFS in 2007 we were, I think, the first ones besides the originators to jump on.” Because BART was an early GTFS adopter, it was the only transit agency with a dedicated iPhone app on the day Apple turned on the iTunes App Store in 2008. (It’s called iBART and was developed by Embark, then known as Pandav.)

In truth, not every transit agency has been equally enthusiastic about standardization. “The default position of a transit agency is to protect its data and not open it up in a way that is accessible for developers,” says Embark co-founder Hodge. In some cases, agencies had relationships with outside vendors who claimed contractual rights to schedule data. In others, agencies didn’t want to give the data away for fear of losing Google Adsense ad revenue on their own websites.

But to Moore, selling or advertising against schedule data is like charging for menus in a restaurant. “I have watched transit agencies try to monetize schedules for years and nobody has been successful,” he says. “Markets like the MTA and the D.C. Metro fought sharing this data for a very long time, and it seems to me that there was a lot of fallout from that with their riders. This is not our data to hoard—that’s my bottom line.”

It took “the power of Google,” in Hodge’s words, to break the logjam. By 2009, so many transit agencies had begun to use GTFS—and the data was turning up in so many places other than Google Maps—that Joe Hughes, a U.K.-based software engineer working on Google Transit, proposed renaming the standard. “Given the wide use of the format…the ‘Google’ in GTFS is increasingly a misnomer, one that makes some potential users shy away from adopting GTFS,” Hughes wrote in a forum post for Google Transit contributors. And he wanted the change to be more than cosmetic: Hughes said it was time to hand ongoing development of the specification over to the larger community of transit agencies and app developers.

The Boys on the Bus

It’s safe to say there’s been more innovation in the world of public-transit trip planning in the last four years than in the previous four decades. Take the example of OneBusAway, a real-time guide to the Seattle-area transit system created by Googler Brian Ferris back when he was a graduate student at the University of Washington.

OneBusAway on the iPhone

In its first, pre-GTFS iteration, OneBusAway was a mere side project for Ferris, something to fill his evenings during a summer research fellowship at Intel. The system used the old-fashioned File Transfer Protocol (FTP) to grab data from servers at King County Metro Transit. Riders could then get bus arrival times by keying in a stop number on their mobile phone.

But once Ferris decided to scale up the system to incorporate data from Sound Transit and other regional systems—and to base his whole PhD dissertation on the project—he needed to standardize. So he followed TriMet’s example. “The first major rewrite of OneBusAway for multi-agency support was to natively support GTFS,” Ferris says. “I didn’t want to have to keep reinventing the wheel.”

The change allowed Ferris to extend the system to the entire Puget Sound area. Today OneBusAway offers real-time bus, light-rail, and ferry arrival information for nine agencies in the region, and is accessible by Web, phone, and SMS, as well as smartphone apps for iOS, Android, and Windows Phone. Area commuters use it to plan 50,000 trips per week. While Ferris himself has moved on to Google, King County Metro Transit, Sound Transit, Pierce Transit, and UW recently committed $150,000 to keep the app running at least through the end of 2012. (That’s music to the ears of at least one Seattleite: Xconomy’s own Curt Woodward, who tells me that OneBusAway is “indispensable…hands down the only good way of navigating the bus system in Seattle.”)

“I felt with OneBusAway that I was having a real impact on people,” Ferris says. “People would stop me on the street and say, ‘This is changing the way I live, the way I get around.’ Open data and standardization is what made that possible.”

Embark’s founder tells a similar story. Because its first application, iBART, used GTFS, the company was well positioned to build similar transit apps for other cities. “It certainly wasn’t easy going,” says David Hodge, who started the company with Ian Leighton three and a half years ago. “We had to convince a lot of transit agencies to give us their data. But it would have been much more of an uphill battle” if these agencies hadn’t already been using GTFS to send their data to Google Transit.

Embark's iBART app on the iPhone

Embark’s free, ad-supported apps also prove that a little openness can support a lot of innovation. The startup’s iBART app and its sister apps for transit systems in Boston, Chicago, London, Long Island, New Jersey, New York, Philadelphia, and Washington are arguably far cooler than anything Google has developed. One nice feature: the apps keep working—that is, you can still input a starting point and a desired ending point and get back a route and schedule recommendation—even when you’re underground and cut off from the Internet. The app sends you a push notification if your usual train is running late. Embark even adjusts its estimates of walking times between stations according to measurements of local citizens’ customary walking pace. (This varies quite a bit between cities, interestingly.)

“We think there is a lot of room for people like us to make applications that are very tailored for specific regions, and to add features that Google may not be interested in,” says Hodge. This month, Embark’s New York City app beat out 41 other apps for the $5,000 grand prize in the MTA App Quest. And back in its home city, San Francisco, the startup’s app continues to win more users: about 3 percent of all trips taken on BART begin with a query on iBART, Hodge says. “If you think about how many people are planning trips, that’s a bunch,” he says.

Still, it’d be wrong to attribute all of these changes to Google and GTFS. Hodge says Embark and other transit-app startups are “riding a number of waves,” the biggest being the arrival of the mobile app store concept in North America and Europe, largely thanks to Apple. Wave number two is the spread of cheap and accurate location-finding technology such as GPS. Then there’s the general ubiquity of Internet-connected smartphones, which are quickly weaning people from their 2005-era habit of printing out a map at home before they leave on a trip. “Our thesis is that in the age of the smartphone, you shouldn’t have to think about how to get somewhere,” says Hodge. Clearly, millions of consumers now share that thesis.

Events Occur In Real Time

As important as it was to get transit schedules off of printed bus-station placards and onto the Internet, that was just the first step in the modernization of trip planning. GTFS applies only to “static” data—the ideal, theoretical schedule to which bus drivers and train conductors try to adhere. But as any rider of public transit knows, theory and reality often—quite often—diverge.

If your morning bus to work was running 10 minutes late and you knew that in advance, you could have one more cup of coffee at home before grabbing your umbrella and saying goodbye to the kitty. That’s the whole concept behind Live Transit Updates, a feature added to Google Maps for six cities last June. If you’re in Boston, Portland, San Diego, San Francisco, Madrid, or Turin and you click on the Google Maps icon for a public transit stop, you’ll see live departure times—meaning, the predicted time the next bus or train will leave, based on real-time location data for vehicles traversing the system.

If there are service alerts, detours, or system-wide delays, you’ll see those too. “No more waiting on the corner wondering when the bus is coming,” says Martha Welsh, a strategic partner development manager on the Google Transit team. “Having that information gives people a little bit more control over their lives.”

The real-time updates do make Google Transit far more useful. But there’s a reason Google hasn’t announced any new partners in the Live Transit Updates program since it was introduced eight months ago: the technology behind it is much more complex and expensive to implement. Observers say they doubt that the revolution Google sparked when it introduced GTFS will have a sequel in the realm of real-time data—or if it does, it will be much more gradual.

For starters, transit agencies that want to provide live updates need to collect live data—i.e., the latitude and longitude of every bus and train, logged at the most frequent possible intervals. This usually means installing a GPS device on every vehicle and wirelessly transmitting the data back to a control center. Agencies must then condense this data into files full of locations and timestamps, publish the files to the Internet, and republish them as soon as there’s new data, so that Google can crunch the numbers and continuously update its predicted arrival times.

To enable all that, Google introduced a new standard in 2011 called GTFS-realtime. It builds on GTFS, but is a different animal, since it includes new feed types for trip updates, service alerts, and vehicle positions, as well as provisions for constantly refreshing this data throughout the day. In an advisory to agencies, Google puts it this way: “Because GTFS-realtime allows you to present the actual status of your fleet, the feed needs to be updated regularly—preferably whenever new data comes in from your Automatic Vehicle Location system.”

That bland statement contains a world of hurt. “It takes a lot more to create and maintain a GTFS-realtime feed than it does for a GTFS feed,” says BART’s Moore. “It’s frankly a little complicated. I think it’s going to be interesting to see how agencies adapt to that standard.”

To get technical for a moment, GTFS-realtime is based on “protocol buffers,” a method for updating records in a dataset by sending short messages. Google engineers invented protocol buffers because they needed something faster and more streamlined than XML, the usual language for exchanging data on the Web. The problem is that it takes a real programmer to master the concept. A transit agency may be lucky enough to have a spreadsheet jockey like Tim McHugh who can generate GTFS files, but it probably doesn’t have developers trained in Google’s peculiar database philosophy.

On top of that challenge, many agencies outsource the problem of automatically determining vehicle locations and generating arrival-time predictions to commercial vendors. While they might be able to figure out GTFS-realtime, these vendors aren’t always eager to feed their data straight to Google. “In many cases, there are sticky contractual arrangements about who owns the data and the predictions,” says Moore.

When it comes to the future of GTFS-realtime, “the jury is still out,” says Embark’s Hodge. “There are expectations baked into it that would require transit authorities to track their vehicles in ways that most of them don’t, and to make predictions in ways that most of them can’t. I like the idea of a real-time data standard. I just think GTFS-realtime is too ahead of its time to be truly adoptable.”

The main concern that Hodge, Moore, and others seem to be expressing is that Google designed GTFS-realtime to suit its own ambitions, rather than the needs or capabilities of the transit agencies. It’s the first sign of friction in what, since the release GTFS in 2007, has been a virtual lovefest.

Ferris, the creator of OneBusAway, is now one of the lead engineers at Google responsible for maintaining and extending GTFS and GTFS-realtime. He says Google is doing its best to respect the limitations of transit agencies while still leaving room for future innovation. “Realtime is a whole order of magnitude more complex than static scheduling—there is just no way around it,” he says. “We wanted to push the envelope in what we support. We wanted something more complex in terms of using a protocol buffer definition optimized for streaming, which gets us a lot more data. But it’s always a tension. We don’t want the spec to be this massive thing that could take five weeks just to parse through. We want this to a be a spec that anyone can work with and propose features and make that happen, without us being the elephant in the room.”

The Best Computer Is the One You Have With You

The reality, of course, is that Google can’t enter any room without being the elephant. And in many ways, that’s a positive thing. When Google bought a small geographical information systems (GIS) startup called Keyhole back in 2004, it wound up disrupting the whole digital-mapping industry, where expensive, professional desktop software had previously ruled. Now anybody can open a free Google map on their smartphone, browse a virtual globe in Google Earth, or get detailed directions from Penzance to Tintagel. (If you ask Google Transit to show you how to get from Union Square in San Francisco to Pioneer Square in Seattle entirely on public transportation, it will oblige.)

Few other companies could have brought about such a swift change—or moved so quickly to take advantage of advances in mobile and location-finding technology.

“For me, personally, Google Earth on the phone is something I could only dream of in the year 2000,” says Chikai Ohazama, a Keyhole co-founder who’s now director of product management for the Google’s Mobile Geo team. “You barely had broadband penetration. There was no 3D graphics on desktops, let alone phones. But today all the dreams we had have come true on the phone.”

Google Maps for Mobile on an Android device

Indeed, if there’s an overarching logic to Google’s involvement in transit data, and location information more generally, the smartphone is its organizing premise. Like its rival Apple, Google sees your phone as an intelligent gateway to a growing world of content, applications, and local information. Since it’s the computer you always have with you, it’s the one you’re most likely to use to navigate your way across town, and to zero in on a particular store or restaurant once you get there. “We like to say a phone has eyes, ears, skin, and a sense of location,” Katie Watson, head of Google’s communications team for mobile technologies, told me last year. “It’s always with you in your pocket or purse. We really want to leverage that.”

In fact, to understand Google’s vision for mobile maps at its fullest, you have to experience it through Google’s mobile operating system, Android. If you’re browsing a map on an Android phone, you can see transit data instantly by tapping on the blue icon for your local bus, train, or streetcar stops. (These icons aren’t clickable on other mobile platforms.) And only on an Android phone can you access related features like 3D maps, a terrain layer, indoor views, turn-by-turn or stop-by-stop navigation, and Places, Google’s Yelp-like catalogue of business locations. Yes, Google Maps still works on iPhones, Windows phones, BlackBerry devices, and Symbian devices—but the experience feels impoverished by comparison.

“What’s really great about Google Maps for mobile is that it offers one-stop shopping,” says Google’s Martha Welsh. “It’s not just about getting from Point A to Point B, it’s really about the opportunity to explore and interact with your environment.”

So far, Google isn’t making aggressive use of its map- or navigation-related products to serve advertisements. (On the Web, you’ll see an occasional keyword-based ad on Google’s street-view and indoor-view pages for businesses, but I’ve never come across an ad on a Google mobile map or a transit data page.) That’s not to say that Google has ruled out monetizing these services. It’s just that for now, they’re offered as part of the larger family of free products—from Gmail to Chrome to Picasa—that make Google so sticky.

The more transit data Google can provide to its mobile users, the more confident they’ll feel that the bus or train will get them to their destination on time (which is why the company is so committed to GTFS-realtime). And the better they’ll feel about leaving the car at home—or not buying one in the first place.

Indeed, if you listen to a public-transit enthusiast like Brian Ferris—who says he hasn’t owned a car in almost eight years—you begin to wonder what other forms of anti-driving persuasion the company may plan to apply.

One natural extension of Google Transit, Ferris suggests, would be a software tool that shows people hunting for a house or an apartment how long their commute to work would be by bus or car—or how much they’ll pay for car insurance and parking in each neighborhood. “If we can capture information about all the external costs we don’t represent now…[and] if we can give you as much information as possible when it comes time to make a decision about where to live or whether to get into a cab versus a car versus a bus, those are the ways we can encourage people to use public transit,” says Ferris. It’s all just another example of “organizing the world’s information,” he says. But like so many of Google’s ideas, it may be one that will help reorganize the world along the way.

Wade Roush is a contributing editor at Xconomy. Follow @wroush

By posting a comment, you agree to our terms and conditions.