In Post-CD Era, Gracenote Makes a Big Business of Content Recognition

For Gracenote, the media database company, Christmas used to be the busiest day of the year. They called it “iPod Day,” because that was when millions of people would be unwrapping their new iPods and then rushing to their computers to rip dozens of CDs so they’d have some music to play.

For every ripped album, Apple’s iTunes software would contact Gracenote’s database, which would run the music through its audio recognition system and send back track and artist names and cover art.

“We’d see these massive spikes, three times or four times the regular load,” Gracenote CEO Stephen White says. “We had to architect the whole service so that we could support Apple on that one day.”

Sales of iPods peaked in the U.S. a long time ago—back in 2008—and CD sales have been plummeting for years as device owners have turned to direct MP3 downloads from online stores or streaming services like Pandora or Spotify. But the history that White is recounting isn’t as ancient as you might think. It wasn’t until 2010 that Gracenote’s Christmas rush began to abate, he says.

That’s mostly thanks to globalization. “As iTunes has come to Korea and Russia and India, you have a tremendous number of new consumers going through the same evolution,” White says. In other words, people outside the U.S. and Europe still buy most of their music on CD, and still need to rip it and run it through Gracenote. “Globally, the CD recognition numbers have started to flatten out, but they have not started to decline yet.”

Gracenote CEO Stephen White

Gracenote CEO Stephen White joined the company in 2011.

Still, the CD is a dying format, and White says Gracenote isn’t counting on revenues from its music fingerprinting service to buoy it forever. Unbeknownst to most consumers, the 350-employee subsidiary of Sony has spent years building up other businesses. It has large divisions that work with TV manufacturers, cable networks, and other content distributors on video recognition technology, and with automakers on systems that help drivers use voice commands to navigate their music playlists. Last year the company exceeded $100 million in revenue for the first time.

Behind everything there’s still Gracenote’s massive database of 130 million songs and more than 1 million movies and TV shows. But these days, the lookups against that data are coming from hundreds of types of devices, from smartphones to smart TVs to set-top boxes to in-car entertainment systems.

In fact, the Gracenote service handles 10 billion queries a month—which, if it were a search engine, would make it bigger than Bing (but not quite as big as Google).

“We see ourselves as the underlying knowledge base that links together all of these various fragmented music and video assets all over the world,” White says. “So ultimately, yes, we do understand that the CD will go away. And we have built a diversified business to allow us to add value in other parts of the ecosystem and make up for it over time.”

I visited Gracenote’s headquarters in Emeryville, CA, just as executives were getting ready for the International CES show in Las Vegas, and got a survey of its historic offerings as well as a bunch of edgier projects that show where the company is going in the near future.

Just look at Gracenote-powered apps like Habu Music—which lets you browse your music collection according to the mood you’re in—and it’s clear how far the company has grown beyond its origins in the early 1990s as the Compact Disc Database (CDDB), the first large collection of disc names, track names, and other metadata about the music files on CDs.

“A lot of folks think of us as a metadata company, but we are really not,” White says. “We are a technology company that happens to have a tremendous amount of metadata.”

Still, at the core of the business is the care and feeding of Gracenote’s media databases. Record labels, artists, publishers, and movie and TV studios submit more than 100,000 new songs and videos to the company every week, and a large editorial team at Gracenote does nothing but curate and enhance this data.

There’s software that creates a fingerprint for every file, and plenty of machine learning algorithms to help describe elements of the submissions, such as their mood and tempo. But humans are needed to train the systems, and to annotate things machines don’t know. (White says it’s very important, for example, to have unique fingerprints in the system for the “clean” and “explicit” versions of pop songs; software has a hard time telling the difference.)

It’s partly because of this human element, by the way, that White doubts there will ever be a single standard for the metadata describing songs and videos, the way the EXIF data embedded in a digital photo can tell you things like where it was taken and at what exposure. He says the idea of a universal identifier is both a holy grail and a red herring for the media industry, in part because of cultural differences.

“For example, what we call ‘world music’ in the U.S. is not world music anywhere else,” he says. “You need a flexible way to represent metadata based on the geography and the use case.”

Beyond identifying albums and tracks from ripped CDs, Gracenote’s databases power newer services like Apple’s iTunes Match and Amazon’s Cloud Player. These programs scan your computer to see what songs you’ve got on your hard drive (no matter where you obtained them) and give you access to cloud versions of the same tunes, playable from anywhere. Both services cost $25 per year, and White says they give music publishers the ability to earn back some of the money they’ve lost to free download sites like The Pirate Bay and

“The reality is the music industry has recognized that there are pirated version of content out there,” hey says. “There’s not much they can do about that, and they would rather participate in the revenue of unlocking those items in the cloud.”

A big part of Gracenote’s revenue—35 percent, according to White—comes from another industry that’s struggling to adapt to consumers’ new media consumption habits, namely automakers.

Gracenote’s history in the automotive business stretches back to the early 2000s, when co-founder and chief technology officer Ty Roberts was summoned to Japan by Pioneer, the giant home and car electronics company. Pioneer was working on a system that would allow car owners to rip CDs to the hard drives inside their GPS devices. It wanted Gracenote to supply software to both identify the songs—a challenge, given that cars had no network connectivity in those days—and let drivers use voice commands to play specific songs.

“As Ty is apt to do, he said ‘We’ll figure this out, give me a couple of weeks,’” White says. “He came back here and told the team ‘All we have to do is take our database and find a way to put it in a car.’ They thought he was crazy. But six months later we shipped the first code.”

White calls the project with Pioneer “the birth of the digital media experience in the car,” and it blossomed into big current-day contracts like one with Ford to supply the music recognition and interaction components of the Ford Sync entertainment system. Gracenote’s most important contribution to Ford Sync, says White, was a vocabulary of phonetic strings adapted to the world of music.

“A normal text-to-speech engine doesn’t handle music very well,” White says. “There are so many nicknames and non-standard pronunciations, like Sade, AC/DC, 311, and that’s compounded when you go international—-50 Cent in French is not ‘Cinquante Cents,’ and Elvis, in Mandarin, is not Elvis, he’s ‘The Cool Cat.’ We did the artist names and nicknames and variations in 30 languages to enable these engines to work properly.”

Gracenote's MoodGrid interface

Gracenote's MoodGrid interface for in-dash infotainment systems.

But with in-dash infotainment systems getting more sophisticated, drivers and passengers aren’t limited to interacting with their music by voice. At the CES show in both 2012 and 2013, Gracenote showed off a touchscreen system it calls MoodGrid, which lets listeners pick something peppy or poky, depending on what kind of stimulus they need. The grid’s x axis runs from “calm” to “energetic,” and the y axis runs from “dark” to “positive.”

The system—which taps into music on a car owner’s iPod or smartphone, and can also connect with streaming systems like Spotify or Rhapsody—is already showing up in hardware from manufacturers like Garmin. “The auto guys love this, because they can create simple touch interfaces that sit on top of large collections of music,” White says. The same technology behind MoodGrid is available to smartphone owners in the form the Habu Music app, developed by the Gracenote-owned app studio Gravity Mobile.

Helping consumers navigate huge media collections is also a challenge for smart TV makers, which is why Gracenote’s third major business revolves around video. Manufacturers know consumers are tired of old-fashioned electronic programming guides that organize shows according to channel and time. “The interfaces for TV listings have been pretty stale for a long time,” White says. And anyway, “Very few people outside of sports fans actually watch TV shows when they air.”

Using its metadata about video content, Gracenote has been able to construct snazzier interfaces that let users search broadcast schedules and streaming catalogs and see recommendations based on their viewing history. For Sony and Philips televisions sold in Europe, for example, Gracenote supplied a system called eyeQ that shows app, broadcast TV channels, and on-demand content on a single home screen. And if you have a Sony Xperia device, you can run your Sony TV using a Gracenote-powered programming guide that’s been shrunk down to tablet size.

White says Gracenote wants to enable even more “second screen” experiences where TV viewers are watching a show on their big screen and simultaneously interacting with related content on their tablets or smartphones. To sync up the two, Gracenote has built a content recognition system called Entourage that can listen to a TV show’s audio track, or even analyze the video signal, to figure out how far the program has advanced. NBC Universal’s Syfy Channel uses Entourage in its Syfy for iPad app to provide fans of two shows, Haven and Faceoff, with content such as challenges and character back-stories.

Entourage is also behind an experimental ad replacement technology that Gracenote showed off at CES. By monitoring the video signal, the system knows when a commercial is coming up, and seamlessly superimposes an ad downloaded from the Internet over the one in the broadcast signal. If the TV or set-top box equipped with the system also has some info about the viewer’s age, gender, and location, then it can “target advertisements to you based on what you care about and who you are,” White says.

The advertisers who paid for the submerged ads might not care for the whole idea. But “nobody is trying to start World War IV,” White says. “This is a $70 billion industry, and it’s really about how we increase the value to the advertiser and the consumer, and how we all participate in the upside that results.”

Gracenote was founded in 1998, the same year as Google, and in at least one way its rise has paralleled the search giant’s: both companies thrive by managing data about data. Gracenote arguably has the more difficult task, since songs and videos are much harder to interpret than HTML Web pages—and they come with a lot less metadata. But that’s exactly why people are willing to pay Gracenote for its services, whereas Google has always had to give its search results away for free and monetize them through advertising.

As a growing list of manufacturers and networks sign up to license Gracenote’s software and data, it’s starting to look like Christmas every day of the year.

Wade Roush is the producer and host of the podcast Soonish and a contributing editor at Xconomy. Follow @soonishpodcast

Trending on Xconomy