Can The Echo Nest Stay Aloft in the Turbulent Music-Recommendation Industry?
As I walked through a windy, chilly Somerville on my way to visit music-discovery startup The Echo Nest yesterday, last weekend’s sudden shutdown at Matchmine weighed on my mind. After all, The Echo Nest is in the same general business as the now-defunct Needham, MA, company: building software that helps digital media companies provide Web users with personalized recommendations, by rating the intrinsic qualities of songs or other media files in the companies’ catalogs and matching them with the tastes expressed by users. And both companies were built around the belief that today’s dominant recommendation systems, based on a decade-old technique called collaborative filtering, do a lousy job of helping people discover new music and other media.
But after pouring nearly two years of work and $10 million in financing into its MatchKey service, Matchmine was unable to find enough clients to keep its main investor, the Kraft Group, from pulling the plug in a bid to help stem its own cash flow crisis. So I had lots of questions for the guys at The Echo Nest—who just brought their company out of stealth mode last month—about whether there’s a real demand for new music recommendation technologies.
As it turns out, they had some pretty good answers. While the company hasn’t said which media companies it’s working with, CEO Jim Lucchese says the startup has been approached by a long list of music-driven organizations—think social networking sites, Internet radio stations, and the like—that are racing to upgrade their recommendation services to compete with hot online music platforms like Pandora and Last.fm. “Recommendation has moved from a nice-to-have to a must-have,” Lucchese says. “There are a lot of companies that need to get into parity with their competitors.” The Echo Nest plans to start talking about some of its customers in a couple of months, Lucchese says, and by the end of the first quarter of 2009, he predicts, “we’ll be powering applications on a number of the comScore top 10 music properties,” referring to websites monitored by the audience measurement service comScore Inc.
An MIT spinoff founded in 2005 (nearly two years before Matchmine), The Echo Nest may also be benefiting from an early decision to stay small and raise little capital. That meant the company could afford to wait a little while as the Internet music economy picked up steam.
“I can’t speak directly to Matchmine’s experience, but there are companies in the recommendation space that over the last couple of years raised tens and tens of millions of dollars, and we did not,” Lucchese says. “Until recently, we were five people. For a company like Matchmine, part of the problem could be that they were at full ramp-up a little bit too early for the market.” But in the last couple of months the market has really increased, Lucchese adds: “We have a very long pipeline right now of deals where people are saying ‘We need to get this out by the first quarter.'”
So what exactly is this technology that The Echo Nest’s clients are in such a rush to implement?
The company’s story starts at the MIT Media Lab, where its co-founders and co-CTOs, Tristan Jehan and Brian Whitman, met as graduate students in Professor Barry Vercoe’s “Music, Mind, and Machine” group. Jehan was developing software that automatically picked out the tempo, rhythm, and other parameters in songs. Whitman was applying text retrieval techniques to information about music—and he says he was “pretty pissed off” about the state of the art in Web-based music recommendation services.
“Amazon and almost every large online store uses sales and clickstream data to do things like saying, ‘Okay, Jen bought these four records, and you bought three of them, and you don’t know Jen, but you should probably buy that fourth one as well,'” Whitman says. That’s collaborative filtering—and it was largely invented by Media Lab professor Pattie Maes (an Xconomist) and her students in the mid-1990s (her recommendation company Firefly Network was bought by Microsoft in 1998 for $40 million).
But while collaborative filtering “sounds like a great idea,” says Whitman, “when you apply it to music, stuff just gets lost.” Music with lower sales volume, and therefore less clickstream or sales data, will never show up in a collaborative-filtering-based search. So while collaborative filtering makes tracks and albums from popular bands even more popular, it marginalizes newer or edgier groups (a point I also made in my first profile of Matchmine).
Whitman was by no means the first to find collaborative filtering unsatisfying. Back in 2000, a composer and record producer named Tim Westergren had started the Music Genome Project, an effort by scores of trained musicologists to rate songs according to nearly 400 attributes such as melody, harmony, instrumentation, and rhythm. That led to the creation of Pandora, a free streaming music service that uses the attribute scores to help users discover new music by comparing it to the tunes they already like.
Pandora is brilliant, in my personal opinion. [Editor’s note: I’d have gone with “wicked awesome,” but the point’s a valid one nonetheless.] It offers free, 24/7 access to a very large catalog of music. The Pandora iPhone app, which makes all of that music available to mobile-device owners on the go, has brought the company hundreds of thousands of new listeners. And Westergren has become a poster boy and activist for Internet radio, the existence of which has been threatened by attempts in Washington, D.C., to hike the royalties musicians can earn on streaming music.
But as cool as Pandora is, it suffers from a major limitation: human analysts can only listen to so many songs every day. In its eight years, the Music Genome Project has catalogued only 1.5 million songs. That’s a small fraction of the total amount of music available over the Internet: the social music community iMeem, for example, has more than 5 million tracks.
It’s a great idea to base music recommendations on the actual, measured characteristics of a given song, rather than guess about whether one person shares musical tastes with another they’ve never met. But to scale it up to the entire universe of music, Whitman argues, you have to turn to automated techniques. One is “machine listening,” the technology Jehan was investigating at the Media Lab, which can quickly produce a boiled-down quantitative representation of a song. But equally important is basic text search technology, which is needed to identify the artists associated with each song and to find out what people are saying online about the music they like.
After all, “You can’t assume that just because two songs sound the same to a computer, you will like both of them,” Whitman points out. Look at mainstream rock and Christian rock: they may share similar tempos and rhythms, but they appeal to very different audiences. “The difference there is completely cultural,” says Whitman. “The text-retrieval stuff I do can tell you a lot about the music that the audio will never tell you.”
Whitman and Jehan hung out together a lot at the Media Lab, and after they both got their PhDs in 2005, they decided to join forces to try to solve the music recommendation problem. Vercoe, Whitman’s thesis advisor, became their first angel investor—“The first time he’s ever invested in a student,” according to Lucchese. And Echo Nest became the first music-discovery company to emphasize what Whitman calls “this combination of acoustic and cultural information.”
What they spent three years building—and finally unveiled at the Demo 08 conference in San Diego in September (watch a video of their presentation here)—is the “Musical Brain,” a software-as-a-service platform that developers building music-driven websites can tap into for recommendations, as well as automated analysis of new songs and feeds of music-related media from across the Internet.
The Musical Brain has too many features to describe here, but for the most part they fall into Whitman’s two categories—the acoustic and the cultural. On the acoustic side, the software can listen to an entire song in about 2 seconds, then use various digital signal processing techniques to identify every unique “segment”—every drum beat, trumpet note, or lyrical syllable—plus factors like pitch, loudness, and timbre. The average pop song yields 2,000 to 3,000 segments, which are summarized in a single XML file that goes into the Musical Brain’s database.
On the text side—which “we have a lot more people working on, because it’s huge and messy,” in Whitman’s words—the company is attempting to catalog every document on the Internet that’s about music. Every time someone writes a blog post about an album or a concert, every time a college newspaper publishes a music review, the Musical Brain finds it and uses natural language processing techniques to figure out what artist or song the text is about, then pick out the specific terms each writer is applying to that artist or song. In this way, the Brain slowly builds up a picture of what Internet users are thinking and saying about music. For example, says Whitman, “We could tell you with a certain probability that today on the blogs, people thought Radiohead was ‘angular’ or ’emotional.'”
One of the hardest parts of the problem, Whitman says, is parsing names—figuring out which bands and artists bloggers, reviewers, and commenters are actually talking about. That’s partly because “there are bands named after just about everything you can imagine,” he says, from The Grizzly Bears to CSS. (That one gives the programmers at The Echo Nest fits, since CSS also stands for Cascading Style Sheets, XML files widely used to control the appearance of Web pages.) The Musical Brain also has to compensate for people’s inability to spell the names of certain artists, like Björk or Britney Spears—there are at least 30 common variations on Britney’s name alone.
“It’s the least sexy part of the digital music market, and no one likes to talk about it, but reconciling artist names is a huge problem,” says Lucchese. “We’ve seen some commercial music services that are still trying to figure out how to tie together ‘Elvis Costello’ and ‘Elvis Costello and the Attractions.’ But we ended up building natural language processing capabilities that are pretty unique and turn out to have real value for the market.”
Name reconciliation, of course, isn’t the kind of service you could sell directly to music consumers—which is why “We are not interested in being a consumer-facing music service ourselves,” says Lucchese. “We are a provider of enabling technology to other music services.” That could mean anything from an ad-supported social networking site like Bebo or MySpace to an editorial outlet like Rolling Stone, Vibe, or Spin, to an Internet radio station or listening community like iMeem, to a paid music service like iTunes or Rhapsody, to a provider of mobile music downloads. (Those are just examples—Lucchese won’t confirm or deny that the company is working with any of these companies.)
The Echo Nest now has nine employees. It closed its first venture financing round—an undisclosed sum contributed by Waltham, MA-based Commonwealth Capital Ventures—in September, and will probably double in size over the next 12 months, according to Lucchese. (I imagine that it’s getting a few resumes from former Matchmine employees.) Its customers will interact with the Musical Brain through the extensive application programming interfaces (APIs) that the company is building for specific services such as recommendations or feeds. “The more you use our platform—the more you hit the APIs—the more you pay,” Lucchese explains. That way, customers don’t have to sink a lot of money into the service up front, and The Echo Nest gets a slice of the revenues as demand for online music services increases. “We definitely want to share in the growth of an industry that’s expanding at 35 percent per year.”
But there’s one down side to being a tools provider: you only look as smart as the people who are actually putting your tools to work. Whether The Echo Nest’s recommendation system ultimately outperforms Pandora’s “will come down to how well our customers use our tools to create new experiences for users,” Lucchese acknowledges. “How you actually measure the difference is a tough one. That said, there’s one area that’s really easy to measure, and that’s the shitty recommendations that people are getting from collaborative filtering systems. If you just looking at transaction data, you are prone to give out recommendations that people immediately distrust.”
Lucchese is right—I can’t remember the last time I bought something at Amazon based on the site’s automated recommendations. I can, however, remember the last time I set up a personalized Pandora radio station. It was last month, and it helped me discover a whole series of musicians who have a style similar to jazz pianist Robert Glasper, one of my current favorites. So the company is riding a trend—and the big question for it, as for every other technology startup, is whether it can outrun the economic downturn and transform its service into real revenues before the capital dries up. Matchmine couldn’t. The Echo Nest just might.
* * *
Next page: video of The Echo Nest’s presentation at Demo 08.
Scroll down for video.