Lexalytics Moves to Boston to Exploit New Market for Sentiment Analysis

3/29/10Follow @wroush

Lexalytics, whose text-analytics software can measure, among other things, whether a digital document is full of praise or insults, did not get off to a superlative start back in 2003.

To begin with, its investors almost closed the company down. Lexalytics got started when the venture funders behind a Woburn, MA-based content management startup called LightSpeed Software decided to consolidate that company on the West Coast. “They were going to close the East Coast operation, so I basically convinced them to give it to me to avoid the shutdown costs,” says Jeff Catlin, a former LightSpeed general manager who, together with a LightSpeed engineer named Mike Marshall, salvaged the Woburn operation, moved it to Amherst, MA, and renamed it Lexalytics.

But three months later, a wrinkle cropped up. Marshall, a UK citizen working in America on a green card, was deported. “They shipped him back, and we didn’t see each other for about three years,” recalls Catlin, Lexalytics’ CEO.

Marshall remained as chief technology officer, working remotely, and the company worked through its rough patch. Today, business is booming. In fact, the startup has outgrown its Amherst location—it’s already hired everyone it could recruit out of the UMass Amherst computer science department, Catlin says—and this month it opened a new headquarters office here in Boston.

The startup’s current momentum was a long time building, and was partly the result of some long-overdue luck, according to Catlin. Sentiment extraction, the ability to measure the emotional tone of a news story or a product review or a customer complaint, has long been one of Lexalytics’ specialties. But only in the last 18 months or so has demand for sentiment extraction software become red-hot, as companies in many industries have realized how the technology might help them with tasks like brand reputation monitoring and algorithmic investing.

“Looking back from a historical perspective, we were brilliant,” says Catlin. “We were the first vendor to do sentiment analysis, which landed us a number of big clients like Cisco, and we are now the recognized leader in that spot. I’d love to say it was really well thought-out and reasoned, but at the time we were just thinking, ‘What would be a cool feature to add?’”

Unlike Cambridge, MA-based Crimson Hexagon, Watertown, MA-based Cymfony, and a cluster of sentiment analysis startups in Seattle like Appature and Evri, Lexalytics doesn’t directly serve companies who want to know what people are saying about them on the Web. But the 20-employee startup does sell its software libraries to many of the firms that do this, including Cymfony and ScoutLabs. “A lot of those vendors use us under the hood to provide their sentiment analysis and entity extraction,” say Catlin.

Entity extraction is the process of tagging a digital document to identify key people, places, companies, products, e-mail addresses, themes, and messages. Once that’s done, Lexalytics’ software can also parse a document’s grammar, word order, and vocabulary to determine who’s speaking about whom, then score the emotional tone of each statement.

“Behind the scenes we have dictionaries of tonal phrases—typically, adjective-noun or adverb phrases—so that [the software] knows when it sees ‘horrible disaster’ or ‘wonderful day’ that those are sentiments, and who they belong to,” says Catlin.

Documents processed by Lexalytics’ software, called Salience, come back as XML files riddled with new metadata that companies can use to draw inferences or soup up their search results with related information. “You give us a document that’s a foot long, and we give you back one that’s three feet long,” says Catlin. “The best applications are with search vendors like FAST and Endeca who use the technology to make their solutions better. Google is great if you know what you are looking for, but if you have no idea what you are looking for, you need the data to tell you what’s going on. The metadata lets you start digging through that.”

Catlin gives a hypothetical example. “Say you want to know who is hot in the news today and who they are related to. It turns out Bill Gates is hot. You click on that and get the concept and the sentiments and the other people and companies that are mentioned around it, and may you find out that it’s about some energy company that he’s funding. You don’t have to have a great question at the start to find that out.”

Crucially, the software can pursue connections like this automatically, without a human involved. Which is why Lexalytics’ technology is also attractive to clients like Thomson Reuters, the financial news and services giant. Catlin says the organization is using Salient to tailor the input for algorithmic trading software that attempts to get ahead of the market by gathering trend information and Web buzz (what traders call “alpha”) faster than people can.

“Because they are Thomson Reuters, they have a very real-time stream of news, and they can dump that information into their trade execution system before anybody else can get it,” says Catlin. “They’ve found they can do 60 to 70 basis points better than the market when scoring sentiment on news.” (A basis point is a hundredth of a percent.)

Lexalytics hasn’t taken any venture cash, and is operating at a profit. The startup shares sales and marketing expenses with UK-based Infonic, a document management company with which it has formed a joint venture. When I interviewed Catlin and Lexalytics marketing vice president Christine Sierra, it was the very first time they’d used the phone in their new office on Congress Street, just a stone’s throw from Thomson Financial.

Moving to Boston not only makes recruiting and growth easier, Catlin says, it also brings the company closer to other major customers like Endeca, Northern Light, and FAST, now a division of Microsoft. “The market seems to be quite receptive to this technology right now, and it’s easier to grow a business in a metro area like Boston than in Amherst,” says Catlin, who will split his time between the old Amherst office and the new Boston headquarters.

To make its technology accessible to companies outside the narrow fields of enterprise search or financial services, Lexalytics introduced a new Web-based service last week called Lexascope. It’s not a full-blown reputation monitoring platform—Catlin says the company has toyed with launching such a service, but doesn’t want to compete directly with its own customers. Rather, it’s an application programming interface, or API, that lets any organization plug their own Web monitoring applications into an online version of the Salient engine, which will trawl through articles, blogs, tweets, surveys, forums, and other documents and extract the major entities, themes, and sentiments. Aimed at marketers and content management specialists, the “freemium” service is available at no cost for up to 1,000 documents per day. For a $400 monthly fee, users can process up to 50,000 documents a day.

Many of the natural language processing, machine learning, and statistical modeling techniques behind sentiment extraction have been around for a decade or more. But as with Lexalytics itself, it’s only in the last couple of years that the parts have begun to mesh well, says Catlin.

“Back in 2003 or 2004, a lot of people were talking about it, but for the most part they couldn’t do it,” he says. “But as time goes by, hardware gets better and research comes out modifying the older techniques. We have glued a bunch of technology ideas together to come up with a whole that is more complete than any of the parts. You don’t have to stand on one foot and wave a rubber chicken counterclockwise to make it work. We are far enough downstream now so the stuff really works to solve real problems.”

Wade Roush is a contributing editor at Xconomy. Follow @wroush

By posting a comment, you agree to our terms and conditions.