The Web’s Last Word on Words

10/26/11Follow @wroush

(Page 2 of 3)

taking this set of services and rolling it out to major publishers and building a business,” says Hyrkin. “A lot of the under-the-hood stuff, over the next year, is going to become more apparent.”

To which McKean adds, in a characteristic bit of word play: “It will be over the hood—mounted right on the hood.”

One of the founding precepts at Wordnik was that language evolves far faster today than traditional dictionaries can keep up. Thanks in part to the Web itself, words can pop up out of nowhere, experience meteoric careers, and flame out just as quickly. Think of “birther,” which used to be slang for “heterosexual” but now connotes someone who questions whether President Obama was born in the United States, or “sheenius,” coined by some anonymous Internet punster this spring to distill Charlie Sheen’s unique brilliance at turning an attention-getting personal meltdown into a career-advancing move. By constantly scouring the Web for fresh text and feeding it into the Word Graph, Wordnik can detect such novel words and meanings well before human lexicographers have a chance to get a grip on them.

But to make sense of it all, the company had to come up with some new technology. “When I joined the company three years ago, what Erin was out to solve was figuring out what things mean automatically, from the relationships in the text,” says Tam. “Almost out of necessity, we came up with this idea of a word graph that relates all of these ideas with other ideas. In this graph we have tens of millions of words and on the order of 50 million relationships.”

It's raining words: The Wordnik front page displays a cascade of recently-searched words.

The graph knows, for example, that “sheenius” can mean “genius,” but that it can also mean “crazy person.” Inside the Word Graph, even simple words like “run” possess a radiating web of probabilistic links to various concepts—running down a street, running out of money, a run on a bank. And the webs are always growing, fed by sources both highbrow and low, such as NPR, Simon & Schuster, Forbes, CNN, the UK’s Guardian newspaper, and Twitter.

“The words go in as fast as we can find them,” says McKean. She calls the graph’s job “supernatural language processing,” a pun on the computer-science discipline of natural language processing.

It’s all fodder for what is perhaps the Web’s most interesting dictionary. When you look up a word at Wordnik.com, you get definitions from sources like WordNet, the Century Dictionary and Cyclopedia, Wiktionary, and the American Heritage Dictionary; etymologies; examples from blogs, newspapers, and magazines; hypernyms (more generic or abstract terms); words found in similar contexts; a reverse dictionary (words that contain your word in their definition); tweets; related images from Flickr; audio pronunciations; lookup statistics; and even tags, comments, and lists generated by Wordnik’s community of 75,000 registered members. It’s all enough to make a word maven swoon.

But what’s the practical import of it all? “We are, right now, in the midst of answering that question,” says Hyrkin. “You can use these definitions and meanings to drive a whole range of activities.”

One early user of the Word Graph is TaskRabbit, the San Francisco-based site where users can farm out small jobs such as picking up groceries to vetted errand-runners. The two startups have a pair of investors in common—Floodgate and Baseline—but Tam says … Next Page »

Wade Roush is a contributing editor at Xconomy. Follow @wroush

Single Page Currently on Page: 1 2 3 previous page

By posting a comment, you agree to our terms and conditions.