Weaving Words with Wordle: A Talk with IBM’s Jonathan Feinberg

Xconomy chose to set up shop in Kendall Square because we wanted to be at the epicenter of investment and innovation in the Boston area. But it was just luck that we ended up right across the street from IBM’s Cambridge research facility at 1 Rogers Street, which is home to both the Collaborative User Experience (CUE) Research group and the new Center for Social Software. The creative software engineers, user interface designers, and visualization researchers in the group have come up with a whole gallery of fascinating tools for studying the way business people (and average Web surfers) interact with data, including Many Eyes, a community portal for visualization experiments.

But one of the Web-based applications included in Many Eyes has taken on a surprising life of its own outside IBM. It’s a free visualization tool called Wordle, where users can dump a bunch of text into a window and then see it automatically yet artfully arranged into a cloud of words, with the size of each word corresponding to its frequency in the original passage. As the Wordle site puts it, it’s “a toy for generating ‘word clouds’ from text that you provide.”

These aren’t your father’s tag clouds: they’re playful, colorful, and made from attractive fonts. And they fill up the space around their center points in a clever way that often seems to say something profound about the nature of the ideas in the text. In fact, Wordle achieved minor fame during the 2008 presidential campaign, when several newspapers and other media outlets used the tool to analyze publications and speeches by the major candidates. It’s also found its way into popular culture: a Google image search for Wordle diagrams turns up more than 200,000 examples. (The Wordle word cloud on this page is based on the text of this article, and the one on page 2 is based in Lincoln’s Gettysburg Address.)

I dropped by the CUE Research group’s offices several weeks ago for a conversation with Wordle’s creator and shepherd, Jonathan Feinberg. Though Feinberg is a senior software engineer at CUE, he developed Wordle as a personal project, and maintains it on a server outside of IBM. Martin Wattenberg, a mathematician who lead’s the CUE Research group’s Visual Communications Lab, was also on hand for the talk. Below is an abridged transcription.

Xconomy: Where did Wordle come from?

Jonathan FeinbergJonathan Feinberg: It came from a Lotus project I worked on called Dogear. It was a social bookmarking engine. In every piece of social software there’s a tag cloud, and I have implemented some, but I thought they were ugly and boring. I was prompted by Dogear to think about the ways you could fit words together on a page. The core bit of code I wrote was a Java applet within Dogear called TagExplorer, but when Dogear came out as a product and TagExplorer was not part of it, because Java applets are considered too slow to load. Then two years later, while clearing out one of my workspaces, I stumbled onto the code and thought I’d like to do something with it. That’s when I built Wordle. So, the sand that formed the pearl was a tag cloud, but it’s really divorced from that idea now.

There was no way to put it out there as a branded IBM thing. I didn’t even begin working on the real Web application for Wordle until I had gotten permission to use the code externally. The lawyers and the product managers didn’t want this code for a product, and I got permission to use it non-commercially—which serviced me well, because I don’t like doing business. So Wordle the Web app belongs to me personally, not ManyEyes.

Martin Wattenberg: But ManyEyes has Wordle in it, and it’s become one of the most popular visualization types on ManyEyes.

X: What’s really wrong with conventional tag clouds?

MW: If you look at how the academic community has viewed things like tag clouds, it’s been with a certain amount of suspicion and skepticism about what they convey. And that skepticism has been backed up by studies. If you are going to try to convey a list of word frequencies and measure what people remember about a text, it’s not clear that tag clouds give you any benefit over, say, a list of words ordered by frequency.

Wordle word cloudThat said, when Wordle went up, Jon positioned it as a toy. To me there is a fine line between a toy and an experiment. We’ve had thousands of people from all over the Web experimenting with Wordle and finding all sorts of things to do with it. There is a blog on the top 25 uses for Wordle. For me that is an indication that there is a lot more to it than just being a toy. There are teachers who have found that it’s extremely useful as part of their teaching. And maybe if you are analyzing a text very serious, a Wordle diagram has no advantage over a list of word frequencies. But if you are trying to convey something emotionally, Wordle has a huge advantage.

There is this analogy with cameras. A Wordle is almost like a Polaroid camera for text. It lets people take snapshots that mean something to them. We’ve found more than one blog where guys talk about getting “boyfriend points” by making Wordles out of their love letters to their girlfriends. All of these things point to really interested new uses of visualization.

JF: But how am I ever going to get back all of the husband points I lost by spending all that time making Wordle?

MW: One of the interesting things we’ve seen is a bunch of fairly high profile media outlets like the Boston Globe and the Washington Post have used Wordles to illustrate articles about speeches. That indicates that it is in some sense more than a toy. Poeple who are professional political analysts find this an interesting way to illustrate their points. Jon looked at tag clouds, thought about what was wrong with them, and created this fantastic alterantive. And by putting it on the Web, we’ve seen ways that it’s useful that I don’t think we would have invented ourselves or trusted. It’s a little counterintuitive, and it’s certianly contrary to the usual paradigm of interface design, were you have a task in mind and an audience in mind before you start.

JF: Calling it a toy was a defensive measure. I don’t want to make any claims for Wordle as a visualization tool that gets you an accurate idea of your text. People write to me and say ‘Wow, I made a Wordle and now I see wasn’t writing about what I thought I was writing about.’ But I feel very resistant to those kinds of analyses.

MW: That raises a research question about what people are getting out of this way of visualizing text and what they think they’re getting. Wordle really is the starting point of an important line of research. We are looking at the way it’s being used, and are in the midst of writing our first paper on it. By putting it out in public without making any strong claims about it, we get to see … Next Page »

