Weaving Words with Wordle: A Talk with IBM’s Jonathan Feinberg

3/16/09Follow @wroush

(Page 2 of 2)

how people use it. And this fits very well with the goals of the Social Software Center, which is a living laboratory where we put things in public and see how they’re used.

X: But you just said Wordle is maintained outside of the center, by Jon himself.

JF: Yes, but the center is creating a venue inside the company for projects like Wordle. Wordle just predates that effort. I am one of the point people on trying to set up a legal, policy, and technology infrastructure for making things like this happen inside IBM.

X: Were there any big technical hurdles to creating Wordle?

JF: The vast majority of the work was in creating a good user experience. The core layout algorithm is basically the same as when I first made it. In a normal tag cloud, you’re lining up words from the top left to the bottom right. You take up as much vertical space as the words need. You get to the end of the line and you do a line break.I wanted to see how much of the space I could fill in. The basic technique came from Martin, who has used it in many visualizations and artworks. It’s a randomized algorithm where you throw stuff on the screen and if it’s overlapping with something else, you move it around a bit. I had to add a lot more computer science to it to make it fast enough. If you do the purely random algorithm thing it will take five minutes and in Wordle it takes a few seconds. It treats the words as graphics. Internally they become shapes, and I manipulate those shapes.

The concurrency is extremely complicated. Managing things that are happening at the same time, without creating inconsistent stages in the memory, took a lot of thought. Coming up with and exposing different layout options about whether words are vertical or horizontal—all that stuff took a ton of work.

Wordle diagram of the Gettysburg AddressX: You’re talking about the way the user controls the appearance of a Wordle?

JF: A user has influence but not control. There are a couple of different settings. One is hte orientation of the words—horizontal or vertical or some combination. There is the question of where the words want to be on the screen; two different choices are exposed in the Web ap, one where they’re randomly distributed across the center line and the other where they prefer to be in alphabetical order. You have control over the font and the palette, although you don’t have control over how the colors get assigned to words.

X: Who owns the copyright on a finished Wordle? The text probably belongs to the person using the software, but many of the design decisions are built into the software. So the program seems to raise an interesting question about ownership.

JF: There is a little boilerplate language about how you can embed a link to Wordle in your blog. Aside from that you have to either take a screen shot, or print to PDF. You can distribute that however you like. The license says you can take any image you make and do anything you like, as long as you give attribution to the website Wordle.net.

MW: On ManyEyes, one of the ways we handle this is to have you agree to the terms of service when you sign up, and assert that you have the right to reproduce whatever data you are putting up. But yeah, it’s one of the interesting aspects of this new kind of transformation that can take place on the Web. You are mixing authorship.

X: What if you make a Wordle using someone else’s words, what then?

JF: I think you’d be hard pressed to find a lawyer who would assert that a Wordle is an infringement. It’s just a list of words. It’s fair use.

MW: What is amazing to me is the range of uses people are putting it to. You can do a lab study of something like Wordle and get a result where you don’t see any benefit, but you can’t extrapolate from that to the field. It shows how important it is to do this sort of experiment in the world at scale. The world is very creative, and it will figure out things to do with the technology that would never come up at all from just looking at a visualizaiton widget. And you have to embrace that.

A good example of this came from the Boston Globe. They did a beautiful comparison of the Obama and McCain campaign websites, and what they discovered was that McCain was mentioning Obama a whole lot, and Obama was not mentioning McCain very much at all, and that showed up loud and clear in the two Wordles. It was like looking at a photo. It immediately got that point across and made you want to look further.

JF: One of the ways that a Wordle functions is that you get not only a picture of the relative frequency of words but you can get happy random juxtapositions of words that are conducive to associative thinking. It’s generating ideas about something that otherwise wouldn’t have occurred to you. It’s like a data toy. I don’t think that it is fair to call it a “mere” toy, becuase play is a very powerful thing. In calling it a toy, I’m not saying it’s not something valuable and useful.

X: I wonder what a professor of literature or a rhetorician might have to say about Wordle, and the way it takes texts and mixes up their meanings and suggests new ones that the original author may never have intended.

MW: You can argue that chopping up the text into individual words is an assault on the text. In some sense, there is a natural controversy there. But focusing on what a rhetorician would say about this may not be the right focus. Trying to figure out why it’s so appealing has to do with other things. If you hear a speech read aloud, it’s read with expression; there is a cadence to every expression that is unique. Wordle brings back some of that unique cadence on the page. It’s almost a return to the richness of the spoken word.

JF: You’re going much further than I’d go.

MW: Well, this is just me speaking and speculating. This is an active area of research for us. And this general line of investigation, into how you illustrate text online, is central to what the Visual Communication Lab is doing right now. The traditional visualization question is how do you see a terabyte database all at once? How do you see your customer transactions all at once? The future qestion may be, how do you see a million blogs at once, or how do you see every book ever written all at once? Those are very important questions for visualization.

X: Do you plan to add any features to Wordle, or eventually come out with Wordle 2.0?

JF: I fix bugs, but I haven’t added any features for a very long time. If there is something interesting and important to do, I’ll do it, sure. But it’s so useful and so deeply enjoyed by so many people right now, why break it? I’ve said no to many features—not just in the interest of protecting my own time, but also in the interest of protecting how easy it is to use for newcomers.

Wade Roush is a contributing editor at Xconomy. Follow @wroush

Single Page Currently on Page: 1 2 previous page

By posting a comment, you agree to our terms and conditions.