The Inevitability of Archiving Social Networking Data

3/27/13Follow @drmime

Making A Case For Email Archiving In 1985

In 1985, I was part of a team at Carnegie Mellon University (CMU) that was doing some very radical things, including deploying Internet e-mail to the entire campus. Some people were quite skeptical about the whole enterprise, fearing in particular that letting “the masses” loose on the Internet would dilute the quality of its content. In retrospect I can’t say they were wrong, but they were certainly swimming against the tide.

I, on the other hand, was insanely optimistic, and knew in my gut that the CMU campus was about to witness the birth of e-mail as a large-scale social medium. I felt that it would be useful to archive as much as possible of the content of that experiment, as a potential resource for future researchers (privacy concerns were not yet high on anyone’s radar).

Therefore, I tried to convince someone—anyone—on campus to start archiving e-mail. This would have been a major endeavor, because we expected many megabytes of data, and thus it required the resources of the campus computing center or library to make it happen. Both of these institutions turned me down with a variant of, “Why would anyone want to look at old e-mail?” My intuition told me that someone would, but at the time I couldn’t make a good enough case for the school to realize the investment and pay for all the magnetic tapes—the data was lost forever.

Where Archiving Data Stands Now

Just a few decades later, of course, e-mail archiving has become a big business and is actually a legal requirement for many companies. Now, no one questions the need for e-mail archiving under many circumstances and it’s hard to recall how outrageous the idea seemed less than 30 years ago when I was at CMU.

Today, the situation with regard to archiving social networking data resembles the e-mail archiving landscape a quarter century ago. People see social networking posts as ephemeral and largely unimportant in the long run, but we are beginning to see the shortsightedness of that perspective.

To begin with, some businesses—particularly in heavily regulated industries such as finance—have begun to use social networking for sensitive matters they’d prefer to keep out of e-mail, precisely because the latter is archived. That’s understandable, but it won’t be long before the regulators catch up. If it makes sense to legally require archiving for e-mail in a given industry, the same logic will apply to social networking data in time.

Moreover, social networking is part of a great evolution of business data from a structured, file-oriented organization to a looser, communication-oriented organization. Nobody ever planned for e-mail to become the central repository for corporate data and yet today, by some estimates as much as 90 percent of that data can be found in e-mail (including attachments). As social networking communications supplement that pattern, everything that has driven the desire to archive e-mail will similarly drive the archiving of social networking data.

So What’s Worth Archiving?

It’s very easy to look at the typical social networking post and say, “This isn’t worth archiving.” Even if the majority of communications are unimportant, the value of archiving the important ones will ultimately justify the maintenance of an archive—and will drive regulatory requirements to hold on to everything for a certain interval, “just in case” the data is needed in the future.

But although regulatory requirements are a very good reason to archive your data—it’s always good to stay on the right side of the law—there is an even better reason in the long run: data mining. When you archive all your communications, you create, in your archive, a potential gold mine of business insight. From your social network data, for example, you might be able to track customer sentiment and responsiveness over time. You might also be able to figure out who in your organization has used social networking effectively to engage with customers and who has not. By analyzing your businesses’ communication patterns, you might be able to identify key influencers who can help or hurt your business in a big way.

I believe that we’re at a turning point in the use of “big data” archives to derive deep analytical insights. Where I work, at Mimecast, we’re starting to do that with e-mail today. I find it hard to imagine that the same won’t be true of social networking data in a few years, yet I hear the same voices of skepticism I heard in 1985. The only difference is that now, archiving won’t cost a fortune in recording and preserving on magnetic tapes as it did back when I was trying to make my case at CMU. If anything, in this day and age the advancement in archiving technology and the lower costs should make the adoption of social network archiving even faster than it was for e-mail archiving.

Nathaniel Borenstein is chief scientist at e-mail management firm Mimecast. Based in Michigan, he is the co-creator of the MIME e-mail standard and previously co-founded First Virtual Holdings and NetPOS. Follow @drmime

By posting a comment, you agree to our terms and conditions.

  • http://twitter.com/mysocialexport MySocial Export

    Really enjoyed the article, at MySocial Export this is exactly what we do for our users.