How Big Data Is Changing Everything
There’s a radical transformation happening in information technology today, one that promises to be every bit as significant—and every bit as disruptive to existing business models—as were Web applications in the 1990s and virtualization in the first decade of the 21st century. It’s a foundational change in the way enterprises, their employees, and their customers manage, share, and secure the staggering amounts of data that pass through their hands every day. It will make data available at higher speeds, on more massive scales and at lower costs that anyone could have imagined even a few years ago. It’s Storage 3.0, and it’s happening right now.
The big story in IT today is “big data”—the almost inconceivable volumes of digital information created and delivered by sensors, financial transactions, video surveillance, Web logs, animation studios, genomics, online gaming networks and a literally unlimited number of other sources. It’s the inevitable but still breathtaking extension of Moore’s Law: There’s more of everything now, on corporate networks, on home computers and on mobile devices. More data is being produced by more endpoints, and the data that’s being produced, like high-definition video, is denser that would have been imaginable even a few years ago. All those ones and zeroes have to be stored somewhere—and, crucially, many enterprises want to keep their data forever—and IT systems worldwide are strained to the limit and beyond as they try to accommodate that demand.
A vast array of solutions—some of them enterprise-class, some consumer-oriented—have emerged to deliver the data storage capacity the world is crying out for. E-mail users frustrated with their Internet service providers’ data caps can use services like Box and Dropbox for oversized attachments. Online backup sites like Carbonite, one of the portfolio investments at Menlo Ventures (where John is a partner), reduce the risk of data loss caused by system failure. Amazon and Apple let businesses and individuals keep everything from financial information to family photos in the cloud. The multibillion-dollar investments being made in these projects highlight our insatiable demand for storage, and the major business benefits to those who can harness it at scale. But these systems are not well-suited to big-data analytics.
The benefits of analyzing big data—that’s data measured not in gigabytes or terabytes, but in petabytes—are far-reaching, and they’re only beginning to be realized. The Human Genome Project is leveraging petabytes of DNA sequencing data as it transforms medical research, and automakers are crunching equally huge amounts of safety test data to improve their new-car designs. But smaller businesses, too, are generating vast amounts of data that they want to store, analyze, and preserve. Casinos now store and mine petabytes of video surveillance data. Gaming companies collect all their users’ in-game interactions to find ways to improve retention and monetization. Digital advertising companies collect and process terabytes of display, mobile and video ad impressions daily to improve campaign performance. Retailers analyze consumer purchases side-by-side with their ad campaigns to optimize revenues and gross margin dollars per shopping basket. Every business with a product or customer base of any size is feeling the competitive pressure to get smarter with its data, and do it now.
But traditional storage data architectures—what we now think of as Storage 1.0—can’t keep up with the demands of these environments. Fibre Channel is too rigid and too complex to set up and manage fast-growing multi-petabyte data farms. And legacy storage arrays are simply too costly to maintain at the petabyte level.
Storage 2.0 emerged in the last decade, with significant improvements on existing storage array designs and smart software features like thin provisioning, deduplication and storage tiering. The changes, and their bottom-line potential, got noticed, with providers like 3PAR, Isilon Systems and Spinnaker Networks bought up in a series of acquisitions worth more than $12 billion. (3PAR and Spinnaker were also Menlo Ventures investments.) These deals injected badly needed innovation into the storage industry, but they didn’t change the … Next Page »