Crocodoc’s HTML Document Viewer Infiltrates the Enterprise

It wasn’t that long ago that you could only read Word documents in Word, you could only view PowerPoint decks in PowerPoint, and you could only read PDFs in Acrobat. But without fully realizing it, we’ve come to the end of an era—the era when reading a digital document required a specialized document viewer (usually the same program that was used to create the document).

Now, more often than not, you can open any common document inside the program you’re already using, whether that’s a Web browser, the e-mail program on your smartphone or tablet, a social networking system like Yammer or LinkedIn, or a file-sharing tool like Dropbox.

You probably didn’t realize it, but there’s one tiny startup powering much of this change. It’s called Crocodoc—and you’ll likely be hearing a lot more about it in the future.

Here at Xconomy, we’ve been following this Massachusetts-born startup, founded by MIT graduate Ryan Damico and three fellow developers, for about four years now. The team started out under the name WebNotes, with a focus on tools for annotating Web pages. In 2010 they got into the Y Combinator startup accelerator program, renamed themselves Crocodoc, and took on the much larger problem of allowing groups to collaborate on editing a document online, no matter what the document type: PowerPoint, PDF, Word, Photoshop, JPEG, or PNG.

A sample page rendered in Crocodoc

In the process, they had to build an embeddable viewer that could take apart any document and reassemble it accurately within a Web browser. And as soon as they’d finished that, they had to tear their own system apart and rebuild it around HTML5 rather than Flash, the Adobe multimedia format that’s edging closer and closer to extinction.

The result of all that iterating is what’s probably the world’s most flexible and faithful HTML5-based document viewer: when you open a PDF, PowerPoint, or Word document in Crocodoc, the Web version looks exactly like the native version, even though it’s basically been stripped down and re-rendered from scratch. When I talked with Damico in February of 2011, the startup had visions of building on this technology to become a kind of central, Web-based clearinghouse for everyone’s documents—a cross between Scribd, Dropbox, and Google Docs, but with a focus on consumers, and with prettier viewing tools.

In the last year, though, Crocodoc’s direction has changed dramatically. Damico and his colleagues realized that it would be smarter to partner with the fastest growing providers of document-sharing services and social business-tool providers than to try to compete with them. “The massive, seismic change for us is that we had a huge opportunity to partner with Dropbox and LinkedIn and SAP and Yammer, and let them build on top of Crocodoc and make it into a core piece of their own products,” Damico says.

In other words, every time an office worker opens a document from within a Web app like Dropbox or Yammer, they’re activating a white-label version of Crocodoc that’s been customized to look like it’s part of the surrounding app. The startup still offers a personal version of its Web-based document viewer, but “Crocodoc proper is our enterprise offering now,” says Damico.

That’s a pretty huge deal for a four-person startup. Dropbox had 50 million users at last count; Yammer had 5 million. Another big Crocodoc user is Edmodo, a San Francisco-based startup that provides a Facebook-like social networking service to 7 million K-12 teachers and students.

“What’s happening at a high level here is that you don’t need software anymore. Desktop software is being replaced by the Web,” says Damico. “If you are a user of Dropbox, you can view PowerPoints, Word documents, PDFs, and the app doesn’t slow down—it’s all in HTML5. As desktop software starts to dwindle, companies are turning to us as the online place for these things.”

Having raised just $1 million from Y Combinator, SV Angel, 500 Startups, and a handful of Silicon Valley and East Coast angel investors, Crocodoc is now in a position most Web startups would envy: It earns cash every time one of its partners’ users converts a document into HTML5, whether the conversion takes place on Crocodoc’s cloud servers or within the partners’ data center. “White labeling is a very simple business model,” Damico says. “It starts at pennies per document, with volume discounts for large customers. It allows us to grow along with our customers.”

As I’ve explained in previous stories, the core of Crocodoc’s technology is a rendering engine that can reproduce pixel-perfect versions of native documents in a format that any Web browser can understand. You’ve probably seen a Word or PDF document displayed in a Google Docs browser window; that’s actually just a big, fuzzy, graphical image of the original document. “It loads slowly and it doesn’t look very good,” says Damico.

To create high-fidelity version of a native document that still loads quickly, you have to understand the structure of the document at a deep level, Damico says. “What is a heading, what is a paragraph, what is the kerning, what is the spacing?” Then you have to tell the browser how to reconstruct the document using nothing but style sheets and the other tools of HTML5. “We think everyone is going to be using HTML5, so we are focused on building the Ferrari of HTML5 document viewers.”

Even so, the switch to making a white-label product that would run inside enterprise Web apps like Yammer or Dropbox was something Crocodoc “didn’t expect at all,” he says. Three things happened in 2011 to spur the shift. First, the Crocodoc team noticed that desktop software’s downswing was accelerating: Developers were creating browser-based version of everything from tax preparation software to image and video editing suites. “HTML5 apps were mimicking desktop apps, but no one had done that for documents yet,” says Damico.

Second, “social enterprise” companies came into their own in 2011, mixing document sharing with collaboration and social networking functions. Dropbox and Box raised enormous venture rounds, Yammer grew exponentially, and Jive went public. (Now Google and Microsoft are chasing after the same market.) “It didn’t take a rocket scientist to see that documents were a key part of social enterprise software—and if you could just bring Word, PowerPoint, and Acrobat into all of these apps popping up in that space, there would be huge value there,” Damico says.

Finally, the social enterprise companies themselves came knocking on Crocodoc’s door. “The thing that really pushed us over the edge was we got e-mails from all these companies who liked our consumer products, and wanted to know if we could build something for them,” Damico recounts.

Yammer was the first big enterprise customer: Crocodoc built an embedded document viewer that resembled the other tools inside the company’s Facebook-like collaboration system. “It wasn’t clear at first if that would be just a one-off thing, but it turned out to be the tip of the iceberg,” says Damico. Today SAP uses Crocodoc to send PowerPoints to users of its iPad app. LinkedIn uses it in its Recruiter product, which lets job screeners view the Word and PDF resumes of Linked members. Edmodo uses it in classrooms.

Until very recently, Crocodoc consisted solely of its four founders—Damico, Peter Lai, Matt Long, and Bennett Rogers. Now Damico says the company is “aggressively hiring” (it has four open positions). After a May 1 announcement about its partnership with Dropbox, SAP, LinkedIn, and Yammer, the company received “an overwhelming amount of interest,” Damico says. “Literally, we haven’t been able to keep up with it.” The number of documents the company is converting, which was already in the “millions per month,” doubled within 30 days after the announcement, he says.

Of course, the digital document business is a venerable and cut-throat one, where the old guard (think Microsoft, Xerox, and EMC) isn’t likely to move aside without resistance. But Damico says he isn’t worried that some deep-pocketed incumbent will come up with a competing document viewer. “We’ve been developing this for years now, and I think we have reached a point where our domain expertise runs so deep that it would be hard for anyone to move as quickly as we can,” he said. “Plus, we do have some protections around the IP.”

If there was a single inflection point for Crocodoc, Damico told me in May, it was winning Yammer as a customer for the white-label version of the viewer. So I asked him this week whether he expects to keep his flagship customer, now that it’s becoming part of Microsoft.

“Yammer is still using Crocodoc’s viewer as a core part of their service,” he says. “While Microsoft has a great online version of its document editing tools, they’re not built for embedding into other products and don’t come close to Crocodoc in terms of rendering quality, security, or ease of integration. As long as Yammer continues to focus on building the best possible experience for their users, their ongoing use of Crocodoc’s viewer should be a no-brainer.”

Wade Roush is a contributing editor at Xconomy.

