Seattle’s Growing Advantage in The Cloud
Cloud computing and biotech are the two most important nonlinearly-growing economic sectors. These two sectors intersect in Seattle in a unique way that has important implications for all involved. Small changes now will make big changes in what our lives are like decades from today, and Seattleites will have a ringside seat.
For now, there are only three organizations with the resources and outlook to be cloud providers: Amazon, Google, and Microsoft. Two of the three are headquartered in the Seattle area, and the third, Google, has a research presence here. Only Microsoft and Amazon appear interested in supplying cloud services per se, and really are setting the cloud agenda. This leaves Seattle with a planet-wide dominance it enjoys in no other economic area except perhaps global health. Such dominance shouldn’t be taken for granted (see: aerospace), but for now if you want to drive the cloud agenda in research, development, startups, or bizdev, you are going to spend time in Seattle.
While Seattle is a top-tier biotech hub, there is no area of biotech where Seattle predominates the way it does in cloud computing. However, it happens that Microsoft’s and Amazon’s cloud groups take disproportionate interest in biotech. Amazon’s cloud leadership includes people with strong biotech backgrounds, and Amazon’s new South Lake Union campus is literally surrounded by cutting-edge biotech research. Health IT is a key sector for Microsoft, and the Azure group has reached out to genomics researchers and others.
Compared to big cloud users like FarmVille and Netflix, biotech isn’t a big cloud consumer, and biotech probably never will be the biggest. Conversely, recent events make the cloud very important to biotech. The most important such development is next-gen DNA sequencing, which has used new chemistries to produce lower-quality DNA sequences very, very cheaply. At the same time, lower read quality increases the computational task of assembling reads. The result is that computational analysis costs are often higher than the wet chemistry costs; sometimes many times higher.
Consider why this situation might cause DNA sequencer makers to have a collective forehead-slap. Instruments are inexpensive with ever-increasing capacity. Reagents are cheap and bound to get much, much cheaper. Yet computation costs are going up. What’s the one part of the business the instrument makers don’t have a big piece of? Computational analysis. Oops.
As computation needs rise, cloud computing can make a big difference. Cloud computing promises systematically lower computation and storage costs, and frictionless scalability. Local companies like Geospiza and Labkey (as well as my company, Insilicos) exploit these advantages to offer computational services that biotech instrument vendors may now be wishing they owned and controlled.
Until recently, biotech researchers typically didn’t track computing expenses because the expenses were usually small and many researchers didn’t pay for computing out of their own budgets anyway. Those days are over for DNA sequencing. Other areas of biotech, such as protein folding, routinely have big computation problems, and in proteomics and many other areas, computation problems are growing faster than computer capacity. In future, much of biology research will have to plan and budget for computation as an integral part of most experiments.
Inevitably, most of this complicated computing will be done using cloud computing. Cloud computing has such economies of scale that it ultimately wins for a lot of things, particularly where computation demand fluctuates heavily over a period of hours to days. Scientific computing fits this profile: big computation, varying drastically over the course of an experiment. Consequently, scientists who use cloud computing will have an edge of those who do not, and in due time computational biology will largely be performed in the cloud.
But what kind of cloud?
Microsoft and Amazon have different things in mind when they say “cloud.” For Microsoft, the cloud has computers running Windows and something much like SQL server. The pain of porting existing Windows programs is minimal (or at least minimized), and it might even be practical to dynamically move applications from a local environment into the cloud and back, as demand requires. Furthermore, Microsoft expands the concept of “cloud” to include resources an organization owns and operates, that are shared within that organization only.
Amazon’s cloud, in contrast, runs standard (and mostly open source) services like Linux and Hadoop. Porting applications that already use these standards to Amazon is straightforward, but Amazon doesn’t appear interested in resources outside of their operation centers. Amazon views themselves as a compute utility with a scale-driven price advantage. Amazon doesn’t have an opinion on what you run locally. For Amazon, a “private cloud” is a dedicated cloud run on Amazon’s equipment, that you access via a VPN.
To parody these positions, Microsoft’s cloud is exactly like your current Windows computer, except you never have to install new software; Amazon’s cloud is whatever you want it to be, so long as it runs on Amazon’s hardware.
These differing viewpoints will have a profound effect on computational biology. Will computational biology influence the future of cloud computing? I think it’s likely. One reason cloud vendors ought to encourage computational biology as a model is because it’s likely more profitable than most other big services, for two reasons: availability and latency. Computational biologists can afford to wait, sometimes for quite a long time, without serious consequences, and they can often afford to redo the occasional computation that fails. Businesses like Facebook and FarmVille can’t afford to wait or start over, because lost ad impressions are lost forever. Science is thus a desirable cloud customer, and so cloud vendors are being rational when they court scientists and hold their work up as an example to others.
So what do scientists want? Mostly what everyone wants: more for less. The danger for Microsoft is, science is often a place where Microsoft is not well accepted. If there’s a danger for Amazon, it lies in scientists’ preference for vibrant marketplaces (of goods as well as ideas). Will scientific computing force Microsoft to be more open, yet ultimately strengthen their hand against Amazon as the cloud computing market grows? It’s too soon to know, but it will be fun to watch. At the least, scientists in Seattle will enjoy unique access to the people who are building the computers of the future.