Open Source Biology Movement Co-Founder, Merck’s Eric Schadt, Leaves for New Job
Merck’s Eric Schadt, one of the creative forces in a fledgling effort to start an open-source computing movement for biology, is leaving that role after a couple of months for a new full-time job.
Schadt, 44, the executive director of Merck’s Rosetta Inpharmatics division in Seattle, has agreed to become the chief scientific officer for Menlo Park, CA-based Pacific Biosciences. PacBio, for short, has raised more than $190 million in the past five years from the likes of Kleiner Perkins Caufield & Byers and other investors to develop new instruments that will make it possible to sequence entire genomes much more cheaply, and in greater detail, than can be done today.
This new gig means Schadt—a world leader in using mathematical models to show how genetic abnormalities lead to disease—will give up his full-time role in Seattle at Sage Bionetworks. This is the open-source biology nonprofit he incorporated in February with fellow Merck executive Stephen Friend, who corralled $5 million in anonymous donations for the effort. It certainly doesn’t sound like an auspicious beginning for a co-founder to leave such full-time project so soon, but Schadt insists he will maintain close ties. He plans to keep his house in Seattle, and travel here one day a week for consulting to Sage. He also plans to make sure PacBio machines in coming years will provide the richly detailed data on different genetic profiles that will be the essential raw material needed if Sage is going to catch on with biologists.
“I’ll still be involved with Sage,” Schadt says. “Sage’s main goal is to provide open access to enable researchers to analyze complex genetic databases. I realized that one of the things that’s critical for it to be successful is that we generate the right kind of data for it.”
Friend, who will remain based in Seattle as the chief executive of Sage, said Schadt will be able to continue driving the project forward. “He will be able to accelerate data generation for the project” while at PacBio, Friend says. He added that Schadt has been thinking for months about how to get potent enough “fuel,” or data, to make Sage’s engine run right. “It’s like if you have a big engine, and not enough fuel to start the engine, that’s not a good thing.”
So by working on developing the new tools for producing data, Schadt hopes he will give researchers around the world a sort of clay to work with as they build new mathematical models to help make these complex data sets talk to each other. Then others on the Sage project will devise software or user interfaces that can put all complex information into a format people can understand so they can put it to real-world use for crafting new experiments, or even prescribing a certain drug.
This is all pretty heady stuff. But the basic idea behind Sage—as I described in March— is built on work Friend and Schadt directed at Merck’s Rosetta division for the past seven years. Their premise is that vast networks of genes get perturbed, or thrown off-kilter, in complex diseases like cancer, diabetes, and obesity. Scientists can’t just pick one faulty gene or protein and make a magic bullet to shut it down. But what if researchers around the world capturing genomic profiles on patients could get all of their data to talk to each other through a free, open database?
For example, a researcher in Seattle looking at how all 35,000 genes in breast cancer patients are dialed on or off at a certain stage of illness might be able to make critical comparisons to readouts from another scientist. By pooling data from around the world, and integrating efforts to capture clinical, genetic, and other molecular data, researchers should get a much sharper picture of what’s going wrong with certain diseases, and how to better go about treating them.
Besides helping scientists aim higher, this will make medicine more transparent than ever, Friend told me back in March. Physicians could look at genetic profiles from their patients, match it up with the Sage database, and then prescribe the medicine most likely to work. The FDA could look for insight into the proper balance between the risk and benefit of a drug. Health insurers could look at drugs for certain patients that have the greatest likelihood of success, and pay for ones that work. Drug companies could use the database to weed out treatments that are bound to fail or cause side effects for patients with certain genetic profiles, potentially saving years of wasted effort and hundreds of millions of dollars.
Merck has committed to donating proprietary know-how and equipment to get the project up and running before it closes down the Rosetta Inpharmatics division in Seattle. That hand-off to Sage, which is being established at the Fred Hutchinson Cancer Research Center, is on track to be completed by July 1, Friend says. The project also hopes to build a staff of as many as 30 people, drawing partially from the talent pool at Rosetta.
There are all kinds of potential obstacles for a project like this, not the least of which will be how to handle intellectual property that Sage wants researchers to put in the public domain for the betterment of science. Friend declined to say much about how issues like this are being resolved, or which other labs have agreed to participate. (He says more details will be available later in the summer.)
Schadt says he foresees a day when more researchers will want to participate in the project to help make sense of crushing data loads that will become available from third-generation sequencing machines like those from PacBio, which are expected to become commercially available in the second half of 2010. These machines will produce 20 times the amount of data of current instruments, providing rich detail on minute genetic variations that make slightly different proteins, how genes are turned on or off, and provide longer snapshots of how RNA is transcribed from all that underlying code.
Schadt made his name at Merck on creating mathematical models to predict how certain genetic profiles influence the way people respond to certain treatments. He downplayed his own impact, saying other researchers around the world can do that work for the project, Schadt says, Friend will focus his energies on the computing “platform,” which partly means turning all this data into a usable form that biologists can get comfortable enough with so they can form their own hypotheses from it, Schadt says.
We’ve written a lot lately about companies that see opportunities in helping biologists decipher the coming tsunami of genomic data through software programs—like Victoria, BC-based GenoLogics, Seattle-based Geospiza, and Redmond, WA-based Microsoft. It sure sounds like there’s room for all comers, including a nonprofit open source movement like Sage, to enter the fray.
“When you have terabytes of data available on individuals, and petabytes on thousands of individuals that will be available in a matter of days, how will labs be able to handle that and interrogate it in useful ways?” Schadt says. Sage hopes to make it useful, he says.