PacBio, Harvard Use Fast Gene Sequencer to Crack DNA Code of Haitian Cholera Strain
Epidemics can travel around the world in a matter of hours on airplanes, killing people before anyone really knows what hit them. But now that DNA sequencing has gotten so cheap and so fast, researchers are finding ways to precisely identify the bugs in real-time to find out where they came from, and how to best fight back.
The latest evidence is emerging from a team of scientists at Harvard Medical School, who, with the help of new sequencing machines in development at Menlo Park, CA-based Pacific Biosciences (NASDAQ: PACB), were able to sequence the full genome of the deadly cholera bug that has plagued Haiti since October. The results, which show the Haitian strain originated not in nearby Caribbean waters but in Southeast Asia, are being published today online in the New England Journal of Medicine.
PacBio made financial news in October when it raised $200 million in its initial public offering, which it hopes to use to capture a slice of the growing market for fast/cheap gene sequencing. The company has been generating buzz in scientific circles for its new technology that is supposed to be able to sequence entire genomes for as little as a few hundred dollars and in as little as 15 minutes—raising the bar ever higher in a sequencing industry currently led by San Diego-based Illumina (NASDAQ: ILMN) and Carlsbad, CA-based Life Technologies (NASDAQ: LIFE). But today’s publication marks the first time that the capability of the PacBio’s machine has been featured in the world’s leading medical journal for playing a role in addressing an ongoing global health concern.
“We hope this will be first of many papers,” in the New England Journal, says PacBio’s chief scientific officer, Eric Schadt. “It’s a good sign the technology is maturing nicely.”
The outbreak of cholera was confirmed in Haiti on October 21, providing an urgent new challenge for infectious disease experts. The epidemic has made more than 90,000 people sick, killed more than 2,000 people, and spread all over Haiti and into the Dominican Republic, according to a report yesterday in the Los Angeles Times, which cited figures from the U.S. Centers for Disease Control and Prevention.
Matt Waldor, an infectious disease specialist at Harvard Medical School and a Howard Hughes Medical Institute investigator, called up PacBio CEO Hugh Martin on November 6, Schadt says. Waldor didn’t have a prior relationship with the company, but had heard about the PacBio machine’s ability to sequence entire genomes of organism very quickly, and asked Martin about whether this was a task for the company’s new machine. The next day, a Sunday, the PacBio executive team got together for a “pow-wow” as Schadt says, and moved on a dime. “We decided to definitely do it,” Schadt says. “It was right in the sights of what we thought the machine was good at.”
The next day, Waldor and colleagues grew cultures of the cholera bacterial strains of interest in their lab so there would be enough of a sample to generate DNA sequences from. The samples were shipped to the PacBio headquarters in California on November 10, where they could be run through one of the company’s prototype machines, Schadt says. Raw sequence information came off the instrument in about 90 minutes, Schadt says.
The team at PacBio ultimately produced five genomes in total. Two were isolated from individual patients in Haiti, another was from a recent outbreak in Bangladesh, a fourth was from Peru, and the fifth was from a historical reference from a 1971 outbreak in Bangladesh, Schadt says.
The actual genomes themselves weren’t that big. While a human genome has 3 billion chemical units of DNA, these bacterial species had just 4.1 million to 4.5 million chemical units. But the sheer mass of data wasn’t really the problem here. There are significant structural variations in the different strains; such variations are hard to detect on today’s machines, which generate full genomes by looking at relatively narrow stretches of the whole genome, Schadt says. The PacBio machine, he says, had an advantage in that it looks at much longer stretches of DNA, called reads, which enabled it to piece together the full bacterial genomes quickly, and to easily identify the subtle variations that make one strain distinct from another.
Getting the precise sequence is thought to be important for responding to infectious disease quickly, because it can provide valuable data for public health officials. In this case, some scientists hypothesized that the Haitian cholera epidemic might be coming from nearby Caribbean waters, or possibly from Latin America. By sequencing all five of those genomes for comparison, the scientists were able to say with confidence that the strain actually came from Southeast Asia, Schadt says. The scientists were also able to get a deeper understanding of how much damage the new strain is likely to inflict on people, and how likely it is to continue to spread. Based on how pathogenic the bug appears, one of the paper’s senior authors, John J. Mekalanos, is now advancing a new strategy to develop a vaccine against cholera that could be given across Latin America, Schadt says.
“Because of this strain’s increased fitness, and pathogenicity, the fear is it will dominate across Latin America,” Schadt says.
This group certainly wasn’t the only one in the world working feverishly the past couple months on the cholera problem. The CDC dumped some raw genome sequence data it obtained from using Illumina machines, and put it into a public repository known as GenBank. The Harvard Medical School/PacBio team, besides writing up its findings in today’s New England Journal, has deposited its raw sequence data in the National Center for Biotechnology Information’s (NCBI) public database, Schadt says.
As Schadt says, this kind of experiment should become increasingly common in laboratories whenever new epidemics pop up. It’s the sort of thing that really would have been impossible even two years ago, when it cost way too much and took too long to get this kind of information.
“Given the speed and granularity in kinds of runs we can do, this kind of project now becomes possible,” Schadt says.