Angry at the Genome
In 2004, I was an enthusiastic postdoctoral researcher in Eric Lander’s lab at the Broad Institute, with the job I had dreamed of since I was 10 years old. Growing up in Paducah, KY, I read Isaac Asimov’s The Genetic Code. And while I understood nothing of its meaning, I fell in love with the idea of being a human geneticist when I grew up.
I had a particular disease passion that had also been part of the plan since that time: autoimmune genetics. You see, I have a remarkable family. Nearly one-third of my relatives within 3 degrees have an autoimmune disorder. Even at my young age, I somehow knew those weren’t good odds. I knew that “things run in families” and that my family seemed to have autoimmunity in spades. You can imagine my surprise when 20 years afterwards, I realized I was, in fact, a human geneticist in the most renowned tank of genomic thinkers around studying autoimmune disease.
It was a thrilling time to be a geneticist. The human genome sequence was complete. The first thorough map of variation in the genome (single nucleotide polymorphisms or SNPs) was nearly complete. Unconstrained by data to the contrary, it felt like we were turning a corner to truly identify the variation that conferred risk to disease.
But in May of 2004, I began to get very nervous because of an unexpected result we found with one of the most talented teams of autoimmune geneticists in existence: the International Multiple Sclerosis Genetics Consortium. Parenthetically, these folks are absolutely who you want at the front lines of genomic inquiry. They are dogged, thoughtful, and careful about the research they do.
At that time, we were following up on one of the key variants that conferred risk to multiple sclerosis or MS: HLA-DRB1-0201 (or “DR2″). As background, about 40 percent of all patients with MS have the DR2 variation in their genome. By comparison, only 20 percent of the general population has this variant. When you run the statistics, it turns out that this is probably one of the strongest associations in all of autoimmune genetics. So it seemed very reasonable to all of us involved that if we gathered enough patients who had MS and looked separately at the patients with and without DR2, we would expect that we might uncover that there were two types of MS.
To imagine this hypothesis, I visualize genetic “skylines.” While MS may appear to be a “single” disease population based on clinical measures, we hypothesized that the disease resulted from two different genetic skylines. Our experiment was to determine whether if we genotyped everyone and separated out those individuals with the most significant variant, DR2, we would immediately be able to recognize two different landscapes.
Why would this be an important experiment? Our hope was that if a patient had an MS skyline that contained a genetic variant, this might mean they were better served by one drug therapy versus another. Biotech and pharma companies might specifically design clinical studies for therapies targeted at those skylines. Or even better, novel gene associations might reveal themselves when the “noise” of the architecture was reduced. We might find new associations, and these findings would provide novel targets for drug discovery.
However what was found was that no additional gene associations were revealed. No existing associations were stronger in one population compared to the other. In short, the skylines were pretty un-interpretable, with the exception of that previously known variation, DR2.
Most of the team was undaunted by this finding and excited to dig deeper into the genome to understand every additional peak and valley of genetic risk. However, I was devastated. For me, the disease hit very close to home and I was disappointed that there would not be actionable data for some time. So I headed into drug discovery project management, all the while hoping I was wrong and that additional time and hard work by my colleagues would prove me so.
In truth, I didn’t think much about the genome until 2010 at Thanksgiving, when 23andMe offered a $99 deal for a 500K SNP map of my genome. Perhaps surprisingly, even with my family history, I was pretty certain I wouldn’t find anything that might upset me. Why? I was 37 years old, so nearly past the window of onset for many autoimmune diseases. Moreover, my husband and I currently don’t plan on having children, so any untoward variant would be unlikely to inspire worry for the next generation.
I couldn’t have been more wrong. My genome did upset me. Not because I found a variant that is certain to become a major health burden for me in the next 60 years, but rather because I realized there was very little actionable information in the data. In short, I realized that my genome for the most part revealed nothing about my past, current or future health.
I’m not the first person to realize this. Many folks have probably felt the same way when they view their own profiles. I believe my greater frustration is directly related to the insights I’ve gained from my time at the front lines of drug discovery and human genetics. With this special vantage point, I’m now not sure that the architecture of the genome will ever provide guidance for treatment of most diseases (Mendelian genetic disease and oncology aside).
Finding a drug that provides a therapeutic benefit at doses that are much lower than those that cause toxic side effects is probably one of the most challenging jobs on earth. For many years, drug researchers have all been hopeful that the genome would reveal its secrets for some of the tough diseases like MS, and that they’d find drug targets that allowed them to develop precise, targeted, “heat-seeking missile” type therapies, even if for just a subset of all patients. Or even better, that the data would allow pre-identification of folks likely to have disease so that pre-treatment might be possible before irreversible damage occurred.
When I looked at my own genome with the latest in genetic meta-analysis data, I realized I might have entertained the conclusion that I had MS. Now obviously, it wouldn’t be “healthy” for anyone to be on immunosuppressive therapies for 20 years if such treatment was not necessary. In my instance this would have proven to be the case. So we are left with a key question: How much more data (and what kinds of data) would we need to collect to better differentiate the genomes like mine, which (so far) have proved unaffected, while there are similar risk factors in genomes like that of my cousin, who developed the disease at 28 years old?
My fear is that for most complex diseases there are not enough patients on earth (in extant generations) to differentiate fully between individuals who will develop disease and those who will not. In fact, current research suggests that we’ve now sampled enough of the complex genetic-disease patient population to be able to definitively rule out the possibility for many diseases. Moreover, the data suggests that while we may be able to eventually describe all the alleles that confer risk to disease, we will never be able to pinpoint for most patients, even related patients, the precise set of variants that gave them their disease. Or to quote a Boston sage: “we will never be able to differentiate casual from causal” at the level of the patient.
In some ways this is easiest to explain using that original example I gave of DR2. Forty percent of MS patients have a DR2 allele, 20 percent of unaffected individuals have this allele. Clearly DR2 variants increase your risk for disease. However, it is entirely possible that while DR2 is involved in the driving MS disease for some of DR2 positive patients, it may actually play no role in other patients (similar to the 20 percent of the unaffected population who are DR2 positive). It could simply be a case of “true, true, and unrelated.”
To be clear, I would love to be wrong about this. And I hope that the response to this article is that statistical geneticists take up arms to destroy my hypothesis that a futility analysis is likely to be positive for most complex genetic diseases.
But in the near term, I’m banking on the continued determination of my colleagues in drug discovery, who work every day to try to improve efficacy and lessen side effects on their candidate drugs, be it by clever delivery, thoughtful structural drug design, or thorough preclinical and clinical assessments. These folks are some of the hardest working people I know, and the natural architecture of the genome is not making their jobs any easier. I’m also banking on the ingenuity of my colleagues in genomics and proteomics as they collaborate with industry to find a path to bring to bear the genome on drug development insomuch as it is possible today.