ENCODE and the Truth
Constant facts of all scientific endeavors – Nature always wins. The truth will come out.
An absorbing article last month in The Scientist called The A@#hole Scientist wondered if scientists could be a-holes, as it called them. Of course they can. Simple logic. Scientists are human and humans can be a-holes. QED.
According to the article, a-hole scientists act by “denying funding to one’s competitors, disparaging them or their results, getting them fired, delaying publication of their work—all simply for vanity, supposed superiority, or envy dressed up as rightness.”
In essence, they make it personal.
I was struck by these examples recently while watching the continuing discussions of ENCODE’s work.
And these discussions also reflect directly on something I helped write for Xconomy on ENCODE, providing us further insights into human aspects of doing science.
Let’s work our way through this.
“All simply for vanity, supposed superiority, or envy dressed up as rightness“ reminded me of Richard Feynman. Not because he was an a-hole but because he was such a keen observer of humanity.
No more so than in an influential speech he gave in 1974 entitled Cargo Cult Science.
Every scientist should read it. It does an amazing job describing the human enterprise of research.
Smart researchers get the science wrong, fool themselves into believing that what they WANT to be true IS actually true, allow their own vanity, supposed superiority or envy to impair progress:
“We’ve learned from experience that the truth will come out. Other experimenters will repeat your experiment and find out whether you were wrong or right. Nature’s phenomena will agree or they’ll disagree with your theory. And, although you may gain some temporary fame and excitement, you will not gain a good reputation as a scientist if you haven’t tried to be very careful in this kind of work.”
Researchers can be the easiest people to fool. Feynman gives some classic examples demonstrating how scientists fooled themselves for decades, because they ‘believed’ they knew the facts. Personal hubris and ego, plus the rush to gain some fame and excitement, can be very powerful.
But Nature always wins. The truth will come out.
Researchers often make all sorts of biased decisions that have little to do with rational thinking.
They can act like the a-holes described in the Scientist article. Sometimes to support biased models as Feynman discussed.
Science is made up of people that messily move the field forward for personal reasons, often based on their own emotional needs.
It is amazing that it works at all.
Feynman provides a hint – reputation. Science works because it is not alchemy; it is the very public display of the work. With reputation on the line.
It requires publication of research, discussion of the work amongst peers and the defense of ideas. Some models wither rapidly. Others withstand the onslaughts and become powerful models of Nature.
Science is about putting research models – faulty as they are – out into the public arena to fight with other ideas. A healthy competition eventually sculpts a model with some semblance of reality, even though carved by frail humans.
To paraphrase Feynman – Nature and reality will agree or disagree within the public battle of ideas.
This brings me back to ENCODE and its work. Because there has now been a strong pushback on their model describing the transcriptional importance of genomic DNA sequences.
The discussion seems to have many aspects described in “Cargo Cult Science” and “The A@#hole Scientist” – surety of views, lack of humbleness, disparaging attacks and even a hint of workaday a-hole moves.
And one side in particular has made it personal.
Dan Graur, the lead author of a paper critical of ENCODE’s work, said: “Everything that Encode claims is wrong. Their statistics are horrible, for a start. This is not the work of scientists. This is the work of a group of badly trained technicians.”
Graur’s paper is a biting critique of ENCODE’s work, using words like “absurd”, “fallacy”, “transgressions” and “hype” – and that is just in the abstract.
Graur (I’ll italicize papers to prevent confusion with the author) is a snarky paper throughout, causing titters amongst many scientists. One compared it to a vulture picking apart a wildebeest carcass.
A vivid metaphor but one usually applied to movie reviews, not dry research papers. There are reasons most scientific discussions are impersonal and use the passive voice.
The peculiar nature of the paper – its direct attack on the work of others; its use of first person; its heavy discussion of semantics – is more a personal argument than a logical one.
To me, it is harmful in its hubris, in its personal tone and in its seeming surety that ENCODE is not only wrong science but bad science, done by bad scientists; disparaging both the results and the researchers.
Now, that is just my opinion and has nothing to do with the data. Nature always wins and the truth will come out, without regard for the character of the researchers or how personal their arguments.
So let’s look at the data.
A major area of disagreement between Graur and ENCODE is something Mark Minie focused on in our Xconomy article. ENCODE resolves some complex issues by “re-defining the gene of a multi-cellular organism as a simple, easily studied biomolecular unit—the RNA transcript of the DNA sequence.”
Graur disagrees with this, feeling that transcription is not sufficient to show function. For example, from the paper (my bold):
“The human genome is rife with dead copies of protein-coding and RNA-specifying genes that have been rendered inactive by mutation. These elements are called pseudogenes (Karro et al. 2007). Pseudogenes come in many flavors (e.g., processed, duplicated, unitary) and, by definition, they are nonfunctional. The measly handful of “pseudogenes” that have so far been assigned a tentative function (e.g., Sassi et al. 2007; Chan et al. 2013) are, by definition, functional genes, merely pseudogene look-alikes. Up to a tenth of all known pseudogenes are transcribed (Pei et al. 2012); some are even translated in tumor cells (e.g., Kandouz et al. 2004). Pseudogene transcription is especially prevalent in pluripotent stem cells, testicular and germline cells, as well as cancer cells such as those used by ENCODE to ascertain transcription (e.g., Babushok et al. 2011). Comparative studies have repeatedly shown that pseudogenes, which have been so defined because they lack coding potential due to the presence of disruptive mutations, evolve very rapidly and are mostly subject to no functional constraint (Pei et al. 2012). Hence, regardless of their transcriptional or translational status, pseudogenes are nonfunctional!”
A pseudogene, even if transcribed or translated, can never be functional. Let’s look at this definition and how it might affect real world decisions – such as requests for funding.
Say a researcher came to you with a proposal examining pseudogenes. Searching genomic databases, using some algorithmic magic, they found at least 15,000 pseudogenes in the human genome. About 1,500 of these “dead copies of protein-coding and RNA-specifying genes” might be transcribed into RNA.
Do any of the pseudogenes have a biological function? There are no data yet. They want to look and need money to continue. Would you fund it?
Following the Graur view, this proposal should be denied, because it looks at something worthless. A pseudogene is a dead copy of a functional gene, even if it is transcribed. Looking for function is bad science, done by poorly trained technicians.
Now, if that proposal was taken to ENCODE, they would have a different view. Transcription is very important. A transcribed pseudogene could be functional; let’s find out what it really does.
I am glad that Graur devotees were not responsible for funding such proposals, that they were not able to deny funding for the projects, to prevent us from learning more about Nature.
Because guess what? When we look at supposedly nonfunctional pseudogenes, we find function.
The very Chan paper Graur pooh-poohs above as “measly” is the example I just used.
The authors hypothesized that transcribed pseudogenes might regulate protein production through various RNA-based silencing processes.
They then demonstrated that some pseudogenes had a biological effect. In one case a pseudogene produced specific RNAs that controlled the growth of tumors.
Pseudogenes with actual biological functions? According to Graur, a pseudogene is by definition not functional. Graur tries to argue around this, saying these are not “real” pseudogenes but are pseudogene “look-alikes”.
Sounds like an ad hoc hypothesis to me, moving the goalposts after the ball is in the air. “Pseudogenes have no function.” “How about this one that does?” “It is a not a ‘real’ pseudogene. It is a pseudogene look-alike.”
Maybe. But which proponent – Graur or ENCODE – would even have looked in the first place?
Functional pseudogenes are not just a “measly” one-hit wonder. Let’s just look at some papers that have come out in February. One describes a database of transcribed pseudogenes, identifying short RNAs that could not only inhibit protein production but also increase it.
How about this paper – using the ENCODE database – finding that several genes, including a pseudogene, may have “evolved under purifying selection, suggesting that their roles are essential and non-redundant?”
A pseudogene that is not translated into protein, that was actively selected and is essential. Found with the help of the ENCODE database.
Then there is this paper entitled “A pseudogene long-noncoding-RNA network regulates PTEN transcription and translation in human cells.”
The transcribed RNA from this pseudogene binds to mRNA – causing degradation similar to Chan. It also binds to the working gene’s DNA causing epigenetic changes.
One pseudogene with multiple biological functions.
This report’s lead author – Kevin Morris – said (my bold):
“Importantly, the observations presented here tell us that some long non-coding RNAs (thought to be junk), in this case emanating from a pseudogene, are mechanistically relevant and act, in the case of PTEN, as a master regulator of both transcription and translation. It shows that long-non-coding RNAs, even though not conserved, have a great potential to evolve into important regulatory units and have been overlooked as a major mechanistic player in human cells.”
A follower of Graur likely would not have supported these projects. Because they are sure that non-coding, non-conserved biologic fossils such as pseudogenes have no function.
And that decision would have been wrong.
Non-conserved, non-coding pseudogenes could be major players in human cells. How often does this happen? We do not know yet.
But when we look, we find them.
Openhelix suggests that the value of ENCODE is its database which can now be examined for the gold that exists there. A database that might not exist if the Graur view prevailed.
ENCODE’s definition provides us with the impetus to find out if a pseudogene is functional. Graur does not.
The Graur definition hampers further investigations, suggesting that we should not even look because the answer is already known.
ENCODE is most likely not fully correct. That is why its work is put out there. We need to know how it withstands the onslaughts of public examination – such as Graur – to see where its faults lie.
But it appears to me that Graur has already been shown to be faulty. Transcription alone – from a nonfunctional, non-coding, non-conserved protein gene – can have huge biological ramifications.
I wonder whose view of pseudogene function Feynman would have suggested leads to cargo cult science? Which one has its view of rightness clothed in supposed superiority?
Which one more greatly limits further exploration because of surety that does not really exist, and which one drives further exploration?
ENCODE’s view will likely be altered as we learn more. That is how science usually works.
And in a few years the authors of Graur might wish that their paper had shown a little more humility, a little less personality and a little less snark. As I said, there are reasons why research papers are so dry.
Scientists use the passive voice and an impersonal approach in research discussions for a simple purpose – to provide personal distance from the scientific models they describe. Impersonal descriptions make it easier to separate personality from the research, especially if one is eventually shown to have used an imperfect model.
Temporary fame and excitement do not long hide incorrect science. Anyone can propose an imperfect model, can fool themselves, and can be shown by Nature to be wrong.
Nothing reveals an a-hole faster than being on the wrong side of Nature. No one wants that reputation.
Thus, we write “The work was done” rather than “I did the work.” It’s easier to say “The research was incorrect” than “I was wrong.”
Because Nature is always right.
The truth will come out.
And scientists are only human.