Uncommon Descent Serving The Intelligent Design Community

Proteins Fold As Darwin Crumbles

Share
Facebook
Twitter
LinkedIn
Flipboard
Print
Email

A Review Of The Case Against A Darwinian Origin Of Protein Folds By Douglas Axe, Bio-Complexity, Issue 1, pp. 1-12

Proteins adopt a higher order structure (eg: alpha helices and beta sheets) that define their functional domains.  Years ago Michael Denton and Craig Marshall reviewed this higher structural order in proteins and proposed that protein folding patterns could be classified into a finite number of discrete families whose construction might be constrained by a set of underlying natural laws (1).  In his latest critique Biologic Institute molecular biologist Douglas Axe has raised the ever-pertinent question of whether Darwinian evolution can adequately explain the origins of protein structure folds given the vast search space of possible protein sequence combinations that exist for moderately large proteins, say 300 amino acids in length.  To begin Axe introduces his readers to the sampling problem.  That is, given the postulated maximum number of distinct physical events that could have occurred since the universe began (10150) we cannot surmise that evolution has had enough time to find the 10390 possible amino-acid combinations of a 300 amino acid long protein.

The battle cry often heard in response to this apparently insurmountable barricade is that even though probabilistic resources would not allow a blind search to stumble upon any given protein sequence, the chances of finding a particular protein function might be considerably better.  Countering such a facile dismissal of reality, we find that proteins must meet very stringent sequence requirements if a given function is to be attained.  And size is important.  We find that enzymes, for example, are large in comparison to their substrates.  Protein structuralists have demonstrably asserted that size is crucial for assuring the stability of protein architecture.

Axe has raised the bar of the discussion by pointing out that very often enzyme catalytic functions depend on more that just their core active sites.  In fact enzymes almost invariably contain regions that prep, channel and orient their substrates, as well as a multiplicity of co-factors, in readiness for catalysis.  Carbamoyl Phosphate Synthetase (CPS) and the Proton Translocating Synthase (PTS) stand out as favorites amongst molecular biologists for showing how enzyme complexes are capable of simultaneously coordinating such processes.  Overall each of these complexes contains 1400-2000 amino acid residues distributed amongst several proteins all of which are required for activity.

Axe employs a relatively straightforward mathematical rationale for assessing the plausibility of finding novel protein functions through a Darwinian search.  Using bacteria as his model system (chosen because of their relatively large population sizes) he shows how a culture of 1010 bacteria passing through 104 generations per year over five billion years would produce a maximum of 5×1023 novel genotypes.  This number represents the ‘upper bound’ on the number of new protein sequences since many of the differences in genotype would not generate “distinctly new proteins”.  Extending this further, novel protein functions requiring a 300 amino acid sequence (20300 possible sequences) could theoretically be achieved in 10366 different ways (20300/5×1023). 

Ultimately we find that proteins do not tolerate this extraordinary level of “sequence indifference”.  High profile mutagenesis experiments of beta lactamases and bacterial ribonucleases have shown that functionality is decisively eradicated when a mere 10% of amino-acids are substituted in conservative regions of these proteins.  A more in-depth breakdown of data from a beta lactamase domain and the enzyme chorismate mutase  has further reinforced the pronouncement that very few protein sequences can actually perform a desired function; so few in fact that they are “far too rare to be found by random sampling”.

But Axe’s landslide evaluation does not end here.  He further considers the possibility that disparate protein functions might share similar amino-acid identities and that therefore the jump between functions in sequence space might be realistically achievable through random searches.  Sequence alignment studies between different protein domains do not support such an exit to the sampling problem.  While the identification of a single amino acid conformational switch has been heralded in the peer-review literature as a convincing example of how changes in folding can occur with minimal adjustments to sequence, what we find is that the resulting conformational variants are unstable at physiological temperatures.  Moreover such a change has only been achieved in vitro and most probably does not meet the rigorous demands for functionality that play out in a true biological context.  What we also find is that there are 21 other amino-acid substitutions that must be in place before the conformational switch is observed. 

Axe closes his compendious dismantling of protein evolution by exposing the shortcomings of modular assembly models that purport to explain the origin of new protein folds.  The highly cooperative nature of structural folds in any given protein means that stable structures tend to form all at once at the domain (tertiary structure) level rather that at the fold (secondary structure) level of the protein.  Context is everything.  Indeed experiments have held up the assertion that binding interfaces between different forms of secondary structure are sequence dependent (ie: non-generic).  Consequently a much anticipated “modular transportability of folds” between proteins is highly unlikely. 

Metaphors are everything in scientific argumentation.  And Axe’s story of a random search for gem stones dispersed across a vast multi-level desert serves him well for illustrating the improbabilities of a Darwinian search for novel folds.  Axe’s own experience has shown that reticence towards accepting his probabilistic argument stems not from some non-scientific point of departure in what he has to say but from deeply held prejudices against the end point that naturally follows.  Rather than a house of cards crumbling on slippery foundations, the case against the neo-Darwinian explanation is an edifice built on a firm substratum of scientific authenticity.  So much so that critics of those who, like Axe, have stood firm in promulgating their case, better take note. 

Read Axe’s paper at: http://bio-complexity.org/ojs/index.php/main/article/view/BIO-C.2010.1

Further Reading

  1. Michael Denton, Craig Marshall (2001), Laws of form revisited, Nature Volume 410, p. 417
Comments
Petrushka: Sometimes (but not always) your comment fall below any level of credibility:
It is possible to describe crystal formation in the language of physics and chemistry, without reference to the concept of information. It is also possible to describe imperfect crystal formation without reference to the concept of information. It is possible to describe DNA replication without reference to information. DNA can be replicated without reference to it’s “meaning.” So what I’m asking is what part of the replication or imperfect replication requires reference to the concept of information
Information is what is replicated. If imformation were not there, how could you replicate it? So the answer to your question: "what part of the replication or imperfect replication requires reference to the concept of information" is very simple: the information which is replicated. Even a child would easily understand that.gpuccio
July 13, 2010
July
07
Jul
13
13
2010
12:57 PM
12
12
57
PM
PDT
Recognizing that active functional information was introduced by a designer at specific moments in natural history, or continuously, or with any other modality (which is an issue completely open to empirical analysis) can certainly help us understand how the designer did it and why, at least to a certain degree.
So the basic entailment of ID is that some unspecified agency did some unspecified thing or things at unspecified times and places using unspecified methods for unspecified purposes, and that has made all the difference?Petrushka
July 13, 2010
July
07
Jul
13
13
2010
12:57 PM
12
12
57
PM
PDT
veilsofmaya: Just to take a momentary rest from molecular biology, here are some comments to some points you raised with PaV: If there is no constraints, how do you explain this particular rate? You simply have no answer, other than “That’s what the designer happened to have chosen.”, which is a non-answer. I have never thought that there are no constraints for the designer. The designer acts in a context and according to laws, and that context and those laws are rich of constraints. It’s unclear how the knowledge that intelligent designer “did it” will get us anywhere more “efficiently” since it doesn’t answer any of these questions I posed. In fact, positing a designer effectively draws a line that claims we cannot hope to understand how the designer did it. It’s a non-explanation I absolutely don't agree. Recognizing that active functional information was introduced by a designer at specific moments in natural history, or continuously, or with any other modality (which is an issue completely open to empirical analysis) can certainly help us understand how the designer did it and why, at least to a certain degree. And it can certainly help us understand the nature of the design, and its workings. On the contrary, going on believing that biological information was generated according to a model which is completely wrong can only help us to remain in confusion. Since ID refuses to address these questions, the fact that we too are designers does not give us any special insight. We still have to figure the answers to these questions for ourselves. Essentially, we’re in the same boat as if there was no designer because we can’t know anything about him/her other than the supposed act of design. I absolutely disagree that "ID refuses to address these questions". What ID says is that the identity of the designer and the knowledge of the modalities of implementation of the design are not necessary for design detection. And that is absolutely true. But that does not mean that, once the design detection and the ID scenario have been achieved, we cannot go on with further questions: who is the designer, how did he implement design, with which modalities in time and space, and so on. While personal ideas about how much we can find the final answers to those questions may vary, there is no doubt that those questions can be scientifically addressed. And I have always been very clear that I believe that many detailed answers can and will be found. Finally, an agent could use a process that happens to closely match what neo-Darwinism predicts, with the exception that it was supposedly chosen rather than naturally occurring. I have always been very clear about what I think of that argument: it is rubbish. If the empirical facts will be shown to "closely match what neo-Darwinism predicts", for me ID is falsified. In that case, I would promptly admit that neo-darwinism is the best explanation, and that no designer hypothesis is necessary. Again, this is why I’m suggesting that ID is a convoluted elaboration of deo-Darwinism. It attempts to explain away neo-dariwnism, rather than explain what we observe. Simply not true, also in the light of what I have said before.gpuccio
July 13, 2010
July
07
Jul
13
13
2010
12:52 PM
12
12
52
PM
PDT
Consilient?Zach Bailey
July 13, 2010
July
07
Jul
13
13
2010
12:51 PM
12
12
51
PM
PDT
In what way is Darwinism different from Classical materialism?
It's the difference between a conjecture and a theory having fifteen decades of accumulated observation and experiment that are consistent and consilient.Petrushka
July 13, 2010
July
07
Jul
13
13
2010
12:50 PM
12
12
50
PM
PDT
Cabal: Several thousand years ago Lucretius (following Epicurus) said nature evolved into being through a slow process of accretion and adaptation. To me it looks like some of the ancients said “I don’t see God, so God couldn’t have created nature.” Some thousands of years have got us to “nature did it by a mysterious process called Natural Selection with the assistance of deep time.” In what way is Darwinism different from Classical materialism?allanius
July 13, 2010
July
07
Jul
13
13
2010
12:46 PM
12
12
46
PM
PDT
It is possible to describe crystal formation in the language of physics and chemistry, without reference to the concept of information. It is also possible to describe imperfect crystal formation without reference to the concept of information. It is possible to describe DNA replication without reference to information. DNA can be replicated without reference to it's "meaning." So what I'm asking is what part of the replication or imperfect replication requires reference to the concept of information.Petrushka
July 13, 2010
July
07
Jul
13
13
2010
12:45 PM
12
12
45
PM
PDT
In the language of physics and chemistry, what exactly is an “error?”
A near, but not exact replica.Petrushka
July 13, 2010
July
07
Jul
13
13
2010
12:38 PM
12
12
38
PM
PDT
Replication with heredity REQUIRES an abstraction of the parent.
Replication is chemistry. Copy errors are chemistry. If not, point to the place in the process that is not chemistry. And if it's not chemistry, what exactly is it?Petrushka
July 13, 2010
July
07
Jul
13
13
2010
12:36 PM
12
12
36
PM
PDT
In the language of physics and chemistry, what exactly is an "error?"Apollos
July 13, 2010
July
07
Jul
13
13
2010
12:33 PM
12
12
33
PM
PDT
Petrushka, Replication with heredity REQUIRES an abstraction of the parent.Upright BiPed
July 13, 2010
July
07
Jul
13
13
2010
12:33 PM
12
12
33
PM
PDT
veilsofmaya: So, even though the meaning of GC content may not have “changed”, it’s presence does not necessary indicate that a sequence is protein-coding. It never has. But it is a good clue. So, why would they acknowledge that GC-rich sequences have a 50% change of coding proteins? Why would they take the time to discover the orphans in question have a GC content of 55%, which is higher than the average for the human genome (39%) and similar to protein coding genes in cross-species counterparts (53%)? Here is what they say: "The orphans have a GC content of 55%, which is much higher than the average for the human genome (39%) and similar to that seen in protein-coding genes with cross-species counterparts (53%). The high-GC content reflects the orphans’ tendency to occur in gene-rich regions. We examined the ORF lengths of the orphans, relative to their GC-content. The orphans have relatively small ORFs (median 393 bp), and the distribution of ORF lengths closely resembles the mathematical expectation for the longest ORF that would arise by chance in a transcript-derived form human genomic DNA with the observed GC-content" IOW, they never aknowledge that the high GC content can simply be a sign that they are protein coding genes. On the contrary, they unilaterally interpret that data in the sense that they are ORFs which arise by chance "in gene-rich regions". The interpretation is the only one given, and there is no discussion of the other interpretation, that they could really be protein coding genes, in that paragraph. merely being GC-rich is considered insufficient to remain in the catalog. The fact is that merely bring ORFs should be enough to remain in that catalof, unless valid contrary arguments are provided (and they were not). Obviously, their uniqueness prohibits the use homology-based methods directly on human orphans themselves because, well, they are human orphans. That's exactly the point. However, this in now way prohibits comparing the properties of human orphans to the properties of well studied human genes that do have homology in multiple orthologs. Again, which properties? This in no way guarantees failure. Yes, if you look at the "properties" they considered (see later). In addition to mice and dog, macaque and chimpanzee, If we included human beings as a potential fourth ortholog for comparison with human orphans, they showed strong differentiation in three out of four cases. However, clearly, no such comparison could be made between human orphans vs human genes that have orthologs in humans as there is no such thing. ???? What does that mean? Please, explain. I cannot comment on something which I simply cannot understand. Just to be clear, I again maintain that the properties they checked (scores, indels, etc.) are all dependent on a comparison between orthologues in different species. Therefore, they cannot tell us anything about genes which, we already know, have no orthologues in other species. I quote here from the supplementary material of the paper: "SI Figure 6 Fig. 6. RFC score and indel patterns. (a) Illustration showing how RFC score is calculated for a pairwise alignment. Species 1 shows a human putative gene sequence in which translation starts in reading frame 0 (that is, codons are read from the first base). Each human base can be assigned as being in codon position 0, 1, or 2. Species 2 shows the orthologous DNA sequence in the mouse genome, aligned to the human sequence with gaps indicated by dashes." Emphasis mine. Given that no other method was available in this fourth case, a different metric was used, based on the estimated rate of possible deletions and additions necessary for the majority of these orphans to be protein coding from the previous ortholog: chimpanzees. This metric was verified as part of the independent check for published articles. Out of the 1,177 orphans, only 12 were found to have experimental evidence of protein coding. ????? Is it possible to be more obscure? Even if this fourth case was discarded, human orphans still showed strong differentiation across multiple factors in three out of four cases. Yes, it is...gpuccio
July 13, 2010
July
07
Jul
13
13
2010
12:31 PM
12
12
31
PM
PDT
Petro- George Orwell called, he said he wants you to read his book 1984 and do a 5 page book report about though control and language.Phaedros
July 13, 2010
July
07
Jul
13
13
2010
12:23 PM
12
12
23
PM
PDT
chemicals do not form encoded abstractions of themselves.
They do, however, replicate themselves, with occasional errors. Words and phrases like "information" and "encoded abstractions" are equivocations, an attempt to prove something by surreptitiously changing definitions. If you wish to argue what chemicals can do or cannot do, demonstrate it in the language of chemistry. What physical process required by evolution violates any established laws of physics or chemistry? Try responding without invoking metaphorical language.Petrushka
July 13, 2010
July
07
Jul
13
13
2010
12:10 PM
12
12
10
PM
PDT
Cabal, every since your appearance here, your arguments have always been the cheapest of the cheap. I don't mean in cheap in the sense of a "cheap shot", I mean cheap in the sense of having the least amount of intellectual integrity. There have been numerous times where people have tried "in vain" to get you to get off your God complex long enough to hear the evidence for design instead; long enough for you to come face-to-face with the actual issues regarding what is observed through science. It has been futile. Like one of those wind-up robots who mindlessly walks into a wall, then spins around with an unchanged expression and starts walking off in another direction, you never seem to understand or engage a damn thing. You are therefore left to constantly demean and misrepresent the ID argument. After all, that is so much easier than trying to grapple with the fact that chemicals do not form encoded abstractions of themselves.Upright BiPed
July 13, 2010
July
07
Jul
13
13
2010
10:04 AM
10
10
04
AM
PDT
Pav,
Second, ID tries to explain what we see
So far, I have searched in vain for that explanation. What has ID contributed to our understanding beyond the level of Genesis? To me it looks like the ancients said "This is beyond comprehension, it must be the work of God." Some thousands of years have got us to "This is too complex; it must be the work of a designer." In what way is ID different than classical creationism?Cabal
July 13, 2010
July
07
Jul
13
13
2010
08:35 AM
8
08
35
AM
PDT
veilsofmaya:
Again, this is why I’m suggesting that ID is a convoluted elaboration of deo-Darwinism. It attempts to explain away neo-dariwnism, rather than explain what we observe.
First, this isn't a logical statement. If ID is trying to explain away neo-Darwinism, then it's anti-neo-Darwinist, not a form of neo-Darwinism. Second, ID tries to explain what we see. My earlier point about population genetics/neo-Darwinism being dead addresses the fact that we are now seeing genetic mechanisms at work that are so sophisticated, with such higher levels of interplay than ever suspected, that PG/ND just can't begin to deal with them. Kimura---BTW, I've read his book on Neutral Theory, have you?---makes it clear that good old fashioned PG can't really cope with with the discoveries of the 60's, and nothing I've ever read (and I've read Fisher's work, I've read portions of Sewell Wright's works, Kimura, etc) can come up with any satisfactory explanations of what organisms/cells do, other than, of course, the kinds of adaptive strategies that cells have and which, like bacteria cells switching from lactose to sucrose metabolism, have been documented. But these are almost trivial examples of what cells can actually do. (And, BTW, in a less than year-old study, for bacterial colonies that "switch" their metabolism, contary to population genetic dogma, the colonies don't become "fixed"; that is, they don't 100% switch over---a small fraction retains the original function. This, again, shows just how little PG can really explain). We're dealing with sophisticated machinery driven by a "software system" that is mind-boggling in its complexity. For example, I talked earlier about RNA editing. It's entirely possible that long stretches of DNA are so constructed that depending on 'where' the editing takes place, entirely different protein/regulatory RNA stretches are produced. This means that its not the 'codon' structure that's important, but each, individual bp. In this case, then, very high conservation of nucleotide bases would be required. And, in the case of ORF's, this is what we see. Now it is the position of Darwinists that this mind-boggling complex machinery came about simply by chance---please don't bother to announce that because of NS the whole Darwinist project is not random, since we know that the replicative demands for the building up of tiny changes to the genome are generally beyond anything that RV+NS can deal with, which is, of course, the whole point of Behe's Edge of Evolution. That ID, in the face of this elaborately complex cellular mechanisms and machines, says that this is the product of design, not of chance, seems to me to be rather sensible. Don't you agree? And this seems to be a much better "[explanation of] what we observe" than neo-Darwinism could ever be. Now you may disagree, but for me, at a personal level, I'm completely convinced that every kind of explanation that neo-Darwinism has given in the past---to which the musings of the evo-devo people will always be firmly attached---is an utter failure, other than, of course, the trivial kinds of adaptations that we see taking place (and I mean 'trivial' in the sense of involving only simple, basic pathways). To me, all of this is now beyond argumentation. And, for me, it's simply a matter of waiting for Darwinist's to "throw in the towel". And that's why I say that in fifty years hence, they'll be rolling on the floor laughing at what was once considered 'true science'. As an analogy, all of this is like watching 'scientists' examining the remains of a crashed, alien space vessel and claiming the whole time that what they're looking at and investigating can really be explained by natural processes alone. Well, please excuse me if I say: "No, it can't."PaV
July 13, 2010
July
07
Jul
13
13
2010
01:51 AM
1
01
51
AM
PDT
@Pav (#225) You wrote:
With RNA editing in play, protein sequences are to a degree now disconnected from the genomic sequence we observe. Coupling this with the tremendous level of variation, intraspecifically, that genome wide studies have shown, there is no longer room for what we call “population genetics”. It’s become meaningless.
PaV, that population genetics is only part of the picture is non-controversial. Nor does it's incompleteness imply it is "meaningless" any more than the incompleteness quantum gravity implies that quantum mechanics is "meaningless." Also, could you be referring to the study referenced this passage from the Distinguishing protein-coding and noncoding genes in the human genome paper? "Finally, the creation of more rigorous catalogs of protein-coding genes for human, mouse, and dog will also aid in the creation of catalogs of noncoding transcripts. This should help propel understanding of these fascinating and potentially important RNAs." This strongly suggests one of the benefits of the methodology is to determine which ORFs are not directly related to protein coding so their indirect influence can be studied further. As such, it's unclear how laying the foundation for this sort of study is "meaningless."
So, if you want to ask about “rates”, well that’s just a game that molecular biologists play using the assumption that they really understand what’s happening at the genomic level.
That we have obvious gaps in our understanding of the human genome is non-controversial. No one suggests otherwise. Nor is this the question I asked. Agin, unless you're suggesting it all happened instantaneously, there is a rate at which change occurred. And if the paper is correct, then older genes actually did change slower than newer genes. Nor would it be impossible for a designer to intentionally decide to change or create newer genes faster than older genes. So, again, why we would observe a rate that is even remotely close to what neo-Darwinism predicts, rather than, say, 10,000+ all at once. If there is no constraints, how do you explain this particular rate? You simply have no answer, other than "That's what the designer happened to have chosen.", which is a non-answer.
In fifty years, supposing the world is still here, they will look back at the articles written over the last twenty years, and they’ll be rolling on the floor laughing so silly will the thinking appear to them.
And you will somehow be immune from such future review? Again, with the exception of dramatic effect, this is non-controversial. Nor is it unexpected given that the problem space is enormous and the limitations of our current technical abilities. However, this doesn't mean that neo-Darwinism is dead. Instead, it may mean we gain a better understanding of the mechanisms behind it.
As I say, population genetics, and with it, neo-Darwinism, is dead. Right now it’s no more than an amusing pastime.
I'm not making a positive argument here. Instead, the question I'm asking on this thread is, are the claims made actually supported by the papers you cited? I don't see it. It's not clear that population genetics be "dead", whatever that means. Nor is it clear it would be the death of neo-Darwinsm? This simply doesn't follow.
As to ID “explaining these observations”, you seem to infer that neo-Darwinism can explain them. Well it can’t…..
You're the one making the assumption by making the claim. Again, for any particular recent or forthcoming discovery to kill Darwinism it would also require the absence of corresponding recent or forthcoming discoveries that explain them. Your claim clearly suggests you have some specific reason to think the former will appear while the latter will not. What is this reason?
just like ID can’t explain it. But ID can point future research in the right direction much more efficiently than neo-D can, and that’s its importance.
It's unclear how the knowledge that intelligent designer "did it" will get us anywhere more "efficiently" since it doesn't answer any of these questions I posed. In fact, positing a designer effectively draws a line that claims we cannot hope to understand how the designer did it. It's a non-explanation Since ID refuses to address these questions, the fact that we too are designers does not give us any special insight. We still have to figure the answers to these questions for ourselves. Essentially, we're in the same boat as if there was no designer because we can't know anything about him/her other than the supposed act of design. Finally, an agent could use a process that happens to closely match what neo-Darwinism predicts, with the exception that it was supposedly chosen rather than naturally occurring. Again, this is why I'm suggesting that ID is a convoluted elaboration of deo-Darwinism. It attempts to explain away neo-dariwnism, rather than explain what we observe.veilsofmaya
July 12, 2010
July
07
Jul
12
12
2010
09:48 PM
9
09
48
PM
PDT
@Gupccio (#226) you wrote:
And it is not true that “now we know”: knowledge about the meaning of CG content has not changed, as far as I know.
Gupccio, While the CG-rich sequences represent a possibility for protein coding, there are a number of other factors present. As our understanding of these factors changes, so does the resulting probability that any GC-rich sequence may be protein coding. So, even though the meaning of GC content may not have "changed", it's presence does not necessary indicate that a sequence is protein-coding.
The authors of the paper have simply unilaterally chosen not to consider it a as a valid point against their assumptions, not even at the level of discussion. That’s serious methodological bias.
So, why would they acknowledge that GC-rich sequences have a 50% change of coding proteins? Why would they take the time to discover the orphans in question have a GC content of 55%, which is higher than the average for the human genome (39%) and similar to protein coding genes in cross-species counterparts (53%)? Clearly, this is part of the discussion and was considered while developing the methodology. However, the aspect being discussed is that merely being GC-rich is considered insufficient to remain in the catalog.
Instead, they analyze in detail the results of two scores which are obviously calculated in relation to the presence of homologues in other species, and which are therefore not applicable to a set of ORFs which by definition are human orphans.
To quote the paper… Characterizing the Orphans. We characterized the properties of the orphans to see whether they resemble those seen for protein-coding genes or expected for randoms ORFs arising in noncoding transcripts. In addition to conserved properties, RFC scores and codon substitution frequency, ORF lengths were examined, which took into consideration GC content. Human orphans had the opportunity to match the characteristics of well studied human genes with orthologs of the dog and mouse, macaque and chimpanzee, but did not. It's unclear how this failed opportunity is not applicable to human orphans or how failure was somehow guaranteed.
Can you understand that verb: prohibits? That’s not because of any authority, but because the basic principles of methodology prohibit to use a score which is not a appropriate to the subject we are studying: in this case, the two scores applied were bound to give a negative result for truly orphan genes, simply because they are orphans.
Obviously, their uniqueness prohibits the use homology-based methods directly on human orphans themselves because, well, they are human orphans. However, this in now way prohibits comparing the properties of human orphans to the properties of well studied human genes that do have homology in multiple orthologs. This in no way guarantees failure. In addition to mice and dog, macaque and chimpanzee, If we included human beings as a potential fourth ortholog for comparison with human orphans, they showed strong differentiation in three out of four cases. However, clearly, no such comparison could be made between human orphans vs human genes that have orthologs in humans as there is no such thing. Given that no other method was available in this fourth case, a different metric was used, based on the estimated rate of possible deletions and additions necessary for the majority of these orphans to be protein coding from the previous ortholog: chimpanzees. This metric was verified as part of the independent check for published articles. Out of the 1,177 orphans, only 12 were found to have experimental evidence of protein coding. . Even if this fourth case was discarded, human orphans still showed strong differentiation across multiple factors in three out of four cases.veilsofmaya
July 12, 2010
July
07
Jul
12
12
2010
09:36 PM
9
09
36
PM
PDT
veilsofmaya (#221): I am happy that you took the time to clarify better your reading of the paper. As I understand it, ORFs have the potential to be protein coding. Orphans are ORFs that are assumed to code proteins or have experimental evidence of coding proteins, but are not found in other species. That's correct. The question is, should specific ORFs be considered part of the human genome given we now know being GC-rich alone is not necessary a good indication of protein coding. In other words, are they really orphan genes or ORFs that do not code proteins. The question is legitimate, but it needs legitimate answers. GC content is a very good clue to the protein coding nature of a ORF. Obviously, it is not in itself conclusive (nothing in itself is conclusive except for the real demonstration of the protein, but as I said that is available only part of the ORFs in the various databases, and still the other ORFs are retained as ORFs, and counted as genes, until some specific contrary evidence is found). And it is not true that "now we know": knowledge about the meaning of CG content has not changed, as far as I know. The authors of the paper have simply unilaterally chosen not to consider it a as a valid point against their assumptions, not even at the level of discussion. That's serious methodological bias. Instead, they analyze in detail the results of two scores which are obviously calculated in relation to the presence of homologues in other species, and which are therefore not applicable to a set of ORFs which by definition are human orphans. That was one of my points, and it is made very clearly in the project description quoted by PaV: "Moreover, the uniqueness of ORFan genes prohibits use of any of homology-based methods that have traditionally been employed to establish gene function." Can you understand that verb: prohibits? That's not because of any authority, but because the basic principles of methodology prohibit to use a score which is not a appropriate to the subject we are studying: in this case, the two scores applied were bound to give a negative result for truly orphan genes, simply because they are orphans. This is another serious methodological bias. In comparison, GC content can be considered an unbiased estimator of the protein coding nature of an orphan ORF, because it has nothing to do with homologues in other species. However, there is currently no scientific justification for excluding ORFs simply because they fail to show evolutionary conservation; the alternative hypothesis is that these ORFs are valid human genes that reflect gene innovation in the primate lineage or gene loss in other lineages. You keep quoting this paragraph from the paper without really understanding what they are saying. The key concept here is in the primate lineage. IOW, they are happy to admit that an ORF is a valid human gene provided that it has at least some homologues in the primate lineage, while they reject as "wholly implausible" that new genes may have arisen in the human species alone. But the only correct, unbiased statement should be the following: "However, there is currently no scientific justification for excluding ORFs simply because they fail to show evolutionary conservation in any species; lack of evolutionary conservation, even in primates is not justification for excluding ORFs, unless one assumes as absolute truth one's own expectation from one's own theory about how genes arise." While orphans were studied as a group, it’s the specific characteristics they exhibit which forms the basis for the proposed methodology. If you only read the summary, I can see how you might conclude orphans are being singled out for merely being orphans. However, in the case of this paper, the use of the term ‘orphans’ is referring to a group of ORFs with specific properties. That's simply not true. I have carefully read the whole paper, and I can't see any "specific property" of this set of human orphans which is presented as justification for excluding them from the list of ORFs which are retained as possible protein coding genes, except for: a) two scores which, as previously said, are not appropriate to the question b) the simple fact that they have no homologues in primates c) the indirect fact that proteins are not known, which is true of many of the ORFs retained in the list, and which is to be expected for genes which have been found only recently, and about which no literature has accumulated. d) The GC content, which is in favour of their protein coding nature. So, to which "specific properties" are you referring? Finally, the paper clearly indicates that removal would help create a catalog of non-coding ORFs for further study. That's simply hypocritical. As I have shown in my post #219, the immediate result of the paper is that those ORFs will as a rule no more be considered in the general research about human protein coding genes.gpuccio
July 12, 2010
July
07
Jul
12
12
2010
09:05 AM
9
09
05
AM
PDT
Dear Veils: Neo-Darwinism is meaningless, passe, done with. I've been reading up on ORFans, and this gets you into what's called RNA editing, which occurs on the pre-mRNA strands. With RNA editing in play, protein sequences are to a degree now disconnected from the genomic sequence we observe. Coupling this with the tremendous level of variation, intraspecifically, that genome wide studies have shown, there is no longer room for what we call "population genetics". It's become meaningless. So, if you want to ask about "rates", well that's just a game that molecular biologists play using the assumption that they really understand what's happening at the genomic level. This just isn't so. Older, younger genes; faster, slower rates. All this means is that if you take all the mutations found in some genomic line, and divide it by the time since it split off from its last known ancestor, it turns out to be higher/lower than average. Garbage in; garbage out. In fifty years, supposing the world is still here, they will look back at the articles written over the last twenty years, and they'll be rolling on the floor laughing so silly will the thinking appear to them. When you have RNA editing that allows insertions and deletions, which can give rise to reading frame shifts, and that can convert certain bases into others---and, at select spots---and when this editing is extremely important (as in the proper functioning of the human brain), then we're looking at a functional system that surpasses our ability to grasp---at least for now. As I say, population genetics, and with it, neo-Darwinism, is dead. Right now it's no more than an amusing pastime. As to ID "explaining these observations", you seem to infer that neo-Darwinism can explain them. Well it can't, just like ID can't explain it. But ID can point future research in the right direction much more efficiently than neo-D can, and that's its importance.PaV
July 12, 2010
July
07
Jul
12
12
2010
05:16 AM
5
05
16
AM
PDT
@PaV (#220) You wrote:
Wouldn’t the sensible thing to do be to test these human orphan genes for function? They don’t seem to be inclined to do anything like that.
PaV, Wouldn't the sensible thing for ID to as well? Yet, nearly all of the research citied here is published by scientists who are supposedly hiding things. This includes the experimental evidence for protein coding in 15 known orphan genes in the papers cited. How is this possible if they are not "inclined to do anything like that?" Furthermore, when will ID explain why the designer chose to mutate genes at just the specific rate we observe? After all, if the designer actually chooses a particular rate, any rate could have been selected, including changing 10s of thousands of genes all at once. Why would the level of mutation be even remotely close to what darwinism predicts? Howe does ID explain this? When will ID explain the method the designer used to determine which genes to change and the specific order to change them? Clearly, such information would be incredibly useful in a wide range of applications, from designing organisms to clean up oil spills, create new energy sources, synthesize drugs, etc. Also, how did the designer change just the right genes while leaving the rest completely unchanged? Surely, such information would be incredibly useful in gene therapy, DNA repair, targeting specific kinds of diseases such as cancer, etc. Wouldn't explaining these observations be the sensible things for ID to do? However, ID "doesn't seem to be inclined to do anything like that." In fact, as I've claimed before, this lack of explanation is why ID is a convoluted elaboration of Darwinism.veilsofmaya
July 11, 2010
July
07
Jul
11
11
2010
10:29 PM
10
10
29
PM
PDT
@Pav (#217) Here, it looks like you've found something closer to the actual paper we've been discussing (in the realm of primates, rather than bacteria) which you think supports the demise of Darwinism. You wrote:
Reading through this thread, I get the feeling that this is the dirty little secret that Darwinists don’t want to talk about, so devastating to their theory is it. The above paper talks about TE’s and gene duplications as possible mechanisms. But there remain {at least} about 15 genes which appear to have developed de novo.
First, mutations at a higher rate than predicted is not "devastating" for Darwinism as our predictions are based on the specific mechanisms we currently know of and our specific understanding of how they are applied. What would have been "devastating" was the discovery that each organism had their own form of DNA based on completely different molecules. This is because we had yet to discovery DNA or the role it plays in evolution before Darwinism was formed. Second, would this "secret" include the experimental evidence of protein coding for 12 of the orphans I've repeatedly mentioned several times in this thread? Is repeating or publishing findings what you'd expect someone to do if they wanted to keep them a "secret?" Finally, as for the paper you referenced, the information it contains represents knowledge we had yet to gain. Specifically, at some point in the past, we did not know younger genes seem to mutate faster than older genes. However, the very same process you claim has supposedly put Darwinism "up a creek without a paddle" could turn around and provide Darwinism paddles in spades. This is the nature of discovery. You must assume that we have the ability to discover newer genes mutate faster yet lack the ability to explain them. In other words, you must assume explanations do not exist or they cannot be discovered. On what basis have you reached this conclusion?veilsofmaya
July 11, 2010
July
07
Jul
11
11
2010
10:24 PM
10
10
24
PM
PDT
@pav (#212) Pav, You've only quoted the summary of the NIH project on orphans. This leaves several open questions. - What methodology was used to identify these ORFs as orphans? - Was there any overlap of the orphans in this project and the paper behind the article Bornagain77 referenced? - What properties did these specific orphans exhibit which indicated they were likely to code proteins and are they present in all orphans, including those in the human genome? This information is absent from the summery of the project. As such, it's unclear if the project has any bering on the conclusions reached by the Distinguishing protein-coding and noncoding genes in the human genome paper, or is a representation of orphans as a whole. For example, are the rates of evolution in bacteria the same as human beings or other species? In other words, even if a vast majority of orphans in bacteria have been determined to protein coding (by some means absent in the summary) this does not mean that the vast majority of orphans as whole are protein coding.
I think this rather seals the deal for Darwin’s demise.
While I can see how you might assume this is the case, it appears to be just that: an assumption. Of course, despite being just a summary, this doesn't seem to have prevented you from referencing it anyway.veilsofmaya
July 11, 2010
July
07
Jul
11
11
2010
10:22 PM
10
10
22
PM
PDT
@Gupuccio (#208) gpuccio, I apologize if I've put words in your mouth. As I understand it, ORFs have the potential to be protein coding. Orphans are ORFs that are assumed to code proteins or have experimental evidence of coding proteins, but are not found in other species. The question is, should specific ORFs be considered part of the human genome given we now know being GC-rich alone is not necessary a good indication of protein coding. In other words, are they really orphan genes or ORFs that do not code proteins. Now, to the quote I referenced, expanded to clarify.
But positive experimental evidence of protein coding is lacking for a lot of ORFs, human and not human. If researchers had to follow the criteria you suggest, the databases of genes should be drastically reduced!
As I've mentioned in this thread, I'm not making a positive argument. Instead, I'm suggesting that the claims made are NOT supported by specific paper sited. So, when I'm referring to ORFs that lack experimental evidence, I'm referring to orphans which are non-conserved because this is what the paper specifically indicates. Perhaps you're suggesting allowing conserved ORFs that do not have positive experimental evidence to remain is the "problem?" But this seems unlikely given that, the paper is clear regarding this issue. So what other reason do you have? To quote the paper…. However, there is currently no scientific justification for excluding ORFs simply because they fail to show evolutionary conservation; the alternative hypothesis is that these ORFs are valid human genes that reflect gene innovation in the primate lineage or gene loss in other lineages. While orphans were studied as a group, it's the specific characteristics they exhibit which forms the basis for the proposed methodology. If you only read the summary, I can see how you might conclude orphans are being singled out for merely being orphans. However, in the case of this paper, the use of the term 'orphans' is referring to a group of ORFs with specific properties. Remember, my objection was to Bornagain77's conclusions based on the science daily article and additional articles which were not related. This is precisely the kind assumption I've been referring to which is visible on thread and others. In regards to the birth and death rate in between chimpanzees and humans, this was a hypothesis which appears to be supported by other experimental evidence in published literature. Nor would the resulting methodology prevent new experimental evidence from including these ORFs in the future. Finally, the paper clearly indicates that removal would help create a catalog of non-coding ORFs for further study. Again, no positive argument here. I'm only noting that the papers in question do not support the claims made. In fact, they suggest otherwise.veilsofmaya
July 11, 2010
July
07
Jul
11
11
2010
10:20 PM
10
10
20
PM
PDT
gpuccio: As I say, it's their dirty little secret. Wouldn't the sensible thing to do be to test these human orphan genes for function? They don't seem to be inclined to do anything like that.PaV
July 10, 2010
July
07
Jul
10
10
2010
08:02 PM
8
08
02
PM
PDT
PaV: About how much a previous paper can influence future research: From the paper you cite at #217: "It has recently been argued that most of the annotated human orphan proteins are likely to be spurious ORFs that are not functional (Clamp et al. 2007). Here we only considered human gene products that showed significant similarity to putative macaque and chimpanzee proteins and, with this data set, we reached quite different conclusions regarding the possible functionality of orphan genes." IOW, in this paper the 1000+ purely human orphans were not considered. So, the Clamp paper has already inflenced the research approach of others, and will continue to do so.gpuccio
July 10, 2010
July
07
Jul
10
10
2010
03:48 PM
3
03
48
PM
PDT
One final note, full disclosure, as I was looking around at papers, I noticed that Howard Ochman first made the claim about E. Coli function in a 2004 paper---although the part about still being under selective constraint was not.PaV
July 10, 2010
July
07
Jul
10
10
2010
12:02 PM
12
12
02
PM
PDT
Here's another paper: Origin of Primate Orphan Genes: A Comparative Genomics Approach http://mbe.oxfordjournals.org/cgi/reprint/26/3/603 The Wikipedia site on orphan genes has almost nothing: two small entries, maybe five sentences altogether. Reading through this thread, I get the feeling that this is the dirty little secret that Darwinists don't want to talk about, so devastating to their theory is it. The above paper talks about TE's and gene duplications as possible mechanisms. But there remain {at least} about 15 genes which appear to have developed de novo. Well, there's this other paper: Relaxed Purifying Selection and Possibly High Rate of Adaptation in Primate Lineage-Specific Genes [http://gbe.oxfordjournals.org/cgi/reprint/2/0/393] that wants to find some kind of answer to how these orphan genes arose de novo: was it a diminished purifying NS---that is, NS leaving those nasty, or not so nasty, mutations alone. But this doesn't appear to be the case. They rule it out. So, Option Two: positive NS. They ran two tests for positive NS, one showed no positive NS taking place, and the other test showed a little---but there were questionable elements to it. So, basically, they're up a creek without a paddle. Here are these de novo genes, and there's only a hint that NS is acting differently in the origination process. And even if it were shown that positive selection is involved, it would have to be taking place at a rate 4 times normal---how then would they explain that. (And, of course, there's then Nachman's Paradox and Haldane's Dilemna to contend with on a gargantuan scale) So they are definitely in trouble here, and I don't think they want to talk about it publicly. Again, it's their dirty little secret. Finally, as a general commentary, this appearance of distinct species specific gene structure reminds me of the rise of the Neutral Theory, Kimura's attempt to deal with the immense polymorphism found by the electrolysis studies of the 60's, none of which was "predicted" by neo-Darwinism. As we know, the Neutral Theory really ended up being "non-Darwinian", and, was attacked as such. Here we go again. Molecular biology providing us with information that is completely in discord with dominant neo-Darwinian theory---AND, the "gene duplication with neutral drift" scenario that is so comfortable to the boys---and this time around they are carefully trying to shove it under the rug. But then there's Doug Axe, and his devastating analysis. One would hope that some time soon a bold group of biologists would say, "Enough is enough!" Alas.PaV
July 10, 2010
July
07
Jul
10
10
2010
11:50 AM
11
11
50
AM
PDT
PaV thanks for finding the paper directly implicating functionality for ORFans. Amazing what a little light can do. 8)bornagain77
July 10, 2010
July
07
Jul
10
10
2010
09:29 AM
9
09
29
AM
PDT
1 3 4 5 6 7 13

Leave a Reply