Uncommon Descent Serving The Intelligent Design Community

Proteins Fold As Darwin Crumbles

Share
Facebook
Twitter
LinkedIn
Flipboard
Print
Email

A Review Of The Case Against A Darwinian Origin Of Protein Folds By Douglas Axe, Bio-Complexity, Issue 1, pp. 1-12

Proteins adopt a higher order structure (eg: alpha helices and beta sheets) that define their functional domains.  Years ago Michael Denton and Craig Marshall reviewed this higher structural order in proteins and proposed that protein folding patterns could be classified into a finite number of discrete families whose construction might be constrained by a set of underlying natural laws (1).  In his latest critique Biologic Institute molecular biologist Douglas Axe has raised the ever-pertinent question of whether Darwinian evolution can adequately explain the origins of protein structure folds given the vast search space of possible protein sequence combinations that exist for moderately large proteins, say 300 amino acids in length.  To begin Axe introduces his readers to the sampling problem.  That is, given the postulated maximum number of distinct physical events that could have occurred since the universe began (10150) we cannot surmise that evolution has had enough time to find the 10390 possible amino-acid combinations of a 300 amino acid long protein.

The battle cry often heard in response to this apparently insurmountable barricade is that even though probabilistic resources would not allow a blind search to stumble upon any given protein sequence, the chances of finding a particular protein function might be considerably better.  Countering such a facile dismissal of reality, we find that proteins must meet very stringent sequence requirements if a given function is to be attained.  And size is important.  We find that enzymes, for example, are large in comparison to their substrates.  Protein structuralists have demonstrably asserted that size is crucial for assuring the stability of protein architecture.

Axe has raised the bar of the discussion by pointing out that very often enzyme catalytic functions depend on more that just their core active sites.  In fact enzymes almost invariably contain regions that prep, channel and orient their substrates, as well as a multiplicity of co-factors, in readiness for catalysis.  Carbamoyl Phosphate Synthetase (CPS) and the Proton Translocating Synthase (PTS) stand out as favorites amongst molecular biologists for showing how enzyme complexes are capable of simultaneously coordinating such processes.  Overall each of these complexes contains 1400-2000 amino acid residues distributed amongst several proteins all of which are required for activity.

Axe employs a relatively straightforward mathematical rationale for assessing the plausibility of finding novel protein functions through a Darwinian search.  Using bacteria as his model system (chosen because of their relatively large population sizes) he shows how a culture of 1010 bacteria passing through 104 generations per year over five billion years would produce a maximum of 5×1023 novel genotypes.  This number represents the ‘upper bound’ on the number of new protein sequences since many of the differences in genotype would not generate “distinctly new proteins”.  Extending this further, novel protein functions requiring a 300 amino acid sequence (20300 possible sequences) could theoretically be achieved in 10366 different ways (20300/5×1023). 

Ultimately we find that proteins do not tolerate this extraordinary level of “sequence indifference”.  High profile mutagenesis experiments of beta lactamases and bacterial ribonucleases have shown that functionality is decisively eradicated when a mere 10% of amino-acids are substituted in conservative regions of these proteins.  A more in-depth breakdown of data from a beta lactamase domain and the enzyme chorismate mutase  has further reinforced the pronouncement that very few protein sequences can actually perform a desired function; so few in fact that they are “far too rare to be found by random sampling”.

But Axe’s landslide evaluation does not end here.  He further considers the possibility that disparate protein functions might share similar amino-acid identities and that therefore the jump between functions in sequence space might be realistically achievable through random searches.  Sequence alignment studies between different protein domains do not support such an exit to the sampling problem.  While the identification of a single amino acid conformational switch has been heralded in the peer-review literature as a convincing example of how changes in folding can occur with minimal adjustments to sequence, what we find is that the resulting conformational variants are unstable at physiological temperatures.  Moreover such a change has only been achieved in vitro and most probably does not meet the rigorous demands for functionality that play out in a true biological context.  What we also find is that there are 21 other amino-acid substitutions that must be in place before the conformational switch is observed. 

Axe closes his compendious dismantling of protein evolution by exposing the shortcomings of modular assembly models that purport to explain the origin of new protein folds.  The highly cooperative nature of structural folds in any given protein means that stable structures tend to form all at once at the domain (tertiary structure) level rather that at the fold (secondary structure) level of the protein.  Context is everything.  Indeed experiments have held up the assertion that binding interfaces between different forms of secondary structure are sequence dependent (ie: non-generic).  Consequently a much anticipated “modular transportability of folds” between proteins is highly unlikely. 

Metaphors are everything in scientific argumentation.  And Axe’s story of a random search for gem stones dispersed across a vast multi-level desert serves him well for illustrating the improbabilities of a Darwinian search for novel folds.  Axe’s own experience has shown that reticence towards accepting his probabilistic argument stems not from some non-scientific point of departure in what he has to say but from deeply held prejudices against the end point that naturally follows.  Rather than a house of cards crumbling on slippery foundations, the case against the neo-Darwinian explanation is an edifice built on a firm substratum of scientific authenticity.  So much so that critics of those who, like Axe, have stood firm in promulgating their case, better take note. 

Read Axe’s paper at: http://bio-complexity.org/ojs/index.php/main/article/view/BIO-C.2010.1

Further Reading

  1. Michael Denton, Craig Marshall (2001), Laws of form revisited, Nature Volume 410, p. 417
Comments
This thread, and its implications, has been over the top. And now, with this latest post from PAV, its simply astounding. Thanks GP and BA (and Petrush and Veils) for talking it through. Very illuminating.Upright BiPed
July 10, 2010
July
07
Jul
10
10
2010
08:40 AM
8
08
40
AM
PDT
PaV: Thank you indeed for this very pertinent contribution, which very clearly confirms the points BA and I were trying to make. I hope veilsofmaya may appreciate it too :)gpuccio
July 10, 2010
July
07
Jul
10
10
2010
07:43 AM
7
07
43
AM
PDT
The person submitting the project was: Howard Ochman, at the Univesity of Arizona.PaV
July 10, 2010
July
07
Jul
10
10
2010
07:35 AM
7
07
35
AM
PDT
I've been following this discussion along these last few days. Szostak's article appears relatively old---2000, I think. Here's a proposal to the NIH for a project involving ORFans. It was submitted this year: DESCRIPTION (provided by applicant): The majority of genes in bacterial genomes, even in species for which extensive experimental evidence is available, are of hypothetical or unknown functions. The aims of this proposal are to investigate this enigmatic class of genes by elucidating the source and functions of "ORFans", i.e., sequences within a genome that encode proteins having no homology (and often no structural similarity) to proteins in any other genome. Moreover, the uniqueness of ORFan genes prohibits use of any of homology-based methods that have traditionally been employed to establish gene function. Thus, these genes present a major challenge to discovering their roles in bacterial genomes. In many respects, these genes constitute the most intriguing portion of bacterial genomes because they give clue to how new genes originate, and likely contribute to the remarkable diversification and adaptation of bacteria. Although it has been hypothesized that ORFans might represent non-coding regions rather than actual genes, we have recently established that the vast majority that ORFans present in the E. coli genome are under selective constraints and encode functional proteins. By combining experimental and bioinformatic approaches, the present proposal will analyze the origins, functions and structural properties of ORFans, and how they have assumed key roles in cellular function. I think this rather seals the deal for Darwin's demise.PaV
July 10, 2010
July
07
Jul
10
10
2010
07:33 AM
7
07
33
AM
PDT
Petrushka you state: "It’s the difference between a targeted search, which evolution isn’t, and exploitation of fitness gradients, without targets." That's the theory, yet the evidence says that even "without targets" evolution goes downhill. Reductive Evolution Can Prevent Populations from Taking Simple Adaptive Paths to High Fitness - May 2010 Excerpt: Despite the theoretical existence of this short adaptive path to high fitness, multiple independent lines grown in tryptophan-limiting liquid culture failed to take it. Instead, cells consistently acquired mutations that reduced expression of the double-mutant trpA gene. Our results show that competition between reductive and constructive paths may significantly decrease the likelihood that a particular constructive path will be taken. http://bio-complexity.org/ojs/index.php/main/article/view/BIO-C.2010.2 Testing Evolution in the Lab With Biologic Institute's Ann Gauger - audio http://www.idthefuture.com/2010/05/testing_evolution_in_the_lab_w.htmlbornagain77
July 10, 2010
July
07
Jul
10
10
2010
06:24 AM
6
06
24
AM
PDT
gpuccio, here is a excellent article on protein folding (chaperonins) that just came out on Crevo: Proteins Fold Who Knows How Excerpt: New work published in Cell shows that this “chaperone” device speeds up the proper folding of the polypeptide when it otherwise might get stuck on a “kinetic trap.” A German team likened the assistance to narrowing the entropic funnel. “The capacity to rescue proteins from such folding traps may explain the uniquely essential role of chaperonin cages within the cellular chaperone network,” they said. GroEL+GroES therefore “rescues” protein that otherwise might misfold and cause damage to the cell. The GroEL barrel and its GroES cap spend 7 ATP energy molecules opening and closing. The process can work in reverse, taking a misfolded protein and unfolding it as well. It might take several rounds for a complex protein to reach its native fold. These chaperonins operate in bacteria as well as higher organisms – and they are not the only chaperones. “Bacterial cells generally contain multiple, partly redundant chaperone systems that function in preventing the aggregation of newly synthesized and stress-denatured proteins,” the authors said. “In contrast to all other components of this chaperone network, the chaperonin, GroEL, and its cofactor, GroES, are uniquely essential, forming a specialized nano-compartment for single protein molecules to fold in isolation.” http://www.creationsafaris.com/crev201007.htm#20100709a So gpuccio, on top of multiple layers of error correction that prevent "random changes' from occurring in the DNA and amino acid sequences in first place, we now have redundant layers of folding machines preventing the proteins from exploring "random" structures as well. And just how is "the fact" of evolution suppose to occur if it is prevented from occurring in the first place?bornagain77
July 10, 2010
July
07
Jul
10
10
2010
05:52 AM
5
05
52
AM
PDT
Can’t you see the difference? 1 is completely different from 2, and hugely less powerful.
That's the question I'd address. In actual living things, natural selection addresses all changes to all parts of the genomes of all individuals in parallel. Artificial selection essentially waits for some specific advantage to appear. It's the difference between a targeted search, which evolution isn't, and exploitation of fitness gradients, without targets.Petrushka
July 9, 2010
July
07
Jul
9
09
2010
09:39 PM
9
09
39
PM
PDT
veilsofmaya: excuse me, but I believe your confusion is huge. I cannot follow you any more on that line. I apologize. Just one example:
"But positive experimental evidence of protein coding is lacking for a lot of ORFs, human and not human." Which is the precisely the point of the paper. Again, you seem to suggest that we should assume these orphans code proteins because you believe they are “designed” rather than based on experimental evidence. Merely being GC-rich is not a clear indicator of coding proteins.
The point of the paper?? I am afraid you are confusion ORFs with orphans. I am saying that "positive experimental evidence of protein coding is lacking for a lot of ORFs, human and not human", and you answer that I "seem to suggest that we should assume these orphans code proteins because I believe they are “designed” !!! This is complete non sequitur. All the best.gpuccio
July 9, 2010
July
07
Jul
9
09
2010
01:12 PM
1
01
12
PM
PDT
@gpuccio (#204) The point I've been making on this entire thread is that the options or conclusions presented are not supported by the papers sited. If you have an opinion, that's fine. However, as I've illustrated, "hate" for orphan ORFs, that genes were actually removed using this particular methodology alone or that removal would prevent further study is clearly not evident in this paper. Furthermore, papers have been cited as proving support, despite not actually being directly related. That they are "interesting" does not mean they support your position. You wrote:
I never said anything critical about those passages,
You implied the research paper was merely a tautology, which it's clearly not. Again to quote the paper… Specifically, it suggests that nonconserved ORFs should be added to the human gene catalog only if there is clear evidence of an encoded protein. It also provides a principled methodology for evaluating future proposed additions to the human gene catalog. Finally, the results indicate that there has been relatively little true innovation in mammalian protein-coding genes. These passages of the study showed the specific properties of non-conserved ORFs identified using this particular methodology matched those of random controls. This was in strong contrast to conserved ORFs.
I am afraid I cannot even start to understand what you are saying here. Why “it’s no longer obvious that they are new”? They are new. They have no known homologue. That’s why they are called orphans. Nobody, not even the paper’s authors, denies that.
I'm afraid that you're not trying very hard. They are obviously new ORFs, but not necessary genes that code new proteins, which is the subject of the entire paper.
There, the simple argument of their being orphans in just the human species, not allowing enough darwinian time for their evolution, has been considered sufficient. Which is the reason for my criticism.
But what is this criticism based on ? Are you assuming they are new protein encoding genes because you're assuming they were "designed?" This is not evident. The paper is not saying these orphans cannot be found to be protein coding in some explicit way. Again this is the topic of other research projects.
And in what sense “what we know about genes has changed significantly since they were added”? I am not aware of any such change.
See references (1-3) from the paper.
“That any of these ORFs are new genes is the goal of some other research project”. What does that mean? ORFs are potentially considered genes, until differently proven.
Other research projects is how 12 of the over 1,000 or so ORFs in question were determined to be protein coding. The goal of this project was to set a base line by which other research could build on. Nor would the application of this methodology exclude such discoveries.
But positive experimental evidence of protein coding is lacking for a lot of ORFs, human and not human.
Which is the precisely the point of the paper. Again, you seem to suggest that we should assume these orphans code proteins because you believe they are "designed" rather than based on experimental evidence. Merely being GC-rich is not a clear indicator of coding proteins.
If researchers had to follow the criteria you suggest, the databases of genes should be drastically reduced!
If the methodology was followed, the number of genes would be A. Reduced due to attrition, not merely removal of orphans. The database would be more accurate. B. Experimental evidence that showed an orphan was protein coding would immediately cause it to be reinstated. Again, this is setting a baseline by which all positive data would then be applied. Instead, you seem to imply it would somehow permanently ban these orphans from being added, which is clearly not the case.veilsofmaya
July 9, 2010
July
07
Jul
9
09
2010
12:44 PM
12
12
44
PM
PDT
Petrushka: What do you mean? Please, be less cryptic and sometimes spend a few more words to clarify your thoughts! The random variation induced by mutagenic PCR is random (targeted, but random). The intelligent selection is intelligent selection. RV + IS = Intelligent protein engineering. (What do you mean by "tainted"?) I had alredy stated all that, and you had already read it. I paste it here again, in case you are too lazy to go back to my post: "So, where is the problem? The problem is that “random mutations coupled to selection” is bottom up protein engineering. It is directed engineering. And what is “directed”? The mutations are not “directed”, although I could argue that they are in some way “targeted” (a choice is made about how to produce the mutations, in how many rounds, with what rate of mutation, and so on). But that is a minor point. I can agree that the induced mutations can to some degree be considered a simulation of “natural” RV. The major point is about the selection. This is intelligent selection. Not natural selection. Therefore, this is directed engineering, and not spontaneous evolution, nor is it in any way a correct simulation of it. I am really amazed that such a striking difference seems so elusive to many intelligent people. I will try to give my explicit definitions, to better make my point: 1)Natural selection: any system with replicators where, after some replicator through various means has developed some new “property”, such property is expanded in the population as a consequence of the spontaneous improvement in replication that the new property confers 2)Intelligent selection: any system with replicators where, after some replicator through various means has developed some new “property”, the property is actively recognized, measured, selected and expanded in the population by the system itself, independently from any objective advantage conferred by the property to the replicator Can’t you see the difference? 1 is completely different from 2, and hugely less powerful. 2 is intelligent selection: the function “does not act”: it is only recognized, because the system has been set up to recognize it. ATP binding in the Zsostac system is recognized by a purification system engineered to recognize it, but is not conferring any advantage in the replication (in this case, the PCR replication system). Intelligent selection is very efficient in bottom up engineering. It can easily recognize a desired property even in minimal form, and amplify it, and develop it through rounds of mutations and selection. NS can do nothing like that, unless a truly functional property in a truly living context is first built, and unless it is capable to confer reproductive advantage. That’s why both protein 18 – 19 and protein DX are the product of intelligent protein engineering."gpuccio
July 9, 2010
July
07
Jul
9
09
2010
08:28 AM
8
08
28
AM
PDT
Now, stop kidding. We have better things to do.
The alternative is that you consider artificial selection somehow tainted. I suppose that's to maintain the assertion that the result is intelligently designed.Petrushka
July 8, 2010
July
07
Jul
8
08
2010
07:16 PM
7
07
16
PM
PDT
veilsofmaya: I disagree with your reading, but I am afraid we cannot go on forever on the same points. I want to specify that I have nothing to object to your point 01 and 02. My objections were only relative to the discussion about the 1000 and such orphans: the precious filtering operations are OK for me. I never said anything crictical about those passages, so I don't understand why you mention that here: just to show that the authors are good guys? It’s no longer obvious that they are new. This is because what we know about genes has changed significantly since they were added. That any of these ORFs are new genes is the goal of some other research project. In fact, as indicated above, the filtering process would be helpful starting point for such a project. I am afraid I cannot even start to understand what you are saying here. Why "it’s no longer obvious that they are new"? They are new. They have no known homologue. That's why they are called orphans. Nobody, not even the paper's authors, denies that. And in what sense "what we know about genes has changed significantly since they were added"? I am not aware of any such change. "That any of these ORFs are new genes is the goal of some other research project". What does that mean? ORFs are potentially considered genes, until differently proven. That has not changed. The (correct) filtering process in points 1 and 2 has put into discussion some of humans ORFs as protein coding genes out of reasonable arguments. That's not the same for the last part, regarding the orphans. There, the simple argument of their being orphans in just the human species, not allowing enough darwinian time for their evolution, has been considered sufficient. Which is the reason for my criticism. Having the potential to code proteins is not the same as positive experimental evidence that a majority of the code proteins. But positive experimental evidence of protein coding is lacking for a lot of ORFs, human and not human. If researchers had to follow the criteria you suggest, the databases of genes should be drastically reduced! And so on... I am sorry, but I cannot go on forever with that. You are free to consider that paper as methodologically correct. I don't.gpuccio
July 8, 2010
July
07
Jul
8
08
2010
01:52 PM
1
01
52
PM
PDT
@gupuccio (#177) You wrote:
The only test was if new genes were new, which is a tautology
This is a highly simplistic interpretation which ignores much of the research performed. 01 The entire catalog was filtered from scratch beginning with an assumption that none of the genes were orphans. A new protocol was developed by which the criteria focused on human, mouse and dog genomes due to the high quality of sequence data available. Development of this specific process was a key part of the study as… A. It would be part of methodology for evaluating future proposed additions to the human gene catalog. B. It identified pseudogenes that slipped into the Ensembl catalog. C It identified numerous errors in human genome annotations. D. It identified 36 putative genes as valid genes, including 10 primate specific. E. It allowed for more detailed analysis and comparison of the properties of the resulting orphans with ortholog and random controls, rather than merely determining they were orphans. From the paper…. Finally, we note that the careful filtering of the human gene catalog above was essential to the analysis above, because it eliminated pseudogenes and artifacts that would have prevented accurate analysis of the properties of the orphans. 02. When this process was applied to the Ensembl (v38) catalog, an additional 598 genes were found due to more accurate identification of cross-species counterparts. This was due to the use of the more accurate dog and mouse sequence data. The filtering process would have resulted in a net loss was due to attrition, not merely removal of orphans previously classified as such using earlier methods.
[If it wasn't a tautology], the paper would have been something like: “Let’s see how many of human orphan genes, which are obviously new, have some independent demonstration of a corresponding protein in scientific literature”.
It's no longer obvious that they are new. This is because what we know about genes has changed significantly since they were added. That any of these ORFs are new genes is the goal of some other research project. In fact, as indicated above, the filtering process would be helpful starting point for such a project.
My complaint is that they categorize as implausible what they are observing (1000 new ORFs, potential protein coding genes, in humans) only because they can’t explain how they could have arisen in such a short time by RV and NS.
Having the potential to code proteins is not the same as positive experimental evidence that a majority of the code proteins. Other researchers observed 12 of the orphans identified in the study had been previously identified as coding. This is not implausible. Nor was the fact that over a 1000 human ORFs which were not present in other specific mammals and had been previously entered into a human gene catalog before specific recent discoveries had been made. While each specific ORF is a potential gene, as a group they exhibit specific properties that makes them unlikely tot be actually protein coding. Specifically… Recent studies have made clear that the human genome encodes an abundance of non-protein-coding transcripts (1–3). Simply by chance, noncoding transcripts may contain long ORFs. This is particularly so because noncoding transcripts are often GC-rich, whereas stop codons are AT-rich. Indeed, a random GC-rich sequence (50% GC) of 2 kb has a ?50% chance of harboring an ORF ?400 bases long Furthermore, out of these 1000 ORFs, the paper is *not* suggesting than any one in particular does not have the potential to be a new gene. Instead, they are suggesting that, in the absence of any other positive experimental evidence, it's unlikely that a majority of them would be protein coding. Nor is this solely based on addition and deletion rates specific to darwinism. The recommendation is, unless research suggests otherwise, they should be reclassified as a non-coding, just as ORFs external to the catalog would not be added given what we now know. Nor would this be a barrier to re-entry for ORFs should other experimental evidence be found in the future. Finally, reclassification does not exclude an ORFs from further study. In fact, it seems quite the opposite. To quote the paper.. Finally, the creation of more rigorous catalogs of protein-coding genes for human, mouse, and dog will also aid in the creation of catalogs of noncoding transcripts. This should help propel understanding of these fascinating and potentially important RNAs. That you see the resulting net attrition as "hate" appears to be caused by constant misrepresentations which have been addressed time and time again.
veilsofmaya
July 8, 2010
July
07
Jul
8
08
2010
01:35 PM
1
01
35
PM
PDT
Petrushka: Now, stop kidding. We have better things to do. If you want, and if you can, you will find all my argumentations very clearly in my #195 (or in the other parts of my exchange with rna). As you perfectly know.gpuccio
July 8, 2010
July
07
Jul
8
08
2010
09:25 AM
9
09
25
AM
PDT
The important thing is not to attribute the results of intelligent engineering to randomness, or to RV + NS. That’s simply wrong.
Are you suggesting that the induced variation was not random?Petrushka
July 8, 2010
July
07
Jul
8
08
2010
06:38 AM
6
06
38
AM
PDT
Petrushka: a study can accomplish several related goals, provided that the differnts goals are clearly stated and that the conclusions be kept well seoarated. That is not the case in that study. And I believe you can traverse any space, gradient or not, continuous or not, through the right quantity and quality of intelligent engineering (including RV + intelligent selection). Otherwise, intelligent agents, including humans, could never design proteins. The important thing is not to attribute the results of intelligent engineering to randomness, or to RV + NS. That's simply wrong. IOW, research must be honest and clear, in its aims, in its procedures, in its methodology, in its epistemology, and in its conclusions.gpuccio
July 8, 2010
July
07
Jul
8
08
2010
06:34 AM
6
06
34
AM
PDT
Obviously, because the aim of the study was to analyze how many functional sequences could be found in a random library, not how many functional sequences could be found after a process of mutation and selection.
I see no reason why a study can't accomplish several related goals. random functional sequences would have purely academic interest, but finding random sequences with minimal function that could be enhanced through mutation and selection would be a much richer and more evocative finding. It's something that would be an obvious follow-up study under any circumstances. It speaks directly to the question of whether protein functionality is a gradient that can be traversed through mutation and selection.Petrushka
July 8, 2010
July
07
Jul
8
08
2010
06:19 AM
6
06
19
AM
PDT
Petrushka: Apparently it eluded Darwin, since artificial selection gave him the idea for natural selection. And inference by analogy, it seems... Just like ID! :)gpuccio
July 7, 2010
July
07
Jul
7
07
2010
11:37 PM
11
11
37
PM
PDT
Petrushka: Why not? What is the justification for excluding mutation and selection? Obviously, because the aim of the study was to analyze how many functional sequences could be found in a random library, not how many functional sequences could be found after a process of mutation and selection.gpuccio
July 7, 2010
July
07
Jul
7
07
2010
05:25 PM
5
05
25
PM
PDT
The major point is about the selection. This is intelligent selection. Not natural selection. Therefore, this is directed engineering, and not spontaneous evolution, nor is it in any way a correct simulation of it. I am really amazed that such a striking difference seems so elusive to many intelligent people.
Apparently it eluded Darwin, since artificial selection gave him the idea for natural selection.
So, again, why the mutations? Why didn’t they just expand and purify the existing sequences? They knew the exact sequences. Why change them? What is the methodological justification of such a procedure?
Why not? What is the justification for excluding mutation and selection?Petrushka
July 7, 2010
July
07
Jul
7
07
2010
04:34 PM
4
04
34
PM
PDT
rna (#190): First of all I would like to really thank you for your detailed and very competent comments, and especially for the very seious, respectful and dedicated tone of your discourse. I really appreciate that, and believe me, it’s not so common to experience such a fruitful exchange, even in disagreement. That said, I must just the same disagree with you on some important points. I will try again to explain why, as clearly as possible. Then, I leave it to you. We can probably agree to disagree on those points, if you think we have made clear our respective positions. But if you have any further point to make, I will be happy to go on with the discussion. So, I will try to follow your arguments in the order they are given. a)both proteins bind ATP and fold into the same structure. Their affinities for ATP differ a bit as does their stability against heat denaturation. If you take the same protein from different organisms lets say myoglobin from a whale and a human they will differ in their affinity for oxygen and their stability. so would you say that they differ in their function and folding? Well, it’s not a very important point for me to define the differences between protein 18 – 19 and protein DX. If you prefer the first, or if you think that they are grossly equivalent, that’s fine for me. I was just quoting the opinion of the researchers who created protein DX, but maybe they are in some way partial to their creature :). It is certain that protein DX was created from protein 18 – 19 through further rounds of “directed evolution” (see later for comments on that), but after all it is not a rule that any activity of protein engineering should necessarily be very successful. It is true, also, that many of the following “experiments” (hydrolysis, in vivo testing) were carried out with protein DX. But again, this is not an important point. b) How could they be the product of directed engineering? The structure was not known at the time and could not have been guessed since there was no sequence homology to any known protein. It was not known which amino acids in the sequence contributed to ligand binding or folding. The mutagenesis in this three rounds was random for every amino acid position and could have lead to an exchange to any other amino acid at every single position of the protein. So it was random mutations coupled to selection that yielded the improvement in protein stability and affinity. This is probably the most important point of all, so I will spend some more words on it. You certainly know that protein engineering can be done in two different ways: top-down and bottom-up. In that sense, protein Top7 is a good example of top down design (the protein was designed starting from current knowledge of protein sequences and protein folding), while our ATP binding protein is a good example of bottom up design. It’s not a case that those two proteins are listed in SCOP under class 11 (Designed Proteins), at the first two fold classes: 1. New fold designs (Top7) 2. In vitro evolution products (our protein; not sure if 18 – 19 or DX, probably the first). Fold 2, like fold 1, includes indeed only one protein, our one, under the classification: Superfamily: Function-directed selections -> 1. Artificial nucleotide binding protein I paste here the beginning of the protein entry: HEADER ARTIFICIAL NUCLEOTIDE BINDING PROTEIN 28-JAN-04 1UW1 TITLE A NOVEL ADP- AND ZINC-BINDING FOLD FROM FUNCTION-DIRECTED TITLE 2 IN VITRO EVOLUTION COMPND MOL_ID: 1; COMPND 2 MOLECULE: ARTIFICIAL NUCLEOTIDE BINDING PROTEIN (ANBP); COMPND 3 CHAIN: A; COMPND 4 FRAGMENT: NUCLEOTIDE BINDING DOMAIN; COMPND 5 ENGINEERED: YES; Well, that’s SCOP’s point of view. But I owe you mine. You say: “The structure was not known at the time and could not have been guessed since there was no sequence homology to any known protein.” In a bottom up process, the structure is never known in advance. That’s why the process is bottom up. You could object that a bottom up process can start from some sequence of known protein, to modify it. That’s true. But in this case, the process of engineering willingly starts from random sequences. So, it’s a bottom up process starting form random sequences. Do we agree on that? The start is random. You say: “It was not known which amino acids in the sequence contributed to ligand binding or folding.” That’s true. You say: “The mutagenesis in this three rounds was random for every amino acid position and could have lead to an exchange to any other amino acid at every single position of the protein.” That’s true. You say: “So it was random mutations coupled to selection that yielded the improvement in protein stability and affinity.” That’s true. So, where is the problem? The problem is that “random mutations coupled to selection” is bottom up protein engineering. It is directed engineering. And what is “directed”? The mutations are not “directed”, although I could argue that they are in some way “targeted” (a choice is made about how to produce the mutations, in how many rounds, with what rate of mutation, and so on). But that is a minor point. I can agree that the induced mutations can to some degree be considered a simulation of “natural” RV. The major point is about the selection. This is intelligent selection. Not natural selection. Therefore, this is directed engineering, and not spontaneous evolution, nor is it in any way a correct simulation of it. I am really amazed that such a striking difference seems so elusive to many intelligent people. I will try to give my explicit definitions, to better make my point: 1)Natural selection: any system with replicators where, after some replicator through various means has developed some new “property”, such property is expanded in the population as a consequence of the spontaneous improvement in replication that the new property confers 2)Intelligent selection: any system with replicators where, after some replicator through various means has developed some new “property”, the property is actively recognized, measured, selected and expanded in the population by the system itself, independently from any objective advantage conferred by the property to the replicator Can’t you see the difference? 1 is completely different from 2, and hugely less powerful. 2 is intelligent selection: the function “does not act”: it is only recognized, because the system has been set up to recognize it. ATP binding in the Zsostac system is recognized by a purification system engineered to recognize it, but is not conferring any advantage in the replication (in this case, the PCR replication system). Intelligent selection is very efficient in bottom up engineering. It can easily recognize a desired property even in minimal form, and amplify it, and develop it through rounds of mutations and selection. NS can do nothing like that, unless a truly functional property in a truly living context is first built, and unless it is capable to confer reproductive advantage. That’s why both protein 18 – 19 and protein DX are the product of intelligent protein engineering. We have in nature a very good model of intelligent protein engineering, realized through a very brilliant algorithm of random search and intelligent selection. I have often pointed to it on this blog. It’s the mechanism of antibody maturation. In it, the low affinity antibody of the primary repertoire, after the first exposure to the antigen, undergoes maturation through a process, still not completely understood, of targeted random mutation and intelligent selection, using the antigen epitopes (probably stored in the antigen presenting cells) as a selecting tool. And the results are brilliant. c) normally, it possible to predict the 3D-structure of a protein correctly, when you find another protein with a known structure and ~ 30% similarities in sequence. Thus, 80% sequence IDENTITY is a pretty good hint that their structures a very similar. (Of course, there are cases where a single amino acid change changes the structure completely but that is maybe the topic of another discussion.) There are many cases, where structures of such pairs of proteins have been determined experimentally to support this notion. But there is no reason why this should be one of those cases. As I have already argued, these two proteins (I mean the B family ancestor and protein 18 – 19) are 80% similar because the second is derived from the first. Their history, their “common descent”, if you want, explains their similarity. But that tells us nothing about the structure, because the structure was probably selected through the rounds of mutations and intelligent selection. Otherwise, why would the ATP binding property have increased so much through those rounds? d) If you look at the structure of 18-19 or DX you see that the binding site for ATP is made up by the two aromatic amino acids phenylalanine and tyrosine and an arginine (at specific positions in the sequence). This amino acids are already the same in the original B-family supporting the idea that the function atp-binding was present from the beginning. Importantly the tyrosine is also involved in atp-cleavage – so even that was there from the beginning. Here I have to definitely disagree. I have never denied that some simple binding to ATP took place in the original sequences. That’s obvious, otherwise they would not have been selected. And the two AAs you cite may well have been responsible for that. But how can you conflate that with the high ATP binding activity which was developed later, and which reasonably depends on the acquired folding, which almost certainly was not present in the beginning? How can you deny what the authors themselves state clearly, that they introduced the mutagenic – selective rounds to increase ATP binding and improve folding? One possible explanation for this low level of ATP-binding is conformational heterogeneity, possibly reflecting inefficient folding of these primordial protein sequences. In an effort to increase the proportion of these proteins that fold into an ATP-binding conformation, we mutagenized the library and carried out further rounds of in vitro selection and amplification. And here comes the final point: why are we here, forced to speculate about the possible properties of the original sequences, or at least of the B family ancestors? It’simple: because the researchers, instead of doing what was consistent with their initial purpose and with their methodological context, did another thing. Instead of purifying and studying the family B proteins, instead of defining their structure, instead of measuring their ATP binding activity, instead of trying to assess how functional were those sequences derived rather directly from their random library, they chose to change those sequences, to improve their folding and their ATP binding activity. That’s the simple truth. If they had acted in a methodologically correct way, we would not probably know the final truths, but at least we would be discussing of the properties of a sequence which was really in a random library. Instead, we are wasting our time discussing protein 18 – 19, which is the product of mutation and intelligent selection, and speculating about how much it could possibly be similar to its ancestor. So, again, why the mutations? Why didn’t they just expand and purify the existing sequences? They knew the exact sequences. Why change them? What is the methodological justification of such a procedure? e) I dont get the fundamental physical difference between ’sticking to ATP’ in the original b-family and ATP-binding of e.g. dx. as i pointed out above the important amino acids binding to atp via electrostatic and aromatic stacking interactions are already in place in the original b-family. In more general tems if you look at the amino acid exchanges between b-family and 18-19 – many of these exchanges are between very similar amino acids. E.g. lysines are exchanged against arginines which as you very well know have similar biophysical properties. such changes in other proteins normally go hand in hand with keeping the structure and the function. I can agree with you on one thing: we are certainly not sure that all 16 AA changes are functionally important. Some of them could be just neutral mutations. But many, certainly, are not. I have already stated my answer to the other point: the binding to ATP was already present in some form, but not the folding, and the special conformation which made possible the higher binding affinity, the molecule stability, and probably even the primordial hydrolytic activity. f) as I pointed out above – that is not true. Furthermore, what is the reason for you to think that only strong binding counts as function. If you look at atp-affinities for natural atp-binding proteins e.g. the nbd-domains of different abc-transporters you find that their atp-affinites vary between ~100 nM and larger > 1mM which is 10000fold weaker then protein 18-19. These are functional natural proteins depending on atp-binding … What is not true? You are certainly not implying that those three rounds did not accomplish anything. Then why were they performed? Just to spend time? Regarding the problem of ATP affinity, your example (abc transporters), which you certainly know much better than I can, is a very good demonstration of the difference between a true function and simple biochemical binding of a molecule. Just a couple of hints from Wikipedia for those who are reading (I am sure you don’t need them): “The common feature of all ABC transporters is that they consist of two distinct domains, the transmembrane domain (TMD) and thenucleotide-binding domain (NBD).” “The structural architecture of ABC transporters consists minimally of two TMDs and two ABCs. “ “The ABC domain consists of two domains, the catalytic core domain similar to RecA-like motor ATPases and a smaller, structurally diverse ?-helical subdomain that is unique to ABC transporters. The larger domain typically consists of two ?-sheets and six ? helices, where the catalytic Walker A motif (GXXGXGKS/T where X is any amino acid) or P-loop andWalker B motif (????D, of which ? is a hydrophobic residue) is situated. The helical domain consists of three or four helices and the ABC signature motif, also known as LSGGQ motif, linker peptide or C motif. The ABC domain also has a glutamine residue residing in a flexible loop called Q loop, lid or ?-phosphate switch, that connects the TMD and ABC. The Q loop is presumed to be involved in the interaction of the NBD and TMD, particularly in the coupling of nucleotide hydrolysis to the conformational changes of the TMD during substrate translocation. The H motif or switch region contains a highly conservedhistidine residue that is also important in the interaction of the ABC domain with ATP. The name ATP-binding cassette is derived from the diagnostic arrangement of the folds or motifs of this class of proteins upon formation of the ATP sandwich and ATP hydrolysis.” “Dimer formation of the two ABC domains of transporters requires ATP binding.” “Nucleotide binding is required to ensure the electrostatic and/or structural integrity of the active site and contribute to the formation of an active NBD dimer.[35] Binding of ATP is stabilized by the following interactions: (1) ring-stacking interaction of a conserved aromatic residue preceding the Walker A motif and the adenosine ring of ATP,[36][37](2) hydrogen-bonds between a conserved lysineresidue in the Walker A motif and the oxygen atoms of the ?- and ?-phosphates of ATP and coordination of these phosphates and some residues in the Walker A motif with Mg2+ ion,[24][28] and (3) ?-phosphate coordination with side chain of serine and backbone amide groups of glycine residues in the LSGGQ motif.[38] In addition, a residue that suggests the tight coupling of ATP binding and dimerization, is the conserved histidine in the H-loop. This histidine contacts residues across the dimer interface in the Walker A motif and the D loop, a conserved sequence following the Walker B motif.[“ And, especially, this final point: “ABC transporters are active transporters, that is, they require energy in the form of adenosine triphosphate (ATP) to translocate substrates across cell membranes. These proteins harness the energy of ATP binding and/or hydrolysis to drive conformational changes in the transmembrane domain (TMD) and consequently transports molecules.” So, the real problem is not how strongly you bind ATP, but rather what you do as a consequence of that binding. g) Do you think that its harmfullness results from it being ‘unnatural’? No, I think that its harmfulness derives form it being simple and gross, and lacking a true function. If its binding to ATP were lower, or if it were simply “less expressed”, it would just be useless. h) Why have they gone on modifying – for biophysical experiments such as measuring affinity, determining a structure etc. you need to produce a certain amount of the protein normally by overexpression in bacteria in much larger amounts then what you have in the actual selection experiments and be able to purify it (where you produce your protein by a process called in-vitro translation in smaller amounts). thus, 18-19 was the better choice for these experiments because it was easier to handle. Purifying and expressing the protein could have been done without intentionally modifying it through rounds of mutational PCR. They could just have used simple PCR. Whatever you can say, the introduction of mutations before selection was a deliberate choice, an intentional act of engineering, contrary to the aims of the experiment, and totally unjustified. And I agree that protein 18 – 19 was “easier to handle”. That’s exactly my point. It should not have been. In the end, I want again to thank you for your part in this discussion. Any possible “strength” in my words is in no way directed against you, but simply motivated by my sincere convictions about the subject.gpuccio
July 7, 2010
July
07
Jul
7
07
2010
03:46 PM
3
03
46
PM
PDT
Petrushka, I answered that question yesterday,
Sorry. I find this site extremely hard to navigate. Threads get pushed down rathere quickly, and there's no indicator of whether a thread of interest is active. I simply lose track of comments. It doesn't help that when viewing long threads with lots of comments, the entire thread has to load. On my connection it can take over a minute for a thread to load. Enough complaining. I'll try to be more careful.Petrushka
July 7, 2010
July
07
Jul
7
07
2010
03:23 PM
3
03
23
PM
PDT
rna, as well I would like for you to explain to me exactly how 1 in 10^12 (trillion) for rarity functional proteins vs. useless proteins would even begin to help explain anything as for the evolution of larger mammals with smaller populations... Let's say the evolution of the whales within 10 to 50 million years perhaps??? Whale Evolution Vs. Population Genetics - Richard Sternberg PhD. in Evolutionary Biology - video http://www.metacafe.com/watch/4165203bornagain77
July 7, 2010
July
07
Jul
7
07
2010
01:03 PM
1
01
03
PM
PDT
rna, I find your response to gpuccio to be "excuses" not reasons. The burning question is why in the world did they even mess with the 1 in 10^12 protein if it was truly functional? For the 1 in 10^12 number to retain integrity, as to a true measure for the rarity of functional proteins, the blatant manipulation of the protein must in fact be factored into the result or removed from the experiment entirely. Why are you trying so desperately to defend what is clearly a "jerry rigged" result? Do you really think if you add enough "excuses" for the manipulation it will legitimize the result. Face it rna, they were caught with their hand in the cookie jar and the result is no more legitimate, than are a insane man's claims to being the king of America.bornagain77
July 7, 2010
July
07
Jul
7
07
2010
12:06 PM
12
12
06
PM
PDT
Petrushka, I answered that question yesterday, as to the what I think the age of the earth is, on David Tyler's post after you had asked me about it. https://uncommondescent.com/intelligent-design/macroscopic-life-in-the-palaeoproterozoic/#comment-358566 As I find it peculiar today, I thought it very peculiar for you to ask that question yesterday as I had just posted several posts, prior to your question, on his thread pertaining exactly to the extremely long “Intelligent Designed terra-forming of the earth”. Post that should have left no doubt whatsoever that I believe in an ancient age for the earth. But I'm curious as to why should you ask today, since it has nothing to do with the issue at hand. Are you trying to divert attention away from this biased "kick the orphan genes out into the street paper” by trying to attack what you falsely think to be a young earth weakness in me instead of trying to defend whatever scant merits the paper might or might not actually have? Do you in fact agree the paper is without true scientific integrity as is clearly apparent to both me and gpuccio? Are you now admitting you have no recourse but to “attack the man” by trying to undermine what you falsely perceive to be the lack of integrity within my judgment? Exactly what is your reasoning so as to employ such a shallow, and I might add intellectually dishonest, ploy that you have just used to try to defend a paper that is in fact not worth defending in the first place?bornagain77
July 7, 2010
July
07
Jul
7
07
2010
11:53 AM
11
11
53
AM
PDT
# 172 gpuccio "a) The structures of 18-19 and DX may be virtually identical, but their folding and functional properties are not ..." both proteins bind ATP and fold into the same structure. Their affinities for ATP differ a bit as does their stability against heat denaturation. If you take the same protein from different organisms lets say myoglobin from a whale and a human they will differ in their affinity for oxygen and their stability. so would you say that they differ in their function and folding? "... because both are the product of directed engineering ..." How could they be the product of directed engineering? The structure was not known at the time and could not have been guessed since there was no sequence homology to any known protein. It was not known which amino acids in the sequence contributed to ligand binding or folding. The mutagenesis in this three rounds was random for every amino acid position and could have lead to an exchange to any other amino acid at every single position of the protein. So it was random mutations coupled to selection that yielded the improvement in protein stability and affinity. "b) Both 18 – 19 and DX differ form the original B family for 16 AAs, 20%. That’s a lot, ...." normally, it possible to predict the 3D-structure of a protein correctly, when you find another protein with a known structure and ~ 30% similarities in sequence. Thus, 80% sequence IDENTITY is a pretty good hint that their structures a very similar. (Of course, there are cases where a single amino acid change changes the structure completely but that is maybe the topic of another discussion.) There are many cases, where structures of such pairs of proteins have been determined experimentally to support this notion. " ... especially if you consider that those AAs are exactly those mutations whcih were actively selected to confer both folding and function... " If you look at the structure of 18-19 or DX you see that the binding site for ATP is made up by the two aromatic amino acids phenylalanine and tyrosine and an arginine (at specific positions in the sequence). This amino acids are already the same in the original B-family supporting the idea that the function atp-binding was present from the beginning. Importantly the tyrosine is also involved in atp-cleavage - so even that was there from the beginning. "The only information we have about function in the original B family is that they stuck to ATP enough to be separated from the other sequences." I dont get the fundamental physical difference between 'sticking to ATP' in the original b-family and ATP-binding of e.g. dx. as i pointed out above the important amino acids binding to atp via electrostatic and aromatic stacking interactions are already in place in the original b-family. In more general tems if you look at the amino acid exchanges between b-family and 18-19 - many of these exchanges are between very similar amino acids. E.g. lysines are exchanged against arginines which as you very well know have similar biophysical properties. such changes in other proteins normally go hand in hand with keeping the structure and the function. " ... And it’s those 3 rounds which found the necessary mutations to confer folding and “function” (in the sense of a strong binding to ATP ..." as I pointed out above - that is not true. Furthermore, what is the reason for you to think that only strong binding counts as function. If you look at atp-affinities for natural atp-binding proteins e.g. the nbd-domains of different abc-transporters you find that their atp-affinites vary between ~100 nM and larger > 1mM which is 10000fold weaker then protein 18-19. These are functional natural proteins depending on atp-binding ... " e) myoglobin" myoglobin binds oxygen when there is a high enough oxygen concentration, diffuses, and releases oxygen when it moves into an environment with lower oxygen concentrations. thereby it helps to regulate oxygen levels. dx in a cell binds atp when there is atp, diffuses, and releases atp when atp-concentrations in its environment are low. Where is the fundamental difference? "And do you really beleieve that its incorporation in bacteria was harmful only because it was overexpressed?" Strictly, I can not answer that because they did not do the proper control experiment - expressing a natural atp-binding protein with the same atp-affinity at the same intracellular protein levels. from experience - often overexpression of a protein decreases the fitness of the poor cells forced to produce it. it drains a lot of resources. If that protein now binds and thereby depletes the cell from atp thats serious damage. Do you think that its harmfullness results from it being 'unnatural'? "The experimenter’s declared purpose was to look for naturally occurring functional sequences in a random library" First paragraph of their paper: The frequency of occurrence of functional proteins in collections of random sequences is an important constraint on models of the evolution of biological proteins. Here we have experimentally determined this frequency by isolating proteins with a specific function from a large random-sequence library of known size. We selected for proteins that could bind a small molecule target with high affinity and specificity as a way of identifying amino-acid sequences that could form a three-dimensional folded state with a well-defined binding site and therefore exhibit an arbitrary specific function. ATP was chosen as the target for binding to allow comparison with known biological ATP-binding motifs ... "why have they gone on modifying those sequences by designed evolution, if not in order to build some apparent function which obviously was not there in the beginning ..." If the atp-binding ability would not have been there in the beginning the sequences would never have made it through the first rounds of selection. For further arguments see above ... Why have they gone on modifying - for biophysical experiments such as measuring affinity, determining a structure etc. you need to produce a certain amount of the protein normally by overexpression in bacteria in much larger amounts then what you have in the actual selection experiments and be able to purify it (where you produce your protein by a process called in-vitro translation in smaller amounts). thus, 18-19 was the better choice for these experiments because it was easier to handle.rna
July 7, 2010
July
07
Jul
7
07
2010
10:03 AM
10
10
03
AM
PDT
gpuccio, here is a article that just came out on Protein folding: Computer program takes on protein puzzle Though the proteins assemble themselves in nature almost instantly, the Rice team's algorithm took weeks to run the simulation. Still, that was far faster than others have achieved. http://www.physorg.com/news197658752.htmlbornagain77
July 7, 2010
July
07
Jul
7
07
2010
09:06 AM
9
09
06
AM
PDT
veils it is abundantly clear to see the preconceived bias of the researchers drove these results severely astray of any meaningful point to be made in the research.
Preconceived bias? How old is the earth BA?Petrushka
July 7, 2010
July
07
Jul
7
07
2010
08:57 AM
8
08
57
AM
PDT
gpuccio: here is a bit of trivia on protein folding'' Youthful Aging Depends on Proper Protein Folding excerpt:If you were to make small chains consisting of only five amino acids (linked like beads on a string), but using all possible combinations of the 20 different amino acids of which our proteins are composed, then the number of possible three-dimensional molecular configurations arising from all those five-unit chains would be 104,857,600,000! (Ripley, where are you when we need you?) The number is that large because: (1) there are 3,200,000 different ways (20 x 20 x 20 x 20 x 20) in which a 5-unit chain can be made from among the 20 amino acids;* and (2) the number of different configurations that are possible when those chains twist and turn and fold in on themselves is 32,768 for each configuration (based on certain molecular-geometric considerations). http://www.life-enhancement.com/article_template.asp?id=2015bornagain77
July 7, 2010
July
07
Jul
7
07
2010
08:53 AM
8
08
53
AM
PDT
gpuccio, for me what this paper is a very good example of is exactly the type of thinking that the scientific method was supposedly set up to remove. Richard Feynman has some interesting quotes in relation to this type of thinking exemplified in his paper "cargo cult science",,,,,,,, The first principle is that you must not fool yourself--and you are the easiest person to fool. So you have to be very careful about that. After you've not fooled yourself, it's easy not to fool other scientists. You just have to be honest in a conventional way after that. [...] It's a kind of scientific integrity, a principle of scientific thought that corresponds to a kind of utter honesty--a kind of leaning over backwards. For example, if you're doing an experiment, you should report everything that you think might make it invalid--not only what you think is right about it: other causes that could possibly explain your results; and things you thought of that you've eliminated by some other experiment, and how they worked--to make sure the other fellow can tell they have been eliminated. Details that could throw doubt on your interpretation must be given, if you know them. You must do the best you can--if you know anything at all wrong, or possibly wrong--to explain it. If you make a theory, for example, and advertise it, or put it out, then you must also put down all the facts that disagree with it, as well as those that agree with it. [...] In summary, the idea is to give all of the information to help others to judge the value of your contribution; not just the information that leads to judgment in one particular direction or another. [...] But this long history of learning how to not fool ourselves--of having utter scientific integrity--is, I'm sorry to say, something that we haven't specifically included in any particular course that I know of. We just hope you've caught on by osmosisbornagain77
July 7, 2010
July
07
Jul
7
07
2010
05:04 AM
5
05
04
AM
PDT
1 4 5 6 7 8 13

Leave a Reply