Uncommon Descent Serving The Intelligent Design Community

Proteins Fold As Darwin Crumbles

Share
Facebook
Twitter
LinkedIn
Flipboard
Print
Email

A Review Of The Case Against A Darwinian Origin Of Protein Folds By Douglas Axe, Bio-Complexity, Issue 1, pp. 1-12

Proteins adopt a higher order structure (eg: alpha helices and beta sheets) that define their functional domains.  Years ago Michael Denton and Craig Marshall reviewed this higher structural order in proteins and proposed that protein folding patterns could be classified into a finite number of discrete families whose construction might be constrained by a set of underlying natural laws (1).  In his latest critique Biologic Institute molecular biologist Douglas Axe has raised the ever-pertinent question of whether Darwinian evolution can adequately explain the origins of protein structure folds given the vast search space of possible protein sequence combinations that exist for moderately large proteins, say 300 amino acids in length.  To begin Axe introduces his readers to the sampling problem.  That is, given the postulated maximum number of distinct physical events that could have occurred since the universe began (10150) we cannot surmise that evolution has had enough time to find the 10390 possible amino-acid combinations of a 300 amino acid long protein.

The battle cry often heard in response to this apparently insurmountable barricade is that even though probabilistic resources would not allow a blind search to stumble upon any given protein sequence, the chances of finding a particular protein function might be considerably better.  Countering such a facile dismissal of reality, we find that proteins must meet very stringent sequence requirements if a given function is to be attained.  And size is important.  We find that enzymes, for example, are large in comparison to their substrates.  Protein structuralists have demonstrably asserted that size is crucial for assuring the stability of protein architecture.

Axe has raised the bar of the discussion by pointing out that very often enzyme catalytic functions depend on more that just their core active sites.  In fact enzymes almost invariably contain regions that prep, channel and orient their substrates, as well as a multiplicity of co-factors, in readiness for catalysis.  Carbamoyl Phosphate Synthetase (CPS) and the Proton Translocating Synthase (PTS) stand out as favorites amongst molecular biologists for showing how enzyme complexes are capable of simultaneously coordinating such processes.  Overall each of these complexes contains 1400-2000 amino acid residues distributed amongst several proteins all of which are required for activity.

Axe employs a relatively straightforward mathematical rationale for assessing the plausibility of finding novel protein functions through a Darwinian search.  Using bacteria as his model system (chosen because of their relatively large population sizes) he shows how a culture of 1010 bacteria passing through 104 generations per year over five billion years would produce a maximum of 5×1023 novel genotypes.  This number represents the ‘upper bound’ on the number of new protein sequences since many of the differences in genotype would not generate “distinctly new proteins”.  Extending this further, novel protein functions requiring a 300 amino acid sequence (20300 possible sequences) could theoretically be achieved in 10366 different ways (20300/5×1023). 

Ultimately we find that proteins do not tolerate this extraordinary level of “sequence indifference”.  High profile mutagenesis experiments of beta lactamases and bacterial ribonucleases have shown that functionality is decisively eradicated when a mere 10% of amino-acids are substituted in conservative regions of these proteins.  A more in-depth breakdown of data from a beta lactamase domain and the enzyme chorismate mutase  has further reinforced the pronouncement that very few protein sequences can actually perform a desired function; so few in fact that they are “far too rare to be found by random sampling”.

But Axe’s landslide evaluation does not end here.  He further considers the possibility that disparate protein functions might share similar amino-acid identities and that therefore the jump between functions in sequence space might be realistically achievable through random searches.  Sequence alignment studies between different protein domains do not support such an exit to the sampling problem.  While the identification of a single amino acid conformational switch has been heralded in the peer-review literature as a convincing example of how changes in folding can occur with minimal adjustments to sequence, what we find is that the resulting conformational variants are unstable at physiological temperatures.  Moreover such a change has only been achieved in vitro and most probably does not meet the rigorous demands for functionality that play out in a true biological context.  What we also find is that there are 21 other amino-acid substitutions that must be in place before the conformational switch is observed. 

Axe closes his compendious dismantling of protein evolution by exposing the shortcomings of modular assembly models that purport to explain the origin of new protein folds.  The highly cooperative nature of structural folds in any given protein means that stable structures tend to form all at once at the domain (tertiary structure) level rather that at the fold (secondary structure) level of the protein.  Context is everything.  Indeed experiments have held up the assertion that binding interfaces between different forms of secondary structure are sequence dependent (ie: non-generic).  Consequently a much anticipated “modular transportability of folds” between proteins is highly unlikely. 

Metaphors are everything in scientific argumentation.  And Axe’s story of a random search for gem stones dispersed across a vast multi-level desert serves him well for illustrating the improbabilities of a Darwinian search for novel folds.  Axe’s own experience has shown that reticence towards accepting his probabilistic argument stems not from some non-scientific point of departure in what he has to say but from deeply held prejudices against the end point that naturally follows.  Rather than a house of cards crumbling on slippery foundations, the case against the neo-Darwinian explanation is an edifice built on a firm substratum of scientific authenticity.  So much so that critics of those who, like Axe, have stood firm in promulgating their case, better take note. 

Read Axe’s paper at: http://bio-complexity.org/ojs/index.php/main/article/view/BIO-C.2010.1

Further Reading

  1. Michael Denton, Craig Marshall (2001), Laws of form revisited, Nature Volume 410, p. 417
Comments
BA: I think veilsofmaya is in absolute good faith, and that he is trying to express his own judgements on the matter. I could agree with you that he is probably sidetracked by too much faith in scientific literature in itself, a bias which can easlily excused in those who do not have to deal with it professionally (veils, I apologize if I am wrong, but I don't think your approach is really "skeptical" enough; if you had the same experience I have with scientific literature in my own field, maybe you would be a little more suspicious about scientific papers). With that, I am not encouraging hyperskepticism: all my remarks about the paper start from the paper itself,assuming good faith in the authors, but freely criticizing the explicit methodology expressed in the paper itself as implicitly revealing a cognitive bias. All that is perfectly legitimate. One may agree or not on the specific points, but it is obvious that scientific papers can and must be scrutinized for their methodology and conclusions.gpuccio
July 7, 2010
July
07
Jul
7
07
2010
04:21 AM
4
04
21
AM
PDT
veils, as gpuccio has clearly pointed out, in his usual clear and well-reasoned way, GC content was a obvious indication that orphans are most likely protein coding genes, I listed two other unbiased methods in 178 ,,, codon equiprobability of determining protein encoding: and Prediction of protein coding regions by the 3-base periodicity analysis of a DNA sequence: https://uncommondescent.com/intelligent-design/proteins-fold-as-darwin-crumbles/#comment-358544 ,,, unbiased methods that could easily have been implemented but were not, As well as has been pointed out earlier, if they were really interested in "cleaning up" the protein coding catalog, as you insist they were, then they should have removed all of the "accepted genes" that have not been properly matched to protein sequences yet,,, i.e. 'they failed to apply their criteria consistently. i.e. They should have gone through their “accepted” gene coding sequences to eliminate all the ones that have failed to be matched directly to proteins so far (and this could be perhaps several thousand genes given the >39% of orphan enzymatic activities of humans). That you would use RFC score and CSF score as the lead off to try to defend this extremely poor excuse for a scientific paper, is really sad and says a lot about their, and your, ulterior motives, since the scores in fact, as gpuccio already pointed out earlier, reflect exactly the point they want to make in the end. veils it is abundantly clear to see the preconceived bias of the researchers drove these results severely astray of any meaningful point to be made in the research. My question to you is why in blue blazes are you trying so desperately to defend a paper that is so obviously void of scientific integrity? Of what possible benefit is it for you to sully your integrity by vainly trying to defend their lack thereof? What possible reward do you have in defending a bankrupt materialistic theory that promises you nothing but death in the first place? Wake up veils!!!! Nickelback - Savin' Me http://www.youtube.com/watch?v=jPc-o-4Nsbkbornagain77
July 7, 2010
July
07
Jul
7
07
2010
04:06 AM
4
04
06
AM
PDT
veilsofmaya: Just a couple of comments. RFC score and CSF score are obviously related to the existence of homologues. I can't see how they could be correctly applied to orphan genes, if not to confirm what has been already established, that they are orphans. Instead, the GC content is an independent clue, and it is in the sense of confirming that they are protein coding genes. Whatever you say about removing or not removing and all other considerations, I think you seem to have a naif idea of how a scientific community works today. While it is true that any new researcher can make new points, change old interpretations and so on (that's still true, thanks God, except maybe if you name ID), it is equally true that a big and authoritative paper like the one we are discussing is likely to very much influence the general way of thinking and future research. That's why researchers can be held responsible for the kind of methodology and reasoning they use when making conclusions from their data, because many people, even in the scientific community, will be influenced mainly by their conclusions, and not their data. So, faulty reasoning can and must be criticized. Finally, I think you are right that the paper about orphan enzymatic activities is not a direct support to the protein coding nature of human orphan genes, but it is an interesting piece of information anyway.gpuccio
July 7, 2010
July
07
Jul
7
07
2010
12:07 AM
12
12
07
AM
PDT
@bornagain77 (#172) Born, Please read carefully… You wrote:
…they were not the least bit justified to remove orphans from the Gene database based primarily, and overwhelmingly…
Didn't happen. Nothing was actually removed. Fiction created to generate outrage and appeal to emotion. Evidence of protein coding, among others, would exclude an orphan from being removed.
…on unwarranted neo-Darwinian assumptions…
Regardless of what theory the hypothesis was based on, it was rigorously applied, tested and the results matched the prediction. In regards to reading frame conservation... The RFC score shows virtually no overlap between the well studied genes and the random controls (SI Fig. 5). Only 1% of the random controls exceed the threshold of RFC >90, whereas 98.2% of the well studied genes exceed this threshold. The situation is similar for the full set of 18,752 genes with cross-species counterparts, with 97% exceeding the threshold (Fig. 2 a). The RFC score is slightly lower for more rapidly evolving genes, but the RFC distribution for even the top 1% of rapidly evolving genes is sharply separated from the random controls (SI Fig. 5). By contrast, the orphans show a completely different picture. They are essentially indistinguishable from matched random controls (Fig. 2 b) and do not resemble even the most rapidly evolving subset of the 18,572 genes with cross-species counterparts. In short, the set of orphans shows no tendency whatsoever to conserve reading frame. In regards to codon substitution frequency… The results again showed strong differentiation between genes with cross-species counterparts and orphans. Among 16,210 genes with simple orthology, 99.2% yielded CSF scores consistent with the expected evolution of protein-coding genes. By contrast, the 1,177 orphans include only two cases whose codon evolution pattern indicated a valid gene. Upon inspection, these two cases were clear errors in the human gene annotation; by translating the sequence in a different frame, a clear cross-species orthologs can be identified. Again, you seem to be complaining for no other reason that it's based on neo-darwinism. However, neo-darwinism also predicts that viruses, cancer cells, etc. would mutate and become resistant to drugs. We know this occurs. Are you suggesting we should *not* take this into account when both administering existing treatments and developing new treatments because it's based on neo-darwinism? From the original paper… Our focus has been on excluding putative genes from the human catalogs. We have not explored whether there are additional protein-coding genes that have not yet been included, although it is clear that cross-species analysis can be helpful in identifying such genes. Preliminary analysis from our own group and others suggests that there may be a few hundred additional protein-coding genes to be found but that the final total is likely to remain under ?21,000. The largest open question concerns very short peptides, which may still be seriously underestimated. Furthermore, it's clear one of the goals is to clean up the human gene catalog to improve future study of both genes….. As a result, the human gene catalog has remained in considerable doubt. The resulting uncertainty hampers biomedical projects, such as systematic sequencing of all human genes to discover those involved in disease. And even promote the future study of non-coding transcripts… Finally, the creation of more rigorous catalogs of protein-coding genes for human, mouse, and dog will also aid in the creation of catalogs of noncoding transcripts. This should help propel understanding of these fascinating and potentially important RNAs. Your assertion that removal would cause "neglect" is yet another crystal clear example of perpetuating the myth that darwinism hampers research. The irony is that a paper that you yourself quoted clearly and explicitly suggests otherwise. Finally, in regards to the second paper you referenced, it's not clear which, if any, of the "orphan enzymes" listed are the same as the orphan genes determined to be non-coding in the first. Specifically, the first paper was focused only orphan genes already in the human genome, while the second paper references orphan enzymes in 287 species. The maximum number of valid orphans enzymes in a single species was 18, which is close to the 12 orphan genes previously identified in human beings.veilsofmaya
July 6, 2010
July
07
Jul
6
06
2010
08:56 PM
8
08
56
PM
PDT
gpuccio Here is a article that just came out involving ATP, that reminded me that Szostak, nor rna, nor petrushka, even so much as offered a "just so" story for how the ATP molecule arose before the ATP enzyme that produces it. It would have been nice for them to at least ry to do that so as to be able to justify using Szostak's experiment for truly ascertaining the ability of purely material processes to even generate proteins (functional or non-functional) in the first place. Nanomachines in the Powerhouse of the Cell: Architecture of the Largest Protein Complex of Cellular Respiration Elucidated - July 2010 Excerpt: The total surface of all mitochondrial membranes in a human body covers about 14.000 square meter. This accounts for a daily production of about 65 kg of ATP. (A little over 143 pounds). http://www.sciencedaily.com/releases/2010/07/100702100414.htm Your Inner Locomotive Revealed - July 2010 http://www.creationsafaris.com/crev201007.htm#20100706a further notes: Evolution vs ATP Synthase - Molecular Machine - video http://www.metacafe.com/watch/4012706 The ATP Synthase Enzyme - exquisite motor necessary for first life - video http://www.youtube.com/watch?v=W3KxU63gcF4bornagain77
July 6, 2010
July
07
Jul
6
06
2010
07:36 AM
7
07
36
AM
PDT
So gpuccio in layman's terms, what you are trying to tell me is that they actually knew when they wrote the paper that unbiased reading of GC content directly indicated the 1177 orphans, or at least a very higher percentage of them, were active protein encoding regions???: That is truly incredible!!! A more biased example of science would be hard to find if this is true,,, to quote your quote from the paper: "The orphans have a GC content of 55%, which is much higher than the average for the human genome (39%) and similar to that seen in protein-coding genes with cross-species counterparts (53%)." gpuccio as far as I can see, that just about seals the deal for Intelligent Design because, as you well know, there simply is no way for material processes, operating within the known laws of physics, especially the second law and Conservation of Information, to account for the generation of even one completely unique gene/protein much less 1000. 1000 completely unique genes in humans is like overkill times 1000. Sure the t's have to be crossed and the i's dotted, i.e. orphans have to be properly matched (proteins to genes) and they have to be insured of isolation from primates, but other than that task, which is certainly easier said than done, the evidence is certainly VERY ID friendly.bornagain77
July 6, 2010
July
07
Jul
6
06
2010
06:27 AM
6
06
27
AM
PDT
BA: here is another interesting piece of the paper: "ORF lengths. The orphans have a GC content of 55%, which is much higher than the average for the human genome (39%) and similar to that seen in protein-coding genes with cross-species counterparts (53%). The high-GC content reflects the orphans’ tendency to occur in gene-rich regions. We examined the ORF lengths of the orphans, relative to their GC-content. The orphans have relatively small ORFs (median 393 bp), and the distribution of ORF lengths closely resembles the mathematical expectation for the longest ORF that would arise by chance in a transcript-derived form human genomic DNA with the observed GC-content (SI Fig. 4)." IOW, the GC content in human orphans points to them being protein coding genes, but as they have already decided that they are not, they interpret the fact the other way round: as they are GC rich, it is more likely that they appear to contain long enough ORFs, becasue a stop codon is less likely to be observed frequently (stop codons are AT rich). Bias, again. You assune what you want to find. From Wikipedia: "GC ratios and coding sequence Within a long region of genomic sequence, genes are often characterised by having a higher GC-content in contrast to the background GC-content for the entire genome."gpuccio
July 6, 2010
July
07
Jul
6
06
2010
05:37 AM
5
05
37
AM
PDT
gpuccio, I realized something about the "kicking the orphans out in the street" paper, even though the authors completely failed to cite the literature that argues strongly for a rigorous effort to match orphan protein sequences to orphan genetic sequences, they also failed to apply their criteria consistently. i.e. They should have gone through the "accepted" gene coding sequences to eliminate all the ones that have failed to be matched directly to proteins so far (perhaps several thousand). But more importantly they also failed to mention, as I pointed out earlier in 129 and 133, that there actually are unbiased methods for determining the likelihood of whether a gene is protein encoding or not that are completely "theory neutral",,, As I referenced in 129,,, codon equiprobability of determining protein encoding https://uncommondescent.com/intelligent-design/proteins-fold-as-darwin-crumbles/#comment-358458 and as I referenced in 133,,,, Prediction of protein coding regions by the 3-base periodicity analysis of a DNA sequence: https://uncommondescent.com/intelligent-design/proteins-fold-as-darwin-crumbles/#comment-358463 Thus the authors truly are without excuse since they allowed their preconceived ideas to directly dictate what the evidence must say instead of using the unbiased scientific methods for determining the likelihood of protein encoding capability. Methods that were readily, and easily, available to them to work with, indeed it seems the thought did not even cross their minds to be unbiased in their study, as you pointed out this bias of theirs earlier: “Such a model would require a prodigious rate of gene birth in mammalian lineages and a ferocious rate of gene death erasing the huge number of genes born before the divergence from chimpanzee. We reject such a model as wholly implausible.” I wonder what Francis Bacon, who popularized the scientific method, would think about that statement?bornagain77
July 6, 2010
July
07
Jul
6
06
2010
04:20 AM
4
04
20
AM
PDT
veilsofmaya (#171): Your atrtempt at defending what cannot be defended has become so generic, non technical and convoluted that frankly I will not go on discussing it. As I believe that you are really convonced of what you say, I respect your opinions. Just a couple of quick comments on your final sum up: The question being asked is, in the absence of some other systematic method, if only thing you know about a group of ORFs is that they are all orphans in respect to it’s closest relatives, are the vast majority likely to be non-coding? A specific test was hypothesized, rigorously applied and tested. The result? They indeed found that, based on pre-existing research, only 12 out of 1,177 reported protein encoding. So, it appears the real complaint here isn’t that the prediction was actually applied, was wrong or that it would be applied over and above research that suggested any specific orphan actually did code proteins, but that it was based on common decent. The only test was if new genes were new, whioch is a tautology. I can't see how that can be seriously "hypothesized, rigorously applied and tested". The observation about reported proteins was in no way the object of the paper, but only an indirect confirmation which, in itself, was not convincing, as I have already argued. Otherwise, the paper would have been something like: "Let's see how many of human orphan genes, which are obviously new, have some independent demonstration of a corresponding protein in scientific literature". And my complaint is not that their model and final conclusions are based on common descent. As you should know, I have nothing again common descent. My complaint is that their model and final conclusions are based on the blind assumption that new genes must be causally explained by darwinian theory, and therefore need times and modalities compatible wwith RV and NS to emerge (or at least, compatible with darwinist's biased evaluation of those times and modalities: as you know, for us IDists no empirical time is compatible with that explanatory model :) ). My complaint is that they categorize as implausible what they are observing (1000 new ORFs, potential protein coding genes, in humans) only because they can't explain how they could have arisen in such a short time by RV and NS. I will then answer your explicit question: I’ll pose the same question to you as I have to Bornagain77: Are you suggesting that Darwinism fails at explaining phenomena or that it cannot be used to explain phenomena because it’s somehow biased against or “hates” ID? It's very clear for me. I have always "suggested" (indeed, stated and tried to demonstrate in detail) that "Darwinism fails at explaining phenomena", and that is obviously because it is a theory both internally inconsistent and unsupported by facts. That statement refers only to darwinism as a causal model, not to uncommon descent, just to stay clear. Its inconsistency rests mainly in the fact that it uses a causal model based on RV + NS, and that the random part (RV), if quantitatively analyzed (which is absolutely necessary in an explanatory model) has not the probabilistic power to explain what it should explain. That, in itself, makes the explanatory model logically inconsistent. It is also unsupported by facts because no empirical evidence exists that such a mechanism can really cause macroevolution. I hope my position is clear. The problem of the bias, or hate, against ID is a logical consequence of that. If you are part of a majority detaining practically all power in the scientific community, and that in the name of a wrong theory, it is very easy to predict your feelings against those in a minority group who are seriously trying to point to the falsity of your theory, and even to give an alternative explanation which is incompatible with your personal beliefs and general views of reality. That's elementary psychology. Calling it "bias" is perhaps a little too refined...gpuccio
July 6, 2010
July
07
Jul
6
06
2010
12:03 AM
12
12
03
AM
PDT
gpuccio, you may find this paper interesting for a rough figure as to exactly how many orphan enzymes remain unsequenced and therefore unmatched to genetic sequences: Orphan enzymes could be an unexplored reservoir of new drug targets Excerpt: Despite the immense progress of genomics, and the current availability of several hundreds of thousands of amino acid sequences, >39% of well-defined enzyme activities (as represented by enzyme commission, EC, numbers) are not associated with any sequence. There is an urgent need to explore the 1525 orphan enzymes (enzymes having EC numbers without an associated sequence) to bridge the wide gap that separates knowledge of biochemical function and sequence information. Strikingly, orphan enzymes can even be found among enzymatic activities successfully used as drug targets. Here, knowledge of sequence would help to develop molecular-targeted therapies, suppressing many drug-related side-effects. http://www.sciencedirect.com/science?_ob=ArticleURL&_udi=B6T64-4JKSJS1-4&_user=10&_coverDate=04%2F30%2F2006&_rdoc=1&_fmt=high&_orig=search&_sort=d&_docanchor=&view=c&_searchStrId=1391475609&_rerunOrigin=google&_acct=C000050221&_version=1&_urlVersion=0&_userid=10&md5=b41619d945ab08d7048188505925db1abornagain77
July 5, 2010
July
07
Jul
5
05
2010
06:22 PM
6
06
22
PM
PDT
As well, to reiterate their primary justification in removing the orphans as stated in their own words as quoted by gpuccio: "And so, the answer is simple: “Such a model would require a prodigious rate of gene birth in mammalian lineages and a ferocious rate of gene death erasing the huge number of genes born before the divergence from chimpanzee. We reject such a model as wholly implausible.” But such a model is not implausible at all. It is only implausible for darwinists. But if you reflect on the obvious fact that humans and chimps are very different, that humans are practically unique in their ability of abstract intelligent thoughts, that they have changed the world they live in under many respects, that they have built varied civilizations and explored reality scientifically and in other ways, then few hundreds of new genes in their basic level proteome information could in some way seem justified…"bornagain77
July 5, 2010
July
07
Jul
5
05
2010
04:08 PM
4
04
08
PM
PDT
veils to reiterate 161 and 162 since it seems you seemed to have completely missed it, veils the whole point, as is amply illustrated in the paper I referenced in 157, is that they were not the least bit justified to remove orphans from the Gene database based primarily, and overwhelmingly, on unwarranted neo-Darwinian assumptions, and they should have in fact conducted as thorough a search of orphan enzyme activity in humans as possible in order to fully validate the orphans removal from the Gene database, as the article I referenced clearly stated: Results We demonstrate that for ~80% of sampled orphans, the absence of sequence data is bona fide. Our analyses further substantiate the notion that many of these enzyme activities play biologically important roles. Conclusion This survey points toward significant scientific cost of having such a large fraction of characterized enzyme activities disconnected from sequence data. It also suggests that a larger effort, beginning with a comprehensive survey of all putative orphan activities, would resolve nearly 300 artifactual orphans and reconnect a wealth of enzyme research with modern genomics. For these reasons, we propose that a systematic effort to identify the cognate genes of orphan enzymes be undertaken. veils if you agree with their extremely biased methodology, a methodology which in fact only gave a passing nod as to a thorough cross check, you are in fact condoning forcing the evidence to fit a preconceived solution. No wonder Darwinists have been able to get away with such deception for so long, they literally make up the rules to science as they go along! i.e. Why is Darwinism true? Because the evidence says so. Why does the evidence say so? Well, because Darwinism is true of course! As guppcio clearly said earlier, “it would be hard to find more circular reasoning”.bornagain77
July 5, 2010
July
07
Jul
5
05
2010
04:02 PM
4
04
02
PM
PDT
@gpuccio (#163) You wrote:
Yes, adding all ORFs of a certain length to a library is standard procedure. Obviously, anybody can give further analysis and propose reviews. But the main standard to hypothesize that a gene is a protein coding gene in bioinformatics is that it is an open reading frame, whether it has homologues or not. That’s why orphan genes are called orphan genes. Because they are possible protein coding genes (ORFs) without homologues (orphan).
Gpuccio, First, it seems you've assumed the "standard procedure" currently in place is "non-biased", has a good track record or has not been made obsolete by recent discoveries about identifying ORFs. On what basis have you made this assumption? Second, it seems your argument depends on the end result of actually removing these orphans from the library based on this qualification alone. However… - The removal has not yet occurred. This was a test to developed to test a prediction if orphan genes were likely to be coding. - Merely having a status of "orphan" is not recommended as the only criteria for removing them from the library. (All orphans should be removed regardless of any research, such as the discovery of coding proteins) - Future ORFs would not be barred if other research suggested they were protein-coding despite being an orphan. - Removal is not a claim of universal non-function. Nor does it demand that future search for function will not be performed.
Now, it is not new or surprising that darwinists hate orphan genes. They fit very badly in their causal model, and it’s not a case that Larry Moran has been particularly critic of the concept itself. One of the main purposes of darwinists is to demonstrate that orphan genes do not exists, or that if they exist they are very, very few…
I fail to see how this "test" reveals "hate" for orphan genes. I'll pose the same question to you as I have to Bornagain77: Are you suggesting that Darwinism fails at explaining phenomena or that it cannot be used to explain phenomena because it's somehow biased against or "hates" ID? The paper made it quite clear that there is currently no scientific justification for excluding ORFs merely because they fail to show evolutionary conservation.. This is because an ORF can be labeled many different ways, including an orphan, protein coding, etc. Than any particular gene is an orphan does not mean it cannot be protein coding or is universally non-functional, etc. The question being asked is, in the absence of some other systematic method, if only thing you know about a group of ORFs is that they are all orphans in respect to it's closest relatives, are the vast majority likely to be non-coding? A specific test was hypothesized, rigorously applied and tested. The result? They indeed found that, based on pre-existing research, only 12 out of 1,177 reported protein encoding. So, it appears the real complaint here isn't that the prediction was actually applied, was wrong or that it would be applied over and above research that suggested any specific orphan actually did code proteins, but that it was based on common decent. To reiterate, this is similar to Bornagain77's claim regarding NP-complete problems. That quantum computing may be unable solve all NP-compete problems in polynomial time does not mean a specific quantum algorithm could not be used to solve a specific NP-complete problem by exploiting it's problem space. Suggesting that all orphans are unlikely to be protein coding is not the same as banning any specific orphan from the library even if it were found to encode proteins. Nor would it prevent any specific orphan from being found to have some other function.veilsofmaya
July 5, 2010
July
07
Jul
5
05
2010
03:46 PM
3
03
46
PM
PDT
rna: let's try to come to a reciprocal understanding: a) The structures of 18-19 and DX may be virtually identical, but their folding and functional properties are not, at least according to what the creators of DX state. Anyway, that's not really important, because both are the product of directed engineering. b) Both 18 - 19 and DX differ form the original B family for 16 AAs, 20%. That's a lot, especially if you consider that those AAs are exactly those mutations whcih were actively selected to confer both folding and function. c) I have seen no evidece in the papers that the original B family sequences had the same structure, or any significant function. As far as I know, that's only your conjecture. The 80% sequence similarity is easily explained by tfhe history of those proteins (aftyer all, they were evolved using the B family as a seed), but there is no evidence that in itself it confers structure or function. The only information we have about function in the original B family is that they stuck to ATP enough to be separated from the other sequences. If you have any other information about structure and function of the original sequences, I would be happy to know it. d) I am well aware that the selection procedure used in the first rounds was not protein engineering. If you read my posts, you will see that I have never said anything different. It's the three rounds of mutational PCR followed by selection which are protein engineering. And it's those 3 rounds which found the necessary mutations to confer folding and "function" (in the sense of a strong binding to ATP. e) I think that something more can be said about the function of myoglobin than about protein 18 - 19. The following is from wikipedia: "The binding affinities for oxygen between myoglobin and hemoglobin are important factors for their function. Both myoglobin and hemoglobin binds oxygen well when the concentration of oxygen is really high (Eg. in Lung), however, hemoglobin is more likely to release oxygen in areas of low concentration (Eg. in tissues). Since hemoglobin binds oxygen less tightly than myoglobin in muscle tissues, it can effectively transport oxygen throughout the body and deliver it to the cells. Myoblobin, on the other, would not be as efficient in transferring oxygen. It does not show the cooperative binding of oxygen because it would take up oxygen and only release in extreme conditions. Myoglobin has a strong affinity for oxygen that allows it to store oxygen in muscle effectively. This is important when the body is starve for oxygen, such as during anaerobic excercise. During that time, carbon dioxide level in blood streams is extremely high and lactice acid concentration build up in muscles. Both of these factors cause myoblobin (and hemoglobins) to release oxygen, for protecting the body tissues from getting damaged under harsh conditions. If the concentration of myoglobin is high within the muscle cells, the organism is able to hold the breath for a much longer period of time. Myoglobin, an iron-containing protein in muscle, receives oxygen from the red blood cells and transports it to the mitochondria of muscle cells, where the oxygen is used in cellular respiration to produce energy. Each myoglobin molecule has one heme prosthetic group located in the hydrophobic cleft in the protein. The function of myoglobin is notable from Millikan's review (1) in which he put together an accomplished study to establish that myoglobin is formed adaptively in tissues in response to oxygen needs and that myoglobin contributes to the oxygen supply of these tissues. Oxymyoglobin regulates both oxygen supply and utilization by acting as a scavenger of the bioactive molecule nitric oxide. Nitric oxide is generated continuously in the myocyte. Oxymyoglobin reacts with NO to form harmless nitrates, with concomitant formation of ferric myoglobin, which is recycled through the action of the intracellular enzyme metmyoglobin reductase. Flogel (2) conducted a study that showed how the interaction of NO and oxymyoglobin controls cardiac oxygen utilization." f) The problem about defining function is that, if you are interested in the occurrence of function in random sequences as evidence for darwinian model, then you shoud stick to aselectable functions which can confer a reproductive advantage in a living being. Do you really believe that protein DX has such a property? Or, even better, one of the original B family sequences? And do you really beleieve that its incorporation in bacteria was harmful only because it was overexpressed? What scenario would you suggest where "normally expressed" protein DX would confer a reproductive advantage? Or, even better, one of the original B family sequences? g) You ask: "But what choice does the experimenter have if he is looking for atp-binding molecules." The answer is simple: the experimenter's intention was not to look for atp-binding molecules. The experimenter's declared purpose was to look for naturally occurring functional sequences in a random library. To do that, all they had to do was to select and expand atp-binding sequences (as they have done in thefirst rounds) and then study those sequences and prove that they were functional. So, why have they gone on modifying those sequences by designed evolution, if not in order to build some apparent function which obviously was not there in the beginning, so that they could be able to state that they had found "function" in a random library? That procedure is completely unnecessary and biased. Its purpose was not to prove what had to be proved, but to give the false impression of having proved it. Again, this is not science, but ideologically driven research.gpuccio
July 5, 2010
July
07
Jul
5
05
2010
02:25 PM
2
02
25
PM
PDT
gpuccio, I actually have a little evidence of the timespan between the OOL and the Cambrian explosion, that gives some strong clues as to the "designed terra-forming" that was going on during that time. Maybe soon as we get off this topic I will be able to show you some of it.bornagain77
July 5, 2010
July
07
Jul
5
05
2010
02:20 PM
2
02
20
PM
PDT
sorry for the wrong choice of words rna, but do you agree the exponent must be added to correctly keep in line with what is going on, at least to a certain extent? If not, please go through only 19 original libraries to prove me wrong? I can easily see the correct approach for determining true probability very quickly falling in line with Sauer's 1 in 10^64 number and with Axe's 1 in 10^77 number, As well rna it seems to me that only by sheer want of any evidence whatsoever to make their case with, even remotely, that evolutionists are so willing to claim this "hoodwinked" 1 in 10^12 result represents anything close to true functionality.bornagain77
July 5, 2010
July
07
Jul
5
05
2010
02:15 PM
2
02
15
PM
PDT
# 144,145 bornagain ... the original 1 in 10^12 number be MULTIPLIED by at least 18 or 19 ... your choice of wordsrna
July 5, 2010
July
07
Jul
5
05
2010
02:03 PM
2
02
03
PM
PDT
# 146 gpuccio let's start with the simple things: " ..That’s why I defined “gross” the structure of protein 18 – 19, and I suppose that the structure of protein DX could be considered more refined. ... " The structures of 18-19 and DX are virtually identical as you can see from the j. molecular biology paper I quoted above, as might be expected from their ~ 80% sequence conservation. Since the proteins from the original B family have similar levels of sequence similarity they also adopt most likely a very similar structure. There is ample precedence for that in hundreds of other protein families. Many naturally occuring proteins need the addition of ligands, other additives or binding partners to be stably folded and to remain functional. (A whole class of functional proteins is designated as 'intrinsically unfolded'.) If you want to work with such proteins in the lab you normally find a family member with more suitable properties normally from a different organism with some mutations in the sequence. " ... It is no more function than the capacity of EDTA to subtract calcium from blood in a test tube..." So myoglobin is in your opinion not a functional protein? Go on try living without it. Same goes for ferritin or calbindin or ... "It is completely wrong to state, as many have done, that protein 18 -19 or protein DX are the product of a purely random procedure." Of course the procedure was not random - it was an experiment. What was random, was the starting sequence library. The first question of the experiment was to answer the question of how many proteins of a given function ('defined here by the experimenter as the simple function: atp-binding ability) can one find in a number of random (10exp12)sequences. The answer is at least 4 dominating families with this capability after eight rounds of selection. The selection procedure employed in the first eight rounds is just a procedure to find those sequences it does not modify what is in the original library. One could instead synthesize 10exp12 random dna sequences, translate them into proteins one by one and characterize each individual protein for atp-binding ability. this is not possible in a normal lab, so you have to use a selection + amplification procedure as your magnifying glass to find the functional sequences. No 'engineering' so far. the real amazing number is that 0.1 % of all sequences bound to atp after the first round aka 10exp9 of the 10exp12 sequences. thus, function in the form of atp-binding is very abundant in random sequences. this is contrary, to all the claims very abundant in this blog that function is very, very hard to find starting from random sequences. this is even more astonishing when one takes into account that the 10exp12 sequences synthesized cover only a tiny and also randomly selected fraction of the overall search space (20exp80). the additional rounds use 'random mutations' to improve the originally selected sequences. To characterize this as 'protein engineering' is a bit misleading in my opinion. and this can and does only improve a function that was already present after eight rounds of selection. The only thing that is really designed in this experiment is the 'fitness landscape' for the selection - a one dimesional one with atp-binding capability as the only parameter. But what choice does the experimenter have if he is looking for atp-binding molecules.rna
July 5, 2010
July
07
Jul
5
05
2010
01:28 PM
1
01
28
PM
PDT
BA: Yes, I had read of those fossils when Petrushka pointed to them. I must say that I am intrigued, but cautious. I am intrigued because I do think that the vast spread of timje between OOL and the Ediacara - Cambrian explosions is really a bit of a mystery, and I would appreciate any new information on what could have happened in those 3 billion years or more. Cautious, because obviously those fossils are too little, and there are many possible interpretations of them. Even the appearance of eukaryotes is in itself some mystery. I have recently found a paper which states that the original ancestor was more an eukaryote than a prokaryiote. So, both OOL and what came after still hold great challenges to our knowledge. It's not an exaggeration, but the only thing that I am really sure of is that biological information is designed :)- Anyway, I don't think that if we discover new and strange ancient forms of life, like the fossils in question, that will change in any measure the powerful meaning of the two metazoan explosions. The ediacara and cambrian events are so strange and amazing in themselves that I doubt that any new finding will be able to lessen their cognitive impact. And let's remember that those explosions are not about the appearance of eukaryotic or multicellular life, but rather about the sudden appearance of multiple, complex, macroscopic body plans, in two successive waves, probably unrelated one to the other. That's something, isn't it?gpuccio
July 5, 2010
July
07
Jul
5
05
2010
01:04 PM
1
01
04
PM
PDT
gpuccio, though slightly off topic (but not by much) this article which came up on Crevo may interest you: Do New Fossils Soften the Cambrian Explosion? Excerpt: Second, these fossils are of dubious interpretation. They may be nothing more than fairy-ring colonies growing outward like bacteria in a Petri dish. Perhaps the matlike remains were flexible enough to fold on the inside in some cases. There is no indication of a coelum or tissue differentiation. They do not appear transitional to Ediacaran fossils, let alone to Cambrian animals. http://www.creationsafaris.com/crev201007.htm#20100705abornagain77
July 5, 2010
July
07
Jul
5
05
2010
11:53 AM
11
11
53
AM
PDT
bornagain77
they should have in fact conducted as thorough a search of orphan enzyme activity in humans as possible in order to fully validate the orphans removal from the Gene database
This is in fact quite interesting. Perhaps a happy medium in the meanwhile is if we were to classify these "orphans" as "currently unknown" and leave them be for now? After all, nobody is talking about deleting the data itself, just reorganizing it really. In that light, do you still object so fiercely?Ken Morley
July 5, 2010
July
07
Jul
5
05
2010
10:48 AM
10
10
48
AM
PDT
veilsofmaya: "If all you knew about a ORF was that it was an orphan (and it were long enough and happen by chance to fall between start and stop signals) would we have been justified in adding it to library?" Yes, adding all ORFs of a certain length to a library is standard procedure. Obviously, anybody can give further analysis and propose reviews. But the main standard to hypothesize that a gene is a protein coding gene in bioinformatics is that it is an open reading frame, whether it has homologues or not. That's why orphan genes are called orphan genes. Because they are possible protein coding genes (ORFs) without homologues (orphan). Now, it is not new or surprising that darwinists hate orphan genes. They fit very badly in their causal model, and it's not a case that Larry Moran has been particularly critic of the concept itself. One of the main purposes of darwinists is to demonstrate that orphan genes do not exists, or that if they exist they are very, very few... Couple that with a firm refusal to discuss OOL (a la Petrushka), and you have partially solved that bad problem of having to explain how genes arose. Human orphans are even worse than generic orphans. You have read the point in the paper: simply stated, if we find that humans have hundreds of new protein coding genes, which are not shared even by primates, how do we explain that? After all, humans are recent, they reproduce slowly, and they are not that many. Some reshuffling and a bundle of mutations in HARs can always be tolerated, provided that we can affirm that the genomes of humans and chimps are 99% or something similar. But 1000 new genes? And so, the answer is simple: "Such a model would require a prodigious rate of gene birth in mammalian lineages and a ferocious rate of gene death erasing the huge number of genes born before the divergence from chimpanzee. We reject such a model as wholly implausible." But such a model is not implausible at all. It is only implausible for darwinists. But if you reflect on the obvious fact that humans and chimps are very different, that humans are practically unique in their ability of abstract intelligent thoughts, that they have changed the world they live in under many respects, that they have built varied civilizations and explored reality scientifically and in other ways, then few hundreds of new genes in their basic level proteome information could in some way seem justified... And if it is true, like ID believes, that new genes do not come out of nothing, nor do they come out of slow RV + NS, then you can see that the model our friends darwinists reject as "wholly implausible" is not implausible at all. That's what is called a cognitive bias.gpuccio
July 5, 2010
July
07
Jul
5
05
2010
10:48 AM
10
10
48
AM
PDT
veils if you agree with their extremely biased methodology, a methodology which in fact only gave a passing nod as to a thorough cross check, you are in fact condoning forcing the evidence to fit a preconceived solution. No wonder Darwinists have been able to get away with such deception for so long, they literally make up the rules to science as they go along! i.e. Why is Darwinism true? Because the evidence says so. Why does the evidence say so? Well, because Darwinism is true of course! As guppcio clearly said earlier, "it would be hard to find more circular reasoning".bornagain77
July 5, 2010
July
07
Jul
5
05
2010
10:45 AM
10
10
45
AM
PDT
veils the whole point, as is amply illustrated in the paper I referenced in 157, is that they were not the least bit justified to remove orphans from the Gene database based primarily, and overwhelmingly, on unwarranted neo-Darwinian assumptions, and they should have in fact conducted as thorough a search of orphan enzyme activity in humans as possible in order to fully validate the orphans removal from the Gene database, as the article I referenced clearly stated: Results We demonstrate that for ~80% of sampled orphans, the absence of sequence data is bona fide. Our analyses further substantiate the notion that many of these enzyme activities play biologically important roles. Conclusion This survey points toward significant scientific cost of having such a large fraction of characterized enzyme activities disconnected from sequence data. It also suggests that a larger effort, beginning with a comprehensive survey of all putative orphan activities, would resolve nearly 300 artifactual orphans and reconnect a wealth of enzyme research with modern genomics. For these reasons, we propose that a systematic effort to identify the cognate genes of orphan enzymes be undertaken.bornagain77
July 5, 2010
July
07
Jul
5
05
2010
10:36 AM
10
10
36
AM
PDT
@bornagain77 (#156) Born, Please point out exactly where universal non-function was claimed. For example, this was addressed in the article... Despite having gene-like characteristics, these open reading frames may not encode proteins. Instead, they might have other functions or possibly none at all. Which I noted here... Again, not being classified as a gene does not necessitate universal non-function. Just because an RFC or gene is classified as an orphan does not mean it is universally non-functional. As with "junk DNA", this misrepresentation has been addressed on multiple occasions, yet it continues to be brought up time and time again.veilsofmaya
July 5, 2010
July
07
Jul
5
05
2010
10:27 AM
10
10
27
AM
PDT
in (#158) I should have wrote: If all you knew about a ORF was that it was an orphan (and it were long enough and happen by chance to fall between start and stop signals) would we have been justified in adding it to library?veilsofmaya
July 5, 2010
July
07
Jul
5
05
2010
10:14 AM
10
10
14
AM
PDT
@ gpuccio (#151)
The darwinist a priori assumptions in their reasoning are especially obvious here:
Gupuccio, Again, the question being asked by the paper was, were these genes incorrectly added in the first place? If so, what is one possible way to help determine false positives now and in the future.
The experimental evidence is thus consistent with our conclusion that the vast majority of nonconserved ORFs are not protein-coding. In the handful of cases where experimental evidence exists or is found in the future, the genes can be restored to the catalog on a case-by-case basis.
Note that the section you quoted from was : Orphans Do Not Represent Protein-Coding Genes. The misunderstanding is similar to the fact that quantum computing cannot solve all NP-complete problems in polynomial time but can be highly effective on specific NP-complete problems. If all you knew about a gene was that it was an orphan (and it were long enough and happen by chance to fall between start and stop signals) would we have been justified in adding it to library? If all orphans were coding, then all 1,177 would have had to have to had changed from it's ancestor, which they reject as implausible. Now, you might suggest this is biased, but remember the question is should ORFs be added merely because they are orphans, which itself a designation based on whether they exist in species with a common ancestor. We can rephrase this as, "Should a ORF be considered a gene merely because it's not present in a species which shares a common ancestor?" The results of the paper strongly suggests the answer is no.veilsofmaya
July 5, 2010
July
07
Jul
5
05
2010
10:10 AM
10
10
10
AM
PDT
BA: Interesting paper. You are really untiring! So, we have both orphan genes and orphan enzymatic activities. I hope some of those can be matched...gpuccio
July 5, 2010
July
07
Jul
5
05
2010
09:35 AM
9
09
35
AM
PDT
and veils, did you happen to notice the article I referenced in 153? A survey of orphan enzyme activities Abstract Background Using computational database searches, we have demonstrated previously that no gene sequences could be found for at least 36% of enzyme activities that have been assigned an Enzyme Commission number. Here we present a follow-up literature-based survey involving a statistically significant sample of such “orphan” activities. The survey was intended to determine whether sequences for these enzyme activities are truly unknown, or whether these sequences are absent from the public sequence databases but can be found in the literature. Results We demonstrate that for ~80% of sampled orphans, the absence of sequence data is bona fide. Our analyses further substantiate the notion that many of these enzyme activities play biologically important roles. Conclusion This survey points toward significant scientific cost of having such a large fraction of characterized enzyme activities disconnected from sequence data. It also suggests that a larger effort, beginning with a comprehensive survey of all putative orphan activities, would resolve nearly 300 artifactual orphans and reconnect a wealth of enzyme research with modern genomics. For these reasons, we propose that a systematic effort to identify the cognate genes of orphan enzymes be undertaken. http://www.biomedcentral.com/1471-2105/8/244 And though this study is not solely a study of human orphans, it surely gives ample reason to believe that they were much, much, to hasty in removing the orphans from the gene count. and in fact gives fairly clear evidence that Darwinists are impeding science by their methodology of excluding sequences from the gene database that could go a long way in helping in identifying the large percentage of orphan enzyme activities!!!bornagain77
July 5, 2010
July
07
Jul
5
05
2010
09:31 AM
9
09
31
AM
PDT
veilsofmaya (#154): I think I had in some way anticipated those points in my post #151. While I appreciate your contribution to understanding what the article is really saying, I believe that the our main point of a serious cognitive bias remains valid, as I have tried to show. I will be happy of any further feedback from you on that.gpuccio
July 5, 2010
July
07
Jul
5
05
2010
09:19 AM
9
09
19
AM
PDT
1 5 6 7 8 9 13

Leave a Reply