Uncommon Descent Serving The Intelligent Design Community

Consider the opossum: the evidence for common descent

Share
Facebook
Twitter
LinkedIn
Flipboard
Print
Email

Remarkably, the recent spate of articles over at Evolution News and Views (see here, here and here) attacking the claim that vitellogenin pseudogenes in humans provide scientific evidence for common descent, all missed the point that Professor Dennis Venema was making, which was not about the existence of pseudogenes, but about the spatial pattern in the genes. The pattern is strikingly clear if we compare chickens with opossums. And since humans belong to the same class as opossums (namely, mammals), any scientific evidence that chickens and opossums have a common ancestor also counts as strong prima facie evidence that chickens and humans have one.

I’d like to acknowledge at the outset the kind assistance given to me by Professor S. Joshua Swamidass, without whose thoughtful advice this post would not have been possible. My expertise lies in the field of philosophy rather than biology. I have endeavored to be as careful as possible in stating the scientific case for common descent; however, if there are any (unintentional) scientific errors in this post, then I take full responsibility for them. I would also like to make it clear that despite my criticisms (in this post) of Dr. Jeffrey Tomkins and the authors of the three articles written in response to Professor Venema over at Evolution News and Views, I do not wish to impugn their personal and scientific integrity.

Professor Venema’s five-part series on Vitellogenin and Common Ancestry is titled, Vitellogenin and Common Ancestry: Does BioLogos have egg on its face? (February 11, 2016). Venema responds to criticisms made by creationist Dr. Jeffrey Tomkins in sections three and four.

As I see it, the recent articles written in response to Professor Venema over at Evolution News and Views suffer from seven fundamental flaws:

(i) ignoring the main evidence for common descent;

(ii) faulty statistics: misconstruing the evidence for common descent;

(iii) changing the definitions of key terms (e.g. “pseudogene”);

(iv) obscuring the issue, by appealing to possible functions of pseudogenes, as an explanation for their presence in both chickens and mammals;

(v) engaging in wild speculation which goes far beyond the available evidence;

(vi) reliance on flawed analogies; and

(vii) theologizing the argument for common descent.

Let’s look at each in turn.

1. Ignoring the main evidence for common descent

Let me begin with a short definition. The term synteny simply refers to the condition of two or more genes, which may or may not be linked, being located on the same chromosome. The term shared synteny refers to the fact that in different species of animals, the spatial arrangement of genes on a chromosome is often conserved: not only do we find the same genes, but we find them in the same order along the same chromosome. In a nutshell, Professor Venema’s argument is that shared synteny is best explained by the hypothesis of common ancestry.

The evidence for the common descent of egg-laying birds (such as the chicken) and mammals is handily summarized in the third article of Professor Venema’s five-part series, which is titled, Vitellogenin and Common Ancestry: Reading Tomkins:

Evolutionarily speaking, the observed shared synteny for the VIT regions in humans and chickens makes a prediction about what we should find in other mammals. Since the last common ancestral population of humans and chickens lived prior to the evolution of all mammals, we would expect to (at least potentially) find these regions in any other mammal we care to sequence – with the understanding that these sequences might be missing if they have been lost in a particular lineage…

The researchers thus looked for VIT genes in a diverse number of mammals, and, not surprisingly, found them in the same arrangement as seen in chickens and humans. One example comes from a marsupial mammal – the opossum. Just like for humans, the opossum VIT genes are riddled with mutations that prevent them from being translated into proteins. Despite those mutations, however, enough VIT gene remnants and their non-gene flanking sequences remain in opossums to easily identify them – nested between the same functional genes we see in humans and chickens. In fact, in opossums, more of the VIT2 and VIT3 sequences remain than in the human genome, and more of the DNA flanking VIT1 remains the same.

These findings, then, match what common ancestry predicts if indeed humans, chickens, and opossums share a common ancestral population deep in the past. Opossums, since they do not lay eggs, do not require VIT genes any more than placental mammals like humans do. Nonetheless, they too have remnants of these genes in the exact places in their genomes that common ancestry would predict. Moreover, the researchers found that several other placental and marsupial mammals also have VIT pseudogenes. As you might expect, however, egg-laying mammals (such as the platypus) retain a functional VIT gene that they use to perform bulk yolk transfer to their embryos.

In summary, what we see is a broad pattern of evidence that supports the hypothesis that placental and marsupial mammals share common ancestral populations with egg-laying mammals, and more distantly, other egg-laying vertebrates such as birds…

The true “main evidence” for the remains of VIT genes in the human genome is as we have discussed: the overall match of sequences between placental / marsupial mammals and egg-laying organisms over large spans of DNA, including flanking regions. This is the evidence that needs to be addressed – and Tomkins does not even mention it, let alone address it. (Bolding mine – VJT.)

So, how did Professor Venema’s critics over at Evolution News and Views deal with this evidence? Amazingly, they almost completely ignored it.

Remarkably, the first Evolution News and Views post in response to Professor Venema, which is titled, Functional Pseuodogenes and Common Descent (May 23, 2016), made no mention of opossums or other mammals. It only mentioned chickens and human beings. The article quoted a single sentence from Professor Venema referring to shared synteny, but it failed to provide a definition of this term, let alone a discussion of its significance. In other words, it completely missed the point that Venema was making.

The final ENV response to Professor Venema, titled, Humans, Chickens, and the Vitellogenin Pseudogene — Summing Up (Evolution News and Views, May 25, 2016), was no better. It totally ignored the evidence from shared synteny, and it focused exclusively on humans and chickens, as it tried to make a case that there was no scientific evidence for common descent:

Let’s take a moment to summarize the results of our comments on the vitellogenin pseudogene and its meaning for the question of universal common ancestry. According to the data, the debate concerns six total genes supposedly shared by humans and chickens

Three of those genes (ELTD1, SSX2IP, CTBS) are functional in both humans and chickens. No evidence of pseudogenes — shared or otherwise — is there.

Two of those genes (VIT2, VIT3) are functional in chickens and non-functional in humans, but according to the data our colleague Dr. Gauger showed, they are hardly found in humans at all. Arguably, they aren’t there.

One of those genes — the supposed “vitegellenin (sic) pseudogene” (VIT1) — is functional in chickens, but according to Tomkins (2015) it is also part of a functional gene in humans. It may not be making egg yolk but it’s not a non-functional stretch of DNA. Indeed, as Ann Gauger notes, the sequence alignment between the human and chicken versions is very low, so it’s not clear if they are the same gene.

Thus the relevant data yield the following conclusion: When we look at this block of six genes supposedly shared by humans and chickens, there are exactly zero non-functional pseudogenes shared between humans and chickens. (Bolding mine – VJT.)

The article says nothing about the spatial order of these genes, along the chromosome on which they are found.

Of the three articles written in response to Professor Venema over at Evolution News and Views, only Dr. Ann Gauger’s article, The Vitellogenin Pseudogene Story: Unequally Yoked (Evolution News and Views, May 24, 2016), discusses the significance of synteny, as well as the evidence from other mammals:

Synteny refers to how well chromosomal sequences from different species align with one another. Genes can be in the same general order and location between species, for example rat and mouse, or chimp and human. If they align well, evolutionists take the alignment as evidence for common ancestry. Sometimes, the gene sequences may be interrupted by deletions or insertions, and stop codons, which prevent the gene from making functional protein.

These “inactivated” genes are called pseudogenes, and are taken by evolutionists as further evidence for common descent. Their presence is explained as the remnants of once functional genes broken by mutation and no longer needed by the organism.

Egg-bearing animals use proteins called vitellogenins to transport nutrients in their egg yolk. Each vitellogenin is a very long complicated protein composed of many exons (stretches of DNA that must be copied into RNA and then spliced together into one contiguous piece before being translated into vitellogenin). An article by Dennis Venema lays out the supposed story: humans retain the remnants of DNA that used to code for egg yolk proteins called vitellogenins. Since humans don’t lay eggs, the argument goes, these “vitellogenin” pseudogenes, now long since mutated into near unrecognizability, must be inherited from common ancestors who did lay eggs. Venema claims that all three human “vitellogenin” pseudogenes, VIT1 through VIT3, show traces of sequence similarity to the functioning vitellogenin genes of the chicken…

This story is based on a 2008 paper by Brawand et al. that discusses yolk proteins in egg-laying animals and mammals. In that paper they identified the region of the chicken genome where vitellogenin genes are located, then found the similar regions in humans, dogs, and various marsupials and platypus, to see if vitellogenin pseudogenes could be found in the right syntenic neighborhoods. (Bolding mine – VJT.)

Dr. Gauger deserves credit for squarely addressing the evidence. She suggests that the shared synteny of the vitellogenin VIT1, VIT2 and VIT3 genes as well as the nearby ELTD1, SSX2IP and CTBS genes in chickens, marsupials and placental mammals could be due to some function which the vitellogenin gene fragments possess in mammals: “This similar order could be due to ancestry or functional reasons.” As we’ll see below, that won’t work – especially if the vitellogenin gene fragments are part of a long non-coding RNA, as is claimed by Dr. Jeffrey Tomkins (whose work Dr. Gauger cites). And as we’ll see in the following section, her claim that “evidence for vitellogenin pseudogenes in human and dog genomes… is very weak” is based on a flawed reading of the evidence. But at least Dr. Gauger fully realizes what the problem is, and she makes an honest attempt to address it.

Advantage: common descent

In a recent email communication, Professor S. Joshua Swamidass observes that the hypothesis of common descent readily explains two features of the data:

1. Why is the genetic signal stronger in the opossum (a marsupial mammal) than in human beings, who are placentals? The common descent model has an answer to this question: the various lines of mammals made the transition away from eggs at different times, so they inactivated their yolk genes at different times. How does the design model answer this question?

2. The identical spatial ordering of the genes (shared synteny) in different groups of animals. The common descent model can readily account for this fact: the shared ordering was inherited from a common ancestor. How does the design model explain it?

In short: every pattern that the common descent model explains is therefore evidence for common descent, or at least common descent plus design.

I’d like to ask Dr. Ann Gauger a question I posed to Dr. Cornelius Hunter in my previous post:

Do you accept that if hypothesis A readily explains an empirical fact F and hypothesis B does not, then F (taken by itself) constitutes scientific evidence for A over B? Or putting it another way, if a fact F is predicted by hypothesis A, and compatible with hypothesis B but not predicted by B, then do you agree that F constitutes scientific evidence for A over B? If not, why not?

To sum up: while the hypothesis of common design is consistent with the data, it fails to explain many of the patterns we see in the data. The hypothesis of common descent, on the other hand, explains these patterns; that is why we count them as evidence for common descent.

2. Faulty Statistics: Misconstruing the evidence for common descent

The second major flaw in the Evolution News and Views articles written in reply to Professor Dennis Venema’s five-part series on vitellogenin and common ancestry relates to the way in which they present the evidence for matching between different groups (taxa) of animals. The ENV articles are marred by the use of faulty statistics, coupled with a mis-reading of the 2008 paper by Brawand et al., which compares the vitellogenin data for various groups of animals.

(a) Getting the data on opossums wrong

Dr. Ann Gauger understates the evidence for the existence of vitellogenin pseudogenes in humans and other mammals. Here’s how she presents it in her article, The Vitellogenin Pseudogene Story: Unequally Yoked (Evolution News and Views, May 24, 2016):

Patches of sequence similarity to the chicken genome that might be interpreted as pseudogenes can be found in syntenic regions of marsupial genomes. Evidence for vitellogenin pseudogenes in human and dog genomes, on the other hand, is very weak. It is practically nonexistent for vitellogenin genes VIT2 and VIT3 (it is not statistically significant compared to the genomic background).

Dr. Gauger minimizes the significance of the evidence for vitellogenin pseudogenes VIT2 and VIT3 in marsupials (such as opossums) by saying that it “might be interpreted as pseudogenes.” In other words, she’s not even sure that opossums possess these pseudogenes. [NOTE: In a comment below, Origenes suggests that what Dr. Gauger is doing here is querying the common scientific definition of “pseudogene.” Even if this interpretation is correct, the comments in section 3 below would still apply – VJT.] However, if we examine the 2008 paper by Brawand et al. which she cites in her post, we find that the authors explicitly state that the matches between chickens and opossums are “real coding sequence matches” for the VIT1, VIT2 and VIT3 exons:

Figure 3. Genome Alignment (Dot Plot Representing SIM Alignments) of Opossum/Chicken Syntenic Regions VIT1-VIT3 Regions

The chain with the best cumulative score is shown. Alignment of flanking genes confirms the synteny of the aligned regions. The subsets of alignments corresponding to VIT exons of the best chain for all three regions have significantly higher scores than genomic background hits in the chain (p < 0.05, Mann-Whitney U test). This shows that VIT1-VIT3 exon matches in opossum represent nonrandom hits and thus correspond to real coding sequence matches.

In addition, we need to keep in mind the fact that the p-values which Dr. Gauger mentions in her article refer only to a single match. However, when one takes into account the fact that there are multiple matches in this region (a point on which I’ll elaborate below), it becomes apparent that the odds of these matches being in all the right places are very, very low. It isn’t enough to merely ask whether there is a statistically significant match between one sequence in a chicken and another sequence in a mammal (e.g. a human or an opossum). What we need to examine is the totality of the evidence.

It appears that Dr. Gauger has failed to fully grasp the strength of the genetic similarities between chickens and marsupial mammals, such as the opossum.

(b) Human vs. chicken: Do human beings have vitellogenin pseudogenes?

What about chickens versus human beings? Dr. Gauger writes:

Evidence for vitellogenin pseudogenes in human and dog genomes, on the other hand, is very weak. It is practically nonexistent for vitellogenin genes VIT2 and VIT3 (it is not statistically significant compared to the genomic background).

Regarding VIT2 and VIT3, Dr. Gauger is correct. Brawand et al. acknowledge: “The coding sequence matches for VIT2/3 may be too short to provide statistical significance or partially spurious.”

Dr. Gauger then proceeds to discuss the remaining VIT1 pseudogene:

The remaining gene, VIT1, has two patches of similarity in its putative former coding sequence, according to a supplemental figure in the paper.

The best of them is a patch 150 bases long (out of 42,637 total bases for the gene!) that has roughly 50 percent identity, by my estimation, and a few deletions to help make things match up. According to the authors, there is a 95 percent chance that the amount of similarity between VIT1 for humans and dogs, and the chicken VIT1, is not due to random chance — but that’s just at the borderline for statistical significance.

(i) Is the match statistically significant?

Professor Swamidass has two comments which are germane here. First, he points out that as a matter of standard practice, the data relating to similarity in the paper by Brawand et al. should be considered correct, unless proven otherwise. If Dr. Gauger thinks that the authors’ claims of similarity are doubtful, then I would invite her to show the exact DNA sequence she used, so that interested readers can perform a BLAST by themselves (there is a website for this), to verify both the match to the vitellogenin chicken gene and to the human sequence. This doesn’t resolve the issue of picking the right DNA sequence, but it is a start. Selecting the right parameters is important, too: if you use the wrong gapped parameter (a mistake Tomkins is notorious for making), then there will be discrepancies.

Second, in response to Dr. Gauger’s statement that the level of similarity between humans and chickens for the VIT1 pseudogene is “just at the borderline” for statistical significance, Professor Swamidass notes that the p-value quoted by Brawand et al. refers only to a single match. However, there are MULTIPLE matches in this region. When you combine this evidence with the fact that these matches are all in the right places (you can very approximately get this by just multiplying all the p-values together, or use the Fisher formula for a better number) the significance is very high, even for the human regions. The odds of this being due to chance are astronomically low.

(ii) “Only a 62% level of similarity”: Has Tomkins made another egregious error?

Dr. Gauger calculates that there is only about a 50% level of identity between human and chicken DNA for the 150 base pair fragment that she mentions. Only 50% identity? Hang on a minute! Dr. Jeffrey Tomkins calculates in his 2015 paper that the level of similarity is 62%, not 50%. So which is it? It would be very helpful if Dr. Gauger could oblige readers by submitting her BLAST test parameters.

If anything, Dr. Tomkins’ 62% figure is likely to be an underestimate, since he has, on previous occasions, made outlandish claims about low levels of similarity between human and chimp DNA, which turned out to be flat wrong. He once claimed that our DNA was less than 70% similar to the chimp’s – a claim that he had to retract when he discovered a computer bug in the BLAST algorithm. Then he claimed that the true figure was 88% – a claim that has since been eviscerated by computer programmer Glenn Williamson – see his recent blog article, Is 1% a Myth? Very briefly, the reason why Tomkins arrives at his figure of 88% is because of his use of the ungapped parameter in BLAST+. Instead of using only the best match when calculating the degree of similarity between human and chimp DNA sequences on focusing on that one, Tomkins takes the average of all the matches – good and bad alike – which brings his average down.

And how did Dr. Tomkins obtain that 62% similarity figure, anyway? Here’s what he says in the “Materials and Methods” section of his 2015 paper:

All genomic sequences were downloaded from the UCSC genome browser website using either the web interface or a Perl script written by author Tomkins. Pairwise DNA alignments were performed using the Geneious software package with the following parameters: global alignment with free end gaps, cost matrix of identity 1.0/0.0, gap open penalty of 3, and a gap extension penalty of 3. These parameters were employed due to the low homology of the sequences being aligned.

In other words, Dr. Tomkins admits that his choice of parameters assumed a low homology between the the relevant sequences in humans and chickens, at the outset. Hmmmm.

UPDATE: I get mail from Glenn Williamson

A few days ago, Glenn Williamson emailed me, saying that he had performed a quick BLAST. Here are the top four hits that he found:

8,620,1097,1,2789062,2788597,74.74,361,483,43500,3e-77
8,24111,24485,1,2758283,2757910,73.59,287,390,43500,1e-56
8,41677,42033,1,2715035,2714688,74.31,269,362,43500,1e-55
8,42618,42930,1,2714120,2713832,68.22,219,321,43500,3e-20

He added that he had carved out a smaller chunk of human chromosome 1 so that the blast would run quicker, and he advised anyone who was interested in checking his results to add 76,000,000 to the 5th and 6th fields.

Glenn Williamson also made the following quick observations:

· About 1,500 base pairs can be aligned with around 73% identity. That’s much more than the 150 base pairs that Tomkins chose to focus on. Readers will recall that Tomkins claimed only a 62% identity, even for this short segment.
· It’s in reverse.
· There is some basic synteny there (the four hits are in the right order).
· There are some repeats in the results, but they are relatively short (around 30 base pairs each).

I’ll leave it to readers to judge whether or not Dr. Tomkins has accurately presented the evidence for genetic similarity between humans and chickens in the vitellogenin pseudogene.

(iii) Does the overall level of similarity need to be high, anyway?

Finally, Professor Joshua Swamidass points out that the neutral theory of evolution predicts that the overall genetic similarity between chickens and humans would not be terribly high. After all, we’re talking about a 70-million-year-old gene here, so it’s hardly surprising that there would be a relatively weak match. In contrast, the relevant pseudogenes in the opossum are much more recent, and for that reason, they have a much stronger match. Common descent is capable of explaining this pattern; common design, by itself, cannot.

Summing up: Has Evolution News and Views presented the evidence accurately?

In short: Dr. Gauger’s assertion that the degree of similarity between VIT1 for humans and chickens is “just at the borderline for statistical significance” rests on a mis-reading of Brawand et al.’s 2008 paper. Regrettably, Dr. Gauger’s statement was bowdlerized in the final, anonymous Evolution News and Views article written in response to Venema, which baldly states: “as Ann Gauger notes, the sequence alignment between the human and chicken versions is very low, so it’s not clear if they are the same gene.” That, I am afraid, constitutes a serious mis-reading of the genetic data, and I’m not sure Dr. Gauger herself would endorse that way of summarizing the evidence.

3. Changing the definitions of key terms

In her article, The Vitellogenin Pseudogene Story: Unequally Yoked (Evolution News and Views, May 24, 2016), Dr. Ann Gauger begins by defining the term “pseudogene” in its standard sense:

Sometimes, … gene sequences may be interrupted by deletions or insertions, and stop codons, which prevent the gene from making functional protein.

These “inactivated” genes are called pseudogenes, and are taken by evolutionists as further evidence for common descent.

So far, so good. But later on in her article, Dr. Gauger approvingly cites a 2015 Answers in Genesis article by Dr. Jeffrey Tomkins, who declares that a gene fragment which turns out to have a function can no longer be called a pseudogene:

…the alleged vtg [vitellogenin] fragment in human is not a pseudogene remnant at all, but a functional enhancer element in the fifth intron of a “genomic address messenger” (GAM) gene… These combinatorial data clearly show that it is a functional enhancer element in a GAM gene expressed in the human brain — strongly challenging the idea that this sequence is an egg-laying pseudogene genomic fossil.

Swamidass comments that by citing this passage from Dr. Tomkins, which defines pseudo-genes in terms of their total lack of functionality, Dr. Gauger has effectively changed the definition of a pseudogene, setting aside the standard definition. Even if VTG in humans were a lncRNA with an important function, it would still be a pseudogene, because no protein is being expressed from it and it exhibits similarity to VTG in chickens, and is in the correct place in the genome.

To be fair, I should mention that Dr. Venema himself, back in 2010, wrote a post in which he inaccurately referred to pseudogenes as “non-functional” (see here and here). The Evolution News and Views article Functional Pseuodogenes and Common Descent (May 23, 2016), points out Dr. Venema’s errors, which were made six years ago. But as they say, two wrongs don’t make a right. And in his 2016 series, Vitellogenin and Common Ancestry: Does BioLogos have egg on its face?, Dr. Venema is very careful to state that pseudogenes can acquire new functions, after having lost their original ones:

The major problem with [Tomkins’] argument is that it subscribes to a false dichotomy: that this sequence is either a VIT1 pseudogene fragment or a functional part of another gene. From an evolutionary perspective, there is no issue with it being both. Part of evolutionary theory is the expectation that occasionally some sequences, after losing their original function, may come under natural selection to be repurposed to another function. The technical term for this process is exaptation, and many examples of it are known.

As we’ll see below, even a functional pseudogene constitutes powerful evidence for common descent, if it can be shown that its function is a derived one.

4. Obscuring the issue by appealing to possible functions of pseudogenes

Left: Brain of human embryo at 4.5 weeks, showing interior of forebrain.
Middle: Brain interior at 5 weeks.
Right: Brain viewed at midline at 3 months. Images courtesy of Wikipedia.

In his article over at Answers in Genesis, Dr. Tomkins marshals what he considers to be a strong case that the alleged vitellogenin pseudogene remnant in human beings is actually a functional enhancer element in a gene which is expressed in the human brain:

…[T]he real story is that the alleged 150 base vtg sequence is not a pseudogene remnant at all, but a functional enhancer element in the fifth intron of a “genomic address messenger” (GAM) gene. This particular GAM gene produces long noncoding RNAs that have been experimentally shown to selectively inhibit the translation of known target genes, a majority of which have been implicated in a variety of human diseases. Messenger RNAs from this particular gene are also known to be expressed in a variety of human brain tissues in both fetal and mature subjects in three separate studies. (Bolding mine – VJT.)

Now, I’m no biologist, but I feel bound to point out that two highly respected biologists who are both Christians have highlighted problems with Dr. Tomkins’ arguments.

(a) Venema on why functionality in a pseudogene doesn’t weaken the case for common descent

In part four of his five-part article, which is titled, Vitellogenin and Common Ancestry: Tomkins’ false dichotomy,
Professor Venema explains why functionality in a pseudogene does not invalidate the case for common descent. When scientists study a pseudogene, the question they need to ask is not whether it is functional or not, but whether the functionality is original or derived:

Part of evolutionary theory is the expectation that occasionally some sequences, after losing their original function, may come under natural selection to be repurposed to another function. The technical term for this process is exaptation, and many examples of it are known. Certainly a long, non-coding RNA gene could arise at this location in the human genome and this sequence could be exapted as a regulatory sequence – but there is no hint of admitting this possibility in Tomkins’ work… Rather, it seems enough to Tomkins to suggest that the sequence is functional – and that this alone will be enough for him to convince his readership that this fragment is “not a real pseudogene.”

Put more simply, evidence of function does not erase the evidence for prior history.

Even though the evidence that this sequence is functional in humans is rather thin, the main issue is that even if its function were convincingly demonstrated in the future, it would not remove the evidence that this sequence was once part of a functional VIT gene – evidence that Tomkins has either not addressed, or denied outright. (Bolding mine – VJT.)

(b) Why Dr. Tomkins’ suggestion can’t explain shared synteny

The Evolution News and Views post in response to Professor Venema, titled, Functional Pseuodogenes and Common Descent (May 23, 2016), argues that the vitellogenin pseudogene in human DNA has a function, after all:

Tomkins has presented evidence that the VIT1 pseudogene sits in part of an intron that is transcribed and produces long non-coding RNAs of the type we know often have function.

In her article, The Vitellogenin Pseudogene Story: Unequally Yoked (Evolution News and Views, May 24, 2016), Dr. Ann Gauger goes even further: she addresses the problem of shared synteny (which was discussed above) by proposing that “[t]his similar order could be due either to ancestry or functional reasons.

However, in a recent post on Sandwalk, Professor Larry Moran writes: “There are thousands of lincRNAs but currently there are only about 200 that have known functions and not all of these are even human.” The point Moran makes is a vital one: Function in lncRNA is the exception rather than the rule. As such, the onus is on Intelligent Design proponents to demonstrate a function, rather than merely hypothesizing it.

Four problems with Dr. Tomkins’ proposal

When I contacted Professor Swamidass regarding the argument for the functionality of the long non-coding RNA produced by the VIT1 pseudogene, which was put forward in a 2015 article by Dr. Jeffrey Tomkins and recently defended in several posts over at Evolution News and Views, he had four major criticisms to make regarding the claim that lncRNA has a function, and that it can therefore be explained equally well by the hypothesis of common design.

(i) How important is the function, anyway?

First, it simply isn’t enough to show that the lncRNA is functional in some way. What needs to be shown is that the lncRNA is functional in an important way. Neither Dr. Gauger’s article, nor the article she cites by Dr. Jeffrey Tomkins, demonstrates that the pseudogene has an important function. The vast majority of lncRNAs are not important; they merely express noise. Drs. Gauger and Tomkins make a case that the lncRNA produced by the vitellogenin pseudogene could be important. That doesn’t mean it is important. Just because a lncRNA gets transcribed, it doesn’t necessarily mean that it’s important.

At this point, readers might want to ask: what would constitute reasonable evidence that the lncRNA in question has an important function? That’s a fair question. Professor Swamidass suggested that the discovery of SNPs (single nucleotide polymorphisms) associated with disease in this lncRNA would be good evidence that it is indeed important. While he acknowledged that evidence of this sort might turn up some day, he thought it extremely unlikely.

For the time being, the claim that common design minus common descent is just as good at explaining the genetic evidence explains unsubstantiated.

(ii) The functionality of lncRNA doesn’t depend on its location

The second point made by Professor Swamidass is that Drs. Gauger and Tomkins also need to demonstrate that the functionality exhibited by the lncRNA depends tightly on its being located at this particular position in the genome. The problem here is that in the rare cases when a lncRNA is functional, its functionality is NOT dependent on its position in the genome (it is a trans-acting element, not a cis element). Hence even if the lncRNA in question were functional, Drs. Gauger and Tomkins would still need to explain why it is found in the exact same place in the genome, in mammals and birds. From a design point of view (in the absence of common descent), there would be no reason for this positioning in the genome. To suggest, as Dr. Gauger does in her article, The Vitellogenin Pseudogene Story: Unequally Yoked, that “This similar order could be due either to ancestry or functional reasons” is to engage in unproven speculation about how lncRNAs work, which runs counter to how biologists currently understand their action.

(iii) Position, position, position

Third, Swamidass notes that the explanation proposed by Dr. Tomkins merely attempts to explain the similarity between individual elements (homology), while ignoring their positioning in the genome (synteny). Swamidass considers Dr. Tomkins’ point about homology to be quite reasonable, but it leaves the argument from synteny untouched. Why are the homologous elements positioned in the same way in the genome, in humans, opossums and chickens? Dr. Tomkins supplies no reason.

Now, Dr. Tomkins might wish to argue that in some cases, positioning is important. However, Professor Swamidass informs me that such cases constitute the exception, rather than the rule, in mammalian systems (microbes are a very different matter). The burden of proof is therefore on Dr. Tomkins to show that the order of elements in the human genome is important for the function he proposes.

(iv) The design enigma

Finally, Swamidass raises an interesting theological/philosophical question. Suppose that the vitellogenin pseudogene in humans, and the genes located near it, were all designed. Did the Designer have the power to put these genes in a different order? As far as biologists can tell, this would have been a very easy thing to accomplish. If they are right, then we are confronted with a design enigma: why were we designed in a way which looks just like the pattern we’d expect, if we arose by a process of common descent? (This is especially true for shared synteny, for which we have no biological explanation for except common descent.)

Summing up: the Evolution News and Views articles written in response to Professor Venema endeavor to show that the genetic data relating to pseudogenes is consistent with their having been designed. But the real question is: what is the best explanation of the data? The authors of the ENV articles have not put forward a coherent explanation for shared synteny. For instance, why, if lncRNA is trans-acting, is it located in this place in the genome, AND why is it similar to the VTG gene? The authors do not say.

5. Engaging in wild speculation which goes far beyond the available evidence

The articles attacking Dr. Venema over at Evolution News and Views also rely heavily on speculation which goes far beyond the available evidence, as the following extracts reveal.

The author of the anonymous article, Functional Pseuodogenes and Common Descent (May 23, 2016), proposes that the vitellogenin fragments in humans may turn out to serve some common function
in chickens and humans, in addition to their function of making egg yolk in chickens. Note the speculative language, which I’ve highlighted in bold:

At best, the human vitellogenin “pseudogene” only represents a small fraction of the chicken version of the gene. One could initially surmise that the fragment (or fragments) of the vitellogenin “pseudogene” that humans have (and use for some function) may not be the part (or parts) crucial only for making egg yolk in chickens. We may be using it for a non-egg-yolk related function that’s also found in chickens.

…[Or] perhaps the chicken vitellogenin gene produces not only egg yolk-related proteins, but also RNAs that have other roles or functional interactions in chickens. We may be using our “vitellogenin pseudogene” for a similar RNA-based function or interaction that chickens do…

In either case, our “vitellogenin pseudogene” and the chicken version would turn into a mere example of homologous DNA performing a homologous functions — something we see all the time in biology and which can be explained by common design just as easily as by common descent. It doesn’t appear that the specific function of our “vitellogenin gene” has been explored yet, and this would be an interesting question to investigate. The hypotheses offered here could very well turn out to be true.

In a similar fashion, Dr. Ann Gauger resorts to speculation in her article, The Vitellogenin Pseudogene Story: Unequally Yoked (Evolution News and Views, May 24, 2016), where she argues that even if humans turn out to possess a vitellogenin pseudogene, it appears to have a function which is related to an overlapping gene:

So what if the similarity is statistically significant? What apparent similarity there is could well be due to an overlapping gene with an entirely different function that is present in that stretch of sequence in the chicken, marsupial, dog, and human genomes. (I am guessing it is present in the other genomes — I know it is present in humans.) Indeed, there is evidence of another gene with other possible functions in that region of the genome…

The long non-coding RNAs mentioned above [by Tomkins in his 2015 paper – VJT] are widely believed to have many important regulatory functions in the cell. They are implicated in long- and short-range interactions between genes, the way the DNA loops, whether genes are sequestered or not — all these things and more are affected. (Bolding mine – VJT.)

Two points need to be made here. First, Intelligent Design proponents are fond of holding evolutionists up to ridicule for their speculative proposals on how life may have arisen from inanimate matter, or how macroevolutionary transitions may take place, even in the absence of intelligent guidance. I know; I’ve engaged in this sort of ridicule myself. But we need to be consistent here, or we risk being labeled as hypocrites. If it’s unscientific of evolutionists to engage in speculation about the origin of life or the mechanism of macroevolution in the absence of hard evidence, then it’s equally unscientific of Intelligent Design advocates to engage in speculation about possible functions of gene fragments in the absence of hard evidence. What’s sauce for the goose is sauce for the gander.

Second, the above proposals fail to address the main evidence for common descent cited by Dr. Venema, namely, “the overall match of sequences between placental / marsupial mammals and egg-laying organisms over large spans of DNA, including flanking regions.” It’s the spatial pattern which needs to be explained, and not just the presence of the genes.

6. Use of flawed analogies

In addition to the problems listed above, the Evolution News and Views article, Functional Pseuodogenes and Common Descent (May 23, 2016), makes use of a flawed analogy in its attempt to weaken the case for common descent:

…[A]s a second point, even if humans are using our “vitellogenin gene” for entirely different purposes than chickens do, this still doesn’t provide evidence for common ancestry. Why? Because we often see in technological designs that similar parts can be used for very different purposes. A plastic ring in one design might be used for blowing bubbles, but in another it helps seal the connections between two pipes. Or a plastic container in an outboard boat motor holds fuel, but in another technological design it holds dishwashing liquid. Using similar parts for different purposes is easily accommodated by common design.

I would respond that while a ring can be used for different purposes, we would not expect to find a ring showing signs of wear and tear in a new contraption. The wear and tear suggests that the ring was borrowed from somewhere else. Likewise, the peculiar matching patterns between our vitellogenin pseudogene and those of chickens and opossums, suggests that we are dealing with a gene that has a long history, and that was formerly used for something else. Only common descent can account for the shared synteny described in Professor Venema’s five-part series.

7. Theologizing the argument for common descent

Finally, one of the Evolution News and Views articles responding to Professor Venema makes the mistake of claiming that the case for common descent rests upon theological assumptions. Here’s an excerpt from the anonymous article, Functional Pseuodogenes and Common Descent (Evolution News and Views, May 23, 2016):

The main issue is that evolutionists have commonly argued that non-functionality in shared pseudogenes is what provides evidence for common ancestry. They argue that God would not put “broken” shared DNA in multiple species and thus this must be evidence for common ancestry over intelligent design (or special creation, or whatever). We’ve seen many theistic and atheistic evolutionists treat pseudogenes in precisely this manner…

Evolutionists claim that these pseudogenes provide special evidence for evolution because God would not create different species with shared non-functional DNA in the same location. Therefore, they argue, pseudogenes must be evidence for shared ancestry. (Bolding mine – VJT.)

Let’s be perfectly clear: the argument for common descent can be formulated without resorting to speculation about what God would or would not have done. In order to illustrate this point, imagine that living things on Earth were actually designed by an alien from Alpha Centauri (pictured above), named Alec. If we knew to be the case, what could we infer about the way in which Alec the alien made living things? Quite a lot.

The question of why Alec produced different groups of living things with the same spatial patterns in their genes (shared synteny) would still be a valid one. How might we explain that fact?

We might suppose that Alec kept the original design of the genes on his computer, and then copied it over to the ancestors of reptiles, birds and mammals, on separate occasions. That would be a case of common design without common descent – although it would still invite the obvious question: why would Alec re-use a pattern of genes that served a purpose in reptiles and birds when designing mammals, even though it serves absolutely no purpose in mammals? But the real reason why this explanation is a poor one is that it’s ad hoc: Alec keeps re-using the original design because he feels like it, or he’s too lazy to change it. A much better explanation would be to suppose that from the outset, Alec planned to generate every kind of living thing by a process of common descent from an original stock, intervening only when natural processes were unable to overcome some macroevolutionary hurdle required to generate a new structural design in a class of creatures. Once we impute this original decision to Alec, the rationale for the non-functional similarities in the patterns of the genes of different classes of organisms becomes immediately apparent. What’s more, all of the non-functional similarities can be explained in one fell swoop.

Someone might object that this explanation is ad hoc, too: after all, we might ask why Alec chose to use an evolutionary mechanism to generate the diversity of living things, when he had so many other alternative mechanisms at his disposal. What the objection overlooks is that the best scientific explanations, other things being equal, are the most parsimonious ones. It’s far simpler for scientists to make a single ad hoc assumption than to make a multitude of such assumptions. Consequently, if it were ever proved that life on Earth had been designed by aliens, scientists would still be justified in inferring that they used an evolutionary mechanism to generate the variety of species we see on Earth today.

The above argument assumed that the Designer was an alien, but it would work equally well if the Designer were an angel, or God, or any intelligent agent.

Conclusion

I’d like to conclude with a quote from the young-earth creationist biologist, Todd Wood, whom no-one can accuse of bias:

While common design could be a reasonable first step to explain similarity of functional genes, it is difficult to explain why pseudogenes with the exact same substitutions or deletions would be shared between species that did not share a common ancestor.
(The Chimpanzee Genome and the Problem of Biological Similarity , Occasional Papers of the BSG, No. 7, 20 February 2006, pp. 1-18.)

Why, indeed?

Now, I can certainly understand why someone might feel that notwithstanding the strong scientific evidence pointing to common ancestry, the authority of Scriptural passages which (on a plain reading) teach the special creation of man, such as Genesis 1:26-27, Genesis 2:7 and Genesis 2:21-24, trumps the verdict of science. Fair enough. That’s an argument I can respect. But to deny the strength of the scientific evidence for common descent in the first place is a sign of a peculiar kind of intellectual obstinacy, to my way of thinking.

An overwhelmingly strong scientific case can be made that life on Earth was designed. That alone should be enough to make belief in Intelligent Design reasonable. I believe that we in the ID movement should stick to our strengths. It does our cause no good if we query the very strong scientific evidence for common descent, which in no way weakens the case for Intelligent Design.

Comments
As usual pucci, all of your "facts" are wrong. Just because you are incapable of both literature searches and learning new information, this does not mean that the information is not out there. We have learned a lot about protein evolution during the last 50 years of research and there is much evidence for it. Again, try a simple pubmed search and you will find an avalanche of papers on the evolutionary history of many protein families. Maybe you'll even learn something, who knows.Alicia Cartelli
June 28, 2016
June
06
Jun
28
28
2016
04:30 AM
4
04
30
AM
PDT
gpuccio @299
[...] in the proteome we find a lot of function, nobody can really say if it is optimal or not, but certainly it seems to work very well.
Good point.
The question remains: how did those 1000+ bits of optimal functional information appear?
Good question.Dionisio
June 25, 2016
June
06
Jun
25
25
2016
08:13 PM
8
08
13
PM
PDT
Hi Gpuccio
d) In all that we can observe, we see constant evidence of negative selection, IOWs conservation of functional information which already exists, or just the emergence of new blocks of information, be them new genes, or new parts of a gene, or simply functional modifications of existing genes, always with definite discrete functional jumps, some of them smaller, some of them really amazing. And any jump above, say, 120 bits is well beyond the probabilistic resources of our whole biological planet.
This statement is absolutely valid but not always easy for biologists to see. Even those with valid PHD degrees from major universities can struggle with this concept based on the almost infinite sequential space of the genome. I have seen a few who are not completely tied to certain world views come around after months of rigorous debate. I participate on a few blogs where Alicia's opinion is the majority and see some progress in participants understanding the problems with current evolutionary mechanisms like RMNS neutral theory etc. Bottom line this is very obvious to you me and Dr. JDD but takes time for others to see clearly. This topic was first brought in the Wistar conference in 1967 and is still being debated 50 years later.bill cole
June 25, 2016
June
06
Jun
25
25
2016
02:19 PM
2
02
19
PM
PDT
Dr JDD Thanks you for your thoughts. I have come to a similar conclusion. Although Joe has come pointed me to a very clever model of population genetics it does not explain the evolution of new features like GPS that are mission critical to the journey through common ancestor theory. As your point articulates it is single dimensional. GPS ( or a genetic equivalent like wings) require new genetic sequences of which we have no known mechanism for generating.bill cole
June 25, 2016
June
06
Jun
25
25
2016
01:51 PM
1
01
51
PM
PDT
Dr JDD and Bill Cole: Thank you for your always interesting comments. I would like to add a few thoughts: 1) Regarding Alicia, I can appreciate (her?) for a short time, and sometimes I have found some interesting points in her posts. But in time she becomes unbearable, and most times uses the most arrogant and pointless arguments as though they were absolute truth that everybody should revere. So, I will go on with my mixed feeling (and behaviours) with this person. 2) Regarding optimal or suboptimal function. The point is simple: in the proteome we find a lot of function, nobody can really say if it is optimal or not, but certainly it seems to work very well. Let's remember that in the famous rugged landscape paper (Hayashi) the wildtype could not be found by experimental means, and only highly suboptimal solutions emerged, none of which was a path to the wildtype. So, let's say that Stat 3 gains its 1000+ bits of functional information in cartilaginous and bony fish, and that those 1000+ bits are conserved to humans, for 400+ million years. OK, I suppose Alicia would say that those 1000+ bits are some optimal information and that's the reason they are conserved. OK, let's say that is the case. The question remains: how did those 1000+ bits of optimal functional information appear? The answer of darwinists (including Alicia) is always the same: it came through gradual evolution, by RV + NS. But where is the evidence for that? In their minds, and nowhere else. Facts: a)In the genome - proteome there is no trace of any intermediate sequence between the proteins which existed before vertebrates, and the protein which we find today in the oldest vertebrates (sharks). There is, instead, this amazing conservation between sharks and humans (and everything else in between). b) Similarly, there is absolutely no evidence of any "suboptimal" form of the Stat 3 protein (or of any other similar case) which could represent at the same time an intermediate function and an intermediate sequence. IOWs, what the darwinist theory badly needs and never finds. c) Any hypothetical suboptimal intermediate should have been functional enough to confer some reproductive advantage in its due time, so that it could expand and be fixed. That's the only way intermediates can help. IOWs, each intermediate must have been under real positive selection, in its due time. But, for some strange reason, there is no trace of those processes. d) In all that we can observe, we see constant evidence of negative selection, IOWs conservation of functional information which already exists, or just the emergence of new blocks of information, be them new genes, or new parts of a gene, or simply functional modifications of existing genes, always with definite discrete functional jumps, some of them smaller, some of them really amazing. And any jump above, say, 120 bits is well beyond the probabilistic resources of our whole biological planet. e) What can never be seen is positive selection which generates new complex functional information, as documented by definite paths where gradual suboptimal states are capable of being naturally selected and expanded, and are at the same time sequence steps to the final "optimal" state. There is no trace of that, IOWs, of the mechanism itself which should explain everything in the mind of darwinists. f) So, the observable facts are: a proteome which is full of evidence of the sudden appearance of new information, of the conservation of existing information by negative selection, and of the modification of what can be modified by neutral variation, without any functional consequence. IOWs, a proteome which screams design and conservation of design, while wholly falsifies the "theory" of gradual paths to complex functional information. So, the final question is: what shall we follow, observable facts and good reason, or Alicia's personal fantasies?gpuccio
June 25, 2016
June
06
Jun
25
25
2016
01:39 PM
1
01
39
PM
PDT
Hi Bill, I am certainly no expert on simulations or codes nor modeling evolution. I may claim to have some expertise on understanding what phage libraries can do, biological relevance of protein function, cell biology systems, etc, but what I state above is just using my knowledge as a scientist combined with the ability to critically read a published piece of work. But you do not need to be a scientist or have done that for decades to be able to do so as other hear prove. It seems to me though that simulations that use other systems are pitiful for trying to understand these things. They work with 1 goal in mind typically and this is dreadfully inaccurate for biological systems. For example, as you mention it see: http://boxcar2d.com/ It seems to me you are simply trying to evolve a car to travel the furthest. But notice some key features, for example, you need a wheel, it needs to turn...the path is defined, the number of shapes and objects are very limited, etc, etc. However, it is 1-dimensional in its goal - travel the furthest. Now take for example a living organism. It requires the ability to replicate, and survive to a point of replication. But to achieve the utilization of energy, replicating in a way that information (!) is passed on and retained. Even at the simplest point, this is multifactorial. This is where the discussions of individual proteins really could not convince me even if you can show a "simplified" version of the protein. The point is biologically speaking it is not about 1 protein. If it accumulates too much, it is toxic to the cell. If it interacts with the wrong thing too much, it can be toxic to the cell. If it is not good enough for the substrates relative biological concentration, it is of little use. Just because you can create in isolation or even simulate a simplified version of a protein or a pathway to a more complex protein from a simple one, does not mean it would work in a biological multi-factorial system. And earlier on in evolution, there would have likely to have been less redundancy so such pathways are even harder to achieve. That is why I come back to the point I made above - why didn't they put some of these "simpler" SH3 domains into the src protein and show functionality. If they cannot even do that, how can you even maintain this is a rational path for the evolution of a protein. There is a reason they have a more complex amino acid composition - undoubtedly it is required under the constrains of the biological system. However the truth still remains that these simulations and other attempts to model protein evolution from simple to present day observations have all woefully failed. Therefore, it is all hear-say, just-so stories and ad hoc reasoning with no sound basis except a good imagination.Dr JDD
June 25, 2016
June
06
Jun
25
25
2016
01:05 PM
1
01
05
PM
PDT
Dr JDD Thanks you very much for going through the paper in detail. I agree with Mung that we should not gang up on Alicia despite her frequent use of put downs and odhominem attacks. She is often supporting a minority opinion of this blog and I know how difficult this can be. The sequential space problem is real and there are not any solutions yet that have delivered a mechanism that can generate new genetic sequences. The counter argument is that these sequences started out as simple and then through natural selection became more complex. No one has come close to modeling this without a target (knowing the sequence and comparing the mutated change to the final working sequence) The closest is a reference Dr Joe Felsenstein gave to me in a discussion on the Sandwalk blog.
But check out the simulations of Karl Sims, or the program breve that enables you to simulate a similar case, or the Boxcar2d system which does something similar (I have left out links because these are easy to find using a browser).
I would be interested in your opinion if you think this simulation shows the ability to generate primitive genetic sequences. Whether it is 20AA or 5AA in the sequence random change will degrade it. I understand Alicia's frustration trying to defend the existence of a valid evolutionary mechanism given the genome is a sequence. Sequences are great at creating diversity but suck at trying to create them with a random search (their use as a password is an example) Passwords are designed to block random search. Some are only 4 digits long i.e. ATM passwords and do a good job securing your bank account.bill cole
June 25, 2016
June
06
Jun
25
25
2016
12:31 PM
12
12
31
PM
PDT
Thanks for your comments gpuccio - and good to see that you share my thoughts on the results presented in that paper. Mung - I agree as well I am not one to hang one out to dry nor am I one to dish out the ad hominems intentionally (even though Alicia has said many negative things to me in the past). I am happy to withdraw my assessment of Alicia as being intentionally dishonest if Alicia can either demonstrate emphatically that there is a misunderstanding of these results or state that she herself was wrong and misinterpreted the results to state something that this work does not support. I sincerely believe that gpuccio is correct and our understanding of this paper emphatically does not support the assertions made about tolerance of mutations in general and/or available sequence space. Also we must consider the biological context and not just "something has limited function". This is essential as I believe gpuccio has well laid out here with the example of STAT3 - and there are many other examples of large proteins that show little to no variation as you look across the species. The question remains - how did RM+NS arrive to such a sequence when it is clear it is so important. The assertion Alicia is making that it shows this is the most optimal sequence and it matters so after millions of years it arrives at that optimal sequence and doesn't change - this is completely ad hoc. There is no evidence to support that, unless one can show a route that the STAT3 could have taken or at least simpler predecessor molecules. This is therefore no different to saying "goddidit" but somehow it seems acceptable to then slate and claim that someone does not understand basic biology for not accepting this "just-so" story (that lacks any evidence to support it).Dr JDD
June 25, 2016
June
06
Jun
25
25
2016
11:36 AM
11
11
36
AM
PDT
Alicia provides entertainment for free! Let's not be too harsh with her.Mung
June 25, 2016
June
06
Jun
25
25
2016
10:33 AM
10
10
33
AM
PDT
Dr JDD: Thank you for you kind advice at #291. I got the full article! :) And above all, thank you for your competent, balanced and fascinating comments at #290 and #292. I fully agree with what you say. This is another kind of paper which says interesting things, but draws inappropriate conclusions from those things, regarding supposed evolutionary implications. We can be grateful for the interesting work, but there is no need to share the conclusions. What we certainly do not need to share are Alicia's "conclusions" from the paper, which are well beyond reason and honesty! Thank you again.gpuccio
June 25, 2016
June
06
Jun
25
25
2016
09:34 AM
9
09
34
AM
PDT
Daniel King: "You should be ashamed of yourself for resorting to such desperate and meaningless ad hominems." No ad hominem. Just an explicit judgment about Alicia's moral character, as expressed in her management of the discussion here. A judgment of which I take full responsibility, and on which of course you are free to disagree. As you are free to give similar judgments about my moral character as expressed in my discussions here.gpuccio
June 25, 2016
June
06
Jun
25
25
2016
08:57 AM
8
08
57
AM
PDT
I think it would be helpful to outline the findings of the paper from Baker’s lab that Alicia says that she cites almost every time she is at UD. Bear in mind this citation is supposed to support the assertion that functional protein sequences are highly common and therefore this vastly reduces the search space for a given sequence. It is also supposed to support the notion that most amino acids in a given protein are mutatable which is also seen as support for many solutions to a functional problem with regards to protein sequence. I should note here whilst I agree with the thought that “perfectly optimised function” is not necessary (i.e. a reduced function over the “final” wild type sequence can be sufficient), this reduced function has to be physiological. In other words, if an enzyme can still catalyse a reaction but this now takes 12 hours instead of 12 seconds, you have a problem if the product of that reaction is indeed necessary to support a particular cellular role necessary to life/survival. Further to this, physiological relevance extends not only in efficiency but in other parameters such as affinity/concentration (i.e. there needs to be enough of the substrate around for the affinity of the protein for that substrate, so physiological levels, not supraphysiological levels) and also the needs of the system as a whole and the properties the sequence invokes. Good examples of the last point are things like protease susceptibility, degradation/turnover rates, etc. Further, the environment can play a role. A good example of this are thermophilic bacteria where the most “optimal” enzyme or protein sequence may not be the most efficient – in fact it is highly unlikely to be the most efficient, but rather, it will be the most efficient in the context of being able to function at say 70-90oC (i.e. thermostability is a driving factor here). Now I am not saying those things cannot come about through RM+NS (especially thermophilic bacteria), however I would personally suggest that the starting point could not have come around through RM+NS (in other words, a thermophile may have arisen through RM+NS from an ancestor that did not possess thermophilic enzymes. Possibly.). So, let’s return to the citation at hand and try to see if this supports the original assertion. In this paper, the researchers tried to see if they could use a simplified amino acid output (instead of the full 20 common amino acids found in most life today) to make a domain of a functional protein. This is an important note, as they simply are looking at the SH3 domain of the src protein. This domain is not a “functional protein” in the sense of on its own in the cell serving a cellular function. It is a domain of the src protein that binds to proline rich sequences. Proline is an amino acid that is perhaps the most unusual one as it provides less flexibility in polypeptides that make up proteins, and induces structural features such as “kinks”. It has a (non-aromatic) ring-like sidechain to achieve this. It is known that proline-rich repeats form helical-like polypeptides with good features for other proteins to interact with. The authors state:
The SH3 domain has a complex beta-barrel-like structure wherein residues spread throughout the sequence come together to create the binding site for a proline-rich peptide. Because peptide binding requires proper folding of the SH3 domain, selection for binding activity necessarily selects for the SH3 fold.
In order to achieve this, they used phage display combinatorial libraries (which can allow for the assessment of many millions of different mutants). To isolate those mutants that passed the criteria they panned the phage with paramagnetic beads that were coated with proline rich peptides. Now here is a very important point that they make (remember, this is about the 57-amino acid SH3 domain from the src protein, not about the src protein itself, which is about 535 amino acids):
Combinatorial libraries of SH3 variants displayed on the surface of M13 phage were constructed in which all residues not involved in binding were biased towards a small set of amino acids. [emphasis JDD]
That is a very important point, which we will come back to. Next, what reduced amino acid set could they use? They attempted to use one nonpolar and two nonpolar amino acids (based on cooperatively folded helical structures obtained from random sequences predominantly were composed of these as they cite) however this 3-amino acid alphabet did not work. What they found was that the majority of the Alanine (A) and Glycines (G) in the wild type sequence could not be replaced, so they included these 2 with Isoleucine (I), lysine (K) and glutamate (E). These are then the 5 amino acids they used. Next, another important statement the authors make:
At positions where structural and phylogenetic data suggested that one of the reduced alphabet residues might not be tolerated, additional residues were included…
Having performed the technical work, they had a large number of hits but for obvious reasons could only check a limited number had properly folded. However, these libraries were performed I believe in 3 segments across the protein domain and then had to be stitched together. At the splicing sites further simplification was attempted, and 2 variants came through this process (FP1 and FP2). FP2 was the most simplified variant, and 40 of the 45 residues attempted to be simplified had been simplified. But remember – the original SH3 domain was 57 amino acids. So where did the 45 come from? Well, as stated earlier, they only mutated those residues not involved in binding! So 12 of the 57 were deemed critical for proline-rich peptide binding. Then out of the remaining (presumably providing mainly structural characteristics to form the beta-barrel), they managed to get 40 to be simplified. The 95% number already discussed in this thread then comes from this statement:
Three of the five positions that resisted simplification are at or near the binding site; the protein scaffold that supports the binding site is 95% I, K, E, A, and G.
That is a bit different from claiming 95% of amino acids in a protein can tolerate change. Now to be fair, the authors do address the 12 non-mutated binding residues:
Since there are exposed large hydrophobic residues at the binding site that may compromise stability, further simplification could likely be achieved if the requirement for binding were relaxed. For example, of the 12 positions held fixed in this study, half were shown to tolerate alanine substitutions in the Sm5 SH3 domain; judging from the effects on expression levels, only one of these mutants appeared to have significantly reduced stability.
Now this is surprising, as if they could have truly mutated those residues along with other simplification of residues not involved in proline-recognition, this would have been a much greater impact and story to tell. Yet they have not published anything of the sort, but simply reference a paper that individually mutates a residue to an A at a time to determine critical residues. This is a far cry from mass mutation and I do think it is telling that they have not done this themselves (which usually means they did it but it did not work). The authors go on to characterise some of the structural features and states of these variants they have produced, but I am not sure how important it is to go into detail as I think the above spells out quite clearly that what this paper was being cited to show, is emphatically not what it demonstrates. So a reflection point here: - Does this paper support the assertion that whole functional proteins can retain biologically relevant function when the vast majority of their amino acids are simplified to a 5-letter aa alphabet? - Does this paper support the assertion that 95% of a protein’s residues are mutatable? - In the last 20 years, is this the best support of the above assertions? Where are the papers looking at simplifying, say, ATP synthase or something fundamental to most of life from the simplest form known? - Who do you think is really being honest here?Dr JDD
June 25, 2016
June
06
Jun
25
25
2016
05:04 AM
5
05
04
AM
PDT
Gpuccio: If you want access to this article go to academia.edu (or google the full title of the article and it will be one of the top hits) and you can create an account as someone with an interest in the science and get access to the pdf. I recommend it - it is an interesting paper.Dr JDD
June 24, 2016
June
06
Jun
24
24
2016
11:19 PM
11
11
19
PM
PDT
Alicia - why are you being dishonest here? All they demonstrated was that to make a beta-barrel like fold that could bind saturating amounts of proline rich peptides immobilised on beads (low hanging fruit for "function") they could get away with replacing most of the 57 amino acids with a reduced amino acid of 5 (rather than 20). Additionally of those 57 amino acids, 12 were not attempted to be simplified!!! Presumably because all function and/or folding was lost. Further, of the remaining, at least 4-5 for each variant they discuss did not tolerate a reduced amino acid substitution. Finally, let me reiterate that this was an SH3 domain of src protein. It is not a protein of any use itself, simply a proline binding domain of a larger functional protein. And as already stated, to get a structural feature is not that difficult (beta-barrel like fold) and also to bind proline rich peptides (you will know how unique proline is and how is kinks polypeptides and can be easily interacted with by other polypeptides even more so when selected for this binding by immobilising a high concentration of proline rich polypeptides on beads), but how is this a "functional protein" of any use in a cellular organism? And again, I come back to my original question - why did they not insert these reduced aa mutants back into the context of the whole src protein and demonstrate the functionality of that protein was maintained? This paper does not support your assertion and you are dishonest if you continue to suggest it does!Dr JDD
June 24, 2016
June
06
Jun
24
24
2016
11:16 PM
11
11
16
PM
PDT
Alicia Do you realize that 3^200 is quite a bit smaller than 5^200. Do you know how much smaller?bill cole
June 24, 2016
June
06
Jun
24
24
2016
06:57 PM
6
06
57
PM
PDT
Alicia
Billy, are you kidding, I said I wasn’t sure if it was three originally. So it’s five. Either way it’s a fraction of what is used by living organisms today. Are you so clueless that you don’t realize whether it’s three or five doesn’t matter? Billy, seriously, sit this one out.
What is your evidence that it has ever been less than 20? If so how many nucleotides did DNA have in your 5 amino acid world? If still 4 can you explain how having 5 amino acids made RMNS more than a just so story? The argument strategy you are using makes me believe you do not understand the sequential space problem that gpuccio has been trying to educate you on. By saying gpuccio does not understand biology you are losing credibility especially when you thought the NPC was a single binding entity. Alicia, are you capable of debating wo ad homenim put downs?bill cole
June 24, 2016
June
06
Jun
24
24
2016
06:54 PM
6
06
54
PM
PDT
gpuccio to Cartelli:
You are not capable of any serious and respectful discussion
What an arrogant and hostile thing to say!
..and your only partial excuse could be that your understanding of biology is really ridiculous.
You should be ashamed of yourself for resorting to such desperate and meaningless ad hominems.Daniel King
June 24, 2016
June
06
Jun
24
24
2016
05:24 PM
5
05
24
PM
PDT
Billy, are you kidding, I said I wasn't sure if it was three originally. So it's five. Either way it's a fraction of what is used by living organisms today. Are you so clueless that you don't realize whether it's three or five doesn't matter? Billy, seriously, sit this one out.Alicia Cartelli
June 24, 2016
June
06
Jun
24
24
2016
04:56 PM
4
04
56
PM
PDT
Alicia
“If it were true that proteins can be rebuilt with 5 amino acids….” Pucci, it is true. I just sent you the paper, which you demanded, that showed exactly that
Do you really believe that the paper submitted supports this? You made a claim that we could create protein structures with 3 amino acids yet your paper refutes your claim. Whats going on? When you say proteins do you mean that all 100000 human proteins only require 5 amino acids? The paper you cited is 19 years old. Do you have any follow up supporting evidence?bill cole
June 24, 2016
June
06
Jun
24
24
2016
03:37 PM
3
03
37
PM
PDT
"If it were true that proteins can be rebuilt with 5 amino acids...." Pucci, it is true. I just sent you the paper, which you demanded, that showed exactly that. Do you live in a constant state of denial of scientific evidence or do you just have the attention span of a goldfish? And now you move the goalposts, demanding "libraries" of functional proteins. You obviously have no idea how experimental biology works. It's not me with a lack of understanding in biology, it's you. Sorry to burst your bubble.Alicia Cartelli
June 24, 2016
June
06
Jun
24
24
2016
02:02 PM
2
02
02
PM
PDT
Alicia: Your initial statement: "Bill, most proteins consist of hundreds of amino acids, many of these amino acids can be swapped out for other amino acids and the protein will still function. Proteins have been re-built in the lab using only three different amino acids I think it was." The paper you link (the abstract, the rest is paywalled): "Functional rapidly folding proteins from simplified amino acid sequences David S. Riddle1, Jed V. Santiago1, Susan T. Bray-Hall1, Nikunj Doshi1, Viara P. Grantcharova1, Qian Yi1 & David Baker1, ,2 Abstract Early protein synthesis is thought to have involved a reduced amino acid alphabet. What is the minimum number of amino acids that would have been needed to encode complex protein folds similar to those found in nature today? Here we show that a small beta-sheet protein, the SH3 domain, can be largely encoded by a five letter amino acid alphabet but not by a three letter alphabet. Furthermore, despite the dramatic changes in sequence, the folding rates of the reduced alphabet proteins are very close to that of the naturally occurring SH3 domain. This finding suggests that despite the vast size of the search space, the rapid folding of biological sequences to their native states is not the result of extensive evolutionary optimization. Instead, the results support the idea that the interactions which stabilize the native state induce a funnel shape to the free energy landscape sufficient to guide the folding polypeptide chain to the proper structure." You are really a buffoon. Dr JDD has already commented very well. If I had access to the details of the paper, I would comment more in detail. If it were true that proteins can be rebuilt with 5 aminoacids, and remain functional, why don't they do it? It should be so easy to build libraries of functional proteins with 5 AAs only. You are not capable of any serious and respectful discussion, and your only partial excuse could be that your understanding of biology is really ridiculous.gpuccio
June 24, 2016
June
06
Jun
24
24
2016
01:23 PM
1
01
23
PM
PDT
That is the "exact reference to the paper" that pucci asked for. Don't all get your panties in a bunch now.Alicia Cartelli
June 24, 2016
June
06
Jun
24
24
2016
01:20 PM
1
01
20
PM
PDT
Let me clarify my question in post 280 (as I know where you will take this if I do not clarify): I am referring to the function of the entire protein and also physiologically relevant function. I am aware they isolated mutants through binding to proline rich peptides immobilised on beads but this is neither physiological nor difficult to achieve or very meaningful. Certainly though, it is not something that demonstrates 95% of a protein can be mutated and still retain its function. That is a very misleading and false assessment.Dr JDD
June 24, 2016
June
06
Jun
24
24
2016
11:04 AM
11
11
04
AM
PDT
Alicia @ 277: What exactly are you claiming that reference demonstrates? That the sequence space for a specified functional protein is vast (only 5 aa necessary)? It seems to me that you are stretching what the Baker paper you referenced demonstrates. It seems to only show that 5aa are necessary to get the simple beta-barrel fold found in an SH3 domain. It also shows that some of the substituted forms based on a simple aa availability can fold as fast or even faster than wt and show good stability. The problem is, a beta-barrel is not a function. It is a structural feature or a type of protein fold. Has anyone put this into a functional protein with a beta-barrel and demonstrated function is retained (at a rate reasonable for a cellular function)? Interestingly, the authors state in the introduction when referring to alpha helices: "In complementary studies, entire helical bundle architectures have been built from reduced amino acid alphabet but for the most part they do not appear to have the ordered packing characteristic of biological proteins." Note the last part of the sentence.Dr JDD
June 24, 2016
June
06
Jun
24
24
2016
10:30 AM
10
10
30
AM
PDT
Alicia
arly protein synthesis is thought to have involved a reduced amino acid alphabet. What is the minimum number of amino acids that would have been needed to encode complex protein folds similar to those found in nature today? Here we show that a small beta-sheet protein, the SH3 domain, can be largely encoded by a five letter amino acid alphabet but not by a three letter alphabet. Furthermore, despite the dramatic changes in sequence, the folding rates of the reduced alphabet proteins are very close to that of the naturally occurring SH3 domain. This finding suggests that despite the vast size of the search space, the rapid folding of biological sequences to their native states is not the result of extensive evolutionary optimization. Instead, the results support the idea that the interactions which stabilize the native state induce a funnel shape to the free energy landscape sufficient to guide the folding polypeptide chain to the proper structure.
How do you think this paper supports your position that RMNS is the cause of diversity?bill cole
June 24, 2016
June
06
Jun
24
24
2016
10:29 AM
10
10
29
AM
PDT
Oooh, another literature bluff! How fun!Eric Anderson
June 24, 2016
June
06
Jun
24
24
2016
09:19 AM
9
09
19
AM
PDT
http://www.nature.com/nsmb/journal/v4/n10/abs/nsb1097-805.html Took me all of five seconds to find it pucci.Alicia Cartelli
June 24, 2016
June
06
Jun
24
24
2016
06:57 AM
6
06
57
AM
PDT
Alicia: The reference, please. Exact reference to the paper. Otherwise, I will not go on discussing with you.gpuccio
June 24, 2016
June
06
Jun
24
24
2016
04:10 AM
4
04
10
AM
PDT
"But neutral evolution goes on anyway. Again, how is it that we have such a low ka/ks for stat 3? You have not answered." Yes, I did, you must not like to read. I said that STAT3 has maintained it's sequence because of its importance in both development and the immune response. Small changes in the sequence of this protein do not mean a loss of function, but they can lower function enough so as to alter the development or the immune reponse. Negatively effecting these processes does not mean immediate selcection against, but over the timescales of millions of years, the changes will be selected against. This is what preserves these protein's sequences. "Exact reference, please" I mention this research pretty much every time I come to UD. Obviously you are impermeable to knowledge. The David Baker lab did the work I was referring to, but it was 5 amino acids. They found that for some proteins, substitutions were tolerated at 95% of amino acids. Yes, these things are all expendable, as I said, depending on conditions. Certain amino acids are essential for us and certain ones are not because of our evolutionary past. Certain amino acids were abundant for our ancestors and the pathways to make those molecules became expendable. The loss of those pathways remains in the ancestors of those species, which includes us. "And your distinction between “function” and “optimal function” is really pitiful." They are two very different things, the first is simply the protein's ability to carry out it's molecular function, such as bind DNA or another protein, or both, whatever. The second takes into account the protein's role in a certain pathway and the effect on the species if teh protein undergoes changes. These two things do not always change in a 1:1 ratio. You simpy don't have a good enough grasp on molecular biology, nor an evolutionary perspective wahtsoever. Bill, for the last time, the NPC importing things and exporting things is controlled by protein interactions via localization signals. That's proteins BINDING to each other and the sequences can be very different, as you can see by the localization signals in your own quote. "I don’t believe" Like I said, when it comes to biology, nobody cares what you "believe in."Alicia Cartelli
June 23, 2016
June
06
Jun
23
23
2016
07:10 PM
7
07
10
PM
PDT
Nope, not me. I do recommend you read the "War is Over" thread though if you're interested in lost genes.MatSpirit
June 22, 2016
June
06
Jun
22
22
2016
10:32 PM
10
10
32
PM
PDT
1 2 3 11

Leave a Reply