In my second response to Arthur Hunt on the origin of the T-urf13 gene (which specifies a mitochondrial ligand-gating pore-forming receptor for T-toxin in maise), I briefly mentioned towards the end of my post Arthur Hunt’s comments on the Panda’s Thumb blog regarding the Axe (2004) result concerned with the rarity of catalytic domains within sequence space.
As I noted in my previous post, Axe’s 2004 JMB paper is not an isolated result. I cited a number of papers which attained similar results with respect to the rarity of functional domains within sequence space. In one study, published in Naturein 2001 by Keefe & Szostak, it was documented that more than a million million random sequences were required in order to stumble upon a functioning ATP-binding protein, a protein substantially smaller than the transmembrane protein specified by the gene, T-urf13, discussed by Hunt. In addition, I noted, a similar result was obtained by Taylor et al. in their 2001 PNAS paper. This paper examined the AroQ-type chorismate mutase, and arrived at a similarly low prevalence (giving a value of 1 in 10^24 for the 93 amino acid enzyme, but, when adjusted to reflect a residue of the same length as the 150-amino-acid section analysed from Beta-lactamase, yields a result of 1 in 10^53). Yet another paper by Sauer and Reidhaar-Olson (1990) reported on “the high level of degeneracy in the information that specifies a particular protein fold,” which it gives as 1 in 10^63. In my previous post, I also strongly encouraged Arthur Hunt and others to read Douglas Axe’s excellent review article in Bio-complexity which covers this topic in more detail, as well as to read the recently-published The Nature of Nature — Examining The Role of Naturalism in Science, which is highly accessible for non-specialists.
Yesterday, I posted a short itallicised update to my previous article, having now looked somewhat closer at the article to which Hunt referred me. For those that missed it, allow me to highlight just a few of the points at which Hunt errs.
The key short coming of Hunt’s analysis appears to be in the categoric conflation of (a) the rarity of functional folds in sequence space, and (b) the ability to optimise those functional folds. But it was the purpose of Axe’s 2004 JMB paper to provide an estimate for the former, and not the latter.
Axe’s research set out with the initiative to ascertain the prevalence of sequence variants with a particular hydropathic signature which could form a functional structure out of the space of combinatorial possibilities. Hunt tells us that “Axe deliberately identified and chose for study a temperature sensitive variant. In altering the enzyme in this way, he molded a variant that would be exquisitely sensitive to mutation.” And, indeed, Axe did begin with an extremely weak (temperature sensitive) variant, entailing that an evolving new fold would be expected to be poorly functional. And why would Axe do this? Because he saught to detect variants operating at the lowest level — the threshold, if you will — of detectability.
Axe sought to provide an estimate of the rarity of functional folds in all sequence space, which he gives as 1 in 10^77. This estimate was extrapolated from the number of variants which were able to carry out the function, no matter how weakly, of the TEM-1 Beta-Lactamase enzyme. The graphic on Hunt’s article seems to me to betray somewhat of a misapprehension of Axe’s result and experimental motif, and also appears to misconstrue the real-life scenario of what is going on here. His graphic illustrates the shape of a generously favourable fitness landscape for one particular fold. What he should have shown is the landscape for all sequence space, portraying functional folds — as they are in the real world of biology — as isolated peaks. The Darwinian mechanism may well be able to optimise a protein to a higher and higher level of function if by chance one can locate the base of a smooth fitness peak. But the problem facing neo-Darwinism is its impotence in finding those functional peaks in the first place.
In summary, then, we can conclude that Arthur Hunt appears to subtantially misapprehend the significance of Axe’s result. The key shortcoming of Hunt’s argument is that he conflates two very different questions — namely, the rarity of functional protein folds in sequence space and the difficulty of optimising those folds. Consider a fitness landscape, comprising a few thousand peaks, each one representing a different functional fold. These peaks are extremely rare, and moreover widely dispersed throughout sequence space. If by some fluke of chance one landed at the base of one of those peaks, then it stands to reason that one might be able to scale that peak by virtue of a Darwinian-type process. But if one were to land some place on the flat plain of non-functionality, miles from any peak, the Darwinian model requires too much of an emphasis on the role of random chance to be considered a viable means of locating a functional peak via a blind search. This problem, of course, is only accentuated many fold by the necessitude for multiple and functionally-specific proteins which are required to work mutually together in even the cell’s most basic activities. In sum, there is no reason to think that this is even plausible.