Home » Intelligent Design » Evolutionist: You’re Misrepresenting Natural Selection

Evolutionist: You’re Misrepresenting Natural Selection

How could the most complex designs in the universe arise all by themselves? How could biology’s myriad wonders be fueled by random events such as mutations?  Read more

  • Delicious
  • Facebook
  • Reddit
  • StumbleUpon
  • Twitter
  • RSS Feed

229 Responses to Evolutionist: You’re Misrepresenting Natural Selection

  1. Just great- eliminate the bad designs and then randomly vary the working designs until they too get eliminated. And THAT how is to do it.

  2. 2

    With monkeys typing random letters, but an intelligent agent selecting them, you COULD generate “me thinks it is a weisel” or even the collected works of Shakespeare in a reasonable amount of time. You wouldn’t say it is impossible in this manner to generate something clever though random letters, would you? So for once I think I disagree with you, though I may be misunderstanding you. The problem with natural selection is it isn’t intelligent; you have to have intelligence hidden somewhere in the process, either an intelligent mutation generator, or an intelligent selector. But to generate animals and plants, I think you need much more than artificial selection, where an intelligent agent selects based on visible traits; you would need a selector who could see what is going on at the microscopic level and envision where useless mutations might accumulate to become useful.

  3. Granville:

    I absolutely agree with you. Intelligent selecion is a powerful principle, as shown in bottom up protein engineering.

    There are two fundamental differences between IS and NS:

    a) IS can select for any defined function, even if not immediately useful. NS can select only for those functions that give a reproductive advantage in a specific context (that is, an extremely tiny subset of all possible functions).

    b) IS can select functions even at very low levels. IOWs, IS can recognize a function even in its raw manifestation, and then optimize it. NS requires that the function level be high enough that it can give the reproductive advantage at phenotipic level.

    Both points are extremely important, and both points are the consequence of the intervention of intelligence and purpose in the process. Moreover, bottom up IS can well be integrated with top down engineering in the design process.

    All those possibilities are denied to non intelligent processes.

  4. The problem with natural selection is it isn’t intelligent; you have to have intelligence hidden somewhere in the process, either an intelligent mutation generator, or an intelligent selector.

    The weeds randomly disperse their seeds. Those that land in fertile soil survive, while the other’s fail to germinate. That’s natural selection at work.

    According to Granville, the weeds should not take over my garden. Can Granville please explain to the weeds that they are doing it wrong.

  5. Neil Rickert:

    The weeds randomly disperse their seeds. Those that land in fertile soil survive, while the other’s fail to germinate. That’s natural selection at work.

    No, it isn’t. It is only natural selection when the differential reproduction is due to heritable random variation.

    I have explained this to you already and even provided the references. That means for you to keep misrepresemnting natural selection you must have some serious isues.

  6. It is only natural selection when the differential reproduction is due to heritable random variation.

    When the weed reproduces, the majority of its offshoots will be in about the same location. So the location is heritable.

    Whether or not “natural selection” is the technically correct term does not matter here. The point is that the situation is similar enough that Granville’s reasoning should show that the weeds won’t take over gardens. But they do.

  7. When the weed reproduces, the majority of its offshoots will be in about the same location. So the location is heritable.

    Wow, just wow. You have no idea what you are talking about.

    And no your “reasoning” has nothing to do with anything Granville said. But please at least attempt to make your case.

  8. The problem with natural selection is it isn’t intelligent; you have to have intelligence hidden somewhere in the process, either an intelligent mutation generator, or an intelligent selector.

    Why? Selection is itself, in a sense, ‘intelligent’, without any connotation of awareness or purpose. It sifts solutions and discards the poorer, thereby conserving the better. Which is what intelligent designers do. But with or without the I-word, it needs nothing ‘hidden’ beyond what it provides: enrichment of populations in alleles that cause their bearers to produce more offspring than the alternatives. The selective agent can be cold, or drought, or predators or prey &c. These selective agents need no access to microscopic levels, or an ‘envisioning’ capacity. They just cull, differentially.

  9. The problem is with natural selection there isn’t any selecting going on.

    The Origin of Theoretical Population Genetics (University of Chicago Press, 1971), reissued in 2001 by William Provine:

    Natural selection does not act on anything, nor does it select (for or against), force, maximize, create, modify, shape, operate, drive, favor, maintain, push, or adjust. Natural selection does nothing….Having natural selection select is nifty because it excuses the necessity of talking about the actual causation of natural selection. Such talk was excusable for Charles Darwin, but inexcusable for evolutionists now. Creationists have discovered our empty “natural selection” language, and the “actions” of natural selection make huge, vulnerable targets. (pp. 199-200)

    Thanks for the honesty Will.

  10. Chas D:

    No. NS is only differential reproduction. The “selection” is not a selection, but rather the result of the interaction between reproductive function in the replicators and the constraints on the environment. But it is the reproductive function in the replicators that, essentially, gives the results that we improperly call “selection”. As darwinists always say, you cannot have NS unless you have replicators.

    So, an objective environment in itself selects nothing. Replicators “select themselves”, but they do that in interaction with the existing environment. That’s why NS can “select” only one thing: reproductive advamtage. Because it is a byproduct of reproduction, and nothing more.

  11. What you are all missing, even you, Dr. Sewell, is that it is not obvious that even with intelligence in the picture a major modification of a complex system is possible one small step at a time if there is a requirement that the system continue to function after each such step.

    For example, consider a WWII fighter, say the P51 Mustang. Can you imagine any series of incremental changes that would transform it into a jet fighter, say the F80 and have the plane continue to function after each change? To transform a piston engine fighter in to a jet fighter requires multiple simultaneous changes for it to work–an entirely new type of engine, different engine placement, different location of the wings, different cockpit controls and dials, changes to the electrical system, different placement of the fuel tanks, new air intake systems, different materials to withstand the intense heat of the jet exhaust, etc., etc., etc. You can’t make these changes in a series of small steps and have a plane that works after each step, no matter how much intelligence is input into the process.

    Now both a P51 and an F80 are complex devices, but any living organism, from the simplest cell on up to a large multicellular plant or animal, is many orders of magnitude more complex than a fighter plane. If you believe that it is possible to transform a reptile with a bellows lung, solid bones and scales, say, into a bird with a circular flow lung, hollow bones, and feathers by a series of small incremental changes each of which not only results in a functioning organism, but a more “fit” one, then the burden of proof is squrely on your shoulders, because the idea is absurd on the face of it.

  12. NS is only differential reproduction.

    No – NS is also differential survival. And Drift is differential survival/reproduction too, without the causal link to particular alleles that characterises NS.

    As darwinists always say, you cannot have NS unless you have replicators.

    Some … darwinists (FFS!) may say that, but the interaction is not with replication. The interaction is between the environment and instances of an entity. That some of these entities go on to replicate is relevant, but it is incorrect to say that “Replicators select themselves”. If there is variation in ability to withstand cold, and it gets colder, the population – breeding and non-breeding, genetically and non-genetically equipped – becomes enriched in cold-survivors, and impoverished in the cold-susceptible. The environment ‘selects’ these individuals by killing them, and selects the remainder by default. Those that are left can go on to breed, and those whose cold-tolerance was genetic can go on to pass that to offspring. But it is important to note that the selection has culled both non-breeding individuals (infertile, too old, surplus) and breeding individuals. Only the latter are evolutionarily important, hence the significance of the ‘replicator’ emphasis, but the process is blind to reproductive capacity.

    For semantic reasons one could insist that selection demands a decision, and hence a decider. To which I would say … meh. Intelligent and active choice of hairy individuals for breeding is indistinguishable from the tendency of hairier individuals to survive cold, in terms of the effect on the population.

  13. Chas D:

    Even if there is not great difference, for my reasoning, between reproduction and survival, still I must say that IMO you are wrong.

    Reproduction is the parameter that is relevant for the neo darwinian algorithm, not survival. Survival without reproduction does not influence in any way the evolution of genomes. It can be true that, in most cases, better survival and better reproduction are connected, but that is not always the case. So, I manintain my point: NS is about differential reproduction.

    Intelligent and active choice of hairy individuals for breeding is indistinguishable from the tendency of hairier individuals to survive cold, in terms of the effect on the population.

    That’s correct. Indeed, it is only for complex functions that you clearly see the difference. IS can build complex functions. NS cannot. That’s, indeed, the main point of ID theory.

  14. Bruce:

    You are essentially correct, but please consider that engineered modifications can be implemented in a complex organism while retaining the old functionality, and then the new plan can be activated when everything is ready. I am not asying that’s the way it was done, but that it is possible.

    For instance, and just to stay simple, one or more new proteins could be implemented using duplicated, non translated genes as origin. Or segments of non coding DNA. That’s, indeed, very much part of some darwinian scenarios.

    The difference with an ID scenario is that, once a gene is dupicated and inactivated, it becomes non visible to NS. So, intelligent causes can very well act on it without any problem, while pure randomness, mutations and drift, will be free to operate in neutral form, but will still have the whole wall of probabilistic barriers against them.

  15. Even if there is not great difference, for my reasoning, between reproduction and survival, still I must say that IMO you are wrong. Reproduction is the parameter that is relevant for the neo darwinian algorithm, not survival.

    I agree to a point; as I stated, in evolutionary terms survival only matters if it leads to enhanced reproductive output for the allele of interest. But you were making the bolder claim that “The “selection” is not a selection, but rather the result of the interaction between reproductive function in the replicators and the constraints on the environment.”

    But the selective interaction is NOT with reproductive function, but with instances of potential reproducers. I don’t see what stops that being termed a “selection”, by whatever agency is operating on a particular trait, just as my Christmas chocs become enriched in soft centres by my selection of the caramels.

    it is only for complex functions that you clearly see the difference. IS can build complex functions. NS cannot.

    What stops it? ID boils down to opinions such as the above. My opinion is that NS – more properly, the combination of selection and drift – can build complex functions from raw mutation. I think the typical formulation (articulated by Bruce David below) is perhaps a little restrictive – it is not essential that every step be more fit than its predecessor, but that it not be overly less fit. Other than that, I simply disagree that this restriction exists. Which may have the regulars guffawing into their Christmas punch at the deluded ‘darwinist’, but …

  16. 16

    Bruce,

    You have a good point, and in fact the evidence is that major advances (new orders, classes and phyla) in evolution did NOT occur gradually. I was not actually arguing that evolution WAS due to intelligent selection of random mutations, was just saying that didn’t seem absolutely impossible. In any case, my point was that it IS possible to produce something significant (eg, the works of Shakespeare) through randomly generated letters, provided an intelligent selector, for example, Wm Shakespeare, is there to do the selecting. Of course he could write it faster if he typed it himself, but if he has to wait for the monkey to type the letters he wants, he can still produce great works in a reasonable amount of time :-)

  17. Chas D:

    Other than that, I simply disagree that this restriction exists.

    OK, but that’s the whole ID theory. You simply disagree, without considering the arguments. Your choice.

  18. gpuccio:

    You are essentially correct, but please consider that engineered modifications can be implemented in a complex organism while retaining the old functionality, and then the new plan can be activated when everything is ready.

    This cannot be done in the WWII fighter plane transformation example I gave above (#3), and I contend it cannot be done if one is modifying, say, a bellows lung into an avian lung. You can’t have an avian lung waiting in the wings, so to speak, ready to be activated when complete. For one thing, it requires changes to other systems in the organism–circulatory, nervous, etc. For another, it would hardly increase the fitness of the organism for it to have two completely different lungs in its body, one functioning and one under construction. Furthermore, the whole point of Darwinism is that the astronomical odds against the construction of such structures by random mutations are mitigated by building them one small step at a time via random mutations selected by natural selection. To build the entire structure which is then “activated when everything is ready” contradicts the basic premise of the Darwinian explanation.

  19. Bruce:

    It obviously contradicts the premise of the darwinian explanation. Indeed, it is a possible model of design explanation.

    I agree with the points you make, but still I think we must consider the possibility of a semi gradual design at least in some cases. For example, new proteins could be designed that way, and contribute to new design plans. The general implementation if design in natural history remains, IMO, vastly an object for future research.

  20. Dr. Sewell,

    I am a great admirer of yours, by the way. I think your analysis of the development of life and human technology vis a vis the Second Law is a significant contribution to the whole debate.

    I agree with you regarding Shakespeare in particular and literature, art, and music in general. However I think that functioning systems are a different type of entity, and have significant restrictions on how it is possible to develop and modify them (again, assuming that there is a requirement that each new version must work). I further believe that this is often overlooked by both sides of the Darwinism/ID debate.

  21. Ok, I won’t argue with that. And certainly, an exploration of how the design of living systems could have been implemented is a worthy research project.

  22. b) IS can select functions even at very low levels. IOWs, IS can recognize a function even in its raw manifestation, and then optimize it. NS requires that the function level be high enough that it can give the reproductive advantage at phenotipic level.

    Perhaps you can provide an example.

    I’m thinking that animal and plant breeding has focused more on specific “function” rather than on the kind of diversity that forestalls extinction.

    I’m thinking of crop failures resulting from cloned vegetables — potatoes and bananas come to mind.

    Intelligent selection is very good at maximizing a specific trait, but I’m not aware of any theory of selective breeding that would maximize survival of populations over long time frames.

  23. Chas D:

    “Other than that, I simply disagree that this restriction exists.”

    OK, but that’s the whole ID theory. You simply disagree, without considering the arguments. Your choice.

    Oh, I’ve considered ‘em! A contrary position is not always an indicator of a lack of the same thought that you yourself have put in! But one would need something to put the matter beyond the purview of mere opinion-trading – a methodology that would enable the distinction to be made between ‘conventional’ evolutionary processes – mutation, selection and drift – and those of an interventionist character. The mainstream is under no obligation to demonstrate the non-existence of ID’s barrier to undirected change.

  24. Chas D:

    My opinion is that NS – more properly, the combination of selection and drift – can build complex functions from raw mutation.

    Great as long as you understand that is an unsupported opinion.

    Also ID is not anti-evolution so your “‘conventional’ evolutionary processes” is nothing but an equivocation.

    The mainstream does have an obligation to present positive evidence for their opinions. And mainstream should not think their opinions amount to scientific evidence.

  25. The mainstream is under no obligation to demonstrate the non-existence of ID’s barrier to undirected change.

    How is that any different from stating that the mainstream is under no obligation to demonstrate the vast potential for undirected change resulting from natural selection?

    You’re claiming that natural selection is responsible for much or most of biological diversity, but that no one is obligated to provide evidence.

    You’re right. No one is ‘obligated’ to support anything they claim. And no one is obligated to take it seriously.

    But surely you must realize that the ‘cornerstone of biology’ must come with evidence. That would be necessary even if no one challenged it. That’s how science works – not just floating a proposed engine of change and leaving to others the burden of determining what barriers it faces.

    Except, that is, in the bizzaro world of evolutionary science where anything that’s plausible is true, and anything that isn’t plausible can be true if it has to be.

    It’s like the 5 year old playing softball with the grown-ups. He’s weak and helpless you so you suspend the rules for him. You throw him an easy pitch, swing the bat for him, and everyone stands still and cheers for as long as it takes him to run the bases.

  26. Chas D:

    Oh, I’ve considered ‘em! A contrary position is not always an indicator of a lack of the same thought that you yourself have put in!

    Chas, please have some respect for my intelligence. I did not mean that you have never considered the ID arguments in your private thoughts. How can I know what you have considered or not considered?

    I just meant that in your post there was no argument that took into explicit consideration any of the ID arguments, and that you had expressed your disagreement without motivating it in any way. I apologize if my meaning was not clear enough.

    The mainstream is under no obligation to demonstrate the non-existence of ID’s barrier to undirected change.

    Well, I would say that “the mainstream”, whatever that means, is under strong obligation to demonstrate that mainstream theories work and are credible and consistent. As ID theory has expressed many detailed and explicit arguments for the falsification of a specific mainstream theory, the obligation to answer those points remains, mainstream or not mainstream.

    Unless you believe that just being many (“mainstream”) gives a right to be always right…

  27. the evidence is that major advances (new orders, classes and phyla) in evolution did NOT occur gradually.

    What evidence?

    And why should new orders, classes or phylae require more “major advances” than, say, the evolution of a marine or flying mammal from a land-dwelling one? Or indeed from any speciation event?

  28. As ID theory has expressed many detailed and explicit arguments for the falsification of a specific mainstream theory…

    Really? What physical laws are violated by mainstream theories of evolution?

  29. No known speciation event has produced new body plans with new body parts.

  30. What physical laws are violated by mainstream theories of evolution?

    What explanation of the evolution of anything is there that could be examined to determine whether it violates any physical laws?

    Put another way, if you carefully steer clear of ever explaining anything in specific terms, your explanation will almost certainly not violate any physical laws. If you see that as a strength and not a weakness, more power to you.

  31. In that case please point to the “new body plans with new body parts” that you think did not arise from a speciation event.

  32. What explanation of the evolution of anything is there that could be examined to determine whether it violates any physical laws?

    Well, the tracking of finch-beak-sizes with seed-size availability in the Galapagos, for example.

    There’s an evolutionary explanation – what physical laws does it violate?

  33. Wrong Elizabeth. YOU have to show that speciation can give rise to “new body plans with new body parts”.

    However you can’t even get from prokaryotic to eukaryotic- endosymbiosis only explains the power-plants, nothing else. So perhaps you could start there.

  34. To keep that in context, by nature the falsification of a vague explanation with no specifics cannot itself be very specific.

    Several days ago I challenged anyone to pick something, anything they wish, and offer the evolutionary explanation of it. (To be reasonable, I asked for something more than colored cichlid fishes.) Frog feet, bat wings, anything. Rather than unreasonably choosing some complex phenomenon and demanding the evolutionary explanation, I reasonably asked for an evolutionary explanation of something, anything. Use the cornerstone of biology to explain something biological.

    After a few comments asking questions about the question the conversation went dead. Maybe it was the holidays. But they’re over and no one has answered what should be the easiest question ever. (Please don’t ask me another question about the question.)

    Here is the question. It’s so simple that it requires no clarification, and I clarified it anyway. Then everyone apparently got busy.

    The answer is no, you haven’t said anything that violates any physical laws. Technically you haven’t said anything at all.

  35. Wrong Elizabeth. YOU have to show that speciation can give rise to “new body plans with new body parts”.

    Well, I can’t do that without knowing what data you have in mind.

    However you can’t even get from prokaryotic to eukaryotic- endosymbiosis only explains the power-plants, nothing else. So perhaps you could start there.

    Well, there are a number of theories, but clearly it’s harder to explain something so remote in the past for which there are so little data. However, symbiosis is one current hypothesis.

    But the origin of eukaryotes wouldn’t be an example of “new body plans” or “body parts” anyway. The earliest eukaryotes would have been single celled organisms.

  36. Well, the tracking of finch-beak-sizes with seed-size availability in the Galapagos, for example.

    There’s an evolutionary explanation – what physical laws does it violate?

    There are two ways I could answer that. First, I could point out that you’re equivocating. Darwin claimed that evolution was an explanation of the origin of species. If you wish to redefine it as the process of minor variations within species then we can shake hands and be done with it.

    That’s the best answer. The other is that even in this simple case, you haven’t even provided an explanation, only an observation. You still haven’t stated anything that can violate or not violate physical laws.

    The example is so trivial that I’m not interested in splitting hairs over it. I don’t deny that natural selection occurs.

    Am I moving the goalposts? Not intentionally. You moved them first. I asked for an evolutionary explanation and you moved the posts closer by reducing it to its most trivial, non-explanatory definition.

    Do you see the difficulty this creates? I’m trying to be fair and reasonable. But if I set the goalposts by asking anything specific such as how beaks evolved, then I’m being unreasonable and asking too much. But if I give you leeway to set them as you wish, you place them one inch in front of the ball.

    It makes me feel trifled with. If that’s all that can be explained with evolution then so be it. There’s nothing to discuss. Everyone will stop asserting that evolution explains anything more significant in biology, and I won’t argue.

  37. Asking what physical laws are violated is rather pointless anyway. Flipping a penny a hundred times and getting all heads doesn’t violate any natural laws.
    Please don’t jump on it – my point is not a comparison of flipping coins to evolution.
    But that a sequence of events does not violate laws does not make it plausible, probable, or a reasonable explanation. And that’s entirely apart from the reality that in this case there are no sequences of events to be considered probable, plausible, or otherwise.

  38. Every possible sequence of flipped coins is equally likely or unlikely.

    Predicting a specific sequence is unlikely, but finding a sequence unlikely after the tosses is pointless.

  39. How about I just cut to the chase? Darwin and a bunch of other folks claim that variation and selection (with a few other odds and ends thrown in) explain the origin of species.

    I claim that they explain only variations within species.

    When pressed for specifics, the only evidence available supports my position, not theirs or yours.

    That’s not a vindication of ID by any means. It doesn’t even prove that evolution cannot explain the origin of species. What is proves is that there is nothing to disprove. There is nothing to test, nothing to falsify.

    Save your phylogenetic trees and fossils, as no account of genetic variation combined with selection can be extracted from them. Hence they neither provide nor contribute to even a hypothetical evolutionary explanation of anything.

    I don’t like it, but you leave me no choice. There is nothing left to debate, and not to be rude, but I’m bored to death with asking the same simple question and receiving elusive, circular, speculative, or trivial answers. So I’m bookmarking this post and declaring victory on this front.

    You may now commence telling me how unjustified and presumptuous I am. Of course, why wouldn’t you? I’m bored with finch beaks, cichlid fishes and nylon-eating bacteria. If that’s what you’ve got then that’s what you’ve got. The origin of species or a species it isn’t.

    I’m raising the flag. Anytime you want it come and get it. The game isn’t over. There’s no clock and no whistle. But everyone wants to talk and no one wants to kick the ball. No one even wants to defend the other goal. I’m not unreasonable. How many people have asked for a simple hypothesis, and how many times? I issued the challenge in the most reasonable, generous terms possible and the crickets are still chirping. I’m bored and I’m going home.

  40. What evidence, Elizabeth? Please have a look here:

    http://www.darwinsdilemma.org/pdf/faq.pdf
    http://www.darwinsdilemma.org/
    http://www.nature.com/news/eni.....ria-1.9714
    http://www.arn.org/blogs/index.php/literature

    In “The Edge of Evolution”, Dr. Michael Behe argues that phyla were probably separately designed because each phylum has it own kernel that requires design. He also suggests that new orders (or families, or genera – he’s not yet sure which) are characterized by unique cell types, which he thinks must have been intelligently designed, because the number of protein factors in their gene regulatory network (about ten) well exceeds the number that might fall into place naturally (three).

  41. Well, the tracking of finch-beak-sizes with seed-size availability in the Galapagos, for example.

    There’s an evolutionary explanation – what physical laws does it violate?

    There are two ways I could answer that. First, I could point out that you’re equivocating. Darwin claimed that evolution was an explanation of the origin of species. If you wish to redefine it as the process of minor variations within species then we can shake hands and be done with it.

    I’m not redefining it at all. But speciation, as opposed to adaptation down a single lineage, is a horizontal concept, and if a population subdivides into two non- or rarely-interbreeding subpopulations, that longitudinal process of adaptation will result in two different species.

    That’s the best answer. The other is that even in this simple case, you haven’t even provided an explanation, only an observation. You still haven’t stated anything that can violate or not violate physical laws.

    Sure I have, and nothing does. The physical laws involved are multiple, from the chemical and biochemical laws involved in the reproductive processes to the climatic and geophysical laws involved in environmental change, to the basic laws of physics involved in any natural hazard. All are involved; none are violated.

    The example is so trivial that I’m not interested in splitting hairs over it. I don’t deny that natural selection occurs.

    Good.

    Am I moving the goalposts? Not intentionally. You moved them first. I asked for an evolutionary explanation and you moved the posts closer by reducing it to its most trivial, non-explanatory definition.

    And explanation of what, then?

    Do you see the difficulty this creates? I’m trying to be fair and reasonable. But if I set the goalposts by asking anything specific such as how beaks evolved, then I’m being unreasonable and asking too much. But if I give you leeway to set them as you wish, you place them one inch in front of the ball.

    You want an evolutionary explanation for the origin of beaks? Well, I don’t know much about beak evolution, but I’m sure there’s a literature on the subject.

    There’s a paper here I just googled up, for instance:

    http://onlinelibrary.wiley.com.....20825/full

    I haven’t read it, but if you want to argue that there is no evolutionary explanation for beak evolution, you’d need to point to the holes in the specific hypotheses.

    What I’m saying, and what any evolutionist is saying, is that all this stuff happened incrementally, whether it’s a change in beak size, or a change from non-beak to a slightly beakish thing. It’s all micro-evolution, including speciation, the thing about speciation being that it’s micro-evolution down two separated lineages.

    It makes me feel trifled with. If that’s all that can be explained with evolution then so be it. There’s nothing to discuss. Everyone will stop asserting that evolution explains anything more significant in biology, and I won’t argue.

    You aren’t being trifled with. What is happening is that you are dismissing as trifling the thing that is absolutely crucial. I can only think it’s a scaling problem – incremental changes look vast when collapsed across millions of years and tiny when collapsed across a couple of years. But that doesn’t mean that the processes are fundamentally different. Even on the galapagos there are populations where there is more within-population breeding than within-population breeding. Scale that up across the years and across larger changes and you’ve got speciation.

  42. Asking what physical laws are violated is rather pointless anyway. Flipping a penny a hundred times and getting all heads doesn’t violate any natural laws.
    Please don’t jump on it – my point is not a comparison of flipping coins to evolution.
    But that a sequence of events does not violate laws does not make it plausible, probable, or a reasonable explanation. And that’s entirely apart from the reality that in this case there are no sequences of events to be considered probable, plausible, or otherwise.

    And that’s a very important point.

    And it’s a misunderstanding of the probabilities involved that lies at the heart of the ID error.

    Dembski simply got it wrong.

  43. There’s a paper here I just googled up, for instance:

    No dice. It asserts various aspects of beak evolution in broad, vague terms, calling each a ‘classic example of evolution.’

    Evolution is (roughly) a process of variation and selection. Guess how many times this paper explains the selection of a genetic variation.

    It does reference specific observed selective causes with regard to those ubiquitous finches, while stating that the specific molecular variations being selected are unknown.

    How typical. One more research paper on evolution missing just one thing – the evolution. It’s a shell game, moving around fossils and the regulatory differences between existing bird species while tossing out wild guesses at selective causes, but only for broad phenotypic changes. The authors are even vague when they guess.

    What statements in this paper are substantiated so that they may be disputed? The observed differences between chicken beaks and duck bills?

    How does one falsify a statement such as this which is made and then never supported:

    The recruitment of forelimbs as wings allowed a newly found mobility resulting from flight and opened vast eco-morphological possibilities. However, this change came at a cost, because animals now needed to develop a new feeding mechanism without the use of forearms. This development exerted selection pressures on the morphology of the face; a strong, lightweight, and effective feeding apparatus had to evolve, leading to an amazing transformation of the snout into a large range of beak shapes adapted to different ecological niches.

    The selective pressure that led to the evolution of a new feature was the evolution of a previous new feature. How convenient. How vacuous and unsupported. How creative and how pointless.

    What I said stands. A vague narrative that steers completely clear of the mechanics of evolution cannot be falsified. Placing it in the same paper as a genetic comparison of extant species does not make it a serious hypothesis. If this is the theory of how beaks evolved, then there is none.

  44. Well, none of that is what I would call evidence that phyla were more differentiated at the bifurcation point than any pair of speciating sub-populations, although once we go back that far, clearly the sampling of evidence will be much sparser, and we will have a much less clear picture of what the earliest members of each phyla looked like.

    I don’t find Behe’s argument that each phylum has a radically different “kernel” very convincing. Sure, prokaryotic cells and eukaryotic cells are different, but, as I said, we have at least one theory (symbiosis) that might explain that. And in any case for non-sexually reproducing organisms, “speciation” is a poor term – what we must postulate is cloning populations that clone along with their symbiotic inclusions. Which is perfectly possible (indeed even we “inherit” parental gut flora).

    I think you are making the mistake of assuming that because “phyla” is a term that refers not only to the earliest exemplars of each phylum but also to the entire lineage from each, that those earliest examplars were as different from each other as we, for example, are from trees, or bacteria. It’s really important to be clear when we are talking longitudinally (adaptation over time) and when laterally (subdivisions of populations into separate lineages).

    Still, if you want to talk about gene regulatory networks, fair enough. Joe was talking about body plans. They aren’t the same.

  45. No dice. It asserts various aspects of beak evolution in broad, vague terms, calling each a ‘classic example of evolution.’

    Evolution is (roughly) a process of variation and selection. Guess how many times this paper explains the selection of a genetic variation.

    It does reference specific observed selective causes with regard to those ubiquitous finches, while stating that the specific molecular variations being selected are unknown.

    How typical. One more research paper on evolution missing just one thing – the evolution. It’s a shell game, moving around fossils and the regulatory differences between existing bird species while tossing out wild guesses at selective causes, but only for broad phenotypic changes. The authors are even vague when they guess.

    What statements in this paper are substantiated so that they may be disputed? The observed differences between chicken beaks and duck bills?

    How does one falsify a statement such as this which is made and then never supported:

    The recruitment of forelimbs as wings allowed a newly found mobility resulting from flight and opened vast eco-morphological possibilities. However, this change came at a cost, because animals now needed to develop a new feeding mechanism without the use of forearms. This development exerted selection pressures on the morphology of the face; a strong, lightweight, and effective feeding apparatus had to evolve, leading to an amazing transformation of the snout into a large range of beak shapes adapted to different ecological niches.

    The selective pressure that led to the evolution of a new feature was the evolution of a previous new feature. How convenient. How vacuous and unsupported. How creative and how pointless.

    What I said stands. A vague narrative that steers completely clear of the mechanics of evolution cannot be falsified. Placing it in the same paper as a genetic comparison of extant species does not make it a serious hypothesis. If this is the theory of how beaks evolved, then there is none.

    I have no idea if it’s “the theory” of how beaks evolved from non-beaks. As I said, I just googled it up. I haven’t read it.

    My point is that there is no difference (or rather, evolutionary theory does not posit a difference) between the processes by which a small beak evolves to become a slightly larger beak, and the processes by which a non-beak evolves to become a slight beak. You seem to be trying to carve nature at joints that aren’t there. You accept that a beak can evolve to become larger, but you can’t accept that a slightly beaky bone can evolve to become a very bony beak.

    But you present no good argument (that I have read) for where these apparently categorical cleavage points appear.

    It isn’t speciation because, as I’ve said, speciation is simply incremental adaptation down non-interbreeding lineages.

    So what is it? What is the bulwark you are seeing that we claim is not there?

    Because unless you can produce evidence for such a bulwark, there is no bulwark to explain. Macroevolution is simply microevolution over a longer time-scale and larger environmental changes.

  46. And it’s a misunderstanding of the probabilities involved that lies at the heart of the ID error.

    How exactly does one understand the probabilities correctly when no one will even attempt to specify what may or may not have happened?

    I don’t care if Dembski got it wrong. Who got it right?

    Use evolution to explain something more than a variation within species. If someone ever bothers to even offer a hypothesis that incorporates the actual mechanics of evolution to explain a case of evolution then we can talk about falsification. Nothing has merited that. There is nothing to refute. It is disqualified from the competition by its refusal to show up. The success, failure, or nonexistence of competing explanations is irrelevant. Game over.

  47. My point is that there is no difference (or rather, evolutionary theory does not posit a difference) between the processes by which a small beak evolves to become a slightly larger beak, and the processes by which a non-beak evolves to become a slight beak.

    Evolutionary theory does not posit a difference – that’s another way of saying that it asserts that they are the same. Again, how convenient.

    Meanwhile, in evolutionary terms of genetic variation and selection it does not fail, but rather does not even attempt to explain the evolutionary transition between any two beak forms, not even the finches. So it asserts that the transitions it doesn’t explain are no different from the other transitions it doesn’t explain.

    I can see why someone would have a hard time falsifying that.

  48. Elizabeth,

    The change from prokaryote to eukaryote is a new body plan.

    Prokaryotic and Eukaryotic Cells

  49. Well, only if by “body plan” you mean “cell type”.

    Why not say “cell type”? “Body plan” is normally used to refer to a property of multi-cellular organisms.

  50. How exactly does one understand the probabilities correctly when no one will even attempt to specify what may or may not have happened?

    There are a great many hypotheses specifying what may or may not have happened. Almost any data paper you find on evolution will be a test of a specific hypothesis about what may or may not have happened.

    I don’t care if Dembski got it wrong. Who got it right?

    Nobody. Dembski’s mistake was to answer the wrong question. The question he tried to answer is a useless question. It’s completely uncomputable.

    Use evolution to explain something more than a variation within species. If someone ever bothers to even offer a hypothesis that incorporates the actual mechanics of evolution to explain a case of evolution then we can talk about falsification. Nothing has merited that.

    OK, here is a clue to the problem: what do you mean by “a case of evolution?

    Please give an example.

    I’d call the Grants’ finch beaks “a case of evolution”, but you don’t. I’d call a speciation event “a case of evolution” but you don’t.

    So what do you call “a case of evolution”?

    There is nothing to refute. It is disqualified from the competition by its refusal to show up. The success, failure, or nonexistence of competing explanations is irrelevant. Game over.

    The game hasn’t even started yet. There is nothing to refute because we apparently aren’t even talking about the same thing. What you appear to have is a straw man – a “case of evolution” that bears no resemblance to any phenomenon any evolutionist has ever sought to explain.

    Can you say what you understand by that term?

  51. Evolutionary theory does not posit a difference – that’s another way of saying that it asserts that they are the same. Again, how convenient.

    Yes, evolutionary theory asserts that all adaptation is incremental – that “macroevolution” is micro-evolution over longer timescales. There are some other features that we observe over longer timescales, of course, such as speciation (bifurcation of population) and population-level effects, but down any one lineage, its microevolution all the way. If you think this is false, please say why.

    Meanwhile, in evolutionary terms of genetic variation and selection it does not fail, but rather does not even attempt to explain the evolutionary transition between any two beak forms, not even the finches.

    Yes it does – it does exactly that. The transition between a mean 5.7 mm beak and a mean 5.8 mm beak is a result of the finches with the slightly larger beaks in Generation X being better able to crack open the larger seeds and thus survive to leave more finchlings in Generation Y, who inherit the larger-beak-producing alleles of their parents. That’s the explanation. You understand it. I understand it. What is the problem?

    So it asserts that the transitions it doesn’t explain are no different from the other transitions it doesn’t explain.

    I don’t know what you mean. Just keep going. Keep those El Ninos with their big seeds coming. Next year, the mean beak size of Generation Z is 5.9. Then 6.0. Then 6.1, Thenn 6.2. Soon, we have no beaks in the population smaller than 5.8, and some are even 6.5, hitherto unheard of.

    The population has transitioned to a brand-new beak size.

    I can see why someone would have a hard time falsifying that.

    Well, for a start, science does not, in fact, proceed by falsification. It proceeds by fitting data to models, and testing those modes against new data.

    Which has been done, over and over again. You change the environment, and the population adapts – beaks grow, spots shrink, resistance increases.

    It works – in the field, in the lab, in computer models. It’s extremely powerful.

    And yet people like you insist that somehow we are cheating – failing to explain something that needs to be explained.

    Well, what is that thing? What is the barrier that you claim that microevolution by means of natural selection from a constantly enriched gene-pool cannot leap?

    And how do you demonstrate that that barrier is real?

  52. Scott,

    Your argument brought an analogy to mind. Here’s an imaginary dialogue between you and an astronomer:

    Scott: I agree that the theory of gravity holds within our solar system, but extrapolating it to interstellar and galactic scales is unwarranted.

    Astronomer: What barrier are you aware of that prevents gravity from acting over interstellar distances?

    Scott: Your error is in assuming that it does.

    Astronomer: But we see evidence that it does.

    Scott: Have you ever directly measured the force between two stars?

    Astronomer: We can’t. But stars behave exactly as we would expect if gravity were operating on them. To support your view, we’d have to posit that 1) gravity mysteriously stops working on some scale larger than the solar system, and 2) there is a different, unknown mechanism that makes it appear that gravity is still working on the larger scales! Are you serious?

    Scott: Game over.

  53. Petrushka:

    Really? What physical laws are violated by mainstream theories of evolution?

    Now, that’s really silly. Why should physical laws be violated? Neo darwinian theory is not about physical laws, it simply violates the laws of logic and the laws of probability and the laws of empirical explanation. Isn’t that enough?

  54. Elizabeth:

    Hi! Let’s see:

    OK, here is a clue to the problem: what do you mean by “a case of evolution? Please give an example.

    What about the emergence of basci protein domains? That is a good case of macroevolution, I would say.

    There are a great many hypotheses specifying what may or may not have happened. Almost any data paper you find on evolution will be a test of a specific hypothesis about what may or may not have happened.

    What about a credible hypothesis about the above? With a serious analysis of the probabilistic credibility of the random part?

    Sure, prokaryotic cells and eukaryotic cells are different, but, as I said, we have at least one theory (symbiosis) that might explain that.

    You must be kidding. Plese, explain all the new protein information in eukaryotes (which is a lot), and the new, completely new, cellular organization. In what way symbiosis would explain that? I am not saying that symbiosis is not a part of the process. It probably is. But to say that it might explain the transition is really the overstaement of the century.

    And so on, and so on…

  55. Elizabeth:

    Yes, evolutionary theory asserts that all adaptation is incremental – that “macroevolution” is micro-evolution over longer timescales. There are some other features that we observe over longer timescales, of course, such as speciation (bifurcation of population) and population-level effects, but down any one lineage, its microevolution all the way. If you think this is false, please say why.

    Evolutionary theory asserts what is wrong. In no way macroevolution is microevolution over longer timescales. That’s pure imagination, and bad imagination all the way.

    At molecular level, we have simply no example of macroevolution. Do you understand that? No example.

    We have no molecular model of one single macorevolutionary transition that could realistically be explained by the neo darwinian model of RV + NS.

    That’s why I think that your statement is false. It is against logic. It is against evidence. It is against facts. It is against mathematics and statistics. It is against engineering. And there is not a single empirical support of it.

    Is that enough?

  56. Petrushka:

    Every possible sequence of flipped coins is equally likely or unlikely. Predicting a specific sequence is unlikely, but finding a sequence unlikely after the tosses is pointless.

    No! Not again. This is really an offense to your own intelligence!

    You should know very well that finding a sequence that is part of some specific, and extremely unlikely, subset of events is not pointless at all.

    What do you think the second law is about? Do you pre-specify the unlikely states of a gas molecules? Is it pointless to say that ordered states are too unlikely to emerge spontaneously?

    Please, answer these points if you really mean what you said in your post. Or just admit that you say some of the things you say for mere propaganda.

  57. 57
    material.infantacy

    “No! Not again. This is really an offense to your own intelligence!”

    I know, this is getting old. Nobody’s interested in the fact that, of a set of possible outcomes in a sample space, any single one of them coming about has a probability of exactly one. Yet this is what we are hearing as an argument against the improbability of functional specification. This is the same as saying, that for a sample space S, the probability of S occurring is one, or P(S) = 1.0, which is axiomatic, and is irrelevant to the issue at hand.

    However for an event F in a sample space S, the probability of F occurring is n(F)/n(S). If the set F consists of all potentially functional sequences, then the probability of F occurring is less than one; and the probability never changes (because the size of F doesn’t change) regardless of whether it occurred or not at some point in history.

    To suggest otherwise is also to suggest that any sequence is as good as any other. This cannot possibly be the case. Not all sequences fold, which means that P(F) < 1.0. This alone should suggest that the axiom P(S) = 1.0 couldn’t be more irrelevant, and that the set F will remain an unchanging subset — and a minuscule one at that, given its proportion to S.

    For a set S, which is a universal set consisting of every possible sequence of length n, there is a subset F which consists of every sequence that results in a folded protein.

    P(F) = n(F) / n(S)

    That is, the probability of F occurring is equal to the number of elements in F divided by the number of elements in S. The set F is objectively improbable, unchanging, and contains the subset of all potentially functional proteins. (That is, the size of the subset of functional proteins is bounded by the size of the subset F.)

    Let’s see an end to the “as equally improbable as any other sequence” straw man, which is really just a rehash of Miller’s fallacious “cards” analogy.

  58. Ummmm for a single-celled organism the “cell type” is the body plan. A body is a collection of masses…

  59. Sure, that is possible, but it doesn’t speak too well for the Intelligent Designer. What an inefficient way to make things! Takes forever. Lots of mistakes, death, disease, suffering, bloodshed, etc. I wouldn’t be too impressed with such a Designer myself.

    I prefer an omniscient Designer who knew how to do it right from the beginning and looked back over His creation and said “It is very good.” Then later, when sin entered the world, things started to fall apart which explains why we see all the problems in the world today.

  60. Here is an interesting write-up on a study by Bozorgmehr published in Complexity on Dec. 22, 2010 that seems to falsify evolution and show the limits of natural selection:

    First, here are some quotes from the article. To read the originally write-up, please check out the web address listed at the end of this post.

    The writer calls says that Bozorgmehr dealt “a falsifying blow to natural selection”.

    Many arguably “beneficial” mutations have been observed to incur some sort of cost and so can be classified as a form of antagonistic pleiotropy.

    Indeed, the place and extent of natural selection as a force for change in molecular biology have been questioned in recent years.

    Moreover, several well-known factors such as the linkage and the multilocus nature of important phenotypes tend to restrain the power of Darwinian evolution, and so represent natural limits to biological change.

    Selection, being an essentially negative filter, tends to act against variation including mutations previously believed to be innocuous.

    The idea that natural selection is more permissive with duplicated genes was analyzed by Bozorgmehr.

    In conclusion, he noted that accidental gene duplication clearly adds to the size of some genomes. “However, in all of the examples given above, known evolutionary mechanisms were markedly constrained in their ability to innovate and to create any novel information, he said. “This natural limit to biological change can be attributed mostly to the power of purifying selection, which, despite being relaxed in duplicates, is nonetheless ever-present.”

    Then he examined cases of co-option cited by Darwinists, but found, again, that “a proclivity toward functional stability and the conservation of information, as opposed to any adventurous innovation, predominates.”?

    The various postduplication mechanisms entailing random mutations and recombinations considered were observed to tweak, tinker, copy, cut, divide, and shuffle existing genetic information around, but fell short of generating genuinely distinct and entirely novel functionality.

    : “Gradual natural selection is no doubt important in biological adaptation and for ensuring the robustness of the genome in the face of constantly changing environmental pressures.” Bad news: “However, its potential for innovation is greatly inadequate as far as explaining the origination of the distinct exonic sequences that contribute to the complexity of the organism and diversity of life.”?

    He didn’t offer a replacement evolutionary theory, but warned that any new contender must think holistically about the cell (cf. 04/02/2008). “Any alternative/revision to Neo-Darwinism has to consider the holistic nature and organization of information encoded in genes, which specify the interdependent and complex biochemical motifs that allow protein molecules to fold properly and function effectively.”

    Bozorgmehr did not refer to intelligent design, and did not cite any ID sources, but arrived at the same conclusions about the natural limits to biological change that creationists and ID advocates have been preaching for decades.

    http://crev.info/2011/01/evolu.....falsified/

  61. material.infantacy:

    Thank you for explaining it so clearly!

    Petrushka, please your answer.

  62. material.infantacy and Petrushka:

    I would like to add, for clarity, that the probability that is pertinent for the analysis of a specific instance of neo darwinian “explanation” can be defined with much greater detail.

    For instance, let’s say that we are analyzing the credibility of the neo darwinian explanation for the emergence of a basic new domain at some point in natural history, let’s say at the emergence of eukaryotes.

    In that case, the probability we have to consider is the probability of a new domain that is naturally selectable in that context.

    That means that we have to look for a specific subset of P(F) (the subset of folded proteins), the subset of folded proteins unrelated at sequence level to already existing proteins in the proteome, with a new fold and a specific new biochemical function. Let’s call that P(NUFF), for New Unrelated Folded and Functional.

    Then we have to look for an even smaller subset of that, the NUFF that are naturally selectable in that context, IOWs that can confer, by themselves, a reproductive advantage in the context of the living being where the transition is supposed to happen (prokaryotes, I suppose, or a symbiosis of them).

    Let’s call that P(NUFFNS).

    That is the tiny subset of probability we are looking for if we want to believe in a darwinian explanation of the emergence of a single new basci protein domain in the course of natural history (an event that took place at least 2000 times in natural history).

    Just to be precise…

  63. tjguy:

    The paper is exactly right.

    Let’s just comment very simply the problems of the duplicated gene mechanism as a neodarwinian tool. It’s complete nonsense.

    The simple truth is that NS acts as negative selection to keep the already existing information. We see the results of that everywhere in the proteome: the same function is maintained in time and in different species, even if the primary sequence can vary in time because of neutral variation. So, negative NS conserves the existing function, and allow only neutral or quasi neutral variation. In that sense it works against any emergence of completely new information from the existing one, even if it can tolerate some limites “tweaking” of what already exists (microevolution).

    I suppose that darwinists, or at least some of them, are aware of that difficulty as soon as one tries to explain completely new information, such as a new basic protein domain. Not only the darwinian theory cannot explain it, it really works against it.

    So, the duplicated gene mechanism is invoked.

    The problem is that the duplicated gene, to be free to vary and to leave the original functional island, must be no more translated and no more functional. Indeed, that happens very early in the history of a duplicated gene, because many forma of variation will completely inactivate it as a functional ORF, as we can see all the time with pseudogenes.

    So, one of the two:

    a) either the duplicated gene remains functional and contributes to the reproduction, so that negative NS can preserve it. In that case, it cannot “move” to new unrelated forms of function.

    b) or the duplicated gene immediately becomes non functional, and is free to vary.

    The important point is that case a) is completely useless to the darwinian explanation.

    Case b) allows free transitions, but they are no more visible to NS, at least not until a new functional ORF (with the necessary regulatory sites) is generated. IOWs, all variation from that point on becomes neutral by definition.

    But neutral variation, while free of going anywhere, is indeed free of going anywhere. That means: feedom is accompanied by the huge rising of the probability barriers. As we know, finding a new protein domain by chance alone is exactly what ID has shown to be empirically impossible.

    IOWs, the neo darwinian “explanation” is silly and wrong.

  64. Bozorghmehr is a crank with no training in biological science who somehow managed to get his phoney papers past inadequate peer-review.

    He’s an internet troll who was so successful he moved on to trolling scientific journals.

    I sometimes suspect he’s an atheist sockpuppet.

  65. Evolutionary theory asserts what is wrong. In no way macroevolution is microevolution over longer timescales. That’s pure imagination, and bad imagination all the way.

    At molecular level, we have simply no example of macroevolution. Do you understand that? No example.

    We have no molecular model of one single macorevolutionary transition that could realistically be explained by the neo darwinian model of RV + NS.

    That’s why I think that your statement is false. It is against logic. It is against evidence. It is against facts. It is against mathematics and statistics. It is against engineering. And there is not a single empirical support of it.

    Is that enough?

    No. First of all, you need to give an operational definition of “macroevolution”, otherwise I cannot even begin to evaluate your assertion.

    You seem to be implying something like “irreducible complexity”, in which case your argument would be circular.

  66. I would like to add, for clarity, that the probability that is pertinent for the analysis of a specific instance of neo darwinian “explanation” can be defined with much greater detail.

    For instance, let’s say that we are analyzing the credibility of the neo darwinian explanation for the emergence of a basic new domain at some point in natural history, let’s say at the emergence of eukaryotes.

    OK

    In that case, the probability we have to consider is the probability of a new domain that is naturally selectable in that context.

    No. There is your first mistake. There no warrant for the assumption that a new domain has to be advantageous (result in greater reproductive success for its phenotype than for phenotypes lacking the new domain). As long as it does not seriously impair reproductive success there is no reason why it should not “emerge” i.e. appear in one indivudual and be replicated down that lineage.

    That means that we have to look for a specific subset of P(F) (the subset of folded proteins), the subset of folded proteins unrelated at sequence level to already existing proteins in the proteome, with a new fold and a specific new biochemical function. Let’s call that P(NUFF), for New Unrelated Folded and Functional.

    Nope. You just need the subset of sequences that result in a foldable protein that is not disastrously disadvantageous to the phenotype in the current environment.

    Then we have to look for an even smaller subset of that, the NUFF that are naturally selectable in that context, IOWs that can confer, by themselves, a reproductive advantage in the context of the living being where the transition is supposed to happen (prokaryotes, I suppose, or a symbiosis of them).

    Now you are double-counting, by equivocating with the word “function”, by conflating genotype with phenotype. If a protein is made, but has no effect on the reproductive success of the phenotype it doesn’t have a “function” in any relevant sense. It merely exists. And provided the first few generations down that lineage survive, it will exist in sufficiently many copies that it is likely to hang around for a long time.

    Let’s call that P(NUFFNS).

    That is the tiny subset of probability we are looking for if we want to believe in a darwinian explanation of the emergence of a single new basci protein domain in the course of natural history (an event that took place at least 2000 times in natural history).

    Better to call it P(sequence-that-results-in-a-folded-protein), which may be a tiny subset of all possible sequences, but still may be quite high given the immense number of opportunities for sequence mutations to occur.

    And we simply do not know how large that subset of sequences is, nor indeed whether some of their precursors also result in reproductive advantage for their bearers.

    Just to be precise…

    It isn’t precise at all. It’s wrong in a number of respects (the sequences coding for a new domain don’t have to be currently advantageous to appear, and there is no good reason to assume the precursors of those sequences don’t confer reproductive advantage), and we don’t in any case know the size of the subset of DNA sequences that result in foldable proteins, though we may know roughly the size of the subset that have actually appeared in living things.

    This exactly the problem Petrushka mentioned, and which IDists tend to dismiss as ignorance because they think they’ve dealt with it. You haven’t.

    You have pre-defined the target as those DNA sequences that result in foldable proteins that form part of modern functional proteins, and assume that that small target comprises the only possible target, forgetting that there may be a vast set of DNA sequences that would also result in protein domans, and another vast set of proteins comprised of those domains that, in some alternative universe, might also prove to confer reproductive advantage in some alternative biosphere.

    And this, to repeat, is the fundamental error at the heart of ID: to look at what exists, and say: what exists must be a tiny subset of what might exist (correct), and is also the only subset possible that could result in life (incorrect) and is therefore extremely improbable. Which is fallacious, because based on a false premise.

    Quite apart from the fallacy that only sequences that confer reproductive advantage on the phenotype can be replicated, and the fallacy that a sequence that confers a new advantage can have had no precursors that also conferred an advantage (in some different way). And the general confusion between genotype and phenotype.

  67. Well, in that case be specific. You have been challenging Darwinists for ages to explain the origin of “new body plans” by which most of us assumed you meant, well, something to do with body plans, i.e. cell differentiation in multicellular organisms, for which hox genes are crucial.

    Turns out you meant “new cell type”, which has nothing to do with cell differentiation.

    If you want to have a technical discussions, it helps to be precise in your language.

  68. First of all, on what grounds do you assert that early eukaryotes had “new, completely new, cellular organisation”?

    Your argument is, again, circular. You assert that there is a non-stepwise change, then defy evolutionists to explain it by step-wise changes.

    First demonstrate that the change must have been non-stepwise, otherwise there is no explanandum.

  69. Elizabeth:

    First of all, you need to give an operational definition of “macroevolution”, otherwise I cannot even begin to evaluate your assertion.

    A molecular transition to a new function implying more than 150 bits of dFSCI.

  70. A molecular transition to a new function implying more than 150 bits of dFSCI.

    Thanks. That needs a fair bit of unpacking though.

    First of all, do you mean that you mean by “molecular transition” a series of changes in a DNA sequence (let’s call the starting sequence A1 and the “new” sequence A2) where A2 confers enhanced reproductive success on the phenotype relative to A1, and where the A2 has 150 more bits of dFSCI than the A1?

    Second, if so, how do you define A1 and A2?

    Third, how do you compute the dFSCI of each sequence?

  71. Also, why can’t macroevolution, by your definition, consist of, for example, 150 microevolutionary steps of 1 bit increase in dFSCI each?

    What I’m trying to get at here is what the categorical boundary is supposed to be that divides microevolution from macroevolution?

  72. Elizabeth:

    No. There is your first mistake. There no warrant for the assumption that a new domain has to be advantageous (result in greater reproductive success for its phenotype than for phenotypes lacking the new domain). As long as it does not seriously impair reproductive success there is no reason why it should not “emerge” i.e. appear in one indivudual and be replicated down that lineage.

    There is no mistake. It’s simply you that don’t understand the reasoning.

    I am evaluating the probabilities of a certain final event to happen in a random system. The final event is supposed to be a naturally selectable function, because that would be the event that “stops” the random system and implies a nevessity mechanism (the expansion of the selected trait). Unless and until we get to that result, the system is random. And therefore we can evaluate the probabilities of a subset of states versus all possible states.

    You have made that mistake many times. You seem to believe that neutral or quasi neutral variation can just happen, and that’s enough for you. Nothing could be fartgher from truth. It is obvious that neutral variation can happen, but the problem is, it is random variation, All states have the same probability to be reached by neutral variation. So, neutral variation is a perfectly random system where all unrelated states have the same probability to be reached. But, if you want to defend an algorithm such as neodarwinism, that assumes that NS has an important role to explain how unlikely results are obtained in a random system, then you need to reach naturally selectable results. It’s as simple as this.

    Nope. You just need the subset of sequences that result in a foldable protein that is not disastrously disadvantageous to the phenotype in the current environment.

    Same mistake as above, and even worse. What you say has no meaning. Why do you need a foldable protein at all? If neutral evolution takes place in a duplicated, inactivated gene, any sequence is neutral. You don’t even need the sequence to be an ORF. Any random sequence is certainly not disastrously disadvantageous to the phenotype, if it is not translated. If, on the other hand, the variation applies to a function sequence, any sequence that reduces or looses the original function will be visible to negative selection. See also my post 11.1.

    Now you are double-counting, by equivocating with the word “function”, by conflating genotype with phenotype. If a protein is made, but has no effect on the reproductive success of the phenotype it doesn’t have a “function” in any relevant sense. It merely exists. And provided the first few generations down that lineage survive, it will exist in sufficiently many copies that it is likely to hang around for a long time.

    Always the same error. I am discussing the probabilities of reaching a naturally selectable result in a random system of variation, If you cannot understand that, you cannot understand any part of my reasoning.

    Better to call it P(sequence-that-results-in-a-folded-protein), which may be a tiny subset of all possible sequences, but still may be quite high given the immense number of opportunities for sequence mutations to occur.

    No. Wrong! A folded protein is not functional and is not naturally selectable. Until a functional selectable result occurs, a foldable protein is not different than a non foldable sequence that is not translated. And why should a foldable non functional protein be translated? If that were the case, all living cells would be repleted of foldable non functional proteins that are not “disastrously disadvantageous to the phenotype”. I suppose that’s not the case.

    And we simply do not know how large that subset of sequences is, nor indeed whether some of their precursors also result in reproductive advantage for their bearers.

    That’s really an “argument from ignorance”, if I ever so one! Yes, we don’t know, because nobody has ever been able to show those precursors, either in the proteome or in the lab. But you can always hope and dream…

    It isn’t precise at all. It’s wrong in a number of respects (the sequences coding for a new domain don’t have to be currently advantageous to appear, and there is no good reason to assume the precursors of those sequences don’t confer reproductive advantage), and we don’t in any case know the size of the subset of DNA sequences that result in foldable proteins, though we may know roughly the size of the subset that have actually appeared in living things.

    You are simply misrepresenting my argument. I never said that a sequence has to be “currently advantageous” to appear. That’s only your imagination. What I said is that a sequence has to be currently advantageous to be naturally selected, and that up to that point all sequences have the same probability to appear. Is that the same thing, in your opinion?

    The size of the subset of foldable proteins is interesting, but not relevant. The only relevant subset is the one I defined, the subset of naturally selectable proteins. All the rest is random variation.

    This exactly the problem Petrushka mentioned, and which IDists tend to dismiss as ignorance because they think they’ve dealt with it. You haven’t.

    We have dealt with it, and we will continue to deal with it. Because it is an importnant problem, and we believe it has to be dealt with. That does not mean that the problem is completely solved, obviously.

    Darwininsts, if they were scientifically honest, should deal with it with the same urgency (some, indeed, have tried, with terrible methodology and false results). Because the problem is fundamental for their own theory.

    Instead, most darwinists, including you, just try to hide behind the supposed impossibility to solve the problem, so that their wrong theories may conrinue to be believed for some more time.

    The problem can be solved, and we have a lot of indications about what the solution is. And the solution is exactly what ID has shown.

    You have pre-defined the target as those DNA sequences that result in foldable proteins that form part of modern functional proteins, and assume that that small target comprises the only possible target, forgetting that there may be a vast set of DNA sequences that would also result in protein domans, and another vast set of proteins comprised of those domains that, in some alternative universe, might also prove to confer reproductive advantage in some alternative biosphere.

    Wow! Have you lost your mind? I have done nothing like that. I have defined a subset as (I quote myself):

    “That means that we have to look for a specific subset of P(F) (the subset of folded proteins), the subset of folded proteins unrelated at sequence level to already existing proteins in the proteome, with a new fold and a specific new biochemical function. Let’s call that P(NUFF), for New Unrelated Folded and Functional.

    Then we have to look for an even smaller subset of that, the NUFF that are naturally selectable in that context, IOWs that can confer, by themselves, a reproductive advantage in the context of the living being where the transition is supposed to happen (prokaryotes, I suppose, or a symbiosis of them).

    Let’s call that P(NUFFNS).”

    Where in that is the concept you attribute to me? Just to be precise, “those DNA sequences that result in foldable proteins that form part of modern functional proteins?

    Why do you put in my mouth things I have not said?

    I will not comment about the “alternative universe” and “alternative biosphere” part, just out of respect for you. I believe you must be really desperate to use those arguments. And anyway, what in the world does an alternative universe have to do with the probabilities of a selectable function arising in a specific biological context, such as my example of prokaryotes transitioning to eukatyotes?

    And this, to repeat, is the fundamental error at the heart of ID: to look at what exists, and say: what exists must be a tiny subset of what might exist (correct), and is also the only subset possible that could result in life (incorrect) and is therefore extremely improbable. Which is fallacious, because based on a false premise.

    Wow again! Where have I, (or ID, for what I know), ever said that “what exists is the only subset possible that could result in life”? I hyave never said that, I am really sure of that. For the simple reason that I don’t believe it, and I usually don’t say things I don’t believe.

    But I have sais a lot of times that already existing information and complexity poses huge constraint to what can be useful in that context. IOWs, if you have to find something useful in an existing strain of bacteria, your options are radically limited, and the possibilities that other forms of life based on fire and lithium could possibly exist in another galaxy are scarcely a help!

    Quite apart from the fallacy that only sequences that confer reproductive advantage on the phenotype can be replicated,

    Totally invented fallacy. I never said that.

    and the fallacy that a sequence that confers a new advantage can have had no precursors that also conferred an advantage (in some different way).

    Totally invented fallacy. I certainly said, and say, that those precursors have never been shown. And I have said many times that science is note made on “mere possibilities”. It usually needs some facts, you know?

    And the general confusion between genotype and phenotype.

    What confusion? I have no confusion about that. That is a serious accusation. Please, detail it.

  73. Elizabeth:

    Again, I am not asserting that. I am asserting that eukaryotes have new cellular organization. Early eukaryotes are, I suppose, your personal dream. I will be happy to comment on them as soon as you let me make their acquaintance.

  74. Elizabeth,

    You have it all bass-ackwards. We don’t have to say what prevents anything. YOU have to demonstrate what allows it.

    That means you have to produce POSITIVE evidence for what you claim can/ did happen.

    The edge of stochastic processes appears to be set at two new protein-to-protein binding sites. So that would be one categorical boundary. Good luck getting over it…

  75. Elizabeth:

    If you remembered the definition of dFSCI, that I have patiently discussed with you many times. you would know that dFSCI refers to the total information necessary to get a function.

    If your 1 bit increases are functional and selectable, then show them. If they are not, then the probabilistic barrier remains the same.

  76. Elizabeth,

    Your position can’t even explain the existence of HOX genes. So stop using what you need to explain in the first place. Your position can’t explain cellular differentiation.

    That said I linked to the differences between the two plans- prokaryote and eukaryote.

    A single-celled organism has a body and therefor a body-plan. I cannot help it that you are not a biologist and don’t have the knowledge required to follow along.

  77. Elizabeth:

    Microevolution is a random variation in the range of what a biological system can achieve, that gives a functional selectable result. Some forms of antibiotic resistance are microevolution, and they are well documented. A single aminoacid change can confer antibiotic resistance and be selected, in the presence of the antibiotic. That is well known, and observed.

    One aminoacis is about 4.32 bits of information. That is in the range of routine variation in a bacterial culture.

    150 bits corresponds to about 35 AAs. A transition requiring 35 AAs to confer a new function has never been onserved to occur in any biological context.

  78. Elizabeth:

    Let’s say, more correctly, that the trensition has a dFSCI of 150 bits.

    A1 would be the existing protein that “evolves” to A2. A2 would be a new protein, with a new function, a naturally selectable new function.

    A2 would differ from A1 of 150 bits of functional information.

    The best way to imagine that is that A2 differs from a! in 35 AAs that must necessarily change exactly to the new value for the function to emerge.

    As this is not usually the case, we can apply a Durston style computation, attributing to each site that changes an informational value in Fits that corresponds to the reduction in uncertainty that each AA site implies.

    So, at the two extremes, if one site must necessarily have one aminoacid, its Fit value will be 4.32 The function implies a complete reduction of uncertainty at that site. If instead any AA can stay at that site, its Fit value is 0 (no reduction of uncertainty is implied by the function). And similarly for all intermediate possibilities.

    The sum of all Fit values at the changing sites gives the dFSCI of the transition.

    When the starting protein (A1) is totally unrelated, as is the case for new basic protein domains, the total dFSCI of the new proteon can be approximated by the Durston method applied to its protein family.

  79. Elizabeth:

    I have made an explicit reasoning in my post 11.1 that is independent from what Bozorghmehr may have said, or from what he may be, and of which I take full responsibility. Can you please comment on that? Unless you think that I am a crank too :)

  80. Elizabeth:

    I don’t want to interfere, but there is probably no reason to fight about words, while it is possible to agree on a shared meaning.

    It is true that “body plan” is usually referred to metazoa, but it is also true that “cell type” is also used to distinguish between different cell types in a multicellular being.

    It is certainly true that, for unicellular beings, that cell is the body. And it is true that it has an inner organization. That we call it “body plan” or not is a free, and not so relevant, choice.

    That the inner structures of prokaryoyes and of unicellular eukaryotes are very different is, I believe, a fact.

  81. It’s not only an offence to Petrushka’s own intelligence – it’s a science stopper. To any question “Why is this fact so?” you don’t look for explanations that are more or less likely. You just say, “It’s a fact, so it had to happen – probability = 1″

    You find the Lord’s prayer inscribed on Precambrian sediment? Any random pattern is equally unlikely in advance. But this one has already happened, so it has a probability of 1.

    Mathematics, or sophistry?

  82. 82
    material.infantacy

    gpuccio,

    “That is the tiny subset of probability we are looking for if we want to believe in a darwinian explanation of the emergence of a single new basci protein domain in the course of natural history (an event that took place at least 2000 times in natural history).”

    That’s how it looks to me as well.

    The probabilities of course multiply as each of those protein domains are considered. Even if it were granted that NUFFNS was close in size to F (the folding set), it’s difficult to see how a random search could yield much success.

  83. Elizabeth:

    No. There is your first mistake. There no warrant for the assumption that a new domain has to be advantageous (result in greater reproductive success for its phenotype than for phenotypes lacking the new domain). As long as it does not seriously impair reproductive success there is no reason why it should not “emerge” i.e. appear in one indivudual and be replicated down that lineage.
    There is no mistake. It’s simply you that don’t understand the reasoning.
    I am evaluating the probabilities of a certain final event to happen in a random system. The final event is supposed to be a naturally selectable function, because that would be the event that “stops” the random system and implies a nevessity mechanism (the expansion of the selected trait). Unless and until we get to that result, the system is random. And therefore we can evaluate the probabilities of a subset of states versus all possible states.

    My position is that your reasoning is faulty. I also think it is confused. Firstly you assume a “final event”. There are no “final events” in evolution. It is a continuous process. And you don’t define “random system”. It’s not in itself a precise term. Do you mean “undirected”? Or do you mean “stochastic”? I will assume the latter for now, and parse your first sentence as: “I am evaluating the probability of a given event happening in a stochastic system.” And what do you mean by “naturally selectable function”? A DNA sequence that may at some point confer reproductive advantage on the phenotype? If not what? And why would such sequence “stop” the “random system”? And what do you mean by “the expansion of the selected trait”?
    Unless you can explain these things I cannot agree that “we can evaluate the probability of a subset of states versus all possible states”.

    You have made that mistake many times. You seem to believe that neutral or quasi neutral variation can just happen, and that’s enough for you. Nothing could be fartgher from truth. It is obvious that neutral variation can happen,

    Indeed it is.

    but the problem is, it is random variation,

    Again, what do you mean by “random”? Please give me a precise definition, because it is a word with a great many imprecise meanings and very few precise ones.

    All states have the same probability to be reached by neutral variation.

    No, they do not. Not all neutral variations are equally probable, and some states (again I’m not sure what you mean by “states” – the state of the DNA sequence? The state of the phenotype? The state of the population?) are therefore less probable than others. Also, some states may have no neutral-only pathways to them. And in any case, “neutral” is not an absolute term – a variant may be neutral (confer no reproductive advantage relative to phenotypes lacking that variant) in one context, advantageous in another, disadvantageous in another.

    So, neutral variation is a perfectly random system where all unrelated states have the same probability to be reached.

    So even without a clear definition of “a perfectly random system” this statement must be false. But it would certainly help if I knew what you meant by “a perfectly random system”.

    But, if you want to defend an algorithm such as neodarwinism, that assumes that NS has an important role to explain how unlikely results are obtained in a random system, then you need to reach naturally selectable results. It’s as simple as this.

    I’m not aware that “neodarwinism” is an algorithm, so I’m not attempting to “defend” it as such. But yes, of course, natural selection has an important role in evolutionary change – it’s the name we give to the process by which allele prevalence in one generation is biased in favour of those that confer reproductive success in the previous one. It is uncontroversial even amount IDists, who term it “microevolution”. It doesn’t explain “unlikely results” – it explains how traits that confer reproductive advantage in a given environment are likely to become more prevalent in that population – it explains adaptation, in other words.
    It’s not that “you need to reach naturally selectable results”. It’s that the resulting allele prevalence will be influenced by what confers reproductive success.

    Nope. You just need the subset of sequences that result in a foldable protein that is not disastrously disadvantageous to the phenotype in the current environment.
    Same mistake as above, and even worse. What you say has no meaning. Why do you need a foldable protein at all?

    I haven’t made a mistake, see above. And I was simply taking your case – how do you get a new protein domain? A protein domain is just a DNA sequence that codes for an amino acid chain with a 3D structure. So any DNA sequence that codes for such a thing for the first time will be a “new protein domain”, whether or not that sequence proves advantageous for the phenotype in which it first occurs. And as long as it doesn’t prove disastrous, it will be replicated many times in that lineage.

    If neutral evolution takes place in a duplicated, inactivated gene, any sequence is neutral.

    You seem to be very confused about the meaning of “neutral”. “Neutral” in genetics simply means “does not confer reproductive advantage on the phenotype in the current environment”. What that advantage is also relative – sometimes it is compared with the parental sequence, at its first appearance, sometimes with the sequence’s peers. The first might be neutral, the second advantageous, for the same allele.

    You don’t even need the sequence to be an ORF. Any random sequence is certainly not disastrously disadvantageous to the phenotype, if it is not translated.

    Exactly. So there is plenty of scope for new sequences to be replicated many times, even if they confer no advantage to the phenotype at first appearance.

    If, on the other hand, the variation applies to a function sequence, any sequence that reduces or looses the original function will be visible to negative selection. See also my post 11.1.

    Metaphors like “visible to negative selection” are very misleading. Negative selection is a way of describing a consequence, it isn’t an agent, and nothing is “visible to” it. It can’t “see”. If a sequence already confers some reproductive benefit to the phenotype then, clearly, a variation that disables that function will tend to result in that lineage (if we are talking about unicellular life) decreasing in prevalence. This means that sequences that confer reproductive benefit are more likely to be conserved. This is non-controversial, indeed, an essential strand in evolutionary theory.

    Now you are double-counting, by equivocating with the word “function”, by conflating genotype with phenotype. If a protein is made, but has no effect on the reproductive success of the phenotype it doesn’t have a “function” in any relevant sense. It merely exists. And provided the first few generations down that lineage survive, it will exist in sufficiently many copies that it is likely to hang around for a long time.
    Always the same error. I am discussing the probabilities of reaching a naturally selectable result in a random system of variation, If you cannot understand that, you cannot understand any part of my reasoning.

    It’s not that I don’t understand it, it’s that in my view, the error is yours! Or, at any rate, the confusion. Your position seems, as I suggested, to boil down to the IC argument – that some advantageous sequences can only be reached by a series of non-advantageous changes. And you are providing no evidence that a protein domain is IC.

    Better to call it P(sequence-that-results-in-a-folded-protein), which may be a tiny subset of all possible sequences, but still may be quite high given the immense number of opportunities for sequence mutations to occur.
    No. Wrong! A folded protein is not functional and is not naturally selectable.

    How do you know? There is no way of knowing, without observing the phenotype in which it occurs in the environment in which that phenotype lives. And, as I said, it doesn’t matter anyway, although if the folded protein in question confers some reproductive advantage, then it is more likely to be conserved. I think you need to unpack this term “naturally selectable”. I am not sure what you mean by it, and I suspect that what you think you mean by it will make no sense when you unpack it.

    Until a functional selectable result occurs, a foldable protein is not different than a non foldable sequence that is not translated. And why should a foldable non functional protein be translated? If that were the case, all living cells would be repleted of foldable non functional proteins that are not “disastrously disadvantageous to the phenotype”. I suppose that’s not the case.

    Clearly any sequence that does not result in any effects on the reproductive success of the phenotype is not selected ie. is neutral. By definition. That includes untranslated protein sequences, translated protein sequences that are not expressed in tissues in which they provide any selective advantage, and sequences that don’t do anything at all. As to “why should a foldable non functional protein be translated?” – the proximal answer is biochemical (if the required translational sequences are present and activated by other chemical signals, and the distal answer (the teleonomic answer) could be because phenotypes bearing sequences in which it is translated under the conditions in which it is translated reproduce better than those that don’t, resulting in greater prevalence of their genotypes in the population.

    And we simply do not know how large that subset of sequences is, nor indeed whether some of their precursors also result in reproductive advantage for their bearers.
    That’s really an “argument from ignorance”, if I ever so one! Yes, we don’t know, because nobody has ever been able to show those precursors, either in the proteome or in the lab. But you can always hope and dream…

    Gpuccio, it may well be an “argument from ignorance” but the argument is simply: you cannot compute the probability of an event unless you know what the probability space is. If you are ignorant of the probability space, then you can’t compute it. Therefore you can’t infer design from the low level of that probability. It’s not evolutionists who are making probability arguments, it’s IDists. And you can’t compute a probability in ignorance! Not without making a Bayesian stab at the likelihood that your priors are correct.

    It isn’t precise at all. It’s wrong in a number of respects (the sequences coding for a new domain don’t have to be currently advantageous to appear, and there is no good reason to assume the precursors of those sequences don’t confer reproductive advantage), and we don’t in any case know the size of the subset of DNA sequences that result in foldable proteins, though we may know roughly the size of the subset that have actually appeared in living things.
    You are simply misrepresenting my argument. I never said that a sequence has to be “currently advantageous” to appear. That’s only your imagination.

    OK, cool. I wasn’t intentionally misrepresenting your argument, I’m just trying to make sense of it. It still makes little sense to me for reasons I have given above.

    What I said is that a sequence has to be currently advantageous to be naturally selected, and that up to that point all sequences have the same probability to appear. Is that the same thing, in your opinion?

    Well, clearly a sequence has to be currently [reproductively] advantageous to be naturally selected because those two things are synonymous. Your second clause is simply wrong. Not all sequences have the same probability of appearing. I’m not even sure what you mean by that, but I can’t think of any sense in which it could be true. I think one of the problems with your approach is that you don’t specify your priors. Given, for example, the (non-functional) sequence: ABCBBCDADA which is more probable as the next variant: ABCBBBDADA, ABCBBCDADADA, or BCDDADCBAB?

    The size of the subset of foldable proteins is interesting, but not relevant. The only relevant subset is the one I defined, the subset of naturally selectable proteins. All the rest is random variation.

    Well, you need to define “naturally selectable” proteins, and also how you identify the subset.

    This exactly the problem Petrushka mentioned, and which IDists tend to dismiss as ignorance because they think they’ve dealt with it. You haven’t.
    We have dealt with it, and we will continue to deal with it. Because it is an importnant problem, and we believe it has to be dealt with. That does not mean that the problem is completely solved, obviously.

    It’s neither solved nor solvable. Therefore not only does it not yield the claimed Design inference, it’s a fruitless approach to detecting design.

    Darwininsts, if they were scientifically honest, should deal with it with the same urgency (some, indeed, have tried, with terrible methodology and false results). Because the problem is fundamental for their own theory.

    It has nothing to do with scientific honesty and everything to do with a sound understanding of probability and the scientific method. You cannot falsify a scientific hypothesis by concluding that the observations are “improbable” given the hypothesis. You can only falsify a null hypothesis that way. Evolutionary theory is not a null hypothesis.

    Instead, most darwinists, including you, just try to hide behind the supposed impossibility to solve the problem, so that their wrong theories may conrinue to be believed for some more time.

    This is not true. It’s the whole approach that is misguided. There are plenty of ways of doing research into evolution that do not involve trying to compute the probability that it did not happen that way, which is impossible. To argue that we are “hid[ing] behind the supposed impossibility” is ludicrous. It’s just not how science is done. It doesn’t work. It’s invalid.

    The problem can be solved, and we have a lot of indications about what the solution is. And the solution is exactly what ID has shown.

    I have tried to demonstrate to you why a) the problem is not a problem, b) it can’t be solved and c) it wouldn’t tell you anything even if it could be. The problem is ill-posed. In fact, to be really frank, it isn’t posed at all.
    If you disagree, please pose it, precisely :)

    You have pre-defined the target as those DNA sequences that result in foldable proteins that form part of modern functional proteins, and assume that that small target comprises the only possible target, forgetting that there may be a vast set of DNA sequences that would also result in protein domans, and another vast set of proteins comprised of those domains that, in some alternative universe, might also prove to confer reproductive advantage in some alternative biosphere.
    Wow! Have you lost your mind? I have done nothing like that. I have defined a subset as (I quote myself):
    “That means that we have to look for a specific subset of P(F) (the subset of folded proteins), the subset of folded proteins unrelated at sequence level to already existing proteins in the proteome, with a new fold and a specific new biochemical function. Let’s call that P(NUFF), for New Unrelated Folded and Functional.
    Then we have to look for an even smaller subset of that, the NUFF that are naturally selectable in that context, IOWs that can confer, by themselves, a reproductive advantage in the context of the living being where the transition is supposed to happen (prokaryotes, I suppose, or a symbiosis of them).
    Let’s call that P(NUFFNS).”
    Where in that is the concept you attribute to me? Just to be precise, “those DNA sequences that result in foldable proteins that form part of modern functional proteins?
    Why do you put in my mouth things I have not said?

    As I said, I’m trying to understand you. Your words, as written, make no sense to me. I don’t know what you mean by the terms you are using (such as “naturally selectable” and “random system” ) so I’m having to make tentative assumptions. Clearly erroneous ones, but I hope at least it is becoming clear why you are unclear to me ?

    I will not comment about the “alternative universe” and “alternative biosphere” part, just out of respect for you. I believe you must be really desperate to use those arguments. And anyway, what in the world does an alternative universe have to do with the probabilities of a selectable function arising in a specific biological context, such as my example of prokaryotes transitioning to eukatyotes?

    No, I am not “really desperate”. Boy, this has come round full circle. I’m simply pointing out that you haven’t come close (and cannot ever come close) to defining the probability space. That’s why your argument suffers from the exact flaw Petrushka pointed out: you restrict the “target” space to something only a little larger than what is observed, and then express astonishment at the low probability that this tiny space should have been hit. You have absolutely no basis on which to define that target space so narrowly – indeed it is uncomputeable, as I have said. I wasn’t talking about multiverses or anything abstruse like that, just what might have happened if what actually happened didn’t. After all, your own existence, you, gpuccio, in all your uniqueness and complexity, might never have existed, were it not for a whole series of events that could easily have taken a different turn. How do you know that a completely different set of protein domains were just as viable as the ones we observe? Or countless sets of protein domains?

    And this, to repeat, is the fundamental error at the heart of ID: to look at what exists, and say: what exists must be a tiny subset of what might exist (correct), and is also the only subset possible that could result in life (incorrect) and is therefore extremely improbable. Which is fallacious, because based on a false premise.
    Wow again! Where have I, (or ID, for what I know), ever said that “what exists is the only subset possible that could result in life”? I hyave never said that, I am really sure of that. For the simple reason that I don’t believe it, and I usually don’t say things I don’t believe.

    Good. Then how do you compute the subset?

    But I have sais a lot of times thatalready existing information and complexity poses huge constraint to what can be useful in that context. IOWs, if you have to find something useful in an existing strain of bacteria, your options are radically limited, and the possibilities that other forms of life based on fire and lithium could possibly exist in another galaxy are scarcely a help!

    Yes, indeed (and I did not have in mind “life based on fire and lithium” but only, in this context, life based on a different set of protein domains), but you have it backwards. Already existing information and complexity does indeed pose a huge constraint” but not just on “what can be useful in that context” but on what variants will be viable in that context. For example, short arms may be advantageous relative to no arms, but once short arms exist, only longer arms may offer any advantage, and once longer arms exist, short arms may be disadvantageous. And, at a molecular level, once a sequence confers a reproductive advantage it will be highly conserved, meaning that only variants that confer increased reproductive advantage will tend to be retained in the population. “Natural selection” therefore is a kind of ratchet mechanism by means of which incremental improvements in adaptation to an environment can be steadily gained. You are correct, of course, in saying that in bacteria, this imposes particular constraints, because horizontal gene transfer mechanisms are limited and haphazard. However, once sexual reproduction got going, whole new avenues opened up, because now sequences could be mixed and matched, genes could propagate independently of the rest of the genome, giving rise to much more effective filtering of advantageous sequences.


    Quite apart from the fallacy that only sequences that confer reproductive advantage on the phenotype can be replicated,

    Totally invented fallacy. I never said that.

    Good. In that case I’m not sure what you meant when you said that you needed to “reach” a “naturally selectable” sequence. I hope you can explain that. I’m doing my best here, gpuccio, and I apologise for my part in misunderstanding you, but I do think that the major part of the misunderstanding arises from confusion in your own thinking.

    and the fallacy that a sequence that confers a new advantage can have had no precursors that also conferred an advantage (in some different way).
    Totally invented fallacy. I certainly said, and say, that those precursors have never been shown. And I have said many times that science is note made on “mere possibilities”. It usually needs some facts, you know?

    You cannot make an inference from lack of evidence. If there can be advantageous precursors, then the thing is not “irreducibly complex”. If you want to infer design from “irreducible complexity” then you need to show that there can have been no advantageous precursors. The onus is on the one making the claim. Evolutionist aren’t claiming that there was no intelligent designer. It is IDists who are claiming that there was.

    And the general confusion between genotype and phenotype.
    What confusion? I have no confusion about that. That is a serious accusation. Please, detail it.

    It seemed to me apparent in your use of terms like “function” and “naturally selectable” and “sequence”. If you are not confused about that, then perhaps the confusion arises from some other source.
    But you certainly, in my view, need to sort out what you mean by those two terms before going any further.
    Anyway, nice to talk to you (if only to disagree with you!) again!
    A happy new year to you

    Lizzie

  84. Not really a “personal dream” but clearly, if evolutionary theory is correct, they must have existed.

    So if you want to use them to falsify evolutionary theory you need to show that they did not exist.

    What you cannot do is say: look eukaryotes leapt into existence fully formed, therefore ID, unless you can show that that indeed happened.

    Nor can you say: unless evolutionist can show simpler earlier protoeukaryotes, ID is true.

    You can, however say: evolutionist don’t have a good explanation for how complex eukaryotes emerged from simpler ones.

    And most evolutionists would be perfectly happy to agree.

    I’ll repeat what I said to you in my recent post: evolutionists do not claim that there is no ID; they/we merely claim that there is no basis for an ID inference, and plenty of basis for our theory that complex functional living things can evolve by means of replication with heritable variation in reproductive success.

  85. I’m not “fight[ing] about words”. I just wish that people would either use them more precisely, or say more precisely what they mean by them.

    Joe has been challenging evolutionists to explain “new body plans” for quite a while, and most of the responses have been in terms of hox genes. If, instead, he was talking about “new cell types” it would have saved a lot of angst if he’d explained that.

    However, now he has. Good.

  86. Let’s say, more correctly, that the trensition has a dFSCI of 150 bits.

    A1 would be the existing protein that “evolves” to A2. A2 would be a new protein, with a new function, a naturally selectable new function.

    So let’s get this absolutely clear: A1 is a DNA sequence that specifies a protein with a certain function, without which the phenotype would be worse off, right?

    A2 is a variant of that sequence that specifies a variant of that protein that serves a different function, without which the phenotype would be worse off?

    And A2 has 150 more bits than A1?

    Please correct me if I have this wrong.

    A2 would differ from A1 of 150 bits of functional information.

    OK.

    The best way to imagine that is that A2 differs from a! in 35 AAs that must necessarily change exactly to the new value for the function to emerge.</blockquote

    So you are proposing that sequence A1 is vital, and that A2 is an even better replacement for A1, but has 35 different AAs, that must all be present for the protein to be made at all?

    As this is not usually the case, we can apply a Durston style computation, attributing to each site that changes an informational value in Fits that corresponds to the reduction in uncertainty that each AA site implies.

    What is “not usually the case”? Are you suggesting that one DNA protein coding sequence mutates to a very different sequence coding for a very different protein in a single step? What evidence to you have for this?

    So, at the two extremes, if one site must necessarily have one aminoacid, its Fit value will be 4.32 The function implies a complete reduction of uncertainty at that site. If instead any AA can stay at that site, its Fit value is 0 (no reduction of uncertainty is implied by the function). And similarly for all intermediate possibilities.

    The sum of all Fit values at the changing sites gives the dFSCI of the transition.

    When the starting protein (A1) is totally unrelated, as is the case for new basic protein domains, the total dFSCI of the new proteon can be approximated by the Durston method applied to its protein family.

    Are you envisaging 35 new triplets or 35 changed triplets?

  87. Elizabeth:

    If you remembered the definition of dFSCI, that I have patiently discussed with you many times. you would know that dFSCI refers to the total information necessary to get a function.

    Get to a function from where?

    If your 1 bit increases are functional and selectable, then show them. If they are not, then the probabilistic barrier remains the same.

    What does “functional and selectable” mean? What would functional but not selectable mean, or selectable but not functional? Or selectable but not selected?

    Selection means no more, nor less, than the phenomenon by which a DNA sequence confers reproductive advantage on the phenotype. If it does so, it is functional, clearly. If it confers no net reproductive advantage, but does have an effect, you wouldn’t normally call it a “function” in evolutionary terms.

    Now, clearly, when we observe microevolution (increase in finch beak size in response to changes in available seed size, for instance) we are seeing functional changes – an increased prevalence of alleles that tend to promote larger beaks. Over time, if El Nino events become more common, any new alleles that tend to promote even larger beaks will also become more prevalent. Mean beak size will continue to increase until still larger beaks are no longer advantageous.

    At what stage would you determine that 150 bits of functional information had been added? And how would you quantify the increase in each generation? Recall that the mean beak depth steadily increases from year to year.

    Elizabeth:

    Microevolution is a random variation in the range of what a biological system can achieve, that gives a functional selectable result. Some forms of antibiotic resistance are microevolution, and they are well documented. A single aminoacid change can confer antibiotic resistance and be selected, in the presence of the antibiotic. That is well known, and observed.

    One aminoacis is about 4.32 bits of information. That is in the range of routine variation in a bacterial culture.

    150 bits corresponds to about 35 AAs. A transition requiring 35 AAs to confer a new function has never been onserved to occur in any biological context.

    So what makes you think that any new function requires a single step of 35 new AAs or AA changes?

  88. F/N: I think this thread has in it some key discussions, so I have taken liberty to clip and highlight, here. Pardon delay in notification, the struggle with the black screen of sudden death is not over yet — and Linux is looking better and better to me, 2nd time it has saved my neck. KF

  89. tjguy:
    The paper is exactly right.
    Let’s just comment very simply the problems of the duplicated gene mechanism as a neodarwinian tool. It’s complete nonsense.
    The simple truth is that NS acts as negative selection to keep the already existing information. We see the results of that everywhere in the proteome: the same function is maintained in time and in different species, even if the primary sequence can vary in time because of neutral variation. So, negative NS conserves the existing function, and allow only neutral or quasi neutral variation. In that sense it works against any emergence of completely new information from the existing one, even if it can tolerate some limites “tweaking” of what already exists (microevolution).

    Again, your metaphor of NS “acting” is misleading you. Natural selection is an effect, not an agent. It doesn’t “act… as negative selection to keep the already existing information”. It doesn’t “act” at all. It is simply the phenomenon by which DNA sequences that only result in reproductive success if unaltered (or only slightly altered) are highly conserved. In other words it’s a way of describing sequences that are both vital and “brittle” – most variants are lethal. It doesn’t “work against the emergence of completely new information”, it simply means that certain sequences are unlikely to be propagated in mutated form. That doesn’t mean the sequence can’t be duplicated, or other less brittle or less vital sequences mutated and propagaged, so it does not have any implications at all for the “emergence of completely new information”.

    I suppose that darwinists, or at least some of them, are aware of that difficulty as soon as one tries to explain completely new information, such as a new basic protein domain. Not only the darwinian theory cannot explain it, it really works against it.
    So, the duplicated gene mechanism is invoked.

    Not, invoked, gpuccio, observed, as one of several mechanisms by which new genes, as opposed to new alleles, are created.

    The problem is that the duplicated gene, to be free to vary and to leave the original functional island, must be no more translated and no more functional. Indeed, that happens very early in the history of a duplicated gene, because many forma of variation will completely inactivate it as a functional ORF, as we can see all the time with pseudogenes.

    Not at all. It can be translated and be functional until it undergoes a mutation that renders it non-functional (and become as you say, a pseudogene). It doesn’t have to become non-function in order to vary. Indeed, one point that Atheistoclast does make, which is quite an interesting one, although he doesn’t make it very clearly, and I’m not even sure he ever really got his own point, is that to some extent, the duplicated gene will sometimes tend to be conserved as a “spare”, as he says, although I think that is a very misleading way of putting it, and indeed, misled him (not that that is very difficult). If the duplicated allele is a useful, but not especially common one, in a sexually reproducing population, individuals with two copies of it may have more offspring than individuals with one, because they are more likely to end up with offspring with at least one. In fact, on a TR thread in which Atheistoclast was developing this idea, he and I both made simple computer models (his was appalling) and I did in fact show that this was true, and so an increasing proportion of the population end up with both the new gene and the old, both still functioning. However, once the prevalence of both in the population gets very high, the reproductive advantage to having a duplicate starts to drop (because your chance of meeting a mate who also has at least one copy starts to rise, so you don’t “need the spare”), and so any inactivating mutation in one of the copies ceases to come with a drop in reproductive success. And once inactivated, it is now “free to mutate”, with the concomitant possibility that one of those subsequent mutations will yield a functional gene.

    So, one of the two:
    a) either the duplicated gene remains functional and contributes to the reproduction, so that negative NS can preserve it. In that case, it cannot “move” to new unrelated forms of function.
    b) or the duplicated gene immediately becomes non functional, and is free to vary.
    The important point is that case a) is completely useless to the darwinian explanation.
    Case b) allows free transitions, but they are no more visible to NS, at least not until a new functional ORF (with the necessary regulatory sites) is generated. IOWs, all variation from that point on becomes neutral by definition.

    And you are back with your agency language again! “Visible to NS” is meaningless in this context (if ever meaningful in any context). In any case you’ve excluded the most likely middle which is, as I’ve explained, that the duplicates are conserved until most of the population have at least one copy of the good allele, and two copies of the gene, at which point, having two good copies of the good allele ceases to be strongly advantageous, if advantageous at all, and the copy (or the original, it doesn’t matter) is “freed” from conserving forces i.e. the phenotypes suffer no drop in reproductive success if the sequence suffers a deactivating mutation.
    Binary thinking seems to be the problem here. Also absolutist thinking. “Neutrality”, “advantageousness”, “deleteriousness” are not absolute terms, they are relative, and they change over time. Evolution is a dynamic process, and the prevalence of alleles in the population is itself part of the environment that determines fitness, so there are powerful feedback loops, as I described above (in my model). And once conserving forces towards having two copies are relaxed, one is free to mutate into something else that may come in handy.

    But neutral variation, while free of going anywhere, is indeed free of going anywhere. That means: feedom is accompanied by the huge rising of the probability barriers. As we know, finding a new protein domain by chance alone is exactly what ID has shown to be empirically impossible.

    No. Again, you forget what is given. The duplicated gene is certainly not “free to go anywhere” – it’s highly constrained by what it is when it stops being what it was. In other words, if it was originally a protein-coding gene, it’s not going to go far from being a protein coding gene. And “protein coding space” is highly clustered – any changes to an ex-protein coding sequence is far more likely to hit a new protein coding sequence than, for example, some sequence pulled nucleotide by nucleotide from a black bag. And that’s what we see – genes in “families” where the gene-lineage is apparent in the sequence.

    IOWs, the neo darwinian “explanation” is silly and wrong.

    Well, Bozorgmehr is both silly and wrong. I don’t think you are silly. But I do think you are wrong :)

  90. Bozorghmehr posted here for a few days and was banned. He seems to have made a few converts, though.

    My memory could be faulty, but I think it was his conspiracy theories that got him banned.

  91. It was his holocaust denial.

    One of his many trolls.

  92. You should know very well that finding a sequence that is part of some specific, and extremely unlikely, subset of events is not pointless at all.

    You are arguing as if the sequence is specified in advance. I’ve asked you repeatedly how you know in advance what a sequence will do. Unless you can demonstrate otherwise, I don’t think you can tell if a sequence is random or just one character change from being functional.

    If you are going to claim that design is possible you need to have a theory of design, a process other than evolution that makes finding functional sequences possible.

  93. Elizabeth:

    Nice to talk you again, for me too! And my best wishes of a happy New Year.

    I don’t think I have the time now to answer all your points. IMO, the confusion of your arguments is huge, but I am sure you think the same of mine :)

    I will just start by clarifying some of my terms which seem to give you problems.

    a) A “random system” is a system whose behaviour cannot be described in terms of necessity laws, usually because we cannot know all the variables and/or all the laws, but which can still be described well enough in terms of some probability model.

    The tossing of a coin is a random system.

    Genetic variation, if we don’t consider the effects of NS (differential reproduction) is a random system. NS introduces an element of necessity, due to the interaction of reproductiove functions and the environment.

    b)”Function” and “naturally selectable function” are two different things. Please, look at my answers to englishmaninistanbul (and Petrushka) here:

    http://www.uncommondescent.com.....ent-412908

    for my definition of “function”. The biochemical action of an enzyme is its function.

    A “naturally selectable function” is a biochemical function that, in a specific biological being, confers a reproductive advantage. IOWs, if you add the gene that implements that function, the reproducer with that gene will reproduce better, and the prevalence of the gene in the population will be amplified: that’s what I mean by “expansion”.

    c) For NS I just mean what darwinists mean, only I refer always to the molecular aspect. So, a molecular variation can be:

    c1) Non visible to NS: that variation in no way modifies the reproductive potential of the reproducer compared to the rest of the population. That’s what a “neutral” variation is.

    c2) Visible to negative selection: The bearer of the variation will reproduce worse compared to the rest of the population. The vairation will tend, more or less, to be eliminated from the population genome.

    c3) Visible to positive selection: the bearer of the variatio reproduces better, and will expand in the population (or, which is the same, those who do not have the new trait will reproduce relatively worse. The new trait, conferring the new biological function, for instance a new enzymatic activity, will expand in the population genome.

    I can’t see what is the problem with those concepts. Could you please confirm if you accept them, and if not, why? Otyherwise, it’s hopeless to go on…

  94. Elizabeth:
    Nice to talk you again, for me too! And my best wishes of a happy New Year.
    I don’t think I have the time now to answer all your points. IMO, the confusion of your arguments is huge, but I am sure you think the same of mine
    I will just start by clarifying some of my terms which seem to give you problems.
    a) A “random system” is a system whose behaviour cannot be described in terms of necessity laws, usually because we cannot know all the variables and/or all the laws, but which can still be described well enough in terms of some probability model.
    The tossing of a coin is a random system.

    OK, thanks. That’s what I would call a “stochastic” system, but I’m happy to use your term. However I have issues about those “necessity laws”. It seems to me that “necessity laws” are also stochastic, and what we have instead is a continuum of uncertainty. For instance we can radioactive decay is, at one level, a “necessity law” and we can make highly precise predictions about half-lives etc at the macro level, but very poor prediction at the level of individual decay events, at which point we are reduced to a probability distribution as with any stochastic system.
    So I’d say that all systems are stochastic. So it doesn’t tell us much in itself ?

    Genetic variation, if we don’t consider the effects of NS (differential reproduction) is a random system. NS introduces an element of necessity, due to the interaction of reproductiove functions and the environment.

    Well, as I said, I’d say it’s all stochastic, and in no case are we dealing with flat probability distributions. Mutations have associated non-flat probability distributions, as does the probability that a variant with given properties will propagate through a population.

    b)”Function” and “naturally selectable function” are two different things. Please, look at my answers to englishmaninistanbul (and Petrushka) here:
    http://www.uncommondescent.com…..ent-412908
    for my definition of “function”. The biochemical action of an enzyme is its function.

    I do not find that link at all enlightening. Please give me an example of a function that is not “naturally selectable”.

    A “naturally selectable function” is a biochemical function that, in a specific biological being, confers a reproductive advantage. IOWs, if you add the gene that implements that function, the reproducer with that gene will reproduce better, and the prevalence of the gene in the population will be amplified: that’s what I mean by “expansion”.

    The reason this makes no sense, and, as a result, your distinction between “function” and “naturally selectable function” is a distinction without a difference, IMO, is that you have, again, confused genotype with phenotype! Genes do not “implement a function”. They are part of a large system of genes that work together to produce a co-ordinated functional phenotype. Most genes are expressed in many organs and tissues, and, depending on where and when they are expressed, serve a different function, but none are solely responsible for that function.
    Perhaps the distinction you are trying to get at is between a gene that is sometimes expressed as a protein (i.e. the DNA has the “function” of sometimes resulting in a protein) and between a gene that makes the difference between a successfully reproducting organism and an unsuccessful one. But I think that’s a very poor use of terms myself. Unless a gene does something that contributes to the reproductive success of the organisms of a population I don’t see how you can say it has a function. And if it does, then it’s by definition selected. That would include, of course, any gene that contributes to the maintenance of the organism.

    c) For NS I just mean what darwinists mean, only I refer always to the molecular aspect. So, a molecular variation can be:
    c1) Non visible to NS: that variation in no way modifies the reproductive potential of the reproducer compared to the rest of the population. That’s what a “neutral” variation is.
    c2) Visible to negative selection: The bearer of the variation will reproduce worse compared to the rest of the population. The vairation will tend, more or less, to be eliminated from the population genome.
    c3) Visible to positive selection: the bearer of the variatio reproduces better, and will expand in the population (or, which is the same, those who do not have the new trait will reproduce relatively worse. The new trait, conferring the new biological function, for instance a new enzymatic activity, will expand in the population genome.
    I can’t see what is the problem with those concepts. Could you please confirm if you accept them, and if not, why? Otyherwise, it’s hopeless to go on…

    Well, they seem unnecessarily complicated and potentially misleading. For a start, “natural selection” occurs at the level of the phenotype, not the genotype (as I keep saying, you seem to confuse these levels). Natural selection is, to a “Darwinist”, simply heritable variation in reproductive success. You cannot consider natural selection at the molecular level only. And in place of your c subdivisions, I would say:
    C1) A neutral variant is one that either confers no greater reproductive success on its bearer than the parental variant did on its bearer in the same environment or that confers no greater reproductive success on its bearer in the current environment than the mean reproductive success of all other variants in the population.
    C2) A deleterious variant is one that confers reduced reproductive success etc.
    C3) A beneficial variant is one that confers increased reproductive success etc.
    But bear in mind that if the environment changes, that variant may become either beneficial or deleterious. Most new variants are near-neutral when they first appear. Bear also in mind that allele frequency itself is a powerful element in the environment.

    Would you agree?

  95. Elizabeth:

    So I’d say that all systems are stochastic. So it doesn’t tell us much in itself ?

    No. I would not say so. Conventional physics is deterministic. You have taken an example from quantum physics, which is both deterministic and probabilistic, at different points.

    In conventional physics, the tossing of a coin is deterministic, but as we cannot know all the variables implied, we can better describe it with a probabilistic system, and if we assune it is a fair coin, a discrete uniform distribution, with each of the two events having a 0.5 probability, will describe the system quite well.

    Even in quantum mechanics, the evolution of the wave funcion is strictly deterministic. It’s only the “collapse” of the wave function that is probabilistic, and I must say that the meaning of probability in the qunatum world is controversial: it could be intrinsic randomness, or just a pseudo-randomness like in classical random systems, although I suppose that the first position prevails.

    Well, as I said, I’d say it’s all stochastic, and in no case are we dealing with flat probability distributions. Mutations have associated non-flat probability distributions, as does the probability that a variant with given properties will propagate through a population.

    Confusion, again. I suppose that with “flat” you mean a uniform probability distribution. But there is no need that the probability distribution that describes a system should be flat. It can be any valid probability function. The system is random all the same.

    That mutations can have slightly different probabilities is true. But I will try to explain, when we arrive to that point, why that is not really relevant.

    Instead, your observation about the effect of NS in propagating a mutation (I suppose you are referring to NS, as you speak of “properties”) is a necessity effect. That has nothing to do with the random system. It is a necessity algorithm that is “coupled” to the random system that generates RV. More on that later (I hope).

    I do not find that link at all enlightening. Please give me an example of a function that is not “naturally selectable”.

    Most biochemical functions are not “naturally selectable” in a specific context. Supppose I introduce by genetic engineering a gene that codes for some human protein into a bacterial population, where the biochemical context that uses the biochemical function of the gene is not present. In that case, the simple existence of a protein that, while retaining its biochemical function, is not in the context to use it, will not add to the reproductive potential of the bacteria. It will not be naturally selectable.

    To be naturally selectable, a protein gene must code for a protein that can be usefully integrated in the existing biochemical environment of the species where it emerges.

    Genes do not “implement a function”.

    Hey, that’s really sticking to words and being fastidious. A protein coding gene codes for a protein. The protein implements a function. OK, we are writing on a blog, in a hurry. Could you please try to understand what I mean? The gene has the information for the protein. The protein imnplements a function. I have no confusion about genotype and phenotype.

    They are part of a large system of genes that work together to produce a co-ordinated functional phenotype.

    What does that mean? The glucose 6 phosphatase gene codes for a specific enzyme, that has a specific biological function. That such a function is integrated in a higher level organization is true, but that does not make the local function less real. And you cannot integrate functions that do not exist into an higher level organization. Your reasoning really makes no sense.

    If your darwinist education prevents you, an intelligent person, from understanding that the local function of an enzyme is different from the global function of an integrated protein system, that still is different from the general reproductive function of a living being, then I must say that darwinism is really much more dangerous for human mind than I believed :)

    Most genes are expressed in many organs and tissues, and, depending on where and when they are expressed, serve a different function,

    No. Their biochemical activity remains the same.

    Unless a gene does something that contributes to the reproductive success of the organisms of a population I don’t see how you can say it has a function.

    Hey! What are you saying? An enzyme retains its biochemical function in the lab, in a cell free system, where it does not certainly contribute to any reproductive success!

    DNA polymerase is standardly used in labs, because it does what it does. I don’t think that allowing us to perform PCR is contributing to some reproductive success. And yet, we can perform that technique in the lab because the involved enzymes retain their biochemical functions even if completely separated from the living beings where they were formed, and even if artificially built in the lab in a cell free system. How can you make such senseless statements?

    Well, they seem unnecessarily complicated and potentially misleading. For a start, “natural selection” occurs at the level of the phenotype, not the genotype (as I keep saying, you seem to confuse these levels).

    I am nbot confusing anything. NS does happen at the phenotype level, but as its result is differential reproduction, its relevant effect is expanding or reducing the instances of a specific genome. That’s the only thing that is relevant to evoultion. So, the genetic variation gives a phenotypic effect, the phenotypic effect determines the expansion or contarction of the population, and therefore the expansion or contraction of the new gene. And so? Must I write all that each time to make you happy? What changes?

    You cannot consider natural selection at the molecular level only.

    Why? I consider the effects of NS at the molecular level, because that is the level where the information is, and it’s information we are debating. Why in th world cannot I do that?

    C1) A neutral variant is one that either confers no greater reproductive success on its bearer than the parental variant did on its bearer in the same environment or that confers no greater reproductive success on its bearer in the current environment than the mean reproductive success of all other variants in the population.

    Simple and intuitive indeed! Look at mine:

    c1) Non visible to NS: that variation in no way modifies the reproductive potential of the reproducer compared to the rest of the population. That’s what a “neutral” variation is.

    Where is your problem? The variant is different: either that difference changes something in the reproductive power of the varied being, or not. And your complex, and useless, definition is misleading to our discussion. After all, we are looking for how new functional proteins emerge (do you remember? that was the scenario). If a modified gene (and protein sequence, just to be complete) does not code for a protein that is new and functional, it can be in some clone that expands or not, for other reasons linked to other parts of the genome, or just because of drift: all that si possible, but in no way that will select for molecular function. In no way it will help a new functional protein to emerge.

    You are always at the same point: you imagine that functional proteins can emerge by chance alaone, and still you want to keep NS in the field, just to be sure. Well, your reasoning is faulty and wrong.

    But bear in mind that if the environment changes, that variant may become either beneficial or deleterious. Most new variants are near-neutral when they first appear. Bear also in mind that allele frequency itself is a powerful element in the environment.

    The simple reason why you run away from the molecular scenario is that, at that level, nothing of what you say has any sense.

    Again, proteins must fold well, must have an active site, must have a specific, oftem amazing, biochemical activity. Otherwise they are useless. And even if they have that biochemical activity, they can be useless just the same in the wrong context. And so?

    Penicillinase is not useful if there is no penicillin in the environment. But it still is a wonderful biochemical machine.

    But if you have to metabolize nylon, you use the same penicillinase strucure, with minimal modifications, because it’s always an esterase activity you need.

    If you had not penicillinase, nylonase could never emerge. Except in Ono’s imagination…

  96. Elizabeth:

    Just a comment on this statement fron you:

    No. Again, you forget what is given. The duplicated gene is certainly not “free to go anywhere” – it’s highly constrained by what it is when it stops being what it was. In other words, if it was originally a protein-coding gene, it’s not going to go far from being a protein coding gene. And “protein coding space” is highly clustered – any changes to an ex-protein coding sequence is far more likely to hit a new protein coding sequence than, for example, some sequence pulled nucleotide by nucleotide from a black bag. And that’s what we see – genes in “families” where the gene-lineage is apparent in the sequence.

    Again, what do you mean? My discussion is about the 2000 protein superfamilies that are not clustered at all. It’s obvious that, in a single family or superfamily, there are similatiries. That’s why I don’t usually discuss the evolution inside families (that can, anyway, be discussed). I usually admit that the evolution inside families is of two kind:

    a) With preservation of more or less the same function, and variation of the primary sequence out of neutral variation.

    b) With slight, sometimes more radicval, “tweaking” of the function, and levels of variation that could be borderline for a design detection (5 – 10 aminoacids).

    But the true ID argument is about the fundamental islands, the basic domains. Those are not clustered. They are completely isolated at primary sequence level (less than 10% homology), have different folding and different function.

    That’s why I say that, in that context, each unrelated state has more or less the same probabilities, and therefore we can assume a practically uniform distribution.

    Let’s be more clear. Let’s start from a duplicated gene. You make a big fuss about the gene retaining for some time its original function, the double allele, and so on. None of that is relevant.

    Let’s say that, if that duplicate gene will become the seed for a new domain, unrelated to the original at sequence level (as it must happen, because new domains do appear), at some point it must lose its original function and become inactive. At this point, all mutations are neutral. As mutations accrue, its sequence will no more be related to the original sequence. At this point, we can safely say that any unrelated state of the sequence has the same probability to be reached by a random wlak as any other. Your reasoning that “if it was originally a protein-coding gene, it’s not going to go far from being a protein coding gene” has no sense. Any stop codon, or frameshift mutation, will make an ORF no more an ORF. And even if the oRF is still transcribed or translated, the protein will no more fold or be functional. We will have a pseudogene, or a protein coding gene that codes for a completely non functional protein (a protein is a protein even if it does not fold, even if it has no function).

    So, new domains are exactly that: some sequence pulled nucleotide by nucleotide from a black bag. Or rather, some sequence that has to be reached by a completely blind random walk from a completely unrelated point of the search space. (Indeed, the idea of the black bag is incorrect, because it is related to a random search by successive extractions, and not to a random walk. The correct model is a random walk).

  97. So, new domains are exactly that: some sequence pulled nucleotide by nucleotide from a black bag. Or rather, some sequence that has to be reached by a completely blind random walk from a completely unrelated point of the search space.

    You are simply assuming that there is no historical path leading to modern domain sequences. You’ve done no research to demonstrate the lack of a path. You have no theory that speaks to the question of whether there is a path.

    All you have are sequences that have no living cousins.

    That is not positive evidence to support an unobserved designer.

  98. Elizabeth:
    So I’d say that all systems are stochastic. So it doesn’t tell us much in itself ?
    No. I would not say so. Conventional physics is deterministic. You have taken an example from quantum physics, which is both deterministic and probabilistic, at different points.
    In conventional physics, the tossing of a coin is deterministic, but as we cannot know all the variables implied, we can better describe it with a probabilistic system, and if we assune it is a fair coin, a discrete uniform distribution, with each of the two events having a 0.5 probability, will describe the system quite well.
    Even in quantum mechanics, the evolution of the wave funcion is strictly deterministic. It’s only the “collapse” of the wave function that is probabilistic, and I must say that the meaning of probability in the qunatum world is controversial: it could be intrinsic randomness, or just a pseudo-randomness like in classical random systems, although I suppose that the first position prevails.

    You put your finger on it IMO when you say “as we cannot know all the variables implied”. Yes indeed. And we never do. Even if at some fundamental level the universe proves to have no quantum uncertainty (as at least one eminent theoretical physicist has proposed) all our models (and we are talking about models here) must be stochastic. Now physicists work to find tolerances and insist on 5 sigma confidence, while life scientists are often content with two. But every single effect we observe comes with stochastic variance. We can never know all the variables. What we can do is estimate probability distributions, and also the extent to which those distributions are orthogonal.

    Well, as I said, I’d say it’s all stochastic, and in no case are we dealing with flat probability distributions. Mutations have associated non-flat probability distributions, as does the probability that a variant with given properties will propagate through a population.
    Confusion, again. I suppose that with “flat” you mean a uniform probability distribution. But there is no need that the probability distribution that describes a system should be flat. It can be any valid probability function. The system is random all the same.

    I agree. That was my point. All systems are stochastic. What we need are the relevant probability distributions, not a division into stochastic and non-stochastic processes. They are all stochastic. Some merely have more variance than others. This is, btw, why I avoid the term “RM+NS” and get cross when people say that mutation is random and natural selection isn’t. Both are variance generation (“RM”) and differential reproduction (“NS”) stochastic processes. And both have probability distributions that are biased in favour of reproductive success.

    That mutations can have slightly different probabilities is true. But I will try to explain, when we arrive to that point, why that is not really relevant.

    OK.

    Instead, your observation about the effect of NS in propagating a mutation (I suppose you are referring to NS, as you speak of “properties”) is a necessity effect. That has nothing to do with the random system. It is a necessity algorithm that is “coupled” to the random system that generates RV. More on that later (I hope).

    Well, as I keep saying, I think these terms and categories are misleading you. Darwinian evolution is extremely simple: if organisms reproduce with heritable variation in reproductive success, variants that tend to reproduce more successfully will become more prevalent. That’s the algorithm. It’s not “coupled” to anything. What is “coupled” within the algorithm, is reproduction with heritable variation in reproductive success. Take away either half of the “coupling” and you don’t have an algorithm. I honestly think that much of the flak copped by Darwinism boils down to overcomplication. That’s all it is. What is genuinely complicated of course are the mechanisms by which variation is generated and replicated. And those are fascinating, and many are still a mystery. But the algorithm is dead simple.

    I do not find that link at all enlightening. Please give me an example of a function that is not “naturally selectable”.
    Most biochemical functions are not “naturally selectable” in a specific context. Supppose I introduce by genetic engineering a gene that codes for some human protein into a bacterial population, where the biochemical context that uses the biochemical function of the gene is not present. In that case, the simple existence of a protein that, while retaining its biochemical function, is not in the context to use it, will not add to the reproductive potential of the bacteria. It will not be naturally selectable.

    Notice that you could only give me an example from Intelligent Design :) Try again please :) This is important.

    To be naturally selectable, a protein gene must code for a protein that can be usefully integrated in the existing biochemical environment of the species where it emerges.

    Sure. Because if it codes for a protein that performs no function for the phenotype, and can perform no function for the phenotype, then it doesn’t have a function, does it? Sure a protein might be generated and not do anything for the organism. But then it would be a functionless protein wouldn’t it? How can it have a function and not do anything useful? Isn’t that an oxymoron? Does it work in Italian?

    Genes do not “implement a function”.
    Hey, that’s really sticking to words and being fastidious.

    Yup. I know. The devil is in the details. I’m trying to get past those waving hands (Granville’s too) and these high level metaphors and personification (NS as agent) and down to the nitty gritty of what actually happens!

    A protein coding gene codes for a protein. The protein implements a function.

    No, the protein doesn’t “implement a function”. The protein may play a role in some function but that doesn’t mean it “implements” it. These things matter. A protein produced in the wrong place and/or the wrong time may cause serious disease. The function is much larger than the protein. At least it is if we are talking about multicellular organisms. Maybe not so much if we are talking about unicellular organisms (which are not my speciality :))

    OK, we are writing on a blog, in a hurry. Could you please try to understand what I mean? The gene has the information for the protein. The protein imnplements a function. I have no confusion about genotype and phenotype.

    I know you think you don’t, and I know you think I am being difficult. But from where I’m standing you do have a problem, and I think it’s because you have over-simplified to the point of falsification the role that proteins play in a functioning organism. I’ve recommended this video many times, I don’t know whether you have seen it. I think it’s excellent (and a refreshing antidote to Dawkins’ “Selfish Gene” concept):
    http://videolectures.net/eccs07_noble_psb/
    I do wish people on this blog would watch it and take note. It would actually improve some of the arguments both for and against Darwinian evolution!

    They are part of a large system of genes that work together to produce a co-ordinated functional phenotype.
    What does that mean? The glucose 6 phosphatase gene codes for a specific enzyme, that has a specific biological function. That such a function is integrated in a higher level organization is true, but that does not make the local function less real. And you cannot integrate functions that do not exist into an higher level organization. Your reasoning really makes no sense.

    Yes it does :) I didn’t say that functions couldn’t be local. Indeed it’s important that they are. If you start producing enzymes in the wrong tissue at the wrong time, you produce disorders. But if the local system doesn’t help the phenotype (the whole organism) reproduce then it’s not fulfilling a function, whatever else it does. Not in the normal sense of the word “function” anyway. It would be what my husband calls our cats: “do-nothing-machines”.

    If your darwinist education prevents you, an intelligent person, from understanding that the local function of an enzyme is different from the global function of an integrated protein system, that still is different from the general reproductive function of a living being, then I must say that darwinism is really much more dangerous for human mind than I believed

    Well, it doesn’t. So retract that thought.

    Most genes are expressed in many organs and tissues, and, depending on where and when they are expressed, serve a different function,
    No. Their biochemical activity remains the same.

    I didn’t say it didn’t. They nonetheless may serve different functions, i.e. some genes are pleiotropic. Even within one organ, a single gene, for example a neurotransmitter transporter, may serve different functions depending on where, how, and the degree to which it is expressed.

    Unless a gene does something that contributes to the reproductive success of the organisms of a population I don’t see how you can say it has a function.
    Hey! What are you saying? An enzyme retains its biochemical function in the lab, in a cell free system, where it does not certainly contribute to any reproductive success!

    Well, it retains its biochemical properties – it still catalyses the same reactions. I wouldn’t call that a function. I’d call it a biochemical property.

    DNA polymerase is standardly used in labs, because it does what it does. I don’t think that allowing us to perform PCR is contributing to some reproductive success. And yet, we can perform that technique in the lab because the involved enzymes retain their biochemical functions even if completely separated from the living beings where they were formed, and even if artificially built in the lab in a cell free system. How can you make such senseless statements?

    It could actually be a language problem. I was helping my son translate some Spanish poetry this evening, and we were both commenting that one Spanish word can have many English equivalents, and so you lose in translation the multivalence of the Spanish word. I’ve found the same with Italian too. English is wonderful for precision, but romance languages can be better for philosophy! So no, my statement is not senseless. But if you want to use the word “function” to refer to what I would call a simply a chemical or physical “property”, then it is important that we keep those meanings separate and do not equivocate between the two. By “function” I mean: serves some purpose, whether teleologic or teleonomic. In the case of components of living things, I mean teleonomic purpose – the role a component plays in contributing to the reproductive success (including the maintenance) of the organism. I might also use an enzyme teleologically to help me get my clothes clean, but or even as some kind of reagent in a lab, in which case I would be giving it a function. But just sitting there catalysing for no purpose, that, I wouldn’t call having a “function” merely exhibiting a biochemical property.

    Well, they seem unnecessarily complicated and potentially misleading. For a start, “natural selection” occurs at the level of the phenotype, not the genotype (as I keep saying, you seem to confuse these levels).
    I am nbot confusing anything. NS does happen at the phenotype level, but as its result is differential reproduction, its relevant effect is expanding or reducing the instances of a specific genome. That’s the only thing that is relevant to evoultion. So, the genetic variation gives a phenotypic effect, the phenotypic effect determines the expansion or contarction of the population, and therefore the expansion or contraction of the new gene. And so? Must I write all that each time to make you happy? What changes?

    OK. I wasn’t sure what you meant by “expansion”. I’d call that “propagation” or “replication” :)

    You cannot consider natural selection at the molecular level only.
    Why? I consider the effects of NS at the molecular level, because that is the level where the information is, and it’s information we are debating. Why in th world cannot I do that?

    Because you need also to consider the phenotype!!!!! The phenotype does not even exist at the molecular level!

    C1) A neutral variant is one that either confers no greater reproductive success on its bearer than the parental variant did on its bearer in the same environment or that confers no greater reproductive success on its bearer in the current environment than the mean reproductive success of all other variants in the population.
    Simple and intuitive indeed! Look at mine:
    c1) Non visible to NS: that variation in no way modifies the reproductive potential of the reproducer compared to the rest of the population. That’s what a “neutral” variation is.
    Where is your problem? The variant is different: either that difference changes something in the reproductive power of the varied being, or not. And your complex, and useless, definition is misleading to our discussion. After all, we are looking for how new functional proteins emerge (do you remember? that was the scenario). If a modified gene (and protein sequence, just to be complete) does not code for a protein that is new and functional, it can be in some clone that expands or not, for other reasons linked to other parts of the genome, or just because of drift: all that si possible, but in no way that will select for molecular function. In no way it will help a new functional protein to emerge.

    Oh boy. My definition is not complex at all. It doesn’t require us to imagine NS as having metaphorical “eyes” to which “variation” might be “visible”. It just tells you what the word means. Let’s leave the rest for now:

    You are always at the same point: you imagine that functional proteins can emerge by chance alaone, and still you want to keep NS in the field, just to be sure. Well, your reasoning is faulty and wrong.

    And here we are right back where we started with “chance alone”. Look, my position, as I’ve said, is that the entire system is stochastic (i.e. has “chance” elements) both variation and “NS”. Of course I want “NS in the field”!!! It’s sitting there already!!! I do understand that you think my “reasoning is faulty and wrong” but from where I’m standing you seem not to have understood any part of it! Perhaps, as I said, that is a language problem. I will sleep on this and see if I can be clearer.
    Communication is hard!

    But bear in mind that if the environment changes, that variant may become either beneficial or deleterious. Most new variants are near-neutral when they first appear. Bear also in mind that allele frequency itself is a powerful element in the environment.
    The simple reason why you run away from the molecular scenario is that, at that level, nothing of what you say has any sense.

    I’m not “run[ning] away from the molecular scenario”. I honestly have no clue why you think so. What I am insisting on is that when we evaluate the function specified by DNA (a molecule) we need to do so in terms of its effect on the phenotype. You keep saying you are remembering the phenotype, then you ignore it!

    Again, proteins must fold well, must have an active site, must have a specific, oftem amazing, biochemical activity. Otherwise they are useless.

    And even if they have that biochemical activity, they can be useless just the same in the wrong context. And so?
    Penicillinase is not useful if there is no penicillin in the environment. But it still is a wonderful biochemical machine.
    Sure. But has no function, in my usage. It may have very special biochemical properties.

    But if you have to metabolize nylon, you use the same penicillinase strucure, with minimal modifications, because it’s always an esterase activity you need.
    If you had not penicillinase, nylonase could never emerge. Except in Ono’s imagination…

    Exactly. A functional protein can have a non-functional precursor. That’s what I’ve been saying all along. And that non-functional precursor might exist because it was, in the past functional.

  99. Petrushka:

    I have done the discussion about the paths thousands of times. I cannot always repeat everything. The above discussion was about the random part. Please, reread all my post with honesty, and don’t change the discussion.

    We are still waiting for your answer about probabilities, and for a defense, or a retraction, of your statement in post 2.3.2.1.14.

  100. Elizabeth:

    Frankly, I don’t think it is a problem of language. I think it is a cognitive problem. With respect, your copgnitive problem.

    What can I say? I disagree with all, but I am tired. Maybe tomorrow…

  101. Unicellular organisms have a body and therefor they also have a body plan.

    HOX genes require an explanation before you can use them for any explanations.

  102. Let’s be more clear. Let’s start from a duplicated gene. You make a big fuss about the gene retaining for some time its original function, the double allele, and so on. None of that is relevant.

    Well, it was relevant to the point you originally made with presented only two alternative scenarios, and dismissed both! I presented a third, and defended that!

    Let’s say that, if that duplicate gene will become the seed for a new domain, unrelated to the original at sequence level (as it must happen, because new domains do appear), at some point it must lose its original function and become inactive.

    And immediately you introduce new cans of worms! The “seed” for “a new domain”? Do you just mean that the duplicated sequence may eventually become the coding sequence for a new gene? What is “seed” doing in there? And why must it become inactive before it does something new? Why could it not simply undergo series of point mutation and produce different a slightly protein at each change? Until one day, the protein produced turns out to have a useful new function function? Or have undergo some change to its promotor so it is now expressed in different tissue, or at a different developmental stage? Or, as you could say, it could become inactive, or active but useless (produce some harmless but non-functional protein).

    At this point, all mutations are neutral.

    .

    No they aren’t, necessarily. Some may well be deleterious if they result in a toxic protein.

    As mutations accrue, its sequence will no more be related to the original sequence. At this point, we can safely say that any unrelated state of the sequence has the same probability to be reached by a random wlak as any other.

    I agree that the longer it remains functionless, the further it will move from its “parent” sequence. It’s destiny may well be a pseudogene.

    Your reasoning that “if it was originally a protein-coding gene, it’s not going to go far from being a protein coding gene” has no sense. Any stop codon, or frameshift mutation, will make an ORF no more an ORF.

    That’s true. But back mutations are also possible. Also restorations of the reading frame. And some frameshifts result in viable proteins.

    And even if the oRF is still transcribed or translated, the protein will no more fold or be functional.

    Most probably. But not necessarily. Brand new genes are probably relatively rare (compared with new alleles, or with duplications). But that doesn’t mean they won’t ever happen. Mutations are common.

    We will have a pseudogene, or a protein coding gene that codes for a completely non functional protein (a protein is a protein even if it does not fold, even if it has no function).

    Sometimes indeed this will happen. More often then not, in fact. Hence the notorious “junk DNA”. But “rarely” is a a lot different from “never”, especially given so many opportunities.

    So, new domains are exactly that: some sequence pulled nucleotide by nucleotide from a black bag.

    No.

    I might even run a simulation to show you that this is false. The variance you’d get by selecting nucleotide by nucleotide from a black bag is orders of magnitude higher than the variance you’d get if you started with a certain sequence, mutated it in various ways known to happen, and weeded out those turned out to be lethal.

    Or rather, some sequence that has to be reached by a completely blind random walk from a completely unrelated point of the search space. (Indeed, the idea of the black bag is incorrect, because it is related to a random search by successive extractions, and not to a random walk. The correct model is a random walk).

    Yes, the correct model is a random model. Good, I don’t have to write the simulation! Or perhaps I do. I’ll give it a go, tomorrow.

  103. Well, I think it’s yours :) But then I would, wouldn’t I?

    Sleep well :)

  104. I need not explanation or retraction. Every sequence of coin tosses is exactly as probable as every other sequence.

    You seem to be arguing as if certain sequences are predicted or specified in advance, but the fact is your metric depends on a sequence having some consequence or function before you label it specified.

    You have no independent way of declaring a sequence to be functional, that is before you test it for function or observe it to be functional.

    You have no theory of why some sequences are functional and others not, no means of predicting the function of sequences.

    Therefor RMNS and design have the same problem (or require the same characteristics of functional space. If function is truly scattered and sparse, with no connecting ridges, then design and evolution are both impossible.

    If functional space is connectable, then the design hypothesis is unnecessary.

  105. 105
    material.infantacy

    Testing

    <

    >

    xn

    A1

  106. 106
    material.infantacy

    “I need not explanation or retraction. Every sequence of coin tosses is exactly as probable as every other sequence….You have no independent way of declaring a sequence to be functional, that is before you test it for function or observe it to be functional.”

    The example I provided at #10 obviates the need to define function in order to arrive at an objective assessment that function is non-arbitrary, and exists in a remote part of protein sequence space. The equality of outcome for any given sequence is entirely irrelevant.

    Here’s another stab at it.

    As already stated, the set of functional sequences is a subset of the set of folding sequences, which is a subset of all ordered sequences. This implies that for a protein to be functional, it needs to fold — and since we know that a subset of sequences fold, we have an objective measure that there exists a set of objectively defined sequences, of which some will be functional, and apart from which none will be.

    Again, the set of functional sequences is a subset of the set of folding sequences, which provides an objective boundary for the number of functional protein folds.

    S = {all possible sequences of length n}
    F = {elements in S which fold}
    F1 = {functional sequences}

    F1 ⊆ F ⊂ S
    n(F1) ≤ n(F) so P(F1) ≤ P(F)
    P(F) = n(F) / n(S) < 1.0

    There is a partition of S, such that F ∪ F’ = S, and F ∩ F’ = {}. (The set F’ is the complement of F, or every element of S which is not in F.) This is to reiterate that functional folds are not arbitrary, regardless of whether or not we can determine/define them. Also, n(F) is minuscule in size compared to n(F’).

    Here’s what’s at issue:

    a) not all sequences fold;

    b) set F1 (functional sequences) is a subset of F (folding sequences), so the size of F1 is bounded by the size of F, n(F1) ≤ n(F), implying that the probability of F1 occurring is less than or equal to the probability of F occurring;

    c) n(F) > 1/n(S), that is, F is not the same as some arbitrary sequence in S, n(F) > 1 (because there is more than one sequence which folds).

    d) functional sequences are irrelevant here (see point b) and only folding sequences need be considered for the objective assessment that not all sequences are equal with regard to having function.

    Since we know that not all sequences are equal — that is, some of them fold, and of the ones that do some of them can have a function — it cannot possibly be more irrelevant that any given sequence is just as improbable as any other. As a matter of fact, it couldn’t be more relevant that sequence specificity is king with regard to biological function.

    Again, to suggest otherwise is to imply that in a biological context, one sequence is as good as any other, which is objectively not the case.

  107. 107
    material.infantacy

    Correction:

    “c) n(F) > 1/n(S), that is, F is not the same as some arbitrary sequence in S, n(F) > 1 (because there is more than one sequence which folds).”

    Should be

    “c) P(F) > 1/n(S), that is, F is not the same as some arbitrary sequence in S, n(F) > 1 (because there is more than one sequence which folds).”

  108. As already stated, the set of functional sequences is a subset of the set of folding sequences, which is a subset of all ordered sequences.

    That’s not entirely true. There are sequences that participate in regulatory networks, and there are probably sequences that have unidentified functions.

    But unless you have a database of functional sequences or a theory of what makes a sequence function, you can’t design, except by cut and try.

    My point is and always has been that in the absence of the ability to distinguish a functional sequence independently of trial and error, design is impossible. As design advocates you really need to demonstrate that design is theoretically possible.

    You could answer my challenge by simply telling me how to distinguish a functional sequence with one character altered from a batch of randomly generated sequences. If you can’t tell that a non-functional sequence has 149 bits of dFSCI, you can’t claim the functional sequence has 150 bits.

    You don’t have a metric. You have a bulls eye painted after the arrow has landed. You simply can’t wait for the coins to be tossed before defining a sequence as a target.

    Design advocates assert that designers have some shortcut to knowing that a sequence will result in a useful fold, but they haven’t proposed a design process that doesn’t use evolution.

  109. 109
    material.infantacy

    “You have a bulls eye painted after the arrow has landed.”

    Clearly this is not the case, as has been shown. Proteins which fold do so regardless of whether anyone’s “painted a bullseye” around them. Your claim has been refuted.

    Venn diagram: Draw a rectangle which represents the sample space. In the center of that, draw a circle which represets the set of folding sequences. In the center of that circle, draw another which represents the functional set. There’s the target, and it’s objectively real. It cannot reasonably be denied. Plain and simple, neat and clean.

    The target which constitutes folding, functional proteins is carved upon the face of reality by the laws of physics. It doesn’t get any more objective than that.

    Miller’s Blunder

  110. Not only is design impossible (except by an omnipotent deity I guess), but unless you know what proportion of sequences result in potentially useful proteins (or, indeed, are potentially useful regulatory sequences) there’s no way of computing the probability that any one will arise “by chance” nor how closely clustered the useful stuff is.

    Which is (of many) reasons why the entire probability calculation approach is misconceived.

  111. should read:

    “Which is one (of many) reasons….”

  112. Clearly this is not the case, as has been shown. Proteins which fold do so regardless of whether anyone’s “painted a bullseye” around them. Your claim has been refuted.

    Venn diagram: Draw a rectangle which represents the sample space. In the center of that, draw a circle which represets the set of folding sequences. In the center of that circle, draw another which represents the functional set. There’s the target, and it’s objectively real. It cannot reasonably be denied. Plain and simple, neat and clean.

    The target which constitutes folding, functional proteins is carved upon the face of reality by the laws of physics. It doesn’t get any more objective than that.

    Well, yes, it does. It only starts to get “objective” when you can list the contents of your functional set.

    Until you’ve done that, firstly you have no way of knowing whether the “functional set” is coterminous with the folded set, nor the size of the folded set, nor do you know how similar the elements of the functional set are, nor in what contexts they prove functional, nor how close those contexts are.

    In other words you know not one single relevant datum with which to compute your probability.

    Unless you assert, without foundation, that the contents of the subset are the observed functional folds. Which would, as Petrushka says, be drawing your target round the arrow.

  113. 113
    material.infantacy

    “Well, yes, it does. It only starts to get “objective” when you can list the contents of your functional set.”

    I don’t need to list the elements to objectively assess, with sufficient clarity, that function is objective.

    “In other words you know not one single relevant datum with which to compute your probability.”

    Irrelevant. The issue at hand is whether function is some sort of post hoc imposition, or whether it’s objectively real. Function is objective, and so is folding. Since not all sequences fold, the assessment is not arbitrary.

    Whether we can tell if a sequence will be functional, or even fold, is not the issue. Folding sequences fold, regardless of an observer’s judgment as to function. Therefore, in a sample space consisting of all sequences of length n, there is a set of folding sequences — a predetermined target.

  114. I don’t need to list the elements to objectively assess, with sufficient clarity, that function is objective.

    You need to count the elements to know what proportion of each set are also members of the larger sets. And unless you know what they are you can’t do that. Or any of the other stuff I mentioned that you would have to do.

    “In other words you know not one single relevant datum with which to compute your probability.”

    Irrelevant. The issue at hand is whether function is some sort of post hoc imposition, or whether it’s objectively real. Function is objective, and so is folding. Since not all sequences fold, the assessment is not arbitrary.

    It certainly isn’t irrelevant! We are talking about probability, right? And it’s also incorrect. Tell me how you would assess, objectively the proportion of DNA sequences that would result in a folding protein. Then tell me how you would assess, objectively, which one of those are “functional”.

    Whether we can tell if a sequence will be functional, or even fold, is not the issue. Folding sequences fold, regardless of an observer’s judgment as to function.

    Sure. But how do you tell whether a sequence is a folding sequence?

    Therefore, in a sample space consisting of all sequences of length n, there is a set of folding sequences — a predetermined target.

    Yes. But if you want to compute the probability of a sequence being a folding sequence then you have to have a way of determining what proportion of possible sequences are folding sequences.

    If not, why not?

  115. P:

    Pardon, but there you go again with your inappropriate “painted target” metaphor.

    Kindly, tell us how seeing the ATP synthase as what it is, a motor — a two-port converting energy into shaft power, is painting a target after the fact. Instead of recognising a case of something we do know something about: a motor, only, using molecular nanotech and in the context of living organisms.

    Likewise, — and here we deliberately move to an analogy — how seeing kinesin with vesicles in tow along the microtubules as a miniature walking truck, is an after the fact, dismissible subjective evaluation.

    In short, you are playing at distractive, label and dismiss rhetoric, not serious discussion on the merits. That is really sad.

    And, that WE do not yet know how to design working proteins etc, does not mean that no-one out there does. That’s like a tribe in the deep Amazon seeing an aeroplane flying over, and saying, we do not know how to do it, so we cannot say that this is credibly a designed object on the evident signs. So, it “must” be a natural object that was always there for all we can say.

    Please, think again.

    The proper issue, is, what are the candidate adequate causal factors that can produce such items with FSCO/I, and what empirical warrant can allow us to decide which is a best explanation.

    Pardon, but I get the strong impression that you know that motors etc are made by designers, and are only credibly explained on design. So, you are desperately straining every talking point device, to shift focus.

    Red herrings and strawmen, in short.

    Which are fallacious.

    So, kindly get back on track and deal wit the issue on the table: if you wish to suggest that the objects in view are explicable on blind chance plus mechanical necessity in some plausible initial environment, then show us how that environment leads to these objects, and substantiate with observations.

    Failing such, we will have every right to conclude that you are trying to distract attention from what you cannot answer cogently on the merits of observed fact, and then dismiss what is not convenient for your preferred view.

    So, kindly fill in the blanks:

    a: Major candidate causal explanations for _______________,_______________, _______________, . . . and _______________, are ________________ .

    b: Evidence, per observations for the first candidate is __________________

    c: Evidence, per observations for the second candidate is __________________

    d: Repeat as often as required.

    e: Of these, on factual adequacy, coherence and explanatory power and elegance, candidate ____ is the best explanation, because ________________ .

    Thanks in advance.

    GEM of TKI

    PS: Having a metric that tells us when something that is specific, complex and functional is beyond the reasonable reach of chance and necessity is a relevant factor in the above. That is exactly what the log reduced chi metric you seem to want to dismiss does:

    Chi_500 = I*S – 500, functionally specific bits beyond the solar system atomic state resources threshold

  116. With respect, kf, you are moving the goalposts. We were talking about the probability of a mutated sequence coding for a potentially functionally useful folding protein, not the probability of a mutated sequence coding for a specific complex function (which requires a great many stretches of code).

    Nobody suggests that complex functions arose ex nihilo by some unselected series of mutations. That’s why natural selection (the subject of the OP) is important.

  117. 117
    material.infantacy

    “Tell me how you would assess, objectively the proportion of DNA sequences that would result in a folding protein. “

    By the laws of physics. I’m standing by the notion that if something is necessitated, then its nature can be determined.

    Aren’t there, in principle, ways to determine the proportion of folding sequences to not? I seem to remember something. Regardless it is non-arbitrary, and there is a target for functional protein sequences — the folded set.

    Unless you are making the claim that folding is arbitrary instead of deterministic, and that proteins can happily exist without folding, there is an objective target for protein function in sequence space.

    I don’t need to list the elements to objectively assess, with sufficient clarity, that function is objective.

    You need to count the elements to know what proportion of each set are also members of the larger sets. And unless you know what they are you can’t do that. Or any of the other stuff I mentioned that you would have to do.

    We only need to “count” the elements to determine the specific probability, not whether one exists. We know that not any sequence will fold, even that fewer fold than not.

    This gives us the basis for determining that a probability exists — that in a sample space S, there is a set F of proteins that will fold. There is also a set F’ (the complement) such that F ∪ F’ = S, and that F ∩ F’ = {}. This is a partition on S.

    That’s a target, the set F. We can negotiate specifics, such as the value of n(F)/n(F’), or whether all sequences in F can be functional, but we need not do so to determine that a target exists.

    Miller’s card analogy

    “But if you want to compute the probability of a sequence being a folding sequence then you have to have a way of determining what proportion of possible sequences are folding sequences.

    If not, why not?”

    Explained above.

  118. 118
    material.infantacy

    “We were talking about the probability of a mutated sequence coding for a potentially functionally useful folding protein….”

    Not exactly. We are talking about whether there is a probability — whether an objective, predetermined target exists as a space F (folded) as a subset of S (all sequences), for which n(F) < n(S); that is, F is a proper subset of S. And that functional protein sequences are a subset of F.

    I vote yes.

  119. But you haven’t told us how you “predetermine it”!

    So how can you compute the probability?

  120. 120
    material.infantacy

    Do all sequences fold? Yes or no. If no, then there is a target in S (sequence space).

    Whether there is an objective target. That is the issue. There either is or there isn’t. Which is it?

    This isn’t about the probability of function, but whether function is a subset of sequence space — yes, it is.

  121. 121
    material.infantacy

    Here are some questions to ponder.

    S = {sequence space}
    F = {folding sequences}

    Is F a subset of S ?
    Is the size of F less than the size of S ?
    Is the probability of F greater than zero and less than one?
    Do we need to know the probability value before knowing that 0 < P(F) < 1 ?
    For any trial in S, are there two possible outcomes, F and F’ ?
    Is F determined by the laws of physics?
    Does F change from time to time?

    Answering those questions will be useful in determining whether folding is deterministically specific to a subset of sequences.

  122. You wrote:

    Not exactly. We are talking about whether there is a probability — whether an objective, predetermined target exists as a space F (folded) as a subset of S (all sequences), for which n(F) [is less than] n(S); that is, F is a proper subset of S. And that functional protein sequences are a subset of F.

    I vote yes.

    Now you say:

    This isn’t about the probability of function, but whether function is a subset of sequence space — yes, it is.

    So yes, we were talking about probability. Specifically, you took Petrushka to task as follows:

    For a set S, which is a universal set consisting of every possible sequence of length n, there is a subset F which consists of every sequence that results in a folded protein.

    P(F) = n(F) / n(S)

    That is, the probability of F occurring is equal to the number of elements in F divided by the number of elements in S. The set F is objectively improbable, unchanging, and contains the subset of all potentially functional proteins. (That is, the size of the subset of functional proteins is bounded by the size of the subset F.)

    No, the set F is not “objectively improbable” because you haven’t calculated it. It’s not calculable. n(F) may be only slightly smaller than n(S). And how, in any case, are you computing sequence lengths, and for what maximum length of genome? And even if you were able to calculate P(F), you’d still have to compute 1-(1-P(F)^N) where N is the number of opportunities for a new sequence in order to get the probability that one of those F sequences would turn up. And those opportunities themselves will be constrained by the sequences already generated.

    So the thing is not “objectively improbable”. We simply do not know, and cannot compute, the probability.

    And in any case, “function” is not synonymous with “coding for a foldable protein”. There are plenty of functional but non-coding regions of the genome. And not all genes that code for a foldable protein perform a function. Some genes are never expressed; some are expressed but don’t do anything that contributes to the life and reproductive capacity of the phenotype.

  123. 123
    material.infantacy

    “Now you say:

    This isn’t about the probability of function, but whether function is a subset of sequence space — yes, it is.”

    It’s what I said all along. Just because I proposed that 0 < P(F) < 1 (or other inequalities) doesn’t mean that I was calculating the probability of a function.

    You’re going a long way to avoid admission that an objective target exists. Nothing of what you say has refuted it.

    You seem to be operating under the assumption that because I can’t compute an exact probability for a given function, that no objective target exists.

    You say that n(F) may be only slightly smaller than n(S). Regardless, n(F) < n(S) and so for any trial there is either F or not F, and 0 < P(F) < 1. That’s a target. That’s the claim. That’s the demonstration.

    It objectively exists, and that’s what you won’t acknowledge. You suggest that it might be more probable than I think. Fine. It’s still a target in S.

    The target is in F, so regardless of how improbable it is (and yes I think it’s improbable) it is only one of the two given outcomes: P(F) + P(F’) = 1. This is true regardless of whether I can determine the size of F.

    “And in any case, “function” is not synonymous with “coding for a foldable protein”.”

    Indeed, but “function” is a subset of “folding” which is why I used folding in the first place.

    Regarding the probability of F, isn’t there an estimate of 10^-74 for a 150 length chain? If you don’t like that number, do you have one which makes a difference? What value do you think would be reasonable?

    Since proteins can’t perform functions unless they first fold into stable structures, Axe’s measure of the frequency of folded sequences within sequence space also provided a measure of the frequency of functional proteins—any functional proteins—within that space of possibilities. Indeed, by taking what he knew about protein folding into account, Axe estimated the ratio of (a) the number of 150-amino-acid sequences that produce any functional protein whatsoever to (b) the whole set of possible amino-acid sequences of that length. Axe’s estimated ratio of 1 to 1074 implied that the probability of producing any properly sequenced 150-amino-acid protein at random is also about 1 in 1074. In other words, a random process producing amino-acid chains of this length would stumble onto a functional protein only about once in every 1074 attempts.

    Meyer, Stephen C. (2009-06-06). Signature in the Cell (pp. 210-211). Harper Collins, Inc.. Kindle Edition.

  124. Elizabeth:

    I will discuss just one point: your refusal to admit the fundamental difference between necessity and a random system. If we can’t agree on that, it’s completely useless to go on discussing anything.

    You say:

    You put your finger on it IMO when you say “as we cannot know all the variables implied”. Yes indeed. And we never do. Even if at some fundamental level the universe proves to have no quantum uncertainty (as at least one eminent theoretical physicist has proposed) all our models (and we are talking about models here) must be stochastic. Now physicists work to find tolerances and insist on 5 sigma confidence, while life scientists are often content with two. But every single effect we observe comes with stochastic variance. We can never know all the variables. What we can do is estimate probability distributions, and also the extent to which those distributions are orthogonal.

    This is an extremely serious misrepresentation of scientific epistemology.

    An explanatory model based on necessity is a set of logic relations and mathemathical rules that connect events, according to rigid determinism, to give a causal explanation of observed facts.

    Newton’s law of gravity gives us a definite mathematical relationship between mass and gravitational force. That relation is not in any way random. It is a necessity relation. A causal relation, where mass is the cause of gravitational force.

    The evolution of a wave function in quantum mechanics is strictly deterministic, and has nothing to do with probability.

    All science works that way. Even biological sciences work largely by necessity models.

    probabilistic model, obviously, are very useful where a necessity model cannot be explicitly built, and still a probability function can well describe, to some extent, what we observe.

    So, the repeated tossing of a coin is well described by a simple probabilistic model, while a single toss of a coin is certainly a deterministic problem that cannot easily be solved by a necessity model, even if in theory it can be solved.

    The two ways to describe reality are deeply different: they have different forms, and different rules.

    They are so different that the whole methodology of science is aimed at distinguishing them.

    Take Fisher’s hipothesis testing, for instance, that is widely ised as research methodology in biological sciences. As you ceretainly know, the purpose of the test is to affirm a causal necessity, or to deny it. Observed data are analyzed, and observed differences, or effects, are statistically tested to compute the probability of those effects if only random noise were the explanation: that is the null hypothesis. if the null hypothesis, that is an explanation out of random effects, is considered too improbable, it is rejected, and the necessity explanation that was the initial hypothesis of the researchers is usually assumed.

    As you can see, that is completely different from your bold statement that “all our models (and we are talking about models here) must be stochastic”. That is a huge mistake. Most scientific models are not stochastic at all. Some of them are, like part of thermodinamics, and the collapse of wave function in quantum physics.

    So, where does your confusion come from? I believe you make a huge mistake, confounding the model, that is often not stochastic at all, and observed data, that always have a component of random noise.

    Random noice is an empirical reality. Random sampling is a cause of random error in most biological contexts. In physics, as in all science, measurement always comes with some error. The presence of random error in empirical data is exactly the reason why those data are often analyzed statistically to distinguish between the effects of random noise and the assumed effect of necessity. But that does not mean, in any way, that the model is stochastic. The explanatory model is usually based on necessity, it assumes specific cause and effect relations, where the probability of an event, given the cause, is 1 or 0.

    So, you are calling the model stochastic, while the only contribution of statistical analysis in bthe procedure is to rule out random causes as an explanation. That is a big congitive mistake.

    You cite 5 sigma. presumably referring to its use in physics. Well, I quote here what 5 sigma measn, and what it is used for (emphasis is mine):

    “Word of the Week: Five Sigma

    September 23, 2011
    by Lori Ann White
    The realm of particle physics is the quantum realm, and physicists who venture there must play by the rules of probability. Major discoveries don’t come from a single repetition of a single experiment. Instead, researchers looking for new particles or strange processes repeat an experiment, such as smacking protons together at nearly the speed of light, over and over again, millions or billions or trillions of times – the more times the better.

    Even then, as the researchers sort through the results, interesting lumps and bumps in the data don’t automatically translate into, “We found it!” Interesting lumps and bumps can, and do, happen by chance.

    That’s why statistical analysis is so important to particle physics research. The statistical significance of a particular lump or bump – the probability that it did not appear by chance alone – must be determined. If chance could have supplied it, no dice, and no discovery.

    The yardstick by which this significance is measured is called standard deviation (generally denoted by the lowercase Greek letter sigma). In statistics, the standard deviation is a measure of the spread of data in a normal probability distribution – the well-known bell curve that determines the grades of many students, to their dismay.

    On that bell curve, researchers plot the probability that their interesting lump or bump is due to chance alone. If that point is more than five sigma – five standard deviations – from the center of the bell curve, the probability of it being random is smaller than one in one million. Only then can particle physicists shout, “Eureka!” (a la Archimedes) … but without running naked through the streets of Athens.”

    Again, as you can see, physicists use the 5 sigma threshold to exclude “that their interesting lump or bump is due to chance alone”. IOWs, to affirm that it is evidence of their necessity model. QED.

    That is exactly how ID uses the concept of dFSCI: a bump that is too unlikely to be due to chance alone. IOWs, a result that cannot emerge in a random system, but that requires a cause. In that case, the cause is design.

    Again, if we cannot agree on those basic epistemological concepts, there is no hope in discussing. All my reasonings are based on my epistemology, in which I very much believe. How can I discuss anything with you, if you believe completely different things (I really cannot understand what and why) at the epistemological level?

  125. It’s what I said all along. Just because I proposed that 0 < P(F) < 1 (or other inequalities) doesn’t mean that I was calculating the probability of a function.

    You’re going a long way to avoid admission that an objective target exists. Nothing of what you say has refuted it.

    You seem to be operating under the assumption that because I can’t compute an exact probability for a given function, that no objective target exists.

    What’s “objective” about a target that you can’t define?

    Sure, a “target” exists = we know that the set of sequences that code for foldable proteins isn’t an empty set. That is trivially true, and not in dispute. If you want to add the adjective “objective” to that clear fact, then feel free, but I don’t see what it adds.

    But if you want then to calculate the probability that a randomly drawn sequence will include one of those foldable proteins then you can’t do it. So you can’t claim then that “The set F is objectively improbable”.

    We simply don’t know how improbable it is because a) we don’t know how many elements of set n(S) are also members of set n(F), and nor do we know how many draws there were.

    All you can say is that set n(F) is smaller than set n(S) where n(F) is the set of sequences that result in foldable proteins.

    Regarding the probability of F, isn’t there an estimate of 10^-74 for a 150 length chain? If you don’t like that number, do you have one which makes a difference? What value do you think would be reasonable?

    We don’t know, which is Petrushka’s point. As I understand it (and I assume Petrushka does too) Axe “drew his target round an arrow”. There’s a good article about it on Panda’s Thumb, with nice illustrations:

    http://pandasthumb.org/archive.....st-fa.html

    As Petrushka said:

    My point is and always has been that in the absence of the ability to distinguish a functional sequence independently of trial and error, design is impossible.

    And as I added:

    Not only is design impossible (except by an omnipotent deity I guess), but unless you know what proportion of sequences result in potentially useful proteins (or, indeed, are potentially useful regulatory sequences) there’s no way of computing the probability that any one will arise “by chance” nor how closely clustered the useful stuff is.

    For the heck of it, I’m just writing a quick simulation to get some kind of feel for how often a randomly generated genome of length N will contain a sequence starting with a start codon and ending with a stop codon that consists of a whole number of triplets (ie. code for an amino acid chain with no introns.

    Not sure it will help though, because I still can’t get from there to whether the resulting protein chains would fold stably or not.

    If you are really interested in the answer, you could join
    folding@home:

    http://folding.stanford.edu/

    It’s my New Year’s Resolution :)

  126. Elizabeth:

    I will discuss just one point: your refusal to admit the fundamental difference between necessity and a random system. If we can’t agree on that, it’s completely useless to go on discussing anything.

    gpuccio, I’ve already said that I find the term “random system” hopelessly ambiguous and imprecise.

    I am happy with the term “stochastic”, which all systems are, to some extent, but can be regarded as non-stochastic at certain scales.

    Please don’t characterise my position as a “refusal to admit” anything. I am not refusing to “admit” anything. I am simply pointing out something that you yourself pointed out, which is that all models include unknowns, error terms, if you like, representing either unmodelled variables that are orthogonal (with luck) to modelled factors or even inbuilt uncertainties (such as quantum uncertainty). The degree to which unmodelled variables impact the effects of interest is the degree to which we have to consider the system stochastic when modelling it.

    That’s all.

    I’ll read the rest of your post now.

  127. Function is objective, and so is folding. Since not all sequences fold, the assessment is not arbitrary.

    Folding ceased to be the most important criterion for utility several hundred million years ago. Most evolution in multi-celled organisms takes place in regulatory genes.

    Of course utility is not arbitrary. It is constrained by selection. My point is you have no way to calculate probability, because — independent of selection — you have no way of determining which sequences are useful and which are not.

    If you wish to argue that design is even possible, demonstrate a method of determining utility that is independent of selection.

    The problem for both evolution and design is determining utility. Evolution solves the problem by trying all variations in the sequence neighborhood. That is the point of the Lenski experiment — demonstrating that this is what actually happens.

    In one sense it makes no difference whether mutation is random. If everything is tried it makes no difference what the order of trials is. It could just as effectively be done starting at one end of a sequence and progressing to the other.

    Now ID proponents could go the Thornton route and do the painstaking research to see if cousin sequences really can be connected. That would be interesting.

  128. We only need to “count” the elements to determine the specific probability

    Based on actual experiment, the probability of finding any arbitrary functional sequence that is one step removed from an existing sequence is one, because populations buy all the lottery tickets.

  129. Elizabeth:

    Now, you cannot reatract what you have stated. I quote another statement of yours (emphasis mine):

    I agree. That was my point. All systems are stochastic. What we need are the relevant probability distributions, not a division into stochastic and non-stochastic processes. They are all stochastic. Some merely have more variance than others. This is, btw, why I avoid the term “RM+NS” and get cross when people say that mutation is random and natural selection isn’t. Both are variance generation (“RM”) and differential reproduction (“NS”) stochastic processes. This is, btw, why I avoid the term “RM+NS” and get cross when people say that mutation is random and natural selection isn’t. Both are variance generation (“RM”) and differential reproduction (“NS”) stochastic processes. And both have probability distributions that are biased in favour of reproductive success.

    the confusion here is very clear. You refuse the epistemological difference between RV and NS, that is extremely serious, and then you strangely state that “both have probability distributions that are biased in favour of reproductive success.” What does that mean?

    RV is a random system (yes, I do prefer that term to “stochastic”). It includes all the events of variation that happen in the genome, and that are cuased by a great number of variables, none of which has “probability distributions that are biased in favour of reproductive success”. Please, name one cause of random variation that has that type of “probability distribution”, and explain why.

    In what sense single point mutations are ” biased in favour of reproductive success”? Or chromosomal deletions? Or anything else?

    You say such strange things exactly because you don’t want to admit the difference between RV and NS.

    Let’s take NS. Is it a random principle? No. It is a necessity relation. It expresses the very trivial concept that, in a reproducing population, if some variation has a negative effect on reproduction, the variated genome will probably be less represented, and the other way round if the variation has a positive effect on reproduction.

    Now, it is true that, due to existing random noice in the reproducing population, positive variations can sometimes be lost, and vice versa. In that sense, the necessity effect of the principle of NS can be diluted, and we need to take that into account. But still, the relationschip between some specific variation and reproduction is a causal relation, a relation of necessity. Its effect will be modulated by other effects, but still it is a necessity relation.

    Let’s make an example. If some very fundamental protein that is required for cell division mutates in a way that it loses all function n(let’s say a frameshift mutation), reproduction becomes impossible. That effect has no probability distribution at all. It cannot even be mitigated by other factors. Negative NS is often a very strong necessity principle.

    You cannot treat those things with “probability distributions”, or just play games saying that they are stochastic. Funtion is a question of necessity. A machine works because it is made to work. Its working is the necessary result of its structure. Probabilty does not help here. We have to understand the function, the structure, and the relationship between the two. That is a work of necessity, of understanding logical and causal relations.

    You cannot “avoid the term “RM+NS””. People who say that “mutation is random and natural selection isn’t” are simply right. Variation in the genome is random, unless it is designed. NS is a necessity principle that depends critically on how the variation modifies fuinction. Its effect is not to modify the genome, but simply to change the percebtual representation of what exists in the population. There is a big difference.

    The only “engine of variation”, the only thing that modifies genome, is a series of random modifications, none of them “biased in favour of reproductive success”. Whatever they are, single point mutations, deletions, inversion, sexual recombination, whatever, they are random, because there is no explicit cause and effect relationship between the cuase of variation and the effect on genomic function. A single point mutation happens randomly in the genome. Even is the probabilites are not the same for all mutations, nothing in that probability dostribution is “biased in favour of reproductive success”.

    The only mechanism “biased in favour of reproductive success”, except for design, is NS, the necessity relationship between the type of variation (positive, negative, neutral) and the reproductive fucntion.

    That’s why RV and NS must be recognized as different, and treated separately.

  130. You say:
    You put your finger on it IMO when you say “as we cannot know all the variables implied”. Yes indeed. And we never do. Even if at some fundamental level the universe proves to have no quantum uncertainty (as at least one eminent theoretical physicist has proposed) all our models (and we are talking about models here) must be stochastic. Now physicists work to find tolerances and insist on 5 sigma confidence, while life scientists are often content with two. But every single effect we observe comes with stochastic variance. We can never know all the variables. What we can do is estimate probability distributions, and also the extent to which those distributions are orthogonal.
    This is an extremely serious misrepresentation of scientific epistemology.

    I don’t agree.

    An explanatory model based on necessity is a set of logic relations and mathemathical rules that connect events, according to rigid determinism, to give a causal explanation of observed facts.

    Well, I accept this as your definition of “an explanatory model based on necessity”. It doesn’t resemble most explanatory models.

    Newton’s law of gravity gives us a definite mathematical relationship between mass and gravitational force. That relation is not in any way random. It is a necessity relation. A causal relation, where mass is the cause of gravitational force.

    Again, that word “random”. You really need to give a tight definition of it, because it has no accepted tight definition in English. And we simply do not know whether gravity is subject to quantum uncertainty or not. Newton’s law is indeed deterministic, but it is only a law, not an explanation – not a theory. And all a law is a mathematical description that holds broadly true. That doesn’t make it a causal relation. We do not know whether mass is the cause of gravitational force. It may be that gravitational force is the cause of mass. Or it could be that “cause” is itself merely a model we use to denote temporal sequence, and ceases to make sense when considering space-time. But I’m no physicist, so I can’t comment further except to say that you are making huge and unwarranted assumptions here.

    The evolution of a wave function in quantum mechanics is strictly deterministic, and has nothing to do with probability.

    Depends what you mean by “nothing to do with probability”. I am talking about scientific (explanatory) models. There is always unmodelled variance, if only experimental error. It is often possible to use deterministic models, even when the underlying processes are indeterminate. Similarly we often have to use stochastic models even when the underlying processes are determinate.
    I think a big problem (and I find it repeatedly in ID conversations) concerns the word “probability” itself, which is almost as problematic as “random”. Sometimes people use it as a substitute for “frequency” (as in probability distributions which are based on observed frequency distributions) . At other times they use it to mean something closer to “likelihood”. And at yet other times they use it as a measure of certainty. We need to be clear as to which sense we are using the word, and not equivocate between mathematically very different usages, especially if the foundation of your argument is probabilistic (as ID arguments generally are).

    All science works that way. Even biological sciences work largely by necessity models.

    No. Pretty well all sciences, and certainly life sciences, use models in which the error term is extremely important. And biology is full of stochastic models. In fact I simply couldn’t do my job without stochastic models (and I work in life-science, but in close collaboration with physicists).

    probabilistic model, obviously, are very useful where a necessity model cannot be explicitly built, and still a probability function can well describe, to some extent, what we observe.

    I think you are confusing a law with a model. A law, generally, is an equation that seems to be highly predictive in certain circumstances, although there are always residuals – always data points that don’t lie on the line given by the equation, and these are not always measurement error. We often come up with mathematical laws, even in life sciences, but that doesn’t mean that the laws represent some fundamental “law of necessity”. It just means that, usually within a certain data range (as with Newton, and Einstein, whose laws break down beyond certain data limits) relationships can be summarised by a mathematical function fairly reliably – perhaps very reliably sometimes.

    So, the repeated tossing of a coin is well described by a simple probabilistic model, while a single toss of a coin is certainly a deterministic problem that cannot easily be solved by a necessity model, even if in theory it can be solved.

    This is a false distinction in my opinion. You can describe the results of a single coin toss by a probabilistic model just as well as you can describe the results of repeated tosses. But if you want to predict the results of an individual toss, as opposed to the aggregate results of many tosses, you need to build a more elaborate model that takes into account all kinds of extra data, including the velocity, spin, distance and angle etc of the coin. And you cannot possibly know all the factors, so there will still be an error term in your equation. In other words, predictive models always have error terms; sometimes these can be practically ignored; at other times, you need to characterise the distribution of the error terms and build a full stochastic model.

    The two ways to describe reality are deeply different: they have different forms, and different rules.
    They are so different that the whole methodology of science is aimed at distinguishing them.

    I agree that characterising uncertainty is fundamental to scientific methodology. I disagree that stochastic and non-stochastic models are “deeply different”. In fact I’d say that a non-stochastic model is just a special case of a stochastic model where the error term is assumed to be zero.

    Take Fisher’s hipothesis testing, for instance, that is widely ised as research methodology in biological sciences. As you ceretainly know, the purpose of the test is to affirm a causal necessity, or to deny it.

    No. That is not the purpose of Fisher’s test, which has nothing to do with “causal necessity” per se (although it can be used to support a causal hypothesis). Nor can you use Fisher’s test to “deny” a “causal necessity”. If students attempt to do that in their essays for me they lose marks! You can use Fisher’s test to support a hypothesis, in which case you can conclude that the observed data are unlikely to be observed under the null hypothesis (the hypothesis that your study hypothesis is false). However, if Fisher’s test tells you that your observed data are quite likely to be observed under the null, you cannot conclude that your hypothesis is false, merely that you have no warrant for claiming that it is true.

    Observed data are analyzed, and observed differences, or effects, are statistically tested to compute the probability of those effects if only random noise were the explanation: that is the null hypothesis. if the null hypothesis, that is an explanation out of random effects, is considered too improbable, it is rejected, and the necessity explanation that was the initial hypothesis of the researchers is usually assumed.

    Right. Except that a good scientist will then attempt to devise an alternative hypothesis that could also account for the observed data. But if you “retain the null”, by Fisher’s test, you cannot conclude that your hypothesis is false. Fisher’s test cannot be used to falsify any hypothesis except the null. It cannot be used in the Popperian sense of falsification in other words.

    As you can see, that is completely different from your bold statement that “all our models (and we are talking about models here) must be stochastic”. That is a huge mistake. Most scientific models are not stochastic at all. Some of them are, like part of thermodinamics, and the collapse of wave function in quantum physics.

    It is entirely compatible with my statement. Typically, we use Fisher’s test to test the difference between two summary statistics. Those summary statistics are models, and they come with associated error terms. Without those error terms we could not compute Fisher’s test statistics, which actually incorporate error terms.

    So, where does your confusion come from? I believe you make a huge mistake, confounding the model, that is often not stochastic at all, and observed data, that always have a component of random noise.

    I am certainly not confounding the two! Scientific methodology involves fitting models to data. Yes, you can build a non-stochastic model, but you still have to deal with the error term, in other words the residuals, aka the stuff impacting on your data that you haven’t modelled. And you can either make assumptions about the distributions of your residuals (assume a Gaussian, for instance) or you can actually include specified distributions for the uncertain factors in your model. If you don’t – you report a non-stochastic model with the assumption that the residuals are normally distributed, and they aren’t, you will be making a serious error, and your model will be unreliable.

    Random noice is an empirical reality. Random sampling is a cause of random error in most biological contexts. In physics, as in all science, measurement always comes with some error. The presence of random error in empirical data is exactly the reason why those data are often analyzed statistically to distinguish between the effects of random noise and the assumed effect of necessity. But that does not mean, in any way, that the model is stochastic. The explanatory model is usually based on necessity, it assumes specific cause and effect relations, where the probability of an event, given the cause, is 1 or 0.

    In my view you are making some serious statistical errors here, I think at a conceptual level. I think it all derives from your unpacked concept “random”. “Random noise” is simply unmodelled variance (as you’ve said yourself). For instance, we can often reduce the residuals in our models by including a covariate that models some of that “noise”, so it ceases to be noise – we’ve found, in effect, a systematic relationship between some of the previously unmodelled variance and a simply modelled factor. Age, for instance, in my field, or “working memory capacity” is a useful one, as measured by digit span.
    Furthermore, sampling error, is not a “cause of random error”, but represents the variability in summary statistics of samples resulting from variance in the population that is not included in your model.
    Certainly some variance is due to measurement error. But it would be very foolish to assume that you have modelled every variable impacting on your data apart from measurement error.
    You are also making a false distinction between “the effects of random noise and the assumed effect of necessity”. Apart from measurement error, all the rest of your “random noise” may well be “effects of necessity”. What makes them “noise” is simply the fact that you haven’t modelled them. Model them, and they become “effects of necessity”. And, as I said, you can model them as covariates, or you can model them as stochastic terms. Either way, you aren’t going to get away without a stochastic term in your model, even if it just appears as the error term.

    So, you are calling the model stochastic, while the only contribution of statistical analysis in bthe procedure is to rule out random causes as an explanation. That is a big congitive mistake.

    I disagree. I think yours is the cognitive mistake, and I think it is the mistake of inadequately considering what “random causes” means. “Random causes” are not “explanations”. They are the opposite of “explanations”. They are theunexplained aka unmodeled variance in your data. Thinking that “random” is an “explanation” is a really big cognitive mistake! But you are not the only person on this board to make it ?

    You cite 5 sigma. presumably referring to its use in physics. Well, I quote here what 5 sigma measn, and what it is used for (emphasis is mine):
    “Word of the Week: Five Sigma
    September 23, 2011
    by Lori Ann White
    The realm of particle physics is the quantum realm, and physicists who venture there must play by the rules of probability. Major discoveries don’t come from a single repetition of a single experiment. Instead, researchers looking for new particles or strange processes repeat an experiment, such as smacking protons together at nearly the speed of light, over and over again, millions or billions or trillions of times – the more times the better.
    Even then, as the researchers sort through the results, interesting lumps and bumps in the data don’t automatically translate into, “We found it!” Interesting lumps and bumps can, and do, happen by chance.
    That’s why statistical analysis is so important to particle physics research. The statistical significance of a particular lump or bump – the probability that it did not appear by chance alone – must be determined. If chance could have supplied it, no dice, and no discovery.
    The yardstick by which this significance is measured is called standard deviation (generally denoted by the lowercase Greek letter sigma). In statistics, the standard deviation is a measure of the spread of data in a normal probability distribution – the well-known bell curve that determines the grades of many students, to their dismay.
    On that bell curve, researchers plot the probability that their interesting lump or bump is due to chance alone. If that point is more than five sigma – five standard deviations – from the center of the bell curve, the probability of it being random is smaller than one in one million. Only then can particle physicists shout, “Eureka!” (a la Archimedes) … but without running naked through the streets of Athens.”
    Again, as you can see, physicists use the 5 sigma threshold to exclude “that their interesting lump or bump is due to chance alone”. IOWs, to affirm that it is evidence of their necessity model. QED.
    That is exactly how ID uses the concept of dFSCI: a bump that is too unlikely to be due to chance alone. IOWs, a result that cannot emerge in a random system, but that requires a cause. In that case, the cause is design.

    And that is exactly what is wrong with ID. Lori Ann White has done what I always reprimand students for doing – saying that their alpha criterion allows them to rule out effects that are “due to chance alone”. It does no such thing. All it does is to allow them to say, as you said yourself, that the observed results are unlikely to be observed if the null is true. Chance doesn’t “cause” anything. But lots of unmodelled factors do. However, under the null hypothesis, only rarely will those unmodelled factors combine to give you results like those you have observed.
    And ID simply does not attempt to model those unmodelled factors. It simply assumes that under the null (no design) the observed data will be very rare. In other words, it assumes what it sets out to demonstrate, which is fallacious.

    Again, if we cannot agree on those basic epistemological concepts, there is no hope in discussing. All my reasonings are based on my epistemology, in which I very much believe. How can I discuss anything with you, if you believe completely different things (I really cannot understand what and why) at the epistemological level?

    I agree we can make no progress without agreeing on basic statistical principles. My position is that yours are erroneous. I hope the above assists you in understanding why I think so.
    Cheers, and a happy new year to you!

    Lizzie

  131. @material.infantacy

    By the laws of physics. I’m standing by the notion that if something is necessitated, then its nature can be determined.

    Aren’t there, in principle, ways to determine the proportion of folding sequences to not? I seem to remember something.

    Not that I’m aware of. That’s the problem with biology. Things that seem in principle to be “knowable” turn out not to be, essentially because once things get a bit complicated, with feedback effects, the maths becomes horrendous and the only thing you can do is model – or empirically observe.

    Like weather, really. We understand weather extremely well, but we still can’t make good forecasts for more than a couple of days at a time. That’s not because we can’t in principle model it very well, it’s because large effects depend on tiny differences in starting conditions, and we cannot possibly know the precise starting conditions.

    Maths is becoming an empirical science :)

    It’s also, incidentally, why, despite being a neuroscientist, I have no real worries that neuroscientists will ever be able to make accurate predictions about what people will do or think, even though I do hold the position that what we do or think is a function of our brains and their inputs. Simply telling each other what we think and what we intend to do will remain the best method indefinitely, IMO, in other words the model that we are autonomous decision-makers is the best model we will ever have :)

  132. 132
    material.infantacy

    This conversation has split into at least two parts: whether an objective target for functional sequences exists in sequence space; and whether the folding set is improbable. These are distinct issues.

    Elizabeth wrote:

    ”Sure, a “target” exists = we know that the set of sequences that code for foldable proteins isn’t an empty set. That is trivially true, and not in dispute. If you want to add the adjective “objective” to that clear fact, then feel free, but I don’t see what it adds.”

    That it’s trivially true was apparent to me as well, which makes this whole conversation a wonder. It should be noted though, that the folding set is certainly not empty, but it’s also not the same as the entire sequence space.

    0 < n(F) < n(S)

    S = {sequence space}
    F = {folding sequences}
    F1 = {functional sequences}
    F1 ⊆ F ⊂ S

    I add “objective” because it’s clear that set F exists. Not all sequences fold, and of those that do, a subset can have function. The implication is that when we observe a function, the space in which it exists is determined by physics, not by the ad hoc rationalization of the observer.

    It should be clear that I chose to focus on the folding set because it obviates the need to debate about what may or may not be functional in some context. Functional proteins are determined (in this sense) by naturalistic laws, because they exist as a subset of proteins that fold. One may object that not all folded proteins are functional; good. But one cannot argue that functional proteins are not folded. So we have a partition on S, consisting of F’ and F.

    F1 ⊆ F and F1 ⊈ F’

    This means that the coins/cards analogies — that each sequence is as improbable as the next — is irrelevant, because F occurring changes the probability of F1 having occurred. None of this reasoning requires an explicit calculation, only the bare knowledge that some proteins fold, but not all do.

    P(F1) < P(F1|F)

    The above is also trivially true. Proteins exist in F, and proteins which can add function in a biological context exist in F1. If we observe F1, then we know that F occurred. If we know that F has occurred, the probability of F1 having occurred is greater. Again, this is trivially true, which is why it can’t be true that what constitutes function is arbitrary. If we observe a functional protein, we know that it exists in a narrow subset of S, because F is a narrow subset of S. (You may reason that it’s less narrow than suggested, but it doesn’t change the core argument.)

    P(F) ≠ 1/n(S)

    That is, if F occurs, it’s not the same as any sequence in S occurring, because n(F) > 1 and n(F) < n(F’), which is perhaps something we can agree on, unless there’s good reason to think that more sequences will fold properly than sequences which won’t.

    As to the improbability of F, I’m sticking with 10^-74, which is a researched estimate. You may have reason for thinking that this estimate is flawed, and if you would like to make a case for that I’d be happy to read it; however you can revise that number down by many orders of magnitude and it doesn’t change the argument. If we say that 1 in 10^50 will potentially fold, then we still have a narrow subset F, which is highly improbable to find with a random trial.

    ”We simply don’t know how improbable it is because a) we don’t know how many elements of set n(S) are also members of set n(F), and nor do we know how many draws there were.”

    First, this isn’t about draws/trials because I’m not talking about what evolution may or may not accomplish. I’m painstakingly stating the obvious, that there exists an objective subset of S in which functional proteins must reside. I agree that we don’t know how improbable. I’m willing to withdraw “objectively improbable” and state instead that it is “likely improbable” or something of the sort. Again, even if we take the 10^-74 estimate down to 10^-50, we’re still looking at something highly improbable.

    Regardless, a target space for function exists and is not ad hoc nor post hoc — because it exists in a narrow and unchanging subset of the sequence space.

    If your position is that the existence of F is trivially true, hence “objective,” then we agree, and can cease straining gnats; and we can put to rest the notion that the argument “equally improbable as any other sequence” has any bearing on the observation of function.

    That was Miller’s argument and entirely misses the point; it is irrelevant.

    I’m arguing against the notion that any sequence is as good as any other with respect to function. If a protein has a function, then it is a subset of F, which is a subset of S, which makes it anything but arbitrary with respect to sequence space, and likely highly improbable.

  133. This conversation has split into at least two parts: whether an objective target for functional sequences exists in sequence space; and whether the folding set is improbable. These are distinct issues.

    Yes. Good thinking :)

    Elizabeth wrote:
    ”Sure, a “target” exists = we know that the set of sequences that code for foldable proteins isn’t an empty set. That is trivially true, and not in dispute. If you want to add the adjective “objective” to that clear fact, then feel free, but I don’t see what it adds.”
    That it’s trivially true was apparent to me as well, which makes this whole conversation a wonder. It should be noted though, that the folding set is certainly not empty, but it’s also not the same as the entire sequence space.
    0 < n(F) < n(S)
    S = {sequence space}
    F = {folding sequences}
    F1 = {functional sequences}
    F1 ? F ? S
    I add “objective” because it’s clear that set F exists. Not all sequences fold, and of those that do, a subset can have function. The implication is that when we observe a function, the space in which it exists is determined by physics, not by the ad hoc rationalization of the observer.
    It should be clear that I chose to focus on the folding set because it obviates the need to debate about what may or may not be functional in some context. Functional proteins are determined (in this sense) by naturalistic laws, because they exist as a subset of proteins that fold. One may object that not all folded proteins are functional; good. But one cannot argue that functional proteins are not folded. So we have a partition on S, consisting of F’ and F.
    F1 ? F and F1 ? F’

    Yes to most of the above, but a couple of caveats: first you cannot assume that unfolded proteins or short peptides never serve any function in an organism; second, and it is related, function has to be considered in terms of the reproductive success of the phenotype in a given context. You can’t look at a protein, or a peptide, or an RNA molecule, and say; this one is functional; this one isn’t. Just because one of those things are coded by DNA and produced in a cell doesn’t mean they are functional in the only sense that matters – the reproductive success of the phenotype.
    So it is misleading, in my view, to characterise sequences, as you have done, simply as “folding” or “folding and functional”. There will be non-folding functional sequences and non-functional folding sequences, and non-coding functional sequences, and sequences that are folding but only functional if some non-coding functional sequence is present, and some of all the above that will only be functional if expressed in some tissue, or in some tissue in certain environments.

    This means that the coins/cards analogies — that each sequence is as improbable as the next — is irrelevant, because F occurring changes the probability of F1 having occurred. None of this reasoning requires an explicit calculation, only the bare knowledge that some proteins fold, but not all do.
    P(F1) < P(F1|F)
    The above is also trivially true. Proteins exist in F, and proteins which can add function in a biological context exist in F1. If we observe F1, then we know that F occurred. If we know that F has occurred, the probability of F1 having occurred is greater. Again, this is trivially true, which is why it can’t be true that what constitutes function is arbitrary. If we observe a functional protein, we know that it exists in a narrow subset of S, because F is a narrow subset of S. (You may reason that it’s less narrow than suggested, but it doesn’t change the core argument.)

    OK, but with the caveats aforementioned :)
    P(F) ? 1/n(S)
    That is, if F occurs, it’s not the same as any sequence in S occurring, because n(F) > 1 and n(F) < n(F’), which is perhaps something we can agree on, unless there’s good reason to think that more sequences will fold properly than sequences which won’t.
    I don’t know of a reason, but I am still concerned about the assumption that only sequences that “fold properly” can have a function. I think this is a serious flaw. If we are talking about the emergence of protein coding sequences very early in the emergence of life, even simple non-folding peptides may have had a function in promoting successful reproduction, even if later such peptides sequences turn out to be invariably non-functional. And the reason it’s important is that early functional sequences can bootstrap later more complex ones, as we see in genetic algorithms. Early genetic patterns that confer an advantage form the basis for later more complex ones which confer greater advantage, after which, the re-emergence of those early ones is disadvantageous, not advantageous.

    As to the improbability of F, I’m sticking with 10^-74, which is a researched estimate. You may have reason for thinking that this estimate is flawed, and if you would like to make a case for that I’d be happy to read it; however you can revise that number down by many orders of magnitude and it doesn’t change the argument. If we say that 1 in 10^50 will potentially fold, then we still have a narrow subset F, which is highly improbable to find with a random trial.

    I have a great many reasons for thinking the estimate is not only flawed, but not possible to estimate, unless we take your assumption that only foldable proteins can ever be functional, and even then, we don’t currently have the ability to estimate the proportion of foldable sequences out of all possible sequences because we don’t know how to predict folding from the sequence alone.

    ”We simply don’t know how improbable it is because a) we don’t know how many elements of set n(S) are also members of set n(F), and nor do we know how many draws there were.”
    First, this isn’t about draws/trials because I’m not talking about what evolution may or may not accomplish. I’m painstakingly stating the obvious, that there exists an objective subset of S in which functional proteins must reside. I agree that we don’t know how improbable. I’m willing to withdraw “objectively improbable” and state instead that it is “likely improbable” or something of the sort. Again, even if we take the 10^-74 estimate down to 10^-50, we’re still looking at something highly improbable.

    But then your conclusion that folding proteins are “highly improbable” hangs overwhelmingly on that estimate! Which, for a great many reasons, given above, is highly flawed.

    Regardless, a target space for function exists and is not ad hoc nor post hoc — because it exists in a narrow and unchanging subset of the sequence space.

    No. The “target space” is not unchanging at all, and sequence space itself is constantly changing. In fact I think the problem here is that you are not placing the problem in fitness space at all, and that is critical. Whether a sequence is “functional” or not depends on whether it affects the phenotype’s position if fitness space. And fitness space is constantly changing. So, to take a toy model: if we start with a very small genome, as in a GA, we have a small sequence space, and we start with just the minimum functional sequences present to keep the population renewing itself. Now, let’s say that (as I seem to recall) DNA sequences with predominantly GC bonds are more stable than ones with predominantly AT bonds. At that point GC bonds are “functional” if organisms with more GC bonds are more likely to reproduce successfully than others. So “GC” or GG or even GAC are “functional” sequences. Later, certain sequences may result in RNA molecules that for some reason or another (perhaps they catalyse some beneficial reaction) improve reproductive efficiency, and so they become “functional” and the predominance or otherwise of GC bonds ceases to matter. Already the “target” has changed.

    If your position is that the existence of F is trivially true, hence “objective,” then we agree, and can cease straining gnats; and we can put to rest the notion that the argument “equally improbable as any other sequence” has any bearing on the observation of function.
    That was Miller’s argument and entirely misses the point; it is irrelevant.

    No, it isn’t irrelevant! In fact it’s exactly the error Axe makes, and on which basis you concluded that folding proteins are “highly improbable”. The fact that there exist a finite number of potentially functional folding proteins (which I dispute, for reasons I have given) and that therefore constitutes an “objective” target is no use unless you know how large that set is. To be precise in the card analogy: it is as though you assert (correctly) that in a deck of unknown but finite size there are an unknown but finite number of possible hands that could win an unknown but finite number of card games. You are then dealt a hand from that deck and told that you have won one of the games. You then erroneously assume that because that number of winning hands is finite, and less than the total number of possible hands, that being dealt any one of them is highly improbable, and that therefore you have been extraordinarily lucky. You have mistaken, I suggest, the knowledge that the winning set is finite for the knowledge that it is a very tiny subset of all possible hands, and the reason you have done so is that someone (Axe in this case) has inferred from the fact the hand you were dealt was a winner in a game now knows exists (because you won it) that it was the only game in town. In other words he drew the target far too tightly round an observed hit.

    I’m arguing against the notion that any sequence is as good as any other with respect to function. If a protein has a function, then it is a subset of F, which is a subset of S, which makes it anything but arbitrary with respect to sequence space, and likely highly improbable.

    And my point is that this oversimplifies the problem to the point of falsification! The target is not unchanging, and indeed, the hitting of a target itself changes the target space. The system is, literally, chaotic, full of feedback loops. Think Markov chains if you like, or at least Bayesian probabilities, but it is really important, I suggest, to consider the system dynamically, not statically, and that where you are at any given point not only constrains where you can go next, but opens up new functional possibilities.
    Anyway, it’s nearly 2012 now, so a happy new year to you, and nice to talk to you! I do enjoy getting down to brass tacks! Thanks!

    Lizzie

    PS: had to get page info to find the code for < thanks for providing it!

  134. 134
    material.infantacy

    Happy new year to you as well, Lizzie. I still don’t think we’re on the same page, and I believe you’re smuggling in some points which have no real bearing on the issue at hand, so I’ll get back to those points later if I get more time.

    It appears to me, that no matter how you slice it, there’s a target in S which remains (at least reasonably) fixed. It’s narrow, and consists of peptides which successfully fold. Within that set the probability of function is significantly greater than it is given S. Even with some rather generous considerations, function is not arbitrary with respect to S, AFAICT.

    If we could only put Miller’s cards analogy six feet under (and those like it) I’d be happy to give it a rest.

    “PS: had to get page info to find the code for < thanks for providing it!”

    Yeah that one has caused me a fair amount of trouble without the code. I found this page for mathy unicode symbols (enjoy). I only wish that the tags for superscript and subscript functioned properly in the comments here. We can dream.

  135. Well, I’m certainly not trying to “smuggle” anything! I’m trying to be as obvious as I possibly can be! I think these things matter!

    But IF we accept, for the sake of argument, that non-folding proteins are non-functional, and that only a subset of folding proteins can be functional, and that no other sequences are functional THEN, I accept that the subset of functional sequences is probably a very small subset of total sequences.

    But that is a huge IF, and I don’t accept any of those premises! Nor do I accept Axe’s estimate.

    But I will concede that Miller’s point is crude, and not applicable to yours, although a legitimate one to make against Dembski’s argument: even though Dembski claims to have sorted it by his “specification” concept, I do not accept that he has, but that’s a slightly different issue. Also a legitimate one to make against Axe.

    And, indeed a legitimate one against many ID arguments I have seen (those “islands of function” for instance :))

    Thanks for the link! The greater-than symbol is the biggest problem as not only does it not render, it is parsed as tag code, and sometimes you lose a great chunk of post!

    Anyway, about to take my glass of champagne of to bed!

  136. 136
    material.infantacy

    “Anyway, about to take my glass of champagne of to bed!”

    Felicitations! =D And thanks for a generous footnote.

  137. Elizabeth:

    First of all, happy New Year to you and all!

    As a first “activity” in the new year, I would like to make some comments about function and folding in proteins, to add to the interesting discussion you are having with MI.

    I see things this way:

    You are right to say that there are many functions that do not depend on protein folding. First of all, it is perfectly true that many functions do not depend on proteins at all. A lot of other molecules are highly functional in the biological context. Some of them are much simpler than proteins. Among others, the regulatory importance of small peptides and small DNA or RNA sequences has been proved.

    But that’s not the point. The point is not that function requires proteins, or folded proteins. The point is that a lot of specific biochemical functions, absolutely fundamental for any living beings, do require folded proteins, because folded proteins are the only kind of molecules that can do those things.

    Think of all the enzymatic activities on which life is based: metabolism, DNA duplication, Transcription, translation, cell cycle, mebmrane receptors, membrane transport, mopvement, amd so on. All these things are realized by the contribution of thousands of individual proteins, each of them very complex, each of them efficiently folded. in no other way those things could be attained.

    So, we need not equate function with protein folding. But we can certainly say that a special subset, a very big subset, of all biological functions, and especially non trivial biochemical activities, require some well folded protein to be achieved. The living world as we know it could never exist without folded, biochemically active proteins.

    Again I must refuse here your attempts to define function in relation to reproduction. I do define function as a well recognizable non trivial biochemical activities, that can allow us, even in the lab, even in the absence of any life, least of all reproduction, to obtain important biochemical results that could never occur spontaneously, like the enzymatic ac celeration of biochemical reactions.

    So, these objectively definable biochemical functions do need complex folded proteins. We know no other way in the world to do those things.

    So, the subset of folding and biochemically functional proteins is ceratinly an important subset of the protein space, without which no life as we know it could exist.

    It is true that folding does not ensure function. A folded protein, to be functional, must have further requisites, like one or more active sites, and in most cases, the ability to undergoe specific, useful, conformational modifications as the result of active site interaction. Many proteins that fold have no non trivial biochemical activity.

    So, there is no doubt that we can define a subset of protein space that is the subset of well folding proteins, ans a subset of that subset, that is the space of folding proteins with non trivial biochemical function.

    And we can certainly affirm that this last subset is absolutely indispensable to life as we know it. Indeed, it is certainly the most important subset of biological molecules, as far as biological biochemical activity is considered.

  138. Elizabeth:

    First of all, happy New Year to you and all!

    It’s looking good so far! I wish the same to you.

    As a first “activity” in the new year, I would like to make some comments about function and folding in proteins, to add to the interesting discussion you are having with MI.

    I see things this way:

    You are right to say that there are many functions that do not depend on protein folding. First of all, it is perfectly true that many functions do not depend on proteins at all. A lot of other molecules are highly functional in the biological context. Some of them are much simpler than proteins. Among others, the regulatory importance of small peptides and small DNA or RNA sequences has been proved.

    But that’s not the point. The point is not that function requires proteins, or folded proteins. The point is that a lot of specific biochemical functions, absolutely fundamental for any living beings, do require folded proteins, because folded proteins are the only kind of molecules that can do those things.

    But that is circular. It’s like saying that I require two legs because without two legs I couldn’t do the things I need two legs to do. But actually, it’s perfectly possible to do lots of things without two legs, just not those things specific to two legs. This is an example of “drawing the target round the arrow”. You see a folded protein, and you say: this protein is essential, because without that protein, this function couldn’t be performed. Sure. But there may be many useful functions that could be performed by some protein we don’t happen to have, yet we get on fine. For example, we can’t synthesise Vitamin C. Most mammals can. Is it required? No. Would it be useful? Yes. Moreover, if population evolves to rely on something (being able to synthesise vitamin C for instance) and then loses it, then it could be disastrous. However, if it evolves to do without, then it won’t. So you can’t remove a part, watch the system fail, then say: “therefore this part is essential to all such systems”. That is what is wrong with Behe’s argument. If you remove the keystone of an arch, the arch fails. That does not mean that arches are unbuildable.

    Think of all the enzymatic activities on which life is based: metabolism, DNA duplication, Transcription, translation, cell cycle, mebmrane receptors, membrane transport, mopvement, amd so on. All these things are realized by the contribution of thousands of individual proteins, each of them very complex, each of them efficiently folded. in no other way those things could be attained.

    Again, you are working backwards from a functioning system. My first car was extremely simple, and if it stopped working was easy to fix. If the starter motor jammed, as it frequently did, I just dunted it with a hammer I kept in the door compartment for the purpose. Now I have a car that is hugely complex. It has automatic gears, decides itself whether to run on battery or petrol (it’s a Toyota Prius). It tells me if there is something behind me when I back up, and won’t go further if there is. And if any one thing goes wrong with it, the whole thing stops (which fortunately hasn’t happened to me yet) and has to be towed in for specialist repair. It is easy to damage, fatally, a complex system by removing a single part. That does not mean that any simpler system will not function. This is the heart of Behe’s fallacy.

    So, we need not equate function with protein folding. But we can certainly say that a special subset, a very big subset, of all biological functions, and especially non trivial biochemical activities, require some well folded protein to be achieved. The living world as we know it could never exist without folded, biochemically active proteins.

    And this, as I said, is fallacious. It’s one of several fundamental mistakes made by ID proponents, that render the argument untenable.

    Again I must refuse here your attempts to define function in relation to reproduction. I do define function as a well recognizable non trivial biochemical activities, that can allow us, even in the lab, even in the absence of any life, least of all reproduction, to obtain important biochemical results that could never occur spontaneously, like the enzymatic ac celeration of biochemical reactions.

    You can define it how you like as long as we are clear :) But if you are going to call the activity of an enzyme a “function” when it is sitting in a test-tube, then you are going to have to define “spontaneously” as well. Are you saying that an enzyme in a test-tube has to be asked politely in order to perform its function, unlike, say, an inorganic catalyst? I don’t think so :)

    So would you also say an inorganic catalyst has a “function”? If not, why not? If so, then how does “function” in your sense differ form “physical/chemical properties”?

    So, these objectively definable biochemical functions do need complex folded proteins. We know no other way in the world to do those things.

    So, the subset of folding and biochemically functional proteins is ceratinly an important subset of the protein space, without which no life as we know it could exist.

    Life as we know it, sure. But we aren’t talking about “life as we know it”. We are talking about life before we knew it. Life as we have to infer it was in the past, based on fossil evidence and evidence from biochemistry and geology and other sciences. It is, as I have said, completely fallacious to assume that only “life as we know it” is viable. It would, as I’ve said elsewhere, be assuming your conclusion.

    It is true that folding does not ensure function. A folded protein, to be functional, must have further requisites, like one or more active sites, and in most cases, the ability to undergoe specific, useful, conformational modifications as the result of active site interaction. Many proteins that fold have no non trivial biochemical activity.

    And many biological catalysts are not proteins at all, yet by your definition have a “function”. And some proteins with “trivial biochemical activity” nontheless have a “function” (in terms of effect on the phenotype).

    So, there is no doubt that we can define a subset of protein space that is the subset of well folding proteins, ans a subset of that subset, that is the space of folding proteins with non trivial biochemical function.

    And we can certainly affirm that this last subset is absolutely indispensable to life as we know it. Indeed, it is certainly the most important subset of biological molecules, as far as biological biochemical activity is considered.

    By defining “function” so idiosyncratically, you are, IMO, tying yourself in knot. Either “function”, as you define it, applies to any compound with any biochemical effect or you need to unpack “non trivial”,which, it seems to me, lands you back in phenotypic effects. You have talked about these proteins being “necessary for life” and yet in the same breath you say that the effect on reproductive success is irrelevant.

    How can a protein that has no effect on reproductive success have a function that is anything other than “trivial”? And how can a compound, regardless of the complexity of its biochemical activity or lack of it, that has an effect on the reproductive success of the phenotype, not be “functional”? If something is “necessary for life” it must have a positive effect on reproductive success, no? And if it doesn’t, then it’s neither necessary nor “functional” is it?

    Do you see the problem?

  139. Oops, from “As a first “activity” in the new year…” to “… because folded proteins are the only kind of molecules that can do those things” are gpuccio’s words, not mine.

    Sorry for the confusion (if a mod could fix the tags that would be awesome :))

  140. Elizabeth:

    Do you see the problem?

    No. There is no problem at all.

    Please, go back to my definition of dFSCI and of function, and you will find the answers. I sum them up here, for you convenience :) :

    a) To evaluate dFSCI, an observer can objectively define any fuction he likes. A defined function does not imply that the observed object id designed. I have also made the example of a stoen that can be defined as implementing the function of paper weight. I am not implying by that that the stone is desigmed for that function.

    b) A complex function, one that needs many bits (for instance, at least 150 for a biological context) of functional information to work, allows us to make a design inference.

    That’s what I meant by “non trivial”. Trivial functions are often non designed, but they can well be defined.

    Is that clear?

  141. Elizabeth:

    And many biological catalysts are not proteins at all, yet by your definition have a “function”. And some proteins with “trivial biochemical activity” nontheless have a “function” (in terms of effect on the phenotype).ùùOK. And there is no reason to think those things are designed, if only a few bits of information can give us that function.

  142. Elizabeth:

    By defining “function” so idiosyncratically, you are, IMO, tying yourself in knot.

    It’s a knot I am perfectly comfortable in.

    Either “function”, as you define it, applies to any compound with any biochemical effect

    Yes, it does. A compound with a biochemical effect can be defined as functional, for that specific effect. But if the existence of that compound, and its presence in the context where it react, can be easily explained, and don’t require any functional information to exist, then there is no reason to infer design just because a chemical reaction happens somewhere on our planet.

    But if I observe a lot of chemical reagents arranged in an ordered way, so that they can react one with the other, in order and in the exact proportions, so that some rare chemical result may be obtained, it is perfectly reasonable to ask whether that context is designed. Maybe it is not designed. Maybe the configuration I observe could easily happen in a non design context. That’s why I have to analyze the “null hypothesis” of a random result in a random system.

    But take an enzyme. It accelerates a chemical reaction that vwould not occur, if not at trivial rates, in the absence of that enzyme. So I define a function.

    And what allows the function to work? It is the specific sequence of aminoacids (let’s say 300) in that enzyme. And is there an explanation for that specific sequence to be there?

    Those are that we in ID want to answer. Logical, reasonable, scientific questions.

    You and darwinists seem not to be interested in asking those questions. Again, it’s your choice.

    But we will go on asking them, and answering them.

    or you need to unpack “non trivial”,which, it seems to me, lands you back in phenotypic effects.

    No. It lands me back in functional complexity.

  143. OK, let’s step back a bit then (literally, heh).

    Can I ask you a couple of questions?

    First:

    Let’s take a gene for some neuromodulatory protein. And let’s say there are two alleles for this protein, and they differ in one amino acid. Where in one allele there is methionine, in the other there is valine.

    Both proteins perform the same function in the phenotype. Which, if either, sequence contains more dFCSI?

    Second: take another gene, with two alleles. In both there are tandem repeats of a 48 pair sequence. However in one allele there are seven repeats, and in the other, nine. Again, both proteins perform the same function in the phenotype.

    Which, if either, sequence contains more dFCSI

    If these questions are not answerable, can you explain why?

  144. You and darwinists seem not to be interested in asking those questions

    Absolutely not. Just check the literature.

  145. I mean, absolutely not true! The biological literature is full of studies addressing those questions.

  146. But those are not definitions!

    Do you have actual definitions somewhere?

  147. Hey Elizabeth- Strange taht you call ID arguments untenable yet you cannot offer anything in support of the claims for your position.

    The fallacy of your position appears to be that there is supporting evidence for it.

  148. Unfortunately the explanations lack evidentiary support.

    You can’t even refute Dr Behe’s claim that two new protein-to-protein binding sites are beyond the capability of stochastic processes.

    Maybe in the New Year….

  149. Elizabeth:

    a) They contain the same dFSCI. The variation is neutral. In Durstin’s method, those variations in a protein family are ecatly the way we compute the dFSCI of the family.

    b) Please, tell me more about the gene and its function, otherwise I cannot answer. In general, at least for a protein coding gene, those variations that do not affect funtion do not contribute to dFSCI.

  150. So if I told you that in the first case, the val allele was associated with greater reproductive success than the met allele, and in the second case that the 7 repeat allele was associated with greater reproductive success than the 9 repeat, how would that affect your calculation of dFSCI?

    Or would it make no difference?

  151. Elizabeth:

    I paste here some recent short definitions of dFSCI I have given here. I have given much more detailed fìdefinitions, but in this moment I don’t know how to retrieve them:

    “Just to try some “ID for dummies”:

    Functionally specified information. It’s not difficult. I need a string of bits that contains the minimal information necessary to do something. The information that is necessary to do something is functionally specified. Isn’t that simple?

    Complexity: How many bits do I need to achieve that “something”? It’s not difficult. It is simple. Programmers know very well that, if they want more functions, they have to write more code. Let’s take the minimal code that can do some specific thing. That is the specified complexity for that function.”

    And about complexity:

    “My definition of complexity in dFSCI is very simple: given a digital string that carries the information for an explicitly defined function, complexity (expressed in bits as -log2) is simply the ratio between the number of functional states (the number of sequences carrying the information for the function) and the search space (the number of possible sequences).

    More in detail, some approximations must be made. For a protein family, the search space will usually be calculated for the mean length of the proteins in that family, as 20^length. The target space (the number of functional sequences) is the most difficult part to evaluate. The Durston method gives a good approximation for protein families, while in principle it can be approximated even for a single protein if enough is known about its structure function relationship (that at present cannot be easily done, but knowledge is growing increasingly in that field).

    This ratio expresses well the probability of finding the target space by a random search or a random walk form an unrelated state.

    dFSCI can be categorized in binary form (present or absent) if a threshold is established. The threshold must obviously be specific for each type of random system, and take into account the probabilistic resources available to the system itself.

    For a generic biological system on our planet, I have proposed a threshold of 150 bits (see a more detailed discussion here):

    http://www.uncommondescent.com…..ent-410355

    As already discussed, the measurement of dFSCI applies only to a transition or search or walk that is reasonably random. Any explicitly known necessity mechanism that applies to the transition or search or walk will redefine the dFSCI for that object.

    Moreover, it is important to remember that the value of dFSCI is specific for one object and for one explicitly defined function.”

  152. Elizabeth:

    It depends. You must tell me why the two alleles are assosiated with different reproduction. If the difference in the two alleles is the cause of different reproduction, then I suppose that the function cannot be exactly the same, or that there must be some indirect mechanism that connects the difference to the reproductive effect.

    Another possibility is that the two alleles are only linked to other proteins that infleunce different reproduction.

    You see, omly darwinists can be content with a statement such as “the val allele was associated with greater reproductive success than the met allele”. All normal people would simply ask: why?

    It’s about causes, remember. About necessity models, and cause effect relationships. That’s what sicence is about. Science is not only descriptive. It tries to explain things.

  153. Well, can you calculate it for the two pairs of alleles I’ve given you?

    I can hunt out the actual sequences, but for now, perhaps you could just give me the formula with the parameters, and I will try to supply the parameters?

    But I’m not clear where in your definition you plug in the advantageousness of the variant.

    If you are going to measure the increase in dFCSI of an sequence that results in increased fitness, or some potentially macroevolutionary change, how do you evaluate the increase, and where does it go in your dFCSI calculation?

  154. Elizabeth:

    The Durston method applies to protein families, to hundreds of sequence that have the same function.

    The function is considered a constrint to uncertainty in the protein sequence. Therefore, those AAs that have never varied contribute 4.3 bits to the total dFSCI, ehilr those that can freely change contribute 0 bits. All intermediate situations contribute correspondingly to the reduction of uncertainty given by the function for each sire, computed according to Shannon’s formula.

    Please, review Durston’s paper for more information.

  155. Elizabeth:

    For the “advantageousness of the variant, please answer my post 25.1.1.1.1

  156. Elizabeth:

    While waiting for your answers, I will give you a general answer to your question, with a few assumptions:

    So, let’s assume that protein A and protein B differ for one AA. Let’s say that protein B has some added function, that determines better reproduction vs the replicators with protein A. We can evaluate the dFSCI of the transition from A to B, for the added function.

    If only that specific AA substitution confers that added function (IOWs, if val and only val confers the added function when substituted at that site), then the dFSCI of the transition is 4.3 bits.

  157. It depends. You must tell me why the two alleles are assosiated with different reproduction. If the difference in the two alleles is the cause of different reproduction, then I suppose that the function cannot be exactly the same, or that there must be some indirect mechanism that connects the difference to the reproductive effect.

    But we can observe that the two alleles ARE associated with different reproduction rates, without knowing why, simply by measuring reproduction rates in two populations, one of which has one allele and one of which has the other.

    Are you saying that if you observe that an allele is associated with increased fitness that you cannot estimate the dFCSI of the increase unless you know the mechanism?

    Can you even say for sure whether it is an increase or decrease?

    Another possibility is that the two alleles are only linked to other proteins that infleunce different reproduction.

    Let’s say that we’ve done an experiment and actually inserted the relevant allele into the genomes of experimental animals. Could you do it then?

    You see, omly darwinists can be content with a statement such as “the val allele was associated with greater reproductive success than the met allele”. All normal people would simply ask: why?

    Who says this darwinist is content? Darwinists are normal people and indeed ask “why?” What I am asking you is how you compute the dFCSI of the change, because unless you can give an examplar of a step-change of > bits then there is nothing that demands explanation!

    And this (partly hypothetical) case is simple: we have two pairs of proteins, slightly different, whose sequence is known, and which produce different reproduction rates.

    It’s about causes, remember. About necessity models, and cause effect relationships. That’s what sicence is about. Science is not only descriptive. It tries to explain things.

    Yes, of course. I’m not disputing that (did you respond t omy response to your post on Fisher? I’m not sure I checked). But it’s also about measuring things, and in this instance I would like to know how you would measure the difference in dFCSI between two alleles that result in two slightly different proteins, one associated with greater reproductive success than the other.

    Let’s say that we think that the difference is due to more efficient dopamine function in the allele associated with greater fitness.

  158. Sorry, this new format is difficult to track. What answers are you waiting for?

  159. OK, thanks!

    So, let’s assume that protein A and protein B differ for one AA. Let’s say that protein B has some added function, that determines better reproduction vs the replicators with protein A. We can evaluate the dFSCI of the transition from A to B, for the added function.

    Let’s say sequence is identical except that in the met allele there is an AUG whereas in the val allele there is a GUG.

    If only that specific AA substitution confers that added function (IOWs, if val and only val confers the added function when substituted at that site), then the dFSCI of the transition is 4.3 bits.

    Would the answer be the same if the met allele was AUG and the val allele was GUC? Or is it 4.3 bits regardless of how many differences there are between the two triplets?

    And two more questions:

    If the environment now changes, and the met allele becomes the more advantageous of the two, does the val allele still have 4.3 bits more dFCSI than the met allele, or does the met allele get those bits back, as it were?

    And the second question is about my second pair of alleles. The 7 repeat has 7 repeats of a 48 base pair section and the 9 repeat 9 repeats. The 7 repeat allele is more advantageous. What is the dFCSI of the difference?

    Thanks :)

  160. Elizabeth:

    Well, an answer to this:

    “It depends. You must tell me why the two alleles are assosiated with different reproduction. If the difference in the two alleles is the cause of different reproduction, then I suppose that the function cannot be exactly the same, or that there must be some indirect mechanism that connects the difference to the reproductive effect.

    Another possibility is that the two alleles are only linked to other proteins that infleunce different reproduction.”

    Meanwhile, I would say that I generally reason on the AA sequence, assuming that synonimous mutations do not affect the protein function (I know that is not always true, òet’s say it is an approximation). It is probably possible to reason on the nucleotide sequence, but Durston works at AA level, and so do I.

    Regarding the question about the environment change, we have to define the added function at biochemical level, whatever it is. The change of 4.3 bits is tied to adding the biochemical difference. Either it is useful or not in the environment, nothing changes. We define the function as “locally” as possible, ususally in terms of biochemical properties. It is obviously possible to define the function more generically, such as “adding reproductive power in this specific environment”, but that would be wcarcely useful, and anyway tied to a particular environment. As I have tried to repeat many times, dFSCI works if it describes the information needed to obtain some very well defined, objective property. Biochemical activities in a specified context are very good examples of a well specified function.

    About the repeat allele, I really need more details about the function. I can only say that in general, a repetition is a compressible feature, and it implies few bits of information. Indeed, a simple repetition can probably be explained algorithmically, and I don’t believe it contributes much to dFSCI. dFSCI is high in pseudorandom sequences, that cannot be efficiently compressed, but still convey information.

  161. But, gpuccio, the more you tell me about your dFCSI, the more useless it seems to me as a metric! It doesn’t seem to be highly with usefulness. And useless information isn’t really information, is it? What good is information if it doesn’t tell you what you need to know?

    That’s why I asked you whether the val allele would still have 4.3 bits more information than the met allele if the environment changed so that the met allele becomes the allele that confers greater fitness.

    That seems to me a really crucial question.

    Meanwhile, I would say that I generally reason on the AA sequence, assuming that synonimous mutations do not affect the protein function (I know that is not always true, òet’s say it is an approximation). It is probably possible to reason on the nucleotide sequence, but Durston works at AA level, and so do I.

    But val to met is not a synonymous mutation and it does affect function. And you are right, even “silent” mutations (GUA to GUC for instance) can apparently affect function. Which is where your definition of function starts to fall apart, or rather, where we have to distinguish carefully between function as in “helps the organism reproduce) and function as in “has some biochemical properties”. It is relatively easy to measure the functional value of a variant in my sense (by measuring relative fitness); I have no idea how you measure functional value simply from the biochemistry. That’s why I keep accusing you of forgetting the phenotype! Surely the effect of a sequence on the phenotype that determines its functional value? And shouldn’t that be a major input into your information metric?

    egarding the question about the environment change, we have to define the added function at biochemical level, whatever it is. The change of 4.3 bits is tied to adding the biochemical difference. Either it is useful or not in the environment, nothing changes. We define the function as “locally” as possible, ususally in terms of biochemical properties. It is obviously possible to define the function more generically, such as “adding reproductive power in this specific environment”, but that would be wcarcely useful, and anyway tied to a particular environment. As I have tried to repeat many times, dFSCI works if it describes the information needed to obtain some very well defined, objective property. Biochemical activities in a specified context are very good examples of a well specified function.

    But this makes no sense to me. I stipulated that, originally, the val allele provided “added function” i.e. conferred greater fitness, and you said that it represented an additional 4.3 bits of information. Now, that same allele, in a different environment, confers less fitness than the met allele. So are you saying it never did add 4.3 bits of information? Or are you saying that any change of one AA adds one bit of information, even if that change makes the phenotype less fit?

    Or are you, as you seem to be, saying that dFSCI can only be measured in with a well-defined function in a specific context? If so, it seems entirely irrelevant to evolutionary theory, which is all about adaptation to context!

    And yet you say that only if there is a step-change increase of > 150 bits in dFCSI can we call a change “macroevolution”, right? So can you give an actual example of macroevolution, as thus defined, and say how you computed the size of the step-change?

  162. Umm the whole point is there ain’t any examples of macroevolution so defined. And conferring freater fitness is not an added function.

  163. Umm the whole point is there ain’t any examples of macroevolution so defined.

    Exactly Joe. That is precisely the point I’m making. So until you or someone provide one, there’s nothing that needs to be explained.

  164. No one can provide an example of something that doesn’t exist. That’s the point- your position doesn’t have any examples to call upon.

  165. Elizabeth:

    Only now I have read your post 19.2. Here are my comments.

    Well, I do believe that you don’t understand statistics well. Let’s see:

    Again, that word “random”. You really need to give a tight definition of it, because it has no accepted tight definition in English.

    I have given it. I quote from my post 13:

    “a) A “random system” is a system whose behaviour cannot be described in terms of necessity laws, usually because we cannot know all the variables and/or all the laws, but which can still be described well enough in terms of some probability model.

    The tossing of a coin is a random system.

    Genetic variation, if we don’t consider the effects of NS (differential reproduction) is a random system. NS introduces an element of necessity, due to the interaction of reproductiove functions and the environment.”

    I can be more clear, and say that a system is not “random” because it is not governed by laws of necessity. I have always been clear about that. It’s our model of the system that is probabilistic, because that’s the best model that can describe it. Much of your confusion in the following derives from your misunderstanding of my point about that.

    And we simply do not know whether gravity is subject to quantum uncertainty or not.

    Again confusion. I have never spoken of “gravity”, but of “Neton’w law of gravity”. Again you misunderstand: I am speaking of the scientific model, and you answer about the noumenon!

    Newton’s law is indeed deterministic, but it is only a law, not an explanation – not a theory. And all a law is a mathematical description that holds broadly true. That doesn’t make it a causal relation.

    From Wikipedia:

    “A scientific law is a statement that explains what something does in science just like Newton’s law of universal gravitation. A scientific law must always apply under the same conditions, and implies a causal relationship between its elements. The law must be confirmed and broadly agreed upon through the process of inductive reasoning. As well, factual and well-confirmed statements like “Mercury is liquid at standard temperature and pressure” are considered to be too specific to qualify as scientific laws. A central problem in the philosophy of science, going back to David Hume, is that of distinguishing scientific laws from principles that arise merely accidentally because of the constant conjunction of one thing and another.”

    A law, like a theory, is about causal relations. Atheory is usually wider than a simple law, but essentially they are the same thing: an explanatory model of reality, based on logical and mathemathical relations.

    We do not know whether mass is the cause of gravitational force. It may be that gravitational force is the cause of mass. Or it could be that “cause” is itself merely a model we use to denote temporal sequence, and ceases to make sense when considering space-time. But I’m no physicist, so I can’t comment further except to say that you are making huge and unwarranted assumptions here.

    And you are completely wrong here. Philosophically, you can dispute whether we can ever assess a true causal relation (Hume tried to do exactly that). But science is all about assumed causal relations.

    Newton’s model does assume that mass is the cause of gravitational force, and not the other way round. Again, here, you make a terrible confusion between scientific methodology and modeling, and mere statistical analysis of data. While statisitic can never tell us which is the cause in a relation, methodology can (with all the limits of scientific knowledge, obviously). For instance, if there is a statistical correlation between two variables, that’s all we can say at the statistical level. But, if one variable precedes the other in time, methodology tells us that a causal relation, if assumed, can be only in one direction.
    That’s why a scientific model is much more than the description of statistical relations. A model makes causal assumptions, and tries to explain what we observe.

    Obviously, we well know that scientific models are never absolute, and never definitive. But they can be very good just the same.

    Depends what you mean by “nothing to do with probability”.

    I thought it was clear. It means that the evolution of a wave function is mathemathically computed, in a striclty deterministic way. There is no probability there.

    I am talking about scientific (explanatory) models.

    I am too.

    There is always unmodelled variance, if only experimental error.

    Now, I will be very clear here, because this is the single point where you bring most of the confusion in.

    Any system, except for qauntum measurements implying a collapse of the wave function, is considered to be deterministic, in physics. Therefore, in principle, any system could be modeled with precision if we knew all the variables, and all the laws implied. That would leave no space to probability, exactly the opposite of what you state.

    But, obviously, there is ample space for probab ility in science. So, in science, we model those parts of the system that we understand according to causal relations (the “necessity” parts), and we describe probabilistically those parts that we cannot model that way (the “random” parts).

    If you computa a regression line, you are creating a necessity model (the regression line is a mathemathical object), assuming a specific mathemathical relationh between two variables. If your model is good, it will explain part of the variance you observe, and if you are happy with that, you can make methodological assumptions, propose a theory and a causal relationship. That is a methodological activity, and it is supported by statistical analysis, but not in any way determined by it.

    And you will have residuals, obviously. Still unexplained variance. What are they? The effects of other variables, obviously, including sampling error, measurement errors, and whatever else. It is obvious that, if we could model everything, we would have a strict necessity system (unless agents endowed with free will are implied :) ). But we treat that part as random variance, exactly because we can’t model it. If we can model part of that residual variance by some new necessity relation, then we can refine our modle, that will become better.

    That’s what you seem not to understand. The model is explanatory, and anytime possible it is based on necessity and assumed causal relations. The data will always have some random component, because usually we cannot model all necessity interaction, even if only because of measurement errors. Quantum mechanics, for its probabilistic part, is the only model I know that is supposedly based on intrinsic randomness (and even that is controversial).

    It is often possible to use deterministic models, even when the underlying processes are indeterminate.

    What does that mean? All processes, in principle, are determinate (except what said about quantum processes). If we use a deterministic model, it’s because we believe that it describes well, to a point, what really happens. We may be wrong, obviously. But that’s the idea, when we make science.

    Similarly we often have to use stochastic models even when the underlying processes are determinate.

    Wrong. We use probabilistic models when we have no successful model based on necessity. And the underlying processes are always determinate, it’s our possibility to describe them that determiones the use of a necessity model or of a probabilistic modle.

    I think a big problem (and I find it repeatedly in ID conversations) concerns the word “probability” itself, which is almost as problematic as “random”. Sometimes people use it as a substitute for “frequency” (as in probability distributions which are based on observed frequency distributions) . At other times they use it to mean something closer to “likelihood”. And at yet other times they use it as a measure of certainty. We need to be clear as to which sense we are using the word, and not equivocate between mathematically very different usages, especially if the foundation of your argument is probabilistic (as ID arguments generally are).

    That again shows great confusion. There is no doubt that the nature of probability is a very controversial, and essentially unsolved, philosophical problem. If you have time to spend, you can read about that here:

    http://plato.stanford.edu/entr.....interpret/

    But the philosophical difficulties in interpreting probabilities have never prevented scientists to use the theory of probability efficiently, no more than the controversial interpretations of quantum mechanics have prevented its efficient use in physics.

    You seem to have strange difficulties with words like probability, and random. Only “stochastic” seems to reassure you, for reasons that frankly I cannot grasp :)

    For your convenience,

    No. Pretty well all sciences, and certainly life sciences, use models in which the error term is extremely important. And biology is full of stochastic models. In fact I simply couldn’t do my job without stochastic models (and I work in life-science, but in close collaboration with physicists).

    Again, a model can be a necessity model, and still take into account error terms, that will be treated probabilistically. The word itself, “error”, refers to a concept of necessity.
    Measurement errors, for instance, create random noise (unless they are systematic) that makes the detection of the necessity relation more difficult.

    I think you are confusing a law with a model. A law, generally, is an equation that seems to be highly predictive in certain circumstances, although there are always residuals – always data points that don’t lie on the line given by the equation, and these are not always measurement error. We often come up with mathematical laws, even in life sciences, but that doesn’t mean that the laws represent some fundamental “law of necessity”. It just means that, usually within a certain data range (as with Newton, and Einstein, whose laws break down beyond certain data limits) relationships can be summarised by a mathematical function fairly reliably – perhaps very reliably sometimes.

    ????? What does that mean?

    Laws, models and theories are essentially the same kind of thing. OK, laws are more restricted, and usually more strongly supported by data. But the principle is the same: we create logical and mathemathical models to explain what we observe.

    The residuals are not necessarily evidence tha the law is wrong. As already said, they can be often explained by measurement errors, or by the interference of unknown variables. And of course, sometimes they do show that the laws is wrong.

    And what do you mean when you say: “but that doesn’t mean that the laws represent some fundamental “law of necessity””
    Laws are laws of necessity. How fundamental they are, depends only on how well they explain facts. And obviouysly, if you mean, with “fundamental”, absolute, then no scientific law, or model, or theory will ever be “absolute”. But they can be very good and very important.

    This is a false distinction in my opinion. You can describe the results of a single coin toss by a probabilistic model just as well as you can describe the results of repeated tosses.

    ???? What do you mean? How can probability help you describe “the results of a single coin toss”? What can you say? Maybe it will be head? Is that a description of the event?

    But if you toss the coin many times, you can observe mathematical regularities. For instance, you can say that the percent of heads will become nearer to 50% as the number of tosses increases. That is a mtahemathical relation, but as you can see it is not a necessity one: it is based on a probability distribution, a different mathemathical object.

    But if you want to predict the results of an individual toss, as opposed to the aggregate results of many tosses, you need to build a more elaborate model that takes into account all kinds of extra data, including the velocity, spin, distance and angle etc of the coin.

    OK. As I have said from the beginnig, each single toss is determined. Completely determined.

    And you cannot possibly know all the factors, so there will still be an error term in your equation.

    Yes, but most times the error term can be made small enough that the prediction is empirically accurate. Otherwise we could never compute trajectories, orbits, and so on.

    In other words, predictive models always have error terms; sometimes these can be practically ignored; at other times, you need to characterise the distribution of the error terms and build a full stochastic model.

    As already said, that a data analysis include a probabilistic evaluation of errors does not mean that the explanatory model is not a necessity model.

    I agree that characterising uncertainty is fundamental to scientific methodology. I disagree that stochastic and non-stochastic models are “deeply different”. In fact I’d say that a non-stochastic model is just a special case of a stochastic model where the error term is assumed to be zero.

    Again you stress unduly the error term. Random errors are an empirical problem, but they do not imply that a necessity theory is wrong. The evaluation of a theory implies much more than the error term. It means to evaluate how well it explains observed facts, if it containf internal inconsistencies, and if there are better necessity theories that can explain the same data.

    No. That is not the purpose of Fisher’s test, which has nothing to do with “causal necessity” per se (although it can be used to support a causal hypothesis).

    Please, read again what I wrote:

    “Take Fisher’s hipothesis testing, for instance, that is widely used as research methodology in biological sciences. As you ceretainly know, the purpose of the test is to affirm a causal necessity, or to deny it.”

    Well I could have repeated: the purpose of the test as used in biological sciences. Must you always so fastidious, and without reason?

    You know as well as I do that Fisher’s hypothesis testing is used methodologically to affirm, or deny, or just leave undetermined, some specific theories. As I have already said, it is never the statistical analysis in itself that affirms or denies: it’s the methodologiacl context, supported by the statistical anlysis.

    Nor can you use Fisher’s test to “deny” a “causal necessity”.

    Why not? If you compute the beta error and power, you can deny a specific effect of some predefined minimal size with a controlled error risk, exactly as you do when you affirm the effect with the error risk given by the alpha error. Otherwise, why should medical studies have sufficient statistical power?

    However, if Fisher’s test tells you that your observed data are quite likely to be observed under the null, you cannot conclude that your hypothesis is false, merely that you have no warrant for claiming that it is true.

    I don’t agree. If the power of your study is big enough, you can conclude that if a big enough effect compatible with your model had been present, your research should have detected it (always with a possibility of error measured by the beta error).

    Right. Except that a good scientist will then attempt to devise an alternative hypothesis that could also account for the observed data.

    Alternative hypotheses must always be considered. That is a fundamental of methodology. If you reject the null hypothesis, that does not automatically validate your model. I am very well aware of that. Let’s say that, if you reject the null hypothesis, you usually propose your model as the best explanation, after having duly considered all other explanations you are aware of.

    But if you “retain the null”, by Fisher’s test, you cannot conclude that your hypothesis is false. Fisher’s test cannot be used to falsify any hypothesis except the null. It cannot be used in the Popperian sense of falsification in other words.

    Well, I was not speaking of a logical falsification, but of an empirical falsification.

    The reasoning is simple. You propose a model. You empirically define some effect that derives form the model, and a minimal threshold of that effect that is empirically interesting. Than you make the experimentation, dimensioning your samples so that the power of the test will be 95%. Then you find a p value of 0.30. You don’t reject the null hypothesis. At the same time, with a power of 95% and a beta error of 5%, you can affirm that it is very unlikely that your model is good, at least that it is good enough to assume an effect of the size you had initially assumed. Empirically, that means that you model not only is not a good explanation, but realistically it is not an useful explanation at all.

    That is empirical falsification, not in the Popper sense, but in the sense that counts in biological research.

    More in another post.

  166. I’ll have to respond in pieces, gpuccio, but let me start at the end:

    Well, I was not speaking of a logical falsification, but of an empirical falsification.

    The reasoning is simple. You propose a model. You empirically define some effect that derives form the model, and a minimal threshold of that effect that is empirically interesting. Than you make the experimentation, dimensioning your samples so that the power of the test will be 95%. Then you find a p value of 0.30. You don’t reject the null hypothesis. At the same time, with a power of 95% and a beta error of 5%, you can affirm that it is very unlikely that your model is good, at least that it is good enough to assume an effect of the size you had initially assumed. Empirically, that means that you model not only is not a good explanation, but realistically it is not an useful explanation at all.

    I agree that you can falsify the hypothesis that an effect is greater than some threshold effect size.

    Cool :)

  167. Probability isn’t especially mysterious, you just need to be clear what you mean by it in any given context.

    Sometimes it’s used as a frequency estimate, sometimes as a measure of uncertainty. The two aren’t the same, though, mathematically, so it’s important to distinguish, and to interpret the word appropriately.

    And the reason I prefer “stochastic” to “random” is that it has a much more precise meaning. “Random” is a disaster :)

  168. Right. I’m glad you agree with me that macroevolution, by gpuccio’s definition, doesn’t exist.

    So there’s no point in his claiming that it can’t be explained by evolutionary theory, is there?

    (Did you inadvertently back the wrong horse there, Joe? ;))

  169. Elizabeth:

    Well, I am happy that I can make my essay without losing too many marks .)

  170. Elizabeth:

    If that is your problem, you can be reassured: in my reasoning, probability is always used as a frequency estimate.

    The concept of reduction of uncertainty is used in Durston’s method to approximate the target space in protein families, and is treated mathematically according to Shannon’s concepts. The reduction of uncertainty due to the constraint of the specific function is computed from the probability of each aminoacid at each site, but is a different concept. So, there is absolutely no problem here.

    And I would be happy if you could specify in what sense “stochastic” is more precise than “random”.

  171. heh.

    Did I just verify that I am human?

  172. Yes, I thought you were using probability in the frequentist sense.

    The problem there, though, is frequency estimates depend on having data sampled from the entire relevant population, and that’s what we don’t have in the case of protein domains. We only have data sampled from the population that happened to make it.

    A bit like estimating the sex ratio of the human population from the sex ratio in CEOs.

    “Stochastic”, in English, is more precise than “random” because “random” has many meanings in regular English usage, including “equiprobable” and “purposeless”. In Italian (or the Italian equivalent) it may be more precise.

    “Stochastic” however is used much more rarely, and with much more precise meaning, and denotes a system, or process, or model, that is not deterministic.

    Even a straightforward regression model is a stochastic model because it contains an error term, assumed to be normally distributed. That’s in addition, of course, to the error associated with the fitted parameters.

  173. Elizabeth:

    Yes, I use it in the frequentist sense.

    In the case of protein domains, we have data about functional domains taken from 4 billion years of sampling. That’s something. Again, we are not looking for final and precise measurements, but for a reasonable estimate.

    I a functional protein emerged 4 billion years ago, neutral variation has reasonably explored the functional space, while keeping the function the same. Even if the functional space had been not completely explored, still that would be a good approximation.

    Protein families include sequences with big differences, a furthur evidence that the functionla space, or a great part of it, has been traversed in the course of evolution.

    When you get, by the Durston method, functional complexities of hundreds of bits for many protein families, you should assume that only a very tiny part of the functional space has been explored by neutral evolution to ignore that finding. There is absolutely no reason to believe that. All the data we have, both form observation of the proteome and its natural history, and from lab data, confirn that the functional space is extremely small compared to the serach space.

    Just consider the rugged landscape paper:

    “Experimental Rugged Fitness Landscape in Protein
    Sequence Space”

    “Although each sequence at the foot has the potential for
    evolution, adaptive walking may cease above a relative fitness of 0.4 due to mutation-selection-drift balance or trapping by local optima. It should be noted that the stationary fitness determined by the mutation-selection-drift balance with a library size of N(d)all is always lower than the fitness at which local optima with a basin
    size of d reach their peak frequencies (Figure 4). This implies that at a given mutation rate of d, most adaptive walks will stagnate due to the mutation-selection-drift balance but will hardly be trapped by local optima. Although adaptive walking in our experiment must have encountered local optima with basin sizes of 1, 2, and probably 3, the observed stagnations are likely due only to the mutation-selection-drift balance. Therefore, stagnation was overcome
    by increasing the library size. In molecular evolutionary
    engineering, larger library size is generally favorable for reaching higher stationary fitness, while the mutation rate, d, may be adjusted to maintain a higher degree of diversity but should not exceed the limit given by N=N(d) all to keep the stationary fitness as high as possible.
    In practice, the maximum library size that can be prepared is about 1013 [28,29]. Even with a huge library size, adaptive walking could increase the fitness, ~W
    , up to only 0.55. The question remains regarding how large a population is required to reach the fitness of the wild-type phage. The relative fitness of the wild-type phage, or rather the native D2 domain, is almost equivalent to the global peak of the fitness landscape. By extrapolation, we estimated that adaptive walking requires a library
    size of 10^70 with 35 substitutions to reach comparable fitness
    .”

    Emphasis mine.

    Don’t you think that “a library size of 10^70″ (strangely similar as a number to some Axe’s estimate) for “35 substitutions to reach comparable fitness” (strangely similar to my threshold for biological dFSCI) means something, in a lab experiment based on retrtiving an existing function in a set where NS is strongly working?

  174. Elizabeth:

    In the meantime, I will try to finish answering your post 19.2:

    Scientific methodology involves fitting models to data. Yes, you can build a non-stochastic model, but you still have to deal with the error term, in other words the residuals, aka the stuff impacting on your data that you haven’t modelled. And you can either make assumptions about the distributions of your residuals (assume a Gaussian, for instance) or you can actually include specified distributions for the uncertain factors in your model. If you don’t – you report a non-stochastic model with the assumption that the residuals are normally distributed, and they aren’t, you will be making a serious error, and your model will be unreliable.

    Well, yes and not. Statistical modeling is one of my favourite activities, and I would say that a normal distribution of residuals, for instance in a regression model, is what we expect if our necessity model really explains completely our data, except for random error.

    Now, that random error could just be error measurement (as is more likely in an experiment of physics, where the relation between causal modle and data is often more direct and simple). In that case, we certainly expect a normal distribution of residuals. Another possibility, much more common in biology and medicine, is that our causal model, while true, explains only part of what we observe. In biology, there may be a lot of hidden variables that we cannot take into account (each individual is different, each individual is much more complex than we can understand, and so on). That’s why our biological (and especially medical) causal models rarely explain not only all, but not even much, of what we observe. Just to make an example, if we test a new grug, we don’t expect it to cure 100% of the cases. It may cure just 15%. And yet, the causal relation between drug administration and cure can be strong abd reliable.

    In the same way, our regression in a disgnostic hypothesis between two variables (with a specific causal model in the background) can be strongly significant, but the effect size (such as R^2) can be relatively small. And yet, the information is sometimes precious, and the causal relation strongly assumed.

    Now the point is, if our supposed cause is really acting in data, but it is only one of many causes, our statistical relation will be diluted by many other effects, not only by “errors”, such as measurement errors.

    The point is: if those other diluting effects are many and independent, then we expect that our residuals will be normally distributed.

    But take the case that there is one major hidden causal factor in the system, that we don’t know of. Then we can expect that residuals will not be normally distributed. IOWs, residuals will contain structure that is not accounted for in the model. Identifying that structure and adding the unknwon term to the original model leads to a better model, but is not always possible.

    Anyway, in that case, the fact that residuals are not normally distributed does not mean that our causal model is not good: our supposed cause can still be a very valid and credible cause, but we have reason to belive that there is at least another big, detectable cause that may explain the structure in residuals, and we have the duty to look for it, if there is a chance to understand what it is.

    So, what you say is partially right, but in no way if invalidates the importance of statistical analysis to detect causal relations in data, even when they are diluted by error measurements, or simply by other unknown causal relations. In the end, the purpose of science is and remains to detect causes.

  175. Elizabeth:

    Let’s go on:

    “Random noise” is simply unmodelled variance (as you’ve said yourself). For instance, we can often reduce the residuals in our models by including a covariate that models some of that “noise”, so it ceases to be noise – we’ve found, in effect, a systematic relationship between some of the previously unmodelled variance and a simply modelled factor. Age, for instance, in my field, or “working memory capacity” is a useful one, as measured by digit span.

    OK. That’s perfectly in line with what i said in my previous post.

    Furthermore, sampling error, is not a “cause of random error”, but represents the variability in summary statistics of samples resulting from variance in the population that is not included in your model.

    Why not? Sampling error is similar to measurement error. We want to measure something in a population, and instead we measure it in a sample. That implies a possible error in the measurement, because the sample does not represent perfectly the population. If the sampling technique is really random, than the error is random too. And, obviously, it depends critically on sample size, as well known. That applies also to tests comparing two or more samples. then sampling error is treated as the possible source of random differences between the groups because of ranbdom sampling, difference that could appear to be effects of a necessity cause (such as a true difference between groups), but are not.

    Certainly some variance is due to measurement error. But it would be very foolish to assume that you have modelled every variable impacting on your data apart from measurement error.

    I certainly don’t assume any such thing. I could never work in medicine, that way! But sampling error is a random error due to sampling, and not another causal variable. And anyway my point is that, even if there are other variables that I have not modeled, stil I can often reliably assume a casual relation for my modeled variable.

    You are also making a false distinction between “the effects of random noise and the assumed effect of necessity”. Apart from measurement error, all the rest of your “random noise” may well be “effects of necessity”. What makes them “noise” is simply the fact that you haven’t modelled them. Model them, and they become “effects of necessity”. And, as I said, you can model them as covariates, or you can model them as stochastic terms. Either way, you aren’t going to get away without a stochastic term in your model, even if it just appears as the error term.

    Right and wrong. As I have said, everything in the model is “effect of necessity” (if we are not experimenting about quantum mechanics), even random noise. The difference is in how we treat those effects.

    The effects of one strong, but unknown, variable will show as “structure” in our residuals, for instance, and will deserve special attention because that variable could be detected and added to the model.

    But the effects of many unknown independent variables, all of them contributing slightly to what we observe, can be treated only probabilistically, either they are measurement errors, sampling error, or hidden variables.

    So, to sum up:

    a) All that we observe is the result of necessity

    b) Much of what we observe can be treated only probabilstically. not because it is essentially different, but because the form and impact of the cause effect relationship is beyond a detailed necessity treatment.

    c) We must always be well aware of wht we are doing: are we treating some variable as a possible necessity cause, and affirming that causal relation, or are we just modelling unknown variables probabilistically? That’s where your arguments are confused.

    I disagree. I think yours is the cognitive mistake, and I think it is the mistake of inadequately considering what “random causes” means. “Random causes” are not “explanations”. They are the opposite of “explanations”. They are theunexplained aka unmodeled variance in your data. Thinking that “random” is an “explanation” is a really big cognitive mistake! But you are not the only person on this board to make it ?

    This is definitely wrong. Random causes, that is unknown causes that we model probabilistically, can very well be an explanation of what we observe. They explain probabilistically, exactly because they are not modelled in detail. But explain they do, just the same.

    Let’s take the simplest experimentsl model, where we test differences in some test variable in two different groups, a test group and a contyrol group, by the methodology of Fisher’s hypothesis testing.

    The resoning goes as follows: the groups differ only for the tested variable, and our null hypothesis is that our variable has not causal relation with what we observe in the data. And yet, in the data we do observe a difference (well, it would be very unlikely not to observe any).

    At this point, we have two possible explanations competing: we accept the null hypothesis, abd assume that the observed difference is well explained by the known possible cause that is sampling error, treated probabilistically, and not in detail. Or we reject the null hypothesis, and assume that our variable (or any other possible causal model) explains the observed difference. The decision, as you know, is usually taken on probabilistic terms, because the hyothesis we are rejecting or assuming as best explanation is a probabilistic hypothesis, where the cause is treated probabilistically.

    Therefore, “random causes” are causes just the same. Their treatment is probabilistic, but just the same we can have good motives to assume that those causes, treated probabilstically, are the best explanation for what we observe.

    The important point, that darwinists never want to recognize, is that when we assume a probabilistic cause in our model, the only way to analyze it, to decide if it is a credible cause or not, is a probabilstic analysis.

    That is exactly the point of ID. Neo darwinisms assumes RV as the engine of variation. That is a causal assumtpion for something that can be trated only probabilistically. Therefore, darwinists have the duty to analyze its probabilisitc credibility. They don’t want to do that, so we do it for them :) .

  176. Elizabeth:

    And that is exactly what is wrong with ID. Lori Ann White has done what I always reprimand students for doing – saying that their alpha criterion allows them to rule out effects that are “due to chance alone”. It does no such thing. All it does is to allow them to say, as you said yourself, that the observed results are unlikely to be observed if the null is true.

    And that’s exactly what is wrong in your reasoning. Why do you remprimend your students?

    Now, as you say: “All it does is to allow them to say, as you said yourself, that the observed results are unlikely to be observed if the null is true”

    All? What else are you looking for? The observed results are so unlikely (for instance, five sigma) that the only reasonable empirical choice is to reject the null hypothesis. IOWs, to “rule out effects that are “due to chance alone”.

    Now, what you have said seems senseless to me, but I will try to make sense out of it, giving possible interpretation that could be in some way true. Please, let me know if that’s what you meant:

    a) Their alpha criterion does not allow them to automatically affirm their H1 hypothesis. Perfectly true, but it’s not what you said.

    b)Their alpha criterion does not allow them to exclude that some small effects due to chance alone are however present in the system. True, but absolutely irrelevant. Why are you interested in subliminal random effects, when you have a five sigma in favour of a non random explanation?

    c) Their alpha criterion does not allow them to logically exclude an explanation “due to chance alone”. True, but who cares? Oue science is mainly empirical. Almost never it can falsify data interpretations “logically”. Five sigma is a very good empirical falsification, abd an absolute reason to rject the null hypothesis.

    So again, why do you reprimand your students? (I care for them, you know…) :)

  177. And that’s exactly what is wrong in your reasoning. Why do you remprimend your students?

    Now, as you say: “All it does is to allow them to say, as you said yourself, that the observed results are unlikely to be observed if the null is true”

    All? What else are you looking for? The observed results are so unlikely (for instance, five sigma) that the only reasonable empirical choice is to reject the null hypothesis.

    Yes.

    IOWs, to “rule out effects that are “due to chance alone”.

    No. That is not the same thing. “The null is true” is not the same as saying “the results are due to chance alone”.

    Because “due to chance” is meaningless. “Chance” doesn’t cause things.

    This is not a nitpick.

    Now, what you have said seems senseless to me, but I will try to make sense out of it, giving possible interpretation that could be in some way true. Please, let me know if that’s what you meant:

    a) Their alpha criterion does not allow them to automatically affirm their H1 hypothesis. Perfectly true, but it’s not what you said.

    Well, it allows them to claim that their H1 is supported. I’d be happy if they said that.

    b)Their alpha criterion does not allow them to exclude that some small effects due to chance alone are however present in the system. True, but absolutely irrelevant. Why are you interested in subliminal random effects, when you have a five sigma in favour of a non random explanation?

    Well, they should bear that in mind, but that is not the source of my objection to the phrase “the results are unlikely to be due to chance alone”.

    c) Their alpha criterion does not allow them to logically exclude an explanation “due to chance alone”. True, but who cares? Oue science is mainly empirical. Almost never it can falsify data interpretations “logically”. Five sigma is a very good empirical falsification, abd an absolute reason to rject the null hypothesis.

    I have no objection to rejecting the null. I object to them claiming that it is unlikely their results are due to “chance alone”.

    So again, why do you reprimand your students? (I care for them, you know…)

    I care for them too :)

    Because chance does not cause anything. It’s just a way of saying that whatever caused something, it was not something you predicted, or could easily have predicted.

    And the reason it isn’t a nitpick is that if you reject the null, you have to be really clear what the null is. The null is not “chance alone”. The null is “H1 is false”.

    And working out the expected distribution under the null is (as I’m sure you agree) an integral part of the test, and not always easy.

  178. Elizabeth:

    Because chance does not cause anything. It’s just a way of saying that whatever caused something, it was not something you predicted, or could easily have predicted.

    I have tried to explain in detail why that is wrong, but if you just object to the wording, we can reach an agreement.

    Let’s say that not rejecting the null means that a probabilistic explanation, based on our understandin of the system and our probabilistic modeling of it, explains quite well what we observe. For instance, in a test control experiment, the probabilistiv variation due to random sampling can well explain the observed difference.

    Rejecting the null means the opposite: that some other explanation, possibly the causal model hypothesized by the researchers, is reasonably needed.

    So, as you can see, chance does cause things, if we just mean that true determinsitic causes, that we model probabilistically because their is no ither way to do that, are the best explanation for what we observe.

    That’s the only thing I ever meant, and the only thing necessary for ID theory.

    So, your final statement in your post 19.2:

    Chance doesn’t “cause” anything. But lots of unmodelled factors do. However, under the null hypothesis, only rarely will those unmodelled factors combine to give you results like those you have observed.
    And ID simply does not attempt to model those unmodelled factors. It simply assumes that under the null (no design) the observed data will be very rare. In other words, it assumes what it sets out to demonstrate, which is fallacious.

    is completely unwarranted. It’s not that ID is not “attepting to model unmodelled factors”. It’s simply that those factors (such as what aminoacid will change because of random mutation) cannot be modeled explicitly in a necessity form, and must be modelled probabilistically. It’s not some strange position of IDists, it’s the only way to proceed scientifically.

    And ID does not assume anything like waht you state. It just tries to model probabilistically the system that, according to neo darwinist theory, is the probabilisti cause of the emergence of genetic information.

  179. 179
    material.infantacy

    Hi Elizabeth, It’s been a couple of days, but I wanted to clarify my argument (while I have the chance) and try and address some of the points you raised in 21.1. I hope you had a good new year.

    Let me restate the case I’ve made, that within protein sequence space there exists an objective target space for function which is determined by the laws of physics, and not by the post hoc definition of a given observer. (I don’t mean to be overly repetitious, but I want to restate my argument with regard to some of the points you raised.)

    S = {sequence space}
    F = {folding proteins}
    F1 = {functional proteins}

    F1 ⊆ F ⊂ S
    n(F1) ≤ n(F) < n(S), which implies
    0 < P(F1) ≤ P(F) < 1, for a single trial

    The function n(X) determines the number of elements in a given set X.

    The set S consists of all sequences of a given length n. For a little more specificity, let’s assume that n is greater than or equal to 150.

    n ≥ 150

    The set F consists of those sequences in S which will fold into stable proteins. This set is determined by the laws of physics. This set is unchanging. That is, it’s the same today as it was yesterday and will be tomorrow. This set is deterministic. If a sequence can be folded into a stable three-dimensional structure, then it exists in this set. I’m making the following assumptions.

    n(F) < n(S)

    That is the same as saying that not all sequences will fold properly. In addition, I’m claiming that the set F is a narrow subset of S — there are many more elements in F’ (the complement of F) than there are in F:

    n(F) < n(F’).

    This also implies that for a partition 0 < n(F) < n(S) that n(F) is closer to zero than it is to n(S); or, that n(F) / n(S) < 1. This implies that P(F) < 0.5 for any single trial.

    The set F1 is a subset of F which consists of all folded proteins that can function in a biological context — any context — past, preset, or future. If a protein can possibly be functional and beneficial to any organism at any time, then it exists in set F1. It’s important to note that F1 may, in my assumptions, be equal to F. This is another way of saying that every protein which folds may indeed be functional in some biological context — some organism, at some point in past, present, or future. That is,

    F1 ⊆ F, which implies
    n(F1) ≤ n(F).

    This makes F1 a static set regardless of what RV+NS might do at some point in an organism’s history. The size of the set F1 is bounded by the size of F.

    Given the above, it is clear that F1 is not arbitrary. Let me illustrate. If all of sequence space, the set S, is a dart board, and the set F is a smaller target within the confines of the dart board (or we can say that it is several small targets scattered around the board uniformly), when a dart is thrown by a blindfolded participant, we are far more likely to miss a target in F than we are to hit one.

    That being the case, F1 can’t be arbitrary — striking a functional sequence is less likely than striking a non-functional one. Not only is the probability for F1 closer to zero than to one, but if we know that we struck a space in F then the probability of F1 having occurred is greater.

    P(F1) < P(F1|F)

    Striking a target in F increases the probability that F1 has also been struck.

    There are two implications to this:

    1) Striking the target is not as likely as missing it. So the cards analog (Miller’s) takes a dirt nap — one sequence is not as good as any other with respect to function — and although each individual sequence has equal probability (axiomatic), functional sequences do not share an equal probability with non-functional ones: P(F1) < P(F1’) because P(F) < P(F’). So if a functional sequence is found, we can infer a rare event, and not one that is probabilistically insignificant, as is suggested by the cards analogy, and ones like it.

    2) When a target in F1 is struck, the fact that it may be functional is not post hoc but objective. One cannot be accused of fitting the data to some arbitrary notion of function, because not all sequences can be functional; and the ones that can are contained to F. So the notion that the observation of function is dreamt up by pro-design ideologues is incorrect. Function exists within a narrow subset F, the folding proteins, which are determined by the laws of physics. We don’t have protein function outside of F. When we observe F1, we are not drawing an arrow around the target, we are observing an empirically validated, physical constraint on S: the set F.

    The estimate provided by Axe’s research suggests that 10^-74 of all sequences will fold into stable structures. This is provisional. However the notion that it is impossible to estimate how many sequences will fold is untenable, since this is determined by the laws of physics.

    I’ll include again the quote from Meyer which puts this in layman’s terms:

    Since proteins can’t perform functions unless they first fold into stable structures, Axe’s measure of the frequency of folded sequences within sequence space also provided a measure of the frequency of functional proteins—any functional proteins—within that space of possibilities. Indeed, by taking what he knew about protein folding into account, Axe estimated the ratio of (a) the number of 150-amino-acid sequences that produce any functional protein whatsoever to (b) the whole set of possible amino-acid sequences of that length. Axe’s estimated ratio of 1 to 1074 implied that the probability of producing any properly sequenced 150-amino-acid protein at random is also about 1 in 1074. In other words, a random process producing amino-acid chains of this length would stumble onto a functional protein only about once in every 1074 attempts.
    Meyer, Stephen C. (2009-06-06). Signature in the Cell (pp. 210-211). Harper Collins, Inc.. Kindle Edition.

    Here are some objections you raised, or variations on them (let me know if I’ve missed the point with any of these) along with some comments.

    a) Axe’s research is wrong.

    This may very well be the case. But unless research is proposed which drastically reduces this estimate, we can infer, at the least, that F is very narrow. I do not accept that any sort of reasonable estimate is just plain impossible. This is provisional, of course. Even if it were shown that say, 1 out of every 10^20 sequences can fold into stable proteins, we would still have a narrow subset of S, called F, within which function exists — and so it could not be considered that function resides with arbitrary sequences in regard to S; we would just have a much larger target.

    b) We may find that long, unfolded sequences have some sort of function.

    That might indeed be demonstrated at some point. However empirically, we find the opposite to be true: if sequences fold into stable proteins, there is a possibility of function; otherwise, no. Again, this is provisional.

    c) RV+NS, the NDE mechanism, makes the set F1 non-static. That is, function is determined relative to environmental pressures and so forth, meaning that what is functional is subject to change.

    This is addressed earlier. Any potentially functional sequence, for any point in time, exists in set F1, which is bounded by set F. If there is a contextually selectable functional set, it is a subset of F1, and thereby a subset of F.

    SEL = {functions selectable by N.S.}

    SEL ⊆ F1 ⊆ F

    If NDE can act significantly outside of the bounds of F, it should be demonstrable. This is provisional.

    My claim is that NDE is not free to drift in and out of the set F unconstrained (RV could explore spaces which do not fold, but N.S. cannot select for them, as I understand it). It can certainly try, and it’s reasonable to speculate that such might be possible in principle, but AFAICT empirical observations do not warrant this. Even if it were shown that in limited cases, long polypeptides could serve a function in some narrow context, stable protein function is crucial to biological function and acts as the rule, not the exception; and the rule is important — exceptions do not exist. (That is not to say that unfolded polypeptides will never be shown to have some limited, contextual role within an organism, only that they cannot substitute for folded, highly specific, functional proteins. I think this was one of gpuccio’s points earlier.)

    Thanks for a good discussion. I appreciate that you took some time with my previous comments.

    Best,
    m.i.

  180. Thanks for your responses M.I.

    My holiday is over, but I’ll bookmark the thread and hope to get back to it!

    Some food for thought.

    Thanks!

    Lizzie

  181. Elizabeth:
    Because chance does not cause anything. It’s just a way of saying that whatever caused something, it was not something you predicted, or could easily have predicted.
    I have tried to explain in detail why that is wrong, but if you just object to the wording, we can reach an agreement.
    Let’s say that not rejecting the null means that a probabilistic explanation, based on our understandin of the system and our probabilistic modeling of it, explains quite well what we observe. For instance, in a test control experiment, the probabilistiv variation due to random sampling can well explain the observed difference.
    Rejecting the null means the opposite: that some other explanation, possibly the causal model hypothesized by the researchers, is reasonably needed.
    So, as you can see, chance does cause things, if we just mean that true determinsitic causes, that we model probabilistically because their is no ither way to do that, are the best explanation for what we observe.

    Well, I think it’s important to be precise. If by “chance causes things” you mean “we made a stochastic model that fit the data”, then fair enough. But why not be precise? Because it’s easy to equivocate accidentally if you use language imprecisely, and in Dembski’s formulation he rejects the “Chance” null and infers design without ever making a stochastic model. And evolutionists would, and do, argue that an appropriate stochastic model fits the data very well.

    That’s the only thing I ever meant, and the only thing necessary for ID theory.

    Well, depends on the theory. It’s a major problem in Dembki’s. And in Behe’s.

    So, your final statement in your post 19.2:
    Chance doesn’t “cause” anything. But lots of unmodelled factors do. However, under the null hypothesis, only rarely will those unmodelled factors combine to give you results like those you have observed.
    And ID simply does not attempt to model those unmodelled factors. It simply assumes that under the null (no design) the observed data will be very rare. In other words, it assumes what it sets out to demonstrate, which is fallacious.

    is completely unwarranted. It’s not that ID is not “attepting to model unmodelled factors”. It’s simply that those factors (such as what aminoacid will change because of random mutation) cannot be modeled explicitly in a necessity form, and must be modelled probabilistically. It’s not some strange position of IDists, it’s the only way to proceed scientifically.

    Right. That’s what I’ve been saying! It requires a stochastic model! But I’m not seeing those stochastic models.

    And ID does not assume anything like waht you state. It just tries to model probabilistically the system that, according to neo darwinist theory, is the probabilisti cause of the emergence of genetic information.

    Not that I’ve seen. Dembski doesn’t. Do you have a reference?

    Don’t rush, though, because I’m going to have to take another break :)

    I’ve bookmarked the thread though. And thank you for a very interesting conversation. We’ve at least managed a couple of points of agreement :)

    Cheers

    Lizzie

  182. It is possible for Axe’s research to have been correctly done, but not present a problem for evolution.

    Axe did not test an evolutionary scenario as Thornton did. Axe did not ask an evolutionary question, which would be: given two alleles, can you get from one to the other by small steps?

    I’m not convinced that the protein problem doesn’t support evolution rather than present a problem. In the entire history of life there have been only 2000 or so protein domains invented. That’s about one every two million years (and they are spread out over several billion years, so any putative designer would have had to make multiple interventions and many visits).

    Some domains have sequence relatives and some don’t. But not having living relatives does not mean that one is specially created. It simply means that you have no living cousins.

    The most interesting thing about protein domains is that nearly all of them have been invented by microbes, which is what you would expect if it requires large numbers of trials. Microbes have large populations and short reproductive cycles.

    The designer of proteins has been rather stingy with vertebrates. Most evolution of vertebrates has skipped the necessity of new proteins in favor of regulatory networks. and whatever you might say about the objective existence of protein folds, the utility of bone length and such is settled in the arena of reproductive success.

  183. Petrushka:

    I find your comments reasonable this time, although obviously I don’t agree with the substance of them.

    I believe that the 2000 basci protein superfamilies have no sequence relatives. Indeed, the number of unrelated groupings at sequence level in SCOP (less than 10% homology) is about 6000.

    As I have said many times, it is inconceivable that all your “cousins” should systematically die, especially when, if you introduec NS in the scenario, there should be hundreds of thousands of them (or probably more).

    The majority of protein domains have been “invented by microbes” (indeed, a little more than 50%). But it is true that only a small number is found in higher animals.

    The reason for that is not necessarily the one you give (although that could contribute). Another reason can well be that most relecant biochemcical activities have alredy been designed at that stage, and regulatori networtks become more important than ever.

    I don’t deny the utility of bone lengths, only I would like to know the molecular basis. Simple dimensions could be regulated by simple factors, while morphological and functional connections seem better candidates for complex regulatory information.

  184. As I have said many times, it is inconceivable that all your “cousins” should systematically die

    The systematic elimination of less functional protein coding sequences is no more mysterious than the systematic extinction of less intelligent hominids.

    Or the systematic extinction of marsupials from South America when a land bridge enabled the crossing over of more competitive mammals.

    I concede that research needs to be done to demonstrate a path to protein coding. It appears that this kind of research will be very difficult, possibly impossible for the foreseeable future.

    So I suspect you will be able to hold on to your opinion.

  185. Petrushka:

    I concede that research needs to be done to demonstrate a path to protein coding.

    Thak you for conceding. I agree, anyway.

    It appears that this kind of research will be very difficult, possibly impossible for the foreseeable future.

    I am more otpimistical. And I really look forward to that.

    So I suspect you will be able to hold on to your opinion.

    And you to yours. At least for a while. Not too much, I hope :)

  186. Elizabeth:

    You say:

    If by “chance causes things” you mean “we made a stochastic model that fit the data”, then fair enough. But why not be precise? Because it’s easy to equivocate accidentally if you use language imprecisely, and in Dembski’s formulation he rejects the “Chance” null and infers design without ever making a stochastic model. And evolutionists would, and do, argue that an appropriate stochastic model fits the data very well.

    I don’t really think that your comments about Dembski, and Behe, are correct.

    The point in ID is that the random system imagined by darwinists cannot explain the data, nor even with the introduction of NS. I think Dembski has stated very clearly that he assumes an uniform distribution for the random system, that is the onlt reasonable thing to do. Then, he establishes the UPB (indeed, too generous a threshold) as alimit of what a random system can empirically explain.

    Maybe he has not gone into the biological details (but Behe certainly has).

    Anyway, while waiting to understand better the nature of your objections to Dembski and Behe, I will try to analyze the biological random system for you, and to show that it cannot explain data. I have really already done that, but it could be useful to review the resoning in detail and in order, now that maybe we have clarified some epistemological points.

    In next post.

  187. Elizabeth:

    Now, let’s mopdel the system of random variation in a reasonable biological model. To do that, I will try to do what darwinists never do: to be specific, and to refer always to explicit causal models.

    a) First of all, what are the data that we are trying to explain. For consistency, I will stick to my usual scenario: the emergence, in the course of natural history, of about 2000 – 6000 unrelated protein domains (according to how we group them). That kind of event has happened repeatedly in natural history, and I believe that we can agree that those 2000 or so protein domain are the essebtial core of biological functio in life as we know it. IOWs, they are very important empirical data that need to be causally explained.

    b) The main characteristics of our 2000 or so basic protein domains are:

    b1) they are completely unrelated at sequence level: less than 10% homology.

    b2) they have specific and different 3D structure due to very good and stable folding.

    b3) they have specific abd different biochemical function (biochemical activity, if you prefer), that can be objectively define and measured.

    b4) There are no known functional intermediates that bridge those groups, neither in the proteome nor in the lab.

    c) Neo darwinism has a tentative explanation for the emergence of those protein structures: they would be the result of RV + NS.

    d) Let’s leave alone, for the moment, NS, and consider RV. What does RV mean? It means any form of modification of the genome that is not designed and whose result cannot be described by a necessity model, but only by a probabilistic model. So, RV includes many proposed mechanisms: single point mutation, insertions and deletion, inversion, frameshift mutation, chromosomal modifications, duplications, sexual rearrangements, and so on.

    e) Now, why do we call all those mechanisms “RV”? As we have discussed, each of those mechanisms is operated in accord to biochemical necessity laws. But the result of the variation cannot in any way ne anticipated by a necessity model. It can only be described probabilistically.

    Let’s take, for example, single point mutation: It can occur at any site, even if some sites are more likely than others. We can never say, by biochemical computations, which nucleotide will change as a result of some duplication error. But we can try to describe those events probabilistically. That is true for all the “variation engines” considered. Even drift is random, although in a strict sense it does not chamge the genome, but only the representations of some genomes in the population.

    So, we can say that all variation that changes the genome (excluding possible design interventions) is random, in the sense that it can only be described probabilistically.

    f) But how can we describe probabilistically those variations? It’s not easy because, as we have said, not all events have the same probability. The probability of a single point mutation is not the same as that of an inversion, for example. Different biochemical mechanisms explain different forms of variation.

    g) There is a way, however, to simplify the reasoning, if we stick top an explicit scenario: the emergence of a new protein domain.

    h) In principle, there are at least two way to generate a functional sequence of codons corresponding to a functional sequence of AAs, IOWs to generate a functional protein coding gene:

    h1) It could emerge gradually “from scratch”, starting with a codon and adding the others.

    h2) It could emerge by a random walk from some existing sequence, by gradual or less gradual modifications of the sequence.

    I will stick to the second scenario, because I believe that nobody is really proposing the first.

    i) So, what we need in our scenario is:

    i1) a starting sequence

    i2) a series of modifications that realize a random walk

    i3) a final result

    Now, I can already hear you objections about the target, and so on. But try to undersatnd the context. We are trying to explain how existing functional domains originated. What I am doing is exploring the probability that one specific existing functional protein domain originated by means of RV such as we observe in biological systems. I will deal after with problems such as the contribution of NS, or the idea that other functional sequences could have arised. So, please follow me.

    j) The starting sequence. I will stick to the more common scenarios:

    j1) the starting sequence is an existing, unrelated gene (IOWs, a gene coding for some other protein, with a different basci domain, different sequence, different structure, different function.

    j2) the starting sequence is an existing piece of non coding DNA.

    Now, the only way to simplify the discussion is to stress that the starting sequence and the final sequence are totally unrelated at sequence level, and that all the variations happen at sequence level. That implies that, if we call A the starting sequence, and B the final sequence, A must completely loose its characteristic primary sequence, to get to B. In the same way, A has to loose its 3D structure and function.

    Now, the fundamental point: once A changes so much that it becomes unrelated to its original sequence, at that point all unrelated states have more or less the same probability to be reached by a random walk.

    More on that in next post.

  188. Elizabeth:

    Now, let’s mopdel the system of random variation in a reasonable biological model. To do that, I will try to do what darwinists never do: to be specific, and to refer always to explicit causal models.

    I do wish you’d stop throwing out these completely unsupported, and IMO completely erroneous generalisations. You can’t get a scientific paper published (or only with difficulty) if you aren’t specific, and, if your hypothesis is a causal one, without specifying your causal model. And yet, tens of thousands of papers on evolutionary biology are published each year.

    a) First of all, what are the data that we are trying to explain. For consistency, I will stick to my usual scenario: the emergence, in the course of natural history, of about 2000 – 6000 unrelated protein domains (according to how we group them). That kind of event has happened repeatedly in natural history, and I believe that we can agree that those 2000 or so protein domain are the essebtial core of biological functio in life as we know it. IOWs, they are very important empirical data that need to be causally explained.

    OK, so you are talking specifically about the evolution of protein domains.

    Yes, we can agree that the protein domains that are the core of life as we know it are the core of life as we know it. We do not, of course, know that they are the core of life as it might have been had some things turned out a bit differently. Hence my repeated reminder of the importance of correct specification of the null hypothesis.

    b) The main characteristics of our 2000 or so basic protein domains are:
    b1) they are completely unrelated at sequence level: less than 10% homology.
    b2) they have specific and different 3D structure due to very good and stable folding.
    b3) they have specific abd different biochemical function (biochemical activity, if you prefer), that can be objectively define and measured.
    b4) There are no known functional intermediates that bridge those groups, neither in the proteome nor in the lab.
    c) Neo darwinism has a tentative explanation for the emergence of those protein structures: they would be the result of RV + NS.
    d) Let’s leave alone, for the moment, NS, and consider RV. What does RV mean? It means any form of modification of the genome that is not designed and whose result cannot be described by a necessity model, but only by a probabilistic model. So, RV includes many proposed mechanisms: single point mutation, insertions and deletion, inversion, frameshift mutation, chromosomal modifications, duplications, sexual rearrangements, and so on.
    e) Now, why do we call all those mechanisms “RV”? As we have discussed, each of those mechanisms is operated in accord to biochemical necessity laws. But the result of the variation cannot in any way ne anticipated by a necessity model. It can only be described probabilistically.
    Let’s take, for example, single point mutation: It can occur at any site, even if some sites are more likely than others. We can never say, by biochemical computations, which nucleotide will change as a result of some duplication error. But we can try to describe those events probabilistically. That is true for all the “variation engines” considered.

    More or less.

    Even drift is random, although in a strict sense it does not chamge the genome, but only the representations of some genomes in the population.

    Using random in the sense in which you are using it, of course drift is random. That’s why it’s called drift! It belongs in the “NS” part of your discussion, though, not the “RV” part. As you say, it does not change the genome. It is part of the process by which certain genotypes become more prevalent, as is NS. But you can dispense with drift as an additional factor simply by modelling differential reproduction stochastically.

    So, we can say that all variation that changes the genome (excluding possible design interventions) is random, in the sense that it can only be described probabilistically.
    f) But how can we describe probabilistically those variations? It’s not easy because, as we have said, not all events have the same probability. The probability of a single point mutation is not the same as that of an inversion, for example. Different biochemical mechanisms explain different forms of variation.

    Exactly. Both differential reproduction and variation generation are stochastic processes based on what you would call “necessity laws”. Everything is caused by something, it’s just that to model it, you have to take a stab at the probability distribution of those causal events. And remember also that we are dealing with a dynamic system here, not a static one, in which the state of the system at any given time strongly constrain what happens next. In other words, given Genome X, the probability that its descendent will resemble it very closely is astronomically higher than the probability that it will not.

    g) There is a way, however, to simplify the reasoning,

    There is indeed :)

    if we stick top an explicit scenario: the emergence of a new protein domain.
    h) In principle, there are at least two way to generate a functional sequence of codons corresponding to a functional sequence of AAs, IOWs to generate a functional protein coding gene:
    h1) It could emerge gradually “from scratch”, starting with a codon and adding the others.
    h2) It could emerge by a random walk from some existing sequence, by gradual or less gradual modifications of the sequence.
    I will stick to the second scenario, because I believe that nobody is really proposing the first.

    Right. And a random walk is indeed the scenario I described, in which the state the sequence is in at Time t strongly constrains the state it is in at Time t+1.

    i) So, what we need in our scenario is:
    i1) a starting sequence
    i2) a series of modifications that realize a random walk
    i3) a final result
    Now, I can already hear you objections about the target, and so on.

    Nope that’s fine. I accept that your target is a sequence that codes for a protein domain.

    But try to undersatnd the context. We are trying to explain how existing functional domains originated. What I am doing is exploring the probability that one specific existing functional protein domain originated by means of RV such as we observe in biological systems. I will deal after with problems such as the contribution of NS, or the idea that other functional sequences could have arised. So, please follow me.
    j) The starting sequence. I will stick to the more common scenarios:
    j1) the starting sequence is an existing, unrelated gene (IOWs, a gene coding for some other protein, with a different basci domain, different sequence, different structure, different function.
    j2) the starting sequence is an existing piece of non coding DNA.
    Now, the only way to simplify the discussion is to stress that the starting sequence and the final sequence are totally unrelated at sequence level,

    What do you mean by “unrelated”? One will be the direct descendent of the other! Or do you mean “completely different” sequence – if so, why?

    and that all the variations happen at sequence level.

    As opposed to what?

    That implies that, if we call A the starting sequence, and B the final sequence, A must completely loose its characteristic primary sequence, to get to B. In the same way, A has to loose its 3D structure and function.

    Why? And what do you mean by “its 3D structure and function”? You mean that the protein it codes for has to not be coded for at some point in the change? What?

    Now, the fundamental point: once A changes so much that it becomes unrelated to its original sequence, at that point all unrelated states have more or less the same probability to be reached by a random walk.

    But why should it ever reach that state? Why are you insisting that a new domain (“B”) must be totally dissimilar to its parent, “A”? That seems to me a quite unsafe assumption. I know there is a fair amount of research into the origins of protein domains, but obviously it’s not my field. But what makes you think that this research is wrong?

  189. Elizabeth:

    Well, it seems we are beginning to communicate better.

    Before going on with the reasoning, let’s try to clarify the small misunderstandings we still have.

    Using random in the sense in which you are using it, of course drift is random. That’s why it’s called drift! It belongs in the “NS” part of your discussion, though, not the “RV” part. As you say, it does not change the genome. It is part of the process by which certain genotypes become more prevalent, as is NS. But you can dispense with drift as an additional factor simply by modelling differential reproduction stochastically.

    That’s OK for me: NS and drift act in similar ways: changing the representation of some genome in the population. Both cannot change the genome. And still, it’s important to remember that drift, unlike NS, is totally random because the fact that some gene becomes more represented in the population because of drift has not necessity relation with the gene itself (the effect is random), while NS expands, or reduces, the representation og genes according to a necessity relation (the causal effect of the specific variation on reproduction). If we agree on that, we can go on.

    What do you mean by “unrelated”? One will be the direct descendent of the other! Or do you mean “completely different” sequence – if so, why?

    Yes, it means that A and B have completely different primary sequences. That is already implied by the premise (they belong to different basic domains, and basic domains have less than 10% homologies, which is more or less the random similarity you can expect between completely different random seqeunces of that type and length).

    IOWs, here we are not discussing how a protein in a superfamily can be transformed into another protein of the same suprfamily, with high homology, similar structure, and similar, or slightly different, function. We are discussiong how a new protein domain can emerge from an existing protein domain, or from some non coding DNA. That’s why the starting sequence is certainly unrelated to the final one in the cae of a protein belonging to amother, previously existing, protein domain. In the case of non coding DNA, we are not sure of anything, but obviously there is no reason in the world why non coding DNA should be potentially related to the new protein domain, except in the case that it was designed for that.

    and that all the variations happen at sequence level. As opposed to what?

    What I mean is that it is the primary sequence that varies, at each random event. Therefore, it the primary sequences are unrelated and therefore distant in the search space, the variation has to traverse the search space anyway.

    We can imagine the search space with the following topology: the distance between states (sequences) is defined as the percent of homology. Sequences with less than 10% homology are absolutely distant in the search space. And that’s exactly our case. No protein A can become protien B, with a completely different sequence, without passing through the “distance” that separates the two states.

    Why? And what do you mean by “its 3D structure and function”? You mean that the protein it codes for has to not be coded for at some point in the change? What?

    Well, if the gene has been duplicated, and the original gene is kept functional by the effect of negative NS, then the duplicated gene is free to change in neutral mode (that would be the subject of my next post, if we can arrive there). The same is true for the start from non coding DNA.

    If the variation must happen in a functional gene, I really don’t know how that could work. The emergence of the new function would inevitably imply, long before the new function emerges, the loss of the old function. I suppose that’s why darwinists very early elaborated the model of gene duplication.

    But if the duplkicated gene changes in neutral mode, it will very early become unrelated to the original gene, while being at the same time unrelated to the new gene (unless design guides the transition, and probably even in that case).

    At this point, I could anticipate my next point. But let’s before finish our work on what already said.

    But why should it ever reach that state? Why are you insisting that a new domain (“B”) must be totally dissimilar to its parent, “A”? That seems to me a quite unsafe assumption. I know there is a fair amount of research into the origins of protein domains, but obviously it’s not my field. But what makes you think that this research is wrong?

    Again, the difference between A and B is implied by the fact that they belong to different basic domains, or superfamilies. And, as I have already said, as far as I know there is no model that shows intermediates between protein domains, either in the proteome, or in the lab. I have been saying these things for years now, and no interlocutor has ever shown any “research abot the origin of protein domains” that contradicts that. So, please, show me the research you think of, and in that case I will try to explain why it is wrong.

    But I am not holding my breath…

    So, if you answer these points, I will go on about the role of NS and the computation of probabilities.

  190. Thank you for conceding. I agree, anyway.

    It would be good of you to respond to the business of cousin extinction. We see it all the time when one variant is noticeably superior to another.

    My concession is that as the technology permits, it needs to be addressed.

  191. GPuccio,

    So, if you answer these points, I will go on about the role of NS and the computation of probabilities.

    Could I ask you to continue regardless of whether or not you get the feedback. It is really interesting. Cheers.

  192. Elizabeth:
    Well, it seems we are beginning to communicate better.
    Before going on with the reasoning, let’s try to clarify the small misunderstandings we still have.
    Using random in the sense in which you are using it, of course drift is random. That’s why it’s called drift! It belongs in the “NS” part of your discussion, though, not the “RV” part. As you say, it does not change the genome. It is part of the process by which certain genotypes become more prevalent, as is NS. But you can dispense with drift as an additional factor simply by modelling differential reproduction stochastically.
    That’s OK for me: NS and drift act in similar ways: changing the representation of some genome in the population. Both cannot change the genome. And still, it’s important to remember that drift, unlike NS, is totally random because the fact that some gene becomes more represented in the population because of drift has not necessity relation with the gene itself (the effect is random), while NS expands, or reduces, the representation og genes according to a necessity relation (the causal effect of the specific variation on reproduction). If we agree on that, we can go on.

    Well, as I said, you can model the two together by modelling differential reproduction stochastically, and including a bias to represent natural selection. I’d be very unhappy about calling that a “necessity relation” because there is no “necessity” that any one phenotypic feature will always result in increased (or reduced) reproduction. Whether that feature is “selected” or not has a statistical answer, not a “necessity” answer. This is why I think that the chance vs necessity distinction is unhelpful, and potentially misleading.

    What do you mean by “unrelated”? One will be the direct descendent of the other! Or do you mean “completely different” sequence – if so, why?
    Yes, it means that A and B have completely different primary sequences. That is already implied by the premise (they belong to different basic domains, and basic domains have less than 10% homologies, which is more or less the random similarity you can expect between completely different random seqeunces of that type and length).

    This seems circular. Clearly the descendent of one domain will be “related” to the other (by definition). After many intervening generations and changes, there may be no commonality, but that doesn’t mean they are unrelated.

    IOWs, here we are not discussing how a protein in a superfamily can be transformed into another protein of the same suprfamily, with high homology, similar structure, and similar, or slightly different, function. We are discussiong how a new protein domain can emerge from an existing protein domain, or from some non coding DNA. That’s why the starting sequence is certainly unrelated to the final one in the cae of a protein belonging to amother, previously existing, protein domain. In the case of non coding DNA, we are not sure of anything, but obviously there is no reason in the world why non coding DNA should be potentially related to the new protein domain, except in the case that it was designed for that.

    But you are assuming that two dissimilar proteins are unrelated. They may not be – it may simply be that the intermediate versions are no longer extant. Think of a word chain:
    Hand
    Hard
    Hold
    Hood
    Food
    Foot.
    Foot is a “new domain”. If hard, hold, hood, and food are no longer extant, we cannot assign it to the same “family” as “hand”. But that doesn’t mean that “hand” wasn’t ancestral to it.

    and that all the variations happen at sequence level. As opposed to what?
    What I mean is that it is the primary sequence that varies, at each random event. Therefore, it the primary sequences are unrelated and therefore distant in the search space, the variation has to traverse the search space anyway.
    We can imagine the search space with the following topology: the distance between states (sequences) is defined as the percent of homology. Sequences with less than 10% homology are absolutely distant in the search space. And that’s exactly our case. No protein A can become protien B, with a completely different sequence, without passing through the “distance” that separates the two states.

    I don’t know what “absolutely distant” means. And I don’t see why you are assuming that the distance can’t be traversed.

    Why? And what do you mean by “its 3D structure and function”? You mean that the protein it codes for has to not be coded for at some point in the change? What?
    Well, if the gene has been duplicated, and the original gene is kept functional by the effect of negative NS, then the duplicated gene is free to change in neutral mode (that would be the subject of my next post, if we can arrive there). The same is true for the start from non coding DNA.

    Well, either is free to change as long as the other remains functional (if it’s important.

    If the variation must happen in a functional gene, I really don’t know how that could work. The emergence of the new function would inevitably imply, long before the new function emerges, the loss of the old function. I suppose that’s why darwinists very early elaborated the model of gene duplication.

    Well, that’s certainly one possible mechanism.

    But if the duplkicated gene changes in neutral mode, it will very early become unrelated to the original gene, while being at the same time unrelated to the new gene (unless design guides the transition, and probably even in that case).

    Well, I still don’t know what you mean by “unrelated”.

    At this point, I could anticipate my next point. But let’s before finish our work on what already said.
    But why should it ever reach that state? Why are you insisting that a new domain (“B”) must be totally dissimilar to its parent, “A”? That seems to me a quite unsafe assumption. I know there is a fair amount of research into the origins of protein domains, but obviously it’s not my field. But what makes you think that this research is wrong?
    Again, the difference between A and B is implied by the fact that they belong to different basic domains, or superfamilies. And, as I have already said, as far as I know there is no model that shows intermediates between protein domains, either in the proteome, or in the lab. I have been saying these things for years now, and no interlocutor has ever shown any “research abot the origin of protein domains” that contradicts that. So, please, show me the research you think of, and in that case I will try to explain why it is wrong.

    This is not my field, but there seems to be a substantial literature on protein domain evolution. I think you are possibly making the error of regarding “protein domains” as some absolute category (rather like “species”, which often suffers from a similar problem) as opposed to a convenient simplifying categorisation by scientists to refer to common evolutionary units.
    Have you done a literature search?

    But I am not holding my breath…
    So, if you answer these points, I will go on about the role of NS and the computation of probabilities.

    OK. I’m sort of interested in how you compute the probabilities, but I’m a bit concerned by some of your assumptions.

  193. Elizabeth:

    I will be brief about your concerns, because otherwise I will never get to the other points.

    a) The necessity relation is the way variation affects reproduction according to a cause effect relation between the variation in biochemical function of the affected protein and the reproductive fitness. This is a necessity relation, although you seem to do your best not to admit it. That random components can influence the final reproductive result is true, but in no way this fact “cancels” the causal relationship between protein function variation and differential reproduction. That causal relation is what interests us, because all the rest (the random components) will not influence the basic computation of probabilities, while the causal influence of protein function variation, being not random, will have a definite effect on the probabilistic scenario, as I am going to show.

    This seems circular. Clearly the descendent of one domain will be “related” to the other (by definition). After many intervening generations and changes, there may be no commonality, but that doesn’t mean they are unrelated.

    No. That’s not the meaning I am giving to “unrelated”. As I have clearly stated, by “unrelated” I mean that they have no sequence homology. Nothing more, nothing less. I ask you to accept this meaning for the following discussion.

    But you are assuming that two dissimilar proteins are unrelated. They may not be – it may simply be that the intermediate versions are no longer extant. Think of a word chain:

    I am not discussing at this point the “possible” intermediates. That is another discussion, for another moment. Here, unrelated just means “with no primary sequence homology”. Nothing else.

    I don’t know what “absolutely distant” means. And I don’t see why you are assuming that the distance can’t be traversed.

    Absolutely distant just means that there is the highest distance in the sequence space, because the two sequences have no homology. Two sequences cannot be more distant than that. And I have never said, least of all “assumed” that the distance “cannot be traversed”. What I said is “No protein A can become protien B, with a completely different sequence, without passing through the “distance” that separates the two states.” That is not “the disatnce cannot be traversed”. It is, on the contrary, “the distance must be traversed”, if we have to reach B from A. Do you think it’s the same concept? I have to ask you to stick more precisely to what I say, if we have to go on with the discussion.

    Well, either is free to change as long as the other remains functional (if it’s important.

    Of course.

    Well, I still don’t know what you mean by “unrelated”.

    Well, I hope at least that is now clear.

    This is not my field, but there seems to be a substantial literature on protein domain evolution. I think you are possibly making the error of regarding “protein domains” as some absolute category (rather like “species”, which often suffers from a similar problem) as opposed to a convenient simplifying categorisation by scientists to refer to common evolutionary units.
    Have you done a literature search?

    It is clear that this is not your field. There is no explanation of the origin of protein domains, as you can see form Axe’s paper about that issue. My definition of protein domains is taken from the literature, and from SCOP, the database of the known proteome classification. The classification I often use (2000 independent, unrelated domains) corresponds to SCOP’s concept of “protein superfamilies”. The result of 6000 groupings sharing less than 10% homology is taken from SCOP, too.

    The following is from the SCOP site:

    “Classification

    Proteins are classified to reflect both structural and evolutionary relatedness. Many levels exist in the hierarchy, but the principal levels are family, superfamily and fold, described below. The exact position of boundaries between these levels are to some degree subjective. Our evolutionary classification is generally conservative: where any doubt about relatedness exists, we made new divisions at the family and superfamily levels. Thus, some researchers may prefer to focus on the higher levels of the classification tree, where proteins with structural similarity are clustered.

    The different major levels in the hierarchy are:

    Family: Clear evolutionarily relationship
    Proteins clustered together into families are clearly evolutionarily related. Generally, this means that pairwise residue identities between the proteins are 30% and greater. However, in some cases similar functions and structures provide definitive evidence of common descent in the absense of high sequence identity; for example, many globins form a family though some members have sequence identities of only 15%.

    Superfamily: Probable common evolutionary origin
    Proteins that have low sequence identities, but whose structural and functional features suggest that a common evolutionary origin is probable are placed together in superfamilies. For example, actin, the ATPase domain of the heat shock protein, and hexakinase together form a superfamily.

    Fold: Major structural similarity
    Proteins are defined as having a common fold if they have the same major secondary structures in the same arrangement and with the same topological connections. Different proteins with the same fold often have peripheral elements of secondary structure and turn regions that differ in size and conformation. In some cases, these differing peripheral regions may comprise half the structure. Proteins placed together in the same fold category may not have a common evolutionary origin: the structural similarities could arise just from the physics and chemistry of proteins favoring certain packing arrangements and chain topologies.”

    As you can see, at superfamily level, it is only “probable” that the proteins in the same superfamily share a common evolutionary origin. Therefore, the 2000 superfamilies grouping is till a very “generous” grouping. I usually take that number because I believe it is a fair intermediate between the “1000 foldings” concapt and the “6000 groupings sharing less than 10% homology” concept.

    That’s it for your last comments. Now, on to probabilities.

  194. Elizabeth (and Eugene):

    I think the best way to go on is tracing my general reasoning, and then going into details.

    So, here it is:

    a) If biological systems were modified only by random variation, any transition could be simply modeled probabilistically.

    Let’s follow this reasoning just to start:

    a1) Our problem is how a new basic protein domains (a new protein superfamily) emerges in the course of evolution. I think we can agree that the general models are practically all transition models. I don’t believe that anyone believes that a new sequence is built “from scratch”, just adding one nucleotide to the other. So, I will focus on transition models (indeed, the reasoning could be very similar also for non transition models)

    a2) In a transition model, the new superfamily in some way must arise from existing sequences. To be general enough, we will consider three variants: a pre-existing functional protein gene, a duplicated and inactivated gene, and a non coding sequence.

    a3) In all three cases, the original sequence is unrelated to the final sequence, in the sense I have defined. If it is a functional protein gene, because it is part of another superfamily. If it is a duplicated, inactivated gene, the reason is the same. If it is a non coding DNA sequence, there is simply no reason to believe that any non coding sequence should be near, at sequence level, to what we will obtain: that is simply too unlikely, and would be assuming that the starting point for some strange reason has already “almost found” the target.

    b) The transition from A to B, in the measure that it is only the result of RV (of all kinds, including drift) can be described as a random walk. Each new state generated by any variation event is a “step” in the random walk, and a probabilistic “attempt” at generating a new sequence. Drift does not generate new states, but changes the representation of existing states in the population. But, being a random phenomenon, that can interest any existing state in the same way, it does not change the probabilistic scenario.

    c) How can we describe probabilistically that scenario, where only RV acts, and some specific functional target emerges? Our first approach will be to ask: what is the probability to reach B (the specific functional target that did emerge in natural history) from A (any unrelated starting sequence)?

    d) I will make now a very simple statement: if A is completely free to change, then our search space is essentially made of two gross subsets: a much smaller one (near-A), which could be defined as all the sequences that share some homology with A (let’s say more than 10% homology); and a much bigger one (not-near-A), all the rest of possible sequences.

    e) It should be obvious that we cannot simply apply an uniform distribution to the search space. Indeed, the sequences in near-A will be reached much more likely by a random walk, and in a smaller number of steps. The probability of being reached is grossly dependent on how similar each sequence is to A.

    f) But B, by definition, is in not-near-A. Now, it should be equally obvious that all sequences in not-near-A have more or less the same probability of being reached by a random walk starting from A. If n_SP is the number of sequences in the search space, 1/n_SP is the probability of each sequence of being reached if we apply a uniform probability distribution. As the total probability is the sum of the probabilty of reaching a state in near-A + the probability of reaching a state in not-near-A, and as the states in near-A are more likely than the states in not-near-A, we can safely assume that the states in not-near-A have a lower probability than 1/n_SP, and that their probability can be considered grossly uniform in not-near-A. That means simply that if we take the probability of each unrelated state to be simply 1/n_SP, we are certainly overestimating it.

    I will stop here for now, and wait for comments (Eugene, you are obviously invited to the discussion)

  195. Dr Liddle,

    As regards word chains it is a stretch of imagination to assume that semantics can change “en route”. If I now all of a sudden start overriding English words with new meanings without telling you about it, that will be the end of our communication. For anything like word chains to happen in practice one needs to assume that the sender and the receiver must a-priori agree (or synchronously be told) on semantics for the entire length of the word chain. Semantics is totally independent of physicality. Another example is Ken Miller’s suggested preadaptational use of Behe’s malfunctional mousetrap as a catapult: the system must be told of the alternative use a priori. Haphazard semantic switching is nonsensical.

    First, someone has to decide what information needs to be passed and only then is its semantics instantiated into physicality. Upon massive observation, semantics goes first, while its instantiation happens second. In other words, (new) semantics cannot emerge spontaneously without agency. There are simply no observations whatsoever to support the assertion that the opposite can ever be the case.

  196. Well obviously it is not an analogy to push very far! I just used to to demonstrate that there are two senses of “relatedness” here, and there is a danger of conflating them.

  197. It is clear that this is not your field. There is no explanation of the origin of protein domains, as you can see form Axe’s paper about that issue.

    No, I cannot see that. And I see other papers that do attempt explanations.

    So why should I (as a non-expert) accept your assertion (or Axe’s) that there is no explanation?

    My fundamental point here (and where I do have some expertise) is not the details of protein evolution, but on the method by which you calculate the the probability when you simply do not have the information with which to calculate all possible evolutionary paths, nor, if you did, to calculate what possible reproductive benefit, in what environments, each intermediate step might confer.

  198. Why should we believe that A becomes B?

  199. Elizabeth:

    And I see other papers that do attempt explanations.

    Please, quote a paper that gives an explicit evolutionary path for a new protein domain. Otherwise, accept that no such a path has ever been shown.

    My fundamental point here (and where I do have some expertise) is not the details of protein evolution, but on the method by which you calculate the the probability when you simply do not have the information with which to calculate all possible evolutionary paths

    If that is your position, we can stop here. I have said many times that “possible evolutionary paths” are not a scientific arguments, unless they are shown to exist. I will calculate the probability according to what is known, not according to what is imagined or just hoped. This is science, and not a silly game.

    So, if you want to go on with a scientific methodology, let’s go on, but know that I will not consider “possible unknown evolutionary paths, if which nobody has ever shown that they are possible, as a scientific argument, and I will never accept that it is my duty to show thatthey are not logically possible. I am satisfied that they have never been found, and therefore they do not exist empirically as part of an explanation.

  200. Petrushka:

    Well, B appears at some point. Either some A becomes B, or B is built from scratch. I am discussing the transition scenario, because it is by far the most accepted, even in darwinian reasoning. The “from scratch” scenario is not better for darwinists, as should be obvious.

  201. GPuccio,

    I believe the computation of probabilities should reflect how you define your neighborhood. A neighbourhood of a state s is a set of states S* reachable from s in one move. A move is clearly a local modification of the configuration of a system in a given state. One can define a neighborhood operator mapping from a given state s to its neighbor states in S*.

    Now, there can be local search operators which can induce what is called very large neighbourhoods whereby a lot of states are reachable in one move. It all depends how you define your move operators, i.e. what your system is allowed to do in one move. E.g. what you define as not-near for one neighbourhood operator, will in fact be within one move for a different operator. That is in fact the only thing I would highlight at the moment.

    I am not a biologist so apologies if I get something wrong. The simplest operator I can think of is the point mutator. It is easier for me to think in terms of bit strings, so my point mutator will just invert, delete one bit from or add one bit to the initial string (at least this set of abilities to the best of my knowledge is assumed by Gregory Chaitin in his metalife models).

    A more powerful operator would move blocks of bits around as whole chunks, which can be thought of as cut & paste. We can also think of copy & paste and bit block inversions. Now if we allow our operator to do all of that in addition to point mutations (and specify a-priori the probability to choose each particular type of move, say, 0.5 for point mutations, 0.4 for black cut & paste & 0.1 for block inversions), we can have a pretty powerful/diverse operator. Correct me if I am wrong but I think that a transposon operator could be an example of the above.

    It appears that you see additions of new bits are problematic.

    I am aware of Douglas Axe’s work which suggests that once you have some function in such a vast configuration space as is the case with proteins, functionality itself necessarily means ‘isolatedness’ in the configuration space. I am absolutely fine with that.

    I am also aware of David Abel’s thoughts on this, which I also value. He maintains that genetic information is in fact prescriptive and therefore novel genetic information cannot be generated spontaneously. His argumentation is based on the stark absense of any observations that prescriptive information can ever do so anywhere else in nature. Here I would just say that while I am absolutely happy with Abel’s argumentation in principle, the details are a gray area to me. I think that it is quite fuzzy in reality. In any event, please continue your posting.

  202. Eugene:

    Your points are very correct, but they do not change much.

    Always referring to biological systems, point mutation is probably the most common event, and point mutation will give a state in the near-A set. Movements of whole blocks will still give results in the near-A set, at least in more cases, because large parts of sequence will remain unchanged, and strong homology will still be recognizable.

    The main event that will allow a sudden “leap” to the not-near-A set would be a frameshift mutation. It is true that the result of a frameshift mutation is somehow determined by the original sequence, but it is still unrelated to the original sequence except for the connection deriving from the shift in codons and the genetic code interpretation of codons. There is no reason to believe that such a kind of relation can have any connection with biochemical functionality. So, a frameshift mutation just will lead to not-near-A in a random point in relation to the function of B. IOWs, we are immediately in the situation where all unrelated results are equiprobable. It is not a case that the only explicit model of “evolution” by a frameshift mutation, the origin of nylonase according to Ono, was completely wrong.

    I don’t think that additions of new bits are problematic. Many mutational events change the length of the sequence. In the computation, I reason for a specified length (usually the final length of B) only because it is easier to model the system. It is certainly possible to extend the reasoning to vairable length sequences, and it would certainly not help the darwinian model, but it is simply too complex for me to do that. Usually, however, the proteins in a family, having the same function, are clustered around some mean length, and can be aligned for that main sequence. That’s the length and sequence that is considered, for instance, in Durston’s method.

    A transposon operator is certainly a very interesting agent, especially for the possible intelligent modeling of non coding DNA in order to prepare a new protein coding gene. But if it just moves existing blocks, homology will be largely maintained.

    My point is: let’s start with a clear idea of how to model a purely random biological system, whatever the variation operators acting in it. Then we can model how NS can apply to such a system, and when.

    My starting point remains: in a biological system, unless NS can operate, all variation is random in respect of the emergence of a new functional state that is completely unrelated, at the sequence level, to already existing functional states. For these unrelated states, the probability of each state to be reached by a purely random walk is approximately the same, and it is necessarily somewhat lower than 1/n_SP, because unrelated states are less probable than related states. This is my starting point.

    In my next post. I will start modeling the effect of NS on such a system.

  203. GP,

    That is much appreciated. The only other thing (as a clarification for me) I wanted to raise but forgot in my previous post is 10% homology meaning no relation. Homology is just sequence similarity, right? If so, as far as I understand, this is supposedly the min one can ever get due to, so to speak, alphabet limitations. In other words, having just 4 letters in our alphabet, there’s a limited number of permutations you can get for words and, consequently, our words are bound to have something in common. Am I right in thinking that it is what you mean?

  204. Eugene:

    Yes, 10% homology is more or less what you can randomly expect, but it is referred to the protein sequence and the 20 AAs alphabet. SCOP is a protein database.

    In SCOP, you have a tool called ASTRAL, which gives you the identifiers of groupings sharing less that some percent homology, or having an E-value >= to some value. The lowest homology threshold accepted is 10%, and that gives you 6311 identifiers. The highest E-value is 10, and that gives you 6094 identifiers.

    So yes, less than 10% homology, for protein sequences, can be considered a random result, and does not allow any assumption of evolutionary relationship.

  205. Elizabeth and Eugene:

    OK, let’s go to NS.

    As I have already said, with great resistance from Elizabeth, NS is a necessity mechanisms that applies to particular situations that can arise in the purely random system of biological variation.

    I will try to summarize here why it is so.

    I will refer to existing functional proteins in the proteome. I will call “function” their known biochemical activity, and nothing else. Please, Elizabeth, bear with me about this word use. It will makes things simpler in the discussion. I tale full responsibility of the definition, and will not conflate any other meaning.

    The important point is that function, as defined, depends strictly on primary sequence. For each known protein function in the proteome, defined as restrictively as possible, we have only a limited set of sequences that allow the function in the final protein. We call that set the functional set for that specific function. Whilt its size is cerianly controversial, there is no doubt that it can in principle be defined.

    So now we call any variation to an existing functional protein, or any addition of a new functional protein, a “variation of function”.

    This is the first point. Variation at the sequence level in an existing protein, or the addition of a new state in a non functiona sequence, can be the cuase of a variation of function, or not.

    We call any variation of sequence “neutral” if it is not the cause of any variation of function: it neither modifies an existing function in an existing protein, not adds a new biochemical function that did not exist before.

    Please, note that at this level I am not considering reproductive success at all.

    We call any variation of sequence “positive” if it optimizes an existing protein function in the biochemical context where it acts (this would require some precisations, but for the moment let’s go on), or, more important, if it adds a new biochemical function.

    We call any variation of sequence “negative” if causes a reducion or loss of an existing protein function.

    Let’s call these three possibilities “the local biochemical effect”.

    The important point is that this effect strictly depends on the protein sequence, and on the sequence-structure-biochemical function relationship. Therefore, this local effect is completely derived from the laws of biochemistry, and is therefore completely a necessity effect of the sequence variation.

    Next, we will consider the relationship netween local effect and differential reproduction.

  206. Excellent, got it.

  207. Elizabeth and Eugene:

    So, NS is defined as differential reproduction in a population. OK, so we must connect that concept to our original concept of “variation of function” at the local, biochemical level.

    Let’s try to be simple.

    a) A neutral variation of function can have no causal effect on differential reproduction of its own. It can be in some way connected to differential reproduction, but only indirectly, not because of the variation of function, because indeed no local function has changed.

    So, for instance, a neutral variation could expand, or be lost, because of drift. But that effect will be random in relation to the function of the sequence itself, because it can affect in the same way all sequences, whatever that function. So, an allele with some neuutral variation could be expanded because it is “linked” to some other selectable functional gene because of its position in the chromosome. But again, this has no relationship with the sequence itself and its function. It is, again, a random effect that has nothing to do with the sequence and its function.

    What I am saying is that even differential reproduction is a random effect if it has no connection with a causal effect of the genetic variation of the sequence, and with its biochemical function. No effect of that kind will ever favour some specific sequence becasue of its sequence function relationship, which is what we need if we want to change the results of a purely random syste. IOWs, indirect effect due to drift or to the selection of something else will not change the probability of getting our B, because all unrelated states share the same probability, and drift or selection unrelated to some specific sequence does not change that fact.

    b) A negative local variation will usually favour worse differential reproduction. That will not always be the case, ans I suppose that’s where Elizabeth would say that a proces is stochastic. And that’s true. But the connection between loss of local function and worse reproduction will usually be a necessity relation, that can be diluted, or changed, or even inverted, by random effects of other variables.

    A special case would be that of some forms of antibiotic resistance, where what is a partial loss of local function can be a function gain because the functional structure that has negatively (or neutrally) chenged is also the target of the antibiotic (see Behe’s considerations on “burniung the bridges”.

    c) Finally, the addition (or optimizaion) of a biochemical function can have a causal positive effect on differential reproduction, or it can have no effect, or it can have a causal negative effect on differential reproduction. That seems intuitive enough, so I will not go into details for the moment. Here too, the cause effect relation can be diluted or changed by random effects.

    So, to go on, we just give the follwinf definitions:

    1) We call unrelated differential reproduction any effect (drift or NS of other genes) that has no cause effect relation to the specific local variation of function.

    2) We call pertinent NS all effects on differential reproduction that can be causally connected to the variation in local biochemical function, whatever the connection, and however modified by random effects they can be.

    In the set of pertinent NS, the only one that interests us in this discussion, we will call:

    a) Positive NS all cases of better differential reproduction connected causally to local function variation (even if it were a negative local function variation).

    b) Negative NS all cases of worse differential reproduction connected causally to local function variation (even if it were a positive local function variation).

    c) No NS (neutral scenarios) all cases where local function variation has no causal relationship with differential reproduction (either because there is no local function variation, or because the variation does not affect reproduction).

  208. Elizabeth and Eugene:

    So, we have already said that unrelated differential reproduction does not change the computation of probabilities for some specific output B in a basic random system of biological variation. The probabilities to get an unrelated state will remain the same, whatever unrelated differential reproduction is present in the system (drift, or selection of other genes).

    But how do the three scenarios of pertinent NS affect the computation of probabilities? That is not always obvious.

    Let’s start with the easiest: No NS does not change the scenario. The probabilities reamin the same. That is rather obvious, becasue no necessity mechanism is acting, and the system remains random.

    Let’s go now to negative NS. It is a very important factor, and it acts all the time.
    The main effect of negative NS is to remove, at least in part, local function variation that is negative, or positive in the “wrong way”. I will stick to the first situation, which should by far be the most common.

    The effect of negative NS is very obvious when the local variation implies total, or serious, loss of local function, and the local function is really necessary for life or reproduction. In that case, the clone with the variation is removed. In all other cases, it can survive, but it usually reproducts worse, except for rare cases where environmental factors can occasionally expand it (see the case of S hemoglobin and malaria).

    From the perspective of our random walk towards B, what changes as the result of negative NS? The answer is easy. In most cases, it will be impossible for a functional gene to “walk” out of near-A. Or very difficult.

    If the gene is no more functional (duplicated and inactivated), or if the starting sequence is non coding, negative NS will have no effect.

    So, just to be simple, the main effect of negative NS will be to keep the functional information as it is, or not to act at all. Nothing in that improves our probabilities to reach B.

    I will deal with positive NS in next post, because that is really the crucial point.

  209. Natural selection is defined as differential reproduction DUE TO heritable random mutations.

    If you have differential reproduction due to something else then you do not have natural selection.

  210. Joe:

    That’s exactly my point. But I wanted to include a brief discussion about unrelated differential reproduction because Elizabeth, and other interlocutors, usually bring it out in the middle of the discussion, in the form of drift or of other accidental involments of genes in indirect selection. So, I thought it was better to rule those aspects out just from the beginning.

    I have called the neutral scenario “No NS” to underline that it completes the spectrum of possibilities about NS.

  211. Yes, that’s why I always describe the evolutionary algorithm as heritable variance in reproductive success.

    But even that isn’t quite accurate, because non-heritable, or temporarily heritable, or culturally heritable phenotypic features may also result in variance in reproductive success and serve to filter the gene pool at the level of the population.

    For example, it is possible that epigenetic variance may serve to keep the gene pool rich and the population more adaptable to changing environments.

  212. Elizabeth:

    I am not sure if that is a comment to what I have written up to now. Anyway, as soon as I can I will go on.

  213. It was to Joe, but not exclusively :)

  214. Elizabeth, Eugene, Joe, or whoever is still in tune:

    As I don’t like to leave things unfinished, I will deal in this last post with the analysis of positive NS. I quote my own definition:

    a) Positive NS all cases of better differential reproduction connected causally to local function variation (even if it were a negative local function variation).

    So, the main point in positive NS is that the variation in local biochemical function has a positive causal effect on differential reproduction.

    I would like to start by saying that IMO the role of positive NS is definitely overemphasized. I believe that RV and negative NS are the principles explaining most of what we can observe (except for design :) ). Positive NS is usually invoked to explain what can only be designed, but in reality it does not explain it at all, because of its intrinsic limitations.

    So, let’s see what is necessary for positive NS to take place:

    a) A variation in local biochemical function must take place by RV. Positive NS can never take place is no variation in local function is present (IOWs, it does not act if only neutral variation of sequence, without any variation of local biochemical function, takes place.

    b) The local variation of biochemical, be it positive or negative, must be the cause of a positive variation in differential reproduction.

    c) That positive variation in differential reproduction must be abkle to express itself as an expansion of the original sequence variation to all, or most, or at least a significant part of the population. We call this phenomenon “expansion” of the local variation. It is important to stress that, while the positive influence of the variation on reproduction is a necessity effect that can be detailed and understood in terms of cause and effect, the final expansion is not always a necessary consequence, and can be “modulated”, or even prevented, by random effects (drift, other independent and unrelated variables). For instance, a local variation that has a positive effect on reproduction could be lost early because of drift.

    In the following discussiion, I will restrict the scenario to the following situations:

    1) The local variation is an increase in local function, or the appearance of a new function, and the new, or increased, function has a direct positive effecy on reproduction.

    2) The necessity effect is strong ennough to determine, often if not always, a significant expansion of the sequence variation in the population, in spite of random effects.

    The first assumption should be the most common scenario, and simplifies the discussion. The second assumption means only that I am granting maximum power to positive NS in my discussion, something that darwinists should appreciate :)

    So, let’s go on.

    Because of RV, an existing local biochemical function increases, or a new local biochemical function appears. That is the starting point.

    Now, I should remind here that our problem is the origin of new protein domains, and in particular the transition from A to B. B, being a new functional domain, can obviously be expanded by positive NS, but that is not relevant to our discussion. Our discussion is about how B arises. So, the thing that should be selected by positive NS (to help us get B) is sopme sequence that can “bridge” the transition between the unrelated A and B, and change the probabilistic modeling of the transition by introducing a necessity effect. We call that selectable sequence an “intermediate”, let’s say A1.

    The simple event of the appearance is different according to which of the original scenarios we take.

    In the case of a functional gene evolving to another new unrelated functional gene, many difficulties arise. The original function must be lost, sooner or later. So, we must explain how the loss of the original function is compatible with reproductive success. Moreover, the function of A1 could be related to the function of A, or to the function of B, or to neither. In general, all these difficulties make the scenario of direct transition fromm one function to a completely different one scarcely credible, even for darwinists.

    The duplicated gene scenario is better. Here, one of the two genes can be inactivated sooner or later, and become free to change throrugh not-near-A. Negative NS is no more an obstacle, because the gene is no more functional, and all variation is by definition neutral.

    But another difficulty arises: if the gene is not functional, it is probably not translated. The reason is simple: is non functional sequences were translated, the cell would be repleted with non functional, non folding proteins, that can scarcely be considered a good way to evolve. Non functional proteins are usually a big problem.

    So, let’s say that the gene is not translated, and it varies, and at some moment it reaches the state A1. OK, now it can be selected. But before, it must be translated again. So, either the cell knows in some way that we have reached a functional sequence in the genome (but how?), or it is simply lucky, and translation is casually reactivated when the functional state is reached.

    The non coding DNA scenario is similar to the duplicated gene scenario.

    Anyway, let’s ignore these difficulties, and let’s say that we have A1. and that A1 is translated. Now, positive NS can finally act.

    OK, but let’s look a moment at A1 before. What males of A1 “an intermediate”? IOWs, what properties must A1 have to be really useful in the transition from A to B?

    a) It must be a sequence intermediate: IOWs, it must have some homology to A and some homology to B. That, and only that, would make the transition from A to A1 easier than the transition from A to B, and so also fro the transition from A1 to B. IOWs. A1 muist be in the not-completely-away-from A, and at the same time in the not-completely-away-from B.

    b) It must have a local biochemical function, and that function must have a positive effect on differential reproduction

    c) It must be translated, and enough integrated in the existing scenario so that its local function can be correctluy used. That is important too, because no fucntion is useful if it is expressed too little or too much.

    Another important point: if and when a selectable sequence appears, in all cases positive NS does not act alone; it always acts together with negative NS.

    I will be more clear. A1 appears, amd is expressed. It cofers a reproductive advantage to the cell. At that point:

    a) Negative NS will tend to eliminate RV that affects the newly acquired function. IOWs it will protect the functional part of A1. We call this effect of negative NS: “fixation”.

    b) Positive NS will tend to expand the original variated clone to most or all the population. That can also be described as negative NS acting on the non variated cells, but the concept is the same, so I will keep my terminology. We call this effect of positive NS: “expansion”.

    So, to sum up, our A1 is fixed and expanded. In the first scenario, that will implly the loss of A. In the other two, that is not the case.

    Well, I think this post is becoming too long. I will stop here for the moment.

  215. Elizabeth, Eugene, Joe, or whoever is still in tune:

    So, let’s go on, and possibly finish.

    We have said that pure negative NS cannot help in the emergence of new protein domains. Indeed, it is an obstacle to that, because it preserves existing function.

    But positive NS is different.

    Once a new sequence appaers that is naturally selectable, both negative NS and positive NS can act on it:

    a) Negative NS will fix the new information, and preserve it.

    b) Positive NS will expand the new gene to most or all the population.

    First of all, I want to specify that I am not dealing here with the selection, fixation and expansion of the final target, B. As we find B in the proteome as a protein domain, we giva for granted that, once it emerged, it was selecte.

    But our problem is how to explain the emergence of B.

    For that, if we don’t want to rely only on RV, (which would be folly, because B has high dFSCI), ve need an intermediate that is naturally selectable.

    That does change the probabilistic scenario. Some ID supporters have claimed the opposite, but I believe they are wrong.

    Now, I want to clarify that for the origin of protein domains no naturally selectable intermediate is known: therefore, the following discussion coul simply be omitted. However, as I believe it is important to know how to model NS, I will do it just the same. It can be applied to any possible selectab le intermediate in a transition.

    I have already listed the properties that an intermediate must have, if it has to be useful in our scenario:

    a) It must be a sequence intermediate: IOWs, it must have some homology to A and some homology to B. That, and only that, would make the transition from A to A1 easier than the transition from A to B, and so also for the transition from A1 to B. IOWs. A1 must be in the not-completely-away-from A, and at the same time in the not-completely-away-from B. (I must add here that, on second thought, it is not really necessary that the intermediate share some homology with A: it would be enough that its dFSCI is lower that the dFSCI of B, IOWs that it is easier to reach from A because its function is less complex than the function of B. It must instead share some homology with B).

    b) It must have a local biochemical function, and that function must have a positive effect on differential reproduction

    c) It must be translated, and enough integrated in the existing scenario so that its local function can be correctly used. That is important too, because no function is useful if it is expressed too little or too much.

    Well, in the following discussion I will make some assumptions and concession.

    1) I will assume that we are dealing with the scenario of a duplicated, inactivated gene. That is, IMO, the most favorable scenario for neodarwinism, because it bypasses the obstacle of negative NS on A, and allows for free neutral variation to happen.
    2) I will grant that, as soon as some naturally selectable sequence is reached by neutral variation, for some miracle the gene is translated again, and fuuly integrated in the existent scenario, so that it may be selected. See how good I am?
    3) I will assume that both negative and positive NS can act completely on the new selectable gene: IOW, that once it emerges, it will be, in a short time, both completely fixed and expanded to the whole population.
    4) I will grant that the intermediate can go on experiencing neutral RV to reach B, while retaining the function, and therefore the fixation, of the partial functional sequence it already has reached. Granting that, I will not have to go into details about the function of the intermediate (if similar to A, to B, or just different).
    5) I will assume that the intermediate is exactly “half way”, at sequence level, between A and B. Therefore, it divides the transition from A to B into two separate transition, each with about half the bits of dFSCI of the whole transition.

    As you can see. All this assumptions, and concession, are favorable to the neopdarwinian scenario. Some of them are just necessary to make the discussion easier.

    So, the scenario now is the following:

    a) We have a duplicated, inactivated A.

    b) The transition from A to B has a dfSCI of, say, 300 bits.

    c) We call the intermediate A1.

    d) A1 is functional and naturally selectable.

    e) A1 shares part of the sequence with B. Let’s grant that the sequence it shares with B will be fixed, and corresponds to about half of the functional complexity of B.

    f) The transition between A and B can therefore be attained through two separate transitions: from A to A1, and then from A1 to B.

    g) The dfSCI of the transition from A to A1 is, say, 150 bits. Therefore, A1 can be reached much more easily, starting from A.

    h) But what happens when A1 is reached? As said, we assume that it is translated and integrated, and that its function is naturally selectable. We also assume that NS will act optimally on it, and in a short time. Therefore, two things happen:
    h1) The functional part of the A1 sequence is fixed, and will not change any more. We are also assuming that this part of the sequence is shared with B, to make things as good as possible.
    h2) A1, that was in the beginning present only in one cell, is expanded to the whole population.

    Now, let’s stop here for a moment. What has happened?

    We have got a result, the transition form A to A1, that implies “only” 150 bits of functional complexity.

    As a result of the necessity mechanisms of negative and positive NS, half of the sequence for B is now fixed, and the following transition, from A1 to B, needs to find “only” the other half of the functional complexity of B. So, let’s say that the second transition has too a dFSCI of 150 bits. This result is ensured by the action of negative NS, that fixes the part we have already attained protecting it from further change.

    What about the probabilistic resources? Well, in the beginning A1 is represented only in one cell, but because of the action of positive NS it is soon present in the whole population. Therefore, form now on, the probabilistic resources to find B from A1 are the same as they were to find A1 from A at the beginning.

    As both the probability (the dFSCI of the transition) and the probabilistic resources of the two events are the same, we can model the whole system as follows:
    A random system where we get twice the same event, with the same probabilty.

    In the next, and final, post I will show how this situation can be modeled probabilistically, and why it differs from the simple random transition from A to B. IOWs. I will show the probabilistic variation due to the intervention of the necessity mechanism of NS.

  216. Elizabeth, Eugene, Joe, or whoever is still in tune:

    Now, I am perfectly aware that the scenario I have described in the previous post is rather artificial: most of my assumptions will never be exactly that way, and some of them are really unlikely. But the point is: all of them are favorable to the neo darwinian scenario, and many of them simplify the computation. So, please, take this model for what it is: a first attempt at modelling a difficult scenario. I will accept any reasonable correction or proposal gladly. I apologize with my ID friends for havinf probably conceded too much to the adversary, but it is only for the sake of discussion.

    So, the probabilistic model I will use is the cumulative binomia distribution. I will make first an example with the toss of a die, and then apply the results to our scenario.

    The probability of getting a six tossing one die once is, as all know, 1:6, about 0.166666667.

    Let’s say we toss two dies, and want to know the probability of getting two sixes. In a single toss, it is obviously 0.166666667^2, that is about 0.027777778.

    Let’s say that getting two sixes is our B.

    Now, let’s say that our probabilistic resources are ten tosses.

    According to the cumulative binomial dostribution, the probability of getting at least one event of probability 0.027777778 in ten attempts is 0.2455066.

    Now, our alternative scenario, the intermediate scenario, will be similar to tossing one die ten times, and getting at least two sixes.

    Again, the binomial distribution tells us that the probability for that event is 0.5154833.

    So, as you can see, with these numbers the scenario where we get two individual events has roughly doubled the probabilities.

    Now, I must say that I have made simulations, and the increase of probability depends very much on the number of attempts in relation to the global probability of the first event. However, the probabilities of the first scenario are always lower than those of the second scenario.

    So, let’s apply those concepts to our gene scenario. we need to assume specific numbers to do that.

    So, as I have proposed in the last post, let’s say that the probability of getting B from A thorugh a pure random walk are 2^300 (dFSCI of B of 300 bits).

    Now, we assume that the probabilistic resources (number of random variations in the whole population in a definite period of time) is of 2^100 (that’s about 10^300 states, not a small number at all).

    So:

    First scenario: the transition from A to B happens by pure random variation, and has a probability of 1:2^300. The probability of getting B from A at least once, in 2^100 attempts, is, according to the binomial distribution:

    6.223017 * 10^-61

    Second scenario: the transition form A to A1 is naturally selected, and then the transition from A1 to B happens with the same probability and the same probabilisitc resource, as the effect of selection. The probability of having two events, each of probability 1:2^150, in 2^100 attempts, is, according to the binomial distribution:

    3.944307e-31

    We have an improvement of 30 orders of magnitude (in base 10). About 99 bits.

    Again, let’s remember however that I have considered almost impossibly favorable conditions. In reality, the gain would be certainly be less than that.

    So, I have tried to show:

    a) that the existence of a naturally selectable intermediate can highly improve the probability of a transition.

    b) That it is definitely possible to cimpute the difference in probability, provided that the intermediate is known and can be evaluated.

    My main point here is: if neo darwinists want really to try to explain protein domains and other forms of biological information, they must face the reality of probabilistic modelling, and not evade it. The modelling can be done and will be done.

    And will demonstrate that the neo darwinian model is incapable to explain reality.

    Well, this is the last post in the series. Any comments?

  217. I don’t really think that your comments about Dembski, and Behe, are correct.

    The point in ID is that the random system imagined by darwinists cannot explain the data, nor even with the introduction of NS. I think Dembski has stated very clearly that he assumes an uniform distribution for the random system, that is the onlt reasonable thing to do. Then, he establishes the UPB (indeed, too generous a threshold) as alimit of what a random system can empirically explain.

    Well, I certainly do not agree that assuming a uniform distribution of anything is “the only reasonable thing to do”! Very few things in the world have uniform distributions, phases of wave forms being about the only thing I can think of.

    So if that is the only “stochastic model” he has proposed, then it is self-evidently inadequate! When I have some probability distribution of results for which I want to know the expected distribution under the null, I very carefully design a stochastic model of the null! And often, even with phase distributions, you find that under the null the distribution is only uniform with infinitely large samples!

    And in non-linear systems (and Darwinian systems are highly non-linear) you certainly can’t make any safe assumptions about what you will see under the Darwinian null, for exactly the same reason that we cannot safely assume (at least in Britain) that if you hang your washing out in the morning, it won’t be soaking wet when you get home.

    Maybe he has not gone into the biological details (but Behe certainly has).

    Please point me to a stochastic model of the null produced by Behe.

    Anyway, while waiting to understand better the nature of your objections to Dembski and Behe, I will try to analyze the biological random system for you, and to show that it cannot explain data. I have really already done that, but it could be useful to review the resoning in detail and in order, now that maybe we have clarified some epistemological points.

    OK :)

  218. Thanks for going to all this effort, gpuccio!

    I’ve read it carefully, and I think I have understood some of your terminology that was confusing me earlier (I assume it is a translation issue): for example, where you used the word “fix” or “fixed” you seem to mean what in English we say “conserved”. “Fixed”, in population genetics, means something different – that all members of a population have that sequence. Similarly, by “expanded”, we’d normally say “propagated”, or “increase in prevalence”. And where you say “pertinent NS” we’d say “NS”, i.e. biased drift, as opposed to unbiased drift (which we’d call drift!)> But that’s OK, because I think I see what you are saying, and in particular, I’m grateful for the distinction you are making between “local” function and the effect of the phenotype.

    While I understand that you think you have been generous to Darwinism, I don’t actually think you have (:)) because I think you have made a number of untenable assumptions. But I’d like to focus on just two issues for now:

    Firstly: Even if I accept for the sake of argument your idea that in general a variant that has a “negative” local effect (the protein does less than it used to, for instance) will tend to have a deleterious effect on the reproductive success of the phenotype, while a variant that has a “positive” local effect (the protein does more than it used to) will be more likely to have a beneficial effect on reproductive success of the phenotype, you seem to have ignored the fact that the effect of a variant on the reproductive success of the reproductive success of the the phenotype is not simply a function of the variant; it is just as importantly a function of the current environment. What is more, what improves reproductive success in one environment may decrease it in the next; and what is still more is that the environment itself is a function of the prevalence of variants in the populations (in the valley of the blind, the one-eyed man is king, but in the valley of the one-eyed kings, the two-eyed man is emperor, and may well eat the kings for lunch).

    This is why I think it is best to think of NS as a special case of drift (or of “pertinent NS” as a special case of NS, as you put it!), as you do, but you need to add another term; if we think of NS as biased drift, that bias is going to change constantly, not only as a function of external factors but of the evolving (in the sense of change of allele frequence over time) population itself.

    This means that rather than thinking of a new variant as having an immediate beneficial effect (and risking being lost through drift), it may be better to think of new variants as generally being neutral at the time they appear (and as you say, seriously deleterious ones will be rapidly weeded out by negative NS), a proportion of which will drift to substantial prevalence, where they remain dormant in the population, not doing much, except making the population varied, until the environment changes, and the section of the population with a particular variant starts to do better as a result of that variant, at which point it rapidly becomes much more prevalent.

    I’d also point out, in the same vein, that the environment also includes the genetic environment, which is also undergoing constant change (through drift), and a sequence that was neutral in one genotype may be highly beneficial in another (gene-gene interaction). For instance, a gene for blue skin might be useless in a furry critter, but highly advantageous in a bald one (example deliberately extreme to make my point :)

    My second point is more serious and concerns your scenario for inactivated duplicates. We now know that regulatory genes are extremely important, and the degree to which a coding gene is expressed will depend on the state of the regulatory genes (whether switched on or off), which in turn will often be a function of the level of protein in the cell (i.e. homeostasis) – so that two perfectly functional identical genes (original and copy), both making the same protein can both be activated, and yet not produce too much protein.

    Moreover, it is perfectly possible that an inactivating variant of one of the genes could take the form of its only being rarely “switched on” (i.e. in response to a rare environmental signal). If so, it is now free, as you say, to change (because it isn’t doing much), and will be switched “on” occasionally, and make a useless protein, but so rarely that it doesn’t get in the way of the phenotype. And so this variant has a sporting chance of being propagated by drift.

    Not only that, but it will not be highly conserved, and soon there will be lots of variants of it and its protein in the population. Now, change the external (for simplicity’s sake) environment so that the switch is operated much more often (perhaps there’s more oxygen in the environment, or more sunlight); now, those individuals who have variant proteins that are damaging will die off; those that have variant proteins that are neutral will survive; and any whose variant protein is actually useful, will thrive, and come to dominate the population.

    Now, I’m not saying that is what occurred, or will occur; I’m merely saying that that, right there, off the top of my head, is a perfectly possibly scenario in which a gene could remain active but not often expressed enough to be important, in one environment, and thus be free to acquire variations in the population which would then be heavily selected by an environmental change that increased expression rate.

    And that is what is wrong with your binomial calculations: yes, the probability of two simultaneous rare-ish events is lower than the probability of each separately. But, because of neutral drift, simultaneity is not required, and because of the importance of regulatory genes and their role in maintaining homeostasis (a highly selectable function), a rich pool of neutral variance can become a rich pool of selectable variance with small changes in environmental conditions.

    And we can simulate this very easily, and demonstrate that populations with plenty of neutral variance are much more robust in the face of environmental change than more homogeneous genomes (which is why small populations are in far greater danger of extinction than you might expect, and why “hybrid vigor” is so called).

    And to make my most important point: this is why we simply cannot infer “design” from low probabilities of an alternative. It is, simply, an argument from ignorance (or even an argument from lack of imagination!) In rich feedback systems (non-linear aka chaotic systems) it is extraordinarily hard to model all the things that might happen, and certainly premature to decide that because routes A, B and C are unlikely that A, B and C are the only non-design routes there are. Not to mention the Texas-sharp-shooter problem of estimating the probability of getting B from C via A1, when we don’t know whether Z from A via A2 might have been just as awesome.

    In other words we simply do not know, and cannot know, the distribution of possible end points under the null of Darwinian evolution, and this is what makes any probability based ID argument invalid.

    Which is not to say that ID is wrong! Just that it’s the wrong way of going about testing the hypothesis, and also means that the grounds for thinking it a reasonable inference right now are invalid, seeing as they are mostly – all? – based on probability – or rather improbability – estimates.

    At least that’s how I see it :)

    Over to you, and thanks again for your posts :)

    Lizzie

  219. I agree. But this mistake is hardly surprising given their antes here: http://richarddawkins.net/disc.....nts?page=7

    In that page I argued that Nature could not build a hut and got some flak for it. There are 2 reasons I believe why the abilities of NS is misrepresented:

    1) It is implicitly assumed that natural selection will weed out the bad variations and keep the better ones. This is false, since natural selection has no foresight and is under no compulsion to do such.

    2) Darwin’s understanding of NS (as I recall) was based on Malthus posits in his opus ‘An Essay on the Principle of Population‘ as which has been evidently disproved by the very fact that technology has increased food production. My assumption here is that readers are familiar with Malthus posits and that they have been shown to be of little or no effect.

  220. Elizabeth:

    Thank you for your reply. I believe you are the forst who has really read all that I wrote here. My compliments! :)

    I will try to answer briefly your thre arguments:

    a) All your observations are correct, but don’t change anything. In my modellyng, anything that is not related to the functional sequence that confers the local, biochemical activity, cannot improve its probabilities of emerging through a random walk. That’s why I say that, whatever the random effects happening, including the modifications in the genetic and outer environments, none of these effects changes the basic fact that all unrelated sequences have the same probability of emerging thorugh a rnadom walk, and that such probability is certainly lower that 1/n_SP. I invite you to make a simulation for that, if you can (I would not be able). You can tahe the space of decimal numbers with 6 digits (not too big, 10^6). The you fix a starting sequence. Then choose a number of unrelated sequences, sharing no digit, or less than 1 digit, as you like, with the starting sequence. Then try a random walk from the starting sequence, and count how many random single digit variations are necessary to reach each of the unrelated states, like in a Monte Carlo simulation. Do it many time, and compute the mean empirical probability of getting each different unrelated state form the starting sequence.

    According to my reasoning, that probability will be similar for all unrelated states, and it will be lower than 1/10^6 (because the special subset of related states, sharing digits with the starting sequence, can be reached more easily).

    Now, you can change all the variables you like, but there is no way any variable will make the sequence, say, 361449, specially probable, unless we choose variables that specifically favour that sequence. There may be vairables that favor that sequence, but they will not be more probable than those that favour any other sequence, so we are again in a random situation and an uniform distribution for unrelated states.

    b) The asnwer to your second point is easy: what you say is correct, nut here I have really been generous, and conceded a scenario where the gene could change neutrally in any possible way, and any functional state reached would be immediately translated and selected. Please, read carefully my post and you will see that that is what I wrote. Therefore, your arguments to explain how that could be possible do not seem specially relevant.

    c) Here I disagree. Again, you make a fundamental epistemological error here.

    You say:

    “And to make my most important point: this is why we simply cannot infer “design” from low probabilities of an alternative. It is, simply, an argument from ignorance (or even an argument from lack of imagination!)”

    That’s wrong. We are looking for the best explanation. We choose between existing explanation. An explanation that has extremely low probabilities of being true is not a valid explanation at all.

    You are simply saying that anotgher explanation could be possible. OK. Provide it. Science is made with explici, detailed explanations, not with wishful thinking.

    Unless and until you can offer an explicit alternative, whose probabilites can be really computed and shown to be acceptable, the only competition is between existing explanations. The only problem is, you are not available to accept design as an explanation, and believe that any so called naturalistic explanation, even is not shown, even if not likely to exust, is better than a design explanation.

    That is dogma and prejudice, not science. And that dogma and prejudice are the essence of all the “argument from ignorance” statements, so common in the darwinist field.

  221. Elizabeth:

    Just one more clarification about a point I had missed. I think you make a technical error here:

    And that is what is wrong with your binomial calculations: yes, the probability of two simultaneous rare-ish events is lower than the probability of each separately. But, because of neutral drift, simultaneity is not required, and because of the importance of regulatory genes and their role in maintaining homeostasis (a highly selectable function), a rich pool of neutral variance can become a rich pool of selectable variance with small changes in environmental conditions.

    Simultaneity is not an issue here. That is a common misunderstanding among darwinists. For the computation of probabilities, it is not necessary that all the variations happen simultaneously. I don’t know if that is what you meant here, but I want to specify that, because it is an important point.

    In my example with dies, in the first case, the importat thing is not that we toss two dies simultaneously. We could better concive the case as a single die with 36 facets.

    The fact is, if only the final result (B) is selectable, we have to have the final complete result (B, with all its dFSCI) before selection can happen. It is not important (nor likely) that all the variation happens simultaneously: but it must all be present in the end, if the function has to be available for selection.

    Instead, in the second scenarion, the intermediate can be selected. That makes the probabilistic scenario different, and I believe that my application of the binomial distribution to that situation is correct (but I will accept any correction, if I am wrong). The important point is, that second probabilistic scenario is appropriate only if the first event (the intermediate) is completely selected, both nehatively (“fixed”, or conserved) and positively (“expanded”, or propagated).

  222. But gpuccio, the reason we Darwinists keep going on about non-simultaneity is that if one mutation precedes another, and drifts through the population, there are vastly more opportunities for the second mutation to occur in an individual that already bears the first!

    And we see this happening in simulations all the time. Even slightly deleterious mutations can readily drift through a population, and if it is a necessary precursor to some advantageous feature, then the chances of the second mutation occurring in some individual becomes very high.

    That’s why your binomial calculation doesn’t work. You need to factor in the probability of the first mutation being propagated through the population, and the size of that population in order to find out how many opportunities there are likely to be for the second one to occur in an organism that already has the first.

    Really off now!

    Cheers

    Lizzie

  223. Elizabeth:

    No. I believe you are mistaken about that. As I have said many times, any mutation can readily drift, not certainly only those that will be useful.

    The point is, each new sequence generated by RV is an attempt. All unrelated states have the same, uniform probability to be reached. All unrelated states have the same, uniform probability to drift. Thjerefore, my computation is perfectly correct.

  224. Do we have any real-world examples of a mutation coming to fixation in a population of say > 1000?

    Is there any evidence for a slightly deletrious mutation readily drifting through a population?

  225. 225
    material.infantacy

    “…those 2000 or so protein domain are the essebtial core of biological functio in life as we know it. IOWs, they are very important empirical data that need to be causally explained.”

    [Just jotting down thoughts as they occur to me. It doesn't happen often. xp]

    This raises the issue that there are two distinct, putative material mechanisms which need to account for the same ability to design and instantiate functional protein products. The first mechanism, an OOL scenario, sequences and manufactures proteins that must be present in the second, DNA-based replication system. The first system must also give rise to the second in its entirety.

    So either both, disparately functioning systems produce the same types of functional protein structures, or perhaps there is a third theory, the overarching theory of evolution, which demands that specific, sequenced, folded, functional proteins come about in either system.

  226. 226
    material.infantacy

    “i) So, what we need in our scenario is:

    i1) a starting sequence

    i2) a series of modifications that realize a random walk

    i3) a final result”

    I imagine that a homologue, hopeful to mutate into a target sequence, has a much higher probability of drifting away from the sequence than toward it. However, given the right mutation rate and enough trials, we should see some drift closer — but then the probability goes down for successive attempts.

    Let’s consider a 50% homologue sequence for a 100 amino protein. 50 of the aminos match type and position in the target sequence, so there is a 0.5 probability of a point mutation hitting the right position, and a .05 probability of it hitting the right residue, resulting in a 0.0125 probability of scoring a total win, moving the homologue closer toward the target.

    However the sample space changes.

    The new sequence now has 51 matching, and 49 non-matching elements. The next mutation has a slightly higher probability of drifting away. If we’re drifting, we’re in a remarkably declining probability for finding the function.

    I’m not trusting my reasoning here, nor my hasty math. But some quick induction leaves me with (50!)*(0.01)^50 *(0.05)^50 = 2.7E-101 chance of success, of the homologue drifting to find the target.

    If beginning with a 90% homologue (100 residues) it works out to (10!)*(0.01)^10 *(0.05)^10 = 3.5E-27 chance of the homologue drifting to the target. If my numbers turn out to be anything close to proper, then I wonder what they would look like with a 500 residue sequence, or more.

    This is without considering what advantage positive selection might contribute, if one or more of the transitional sequences happens to fold, function, and confer a selective advantage along the way. We can at least imagine that if some intermediate sequence becomes locked in by NS, that chances would improve, although I couldn’t say how much. Perhaps not much, since we still need to account for the transition from A -> F, then from F -> B. We would also need to consider the space of the folding sequences set, and the contextual problem that a protein conferring an advantage in some particular context might not confer any advantage at the specific point in time required. IOW, an intermediate sequence may fold properly, but its function may or may not be useful to the organism at its particular stage of development.

    This is all just stream-of-consciousness commenting.

  227. 227
    material.infantacy

    “Now, the fundamental point: once A changes so much that it becomes unrelated to its original sequence, at that point all unrelated states have more or less the same probability to be reached by a random walk.”

    I need to spend some time reasoning this through.

  228. 228
    material.infantacy

    Correction: second paragraph, 0.0125 -> 0.025

  229. Time to update some basic comprehensions:

    Natural Selection Is Ubiquitous

    Higgs Particle? Dark Energy/Matter? Epigenetics?
    These Are YOK!
    Update Concepts-Comprehension…
    http://universe-life.com/2011/.....d-whither/

    Evolution Is The Quantum Mechanics Of Natural Selection.
    The quantum mechanics of every process is its evolution.
    Quantum mechanics are mechanisms, possible or probable or actual mechanisms of natural selection.
    =================

    Universe-Energy-Mass-Life Compilation
    http://universe-life.com/2012/.....mpilation/

    A. The Universe

    From the Big-Bang it is a rationally commonsensical conjecture that the gravitons, the smallest base primal particles of the universe, must be both mass and energy, i.e. inert mass yet in motion even at the briefest fraction of a second of the pre Big Bang singularity. This is rationally commonsensical since otherwise the Big would not have Banged, the superposition of mass and energy would not have been resolved.

    The universe originates, derives and evolves from this energy-mass dualism which is possible and probable due to the small size of the gravitons.

    Since gravitation Is the propensity of energy reconversion to mass and energy is mass in motion, gravity is the force exerted between mass formats.

    All the matter of the universe is a progeny of the gravitons evolutions, of the natural selection of mass, of some of the mass formats attaining temporary augmented energy constraint in their successive generations, with energy drained from other mass formats, to temporarily postpone, survive, the reversion of their own constitutional mass to the pool of cosmic energy fueling the galactic clusters expansion set in motion by the Big Bang.

    B. Earth Life

    Earth Life is just another mass format. A self-replicating mass format. Self-replication is its mode of evolution, natural selection. Its smallest base primal units are the RNAs genes.

    The genesis of RNAs genes, life’s primal organisms, is rationally commonsensical thus highly probable, the “naturally-selected” RNA nucleotides.

    Life began/evolved on Earth with the natural selection of inanimate RNA, then of some RNA nucleotides, then arriving at the ultimate mode of natural selection, self-replication.

    C. Know Thyself. Life Is Simpler Than We Are Told, Including Origin-Nature Of Brain-Consciousness-“Spirituality”***

    The origin-reason and the purpose-fate of life are mechanistic, ethically and practically valueless. Life is the cheapest commodity on Earth.

    As Life is just another mass format, due to the oneness of the universe it is commonsensical that natural selection is ubiquitous for ALL mass formats and that life, self-replication, is its extension. And it is commonsensical, too, that evolutions, broken symmetry scenarios, are ubiquitous in all processes in all disciplines and that these evolutions are the “quantum mechanics” of the processes.

    Human life is just one of many nature’s routes for the natural survival of RNAs, the base primal Earth organisms.

    Life’s evolution, self-replication:

    Genes (organisms) to genomes (organisms) to mono-cellular to multicellular organisms:

    Individual mono-cells to cooperative mono-cells communities, “cultures”.

    Mono-cells cultures evolve their communication, neural systems, then further evolving nerved multicellular organisms.

    Human life is just one of many nature’s routes for the natural survival of RNAs, the base Earth organism.

    It is up to humans themselves to elect the purpose and format of their life as individuals and as group-members.

    Dov Henis (comments from 22nd century)

    ***????? ?????? ?? “?????????”, ???? ??????????,
    ?????????? ?????? ?????????, ?????? ??????? ????

    An Embarrassingly Obvious Theory Of Everything
    http://universe-life.com/2011/.....verything/

    Tags: brain origin, gravitation, gravitons, lifeevolution, nerved organisms, RNAlifehood, spirituality, universeevolution

Leave a Reply