Home » Intelligent Design » EV Ware: Dissection of a Digital Organism

EV Ware: Dissection of a Digital Organism

Can undirected Darwinian evolution create information?

In a celebrated paper titled “Evolution of Biological Information,” a computer program named ev, says yes.  It claims to illustrate the following properties of evolution.

  • “[Ev shows] how life gains information.” Specifically “that biological information… can rapidly appear in genetic control systems subjected to replication, mutation and selection.”
  • Ev illustrate punctuated equilibrium: “The transition [i.e. convergence] is rapid, demonstrating that information gain can occur by punctuated equilibrium.”
  • Ev disprove “Behe’s … definition of ‘irreducible complexity’ … (`a single system composed of several well-matched, interacting parts that contribute to the basic function, wherein the removal of any one of the parts causes the system to effectively cease functioning’. “

In a wonderful friendly GUI (graphic user interface) from the people at EvoInfo.org, it is easy to show that without front loaded programmed information about the search,  ev simply will not work.  These claims therefore ring hollow.

The goal of ev is to identify a given string of bits (ones and zeros).  The reason the ev program works is because of the structure created by the writer of the ev computer program.  A Hamming oracle in ev, for example, tells you how close your guess is to the correct answer.  Contrast this to undirected random search where you either told: No, your guess is wrong or, Yes, your guess is right.  At a trillion trials per second, it would take “about 12 566 000 000 000 000 000 ([over] twelve and a half quintillion) years” to find the ev target using undirected random search.  To identify the target string of bits, the Hamming oracle allows reduction of the number of trials to thousands, hundreds, and even tens.

EvoInfo’s EV Ware GUI  works on your browser and is easy to use.

ALSO: See the GUI autopsy results for Dawkins’s METHINKS*IT*IS*LIKE*A*WEASEL  at EvoInfo.org

  • Delicious
  • Facebook
  • Reddit
  • StumbleUpon
  • Twitter
  • RSS Feed

61 Responses to EV Ware: Dissection of a Digital Organism

  1. So, basically, if all proves accurate: Information > neo-Darwinism.

    lol

    Man, if the Conservation of Information Theory proves to be as true as, say, the 2nd Law of Thermodynamics, it will be the death knell of neo-Darwinism.

  2. From the discussion of the above paper “Evolution of biological information” by Thomas D. Schneidera.

    We find:
    ” The ev model shows explicitly how [...] thereby completely answering the creationists.”

    “The ev model can also be used to succinctly address two other creationist arguments.”

    “This situation fits Behe’s (34) definition of ‘irreducible complexity’ exactly [...]”

    “So, contrary to probabilistic arguments by Spetner [...]”

    ———–

    …but I thought ID and creationist arguments were suppose to be untestable.

    Just saying. :P

  3. By the way, that paper is 8 years old.

  4. But I guess that’s why Baylor said it was “much celebrated”… I just didn’t recall that paper. I must have missed the party! :P

  5. And all this demonstrates once again that the great majority of Darwinists do not understand beans about information systems like DNA and should not be allowed to teach biology! Ha! ;-)

  6. Can someone pleasedumb this down so that I can understand it? Thanks.

  7. Platonist,

    The Ev program uses what is called an “oracle” that gives it information about the search space. Oracles can give more or less information, depending on how they answer a question.

    For example, pretend you are in a field 20 acres large and you want to find an easter egg. You ask the oracle “Is the egg here, where I’m standing?” and the oracle can reply in two different ways: she can say “No”, and leave it at that or she can say “You are 23.5 feet away.” Obviously the second way of answering provides much more information, and you’ll be able to find the egg much faster.

    The Ev simulation uses the second type of oracle which returns a “Hamming Distance”, telling the algorithm how far away from a target the “organism” is. Using this information, one can zoom in on the target rather quickly.

    This oracle is a HUGE source of information. Just using the Hamming Oracle and disabling the “evolutionary” part of the algorithm, you can zoom in even faster. (See “Hamming Search” in the GUI.) This means that the evolutionary part of the search actually hinders the process, it doesn’t add anything to it. Ev turns out to use the available information inefficiently.

    In addition to this, the structure of the digital organism is pre-disposed to create genomes with few binding sites (zeroes in the GUI.) This helps reduce the search space as well, although the information provided by the oracle dwarfs the information gained by the structure. You can test this yourself: after a few thousand runs using both “Ev Bindings” (few ones, lots of zeros) and “Random Bidnings” (equal parts on average of zeros and ones), Ev will take longer on average to find random target sites than it will just using “ev” binding sites. (I ran the tests, I can post the numbers if you’d like.) You can also randomly converge on a target of all zeroes, which should be impossible unless there is a bias towards genomes with lots of zeroes.

  8. The Dawkinites should be careful. If evolution can be proven by computer program, they will be proving intelligent design because the programmer is (moderately, in this case) intelligent.

    I think I will write a program to use evolution to evolve a tracked vehicle, thereby proving that automobiles and similar conveyances were not designed, but evolved.

  9. Thanks Atom.

    So what is being said is that the computer program shows nothing. Only that information from an intelligence is required.

    With that said. Lets all hope that 2009 is good to ID. 2008 perhaps wasn’t the greatest year, but lets not give up hope.

    Darwinism is bankrupt and God exists.

  10. Plantonist,

    What is being said is that the Ev program does not represent a “free” source of information. This has implications for other evolutionary algorithms (if they use similar oracles), but the article/GUI is primarily concerned with Schneider’s Ev.

    I disagree; 2008 was a good year for ID, since ID research was being done. Politics and opinions do not change truth; as long as the ID project is moving forward, then ID is having a good year. The only bad year for ID is when ID research is not being done.

    Atom

  11. Platonist @ 6

    Can someone pleasedumb this down so that I can understand it? Thanks.

    I agree. The question is, what is meant by “information” in this context? Information in DNA, for example, if it can be said to exist at all, does not appear to be the same as the information being conveyed in these posts. There is no ‘meaning’ in the sense of that which is intended by a sender or that which is apprehended by a recipient.

    Atom @ 7 offers as an illustration:

    For example, pretend you are in a field 20 acres large and you want to find an easter egg. You ask the oracle “Is the egg here, where I’m standing?” and the oracle can reply in two different ways: she can say “No”, and leave it at that or she can say “You are 23.5 feet away.”

    As a metaphor for what is happening it is helpful but the difference is that in neither the computer program nor evolution are there intelligent agents asking or answering questions in this way. It is analogous to what is happening in the program but not the same. By the same token could it be that what is happening in the program is analogous to what is supposed to happen in evolution but not the same?

  12. Seversky,

    Welcome to UD.

    An evolutionary search (or any heuristic search method) attempts to improve the chances of a search over brute force exhaustive search (which is not possible for large search spaces) or random sampling (which also doesn’t work fast enough on average for large spaces.) To improve the chances of a search, you must limit the search space by eliminating some sub-space where the target is not likely to be found. Any time you eliminate possibilities, you impart information, which is mathematically the reduction of “uncertainty” or possibilities.

    With enough information, you can eliminate all sub-spaces but the target. Again, using my analogy, you can find any easter egg in any field given enough information. Depending on the quality of the information (how much sub-space it eliminates) and how well you use that information, you can greatly reduce the time it takes to find the target.

    Ev uses information given by the Hamming Oracle to retain those genomes “closer” to a target space. It knows it is closer or further away because the Hamming Oracle gives it this information. Without this information, the search doesn’t work. (See Random Output mode in the GUI)

    And you are correct, organisms in the real world cannot hope to have this information given to them, unless an intelligent agent somehow provided it or programmed it. Saying that fitness functions encode this information only pushes the problem and information back one level to the fitness functions.

    So saying that Ev is a viable model for evolution or somehow represents a “free” source of information is false and misleading.

    Atom

  13. Atom:

    Wonderful work! As it is evident from most of the discussions here at UD, demonstrating that algotithms cannot generate CSI remains the main point of ID. While we rely on the work of our theorists (Dembski and Marks) to get ever better theorical demonstrations of that, your practical implementation showing clearly to us non mathematicians what is really at stake is extremely useful. I have often used your weasel ware GUI to help friends who are not familiar with mathematical concepts what is really happening in the different models. The ev ware is another precious tool.

    I can’t understand why some people find it difficult to understand the fundamental intuition behind these analysis: these softwares already know the target! They just refrain from giving you immediately the correct answer, because otherwise there would be no game, and give it to you in small pieces. But they know the answer!

    The evolutionary process, as it is conceived, does not know the answer. Indeed, it is not even interested in it. Natural selection can only select function, not information. GAs, instead, select information. Their only meaning, in practice, is to sow that: if I already know information, I can select it. What an achievement!

    I have always thought that the only true evolutionary simulation should be like that: take a system (a computer) and implement in it simple digital replicators subject to random variation (possibly at an adjustable rate). And then just wait for their “evolution”. We have all that is necessary. One could say: but where is NS? Well, NS is in the same place where it is supposed to be in natural history: it is in the rules of the system and in the rules of the replicator. The replicator has all the chances to become more efficient by random variation and profit of the rules of the system to become something better. So, just wait!

    But the moment the programmer, tired of that infinite wait, starts saying: well, let’s help it a bit; after all, we know what we want to achieve.

    Well, I suppose that’s exactly what a patient designer has been doing…

  14. Atom,

    Ok, I’ll bite. I have a few thoughts on Ev I would like to share.

    First, I think The EvInfo.org site misses the point of the ev program somewhat. The ev program was designed with the intent of showing how evolutionary processes can increase the amount of information (Shannon Entropy) in a simulated DNA strand. It does this by selecting, randomly mutating and reproducing many strands over many generations. Your presentation on EvInfo.org doesn’t make that clear that this is the intent of the ev program only that it’s function is to find a match for patterns or targets. While it looks like from the ev source code that a comparison between already existing DNA strands does occur. This is a completely untargeted relative comparison. There is no target against which each DNA strand is compared.

    Second, about your general objection, You stated the program functions by means of an oracle which gives the program information about the search space. If I understand your objection correctly this somehow invalidates the ev program as an evolution simulation because the oracle is providing the program with information that a true evolution simulation would not otherwise have. If true, I disagree. The “oracle” or fitness function in ev is not adding any new information directly or indirectly to any of the DNA strands. It is simply selecting which organisms make it to the next round of mutation and reproduction. That is it. There really is no “target” that ev attempts to find. The oracle really is not an oracle it really should be called an “environment” or fitness function instead.

    The fitness function or “oracle” in ev exists simply to reproduce the environment that any real DNA strand might have to exist in. Just as an avalanche rushing down the mountain is selecting strongly rooted trees from weakly rooted trees for survival to the next round of reproduction, the selection mechanism or fitness function in ev is doing nothing more. No, additional active, free or non free information is being added to the trees that live or die just decisions on which ones lives and which ones die. Just like in nature.

    I don’t care necessarily if you respond to my critiques above However, I would ask that you show me and the others here on this forum where in the source code of ev active free or non-free information is being added to the DNA strand.

    P.S Will this get me banned? I heard it was a tough crowd over here :)

    jdaggs

  15. Seversky:

    You ask:

    “The question is, what is meant by “information” in this context? Information in DNA, for example, if it can be said to exist at all, does not appear to be the same as the information being conveyed in these posts. There is no ‘meaning’ in the sense of that which is intended by a sender or that which is apprehended by a recipient.”

    Your question shows probably a lack of familiarity with the ID concepts.

    Of course there is meaning in DNA, and that meaning corresponds to the specified information. As you probably know, information in the ID theory means that some result is fixed out of all possible theoretical results (in a system). So, if we are talking of a binary string of 130 bits, for instance, like in the example Atom makes commenting the GUI, any single random string is information with a complexity of 1 : 2^130. That kind of information is only a probability, and has nothing to do with meaning. Indeed, in Shannon’s information theory, meaning is not even an issue. Shannon’s theory is a theory about information in this blind sense, and not about meaning.

    On the contrary, specified information corresponds broadly to our intuitive concept of meaning. Specified informations is a subset of all possible information, usually a very small one. “Specified” means all information which has some properties which allow us (intelligent observers” to distinguish that information from a generic random information.

    There are many ways that information can be specified (see Dembski). Bit for our purpose, only one is important: functional specification. An information is functionally specified when, in the right context, it can do something which would be impossible without it.

    Going back to your example (DNA and these posts): both are examples of functionally specified complex information. These posts are information which, in the context of english language, transmit to the reader some specific knowledge or thought. DNA (the protein coding genes) are information which, in the context of the language of the DNA code, transmit to the translation system the correct functional sequence of a protein.

    In both cases the meaning is abstract, and is encoded in a symbolic language. Both cases are examples of a functional message being conveyed through a symbolic language. Both cases are CSI.

    Just to show you the similarity. I can use this post to send a message to you, a fellow biologist, saying:

    Hey friend, this is the protein whose properties you should study. Just synthesize it and study how it folds. Here is the sequence:

    GTGCTGTGAACTGCTTCATCAGGCCATCTGGcCCCCTTGTTAATAATCTAATTACeCTAGGTCTAAGTAGAGTTTGACGTCCAATGAGCGTTT

    As you can see, I have used this post exactly to do what DNA does; to convey a specific useful information.

    I can agree that these posts can convey a grater variety and complexity of meanings, but after all DNA is only a static mass memory, while we are using these posts to communicate in almost real time. But there is CSI, and therefore meaning, in both.

  16. Sev:

    This might help:

    “Now we believe that the DNA is a code. That is, the order of bases (the letters) makes one gene different from another gene (just as one page of print is different from another)”

    Source?

    Oh, a certain to-be Sir Francis Crick, in a letter to his son Michael, March 19, 1953. [Cited, Thaxton, here.]

    GEM of TKI

    PS: Defining FSCI and showing its roots in OOL research on the informational macromolecules of life, c. 1970′s – 80′s. (FSCI is the relevant subset of Orgel’s CSI.)

  17. …furthermore, I cannot see that any computer algorithm can establish (in principle) that digital organism is biologicaly viable at EVERY stage in it’s “search” towards the target.

  18. The “Oracle” in nature is differential reproduction. The information it provides is whether, given either a change in the environment or a change in the organism, the result is more or less reproduction.

    The child’s guessing game of “warmer or colder” is the only answer an oracle imitating natural selection needs to give. Warmer means more successful reproduction and colder means less successful.

    This seems to logically result in the simplest, fastest reproducer coming out on top every time in an environment with limited resources. Indeed that’s what origin of life experiments with self-replicating ribozymes has shown – you get something uber-simple that takes over the whole flask. “Evolution” (Darwinian or otherwise) produces not a single simplest fastest reproducer (due to variations in environments) but if the winner is declared by metrics of biomass and/or number of individuals then the simplest creatures, bacteria, also thought to be the first creatures, are the winners.

    Predation, weaponization, and defense might conceivably account for a departure from simplicity and reproductive speed.

    The bottom line for explanations of evolution, as always, remains with how the first reproducer that was able to play the warmer/colder game came about and in that regard probably the most difficult thing to explain is how the abstract genetic code came about.

    One of the quotes on the EvoInfo page zeroes in on this:

    “The information content of amino acid sequences cannot increase until a genetic code with an adapter function has appeared. Nothing which even vaguely resembles a code exists in the physio-chemical world. One must conclude that no valid scientific explanation of the origin of life exists at present.”

    Hubert Yockey, “Self Organization Origin of Life Scenarios and Information Theory,” Journal of Theoretical Biology 91 (1981): 13.

    Some of the other quotes make related points. The selection of quotes on the EvoInfo home page is excellent, by the way, and makes for an enjoyable reading.

  19. jdaggs:

    This is just a request for information, in order to uderstand better. Indeed, I don’t know in detail the ev program, so I would like to be sure I understand how it works.

    I have tried to read the paper, and i am interested to understand how the selection process works, because I think that is the most relevant point.

    I quote here a couple of phrases from the paper:

    “…the program arbitrarily chose the (16) site locations, which are fixed for the duration of the run”

    I I understand well, that would mean that the program decides the 16 site locations and knows them.

    “The organisms are subjected to rounds of selection and mutation. First, the number of mistakes made by each organism in the population is determined. Then the half of the population making the least mistakes is allowed to replicate.”

    So, it seems, selection happens by determining how much the mutated organism can localize the site locations known to the program. In other words, these simple organisms are mutated (both the localizing “gene” and the sequence where the localization has to be made) in order to match a pre-fixed distribution of localization sites. It seems to me that the pre-fixed distribution is indeed a target. And the measurement of the mistakes is certainly an oracle. And the measurement can be done only because the program already knows where the sites have to be localized.

    Am I wrong? (this is not a rhetorical question: I just want to understand)

    And now, some personal comments.

    Again, I would like to emphasize that selection in these simulation programs seems to be based on one of two methods:

    a) The results are directly matched against the information (the “solution”). That should be the case in the weasel example.

    b) The results are “measured” for some property related to the known solution (so called “fitness function”). That would be the case in the ev program, if I understand correctly.

    Well, I believ that neither a) not b) are models of natural selection. For a) it is rather obvious. The program knows the solution, and uses it directly to get the solution. Nothing could be more trivial.

    But for b) too, the program has to know the solution, in order to measure against it, although more indirectly. So here too we have a lot of active information in the program.

    We are here in a context similar to the intelligent selection in protein engineering, ot to the intelligent selection in antibody maturation: a random variation is applied to a well specified target, and a very intelligent selection is made in order to get a pre-fixed result. In the case of protein enigeering, that can be the measurement of the desired function. In the case of antibody maturation, that would be the measurement of the affinity for the (known) antigen, whose information is stored in the immune system (probably in the antigen presenting cells).

    In both cases, the system knows what it wants to select, measures it, and selects. That’s the big difference with NS. In NS, the system is supposed to be completely blind. Both the environment and the replicators have no idea of what they should achieve. The emerging function must emerge without any pre-conceived accommodation, only by virtue of a random acquisition of some unexpected advantage in the existing, blind system formed by the environment and the replicator.

    I think the difference is really big. It means that the programmer must know nothing of what will be selected. It means that the selected function has to emerge on its own merit. It has to be functional, and not only “measurable”. It is easy to “evolve” measurable functions, when we know what the function must be, we restrict our target, use random variation on it, and then are able to meausre any possible increse, even if very small, of our expected function. The immune system does that in antibody maturation. Protein engineers do the same thing. GA programmers do the same thing.

    But none of that is in any way a model of NS. And none of that is a model of CSI generation witout active information.

  20. 1. Biological information is related to biological function.

    2. Shannon information does not care about meaning nor function.

    And I would say:

    3.- Biological information is not the sequence of nucleotides.

    4. The sequence is important only in carrying out the instructions embedded on the DNA via the pre-determined genetic code- as in which codons represent which amino acids, start and stop positions.

  21. jdaggs,

    Thank you for your thoughts.

    The “target” in Ev is a string that is bound at only 16, fixed binding sites, and is not bound at any other location. Indeed, the “Error Count” shows the Hamming Distance or the distance from this target. (It counts how many sites it is bound at that it shouldn’t be bound at as well as how many it still needs to be bound at.)

    Once you have this information, you can zero in on a target extremely rapidly as the Hamming Search demonstrates. Hamming Search mode doesn’t even use an “evolutionary” reproduction scheme; it just efficiently extracts the information already given to us by the same Hamming Oracle that Ev uses. Armed with that information, we can find any target of a fixed length in N + 1 queries.

    In regards to your question about where the information input is in the source code, it is given by the oracle that evaluates the search space and a given genome string and returns the Hamming Distance (Error Count) of this genome, so that selection can do its work. I can highlight the code, but it is easy to find (I included a link to the source code at the bottom of the GUI. It is in the “ev_simulation_array_based.js” file.) Does the Oracle itself represent a free source of information? No, because the oracle itself needs to be given the information about the target. There is no free source of information, we just push the problem back one level.

    Furthermore, as mentioned before, the very data structure used by Ev to represent a digital organism and “binding” produces phenotypes with lots of “zeros” (not bound) and few “ones” (bound). This increases the odds of finding a string with few ones and lots of zeros.

    Atom

    PS As far as I know, you will only be banned if your become disrespectful or if you disagree with a certain mod in a political post. Just steer clear of politics on the site and be as courteous as you would face to face with people, and you will be here a while.

  22. Addendum:

    jdaggs wrote:

    While it looks like from the ev source code that a comparison between already existing DNA strands does occur. This is a completely untargeted relative comparison. There is no target against which each DNA strand is compared.

    The “comparison” is based on a metric, in this case the Hamming Distance. The Hamming Distance is based on the divergence from a fixed target (phenotype, or output, string.) So in short, your statement “This is a completely untargeted relative comparison” is mistaken.

    To return to the example I gave earlier, let’s say we have 10 searchers in our field for the eggs. They each wander to a spot in the field, and then I assign a number to them based on their exact distance from the egg. You then do you “relative” comparison among these searchers using the Distance numbers I assigned earlier, eliminate the 5 farthest (furthest?), replacing them by 5 new searchers placed randomly one foot away from each of the remaining searchers. Then repeat the process.

    It is clear where the Oracle’s information is input: it is at the “evaluation” step, which is assigning a number based on a fixed target. The selection step uses this number in the comparisons. Again, even though the information is hidden one step back in the process it is still there. Therefore, Ev does not demonstrate a free source of information; it only demonstrates the inefficient use of Active Information given by the Hamming Oracle.

    Atom

  23. 23

    I’m not sure about the Hamming oracle being so bad. People seem to be hung up on details and overlooking that it is just a simulated fitness function. I mean, I’ve seen some of the challenges people make here over ‘active information’ and how much of it can be put in. For some people, the mere act of writing a genetic algorithm is enough to load in Active Information, which means that no simulations of nature can ever be run.

    The Hamming oracle is giving out no more than a value. It has to be a value with a range, because organisms must compete. So yes, it’s giving information, but no more than an environment does (an awful lot of fitnesses can be summed up with a single number, after all. Leg length. Adhesive stickiness. Body weight. etc.).

    Atom:

    And you are correct, organisms in the real world cannot hope to have this information given to them, unless an intelligent agent somehow provided it or programmed it.

    This is clearly untrue! What if dna encodes for body size, and mice have to escape down some small holes? If I simulated that on a computer, I would of course be testing the digital bodies against digital holes, and telling them how well they did. That’s active information! Are you telling me that this invalidates the real life example, or are you agreeing that actually, natural information is easily passed from environment to lifeforms? Because I’m sure the population of mice will notice that there aren’t as many fat ones around.

    Saying that fitness functions encode this information only pushes the problem and information back one level to the fitness functions.

    Which in nature, arise as a result of lawlike processes (hills have slopes, some trees are taller, etc.). If you’re saying that ev loads in active information, this is directly analogous; you are saying that environments load in information to lifeforms.

    The next question, of course, is how much information does an environment create by natural processes? I believe it will be a lot more than the few hundred bits of ev.

  24. Venus Mousetrap,

    Let me cut to the heart of your criticism. You wrote:

    The Hamming oracle is giving out no more than a value. It has to be a value with a range, because organisms must compete. So yes, it’s giving information, but no more than an environment does (an awful lot of fitnesses can be summed up with a single number, after all. Leg length. Adhesive stickiness. Body weight. etc.)

    An environment does not return a value when evaluating fitness that is like “ten DNA bases away from a functional organ.” (I know you aren’t saying it does, but to be similar to Ev, it would have to.) Ev is giving the distance from a small functional subspace (how many binding sites the genome string missed), and this information alone is enough to quickly find the target, in N + 1 steps, using Hamming Search. No evolutionary search is needed and indeed, using one is inefficient. The math and GUI decisively show this. Try it yourself.

    Remember, the Hamming Oracle is not a free source of information. It needs information about the target to work properly. (If not, then how could it ever provide information about “distance” from the target? It would degenerate to a needle-in-the-haystack oracle.)

    So if nature acts as a Hamming Oracle, what encoded the information about the target space into nature (the fitness function)? What chose the correct fitness function that encodes this information about the target space?

    Again, you are only pushing the information problem up one level.

    Atom

  25. Continued:

    Now let’s examine fitness functions themselves.

    You may argue that nature provides information about “closeness” by rewarding those genomes that are “closer” to a functional target with more offspring. But then your fitness function (the fitness slope itself) would encode the information about closeness to target. Indeed, there are many fitness slopes which would not work: ones with sparse islands of functionality, ones where there are no smooth paths to highly functional states, ones that rewarded genomes that were actually far away from functional states (relative to whatever function you are trying to develop), etc. What chose this particular fitness function/landscape out of all the possible ones?

    Any time you exclude possibilities (reduce uncertainty), you are mathematically providing information (in the Shannon sense.) We have now pushed the problem up one level to selecting the proper fitness function.

  26. In the above post:

    “(uncertainty)” technically should read “(reduce uncertainty)”

    in the last paragraph…

  27. An environment does not return a value

    How about the number of dead bodies? :P

    More seriously, environmental conditions can funnel a search but the problem is twofold:

    1) the funneling is usually not balanced like an intelligent search would be. In engineering evolutionary searches should use a generalized (not explicitly defined) target as to find solutions to problems that we might not have conceived of otherwise. The problem is that nature is usually too generalized and cannot efficiently optimize the search within reasonable time constraints.

    2) Environmental conditions necessary for funneling the search may not be available for all functionality. I wrote about this at length fairly recently but it was ignored…possibly because the comment was quite lengthy.

  28. Thanks Patrick, but dead bodies (I know you were kidding) does not tell you how far away from a functional subset you are.

    The points you bring up in your other post are apropos here: not every fitness function will lead you to the functional subset, therefore you need the correct type of fitness landscape to hold for a given duration of time for self-selection (differential reproduction) to have any reasonable hope of ever locating that functional subset.

  29. DaveScot (18),

    “Evolution” (Darwinian or otherwise) produces not a single simplest fastest reproducer (due to variations in environments) but if the winner is declared by metrics of biomass and/or number of individuals then the simplest creatures, bacteria, also thought to be the first creatures, are the winners.

    Ah, so word gets around. Yes, DaveScot, there is an evolutionary free lunch. Wolpert and Macready did not take time into account. Going from the simple to the complex is just the way nature works, and it happens to be the best way to proceed in optimization of a “really black” black box.

    There is no reason to impute design to a natural process that goes from simplicity to complexity.

    Predation, weaponization, and defense might conceivably account for a departure from simplicity and reproductive speed.

    These are successful departures from simplicity and reproductive speed. I would describe biological complexity as a pyramid, with each level continually generating variations that usually fail and sometimes succeed, but everyone would imagine far too small a base. It is one thing to consider that unicellular organisms account for most of the earth’s biomass, and quite another to consider that they account for almost all of its reproductive trials.

  30. Another old post relevant to this topic:

    Dodgen Daily

  31. Venus Mousetrap:

    please, see also my points at #19 about NS.

    I believe there are two fundamental errors in your reasoning.

    The first, and most important, has been clearly shown by Atom in #24 and 25. The problem of metrics is essential, and corresponds to the situation in my second case of “artificial” NS in GAs:

    “b) The results are “measured” for some property related to the known solution (so called “fitness function”). That would be the case in the ev program, if I understand correctly.”

    Now, you can have two different kinds of metrics. In one case, you can return a number which measures the real distance from the result. That is a very explicit kind of oracle, and obviously requires perfect knowledge of the solution. It is no different from the weasel situation.

    In the second case, let’s say a generic fitness function, the measurement is not absolute, but relative: it can measure the emergence, or the increase, of a pre-determined function, with great sensitivity and specificity. Here again, even if the exact information for the solution is not needed, you need knowledge of two very important and almost equivalent things: what is the function you want to attain, and how to measure it efficitently, even at very low levels.

    Two understand the importance of just knowing what the function is, let’s take the example of protein engineering through partial targeted random variation, which I quote in my post #19. Just imagine what would happen if the engineers, after applying a round of random mutations, had to measure any possible emergence or increase of any possible function, even at very low level, and then select each case. The method would immediately become empirically useless. Instead, if you can measure for only one function, the one for whichh all the experimental setting (including choice of the original library) was programmed, and promote any tiny sign of that function against all other possibilities, then system works. So, the advance choice of what to look at is fundamental, and is an explicit act of engineering.

    And you shouldt also recognize the importance of being able to measure the function having a specific measurement procedure, intelligently programmed. That allows you to measure any observable function variation in the limits of the sensibility of your measurement system, and you will be careful to intelligently provide a measurement system which is sensible enough. Even if you choose to use in your GA a binary oracle, based on a threshold, the setting of the threshold is still an indirect form of metrics.

    You can object that NS is something like that: it works with a threshold, and utilizes a binary oracle. That’s correct, but the important point is that the measurement in NS is neither specific nor sensitive, because it is extremely indirect: indeed, it is not in any way connected to the searched function itself (also because there is no searched function: NS is by definition blind), and it selects on the basis of any generic improvement in survival. Please, reflect that improvement in survival implies an extremely high threshold for any function: the function must be present, relevant, and must be, alone, capable of significantly increase the survival of the whole replicator. And the measurement is also extremely slow.

    To understand how fundamental are these differences, let’s take the second example which I quote in my post #19: antibody maturation in the immune system after the primary immune response. Here the system, after having applied targeted partial random variation, selects for increased affinity to the pre-determined antigen (which is known to the system, because it is the antigen which primarily activated the immune response). In a few months, the antibody affinity for that antigen greatly increases.
    But just think if the immune system had to find what clone has expressed the increased affinity just by measuring the survival increase: in that case, assuming as a thought experiment that the new clones could be genetically transmitted to different individuals, the model would work only if the individuals with a higher affinity for that antigen invariably could survive better.

    That is not only a slow way of measurement, it is a way which would never work, because there is no reason that a higher affinity in an antibody to that specific antigen must be a factor which improves survival in a generic situation. That’s why the immune system builds a generic low sensitivity library of antibodies for each one of us (the primary repertoire), and engineers higher affinity antibodies in each case where a specific antigen has been met (the secondary response). That makes absolute sense, and is based on careful engineering which allows the system to work on two basic informations; what the antigen is (an information given by the primary immune response), and how to measure the affinity of each clone against it (a very intelligent information coded into the immune system itself).

    So, I conclude pointing at the second error in your reasoning, which should by now be self-evident. You say:

    “The next question, of course, is how much information does an environment create by natural processes? I believe it will be a lot more than the few hundred bits of ev.”

    No, it’s exactly the contrary. The environment passes not a lot of information, it is just as it is, and that becomes information for the replicator only indirectly through the rough measurement implied in survival. But consider the case of antibodies. If the replicator received information from the environment about all possible existing antigens, that would be meaningless to it. It’s only the restriction of information (this antigen is more important now) which allows the replicator to react intelligently through a specific measurement system. In other words, useful (functional) information is a restriction of possibilities, not an increase of them. You gain useful information when you can restrict the field of possibilities. That’s why CSI (and in particular FSCI) is the most useful information: it gives you the possibilities which work, and you can work without trying all the others.

    That’s why ev is so powerful. Because it has so many bits of CSI in it. Remember, bits of CSI are a measure of how improbable that information is, in other words they are a measure of how much you are restricting the field vs a random search. I quote from the important paper by Durston, Chew, Abel and Trevors: “Theoretical Biology and Medical Modelling”:

    “The measure of Functional Sequence Complexity, denoted as ?, is defined as the change in functional uncertainty from the ground state H(Xg(ti)) to the functional
    state H(Xf(ti))”

    and

    “The resulting unit of measure is defined on the joint data and functionality variable, which we call Fits (or Functional bits). The unit Fit thus defined is related to the intuitive concept of functional information, including genetic
    instruction and, thus, provides an important distinction between functional information and Shannon information”

    It’s that simple. A program like ev contains many Fits of functional information. The environment doesn’t.

  32. Patrick

    The problem is that nature is usually too generalized and cannot efficiently optimize the search within reasonable time constraints.

    Haldane’s Dilemma

  33. Going from the simple to the complex is just the way nature works-Sal Gal

    Evidence please.

    There is no reason to impute design to a natural process that goes from simplicity to complexity.

    Mere complexity does not get one a design inference.

    And do you have evidence for such a process?

  34. gpuccio,
    there’s a mistake in the sequence you’ve presented.

  35. sparc:

    Maybe an useful mutation? :-)

  36. Maybe an useful mutation

    If as I assume you’ve used the IUPAC code to write the sequence there is a non-defined character in the sequence.
    (IIRC correctly I’ve answered before but the comment didn’t show up here. Maybe I forgot to hit the submit button)

  37. My thanks for the previous comments but I find I am still grappling with the concept of information. The problem is that it seems to be used to mean different things in different contexts. It can mean the thoughts and opinions being shared through a blog like this, it can mean what is represented by electrons being shuffled around the circuitry of a computer or it can mean the shapes and arrangements of molecules in the gelatinous blob of a tiny living cell. Are they, in fact, different things or do they have something underlying in common which qualifies as information?

    What we commonly think of as information seems to be what is called semantic information embodied in messages passed between intelligent agents such as ourselves. That involves intention and the capacity to extract meaning from messages which can also be distinguished from background ‘noise’. I found this passage from an article called “The Information Challenge” by Richard Dawkins which was helpful:

    Redundancy is any part of a message that is not informative, either because the recipient already knows it (is not surprised by it) or because it duplicates other parts of the message. In the sentence “Rover is a poodle dog”, the word “dog” is redundant because “poodle” already tells us that Rover is a dog. An economical telegram would omit it, thereby increasing the informative proportion of the message. “Arr JFK Fri pm pls mt BA Cncrd flt” carries the same information as the much longer, but more redundant, “I’ll be arriving at John F Kennedy airport on Friday evening; please meet the British Airways Concorde flight”. Obviously the brief, telegraphic message is cheaper to send (although the recipient may have to work harder to decipher it – redundancy has its virtues if we forget economics). Shannon wanted to find a mathematical way to capture the idea that any message could be broken into the information (which is worth paying for), the redundancy (which can, with economic advantage, be deleted from the message because, in effect, it can be reconstructed by the recipient) and the noise (which is just random rubbish).

    I understand from the illustration about the Concorde flight how a message can be stripped down to its bare essentials in terms of information and that Shannon expressed this in a mathematical form in which the meaning was irrelevant but what, exactly, is information?

    The question I asked myself is this, the message about the Concorde flight would have told the recipient something they didn’t know before, namely, when and where the traveller’s flight was due to arrive. But suppose the sender was uncertain whether the message had been received so sent it again just to be safe, would it still contain information? Suppose the recipient had read the first message, they would no longer be surprised or informed by the second message, yet it was exactly the same as the first, so what is the information it contained?

    It seems to me that information is not so much a property of the message as it is a description of the relationship between the message and the recipient or, more precisely, the change the message causes in the state of the recipient. In a sense, it’s a process rather than an attribute. In the case of the Concorde flight message, the first one changed the state of the recipient by adding new knowledge, the second did nothing because the knowledge was already there.

    The other problem is that I can see how it would be possible to contruct a broad definition of ‘information’ that would encompass both semantic information and what happens in a computer or a living cell. On that basis, for example, you could show how an organism acquires new ‘information’ from the environment as Paul Davies has argued. But I also found a piece on a blog by philosopher John Wilkins which argues that it is misleading to think of DNA and what happens at the genetic level as information at all:

    A recent New Scientist article poses the often-posed question in the title. The answer is mine. Forgive me as I rant and rave on a bugbear topic…

    OK, I know that we live in the “information age” and far be it from me to denigrate the work of Shannon, Turing and von Neumann, but it’s gotten out of hand. Information has become the new magical substance of the age, the philosopher’s stone. And, well, it just isn’t.

    In the article linked, physicist William Bialek at Princeton University argues that there is a minimum amount of information that organisms need to store in order to be alive.”How well we do in life depends on our actions matching the external conditions,” he says. “But actions come from ‘inside’, so they must be based on some internal variables.”

    This is a massive fallacy. Really, really massive. Consider what it relies upon apart from superficial authority and technobabble: it means that organisms must be computers, that they must store data in variables, and that nothing can occur unless it is based on an internal program. For gods’ sakes, hasn’t Bialek heard of causality? You know, physical properties that cause states of affairs? Or is he going to go the John Wheeler route and claim that everything is information (in which case, why care about the information of living systems)?

    Calling everything information is massive projection, or even anthropomorphism. It takes something that exists as a semantic or cognitive property and projects it out to all that exists. It makes observers the sole reality. In biology, the concept of information has been abused in just this way, but it’s a peculiarly twentieth century phenomenon. And that’s not coincidental – in 1948, Shannon and Weiner both presented radical and influential conceptions of information – one based on communication [1], and the other on control [2]. Previously, in the 1930s, Alan Turing had developed the notion of a computer, and in 1950 [3] he started the ongoing interest in computation as a form of cognition. So, three senses of “information” got conflated in popular (and technical) imagination, and shortly afterwards, the term was applied to genes, but (and this is often forgotten) just in terms of causal specificity – genes “coded for” proteins by a physical process of templating.

    But people have gotten all enthusiastic for “information” (bearing in mind the etymology of enthusiast as “in-godded one”), and as a result lots of rather silly claims has been made – not all by physicists by any means – about information in biology.

    We need to specify carefully what counts as information and what doesn’t. I personally think that DNA is not information – allow me to explain why.

    http://scienceblogs.com/evolvi....._for_l.php

    I’m not sure where I stand on this and he admits it is a minority view but it is provocative.

  38. Seversky:

    I think we should be clear about two very different emanings of the word “information” in scientific discourse.

    One is Shannon’s entropy H. Although that value is often referred to the concept of information, it is just a measure of uncertainty. Indeed, Shannon’s theory has nothing to do with meaning. So, you must be aware that a random sequence, with no meaning, has the highest value of H. In other words, it is not compressible, it is not redundant, and if you have to communicate it you need the highest number of bits. But it may well mean nothing.

    When you say:

    “What we commonly think of as information seems to be what is called semantic information embodied in messages passed between intelligent agents such as ourselves.”

    You are insteda obviously referring to what is usually called “meaning”. The concept which tries to explicitly and objectively formalize “meaning” is the concept of specification, for instance as it is given in Dembski.

    In other words, any complex sequence which can be “recognized” as special by intelligent agents is “specified”.

    Specification comes in different flavors. As I have already said, the most useful form of specification is functional specification, but there are other forms: for instance, even a ramdom sequence may become specified if it is given in advance before it is found in the supposed random event (pre-specification).

    But let’s go back to functional specification: it is a sequence which can do something specific in a specific context. Human language (these posts) is a form of functional specification. And so a computer program, or the project for a machine.

    You say:

    “The problem is that it seems to be used to mean different things in different contexts. It can mean the thoughts and opinions being shared through a blog like this, it can mean what is represented by electrons being shuffled around the circuitry of a computer or it can mean the shapes and arrangements of molecules in the gelatinous blob of a tiny living cell.”

    I don’t agree that they are different things. You seem to confound the software with the hardware, the specified information with the hardwrae where it is implemented. In the sense of specified information, of “software”, our opinions, the arrangement of electrons in a computer, and the arrangement of molecules in the cell are one and the same thing: software, specified information. They are, obviously, differnet softwares. And they are “written” in different hardwares. But information is essentially am immaterial concept, and is independent from the hardware, as we well know.

    In other words, you can describe the arrangement of the molecules in a cell through a computer code, or in one of these posts, through human language, but you are still describimg the same information.

    Finally, I have to really disagree with Wilkins: DNA contains specified information, and a lot of it. That’s beyond any boubt. I have read what Wilkins says, but I disagree completely. If you just use the very simple definitions of information above, and not the abstruse and innatural self-definitions given by Wilkins, you will see that there may be no doubt that DNA stores a lot of information, exactly in the same sense that a hard disk stores a lot of functional programs: DNA is a mass memory, and it contains “at least” all the information for the synthesis of the proteins in an organism.

  39. Seversky,

    I think you are right that information gets used in different ways. But we tend to use it in common language as data that mean something. And that is how I believe most use it here.

    So information is just a piece of data and each nucleotide is a data point. Search the internet for definitions of the word. For example, go here

    http://www.onelook.com/?w=information&ls=a

    Use for information such things as

    news; intelligence; words

    facts; data; learning; lore

    Each nucleotide is a piece of information just as each molecule in a rock is a piece of information. Both DNA and a rock are complex so each is an example of complex information. However, some units of the information in the DNA specify something else, for example a gene specifies a protein and sometimes RNA. And some of these proteins and RNA have functions and some of these proteins and RNA work together as functional units or systems. So the information in the DNA is complex, specified and and the elements specified are functional. Life is the only place this appears in nature. It appears quite frequently with human intelligence. Now it is quite possible that not all the DNA in a genome is specified and functional and may be just junk but a large part is not.

    Sometimes people invoke Shannon’s concepts, which I do not understand, to get at the unlikelihood of a particular combination that has functionality. The unlikelihood of a particular combination of data that specifies a function is usually extremely low. However, I cannot pronounce on the appropriateness of Shannon’s concepts to do so but common sense indicates the near zero probabilities of most of the combinations and that is what is at issue.

    I cannot see how DNA is not information under the every day meaning of the term and I assume it is information also on some of the esoteric uses of the term.

  40. Seversky #37

    Nice comment.

    The problem is that it seems to be used to mean different things in different contexts.

    Of course! The informal word “information” has a range of informal meanings and has had several different formal interpretations.

    It seems to me that information is not so much a property of the message as it is a description of the relationship between the message and the recipient or, more precisely, the change the message causes in the state of the recipient.

    I absolutely agree. The use of the word “information” in the phrase CSI is a different use from the usual English language use. CSI is just the probability of an observed outcome being of a certain type given some assumptions about the outcome was generated. The word “information” gives CSI a certain grativas and links it with the idea that the outcome might be have been deliberately arranged or “designed”.

  41. Mark:

    You say:

    “The use of the word “information” in the phrase CSI is a different use from the usual English language use. CSI is just the probability of an observed outcome being of a certain type given some assumptions about the outcome was generated. The word “information” gives CSI a certain grativas and links it with the idea that the outcome might be have been deliberately arranged or “designed”.”

    As it often happens, I partially agree with you: in a sense, that is exactly the important point of ID.

    CSI is an “objective” definition of an objective property of an outcome, which is objectively recognizable and measurable.

    So, you are right: CSI is not “in itself” the meaning, the useful information. It is an objective property which, according to ID theory, allows us to “infer” correctly that that output is designed, and therefore it represents a meaning, ot, as Seversky says, a form of “semantic information embodied in messages passed between intelligent agents such as ourselves”. The aspect of CSI which more directly “corresponds” to the common concept of “meaning” (without being completely equal to it) is specification.

    The key aspect is that design is “inferred” form the existence of CSI. CSI is not design. CSI is an objective formal property that we in ID believe, on the basis of empirical observations and of theoretical considerations, to be invariably associated with the process of design. So, we could say that CSI (once correctly defined, observed and measured) is a fact (an observable), or if you prefer an observable property of facts, while the design inference is a theory (just to remain epistemologically correct).

    Obviously, some people may not agree, as we well know, that CSI is objectively observable and measurable, but that is another aspect. In other words, let’s say that “if” (as I believe) it is true that CSI is a property which can be objectively observed and measured, than it is an observable, while the design inference is anyway a scientific theory, an inference.

    So, your point is good, but it is not the word “information” which “gives CSI a certain grativas and links it with the idea that the outcome might be have been deliberately arranged or “designed”” (at least, not in ID): in ID, it is a whole scientific theory which accomplishes that.

  42. Mark (and GP et al):

    A few remarks:

    1] [Shannon] Info is . . . tied to probability

    First, let us note that the basic Shannon-style metric of information is a probabilistic metric, here excerpting my note, Section A at a point where I build on Connor’s classic remarks:

    . . . let us consider a source that emits symbols from a vocabulary: s1,s2, s3, . . . sn, with probabilities p1, p2, p3, . . . pn. That is, in a “typical” long string of symbols, of size M [say this web page], the average number that are some sj, J, will be such that the ratio J/M –> pj, and in the limit attains equality. We term pj the a priori — before the fact — probability of symbol sj. Then, when a receiver detects sj, the question arises as to whether this was sent. [That is, the mixing in of noise means that received messages are prone to misidentification.] If on average, sj will be detected correctly a fraction, dj of the time, the a posteriori — after the fact — probability of sj is by a similar calculation, dj. So, we now define the information content of symbol sj as, in effect how much it surprises us on average when it shows up in our receiver:

    I = log [dj/pj], in bits [if the log is base 2, log2] . . . Eqn 1

    This immediately means that the question of receiving information arises AFTER an apparent symbol sj has been detected and decoded. That is, the issue of information inherently implies an inference to having received an intentional signal in the face of the possibility that noise could be present. Second, logs are used in the definition of I, as they give an additive property: for, the amount of information in independent signals, si + sj, using the above definition, is such that:

    I total = Ii + Ij . . . Eqn 2

    In short, a metric based on probabilities is inherent to the generally used concept of information in info theory.

    And, the issue of distinguishing signal from noise, or message from lucky noise’s mimic, is an issue of inference to design. One that is probabilistically based.

    Coming out the starting gate.

    2] Meaningful information:

    In a letter that should be better known, March 19, 1953, Crick wrote to his son as follows:

    “Now we believe that the DNA is a code. That is, the order of bases (the letters) makes one gene different from another gene (just as one page of print is different from another) . . . “

    In short, it has long been recognised that DNA stores functional information. (Which is what messaged are about in the end: they do a job when received by a suitable receiver and “sink.”)

    3] Sequence complexity:

    Trevors and Abel in this 2005 paper made a distinction between random, ordered and functional sequence complexity, based on a 3-d metric space:

    OSC: high algorithmic compressibility, low algorithmic functionality, low complexity

    RSC: highest complexity, low algorithmic compressibility and functionality.

    FSC: Fairly high complexity, a bit higher compressibility than RSC, high algorithmic functionality.[cf Fig 4.]

    You will see that compressibility and complexity carry an inverse relationship, and that high order sequences are too rigidly defined to carry much information [they are a repeating block]. Random sequences resist comprtession, as they essentially have to be listed to describe them.

    Functional sequences of course will have some redundancy in them [for error detection and correction i.e. resistance to corruption], but have low periodicity otehrwise. They of course are constructed to fuction and so have high fucntionality.

    This brings us back to the flooded fitness landscape issue.

    4] Flooded fitness landscapes and local vs broadcasting oracles vs maps and helicopters:

    Atom, first thanks for sharing the EIL’s work with us. I want to play a thot exercise using the landscape analogy.

    When a fitness landscape has a threshold of functionality, it means that unless you are at least at the shores of an islands of function, you cannot access differential success as a clue to climb towards optimal function.

    For a sufficiently complex space [the ones relevant to FSCI and CSI), we know that the search resources of the cosmos are vastly inadequate for reasonable chances of success at reaching islands of function from arbitrary initial points.

    Many GA's -- apart from dealing with comparatively speaking toy-scale complexity -- allow in effect broadcasting oracles so that non functional states that are close enough can get "pointed right" towards function. Worse, non-functional states are rewarded and allowed to continue varying until they become functional. (This might work for micro-evo changes but we need to deal with macro-evo changes, and with getting to first complex function.)

    Once one then reaches to function, voila, one can hill-climb to optimal function, on the assumption that we have nice ascents -- if there are overly high and steep cliffs, that is another problem.

    But now, what if you have a map, even a less than perfect one [or an imaginary map . . .] that will often get you into the near neighbourhood of archipelagos of function?

    That may well allow short range explorations that get you on an island and allow island hopping.

    So, we see how imaginative creative ideas and suitably restricted trial and error may work as tools of design. Indeed, if general location A does not work after reasonable tests and trials, one may proceed to suspect hot zone B and so on. (Cf here mineral exploration.)

    In short, design can use constrained random search inthe context of a hot zone.

    5] Is CSI observable and/or measurable?

    I have discussed a simple approach: consider a [pseudo-]vector {L, S, C}

    –> L: degree of contingency, Low being 0 and high being 1. [We observe such routinely when we identify natural regularities and seek explanatory laws that cover such patterns. Similarly, we see high contingency when we see that under reasonably cisimilar circumstances outcomes scatter sufficiently to give us pause. Cf dice and how people behave for undirected and dirtected contingency)

    –> S: specificity, especially functional specificity, high being 1 and low 0. (This we observe every time we contrast the words just past with say fhuwyuwg7824rqw3jfgyqw3tg.)

    –> C: complexity, here being indicated by equivalent storage capacity in bits, with 500 – 1,000 bits being the threshold for sufficient complexity. (E.g. cf DNA.)

    Next, multiply: L*S*C, and take the result.

    We then identify for this simple case CSI as the zone in which the product exceeds the range 500 – 1,000 bits, i.e we have a crude metric for FSC in the T-A sense. [If we fail at either of the first two stages, we don’t get above 0.)

    Of course Dr WmAD has done a far more sophisticated form, but his form is conceptually similar enough that we can see what is going on.

    But, clearly we have an approach that can achieve significant intersubjective consensus, and is premised on items that are in fact objectively and routinely identifiable.

    It is also a discrete state metric that gives an interval scale with a pass-fail threshold. We use such scales routinely to determine who accesses educational levels, and promotions or even firings. So, to reject or raise a stonewall of objections to the approach in this case is plainly selectivley hyperspkeptical.

    Indeed, just by reading and taking seriously posts in this thread, we have intuitively used such an approach in deciding that posts are real messages, not mere lucky noise mimics.

    __________

    I trust these points help us set the discussion on a more balanced footing.

    GEM of TKI

  43. A few points:

    - Props to Atom for his mad javascripting skillz. Very well done.

    - I agree that Schneider tends to overstate the implications of his results. In particular, I don’t see how his results refute Behe’s ideas. Like Dawkins, Schneider gets quite excited about his own code. But I think all coders go through that stage, and some of us never get past it.

    - If you choose all zeroes and random input, Atom’s code takes around 370 queries (averaged from 10,000 trials), while Marks and Dembski’s code takes around 440. I can explain one reason for this discrepancy if anyone’s interested.

    - I’m baffled by the following statements:

    “The ability of ev to find its target in much less that twelve and a half quintillion years is not due the evolutionary program”

    “the ability of ev to find its target is not due to the evolutionary algorithm used, but is rather due to the active information residing in the digital organism.”

    “It is the active information introduced by the computer programmer and not the evolutionary program that reduced the difficulty of the problem to a manageable level.”

    “The active information from the Hamming oracle and ev’s number cruncher are responsible for the rapid convergence – not the evolutionary program.”

    I can’t reconcile these statements with the following fact: If you replace Schneider’s evolutionary algorithm with random sampling, the performance degradation renders the search unmanageable.

    Which brings up a question. The above fact indicates a significant amount of active information, but in which component(s) of the search does it reside? The evolutionary algorithm, the number cruncher, or the Hamming oracle?

    There are even more fundamental issues in play with regards to the EvoInfo Lab’s understanding of Schneider’s paper, and with the active information concept in general, but I don’t want to wear out my welcome.

  44. R0b,

    Thanks.

    As for the discrepancy’s with Marks/Dembski’s Matlab version, there were some design decisions that were made and differences in my implementation. You will notice some differences in results, but nothing materially relevant (in my opinion.) If you find a significant difference between my javascript implementation and Schneider’s version (other than memory issues – a web browser isn’t a C++ box, so I had to limit the GUI to keep browsers from locking up), then please let me know.

    You wrote:

    I can’t reconcile these statements with the following fact: If you replace Schneider’s evolutionary algorithm with random sampling, the performance degradation renders the search unmanageable.

    Replacing Hamming Search or any other search with random sampling will also make the search intractable. You can replace the Ev strategy with either a Hamming Search or a Ratchet search, and you will still eventually converge on the target. (Hamming will converge quickly always, while Ratchet usually works, but not always.) So the “Ev Strategy” (Schneider’s evolutionary search strategy – mutation and selection) isn’t the main source of the information. The selection step takes advantage of the information presented to the search by the Hamming Oracle.

    The Hamming Oracle has access to information about the target space and passes this along to the searches that use it. The second source of Active Information is the skewing of “randomness” towards phenotypes with few ones and many zeros. As you saw, this can allow you to converge on a target using random input, which shouldn’t happen if all phenotypes were equally likely. Also, changing the target to random sites will make the Ev search take longer, demonstrating the application of negative Active Information. (the assumption about the target space, many zeros and few ones, is false on average.)

    Atom

  45. Slightly off topic – but related. I would like to invite anyone to do submit simple examples for themselves or others to calculate the CSI. It seems to me this should be informative. I have set up a blog post to do this and submitted a very straightforward example of my own – a hand of 13 spades. Oleg already has one example slightly more complicated example which inspired me to do this.

    No trick is intended. I just want to see when and how it can be done.

  46. Mark:

    For the record here, I note that I have commented at your blog thread as follows; with onward link to the thread in which your commenter Oleg raised the question of a 60-element bit string as a candidate for CSI status.

    _______________

    Mark:

    For the record, I observe here that in your previous thread, I used a far simpler but nonetheless effective metric to address Oleg’s case.

    His case was 60 bits long so even if functional and specific in that functionality, not CSI as below the relevant range that on the crude metric begins at 500 – 1,000 functional and specific bits.

    For the further record, I have contrasted the case of biofunctional DNA, pointing out that even with parasitic lifeforms that start at about 100,000 4-stage elements, we are well within the CSI territory, with 200 k+ functionally specific bits.

    In short, CSI is not only measurable but can be measured in simple as well as sophisticated ways. The material issue is settled at the outset already, then.

    And, that is all I need to do here.

    G’day.

    GEM of TKI

    _____________

    Onlookers may wish to look at point 5 in 42 above for the simple but I believe useful enough metric I used, and the other points here give its context.

    I trust that his will help clarify the record there and here.

    GEM of TKI

  47. Mark (and those interested):

    I have posted a couple of answers on your blog, but just excuse me if I will not necessarily follow up a long discussion there. The main reason (but not the only one) is that my (I hope intelligent) resources to design posts are limited, and I do believe that the important discussions are here.

    But I have great respect for you and for your blog. And anyway, if I really believe that I have something useful to say there, I will try to do that.

  48. GP (and Mark):

    I took a look at the thread over at MF’s blog on calculating CSI, on seeing your post just above.

    GP, you have with great patience composed a more than adequate response, and then undertook onward interactions with the set of commenters there; and in far more than a mere two posts!

    I applaud your effort, and agree with its overall substance, and even moreso with your ever so gracious tone.

    Having already spent some time at CO, I will simply note here on a few points:

    1 –> For those who wish to look, I have updated my online note to highlight metrics of FSCI, first the intuitively obvious functionally specified bit, and also have given an excerpt on Dr Dembski’s more sophisticated 2005 model.

    2 –> Venus, in the old English Common Law, murder was defined as maliciously causing the death of an innocent within “a year and a day.” The line has to be drawn somewhere, and you will note that in the case of interest, (1) the EF is designed to be biased towards false negatives, (2) I in fact use a range from 500 – 1,000 bits to take in reasonable cases of islands of functionality in the wider config space, and (3) the biologically relevant cases are far, far beyond the upper end of the threshold band.

    3 –> I also note that I responded to a stated bit string of unspecified functionality of 60 bits length. It is not a natural occurrence observed “in the wild”, and alphanumerical characters are contingent. It could be the result of a program that forces each bit, or it could be the result of a [pseudo-]random process. I simply pointed out that as given there is not enough length to be relevant to the FSCI criterion.

    4 –> Subsequently, it was stated that this is the product of a definitive process in mathematics, by reference to an example of a pseudorandom string based on an algorithm, by Dr Dembski. In short, per the report taken as credible, it is the result of an algorithm, which is of course in all directly known cases, designed. And indeed, once we bring in the workings of the algorithm to generate the bit string, we immediately will observe that the complexity jumps up hugely; credibly well beyond the threshold, once we take in the statement of the algorithm, the compiling of it, and the physical execution required to implement the stated output.

    I trust that these notes will be helpful.

    GEM of TKI

  49. Re #47

    gpuccio

    I would be happy to discuss the calculation of CSI on any blog but I only have the ability to make new posts on my own. That’s why I am using it.

    Anyhow thanks for your contributions over there. It is a bit disappointing that no one is able to give a worked example of CSI as defined in “Specification: The Pattern That Signifies Intelligence”. I thought it would be possible at least for a bridge hand of 13 spades. However, I note that you are not entirely happy with the definition of CSI in that paper so I can’t blame you!

  50. Gpuccio
    First thanks for participating, without you it would be a very one-sided discussion.

    My prime interest was in checking my understanding of Dembski’s paper and also how many ID proponents understand it. As you have admitted you don’t fully understand it, I guess the answer so far is zero :-)

    I will leave comments on your own method of calculating CSI until after I have walked the dog…

  51. Sorry posted that last comment on the wrong blog!

  52. Mark

    Dembski’s metric of CSI is a mathematical model of great generality and technical depth, more for his technical peers than for anyone else. Summing it up, I have excerpted as follows, From pp 17 – 24 or so, and using X for chi and f for phi:

    define fS as . . . the number of patterns for which [agent] S’s semiotic description of them is at least as simple as S’s semiotic description of [a pattern or target zone] T. [26] . . . . where M is the number of semiotic agents [S] that within a context of inquiry might also be witnessing events and N is the number of opportunities for such events to happen . . . . [where also] computer scientist Seth Lloyd has shown that 10^120 constitutes the maximal number of bit [issue or state resolvable to yes/no, hi/lo etc] operations that the known, observable universe could have performed throughout its entire multi-billion year history.[31] . . . [Then] for any context of inquiry in which S might be endeavoring to determine whether an event that conforms to a pattern T happened by chance, M·N will be bounded above by 10^120. We thus define the specified complexity of T given [chance hypothesis] H [in bits] . . . as

    X = –log2[10^120 ·fS(T)·P(T|H)].

    A bit of commentary:

    1 –> As you know information is an inverse probability metric, so to get it in bits we do a negative log to base 2 metric;

    2 –> This, working with a modified conditional probability of being in the relevant target zone, on a relevant chance hyp [often a quasi-flat distribution across microstates as is common in stat thermodynamics and as reflects the Laplace indifference criterion.]

    3 –> We then in essence multiply the probability by an upper bound that measures the available search resources of the cosmos, and some other adjusting factors. [The idea being that if one has a low probability 1/n, n tries will more or less make bring the odds of seeing the event at least once to an appreciable fraction of 1.]

    4 –> fS(T) is a multiple of the base probability that brings out “sufficiently similar” patterns in the target zone, i.e. we are interested in seeing any of a cluster of materially similar outcomes.

    5 –> As his technical bottomline:

    >> . . . if 10^120·fS(T)·P(T|H) L/T. 1/2 or, equivalently, that if X = –log2[10^120·fS(T)·P(T|H)] >
    1, then it is less likely than not on the scale of the whole universe, with all replicational and specificational resources factored in, that E should have occurred according to the chance
    hypothesis H. Consequently, we should think that E occurred by some process other than one
    characterized by H.>>

    6 –> In short if X >> 1, it is highly unlikely that a given entity or event has occurred by chance. And since highly contingent aspects of events — the focus of the discussion — are the opposite tot he pattern of regularity we see in cases of lawlike necessity, this leaves intelligent agency as the best explanation.

    Now, as you know, I start from a different place, the status of OOL research circa mid 1980′s by which time CSI had been recognised conceptually and FSCI had been identified as the key characteristic of life, a characteristic that is well known from intelligently designed systems.

    Accordingly, I have long intuitively used and have now explicitly identified a simpler and more readily usable metric, functionally specific bits. And once the number of FS Bits approaches 500, design is a probable explanation. As it approaches 1,000, that would allow for islands of functionality that equal the number of quantum states in our universe across its lifespan, and still have them so isolated that they are at 1 in 10^150 of possible configs. So, once we pass the upper limit of the band, it is morally certain — note my use of a term of epistemic responsibility, not proof beyond all doubt — that an event is not a matter of chance or necessity but design. (Of course the biologically relevant cases, such as the DNA of the simplest life forms, start at 100′s to 1,000 times that upper limit.)

    A similar metric could be done for K-compressibility specified entities, which would get us to a simple metric for CSI.

    Now, you wish to work with a hand of 13 cards from 52 in a standard deck that “by chance” or “by design” would have all 13 spades in it:

    A deck of 52 cards will have 52C13 [~ 635 * 10^9, I believe, if my freebie software calculator is right] possible 13-card hands, and of these precisely one will be 12 spades, and having 13 hearts, Clubs or diamonds would be materially similar. P(T|H) would I believe be 1/52C13, and fS(T) would be 4 [or a similar number].

    X = – log2[10^120 * 4 * 1/52C13]

    ~ (- 361), i.e. much lower than 1

    That is — as can be inferred from the high value of the odds of being in the target zone [~ 1 in 10^12] relative to the UPB — the Dembski X-metric rules “not sufficiently likely to be designed to be ruled a clear case of design.”

    I trust that helps.

    G’day

    GEM of TKI

  53. Hi Atom. The difference I noticed between your version of ev, Marks and Dembski’s version, and Schneider’s version is the number of potential binding sites. Yours seems to have 125, while Marks and Dembski’s has 131 and Schneider’s has 256. This discrepancy doesn’t affect the point that the EvoInfo Lab is making with regards to ev, but it does affect the reported numbers.

    My objection to the claim that ev’s efficiency “is not due to the evolutionary program” is this: Marks said in a presentation, with regards to ev, “Some recently proposed evolutionary models are shown, surprisingly, to offer negative added information to the design process and therefore perform worse than random sampling.” But it turns out that the evolutionary algorithm actually performs many, many orders of magnitude better than random sampling, so why do we not now conclude that it offers positive active information?

    But the larger issue is this: Since search efficiency depends on the components being well matched, why do some components get credit and others not? Not even the Hamming oracle is good or bad in and of itself. If the search algorithm interprets the oracle’s output as Hamming closeness rather than distance, the search will move directly away from the target, and we’d be better off with a yes/no fitness function.

    Another general problem is the fact that Schneider’s statements, and his entire analysis, are in strictly classical information terms, but the EvoInfo Lab has responded in terms of active information without explaining how the two frameworks are connected. Just because Schneider, the EvoInfo Lab, and Brillouin all use the word “information” doesn’t mean that they’re talking about the same thing.

    On a positive note, it is certainly true that the target sequence is well-matched with the perceptron. That’s a good find on the part of the EvoInfo Lab.

  54. Hey R0b,

    You wrote:

    The difference I noticed between your version of ev, Marks and Dembski’s version, and Schneider’s version is the number of potential binding sites. Yours seems to have 125, while Marks and Dembski’s has 131 and Schneider’s has 256

    In Schneider’s original paper, he explains that the first 125 nucleotides of the 256 encode the weight matrix and threshold value (as they do in Mark’s version and mine), while the rest represent the possible binding site locations. The last positions (256-266) are not evaluated in any version of the algorithm (Schneider’s, Marks’ or mine), and are just used for allowing the sliding window to operate correctly.

    Mine rounds the binding site locations to 125, since it makes for a nice grid, and would actually help the Ev simulation succeed, rather than hinder it (since there are less sites to match/mismatch.)

    You point about the output from the Hamming Oracle possibly being misused to hinder a search actually bolster’s Marks’/Dembski’s point: you would be applying negative Active Information.

    As for the cross section of Shannon Information and Active Information, that is covered in the papers co-authored by Marks and Dembski that introduces the metric. They link to the paper from the introduction, but it looks like they removed the link temporarily. Perhaps you can ask one of them for a copy if interested.

    Atom

  55. Addendum:

    Forgot to specify, Schneider’s version evaluates positions 126-261, while the last 5 positions are used for the sliding window.

    My position “1″ on the GUI corresponds to position 132 of the 256 nucleotides.

  56. Hi again, Atom. Schneider’s paper states: “Evaluations to determine mistakes are for the first 256 positions on the genome.” So in Schneider’s version of ev, the number of mistakes ranges from 0 to 256, while in the EvoInfo Lab’s version it ranges from 0 to 131. Any binding site in the first 125 positions would constitute a mistake for Schneider’s ev. That Schneider’s ev allows for more than 131 mistakes is obvious if you you run the java GUI version for a few iterations with selection turned off. I hope that clears it up.

    Yes, a mismatched oracle and search algorithm would constitute negative active information. My point is that it doesn’t make sense to attribute this active information to either the oracle or the search algorithm individually, since it’s the mismatch that’s the problem. Likewise for positive active information.

    I realize that the match between the perceptron logic and the small number of 1′s in the oracle’s target helps the search immensely given either Schneider’s evolutionary algorithm or random sampling. But I could come up with a search algorithm for which the perceptron/oracle match does more harm than good. That’s the larger issue I referred to above. And we’re still left with the smaller issue, which is that Dembski & Marks’s original logic for attributing negative active information to the evolutionary algorithm attributes a boatload of positive active information to it once the MATLAB script bugs are fixed.

    As far as the accompanying paper referenced several times in the info, I’ve tried many times in the past to get it and it’s never been available from the website. I would certainly welcome an explanation of the connection between Shannon info and active info, as the concepts appear to be conflated in the EvoInfo Lab’s objections to Schneider’s statements. It seems to me that if you cast ev as a classical information system, complete with a message, a noisy channel, and a receiver, all of Schneider’s statements about information WRT ev make sense. He’s certainly not talking about active information or CSI.

  57. R0b,

    I have not run Schneider’s version, but have the paper with me. Here is a screenshot from the relevant section, with the line you pointed out highlighted: Screenshot

    The graphic and explanation clearly shows the first 125 positions being used for the weight matrix and threshold. I think the reference to “256″ nucleotides is, in context, compared to the 266 nucleotides total. (The next line seems to confirm this.) Also, 256 would exclude any nucleotides after the last ev binding site, which occurs at position 256. This also lends weight to my interpretation.

    Also, it would seem odd to run the weight matrix nucleotides through the algorithm to see if any binding sites exist in them (which they never will) since the method outlined in the text of the paper seems to indicate that the second portion represent the binding area. (From the Ev paper: “The organism has two parts, a weight matrix gene and a binding site region. emphasis mine)

    However, I have not run Schneider’s version, so if I am in error on this point, I apologize.

    Again, if you are correct, I actually help the Ev case by providing a smaller target. So while the criticism may be valid (though I think it isn’t, for the reasons outline above), it would actually help Schneider’s case, and so is irrelevant for polemical purposes.

  58. R0b,

    Re-reading the relevant portions of the paper, I think you may be correct that all 256 bases are ran through the evalutation engine. There is a line on the second page that says:

    The weight matrix gene for an organism is translated and then every position of that organism’s genome is evaluated by the matrix.

    (emphasis mine)

    So I apologize for that and stand corrected.

    My version of the GUI, therefore, is still a mini-version of Schneider’s version: less possible binding sites, only 125 in total vs. 256 (or 131), but this doesn’t harm the algorithms involved in any way (and actually helps them.)

    It would be possible to modify my javascript objects (if you’re interested), since I set a constant to control the beginning of the evaluated region. It shouldn’t make a difference in the general results, but the option is there if you care to use it.

    Thanks for the find.

    Atom

  59. Atom, no problem. You’re right that this makes no difference to EvoInfo Lab’s points of criticism, only to the numbers.

  60. R0b,

    Prof. Marks will include a note on the introduction explicitly stating that the GUI is not an exact one-to-one model of Schneider’s simulation. That should make things transparent enough.

    Atom

  61. Atom, thanks. This is a nit-picky issue, but at least it’s concrete enough for all of us to come to agreement on it, which is pretty rare in this debate.

    Again, there are bigger issues. Here’s another one: Not only is the active info metric a function of the search as whole rather than of the individual components, it’s actually a function of a search model, not the modeled process itself. If we want to use the active info metric to draw conclusions about the underlying process, we have to establish some non-arbitrary way of modeling processes as searches.

    Take ev for example. Schneider did not cast ev as a search — it was Marks and Dembski who did that. They could have lumped the perceptron into the fitness function, in which case the search space would be of size 4^261 with a good-sized chunk occupied by what they would call the target. This search model would have significantly less active info than their model has.

    Even worse, they could have defined the target any way they wanted to, as Schneider said nothing about a target. Had they defined it to be the opposite of the message that gets conveyed to the genome, the search would be an guaranteed failure with tons of negative active information.

    BTW, Schneider needs to be careful with his approach too, as it has the same problem of arbitrary modeling. For instance, what part of a signal constitutes a message, and what part is noise? It depends on how you model it. Given Schneider’s approach, I see nothing wrong with his statements about information gain, but his conclusions regarding the underlying biology may not be justified.

Leave a Reply