20 January 2009
Two forthcoming peer-reviewed pro-ID articles in the math/eng literature
William Dembski
The publications page at EvoInfo.org has just been updated. Two forthcoming peer-reviewed articles that Robert Marks and I did are now up online (both should be published later this year).*
——————————————————-
“Conservation of Information in Search: Measuring the Cost of Success”
William A. Dembski and Robert J. Marks II
Abstract: Conservation of information theorems indicate that any search algorithm performs on average as well as random search without replacement unless it takes advantage of problem-specific information about the search target or the search-space structure. Combinatorics shows that even a moderately sized search requires problem-specific information to be successful. Three measures to characterize the information required for successful search are (1) endogenous information, which measures the difficulty of finding a target using random search; (2) exogenous information, which measures the difficulty that remains in finding a target once a search takes advantage of problem-specific information; and (3) active information, which, as the difference between endogenous and exogenous information, measures the contribution of problem-specific information for successfully finding a target. This paper develops a methodology based on these information measures to gauge the effectiveness with which problem-specific information facilitates successful search. It then applies this methodology to various search tools widely used in evolutionary search.
[ pdf draft ]
——————————————————-
“The Search for a Search: Measuring the Information Cost of Higher Level Search”
William A. Dembski and Robert J. Marks II
Abstract: Many searches are needle-in-the-haystack problems, looking for small targets in large spaces. In such cases, blind search can stand no hope of success. Success, instead, requires an assisted search. But whence the assistance required for a search to be successful? To pose the question this way suggests that successful searches do not emerge spontaneously but need themselves to be discovered via a search. The question then naturally arises whether such a higher-level “search for a search” is any easier than the original search. We prove two results: (1) The Horizontal No Free Lunch Theorem, which shows that average relative performance of searches never exceeds unassisted or blind searches. (2) The Vertical No Free Lunch Theorem, which shows that the difficulty of searching for a successful search increases exponentially compared to the difficulty of the original search.
[ pdf draft ]
—————
*For obvious reasons I’m not sharing the names of the publications until the articles are actually in print.








1
sallyann
01/20/2009
2:34 pm
Congratulations Dr Dembski! How soon before the whole Darwinian edifice crumbles?
Here is a poem to celebrate
There was a young man named Bill
Who found science to be quite a thrill
But he hated old Charlie
Found his “Origins” gnarly
And so swiftly moved in for the kill
2
Collin
01/20/2009
3:19 pm
Well Congradulations. That is good news. BTW are there 2 Marks (I and II?).
It seems like math/engineering journals, perhaps being more grounded in the real world, are willing to follow where the evidence leads.
3
skynetx
01/20/2009
3:50 pm
Congratulations to both Dr. Dembski and R. J. Marks. This is brilliant stuff.
4
bornagain77
01/20/2009
4:00 pm
Great, I’ve been waiting for this paper for a long time.
5
R0b
01/20/2009
4:11 pm
A hearty congratulations to Drs. Dembski and Marks. These papers obviously represent quite a bit of work.
The claim that these papers are pro-ID will, of course, be immediately challenged. Personally, I don’t see how that claim can be legitimately defended. It seems that information shared between the landscape/target and the search algorithm can be accounted for by deterministic processes, or simply by the way that the search is modeled.
For example, Marks and Dembski model their understanding of Dawkins’s weasel algorithm as a “partitioned” search, but it could be equivalently modeled as 28 independent searches, one for each position in the phrase. In that case, the 28 searches are blind, so there is no active information involved.
Regardless, I wish the EvoInfo Lab all the best in its research.
6
critter
01/20/2009
4:22 pm
Can you provide a publication date?
7
CJYman
01/20/2009
4:30 pm
Hello Rob,
The claim is legitimately defended since one of the papers proves that it is just as difficult to find the problem as it is to find a match between landscape and search procedure to find the problem any faster. IOW, if it is extremely improbable that the chemical constituents for a human brain will randomly coalesce to form that functioning human brain, then it is at least as extremely improbable (if not more so) for the evolutionary process (encoded in the environment and laws of nature) to be discovered. Basically, the laws and environment which allow the ratcheting filter to discover the brain are just as hard to find by search as the human brain would be. Information is continually moved back a search. Hence, the fine tuning argument.
Furthermore, if the problem were modeled as 28 independent searches, how would you guarantee successfully finding and locking on any one letter and then positioning the results of the independent searches into the correct phrase? Will 28 blind searches truly be able to perform such a feat?
8
R0b
01/20/2009
7:24 pm
CJYman, thank you for your comment.
It’s interesting that simple natural laws can be more information-rich than human brains, but such is the concept of active information.
If you try to model the evolution of the human brain as a search, you’re immediately faced with some choices that seem arbitrary. For instance, what is the target? Have we reached it? Maybe we’re stuck in a local optimum. If natural laws prevent us from ever reaching the ultimate target, then they have negative active information.
Now try modeling the search the found our current natural laws. What is the search space? What’s the target? Is active info a property of actual processes, or of the choices we make when we model those processes as searches?
What is the connection between active info and intelligence or design? Consider the following information about an objective function: “It’s smooth.” (Assume that smoothness is well-defined, eg neighboring nodes differ by no more than 1.) The amount of active information in that simple fact depends on the size of the search space. If active info implies intelligence, then we have arbitrarily high intelligence associated with a simple fact.
Another example: Suppose my target is a lake. A stick dropped in a river tends to find its way to the target without searching every corner of the universe, so we’re talking massive amounts of active info. How does the stick know which way to go? Obviously, the same gravity that determines the location of the lake (at a local minimum) also guides the stick. The target location and the search algorithm are not independent. Is this dependency intelligent?
As far as the weasel searches, if Search1 searches for the first letter, Search2 searches for the second letter, etc. then all letters will be found. Marks and Dembski’s weasel algorithm is nothing more than all of these searches happening in parallel. One model has lots of active info, and the other has none. So again it seems that the active info metric depends heavily on how we choose to model a process.
I always enjoy talking with you, CJYman.
9
jerry
01/20/2009
8:26 pm
Is anyone interested in providing how these search algorithms are related to ID and evolution? What are they searching? And how does this relate to biological processes. If anyone can put this in simple language, it would be appreciated.
I realize that it has something to do with transforming one DNA string into another and for the new DNA string to generate something functional and useful but while the mathematical rigor may be necessary for these journals what is it in simple terms?
Anyone?
10
notedscholar
01/20/2009
9:33 pm
Well this would be interesting…. except the abstracts show that the papers will constitute a slew of technobabble and idiosyncratic terminology, and layered levels of abstraction, no doubt never even mentioning theism once.
Not surprising.
NS
http://sciencedefeated.wordpress.com/
11
JT
01/20/2009
10:14 pm
On one leveI agree with Dr. Dembski, and it is hard to imagine how anyone possibly could not. He is stating in a more complicated way the following:
The probablity of a bitstring is proportional to the number of bits in the smallest program that will generate it.
Thats why a “search for a search for search” cannot be more probable than the original search.
However, what is kind of amazing to me, is that there is still a very strong commitment to Dualism in Dr. Dembski’s work which I think is complicating his whole endeavor.
He demands we take away a certain part of a computer algorithm and say, “That’s not really part of the search algorithm – that was added into it.” And the inevitable source for this essential ingredient divorced from the algorithm proper is of course the miraculous, inscrutable and decidly uncomputational (in Dr. Dembski’s mind) marvelous Human Mind. (And please remember to capitalize, out of respect.)
I can’t imagine Leon Brillouin thinking his observation was any more than obvious.
A computer is an extremely simple device, everyone should understand that (think Turing Machine). But a computer agorithm can be as arbitrarily complicated as needed. And you can’t come in and strip part of it out and say, “That’s not part of the algorithm! That’s the result of a Human Mind!” It is part of the algorithm. And even if it came from a human mind, you haven’t shown a human mind isn’t an algorithm too.
Don’t really want to start a huge heated debate though.
12
William Wallace
01/21/2009
2:31 am
Well, I wonder who the NCSE will “Richard Sternberg” (verb) after they are published.
13
Laminar
01/21/2009
6:12 am
From the CoS paper:
“In evolutionary search, a large number offspring is often generated, and the more fit offspring are selected for the next generation. When some offspring are correctly announced as more fit than others, external knowledge is being applied to the search giving rise to active information.”
Deciding that an individual has achieved the search target is to decide that it is more fit than one that hasn’t, all search algorithms require some knowledge of the solution, namely what qualifies as a solution. The fitness function in this case is just binary or ‘all or nothing’ choice. The problem here is that reducing the evaluation criteria of a GA or other type of hill climbing algorithm to an all or nothing choice converts these algorithms into random searches – If a ‘hill climber’ can’t measure the slope then it is just doing a random walk.
Providing an evaluation criteria that is non-binary does not necessarily imply or require knowledge about the search in hand, it just requires that the search space has certain properties in order for the search to be effective. In other words if your fitness landscape is flat with a single pinnacle of fitness then a graduated fitness function is of no help.
“A “monkey at a typewriter” is often used to illustrate the viability of random evolutionary search.”
This is an example of a random search with a binary evaluation criteria (it either is the works of Shakespeare of it isn’t) There is no descent with modification and no fitness metric, just a halting condition. It is NOT an evolutionary search.
14
Sal Gal
01/21/2009
9:49 am
The notion of “performance” in the context of NFL is nonstandard. Performance is defined as the quality of a sample of n points in the search space and their associated fitness values. The running time of a search algorithm is entirely ignored. This is reasonable only if the time to evaluate fitness does not vary much from one point in the search space to the next — e.g., as in typical combinatorial optimization problems.
It is obvious that the time required for “fitness evaluation” in biological evolution varies enormously from one type of organism to the next. Evolution is not just a matter of information, but also of time, and extant NFL analyses do not tell us about “performance” in any conventional sense of the term.
When time is taken into account, generally superior “searches” do emerge quite naturally [paper in review].
15
CJYman
01/21/2009
9:49 am
Hello Rob,
I definitely enjoy discussing these things with you as well. Your style of conversation is a breath of fresh air (no obfuscation, honest responses, and no personal attacks).
I apologize if I can’t follow this topic for that long, as I am back to work now, but here goes for now. And I apologize for my extreme inability to be brief and to the point. Sometimes, I just think that a little in depth rant is necessary to get the idea across. So I will be posting responses to your #8 in sections.
Rob, you state:
“It’s interesting that simple natural laws can be more information-rich than human brains, but such is the concept of active information.”
Yes, the laws themselves are simple mathematical descriptions of regularities, however the relation of laws to each other is not simple and it is in this complex organization of law and initial conditions where the active information “resides.” In fact, if you understand the fine tuning argument from physics you will understand that our set of laws and initial conditions (cosmo constant) are part of an extremely small set that would allow life, much less even provide enough time for evolution to occur. Life’s existence is teetering on a knifes edge of a combination of laws among a vast majority of possible mathematical laws which wouldn’t support any notion of life as an evolving information processing machine.
That is what information is about. It is not about the laws as merely descriptions of regularity themselves but about the *organization* of those laws, ie: which laws are being utilized, which initial conditions are being utilized, and what values are the laws set at?
That is what ID critics fail to realize. There is objectively more to our universe than law and chance. There are organizations of law and chance. This is known as information and it can be assigned a probability — its information content in bits.
In fact, the organization of natural laws may be so complex that they operate in such a way as to provide a framework for consciousness. The organization of laws themselves may even be so complex as to cause the universe itself to be an intelligent system. If a chess program, “merely” a sufficiently organized collection of logic gates, can be intelligent (model the future and move toward a target of winning the game) then there is no theoretical reason why our universe can’t be intelligent in at least that same way.
To say that a chess program is only law and chance and thus has no need for previous intelligence is blatant misinformation. It’s not necessarily the law and chance involved that requires previous intelligence (law and chance is just what it is), it is the highly specific and improbable organization of law and chance — the information — that required previous intelligence (foresight).
16
CJYman
01/21/2009
9:56 am
Rob (#8):
“If you try to model the evolution of the human brain as a search, you’re immediately faced with some choices that seem arbitrary. For instance, what is the target?”
As far as I can tell, there is no need for arbitrary measurement. Out of all possible combinations of the chemical constituents to make a brain, the brain is an extremely improbable combination of chemicals. In fact, with enough calculation power, an objective number of probability could be given to the brain. I’m sure for the purposes of debate, no one would object to the that number being unfathomable small, seeing that every scientists acknowledges that it is indeed the most complex system by a far shot within our universe (that we are aware of) — even compared to a space shuttle which is a pretty darn complex system.
If we wouldn’t expect as a matter of probability, in the history of our planet, for the constituent chemicals which make up a brain to randomly coalesce into a brain, then we both understand that we need some type of fine tuned laws to allow the brain in the first place and then a ratcheting mechanism based on those same laws to bring us closer to the brain. Thus we have a filtering process which is needed to discover the brain. Now, we both seem to realize that the brain itself won’t come together via chance and law absent that filtering process. So the question becomes, can that filtering process and the organization of laws which account for it be discovered via merely an arbitrary collection of laws and initial conditions (chance and law)? This becomes a search for that filter (search process which raises the probability for discovering/generating the brain). SO we still haven’t accounted for the ability to find the brain at better than chance performance by saying “evolution did it.” We need to still search for that evolutionary algorithm — the incredibly fine tuned laws and initial conditions of our universe — those highly improbable physical constants and variables which drive some physicists to marvel that it seems a “superintellect has mokeyed with physics.”
Rob:
“Have we reached it? Maybe we’re stuck in a local optimum. If natural laws prevent us from ever reaching the ultimate target, then they have negative active information.”
It doesn’t matter if we are stuck in a local maximum, it only matters if we have indeed reached one of the targets. Even if we don’t reach all of the targets (which we will never know since we are within the program) then that makes no difference for us having reached at least one of the targets. Not knowing all the targets makes no difference in calculating the positive active information for targets which have been reached.
I think that a good example for the point that is being made is this: take some background noise and an arbitrary collection of laws (definitely only law and chance absent any previous foresight to organize the laws and initial conditions in any specific way for any specific target) and run a program which causes the noise and laws to interact with each other an see if any active information is generated. Are any specified or pre-specified patterns discovered at better than chance performance?
17
CJYman
01/21/2009
10:04 am
Rob (#8):
“Now try modeling the search the found our current natural laws. What is the search space? What’s the target? Is active info a property of actual processes, or of the choices we make when we model those processes as searches?”
Active info is a property of the organization necessary for the actual processes to produce better than chance performance. Taking this back a step further, if chance and law absent previous foresight will not cause an organization of law and chance to produce active information, then active information is also a property of foresight. So far, no one has shown that active information can be generated absent previous foresight (applying knowledge of the problem to be solved or search space into the behavior/organization of the search algorithm as it relates to the search space). Of course, that obviously adds credibility to the ID Hypothesis.
Rob:
“Another example: Suppose my target is a lake. A stick dropped in a river tends to find its way to the target without searching every corner of the universe, so we’re talking massive amounts of active info. How does the stick know which way to go? Obviously, the same gravity that determines the location of the lake (at a local minimum) also guides the stick. The target location and the search algorithm are not independent. Is this dependency intelligent?”
I don’t think your example models the concept of active information correctly, since no one can create an evolutionary algorithm based on whichever principles you are saying are at work. I think the flaw in the example would be in actually calculating for active information — how may sticks fall into rivers naturally and then what percentage of those sticks would actually make it to any lake. Then we would have to randomly drop sticks in random rivers and see what percentage make it to your target lake compared to any lake. This would seem to be akin to assuming that any collection of background noise and laws will eventually produce life and an evolutionary algorithm and then generate intelligence at better than chance performance. Well, sure you can make that as your hypothesis, but good luck with trying to test that or even coming up with a mathematical model to show that is theoretically possible.
Rob:
“As far as the weasel searches, if Search1 searches for the first letter, Search2 searches for the second letter, etc. then all letters will be found. Marks and Dembski’s weasel algorithm is nothing more than all of these searches happening in parallel. One model has lots of active info, and the other has none. So again it seems that the active info metric depends heavily on how we choose to model a process.”
The problem is not merely finding the letters. The problem is first locking the letters in place when they are found and then placing those locked positions in to the correct position to reach the final phrase — ie: search 1 goes in position a, search 2 goes in position b, etc. Which metric are you using to determine when each search locks on a letter? Is this metric possible without any knowledge of the target?
IOW, trying simulating the search you propose without imposing problem specific information about when each search is to stop and which search goes in which position. Without knowledge about the final target, your search procedure seems even harder for chance and law to accomplish since now there are two steps instead of only one.
18
CJYman
01/21/2009
11:04 am
As an aside, active information is caused by the organization of the algorithm (matching of search procedure to search space) and this is why the algorithm can not generate active information any more than it can create itself. Thus, if it can not create itself absent previous foresight, the active information is generated by a higher level search ad infinitum or by a system which can model the future and generate targets (foresighted system).
19
pubdef
01/21/2009
1:11 pm
Pardon me for jumping in, and for not having read any of this post or comments, but I thought this might be an opportune moment to ask a question I’ve had for a while, on the general subject of “material” and “information.”
Suppose the course of a creek is altered by a rock that falls into it. Better, suppose that the new course leads the creek into country that is more amenable to creek-dom, and the creek becomes a river.
Is the rock “information?” My real point: how is this different from DNA (or RNA — I don’t have the specifics of that science)? A physical substance affecting another physical substance, with extensive consequences in the long run?
20
R0b
01/21/2009
2:06 pm
CJYman, thanks for the comments. I think we’re going to have to talk far more concretely if we want to avoid talking past each other.
With regards to weasel, you say:
I need to describe my model better. SearchAlgorithm1 sends a guess to Oracle1, and Oracle1 responds with “yes” if the guess is “M”, or “no” otherwise. Once SearchAlgorithm1 gets a “yes” back, it stops searching. This is a blind search as Dembski describes it, and it involves no more problem-specific information than any other blind search. Likewise with SearchAlgorithm2, SearchAlgorithm3, etc. Oracle2 says “yes” for the letter “E”, Oracle3 for the letter “T”, etc. When all searches are done, SearchAlgorithm1 has the letter “M”, SearchAlgorithm2 as the letter “E”, etc.
You might ask, how does each SearchAlgorithm know which Oracle to query? Isn’t that problem-specific information? I would respond, how does any search algorithm know what oracle to query? How does it know to query at all? How does it know what search space to sample from? How does it know to stop when the oracle tells it that it found the target? How does it know to not stop before that?
All of this information is built into the search model, and Marks and Dembski do not count it as active information. The only info that counts as active info is that which allows the search to perform better than random search. My weasel model performs random searches, so it has no active info, according to Marks and Dembski’s definition.
I think the same principle applies to your response to the stick-in-the-river example. According to my model of that physical process, the stick finds the target far faster than it would through random sampling. You may think that a different model is more appropriate, and that’s exactly my point. Marks and Dembski’s framework provides no criteria for determining whether a model is appropriate or not, so the choice is arbitrary. This means that the active information metric is, to some degree, arbitrary when applied to real processes rather than pre-specified search models. Note that the two papers referenced in the OP apply the metric only to pre-specified models.
Note also that neither paper makes any attempt to connect the active info concept to intelligence, design, teleology, etc. So there’s a gap there that Dembski needs to fill in order to support his claim that the papers are pro-ID.
21
R0b
01/21/2009
2:28 pm
For anyone who has read the second paper: To my math-challenged eyes, it appears that the only type of information that their vertical NFLT addresses is what they call “importance sampling” in the first paper. I must be missing something, so I need someone to hold my hand here.
I would expect the phrase “search for a search” to refer to a search for an better-than-random search algorithm. But the meta-search space (M(omega) on page 4) is a space of probability distributions, not a space of algorithms. If we’re restricting ourselves to stateless search algorithms and no fitness/cost function, then the only information we can use is a probability distribution to bias our sampling. But this leaves out common search strategies like genetic algorithms.
For example, if the active information I’m given is:
- Fitness function f is smooth
- Target T is at the maximum of f
then that information helps immensely in choosing a search algorithm. But neither of those items of information are probability distributions, so neither of them fit into Marks and Dembski’s meta-search space.
Again, I’m math-challenged and I know I’m missing something.
22
Prof_P.Olofsson
01/21/2009
5:02 pm
Getting papers published is always nice so congratulations to the authors! However, I don’t see how these 2 papers qualify as”pro-ID” until it is demonstrated how they are relevant to evolutionary biology. Remember that such claims were made for the original paper by Wolpert et al. until it was pointed out that it has assumptions that do not apply to evolutionary biology.
23
Toronto
01/21/2009
6:00 pm
Evolution is a blind, unguided process. ID is a goal-driven process. Since ID has it’s goal and evolution doesn’t care for one, neither are searching for anything.
How do you measure the efficiency of a function they’re not actually performing?
In other words, what do these two papers contribute to the resolution of the ID/evolution argument?
24
Sal Gal
01/21/2009
8:26 pm
Wolpert and Macready stated outright that their work applied to combinatorial optimization. No one in his right mind would model biological evolution as combinatorial optimization. Setting up as a target, say, the set of all length-100 sequences of nucleotides coding for a particular protein is highly unrealistic. Sure, that gives you a combinatorial optimization problem, but to say that any natural process sought for an encoding of a particular length is absurd.
I don’t think it’s a particularly well kept secret that the genetic sequences that are prime for evolution into genes coding for new proteins are duplicate genes and pseudogenes.
There is huge inconsistency among IDists on the matter of functionality of DNA sequences. They often push the point that most or all of the genome is designed, and that we simply have a great deal to learn about what non-coding sequences do. But when they want to portray evolution as utterly improbable, they say that a sequence of bases is categorically fit if it codes for a prespecified protein, and is categorically unfit otherwise. Now which is it? Can a non-coding sequence contribute to fitness, or not?
Geneticists believe that some pseudogenes serve functions, even though they don’t code for proteins. So why should the fitness of a genetic sequence be dichotomous? Why should we not consider that a non-coding sequence may pass through a succession of functions before coding for a protein? Just as the IDists say, who knows what functions we have yet to discover?
25
Domoman
01/21/2009
10:30 pm
Glad to hear about your articles Mr. Dembski! You seem to be successfully pulling into the area that neo-Darwinists seem to deem the most important, that is, peer reviewed articles. Once more ID themed papers get into the peer reviewed sections of science, who’s to say neo-Darwinism will hold up at all?
Especially considering, if I understand the idea behind the conservation of information theory correctly, it will literally be the death-knell of neo-Darwinism!
26
William Dembski
01/21/2009
11:17 pm
As for the relevance of this work to biology, let me remind commenters that Thomas Schneider used his ev program to argue against Behe and for the power of natural selection in biological evolution and that Rob Pennock cited his work on AVIDA likewise to argue against Behe and for evolution (Pennock cited this not in his NATURE article but in his Dover expert witness report).
So if you’ve got a problem with the applicability of the research at the Evolutionary Informatics Lab to real-life biological evolution, take it up with Schneider and Pennock.
27
tribune7
01/21/2009
11:51 pm
how they are relevant to evolutionary biology.
Everything is related to evolutionary biology, Professor. If it wasn’t for Darwin, we’d still be walking around in animals skins and rubbing two sticks together to make a fire.
Don’t you read, Newsweak? Oh, I misspelled it. Horrors.
28
Prof_P.Olofsson
01/22/2009
9:53 am
tribune[27],
I wrote “relevant” which is not the same as “related.” Don’t you read Merriam-Webster?
29
gpuccio
01/22/2009
9:53 am
Sal Gal (#24):
“I don’t think it’s a particularly well kept secret that the genetic sequences that are prime for evolution into genes coding for new proteins are duplicate genes and pseudogenes.”
It is not a particularly well kept secret that darwinists do think that way. They have to use the hypothesis of duplication as a first step because that helps to keep the previous function at the level of the originary gene. That seems useful, if you are supposing that unguided evolution goes from protein A with function A1 to protein B with function B1. But, at the same time, operating on a duplicate gene implies that negative selection cannot any more act.
“There is huge inconsistency among IDists on the matter of functionality of DNA sequences. They often push the point that most or all of the genome is designed, and that we simply have a great deal to learn about what non-coding sequences do. But when they want to portray evolution as utterly improbable, they say that a sequence of bases is categorically fit if it codes for a prespecified protein, and is categorically unfit otherwise. Now which is it? Can a non-coding sequence contribute to fitness, or not?”
I understand that you are not a biologist, but please, let us go back to real examples. Protein coding genes are only 1.5% of the human genome. The non coding DNA is almost certainly functional, and its functions are almost certainly regulatory. But we still understand poorly what those functions are and how they work.
There is absolutely no inconsistency in ID about that matter. We do believe that non coding DNA is functional (at least a graet part of it; personally, I belive it is almost completely functional). And we do believe that its functions are regulatory. But, when we “want to portray evolution as utterly improbable”, we do choose the model of protein coding genes, because that’s the model we know about, and that’s the model on which darwinist theory has been built. In other words, we do not deal with the information in non coding DNA for exactly the same reason why darwinist do not deal with it: because we don’t know where it is, and how it is encoded.
But where is the inconsistency? Let’s say that we are dealing with how that 1.5% of the genome which codes for proteins was generated. That is more than enough to prove that darwinian evolution is “utterly improbable”. In other words, in tha 1.5% we have information fro about 20,000 proteins (at least), and we have to explain how it emerged.
Moreover, a protein coding gene is a protein coding gene. It encodes a protein sequence. And there is only one kind of fitness for a sequence of nucleotides which encodes a protein sequence: the protein must be functional. Or are you suggesting that a protein coding gene evolves first as a regulatory gene, and then is “coopted” as a protein coding gene? Are you sharing the darwinist folly to that point?
In the end, the problem is simple: we have two proteins, A and B, completely different one from the other, and someone says that B is derived from A, or that both are derived from a common ancestor. Well, the information in those proteins is digital, and for me that means that any search based on random variation has tp traverse the combinatorial space defined by proteins sequences of approximately that length. That is the problem with random variation an a digital search. And if you say that NS changes the facts, that may be true, but NS is a model of necessity, and you have to build that model, to detail where and how NS can work. Simply hoping that it can work anyway will not do.
Regulatory functions are all another question. They are still vastly unknown. But, if we knew more about them, they would certainly be a much greater problem for darwinists, because indeed regulation is a higher level of information.
30
gpuccio
01/22/2009
10:38 am
R0b:
Just a few thoughts on what you say:
1) Natural laws do not imply any new information except obviously the possible information in the original setting of the laws themselves: that is an aspect of the fine tuning argument, which is very different, and separated, form the main ID argument of information in biology. Once a necessity mechanism is detailed, there is no addede information there. The mechanism has to follow the laws. Functional information can be superimposed on contingent structures, not on necessary ones.
The information in biology is very definite: it is digital, and it is functional. Necesiity has no role in creating it, except for the possible role of the NS model. NS is a model of necessity, but it is a model which has never been detailed, except for very trivial microevolutionary events. And I am not only saying that it has not been proved: I am saying that no real molelular model has ever been detailed, for any important protein transition, for example, where the role of NS is substantiated.
2) I will not answer your weasel argument becasue, like you, I am not a mathemathician. But I will only say that I don’t agree with your concept. If you have to get the final phrase, you have to know the solution for eaxh search, and the correct relationship netween those solutions. That equals, obviously, knowing the searched phrase. In other words, you can get the weasel phrase only of you already know it. But then why search for it?
It’s not the same if you have to look for one protein with one function you need. Take protein engineers, for instance. They know what they want (the function), but they do not know how to achieve it (they don’t know the sequence which will have that function, notwithstanding all the knoweledge we have about proteins). So, they utilize partial random search, and function measuring for selection. And still, they have to work a lot to get some results. And function measuring is much more sensitive than just NS (selecting a fucntion only when it is strong enough to provide, by itself, a reproductive advantage).
3) Always to get back to biology, in evolution there is IMO no search model: just replicators and an environment, and random variation. NS is only a consequence of the interaction between the replicator and the environment. But the theory of darwinian evolution implies a search, because it assumes that the growing information in higher beings is derived from the information in lower beings through unguided mechanisms. In other words, darwinian theory interprets a completely neutral system (the replicator and its environment) as a system which can find new information and new intelligent patterns. That’s why we speak of a search: because darwinian evolution needs a system which acts as a search. The only problem is that such a system is not really effecting any search, and that’s exactly why the theoy does not work. That’s why you will never obtain better digital replicators just by random noise in a digital system. The concept does not simply work. Mathematicians can argue about why it does not work, but we do know that it does not work: you just have to try to detail it or put it to test, both in a computer environment or in a biological environments. The only space where that system seems to work is in the mind of darwinists.
31
tribune7
01/22/2009
10:41 am
PO–I wrote “relevant” which is not the same as “related.”
Relevant is relevant to related and related is related to relevant
Don’t you read Merriam-Webster?
It has an awful plot
32
gpuccio
01/22/2009
10:43 am
pubdef (#19):
“Suppose the course of a creek is altered by a rock that falls into it. Better, suppose that the new course leads the creek into country that is more amenable to creek-dom, and the creek becomes a river.
Is the rock “information?” My real point: how is this different from DNA (or RNA — I don’t have the specifics of that science)? A physical substance affecting another physical substance, with extensive consequences in the long run?”
No, the rock is not information. And your example has nothing to do with DNA. DNA (at least, the protein coding genes) is digital information which stores a sequence which, when translated into an aminoacid sequence, has a specific function. We know the code, we can read it. We know, in many cases, the product (the protein) and its function. What has all that to do with your rock?
33
gpuccio
01/22/2009
10:51 am
Laminar:
“This is an example of a random search with a binary evaluation criteria (it either is the works of Shakespeare of it isn’t) There is no descent with modification and no fitness metric, just a halting condition. It is NOT an evolutionary search.”
Well, NS is a very limited oracle, and it can only judge if a reproductive advantage has been achieved or not. I really think that all that talking of landscapes and fitness functions creates a lot of confusion. Let’s be simple: NS can do only two things: expand a genome if a perceptible reproductive advantage has been achieved (positive selection); or eliminate it if there has been a significant loss of function (negative selection). That is the only oracle you have in the darwinian model, nothing else. And I am still expecting that a credible molecular model for macroevolution be detailed with that kind of oracle correctly utilized, and the random variation part correctly computed. Is that asking too much from such an extraordinary scientific theory as darwinian evolution is supposed to be?
34
Prof_P.Olofsson
01/22/2009
12:17 pm
WilliamDembski[26],
I skimmed the “search for a search” paper. It’s a nice paper. I think there is a simpler proof of “conservation of uniformity” for finite Omega, one that uses elementary methods and does does not invoke weak convergence of measures. Consider the vector [p(1),...,p(n-1), p(n)] where [p(1),...,p(n-1)] is uniformly distributed and p(1)+…+p(n-1)+p(n)=1. Integrating over the simplex gives the joint pdf and the marginal of each p(i), and the expected value in the marginal is 1/n.
I understand that the authors are pro-ID but I don’t see how the paper itself can be labeled pro-ID.” What is the logic behind such a claim? Because the darwinian search obviously is not uniform, it has not been chosen according to the uniform distribution induced by the Kantorovich-Wasserstein metric, hence there is support for ID?
I also don’t understand your comment. What do Schneider, Pennock, and Behe have to do with how your paper is relevant to biology?
35
Toronto
01/22/2009
2:21 pm
gpuccio @30
Evolution performs no search, it is in itself, a result.
Evolution is not the trip, it’s the destination you arrive at.
Whether implied by Darwin or anyone else, there is no search process involved.
Since evolution has no goal, we can’t tell where it’s going or how easy it will be to get there.
36
R0b
01/22/2009
2:56 pm
gpuccio:
Since Dembski and Marks put no restriction on what processes can be modeled as searches, any process that isn’t uniformly random involves active information. (In fact, even uniformly random processes can be modeled as active-info-rich searches by simply modeling the search space in such a way that the sampling isn’t uniform. If a process generates random circles whose diameters are uniformly distributed, simply define the search space in terms of the areas rather than the diameters.)
So natural laws are active-information-rich according to Marks and Dembski’s definitions. Their claim is that this active info must be referred to a higher-level search (or, presumably, to intelligence, although they don’t say that in the papers), which can in turn be referred an even higher-level search, ad infinitum. Regularities that just are don’t seem to be an option in this framework.
The active information metric is defined in terms of a prespecified target T. If prespecified targets are a problem, then they’re a problem for Marks and Dembski’s whole framework.
Who is “we”? The idea that biological evolution is searching for something is certainly not the mainstream view.
Genetic algorithms certainly do work. Whether they scale well enough to produce our earthly biota is another question.
37
pubdef
01/22/2009
3:23 pm
#32:
I will concede that my point may be “clear as mud,” and I appreciate that anyone makes an attempt to answer it.
What I’m trying to say is: notwithstanding the characterization of DNA as “digital information,” isn’t it really a physical configuration of molecules
that interacts with other physical objects/particles, leading to a far flung range of physical consequences?
Another stab at it: DNA seems to be viewed as software, but isn’t it really hardware?
38
Upright BiPed
01/22/2009
3:30 pm
“Another stab at it: DNA seems to be viewed as software, but isn’t it really hardware?”
No more than a CD-ROM is hardware; is has a physical reality, but it is the information contained within that is its purpose.
DNA is a vessel of information (of the functionally specified variety) that animates and organizes inaimate matter into living tissue.
It is the only thing in the universe that does so.
39
Sal Gal
01/22/2009
4:10 pm
Evolution of Avida programs is not combinatorial optimization. When the running time for Avida is bounded, proceeding from shorter programs to longer programs is generally superior to proceeding from longer programs to shorter programs. Avida affords itself of that free lunch.
Only you and Bob, to my knowledge, have tried to bring “no free lunch” results to bear on Avida. So I believe it is reasonable to take up the issue with you. And I’m curious as to whether you and he elected to remove the attacks on ev and Avida from the first paper, or if the reviewers required it.
As for ev, Ray has contended that you have not understood what he meant to demonstrate. I have not reread his paper to see if he’s right, and I won’t bother, because it seems to me that you’re beating a dead horse. Computational studies today are more sophisticated than ev.
40
Sal Gal
01/22/2009
4:26 pm
gpuccio,
Long time no see, friend.
I think you may have missed my point, which was a criticism of modeling the fitness of a DNA sequence as all-or-nothing. I am saying that I think it’s inappropriate for Dembski and Marks to attribute all “warmer or colder” information to an “oracle” or an “assistant.”
41
Sal Gal
01/22/2009
4:36 pm
P.S. to 39: Free associating on Toms — that’s Schneider, not Ray.
42
gpuccio
01/22/2009
5:47 pm
pubdef:
“What I’m trying to say is: notwithstanding the characterization of DNA as “digital information,” isn’t it really a physical configuration of molecules
that interacts with other physical objects/particles, leading to a far flung range of physical consequences?
Another stab at it: DNA seems to be viewed as software, but isn’t it really hardware?”
As Upright BiPed has already said, DNA is a support for information: you cannot have pure software, you always have software written on a hardware support. The “physical configuration of molecules
that interacts with other physical objects/particles” is only the biochemical structure of the DNA molecule. But the special sequence of nucleotides which, in a symbolic code, encodes the sequence of aminoacids in a specific protein is pure digital information. There is no physical or biochemical law which determines that sequence. The sequence is preserved in DNA, and it must have been created in some way. Darwinists believe that it was created through RV + NS. We (IDists) believe that it is the product of designe. But in no case it can be the product of necessity. There is no law of necessity which can output the correct sequence of nucleotides which corresponds to the aminoacid sequence of, say, myoglobin, according to a definite symbolic genetic code (the genetic code we observe in biological beings).
Darwinists, and even many biologists, are really confused when they speak of non uniform distributions, or of landscapes, or of fitness functions, as though those concepts could apply to the random generation of new nucleotide sequences. There are no laws which can generate randomly a sequence of nucleotides in a pattern which is really distant from an uniform distribution, least of all in a pattern which may have anything to do with the genetic code or with functional aminoacid sequences. All that talk about evolutionary algorithms is just smoke in your eyes.
The truth is that any component in darwinian theory which can apply to the possible generation of information in a way which differs from a random search on a uniform distribution must be related to NS. And NS is a form of necessity. Call it evolutionary algorithm or any other name, it is only NS which can make a difference.
But NS is not an omnipotent principle. As all laws of necessity, it must be modeled according to necessity. It is indeed a potential oracle, but an oracle of which we know all too well the limits: it can only expand genomes with a reproductive advantage or eliminate genomes with important loss of function. Out of those limits, NS is non existent.
So, again, darwinists should be able to show where NS can act to “join” informational leaps which are in the range of what a random search over an uniform distribution can empirically accomplish. If they cannot do that, they have nothing: not a model, not a theory, nothing.
And believe me, they cannot do that.
43
gpuccio
01/22/2009
6:08 pm
Sal Gal:
Long time no see, friend. It’s always a pleasure to meet you.
You say I may have missed your point, but I don’t believe that. I have purposefully avoided to discuss Dembski and Marks’points, mainly for lack of competence on my part.
But I have commented on what you were saying. I quote again:
“There is huge inconsistency among IDists on the matter of functionality of DNA sequences.”
That’s not true, as I have tried to show. We do believe that, in general DNA sequences are functional, but in different ways: protein coding genes are functional as “storage” of aminoacid sequences; non coding DNA is functional as a regulatory information, at present poorly understood, and probably much more complex than the “simpler” protein coding information. Let’s say that protein coding DNA codes for the “effectors”, while non coding DNA codes, in some mysterious way, for the procedures. Or for part of them (other unknown components will probably be discovered in time).
Let’s remember that at present we know almost nothing about the procedures. We don’t know how transciption is regulated, how and why cells differentiate in myriad of forms, how multicellular plans are achieved, how intercellular communication is controlled at higher levels of integration, and so on. Procedures. In any software, those are the real thing.
I quote again:
“But when they want to portray evolution as utterly improbable, they say that a sequence of bases is categorically fit if it codes for a prespecified protein, and is categorically unfit otherwise. Now which is it? Can a non-coding sequence contribute to fitness, or not?”
As already said, we reason on the protein coding genes, because that is the part we understand (and that “we”, for once, includes both darwinists and IDists). We will reason about the regulation part when we will understand it. And believe me, that will not be good news for darwinists.
So, leaving alone Dembski and Marks, where is it that I have missed your point?
And is my point clear? To answer explicitly your question, a non coding sequence can certainly contribute to fitness, but only to the fitness related to its specific function, probably a regulatory one. It cannot certainly contribute to the fitness inherent in a protein coding gene, which is another thing, works in another way, has another symbolic code, and so on.
So, for protein coding genes, we have to go back to the only possible oracle if we are to believe in unguided, non designed evolution: NS. I have dealt with that in my previous post, so I refer you to that. In other words, all “warmer or colder” effect must be attributed only to natural selection (or to design). And it is a very “threshold” effect: only if the message is “very warm” (the new replicator replicates better than the old ones, and can expand in the population) can the variation in genome be fixed, and we have something different from a random search on a uniform distribution. Or, in alternative, if the message is “very cold”, and the new genome is terminated.
In all other cases, there is no oracle, and no other supporting or miraculous force. We are back to random search over an uniform distribution of possibilities. Or to design.
44
R0b
01/22/2009
6:41 pm
Not that anyone cares, but I answered my own question from #21. The piece I was missing was the fact that the amount of active information changes over the course of the search. The fact “the fitness function is smooth” contains no active information initially, since by itself it provides no guidance as to the location of the target. That’s why the first query of genetic algorithms is typically random. It’s only after we start querying that the distribution starts to become biased.
So maybe the vertical NFLT is general enough to handle all kinds of information about the target and fitness landscape. I would be interested to see it formally applied it to a smooth fitness landscape.
45
R0b
01/22/2009
7:39 pm
A comment on the last conclusion of the 1st paper:
So whoever publishes an algorithm for solving a particular problem should make explicit (1) the size of the problem space and (2) the efficiency of the algorithm applied to that problem.
What I find interesting is the idea that someone would want to hide how well their algorithm performs.
46
Laminar
01/22/2009
8:04 pm
gpuccio:33
“Well, NS is a very limited oracle, and it can only judge if a reproductive advantage has been achieved or not.”
Whether an agent reproduces or not is always a binary event, obviously. How NS differnetiates between agents that reproduce or not is not a binary event, it is a probability. Roughly speaking the fitter the individual the greater their probability of reproducing and, depending on the type of GA, higher fitness individuals have a greater chance of producing greater numbers of offspring.
“NS can do only two things: expand a genome if a perceptible reproductive advantage has been achieved (positive selection); or eliminate it if there has been a significant loss of function (negative selection).”
Genomes with a lower fitness than their parents still have a probability of reproducing, it is just lower. Genomes with the same fitness as their parents have the same probability of reproducing, even if they are different. And that all assumes that the fitness landscape stays the same, which in biology and in some GA implimentations is not always the case – you can end up fitter than your parents because of an environmental change even if at birth you were less fit.
Either way we are talking (and the paper is talking) about search algorithms and my point still stands – A binary fitness evaluation presents no slopes, even if they actually exist in the fitness landscape. Removing the ability to detect any graduated differences in fitness between individuals or iterations effectively converts a variable fitness landscape into a flat one. A hill climbing algorithm is ONLY an hill climbing algorithm if there is a hill to climb and a GA is ONLY a GA if there is some way of ranking individuals. Otherwise you just have an individual taking a random walk, or a population of individuals taking a random walk. To imply that these strategies are ‘adding information’ is like arguing that an aeroplane is no better at flying than a car because the aeroplane can’t fly without wings.
47
Sal Gal
01/23/2009
12:32 am
Bill Dembski,
When you first announced your publications here at UD, you linked them to complex specified information. Yet I can find no explicit reference to CSI. And I genuinely do not see it between the lines. Would you explain the connection to CSI?
48
gpuccio
01/23/2009
1:45 am
R0b:
“What I find interesting is the idea that someone would want to hide how well their algorithm performs.”
The concept is probably that the author of a simulation of evolutionary search would want, and indeed does want, to hide how his algorithm performs well only because of the specific information about the solution which has gone into the programming of the algorithm. The idea is simple: that makes the algorithm efficient, but it makes it a very bad evolutionary simulation, because a true evolutionary model must know nothing in advance of the solution it finds. Is that clear?
49
djmullen
01/23/2009
4:30 am
What I don’t understand is what does searching have to do with evolution? It looks to me like the ultimate concern of the authors is searching through all possible genomes for the tiny proportion of those genomes that will construct and operate an organism that can successfully reproduce.
But every living organism that exists is the product of organisms that are already in that tiny proportion of the possible genomes and if they successfully reproduce then their DNA is already in it too.
In practice, they do this by making only small changes to the DNA they hand down to their offspring. This keeps the results of their so-called “search” either inside the sweet spot or very close to it. No other “search strategy” is necessary.
Evolution says that every living organism since the first self-reproducing molecule is already in the genomic sweet spot and all they have to do is keep their offspring in it too.
50
Joseph
01/23/2009
8:08 am
1- “Evolution” is not being debated
2- Whether or not “evolution” is goal-oriented is being debated.
3- In any goal-oriented scenario the goal is being searched for.
51
Joseph
01/23/2009
8:11 am
To all of those who doubt these papers are pro-ID:
Can you point to any peer-reviewed papers that demonstrate the power of blind, undirected processes pertaining to biology?
52
gpuccio
01/23/2009
9:12 am
djmullen:
No, things are very different from what you are saying. You say:
“But every living organism that exists is the product of organisms that are already in that tiny proportion of the possible genomes and if they successfully reproduce then their DNA is already in it too.”
Absolutely not. First of all, we cannot speak of the whole genome (that would be beyond any possible analysis). We have to analyze some single portion of genomes.
As I have said (see my answers to Sal Gal) the only portion we can really compare (because it is the only portion we really understand) is the protein coding genes. Now, while there are some similarities between some proteins in different species, others are completely different. Practically each species has proteins which are absolutely peculiar to that species.
I quote here form one previous post of mine:
“The proteins we do know (and we know a lot of them) are really interspersed in the search space, in myriads of different and distant “islands” of functionality.
You don’t have to take my word for that. It’s not an abstract and mathematical argument. We know protein sequences. Just look at them.
Go, for example, to the SCOP site, and just look at the hyerarchical classification o protein structures: classes (7), folds (1086), superfamilies (1777), families (3464). Then, spend a little time, as I have done, taking a couple of random different proteins from two different classes, or even from the same superfamily, and go to the BLAST site and try to blast them one against the other, and see how much “similarity” you find: you will probably find none. And if you BLAST a single protein against all those known, you will probably find similarities only with proteins of the same kind, if not with the same protein in different species. Sometimes, partial similarities are due to common domains for common functions, but even that leaves anyway enormous differences in term of aminoacid sequence.”
Just to give you an idea: bacteria have about 500-2000 protein genes, while humans have 20000-25000. Drosophila has about 14000.
Each protein, and each protein fucntionality, is a different island in the sae of possib le sequences. We have hundreds of thousands of different proteins. Some of the most complex proteins are more than 2000 aminoacids long.
So, maybe you are not familiar with biology (no problem there), but when you say:
“This keeps the results of their so-called “search” either inside the sweet spot or very close to it. No other “search strategy” is necessary.”
you really don’t know what you are speaking of.
53
Prof_P.Olofsson
01/23/2009
10:06 am
Joseph[51],
You’re avoiding th issue. If you claim these papers to be “pro-ID” you ought to demonstrate what bearing they have upon biology.
As for the kind of papers you ask for, you can check out Rick Durrett’s publications for a start.
54
Prof_P.Olofsson
01/23/2009
10:19 am
ROb[44],
As you can see, there is no mention of fitness functions in the “search for a search” paper. A “search” is defined as a probability measure over the search space (see the construction on page 2, B); this probability measure may or may not involve the use of fitness functions.
55
CJYman
01/23/2009
12:45 pm
Prof. Olofsson:
“As for the kind of papers you ask for, you can check out Rick Durrett’s publications for a start.”
Does he show how background noise (chance) combined with an arbitrary collection of laws (set of laws put together absent any consideration for future results — absent foresight) will produce a system of signs and the mechanisms to process them, a search space amenable [where functions aren't spaced too far apart] to a ratcheting filter (natural selection), and the laws necessary to create environments that will allow life to evolve to ultimately highly improbable and functional machines and intelligence (systems with the ability to model the future and generate targets). Basically does he show that life and evolution to produce intelligence will occur from any arbitrarily chosen set of laws and initial conditions?
I would definitely be interested in seeing such research.
56
R0b
01/23/2009
1:08 pm
Thanks Prof. Olofsson. On reading II.B, I see that I was completely misinterpreting their vertical NFLT examples. It appears to be perfectly general. That’ll teach me to try to speed-read technical papers.
57
R0b
01/23/2009
1:51 pm
gpuccio:
The active information metric measures the algorithm’s performance relative to the given problem. To say that an algorithm performs well because of X bits active information is to say that it performs well because it performs well.
Actually, no. Efficient algorithms have lots of active information, by definition. Does that make them bad?
“True evolutionary models”, assuming they perform better than random sampling, have plenty of active information. GA’s typically know nothing in advance about the location of optima, but they do act on the assumption that the landscape is reasonably smooth. The correctness of that assumption constitutes active information.
58
Toronto
01/23/2009
2:13 pm
Joseph[50]
gpuccio @[48] seems to agree that the process of evolution has no goal.
If evolution had a goal it would be a form of ID.
59
R0b
01/23/2009
3:05 pm
Joseph:
Interestingly, Marks and Dembski’s work cannot address that question, because you have to know that there is a goal and know what it is before you can even begin an analysis in their framework.
Actually, we need to distinguish between a goal in the codomain of the objective function vs. a goal in the domain. If, as in a prototypical optimization problem, the goal is defined in terms of the codomain, it makes sense to describe the process as a search for a solution to a problem. If the goal is defined in terms of the domain, then we’re not searching for anything, but simply moving toward a goal whose location is already known.
Correct me if I’m wrong, but ID scenarios seem to imply the latter. That is, the designer was not using evolution to find a solution to a problem, but rather to instantiate an already-known solution.
60
gpuccio
01/23/2009
3:07 pm
R0b:
“GA’s typically know nothing in advance about the location of optima, but they do act on the assumption that the landscape is reasonably smooth.”
I don’t agree. GAs know a lot about what they want to find, and how to find it. The weasel is just a gross example, but wouldn’t you agree that the weasel algorithm knows exactly the solution it is searching for? Other GAs may not know exactly the solution, but they do know a lot of other precious information. They don’t certainly know (or assume) only “that the landscape is reasonably smooth”.
I am not specially interested in GAs, being absolutely convinced that they are completely useless, and that they don’t model anything, except for the ability of their programmer. But many times, here and elsewhere, I have declared what a true computer simulation of darwin’s theory should be, usually receiving no answer or comments. IMO, such a simulation should work more or less as follows:
1) Take a digital environment: a computer running some operating system, and any software we like. The digital environment can be stable or change, but the important point is that it should in no way be programmed for the simulation.
2) Take a program which generates digital replicators, and let it run in the system. The program can incorporate some system of random variation, which can be regulated as we like, but it has to be completely random, and we can apply any random probability distribution we want, but again then important point is that no programming must be introduced which has any relationship with the simulation. The whole system will so be blind to the simulation, exactly like darwinian theory assumes.
3) Just let the system and the software run, and wait. For what? For any variation in the replicator which is spontaneously selected by the system as useful.
That would really be an evolutionary simulation. You have everything: a system which is not designed to run the simulation (in other words, a blind system), but which has its own rules and characteristics, like any true landscape; replicators which start as functional (we are not simulating OOL here), and are subject to random variation, adjustable as we like; and, above all, no active information introduced in the system.
That, and only that, would be a real simulation of darwinian evolution. Can you see the differences with existing GAs?
61
R0b
01/23/2009
3:41 pm
gpuccio:
I said that GAs typically know nothing about the location of the target. But yes, what they are trying to find and how to find it is incorporated into the GA.
No, I wouldn’t, but I think our disagreement is semantic. You seem to see the objective function as part of the GA, while I see the two as separate. The weasel search algorithm, for instance, starts with a random string precisely because it has no idea what the target string is. The fitness function (oracle), on the other hand, knows the target string.
And yet GAs are able to find solutions that their programmers don’t know beforehand.
And I’m not sure what you mean when you say that they don’t model anything. If you’re referring to biological evolution, then obviously nobody claims that computers can model that with any degree of fidelity, although they can provide us with some insight into the variation+selection process on a small scale.
Active information is defined in terms of a target. What is the target of the proposed simulation?
62
gpuccio
01/23/2009
5:29 pm
R0b:
“Active information is defined in terms of a target. What is the target of the proposed simulation?”
That’s exactly the point: my proposed simulation has no target, out of verifying if a simulation which has no target can obtain some new functional target. That’s exactly the point, and that’s exactly what darwinian evolution is believed to be: a process with no target, which can find some extraordinarily functional targets. I don’t believe that can happen. But darwinists believe exactly that. So why don’t they try to simulate that kind of process, instead of trying to realize processes which see very well, and then pretending that they are simulations of a blind process?
Again, and to be clear to the extreme, I admit that the simulation can exercise natural selection, but not intelligent selection. In other words, the selection must arise spontaneously from the interaction between the replicator and the system, without any previous programming. In that way, and only in that way, a true (blind) natural selection can take place versus any possible new useful function which may arise from true blind random variation. AQny possible useful new function is the target. No specific target at all, the hreatest possible target of all: any possible new useful function. True blindness, true random variation, true blind natural selection. Isn’t that what darwinists have been preaching for all these years?
63
R0b
01/23/2009
6:12 pm
gpuccio, first of all, I don’t understand what you mean when you say that the system must not be designed to run the simulation. What are we simulating? Biological evolution? If so, then the system should have the characteristics of our terrestial environment. It needs to simulate whatever it’s supposed to simulate.
Regardless, selection will be determined by the rules and characteristics of the system. The replicators that are better aligned to the environment will reproduce more. Such replicators are more functional, in terms of that environment, than other replicators. I don’t know where the line is between “more functional” and “new function”, so that’s something you’ll need to define for me.
64
gpuccio
01/23/2009
6:44 pm
R0b:
I will try to clarify.
1) The system must not be designed to run the simulation, just as the environment (in darwinian theory) is not designed to produce life (or any higer form of life). The system will obviously have its rules of functionality (its “fitness functions”), but those rules must not have been written in view of the simulation. Just as it would have happened in natural history, according to darwinian theory.
2) Yes, we are simulating biological evolution in a computer environment. Obviously, a computer system cannot have the characteristics of our terrestrial environment, neither in my simulation nor in any other GA simulation. And, least of al, can the digital replicators have the characteristics of true biological replicators. But that’s not the point. The point is that we are simulating the model according to which random noise, plus a natural selection deriving from the unprogrammed interaction between modified replicators and a fucntional environment, can build up new functional information and generate more complex and functional replicators. That’s the point of the simulation. Even if the digital context is obviously different from the biological one (and there is no escape for that, in any digital simulation), the logical model is the same. So the simulation, although not perfect, is appropriate, while traditional GAs are not appropriate because they are not modeling a blind natural selection, and therefore those models are logically completely different from what they are trying to simulate.
3) The point is exactly to verify if what you call “the replicators that are better aligned to the environment” will really arise in this model. I believe and expect they never will.
4) You say: “I don’t know where the line is between “more functional” and “new function”, so that’s something you’ll need to define for me.”. There is no line. Any new replicator which is better aligned to the environment and spontaneously expands in the environment to the expense of the previous forms is more functional and has developed a new function. There is no difference between the two concepts.
4) Personally, I do believe that those more functional replicators will never arise in such a system, as I believe that they have never arised in natural history by a similar causal mechanism. But I am ready to analyze any possible outcome of such a simulation. Morever, if a new form of replicator is really selected, it would be easy to verify what the new function is, how it gave the new replicator a reproductive advantage, and above all what is the statistical boundary which has been overcome (how many functional bits have been added). According to darwinian theory, even if the first advantages may be trivial (within the range of statistical expectations), in time the accumulation of such useful variations should give functional variations of the order of CSI (500 bits). My belief and expectation, on the contrary, is that we would never ackowledge even the first, microevolutionary steps, or if we do they will be absolutely trivial variations, and they will stop at that level.
I hope that is clear. I really believe that, if we want to simulate the logical model of darwinian theory, we have to proceed that way. Otherwise, we are only playing games, doing one thing and claiming we are doing another.
And, by the way, that’s what I meant when I said that GAs don’t model anything. I meant that they don’t model anything even distantly related to the logical model of darwinian evolution. But it is true that they can model an intelligently designed way to intelligently find a solution.
65
Toronto
01/23/2009
7:51 pm
gpuccio[62]
How about this?
A 256 bit “DNA” based fractal generator.
char DNA[32];
where DNA[0] to DNA[15] select any one of f1() to f16() in any order with the parameters from DNA[16] to DNA[31].
x=f1(DNA[16],0){yadadyada..};
x=f2(DNA[17],x){yadayada…};
x=….
r=f16((DNA[31],x){yadayada…};
The result r contains the seed and fractal generator.
The final output, determined at birth, is used either to generate a video bitmap or fed to a DA converter for sound output which means each DNA/life, doesn’t know what it’s generating as an output, audio or video.
Lifeforms die based on voter input. If too many visitors on the website don’t like the song or picture, it’s removed from the environment. If an output gets a lot of thumbs up, it survives and modifies a bit. If it survives long enough, (100 votes?), it gets to mate with another successful DNA/life whose offspring have attributes of both parents.
In the US, enough generations might produce a picture of an eagle, while in Italy, you might end up with something that sounds like opera.
66
Sal Gal
01/23/2009
8:52 pm
gpuccio,
I have agreed with your general argument, though not the details, for years. But you are have not spoken to death of virtual organisms. You also have not addressed the massive parallelism of life. There are an estimated 5 X 10^30 bacteria on earth at present. I won’t guess how many elementary self-replicators there might have been prior to cellular life, but will suggest that they may have reproduced more rapidly than bacteria. In any case, our computational resources are relatively tiny.
Various folks, including me, who are interested in both information and evolution think of organisms as modeling their environments. As long as the virtual environments are uninteresting, there is no reason to expect virtual organisms to be anything but the same. One thing that makes an environment “interesting” is that there are many ways to garner resources and reproduce.
I should mention that GAs have discovered bugs in application-specific software the GA developers had not examined. That is, the GAs came by fit individuals that exploited errors made by programmers other than those who wrote the GAs. This is not sensational, but it is qualitatively what you said you wanted to see.
P.S.–Thanks for pointing me to SCOP and BLAST. I’m interested, but I just can’t follow up at the moment.
67
Mark Frank
01/24/2009
4:46 am
But you are have not spoken to death of virtual organisms. You also have not addressed the massive parallelism of life.
Gpuccio
Sal Gal is spot on. Any simulation of RM+NS has to deal with both RM and NS. RM is random in the sense that the mutation is independent of the selection criteria. But NS, which is the simulated by the fitness function, is far from random. What confuses the issue is that the fitness function for real species are very complicated, hard to determine, and keeps on changing over time. Nevertheless it is not random. If the criteria for survival in each generation had no relation to the criteria for survival in the previous generation then evolution would not get off the ground.
GAs typically specify a simple and unchanging fitness function. This is like a world where there is only one attribute that matters in the struggle to survive. For example, imagine a world where the only organisms to survive were those with the greatest power/weight ratio. Then the mechanism of RM+NS would lead to species with ever greater P/W ratios. This world has just as much knowledge of its target as a GA, but would still be a Darwinian process. You see something like this when artificial selection limits the fitness function to one or two criteria desired by man.
You seem to be demanding that the fitness function arise spontaneously in the simulation. But, just like the real world, any simulation must include a process for selection based on something.
68
gpuccio
01/24/2009
6:24 am
Sal Gal and Mark:
As usual, very interesting discussion with you. I offer my counter arguments to what you say, and in defense of my “thought simulation”.
Sal Gal:
“But you are have not spoken to death of virtual organisms.”
Death can well be included in the original programmimg of the replicators. You can program the replicators as you like, provided that you don’t use any explicit or implicit form of frontloading, which could be anyway easily revealed by an impartial scrutiny of the original software.
“There are an estimated 5 X 10^30 bacteria on earth at present. I won’t guess how many elementary self-replicators there might have been prior to cellular life, but will suggest that they may have reproduced more rapidly than bacteria. In any case, our computational resources are relatively tiny.”
Well, I am not so sure they are so tiny, although I have not made the calculations. Even if we work with less replicators, in a digital environment we can greatly increase the times of variation and of replicating fitness evaluation. And anyway, I am not saying that such a simulation is easy, quick, or that it could be done with small computational resources. I am only saying two things:
a) It is possible in principle to realize that simulation.
b) It is the only appropriate kind of digital simulation for darwinian theory.
“As long as the virtual environments are uninteresting, there is no reason to expect virtual organisms to be anything but the same. One thing that makes an environment “interesting” is that there are many ways to garner resources and reproduce.”
You can make the operating system as interesting and varied as you like, provided that it contains no specific code to recognize and ecourage any specific thing which you want to attain. In other words, the selection must be self-selection, deriving from the interaction between the functional replicators and the rules of the environment, and must in no way be programmed by the desginer of the simulation.
“This is not sensational, but it is qualitatively what you said you wanted to see.”
I have no problem in admitting that GAs can do a lot of things, like any other software can. The only thing they cannot do is simulate the logical mechanism of darwinian evolution.
What I wanted to see is some form of evolution of CSI in simulation of the kind I have suggested. It is obvious that GAs, being intelligently designed problems, can find solutions which the designer did not know in advance. If a computing software is designed to solve an equation, that’s exactly because the designer does not know in advance the solution. But that does not make it a simulation of darwinian theory.
And finally, I do believe that you should play a little with real proteins. It’s rather easy, and it could be an amazing experience.
Mark, next post is for you.
69
gpuccio
01/24/2009
6:54 am
Mark:
“Any simulation of RM+NS has to deal with both RM and NS. RM is random in the sense that the mutation is independent of the selection criteria. But NS, which is the simulated by the fitness function, is far from random.”
Well, that’s exactly my point. Exixting GAs deal certainly with RM, but they absolutely don’t deal with NS. All of them deal with Intelligent Selection (IS).
Let’s speak a little of the famous “fitness function”. In reality, no fitness function exists. Functions are just our intelligent creations. The problem is, the fitness functions created in GAs don’t represent in any way what is assumed in darwinian theory. And that’s not only because biological reality is different from digital reality: that is a basic problem of all simulations, both GAs and the one I am proposing. That is certainly a limit, but it is not my point.
My point is that teh fitness function in GAs is an intelligent function, devised exactly to obtainin in some more or less indirect way, what the designer of the simulation wants to obtain. That’s only design in a cheap tuxedo.
In darwinian theory, NS is only a blind effect which derives form the interaction of two different realities: the environment, or landscape, or whatever you want to call it, and the functional reality which we call the replicator. Two points are essential to define some process as NS:
a) The environment must be totally blind to the replicator, in other words in no way it must be designed to favor some specific solutions in the replicator. It can, obviously, favor whatever solution arises “naturally”, as a consequence of necessity deriving from its internal rules and structure. But no “fitness function” must be purposefully written in the environment by the designer. The only “fitness function” will be a description of how the environment works, or is structured, or changes, exactly as it should happen in darwinian theory.
b) The replicator can be as functional and as complex as we want: in my simulation, it represents the result of OOL (which we are not simulating here). The only requirement is that nothing must be frontloaded in the replicator to “guide” or help the future variations. In other words, any future variation must be truly random (and, as I have said, the variation mechanism can apply any statistical distribution you like, uniform or not, provided that no information is inputted about some specific functional solution).
That is the concept of NS as it is outlined in darwinian theory. That’s what we have to simulate. Anything which does not have those two properties, is neither NS nor a simulation of it. It is just some form, more or less in disguise, of IS.
“Nevertheless it is not random.”
I have never said that NS is random. It is a process of necessity. But it has to satisfy the two criteria I have detailed, otherwise it is not NS. In other words, NS is a “blind” (not “random”) process of necessity.
“Then the mechanism of RM+NS would lead to species with ever greater P/W ratios.”
In that simulated world, you would only obtain (if you are lucky) the same replicators, with slightly greater P/W ratios (as far as that is possible without damaging the existing functions, and by means of simple random variations, which would never approach the level of CSI). In other words, you attain (if you are lucky) exactly what you specified in your intelligent fitness function. If you specify greater P/W ratios, you can obtain that and nothing else. If you specify flying objects (whatever that may mean in a digital environment) you can obtain that and nothing else. If you don’t specify anything, you obtain nothing. That’s exactly, IMO, the intuitive meaning of the concept of active information.
And specification, as you well now, is the first requirement for CSI and design.
“But, just like the real world, any simulation must include a process for selection based on something.”
The real world, in darwinian thought , does not “include” any “process” for selection. NS is just a consequence of how the real world is, and of how a replicator works. It is not a “process”. It is a blind effect, due to blind laws of necessity. Maybe a theistic evolutionist could argue that the environment is designed to produce life, but that’s another story. I believe, to your merit, that you are not a theistic evolutionist.
70
Mark Frank
01/24/2009
7:56 am
Gpuccio
The real world, in darwinian thought , does not “include” any “process” for selection. NS is just a consequence of how the real world is, and of how a replicator works. It is not a “process”. It is a blind effect, due to blind laws of necessity.
First a comment on terminology: a process does not have to be designed. Stalactites grow through a process; stars are born, shine and then die through a process. These are both blind. They just happen.
But let’s not get tied into semantics. It appears that for you it is a key issue that the fitness function in a GA is designed, while the fitness function in the real world is not. This is the big thing that invalidates GAs as simulations of the Darwinian process as far as you are concerned. Correct?
Now do you accept that some GAs do generate complex solutions to designed fitness functions? So they establish that RM can generate complex and unanticipated solutions – even if the problem they address is anticipated?
If so, surely this is a significant piece of evidence in favour of the Darwinian process? All that remains is to show that it is possible for RM to generate complex and unanticipated solutions to problems which stem from the natural world rather than the simulators’ brains. While it may be difficult to create such problems in a simulation it is not clear why they the differ in principle. They may be more more complex but they don’t present any new problems in principle. What are the key differences between a designed fitness function and a “blind” one such as NS? They both eliminate some individuals in a systematic way. Why should a process that can solve one type of problem not be able to solve the other?
Imagine someone did discover (a la Douglas Adams) that the earth was in fact a giant machine designed to generate complex life forms through RM and a complex environment. Would this suddenly invalidate the Darwinian logic?
71
Joseph
01/24/2009
8:33 am
2- Whether or not “evolution” is goal-oriented is being debated.
yes “evolution” as it is currently accepted and applied.
ID doesn’t say. All ID says is that there are some things in the universe (and perhaps the universe itself) which show signs of being intelligently designed.
As for the implementation that would be another question that can be pursued after design is determined.
72
gpuccio
01/24/2009
9:45 am
Mark:
Your objections allow me to go into further detail about very important aspects of the question.
First of all, I agree with you not to get tied into semantics about the word “process”. I think we agree about what NS is in the darwinian theory, and that’s the important thing.
So, let’s go on. You say:
“It appears that for you it is a key issue that the fitness function in a GA is designed, while the fitness function in the real world is not. This is the big thing that invalidates GAs as simulations of the Darwinian process as far as you are concerned. Correct?”
Yes, it’s perfectly correct. But I would like to add immediately that it is not only a question of fromal difference: it is a question of utter substance, because in GAs the fitness function (and the algorithm which uses it) are designed in a way which incorporates a lot of “active information” (or, if we don’t want to get tied into semantics, of useful knowledge) about the problem to be solved, plus a lot of intelligent planning about how to best solve it. It is not a small difference. More about that later.
“Now do you accept that some GAs do generate complex solutions to designed fitness functions? So they establish that RM can generate complex and unanticipated solutions – even if the problem they address is anticipated?”
Here the matter becomes more tricky. Let’s try to analyze it better. GAs, like any intelligent software, can certainly generate solutions. And the solution is by definition unanticipated, while the problem is certainly anticipated. IMO, what differentiates GAs from other softwre is that GAs use a random search as part of the algorithm. That’s why I have always compared GAs to other engineering processes which do the same, like protein engineering and antibody maturation.
Now, to understand the question better, let’s take an example which uses only necessity: a software program which can calculate succesive digits of pi. Now, let’s pretend you are calculating the nth digit of pi: the solution is unanticipated (we don’t know it in advance), but the problem is anticipated (we have to know the right algorithm which can correctly calculate that digit). The same algorithm can calculate many digits of pi (I suppose…). So, we have here an example of a program based on pure necessity (a mathemathical formula) which can give us a solution which becomes ever more complex with each new digit. But please, take notice of two important things:
a) the algorithm is of pure necessity, and does not imply any random search.
b) even if the complexity of the solution can increase, the specification remains the same, and is linked to the mathematical definition of pi. No new specification is ever created by the program.
Now, let’s go to intelligent algorithms which incorporate some random search as part of the process. Apart from examples which are pure propaganda, like Dawkins’ weasel, where the program could as well write down the solution ibstead of looking for it by random search, that kind og algorithms, like those used in protein engineering, have a definite reason to exist: in that case, again, the designer knows the problem but not the solution, but he knows no algorithm based on necessity to reach the solution. In other words, he cannot calculate the solution. That’s typically our situation with protein functions: we may know what function we are searching, but we have no idea of which aminoacid sequence can implement it.
That’s why, instead of a process of calculation based on necessity, we can adopt a process of trial and error, based on limited random search and some form of intelligent selection, usually a very sensitive measure of the desired function after each step of limited random variation.
Such a method, if correctly designed, works. It is not easy, it is not quick, but it works. We know that.
But it is important to understand why it works. We have a lot of intelligent programming and active information here, at different steps:
a) First of all, there is usually a careful selection of the sequences we start from. That can be very important, and is based on what we understand of protein function.
b) The random variation step is usually tailored as much as possible. For instance, in antibody maturation, it is applied only to the part of the sequence which has to be “improved”, and not to the whole protein coding gene. In other words, it is random, but “targeted” variation.
c) The engineer never asks random variation to do what it cannot do. In other words, the process of variation has the role of achieving as much variation as statistical laws allow to achieve with the available resources, and not more. That’s why each step of limited variation has to be followed by a step of becessity (measurement and selection).
d) Finally, and most important, the selection is intelligent selection, and not natural selection. That does not mean only that it is implemented by an intelligent engineer, but also, and especially, that it is completely based on our intelligent understanding of the problem: here is where most of the active information slips in.
The difference is fundamental. I decide what function I am looking for: I am not expecting “any possible useful function”, but a specific solution to a specific problem. I “measure” it, even at levels which could have no significant relevance in the general context of the environment. In other words, I recognize any approach to my function, by my capacity to define and measure it, and not by a spontaneous advantage which the function implies. My selection is therefore artificail, intelligent, and guided by a lot of active information about the result.
More in the next post.
73
gpuccio
01/24/2009
10:24 am
Mark:
So, let’s try to draw some conclusions. You say:
“If so, surely this is a significant piece of evidence in favour of the Darwinian process? All that remains is to show that it is possible for RM to generate complex and unanticipated solutions to problems which stem from the natural world rather than the simulators’ brains.”
Here is where your reasoning is not correct, in the light of what we have said before. The fact that highly sophisticated pieces of engineering can find solutions to specific problems, even by intelligently using random search, is in no way evidence in favour of the darwinian process. And it is not only “the problem” which “stems form the simulator’s brain”, but especially the process of solution, even if the solution itself is not known in advance. And the process of solution is an intelligent creation, a fruit of design, and it incorporates not only the active informatiout about the problem, but also the intelligent elaboration of the designer, his intuitions, his patient work, his purpose, his general knowledge and view of the world, and who knows how many other things.
How could all that “stem for the natural world”? The natural world is as it is. Nothing stems from it, not problems, not processes of solution, not solutions. The best we can say is that, form the interaction between the replicator (which, at least, is a functional entity) and the natural world (which is only a passive scenario) the functions inherent in the replicator can be favoured, or suppressed, according to how they fare in that scenario. And that’s exactly what I had requested in my proposed simulation: a passive scenario (although an organized one, with its laws and structures), a functional replicator, and random variation. That’s all that is needed. In other words, the simulator must put his intelligence in building that scenario, but not in solving it.
” While it may be difficult to create such problems in a simulation it is not clear why they the differ in principle.”
I hope I have explained why they differ in principle.
“What are the key differences between a designed fitness function and a “blind” one such as NS? They both eliminate some individuals in a systematic way.”
Indeed, the fitness function is an ambiguous abstraction. Let’s say that it is the replicator itself which survives or not survives, according to the resources it has (in NS); while it survives or not survives according to the planned expectations of the programmer (in IS). That is a lot of difference. The “fitness function” of the natural world, whatever it nay be, incorporates no design and no intelligence. The fitness function in an algorithm is a well defined product of design, and incorporates a lot.
“Imagine someone did discover (a la Douglas Adams) that the earth was in fact a giant machine designed to generate complex life forms through RM and a complex environment. Would this suddenly invalidate the Darwinian logic?”
Well, maybe I was wrong and you really are a theistic evolutionist in disguise! You had fooled me with all that false reasoning about a materialistic view of the world.
Really, I don’t want to discuss here the inconsistencies of TEs. But if you are just implying that the earth is a giant machine designed to produce life by ETs, then I can answer: no, it isn’t. If it were, we would see that. We would see the active information somewhere.
Now, I am not saying that the earth is not tailored to favour life: I do believe it is. If it were not, life would simply be impossible. In a sense, the whole universe is tailored to favour life (see the fine tuning argument). But that is the most I can concede to TEs. That “favouring” does not include the generation of specific protein sequences, or of all the other information we observe in the biological world. That’s why the biological world is so different from the non living world. For it, another level of design is necessary.
Finally, I can admit that biological information could be generated by pure random variation and Intelligent Selection, like in GAs. But that is exactly an ID scenario (although not my favourite one).
74
gpuccio
01/24/2009
10:36 am
Toronto (#65):
Your example, as I understand it, is interesting, but absolutely not pertinent to our discussion.
First of all, a fractal generator is a necessity algorithm with some random seed. It is an example of “organization” based on necessity, but not of “functional information”. Your fractals perform no function.
Second, as you say: “Lifeforms die based on voter input. If too many visitors on the website don’t like the song or picture, it’s removed from the environment. If an output gets a lot of thumbs up, it survives and modifies a bit.” And so on. In other words, the results are selected by intelligent consious people, according to their intelligent and consious sensibility. That has nothing to do with NS, and has nothing to do with function. It is just a collective rorschach test where people select what they like (or recognize) more. In a sense, it is an interesting variation of the weasel example (and I mean Shakespeare, not Dawkins).
“In the US, enough generations might produce a picture of an eagle, while in Italy, you might end up with something that sounds like opera.”
That’s exactly my point. The audience is modeling the content according to its projected representations. In a way, it’s a form of (artistic) design.
If I have missed something of what you mean, please let me know.
75
Mark Frank
01/24/2009
11:28 am
Gpuccio
You have written a lot. But I think it will be enough to concentrate on this bit:
And it is not only “the problem” which “stems form the simulator’s brain”, but especially the process of solution, even if the solution itself is not known in advance. And the process of solution is an intelligent creation, a fruit of design, and it incorporates not only the active informatiout about the problem, but also the intelligent elaboration of the designer, his intuitions, his patient work, his purpose, his general knowledge and view of the world, and who knows how many other things.
The process of solution of a GA has three parts:
(a) The initial conditions
(b) The variation mechanism
(c) The elimination mechanism
I am going to leave aside (a) and (b). I am sure there are GAs where these are done at random with no attempt to tune them to the end result. The real issue is (c).
You use grand phrases about the designers’ work, intuition and intelligence but in the end this work is going to manifest itself as an elimination mechanism – a set of criteria for deciding who will survive and who will not. It doesn’t matter a toss whether the designer laboured on it for 20 years with the genius of Leonardi Di Vinci or stumbled on it while doodling and decided to give it a go. We need to understand what is it about a GA elimination process which makes GA an unacceptable simulation of NS? It isn’t the amount of work that was done beforehand!
You hint at one difference when you write in the context of protein design:
I recognize any approach to my function, by my capacity to define and measure it, and not by a spontaneous advantage which the function implies. .
In other words the designer has worked out that the fitness function = selection criteria = measurement system will lead to the required function. But note that it is the measurement that decides what survives and thus plays the role of natural selection. It looks like your concern is that the measurement leads to an end objective over and above satisfying the measurement while NS does not. But not all GAs work that way. Some simply have a fitness function. Suppose for example you use a GA to solve the travelling salesman problem. Then the fitness function is simply the shorter solution survives – the end objective is the same as the fitness function.
To summarise. Suppose we have a GA with the following properties:
(a) Initial states are selected randomly within the domain space
(b) Variation is not in way related to fitness function or any external objective
(c) Survivors are selected in a manner chosen by the programmer on a whim and with no other end than to generate survivors that do well on the measurement used
Would this satisfy you?
76
Patrick
01/24/2009
12:50 pm
1. The active information in a fitness function acts as a long term “funnel”. This funnel can contain an explicit target, like with the weasel example.
As gpuccio said, “It is obvious that GAs, being intelligently designed problems, can find solutions which the designer did not know in advance.” This is because the funnel contains a specific target that is general in scope and is applied over the long term regardless of short term considerations.
For example, you could be looking for the best shape for an antenna and these could be several different fitness functions. I’m just guessing but I’m presuming the landscape for an antenna GA should be very smooth in the best case scenarios. This example starts with an antenna that gets 1 dBi.
a) tested against explicit set of shapes
b) specific test by sampling signals and any increase, however small, in forward gain is rewarded
c) the more complex the shape the better but everything is rejected except those shapes that have a gain of 1 dBi higher than the previous generation (as in, the steps in the pathway are in 1 dBi incremental jumps).
d) the more complex the shape the better but everything is rejected except those shapes that have a gain of 12 dBi or higher.
e) complexity and length is rewarded. Each generation is checked for a better antenna, but is not rejected.
a should find the target very fast, with performance getting worse and worse with each version. I’d be surprised if d gets anything. And as a funnel e is so overly generalized that it may never find anything useful. Although you’d probably end up with a monster of an antenna.
2. The main issue is this:
If I may re-interpret you:
a) fitness functions in nature are not static, and will often not apply uniformly over many generations. Some fitness functions may not exist at all for certain problems, or they’re so overly generalized that the search is not properly funneled
b) neutral mutations are allowed but no fitness function can target specific configurations of them or guide them toward long term goals. If a series of neutral mutations manages to hit upon a configuration without any guidance some fitness functions will activate at that point. The weasel program is a good example since many intermediates do not have a function (comprise no english word) in the short term yet they’re selected for anyway.
c) most fitness functions must be limited to a short term target acheiveable in single step pathways. A new set of fitness functions will be generated based upon the new functional configuration of the replicator.
d) in order to be realistic the fitness function must not contain a long term target that is specific in scope yet cannot currently effect the organism/creature in the short term. For example, you cannot have a fitness function targeting the long term functionality of the flagellum but you could have hypothetical fitness functions for intermediates.
e) in nature there are examples where deleterious/destructive mutations may provide benefit in limited environments. There must be no preference for fitness functions that imply both benefit and constructiveness.
f) some overly generalized long term fitness functions can be used but there must be a maximum ceiling balanced against other considerations. For example, for a hunter a generalized fitness function could be “an increase in speed in order to catch my prey”. Problem is, this must be balanced against things like an extremely high metabolism and increasing speed at the expense of strength required to take down the prey.
77
gpuccio
01/24/2009
12:52 pm
Mark:
“Would this satisfy you?”
No. The replicator must survive or not survive for its intrinsic capacity to survive or not survive in the environment. In other words, the variation must increase the true replicating ability of the replicator: it’s not the designer that has to decide who survives according to a predetermined, and searched for, function.
That’s exactly what is assumed in darwinian theory: the new function must be such that it increases the reproductive ability of the new form. You understand that it is a very severe restraint to what functions can be selected. So you cannot underestimate this point. Solving the travelling salesman problem in a shorter way does not usually help a software to replicate. It has to be a function inherent to the replication in that environment. Only that kind of function can be selected by NS. Other kinds of selection are IS. That is the weakest point in the concept of NS, and you cannot get rid of it so simply.
Moreover, you are underestimating the importance of the metrics. A measurement is a very sensitive metrics, where you can put the threshold as low as you want or can. In NS, that is completely different. the threshold of measurement of a function is necessarily very high, becasue that function must provide some real reproductive advantage. Only a few functions can do that, and only at really functional levels of activity. That is a big, big problem for NS.
Finally, you are underestimating the difference between the two kinds of selection: negative and positive. Negative selection has the role of eliminating failures (by far the most likely results, with RV). But positive selection must expand the mutated individual so that it comes to represent the whole population, or most of it. For a mutated individual to expand, it has to acquire a true reproductive advantage vs the previous form, because the previous form is still perfectly functional, and the single mutated individual must compete with all the others so that they are suppressed, and it expands. That is no simple result to be obtained. It is not a case that the best examples of “positive” selection are those of antibiotic reistance, where a single and very powerful artificial aggressive agent can suppress the normal population and let only the lucky carriers of a mutation survive. But that is not certainly the general scenario for all supposed cases of NS.
So, as you can see, I insist that to simulate the darwinian mechanism, you have to demonstrate that the results of random variation can generate a ture spontaneous advantage in some replicators, and that such spontaneous advantages can cumulate to the point of generating true new complex functions (CSI).
The necessity of overcoming the barrier of better survival is the biggest obstacle for darwinian mechanism. Such a barrier, in a complex replicator, can be overcome only by complex functions and adjustments, which cannot derive from simple random variations.
And, if the darwinian model is true, why shouldn’t it work in a digital environment? There are infinite ways in which a replicating software can profit of the digital environment where it runs: better programs can occupy better spaces of the digital environment, compete for the computing resources, or directly attack competitors. Computer viruses do that. But I suppose that they are usually designed. Why cannot “better” computer viruses, or anything like that, come out of random digital noise in a digital environment? Why cannot we simulate that?
The truth is that we all know that such a simulation would never work. Because the model it is based on has never worked, and never will work. In the same way that plasmodium falciparum has never acquired the capacity to survive at certain temperatures, or to survive in carriers of sickle cell anemia. You just don’t acquire that kind of functions that way.
By the way, have you an example of GA solving the travelling salesman problem in a shorter way (and, possibly, with a new and different algorithm, which would be similar to generating a new protein)? I would be interested in that.
78
Patrick
01/24/2009
1:03 pm
I think I’ll attempt to summarize in one sentence: The fitness function must not contain a long term goal that is not applicable to short term goals, and these goals are very general in scope applying only to competitiveness against other replicators and not “functionality for function’s sake”.
79
gpuccio
01/24/2009
1:22 pm
Patrick:
I think that in general I agree with what you say, but I really am not comfortable with the whole concept of fitness function. It looks extremely artificial to me.
Moreover, I am always thinking in terms of molecular biology. Too many discourses remain generic and useless because they are not brought to the essential level of molecular biology. That’s also what Behe tries to remind always.
What I wonder is: how can a specific change in a protein bring to a reproductive advantage in all general cases? Obviously, we have very extreme (and artificial) cases where a small change can bring a great survival advantage: antibiotic resistance (of the simpler form) is a good example, and it’s not a case that it is almost the only example. And, as Behe correctly points out, it is a case of lucky disease, of loss of information which becomes useful due to exceptional circumstances.
But in general, how can a simple mutation in one protein generate a phenotipic change that is good enough to give a true reproductve advantage and to expand? Again, I mean beyond the few cases of microevolution, which always deal with very small changes and adaptations in the same island of functionality?
The transition from one protein to a different one is a virtual impossibility in almost all cases. There are no functional intermediates, when you have to change completely the primary sequence, and the folding, and the active site.
And beyond that, in almost all cases, how could the appearance of a new protein with some elementary biochemical activity be useful in an integrated and complex cellular system? We know very well that you don’t need just a new protein, but a correct regulation of its transcritpion, translation and post-translational events, and a series of finely tuned interactions of that protein with a lot of other proteins and cascades in the cell, before you get functionality. In other words, most cellular functions are IC from the beginning. That’s why I have always found very strange the focus on the flagellum, just because Behe used that model in his book. Almost everything is IC in a cell!
I would like to repeat here that the focus of darwinists on duplicated genes as the basis for evolution is rather symptomatic. We should remember that a duplicated gene is the only way to work at developing a new functional protein without losing the original function. Indeed, that’s what any programmer does when he wants to work at some part of the code and change it: he copies it, and works on the copy, not to destroy the original.
But, apart from the lucky circumstances of having the right genes in copy for our evolutionary experiments, we should remember that applying variation on a non functional copy of a gene has an unpleasant consequence which is often underestimated: that negative selection cannot any more control what is happening. In other words, if the duplicated gene is no more transcribed and functional, no negative selection can eliminate the bad mutations which compromise the original function. So, all the original useful information will be quickly lost, and unless and until a new functional configuration arises, no positive selection can apply. In other words, mutations on a non functional gene become neutral, and we are in the full ocean of non functional possibilities, from which probably nothing has ever emerged.
80
Patrick
01/24/2009
1:50 pm
I’d say it IS artificial, since fitness functions in GAs are typically not dynamically morphing but are instead statically and uniformly defined at the outset. Although it’s possible for the programmer to step in and tweak the function once a plateau is reached. But my point was to try and limit the usage of fitness functions to be more realistic.
The problem is that it makes writing the program much more difficult since a large variety of fitness functions will need to be accounted for. Essentially, to make the project feasible you’d need the software to dynamically adjust the fitness functions realistically at runtime without constant programmer intervention.
Now I have heard of this being done but only in reference to AI research. In the late 90s a friend of mine wrote a basic AI that then self-modified via a dynamically adjusted GA and other forms of input. Supposedly the resulting AI was pretty smart even on the limited hardware of that time. Unfortunately, the project got axed when the investor died in the World Trade Center.
And I’d also agree that I’ve never heard of a simulation that models all the biological constraints you mention.
81
Toronto
01/24/2009
3:46 pm
gpuccio[74]
I think my example at [65] satisfies
the above. As a “black box”, it generates new CSI according to the environment. The environment supplies the fitness functions in the form of people voting. These fitness functions are constantly changing according to the moods and needs of the voters.
The output bit stream need not be used strictly as video or audio but left up to the environment, e.g., a group of unknown users could use the output as indexes into a list of possible stock picks for trading purposes.
The “black box” never changes, only the CSI as perceived by the user.
As far as the use of a fractal generator is concerned, any process could be used to generate the bit stream. The user is as blind to the internal process as the programmer was to the external goal.
Only the single bit change in the “DNA” is required after a successful survival.
82
Patrick
01/24/2009
4:28 pm
Toronto,
Fractals have been discussed on UD by gpuccio and myself before. Still, I wouldn’t mind seeing your particular proposal carried out, if only to see what it results in from an artistic perspective.
83
Toronto
01/24/2009
5:32 pm
Patrick[82]
My proposal does not require the properties of a fractal generator. The bit stream could be generated as the result of a function applying a bit mask to n bits of PI starting from location X. The goal is still hidden from the programmer and it will change based on the whims of the environment, the end-user.
I too would like to see the output, but I would use it to generate short sentences. If for every million chars of output I got 3 or 4 short runs of proper grammar, say 5 to 7 words each, I would place them in an array. After a few thousand runs, I would have thousands of arrays which I could select again via my black box by simply deciding to interpret the output in a different way.
An automated author!
84
gpuccio
01/24/2009
5:51 pm
Toronto:
I think I have answered you #65 in my #74. I cannot see any CSI in your example. I cannot see any function or complexity in the output you describe. And the users are not an environment there, but just conscious observers who project their representations. It has nothing to do with what we are discussing here, but if you want to think differently, you are welcome.
85
JayM
01/24/2009
6:45 pm
gpuccio @77
I’ve been following this discussion with considerable interest. While my personal interests when writing software for my own amusement lean more toward cellular automata, I have implemented a couple of genetic algorithms.
First, this has sparked a number of ideas that I probably lack the time to implement fully. Thank you for the mental kickstart.
However, I think you might be focusing on the wrong level of abstraction. The fitness function in a genetic algorithm (GA) that is simulating random mutation plus natural selection (RM+NS) is a simplified model of the ability of an individual organism to survive and reproduce in the simulated environment. That is the replicator’s “intrinsic capacity to survive or not survive in the environment.”
That being said, if you want to see a GA where ability to replicate is measured directly, see Thomas Ray’s Tierra. It does exactly that.
Yes, and this is exactly what is modeled by many GAs. It is possible to learn about the capabilities and limitations of RM+NS without taking the simulation down to the level of individual atoms.
Could you expound on this? I don’t see the severe restraint (aside from the limitations of current computing hardware).
Only that kind of function is selected in the real world of competing organisms. The point of (some) GAs is to show that RM+NS can generate complex, unexpected, and varied solutions to surviving in particular environments. That’s a different level of abstraction than simple replication, but the mechanisms being used are the same.
Again, GAs measure the capabilities and limitations of random mutation plus selection (usually without a known end goal) in various environments. Those environments generally don’t mirror the real world, although Tierra reflect a small subset of it. That is immaterial to the discussion, however, because GAs show that the process of random mutation followed by selection does have certain capabilities in a wide variety of environments.
My personal view is that GAs could be used to flesh out Dr. Behe’s ideas in The Edge of Evolution to provide a better understanding of where the edge lies and what types of problems cannot be solved by RM+NS. I’m a math and software geek, though, so I would think that. I do believe we need significantly more computing horsepower for such research.
JJ
86
Toronto
01/24/2009
7:09 pm
gpuccio[84]
Evolution is not concerned with CSI as it is defined as a process that has no specific goal. The term belongs to the ID point of view, not Darwin’s.
My model is what you asked for, which is a darwinian mechanism.
From the ID point of view, it generates CSI, because the observer rejects any information that does not lead to his specific requirements which may be quite complex.
From the point of view of the box, there is no predefined specific output, but the viewer sees what he requested, CSI.
Voting need not be done by human viewers, it could be another black box looking for a mate.
87
gpuccio
01/24/2009
9:31 pm
JayM:
thank you for the interesting reflections. A few thoughts:
The problem with many fitness functions in GAs is, IMO, that while they pretend to model NS, they are really introducing active information, and therefore realizing IS.
If a GA does not do that, we could analyze its results, and see how much it is really modeling something. But we need to know the code in the details for that.
I have found the link about tierra very interesting. Unfortunately, there is probably not enough detail there to really understand, but my impression is that it could be something more near to what I think should be done. The limit here seems to be that the original organisms were 80 bytes long, if the article is right, and the derived organisms were either not very different from them (79 bytes), or much shorter (45). The shorter ones were essentially a subset of the original ones, and parasitic of them, because they had to use part of the originals’code to replicate.
In other words, my impression is that in general all the viruses were using approximately the same code, with minor variations. That is very interesting, but it would be essentially a simulation of microevolution, in the range of random probabilistic resources applied to an existing code. Obviously, we should know in detail the individual codes to really understand what happened, and why the modified code was sometimes more efficient than the original one. But I don’t think that any really new code has been generated here, and certainly not with the characteristics of CSI.
I am sure that we can simulate microevolutionary events in a real simulation. That kind of events is possible, and well documented even in biology (see antibiotic resistance). It is the assumption that a cumulation of simple microevolutionary events can bring about new complex functions (let’s say, a completely new code of at least 500 bits, a new algorithm, and so on), and not just reshuffle or slightly modify existing ones, which should be tested in a true simulation. But if you have further details about that, please let me know.
You ask for further elaboration on this concept “You understand that it is a very severe restraint to what functions can be selected.”
What I mean is that if NS can only expand new replicators which have a detectable reproductive advantage, then not all useful functions can apply, indeed only a tiny subset of them. Many complex functions, while potentially interesting and useful in the long term, would never give an immediate reproductive advantage, and could never be selected by NS. Moreover, a new function must be well integrated into the existing system of replication, before it can translate into a true advantage.
Going back to the biological world, you must understand that protein functions, even when searched by protein engineering algorithms, appear in the beginning as very low biochemical affinities, detectable by some sensitive measurement system, but completely useless in the real cell environment. They have to be intelligently selected and amplified, before a true powerful biochemical function can be reached. And still that function would have to be integrated into what already exists, and carefully regulated (the synthesis of the protein started at the right moment, and stopped when it is no more necessary, the protein concentration regulated at the right level, and so on). Think, for instance, of protein cascades, where all the components of the cascade must be present for the final result to be obtained, and each protein has to be present in different concentrations, from very low to very high, so that the cascade may amplify the original signal. Amd the signal must come from the right source, and be translated to the right effector. And still the effect must be strong enough to give a reproductive advantage, before it can be selected.
So, what I mean here is that NS, as it is conceived in the real biological world, and at the molecular level, is a very, very poor oracle. It can do very little, probably almost nothing at the level of complexity which is already present even in the simplest autonomous living beings, bacteria and archea. Because the more an organism is complex, the more difficult it is to obtain an immediate reproductive advantage by a simple step variation. And bacteria and archea are very complex.
So, to assume that NS is responsible for the emergence of all the existing proteomes, where each protein is hundreds of aminoacids long, and most proteins are deeply different one from another, and have different functions, and almost none of those functions can be useful by itself, and all that scenario has to be regulated and integrated, not only within a single cell, but among myriads of different cells, in multicellular organisms, and so on, well, to believe that is real folly.
So, please remember that ID has never affirmed that RV cannot generate something useful: the ID assumption is that RV cannot generate something useful and complex. That’s why the concept of CSI has two parts: the specification and the complexity. It is the complexity which avoids the false positives due to random variation. But it is the specification which connects the complexity to design.
88
William Dembski
01/24/2009
10:35 pm
I should note that our approach subsumes fitness functions but is considerably more general. Fitness functions alter the probabilities in a search. Our measure of active information focuses on that change in probabilities. But there are other ways to alter probabilities than by introducing a fitness function.
89
pubdef
01/25/2009
1:36 am
gpuccio:
I have to admit that you lost me after this point; I don’t really understand what “necessity” is in this context, and don’t really care at this moment. But I think there’s a problem with the portion of your post that I reproduced here.
The “special sequence of nucleotides” — the nucleotides are physical objects, interacting with other physical objects. You, apparently, are asserting that their sequence is a “code.” I maintain that the only difference from a rock in a stream is a matter of degree; they are both physical objects interacting with other physical objects. DNA is much more complicated, but how does that constitute “information” in a way that makes it fundamentally different from the rock? Where along the continuum of complexity does a physical interaction attain the status of “information?”
I know that geneticists and others in science refer to DNA as a “code,” but I see that as a nomenclature that describes its function by analogy. To argue that the genetic “code” is evidence of ID is to assume the conclusion, i.e., that DNA is a product of intelligence — a “code” like Morse code or computer source code — when there is no empirical evidence of teleological origin.
90
Mark Frank
01/25/2009
2:37 am
Gpuccio
No. The replicator must survive or not survive for its intrinsic capacity to survive or not survive in the environment. In other words, the variation must increase the true replicating ability of the replicator: it’s not the designer that has to decide who survives according to a predetermined, and searched for, function.
I am really struggling with this. What are you asking for? This is a simulation not the real thing. No simulated life form is going to die or survive unless there is a mechanism in the software for doing that. The programmer must create that mechanism.
What does an “intrinsic” capacity to survive mean in this context?
What is the “true” replicating ability as opposed to any other replicating ability?
It is almost as if you want the environment and the die/survive mechanism to develop through evolution as well as the individuals that live in that environment.
Go back to the example of artificial selection. In this case the real world fitness function is the product of a designer. If I breed pigeons I decide which ones survive and on what basis. Suppose I breed pigeons on the basis of speed and the result is a pigeon that has a radically different breast bone structure (I don’t design the breast bone structure – in fact I may not even know it exists). Would this not be an impressive demonstration of Darwinian mechanisms in action? But the selection mechanism (speed) is completely designed.
91
gpuccio
01/25/2009
8:49 am
pubdef:
Why do you say that DNA is not a code, and that the concept of a genetic code is only an analogy? THat is simply not true. DNA, in its protein coding parts (which, as you probbaly know, are only 1.5% in the human genome)stores information through a specific symbolic code, which works in the same way as the Morse code you cite. I am not implying here thatv the word “code” indictes necessarily that ir is designed: as you say, that would be assuming the conclusion. I am using the word “code” in a very elementary sense (the same as geneticists have always ised it): a symbolic language which bears symbolic information for something else.
The genetic code is made of codons (three consecutive nucleotides). Of all the possible 64 combinations, each has a specific meaning: it corresponds to one of the 20 aminoacids, or to a stop signal.
The important point is that there is no biochemical reason (law of necessity) why, say, UCU corresponds to Serine. The connection is purely symbolic, and is guaranteed only by the fact that the translation system recognizes the UCU codon and connects it to the aminoacid Serine. The connection is realized by the tRNA molecules, which recognizes the codon in the mRNA, and (at another site of the molecule) links to the right aminoacid and transfers it to the growing protein sequence. So, the connection is purely symbolic: nobody knows why UCU, in particular, corresponds to Serine. UCU is just a word which represents a meaning, the aminoacid Serine. The system works only because all of its parts use the same code or language.
And beyond the genetic code, the gene coding sequence is information. The gene coding for the protein myoglobin has a sequence of 154 x 3 = 462 nucleotides which code for the 154 aminoacid sequence of the protein myoglobin. That sequence is unique to that protein, and is the sequence which allows the function of myoglobin.
That’s what I mean when I say that there is no law of necessity which can output the sequence of myoglobin (or any other functional protein sequence). If you just try to synthesize a random protein sequence from a pool of random aminoacids, you can get any possible sequence. Biochemical laws do not privilege any specific sequence, and least of all a functional sequence like myoglobin. If you want to synthesize myoglobin, you need to know the prinary sequence of myoglobin: you have to know that specific sequence of 154 aminoacids. In other words, you need a specific information.
You ask: “Where along the continuum of complexity does a physical interaction attain the status of “information?”
The answer is very simple: when some particular complexity assumes a configuration which can give you a specific useful information, which can allow you to do something which otherwise you could never do.
92
Mark Frank
01/25/2009
11:12 am
Gpuccio
Re #90.
This all depends on what you mean by “code”. Code, meaning and symbol are all words with a number of uses in English. An important distinction is between what Grice call non-natural and natural meaning. They can be associated with an agreement among people to associate one thing with another, for example the letters Au with the metal gold. On the other hand they may depend on some kind of causal relationship other than an arbitary agreement – for example dole queues are a symbol of an economic depression.
UCU causes the production of Serine. This is because of biochemistry – not some arbitrary agreement. Therefore it falls into the second category of symbol.
You write:
The important point is that there is no biochemical reason (law of necessity) why, say, UCU corresponds to Serine. The connection is purely symbolic, and is guaranteed only by the fact that the translation system recognizes the UCU codon and connects it to the aminoacid Serine.
But that is only to say that UCU causes Serine in the context of the transalation system – given the presence of the translation system then there is every necessity that UCU leads to Serine. Drinking alcohol causes road accidents – but only in the context of driving a car.
93
jerry
01/25/2009
11:22 am
Pubdef,
Your arguments are fatuous. They seem more like an attempts of a disruptive toddler rather than constructive adult. Comparing a rock in the middle of the stream to an ordered set of molecules which then set in motion a series of steps that end up with a completely different ordered set of molecules that operate in context with several other ordered sets of molecules and each of these ordered sets of molecules have physical properties which are functionally useful for an organism is one of the more inane arguments I have ever heard.
Keep up the good work because it is objections to ID such as yours that makes our case easier. Remember we are not trying to convince you because we long ago knew such attempts were useless but we are trying to convince those who are seriously trying to understand the issues. So thank you for your efforts. People like you make our job easier.
94
jerry
01/25/2009
11:31 am
Mark Frank,
I don’t want to leave you out. So thank you too for your inane arguments. The more we have people like you and pubdef, the easier it is for us. The two of you are setting standards for what can be used against ID and we appreciate your efforts.
95
JayM
01/25/2009
12:14 pm
gpuccio @86
Do you have any particular fitness functions in mind, other than those like Dawkin’s Weasel that do explicitly encode their desired end state? One of my favorite GA examples is the evolved antenna. This is a much better example of what GAs can do than the Weasel. In this model, the environment consists of the physical laws regarding antennas and the other antenna designs. Each genome is rated against the others in that environment. The only thing that the simulation itself does is model random mutation followed by selection where the chance of reproducing is proportional to the fitness in the environment at that time.
The only information introduced in this model is the relative fitness of the genome. Although greatly simplified, this is exactly how information from a real world environment is communicated to a population of organisms. I don’t see how this requires intelligent selection.
I believe the paper linked from the page I linked to gives enough information. Would you agree, though, that if the GA behaved as I described, it really is modeling RM+NS?
My apologies, the page I linked to wasn’t the best. Here is a much better one, that includes links to many publications on Tierra and alternative implementations. The “organisms” are actually quite different from each other.
I’m a little confused by this. In your original post to which I replied, you said:
This is exactly what Tierra does. I’m not sure why you’re now adding additional constraints about new code and CSI. What is your objection?
In any case, the code of the digital organisms is available, and you can even run Tierra on your own machine to see the behavior you are looking for.
This is where I think there is extensive potential for ID research. That being said, I believe there are GAs out there that do mathematical theorem proving and have come up with new algorithms.
I’m afraid I’m getting more, rather than less, confused about your precise objections to GAs.
Ah, I see, thank you. That is definitely a restriction. Much as I might like retractable wheels built-in to my feet, that is not something that can evolve from our current body plan.
This ties in with my previous comment about rich areas for ID research. Where can we get to from where we are?
JJ
96
Prof_P.Olofsson
01/25/2009
12:31 pm
WilliamDembski[87],
Indeed. I also pointed this out in [54].
Minor point: On page 2, there is no need to restrict yourself to finite-dimensional vectors; Tychonoff’s Theorem takes care of the infinite-dimensional case which is relevant if you don’t want to fix the number of steps in advance.
I still wonder about the relevance to evolutionary biology of searching for a search according to the probability meausure induced by the Kantorovich-Wasserstein metric. I’m not trying to annoy you; I think it’s a very relevant question.
Maybe jerry [93] can give me an answer?
97
jerry
01/25/2009
2:15 pm
Prof_P.Olofsson,
Are you ready to say the correspondence of the nucleotides in DNA to the production of functional proteins does not act like a code. If you are, then I will gladly include you with pubdef and Mark Frank.
But you should also look at my comment #9 about the current thread.
I do detect a small lack of constructive criticism on your part. Yours is one of pointing out the flaws in other’s arguments, which is well and good but I do not see any attempt at helping others how to solve the problem other than your criticism. For example, can the problems you raised in the past about the flagellum be solved or ameliorated somewhat if there were probability estimates of the number of potentially functional proteins from the totality of possible proteins. Now I do not know enough about the technicalities of either probability theory or the behavior or random polymers of amino acids to make any intelligent assessment but I bet that there are some that do. By the way I have hardly read all your comments so my assessment could be quite wrong. I was only using the sampling of the ones I have read to make my judgment and like any statistical analysis there is a potential error. The sampling also indicate a cordial and generally nice person.
By the way I was a mathematics major in college and went to Duke graduate school on a fellowship for mathematics before leaving for the military and a change of life. My leaving had nothing to do with my work in math but I never really got back into it and eventually went to Stanford for an MBA where I initially was enamored with Operations Research because of my math background. I decided to pursue marketing instead. I had a number of statistics courses in later years as part of a different Ph.D program. So while the technicalities of statistics is a distant memory other things I remember quite well. I can recognize constructive behavior or the lack of it when I see it. That is something I never lost because I see it or the lack of it every day.
I look forward to you proving me wrong because I believe you could be extremely helpful. But if you were, you would probably suffer for it with your colleagues. Actually what could be a better way to discredit ID then to actively help it and then have those you helped admit that what they were trying to show was a dead end and then have them thank you for all that you tried to do for them. If such a thing happened to me I would glow with satisfaction.
98
pubdef
01/25/2009
2:32 pm
Jerry — guilty as charged. It must have been past my bedtime when I posted that last comment.
But in all seriousness — it seems that a large part of the argument of ID relies on the concept of “information,” and I’m honestly at sea about what exactly that it. And gpuccio: this is not very helpful.
In other words, a physical interaction attains the status of “information” when it can give you “a specific useful information.”
As long as it appears to me that ID relies on how complex something is, I don’t think I’ll be impressed. In my limited experience, evolutionary biologists have been quite aware of how complex nature can be. I didn’t get very far into “The Blind Watchmaker” before it was overdue, but I did get through the chapter in which Dawkins describes echolocation in bats in greater and greater detail, all the while making analogies to radar developed for planes in WWII.
99
Sal Gal
01/25/2009
2:52 pm
gpuccio (and pubdef):
In a coding system, as defined by humans, there are both encoder and decoder.
We observe cells doing work to frame sequences of discrete objects and transform them into sequences of discrete objects of a fundamentally different kind. This looks like decoding, but with no empirical observation of an encoder, we are not justified in calling it that.
Humans recognized the analogy to decoding, and began to perform encoding operations. The fact that there exist genetically engineered organisms decoding human transmissions implies neither that organisms typically decode nor that genetic information is generally due to human-like activity. In fact, if we adopt the perspective that imperfect replication of DNA is erroneous, then algorithmic information usually enters the genome by error.
100
Sal Gal
01/25/2009
2:52 pm
gpuccio (and Bill Dembski),
Your wonderment at the transformation of sequences of bases into proteins is not an indication of its improbability. You, an entity within the universe in which you and I agree that the process exists, cannot transport the two of us without and demonstrate to me the objective probability of a universe in which the process exists.
Arguments from improbability are absurd attempts to circumvent absurdity. (What is the CSI of the semiotic agent in terms of which CSI is defined?) My knowledge of the Creator is experiential, not logical. I can tell other people steps that might lead them to similar experience, but I cannot speak sensibly of the experience itself.
This verse is not popular with preachers. It never grabbed me until I saw it in an apologetics presentation of Bob Marks. The title of the slide is “Faith before Science.” Bob and I have different beliefs, but I greatly admire people who make such statements outright.
101
gpuccio
01/25/2009
2:58 pm
JayM:
I find your last post very balanced, and I don’t think that you are confused about my points. In a sense, I think you have understood them very well, and that we agree on much. I will try to clarify what seems to be still not clear.
1) I am not so aware of the current status of GAs, so I apologize for not being able to enter into details. I will use your indications to deepen my knowledge, and maybe we can go further in the discussion.
2) I think it should be clear that I have no objection that well programmed GAs can achieve specific answers. In the same way, I have no objection that, if a GA really does not introduce active information, its results can be considered interesting data, and should be carefully evaluated. I remain of the idea, howevere, that the best simulation of the “concept” of NS is the kind I have suggested, and which seem to be implemented in tierra. In a sense, any artificail introduction of a fitness function could potentially introduce active information, and it could be difficult to demonstarte or understand how in each single case. That’s why relying only on the spontaneous properties (or, if you want, fitness function) of an independent digital system seems the best solution.
3) I agree with you that Tierra, at least at first scrutiny, does what I have requested. I am happy with that, but obviously I will try to understand more about it, thanks to your new links. Just give me some time. But I can well accept that Tierra can be the kind of simulation which I have in mind.
4) I am in no way “adding additional constraints about new code and CSI”. I have no objection to Tierra, if it really works as I understand. I have only said that its results seem to me, at ny current level of understanding, good simulations of microevolutionary events. I am adding no constraints about CSI: I just think that Tierra has not produced new CSI. I will be more explicit. I think that Tierra has produced some optimization of an existing code, within the range of a random search. That is interesting, and it is ceratinly important data, but it is not CSI.
But I don’t want to anticipate a discussion about that until I can personally understand the data from Tierra. I am interested essentially in two aspects:
a) How much new functional complexity has been generated (that is, how many new functional bits).
b) If new functional specifications have emerged (that is, for instance, new algorithms, and not only an optimization of the existing algorithm and code). That was the meaning of my expression “new code”, but I understand that it probably was not precise enough.
5) I absolutely agree with you that there is extensive potential for ID research in all that. I am convinced that if ID theorists and darwinist theorists (and anybody else interested) could work together in mutual respect, instead of fighting, and compare their views in the field of active research and theoretical confrontation, much good would ensue for science.
102
gpuccio
01/25/2009
3:15 pm
pubdef (#97):
“In other words, a physical interaction attains the status of “information” when it can give you “a specific useful information.””
I can’t see your problem. The physical interaction of this answer to you post, (all the physical events which cause the bits of my answer to appear on your screen) has the status of information because you can red the words, and understand the meaning (although it does not seem to be particularly useful to you, but that’s another story). A protein coding gene is information because it can instruct the translation system about the correct aminoacid sequence.
If we cannot agree on such simple concepts, I really don’t know what else to say. Again, I am not trying to imply “immediately”, with that, that the sequence of nucleotides was in some way written by a designer. That is the conclusion of the whole ID theory, but I don’t want in any way to imply that in my concept of code or of information. I am just saying that a protein coding gene contains in its organized sequence, not in its general biochemical structure, the information for a functional protein. That remains true even if that information was produced by unguided mechanisms like RV and NS, as darwinists believe.
And the information is stored through a symbolic code. That is simply true. But I am afraid I cannot say it more clearly than I alredy have.
You say:
“As long as it appears to me that ID relies on how complex something is, I don’t think I’ll be impressed. In my limited experience, evolutionary biologists have been quite aware of how complex nature can be.”
But the main concept in ID is not complexity, but specification. Complexity is everywhere, but specification is characteristic of designed things.
The problem is, some things are designed and specified, but they are simple. In that case, the specification we observe could in alternative be the product of a random process, and we cannot infer design with certainty, even if it was really the cause of what we observe. Those are the famous false negatives, which are part of the ID theory.
That’s where complexity becomes important. Associated to specification, it becomes the rule to infer design with only virtual (logical) possibility of false positives. But it’s specification which is the real characteristic product of design, and not complexity.
103
Mark Frank
01/25/2009
3:17 pm
# 97
pubdef
I think you charge yourself unfairly. You asked what is it that makes a rock in a stream not a code, while DNA is a code, other than greater complexity. Gpuccio, being a gentleman, attempted to answer it, and this clarified what he thinks a code is. Seems like it was a good question to ask.
104
gpuccio
01/25/2009
3:39 pm
Sal Gal (#98 and 99):
OK, in the genetic code we observe the decoding. But if something is decodedm it must have been encoded. I am not saying that in itself that proves that it was encoded by a designer. It could have been encoded through the amazing works of RV + NS, as darwinists believe, but encoded it is just the same.
I don’t want to start here an useless discussion about terms. I just think that you, and pubdef, are charging me of what I have never said. I am saying only two things:
a) The genetic code is a symbolic code, in the sense that the information stored in DNA as protein coding genes can only be retrieved by means of a compelx system, the translation system, where exactly the same symbolic correspondences are embedded. And that correspondence is not in any way connected to biochemical laws, but only to a semantic connection between the stored information in DNA and the translating system in tRNAs.
b) The information in protein coding genes is functional information, because it is perfectly apt to guide the synthesis of a perfectly functional protein. Please notice that the function is in no way present in the DNA sequence (DNA can never act as an enzyme), but arises only in the final protein, as a consequence of the information of the DNA. The possibility of errors in the process does not change anything of that.
And I cannot understand to what you are referring when you speak of my “wonderment at the transformation of sequences of bases into proteins” and to some “argument from improbability” about that which should imply the whole universe and its Creator.
I am not wondering at the transformation of anything. I am just saying a very simple thing, that the transformation of a sequence of bases into a protein can never happen because of natural biochemical laws. It requires a very complex system of biochemical machines. In the same way, a newspaper cannot come out of mechanical laws of the universe, if you have not the journalists, the printing machine, and so on.
I am very susprised that I must discuss with you about those simple things. What has that to do with the universe, the Creator, and arguments from improbability?. If you think that it is an argument from improbability that a newspaper is the product of journalists and of a printing apparatus, well, I am all for arguments from improbability.
The real question for the genetic code, and for the information in DNA, is: Why and how is it there? Why and how is it as it is?
I don’t think that this is a stupid question, or an “attempt to circumvent absurdity”, or that trying to answer it strictly depends on our personal experience of the Creator (although I would probably agree with you that, in a sense, everything does). Darwinists have been trying to answer that question for decades, without directly drawing from their personal experience of the Creator, and I certainly don’t blame them for that, even if I don’t agree with their answers. ID is just trying to answer it in a different, and IMO much better, way.
105
gpuccio
01/25/2009
4:03 pm
Mark:
There is probably still much to say, but I will try to be brief. Please refer also to my other answers to others.
#89:
Obviously a simulation is not the real thing, because digital entities are not biological entities. But what I ask is that the same logical concept is tested in the simulation, which is believed to act in the real thing.
As NS in the darwinian theory acts according to the fitness function of the environment, which is written by nobody, I ask that in a good simulation of it, the simulated NS derive from the fitness fucntion intrinsic in the digital environment where the simulation is run (such as the operating system, and any other software or hardware resource present in the environment), and not from an artificial fitness function introduced by the programmer of the simulation.
That is not an absurd or unrealistic request, Indeed, as you can see, JayM has understood it very well, and pointed to an example (Tierra) where my requests seem to be satisfied. Therefore, it is possible to satisfy those requests, and I am convinced that such a premise guarantees a much better simulation of the “concept” of NS. I have no pretense that any digiatl simulation can really model the “substance” of biological reality.
You say:
“What does an “intrinsic” capacity to survive mean in this context?
What is the “true” replicating ability as opposed to any other replicating ability?”
It’s very simple: I mean the natural capacity of the digital replicators to replicate and survive in the digital environment, not because of some judgement or measurement made on them by an artificial fitness function, but simply because their code can better utilize the resources in the digital environment.
You say:
“It is almost as if you want the environment and the die/survive mechanism to develop through evolution as well as the individuals that live in that environment.”
No, that was never my point. The environment is alredy set at the beginning, like any digital environment is. It could change, or not, but not “through evolution”, because the environment in natural history is not supposed to change “through evolution”. Similarly, the die/survive mechanism is initially set by the programmer, through his programming of digital replicators which can survive and replicate in that environment. That is done by the programmer, because we are not simulating OOL. But any successive variations in survival or death of modified replicators would happen as a consequence of what evolution has done, plus the intrinsic rules of the system.
And no, I don’t think that artificail selection is “an impressive demonstration of Darwinian mechanisms in action”. Your example with pigeons, if it were true, would only demonstrate that a different breast bone structure can arise by RV in pigeons, and that an intelligent observer interested in speed can select it because it helps gain speed. But it tells nothing about NS. To show a complete darwinian mechanism, you should show that the different breast bone structure, after arising by RV, is selected and expanded in the population of pigeons through a reproductive advantage, be it due to speed or to any other associated function. Can you see the difference?
106
jerry
01/25/2009
4:06 pm
pubdef,
Information gets used in different ways. But we tend to use it in common language as data that mean something. And that is how I believe most use it here.
So information is just a piece of data and each nucleotide is a data point. Search the internet for definitions of the word. For example, go here
http://www.onelook.com/?w=information&ls=a
Use for information such things as
news; intelligence; words
facts; data; learning; lore
Each nucleotide in a DNA string is thus, a piece of information. Each molecule in rock is a piece of information. If one wants to go further down the structure, then be my guess. Each substructure down to the quarks will be pieces of information.
Now the DNA and rock are also complex. So each is an example of complex information. However, some units of the information in the DNA specify something else, for example a gene specifies a protein and sometimes RNA. And some of these proteins and RNA have functions and some of these proteins and RNA work together as functional units or systems. So the information in the DNA is complex, specified and and the elements specified are functional. Life is the only place this appears in nature. It appears quite frequently with human intelligence. Now it is quite possible that not all the DNA in a genome is specified and functional and may be just junk but a large part is not. Hence DNA is functionally complex specified information or FCSI. This is what the whole debate is about.
This discussion has appeared several time in the last few weeks and the anti ID people think they have scored points by asking for a definition of information when the simplest definition from a dictionary will suffice.
107
gpuccio
01/25/2009
4:12 pm
Mark:
# 92:
“On the other hand they may depend on some kind of causal relationship other than an arbitary agreement – for example dole queues are a symbol of an economic depression.
UCU causes the production of Serine. This is because of biochemistry – not some arbitrary agreement. Therefore it falls into the second category of symbol.”
No, UCU causes the production of Serine because of an agreement between the code in DNA and the code in the translation system. That agreement is not due to biochemistry, although obviosuly the single steps of the process of recognition (like the coupling of codon and anticodon in the tRNA) follow the laws of biochemistry.
You say:
“But that is only to say that UCU causes Serine in the context of the transalation system – given the presence of the translation system then there is every necessity that UCU leads to Serine.”
That’s true. But there is no reason of necessity that the same codon (UCU) be connected to Serine both in the information embedded in DNA and in the recognizing codon in tRNA. The relationship between UCU and Serine has no biochemical cause, and is purely symbolical. And that relationship is the same in two completely different parts of the cell. That’s all I am saying.
For further elaboration on this, please ckeck my answer to Sal Gal at # 104.
108
Patrick
01/25/2009
4:28 pm
1. Depends on the information content for this “new” bone structure, does it not? But let’s assume it’s something radical (not minor) and ONLY Darwinian mechanisms are at work and functional intermediates are always 2-3 steps away. That would at least show that Darwinian mechanisms related to variation are at least up to the task for this one object.
2. As you said the long term goal for the selection mechanism is designed. This gets back to the funneling problem for natural conditions described in #76. Can you think of a hypothetical environment that could produce the same effect in pigeons?
This brings to mind my comment on natural selection and squirrels. I could artificially breed squirrels and presumably I could eventually produce a flying squirrel. But what if I could not?
109
Patrick
01/25/2009
4:41 pm
As for the information/code and rocks discussion: a code is an abstraction. How is a rock in a stream an abstraction? The positioning of the sun and earth does not encode information in itself. The positioning of ink on a paper does, but the atomic properties of the ink do not inherently encode anything related to the information being outputted. There are real properties involved, which may influence the usage of a code, but the information content is not inherent to these real properties. A series of rocks encoding my name on a plane would be an abstraction that does not rely directly on the properties of the rocks, or streams, or whatever.
110
jerry
01/25/2009
5:12 pm
I have a question since I do not understand the basic argument of this thread. Is the question: Can sexual reproduction amongst members of a population with a specific gene pool find any possible combination of elements in that gene pool? And if so what are the limits for the gene pool?
Now there are three sources for alleles or other genetic elements in the gene pool, those already currently in the gene pool in at least one of the members of the population of the gene pool, a mutation on a genomic element of a gamete of a member in the gene pool essentially adding a new element to the gene pool and then there is possible recombination of a DNA sequence during sexual reproduction that can also add something new to the gene pool.
Not all combinations of these genomic elements exist in the population but it can be theorized that a currently non existing combination may have some reproductive benefits and would be selected if it showed up. And so the issue is how likely such a combination will show up in a future member of the population. Combinations not available because of the current makeup of the gene pool are just not possible unless there were infusions of new genetic material from some place.
So is the question about how easy or how long it will take to produce a combination that has better selection possibilities and secondly whether that combination represents anything significantly new in terms of functional capabilities. In other words is a flying squirrel currently possible within the gene pool of the squirrel population? And if so how long or how difficult would it be to get to the genetic combination that allows a squirrel to fly. Or is the combination of changes in the genome to get there just so implausible or even impossible because no combination of genomic elements in the population could ever lead to it so that naturalistic methods have no chance what soever. And Dembski and Marks are trying to show that whatever combination of fortuitous events took place, it would never reach certain places even if it were theoretically possible. And the term “search” has no meaning because nothing is really searching and the term is just used to describe the results of possible naturalistic processes.
Or are we just having intellectual fun with this discussion.
I know Darwin just viewed the organism as infinitely malleable and the right combination of environment and luck would lead to almost anything you can imagine.
I guess I am trying to answer by own question posed at #9.
111
Mark Frank
01/25/2009
5:27 pm
Gpuccio – I will leave the definition of code and symbol and concentrate on simulation.
I ask that in a good simulation of it, the simulated NS derive from the fitness fucntion intrinsic in the digital environment where the simulation is run (such as the operating system, and any other software or hardware resource present in the environment), and not from an artificial fitness function introduced by the programmer of the simulation.
At last I understand what you want from a simulation although I cannot see why you create this constraint.
I don’t think Tierra is going to satisfy you. It is hard from the available documentation to understand how Tierra decides which individuals to eliminate but I am willing to bet the programmer didn’t just rely on Windows or Linux to do the elimination. There is a reference to code written to eliminate individuals based on their lack of reproductive success. I am guessing this means how many they spawn i.e. fecundity not fitness. But anyway it appears to be written for the purpose.
But in any case stop and think. Suppose we do create a simulation where individuals reproduce and die wthout any code written to deliberately kill them off. i.e. it is just the hardware and operating system. The hardware and operating system are designed by someone – well a team of people. They didn’t design it to eliminate individuals but it does the job and no doubt by inspecting the code it would be possible to deduce the algorithm it (accidentally) implements for eliminating individuals.
Now suppose someone writes code deliberately that implements the same algorithm. They have now written a fitness function.
What’s the difference? Why do you accept it when the code was a by-product of the OS but not when it was written intentionally for that purpose implementing the identical algorithm?
112
Joseph
01/25/2009
6:11 pm
pubdef et al.,
In the paper “The origin of biological information and the higher taxonomic categories”, Stephen C. Meyer wrote:
see also
113
Prof_P.Olofsson
01/26/2009
12:40 am
jerry[97],
And you’re a descendant of vikings if I remember correctly!
To answer your question, no, I am not ready to say that.
I see your comment [9] now, I missed it before. I agree. I would also like to know how these results are supposed to be relevant to biology. The only logic I can imagine for the “search for a search” articel is that either the darwinian search algorithm is chosen according to the Kantorovich-Wasserstein probability distribution, or else there is support for ID. I could not argue for such a claim if my life depended on it. For starters, what is it supposed to mean that an organism “searches for a search”? Then, why does it have to search according to the aforementioned probability measure? What does that measure even mean intuitively?
I have actually offered some constructive criticism about the math in the “search for a search” paper. As for applications, my point is that if somebody claims that an article is “pro-ID” then I think it is fair to ask how, when there are no pro-ID claims in the actual article. The fact that the authors are pro-ID is not enough for me. So I’m left to guess but I can’t come up with anything reasonable that I can even argue against. In that sense, yes, I suppose I am not being constructive but I don’t have much to be constructive about.
Let your viking blood cool down now and be nice to Mark Frank! He’s a very nice person.
114
jerry
01/26/2009
1:36 am
Prof_P.Olofsson,
Like everyone from Sweden, i can trace my ancestry back to an Ericsson so I tell my kids they are descendants of Leif Ericsson.
I am cool, it has only been above freezing once in the last 10 days where I live. You don’t like my sarcasm about the anti ID comments? Some are pretty silly.
Did you see my comment at #110 where I try to answer my own question. I am just trying to get some understanding of what all the technical stuff is about. Maybe you can help me.
115
Mark Frank
01/26/2009
5:23 am
Jerry
Re your post #106.
You finished with:
This discussion has appeared several time in the last few weeks and the anti ID people think they have scored points by asking for a definition of information when the simplest definition from a dictionary will suffice.
This makes a point which runs through much of the discussion. Words such as “information”, “symbol”, “code” and “meaning” are bandied about as though their use was obvious and unambiguous. But they are not. Even the link you provide has multiple different simple definitions of information.
I wrote a detailed response to this. It is too long for a reasonable comment so I posted it here
116
djmullen
01/26/2009
7:23 am
gpuccio:
I’ll try to express myself better with an example. Let’s use an extremely small bacteria, one with only a single protein gene. “Protein gene” refers to a portion of its DNA whose pattern of DNA bases specifes the pattern of amino acids in a protein. (To simplify the discussion, we’ll assume that the stretch of protein gene DNA is continguous and codes for a single protein.) Further assume that this particular bacteria has been happily reproducing for some time and that its offspring are viable bacteria, also capable of reproducing, and that it wants to reproduce again. Where is the bacteria going to get the exact sequence of DNA bases it needs in its protein gene that will specify the correct sequence of amino acids to produce the needed protein?
Dembski and Marks seem to think the bacteria has to find this sequence via some sort of a search process, which starts with every conceivable DNA sequence that will fit into the gene and rejects all but the single sequence of DNA bases that will produce the correct protein. At least that’s what I gather from reading the words in the paper and ignoring the math, which is beyond me. This would be a gargantuan task. If the protein gene was only 400 base-pairs long, there would be 4^400 or 6.6 E+240 different combinations to try. The universe will burn to a cinder long before the bacteria even gets properly started with this task.
What the bacteria does instead is much simpler – it merely makes an exact copy of its DNA, including the section that codes for that protein, and hands that copy down to its offspring. No searching is necessary. Since the bacteria has been using its DNA and the protein it encodes for to successfully reproduce, we know that the protein gene is a good one because successful reproduction is our definition of success. No searching whatsoever is necessary.
Now let me show you when the bacteria actually does “search” the search space and how it greatly improves the odds of finding a workable protein gene by searching only areas of the search space that are “close” to itself. Let’s use a larger bacteria, one with 500 protein genes. That means that 500 stretches of its DNA encode for proteins and we know that all of those proteins “work” because this bacteria is successfully reproducing. Suppose that a single base-pair in one of the 500 protein genes gets mutated during reproduction. That means that the other 499 protein genes are good ones because they didn’t mutate and are identical to their parent’s. So the bacteria isn’t chosing a point in the search space at random, it’s choosing a point that is so close to where it started from that 499 out of 500 protein genes are known to be correct. The odds of the new genome being servicable are obviously hugely more likely than if you choose a point in the search space at random, which would be done by mutating every single base pair.
Think of evolution this way: three or four billion years ago, a sub-microscopic molecule was randomly put together that was able to reproduce itself before the same forces that assembled it tore it apart. It’s offspring was also able to reproduce itself before it was destroyed. This puts the molecule squarely in the tiny section of the humongus search space that is able to successfully reproduce. The molecule doesn’t have to search for the sweet spot, it’s already in it, in a section that’s likely enough so that natural forces could randomly assemble a molecule that’s in it in a few hundred million years or less. All the molecule has to do to keep its offspring in the sweet spot is to make an exact duplicate of itself. No searching necessary.
When the molecule is mutated, then the new, different offspring is tested to see if it’s also in the sweet spot. It’s tests itself by trying to stay intact and reproduce. Again, only nearby sections of the search space are tested. If the molecule is 100 atoms long and we only change a single atom, then 99% of the molecule is “known good” and we’re only testing the effect of the single changed atom. This is much more likely to be successful than randomly assembling a 100 atom long molecule and seeing if it works. If it doesn’t work, the new molecule is destroyed. But if the new pattern is good, then we now have TWO different species of molecules that successfully self-reproduce and evolution is off and running.
This is how evolution works. Every viable species is always in the sweet spot. It is never necessary to search for a successful DNA pattern to hand down to the offspring, you merely duplicate your known good pattern. If that pattern mutates, you are only then searching the genome space for a viable DNA pattern and you are only searching that space in the immediate vicinity of your known good pattern. I honestly don’t understand why Dembksi and Marks are so concerned with searching large spaces. That has nothing to do with evolution.
117
JayM
01/26/2009
7:58 am
djmullen @116
That was an incredibly lucid explanation, thank you. Of course, it supports my personal contention that the “edge of evolution” (where have I heard that phrase before?) is a rich area for ID research, so I’m a friendly audience for you.
While I would like to see more peer-reviewed papers supporting ID, I find your and Professor Olofsson’s arguments persuasive. After re-reading Dr. Dembski’s two papers, I can’t honestly portray them as supportive of ID. There is a gap between the math and what we know about biology.
I look forward to someone making the necessary connections.
JJ
118
gpuccio
01/26/2009
9:39 am
djmullen:
Do you really mean what you are saying?
Let’s start with your (obviously imaginary) bacterium with only one protein. First of all, in case you forgot it, we still have to explain how he got that protein (that is, we still have to solve the problem of OOL). You will probably say that it originated from simpler proteins. But the smallest proteins which can fold and exercise an autonomous biochemical function as enzymes are usually 80-100 aminoacids long (one of the smallest reported subunits is 62). So, where are the simpler proteins (functional proteins) from which the actual ones we see today originated?
But let’s leave that alone, for now. You have your one protein bacterium. Now you state the most incredible thing: that when a bacterium duplicates its DNA, there is no search necessary. That is incredible, not because it is not true, but because it is so trivial that I really can’t understand why you have to state it. Who has ever thought to deny that? It is obvious that DNA duplication is not a search!
But let’s go on. Now, let’s compare your one protein bacterium with a (more realistic) 500 protein bacterium. And let’s suppose that the one protein bacterium is the progenitor of the other one (being simpler, that’s a fair supposition). Now just tell me: whence did all the other 499 proteins come?
You say: it’s easy, they came by evolution, just substituting an aminoacid at a time, and always remaining in the sweet spot of functional proteins.
And I say, are you kidding? You are simply ignoring a very essential fact: the 500 proteins are completely different one from the other. Do you want an example?
In the genome of E. coli, a well known bacterium, we can find at least 2018 proteins described, grouped in 525 groups of similar proteins. The biggest group includes 52 proteins, while the smallest groups include only two proteins each, and there are 291 of them.
Now, let’s take, randomly, two proteins from two of the smallest groups.
One is called dinG, is described as a “Probable ATP-dependent helicase”, and is 716 aminoacids long.
The second is called clpA, is described as “ATP-dependent Clp protease ATP-binding subunit” (it is indeed a subunit of a more complex protein), and is 758 aminoacids long.
Now, I have taken these two examples completely at random, from two of the 291 smaller groups (of two similar proteins) in the E. coli genome. I could have taken thousands of different pairs. I have just selected two proteins with approximately the same length, for a more clear discussion.
Then I have blasted the two sequences with blastp, on the NCBI site. That software, which is routinely used for research, looks for possible alignments between two (or more) protein sequences. This is the result:
Only four possible partial alignments were found. The best was 49 aminoacids long, and presented 15 identities (15/49). The second was 7/8. The third was 12/30, and the fourth 6/11.
In other words, even considering all four possible alignments (which, obviously, are not really compatible one with the other), we get only 40 identities out of two sequences which are each more than 700 aminoacids long.
So I ask you: how did these two proteins (and all the other hundreds of different proteins, with hundreds of different foldings, 3D structures and functions) arise? How were these two completely different functional sequences found, remaining always in your imaginary “sweet spot”? How can your concept of “no search” apply to that?
Just answer that. And, as you seem to be familiar with combinatorial computing, I need not remind you that the combinatorial space of 700 aminoacid sequences is 20^700.
119
Prof_P.Olofsson
01/26/2009
9:54 am
jerry[114],
My family tree starts with a guy named Sven in the early 17th century. Doesn’t get more Swedish than that.
I need to visit you; here in San Antonio it hardly ever freezes and I would love to see some winter. And listen to sarcastic comments…
Anyway, yes, I did see your attempt to answer yourself. I don’t think we can sort this out until we learn how these results are intended to be relevant to ID/biology. The construction in the “search for a search” paper is very technical (I haven’t read the other paper). I understand the technicalities but cannot see the connection to biology.
Skal!
PO
120
JayM
01/26/2009
10:37 am
gpuccio @118
I think you’re asking a different question than djmullen was answering.
There are two issues being conflated here. The first, which I think is what you’re focusing on, is origin of life. That is, of course, a fascinating area of research.
The second, which djmullen discussed so eloquently, is the result of evolutionary mechanisms once life, or at least populations of replicating entities, exists. This is where I have trouble seeing the connection between Dr. Dembski’s two papers and ID. Once you have a population of imperfect replicators, djmullen is absolutely correct when he notes that there is no need to search a large genome space. Most replicators will have the same genetic makeup of their parent(s). Some will mutate but still maintain far more similarity than difference. That’s equivalent to just looking around the (very) local region of genome space for equivalent or better fitness.
Over many generations, the genomic makeup of the population may change, even significantly. At no point, however, is there a need to search prohibitively large regions when starting from a viable point.
How to get to that point and the size of the connected, viable regions are interesting questions.
JJ
121
CJYman
01/26/2009
10:43 am
It seems that some people here are not quite understanding what a search is in relation to these papers. The word “search” in this context has no intrinsic teleology associated with it. It merely describes the flipping of bits (bit operations), while describing the search space in terms of bits. This is where the probability of a pattern comes in to play.
The rest of this is kinda lengthy, so I’m gonna post the rest in sections.
What is the probability that a given pattern (measured in bits) will be generated by unguided (random) bit operations? That is simply calculated as the pattern’s probability — assuming a uniform search space.
Now, the point of these papers is to ask [and answer] the question: “how can the probability of generating a pattern be increased *and what is the probability of finding a way to increase the probability of finding that given pattern*?” Little bit of a tongue twister, but read it a couple times and it’ll make sense.
Well, basically, there can exist a search procedure (bit flipping operation) which rejects some bit operations/flips and keeps others. However, if the search space is uniform, this does not increase the probability for finding a given pattern. What needs to happen is that the search procedure needs to be matched with the proper search space that will allow the filter to actually improve the probability for finding a given pattern. This has already been proven in the NFLT.
Dembski and Mark’s paper picks up from there. What is the probability of matching a search space to a search procedure in order to increase the probability of finding a given pattern? They have merely shown that the probability of finding that match can be no less (and apparently increases exponentially with every higher level search) than the probability of finding the given pattern within a uniform search space in the first place. So appealing to a non-uniform search space and an evolutionary filter does not provide a solution to increasing the probability, since the probability (information) is merely moved back a step to the probability of finding the set of laws and initial conditions which will provide a non-uniform search space and ratcheting filter to increase the probability of finding the given pattern.
Thus, according to the math provided by Dembski and Marks, it is just as improbable to find a given pattern (measured against a uniform search space) as it is to find the search procedure (and landscape) to increase the probability of finding that given pattern.
122
CJYman
01/26/2009
10:46 am
Some people here seem to be asking, “How does this relate to life, evolution, and ID Theory?” Well, first, life is founded upon the processing of a system of signs/units (measurable in binary digits). The organization of these bits form patterns and a certain percentage of those patterns, when processed produce function which may aid in the survival of the system itself. Since the bits are subject to mutations and transfers of many types, they are subject to bit flips and operations.
Thus, the patterns are subject to search (as defined in the first para above in my previous comment). This is why mutations can change one functional or non-functional pattern into another functioning or non-functioning pattern. When a functional pattern is generated, it can be said that that pattern has been found by searching through the possible patterns. Obviously evolution by natural selection provides the ratcheting filter whereby some patterns are kept and other are rejected.
Now, “How does this relate to ID?” Since it is just as improbable to find the pattern which can be processed into function as it is improbable to find the match between search procedure and search space which will increase the probability of finding the functional pattern, then the probability of finding the set of laws and initial conditions to allow the structure of life and evolution from a uniform search space is just as improbable as finding the results of life and evolution.
So, we can have an infinite regress of fortuitous and highly improbable matchings of search space to search procedure (active information) to ultimately increase the probability drastically of finding a system which can apply foresight (the brain), or we can hypothesize other options. The infinite regress seems too similar to “its turtles all the way down,” and I am personally unaware of any evidence of a tested pre-universe evolutionary process, and so this provides no real explanation for highly improbable, functional results seen in life. “Evolution-did-it” doesn’t cut it anymore. That evolution must now be explained since it is also just as difficult to find within a higher order search space as the human brain would be to find in a uniform search space — that is, by chance.
If we must ultimately explain everything in terms of law and chance, then it seems that the only real option, other than “turtles (evolutionary algorithms) all the way down” would be to simulate evolution generating itself from background noise (chance) and an arbitrary collection of laws (set of laws put together absent any consideration for future results – absent foresight).
But is there another option? Of course, ID Theory says that there is. Systems which are capable of foresight (modeling the future and producing targets) can also increase the probability of finding a given pattern. Intelligent systems do this by applying foresight when matching the proper search procedure to search space in order to increase the probability of finding a given pattern. This has been observed and evidence provided by the NFLT shows that a simulation of evolution will not work unless “problem specific information” about the search space and target is incorporated into the behavior of the algorithm. Basically, this provides evidence that evolution can not even be simulated without some knowledge of the characteristics of the target being incorporated into the programming of the behavior of the interaction between search procedure and search space. So if future knowledge of the target is necessary to even simulate evolution, what happens to the hypothesis that evolution needs no foresight?
Put all of this together and we see that the ID Hypothesis is valid and scientific since it is based on observation and is the only verified option that is provided by the math within these two papers. Furthermore, it is falsifiable by showing through testing and observation that just law and chance absent foresight can increase the probability of finding a given target. So far, no one has shown that to be the case and the NFLT provides evidence that it may be practically impossible.
123
pubdef
01/26/2009
11:51 am
Patrick #109:
Thank you, that’s helpful and simple. My question now becomes: where is the “abstraction” in DNA? I think we’re projecting the abstraction onto it. Before we came along, there was just this mechanism — if one series of base pairs was present, a cell developed in one way, if another base pair, another course of development. Grossly oversimplified, perhaps (probably), but I don’t think it’s fundamentally off base. Anyway, we looked at it and conceived of the “abstraction” of “instructions” or “code.”
124
Sal Gal
01/26/2009
12:06 pm
gpuccio,
As Patrick points out, a code is an abstraction. All models are abstractions. You are generally very good about avoiding reification. But here you have slipped up and treated the abstract code as physical, and have gone from there to say that if scientists model genes as encoding proteins, then something must have done the encoding. Consider that you are doing with the genetic code what others do with fitness functions.
This goes to the crux of the matter. In mainstream evolutionary theory, there is no encoder. Merely changing the bases in a chromosome is not encoding. Exchange of DNA among organisms, perhaps mediated by viruses, is not encoding.
I commented on probability because you seem to regard complex biological structures and processes as improbable. Tell me if you in fact do not. I mentioned “wonderment” because your degree of wonder at complexity may be converted into a subjective (Bayesian) probability, but not an objective (frequentist) probability. No one can assign an objective probability to the proposition that our empirical observations of the universe may be accounted for strictly in terms of matter, energy, and their interactions. And this spells death for improbability-based arguments that some events have causes attributable to something other than interactions of matter and energy.
125
Sal Gal
01/26/2009
12:10 pm
Dembski and Marks can do fancy things with probabilities, but in the applications Dembski really cares about, he cannot get the probabilities he needs. I have always said that active information may have engineering applications.
126
Laminar
01/26/2009
12:29 pm
CYJman:
The problem is that ID can also invalidate the paper if you want it to. We know that evolutionary search spaces in biology are non-uniform so arguing that they couldn’t have been ‘found’ just begs the answer that they were designed – which is exactly the position of many theistic evolutionists.
127
Patrick
01/26/2009
12:38 pm
Leaving aside the points related to OOL and the initial genetic toolkit, this is the main issue:
The unqualified assumption is that for every potential long-range target there is a series of fitness-improving functional (or at least non-deleterious) intermediates in the “(very) local region of genome space”.
ID proponents think that for most starting “viable points” in genome space that the search landscape could be likened to a valley surrounded by sheer cliffs. The valley consists of other viable points in local genome space, but is limited in scope.
Some ID proponents believe these sheer cliffs can be traversed by intelligent mechanisms. Darwinists who think that “many micro-evolutionary events leads to macro-evolution” think there exists very narrow paths burrowing to other valleys, but even these are very difficult to traverse, and we’re lacking evidence for their existence. Other Darwinists think there are mechanisms that allow the traversal of the cliffs to produce macro-evolution.
And when it comes to long-range targets this leads to standing challenges like the flagellum. Although I’d say that gpuccio at #118 does a better job of describing the problem than I do in those previous discussions.
128
Patrick
01/26/2009
12:55 pm
Huh?? Your objection just confuses the issues. Unguided variation may not be an “encoder” (which I presume you’re saying is inherently an intelligent agent) but changes in information must meet the coding scheme in order to be valid for transcription. Error correction just enforces this requirement of being encoded properly.
And I’d like to see how you’d get from a basic chemical replicator to an information-based replicator with rules defining a coding scheme.
In any case, this particular discussion already took place not too long ago in A Simple Gene Origination Calculation.
129
gpuccio
01/26/2009
1:38 pm
JayM (#120):
I am afraid that you are wrong in what you say, and again that derives form little familiarity with biology (which, obviously, is not your fault). The problem is that darwinist must have convinced people, more or less indirectly, of a lot of things that are absolutely false.
The point is that what I say does not refer “only” to the problem of OOL. It is perfectly true also for the problem of evolution.
Indeed, I could well have made my example with two proteins from different species. Let’s be clear: some proteins can be found in meany species, at different levels of natural history, and remain very similar, or chance in a more or less limited range. But a lot of proteins arise practically “de novo” at sone time in the course of natural history, even if they can persist after in many species.
Are you suggesting that practically all the essential proteins, and 3d structures of proteins, were already present in the mythical LUCA? Well, that’s not the case. Almost any species has proteins which are exclusively found in that species, and have no homology with other known proteins.
In a recent paper “What makes species unique? The contribution of proteins with obscure features”. available online on Genome Biology, a comparison of “predicted proteomes derived from 10 different sequenced genomes, including budding and fission yeast, worm, fly, mosquito, Arabidopsis, rice, mouse, rat, and human” showed that “7.5% of the proteins with defined features (PDFs) were species specific (17,554 in total)” and “60% of the Proteins with obscure features (POFs) identified in these 10 proteomes (44,236 in total) were species specific”. And “Approximately one-quarter of eukaryotic proteins are POFs”.
The simplest bacteria have approximately 500 proteins. E. Coli has more than 2000. Eukaryotes are much more complex. Do you believe that all the proteins in eukaryotes can be traced to those in simple bacteria, with only minor evolutional “play”. No. That’s not true.
C. elegans, matbe the simplest multicellular being, a small worm made of only 1000 cells, has a proteome of about 20000 proteins, almost the same as humans. Do you believe they are the same? Or that all human proteins are very similar to those in the proteome of C. elegans?
New molecular functions arise in new species. New proteins, new protein cascades, new regulations. New organs, new tissues, new cells, There is a lot of novelty, wherever you look at, in biology.
djmullen’s reasonings of “sweet places” and “no searchs” are pure imagination, and bear no relationship to biological reality. And you accept them for the same reason: you don’t know how things are.
Again, that’s not your fault, or djmullens’ fault. It is really strange how many people believe that the difference between species are easily explained by some reshuffling of existing information. Perhaps they read too much Dawkins (certainly not a good idea), and their best model is dog breeding. But sometimes I wonder how darwinists have succeeded in spreading so much disinformation about biology, and in hypnotizing people making them believe things that have no scientific basis.
130
gpuccio
01/26/2009
2:03 pm
Sal Gal:
I can only quote myself (#104):
“I am saying only two things:
a) The genetic code is a symbolic code, in the sense that the information stored in DNA as protein coding genes can only be retrieved by means of a complex system, the translation system, where exactly the same symbolic correspondences are embedded. And that correspondence is not in any way connected to biochemical laws, but only to a semantic connection between the stored information in DNA and the translating system in tRNAs.
b) The information in protein coding genes is functional information, because it is perfectly apt to guide the synthesis of a perfectly functional protein. Please notice that the function is in no way present in the DNA sequence (DNA can never act as an enzyme), but arises only in the final protein, as a consequence of the information of the DNA. The possibility of errors in the process does not change anything of that.”
To be even more clear, I am saying:
1) UCU is translated to serine in practically all protein coding genes.
2) That happens for no biochemical property which connects the tri-nucleotide UCU to the aminoacid Serine, but only because in all those beings there is another molecule, a specific tRNA, which has an anticodon in a key position which can connect (through biochemical laws) to the UCU codon in mRNA, and that same molecule (the specific tRNA) has another, separated site, in its structure, which links the aminoacid Serine and attachs it to the growing protein sequence.
3) The results of all that is that Serine strangely appears at the right functional place in all existing functional proteins.
4) The same can be said for all the other 19 aminoacids, and for all the other codons in the genetic code (except the stop codons), where each aminoacid is “mounted” by the appropriate tRNA, recognizing the appropriate codon on mRNA.
5) I am curious about that, and would like to know why it is that way. I don’t accept that as a law of nature. And I don’t accept the explanations of darwinists (provided that they have ever offered one).
6) If you are not curious about that, or if deeper problems prevent you from asking yourself such trivial questions, I really don’t know what else to say.
In case you want to go on with the discussion, could you please explain, as clearly as possible, what you don’t agree with in the above, very simple, points? And what in them is “reification”? Thank you.
And again, I certainly admire the complexity and creativity in nature, and often I am really overwhelmed at its depth and beauty. I feel many different emotions and intuitions when I look at nature, both in a garden or in a biology text (although the garden is usually better).
But what I really “wonder” at is the darwinist folly, and how many intelligent people can share such a theory and such arguments without any apparent doubt or second thought. That is a real wonderment for me. That I would bet against, in my best Bayesians moments, as absolutely improbable, if I did not see it happen every day.
131
Mark Frank
01/26/2009
2:37 pm
Gpuccio
I am thinking about your post #129. I wish there were a biochemist watching this discussion but I have a logical query.
If:
a lot of proteins arise practically “de novo” at sone time in the course of natural history, even if they can persist after in many species.
what is the mechanism that causes that to happen (whether it be guided or not)? It is presumably a rearrangement of DNA in the genome and we know the mechanisms involved:
* point mutation – clearly not sufficent to produce a radically new protein
* crossing over during meiosis – but you seem to be saying new proteins arise in asexual species as well as sexual
It seems that what we are really talking about is:
* insertions, deletions, and duplications.
These operations could instantly create new proteins that appear radically different from their predecessors. But they would be working with strings of DNA that had proven effective before. The unit being rearranged is much larger than a single base pair or even a single amino acid.
I don’t have enough biochemistry to know if this makes sense but it seems worth pursuing.
132
JayM
01/26/2009
4:05 pm
gpuccio @129
Certainly. There is also a lot of commonality. The ratio between those two is one of the pieces of evidence for common descent. Organisms with a more recent common ancestor share more features than those who diverged earlier.
With all due respect, my knowledge of biology is more than sufficient to recognize that djmullen’s comments are spot on. It’s very simple to understand just by looking at yourself and your parents. You do not share a genome with either. In fact, you have a number of mutations so that your genome is not even a proper subset of the combined genomes of your parents. Nonetheless, you live. You are a viable organism. That demonstrates beyond a shadow of a doubt that there is a “sweet spot” in genome space within which the evolutionary mechanisms that operate between one generation and the next work.
This is why I have great difficulty seeing how Dr. Dembski’s two papers are at all applicable to the ID issue. I could see how they might be applied to the origin of life problem, but evolutionary biology starts from a viable point and explores nearby points, most of which are also viable. Evolutionary mechanisms do not search the whole possible multidimensional genome space.
I’ve been enjoying our conversation, but your patronizing tone in this post is unappealing.
You seem to be conflating multiple topics. In the context of observed evolutionary mechanisms, Dr. Dembski’s papers do not seem to be obviously supportive of ID. The fact that small changes to a viable genome result in other viable genomes, as shown by the differences between you and your parents, indicates that the viable regions in genome space are not uniformly distributed. That means that mathematical discussions of searching the whole space, however elegant they may be, do not apply in this domain.
The broader question of just how far these mechanisms can go is, as you point out, quite interesting. The fact that there are limits at one scale does not invalidate djmullen’s explanation at the scale he was discussing.
JJ
133
JayM
01/26/2009
4:16 pm
CJYman @121
This does not, as you touch on below, reflect the mechanisms that are part of modern evolutionary theory (MET). The search space for viable organisms is not uniform and the mechanisms are not random (although some do include a random component).
If we’re going to argue that MET is an insufficient explanation for what we observe in the natural world, we need to address what evolutionary biologists actually say.
I just read the NFLT papers I could find on the web. I’m not sure I’d summarize quite the same way, but I think I get your point.
This seems to be a significant change to ID theory. Instead of attempting to demonstrate that the mechanisms of MET are insufficient, we’re now reduced to arguing that the physical laws that result in those mechanisms are unlikely. The cosmological argument is interesting, but it cedes all of biology to ID opponents.
MET doesn’t have anything to say about how the mechanisms arose, it merely shows how they explain what we see. Please correct me if I’m misunderstanding you, but it seems that you’re saying that Dr. Dembski’s papers provide support for the idea that the universe is designed, but have nothing to say about the likelihood that the mechanisms of MET can account for all the biological diversity we see.
JJ
134
gpuccio
01/26/2009
4:43 pm
Mark:
excuse me if I have not answered your previous post (I will do that as soon as possible), but as you see the discussion has shifted to other issues. So I answer first you 131.
Yes, you are right: great changes take place at molecular level in natural history, and we don’t know why ot how. We in ID are certain that those changes are guided, for the reasons you know, but for the rest we are as ignorant as the darwinists.
It is perfectly possible that a new protein comes out of an initial gene duplication, as darwinists believe. That’s what programmers often work when they want to modify an existing code. And then in some way the new code is superimposed.
I don’t believe too much in the importance of working with big strings, because that has many restraints. In many cases you would need to change individual nucleotides at the right place, and point mutation remains the most powerful way to do that.
I suggest here, just for discussion, some possible mechanisms of guided programming of a new protein. They are purely hypothetical, but I will mention where there could be some corresponding model in biology:
1) Guided variation: a variation which is not random, but pseudorandom, and where special events are intelligently favoured (for instance, at quantum level). That could include both point mutation and other mechanisms, like deletion, inversion, and so on. Guided vairation could also be mediated through intermediate tools, like transposons, ERVs, and others.
2) Random targeted hypermutation: we have an example of that in antibody maturation, but that is realized by an algorithm embedded in the immune system, and is not destined to genetic transmission. However that mechanism works, but it has to be coupled to:
3) Intelligent selection. In other words, the results which are in the sense of the needed change must be kept by a guided, intelligent intervention, even if they are not yet truly functional. For instance, the change in function could be measured, even if minimal, and preserved, as it happens in antibody maturation when the affinity for the original antigen is in some way measured (probably by the antigen presenting cells) and determines the suppression or retention of the new clone. That is similar to what happens in GAs, and it could take place in the context of the individual biological being, and not through the much slower (and imaginary) process of NS. Or, if the designer already knows the solution, he could act as a direct oracle, maintaining only the variations which are in accord with the information to be implemented (a la weasel). In all cases, intelligent selection, as we know from the example of appropriate GAs, can realize very quickly what NS can never achieve.
But, while guided variation, targeted random variation and intelligent selection are certainly possible mechanisms by which a designer can implement intelligent information in the genome, huge mysteries do remain. First of all, is the implementation of the information really gradual, like traditional darwinism supposes? In other words, does the designer act in a very slow and gradual way? It is possible, but there are a number of issues against that: first of all, obviously, natural history, which is almost certainly discontinuous, especially at three major points: OOL, and the ediacaran and cambrian explosions. And the whole fossil record, in general, does not favour continuity, as Gould had well understood. And finally, as I have said often, we cannot really think that change can come hrough one new gene, however powerful. Real change in a complex system requires a redefinition of multiple genes and parameters, of the regulation, of the procedures, and of many other things. It is not a case that even very similar species are sometimes very different at the genomic level. Moreover, we still don’t know how and where regulations and procedures are really implemented in the genome (or in some other place).
The difficulty of obtaining even a single new protein, however, is evidently troubling darwinists ever more. That can explain the recent attempts to find an important role to frameshift mutations, which are the only hope for traditional darwinism to obtain radically new sequences in a “simple” way. So, although frameshift mutations are totaly unrealistic as a way to obtain functional information, darwinists are trying just the same to exploit them in their “models”, driven probably by sheer desperation. What a pity that the only empirical support for those theories comes from the old model of nylonase, and that that model has been recently shown to be false (if you don’t believe me, just go to the Wikipedia page about nylon eating bacteria, and read carefully the linked paper by Negoro et al, to see how the famous nylonase, for years boasted by darwinists as an example of new function created by a single frameshift mutation, is instead a classical example of microevolution, where the mutation of one or two aminoacids allows a new substrate affinity in an existing esterase).
135
Mark Frank
01/26/2009
5:06 pm
Gpuccio
You wrote:
So, although frameshift mutations are totaly unrealistic as a way to obtain functional information, darwinists are trying just the same to exploit them in their “models”, driven probably by sheer desperation. What a pity that the only empirical support for those theories comes from the old model of nylonase, and that that model has been recently shown to be false (if you don’t believe me, just go to the Wikipedia page about nylon eating bacteria, and read carefully the linked paper by Negoro et al, to see how the famous nylonase, for years boasted by darwinists as an example of new function created by a single frameshift mutation, is instead a classical example of microevolution, where the mutation of one or two aminoacids allows a new substrate affinity in an existing esterase).
About the fourth paragraph in Wikipedia reads:
A series of recent studies by a team led by Seiji Negoro of the University of Hyogo, Japan, suggest that in fact no frameshift mutation was involved in the evolution of the 6-aminohexanoic acid hydrolase.[1] However, many other genes have been discovered which did evolve by gene duplication followed by a frameshift mutation affecting at least part of the gene. A 2006 study found 470 examples in humans alone.[2]
with suitable references.
What gives?
136
Mark Frank
01/26/2009
5:26 pm
Gpuccio
I then went on to read the paper you reference in #129. It seemed surprisingly easy to understand. Of course I may have got it wrong, but it appears to conclude that there is a high (92.5%) conservation of proteins among the PDF proteins – those with clearly defined function. This among organisms as diverse as yeast and mammals with a last common ancestor hundreds of millions of years ago.
There is much lower conservation among POF proteins. But these are the protein for which the function is not understood. And a lot of them have quite distinctive characteristics – so they cannot be treated as like PDF proteins for which the function has yet to be discovered.
Many of these POF proteins might do nothing at all. Those that do something we have no idea whether it is the whole string that is relevant or just a part of the protein that is active or how wide a range of other proteins could perform the same function. If these things are true then this would greatly lighten selective pressure and under Darwinian assumptions they would diversify very rapidly – because a wide range of changes would be viable.
Am I failing to understand something?
137
jerry
01/26/2009
7:06 pm
Mark Frank,
Use data as a synonym for the word information. The molecule has a name so you can use that or better yet since we know we are talking about nucleotides, so use A, T, C, G. Thus the information in the DNA or some segment of the DNA is the molecule A, T, C or G or a string of these letters. It is simple as that.
For a rock you can use the molecule at some coordinate and then proceed to list every other molecule at each coordinate point. I cannot imagine why anyone would want to do this but it theoretically could be done and when through you would have a very complex set of information with each data point an individual piece of information.
Now each triplet of letters in the DNA or certain parts of the DNA is another piece of information. What is so hard about this. If you want to pursue some philosophically obscure path to analyze this example be my guess but the rest of us will proceed on.
Now certain segments of this DNA can be thought of as a unit and that unit is another level of information. Think of letter, word and sentence. And these sentences have meaning and that meaning is a string of polymers of amino acids which people call proteins.
It is all quite simple and straightforward. So when we say that DNA contains information and complex information it becomes very obvious what is meant but so far it may not be any different than the rock. But the DNA data goes further and relates or specifies something else while the data in the rock has just hit a stone wall and is just still a rock. And we can go further and look at look at the thing specified and we see that it has a function so the nucleotides in DNA is information, complex, specifies something else and this something else is functional.
We might even take this to a new level above this and say that the functional element or protein or RNA then becomes part of a coordinated system and maybe we can call the data or information in DNA something like systematic, functional complex, specified information or SFCSI. We have now left the rock in a distant galaxy as we accelerated to light speed to get to where the data is leading us to. And who knows that if we can determine new levels we will have a longer abbreviation that we can put in our Mickey Mouse decoder ring to get the meaning of life.
If you want to quibble over this, be my guess but I do not see the point of it.
I assume that these papers of Dembski and Marks are looking at the likelihood a sequence of mutations can lead a particular DNA string from one functionally complex specified point to another that is not related to it. If that is not what it is about then maybe someone could enlighten us peons.
138
CJYman
01/26/2009
8:55 pm
JayM (#133):
“This does not, as you touch on below, reflect the mechanisms that are part of modern evolutionary theory (MET). The search space for viable organisms is not uniform and the mechanisms are not random (although some do include a random component).”
That’s the point. Since evolution is not random, what allows it? The answer: a matching of search space to search procedure which is just as improbable as the effects which it produces. ie: if we wouldn’t expect the chemical constituents which make up a human brain to randomly coalesce in someone’s backyard pool, then we shouldn’t expect background noise and an arbitrary set of laws to produce life and evolution.
JayM:
“If we’re going to argue that MET is an insufficient explanation for what we observe in the natural world, we need to address what evolutionary biologists actually say.”
But neither myself nor these two papers say that MET (+ all other evolutionary mechanisms yet to be discovered) is insufficient to explain the biology which we see around us. These papers take evolution as granted. These papers merely get to the foundational point of the debate — “what causes life and evolution” and “is life and evolution possible absent previous foresight” and “will background noise and an arbitrary set of laws cause life and evolution.” The rest of my post which you responded to outlines the significance of these questions and how these two papers plus ID Theory help to provide answers.
JayM:
“I just read the NFLT papers I could find on the web. I’m not sure I’d summarize quite the same way, but I think I get your point.”
I have also read through them and they stress the significance of matching search procedure to algorithm in order to achieve any significant results. They even conclude (from what I can remember) by discussing the importance of incorporating problem specific information into the behavior of the algorithm in order to solve the problem.
JayM:
“This seems to be a significant change to ID theory. Instead of attempting to demonstrate that the mechanisms of MET are insufficient, we’re now reduced to arguing that the physical laws that result in those mechanisms are unlikely.”
Actually, that isn’t strictly an ID argument. Poking holes in evolution is not equal to an ID argument. Many non-IDers, such as James Shapiro, point to the insufficiency of MET to explain the diversity of life which we observe. I believe he is on the right track researching cellular non-random genetic engineering in which living organisms control their own evolution to an extent.
ID Theory, by definition, is a search for patterns which signify previous intelligence. Once I actually began to understand ID Theory, I noticed that it is not an anti-evolution argument. In fact, these two papers begin to provide evidence that evolution is one of those patterns which signifies previous intelligence. CSI and IC are other patterns which also signify previous intelligence.
JayM:
“The cosmological argument is interesting, but it cedes all of biology to ID opponents.”
Not so. Life and evolution itself is evidence for a previous intelligence. Think about it a bit. There would be no cosmological argument if life didn’t exist. There would be no patterns to explain and thus no cosmological ID. Biological ID merely recognizes that certain features of life (including the system of life itself and its evolving process) require previous intelligence and Cosmological ID provides evidence that this intelligence is fundamental to the laws of nature.
JayM:
“MET doesn’t have anything to say about how the mechanisms arose, it merely shows how they explain what we see.”
That’s pretty much the point. So when you see your friendly prof for the public understanding of science foaming at the mouth attempting to say that evolution proves that there is no teleology within life or nature, then you know that he is way out of his element. ID Theory is interested in quantifying the patterns produced by intelligence and what causes life and evolution.
JayM:
“Please correct me if I’m misunderstanding you, but it seems that you’re saying that Dr. Dembski’s papers provide support for the idea that the universe is designed, but have nothing to say about the likelihood that the mechanisms of MET can account for all the biological diversity we see.”
They provide evidence that life and evolution are designed. They show that in order for a system (including evolution) to increase the probabilities of discovering a function/pattern or solve a problem, the higher order system which causes that evolution must be at least as improbable. So, as a foundational explanation, we need to look for something which can increase probabilities and foresighted systems are capable of that, however background noise and arbitrary collections of laws (set of laws created without any consideration for future results) are not up to that challenge.
Other people have demonstrated that MET is most likely not up to the task of generating all the biological diversity and function which we see (wasn’t that the point of the Altenburg 16). However, that only means that we don’t know everything about how evolution works yet. Big Deal … there’s much to learn about the mechanisms.
139
jerry
01/26/2009
11:11 pm
“Other people have demonstrated that MET is most likely not up to the task of generating all the biological diversity and function which we see (wasn’t that the point of the Altenburg 16). However, that only means that we don’t know everything about how evolution works yet. Big Deal … there’s much to learn about the mechanisms.”
The MET is an evolving synthesis of processes that account for changes in a population over time. As such it is far different from what it was in the early 1940′s when it was initially finalized. It changed dramatically with the discovery of the structure of DNA and all the multitude of processes surrounding micro biology in the next 40 years after Crick and Watson discovered the DNA structure. It is constantly changing today as things such as epigenetics are added.
The two constants in all the various syntheses have been natural selection and naturalistic mechanisms for providing variation. Gradualism is even being thrown to the wolves but never natural selection and never naturalistic mechanisms of variation. Changes to the MET are not thought of as supporting ID.
Read Hunter’s new website. They now say that organism have the capability to evolve built in. Is this support for ID according to the evolutionary biologists? No, because this capability was selected after it developed naturally. Evolutionary biologists have more outs from a sticky situation than Jack Bauer is capable of executing in 24 hours.
140
sparc
01/26/2009
11:28 pm
Do you think that it will really help to introduce yet another term? Currently we have IC (Behe) and CSI (Dembski) which seem well established in the ID community. In addition one occasionally finds FCSI (KairosFocus, Jerry, gpuccio) and FSIC (Gordon Mullings). And now Jerry just introduced SFCSI.
Will a six letter abbrevation (e.g. IFSCSI) be the next step in the evolution of CSI?
141
gpuccio
01/27/2009
1:50 am
Mark:
You are, as usual, a careful reader.
# 135:
I had obviously read with great attention the other papers mentioned in the Wikipedia page. I did not mention them explicitly in my post pusposefully, but I was referring exactly to them when I wrote:
“The difficulty of obtaining even a single new protein, however, is evidently troubling darwinists ever more. That can explain the recent attempts to find an important role to frameshift mutations, which are the only hope for traditional darwinism to obtain radically new sequences in a “simple” way. So, although frameshift mutations are totaly unrealistic as a way to obtain functional information, darwinists are trying just the same to exploit them in their “models”, driven probably by sheer desperation.”
I hoped that someone mentioned them, so that we could deepen the discussion.
Those papers are really the product of wishful thinking, or if we want of desperation. The only interesting thing in them is that the authors frankly aknowledge that, without the frameshift mutation echanisms, it would really be difficult for the model of darwinian evolution to explain the observed diversity of proteins.
But the papers themselves are absolutely generic and speculative. They are of the kind: let’s take all possible frameshifts in the human proteome, and just blast them against the existing genome, and see if we find some partial homologies somewhere more frequently than we would expect in a completely random system. In other words, they are only abstractly playing with thousands of sequences to give some support to an unbelievable assumption.
The only real example of an observed frameshift mutation which would give a functional protein cited by them as a basis for their assumptions is, obviously, nylonase.
And, obviously, nylonase is bogus. It is one of the most beautiful examples of a bogus assumption emphasized by darwinists for years as “truth” only because it seemed to give support to them against IDists. It is a very good example of what dogmatism can do to scientific reasoning.
Very briefly, the fairy tale of nylonase’s frameshift origin begins (and ends) with a paper from Susumu Ohno on PNAS in April 1984. It was a paper exactly of the kind of the more recent ones cited in wikipedia: highly abstract, and unsupported. Ohno has just observed that there was a “possible” ORF in the genome of the plasmid containing the gene for nylonase which “could” have been, in the past, a protein gene for a £never observed” ancestral protein which “could” have given origin of the observed gene for nylonase by a frameshift mutation.
It is interesting that Ohno was suggesting that an observed protein (nylonase) “could” have originated from a hypothetical, and never observed, protein by an hypothetical, and statistically almost impossible, mechanism. That could have remained just an interesting but bizarre paper about an interesting, but almost certainly false, theory, if darwinists had not decided to make of it a favoured piece of propaganda, much like they have done with another useful fairy tale, Matzke’s theory about the “evolution” of the flagellum.
So, the frameshift origin of nylonase became “truth”, and frameshift mutations became the tool of darwinian evolution to quickly and efficiently traverse, by statistical magic, that ocean of improbability which, at the same time, darwinists were swearing there was no need to traverse. And why not? They had empirical evidence for that!
We had to wait more than twenty years to get rid of those myths, by means of the serious work of those serious researchers who, thanks God, still exist.
And please, take notice that the supporting myth that nylonase was a “de novo” protein, emerged in a few decades by the wonderful mechanisms of darwinian evolution, was based on a quick and superficial statement, in the paper about its discovery, that it presented no obvious homology to a bunch of known proteins, and no other apparent function than digesting nylon. Both these statements were, obviously, false.
Finally, I must thank Zachriel for having prompted me to review the current literature about nylonase by citing it on your blog. It took a little time, but it was really rewarding. Strangely, the important news about that very important acquisitions had not been boasted around by darwinists, and even we IDists had probably not noticed it. But I must give credit to Wikipedia for correctly citing it, even if with the understandable, but very awkward, attempt at covering the facts by the prompt citation of those other absolutely irrelevant papers.
142
gpuccio
01/27/2009
2:06 am
Mark (# 136):
I quoted that paper just to show you an example of how many proteins exist in the proteome which are specific to one single species. That means that they show no homology to proteins in other species. Even in the category of PDFs, they were citing 17,554 species specific proteins! That seems really a lot to me.
That does not mean, obviously, that all the remaining proteins are conserved in all the species. It just means that they show some homology (certainly of various level and significance) to “some” other proteins in the tested genomes.
In the interesting article linked in Paul Nelson’s recent thread about the tree of life, we find the following interesting statement:
“The battle came to a head in 2006. In an ambitious study, a team led by Peer Bork of the European Molecular Biology Laboratory in Heidelberg, Germany, examined 191 sequenced genomes from all three domains of life – bacteria, archaea and eukaryotes (complex organisms with their genetic material packaged in a nucleus) – and identified 31 genes that all the species possessed and which showed no signs of ever having been horizontally transferred.”
31 genes. That’s remarkable, isn’t it?
And about homology, please remember that the assumption that any level of homology in protein sequences is evidence of unguided derivation is only another myth of darwinism.
First of all, homology is tested against the null hypothesis of randomness. Therefore, it is perfectly obvious that we will find some homologies bewteen proteins which have a similar function, for the same reason that all motorcars have wheels. That’s certainly statistically improbable in a random assemblage of pieces, but it is certainly expected in a collection of machines which are designed to move.
And finally, even if homology can show derivation, in no way it proves “unguided” derivation.
So, to sum up, I think that the origin of “de novo” proteins does remain a completely unsolved mystery for darwinian theory.
143
djmullen
01/27/2009
3:33 am
gpuccio:
Although proteins are interesting, my main concern is with the subject of this thread, the Dembski and Marks papers and how they relate to evolution. As you say, when copying DNA, “…there is no search necessary.” You even say this fact is so trivial, I don’t even have to state it.
So I ask again, what do Dembski and Marks’ papers on searching through the vast search space of a particular genome have to do with evolution? So far as I know, evolution never does anything like that. It either does a direct copy of the parental DNA, which means no search at all, or it mutates a very very tiny percentage of the parental DNA which means it “searches” an area of the genome very very close to the known-good genome of the parents. I don’t see any relevance of the Dembski-Marks papers to evolution at all.
144
gpuccio
01/27/2009
6:23 am
djmullen (143):
if you have followed all the previous discussions about prtoeins, you should already have the answers.
To sum up:
1) Proteins are the essential material on which evolution has to work.
2) All the proteins we can observe are extremely varied both as primary sequence and as function. Many of them are even species specific.
3) The ocean of possible proetin sequences is, indeed, an ocean, and a very big one. Functional proteins are located in myriads of small separated islands in that ocean, as can be easily verified comparing their primary sequences, which are well known in great detail for a lot of them (just think of my example of two real proteins, completely different one from the other, with two different functions and 3D structure, and a conbinatorial space of more than 20^700).
4) If you believe that all the existing proteome is derived from some original ancestor, with a much smaller proteome and simpler functions, and by unguided darwinian means, that implies that the ocean of possible sequences has been traversed myriad of times in the course of natural history, and even more amazingly at OOL, in some mysterious prebiotic setting.
5) That’s why Dembski and Mark’s work about random searches is absolutely relevant, if you want to defend darwinian theory. I am not going into the details of their work, but relevant to biology it certainly is.
6) If you go on affirming that evolution “mutates a very very tiny percentage of the parental DNA”, then you have to explain why proteins are so different and interspersed in the ocean of sequences. Or show that in all cases bridges exist bewteen the islands of function where a selectable and growing function is maintained in all the intermediates, and the jumps between one intermediate and the others is always tiny enough that no important search has to be performed. I really doubt that anyone has ever done that, or will ever do that, even for one single important case (I mean, for one case which is not simply a microevolutionary, insulated case of one-two aminoacids substitution with immediate and definitive selection). Nobody can build such a model for a very simple reason: it does not exist. And, to save darwinian theory form the paradise of inconsistent fantasies, that should be necessarily done not for one, but for all known cases of different proteins with different functions.
I hope that answers more explicitly your questions.
145
JayM
01/27/2009
7:42 am
CJYman @138
Thanks for the detailed reply. It appears that I do understand you correctly. In order for Dr. Dembski to present these two papers as support for ID, though, he (or someone) needs to make the links you hint at explicit. Unless one subscribes to the multiverse theory, it isn’t clear that the physical laws of this universe are the result of any kind of search (or search for search) process.
That’s what we have to prove rather than assert, though. MET mechanisms demonstrably have some ability to allow biological organisms to adapt within some limits, so we can’t make a blanket statement that foresight is absolutely required. We need to find out what the limits are and prove them both mathematically and experimentally before we can claim to have positive evidence for ID theory.
I hope that Dr. Dembski’s papers are eventually seen as one small step in that direction.
JJ
146
JayM
01/27/2009
7:57 am
gpuccio @144
The fact that most evolutionary mechanisms result in tiny changes between generations is not just something that djmullen is “affirming”, it is a repeated empirical observation. Even “large” changes like duplication or frameshifts don’t change the genome significantly in the next generation, although they can provide more variety in subsequent generations.
There is much that ID proponents can legitimately argue against MET, but even hinting that djmullen’s characterization is inaccurate suggests a lack of understanding that can be exploited by ID opponents. We need to be careful in our public statements.
I would turn this around and suggest that it is an excellent avenue for ID research. If it can be shown that no path exists, that would destroy MET.
The obvious objection to this suggestion is that we’re being asked to prove a negative. The problem is not, however, as intractable as that. There are a limited number of mechanisms in MET. There are a limited number of base pairs and amino acids. There is at least some historical record of what paths were actually taken. It would be difficult, but by no means impossible to show that a particular protein could not have arisen by MET mechanisms. This would be a true vindication of The Edge of Evolution.
JJ
147
JayM
01/27/2009
8:05 am
gpuccio @144
This overstates the case somewhat. While some full proteins are species specific, most do have underlying partial homologies. It is those that are used as evidence for common descent.
I’m afraid that “extremely varied” isn’t supported by the empirical evidence. There is certainly enough difference to make one question how non-intelligent mechanisms could be the cause, but the underlying amino acid sequences are not all over the map as you imply.
Again, we in the ID camp need to be very careful to address the real positions held by ID opponents rather than strawmen that not only can be easily knocked down but that can also be used to distract endlessly from our core message.
JJ
148
Laminar
01/27/2009
8:36 am
gpuccio:
Obviously I’m not an expert in biology like you, and I’m assuming that all the claims you make about these islands of functional proteins are based on published research but it is still hard to see how two research papers about the mathematics of search algorithms, which don’t explain why they are relevant or applicable to biology, are actually applicable to biology, in fact as a comp scientist I found it hard to see how the first paper could be applied to any real world situation given the way the authors seem to misunderstand some of these algorithms.
What is needed is for an expert in the field like you to take all this evidence on the awkwardness of these protein search spaces and the evidence presented by Dembski and Marks, write it up as a paper and submit it to a biology journal. I doubt it would get published because of the conspiracy but you would still be able to post it on the web as evidence of the way these darwinians are suppressing the truth.
149
gpuccio
01/27/2009
8:50 am
JayM:
# 146
“The fact that most evolutionary mechanisms result in tiny changes between generations is not just something that djmullen is “affirming”, it is a repeated empirical observation.”
What do you mean? Obviously a sinle point mutation changes amnly one nucleotide, but if it is a frameshft or a stop mutation a whole protein can change. Deletions and inversions can cause great changes in one signle step. The problem is that all those steps are random, and that’s why a frameshift mutation , which is the equivalent of a blind long leap into the ocean, has practically no hope to find an “island”.
I have the impression that you and djmullen are maybe suggesting that a change can be “small” in regard to the whole genome. But we have not to reason about the whole genome. That is senseless. Our unit of reasonong must remain the single protein coding gene.
There are single point mutations which are incompatible with life. Much more difficult would be to find a single point mutation which gives a reproductive advantage to a complex organism.
Empirical observation of “useful” mutations is limited to microevolutionary events, like antibiotic resistance and even nylonase. In those observed events, the edge of evolution seems to be, at present, about two coordinated useful aminoacid substitution. That is very little to explain how you can get to a new protein of 700 aminoacids. Therefore, there is no empirical observation of mechanisms which can eliminate the need for a blind search in the ocean of sequence possibilities.
“but even hinting that djmullen’s characterization is inaccurate suggests a lack of understanding that can be exploited by ID opponents. We need to be careful in our public statements.”
djmullen’s characterization is inaccurate. I take full responsibility for this statement. And I think I am very careful in my public statements (which, obviously, does not mean that I cannot be wrong, like anybody else).
“The obvious objection to this suggestion is that we’re being asked to prove a negative.”
We have nothing to prove. What darwinists are assuming is obviously incredible. It’s them that have to provide some example or model to make it at least debatable. Otherwise, they have no theorym they have no model, and especially they have no empirical support. Which is exactly the case.
“This would be a true vindication of The Edge of Evolution.”
The edge of evolution is already vindicated. But, obviously, we can do even better, and we will.
150
gpuccio
01/27/2009
9:30 am
JayM:
# 147
“This overstates the case somewhat. While some full proteins are species specific, most do have underlying partial homologies. It is those that are used as evidence for common descent.”
And so? Even if you group the proteins which have strong homologies, you still have myriads of separated islands. Who has said that an island must be made of only one protein? Even if myoglobin is present in many living beings, it is still completely different form insulin, or from c-myc. And who has ever criticized the concept of common descent? I accept it. But what has common descent to do with traversing the ocean of possibilities?
Let’s take the example of myoglobin. At some point in natural history, this protein appears. It is about 154 aminoacids long. Where did it come from? The fact that, after its appearance, it is maintained in most species means only that the function has been exploited in most species. But the traversing is necessary to get to that function, the first time. That’s why I said that djmullen’s statement that DNA replication was not a search was suspiciously trivial. Maybe he (and you) have not well considered that the search happens before, when you have to find the information, and not when you simply copy or transmit it?
“I’m afraid that “extremely varied” isn’t supported by the empirical evidence.”
As an answer, I quote again from my previous posts:
from my post # 52:
“The proteins we do know (and we know a lot of them) are really interspersed in the search space, in myriads of different and distant “islands” of functionality.
You don’t have to take my word for that. It’s not an abstract and mathematical argument. We know protein sequences. Just look at them.
Go, for example, to the SCOP site, and just look at the hyerarchical classification o protein structures: classes (7), folds (1086), superfamilies (1777), families (3464). Then, spend a little time, as I have done, taking a couple of random different proteins from two different classes, or even from the same superfamily, and go to the BLAST site and try to blast them one against the other, and see how much “similarity” you find: you will probably find none. And if you BLAST a single protein against all those known, you will probably find similarities only with proteins of the same kind, if not with the same protein in different species. Sometimes, partial similarities are due to common domains for common functions, but even that leaves anyway enormous differences in term of aminoacid sequence.”
from my post # 118:
“In the genome of E. coli, a well known bacterium, we can find at least 2018 proteins described, grouped in 525 groups of similar proteins. The biggest group includes 52 proteins, while the smallest groups include only two proteins each, and there are 291 of them.”
Do you still think that “extremely varied” is not supported by the empirical evidence? What empirical evidence are you suggesting?
“but the underlying amino acid sequences are not all over the map as you imply.”
That is simply wrong. They are. Go back to the two proteins I took at random from the E. coli genome. They are more than 700 aminoacids long, each one of them. In the best possible alignments, four different ones, you can find only 40 identities (not consecutive). Theose two proteins are “all over the map”. And the same can be said if you take any other pair of proteins from different groups in the E. coli genome, or from different families or superfamilies in the SCOP classification. And even proteins with homologies are very different one from the other. Homologies are partial. Some are very partial.
Choosing a conserved part of the c-myc sequence (a very important transcritpion factor), i have performed a blastp serach with a sequence of only 7 consecutively conserved aminoacids. 7 aminoacids is not much, they correspond to a search space of “only” 10^9. The search was done against the whole database of known proteins, and guess what? Only c-myc molecules had that exact sequence in their primary structure. You had to drop at least one aminoacid to find that sequence in other kinds of proteins.
Can you see how terribly specific even a small sequence of aminoacids can be? Just consider that the specific immunological response in higher animals and humans is targeted against very small aminoacid sequences, the epitopes, whose length is usually less than 10 aminoacids, and yet those small sequences are so specific that they are the basis for our defenses.
So, you can see that protein sequences are at the same time:
a) extremely varied and interspersed on the whole map of possible sequences
b) extremely specific, with very short sequences representing often a definite signature for one protein.
“Again, we in the ID camp need to be very careful to address the real positions held by ID opponents” r
That’s exactly what I am trying to be.
“rather than strawmen that not only can be easily knocked down but that can also be used to distract endlessly from our core message.”
I don’t think I am using any strawmen. You can see that I am very precise and explicit in my arguments and in my answers, trying to address exactly what others, including you, are saying. If sometimes I misunderstand what the other is saying, I am ready to apologize.
151
CJYman
01/27/2009
9:34 am
JayM:
“Thanks for the detailed reply. It appears that I do understand you correctly. In order for Dr. Dembski to present these two papers as support for ID, though, he (or someone) needs to make the links you hint at explicit.”
I see where you are coming from and I think that the reason he doesn’t make the connection explicit is because it seems that these papers are more of a response to inflated claims about how evolutionary algorithms can account for biodiversity. These papers were merely showing that the extent to which an evolving system increases probabilities is proportional to the improbability of finding that system (evolutionary algorithm) in the first place. The papers [along with the NFLT] explicitly detail the fact that in order to increase probability of generating a given pattern, a set of laws matching search space structure and search procedure, which is even more improbable, must first be generated. Thus, evolution is not, as some have claimed, a free source of information — information being an increase of probability. This increase of probability, labeled active information in the paper, is merely pushed back to higher and higher levels.
What the paper does show is that this increase in probability seen in the operation of evolution still needs an explanation. Evolution is not an explanation of its own success at increasing probability anymore than an EA can create itself absent foresight.
From reading the paper, we can see that we have a few options … infinite regress of active info. which is akin to “turtles all the way down” and never truly explains increases in information (probability). That “explanation” merely explains away by handwaving.
The other two options are merely chance and law, or foresighted systems. As I have already explained, no one has shown background noise (chance) and an arbitrary set of laws (set of laws with no regard for future results) to be up to the task of generating active information. If someone showed such an experiment, the math within these papers would be falsified, since they forbid such increases in probability. However, we do know through our own experience and through the creation of AI systems that foresight using systems (systems which model the future and generate targets) do exist and do increase probabilities of generating specified or pre-specified patterns.
I have explicitly, yet briefly, outlined the ID connection in my post #122.
JayM:
“Unless one subscribes to the multiverse theory, it isn’t clear that the physical laws of this universe are the result of any kind of search (or search for search) process.”
The multiverse theory provides no more of an explanation for life and evolution than it provides an “explanation” for this conversation we are having. It is merely a chance of the gaps “explanation” and there is no criteria for when we can invoke it and when we can’t. Can we invoke infinite chance to explain the results of all scientific tests? If so, then we would never discover anything since the proper explanation is that “the multiverse did it.” Sure, using an infinite resource of chance comes in handy when trying to explain everything and anything, but is it the best explanation? This is where other ID argument come in, such as CSI. If a computer program based on only chance and law (absent previous foresight) won’t produce CSI, why would an infinite amount of such programs be a better explanation rather than foresight, which is indeed characterized by its generation of CSI?
As to the universe being the result of search, as long as it is one of any number of possibilities in “quantum chaos,” then it is a search as defined by these two papers and as I previously explained in my post #121. At least our universe is confined to an extremely small set of possible mathematically described universes which even allow life and evolution. If, however, our universe is the only option, then no there is no search involved but this would raise other extremely intriguing questions.
However, the papers would still show that any simulation of evolution requires the previous raising of probabilities before the evolutionary algorithm even starts. As we all know, intelligent humans raise the probabilities by introducing problem specific information into the behavior of the interaction between search algorithm and search space, by programming law and chance in such an improbable way as to solve a given problem in the future. That basic concept has been proven in the NFLT and extended in these two papers. Moreover, human intelligence uses its foresight to add problem specific information.
IOW, all simulations of evolution require foresight so, barring this universe being the only option, on what grounds could someone say that the “real deal” requires no foresight as a scientific fact or that there is “no evidence for teleology within life” or that “ID is unscientific?”
152
jerry
01/27/2009
9:58 am
sparc,
You should hang out with Americans more often. You will pick up common sense and some humor. I know you have referred to us as dummkopfs but you read things too literally. My use of the term SFCSI was meant to show the absurdity of Mark Frank’s objections and my guess you too if you deny FCSI. Do you deny the reasoning of FCSI or that DNA is information or a code? If so then you can join us as dummkopfs.
For those who do not know, sparc is a biologist who is unable to provide any critical analysis on the evolution debate. And as I said earlier on this thread we need more like sparc here who only can contribute inane remarks when they disparage ID. Go Sparc.
153
Patrick
01/27/2009
12:14 pm
CJYman,
If I may layman-ize: Darwinian processes are generally limited to searching for short range targets in local genome space. In order to search for long range targets a set of specific conditions relevant to the long term (“a set of laws matching search space structure and search procedure”) must first be found as well. And a Darwinian search in nature is limited in scope to that which increases overall survivability, not functionality for merely functionalities sake as with engineering projects using GAs.
Does that summarize correctly?
154
Prof_P.Olofsson
01/27/2009
12:42 pm
CJYman and others,
The only logic I can see in how the “search for a search” paper supports ID would be to claim:
“Either
(a) the darwinian search algorithm was chosen according to the Kantorovich-Wasserstein distribution
or
(b) it was desiged.”
Note that the K-W distribution on M(Omega) depends on what metric you choose on Omega so there is no unique way to interpret “randomly choosing a search.”
Informal claims such as “it is as difficult to find a search as it is to find a target” are not supported by the paper unless you make a lot of arbitrary assumptions.
155
Prof_P.Olofsson
01/27/2009
12:56 pm
CJYman and others,
Even if we do accept all the assumptions about “searching for a search,” shouldn’t we compute probabilities rather than talk about averages? I’ll give an example following Dembski&Marks’s construction on page 2 in the “search for a search.”
Let Omega={0,1} and search for 0. If we choose at random, the probability is 1/2 to find it. If we first choose a probability distribution on Omega at random, the probability to find 0 is still 1/2 (compute the expected value of p where p is uniform on [0,1]). That’s the “conservation of uniformity” in this example.
Now instead ask “What is the probability to do better than random search?” This is the probability to find a p that is greater than 1/2 and as p is uniform on [0,1], there is a 50% chance to beat random search. If we “search for a search” twice, we have a 75% chance to beat random search at least once. And so on. In other words, it is easy to beat random search by repeatedly randomly searching for searches.
Now assume that Omega is huge but still finite, Omega={x1,…,xn} where we search for x1. [Note that x1 may represent an entire search string so finding x1 in one step may mean that we have found some target in a finite number of steps.]
The probability to beat random search turns out to be about 37% [approximately = exp(-1)], regardless of n, and searching twice gives us about 60% chance to succeed.
Conclusion: If we “randomly search for a search” at least twice, we are more likely than not to beat random search.
156
Mark Frank
01/27/2009
2:05 pm
Gpuccio
This business of frameshifts is rather interesting. I wish I had time to really understand the papers. But there is a broader thing that concerns me.
I can’t see that frameshift is a Darwinian concept. It is just a way generating a DNA string from a parent that gives a radically different protein string. All other known methods, it appears, do not generate radically different proteins. So if we come across a protein which is radically different from others and there is no evidence that frameshift was involved, then there is a problem knowing how that protein was generated. But it is a problem for all theories that assume DNA is created by modification of parent DNA. Unless you believe God inserts complete DNA strings into cells from time to time then, this equally a problem for a non-Darwinian theory.
In fact frameshifts could be interpretated as evidence for ID using classic ID arguments. If, as you say, it is highly improbable that a frameshift will generate something useful, then the prescence of over two hundred proteins created by frameshift is extraordinarily improbable – unless big G was controlling the shifts.
On the other hand if the proteins were generated by some, as yet not understood method of moderating parents, then it might be a much more plausible mechanism.
157
JayM
01/27/2009
2:27 pm
gpuccio @150
The underlying partial homologies are evidence that supports MET mechanisms of incremental change. You can’t consider just the full protein as though it came into being all at once. Because even those species specific proteins have significant overlap with proteins in other species, evolutionary biologists can and do consider them consistent with MET.
Could you point to some literature cites that support this claim? I have looked for such information for quite some time, because it seems that this is a very promising area for ID research. I haven’t found any clear results that show that genome space is fragmented in this way.
An evolutionary biologist would probably say it came from a protein that was about 153 amino acids long. Or from a very slightly different protein that was about 154 amino acids long.
Unless ID researchers can show this is a mathematical impossibility,
the MET mechanisms remain scientifically credible.
While I appreciate your effort, that isn’t a particularly rigorous process for tracking homologies. You might identify a new homology with that approach, but failure to find one after a handful of searches doesn’t invalidate the entire corpus of evolutionary molecular biology.
Identifying proteins that are a result of a genome in an unreachable area of genome space will require extensive, painstaking work to identify candidates and trace them back to their progenitors in a recent common ancestor with another species. Taking humans as an example, if MET mechanisms are sufficient you should see similar molecular constructs, for a certain level of complexity, in chimpanzees. Other great apes should show less similarity, and other primates less still. If you find a relatively complex construct (protein, gene, etc.) without a corresponding similarity, you’ll have a solid candidate for further research.
I would love to see ID researchers taking this tack.
There are certainly a lot of different proteins, but again you can’t assert that they can’t be arrived at through MET mechanisms without looking at the probably precursors and overlapping sequences in other organisms. MET does not claim that the whole protein came into being in one fell swoop. We have to address what MET actually says, not what we wish it said.
Your work with BLAST does not, unfortunately, support your claims. Instead of looking at entire proteins or genes, you must consider subsequences shared with likely precursors and other descendents from those precursors. This is exactly what molecular biologists and bioinformaticists do on a daily basis. Numerous homologies have been identified and the literature contains more every day.
I will be delighted if you have really identified the disconnected islands that disprove MET, and if you have you should definitely publish your results, but I don’t see support for that in what you’ve presented so far.
ID opponents make some strained arguments, but one that is valid is that anyone who discovered what you claim here would be world famous. It stretches credibility to think that everyone with access to the same tools you used could be Expelled and prevented from publishing.
JJ
158
JayM
01/27/2009
2:41 pm
CJYman @151
I still don’t see this from the papers themselves. When I saw the title about peer-reviewed papers supporting ID, I was excited. After reading them, I can see it will be all too easy for ID opponents to dismiss them as having no biological relevance.
I would love to see a third paper that ties this all together.
Again, this argument seems to be that the physical laws of the universe are designed, but that, given them as they are, evolutionary mechanisms are sufficient to explain biological diversity.
I suspect this is not Dr. Dembski’s position.
That’s great, but it still seems to cede the field of biology to modern evolutionary theory.
An aside due to my inner geek: Is there a formal proof that CSI cannot be generated from evolution simulations?
I don’t believe that “all simulations of evolution require foresight.” We’ve discussed several different genetic algorithms in recent threads here and Dawkin’s Weasel is the exception rather than the rule. I’m not sure how this is relevant to the two papers, though.
JJ
159
Prof_P.Olofsson
01/27/2009
2:55 pm
And, to [154] and [155], I’d like to add a third question: Even if we adopt the model that searches are being searched for via a probability distribution, how can we argue that the distribution must be uniform? Even if you don’t believe the darwininan search can be successful, you have to admit that it exists and is reasonably well understood: DNA is replicated, mutations cause imperfections which leads to a new genotypes (vastly simplified). How would you argue that this search is no more likely that one that is uniform over the entire sequence space (“blind search”) which cannot even be given a reasonable biological or phyiscal meaning? The recurring use of the uniform distribution in many attempts to apply math to biology is a model assumption must be argued just like any other model assumption.
160
Prof_P.Olofsson
01/27/2009
3:07 pm
And, finally, I know that my comments [154,155,159], although directly related to the Dembski & Marks paper, are aside of the main topics of the thread. If somebody is interested, we may take it off the air as well.
161
CJYman
01/27/2009
3:34 pm
JayM:
“That’s great, but it still seems to cede the field of biology to modern evolutionary theory.”
It cedes biology to evolution and it takes evolution for itself.
JayM:
“An aside due to my inner geek: Is there a formal proof that CSI cannot be generated from evolution simulations?”
The only formal proof presented is that CSI won’t generate itself through only chance and law and this is implicit within these two papers. Since CSI is a measurement of finding a specified or pre-specified pattern at better than chance performance, problem specific information is required to find it (according to NFLT). Problem specific information is measured as active information within these two papers. Do you see the connection?
CSI can be transferred through evolution, but will never be generated from scratch by evolution. Why? Because these two papers show that active information continually regresses to higher and higher levels of search. So, if active information is necessary for the transferring of CSI, then evolution can’t generate CSI. It can only transform active information into CSI.
JayM:
“I don’t believe that “all simulations of evolution require foresight.” We’ve discussed several different genetic algorithms in recent threads here and Dawkin’s Weasel is the exception rather than the rule.”
My apologies for not being clear.
If we are discussing any simulation relevant to biology, then evolution is the discovery and increasing of CSI. Can you provide any simulation of biological evolution which discovered CSI from initial conditions that were not programmed with any consideration for future results?
JayM:
“I’m not sure how this is relevant to the two papers, though.”
It’s more relevant to the NFLT, where it is important that problem specific information be incorporated into the behavior of the algorithm (almost an exact quote), thus searching for a given pattern becomes as difficult as searching for the problem specific information necessary to discover that pattern at better than chance performance. These two papers merely provide a metric for measuring problem specific information and showing that it regresses to higher and higher level searches.
The relevance to ID is that we seem to have only three choices which I’ve already discussed … an infinite regress of active info., a falsification of the paper by showing active info generating through background noise and arbitrary collection of laws, or use a foresighted system to program the initial conditions — actually incorporating problem specific information through knowledge of the future problem to be solved.
162
CJYman
01/27/2009
3:36 pm
Hello Prof. Olofsson,
I’m not sure that I follow you, however if what you are stating is true and actually has practical effect (ie: actually happens in the real world), why doesn’t anyone just show that background noise (chance) and an arbitrary set of laws (set of laws collected without any consideration for future results — absent foresight) will produce systems which process signs/units and evolve into greater and greater specified complexity, and just falsify ID theory and be done with it. If evolution is so powerful absent previous foresight, why do programmers bother programming boundary conditions for future known targets into the evolutionary algorithms used to solve problems (ie: max efficiency antenna shape)?
163
Prof_P.Olofsson
01/27/2009
4:16 pm
CJYman[162],
Let me clarify: I am questioning the claims that the “search for a search” paper is pro-ID as was claimed in the introduction. In order to be considered pro-ID, it would have to have some implications for biology.
I ask, for example, (1) what is the rationale behind claiming that search algorithms are chosen according to probability distributions, in particular the K-W distribution [154]? and (2) if they are, why do these distributions have to be uniform [159]? and point out (3) even if they are uniform, they are still quite likely to beat random search [155].
164
R0b
01/27/2009
4:21 pm
Prof_P.Olofsson:
One of the arbitrary assumptions involved in measuring active information even violates Dembski’s own repeated warning. He has told us several times that “how we measure information needs to be independent of whatever procedure we use to individuate the possibilities under consideration.” And yet, the active information measure depends very much on how we individuate the possibilities.
165
R0b
01/27/2009
5:02 pm
Prof_P.Olofsson:
Indeed, Olle Haggstrom’s response to the active info approach is titled “Uniform distribution is a model assumption”. The question asked by Haggstrom and others is, why should we always expect uniform randomness? In fact, it’s an impossible expectation. If everything were characterized by a uniform distribution, then that would be a non-uniform distribution of distributions.
166
R0b
01/27/2009
5:38 pm
CJYman:
Just for clarification, are you saying that this formal proof is implicit in these two papers? (An implicit formal proof sounds like an oxymoron to me.) If not, where is this formal proof to be found? Thanks.
167
gpuccio
01/27/2009
6:15 pm
JayM (# 157):
I appreciate your efforts, but I am afraid I can only repeat for you what I have already said about djmullen: your characterization is inaccurate.
“The underlying partial homologies are evidence that supports MET mechanisms of incremental change.”
Absolutely not. There is nothing in the homologies which shows incremental change in the sense of a gradual modification of the function. The conservation of domains has a functional basis, but sometimes the same domains and 3d strictures are obtained, in different species, with very different primary sequences. And there are a lot of different domains, and different foldings, whose biochemical nature is completely different. If you were familiar with the biology of proteins, you would never say the things you say.
It is true that darwinists affirm that proteins have formed by incremental change in function, but that is only one of the many just so stories. There is no empirical confirmation for that, and a lot of theoretical impossibilities. Why do you think that ID exists at all?
“You can’t consider just the full protein as though it came into being all at once.”
I am not saying that it came into being at once, I am saying that it was designed. But you seem to ignore that parts of one protein are not functional. You need a whole protein, with its correct folding and domain, to get a function. Very big proteins can be multidomain, and can be deconstructed in different functional portions, but proteins like myoglobin are very compact globular proteins, and the whole sequence must be there. That’s why myoglobin has approximately the same length in most species, with only minor variations. That is true of many important proteins.
“An evolutionary biologist would probably say it came from a protein that was about 153 amino acids long. Or from a very slightly different protein that was about 154 amino acids long.
Unless ID researchers can show this is a mathematical impossibility,
the MET mechanisms remain scientifically credible.”
We all know what evolutionary biologists would say, and that’s why we are here in ID. A 153 aminoacids myoglobin would still be a myoglobin, with the same 3D structure and the same function. Not all myoglobins are “exactly” 154 aa (that is the length of the human form, and of most examples). Some are of 146, 147, 149 and so on. But those are minor difference, in part probably due to neutral evolution. But the protein is essentially the same.
But there is no model which can explain the step by step derivation of myoglobin from some other simpler protein with a different function. And remember, such a model should explain not only the variations in aa sequence, but also the variations in function which allow the selecrion and expansion of each single intermediate. Again, that’s why I am in ID, and not in the darwinian field. You can stay where you like, or just go on observing and posting.
“Could you point to some literature cites that support this claim? I have looked for such information for quite some time, because it seems that this is a very promising area for ID research. I haven’t found any clear results that show that genome space is fragmented in this way.”
There is no need to support my claim. The data are there, for anybody to look at them. I have just showed you a few examples. Go to SCOP, do some blast homework, and you will see for yourself. Everybody who works with proteins knows what I am saying.
“While I appreciate your effort, that isn’t a particularly rigorous process for tracking homologies. You might identify a new homology with that approach, but failure to find one after a handful of searches doesn’t invalidate the entire corpus of evolutionary molecular biology.”
Blastp is one of the most currently used tools to look for homologies. I have also used ClustalX, with the same results. Do you really believe that there is a way to find significant homologies between two proteins like dinG and clpA? Not even darwinists are so smart.
And the entire corpus of evolutionary molecular biology is based on false assumptions, and that’s what invalidates it.
“Taking humans as an example, if MET mechanisms are sufficient you should see similar molecular constructs, for a certain level of complexity, in chimpanzees.”
Again you miss the point. There are proteins which are almost identical in extremely different species. That is easily explained as functional conservation, but tells us nothing about the emergence of the function. Neutral evolution, and especially synonymous mutations, are interesting tools to try to understand variations due to time and random errors, but tell us nothing of the mechanism of function generation.
“MET does not claim that the whole protein came into being in one fell swoop. We have to address what MET actually says, not what we wish it said.”
I am still waiting for them to say how that happened. Can you help?
“I will be delighted if you have really identified the disconnected islands that disprove MET, and if you have you should definitely publish your results, but I don’t see support for that in what you’ve presented so far.”
If you can’t see what is under the eyes of all, I really can’t help you.
“ID opponents make some strained arguments, but one that is valid is that anyone who discovered what you claim here would be world famous. It stretches credibility to think that everyone with access to the same tools you used could be Expelled and prevented from publishing.”
I would be very pleased if I could become famous so easily. Unfortunately, one does not become famous for saying what is known to all, and is easily retrieved in public databases. I am just discussing the interpretation of what is known to all. The only reason why I spent all this time giving you some real example, is that you (and djmullen) were saying completely wrong things with a certainty which, I hope, could derive only from your ignorance of the matter.
168
R0b
01/27/2009
6:43 pm
CJYman:
One problem with the connection that you’re drawing is equivocation on the word “chance”. Under the active info framework, chance means a uniform distribution. Under the old CSI framework, it meant any distribution, including the distribution conferred by RM+NS.
Dembski in NFL: Chance as I characterize it thus includes necessity, chance (as it is ordinarily used), and their combination.
Dembski in his Specification paper: Moreover, H, here, is the relevant chance hypothesis that takes into account Darwinian and other material mechanisms.
So products of evolution have no CSI, by definition, in spite of the fact that evolutionary processes have boatloads of active information.
169
gpuccio
01/27/2009
6:52 pm
Mark (#156):
I come back to you with real pleasure. Yous posts are always intelligent, sincere and creative.
I agree with you that frameshift mutations as a source of functional variation should not ba a classical darwinian concept. The fact that they are pursuing it so earnestly, apparently even after the debunking of nylonase, is just a sign, for me, of how desperate they are. Apparently, not all darwinist biologists share JayM’s faith in an unfragmented proteome.
“So if we come across a protein which is radically different from others and there is no evidence that frameshift was involved, then there is a problem knowing how that protein was generated.”
You bet!
“But it is a problem for all theories that assume DNA is created by modification of parent DNA. Unless you believe God inserts complete DNA strings into cells from time to time then, this equally a problem for a non-Darwinian theory.”
Yes and no. The implementation of information remains a problem, but if you have not to look for that information, or at least if you can look for it with intelligent means, the main obstacle is overcome.
Let’s take an example. In the case of antibody maturation, which I have cited, the intelligent procedure embedde in the immune system realizes a very significant increase of the affinity of the primary antibody response in just a few months. As I have said, that is attained through targeted hypermutation and intelligent selection through direct measurement of the affinity of the new clones to the antigen. And still, that is a rather indirect method, because the system does not know in advance the sequence to be found, but only the function to be measured.
But let’s imagine that an intelligent force, which knows the sequence to be obtained, like the weasel algorithm, could just “fix”, by some biochemical, or quantum, method which we presently don’t know, any random variation which is correct. Or, better still, just induce the right variation directly, “guiding” the apparently random events at the molecular level. Wouldn’t that make the implementation of the information extremely easier?
You see, the real problem is the search. It’s the search which is completely out of any possibility, without any intelligence and any information about the result to be attained. But, as GAs show, if you have a good oracle, you are in all another situation.
And, in a sense, God could well insert complete DNA strings into cells from time to time. After all, that’s what plasmids and transposons, and even ERVs, seem to do. Why couldn’t God do the same, directly or indirectly?
Frankly, frameshift mutations are so problematic that I don’t believe that even the Designer would use them. But I think there is some literature about genes which can be read in two different ways, both functional, through a different (frameshift) start of transcription (but I can’t remember the details now). Well, that would be tremendously smart design! Like some brilliant enigmas, or some paintings by Escher.
170
djmullen
01/28/2009
3:39 am
gpuccio:
Let’s see, all I have to do to convince you is to basically document every single evolutionary step from the first reproducing chemical to you, me and every other protein in every other organism on the planet?
Fine. Let’s see you do the same thing. You say that some proteins could not possibly have been produced by a series of small evolutionary steps. OK. Which proteins? If they weren’t produced by evolution, then what steps were taken to produce them? Was some mega-mutation necessary to produce some particular protein, changing hundreds or thousands of DNA bases at once? Which bases were changed? What were they before and after the change? When did this happen? How was it done? How was the correct pattern found? Please document your claims. Saying “The Designer” is not documenting anything unless you can tell me about The Designer. Who is he, where is he, how did he get this info? Do you have any independent evidence of his existence other than your protein sequences which you claim can’t be produced by evolution?
While we’re on this subject, what is your expertise in proteins? I’ve been reading your messages about BLASTing protein sequences and I’m uncomfortably reminded of Salvador Cordova’s Avida fiasco. Have you had any training on proteins or the BLAST software?
171
gpuccio
01/28/2009
4:44 am
djmullen:
How boring. Typical behavior once again. When the discussion is no more rewarding, just:
a)Make extreme and false affirmations about what the other has said:
“Let’s see, all I have to do to convince you is to basically document every single evolutionary step from the first reproducing chemical to you, me and every other protein in every other organism on the planet?”
I would have been happy with a single example for one protein, just as a start…
2)Shoot a series of irrelevant and rather silly questions as a form of personal attack:
“Fine. Let’s see you do the same thing. You say that some proteins could not possibly have been produced by a series of small evolutionary steps. OK. Which proteins?”
Practically all of them.
“If they weren’t produced by evolution, then what steps were taken to produce them?”
Please see my detailed asnwers to Mark at #134 and #169. In brief, possibly the same mutations and selections hypothesized by darwinists, but intelligently guided, and not random.
“Was some mega-mutation necessary to produce some particular protein, changing hundreds or thousands of DNA bases at once?”
Not necessarily. But it’s always a possibility.
“Which bases were changed?”
Those which, according to the design, needed to be changed.
“What were they before and after the change?”
Before the change they were those of the ancestor, after the change those of the target protein.
“When did this happen?”
In the course of natural history, as new species appeared. But some of my IDist friends would prefer the theory of frontloading, which I personally don’t like very much. You choose.
“How was it done?”
I don’t know. That’s open to research. But again, if you read my answers to Mark, you will find some suggestions.
“How was the correct pattern found?”
By design. Again, in my answers to Mark you will find two different possibilities: a) the designer knew the solution; b) the designer knew the searched for function, and could measure it.
“Please document your claims. Saying “The Designer” is not documenting anything unless you can tell me about The Designer.”
Then how is it that you are here, on an ID blog? Let me understand, you are listening to people who have always stated clearly that they have, at present, no scientific knowledge about the designer, and you go on discussing in detail many serious issues of the theory with them, and now suddenly you come out with such fundamental disbelief in its basic premises? Tell me, what’s happened?
“Who is he, where is he, how did he get this info?”
My personal (non scientific) opinion? He is a God, He is transcendent, but also very much present in His creation, He can easily get all the info He needs. But I would not discuss these non scientific points here.
“Do you have any independent evidence of his existence other than your protein sequences which you claim can’t be produced by evolution?”
A lot. But it’s not scientific evidence, at least not yet. But, at the scientific level, I am very happy with my protein sequences which can’t be produced by evolution. Very happy indeed.
3) Bring the personal attack at another level, challenging the other’s competence and doubting his “authority”, possibly tying him to some other supposedly uncomfortable people:
“While we’re on this subject, what is your expertise in proteins? I’ve been reading your messages about BLASTing protein sequences and I’m uncomfortably reminded of Salvador Cordova’s Avida fiasco. Have you had any training on proteins or the BLAST software?”
First of all, my expertise in proteins is not your business. This is a blog, and not a scientific journal. And I, and all the other sincere people who come here, have never spoken “from authority”, or requested any academic title or position from others.
Second, I feel honored of being compared to Salvador Cordova.
And third, as I have told that many times here and it’s not a secret, I am a Medical Doctor.
And finally, I am not really looking forward to your response.
172
Mark Frank
01/28/2009
6:12 am
Gpuccio
Re #169
Thanks to you I am learning a lot of biochemistry. Debate is very powerful learning process if properly facilitated – in this case we are doing well at facilitating ourselves.
The example of antibody maturation is rather interesting. If I understand it correctly, it really shows up the difference between non-design approaches and design approaches to solving problems. In particular, because it is a restricted problem, it makes clear the negative nature of the evidence for the design – at least in this case.
As I am not a biochemist I will check my understanding of some basics:
1) When an antigen enters a vertebrate system then after an initial generic response the vertebrate immune system creates an antibody that is specific to that antigen (it locks on to it and marks it for destruction by T cells)
2) This specific antigen is created by vastly increasing the usual rate of mutation in some specific genes and areas of genes (but not in the germ cells so the mutation is not inherited)
3) Once a successful antibody is created presumably there is then some feedback to the genes to increase production of the successful antibody
I will also assume that
4) The target area (the combinations that will successfully lock onto the antigen) is so small and the domain so large that it is effectively impossible to reach the target by simply spinning the wheels and hoping to come down with the right combination.
5) There no value in “partially fitting” the antigen – a mutation either fits or it doesn’t
Now we have an interesting scientific problem. A non-design approach is to explore how that search might work. Options include:
* The antigen somehow causes mutations which are closer to the target
* The target area of all possible antigens is not randomly distributed around the domain but is clustered – so if the starting position(s) for mutations was in the cluster it would greatly increases the chances of hitting the target
* Mutations are not random (random in the sense that any position in the relevant genes is equally likely to mutate and is equally likely to chance to one of the other three base pairs). And this lack of randomness increases the chances of hitting a target.
There may be other options to explore but these exist and I understand they are currently being actively researched.
Now what about the design approach?
You are positing that an intelligence intervenes every time an antigen enters an individual vertebrate to direct the search for an antibody. The evidence for this intervention is the improbability of all current non-design solutions.
Now here is the crunch. What evidence is there for this design explanation other than the failure of known non-design explanations? When talking of bacterial flagella etc this point gets obscured. It is possible to argue for irreducible complexity, and talk about codes and symbols etc, as positive evidence for design, because you are talking about the mystery of the whole transcription system etc. But in this case things are a simpler. We are taking most of biology and the transcription system for granted. As far as this issue is concerned all the rest might be designed or not. We are only asking how do mutations end up creating the appropriate antibodies? You can see the negative nature of the design explanation by imagining that someone comes up with a plausible non-design explanation. What evidence remains for a design explanation? None. By finding a plausible non-design explanation you have removed all evidence for design. Ergo the only evidence for design was the lack of a plausible non-design explanation.
173
gpuccio
01/28/2009
8:42 am
Mark:
My compliments, you have understood much about the immune system, but you are still missing some key points (not your fault). And I am afarid you are also missing my point in citing that system as an example (again, not your fault: I wrote very quickly, giving too many things for granted).
So I will proceed this way: in this post I will try to make a general review of how the immune system works, staying as simple as possible (not an easy task). And in my next post I will elaborate better on my point about that system in our discussion.
First of all, it should be clear that for simplicity we will palk only of the B cell response (antibody mediated).
So, B cells are specialized lymphocytes present in the body wherever there is lymphatic tissue, and in the blood. They derive from hematopoietic stem cells, which become committed to maturation as B lymphocytes. During the first part of life (fetal life and the first few years) the immune system undergoes a process of ontogenesis and maturation which brings it tu full functionality. During that process, mature B cells are formed which we will call here “virgin” B cells (because they have never met the antigen).
Now, a very important thing happens during the maturation of virgin B cells. In the single clones of B cells, a recombination happens at the level of DNA, in a series of genes which are implied in antibody production. As a consequence of that DNA recombination, each final clone of B cells has a different DNA (in that tiny part), and produces a different antibody. Each of those clones is represented by a number of lymphocytes with the same DNA recombination, and is distributed in the body. So, that is what is called the primary antibody repertoire. It is through it that we can have the primary immune response to almost any antigen in nature.
Now, we have to notice some important points:
a) The target of antibodies are mainly small peptide segments, usually about ten aminoacids long, called epitopes. While antigens are big molecules, usually proteins, epitopes are the real unit which is recognized by antibodies. So, the combinatorial space of possible epitopes is big, but not huge (about 20^10).
b) The primary repertoire is achieve through a process of recombination which includes many random components, in a very controlled scenario. That’s what allow to achieve a repertoire which is blind, in the sense that it covers in an interspersed, random way, the combinatorial space of possible epitopes. No information from outer antigens is used in that process. In the ens, primary antibodies are in enough number (nobody knows how many) and sufficiently interspersed in the space, that almost any possible antigen will meet one, and usually many, antibodies which can bind to it. Indeed, the primary immune response is policlonal. But for the same reasons the affinity of the primary antibodies to one specific antigen is almost always low.
c) B (and T) lymphocites are practically the only cells in the body which undergo a somatic modification of their DNA. Theur DNA is therefore slightly different from the DNA of all other cells.
Now, let’s go to the primary response. When some antigen enters the organisms, it is processed by specialized cell, the antigen presenting cells, whose role is to expose the epitopes on their surface, correctly associated to other molecules, so that virgin B cells may be exposed to them, until the clones which have an antibody which can bind the epitope are found and their proliferation begins, stimulated and controlled by other lymphocytes, the regulatory T cells. The competent clones proliferate and start producing their antibodies, giving so a first defense to the organism, after just a few days from the exposure to the antigen. That is the primary response. It is specific, but at low levels of affinity.
Well, I think I will stop here and continue in the next post.
174
gpuccio
01/28/2009
9:18 am
Mark:
Well, we have seen how the primary antibody response takes place. I would like to remark again that the primary repertoire is built without any information from outside antigens, and that its production can be already considered a highly engineered procedure, using random variation in a strictly controlled scenario, and attaining the best blind defense for any (or most) possible antigens.
But once an antigen really enter the body, information is inputted from the environment in the form of the epitopes, and is processed and stored in the antigen presenting cells (APCs). It’s that information which allows the specific selection of active antibodies from the pool of the repertoire, and their amplification through proliferation of the appropriate B cells. Let’s remember also that that process is strictly controlled by the T compartment.
Then, in the following months (3-6), ensues the very interesting process of antibody maturation. That is less well understood, but we know many things about it.
The process implies two different mechanisms:
a) Somatic hypermutation, probably due do specific mutating enzymes, strictly targeted to the small region of DNA coding for the active site recognizing the antigen (about 2000 bp)
b) Selection of the resulting mutated clones according to their affinity to the antigen: the clones with higher affinity are stimulated, and those with lower affinity are suppressed. The process is cyclically repeated in the Germinal Center of the lymph node, until high affinity is achieved. Future, secondary responses to the same antigen are base on those high affinity antibodies, and are much more efficient.
A few comments. The hypermutation is the random part of the process, but again it is highly engineered: it takes place only in a very short segment of DNA (the appropriate one). It is probably well controlled as to its rate and modality, and is actively accomplished through appropriate enzymes. And it takes place in a very precise window of the reproductive cycle of the lymphocyte.
Now, let’s go to the selection. It is almost certainly based upon exposure of each mutated clone to the antigen stored on APCs, and the following inhibition or stimulation of the clones is probably effected by the usual T regulation lymphocytes, and certainly in a very precise manner, very strictly dependent on the measured affinity.
So, in this second process we see again the application of targeted and controlled random variation, like in the building of the primary repertoire, but this time it:
a) starts with a pool of antibodies which are already a functional island of the space (they have been selected from the primary repertoire by the exposition to the antigen).
b) is followed by measurement of the resulting affinity after the variation, and the consequent stimulation or inhibition of the clone.
In other words, here the information inherent in the antigen plays a fundamental role, and “guides” the process towards its goal.
In hope that’s clear enough. If you have any doubts, please ask.
Now, in the following post, my comments and my point.
175
rna
01/28/2009
9:34 am
gpuccio:
I would be really careful in any statements regarding the possible distribution of functional proteins in sequence space. This is an area under intense experimental investigation and I don’t think a consensus view is emerging yet. Let me give some examples:
). Thus only a very tiny fraction of the possible sequence space can be explored in these experiments. Yet, in these experiments normally not only one but multiple solutions to a given problem are found and normally on must restrict oneself to look only for the most abundant solutions. This seems to indicate that the sequence space is rather rich in possible solutions = functional sequences and moreover that for any given problem multiple structurally and sequentially unrelated solutions can be found. The functional RNAs found in such experiments do not only trivial things but there have been molecules e. g. working as RNA-polymerases, catalyzing peptide-bond formation or catalyzing chemical reactions that are not even possible with naturally occuring protein enzymes.
The first comes actually from DNA or RNA in vitro selections. This is of course an artificial and simplified example but it yields an idea about the distribution of functional solutions to a chemical problem (enzyme activity, binding of a ligand, regulation) in sequence space. In vitro selection experiments typically start with a pool of random RNA sequence of 40 to 70 nucleotides in length e. g. with a sequence space of 4 to the power of 40 or ~ 10 the power of 24. Due to technical limitations normally only 10 to the power of 14 molecules can be synthetized and investigated (more do not fit in a test tube
Second, many examples are known where proteins with totally different folds and unrelated in sequence catalyze exactly the same reaction using the same chemistry such as the proteases subtilisin and trypsin but there are many more examples.
Third, the same reaction can be catalyzed using enzymes not only unrelated in structure and sequence but also using very different chemistry such as the metalloproteases and the cysteinproteases but examples for many other enzyme classes can be found. If anything this argues for a sequence space rich in functional proteins.
Fourth, there are examples where the exchange of a single amino acid or very few amino acids leads to a novel stable three-dimensional fold of a protein with a change in fold being a prerequisit for a possible novel function. On the other hand each functional protein sequence seems to be surrounded by similar sequences with the same fold and function in sequence space such as the functional myoglobins from different species.
Thus, the assumption you made of functional proteins being remote islands in sequence space is maybe a little premature.
176
gpuccio
01/28/2009
9:45 am
Mark:
Now, that’s where you have misunderstood my point:
“You are positing that an intelligence intervenes every time an antigen enters an individual vertebrate to direct the search for an antibody. The evidence for this intervention is the improbability of all current non-design solutions.”
No, I was not saying that. I don’t believe that a conscious intelligence (the designer) is directly active in the process of the immune response.
What I believe is that the process of immune response, in all its parts, is a very intelligent algorithmic procedure embedded in the immune system, probably in that part of the genome which controls procedures, and which we don’t understand very much. For the rest, it works like any other software: a complex algorithm which produces intelligent results mechanically, because it has been programmed to do so.
So, the designer does not intervene directly, but he does intervene indirectly, through the programmed procedure.
Now you will have noticed that the procedure is very complex, and that it requires the careful interaction of many different cell types. So, it is certainly a very good example of complex reality which suggests design.
But that was not the reason why I have cited it in our discussion. As you remember, our discussion was about the modalities by which specific proteins could be obtained in a design perspective. I quote from my post:
“Let’s take an example. In the case of antibody maturation, which I have cited, the intelligent procedure embedded in the immune system realizes a very significant increase of the affinity of the primary antibody response in just a few months. As I have said, that is attained through targeted hypermutation and intelligent selection through direct measurement of the affinity of the new clones to the antigen. And still, that is a rather indirect method, because the system does not know in advance the sequence to be found, but only the function to be measured.”
So, my reasoning is as such. We don’t know the modalities by which the designer implements the information “directly”, when he designs the genome or its variations, but we have at least this interesting model where the designer “indirectly” designs proteins, through an embedded procedure. And this model is interesting because it uses partial random search, coupled to function measurement for selection. That is also the main method used by protein engineers today.
So, as we know that the designer uses that kind of algorithm in the model of the immune system, it could be reasonable to hypothesize that he could use it also in building the genome information, but this time “directly”, for instance through targeted, guided random variation (think to some possible hypermutation process on a duplicated gene), followed by function measurement and direct selection (which need not pass through the long process of genome expansion though reproductive advantage; the “good” results could just be kept, and the bad results passed again through hypermutation).
In alternative, I have offered also the possibility that the designer may work knowing already the solution: in that case, he could still use targeted hypermutation, but just “keep” the correct nucleotides, without any preliminary function measurement.
The third alternative is direct, intelligent mutation. That would obviously be the easiest way.
That was simply my argument. I hope it is clear now.
177
gpuccio
01/28/2009
10:14 am
rna:
Thank you for your very correct post, with which I agree completely. What you say if true, but I am afraid you have misunderstood my point (I must not be in good form in these days).
In the above discussions, I was not arguing on how rich the general space of proteins is of functional islands, or on how big those islands of functionalities are. I have often discussed that problem elsewhere, and, although I do believe that the islands of functionality are really distant and interspersed, I am perfectly aware that the problem is not at all solved, and that it is very important. All the facts that you cite are correct, even if there are other aspects, which I will not detail here. Mark can witness how, even on his blog, I have often pointed to the measurement of the size of the functional space as to one of the fundamental problems, and like you I expect important clarifications from ongoing research in that field.
But in my posts I was just responding to the statements made by others that all functional proteins would lie in a “sweet spot” of the search space, and that therefore no search was really necessary. That is simply false.
Now, to be more clear, let’s take all the known proteome frome any database of functional proteins. Let’s forget, for a moment, the relationship with function, and let’s look simply at their primary structures. What I was saying is that those primary structures are absolutely interspersed in the space of possible sequences (obviously, with similar proteins sharing various levels of homology). That is absolutely true, and requires a search to get to the different islands of functional structures. We may discuss on how easy it could be to get to some island in a search (in other words, we could discuss if the search space is more like the coast of Maine or like the Pacific Ocean), but it is absolutely true that it is an ocean with a lot of islands.
The things you say confirm my point: it is true that two proteins may have similar 3d structure and function, and completely different primary structure (although that’s more an exception than the rule). But you see, random variation (the search) is supposed to work on the primary structure, knowing nothing of the function. It is only the selection part which is interested in the function, and with all the restraints which I have already discussed in this thread. So, those two proteins are distant islands for the search, because of their different primary structures.
Therefore, while you say “Thus, the assumption you made of functional proteins being remote islands in sequence space is maybe a little premature.”, I have to counter that I made no assumption: the functional proteins we know “are” remote islands in sequence space. You may argue that they could be very big islands, and not so distant as I believe. That can be discussed. But there is no doubt that they are islands, and that they are interspersed in the ocean. The “sweet space” is only a sweet invention.
178
Prof_P.Olofsson
01/28/2009
10:25 am
ROb[165],
Good point!
179
Mark Frank
01/28/2009
11:22 am
Gpuccio
That was very clear and informative thanks. There were a number of things I did not understand, both about antibody maturation and what you believe. I realise now that you proposed antibody maturation only as an analogy of the supposed design process and not as an example.
Mark
180
CJYman
01/28/2009
12:22 pm
Prof. Olofsson:
“Let me clarify: I am questioning the claims that the “search for a search” paper is pro-ID as was claimed in the introduction. In order to be considered pro-ID, it would have to have some implications for biology.”
It only shows that random search won’t match search algorithm to search space in order to increase the probability of finding a given target. From what I understand, NFLT has already proven that in order to increase the probability of search, the search procedure needs to be matched to the correct search space.
The NFLT could be presented with Dembski and Mark’s paper as an hypothesis with ID connections as I’ve previously explained in my comments #121 and 122, and could be falsified as shown in my comment #162.
Simply, if biological evolution can be modeled as a search and if our set of natural laws can be modeled as a search (which many physicists seem to do in discussing the values of our laws in relation to all possible mathematical universes) then the implications of the paper apply to all aspects of nature including biology.
Prof. Olofsson:
“I ask, for example, (1) what is the rationale behind claiming that search algorithms are chosen according to probability distributions, in particular the K-W distribution [154]? and (2) if they are, why do these distributions have to be uniform [159]? and point out (3) even if they are uniform, they are still quite likely to beat random search [155].”
1) It seems that the only assumption necessary is that the laws generating the search algorithm and search space are one out of many possible mathematical configurations of those laws and that whatever produces the search algorithm and search space matches those laws in a fashion which is blind to future results. Thus, the matching of (values for) laws = a search through a probability of all possible (values for) laws. I don’t see how this assumption could be controversial unless the universe and its laws where eternal (without beginning) and the only possibility.
2) They don’t have to be. Don’t the papers merely show that the degree to which a search space is non-uniform is the same degree to which the higher order search for that search space is non-uniform? As I have explained previously (in #121 and/or 122), this allows us the option of an infinite regress of active information.
3) Are you saying that any type of search through a uniform search space will return better than chance performance? Doesn’t that directly contradict the NFLT? Are you saying that searching for an evolutionary algorithm through unguided processes (matching of search procedure to search space with no consideration for future results) will stand a better chance of being found than attempting to locate the original pattern that the search algorithm is to find at better than chance results? It seems that you are saying that it is easier for non-foresighted processes to find the matching of algorithm and search space responsible for finding an optimized antenna than it is to find that optimized antenna by random search. Can you provide any evidence for this? If that were true, then I ask again:
… if what you are stating is true and actually has practical effect (ie: actually happens in the real world), why doesn’t anyone just show that background noise (chance) and an arbitrary set of laws (set of laws collected without any consideration for future results — absent foresight) will produce systems which process signs/units and evolve into greater and greater specified complexity, and just falsify ID theory and be done with it. If evolution is so powerful absent previous foresight, why do programmers bother programming boundary conditions for future known targets into the evolutionary algorithms used to solve problems (ie: max efficiency antenna shape)?
181
CJYman
01/28/2009
12:28 pm
Rob:
“
Informal claims such as “it is as difficult to find a search as it is to find a target” are not supported by the paper unless you make a lot of arbitrary assumptions.
One of the arbitrary assumptions involved in measuring active information even violates Dembski’s own repeated warning. He has told us several times that “how we measure information needs to be independent of whatever procedure we use to individuate the possibilities under consideration.” And yet, the active information measure depends very much on how we individuate the possibilities.”
First, those assumptions could be a part of an hypothesis which incorporates the math within these papers. As such, until the assumptions are falsified or the math is shown to be incorrect, we have a standing scientific hypothesis.
Second, what are those arbitrary assumptions? You seem to be saying that these assumptions have to do with how we measure active information by individuating the possibilities, yet I don’t see this as a problem at all, since the probability of a pattern is measured against a non-arbitrary uniform probability distribution. Breaking up the pattern and convoluting it into a couple of different searches would be the arbitrary action, and I’m not sure that would even return a different measurement when these separate searches are each measured against a uniform probability distribution. As far as I understand, Active info = probability associated with bit operations taken to find the pattern – probability of pattern measured against a uniform distribution. Neither the number of bit operations nor the uniform probability distribution are arbitrary figures, so neither is their difference (active info) arbitrary.
Rob:
“
The recurring use of the uniform distribution in many attempts to apply math to biology is a model assumption must be argued just like any other model assumption.
Indeed, Olle Haggstrom’s response to the active info approach is titled “Uniform distribution is a model assumption”. The question asked by Haggstrom and others is, why should we always expect uniform randomness? In fact, it’s an impossible expectation. If everything were characterized by a uniform distribution, then that would be a non-uniform distribution of distributions.”
Actually no, that would not be a non-uniform distribution of distributions, if *everything* is characterized by a uniform distribution. If you actually mean *everything*, then that would include the distribution of distributions. Simply put, the foundational randomness and chaos from which everything arises would be uniform and would not be biased toward any specific outcome. Thus, when Dembski assumes uniformity, he takes the perfect non-teleological non-biased starting point and begins from there.
However, there is no need to expect “uniform randomness” in order to make sense of these papers. Olle Hagstrom seems to be missing the point in that one option of the papers is that there can be an infinite regress of active information and thus no true uniform search space exists “outside” of our universe. The implications of this would be that there has always been an infinite regress of bias in the foundational search space to produce our universe, life, evolution, and intelligence. However, that infinite regress of active info is only one of the options which I have discussed in my comment #122, and if there is indeed “uniform randomness,” then the ID position becomes more of an obvious and accurate choice.
One way to begin to answer if the preceding assumptions are justified is to “test randomness” and see if it has any tendency to match non-uniform search spaces with search algorithms to generate active information. As long as no active information is generated, we can conclude that a random source (background noise and set of laws put together with no consideration for future results) will remain uniform with respect to randomly generated search procedures.
Any other assumption is merely that – an arbitrary, un-necessary, and unfounded assumption.
Excellent ID research, eh?
CJYman:
The only formal proof presented is that CSI won’t generate itself through only chance and law and this is implicit within these two papers.
Rob:
“Just for clarification, are you saying that this formal proof is implicit in these two papers? (An implicit formal proof sounds like an oxymoron to me.) If not, where is this formal proof to be found? Thanks.”
Sorry for not being more clear, but it is the fact that CSI (as a measurement of better than chance performance) necessarily requires active info which is implicit in these papers. The formal proof that chance and law (barring previous active info) won’t generate active info is what these papers seem to produce. Thus, the formal proof of active info not being generated via chance and law also formally proves that CSI won’t generate from merely chance and law.
182
Prof_P.Olofsson
01/28/2009
12:59 pm
CJY[180],
That seems to be part of the ID folklore but it is not what the NFLT (Wolpert & Macready 1997) says. It states that all search algorithms are equally good/bad if the fitness function is chosen uniformly. So, if the conclusion of NFLT is false, all we can conclude is that the uniformity assumption is not satisfied. In evolutionary biology, it is quite obvious that the conclusion does not hold (for example, the darwininan search clearly beats random search in cases we all can agree upon such as chloroquine resistance, per Michael Behe) and that the assumptions are not met (for example, the fitness function that makes all genotypes equally fit is not very likely); hence, the NFLT simply does not apply. The ID folklore now seems to be that (a) either NFLT applies and Darwin loses (we all agree about that) or (b) the NFLT does not apply and Darwin loses which is completely unsubstantiated.
More ID folklore. Assuming uniformity means that you make an assumption. The only way to possibly make suc