Uncommon Descent Serving The Intelligent Design Community

NEWS FLASH: Dembski’s CSI caught in the act

Share
Facebook
Twitter
LinkedIn
Flipboard
Print
Email

Dembski’s CSI concept has come under serious question, dispute and suspicion in recent weeks here at UD.

After diligent patrolling the cops announce a bust: acting on some tips from un-named sources,  they have caught the miscreants in the act!

From a comment in the MG smart thread, courtesy Dembski’s  NFL (2007 edn):

___________________

>>NFL as just linked, pp. 144 & 148:

144: “. . . since a universal probability bound of 1 in 10^150 corresponds to a universal complexity bound of 500 bits of information, (T, E) constitutes CSI because T [i.e. “conceptual information,” effectively the target hot zone in the field of possibilities] subsumes E [i.e. “physical information,” effectively the observed event from that field], T is detachable from E, and and T measures at least 500 bits of information . . . ”

148: “The great myth of contemporary evolutionary biology is that the information needed to explain complex biological structures can be purchased without intelligence. My aim throughout this book is to dispel that myth . . . . Eigen and his colleagues must have something else in mind besides information simpliciter when they describe the origin of information as the central problem of biology.

I submit that what they have in mind is specified complexity, or what equivalently we have been calling in this Chapter Complex Specified information or CSI . . . .

Biological specification always refers to function . . . In virtue of their function [a living organism’s subsystems] embody patterns that are objectively given and can be identified independently of the systems that embody them. Hence these systems are specified in the sense required by the complexity-specificity criterion . . . the specification can be cashed out in any number of ways . . . “

Here we see all the suspects together caught in the very act.

Let us line up our suspects:

1: CSI,

2: events from target zones in wider config spaces,

3: joint complexity-specification criteria,

4: 500-bit thresholds of complexity,

5: functionality as a possible objective specification

6: biofunction as specification,

7: origin of CSI as the key problem of both origin of life [Eigen’s focus] and Evolution, origin of body plans and species etc.

8: equivalence of CSI and complex specification.

Rap, rap, rap!

“How do you all plead?”

“Guilty as charged, with explanation your honour. We were all busy trying to address the scientific origin of biological information, on the characteristic of complex functional specificity. We were not trying to impose a right wing theocratic tyranny nor to smuggle creationism in the back door of the schoolroom your honour.”

“Guilty!”

“Throw the book at them!”

CRASH! >>

___________________

So, now we have heard from the horse’s mouth.

What are we to make of it, in light of Orgel’s conceptual definition from 1973 and the recent challenges to CSI raised by MG and others.

That is:

. . . In brief, living organisms are distinguished by their specified complexity. Crystals are usually taken as the prototypes of simple well-specified structures, because they consist of a very large number of identical molecules packed together in a uniform way. Lumps of granite or random mixtures of polymers are examples of structures that are complex but not specified. The crystals fail to qualify as living because they lack complexity; the mixtures of polymers fail to qualify because they lack specificity. [[The Origins of Life (John Wiley, 1973), p. 189.]

And, what about the more complex definition in the 2005 Specification paper by Dembski?

Namely:

define ϕS as . . . the number of patterns for which [agent] S’s semiotic description of them is at least as simple as S’s semiotic description of [a pattern or target zone] T. [26] . . . . where M is the number of semiotic agents [S’s] that within a context of inquiry might also be witnessing events and N is the number of opportunities for such events to happen . . . . [where also] computer scientist Seth Lloyd has shown that 10^120 constitutes the maximal number of bit operations that the known, observable universe could have performed throughout its entire multi-billion year history.[31] . . . [Then] for any context of inquiry in which S might be endeavoring to determine whether an event that conforms to a pattern T happened by chance, M·N will be bounded above by 10^120. We thus define the specified complexity [χ] of T given [chance hypothesis] H [in bits] . . . as  [the negative base-2 log of the conditional probability P(T|H) multiplied by the number of similar cases ϕS(t) and also by the maximum number of binary search-events in our observed universe 10^120]

χ = – log2[10^120 ·ϕS(T)·P(T|H)]  . . . eqn n1

How about this (we are now embarking on an exercise in “open notebook” science):

1 –> 10^120 ~ 2^398

2 –> Following Hartley, we can define Information on a probability metric:

I = – log(p) . . .  eqn n2

3 –> So, we can re-present the Chi-metric:

Chi = – log2(2^398 * D2 * p)  . . .  eqn n3

Chi = Ip – (398 + K2) . . .  eqn n4

4 –> That is, the Dembski CSI Chi-metric is a measure of Information for samples from a target zone T on the presumption of a chance-dominated process, beyond a threshold of at least 398 bits, covering 10^120 possibilities.

5 –> Where also, K2 is a further increment to the threshold that naturally peaks at about 100 further bits. In short VJT’s CSI-lite is an extension and simplification of the Chi-metric. He explains in the just linked (and building on the further linked):

The CSI-lite calculation I’m proposing here doesn’t require any semiotic descriptions, and it’s based on purely physical and quantifiable parameters which are found in natural systems. That should please ID critics. These physical parameters should have known probability distributions. A probability distribution is associated with each and every quantifiable physical parameter that can be used to describe each and every kind of natural system – be it a mica crystal, a piece of granite containing that crystal, a bucket of water, a bacterial flagellum, a flower, or a solar system . . . .

Two conditions need to be met before some feature of a system can be unambiguously ascribed to an intelligent agent: first, the physical parameter being measured has to have a value corresponding to a probability of 10^(-150) or less, and second, the system itself should also be capable of being described very briefly (low Kolmogorov complexity), in a way that either explicitly mentions or implicitly entails the surprisingly improbable value (or range of values) of the physical parameter being measured . . . .

my definition of CSI-lite removes Phi_s(T) from the actual formula and replaces it with a constant figure of 10^30. The requirement for low descriptive complexity still remains, but as an extra condition that must be satisfied before a system can be described as a specification. So Professor Dembski’s formula now becomes:

CSI-lite=-log2[10^120.10^30.P(T|H)]=-log2[10^150.P(T|H)] . . . eqn n1a

. . . .the overall effect of including Phi_s(T) in Professor Dembski’s formulas for a pattern T’s specificity, sigma, and its complex specified information, Chi, is to reduce both of them by a certain number of bits. For the bacterial flagellum, Phi_s(T) is 10^20, which is approximately 2^66, so sigma and Chi are both reduced by 66 bits. My formula makes that 100 bits (as 10^30 is approximately 2^100), so my CSI-lite computation represents a very conservative figure indeed.

Readers should note that although I have removed Dembski’s specification factor Phi_s(T) from my formula for CSI-lite, I have retained it as an additional requirement: in order for a system to be described as a specification, it is not enough for CSI-lite to exceed 1; the system itself must also be capable of being described briefly (low Kolmogorov complexity) in some common language, in a way that either explicitly mentions pattern T, or entails the occurrence of pattern T. (The “common language” requirement is intended to exclude the use of artificial predicates like grue.) . . . .

[As MF has pointed out] the probability p of pattern T occurring at a particular time and place as a result of some unintelligent (so-called “chance”) process should not be multiplied by the total number of trials n during the entire history of the universe. Instead one should use the formula (1–(1-p)^n), where in this case p is P(T|H) and n=10^120. Of course, my CSI-lite formula uses Dembski’s original conservative figure of 10^150, so my corrected formula for CSI-lite now reads as follows:

CSI-lite=-log2(1-(1-P(T|H))^(10^150)) . . . eqn n1b

If P(T|H) is very low, then this formula will be very closely approximated [HT: Giem] by the formula:

CSI-lite=-log2[10^150.P(T|H)]  . . . eqn n1c

6 –> So, the idea of the Dembski metric in the end — debates about peculiarities in derivation notwithstanding — is that if the Hartley-Shannon- derived information measure for items from a hot or target zone in a field of possibilities is beyond 398 – 500 or so bits, it is so deeply isolated that a chance dominated process is maximally unlikely to find it, but of course intelligent agents routinely produce information beyond such a threshold.

7 –> In addition, the only observed cause of information beyond such a threshold is the now proverbial intelligent semiotic agents.

8 –> Even at 398 bits that makes sense as the total number of Planck-time quantum states for the atoms of the solar system [most of which are in the Sun] since its formation does not exceed ~ 10^102, as Abel showed in his 2009 Universal Plausibility Metric paper. The search resources in our solar system just are not there.

9 –> So, we now clearly have a simple but fairly sound context to understand the Dembski result, conceptually and mathematically [cf. more details here]; tracing back to Orgel and onward to Shannon and Hartley. Let’s augment here [Apr 17], on a comment in the MG progress thread:

Shannon measured info-carrying capacity, towards one of his goals: metrics of the carrying capacity of comms channels — as in who was he working for, again?

CSI extended this to meaningfulness/function of info.

And in so doing, observed that this — due to the required specificity — naturally constricts the zone of the space of possibilities actually used, to island[s] of function.

That specificity-complexity criterion links:

I: an explosion of the scope of the config space to accommodate the complexity (as every added bit DOUBLES the set of possible configurations),  to

II: a restriction of the zone, T, of the space used to accommodate the specificity (often to function/be meaningfully structured).

In turn that suggests that we have zones of function that are ever harder for chance based random walks [CBRW’s] to pick up. But intelligence does so much more easily.

Thence, we see that if you have a metric for the information involved that surpasses a threshold beyond which a CBRW is a plausible explanation, then we can confidently infer to design as best explanation.

Voila, we need an info beyond the threshold metric. And, once we have a reasonable estimate of the direct or implied specific and/or functionally specific (especially code based) information in an entity of interest, we have an estimate of or credible substitute for the value of – log2(p(T|H)); especially if the value of information comes from direct inspection of storage capacity and code symbol patterns of use leading to an estimate of relative frequency, we may evaluate average [functionally or otherwise] specific information per symbol used. This is a version of Shannon’s weighted average information per symbol H-metric, H = –  Σ pi * log(pi), which is also known as informational  entropy [there is an arguable link to thermodynamic entropy, cf here)  or uncertainty.

As in (using Chi_500 for VJT’s CSI_lite [UPDATE, July 3: and S for a dummy variable that is 1/0 accordingly as the information in I is empirically or otherwise shown to be specific, i.e. from a narrow target zone T, strongly UNREPRESENTATIVE of the bulk of the distribution of possible configurations, W]):

Chi_500 = Ip*S – 500,  bits beyond the [solar system resources] threshold  . . . eqn n5

Chi_1000 = Ip*S – 1000, bits beyond the observable cosmos, 125 byte/ 143 ASCII character threshold . . . eqn n6

Chi_1024 = Ip*S – 1024, bits beyond a 2^10, 128 byte/147 ASCII character version of the threshold in n6, with a config space of 1.80*10^308 possibilities, not 1.07*10^301 . . . eqn n6a

[UPDATE, July 3: So, if we have a string of 1,000 fair coins, and toss at random, we will by overwhelming probability expect to get a near 50-50 distribution typical of the bulk of the 2^1,000 possibilities W. On the Chi-500 metric, I would be high, 1,000 bits, but S would be 0, so the value for Chi_500 would be – 500, i.e. well within the possibilities of chance.  However, if we came to the same string later and saw that the coins somehow now had the bit pattern of the ASCII codes for the first 143 or so characters of this post, we would have excellent reason to infer that an intelligent designer, using choice contingency, had intelligently reconfigured the coins. that is because, using the same I = 1,000 capacity value, S is now 1, and so Chi_500 = 500 bits beyond the solar system threshold. If the 10^57 or so atoms of our solar system, for its lifespan, were to be converted into coins and tables etc, and tossed at an impossibly fast rate, it would be impossible to sample enough of the possibilities space W to have confidence that something from so unrepresentative a zone T,  could reasonably be explained on chance. So, as long as an intelligent agent capable of choice is possible, choice — i.e. design — would be the rational, best explanation on the sign observed, functionally specific, complex information.]

10 –> Similarly, the work of Durston and colleagues, published in 2007, fits this same general framework. Excerpting:

Consider that there are usually only 20 different amino acids possible per site for proteins, Eqn. (6) can be used to calculate a maximum Fit value/protein amino acid site of 4.32 Fits/site [NB: Log2 (20) = 4.32]. We use the formula log (20) – H(Xf) to calculate the functional information at a site specified by the variable Xf such that Xf corresponds to the aligned amino acids of each sequence with the same molecular function f. The measured FSC for the whole protein is then calculated as the summation of that for all aligned sites. The number of Fits quantifies the degree of algorithmic challenge, in terms of probability [info and probability are closely related], in achieving needed metabolic function. For example, if we find that the Ribosomal S12 protein family has a Fit value of 379, we can use the equations presented thus far to predict that there are about 10^49 different 121-residue sequences that could fall into the Ribsomal S12 family of proteins, resulting in an evolutionary search target of approximately 10^-106 percent of 121-residue sequence space. In general, the higher the Fit value, the more functional information is required to encode the particular function in order to find it in sequence space. A high Fit value for individual sites within a protein indicates sites that require a high degree of functional information. High Fit values may also point to the key structural or binding sites within the overall 3-D structure.

11 –> So, Durston et al are targetting the same goal, but have chosen a different path from the start-point of the Shannon-Hartley log probability metric for information. That is, they use Shannon’s H, the average information per symbol, and address shifts in it from a ground to a functional state on investigation of protein family amino acid sequences. They also do not identify an explicit threshold for degree of complexity. [Added, Apr 18, from comment 11 below:] However, their information values can be integrated with the reduced Chi metric:

Using Durston’s Fits from his Table 1, in the Dembski style metric of bits beyond the threshold, and simply setting the threshold at 500 bits:

RecA: 242 AA, 832 fits, Chi: 332 bits beyond

SecY: 342 AA, 688 fits, Chi: 188 bits beyond

Corona S2: 445 AA, 1285 fits, Chi: 785 bits beyond  . . . results n7

The two metrics are clearly consistent, and Corona S2 would also pass the X metric’s far more stringent threshold right off as a single protein. (Think about the cumulative fits metric for the proteins for a cell . . . )

In short one may use the Durston metric as a good measure of the target zone’s actual encoded information content, which Table 1 also conveniently reduces to bits per symbol so we can see how the redundancy affects the information used across the domains of life to achieve a given protein’s function; not just the raw capacity in storage unit bits [= no.  of  AA’s * 4.32 bits/AA on 20 possibilities, as the chain is not particularly constrained.]

12 –> I guess I should not leave off the simple, brute force X-metric that has been knocking around UD for years.

13 –> The idea is that we can judge information in or reducible to bits, as to whether it is or is not contingent and complex beyond 1,000 bits. If so, C = 1 (and if not C = 0). Similarly, functional specificity can be judged by seeing the effect of disturbing the information by random noise [where codes will be an “obvious” case, as will be key-lock fitting components in a Wicken wiring diagram functionally organised entity based on nodes, arcs and interfaces in a network], to see if we are on an “island of function.” If so, S = 1 (and if not, S = 0).

14 –> We then look at the number of bits used, B — more or less the number of basic yes/no questions needed to specify the configuration [or, to store the data], perhaps adjusted for coding symbol relative frequencies — and form a simple product, X:

X = C * S * B, in functionally specific bits . . . eqn n8.

15 –> This is of course a direct application of the per aspect explanatory filter, (cf. discussion of the rationale for the filter here in the context of Dembski’s “dispensed with” remark) and the value in bits for a large file is the familiar number we commonly see such as a Word Doc of 384 k bits. So, more or less the X-metric is actually quite commonly used with the files we toss around all the time. That also means that on billions of test examples, FSCI in functional bits beyond 1,000 as a threshold of complexity is an empirically reliable sign of intelligent design.

______________

All of this adds up to a conclusion.

Namely, that there is excellent reason to see that:

i: CSI and FSCI are conceptually well defined (and are certainly not “meaningless”),

ii: trace to the work of leading OOL researchers in the 1970’s,

iii: have credible metrics developed on these concepts by inter alia Dembski and Durston, Chiu, Abel and Trevors, metrics that are based on very familiar mathematics for information and related fields, and

iv: are in fact — though this is hotly denied and fought tooth and nail — quite reliable indicators of intelligent cause where we can do a direct cross-check.

In short, the set of challenges recently raised by MG over the past several weeks has collapsed. END

Comments
MathGrrl @169
If the paper you are touting claims that ev is a targeted search then it is wrong.
Hi MathGrrl, welcome back. Let's start with this claim about ev, which I find strange indeed. But perhaps I am just not understanding what you mean. GA's are by definition targeted searches. If a GA was not a targeted search it would perform no better than a random search, and thus there would be no point in using a GA. If ev is (or uses) a GA, ev uses a targeted search. Why is my argument not valid? Note: It doesn't matter what the specific target or targets are in ev, and not identifying them does not impact the validity of the argument I present.
A detailed description of how to solve a problem by first specifying the precise starting conditions and then how to follow a set of simple steps that lead to the final solution is known as an algorithm. An algorithm is characterized by: - a precise statement of the starting conditions, which are the inputs to the algorithm; - a specification of the final state of the algorithm, which is used to decide when the algorithm will terminate; - a detailed description of the individual steps, each of which is a simple and straightforward operation that will help move the algorithm towards its final state. Explorations in Computing
Frankly, as a "MathGrrl" I'd expect you to know this.
The Genetic Algorithm is an Adaptive Strategy and a Global Optimization technique. It is an Evolutionary Algorithm and belongs to the broader study of Evolutionary Computation.
The objective of the Genetic Algorithm is to maximize the payoff of candidate solutions in the population against a cost function from the problem domain. The strategy for the Genetic Algorithm is to repeatedly employ surrogates for the recombination and mutation genetic mechanisms on the population of candidate solutions, where the cost function (also known as objective or fitness function) applied to a decoded representation of a candidate governs the probabilistic contributions a given candidate solution can make to the subsequent generation of candidate solutions.
Listing (below) provides an example of the Genetic Algorithm implemented in the Ruby Programming Language. The demonstration problem is a maximizing binary optimization problem called OneMax that seeks a binary string of unity (all '1' bits). The objective function provides only an indication of the number of correct bits in a candidate string, not the positions of the correct bits.
Genetic Algorithm If you've read Schneider's ev paper you'll know why the text above is in bold.Mung
April 29, 2011
April
04
Apr
29
29
2011
12:45 PM
12
12
45
PM
PDT
MathGrrl:
If the paper you are touting claims that ev is a targeted search then it is wrong.
Unfortunately you have to do more than just say it. Perhaps you can write a letter to the journal informing them of your objections. But all that is moot because you have been given a rigorously defined concept of CSI and you just refuse to grasp it. I say shame on you for that and shame on us for continuing to respond to an obvious waste of bandwidth...Joseph
April 29, 2011
April
04
Apr
29
29
2011
11:10 AM
11
11
10
AM
PDT
F/N: For convenience, I excerpt the clip from 151 above, on how Information is quantitatively defined: ___________ >> 2 –> I turn to my trusty Princs of Comm Systems, 2nd edn, Taub and Schilling (McGraw Hill, 1986), p. 512, Sect. 13.2 (which follows my good old Connor, as cited and used in my always linked; cf as well Harry Robertson’s related development of thermodynamic Entropy in Statistical Thermophysics (PH, 1998), pp. 3 – 6, 7, 36, etc, as also cited and used):
Let us consider a communication system in which the allowable messages are m1, m2, . . ., with probabilities of occurrence p1, p2, . . . . Of course p1 + p2 + . . . = 1. Let the transmitter select message mk of probability pk; let us further assume that the receiver has correctly identified the message [My nb: i.e. the a posteriori probability in my online discussion is 1]. Then we shall say, by way of definition of the term information, that the system has communicated an amount of information Ik given by Ik = (def) log2 1/pk (13.2-1)
3 –> In short, Dembski’s use of the term “information” is precisely correct, though it differs from the terminology used by others. 4 –> Schneider should have checked (or should be more familiar with the field) before so dismissively correcting. >> ___________ Schneider had sought to "correct" Dembski for using this commonplace definition and quantification of information. He thereby inadvertently reveals his want of familiarity with common, well accepted and effective usage. And once we understand that the common metric of info in bits we commonly see is based on the usage cited from T & S [and could be from many other sources], the objection on want of adequate definition collapses. For, the log reduction of the Dembski metric shows that it is about identifying info on configs in identifiable target zones, that are sufficiently isolated that it is maximally implausible for them to be arrived at by chance plus necessity. Whilst, posts in this thread for a handy example [one that has been repeatedly used in responding to MG and which she has consistently ducked or brushed aside to make her favourite talking points], are demonstrations of how routinely intelligence is able to get to such islands of function in vast config spaces. Let us have done with this notion and talking point that CSI is ill defined and meaningless. GEM of TKIkairosfocus
April 29, 2011
April
04
Apr
29
29
2011
08:31 AM
8
08
31
AM
PDT
MG:
There have been no such answers . . .
Pardon, but that is now plainly an empty declaration in light of the above in this thread and the OP. CSI has been adequately conceptually understood and described from the 1970's, and it has been adequately mathematically modelled for a decade or more. In the log reduction on basic rules of logs, the connexion from the Dembski chi metric to the observable world of information content of functioning systems is made, and the threshold type approach is justified on a needle in a haystack search challenge, on the gamut of our solar system or our observed cosmos. With the solar system scope metric in hand:
Chi_500 = Ip - 500, bits beyond the threshold
. . . we may directly insert the Durston et al results for information content of 35 protein families, as is now shown in point 11 of the OP:
RecA: 242 AA, 832 fits, Chi: 332 bits beyond SecY: 342 AA, 688 fits, Chi: 188 bits beyond Corona S2: 445 AA, 1285 fits, Chi: 785 bits beyond . . . results n7
Other applications once we have a reasonable estimate for information content, are plainly possible on the lines of this paradigmatic case. The link to the simple brute force X-metric, is also plain. Let's do a direct test: we know that we have cases of randomly generated valid text up to 20 - 25 ASCII characters [spaces of up to 10^50 or so possibilities]. Can you provide a case for at least 72 characters? 143? Let's ask:
a: What does that want of cases in point have to do with the Planck-time quantum state resources of our solar system and our observed cosmos, per the discussion in Abel's 2009 plausibility metric paper? b: How does this tie in with the statistical foundations for the second law of thermodynamics? c: And, BTW, do you observe the cited definition of INFORMATION as a metric from Taub and Schilling? d: Do you recognise that this is therefore a valid and sufficiently rigorous definition as is commonly used in telecomms work? e: Do you see that in the log reduced form that is precisely the definition of information used?
Before your remarks can have any further weight, you need to cogently respond to the issues summarised here, and especially here and here above (with a glance here also helpful). The sub thread on Schneider's ev from here at 126 would also require serious attention. Schneider's sense of vagueness regarding the CSI concept is self-induced. He plainly has not addressed the concept that a set of bits or the like can specify a config space of possibilities. In such a space, we may define a zone of interest T which, if sufficiently isolated, will be maximally hard to find on undirected chance plus necessity. Indeed, that is the obvious error in ev itself. For Schneider seems to have failed to realise that he has started within an island of function, and that he is proceeding on a nice trend and an evident metric of distance to target that yields a warmer/colder signal. Indeed, his graph of a ramp up to the target with hunting oscillations, is diagnostic [at least to one familiar with the behaviour of closed loop control systems]. What Schneider has provided is a model of what is not in dispute -- not even with modern Young Earth Creationists, i.e. relatively minor variations within a functioning body plan. He does not have a model that accounts for the origin of such body plans on undirected chance plus necessity. (His confusion of artificial selection with natural selection was telling. He has not realised that HE is the source of the crucial active information that explains better than random walk based search performance.) Indeed, he inadvertently provides a demonstration of the best, empirically warranted, explanation for arriving on such an island of function. Namely, design. GEM of TKI PS: Your attempt to push the ball back into Dr Torley's court after he has more than abundantly explained why he asks you to at least provide a summary, is telling. Especially for one who above confused a log reduction with a probability calculation.kairosfocus
April 29, 2011
April
04
Apr
29
29
2011
07:52 AM
7
07
52
AM
PDT
kairosfocus and Mung, With respect to your discussions of ev, I think there are three points that you haven't addressed. The first is that Schneider, like myself and others, finds Dembski's concept of CSI mathematically vague. He has to make some assumptions about what it means in order to even begin to calculate it. This is why a rigorous mathematicall definition and detailed examples are so important. The second point is the discussion of Schneider's "horserace" to beat the UPB. You both make a big issue about Schneider tweaking the parameters of the simulation, population size and mutation rate in particular, but you don't discuss the fact that, once the parameters are set, a small subset of known evolutionary mechanisms does generate Shannon information. This goes back to my discussion with gpuccio on Mark Frank's blog where we touched on the ability of evolutionary mechanisms to result in populations that are better suited to their environment than were their parent populations. That, in turn, suggests that, while it might be possible to make a case for cosmological ID, there is no need to posit the involvement of intelligent agency in biology. The third point is that, despite a lot of discussion about ev, neither of you have provided a detailed calculation of CSI for my ev scenario. This was a particularly interesting topic in my discussion with gpuccio in its impact on his calculations. I would be very interested in reading what you think of that part of the thread.MathGrrl
April 29, 2011
April
04
Apr
29
29
2011
06:33 AM
6
06
33
AM
PDT
Mung,
I have my doubts as to whether ev even qualifies as a genetic algorithm, I’ll need to do some more reading. So what’s missing? Crossover.
ev implements a very simple subset of known evolutionary mechanisms, not including crossover. That doesn't mean it's not a GA. Interestingly, even using such a small subset of known evolutionary mechanisms, ev still demonstrates the same behavior that Schneider researched for his PhD thesis.MathGrrl
April 29, 2011
April
04
Apr
29
29
2011
06:32 AM
6
06
32
AM
PDT
vjtorley,
If you want me to answer your questions, I’m afraid you’ll have to (a) explain the cases you’re describing, in non-technical language (i.e. something a bright 12-year-old could grasp), and (b) explain why you think they pose a threat to Professor Dembski’s concept of complex specified information, in a summary of two or three pages at most.
I'm happy to explain any of the scenarios that you feel are too underspecified. Rather than guessing at your points of confusion, could you please explain where you find the jargon too thick? I would also note that I'm not claiming that my scenarios "pose a threat" to Dembski's CSI metric, I'm saying that I have not seen a mathematically rigorous definition of that metric nor any examples of how to calculate it for scenarios such as those I present.MathGrrl
April 29, 2011
April
04
Apr
29
29
2011
06:32 AM
6
06
32
AM
PDT
Joseph,
You forgot one main point R0bb- ev is a targeted search, which means it is an irrelevant example. MathGrrl:
This is not correct.
Yes it is and I have provided the peer-reviewed paper that exposes it as such.
The nice thing about science is that it is objective. People can look at the empirical evidence and determine whether or not it supports a claim. If the paper you are touting claims that ev is a targeted search then it is wrong. If you disagree, please refer to the ev paper to identify the target of the search. You may want to read Schneider's PhD thesis for background information.MathGrrl
April 29, 2011
April
04
Apr
29
29
2011
06:31 AM
6
06
31
AM
PDT
Mung,
You are boringly repetitious.
I assure you that when an ID proponent presents a rigorous mathematical definition of CSI and demonstrates how to calculate it for my four scenarios, I'll stop repeating my request.MathGrrl
April 29, 2011
April
04
Apr
29
29
2011
06:31 AM
6
06
31
AM
PDT
kairosfocus,
No one there [in my guest thread] was able to present a rigorous mathematical definition of CSI based on Dembski’s description. If you can, please do so and demonstrate how to calculate it for the four scenarios I describe there.
Pardon directness: this is the proverbial stuck record, repeating long since reasonably and credibly answered demands, without any responsiveness to the fact that they have been answered, again and again.
There have been no such answers. Not even ID proponents can provide a rigorous mathematical definition that is consistent with Dembski's description of CSI nor has anyone here thus far provided detailed example calculations for my four scenarios. As noted previously, you provide no basis for the numbers used in your calculations. If you are willing and able to demonstrate exactly how you arrived at your numbers, in the context of a rigorous mathematical definition of CSI, I would be delighted to then apply that objective metric to other systems.MathGrrl
April 29, 2011
April
04
Apr
29
29
2011
06:30 AM
6
06
30
AM
PDT
kairosfocus,
The issue at the heart of the CSI/FSCI challenge is to arrive at the shores of such islands of function from arbitrary initial points in config spaces.
No, the issue is that ID proponents make the claim that CSI is a clear indicator of the involvement of intelligent agency but cannot define the metric rigorously or show how to objectively calculate it for real world scenarios. That means that their claims are unsupported.MathGrrl
April 29, 2011
April
04
Apr
29
29
2011
06:30 AM
6
06
30
AM
PDT
kairosfocus,
Please provide references to where I have done so . . .
Kindly cf 44 ff above, for my earlier responses; which relate to your 36 ff.
I still see no mathematically rigorous definition of CSI nor any detailed calculations for my four scenarios. As previously noted, you do not explain how you arrive at your numbers used in post 44. The determination of how many bits of CSI are present is exactly the question being discussed. I would be interested in seeing your more detailed explanation.MathGrrl
April 29, 2011
April
04
Apr
29
29
2011
06:30 AM
6
06
30
AM
PDT
My dear interlocutors, I apologize for disappearing for the past week; real world responsibilities intervened. I am attempting to continue the discussion in the two most active child threads of my original guest post. I hope you'll continue as well.MathGrrl
April 29, 2011
April
04
Apr
29
29
2011
06:29 AM
6
06
29
AM
PDT
PS: How negative feedback control works.kairosfocus
April 27, 2011
April
04
Apr
27
27
2011
03:42 AM
3
03
42
AM
PDT
Mung: I find Schneider's web pages -- pardon* --often very hard to wade through; too busy and disorganised. (That is why I tend to use clips.) ___________________
*Pardon, again: I suggest instead layout as an article, with an index near the top of the page, and perhaps use of text boxes. Or, use of unequal width columns similar to a blog page.
Let's slice up his highlight:
Replication, mutation and selection are necessary and sufficient for information gain to occur. This process is called evolution.
Replication in ev requires a huge, fine tuned, multi-component background algorithmic process that is not only intelligently designed but controlled and protected from chance variation, If not, the process would break down rapidly. Thus, we see tha the variation in view is tightly controlled within an island of fine tuned, complex and specific function. Which is: intelligently designed. "Evolution" on intelligent design . . . and as a part of the design. (As in Wallace's Intelligent Evolution.) Is that what Schneider really wishes to demonstrate or acknowledges/claims demonstrating? Next, the variation and selection are plainly within an island of defined function and are crucially dependent on nice trends of performance and a warmer-colder metric, as can be seen from what happens in the graph at the top of his page when the selection filter is turned off, i.e the system wanders away from the target. That sort of ramp and hold vs wander away is a characteristic signature of a negative feedback controlled process: set target point, adjust plant, test o/p relative to target, adjust process towards target, compare fed back performance, adjust plant on differential, hold as differential falls to zero; or, at least hold within the hunting oscillations. That is, there is a targetting here on a warmer-colder metric. A feature of feedback control. And judging by noisiness, ev lacks damping and so is prone to oscillations and breakout of control. Which is what the tweaking in 126 is speaking of. In effect a Hamming digital distance to target oracle of some form is at work, just as Dembski et al have pointed out. And yes, there is a measure of performance that is trending as the ramp part of the graph shows, i.e there is a fitness function of some type, or better, a FIT-to target function. So, the fundamental problem with the genome variation and we started with a random genome claim is what it conceals (probably inadvertently, I think there is a sincerity here but one that is blind what it is not inclined to see):
1: In each instance, you are searching well within a match of 500 bits, which is within the search limit that the design inference accepts. 2: That you are able to search and match implies there are designed algors offstage doing the real work. 3: You have a perceptron that biases you to sparse 1 codes, in a context where that is what you need, i.e you are loading the dice heavily. (If you need 6's and the dice are loaded so you get 6's 80% of the time, you have shifted the uncertainties of dice tossing dramatically.) 4: You start within an island of function, as in effect every "random" index value will have a measurable function. In reality by far and away most of the in principle possible configs of genomes -- that START at 100+ k bits (notice how we never see GA's that start at that sort of level!) are decidedly non-functional. And in fact your program is a large one and most of the bits in its informational strings are NOT allowed to vary at random. 5: Your fitness metric implies a conveniently nice trend-y response on the underlying config possibilities, so you can build in the algor that says climb uphill (rewarding increments in warmer/colder) and get where you want. [Contrast my black hole variant on using the Mandelbrot set as a fitness function: if trends are not nice,then the whole model fails. What gives you a right to assume/construct a nice trend?) 5: As your documentation at 126 shows, several parameters have to be set just right -- fine tuned -- for the components to work together to get the desired results. 6: To get that fine tuning, Schneider, obviously, was in effect running life over and over again, until things worked out. As in, where do you have the planets and sub-cosmi to run life over and over to get the one that is just right? And, how did you know how to get the range of variation and probability distribution that would set up a population to hit just the right peak?
In that context, if one narrows focus to the genome and its move from random to matched, well it looks like you have got functional info on a free lunch. Indeed, that's why he was crowing on how NATURAL selection was beating the UPB in an afternoon. But, once one pulls back the tight focus, one sees that a lot of intelligently directed things were going on offstage that were critical to getting that performance. Intelligent things that amount to serious dice loading and capture of good results, discarding the bad ones. At most, ev gives a picture of some of how micro evo -- which is not in dispute -- allows variations to fit niches. This is comparable to how bacteria put in nutrient mixes with unusual sugars or the like, may well adapt to eat the new stuff. But these pay a fitness cost because as a rule something got broken to get the enzyme[s] required to break up and use the new sugar. This has nothing to do with explaining where the bacterium with all its integrated functionality, came from. Including the set of enzymes that were so set up that fiddling a bit would allow the organism to survive on unusual nutrients. There is no free lunch, and Schneider is evidently distracting himself from his own contributions to his results that -- to ft his preconceptions -- he ascribes to "natural" selection. GEM of TKIkairosfocus
April 27, 2011
April
04
Apr
27
27
2011
01:26 AM
1
01
26
AM
PDT
Does ev qualify as an EA?Mung
April 26, 2011
April
04
Apr
26
26
2011
08:35 PM
8
08
35
PM
PDT
Hi kairosfocus, What do you think of Schneider's use of Shannon Information? http://www.ccrnp.ncifcrf.gov/~toms/paper/ev/ I'm suspicious of how he decides that there's been a reduction in uncertainty at the binding site and therefore an increase in Shannon information. It also occurred to me today that he creates 64 organisms and each is generated with a random genome. So while he should be starting off on an island, he really isn't! He's maximizing his chances to search different locations in the space.Mung
April 26, 2011
April
04
Apr
26
26
2011
05:58 PM
5
05
58
PM
PDT
F/N: The onward discussion by No-Man here, and my suggestions here in that thread, seem to be relevant. I clip the latter: _________ >> F/N: Applying a modified Chi-metric: I nominate a modded, log-reduced Chi metric for plausible thresholds of inferring sufficient complexity AND specificity for inferring to design as best explanation on a relevant gamut:
(a) Chi’_500 = Ip*S – 500, bits beyond the solar system threshold (b) Chi’_1000 = Ip*S – 1,000, bits beyond the observed cosmos threshold
. . . where Ip is a measure of explicitly or implicitly stored information in the entity and S is a dummy variable taking 1/0 according as [functional] specificity is plausibly inferred on relevant data. [This blends in the trick used in the simplistic, brute force X-metric mentioned in the just linked.] 500 and 1,000 bits are swamping thresholds for solar system and cosmological scales. For the latter, we are looking at the number of Planck time quantum states of the observed cosmos being 1 in 10^150 of the implied config space of 1,000 bits. For a solar system with ours as a yardstick, 10^102 Q-states would be an upper limit, and 10^150 or so possibilities for 500 bits would swamp it by 48 orders of magnitude. (Remember, the fastest chemical interactions take about 10^30 Planck time states and organic reactions tend to be much, much slower than that.) So, the reduced Dembski metric can be further modified to incorporate the judgement of specificity, and non-specificity would lock out being able to surpass the threshold of complex specificity. I submit that a code-based function beyond 1,000 bits, where codes are reasonably specific, would classify. Protein functional fold-ability constraints would classify on the sort of evidence often seen. Functionality based on Wicken wiring diagram organised parts that would be vulnerable to perturbation would also qualify, once the description list of nodes, arcs and interfaces would exceed the the relevant thresholds. [In short, I am here alluding to how we reduce and represent a circuit or system drawing or process logic flowchart in a set of suitably structured strings.] So, some quantification is perhaps not so far away as might at first be thought. Your thoughts? >> __________ GEM of TKIkairosfocus
April 26, 2011
April
04
Apr
26
26
2011
04:14 PM
4
04
14
PM
PDT
MG (et al): Still waiting . . . GEM of TKIkairosfocus
April 26, 2011
April
04
Apr
26
26
2011
03:21 PM
3
03
21
PM
PDT
MG (et al and Graham): If you are still monitoring this thread on the significance and credibility of CSI as a properly scientifically grounded metric pointing to design as the best explanation for what is sufficiently complex and specific, it is open for a response. Please note the guide to the thread (including answers to your main 4 q's and the second string of q's, a response to your meaningless claim, and a response to Schneider's ev and claims on CSI) at 1 above. G'day GEM of TKIkairosfocus
April 25, 2011
April
04
Apr
25
25
2011
04:24 AM
4
04
24
AM
PDT
Mung: The random walk backed up by a lawlike filter is not incapable of CSI by DEFINITION, but by being overwhelmed by the needle in the haystack challenge; an analysis backed up by observations. That is why Dembski went out of his way to identify and define a lower limit to number of bits whereby beyond this level, zones of interest are so isolated that it is not credible to land on the zone within the available search resources, UNLESS one is using active information. Such active information includes oracles that attract through things like warmer/colder signals, and the like. Intelligence routinely arrives at such zones of interest beyond such isolation thresholds, but in so doing, it is precisely not using a blind search backed by trial and error tests. E.g. consider posts in this thread. In the OP this thread, we saw that the Dembski type Chi metric can be reduced to exactly the bits beyond a reasonable threshold metric deduced: Chi_500 = Ip - 500, bits beyond Let us observe how MG, plainly coming from Schneider's perspective, has been unable to address its significance. She even managed to confuse a log reduction with a probability calculation. And if she has been taught that it is INCORRECT to use the Hartely-suggested log metric for information, then the confusion is maximised. (And of course only dumb IDiots and Creationists do that . . . ) Here is Wiki on self-information, as a simple clarification:
the self-information I(wn) associated with outcome ?n with probability P(wn) is: I(wn) = log (1/P(wn)) = - log (P(wn)) . . . . This measure has also been called surprisal, as it represents the "surprise" of seeing the outcome (a highly improbable outcome is very surprising). This term was coined by Myron Tribus in his 1961 book Thermostatics and Thermodynamics. The information entropy of a random event is the expected value of its self-information.
That should help clarify. GEM of TKIkairosfocus
April 24, 2011
April
04
Apr
24
24
2011
03:07 PM
3
03
07
PM
PDT
In NFL, Dembski begins discussion of ev in Chapter 4 Section 9, "Following the Information Trail." But prior to that, in the same chapter, he writes:
HI MOM!
Wait, that's not what he wrote. How did that get there? Here's the actual quotes:
Technically, Dawkins's target sequence is not long enough fir its probability to fall below the 1 in 10^150 universal probability bound or correspondingly for its complexity to to surpass the 500-bit universal complexity bound. Dawkins's target sequence therefore does not qualify as complex specified information in the strict sense - see sections 2.8 and 3.9. Nonetheless, for practical purposes the complexity is sufficient to illustrate specified complexity.)
In general, then, evolutionary algorithms generate not true specified complexity but at best the appearance of specified complexity.
So if Dembski has anything to say about the ability of ev to generate CSI it should certainly be understood in the context of what he wrote earlier in the chapter. It looks like Schneider went straight to the section on ev and thus failed to understand it in context. That said, Dembski has in one fell swoop dispensed with 75% of MathGrrl's scenarios, unless she wants to argue that they are not EA's. That said, with one of four possible bases, what is the minimum number of sites required to encode information the UPB and the universal complexity bound? iirc, the genome length of an ev organism was 256 sites, but each individual binding site is only 16 (16x16=256).Mung
April 24, 2011
April
04
Apr
24
24
2011
02:33 PM
2
02
33
PM
PDT
Instead of driving down to my storage unit to pull out my copy of NFL, I just bought the Kindle version. Go Kindle!
Also, in No Free Lunch, Dembski asked where the “CSI” came from in Ev runs (p. 212 and following). So Ev creates “CSI”.
Dembski writes the following:
To see that the Darwinian mechanism is incapable of generating specified complexity, it is necessary to consider the mathematical underpinnings of that mechanism, to wit, evolutionary algorithms. By an evolutionary algorithm I mean any well-defined mathematical procedure that generates contingency via some chance process and then sifts it via some law-like process.
So it's pretty clear where, if there is any CSI, it does not come from.Mung
April 24, 2011
April
04
Apr
24
24
2011
12:45 PM
12
12
45
PM
PDT
Schneider should have checked (or should be more familiar with the field) before so dismissively correcting.
The opinion I'm developing of Schneider from reading his online postings is that he just expects to be believed because he is, after all, just exposing the creationists as the frauds they are. He certainly doesn't seem to take Dembski seriously or even consider that he might be doing original work, even though he has what, three doctorates? ( How his wife stands him I don't know ;) )Mung
April 24, 2011
April
04
Apr
24
24
2011
12:29 PM
12
12
29
PM
PDT
28 --> Were these Schneider's dismissed "subjective" specifications, useless in guiding the selection of available components and the creation of a reasonably successfully configured system? 29 --> Or, later on when I spotted how to take juice bottle caps, cellulose sponges, and Al mains cable lying on the ground after a hurricane (Hugo) to make soldering iron stands for effectively zero cost, was that subjectivity pretty useless, or the specificity of configuration meaningless? 30 --> Or is my son's current exercise to convert some card lying around into a version of Kreigspeil, useless? Later on he tangles it up with specificity, which is a terrible term: 31 --> More inappropriately denigratory dismissal Biological specification always refers to function. An organism is a functional system comprising many functional subsystems. In virtue of their function these systems embody patterns that are objectively given and can be identified independently of the systems that embody them. Hence these systems are specified ... (page 148) 32 --> Compare Schneider's clip with mine in the OP, to see how this has been twisted by extraction from context:
148: “The great myth of contemporary evolutionary biology is that the information needed to explain complex biological structures can be purchased without intelligence. My aim throughout this book is to dispel that myth . . . . Eigen and his colleagues must have something else in mind besides information simpliciter when they describe the origin of information as the central problem of biology. I submit that what they have in mind is specified complexity, or what equivalently we have been calling in this Chapter Complex Specified information or CSI . . . . Biological specification always refers to function . . . In virtue of their function [a living organism's subsystems] embody patterns that are objectively given and can be identified independently of the systems that embody them. Hence these systems are specified in the sense required by the complexity-specificity criterion . . . the specification can be cashed out in any number of ways . . ."
33 --> In other words, the issue is the Wicken wiring diagram, whereby specific items are organised in specific ways to fulfill a function. And Wicken has been on record since 1979:
‘Organized’ systems are to be carefully distinguished from ‘ordered’ systems. Neither kind of system is ‘random,’ but whereas ordered systems are generated according to simple algorithms [[i.e. “simple” force laws acting on objects starting from arbitrary and common- place initial conditions] and therefore lack complexity, organized systems must be assembled element by element according to an [[originally . . . ] external ‘wiring diagram’ with a high information content . . . Organization, then, is functional complexity and carries information. It is non-random by design or by selection, rather than by the a priori necessity of crystallographic ‘order.’ [[“The Generation of Complexity in Evolution: A Thermodynamic and Information-Theoretical Discussion,” Journal of Theoretical Biology, 77 (April 1979): p. 353, of pp. 349-65.]
So I'll take it that if one can make a significant sequence logo from a set of sequences, then that pattern is 'specified'. Clearly Dembski would want these to fall under his roof because they represent the natural binding patterns of proteins on DNA. 34 --> this of course points straight back to the problem already dealt with whereby Schneider, having designed the system and how it operates, imagines that the ev program is a faithful representation of undirected natural selection, so he csn infer form his intelligently designed system that undirected natural processes can find isolated islands of function [he starts on an island of function] and so spontaneously create CSI out of lucky noise. (Note: the concept of "specified" is the point where Dembski injects the intelligent agent that he later "discovers" to be design! 35 --> Willful strawman misrepresentation. Schneider knows, or -- on easily accessible corrective materials -- should know better. 36 --> The analytical issue Schneider simply will not engage is the attempt to land on zones/islands of interest that are deeply isolated in large config spaces by undirected chance and necessity. 37 --> That search space challenge is central to Dembski's work, and it is central to the statistical foundation of the second law of thermodynamics: statistical miracles are not to be relied on, and some things are so remote that they are not credibly observable on the gamut of our observed cosmos, by undirected chance and necessity. 38 --> that is why Dembski keeps on highlighting threshold metrics as have been elaborated int eh OP, and reduced to the thresholds above. Those thresholds are reasonable scales for config spaces where blind chance random walks and trial and error become utterly implausible as explanations. 39 --> For reasons Abel explained in his 2009 paper. This makes the whole argument circular. 40 --> Wrong, and a denigratory dismissal Dembski wants "CSI" rather than a precise measure such as Shannon information because that gets the intelligent agent in. 41 --> Pummelling the strawman. Instead look at the reason why zones of interest in config spaces are significant, and seek to understand why the needle in a haystack challenge is a challenge. If he detects "CSI", then by his definition he automatically gets an intelligent agent. 42 --> How rhetorically convenient it is to dismiss an inference to best empirically anchored explanation across known causal factors -- chance, necessity, art -- as a circular a priori assumption. 43 --> This is also an atmosphere-clouding and poisoning turnabout false accusation as there is documented proof of a priori materialism censoring origins science. The error is in presuming a priori that the information must be generated by an intelligent agent.) 44 --> In fact the real error is in your presumption of materialism a la Lewontin, and projecting unto Dembski the same error. In fact an inference to best explanation on the known facts of cause and patterns of empirical signs is routinely used to distinguish chance, necessity and agency as causes.>> __________________ Professor Schneider, plainly, needs to re-assess his approach and analysis. GEM of TKIkairosfocus
April 24, 2011
April
04
Apr
24
24
2011
11:31 AM
11
11
31
AM
PDT
Mung: On following your link to Schneider, I clipped this, which I am going to mark up on points, as rhetorical games of a very familiar ilk -- guess where MG got her talking points -- are being played: _______________ >>On page 127 Dembski introduces the information as I(E) = def - log2 P(E), where P(E) is the probability of event E. 1 --> This is actually following a standard and well-accepted professional Eng'g usage of the term and its quantification, following Hartley's suggestion and Shannon's work. This needs to be established, to expose what follows. 2 --> I turn to my trusty Princs of Comm Systems, 2nd edn, Taub and Schilling (McGraw Hill, 1986), p. 512, Sect. 13.2 (which follows my good old Connor, as cited and used in my always linked; cf as well Harry Robertson's related development of thermodynamic Entropy in Statistical Thermophysics (PH, 1998), pp. 3 - 6, 7, 36, etc, as also cited and used):
Let us consider a communication system in which the allowable messages are m1, m2, . . ., with probabilities of occurrence p1, p2, . . . . Of course p1 + p2 + . . . = 1. Let the transmitter select message mk of probability pk; let us further assume that the receiver has correctly identified the message [My nb: i.e. the a posteriori probability in my online discussion is 1]. Then we shall say, by way of definition of the term information, that the system has communicated an amount of information Ik given by Ik = (def) log2 1/pk (13.2-1)
3 --> In short, Dembski's use of the term "information" is precisely correct, though it differs from the terminology used by others. 4 --> Schneider should have checked (or should be more familiar with the field) before so dismissively correcting. Actually this is the surprisal introduced by Tribus in his 1961 book on thermodynamics. 5 --> Improperly dismissive Be that as it may, it is only a short step from this to Shannon's uncertainty (which he cites on page 131) 6 --> H = - [SUM on i] pi* log pi, i.e the weighted average information per symbol and from there to Shannon's information measure, so it is reasonable to use the significance of Shannon's information for determining complexity . . . 7 --> Here we see that he is ducking the key point in Shannon's work, that Shannon did not address the content, definiteness or meaningfulness of the information, as his interest was in things like the carrying capacity of lines for teletype or the like. 8 --> Dembski is precisely concerned with such, and so are Durston et al as excerpted in the OP, point 10. Thus their distinguishing of the ground state from the functional state and their addressing of a metric of increment on so moving based on Shannon's H. 9 --> Nor is this distinction immaterial or erroneous. If one is interested in the functionality of specific configurations in a message [text vs gibberish for instance], it is a reasonable step to set out to identify it and how it may be measured. What is "Specified"? This seems to be that the 'event' has a specific pattern. 10 --> As can be seen in the clip from pp 144 and 148 in NFL, Dembski has a very specific distinction in mind,namely that he particular event in question comes from a zone of interest in a config space of the possible states of the relevant info-storing entity. 11 --> So, if one observes a particular config E but any of a set of configs including E, T would do just as well, we need to address not the likelihood of getting E but of landing anywhere in T. 12 --> T being the zone of interest in the config space, the relative proportion of the state being taken up by T being a pretty good first indicator of how specific it is relative to other plausibly possible configs. 13 --> It is precisely at this point that Schneider goes off the rails, as he plainly (for years) cannot or will not see the significance of a restricted zone of interest in a wider config space. In the book [NFL] he runs around with a lot of vague definitions. 14 --> Resort to sophomoric belittling characterisation of what one does not understand or accept. 15 --> For instance in the case that Dembski discusses in his 2005 paper, he picks the case from Dan Brown's da Vinci code, where the protagonists must enter the correct code for a bank vault the first time or they will be locked out. This is a singleton set, and the specification is two fold: (i) the right code, and (ii) no mistakes and re-tries permitted. 16 --> In the event, a clue had to be de-scrambled to form the first several members of the Fibonacci sequence, and we see the functional config E being specific to a one shot correct code for the vault. 17 --> this explanatory discussion to help those with difficulties understanding that specifications relate to tight zones of interest in large config spaces, has been online at Dembski's site since 2005 or so. So, no critic in 2011 should be psoting informaiton that does not reckon with this, if s/he is at all concerned to be fair minded. 18 --> But in fact, such clustering of configurations in zones of interest is quite familiar, and fairly easy to understand. So familiar that we have a common phrase for the point that is to be made: searching for a needle in a haystack. (A term that Dembski uses.) "Specification depends on the knowledge of subjects. 19 --> But of course, design specs set out the desired configs of a working system, before it is implemented on the ground. 20 --> In fact, as any experienced designer will confirm, getting the design specs right [and acceptable to the relevant stakeholders] is a key part of a good design job, and it is on the right specs that identification of components/items and their configurations to achieve a target performance are undertaken. Is specification therefore subjective? Yes." Page 66 21 --> Subjective has two meanings, and if one equivocates, then one may twist the proper sense. It is subjects who set designs or who describe the characteristics of zones of interest in possible config spaces. No meaning-understanding or expressing subject and no specification of significance. That means that if only if Dembski says something is specified, it is. 22 --> Off the rails. And demeaningly abusive. 23 --> On the contrary, specifications are very useful indeed, and can be pretty objective. Just look at the drawings for a house, or a car engine or a circuit diagram etc. That's pretty useless of course. 24 --> Contemptuously abusive and dismissive, where in fact it is Schneider who plainly is either unfamiliar with the business of design -- has he ever built a house? -- or is being willfully deceptive of those who look to him for intellectual leadership. 25 --> In either case, his whole argument is dead at this point. 26 --> FYI professor Schneider and MG, there is an excellent reason why software like Auto-cad is so much in demand for engineers. And, when I used to have to troubleshoot [and build] scientific instrumentation for a living, almost the first thing I wanted was the system and circuit etc drawings. 27 --> When I had to design and build them, I needed specifications to guide what was to be done, based on what was technically possible and reasonably affordable.
(A 0.1 deg C resolving thermometer of less than 1 - 2 mm dameter for inserting into the shells of living oysters comes to mind, for the implication was we could not use thermocouple probes; which would poison the critters, wild oysters being studied towards oyster farming. I ended up using glass encapsulated thermistor probes sealed into a plastic pen body, and using a calibration curve to linearise a VCO output used as a single ramp A/D. Precision turned out to be of order 0.01 C. Did the job, within available resources at a time of austerity. It could detect a drop of ice cold water dropped into a beaker.)
[ . . . ]kairosfocus
April 24, 2011
April
04
Apr
24
24
2011
11:30 AM
11
11
30
AM
PDT
My own CSI Challenge Schneider presents Two cases of "Complex Specific Information" If the binding sites produced by ev happened to match the binding sites from the Hengen paper, and the sequence logos matched each other, would he have been surprised? If so, why? Would we be warranted in inferring that someone had fiddled with the results? Why?Mung
April 24, 2011
April
04
Apr
24
24
2011
09:34 AM
9
09
34
AM
PDT
F/N 2: To get an idea of how hard it is to land in the border zone on non-blue and non-black space -- growing space -- call up the M-brot applet here, head for sea horse valley [between body and head] with your mouse cross-hairs, and try to pick the border zone by hand. You will find it much easier to end in the sea or the black hole. And, that is where you are close! (Notice, too how the sea in this implementation gives a gradient towards the functional edge zone. Think of how a slope sensor algorithm will be able to hill climb towards the border zone regardless of the fact that in the sea there is no function to reward.) I am beginning to think that a bit of experimentation with an algorithm like this will give a clearer idea of what is going on than essays of many words. So, have some fun . . .kairosfocus
April 24, 2011
April
04
Apr
24
24
2011
03:17 AM
3
03
17
AM
PDT
F/N, re 145:
Sch: Also, in No Free Lunch, Dembski asked where the “CSI” came from in Ev runs (p. 212 and following). So Ev creates “CSI”.
Schneider is here playing at strawmannising Dembski and knocking over the convenient strawman. This comes out in how we just saw how the ev actually "creates" information. That is it is the on-stage proxy for the designer of the program. GIGO. The pumelling of the strawman:
Was it, as Dembski suggests in No Free Lunch (page 217), that I programmed Ev to make selections based on mistakes? Again, no. Ev makes those selections independently of me. Fortunately I do not need to sit there providing the selection at every step! The decisions are made given the current state, which in turn depends on the mutations which come from the random number generator. I personally never even see those decisions.
So, who designed the algorithm, set up the decision nodes and branch programs? The loops? Try two options:
A: ev created itself out of lucky noise? B: Schneider wrote ev?
No prizes for guessing which is correct. So, even though Schneider was not specifically present to decide, he programmed the decision for his mechanical proxy, the computer. An idiot robot that -- notoriously -- will do exactly what you tell it regardless of consequences. So, when he caricatures and pummels Dembski as though Dembski suggested that Schneider MANUALLY made every uphill climbing decision form zero function, that is something Schneider knew -- or full well SHOULD have known -- was a demeaning and disrespectful strawman misrepresentation. That mean-spirited joke at Dembski's expense also admirably serves to distract attention from the very real behind the scenes strings leading from the idiot robot with the GIGO problem responsible for the existence of the software industry, to the creator of the program who set up the sort of hill-climbing from a zero base we have seen. FYI, every strawman argument is automatically a distractive red herring. Just it adds the entertainment factor of a strawman about to be beaten up or soaked in ad hominems and burned after being beaten up. And, the notion that ev is actually making DECISIONS is the worst kind of anthropomorphising of that idiot robot, the computer. GEM of TKIkairosfocus
April 24, 2011
April
04
Apr
24
24
2011
02:49 AM
2
02
49
AM
PDT
Mung and Ilion: Ah, but you see in Schneider's thinking, ev only simulates what he thinks real evolution does -- it is all based on his work with real organisms [on MICRO-evo of already functioning organisms!] -- so the work to get the program set up right is all of no account. Don't watch that curtain, the strings going up from the dancing ventriloquist's puppets, the smoke, the mirrors the trap doors under the stage etc! It's magic [natural selection . . . ]. That is why Schneider on his racehorse page thought that the tweaked program showed how natural selection beat the Dembski bound on an afternoon. It never dawned on him that THE LIMIT IS A PHYSICAL ONE -- NUMBER OF PLANCK TIME QUANTUM STATES OF 10^80 ATOMS ACROSS THE THERMODYNAMIC LIFESPAN OF THE COSMOS OR THE SOLAR SYSTEM. Where, 10^20 P-times are needed to carry out the fastest nuclear, strong force interactions. So, if you are beating the performance of random search before such a limit, it is because you have intelligently injected active information. But if Dembski et al are IDiots, their opinions are of no weight. Never mind, that the same basic concepts are foundational to the statistical basis for the second law of thermodynamics, and for decades have pointed to 1 in 10^50 as more or less an upper limit to plausibly observable events in a lab scale setting. (A limit that not so coincidentally is what the random text search strings are running up to, and the tierra program has hit.) Config spaces double for every additional bit, and if you take 200 bits as a reasonable upper lab observability threshold, 398 bits or 500 bits are 200 - 300 doublings beyond that. My own preferred 1,000 bit limit is 800 doublings beyond that. If you are finding needles in haystacks that big, it is because you know where to look. No ifs, ands or buts about it. And, in the real case of cell based life you are beyond about 200 k bits of functional information. that is two hundred TIMES the number of doublings to get to 1,000 bits. When you look at the integrated functional specificity involved, to get a metabolising entity with a stored data von Neumann self-replicator, with step by step regulatory controls and complex organised molecular nanomachines, the conclusions are pretty obvious. Save, to those utterly determined -- never mind the absurdities we just saw exposed -- to NOT see them. I guess I was about 5 or 6 when my mom taught me that there is none so blind as s/he who refuses to -- WILL not -- see. Let's spell out a few points: 1 --> Ev is a software program run on a computer, using computer languages to specify code in execution of a designed program that takes in and stores inputs, manipulates according to a process logic [an algorithm] and generates outputs. 2 --> In that process, there are step by step sequences taken, there are decision points, there are branches. All, fed in and tweaked to work by the designer. 3 --> Opportunity and means and motive to get it "right." 4 --> GA's are based on setting a so-called fitness landscape that is like the heights of the points on a contour map, sitting on the underlying base plane of some config space or other of possibilities. 5 --> That map is based on a function that tends to have nice trends of slope pointing conveniently uphill. (There are many, many functions that do not have reliable trends like that -- think of cliffs that drop you into sinkholes too deep and broad to climb back out of, sort of like a ring island with a huge sunken volcano caldera in the middle. So, to specify such a nice function is a major intelligent intervention.) 6 --> Then, you have to have controlled random change in a genome that conveniently does not rapidly accumulate to damage the key functions that allow you to climb the hill. 7 --> Which hill-climbing algorithms are themselves very intelligently designed sand programmed to push upslope to the heights that you "know" are there. 8 --> Let's take back up the Mandelbrot set example looked at above. Let the central black space be the caldera of doom. (Cf here for a discussion that discusses the zone of the plane in which the set is found, including equations for the head-bulb and the main part of the cardioid body. The diag here will show that he familiar head runs out towards - 2 on the reals, the double lobe body is between 1 and - 1 on the imaginary axis, and does not quite go to + 0.5 on the reals, linking to the head at about - 0.7 on the real axis. This amazing infinity of points is in a box 3 wide by 2 high, with zero being about 1/2 a unit in from the RH end. The vastly wider sea of points beyond this zone is all flat blue on the usual rendering.) 9 --> Now, run your hill-climbing, noting that there are infinitely many points that will drop you dead, and there is a vast sea of non-function around. 10 --> Start out in that sea. Oops, you are a non starter. Try again. Oops, by overwhelming probability you keep on landing in the sea of non-function. And since the ring of possibilities there is going to be smallish, a random walk there is not going to allow your hill climbing to get started. 11 --> You are stuck and cannot climb. (Unless your hill-climber somehow knows the direction to the island of function and can pull you in, i.e there is a Hamming distance oracle that gives a warmer-colder signal. This was the tactic used by Weasel.) 12 --> See why it is highly significant to observe that these evolutionary algorithms start and work within an island of intelligently input function? 13 --> And/or that they use a Hamming oracle to pull you in towards the island of function? 14 --> Q: What do I mean by that?
A: Well, look at Schneider's graph for how ev hill climbs, noting that HE SAYS THAT THE INITIAL RUNS ARE AT EFFECTIVELY ZERO FUNCTION OR ZERO INFORMATION, more or less the same thing. So, how is something with minimal or flat zero function reproducing itself on differential success, enabling a hill-climb?
15 --> The logic of ev and similar EA's is already falling apart. 16 --> The answer is that there is a lot of implicit function that is providing the context in which ev can hill climb on minor differences in the targetting metric, aka the genome. 17 --> There is a Hamming oracle on distance to target, or something that is set up to pick up slight increments in proximity to target [on the in-built assumption that heading uphill moves you closer to home . . . ]. 18 --> The inclusion of such a targetting routine and assumption are doubtless excused as being how real evolution works, so you are simply modelling reality. 19 --> Now, let us remember our black hole in the heart of the Mandelbreot island, with infinitely many extensions to drop you off into non-performance in the hill-climbing zone. 20 --> Climb away from the shoreline. The variants that go out to sea are eliminated. Those that climb tend to do so until they hit a black hole. 21 --> But, while they are running,the variability part of the entity is a small fraction. So, we have sudden appearance, stasis with minor adaptive variations, and eventual sudden disappearance. 22 --> A model that is about micro-evo of already existing functional forms, and which explains minor variation to fit niches then disappearance. 23 --> Sounds familiar:
. . . long term stasis following geologically abrupt origin of most fossil morphospecies, has always been recognized by professional paleontologists. [[Gould, S J, The Structure of Evolutionary Theory (2002), p. 752.] . . . . The great majority of species do not show any appreciable evolutionary change at all. These species appear in the section [[first occurrence] without obvious ancestors in the underlying beds, are stable once established and disappear higher up without leaving any descendants." [[p. 753.] . . . . proclamations for the supposed ‘truth’ of gradualism - asserted against every working paleontologist’s knowledge of its rarity - emerged largely from such a restriction of attention to exceedingly rare cases under the false belief that they alone provided a record of evolution at all! The falsification of most ‘textbook classics’ upon restudy only accentuates the fallacy of the ‘case study’ method and its root in prior expectation rather than objective reading of the fossil record. [[p. 773.]
24 --> We do have a model of evolution: one of micro-evo on existing, functional body plans with a proneness to drop off into non-performance. 25 --> Add in an unpredictable gradual or sudden variability in the island's topography of function, and voila, we have a model that fits the fossil record pretty well. 26 --> Only, it shows that small scale random changes rewarded by hill-climbing do not explain what was needed: MACRO-evo. 27 --> BTW, notice in the diagram, too, how when the hill climbing filter is cut off, we see a rapid descent downslope. That is the island topography is confirmed. _______ Schneider's ev confirms that it is active information that is the source of the new information in the output, and that it takes design to get you to and keep you on an island of function. So, the system is targetted and tuned, and it depends on a nice topography for the islands of function it works within. In short, the functional info that appears in the o/p was built in form the intelligent actions of the designer of the program. Ev does not overturn design theory, but illustrates it's point about specified complexity and its credible cause. GEM of TKIkairosfocus
April 24, 2011
April
04
Apr
24
24
2011
02:22 AM
2
02
22
AM
PDT
1 2 3 4 7

Leave a Reply