Uncommon Descent Serving The Intelligent Design Community

NEWS FLASH: Dembski’s CSI caught in the act

Share
Facebook
Twitter
LinkedIn
Flipboard
Print
Email

Dembski’s CSI concept has come under serious question, dispute and suspicion in recent weeks here at UD.

After diligent patrolling the cops announce a bust: acting on some tips from un-named sources,  they have caught the miscreants in the act!

From a comment in the MG smart thread, courtesy Dembski’s  NFL (2007 edn):

___________________

>>NFL as just linked, pp. 144 & 148:

144: “. . . since a universal probability bound of 1 in 10^150 corresponds to a universal complexity bound of 500 bits of information, (T, E) constitutes CSI because T [i.e. “conceptual information,” effectively the target hot zone in the field of possibilities] subsumes E [i.e. “physical information,” effectively the observed event from that field], T is detachable from E, and and T measures at least 500 bits of information . . . ”

148: “The great myth of contemporary evolutionary biology is that the information needed to explain complex biological structures can be purchased without intelligence. My aim throughout this book is to dispel that myth . . . . Eigen and his colleagues must have something else in mind besides information simpliciter when they describe the origin of information as the central problem of biology.

I submit that what they have in mind is specified complexity, or what equivalently we have been calling in this Chapter Complex Specified information or CSI . . . .

Biological specification always refers to function . . . In virtue of their function [a living organism’s subsystems] embody patterns that are objectively given and can be identified independently of the systems that embody them. Hence these systems are specified in the sense required by the complexity-specificity criterion . . . the specification can be cashed out in any number of ways . . . “

Here we see all the suspects together caught in the very act.

Let us line up our suspects:

1: CSI,

2: events from target zones in wider config spaces,

3: joint complexity-specification criteria,

4: 500-bit thresholds of complexity,

5: functionality as a possible objective specification

6: biofunction as specification,

7: origin of CSI as the key problem of both origin of life [Eigen’s focus] and Evolution, origin of body plans and species etc.

8: equivalence of CSI and complex specification.

Rap, rap, rap!

“How do you all plead?”

“Guilty as charged, with explanation your honour. We were all busy trying to address the scientific origin of biological information, on the characteristic of complex functional specificity. We were not trying to impose a right wing theocratic tyranny nor to smuggle creationism in the back door of the schoolroom your honour.”

“Guilty!”

“Throw the book at them!”

CRASH! >>

___________________

So, now we have heard from the horse’s mouth.

What are we to make of it, in light of Orgel’s conceptual definition from 1973 and the recent challenges to CSI raised by MG and others.

That is:

. . . In brief, living organisms are distinguished by their specified complexity. Crystals are usually taken as the prototypes of simple well-specified structures, because they consist of a very large number of identical molecules packed together in a uniform way. Lumps of granite or random mixtures of polymers are examples of structures that are complex but not specified. The crystals fail to qualify as living because they lack complexity; the mixtures of polymers fail to qualify because they lack specificity. [[The Origins of Life (John Wiley, 1973), p. 189.]

And, what about the more complex definition in the 2005 Specification paper by Dembski?

Namely:

define ϕS as . . . the number of patterns for which [agent] S’s semiotic description of them is at least as simple as S’s semiotic description of [a pattern or target zone] T. [26] . . . . where M is the number of semiotic agents [S’s] that within a context of inquiry might also be witnessing events and N is the number of opportunities for such events to happen . . . . [where also] computer scientist Seth Lloyd has shown that 10^120 constitutes the maximal number of bit operations that the known, observable universe could have performed throughout its entire multi-billion year history.[31] . . . [Then] for any context of inquiry in which S might be endeavoring to determine whether an event that conforms to a pattern T happened by chance, M·N will be bounded above by 10^120. We thus define the specified complexity [χ] of T given [chance hypothesis] H [in bits] . . . as  [the negative base-2 log of the conditional probability P(T|H) multiplied by the number of similar cases ϕS(t) and also by the maximum number of binary search-events in our observed universe 10^120]

χ = – log2[10^120 ·ϕS(T)·P(T|H)]  . . . eqn n1

How about this (we are now embarking on an exercise in “open notebook” science):

1 –> 10^120 ~ 2^398

2 –> Following Hartley, we can define Information on a probability metric:

I = – log(p) . . .  eqn n2

3 –> So, we can re-present the Chi-metric:

Chi = – log2(2^398 * D2 * p)  . . .  eqn n3

Chi = Ip – (398 + K2) . . .  eqn n4

4 –> That is, the Dembski CSI Chi-metric is a measure of Information for samples from a target zone T on the presumption of a chance-dominated process, beyond a threshold of at least 398 bits, covering 10^120 possibilities.

5 –> Where also, K2 is a further increment to the threshold that naturally peaks at about 100 further bits. In short VJT’s CSI-lite is an extension and simplification of the Chi-metric. He explains in the just linked (and building on the further linked):

The CSI-lite calculation I’m proposing here doesn’t require any semiotic descriptions, and it’s based on purely physical and quantifiable parameters which are found in natural systems. That should please ID critics. These physical parameters should have known probability distributions. A probability distribution is associated with each and every quantifiable physical parameter that can be used to describe each and every kind of natural system – be it a mica crystal, a piece of granite containing that crystal, a bucket of water, a bacterial flagellum, a flower, or a solar system . . . .

Two conditions need to be met before some feature of a system can be unambiguously ascribed to an intelligent agent: first, the physical parameter being measured has to have a value corresponding to a probability of 10^(-150) or less, and second, the system itself should also be capable of being described very briefly (low Kolmogorov complexity), in a way that either explicitly mentions or implicitly entails the surprisingly improbable value (or range of values) of the physical parameter being measured . . . .

my definition of CSI-lite removes Phi_s(T) from the actual formula and replaces it with a constant figure of 10^30. The requirement for low descriptive complexity still remains, but as an extra condition that must be satisfied before a system can be described as a specification. So Professor Dembski’s formula now becomes:

CSI-lite=-log2[10^120.10^30.P(T|H)]=-log2[10^150.P(T|H)] . . . eqn n1a

. . . .the overall effect of including Phi_s(T) in Professor Dembski’s formulas for a pattern T’s specificity, sigma, and its complex specified information, Chi, is to reduce both of them by a certain number of bits. For the bacterial flagellum, Phi_s(T) is 10^20, which is approximately 2^66, so sigma and Chi are both reduced by 66 bits. My formula makes that 100 bits (as 10^30 is approximately 2^100), so my CSI-lite computation represents a very conservative figure indeed.

Readers should note that although I have removed Dembski’s specification factor Phi_s(T) from my formula for CSI-lite, I have retained it as an additional requirement: in order for a system to be described as a specification, it is not enough for CSI-lite to exceed 1; the system itself must also be capable of being described briefly (low Kolmogorov complexity) in some common language, in a way that either explicitly mentions pattern T, or entails the occurrence of pattern T. (The “common language” requirement is intended to exclude the use of artificial predicates like grue.) . . . .

[As MF has pointed out] the probability p of pattern T occurring at a particular time and place as a result of some unintelligent (so-called “chance”) process should not be multiplied by the total number of trials n during the entire history of the universe. Instead one should use the formula (1–(1-p)^n), where in this case p is P(T|H) and n=10^120. Of course, my CSI-lite formula uses Dembski’s original conservative figure of 10^150, so my corrected formula for CSI-lite now reads as follows:

CSI-lite=-log2(1-(1-P(T|H))^(10^150)) . . . eqn n1b

If P(T|H) is very low, then this formula will be very closely approximated [HT: Giem] by the formula:

CSI-lite=-log2[10^150.P(T|H)]  . . . eqn n1c

6 –> So, the idea of the Dembski metric in the end — debates about peculiarities in derivation notwithstanding — is that if the Hartley-Shannon- derived information measure for items from a hot or target zone in a field of possibilities is beyond 398 – 500 or so bits, it is so deeply isolated that a chance dominated process is maximally unlikely to find it, but of course intelligent agents routinely produce information beyond such a threshold.

7 –> In addition, the only observed cause of information beyond such a threshold is the now proverbial intelligent semiotic agents.

8 –> Even at 398 bits that makes sense as the total number of Planck-time quantum states for the atoms of the solar system [most of which are in the Sun] since its formation does not exceed ~ 10^102, as Abel showed in his 2009 Universal Plausibility Metric paper. The search resources in our solar system just are not there.

9 –> So, we now clearly have a simple but fairly sound context to understand the Dembski result, conceptually and mathematically [cf. more details here]; tracing back to Orgel and onward to Shannon and Hartley. Let’s augment here [Apr 17], on a comment in the MG progress thread:

Shannon measured info-carrying capacity, towards one of his goals: metrics of the carrying capacity of comms channels — as in who was he working for, again?

CSI extended this to meaningfulness/function of info.

And in so doing, observed that this — due to the required specificity — naturally constricts the zone of the space of possibilities actually used, to island[s] of function.

That specificity-complexity criterion links:

I: an explosion of the scope of the config space to accommodate the complexity (as every added bit DOUBLES the set of possible configurations),  to

II: a restriction of the zone, T, of the space used to accommodate the specificity (often to function/be meaningfully structured).

In turn that suggests that we have zones of function that are ever harder for chance based random walks [CBRW’s] to pick up. But intelligence does so much more easily.

Thence, we see that if you have a metric for the information involved that surpasses a threshold beyond which a CBRW is a plausible explanation, then we can confidently infer to design as best explanation.

Voila, we need an info beyond the threshold metric. And, once we have a reasonable estimate of the direct or implied specific and/or functionally specific (especially code based) information in an entity of interest, we have an estimate of or credible substitute for the value of – log2(p(T|H)); especially if the value of information comes from direct inspection of storage capacity and code symbol patterns of use leading to an estimate of relative frequency, we may evaluate average [functionally or otherwise] specific information per symbol used. This is a version of Shannon’s weighted average information per symbol H-metric, H = –  Σ pi * log(pi), which is also known as informational  entropy [there is an arguable link to thermodynamic entropy, cf here)  or uncertainty.

As in (using Chi_500 for VJT’s CSI_lite [UPDATE, July 3: and S for a dummy variable that is 1/0 accordingly as the information in I is empirically or otherwise shown to be specific, i.e. from a narrow target zone T, strongly UNREPRESENTATIVE of the bulk of the distribution of possible configurations, W]):

Chi_500 = Ip*S – 500,  bits beyond the [solar system resources] threshold  . . . eqn n5

Chi_1000 = Ip*S – 1000, bits beyond the observable cosmos, 125 byte/ 143 ASCII character threshold . . . eqn n6

Chi_1024 = Ip*S – 1024, bits beyond a 2^10, 128 byte/147 ASCII character version of the threshold in n6, with a config space of 1.80*10^308 possibilities, not 1.07*10^301 . . . eqn n6a

[UPDATE, July 3: So, if we have a string of 1,000 fair coins, and toss at random, we will by overwhelming probability expect to get a near 50-50 distribution typical of the bulk of the 2^1,000 possibilities W. On the Chi-500 metric, I would be high, 1,000 bits, but S would be 0, so the value for Chi_500 would be – 500, i.e. well within the possibilities of chance.  However, if we came to the same string later and saw that the coins somehow now had the bit pattern of the ASCII codes for the first 143 or so characters of this post, we would have excellent reason to infer that an intelligent designer, using choice contingency, had intelligently reconfigured the coins. that is because, using the same I = 1,000 capacity value, S is now 1, and so Chi_500 = 500 bits beyond the solar system threshold. If the 10^57 or so atoms of our solar system, for its lifespan, were to be converted into coins and tables etc, and tossed at an impossibly fast rate, it would be impossible to sample enough of the possibilities space W to have confidence that something from so unrepresentative a zone T,  could reasonably be explained on chance. So, as long as an intelligent agent capable of choice is possible, choice — i.e. design — would be the rational, best explanation on the sign observed, functionally specific, complex information.]

10 –> Similarly, the work of Durston and colleagues, published in 2007, fits this same general framework. Excerpting:

Consider that there are usually only 20 different amino acids possible per site for proteins, Eqn. (6) can be used to calculate a maximum Fit value/protein amino acid site of 4.32 Fits/site [NB: Log2 (20) = 4.32]. We use the formula log (20) – H(Xf) to calculate the functional information at a site specified by the variable Xf such that Xf corresponds to the aligned amino acids of each sequence with the same molecular function f. The measured FSC for the whole protein is then calculated as the summation of that for all aligned sites. The number of Fits quantifies the degree of algorithmic challenge, in terms of probability [info and probability are closely related], in achieving needed metabolic function. For example, if we find that the Ribosomal S12 protein family has a Fit value of 379, we can use the equations presented thus far to predict that there are about 10^49 different 121-residue sequences that could fall into the Ribsomal S12 family of proteins, resulting in an evolutionary search target of approximately 10^-106 percent of 121-residue sequence space. In general, the higher the Fit value, the more functional information is required to encode the particular function in order to find it in sequence space. A high Fit value for individual sites within a protein indicates sites that require a high degree of functional information. High Fit values may also point to the key structural or binding sites within the overall 3-D structure.

11 –> So, Durston et al are targetting the same goal, but have chosen a different path from the start-point of the Shannon-Hartley log probability metric for information. That is, they use Shannon’s H, the average information per symbol, and address shifts in it from a ground to a functional state on investigation of protein family amino acid sequences. They also do not identify an explicit threshold for degree of complexity. [Added, Apr 18, from comment 11 below:] However, their information values can be integrated with the reduced Chi metric:

Using Durston’s Fits from his Table 1, in the Dembski style metric of bits beyond the threshold, and simply setting the threshold at 500 bits:

RecA: 242 AA, 832 fits, Chi: 332 bits beyond

SecY: 342 AA, 688 fits, Chi: 188 bits beyond

Corona S2: 445 AA, 1285 fits, Chi: 785 bits beyond  . . . results n7

The two metrics are clearly consistent, and Corona S2 would also pass the X metric’s far more stringent threshold right off as a single protein. (Think about the cumulative fits metric for the proteins for a cell . . . )

In short one may use the Durston metric as a good measure of the target zone’s actual encoded information content, which Table 1 also conveniently reduces to bits per symbol so we can see how the redundancy affects the information used across the domains of life to achieve a given protein’s function; not just the raw capacity in storage unit bits [= no.  of  AA’s * 4.32 bits/AA on 20 possibilities, as the chain is not particularly constrained.]

12 –> I guess I should not leave off the simple, brute force X-metric that has been knocking around UD for years.

13 –> The idea is that we can judge information in or reducible to bits, as to whether it is or is not contingent and complex beyond 1,000 bits. If so, C = 1 (and if not C = 0). Similarly, functional specificity can be judged by seeing the effect of disturbing the information by random noise [where codes will be an “obvious” case, as will be key-lock fitting components in a Wicken wiring diagram functionally organised entity based on nodes, arcs and interfaces in a network], to see if we are on an “island of function.” If so, S = 1 (and if not, S = 0).

14 –> We then look at the number of bits used, B — more or less the number of basic yes/no questions needed to specify the configuration [or, to store the data], perhaps adjusted for coding symbol relative frequencies — and form a simple product, X:

X = C * S * B, in functionally specific bits . . . eqn n8.

15 –> This is of course a direct application of the per aspect explanatory filter, (cf. discussion of the rationale for the filter here in the context of Dembski’s “dispensed with” remark) and the value in bits for a large file is the familiar number we commonly see such as a Word Doc of 384 k bits. So, more or less the X-metric is actually quite commonly used with the files we toss around all the time. That also means that on billions of test examples, FSCI in functional bits beyond 1,000 as a threshold of complexity is an empirically reliable sign of intelligent design.

______________

All of this adds up to a conclusion.

Namely, that there is excellent reason to see that:

i: CSI and FSCI are conceptually well defined (and are certainly not “meaningless”),

ii: trace to the work of leading OOL researchers in the 1970’s,

iii: have credible metrics developed on these concepts by inter alia Dembski and Durston, Chiu, Abel and Trevors, metrics that are based on very familiar mathematics for information and related fields, and

iv: are in fact — though this is hotly denied and fought tooth and nail — quite reliable indicators of intelligent cause where we can do a direct cross-check.

In short, the set of challenges recently raised by MG over the past several weeks has collapsed. END

Comments
I've just completed the first essay by Bruce Gordon in "The Nature of Nature" in which he puts forth eight reasons to prefer an intelligent design explanation over a neo-Darwinian gradualism explanation. In four of the eight (1, 3, 7, and 8) he appeals to complex specified information. Could MathGrrl have a valid point? But then, is anyone here really denying that ID theorists claim that CSI is a reliable indicator of design? The Gordon essay has an appendix in which he writes: [quote]How does one distinguish the product of intelligence from the product of chance? One way is to give a rigorous mathematical characterization of design in terms of conformity of an event to a pattern of very small probability that is constructible on the basis of knowledge that is independent of the occurrence of the event itself. [/quote] (Wow. Is it possible that's what Dembski means by a specification?) The mathematics involved look very similar to that which appears in the paper by Dembski which MathGrrl claims to have read and which she references in her OP. If someone who has the "Nature of Nature" book could compare the mathematics between the two I'd like to hear their comments.Mung
April 19, 2011
April
04
Apr
19
19
2011
04:51 PM
4
04
51
PM
PDT
Of particular interest to me is the recent claim by MathGrrl that:
Schneider shows how to generate arbitrary amounts of Shannon information via ev.
Now apart from the fact that this is a vague and muddled statement (couldn't a random number generator just as well generate arbitrary amounts of Shannon information?) Schneider himself actually makes no such claim. So what is MathGrrl really trying to say? Well, trying to reconstruct from the context, MathGrrl in post #66 is responding to kairosfocus @44. MathGrrl:
I don’t see where you’re getting your 266 bit value
From the calculation done by PaV. kairosfocus @3
15: PAV has also pointed out that ev, the best of the breed per Schneider, peaks out at less than 300 bits of search, on a computer — which is far faster and more effective at searching than real world generations on the ground would be; i.e. well within the thresholds that the various metrics acknowledges as reachable by chance dominated search.
So MathGrrl appears to be arguing that ev can easily exceed the 500 bit threshold. It really seems to me that she's arguing over something that's not in dispute. Intelligently designed and manipulated (and there's no doubt that Schneider had to intelligently intervene to get his program to meet the new criteria) computer programs can do things that a random search cannot. Horse Race How much information did he have to inject in the new search for a search in order for the program to succeed?Mung
April 19, 2011
April
04
Apr
19
19
2011
04:03 PM
4
04
03
PM
PDT
It's taking more than 24 hours for my posts to pass through moderation, so I've started reproducing them over at ARN starting HERE. I don't expect anyone to respond to posts there, so I'll try to be careful about what I say there. That said, I am providing links there to my posts here. Please check there occasionally to see if I've posted anything that hasn't shown up here yet.Mung
April 19, 2011
April
04
Apr
19
19
2011
03:30 PM
3
03
30
PM
PDT
Talking point: The fact that [controlled, limited] random change + [artificial, algorithmic] selection [matched to a specified fitness metric on the space of possibilities in an island of function set up by designers of the relevant GA program] can produce complex designs [when run on equally intelligently designed and programmed computers]* doesn’t mean it’s the fastest or most efficient way of doing so . . . _________ * Has anyone actually OBSERVED a case of known chance plus blind necessity without intelligent guidance producing novel functionally specific complex genetic information beyond, say, 1,000 bits -- 500 bases -- of complexity? (Duplications of existing functional info don't count for reasons identified in 19 - 20 above.) More details in response to a request for explanation, here. +++++ Oops, I'se be a very bad boy . . . spoiling the rhetorical force of the objection by inserting the material parts that are usually omitted when it is made. "Well, I couldn't resist those hot oatmeal and raisin cookies, mama . . . " WHACK! GEM of TKIkairosfocus
April 19, 2011
April
04
Apr
19
19
2011
02:25 PM
2
02
25
PM
PDT
Onlookers: Now that the unpleasant piece of housecleaning is out of the way, let us look on MG's playbook of objections, point by point: _________________ >> 12.1 Publish a mathematically rigorous definition of CSI 1 --> CSI is primarily a description of the real world pattern of complex functionally organised things that work based on some sort of Wicken wiring diagram. So this is wrong headed, as has been repeatedly highlighted. 2 --> Dembski and others have produced quantitative models of what CSI is about, and have in so doing made more or less good enough for govt work metrics. 3 --> The reduced form of the Chi metric is useful, whether in Torley's 500-bit threshold form or the 398 + K2 threshold form. So is the simple brute force X-metric. And, no number of drumbeat repetitions of dismissive hyperskeptical objections will change that. Utility will defeat cavails everytime. 4 --> The significance of the relevant thresholds for searches on random walk driven or dominated process, is plain. 5 --> if you reject these, either you believe in statistical miracles as commonplace [for which we find no cogent evidence], or you believe the cosmos was programmed to produce life and the sort of variety we see. Which boils down to you believe in a form of design model. 6 --> And of course point 11 in the OP now shows the application of the reduced Demsbki metric to the biological world, with success. 12.2 Provide real evidence for CSI claims 7 --> Willfully vague and hyperskeptical. If you mean that the only empirically known, observed source of CSI and especially FSCI is intelligence, look around you and ask where the web posts, library books, etc came from. Then ask if you have ever seen the like produced by chance plus blind mechanical necessity. 12.3 Apply CSI to identify human agency where it is currently not known 8 --> A strawman: the inference to design is not an inference to the designer. That twerdun, not whodunit. 12.4 Distinguish between chance and design in archaeoastronomy 9 --> the idea here is whether certain alignments of site layouts, stones etc in ancient sites were by accident or by design. 10 --> But in fact this test was long since met by archaeologists, e.g. when it was detected that the best explanation for certain features of Stonehenge were aligned with solstices, and would work for day counting etc. 11 --> The layout can be specified as a nodes, arcs and interfaces wireframe network, and reduced to a net list, The scale of the net list can be specified in turn and tested for whether it was beyond a threshold. 12 --> The inferred function can then be tested for variability to see if it fits in an island of function; for alignment of solstices and equinoxes or to specific prominent stars that is not hard to do and has been long done. Accuracy counts, and in particular accuracy against proper motion where that is relevant or against precession of the equinoxes. (This can even help date things.) 13 -->there will be marginal cases, where there is not a clearly identifiable and specific function, e.g any lines on the ground can probably be aligned with some star or other at some time of the year. 14 --> in these cases, as designed, the FSCI test will rule conservatively: chance contingency not choice contingency. A false NEGATIVE. 12.5 Apply CSI to archaeology 15 --> Done, as just seen. 12.6 Provide a more detailed account of CSI in biology 16 --> Notice how vague this is? DNA is rich in 4-state, digitally coded information beyond the 500 or 100 bit thresholds, many times over as repeatedly documented. RNA templates off this and puts it to work. 17 --> Proteins are coded for and work on DNA through RNA. 18 --> The complex functional organisation of the cell can be reduced to node and arc diagrams, starting with say the network of cellular metabolic reactions and the regulatory networks. Nodes, arcs and interfaces can be reduced to netlists, and counted up in bits. We already know the answer: design, emphatically; on being very functionally specific and well past the 1,000 bit threshold. 19 --> In addition, we have the infinite monkey analysis to tell us that it is utterly implausible that something so complexly and specifically organised will be feasible of blind random walks and mechanical necessity on the gamut of the observable cosmos. 20 --> this is not rejected for want of empirical or analytical support, but for want of fit with the prevailing evolutionary materialistic agenda in science as exposed by Lewontin, Sagan the US NAS, etc etc. Indeed, the cite form the paper at this point is all too inadvertently revealing of the Lewontin materialist a priori at work:
It is our expectation that application of the "explanatory filter" to a wide range of biological examples will, in fact, demonstrate that "design" will be invoked for all but a small fraction of phenomena [what is the evidence trying to tell you?], and that most biologists would find that many of these classifications are "false positive" attributions of "design."[In short a naked appeal to the evo mat consensus of the new magisterium]
12.7 Use CSI to classify the complexity of animal communication 21 --> Vague and overly broad. Animals in general do not communicate using abstract,functionally specific complex symbol sets or strings. 22 --> In the case that leaps to mind, bee dances, this is obviously genetically programmed, and it would trace to the FSCI in DNA, which is designed on the inference from FSCI. 23 --> It is astonishing how many times the challenge cases are tossed out by the authors, but he cases that have obvious answers that show that the FSCI metrics and the explanatory filter have reasonable answers even if you differ with the conclusions, are passed over in a telling silence. 24 --> If bird songs are symbolic and functional with complexity that can be discerned beyond the 1,00 bits then that would point to design as the source. The real issue would be where the design rests, e.g are the birds giving evidence of verbal communication, and same for the dolphins or whales. 25 --> Show the function and the complex specificity then we can look at what the design filter says about type of source. 26 --> If whales do have personal signatures that are evidently deliberately constructed on an individual basis then that is a sign that the whales are intelligent enough to do that. Which would be great news, and would compound our guilt over our wanton slaughter of these wonderful creatures. 12.8 Animal cognition 27 --> The pattern of a rat traversing a maze and learning the pattern shows some measure of deliberation on the part of the rat, i.e some measure of design. 28 --> Oddly while MG casts this up as a challenge, the authors give a grudging concession:
We note the use of examples in Dembski's work involving a laboratory rat traversing a maze as an indication of the applicability of CSI to animal cognition [16, 17, 19].
29 --> in other words, a success by the EF on FSCI!>> _________________ Plainly, the list of objections is by no means so formidable as it is made out to be. And, the design approach with tools such as FSCI, and IC, is promising; or in fact it is routinely used on an informal basis, we do qualitatively recognise functionally specific and complex organisation and associated information all the time and instinctively or intuitively infer from it to design, e.g the very posts on this thread. GEM of TKIkairosfocus
April 19, 2011
April
04
Apr
19
19
2011
02:07 PM
2
02
07
PM
PDT
Onlookers: It is a pity that I have to start this by speaking to a serious challenge MG needs to address before she can sit to the table of reasonable discourse. Let's get it out of the way then move on to the substance of the burst of comments she put up today. Now, you know by now that MG has been simply recirculating objections and dismissals, even implying that the numbers I clipped -- by way of illustrative example -- from the excerpts she gave or that were otherwise accessible wee taken out of the air. Maybe she did not bother to read, maybe she is willfully accusing me of wrongdoing, but in any case she is out of contact with the observable facts. Someone who accuses others of being "dishonest" -- not merely in error but willfully deceitful -- has a stiff burden of proof to meet, which MG simply has not done. That is sad, and it is a further demonstration of just how completely her challenge over the past weeks has collapsed. Now too, I don't know if I have overlooked something, but somehow it looks like the case where I have used Durston metrics of FSC to give info estimated to fit into the reduced Chi-metric somehow has gone without comment by MG. If I have seen correctly, that may be the most telling thing of all, as the case is the one with indisputably bioloogical info reduced through the Demsbki metric. If I have not seen correctly could someone kindly draw that to my attention? GEM of TKIkairosfocus
April 19, 2011
April
04
Apr
19
19
2011
01:15 PM
1
01
15
PM
PDT
Onlookers: Joseph made a good catch above in pointing out that "Shannon Info" is not funcitonal info or complex specified info more specifically. A flipped coin can easily generate unlimited amounts of Shannon info. But if your flipped coin starts counting up in binary or outputs a text say from Shakespeare and does so beyond 1,000 bits of info, you better sit up and take notice of that coin. GEM of TKIkairosfocus
April 19, 2011
April
04
Apr
19
19
2011
01:06 PM
1
01
06
PM
PDT
Robb: Random walks do not create 500+ or 1,000+ bits worth of functionally specific, complex information. (Notice the limitations of the infinite monkeys random text generators as already discussed above. Notice, too the reduction of Dembski's Chi metric to a bits beyond a threshold measure. Once you can by some reasonable basis estimate information to create specific function in bits, and do the comparison.) GEM of TKIkairosfocus
April 19, 2011
April
04
Apr
19
19
2011
01:03 PM
1
01
03
PM
PDT
Robb: Now, above you spoke much about the need to reckon with all possible chance hypotheses before one can evaluate an information metric. Not really. That may be analytically so in principle for a deductive type proof, but in fact in engineering work, information metrics look at the constraints on the information-bearing units in the string statistically. The simplest model is the equiprobable model, and then we move away from that by studying actual symbol patterns, e.g the famous E is far more common in English than say X, so it conveys a lot less "surprise" thence information when it shows up. When we look at the DNA chain or the AA amine-carboxylic acid backbone of a protein, we find there are few if any physical constraints on chaining, so it is a reasonable first rough model to use a flat random distribution; as PAV did. Beyond that, the solution is to do what Durston et al did, observe statistical frequencies of occurrence within the messages, which gives the probability patterns to feed into the pi log pi summation. Facts of real distributions trump speculative models of possible distributions, every time. In their case this was in the context of protein chains to give biofunction. Empirically quite adequate (and BTW, above I linked a similar example from Bradley that has sat in my notes for years). The actual sampling of a big enough cross section of a population can tell you a lot about the population, never mind perverse cases like the novel deliberately written in the 1930's without a single letter E. (How they got along without the, he, she etc and so forth escapes me. That sentence just past has eight e's in it.) Going beyond, what you are really debating is thresholds of reasonable searchability. We know that since search algor apart from random search has to be -- in our observation intelligently -- tuned to the environment to produce superior results on average relative to random walk [and since the non-foresighted, non-informed search for a tuned search is exponentially more difficult than the simple random search], random walk is a good enough on average metric.A detuned hill climber will in effect send you astray. Same message again. Unless, you are willing to argue that there are hidden cosmic laws that rig the thermodynamics of those warm little ponds to come up with the right clusters of pre life molecules and to organise them in the right networks to do metabolic work and to have associated von Neumann replicator facilities. That is tantamount to saying that the cosmos is rigged to produce life in environments in Galactic and circumstellar habitable zones of spiral galaxies. Which is a design view in all but acknowledged name. Now, let's look at what a 500 bit threshold (the evident upper tendency of the 398 bits + K2) does. The solar system has in it about 10^102 possible quantum states -- Planck time states, 10^20 times faster than strong force nuclear interactions -- in the time since formation. 398 bits corresponds to 10^120 states, which swamps the search resources, especially the CHEMICAL interaction search resources. (Fast, strongly ionic inorganic reactions may go in like 10^-15 s, but organic reactions are often millisecond to second or worse affairs. There is a reason why the ribosome is running at about 100 AA/s.) That is why 500 bits is a reasonable threshold for search complexity. Notice, we are here counting by comparison with atomic level interaction times. That is going to swamp any non-foresighted process. As you know, my choice of threshold in my X-metric and in the comparable Chi-1000 metric is 1,000 bits. I set that up by squaring the number of relevant states for the cosmos across its lifespan. The 10^80 atoms of the observed cosmos, running in Planck time steps, for the thermodynamic lifespan of the cosmos -- ~ 50 million times the timeline since the usual date for the big bang -- cannot sample 1 in 10^150th part of the configs for 1,000 bits. That is not a probability barrier, that is a search resources barrier. It does not matter what hypothesis -- apart from the miracle of life being written into the laws and initial circumstances of physics in a way presently hidden to us -- you pick, blind chance is going to be simply impotent to sample enough of such a pace to give a plausible expectation that islands of function in it will be found, given that such are fairly specific, in order to carry out code or algorithm or language related functions, or even specifying nodes, arcs and interfaces in a multiple component system. Brute force works every time. At the same time, 1,000 bits is 125 bytes. Or 143 ASCII characters. From the programmer's view, you are not going to put together a significant all-up control system in that span. Oh, you could store some sort of access-key password, or the like, but you have simply displaced the real processing work elsewhere. BTW, that is the common core problem with avida, ev, tierra etc, they are effectively doing searchable range access key passwords and calling them "genes" or some nice sales name like that. (But make those access keys long enough and make the relevant configs that work realistically rare in the field of possible configs and see what will happen; like say 125 bytes worth of looking, for passwords that are like 1 in 10^150 of the space in aggregate: infinite monkeys, on steroids. As I keep on saying, if you see a lottery that keeps getting won, it is designed to be winnable.) The heavy lifting is going on elsewhere and they are saying this pattern triggers option A or option B. That is why I keep pointing out that you cannot start within an island of function and do more than model microevo, which is not in dispute. Not even by modern young earth creationists. In the real world, any TRULY successful macro evo theory that starts with a warm little pond has to credibly get to cells with metabolism and code based self-replicating facilities, writing the codes along the way. Then, it has to get to tissue-and enzyme etc protein codes and regulatory networks to implement complex body plans, with 10 mn+ bases worth of DNA for each major plan, dozens of times over. On Earth (or Earth plus Mars)! That requires so many statistical miracles that the chance variation and natural selection model is a non starter. Full stop. Darwin's theory was a pre information age theory, and it has had to be force-fitted over the information technology findings that have come out over the past 60 or so years. The wheels are wobbling and coming off . . . GEM of TKIkairosfocus
April 19, 2011
April
04
Apr
19
19
2011
12:30 PM
12
12
30
PM
PDT
Specification: The Pattern That Signifies Intelligence This is the paper referenced in MathGrrl's original OP. Look at the title. My gosh. I wonder if there's anything in it about specification as that term is understood and used by Wm. Dembski himself.Mung
April 19, 2011
April
04
Apr
19
19
2011
12:10 PM
12
12
10
PM
PDT
MG @40
If, as seems increasingly likely, you can’t provide a rigorous mathematical definition of CSI as described by Dembski and show how to apply it to my four scenarios, please just say so and we can move on.
In case you haven't noticed, I'm not playing your game. MG @41
My participation here is solely so that I can understand CSI well enough to be able to test whether or not known evolutionary mechanisms can create it.
Schneider claims to have created it. Do you doubt him? MG @67
As noted in my guest thread, specification is one of the more vague aspects of CSI. Some ID proponents don’t seem to have a problem with the specification[s] I suggested...
I have a problem with them. As I wrote in @59 above:
Then I’d try to find out if MG even had a clue what Dembski means by a specification, or if she even read the paper that she was quoting from in her OP.
Unfortunately, that post is still in the moderation queue, so you probably haven't seen it yet. [Note to mods: Please take me out of the moderation queue. It makes it near impossible to carry on a conversation with such long delays. Thank you.]
Why do you think that “Produces at least X amount of protein Y” is not a specification in Dembski’s sense? Please reference his published descriptions of CSI that support your view.
Please reference the Dembski paper you referenced in your original OP. You did read it, didn't you?Mung
April 19, 2011
April
04
Apr
19
19
2011
11:52 AM
11
11
52
AM
PDT
Onlookers: You will note that I explicitly previously cited actual output numbers given by the claimed evolutionary algorithms, to illustrate how the Chi metric turns information numbers into bits beyond a threshold value and answers the question of whether it is reasonable that the output could have come from chance, if it is functionally specific or the like. But that is not the real elephant in the middle of the room. At this time MG is refusing to observe the underlying and inescapable problem with the "arbitrary" quantities of information "produced" by her favourite programs. As I pointed out -- now, above, and previously and as others have pointed out al the way back to the NFL ch 4 -- the core problem can be seen with the example of using the Mandelbrot set used as a fitness function. Namely, the random walk does not create functional information de novo, it only samples the output of a built in function that defines fitness, and it progresses on the instructions of a hill-climbing algorithm. As is a commonplace, a processor takes in an input, processes it and transforms based on its instructions, to give an output. Where do the actual sources of information -- inputs, algorithms, fitness functions and the like -- come from? INTELLIGENCE. Worse, in fact, PAV is quite right. When such a program searches in random walk space, it has strict limits on what it can explore absent being already confined to an island of function. If we have a space of 1,000 bits worth of possibilities, the search capacity of the cosmos is fruitlessly exhausted without sampling even 1 in 10^150th of the config space. So, the performance and the actual capacity to generate info by real random walks is going to be appreciably less than 500 bits. Absent the sort of helpers I highlighted in 44 ff, these program would be facing exactly the limits given by the random text genrators. Spaces of order 10^50 or so possibilities are searchable for islands of function within present computing resources. those within 10^500 possibilites are searchable within the resources of the whole cosmos. But those of 1,000 and beyond are beyond reach. So, if someone presents a program that starts in an island of function and does a hill-climbing tour, you know where the information really came from: active information injected (without realising that) by the programmers. GEM of TKIkairosfocus
April 19, 2011
April
04
Apr
19
19
2011
11:35 AM
11
11
35
AM
PDT
MG @66
Again, I don’t see where you’re getting your 266 bit value, but Schneider shows how to generate arbitrary amounts of Shannon information via ev.
MG, you have a bad habit of citing things without having read them. I just loved bit:
We've beaten the so-called "Universal Probability Bound" in an afternoon using natural selection! (emphasis mine)
Mung
April 19, 2011
April
04
Apr
19
19
2011
11:32 AM
11
11
32
AM
PDT
Loved that first Mandelbrot Set youtube vid, lol. Also, if youtube is to be believed, tornadoes actually avoid junkyards.Mung
April 19, 2011
April
04
Apr
19
19
2011
11:18 AM
11
11
18
AM
PDT
Mathgrrl: Thank you for your post. As for specifics about the calculation of CSI for the test cases discussed by Elsberry, I have already written a post outlining the methodology by which his questions could be resolved: https://uncommondescent.com/intelligent-design/a-test-case-for-csi/ All we really need are two things: (i) hard data relating to the probability distributions of various patterns in nature, and (ii) a detailed inventory of the mode of origin of each of the various patterns that are observed to arise in the natural world. (If we can't do this for large and/or complex patterns, we can at least do it for small and/or simple ones.) Given these, Elsberry's questions become tractable: even large, complex patterns can be decomposed into smaller parts, on which we can perform the requisite mathematical calculations. Compiling the relevant data as well as the inventory of origins is a formidable task that will take some decades, however, even with millions of volunteers co-operating. Your put-downs of Professor Dembski's CSI metric relate to just one factor in his equation: the number of other patterns that are as easily describable as the pattern we are investigating. In any case, as I pointed out, this is the least significant factor in the equation for CSI. Replacement of this factor by 10^30 renders the calculation both objective and (given the requisite data and probability distribution - see above) computable, and yields a figure very close to Professor Dembski's CSI. You have yet to respond to my challenge regarding the four scenarios you describe:
Please provide us with a two- or three-page, detailed but completely jargon-free description of the four scenarios you are describing and post it up on UD. No references to other papers by biologists, please. Describe the problems in your own words, as you would to a non-biologist (which is what I am). Then I might be able to help you.
I'm still waiting.vjtorley
April 19, 2011
April
04
Apr
19
19
2011
09:37 AM
9
09
37
AM
PDT
MG: I am sorry to have to be direct, but your further action in the teeth of well-merited correction now demands such:
your endlessly repeated selectively hyperskeptical talking points and supercilious dismissals are now both plainly delusional and willfuly slanderously dismissive.
You have called people, for no good reason, dishonest - and please don't hide behind subterfuges such as no, I have only asked questions. To raise such suggestions is itself an accusation. One that any fair minded and informed reader will at once see is groundless. As a matter of fact, adequate definitions have been provided, in great details and for many weeks now. In addition, responses to your challenges have been given and the reduced form of the Demski metric, with the Durston case, has provided us with a handy list that shows how the approach can indeed be applied fairly simply to biological systems. (In this context, Robb's recycled objections on possibilities for chance hyps are irrelevant: we know the patterns and the ranges from the test of life itself in the range of protein family variations.) Please, MG, change your ways. GEM of TKIkairosfocus
April 19, 2011
April
04
Apr
19
19
2011
07:32 AM
7
07
32
AM
PDT
MathGrrl:
As noted above, Schneider shows how to generate arbitrary amounts of Shannon information via ev.
Specified informatin is a specified subst of Shannon Information-> ie it is Shannon information with meaning/ functionality.Joseph
April 19, 2011
April
04
Apr
19
19
2011
07:31 AM
7
07
31
AM
PDT
vjtorley,
I’d like to clarify. In my original post on the CSI scanner, I argued that Dembski’s CSI was calculable, but not computable.
Yes, I read that. While I appreciate your efforts and look forward to further discussion with you on that topic, in this context yours is a distinction without a difference. You seem to recognize that CSI as described by Dembski cannot be used to calculate an objective, numerical metric. Claims by ID proponents that rely on such a metric are therefore clearly unsupported.
In a subsequent post, I then provided you with a simplified version of CSI, which I christened CSI-lite. CSI-lite is both calculable and computable.
That's fine, but it's not CSI as described by Dembski. Dembski's metric is the one generally accepted by ID proponents. If you can demonstrate that your CSI-lite is an unambiguous indicator of the involvement of intelligent agency, I'll be happy to spend some time testing those claims. Wesley Elsberry and Jeffrey Shallit have documented several excellent tests for your metric:
12.1 Publish a mathematically rigorous definition of CSI 12.2 Provide real evidence for CSI claims 12.3 Apply CSI to identify human agency where it is currently not known 12.4 Distinguish between chance and design in archaeoastronomy 12.5 Apply CSI to archaeology 12.6 Provide a more detailed account of CSI in biology 12.7 Use CSI to classify the complexity of animal communication 12.8 Animal cognition
MathGrrl
April 19, 2011
April
04
Apr
19
19
2011
06:53 AM
6
06
53
AM
PDT
PaV,
when the ev program produces less than 96 bits of actual information
As noted above, Schneider shows how to generate arbitrary amounts of Shannon information via ev. Does this constitute CSI? If not, why not?MathGrrl
April 19, 2011
April
04
Apr
19
19
2011
06:52 AM
6
06
52
AM
PDT
PaV,
Because something is difficult to demonstrate, doesn’t mean it doesn’t exist.
While you are correct in general, in the case of quantitative metrics, as Dembski and other ID proponents claim CSI to be, lack of a rigorous mathematical definition does, in fact, mean that it doesn't exist.
I’ve already walked you through an example of how CSI works. Do you dispute that I had done that?
Yes. I have yet to see anyone provide a rigorous mathematical definition of CSI that is consistent with Dembski's published descriptions, nor have I seen anyone demonstrate how to calculate CSI objectively according to such a definition.
Instead you continue with your DEMAND that these four scenarios be analyzed…….or else!!!
I had no idea that my words had such force. Here I was thinking I was just asking for an explanation from the people who claim to understand the concept. I'll type more softly when replying to you in the future.
So, how about your first scenario: “A simple gene duplication, without subsequent modification, that increases production of a particular protein from less than X to greater than X. The specification of this scenario is “Produces at least X amount of protein Y.” First, why do you think “Produces at least X amount of protein Y” is a “specification”. CSI deals with events. So, please tell us, what is the event.
As noted in my guest thread, specification is one of the more vague aspects of CSI. Some ID proponents don't seem to have a problem with the specification I suggested (see a couple of the comments above in this thread, for example). Others, like you, seem to have a different concept. Why do you think that "Produces at least X amount of protein Y" is not a specification in Dembski's sense? Please reference his published descriptions of CSI that support your view.MathGrrl
April 19, 2011
April
04
Apr
19
19
2011
06:52 AM
6
06
52
AM
PDT
kairosfocus,
1: the fact that no ID proponent can calculate CSI for my scenarios You plainly died not look at the posts at 19 above.
I did. You tossed out a few numbers but certainly didn't provide a rigorous mathematical definition of CSI that is compatible with Dembski's published descriptions.
Scenario 1, the doubling of a DNA string produced no additional FSCI, but the act of duplication implies a degree of complexity that might have to be hunted down, but would be most likely well beyond 500 bits or 73 bytes of info.
I'm not entirely sure what you mean by that last sentence, but Dembski clearly states that CSI should be able to identify the features of intelligent agency in an object "even if nothing is known about how they arose". Do you disagree with that? If you're basing your calculation solely on the length of the genome, gene duplications easily exceed your 500 bit limit.
Scenarios 2 – 4 were computer sims, and as PAV long since noted Ev was within 300 bits, far below the significance threshold. 266 bits – 500 bits = – 234 bits lacking.
Again, I don't see where you're getting your 266 bit value, but Schneider shows how to generate arbitrary amounts of Shannon information via ev.
Scenario 3, top of no 20, point 14, on the cited bit number I found, and corrected: Chi_tierra = 22 – 500 = – 478
More numbers pulled out of thin air, with no CSI calculation to be seen. You seem to be assuming that the 22 instruction parasite appears de novo. In fact it never appears earlier than a few thousand generations into a run. One reason I included this scenario is to understand how CSI calculations take known evolutionary mechanisms into consideration. As near as I can tell from your vague description, your version of CSI doesn't consider them at all.
Scenario 4, Steiner has numbers up to 36 bits I find just now: Chi_steiner = 36 – 500 = – 464
And you finish with yet more arbitrary numbers and no rigorous mathematical definition of CSI. It is very easy to specify Steiner problems that require more than 500 bit genomes to compute. The bottom line is that you haven't provided a rigorous mathematical definition of CSI. Your "calculations" are completely arbitrary and seem consist of little more than raising two to the power of whatever number you arbitrarily select for each of my scenarios. That's neither convincing nor descriptive enough to show how to calculate CSI objectively.MathGrrl
April 19, 2011
April
04
Apr
19
19
2011
06:51 AM
6
06
51
AM
PDT
kairosfocus, I'm not a math educator, but thanks for the compliment. Again, Dembski defines specified complexity in terms of all relevant chance hypotheses. Can you tell us how to determine what chance hypotheses are relevant? Do you understand that if we consider only the chance hypothesis of random noise, we get a plethora of false positives? Would you like some examples?R0bb
April 19, 2011
April
04
Apr
19
19
2011
06:33 AM
6
06
33
AM
PDT
MG: Please be reminded of the corrections to your assertions of yesterday, starting from 44 above. They include cases of unwarranted and demonstrably false accusations against the characters of others on your part, so please pay close attention. Attention is also particularly called to 11 above, which integrated Durston with Dembski and provides three examples from the 35 that are directly possible, of Chi metrics of biological, real life systems. GEM of TKIkairosfocus
April 19, 2011
April
04
Apr
19
19
2011
05:33 AM
5
05
33
AM
PDT
PPPS: Also, on OOL. In that context, Robb, you are looking at getting from some warm little pond to first life, a metabolising entity with an in-built von Neumann self-replicating facility. The evidence from observed life -- the only relevant actual observations in hand -- is that we are looking at about 100+ k bits worth of info in the blueprint storage section or tape of the vNSR. This is vastly beyond 500 - 1,000 bits. And, there is no observational evidence to support a hypothesis of a nest staircase of simpler components forming an OBSERVED stepwise sequence to that first living cell. Not even a staircase with some treads missing, it is all but wholly missing apart form some speculative just so stores and some autocatalysis under irrelevant chemical circumstances. I submit that on the evidence in hand the best explanation for OOL, just as for the origin of a fine tuned cosmos that is supportive of C-chemistry cell based life, is design. And if design is already on the table seriously for these two cases, there is no reason not to entertain it for the emergence of body plans, including our own.kairosfocus
April 19, 2011
April
04
Apr
19
19
2011
03:02 AM
3
03
02
AM
PDT
PPS: On the issue of non-uniform probability distributions. Robb, first and foremost, the reduction of the Chi-metric to a threshold renders the old fashioned "how do you set up probability distributions" objections moot. If you accept that info theory is an existing, scientific, established discipline, then you will know that information is as a rule measured on a standard metric -- and yes, this is to the short discussion of info th 101 in my always linked briefing -- tracing to Hartley and Shannon: Ii = - log(pi) p(T|H) is a relevant form of the probability in question, as the probabilities are in the context of an assumed distribution of the likelihood of given symbols in a given code. (In the OOL context, Bradley builds on Yockey as clipped here in my always linked APP 1, point 9, this has been a couple of clicks away for years, from every post I have made as a comment at UD, Robb.] If you look at no 11 above, you will see that Durston et al give one very relevant practical way to define the chance hyp alternative; by studying sequence distributions of observed protein families, AKA sampling the population. In general, you should be aware that the definition of the probability distribution of symbols is an integral a part of standard info theory, that is why Shannon defined a metric of average information per symbol, based on a weighted sum: H = - [SUM on i] pi* log(pi) That H-metric is exactly what Durston et al build on, and it then allows us to use the reduced, threshold version of the Demsbki Chi metric to deduce, as was posted above in comment 11 and now appears in the OP as revised:
RecA: 242 AA, 832 fits, Chi: 332 bits beyond SecY: 342 AA, 688 fits, Chi: 188 bits beyond Corona S2: 445 AA, 1285 fits, Chi: 785 bits beyond.
Remember, by accepting p as given in Info theory, we may then proceed to reduce Demsbki's Chi metric:
From: Chi = – log2(10^120*phi_S(T)*p(T|H) To: Chi_500 = Ip – (500), in bits beyond a threshold of complexity
In this form, what is going on is far plainer to see, i.e we are measuring in bits beyond a threshold. Debate how you set that threshold all you want, there is no good reason to see that 500 - 1,000 bits will not be adequate for all practical purposes. Do you care to suggest that we can easily and routinely observe random walks finding islands of function in spaces of at least 10^150 or 10^301 possible configs? Also, given how the Durston case fits in, we can see that the objections raised above are moot -- they say in effect that you are disputing standard info theory. Fine, that's how science progresses; your task now is: provide an alternative metric for I than the ubiquitous bit, and persuade the world that he bit is inadequate as a metric of information because you can raise exotic probability distribution objections. And, your case is . . . ?kairosfocus
April 19, 2011
April
04
Apr
19
19
2011
02:42 AM
2
02
42
AM
PDT
Robb: Pardon, but you seem to be falling into the same question-begging trap that MG has. (I was going to reply to her challenge here, but saw that you provided a live, fresh case in point.) The basic problem with programs like Ev etc, as was --AGAIN -- pointed out in 19 - 20 above and -- AGAIN -- brushed aside, is that they start from within defined islands of function where the very act of setting up the working program feeds in oodles of functional specificity in the form of matching the search to the space and the so-called fitness or objective function metrics involved. That is why the better approach is the sort of program that we see in the infinite monkeys type tests; which show us that we can reasonably search spaces of about 170 bits worth of possibilities, but there is a clear challenge to search a space of 500 - 1,000 or more bits worth of possibilities. Notice the cited challenges of pre-loaded active info in ev as cited in 19 - 20. All of this rests on the basis of the NFL results that lead to the conclusion that the non-foresighted search for a search is exponentially harder than the random walk driven direct search for zones of interest. It is intelligently inputted, purposeful information that matches search to config space and objective function, that feeds in warmer/colder metrics that allow hill climbing within islands of function, and more. In short, there is a serious problem of question-begging in the assumption that intelligently designed, complex algorithms with loads of intelligently input information involved in their function are creating new information, rather than simply transforming -- usefully -- already existing information added by their creators. For instance there is nothing in the wonderful output of a Mandelbrot set program that is not already pre-loaded in the inputted algorithm and start-up information, as well as of course the routines that set up the lovely and admittedly quite complex displays. To see what I am driving at, consider a Mandelbrot set program that selects points to draw out and display based on a random walk across the domain of the pre-built functions. As background, Wiki sums up:
The Mandelbrot set is a particular mathematical set of points, whose boundary generates a distinctive and easily recognisable two-dimensional fractal shape . . . . More technically, the Mandelbrot set is the set of values of c in the complex plane for which the orbit of 0 under iteration of the complex quadratic polynomial zn+1 = zn2 + c remains bounded.[1] That is, a complex number, c, is part of the Mandelbrot set if, when starting with z0 = 0 and applying the iteration repeatedly, the absolute value of zn never exceeds a certain number (that number depends on c) however large n gets. For example, letting c = 1 gives the sequence 0, 1, 2, 5, 26,…, which tends to infinity. As this sequence is unbounded, 1 is not an element of the Mandelbrot set. On the other hand, c = i (where i is defined as i2 = ?1) gives the sequence 0, i, (?1 + i), ?i, (?1 + i), ?i, ..., which is bounded and so i belongs to the Mandelbrot set. Images of the Mandelbrot set display an elaborate boundary that reveals progressively ever-finer recursive detail at increasing magnifications. The "style" of this repeating detail depends on the region of the set being examined. The set's boundary also incorporates smaller versions of the main shape, so the fractal property of self-similarity applies to the whole set, and not just to its parts.
The zone of the boundary is often rendered in colours depending on how many iterations it takes for a point to run away, and we can often see a lovely pattern as a result. (Video: here, a second vid with explanation is here -- this last illustrates the sampling approach I will discuss below . . . ) But the colour pattern is wholly determined by the mathematics of the set and the algorithm that colours points depending on how they behave. So, now, let us consider a random walk sample of the points in the complex plane, in the context of the M-brot set serving as in effect the stand-in for a traditional fitness function:
1: At the first, the output of such a program would seem to be a random pattern of colours, scattered all over the place, but 2: After a long enough time of running, that random walk based M-brot set display will look much like that of the traditional versions [cf here [overview] and here [border zone showing beautifully rich and complex details of the seahorse valley between the "head" and the "body" of what I like to call the M-brot bug . . . yes, I am a M-brot set enthusiast]], 3: The only difference of consequence being that the test points were picked at random across time, on a "dart-throw" sampling basis, or its equivalent, a random walk across the complex plane. 4: Sampling theory tells us that if a field of possibilities is sampled at random, a representative picture of the overall population will gradually be built up. 5: Compare the two programs, the traditional and the random walk versions. Is there any material difference introduced by the ransom walk? Patently, no. 6: Now, introduce a bit of hill climbing -- the colour bands in an M-brot program are usually based on number of cycles until something begins to run away -- and let the random walk wander in towards the traditional black part, the zone of fixed solutions. 7: That the walking population now migrates from the far field towards the black peak zones is a built in design. 8: Has that wandering in and picking up values where there are nice solutions to the problem ADDED fresh information that was not implicit in the built in program? 9: Not at all. 10: Just so, when you build a hill-climbing algorithm that starts within an island of function and wanders in towards peak zones on hill climbing origins, the results may look surprising, but the results are making explicit by transforming what was built in as functional capacity, they are not creating previously non-existing information out of thin air. (And if you make the fitness landscape dynamic that makes no material difference, apart from making the wandering permanent. Indeed,that points to a reason for a designer to build in evolvability: the need to keep adapted successfully to a varying fitness landscape in an island of function.) 11:In short, we have serious reason -- I here exploit the increasingly acknowledged link between energy, entropy and information in thermodynamics -- to believe that an informational "free lunch machine" equivalent of a classic perpetual motion machine is no more credible than the latter. 12: That is not to say we can simply close our minds to the possibilities of such; just, those who propose a free lunch inforamtion machine need to show that the machine is not simply transforming in-built information. 13: The soundest way to do that is to do what the random text generator programs above have done, with a sufficient threshold of real function that we are indeed modelling macro evolution of body plans, not extrapolating from a model of microevo, which is not warranted on search space reasons. 14: However the dismissive sniffing we have seen on "tornado in a junkyard [builds a jumbo jet]" strongly suggests that the threshold of complexity problem is real and resented, so dismissed not cogently addressed. 15: So, let me note: the threshold of challenge does not begin from Sir Fred Hoyle's multi-megsbit issue of a tornado spontaneously assembling a flyable jumbo jet; it starts with maybe trying to build an instrument panel gauge by that same tornado method; or even, 16: trying to see if a tornado passing through the nuts and bolts aisle of a hardware store would spontaneously match the right nut tot he right bolt and screw it in to hold a shelf together. 17: I beg to submit that, as a matter of routine common sense if one sees the right sized nut and bolt in the proper holes, lined up and screwed down to the right torque, one infers to design, not a tornado passing through the nuts and bolts aisle of your local hardware shop. 18 --> What the reduced Demsbki metric and related metrics are doing is that they are giving a rationale for that inference on information behind the required organisation, and the threshold of complexity that makes it reasonable to infer to design. 19: Let us again remind ourselves: From: Chi = - log2(10^120*phi_S(T)*p(T|H) To: Chi_500 = Ip - (500), in bits beyond a threshold of complexity
This little thought exercise therefore means that so-called evolutionary algorithms can so far only credibly model MICRO-evolution, not body plan origination macroevo. Micro-evo as described is not in dispute, not even by modern young earth creationists. So, to avoid suspicions of bait and switch tactics, it is incumbent on developers and promoters of evolutionary algorithms that they address the problem of first needing to get to the shores of islands of function before they proceed to hill-climb on such islands. Such considerations also strongly suggest something that is emerging as it is beginning to be observed that mutations are often not simply at random but follow patterns: micro-level evolvability and adaptability -- up to probably about the level of the genus to the family -- are in-built into the design of living systems, probably to confer robustness and ability to fit with niches. [Cf discussion that starts here, esp the videos on the whale and on cichlids. Also, the one on thought provokers on what mutations are.] Where the serious barrier to evolutionary mechanisms comes in is the origin of major body plans, and that includes the very first one. So far, I have not see anything that suggests to me that a solution to the macro-level problem is feasible on the blind watchmaker type approach. Extrapolations form what is equivalent to maicroevo to macro evo, do not help matters. GEM of TKI PS: Robb, you are a math educator. I think you need to address the reduction of the Dembski metric shown in the OP above and again commented on just above.kairosfocus
April 19, 2011
April
04
Apr
19
19
2011
02:20 AM
2
02
20
AM
PDT
PaV:
This is stupidity of the highest order. Look, sweetheart, when the ev program produces less than 96 bits of actual information (that’s right, we’re dealing with 16 sites each six bases long, and perfect positioning doesn’t take place). Per Dembski’s definition, this doesn’t rise to the level of CSI. To then go on and determine the actual “chance hypothesis” serves no usefulness whatsoever. It would be an exercise in masochism, and no more.
It depends on what information you're measuring. Schneider measures the information in the locations of the binding sites, as do Dembski & Marks et al (although they measure it differently than Schneider, coming up with 90 bits as opposed to Schneider's 64 bits). You're measuring the information in the particular bases in the binding sites, not the locations. Regardless of what you're measuring, all you need to do is run ev multiple times to generate 500+ bits, so the amount generated by a single run isn't relevant.
Then, knowing that DNA nucleotide bases are equiprobable given known chemistry/quantum numbers, then any nucleotide base has a 0.25 chance of being at any position along the DNA strand. For a sequence of length, 2X = -log_2{(10^150)}, or,log_2 (10^149), CSI would be present. Q.E.D.
At best, you've eliminated a hypothesis consisting of random mutation and nothing else. Has anyone proposed such a hypothesis? According to Dembski's definition, specified complexity is multivalent, with one value for each relevant chance hypothesis. And for Dembski, all material processes, stochastic or deterministic, are chance hypotheses. So if you come up with a single number when you're measuring the CSI of something, you need to explain why there's only one relevant chance hypothesis. Can you point me to a definition of CSI that explains how we determine which chance hypotheses are relevant? In order to define CSI as a property or measure of real-world things and events, it's not enough to formulate it as a function of H, eg –log2(10^120*φ_S(T)*P(T|H)). You also have to define H. Dembski's repeated warning that specified complexity needs to be based on all relevant chance hypotheses is for good reason. Nature is full of phenomena that are improbable under a random noise hypothesis, but very probable under the laws of nature. And many can be described simply, and are therefore specified. So restricting CSI calculations to a random noise hypothesis produces a plethora of false positives. And yet all attempted CSI calculations that I've seen have been based solely on the random noise hypothesis, including all of the attempts I've seen on this board. And that goes for Dembski's attempts, and well as Durston's FSC calculations (he assumes that ground state = null state), and Marks & Dembski's CoI (they assume that the search-for-a-search is blind), and even Sewell's SLoT math. Random noise is the ubiquitous model that pervades ID. So, under the rigorous definition of CSI, how is H defined? How do I determine what chance hypotheses are relevant?R0bb
April 18, 2011
April
04
Apr
18
18
2011
11:51 PM
11
11
51
PM
PDT
VJT @52
Please provide us with a two- or three-page, detailed but completely jargon-free description of the four scenarios you are describing and post it up on UD. No references to other papers by biologists, please. Describe the problems in your own words, as you would to a non-biologist (which is what I am). Then I might be able to help you.
Heck, I'd be happy to see them one at a time. Then I'd try to find out if MG even had a clue what Dembski means by a specification, or if she even read the paper that she was quoting from in her OP.
I have no intention of undertaking a course in evolutionary algorithms in order to answer your question; I’m afraid I simply don’t have the time.
I've ordered a couple books which I hope will show me how to code some in my favorite language.Mung
April 18, 2011
April
04
Apr
18
18
2011
04:37 PM
4
04
37
PM
PDT
MG Talking point: I want to learn enough about CSI to be able to test whether or not evolutionary mechanisms are capable of generating it. Thus far it is not sufficiently well defined for me to do so. Based on some ID proponents’ personal definitions of CSI, it appears that evolutionary mechanisms can generate it, but those aren’t the same as Dembski’s CSI. Nicely vague of course so trying to nail down the "personal definitions" will be like trying to nail down a shingle on a fog bank. Indeed, the plain -- and very post modernist -- intent is that there is no coherent, objectively real observable pattern being described and no resulting objectively correct definition of what CSI is [and by extension its functional subset FSCI], so such is to be taken as whatever one picks to make of it. Just as, these days marriage itself is being taken as a wax nose to be bent how one wills, on grounds that the opposite sexes do not form a natural, objective complementarity. (In short, this perspective is yet another manifestation of the radical, amoral relativism rooted in evolutionary materialism that Plato warned against in the Laws Bk X, 2,350 years ago.) Makes a pretty handy strawman. Probably -- sadly -- the underlying intent of the whole rhetorical exercise. In corrective steps: 1--> CSI was not defined by Demsbski. Yes, not. It is fundamentally a description of an observable characteristic of many things, first in the world of technology then also in the world of cell based life. For instance, compare the "Wicken wiring diagrams" of a petroleum plant and the biochemical reaction pathways of the living cell, in Fig. I.2 here, also the layout of a computer motherboard here or the regulatory networks of DNA activation and control in Figs. G.8 (a) - (d) here 2 --> The objectively controlled description responding to real-world phenomena and objects is in the very words itself: jointly complex and specified information [and related organisation], an aspect of objects, phenomena and processes that exhibit the cognate: specified complexity. 3 --> Thus we come to Orgel, describing the way (a) the strikingly complex organisation life forms differs from (b) randomness AND from (c) simply ordered entities, in the explicit context of the origin of life [note the title of the work]; thus, the context of unicellular organisms:
. . . In brief, living organisms are distinguished by their specified complexity. Crystals are usually taken as the prototypes of simple well-specified structures, because they consist of a very large number of identical molecules packed together in a uniform way. Lumps of granite or random mixtures of polymers are examples of structures that are complex but not specified. The crystals fail to qualify as living because they lack complexity; the mixtures of polymers fail to qualify because they lack specificity. [[The Origins of Life (John Wiley, 1973), p. 189.]
4 --> Thus on contrasted concrete exemplars, we may properly and OBJECTIVELY observe and distinguish simple order [not complex], specified complexity [= complex organisation], and randomness [complex but nor correlated with a principle of organisation or order]. 5 --> Noticing suchobjective, observable material differences and expressing them in words is a first step to understanding and modelling; indeed, it has been aptly said that the agenda of science is to describe, explain, predict and control (or, at least influence). MG's evasiveness when she has been pressed on whether or not Orgel is meaningful in the above quote, is therefore sadly revealing of an underlying inadvertent anti-scientific spirit. Utterly telling. 6 --> As of this point, though, we have an ostensive definition clarified by pointing out examples and counter examples, and giving rise to a trichotomy of complexity: order, organisation, randomness. 7 --> This will be picked up, not only by Demsbski et al, but by Trevors, Abel and co, who define and distinguish orderly, functional [function is one principle of organisation . . . ] and random sequence -- string: s-t-r-i-n-g -- complexity. 8 --> Since, complex networked structures can be reduced to network lists of related and structured strings, per nodes, arcs and interfaces, this focus on strings is without loss of generality. 9 --> The issue of function brings to bear the closely related remark of Wicken:
Organized’ systems are to be carefully distinguished from ‘ordered’ systems. Neither kind of system is ‘random,’ but whereas ordered systems are generated according to simple algorithms [[i.e. “simple” force laws acting on objects starting from arbitrary and common- place initial conditions] and therefore lack complexity, organized systems must be assembled element by element according to an [[originally . . . ] external ‘wiring diagram’ with a high information content . . . Organization, then, is functional complexity and carries information. It is non-random by design or by selection [Wicken plainly hoped natural selection would be adequate . . . ], rather than by the a priori necessity of crystallographic ‘order.’ [“The Generation of Complexity in Evolution: A Thermodynamic and Information-Theoretical Discussion,” Journal of Theoretical Biology, 77 (April 1979): p. 353, of pp. 349-65.]
10 --> Observe, in discussing the issue of CSI, MG NEVER responsibly or cogently addresses these key conceptual discussions; she only tries to drive a dismissive rhetorical wedge between Orgel-Wicken and design thinkers. (Observe, also how she tries to drive a similar rhetorical magic wedge between Abel, Trevors, Chiu and Durston and Dembski et al. FYI, MG, Joseph is right: Dembski's quantification of CSI, as well as that of Durston et al, is in a context of using the classic Shannon-Hartley negative log probability metric for information [as an index of complexity and improbability of access by a random walk driven search algorithm or natural process, and integrating into improbability/surprise in that sense, specific functionality and/or meaningfulness. In addition, Durston et al use an extension of Shannon's average information per symbol metric, H, to assess jump in degree of function as one moves from ground state to functional state, this last being an island or zone of function in a wider space of possible configurations, the overwhelming majority of which are non-functional. That is why the Durston metric can easily be incorporated into the reduced Dembski metric, yielding the values of Chi for the 35 protein families, as may be seen in the revised point 11 of the original post above.) 11 --> This is a crucial error and is responsible for her onward blunders. She apparently cannot bring herself to conceive or acknowledge that Dembski et al could be trying to do -- or even, succeeding in doing! --just what we read in the OP as cited from NFL pp 144, 148: building on the thinkers who went before. 12 --> As for the "ignorant, stupid, insane or wicked/dishonest" who hang around UD and try to think along the lines laid out above, producing "personal definitions" . . . 13 --> Now in fact, what Dembski explicitly did was to try to quantify what CSI is about. As a first pass, we may see his statement in NFL, p. 144 that MG never addresses on the merits -- much less, in context. Let's break it up into points to see what it is doing:
“. . . since a universal probability bound of 1 in 10^150 corresponds to a universal complexity bound of 500 bits of information,
a: (T, E) constitutes CSI because b: T [i.e. "conceptual information," effectively the target hot zone in the field of possibilities] subsumes c: E [i.e. "physical information," effectively the observed event from that field], d: T is detachable from E, and e: T measures at least 500 bits of information . . . ”
14 --> In short the observed event E that carries information comes from an independently describable set, T, where membership in T involves 500 or more bits of information per the standard negative log probability metric. 15 --> Dembski is therefore giving a metric, with 500 bits as a threshold where the odds of getting to E by a chance driven random walk are 1 in 10^150 or worse. 16 --> In the 2005 elaboration, he gives the more complex expression that we have reduced: Chi = - log2 (10^120 * phi_S(T)* p(T|H)), or Chi = Ip - (398 + K2), bits beyond a threshold 17 --> That threshold tends (unsurprisingly) to max out at 500 bits, as VJT has deduced. 18 --> A metric of information in bits beyond a threshold of sufficient complexity that available random walk driven search resources would be all but certainly fruitlessly exhausted on the relevant real world gamut of search, is plainly not meaningless. 18a --> It is also quite well supported empirically. The best -- no mechanisms that transmute inputted information into output information that may mislead us (Dawkins' Weasel is notorious in this regard . . . ) that it is coming up as a free lunch -- random walk tests to date are probably the 'monkeys at keyboards" tests, and to date the capital examples run like this one from Wikipedia:
One computer program run by Dan Oliver of Scottsdale, Arizona, according to an article in The New Yorker, came up with a result on August 4, 2004: After the group had worked for 42,162,500,000 billion billion monkey-years, one of the "monkeys" typed, “VALENTINE. Cease toIdor:eFLP0FRjWK78aXzVOwm)-‘;8.t" The first 19 letters of this sequence can be found in "The Two Gentlemen of Verona". Other teams have reproduced 18 characters from "Timon of Athens", 17 from "Troilus and Cressida", and 16 from "Richard II".[20] A website entitled The Monkey Shakespeare Simulator, launched on July 1, 2003, contained a Java applet that simulates a large population of monkeys typing randomly, with the stated intention of seeing how long it takes the virtual monkeys to produce a complete Shakespearean play from beginning to end. For example, it produced this partial line from Henry IV, Part 2, reporting that it took "2,737,850 million billion billion billion monkey-years" to reach 24 matching characters: RUMOUR. Open your ears; 9r"5j5&?OWTY Z0d...
18b --> The best case search results are of order 24 ASCII characters, or spaces of 128^24 = 3.74*10^50, taking up less than 170 bits; well within the 500 bit threshold. It has been observed that trial and error can find islands of function in spaces of 10^50 or so possibilities, corresponding to 170 or so bits. 19 --> In short, trial and error on random walks are strictly limited in what they can achieve. And, if the threshold of function T is of order 500 or more bits, then we have good reason to believe that such exercises will never of their own accord find such zones of function. 20 --> For the only observed cell-based living systems, the DNA complement starts north of 100,000 4-value bases, or 200 k bits. (In fact the estimate for the minimally complex independent living cell is about 300 k bases, or 600 k bits.) 21 --> There is no empirical evidence of a ladder of pre-life entities that mounted up stepwise to this. And, given the evidence that the living cell comprises a complex metabolic system integrated with a code based von Neumann self-replicator, its minimal threshold of functional complexity is certainly well past 500 or 1,000 bits. 22 --> The latter is 125 bytes or 143 ASCII characters, wholly inadequate to construct any control software system of consequence. And yet, the number of possible configs for 1,000 bits is 1.07*10^301, over ten times the square of the number of Planck time states of the 10^80 or so atoms of the observed cosmos, across the estimated thermodynamic lifespan of some 50 million times the usual timeline from the big bang. (A Planck time is so short that the fastest, strong force nuclear interactions take about 10^20 -- a hundred billion billion -- Planck times.) 22 --> In short, the blind chance plus mechanical necessity based search resources of the cosmos could not credibly find an island of function in a config space corresponding to 1,000 bits, much less 100,000. 23 --> And when it comes to origin of main body plans, we are looking at 10+ mn bases of novel DNA information, dozens of times over. 24 --> The only observed, known cause of such degrees of functional complexity -- e.g. as in the posts in this thread -- is intelligent design. That observation is backed up by the sort of analysis of search space challenges we ave just seen, a challenge that is only known to be overcome by the injection of active information by intelligence. 25 --> Now of course there have been hot debates on hos probabilities are assigned to configs and how the scopes of islands of function can be estimated. 26 --> On one side, if there are currently hidden laws of physics that steer warm little ponds to form life and then shape life into body plans, then that is tantamount to saying nature is carrying out a complex program, and is front loaded to produce life. This is of course a form of design view. 27 --> On another side, the speculation is that there is a vast number of unobserved sub cosmi and ours jut happened to get lucky i that vastly larger pool of resources. This is of course a convenient and empirically unwarranted speculation. Metaphysics, not physics. (And, as the discussion of cosmological fine tuning here points out, it points straight back to an intelligent necessary being as the root of the multiverse capable of producing a sub-cosmos like ours.) 28 --> The simple brute force X-metric was developed against that backdrop. 29 --> It uses a complexity threshold of 1,000 bits so that cosmos scope search -- the only empirically warranted maximum scope -- is utterly swamped by the scale of the config space. 30 --> Since we can directly observe functional specificity, it uses that judgement to set the value of S = 1/0. 31 --> Similarly we can directly observe contingency and complexity beyond 1,000 bits, giving C - 1/0. 32 --> We can easily convert information measures [remember the nodes and arcs diagram and net list technique] to bits or directly observe them in bits, as we see all around us, so we use the number of bits, B. 33 --> This brings up and warrants the only "personal" definition of FSCI shown at UD: X = C *S* B 34 --> But by direct comparison, this is essentially comparable to the transformed version of the Chi metric, if we were to use a 1,000 bit threshold [and maybe we should now begin to subscript Chi to indicate the thresholds being used]: Chi_1000 = Ip - 1,000, in bits beyond the threshold. 35 --> the only gap is that for B we usually simply use the metric of physical information string capacity, without bothering to look at how much redundancy is in the code leading to some ability to compress. (That usually does not push us much beyond ~ 50 % loss-less compression for typical file sizes, so the metric is -- on the intended rough and ready basis -- comparable to the Dembski one.) ______________ In short, we have excellent reason to see that CSI and FSCI are meaningful concepts, are capable of being turned into quantitative models and metrics, and directly apply to technological systems and biological ones. Indeed, since this has been specifically challenged then denied, we must note -- point 11 OP and comment 11 -- that the Durston FSC metric and the reduced Dembski Chi-metric can easily be integrated to show 35 values of Chi for protein families. This talking point also collapses. GEM of TKIkairosfocus
April 18, 2011
April
04
Apr
18
18
2011
04:18 PM
4
04
18
PM
PDT
kairosfocus: Congratulations on you excellent posts from 44-48.Well Done!StephenB
April 18, 2011
April
04
Apr
18
18
2011
03:50 PM
3
03
50
PM
PDT
1 3 4 5 6 7

Leave a Reply