# Siding with Mathgrrl on a point, and offering an alternative to CSI v2.0

May 18, 2013 | Posted by scordova under Complex Specified Information, Intelligent Design, Mathematics |

There are two versions of the metric for Bill Dembski’s CSI. One version can be traced to his book *No Free Lunch* published in 2002. Let us call that “CSI v1.0”.

Then in 2005 Bill published Specification the Pattern that Signifies Intelligence where he includes the identifier “v1.22”, but perhaps it would be better to call the concepts in that paper CSI v2.0 since, like windows 8, it has some radical differences from its predecessor and will come up with different results. Some end users of the concept of CSI prefer CSI v1.0 over v2.0.

It was very easy to estimate CSI numbers in version 1.0 and then argue later whether the subjective patterns used to deduce CSI were independent and not postdictive. Trying to calculate the CSI in v2.0 is cumbersome, and I don’t even try anymore. And as a matter of practicality, when discussing origin-of-life or biological evolution, ID-sympathetic arguments are framed in terms of improbability not CSI v2.0. In contrast, calculating CSI v1.0 is a very transparent transformation going from improbability to taking the negative logarithm of probability.

I = -log2(P)

In that respect, I think MathGrrl (who’s real identity he revealed here) has scored a point with respect to questioning the ability to calculate CSI v2.0, especially when it would have been a piece of cake in CSI v1.0.

For example, take 500 coins, and suppose they are all heads. The CSI v1.0 score is 500 bits. The calculation is transparent and easy, and accords with how we calculate improbability. Try doing that with CSI v2.0 and justifying the calculation.

Similarly, with pre-specifications (specifications already known to humans like the Champernowne Sequences), if we found 500 coins in sequence that matched a Champernowne Sequence, we could argue the CSI score is 500 bits as well. But try doing that calculation in CSI v2.0. For more complex situations, one might get different answers depending on who you are talking to because CSI v2.0 depends on the UPB and things like the number possible primitive subjective concepts in a person’s mind.

The motivation for CSI v2.0 was to try account for the possibility of slapping on a pattern after the fact and calling something “designed”. v2.0 was crafted to try to account for the possibility that someone might see a sequence of physical objects (like coins) and argue that the patterns in evidence were designed because he sees some pattern in the coins somewhat familiar to him but no one else. The problem is everyone has different life experiences and they will project their own subjective view of what constitutes a pattern. v2.0 tried to use some mathematics to create at threshold whereby one could infer, even if the recognized pattern was subjective and unique to the observer of a design, that chance would not be a likely explanation for this coincidence.

For example, if we saw a stream of bits which some claims is generated by coin flips, but the bit stream corresponds to the Chapernowne sequence, some will recognize the stream as designed and others will not. How then, given the subjective perceptions that each observer has, can the problem be resolved? There are methods suggested in v2.0, which in and of themselves would not be inherently objectionable, but then v2.0 tries to quantify how likely the subjective perception can arise out of chance and then it convolves this calculation with the probability of the objects emerging by chance. Hence we mix the probability of an observer concocting a pattern in his head by chance and then mixing it with the probability an event or object happens by chance, and after some gyrations out pops a CSI v2.0 score. v1.0 does not involve such heavy calculations regarding the random chance an observer formulates a pattern in his head, and thus is more tractable. So why the move from v1.0 to v2.0? The v1.0 approach has limitations witch v2.0 does not. However, I recommend, that when v1.0 is available to use, use v1.0!

The question of post diction is an important one, but if I may offer an opinion — many designs in biology don’t require exhaustive rigor as attempted in v2.0 to try to determine if our design inferences are postdictive (the result of our imagination) or whether the designed artifacts themselves are inherently evidence against a chance hypothesis. This can be done using simpler mathematical arguments.

For example, consider if we saw 500 fair coins all heads, do we actually have to consider human subjectivity when looking at the pattern and concluding it is designed? No. Why? We can make an alternative mathematical argument that says if coins are all heads they are sufficiently inconsistent with the Binomial Distribution for randomly tossed coins, hence we can reject the chance hypothesis. Since the physics of fair coins rules out physics as being the cause of the configuration, we can then infer design. There is no need in this case to delve into the question of subjective human specification to make the design inference in that case. CSI v2.0 is not needed to make the design inference, and CSI v1.0, which says we have 500 bits of CSI, is sufficient in this case.

Where this method (v1.0 plus pure statistics) fails is in questions of recognizing design in a sequence of coin flips that follow something like the Champernowne sequence. Here the question of how likely it is for humans to make the Champernowne sequence special in their minds becomes a serious question, and it is difficult to calculate that probability. I suppose that is what motivated Jason Rosenhouse to argue that the sort of specifications used by ID proponents aren’t useful for biology. But that is not completely true if the specifications used by ID proponents can be formulated without subjectivity (as I did in the example with the coins)

The downside of the alternative approach (using CSI v1.0 and pure statistics) is that it does not include the use of otherwise legitimate human subjective constructs (like the notion of motor) in making design arguments. Some, like Michael Shermer or my friend Allen MacNeill, might argue that we are merely projecting our notions of design by saying something looks like a motor or a communication system or a computer, but the perception of design is owing more to our projection than to an inherent design. But the alternative approach I suggest is immune from this objection, even though it is far more limited in scope.

Of course I believe something is designed if it looks like a motor (flagellum), a telescope (the eye), a microphone (the ear), a speaker (some species of bird can imitate an incredible range of sounds), a sonar system (bat and whale sonar), a electric field sensor (sharks), a magnetic field navigation system (monarch butterflies), etc. The alternative method I suggest will not directly detect design in these objects quite so easily, since the pure statistics are hard pressed to describe the improbability of such features in biology even though it is so apparent these features of biology are designed. CSI v2.0 was an ambitious attempt to cover these cases, but it came with substantial computational challenges to arrive at information estimates. I leave it to others to calculate CSI v2.0 for these cases.

Here is an example of using v1.0 in biology regarding homochirality. Amino acids can be left or right handed. Physics and chemistry dictate that left-handed and right-handed amino acids arise mostly (not always) in equal amounts unless there is a specialized process (like living cells) that creates them. Stanley Miller’s amino acid soup experiments created mixtures of left and right handed amino acids, a mixture we would call racemic (a mix of right and left-handed amino acids) versus the homochiral variety (only left-handed) we find in biology.

Worse for the proponents of mindless oirgins of life, even homochiral amino acids will racemize spontaneously over time (some half lives are on the order of hundreds of years), and they will deanimate. Further, when Sidney tried to polymerize homochiral amino acids into protoproteins, they racemized due to the extreme heat and created many non-chains, and the chains he did create had few if any alpha peptide bonds. And then in the unlikely event the amino acids polymerize, in a soup, the amino acids can undergo hydrolysis. These considerations are consistent with the familiar observation that when something is dead, it tends to remain dead and moves farther away from any chance of resuscitation over time.

I could go on and on, but the point being is we can provisionally say the binomial distribution I used for coins also applies to the homochirality in living creatures, and hence we can make the design inference and assert a biopolymer has *at least* -log2(1/2^N) = N bits of CSI v1.0 based on N stereoisomer residues. One might try to calculate CSI v2.0 for this case, but me being lazy will stick to the CSI v1.0 calculation. Easier is sometimes better.

So how can the alternative approach (CSI v1.0 and pure statistics) detect design of something like the flagellum or DNA encoding and decoding system? It cannot do so as comprehensively as CSI v2.0, but v1.0 can argue for design in the components. As I argued qualitatively in the article Coordinated Complexity – the key to refuting postdiction and single target objections one can formulate observer independent specification (such as I did with the 500 coins being all heads) by appeal to pure statistics. I gave the example of how the FBI convicted cheaters of using false shuffles even though no formal specifications for design were asserted. They merely had to use common sense (which can be described mathematically as cross or auto correlation) to detect the cheating.

Here is what I wrote:

The opponents of ID argue something along the lines: “take a deck of cards, randomly shuffle it, the probability of any given sequence occurring is 1 out of 52 factorial or about 8×10^67 — Improbable things happen all the time, it doesn’t imply intelligent design.”

In fact, I found one such Darwinist screed here:

Creationists and “Intelligent Design” theorists claim that the odds of life having evolved as it has on earth is so great that it could not possibly be random. Yes, the odds are astronomical, but only if you were trying to PREDICT IN ADVANCE how life would evolve.

http://answers.yahoo.com/question/index?qid=20071207060800AAqO3j2

Ah, but what if cards dealt from one random shuffle are repeated by another shuffle, would you suspect Intelligent Design? A case involving this is reported in the FBI website: House of Cards

In this case, a team of cheaters bribed a casino dealer to deal cards and then reshuffle them in same order that they were previously dealt out (no easy shuffling feat!). They would arrive at the casino, play cards which the dealer dealt and secretly record the sequence of cards dealt out. Thus when the dealer re-shuffled the cards and dealt out the cards in the exact same sequence as the previous shuffle, the team of cheaters would be able to play knowing what cards they would be dealt, thus giving them substantial advantage. Not an easy scam to pull off, but they got away with it for a long time.

The evidence of cheating was confirmed by videotape surveillance because the first random shuffle provided a specification to detect intelligent design of the next shuffle. The next shuffle was intelligently designed to preserve the order of the prior shuffle.

Biology is rich with self-specifying systems like the auto correlatable sequence of cards in the example above. The simplest example is life’s ability to make copies of itself through a process akin to Quine Computing. Physics and chemistry makes Quine systems possible, but simultaneously improbable. Computers, as a matter of principle, cannot exist if they have no degrees of freedom which permit high improbability in some of its constituent systems (like computer memory banks).

We can see the correlation between a parent organism and its offspring not being the result of chance, and thus we can reject the chance hypothesis for that correlation. One might argue that though the offspring (copy) is not the product of chance, the process of copying is the product of a mindless copy machine. True, but we can further then estimate the probability of randomly implementing particular Quine computing algorithms (that makes it possible for life to act like computerized copy machines). The act of a system making copies is not in-and-of-itself spectacular (salt crystals do that), but the act of making improbable copies via an improbable copying machine? That is what is spectacular.

I further pointed out that biology is rich with systems that can be likened to login/password or lock-and-key systems. That is, the architecture of the system is such that the components are constrained to obey a certain pattern or else the system will fail. In that sense, the targets for individual components can be shown to be specified without having to calculate the chances the observer is randomly formulating subjective patterns onto the presumably designed object.

That is to say, even though there are infinite ways to make lock-and-key combinations, that does not imply that emergence of a lock-and-key system is probable! Unfortunately, Darwinists will implicitly say, “there are infinite number of ways to make life, therefore we can’t use probability arguments”, but they fail to see the errors in their reasoning as demonstrated with the lock-and-key analogy.

This simplified methodology using v1.0, though not capable of saying “the flagellum is a motor and therefore is designed”, is capable of asserting “individual components (like the flagellum assembly instructions) are improbable hence the flagellum is designed.”

But I will admit, the step of invoking the login/password or lock-and-key metaphor is a step outside of pure statistics, and making the argument for design in the case of login/password and lock-and-key metaphors more rigorous is a project of future study.

Acknowledgments:

Mathgrrl, though we’re opponents in this debate, he strikes me a decent guy

NOTES:

The fact that life makes copies motivated Nobel Laureate Eugene Wigner to hypothesize a biotonic law in physics. That was ultimately refuted. Life does copy via a biotonic law but through computation (and the emergence of computation is not attributable to physical law in principle just like software cannot be explained by hardware alone).

### 117 Responses to *Siding with Mathgrrl on a point, and offering an alternative to CSI v2.0*

### Leave a Reply

You must be logged in to post a comment.

First and foremost, I apologize to you if I hurt your feelings in your last post Sal.,,, It’s a delicate balance to walk above the fray without getting personal and I stumble much more than I would like from that delicate mark. ,,,This recent article by Dr. Dembski may be of some interest to your post if you, or some reader, has not seen it yet:

Before They’ve Even Seen Stephen Meyer’s New Book, Darwinists Waste No Time in Criticizing Darwin’s Doubt – William A. Dembski – April 4, 2013

Excerpt: In the newer approach to conservation of information, the focus is not on drawing design inferences but on understanding search in general and how information facilitates successful search. The focus is therefore not so much on individual probabilities as on probability distributions and how they change as searches incorporate information. My universal probability bound of 1 in 10^150 (a perennial sticking point for Shallit and Felsenstein) therefore becomes irrelevant in the new form of conservation of information whereas in the earlier it was essential because there a certain probability threshold had to be attained before conservation of information could be said to apply. The new form is more powerful and conceptually elegant. Rather than lead to a design inference, it shows that accounting for the information required for successful search leads to a regress that only intensifies as one backtracks. It therefore suggests an ultimate source of information, which it can reasonably be argued is a designer. I explain all this in a nontechnical way in an article I posted at ENV a few months back titled “Conservation of Information Made Simple” (go here). ,,,

,,, Here are the two seminal papers on conservation of information that I’ve written with Robert Marks:

“The Search for a Search: Measuring the Information Cost of Higher-Level Search,” Journal of Advanced Computational Intelligence and Intelligent Informatics 14(5) (2010): 475-486

“Conservation of Information in Search: Measuring the Cost of Success,” IEEE Transactions on Systems, Man and Cybernetics A, Systems & Humans, 5(5) (September 2009): 1051-1061

For other papers that Marks, his students, and I have done to extend the results in these papers, visit the publications page at http://www.evoinfo.org

http://www.evolutionnews.org/2.....70821.html

The problem is there isn’t a 1.0 nor 2.0. The 2005 paper refined his earlier works, it didn’t replace them. For example, now he admits that a universal UPB of 10^150 was too high and that some scenarios could have a much lower UPB- the UPB is now dynamic as opposed to static.

Sal, interesting and thought-provoking post, thanks.

I wonder. Your coin example calls out an objective sequence for a design inference: 500 heads out of 500 coins. However, might it be possible to say something similar about sequences with only 1 coin tails, or 2, and so on?

Consider this in terms of uncertainty. If our sequence specification is 500 heads, then there is zero uncertainty per bit: for any coin in the sequence, the probability that it will be heads is 1 and the probability of tails is 0.

Now consider sequences with exactly 1 tails among 499 heads. In this case we’ve increased uncertainty from zero to around 0.021 bits of informational entropy. For two tails and 498 heads, we end up with around 0.038 bits. These numbers come from this formula for entropy:

U = -[p0*log2(p0) + p1*log2(p1)], where p0 is the probability of tails and p1 is the probability of heads for each coin.

Constructing a case where zero uncertainty translates to maximum specificity, and complete uncertainty yields minimum specificity:

S = N – U*N, where N is the number of coins and U is the informational entropy per coin.

Plugging in the case with 500 heads, which would equate to maximum specificity and minimum uncertainty:

S = 500 – 0*500 = 500 bits of specificity

For minimum specificity and maximum uncertainty:

S = 500 – 1*500 = 0 bits of specificity

That would be the case where each coin had an equal probability of being heads or tails.

If at maximum specificity we have all heads, we still have 500 bits of specificity. For the example with a single tails and 499 heads:

S = 500 – 0.021*500 = 489.5 bits of specificity

I think this is interesting. At maximum specificity, the sequence of all heads yields 500 bits, which is 3.27E150, breaking the universal probability bound. At 489.5 bits of specificity, which is exactly one tails, we get 2.26E147, which is shy of the UPB. Of course as we increase the string length, that is, the number of coins in the sequence, we would find cases where one, two, or any number of tails would still yield results above the UPB.

It seems to me that if this is reasonable, then it may provide objective criteria for more than just the extreme case of all heads or all tails in a sequence. We can actually apply uncertainty to determining the specification content.

Hopefully my reasoning is sound and I haven’t made any significant errors.

Nice post, Sal! In my opinion, the ideal way to determine CSI would be a two-pronged approach that utilizes the -log2(P) to determine Shannon information coupled with some sort of database that could a sequence could be compared with to determine specification.

Chance,

With 498 coins heads, and 2 coins tails, would we say that this is obviously designed? There are many possibilities to achieve this with 500 coins. Some examples

TT…..HHHHHHHH

or

HHHH…..TT…..HHHH

or

HHHHHHH……..THTH

We would have to count how many of these exist and that will yield a certain probability for hitting one of those configurations. I’d probably use something like the binomial distribution to calculate the improbability in this case, and that will yield the bit number. The same sort of approach would work for 2 heads and 1 tail ratios.

I’m open to differing opinions, but the point is, whatever the number of CSI bits, we have methods outside CSI to tell us something is improbable.

I haven’t examined your calculations in detail. Neil Riekert is the mathematician among us and so is DiEb, DiEb especially seems quite eager to offer critiques.

But I’ll take a stab:

P(498 heads,2 tails) =

P(2) = [500!/ ( 2! (500-2)!)] .5^2 (1 – .5)^(500-2) =

3.811 x 10^-146 = 483 bits

I used this website:

http://stattrek.com/online-cal.....omial.aspx to calculated the binomial probability

For 2 heads and 1 tail ratio

P(500/3 tails) = P(166 tails) = 45 bits

My calculation yielded 491 bits. The way I got it was going to

http://stattrek.com/online-cal.....omial.aspx

Then enetering the following parameters:

Probability of success on a single trial : .5

Number of tirals : 500

Number of successes: 1

Then press calculate and it does that cumbersome binomial probability via this formula:

P(1) = P(499) = [500!/ ( 1! (500-1)!)] .5^1 (1 – .5)^(500-1) =

This will yield a probability of

1.527 x 10^-148 which can be expressed in bits by

-log2(1.527 x 10^-148) = 491 bits

Sal, thanks for your response.

Using the uncertainty/specificity formula at #3, here’s what I get:

2 tails, 498 heads: ~481 bits

1/3 tails, 2/3 heads: ~41 bits

The numbers certainly aren’t exactly the same as yours, but the numbers get closer together when uncertainty is low, as in the 2 tails case. So perhaps they converge for large values of N.

Just to note, with 1000 tosses, 100 tails and 900 heads still comes in above 500 bits. There certainly appears to be some sort of objective specificity in those ratios.

In any case, it appears that cases with 1 heads, or 2, and so on, could still be considered complex and specified for larger values of N.

As a footnote, dealing with specificity from an uncertainty perspective does require a slightly different way of looking at the sequence.

When considering 1 tails and 499 heads, instead of insisting upon exactly 1 tails in a string of 500 coins, we consider each coin to have an entropy relating to the probability of 1/500. This is a Shannon information perspective.

U = -[p1*log2(p1) + p2*log2(p2)]

= -[1/500*log2(1/500) + 499/500*log2(499/500)]

= -[0.002*log2(0.002) + 0.998*log2(0.998)]

= ~[0.0179 + 0.0029] = ~0.021 bits of entropy per coin

Then U can be used in the specificity equation:

S = N – U*N, where N is the number of coin tosses.

S = 500 – 0.021*500 = 489.5 bits (specified, but not above the UPB)

I’m guessing that this value converges on the true probability with large N. I can’t say whether this method is better overall than the binomial approach, but it appears less cumbersome.

My understanding is there is the specification and then there is the outcome.

The uncertainty for the outcome of 500 fair coins is always 500 bits.

It’s a little hard to say there is uncertainty in ANY specification since the specification is precisely defined, but we can say a given 500-bit specification has 500-bits of information. A 499-bit specification has 499-bits of information, etc. How can a specifation be less than 500 bits if the number of coins is 500? This happens if a specification is actually composed of subspecifications.

The measure of information is the measure of the reduction of uncertainty. When 500 coins are flipped we have reduced 500 bits of uncertainty because we now know what the outcome is and thus we have 500 bits of information whereas before the flipping we had 500 bits of uncertainty. So you can see uncertainty and information mirror each.

We have the rather counter intuitive result if we lump several sub-specifications together where each is 500 bits, the composite probability of hitting one of those specification isn’t

500bits + 500bits + 500 bits ….

it is (gosh I hope I don’t mess this up):

-log2( 1/2^500 + 1/2^500 …) bits

Hence when we have 1 coin tails and 499 coins heads we have the following 500 possiblities (sub-specifications if you will):

1. T H H ……H

or

2. H T H ……H

or

3. H H T ……H

…..

….

498. H H H …. T H H

499. H H H…… H T H

or

500. H H H……H H T

That tallies to 500 different possible outcomes where one of the coins is tails. That is to say when we have 1 coin tails and 499 heads, there are actually 500 ways this can be achieved with 500 coins. The specification that there are “1 tail and 499 heads” is actually 500 sub-specifications as enumerated above.

One can extend the previous calculation to tally these 500 cases:

-log2( 1/2^500 + 1/2^500…..) = -log2(500/2^500) = 491 bits

which was the same result using the binomial probability.

The specification of 1 coin tails and 499 heads is a single specification of 491 bits composed of 500 separate specifications where each sub-specification has 500 bits. You can see the counter intuitive result is that by adding five-hundred 500-bit specifications, you get one 491-bit specification!

That is the way I understand things anyway.

Well I tested the numbers for 100 tails out of 1000 tosses with the binomial calculator you provided, and the numbers are not converging, so perhaps there is something wrong with my approach at #3. Wouldn’t be the first time. 😉

Well, there are many Darwinists claiming superior intellect over we humble idists. I’m waiting for them to give us free-of-charge peer review and fix any mistakes we made. They always seem eager to “fix” our misunderstandings, so I eagerly await to hear from them.

The difference between what I proposed in #3 and what you’re suggesting in #9 is that my calculation for 1 tails out of 500 tosses is the same as presuming an unfair coin that comes up tails only 1 out of 500 throws, which is very low uncertainty per toss, or 0.021. Contrast this with a fair coin which is exactly 1 bit.

Fair coin:

U = -[0.5*log2(0.5) + 0.5*log2(0.5)]

= [0.5 + 0.5] = 1 bit per toss, or maximum uncertainty.

As the coin becomes more biased, the measurement gets smaller and uncertainty (entropy) is reduced. I thought that this would make sense when applied to the bit measurement for N throws, but I’m not so sure at this point.

Anyway, thanks for indulging me Sal. It was fun.

By the way, the reduction in uncertainty does make some intuitive sense.

For 500 heads, there is no uncertainty as to the value of each toss, it will always be heads.

However with 499 heads and 1 tails, there is some uncertainty introduced, in that heads could come up in any one of 500 positions. We still know that we’ll get 499 heads, but we lose some certainty about each individual toss.

Another way of saying is that if we know for sure that we’ll get 1 (and only 1 tail) the question is which of the above enumerated subspecifications will emerge. The probability that any one of the 500 specification will emerge is:

P(a specific 1 tail, 499 head configuration) = 1/500

which translates to

-log2(P) = -log2(1/500) = 9 bits

notice that I calculated the specification for 1 tail as 491 bits

Take the above result of 9 bits and look at this:

500 – 9 = 491

If I were the observer, and said, “Chance Ratcliff, there is 1 tail and 499 heads” I would have reduced your uncertainty in what the outcome of the experiment was by 491 bits, or equivalently I have communicated to you 491 bits of information.

The only thing you wouldn’t know at that point is where the tail coin is located. You still have 9 bits of uncertainty. When I tell you the postion, I’ve reduced your uncertainty about the coins another 9 bits (or equivalently increased your information by 9 bits), hence I then have communicated a total of 491 + 9 = 500 bits of information about the outcome of the experiment. Once you have 500 bits of information, your knowledge of the outcome is complete.

Yes, I believe that is correct. If the coin is not fair though, that number will go down. You can plug in the numbers for say, 1/4 probability of tails and 3/4 heads into that simple entropy equation and you’ll get ~0.811 bits per toss.

WRT specification, we could specify “1 tails and 499 heads” if we wanted, and that specification would include some uncertainty, so there would be 500 cases where we could get that same outcome. I was thinking that this could be viewed through the lens of Shannon entropy. Perhaps it can. In that case the uncertainty is present in our specification, which could be satisfied by multiple outcomes.

Any time we talk bits we are talking shannon entropy. I know get flak about this, but shannon entropy is a measure of uncertainty, not disorder.

A 500 meg disk has 500 megs of shannon entropy (or shannon information). Information and uncertainty are different sides of the same coin (pun intended).

Sal @14, yes that makes sense.

OK, now that I’m using the binomial calculator correctly:

Trials = 100, 1 tails 99 heads

Binomial method: 93.4 bits

My method: 94.8 bits

Difference: 1.4 bits

Trials = 500, 1 tails 499 heads

Binomial method: 491.0 bits

My Method: 492.5 bits

Difference: 1.5 bits

Trials = 1000, 1 tails 999 heads

Binomial method: 990.0 bits

My Method: 991.5 bits

Difference: 1.5 bits

In each case the difference is small but constant. Huh. Well at least I wasn’t far off the mark, but I’d sure like to understand why the delta hovers at 1.5 bits.

I presume we are all keeping in mind that measuring so-called Shannon information is

notmeasuring CSI. It is only measuring the complexity (improbability, surprise factor, however else you want to describe it).The “specification” part of CSI is quite often not subject to calculation. At least not with the Shannon bit-calculation approach.

Eric, would you agree that 500 heads out of 500 trials is an objective specification having 500 bits of information? If so, could “1 tails and 499 heads” also be an objective specification with 500 sequences matching the specification? In the latter case, the uncertainty is higher, and I was toying with the idea that we might quantify this with Shannon entropy.

Chance,

There is something with respect to caculating Shannon Entropy:

U = I = -p(x0)log2(p(x0)) – p(x1)log2(p(x1)) – p(x2)log2(p(x2))…..p(xn)log2(p(xn))

each xi is a microstate, and there are 2^500 microstates or way that 500 coins can be configured. So, it’s rather painful to use this form of Shannon entropy where n = 2^N and N = 500!

Shannon included it to be complete, but if you have the assumption that every microstate (a complete collection of 500 coins) is equally probable, you can simplify the above torturous equation to

U = I = N bits = 500 bits

where N is the number of coins, in this case 500.

Alternatively we can simply take the number of possible microstates and take the logarithm of them. The number of microstates is 2^500, so

U = I = log2(2^500) = 500 bits

or alternatively we can take the probability of any specific outcome, in this case 1/2^500 and then take the alternative formula frequently used in Bill’s writings:

I = -log2(P) = -log2(1/2^500) = 500 bits

I saw you tried to make a variation of the more tortured form in your calculations. Maybe that will help you connect the dots.

The reason various forms are used is sometimes the bits can more easily be calculated using probability, at other times it is easier to count microstates. The torturous version is the most comprehensive, but if there are special conditions (like fair coins), the calcutions can be simplified to the forms I was using.

The torturous version of the Shannon entropy is frequently used to calculate bandwidth in communication systems with an analog connection such as in fiber optic or wireless communication, but for nice digital bitwise communication, the simpler formulations will suffice.

RE: #20, in the first case of 500 heads, there is zero uncertainty, and 500 bits of info. In the second case, there is 1 tails and 499 heads, with a delta of 9 bits more uncertainty, and so 491 bits of information, as Sal calculated it. (My method at #3 produces 492.5 bits, 1.5 bits higher.)

So Eric, you are correct this is not the same as CSI, but I thought it was interesting nonetheless.

CSI v1.0 can be amenable to Shannon bit-calculations for specification in some cases.

CSI v2.0 is not

If you can’t describe number of bits in the specification for v1.0, it is harder to make the design inference formally. Some constructs are really difficult to analyze, like say, how improbable is assembly of a house of cards. Formalisms work only well for tractable examples, things like 747s? Forget it. But the nice thing is that for such complex systems, do you really have to make the calculations? For Darwinists, no number of probability will ever satisfy them, so it won’t matter in their case anyway.

Some of the more trivial cases (like homochirality and the computers in life) are sufficient for me at a personal level. I have no ambition to persuade anyone or any group of people that are commited to some mindless account. No amount of numbers will ever be persuasive to them. These considerations are for people like myself who are skeptical but sympathetic. These exercises are helpful to me to reassure me that the Darwinists have only assertions, nothing even attempting the rigor the ID side provides.

The calculations for some of the specifications described by Chance Ratcliff are under the assumption of v1.0.

Sal @21, you are correct, and I understand the simplification. But I couldn’t use the simplification for what I was doing. It may have been wholly inappropriate for me to apply it the way I did, because in truth a biased coin that comes up tails 1 out of 500 tosses is not exactly the same as a sequence of 500 coins with exactly 1 tails. Yet the similarity in outcomes between my method and yours is striking, imo (except for the mystery 1.5 bits). So that more complete form was necessary to calculate a biased coin, which was the model I applied.

Thanks much for helping me with this and for spending the time.

Here is an example of an statistically constructed specification that has no subjective component. Pure statistics would reject the chance hypothesis. It deals with the convergence of patterning between rat and mouse genomes that can’t be attributed to common descent. It has huge correlations across huge data samples. See the graph “The Mystery Signal” at the bottom of Sternberg’s article:

http://www.evolutionnews.org/2.....32961.html

There are tons of these cross correlating features in biology that scream out “STEGANOGRAPHY”.

CSI v1.0 and statistics are good for identifying these as designs, but really, practically speaking you don’t need to go that far, your eyes will tell you there is design such as found merely by looking in Sternberg’s graph.

Obscure data, yes. But hey, that’s why you need to read Uncommon Descent to uncover gems that few are privy to.

Sal

Sal, I think I understand you better in #21 now.

We could use -log2(P) where P is “the number of ways to succeed” divided by the total search space. Is that correct? I get just over 491 bits using that form for 1 tails and 499 heads.

RE: #26 if that’s the case we can calculate the number of ways to succeed as C(n,k) where n is the number of trials and k is the number of tosses coming up tails.

C(n,k) = n! / k!(n-k)!

It’s pretty simple then:

Bits = -log2[C(n,k)/n]

I think that’s correct but my brain is beginning to rebel now.

Yes, exactly. With respect to the discrete math you applied using combinations, I think it may work for the case of just 1 coin, but look at your formulation and see how close it is to the form on this webpage for the binomial probability:

http://www.regentsprep.org/Reg.....Lesson.htm

I’m getting sleepy, so take what I said with a shaker of salt….

Sal

Yep, it doesn’t hold. I should have checked, and I should have known I’m too burnt to think straight. Hasta menana.

Actually I think you’re missing a power. Since probability of heads is same as tails, the form of the binomial probability reduces to:

Bits = -log2[C(n,k)/2^n]

Chance Ratcliff,

By the way, I’m glad you’re using the special form of Shannon’s entropy. If you used the general form, that would involve writing out 2^500 number of terms…

Otherwise stated, the number of terms in the general form would be equal to the Universal Probability Bound trials of 2^500, so you’d be up really late trying to solve that equation.

Yep. That looks like it works. Apparently you’re not as tired as I am; at least that’s my version of the story.

Lol! I think it would still simplify, but I’m too loopy to think about it much more. It’s quite odd that my original #3 formulation approximates it with an error of 1.5 bits when N is greater than 100. I won’t even try to account for that tonight.

RE: #33, come to think of it, that error rate of 1.5 bits assumes 1 success in each of the search spaces. The error is likely not constant for differing numbers of trials. Stick a fork in me, I’m done for the night. Really. 😛

Chance:

500 heads in a row could be a specification in certain instances, but the 500 bits of information you mention is a calculation of the complexity, not the specification. (Incidentally, 500 heads can be written: “heads; repeat 499 times” or some other way that uses a much shorter description than a 500-length string. But that is a sidenote.)

I think we can see the futility of this when we look at two sequences:

tobeornottobethatisthequestion

vs.

brnottstinoisqotebeeootthuathe

Exact same letters. Exact same Shannon “information.” The calculation of entropy/probability/surprise/whatever-we-want-to-call-it is exactly the same.

Yet the first is a specification; the second is not. We clearly recognize it as such. What is it about the first that allows us to determine that it is a specification? It isn’t because we’ve gone through some calculation and assessed the specification quantitatively. Rather, it is because there is recognizable meaning/function to the sequence.

To be sure, I think it would be neat if we could somehow mathematically and quantitatively assess and calculate specification. Perhaps in some very narrow and rare instances we can. But I’m very skeptical that specification, as a general matter, is subject to mathematical quantification (unlike complexity, which oftentimes can be readily calculated). Most of the time specification is much more of a logical or practical or experiential assessment, than a mathematical one.

One final thought:

Usually (for the moment I’m willing to consider it may not be always; but usually) specification is a binary yes-no determination, not a sliding scale of

amountof specification like we use for complexity. In other words, I’m not sure it is helpful or meaningful to say, in effect, “if the amount of specification [units] gets beyond a certain threshold then we’ll say we’re dealing with a specification.” Generally it either is or it isn’t a specification — yes or no. Then if the answer is yes we use the complexity measurement (whichisquantifiable and can be on a sliding scale) to determine whether the complexity side of the assessment has been satisfied.Our subjective perception based on experience, which is legitimate and which is somewhat beyond the scope of v1.0. v2.0 tries to address it, and, not surprisingly, it is substantially harder to pursue.

It is a worthy pursuit, but, perhaps for some the smaller steps are needed.

One reason I posted this is that I’m preparing pedagogical materials for college students interested in ID. When Allen MacNeill at Cornell used v2.0 in his ID class in 2006, it just about crushed everyone!

Eric @35, I’m about burned out for the evening, but I’ll take a stab at this comment of yours:

But if the complexity of the specification is greater than the UPB, then CSI = yes, to put it simplistically. so the calculation needs to be in the context of an inequality. CSI is present when the specificity is greater than the UPB, or S > 500. That’s our boolean yes/no. The presumed specificity S is calculated in terms of uncertainty, for certain specifications, like “0 tails in 500 trials” or “100 tails in 1000 trials”, both of which result in a bit content greater than the UPB.

I’m not sure it’s ultimately correct ,but that’s the basic reasoning. See #3 (and onward) for my original attempt, which isn’t technically correct but gives the gist of the reasoning.

I believe from an operational practical standpoint, most of the designs we’ll formally demonstrate as designed will be of the v1.0 variety, the rest will be intuitively discovered.

For example, even today, people who reject ID treat much of the operation of an organism as a functioning entity and are doing reverse engineering. The only ones who are seriosly missing out are Darwinists who insist things are junk.

v1.0 methods are very good for discovering grammar. Using correlation, meaning can be discovered in some cases. For example we were able to decode the meaning of the genetic code. One only had to assume a design existed, and with some clever discovery of correlation, the meaning of the code was constructed. Here is a meaning table in biology that was unwittingly deduced by v1.0 methods long before v1.0 methods were codified by ID proponents:

http://www.lucasbrouwers.nl/bl.....c-code.jpg

the question of the Designer was not necessary to assume there existed a design in the engineering sense.

The sort of things that Sternberg discovered are on a whole nother level. These sort of things might help us to elucidate and figure out how things connect together and really work. It could be like the Rosetta stone or stones.

For things that v2.0 can confirm as designed, in practice, we already assume those things are designed. For example we call eyes “eyes” even though the perception of what constitutes and eye is subjective, we don’t need any formalisms to convince us that they fundamentally serve the purpose of helping the organism see.

The formalisms might help demonstrate that the evolutionary path could not be one based on random mutation, and because of the No Free Lunch theorems and population genetics, neither could the evolutionary path be through a process of natural selection. But if one is committed to dismissing ID, no amount of formalism will convince them anyway, save a very few people, but of the people that changed sides, the formalisms had very little to do with their change of mind (i.e. Dean Kenyon, John Sanford, Michael Behe, etc.) Common sense was far more important.

I have studied the formalisms for my own benefit over the years, just to help make sure that the perception of design wasn’t some accident of human imagination, and because I’m a doubting Thomas by nature. The examples I gave above were enough of a starting point for me and how I concluded that the preception of design wasn’t a misperception of human imagination.

What else really convinces me of design? The behavior of the Dariwnists — many of them rail, intimidate, abuse, demean, but never back up there claims with facts and coherent reasoning, just sophistry and misrepresentation and equivocation. That’s the conduct indicative of those that have no case.

I didn’t think that mathgrrl was a “decent guy” when I read the post you linked. He admitted to resorting to dishonesty then accused all of us of being intellectually dishonest. He used a heavy dose of shaming techniques to make us ashamed of being ignorant and wicked “intelligent design creationists.” I also think that he is being willfully ignorant about CSI being a useful concept even if it is not mathematically definable yet (just as entropy is a useful concept even when you don’t have a mathematical rubric for it). I know that he was told that in the many debates on this website on CSI yet he failed to even bring that up in his post summarizing the debates. This was (imo) done purposefully and therefore dishonestly. So no, I don’t think he’s a decent guy.

Eric @35,

Yes I think you are correct here. On further reflection, I think that when uncertainty is low, chance is out of the picture, as in the case of zero tails out of 500 tosses, or 100 tails out of 1000 tosses, both of which have more than 500 bits of entropy. So here we can rule out chance by examining the uncertainty contained in a string, but we cannot rule out necessity by this same method. Some sequences with low uncertainty will have algorithmic simplicity.

This is only true in a first-order approximation of English text, where letter frequencies are taken into account, instead of pairs or triplets. The first thing to note is that both of those strings, by comparison to a truly random sequence, exhibit a signal, a reduced uncertainty, because they will tend to correspond to English letter frequencies, such that ‘e’ will occur with around 0.13 probability and ‘q’ with negligible frequency.

We can differentiate the strings from a Shannon perspective by moving to a second-order approximation of English, which takes into account the relative frequencies of letter pairs. In this case, the first string has a reduced uncertainty over the second. In essence, we could transmit the first message with fewer bits. As we move up to triplets, or word approximations, the uncertainty decreases more. However it should be noted that we’re smuggling information in at the same time, because any approximation of English text is a context, which specifies probabilistic details about language.

Clearly there’s an enigma with regard to intelligence, language, specification, design, etc. And while specification may not be amenable to precise mathematical definition, we can apply principles like I alluded to above to determine whether specification is more likely in the former or the latter strings. There is less uncertainty in the former, if properties of English text are taken into account.

I tend to agree, and I’m certainly not suggesting that specification can be reduced to a mathematical formula; but that’s not to say that math is unuseful in exploring the properties of specification and its output. I think it’s possible that language will turn out to be the more illuminating property of designed things.

Pure math, or any design detection methodology, will not detect all possible designs. In fact, our best design detection methods will only detect a very small subspace of all possible designs.

It is fortuitous, dare I say Provident, we detect any design at all. But the fact that we have designs in biology which we can detect, I’d say that was by Design.

Chance @40:

Thanks for your thoughts.

Any time we talk about uncertainty, we are talking about complexity. I think if you look back carefully at the additional rules of English you bring to the table to lessen the uncertainty you will see that what you are really calculating — again — is complexity, not specification.

As to #37:

Again, I think you are mixing the two concepts up as though they were one. The UPB relates to complexity. If the UPB is exceeded then we have a certain level of complexity, but not necessarily CSI. We can have a ‘C’ way beyond the UPB, but the ‘S’ will still be missing unless it is found on its own merits. I’m not sure we’re saying different things here, but just wanted to make sure.

Correction to my #40:

“both of which have more than 500 bits of entropy”

->

“both of which have 500 or more bits of entropy”

_______

Sal,

I recently had a similar thought, that the quality of conduct is often proportional to the strength of one’s argument.

On further comment. It may be helpful to think of it this way:

One implication of a mathematical calculation to measure specification, as some people may be proposing, is that specification then becomes a scale. As a result, A can be

morespecified than B.This being the case, when does A become specified

enoughto say that it is truly specified? We can’t measure specification (using whatever unit we can imagine) in terms of the universal probability bound. It can be measured only in terms of whether it is sufficiently specified. Therefore, we would require not only a measuring unit, but also a rational cutoff point for “specified enough,” which essentially is a percentage. Is it 100% specified, 90% specified, 50% specified?And 50% of what? Some idealized, hypothetical specification?

Is a Ferrari

morespecified than a Ford? Against what are we measuring and how could we even in principle make such a determination?Harking back to biology, is the bacterial flagellum specified? Sure, we all recognize that it is. We recognize it because we look at it functionally, logically, and experientially. In contrast, if we take the position that its specification can be measured and calculated, then we are, by definition, now forced to ask “But how much is it specified? And is it specified

enough?” Maybe the bacterial flagellum is specified but it isn’t specifiedenoughto warrant a design inference?I still haven’t made up my mind about whether specification can be amenable to mathematical calculation. Perhaps in some rare cases it can. But I think the whole effort to quantify and calculate and measure specification is heading down the wrong path and will serve primarily to confuse rather than enlighten. Indeed, some of the critics of the design inference have wrongly (in my view) gone down this path and demanded an objective mathematical calculation of specification before they will admit that something is specified. It seems like the wrong way to approach things.

No need for a cutoff, we can merely assert one hypothesis is more convincing than others. That is probably true of the way we make desicions, we don’t necessarily decide absolutely something is right, but we do rank which things look like better propositions.

So, I don’t try to say something is necessarily true because it is improbable, but something (like ID) is more believable as an explanation if it can be shown the alternative is improbable (like mindless origins).

FWIW, Dawkins still uses the chance hypothesis as an argument for the origin of life. The chances are very good then that he is wrong.

I should add, there is a dimension of this that sometimes is forgotten. Future random chance tossings of a system of 500 coins will tend to evolve it into a “racemic” mixture of heads and tails. Particularly in biology, not only is it improbable for homochirality to emerge, it is even more improbable that it will remain that way over time. That’s sort of hard to capture in a specification, so in my example above, the specificity number of N bits for N amino acids was awfully generous for the chance hypothesis. Its much more than N bits.

For certain limited situations, such as with coins and homochirality, we can state specifications as having a certain number of bits.

For example, the outcome of 500 coins conveys 500 bits of information. “All coins heads” is a 500 bit specification of a possible outcome. “1 coin tails, the rest heads” is a 491 bit specification, etc. The use of “bits” is just a measure of improbability, that’s all. We may affix to it fancy names like: “shannon uncertainty”, “shannon entorpy” , “shannon information”. Mathematically, it’s a measure of improbability.

We attach the name Shannon because he was able to use the math of improbability to come up with Shannon’s theorem of communcation which tells us how much bandwidth we can theoretically pump through a wire. In the construction and defense of his theorem, he coin the notion of “bit”. It was an incredible intellectual achievement.

Very rare cases indeed, but thankfully there are designs we can use this limited approach on. There are perhaps other ways to detect design, but that is outside the scope of my present research. I salute efforts to go beyond the methods I outlined in this thread.

Eric @42, I am suggesting a correlation between specificity and the properties of the output. However you are correct. My phrase, “complexity of the specification” is not appropriate in the context, and has caused confusion. Thanks for bringing it up.

What I’m suggesting at #40 is that the text is more likely to be specified when the uncertainty is low. In the two strings you provided, the first one has less uncertainty than the second, and at the same time more specificity (meaning). These appear inversely proportional. Can we always expect that to be the case when English text is contrasted to random sequences? Yes, I think so. And as I said in the first paragraph of #40, I think we are ruling out chance explanations with low uncertainties, but that doesn’t specifically rule out necessity, or causes with algorithmic simplicity.

You are correct. I’m not calculating specification. I’m calculating the uncertainty in the output. And I kept referring it as specificity, but only because of the inference to its increased likelihood when uncertainty is reduced. This is an artifact of my original #3, where I associated low uncertainty with the presence of specification. In actuality, I think low uncertainty only rules out chance, it doesn’t rule out necessity.

Does that seem more reasonable and consistent? Thanks for giving the opportunity to try and be more concise. And I think you make good points about the elusiveness of quantification of specification. However, isn’t “specification” in essence a description by which an object or phenomenon can be reproduced, and aren’t we interested in how complex or simple this description is at some point when evaluating design?

why is it that Joe’s name is the first one that popped into my head when I read this? Likely because he is the poster child for IDists and their arguments both here and across the interwebs!

Footnote to #46.

How do we rule out chance? In a sequence of 1000 coin tosses, 100 coming up tails, there are about 536 bits of entropy. This doesn’t necessarily mean specification, but it sure seems to eliminate chance, and that increases the likelihood of either design or necessity.

Yes, no?

Footnote 2 for #46,

If we must import a context for English letter frequencies in order to reduce uncertainty in a string, wouldn’t this increase the likelihood that meaning is present, and decrease the likelihood of either chance or necessity?

Eric @44,

That may be the case, but In order to know a path is the wrong one, sometimes it has to be traversed a little. It’s not my intention to cause any confusion, only to explore something that occurred to me when reading Sal’s OP. There might be some relationship between entropy and design. If it turns out not to be the case, I’ll still be satisfied with my attempt to understand why.

Eric @44,

I can only say that I think it helps to rule out chance. It also helps to rule out necessity. If both of these things can be ruled out, it strengthens the inference to design. Other methods would better help us understand what specificity is, and what content it has. I don’t think there’s mutual exclusivity here.

scordova @45:

We aren’t talking about a design hypothesis vs., say, Darwinism here. We’re talking about whether something is specified. The bacterial flagellum either is or isn’t a specification. Or are you suggesting that the flagellum could be categorized as “kind of” specified, or “almost specified” or some other such scalable notion?

One might be tempted to imagine a flagellum with additional parts that would allow it to, say, rotate faster. One might then think that this means such a flagellum has more specification. But such a flagellum would not be more

specified, it would be morecomplex. Any way you try to describe those additional parts and their interaction with the existing parts and the improbability of those parts arising on their own would inexorably lead you back to the complexity side of the CSI equation.Regardless, if we take the position that something can be more or less of a specification, then — by definition — in order for the design inference to work we must have a cutoff.* Otherwise we can never be sure that we really have a specification and the critical response is quite simple: (i) show me your calculation of specification, and (ii) you haven’t demonstrated that this particular item is specified

enough.—–

* Just like we have a cutoff for complexity. With complexity we can readily see that there is a continuum, with things being more or less complex. Thus, we have to put in place a cutoff to avoid false positives, typically understood as the universal probability bound. A continuum of specification would require exactly the same kind of cutoff to properly detect design and avoid false positives.

Hi franklin,

I will take on any evolutionist in a debate, especially one that involves evidence. 😛

Where’s CJYman?

I’d advise you to brush up on a few (well pretty much everything) details first….like maybe set theory for starters. You’re ignorance of the subject as well as your refusal to admit your ignorance and stubborn refusal to correct that deficiency has produced much laughter and jocularity on the interwebs. Keep it up your IDs best point man!

correction: “Keep it up you’re IDs best point man!

Yeah set theory, now that will help evos win their case. Too bad they don’t have any evidence for their position so they have to focus on irrelevant BS.

Footnote 3 for #46,

Is it possible in principle for a string to have detectable specification when uncertainty is maximal? If not, then a string can only have discernible meaning when its entropy can be reduced. This is not a sufficient condition for specification, but perhaps a necessary one.

Also, I think it needs to be stressed that just because a signal can be detected, it does not follow that the detection also entails the details. Just because there might be a metric which allows us to discern that specificity is more likely to be present in a string, it does not mean that we’ve described the specification or even quantified it. Consider two piles of strings, sorted based on their entropy. One pile has high entropy, the other has a more moderate amount. Which pile will be more likely to contain meaningful phrases?

This seems like something that could be explored empirically.

And franklin, only a moron would say that I am ignorant of set theory just because I disagree with an arbitrary rule. And here you are.

Grow up loser…

Chance Ratcliff,

The longer the string the higher Shannon Entropy is needed to describe it. Meaning is discernable if:

1. the designer of the string is working to ensure you understand the meaning

2. you had some luck

Here is a-non string object where we humans hope someone will figure out the meaning:

http://en.wikipedia.org/wiki/Voyager_Golden_Record

In the world of formal languges, meaning is never uncovered unless the observer has the capability and tools to discern meaning, and that entails the observer is provided tons of information to decode meaning in a language. Example: the Java Language interpreter. For that matter, any computer language interpreter or compiler.

It isn’t that you have a disagreement with set theory it is that you are clueless about it and don’t have the wherewithal to recognize that simple fact. When folks who do understand set theory try to explain where you are mistaken it flies right over your head. Why don’t you run some of your assertions about set theory here to folks at UD and see what they think.

As it has been pointed out to you: the fromStanford Encyclopedia of Philosophy (remember its importance to nested hierarchies as well):

Set Theory is the mathematical science of the infinite. It studies properties of sets, abstract objects that pervade the whole of modern mathematics. The language of set theory, in its simplicity, is sufficiently universal to formalize all mathematical concepts and thus set theory, along with Predicate Calculus, constitutes the true Foundations of Mathematics. As a mathematical theory, Set Theory possesses a rich internal structure, and its methods serve as a powerful tool for applications in many other fields of Mathematics. Set Theory, with its emphasis on consistency and independence proofs, provides a gauge for measuring the consistency strength of various mathematical statements.

and this does seem quite appropriate for you: “I recently had a similar thought, that the quality of conduct is often proportional to the strength of one’s argument.”

Sal @60, I don’t disagree with anything there. I’m suggesting that any string which contains a discernible message cannot have maximum entropy, like a random sequence would have. So I’m not suggesting uncovering specific meanings, but rather ruling out meaning in cases where the symbols approach total randomness, which I’m attempting to correlate with high uncertainty.

Letter Frequencies

If meaning is present in a string, then some sort of signal will be discernible. Not the specific meaning, but the increased likelihood of the presence of

anymeaning. Of course, the longer the string the more entropy. But take for instance a couple of 1000 letter sequences. The first is from the Declaration of Independence, and the second is totally random at 4,755 bits and maximum entropy. Analysis of the first string would produce letter frequencies approaching those linked to above. This would reduce entropy. Further analysis, like frequency of letter pairs or letter triplets would likely reduce entropy even more. The first string would have less entropy than the second; and if it didn’t, it couldn’t contain a discernible meaning.Do you think this is mistaken?

for the onlookers here is Joe’s conceptual understanding of set theory….see if you can make any sense out of it.

http://intelligentreasoning.blogspot.com/

A Tale of Two Sets

–

Take two sets of whole numbers- A and B. Set A contains every single number set B has plus one number B does not have.

Now take an arbitrary measuring system and voila, both sets are the same size!

Careful here. A random sequence has close to maximum ALGORITHMIC entropy, that is not the same as SHANNON entropy.

If I have a 10meg zip file that expands to 500 megs and then can be recompressed down to 5 megs, how much information does the zip file really have? The answer is in the eyes of the beholder! You could make a case for either numbers as being the Shannon Entropy, and practically speaking most engineers don’t care as long as they get paid to make compression and decompression algorithms.

For example, 500,000 coins all heads has 500,000 bits of Shannon entropy, but it may have only a few bits of Algorithmic Entropy (because it takes only a few bits to represent the string relative to the actual string size).

Zipped up, compressed files, JPEG files are packed with nearly maximum alorithmic entropy. They are far more meaningful than an empty string of 500,000 zeros (which have low ALGORITHMIC entropy). Hence there is the case where the high algorithmic entropy file (which looks disordered white noise, but is not) has far more meaning than a low algorithmic entorpy file (which is all zeros).

I think I’ll just leave it at that. Thanks for your help. Eric too.

Chance @50:

Absolutely there is. It is relevant to the calculation of complexity.

@62:

While it is true that any grammatical sentence will be less random, on average, than a purely random string of letters, that does not necessarily translate to meaning. Back to my example:

tobeornottobethatisthequestion

and

brnottstinoisqotebeeootthuathe

have the exact same amount of Shannon “information.” The same will hold true if I incorporate into my calculation the relative frequency of letters in, for example, the English alphabet. Still get the same result on a letter-by-letter calculation.

Now we could step up a level and calculate whole words, but in that case we would still end up with a situation where:

tobeornottobethatisthequestion

and

bebetotoquestionthenotthatoris

have the exact same Shannon information.

It won’t be until we have actually stepped up to the level of a grammatical sentence (or at least coherent phrases) as our minimal search parameter that we actually start getting away from the purely statistical Shannon calculation into being able to search for actual meaning/function/specification. Yes, we could then search for whole meaningful sentences (or phrases), but that would mean we have really just snuck in the meaning/specification in the back door. At that point we have defined a particular specification and we are just searching to see if we can find it.

Beyond a general observation that meaningful sequences are typically not characterized by pure randomness, I don’t think Shannon calculations are able, by definition, to distinguish between functional, coherent, meaningful sequences and complete gibberish.

Eric, thanks for your additional comments. I think the conversation has gone as far as it will. I appreciate your efforts, and Sal’s, to engage my points directly.

You are welcome, Chance. Talking to you was a lot of fun, and I wish more of my thread were like this one.

franklin:

Yes, it is. That is the only issue with infinite sets.

And I have asked, and no one has answwered, what practical application this aribitary alignment has. IOW your Stanford quote-mine is meaningless as it does not deal with this specific case.

People have answered you in many different ways and with different examples. That you aren’t capable of grasping the concepts and math of set theory is no ones problem but your own.

Do you really think there are two different values, i.e.,’ infinity’ and ‘infinity + 1’?

And the alignment is anything but arbitrary it is a direct one to one mapping….I know it must be tough for you to grasp these mathematical concepts but perhaps if you applied yourself or even ask oleg to explain it to you yet again you might be able to pick up what most everyone else already understands.

franklin,

Infinity cannot be measured. Also I have answered those people who can’t even address my explanations- just as you cannot.

The alignment is arbitray for the reasons provided. Now you can ignore my explanations but your ignorance is not a refutation.

Again, what practical application is there in saying that two sets, that cannot be measured, are the same size?

Joe, you may have answered those people but there is one problem in that your answers are incredibly and obviously wrong to anyone who understands set theory. That you cannot grasp the concepts in set theory is your problem.

Your alignments make no sense at all and underscore your inability to comprehend the subject matter.

Every one who understands set theory can ignore your incredibly wrong answers, alignments, and assertions. Try educating yourself on set theory before pontificating on it. Your inadequacies in grasping and understanding the subject matter becomes immediately obvious when you go off half-cocked in your delusional posts where you think that you actually understand set theory.

Go visit the Stanford site if you are really interested in understanding the practical applications of set theory. Your ignorance does nothing to refute anything in set theory but it is kinda funny observing your antics.

You should ask yourself why everyone else understands set theory and you don’t. Is everyone else wrong or does the problem lie with you. The answer is obvious to everyone! If you need help maybe Dembski or KF can give you a hand with the material.

franklin,

Is {1,2,3,4,…} a proper subset of {0,1,2,3,…} because its 1 matches/ aligns with the superset’s 0 or its 1?

And again, I did not ask about the practical applications for set theory. So why are you blathering on as if I did? It’s as if you think your belligerence is really going to hurt me. It doesn’t. I will just keep correcting you as you spew.

Also:

Take two sets of whole numbers- A and B. Set A contains every single number set B has plus one number B does not have. In what way can these two sets be the same size?And what practical application does it have to say they are the same size?

Why do you keep ignoring that?

@Joe:

In that way:

What’s the problem?

I keep ignoring it because it is so wrong it is pathetic and it is also not my job to educate you although many others have already tried to disabuse you of your ignorance on the subject. But don’t let your ignorance stop you continue to carry on which is providing much humor albeit at your expense.

see #74 another individual that understands set theory!

pssst… JWTIL the problem is Joe and his lack of understanding. Sorta like a mega case of the ‘arrogance of ignorance’ that characterizes his online persona.

If you mean by size the cardinality, it does matter.

It helps you solve problems in calculus or at least determine if you can solve a problem using certain methods.

Let f(x) = 1 for all reals on the interval[0,1]

the Riemann integral of f(x) is F(x) = 1

let g(x) = 1 for all rationals and 0 for all irrationals

there is no Riemann integral for g(x)

even though there are an infinite number of rationals and an infinite number irrationals between 0 and 1, you can’t do a 1 to 1 mapping so you won’t be able to say what the Riemann integral of g(x) is, in other words G(x) = ????

The fact the rationals and irrationals don’t have the same cardinality is therefore important.

Many times when doing real world applied math, it is helpful when we can take something with a finite number of points (like say the gas molecules in a box) and come up with an infinite idealized fluid model that isn’t exact but has easier math. It’s important to know when we can make such leaps from “finite exact but computationally impossible calculations” to “infinitesimal approximation but computationally possible calculations”. Such considerations as above then become very important in determining if our approximate methods will give us usable answers.

Yes, the bijection and the arbitrary rule.

That’s the problem. However it is minor because obvioulsy it has no impact on anything and is a matter of debate- measuring infinite sets and the number of power sets. It seems like bijection is a tool for the lazy to not actually have to actually do the work. “Oh infinite sets- same size- bijection”

If that ever has some practical use I will change my opinion of bijection’s use on infinite sets.

By the way, have at it guys here in this thread. It looks like we’ve discussed away the original post. So if you want to conduct your off topic dicussions here, go ahead, better here than in other thread.

Enjoy!

Sal:

But wait- rationals are an infinite set. And irrationals are an infinite set. Bijection says they have the same cardinality- and just so you know that is the point I am arguing. I don’t think all infinite sets are the same size. I don’t believe Cantor did either. But I don’t know of a way to tell.

“Infinite is infinite, dude”

Sal,

My apologies and thank you for your input wrt sets and CSI.

JWTruthInLove,

My issue is, as I stated, with supersets and subsets we use one alignment. And then with infinite sets we use another alignment.

It seems to me we do that just because no one wants to actually think about it because we cannot really comprehend infinity.

And franklin- YOU are pathetic. Your inability to think outside of your sock-puppet is duly noted.

@Joe:

Please provide a citation for that claim!

Again, wiki helps:

It doesn’t matter whether the set is finite or infinite, or what alignment your favorite alignment is.

If you have a problem with the definition, make up your own definition.

Try

Rational Numbers Countable, Home School Math

which helps one understand

Cantor’s Diagonal Argument

Which shows the reals are uncountable.

If the reals are uncountable and the reals are composed of rationals (countable) and irrationals, then it stands to reason the irrationals are uncountable, hence the irrationals have a higher cardinatlity than the rationals.

@Joe:

You have a problem with abstract thinking, which is necessary in math (or computer science).

So there are infinite sets that do not have a one-to-one correspondence, ie not all infinite sets have the same cardinality.

Sal, I thought someone refuted Cantor’s diagonal argument…

And no JW, I don’t have an issue with abstract thinking. Just because we cannot comprehend infinity doesn’t mean I have issues with abstract thinking.

And I undersatnd what wiki says and what the rule is. I don’t have to agree with it. Geez students get taught evolutionism in school, do they have to agree with it? OR are people allowed to challenge something that may be not as established as thought?

LoL! I was mocking my attackers and now keiths sez I am confused!

But wait- rationals are an infinite set. And irrationals are an infinite set. Bijection says they have the same cardinalitymocking you guys, keiths- you and mr infinity = infinity.@Joe:

Who is keith?

Why not? It’s just a definition. Make up your own definition and call it “Joe’s cardinality”.

Evolutionism entails more than just definitions.

Sal,

It was his Continuum Hypothesis that people have been debating- my bad.

keiths is the guy who baldly asserted the unguided evolution is by far a better explanation for what we observe than ID.

And using my cardinality the set of integers that includes 0 is greater than the set of integers that doesn’t.

As I said, it has everything the other one has PLUS something else.

Can anyone tell me what logical contradiction that dredges up? My opponents fear the worst yet cannot say what that is.

And evolutionism entails crap in the guise of science.

Looks like I’m getting brain Cantor

That’s an example of a relation of your cardinality. What’s your definition of cardinality (aka “Joe’s cardinality”)?

“It warps the minds of our children and weakens the resolve of our allies”.

As for mapping rationals to irrationals, it is done the ordinal way 😛

Normally Joe’s cardinality is the actual number of elements. In the case on infinite sets then we can only do a greater than, less than or don’t know.

If one infinite set obvioulsy contains the same elements of another infinite set AND has elements the other does not, then it has a greater cardinality.

Nice finishing quote, btw.

The problem is we are finite human beings trying to construct theorems about things that have an infinite number of members.

Hofstadter called this “finitistic reasoning”. One might argue on principle whether it is legitimate at all to use our finite reasoning skills to make statements about things that are infinite.

But it is our nature to try to extrapolate ideas out to infinity where things are not always so clear.

I don’t think I can add anything more to questions of set theory. Have at it gentleman and thanks to all who offered comments on my thread both on or off topic.

@Joe:

I like the definition. It may not be useful in algorithmic theory but someone somewhere might actually find a use for it. Let’s call the function which returns “Joe’s cardinality” of a set JOEC.

Sal,

Your input was very welcome and appreciated. On one hand I thought that I knew that all infinite sets were not equal and on the other I had my opponents yelling “infinity is infinity you IDIOT”.

And now I see the value in saying infinite sets are not equal- thanks to you.

JW, Thank you. One never knows when something new and different will become handy. 😉

I am still trying to figure out what logical inconsistencies my definition brings about…

JWTruthInLove (Sal too)- the following is what I have come up with wrt infinite sets and cardinality:

The Number Line HypothesisWith respect to infinite sets (with a fixed starting point), it has been said that the set of all non-negative integers (set A) is the same size, ie has the same cardinality, as the set of all positive integers (set B).

I have said that set A (the set of all non-negative integers) has a greater cardinality than set B (the set of all positive integers). My argument is that set A consists of and contains all the members of set B AND it has at least one element that set B does not.

That is the set comparison method. Members that are the same cancel each other and the remains are inspected to see if there is any difference that can be discerned with them.

Numbers are not arbitrarily assigned positions along the number line. With set sizes, ie cardinality, the question should be “How many points along the number line does this set occupy?”. If the answer is finite then you just count. If it is infinite, then you take a look at the finite because what happens in the finite can be extended into the infinite (that’s what the ellipsis mean, ie keep going, following the pattern put in place by the preceding members).

With that in mind, that numbers are points along the number line and the finite sets the course for the infinite, with infinite sets you have to consider each set’s starting point along the line and the interval of its count. Then you check a chunk (line segment) of each set to see how many points each set occupies (for the same chunk). The chunk should be big enough to make sure you have truly captured the pattern of each set being compared.

The set with the most points along the number line segment has the greater cardinality.

For set A = {0.5, 1.5, 2.5, 3.5,…} and set B = {1,2,3,4,…}, set A’s cardinality is greater than or equal to set B. It all depends on where along the number line you look.

As opposed to looking down infinity and saying “Gee, it goes on forever so they must be the same”, I look back from infinity and say “Hey, look what came before this point” and can we use that to make any determinations about sets.

Applying finite minds to the problems of the infinite is bound to present difficulties. One could make the argument that it’s inappropriate to even try, but human nature is such we’ll try any way.

Here is a powerful example of what happens when we extrapolate out to infinity. You can find yourself concluding that:

1 = 0

See:

Grandi Series

Consider the following two sets:

SET 1: integers greater than 0 (a member of this set is designated Y)

SET 2: integers greater than 10 (a member of this set is designated X)

Superficially it would seem SET 1 has 10 more members than SET 2. But then again, what happens when we’re dealing with infinity gets strange.

A math professor would say, “prove that the two have the same cardinality.”

If he did so on a homework assignment I’d say something like:

Is it this proof valid? Good enough to make the grade. I’d probably get docked points for the less terse form of the proof.

Questions of its ultimate validity I leave to others, but that is the accepted answer and it seems to work. But what did I say about applying finitistic reasoning to question of infinity? The claims appear to be not so clear or at least counter intuitive. The fact that the cardinality of all the reals from 0 to 1 is equal to the cardinality of all the reals from 0 to 2 seems really astonishing. The proof would be:

SET 1: all reals from 0 to 1 (symbolized by Y)

SET 2: all reals from 0 to 2 (symbolized by X)

@Joe: It’s confusing if you use the same term “cardinality” for the actual cardinality and JOEC.

A = {x : x in N}

B = {f(x) : x in N}, f(x) = x + 1

Am I correct to assume that JOEC(A) > JOEC(B) is true?

No one is saying that, except for you.

This one-to-one mapping seems to make numbers arbitrary. Also the strange thing about infinity is it makes small % in the finite world really, really close to 0. So a difference of ten numbers in an infinite world would be almost as close to 0% as one can get.

Also “my” cardinality deals with the number of elements in a set. What does actual cardinality stand for?

For argument’s sake only:

Let’s say that my methodology is correct and the set of all non-negative integers has a greater cardinality than the set of all positive integers.

How would that effect anything? (other than meaning Sal’s above proofs are wrong)

And thank you Sal and JWTruthInLove. It may go a little slow here because my last calculus class was in 1992- so be gentle….

I’m wondering how we could form a bijection between sets A and B if A = {all positive integers} and B = {all nonnegative integers} when we define a mapping between sets as F:A→B, such that F(a) = a. (Set B would include zero, where set A would not). It seems that such a condition could never be satisfied between these sets. Is there a practical way to resolve this disparity? It seems like a logical contradiction to me.

Considering any

discretecase of A and B containing numbers less than N, we would always get a set containing zero when taking the complement of the intersection between sets A and B: (A ∩ B)’ = {0}, indicating that the cardinality of A and B are different: |A| ≠ |B|. No bijection would exist for a mapping F:A→B where F(a) = a.Why should this not be so for the infinite case?

as to:

Yet:

Moreover Godel derived incompleteness, at least in part, by studying the infinite. As to ‘presenting difficulties’, the not to subtle hint of the following video is that ‘studying the infinite’ was ‘dangerous knowledge’

Footnote to #104,

We could define F:A→B as F(a) = a-1, and this would appear to allow for a bijection in the infinite case, but this seems little better than craftiness.

Yet if A = {all positive integers} and B = {all nonnegative integers}, then isn’t A ⊂ B true even for infinite sets? If so, then they can’t have the same cardinality, at least as the definition would apply to discrete cases.

I’m not inclined to think of infinity as a quantity, if only because for two functions, f(x) = 10^x and g(x) = log(x), both f and g approach infinity in the limit as x approaches infinity. Graphing these two functions makes it clear how absurd this is if infinity is treated like a quantity!

The mappings are arbitrary. The construction of what you want to put in a set is arbitrary, so in that sense, a set contains what you arbitrarily choose to put in it.

At issue is of you make certain arbitrary constructs, what will their properties be?

Perhaps disturbing is that it becomes apparent with set theory, the real numbers are not the only conceptual arbitrary entitities one can concoct within set theory.

Can other mathematical “number” systems be concocted. Yes, and surprisingly, they have utility. Like:

http://booster911.hubpages.com.....Arithmetic

where 1 + 1 + 1 = 1

So yeah, members of sets can be be abitrary constructions with arbitrary properties. We can construct math systems that behave in the familiar way, and math systems that don’t.

The modulo-2 polynomials, strange as they are, are vital in Information Technology. Whether we choose to call some members of a set “numbers” is maybe a matter of convenience, what they really are is based on the rules and properties we project on them via unprovable axioms such as this set of unprovable axioms for real numbers:

http://www-history.mcs.st-and......es/L5.html

These axioms were a codification of the way we sort of expect numbers to behave based on how we experience reality.

Some daring mathematicians said, “what if we assume different properties, what happens”. And you get other mathematical systems. Set theory helps to deduce those properties rigorously.

Are those strange math systems useful. Sometimes. One strange, previously tabooo, math system was non-Euclidean geometry which became the basis of much of modern physics.

One might ask, is there an inherently true math? A lot of mathematicians might respond with, “what does it matter as long as its beautiful?”

Why? You’re extrapolating finitistic reasoning to situations involving infinity. It looks like the quality of being infinite makes lots of things possile that would be otherwise impossible for mere mortal finitistic systems.

Sal, I wouldn’t equate eternality with infinity. 😉 But I’d say that infinity is only abstractly useful, such as when dealing with limits. I doubt such a quantity could be concretely real, at least in this universe.

In other words, Black Magic, Fenomenal Black Magic (FBM).

Chance,

The bijection is formed ordinally, set A’s first element with set B’s first element, regardless of what the actual number is. And yes it appears to be nothing but craftiness.

There are many mathematicians who find it offensive and denigrating that their idealized world could have any counterpart in reality.

There was a mathematician by the name of Ito who created Ito’s Calculus. He was later mortified to find that people found application of his Calculus to finance.

Sal,

Again I thank you. My demon doesn’t like the word “concoct” used in relation to the word “math”.

On New Year’s Eve 2006 I posted :

So yes, I understand the arbitrary nature of set theory.

Thanks again, much to think about…

Joe @111, ahh, I see. Thanks. Yes, crafty. 😉

Sal @112, point taken. (That was humorous the way you put it.) I try not to be dogmatic about it, but infinity looks to me like something, which by definition, can never, ever be traversed.

So much for uniformitarianism…

To Chance, JWTruthInLove, and Sal-

The roller coaster hypothesis:

When dealing with certain infinite sets- make all

elements = e, and then it’s {e,e,e,e,e,e,e,e,e,e,e,e,e,e,e,e,e,…} all the way down, for all similarly classified sets. Just like going over the top and down a steep, never-ending roller coaster drop, in which you reach terminal velocity and just keep going, and going and going.How to compare infinite sets of natural numbers, so that proper subsets are also strictly smaller than their supersets:HT

Winston Ewert