Uncommon Descent Serving The Intelligent Design Community

“Conservation of Information” — on the choice of expression

Share
Facebook
Twitter
LinkedIn
Flipboard
Print
Email

Conservation of information as developed in several articles (see the publications page at www.evoinfo.org) by Robert Marks and me has come in for criticism not only conceptually but also terminologically. None of the conceptual criticisms has in our view succeeded. To be sure, more such criticisms are likely to be forthcoming. But as this work increasingly gets into the peer-reviewed literature, it will be harder and harder to dismiss.

That leaves the terminological criticism. Some have objected that a conservation law requires that the quantity in question remain unchanged. Take conservation of energy, which states that in an isolated system energy may change forms but total energy remains constant. Some have argued that what we are calling conservation of information is more like entropy. But that’s not the case either. Entropy, as characterized by the second law of thermodynamics, says that usable energy will diffuse and thus be guaranteed (with overwhelming probability) to increase. Hence entropy, unless usable energy is in a maximally diffuse state, will change and cannot rightly be regarded as falling under a conservation principle.

Conservation of information, by contrast, falls in a middle-ground between conservation of energy and entropy. Conservation of information says that the information that must be inputted into a search for it to successfully locate a target cannot fall below the information that a search outputs in successfully locating a target. Robert Marks and I show that this characterization of conservation of information is non-tautological. But as stated, it suggests that as we move logically upstream and try to account for successful search, the information cost of success cannot fall below a certain lower bound.

Strictly speaking, what is conserved then is not the actual inputs of information to make a search successful but the minimum information cost required for success. Inefficiencies in information usage may lead to more information being inputted into a search than is outputted. Conservation of information thus characterizes information costs when such inefficiencies are avoided. Thus it seems to Robert Marks and me that the expression “conservation of information” is in fact appropriate.

Comments
Atom:
That would mean that it is possible that P(Lb) = P(Lh), but the steps to get there would still need to be filled in.
One way to fill them in would be to break down P(Lh) as follows. P(Lh) is: the probability of selecting search 1 from the high-level space, times the probability of search 1 finding the low-level target, plus the probability of selecting search 2 from the high-level space, times the probability of search 2 finding the low-level target, plus the probability of selecting search 3 from the high-level space, times the probability of search 3 finding the low-level target, etc. Using my notation from here, the probability of selecting any given search from the high-level space is 1/|O2|, so we can factor 1/|O2| out of the sum, and the sum is sum(O2)/|O2|. And according to your insight, that is P(Lb). Alternately, we could argue from symmetry. In all of Marks and Dembski's examples, the high-level space doesn't favor any low-level points over any other low-level points, so every low-level point is equally likely to be selected. This means that the probability of low-level success is equivalent to that of a low-level blind search.R0b
June 9, 2009
June
06
Jun
9
09
2009
09:33 AM
9
09
33
AM
PDT
Atom:
If it is correct, however, then I wouldn’t say that P(Lb) = P(Lh), since the point of the assisted search is to raise the probability of success by some factor. I would think P(Lh) > P(Lb).
Yes, your interpretation is exactly correct. But keep in mind that the "assisted" search is randomly pulled from a higher-level space of searches, and could either increase or decrease the likelihood of success at the lower level. If the higher-level target is a set of "good" searches, and if the higher-level search is successful, then your inequality P(Lh) > P(Lb) is certainly correct. But I'm not including those assumptions in Lh. When I say P(Lb)=P(Lh), I mean that all of the following methods have the same probability of finding the low-level target: 1) Blindly select a point in the low-level space. 2) Blindly select a search from the higher-level space and use it to select a point in the lower-level space. 3) Blindly select a search from the 3rd-level space, use it to select a search from the 2nd-level space, and use that search to select a point in the lowest-level space. etc. This assumption seems to hold in all of Marks and Dembski's latest examples. (Although it might not hold for examples IV.1, IV.2, and IV.3 in an older paper. I'm assuming that there are implied levels beyond those that are shown in these examples, and that these implied higher levels even things out.) I know my terminology and prose stink, but hopefully you can dig through the obscurity to find some logic inside. After all, the key insight came from you.R0b
June 8, 2009
June
06
Jun
8
08
2009
03:15 PM
3
03
15
PM
PDT
Addendum: I guess I missed this part
Lh: Low-level target found by a search that was found by a high-level blind search.
That would mean that it is possible that P(Lb) = P(Lh), but the steps to get there would still need to be filled in. AtomAtom
June 8, 2009
June
06
Jun
8
08
2009
03:13 PM
3
03
13
PM
PDT
Hey R0b, Thanks for the one line version. Not to be a gadfly, but I followed your new proof until this line:
Your observation is that P(Lb)=P(Lh), so:
Even though this is meant to represent my condition, I am not following how it is equivalent, given your notation. Perhaps I'm misreading it. I take P(Lb) to be "Probability of finding low-level target T using blind search" and P(Lh) to be "Probability of finding low-level target T using assisted search." If this is incorrect, please let me know. If it is correct, however, then I wouldn't say that P(Lb) = P(Lh), since the point of the assisted search is to raise the probability of success by some factor. I would think P(Lh) > P(Lb). I'm sure I'm missing something, so any clarification and patience would be appreciated. Thanks, AtomAtom
June 8, 2009
June
06
Jun
8
08
2009
01:49 PM
1
01
49
PM
PDT
Thanks Atom. In my email I mentioned that the logic behind the LCI could be stated in a single line. Here's my attempt to do so. I'm posting it here not in hopes of a response, but so I that I can forget about it and refer back to this comment if I need to. First, your observation regarding Marks and Dembski's implied condition in defining higher-level search spaces leads to a fact that greatly simplifies things. Namely, the probability of success at a given level is independent of how many higher levels there are. That is, when we talk about the probability of success, we don't need to specify whether the probability is based on a blind search, or on a search that was found by a blind search, or on a search that was found by a search that was found by a blind search, etc., because the probability is the same regardless. The LCI says that finding a good search which in turn finds the low-level target or no easier than simply finding the low-level target. By noting that the probability of finding the low-level target is the same in both cases, the above sentence becomes self evident, since the former case has the added condition that the high-level target is also found. So the LCI is simply saying that finding targets A and B is no easier than finding target A. We can show formally how the above translates into the LCI. Event definitions: Lb: Low-level target found by blind search Lh: Low-level target found by a search that was found by a high-level blind search. Hb: High-level target found by a blind search Start with the above bolded statement. The following is true regardless of how Lh and Hb are defined: P(Lh & Hb) <= P(Lh) Restating: P(Lh|Hb)*P(Hb) <= P(Lh) Your observation is that P(Lb)=P(Lh), so: P(Lh|Hb)*P(Hb) <= P(Lb) And that is the LCI. We can also rearrange it: P(Hb) <= P(Lb)/P(Lh|Hb) And take the negative log to put in information notation: I(Hb) >= I(Lb)-I(Lh|Hb) And that's the more familiar form of the LCI: The information cost of finding a search is at least as much as the active info (endogenous minus exogenous info) of that search.R0b
June 8, 2009
June
06
Jun
8
08
2009
10:44 AM
10
10
44
AM
PDT
R0b, Please contact me off-list through my website contact form (atomthaimortal.com), if you're able. Someone wants to give you credit for your work. AtomAtom
June 4, 2009
June
06
Jun
4
04
2009
10:28 AM
10
10
28
AM
PDT
Conservation of information says that the information that must be inputted into a search for it to successfully locate a target cannot fall below the information that a search outputs in successfully locating a target. Strictly speaking, what is conserved then is not the actual inputs of information to make a search successful but the minimum information cost required for success.
Thanks you for the succinct statements, Dr. Dembski. Also, I find the conversation on this thread rather amusing. Please continue. Generally speaking the guy who does the work gets to choose the name. Hence, we have californium and einsteinium. Dembski would be well within his rights to name it after his fictional pet goat, Biff.tragic mishap
June 4, 2009
June
06
Jun
4
04
2009
07:04 AM
7
07
04
AM
PDT
[34]"However, a given point on the earth’s surface has only one gravitational potential. A given outcome e.g a protein can have different levels of “information” simultaneously depending on the target under consideration." But potentials aren't 'conserved', forces are. So an electron at a particular elevation on earth would have both an electrical potential and simultaneously a gravitational potential. Same particle, two potentials; and both forces are conserved.PaV
June 4, 2009
June
06
Jun
4
04
2009
01:34 AM
1
01
34
AM
PDT
Sorry, the previous comment was misrendered because of less-thans and greater-thans. Here's the second paragraph: To see that Marks and Dembski’s conservation principle follows from this mathematical fact, we can rearrange the equation to get: P(K) <= P(S)/P(S|K) and take the negative log to render in information terms: I(K) >= I(S)-I(S|K) In Marks and Dembski’s terminology, this says that the information cost of the problem-specific knowledge is at least as great as the active info (that is, the endogenous info minus the exogenous info).R0b
June 3, 2009
June
06
Jun
3
03
2009
09:42 AM
9
09
42
AM
PDT
Jehu, I'm afraid I don't see the contradiction. The former statement seems rather obvious to me, and the latter is a mathematical fact: P(K)*P(S|K) <= P(S). Which statement do you think is false? To see that Marks and Dembski's conservation principle follows from this mathematical fact, we can rearrange the equation to get: P(K) = I(S)-I(S|K) In Marks and Dembski's terminology, this says that the information cost of the problem-specific knowledge is at least as great as the active info (that is, the endogenous info minus the exogenous info).R0b
June 3, 2009
June
06
Jun
3
03
2009
09:39 AM
9
09
39
AM
PDT
R0b, Can you reconcile these two statements that you made?
On the contrary, unless we’re provided with such information, we can’t find targets any faster than random sampling.
And
That is, the probability of having problem-specific knowledge AND succeeding with that knowledge is no greater than succeeding without that knowledge.
I am sorry but you seem to be contradicting yourself.Jehu
June 3, 2009
June
06
Jun
3
03
2009
09:12 AM
9
09
12
AM
PDT
Dr. Dembski:
Moreover, I’m encouraged that the engineering community is open to my ideas and willing to publish them.
It bears noting that the two peer-reviewed papers make no controversial (i.e. ID) claims. The only way I can think of to connect those papers to ID is via the notion that intelligence creates information, while nature does not. Unfortunately, I see no logical or empirical support for that notion. As Atom's example in 23 shows, we humans use problem-specific information, but on what basis is it claimed that we create it? On the contrary, unless we're provided with such information, we can't find targets any faster than random sampling.R0b
June 3, 2009
June
06
Jun
3
03
2009
06:29 AM
6
06
29
AM
PDT
I see the anti-IDists are still complaining about definitions. Yet their position doesn't have anything that is rigorously defined. Ya know all you guys have to do to refute Dembski and Marks is to demonstrate that their isea of information is reducible to matter, energy, chance and necessity.Joseph
June 3, 2009
June
06
Jun
3
03
2009
05:32 AM
5
05
32
AM
PDT
On the ISCID boards a long time ago, I tossed out the idea that the vagaries of information in biology be viewed as work and not as a state variable. From the essay (sorry, I can't seem to embed the link - go to ISCID and search for all essays by member 179, it's the essay from March 10, 2002):
"B. Which brings me to a second point. Usually, (in my reading, at least), information content is reflective of the informational entropy of a system. Entropy, in turn, is usually taken as a state variable – the informational entropy of, say, a protein is independent of the pathway by which the protein originated. The preceding indicates that complexity does not share this property. It follows (at least to me) that the property “complex specified information” (CSI) is not a state variable, and thus should not be rigorously equated with information per se. I would suggest that a better analogy to be used here is that of thermodynamic work. Work is a property that is pathway-dependent – the amount of work obtained in going from state A to state B is determined as much by pathway as the inherent thermodynamic properties of the initial and final states (although the poises of the state variables do affect the work that can be done). It seems (naively, to be sure) that CSI would be better defined in terms of some sort of informational “work”, rather than inherent information content. (This would take into account the pathway dependence of the assignment of complexity, as indicated in the preceding.)"
Arthur Hunt
June 3, 2009
June
06
Jun
3
03
2009
05:31 AM
5
05
31
AM
PDT
I think they have defined it a little but I think we can do better if more time is spent on it.Frost122585
June 3, 2009
June
06
Jun
3
03
2009
02:46 AM
2
02
46
AM
PDT
"I would however like the concept of specificity to be a little more thoroughly developed though." Seconded - "information" is defined in terms of the probability of meeting a specified target. Therefore, without an objective definition of "specified" there is no objective definition of information. Is this still the most authoritive attempt to define specification?Mark Frank
June 3, 2009
June
06
Jun
3
03
2009
02:42 AM
2
02
42
AM
PDT
I would like for Bill to explain shortly how this new work of COI fits in with the NFl (no free lunch) theorems. I think the conceptual idea that specified complexity cannot be purchased without intelligence is the correct thesis for this kind of statistical and mathematical side of ID. When I first read about NFL I was really taken back at how brilliant a conceptual criticism it really was. I would however like the concept of specificity to be a little more thoroughly developed though.Frost122585
June 3, 2009
June
06
Jun
3
03
2009
12:32 AM
12
12
32
AM
PDT
Re #12 "Here’s another way of looking at conservation laws. Gravity is considered a conserved force, that is, it does not change with time. Yet, the gravitational potential at any point on the surface of the earth varies because of the differences in altitude from one area of the world to another." However, a given point on the earth's surface has only one gravitational potential. A given outcome e.g a protein can have different levels of "information" simultaneously depending on the target under consideration.Mark Frank
June 2, 2009
June
06
Jun
2
02
2009
10:36 PM
10
10
36
PM
PDT
R0b
That is, the probability of having problem-specific knowledge AND succeeding with that knowledge is no greater than succeeding without that knowledge.
What? Is that a typo?Jehu
June 2, 2009
June
06
Jun
2
02
2009
10:20 PM
10
10
20
PM
PDT
I agree with serendipity that the LCI is no more or less a conservation law than the 2LoT, but I have no problem with either of them being labeled a conservation law. The term information, on the other hand, does seem to add confusion to Marks and Dembski's account. They refer to active information as a measure of content (as well as the content itself), but refer to endogenous information as a measure of difficulty. We would expect endogenous and exogenous information to describe disjoint content, but what content, if any, do they refer to? If we look at it from a classical information standpoint, the content is the outcome of the event whose probability is being measured. Since endogenous and exogenous information measure the probability of the search succeeding, their content is boolean -- a simple "yes" as opposed to no. The confusing part is that this is not the information we seek when we search. It's like when my wife asks me if I know where her keys are, and I say "yes", pretending that she's only interested in whether I know, and not in the location of the keys. The concepts seem much more straightforward when described in terms of probability. Consider the following fact: P(K)*P(S|K) <= P(S) That's Marks and Dembski's conservation principle. That is, the probability of having problem-specific knowledge AND succeeding with that knowledge is no greater than succeeding without that knowledge. Why couch this in terms of "information" when it's perfectly clear in probabilistic terms?R0b
June 2, 2009
June
06
Jun
2
02
2009
09:11 PM
9
09
11
PM
PDT
Dr Dembski, I have to disagree with your conclusion. LCI is not a marketing term. If you say cost is conserved, then call it LCC.Nakashima
June 2, 2009
June
06
Jun
2
02
2009
08:33 PM
8
08
33
PM
PDT
Atom, I think you're being oversensitive. I chose my Moonie example precisely because I knew it would seem ridiculous to almost every reader here (unless Jonathan Wells happens to be lurking), both theists and atheists. That makes it the perfect demonstration of the fact that precedence by itself does not constitute justification. As for ISCID, I'm not speaking of its popularity. I'm speaking of the fact that it does not represent the scientific zeitgeist, and I stand by that characterization. Do you disagree?serendipity
June 2, 2009
June
06
Jun
2
02
2009
07:27 PM
7
07
27
PM
PDT
PS And yes, I consider it insulting when you try to get cute with me in a way you'd never do in my presence. Have respect and keep it civil.Atom
June 2, 2009
June
06
Jun
2
02
2009
06:56 PM
6
06
56
PM
PDT
seren, Aside from your unnecessary jab at ISCID, you've made your point. I wasn't aware of a stronger existing precedent in Computation theory. If that's the case, your different usage would make sense. But if you'd like to have conversations with me in the future you'll stay clear from the "I'm-so-clever" little references to ISCID's popularity and Moonies. I have limited time and prefer not to spend it on people who would think to insult me behind a keyboard. AtomAtom
June 2, 2009
June
06
Jun
2
02
2009
06:54 PM
6
06
54
PM
PDT
Atom writes:
If the precedent is there, even with two, it is still a precedent.
If a precedent set by two people counts as justification, then millions of ridiculous ideas are justified by precedent. For example, hundreds of thousands of Moonies think that Reverend Moon is the second coming of Christ. Do you accept that precedent?
Furthermore, I remember seeing many discussions on ISCID in its heyday about a “4th Law of Thermodynamics” (with relation to information) where many similar ideas were discussed and if I remember correctly, the phrase Conservation of Information was also used in association with those concepts.
The ISCID forums aren't exactly the first place I'd go if I were trying to gauge the scientific zeitgeist. I have seen the phrase "conservation of information" used in reference to reversible computation. However, this usage is legitimate because information is actually conserved in those processes: the system contains the same amount of information after a reversible computation as it does before. In irreversible computations, information is destroyed and lost forever, which is another reason why a "Law of Conservation of Information" is inappropriate. Information, as we know it, is not conserved. Even "active information" is not conserved, as Dembski admits.
...if you don’t like the phrase, please come up with a better one and share it with others.
Since Dembski and Marks are proposing the law, it's up to them to provide an accurate name for it. I am here to dispute the conclusion of Dembski's opening post:
Thus it seems to Robert Marks and me that the expression “conservation of information” is in fact appropriate.
Because the LCI is neither about conservation nor about information, the name "Law of Conservation of Information" is inappropriate.serendipity
June 2, 2009
June
06
Jun
2
02
2009
05:42 PM
5
05
42
PM
PDT
serendipity, I don't think Dembski is basing his usage on just the two examples he mentioned, though I don't think they should be dismissed either. If the precedent is there, even with two, it is still a precedent. Furthermore, I remember seeing many discussions on ISCID in its heyday about a "4th Law of Thermodynamics" (with relation to information) where many similar ideas were discussed and if I remember correctly, the phrase Conservation of Information was also used in association with those concepts. (I could be mistaken, but the phrase already sounded familiar to me when Dembski and Marks used it.) I'm sure Dembski can point to even more examples, but again, this isn't something I want to waste time arguing over; if you don't like the phrase, please come up with a better one and share it with others. If it is better than Dembski's I'm sure it will catch on. AtomAtom
June 2, 2009
June
06
Jun
2
02
2009
04:46 PM
4
04
46
PM
PDT
Atom writes:
But why argue about notation? You’re free to come up with your own notation scheme. I won’t stop you.
I'm not arguing about notation. I'm pointing out that the name "Law of Conservation of Information" is highly misleading. The name of the "law" is the topic of this thread, after all.serendipity
June 2, 2009
June
06
Jun
2
02
2009
04:44 PM
4
04
44
PM
PDT
Atom asks:
This is a simple definition. Perhaps you can explain what you find unclear about it?
I haven't said that the definition of "active information" is unclear. I've stated that the phrase "Law of Conservation of Information" is highly misleading.
You disapprove of this and say it is similar to entropy.
I've shown that it's parallel to entropy, and that the difference Dr. Dembski cites is not a difference at all.
Dembski clearly stated his reasons for using conservation, citing past precedence among other things, so I won’t fault him for that.
He cited the precedence of two people who may or may not use the word "conservation" in the loose way that he does (I don't have access to Medawar and Schaffer, so I can't say). Compare that to the consensus among scientists for using "conservation" to refer to situations where a quantity neither increases nor decreases. Suppose we add the LCI to the pantheon of recognized conservation laws. Look what happens: Q. Is the law of conservation of mass/energy about mass/energy? A. Yes. Q. Is mass/energy conserved? A. Yes. Q. Is the law of conservation of charge about charge? A. Yes. Q. Is charge conserved? A. Yes. Q. Is the law of conservation of angular momentum about angular momentum? A. Yes. Q. Is angular momentum conserved? A. Yes. . . . Q. Is the law of conservation of information about information? A. Well, no. It only applies to active information. Q. Oh. Well, is "active information" conserved? A. Well, no. It can decrease. Q. What was the name of that law again?serendipity
June 2, 2009
June
06
Jun
2
02
2009
04:26 PM
4
04
26
PM
PDT
serendipity, Dembski and Marks begin with p, which is the probability of finding a target in a search space using a null, blind search. They take the log base 2 of this probability to define the endogenous information, or in simpler terms, the inherent "difficulty" of the search problem. They then consider an assisted search, which finds the target with probability q, where q > p. They take the log base 2 of this, and define this as the exogenous information. They then define the active information as the difference between the endogenous and exogenous information, or log(q/p). It is useful in measuring how much information the assisted search adds towards finding the target. The assisted search will find the target in less queries than blind search, hence why it is suitable that its information measure relative to the problem is greater than null search. To use an everyday example, the probability of my finding my keys in my apartment by brute force is p. The probability of finding them using information about their location (my wife telling me where they are) is q. Furthermore, q >> p and I will have to search less places once I have the information associated with q. Therefore, Dembski and Marks' conventions work well when applied to real problems. We want to remember that q imparts more problem specific information than p and their notation scheme reflects this. But why argue about notation? You're free to come up with your own notation scheme. I won't stop you. AtomAtom
June 2, 2009
June
06
Jun
2
02
2009
04:04 PM
4
04
04
PM
PDT
To see just how idiosyncratic "active information" is, consider a blind search and an augmented search, both of which successfully locate a given target. According to classic measures of information such as Shannon's and Kolmogorov's, both searches yield the same information (they found the same target, after all). To Dembski and Marks, the augmented search yields more information than the blind search, despite the fact that they both found the same target.serendipity
June 2, 2009
June
06
Jun
2
02
2009
03:33 PM
3
03
33
PM
PDT
1 2

Leave a Reply