Cox's theorem, named after the physicist Richard Threlkeld Cox, is a derivation of the laws of probability theory from a certain set of postulates. This derivation justifies the so-called "logical" interpretation of probability. As the laws of probability derived by Cox's theorem are applicable to any propositions, logical probability is a variety of Bayesian probability. Other forms of Bayesianism, such as the subjective interpretation, are given other justifications.
Table of contents |
2 Implications of Cox's postulates 3 Interpretation and further discussion |
Cox wanted his system to satisfy the following desiderata
Cox's assumptions
The postulates as stated here are taken from Arnborg and Sjödin (1999).
"Common sense" includes consistency with Aristotelian logic when
statements are completely plausible or implausible.
The postulates as originally stated by Cox were not mathematically rigorous (although better than the informal description above), e.g., as noted by Halpern (1999a, 199b). However it appears to be possible to augment them with various mathematical assumptions made either implicitly or explicitly by Cox to produce a valid proof.
Cox's axioms are:
- The plausibility of a proposition determines the plausibility of the proposition's negation; either decreases as the other increases.
- The plausibility of the conjunction [A & B] of two propositions A, B, depends only on the plausibility of B and that of A given that B is true. (From this Cox eventually infers that multiplication of probabilities is associative, and then that it may as well be ordinary multiplication of real numbers.)
- Suppose [A & B] is equivalent to [C & D]. If we take acquire new information A and then acquire further new information B, and update all probabilities each time, the updated probabilities will be the same as if we had first acquired new information C and then acquired further new information D. In view of the fact that multiplication of probabilities can be taken to be ordinary multiplication of real numbers, this becomes a functional equation
Cox's theorem implies that any plausibility model that meets the
postulates is equivalent to the subjective probability model, i.e.,
can be converted to the probability model by rescaling.
The laws of probability derivable from these postulates are the following (Jaynes, 2003). Here w(A|B) is the "plausibility" of the proposition A given B, and m is some positive number.
Implications of Cox's postulates
It is important to note that the postulates imply only these general properties. These are equivalent to the usual laws of probability assuming some conventions, namely that the scale of measurement is from zero to one, and the plausibility function, conventionally denoted P or Pr, is equal to wm. (We could have equivalently chosen to measure probabilities from one to infinity, with infinity representing certain falsehood.) With these conventions, we obtain the laws of probability in a more familiar form:
Rule 2 is a rule for negation, and rule 3 is a rule for conjunction. Given that any proposition containing conjunction, disjunction, and negation can be equivalently rephrased using conjunction and negation alone (the conjunctive normal form), we can now handle any compound proposition.
The laws thus derived yield finite additivity of probability, but not countable additivity. The measure-theoretic formulation of Kolmogorov assumes that a probability measure is countably additive. This slightly stronger condition is necessary for the proof of certain theorems, however, it is not clear what difference countable additivity makes in practice.
Cox's theorem has come to be used as one of the justifications for the
use of Bayesian probability theory. For example, in Jaynes (2003) it is
discussed in detail in chapters 1 and 2 and is a cornerstone for the
rest of the book. Probability is interpreted as a formal system of
logic, the natural extension of Aristotelian logic (in which every
statement is either true or false) into the realm of reasoning in the
presence of uncertainty.
It has been debated to what degree the theorem excludes alternative
models for reasoning about uncertainty. For example, if certain
"unintuitive" mathematical assumptions were dropped then alternatives
could be devised, e.g., an example provided by Halpern (1999a).
However Arnborg and Sjödin (1999, 2000a, 2000b) suggest additional
"common sense" postulates, which would allow the assumptions to be
relaxed in some cases while still ruling out the Halpern example.
The original formulation of Cox's theorem is in Cox (1946), which is extended with additional results and more discussion in Cox (1961). Jaynes (2003) cites Abel (1826) as first known instance of the associativity functional equation which is used in the proof of the theorem. Aczél (1966) refers to the "associativity equation" and lists 98 references to works that discuss it or use it, and gives a proof that doesn't require differentiability (pages 256-267).
References and external links
Interpretation and further discussion