In the
previous post, I presented a couple of arguments for the Principal Principle. The first was based on the following fact: For many plausible measures of distance between credence functions, if a credence function $c$ violates the Principal Principle, then there is a credence function $c'$ that satisfies it such that $c'$ is closer to each of the credence functions that match the possible objective chances than $c$ is. That is, if $ch_w$ is the chance function at world $w$, then for all worlds $w$, $c'$ is closer to $ch_w$ than $c$ is. Thus, if you think, as Alan Hájek does, that credences aim to match the objective chances, then it seems that you should obey the Principal Principle. The second argument was based on the following fact: For many plausible measures of distance between credence functions, if a credence function $c$ violates the Principal Principle, then there is a credence function $c'$ that satisfies it such that each possible objective chance function expects $c'$ to be more accurate than it expects $c$ to be (where accuracy is proximity to the omniscient credence function). That is, for all worlds $w$, $ch_w$ expects $c'$ to be closer to the omniscient credences than it expects $c$ to be. Thus, if you think that the objective chances should guide decisions when they speak univocally in favour of one action over another, then it seems that you should obey the Principal Principle. Thus, suppose you know that the chance of a coin landing heads is between 0.6 and 0.8 inclusive. And suppose that your credence in it landing heads is 0.5. Then, according to the first of these arguments, you are irrational because there is an another credence (for instance, 0.6) that is closer to matching the chances than your credence is regardless of what the chances are. And according to the second of these arguments, you are irrational because there is another credence (for instance, 0.6) that each possible chance function expects to be more accurate than it expects your credence to be.
In this post, we will consider two arguments for van Fraassen's Generalized Reflection Principle and Conditionalization (which follows from Generalized Reflection) (van Fraassen, 1999). They are related to the two arguments for the Principal Principle considered above. The first is a beautiful new argument due to Robbie Williams (Williams, ms). The second is my take on Robbie's argument. Throughout, I'll assume that all credence functions are probability functions.
Williams' argument for GRP and Conditionalization
We consider an agent with a credence function $c$ and an updating rule $\mathbf{R}$. $\mathbf{R}$ takes a partition $\mathcal{E}$ and an element of that partition $E$ and returns a credence function $c_{\mathbf{R}(\mathcal{E}, E)}$. We think of $c_{\mathbf{R}(\mathcal{E}, E)}$ as the credence function that the updating rule would mandate were the agent were to receive evidence $E$ from partition $\mathcal{E}$. And we demand that $c_{\mathbf{R}(\mathcal{E}, E)}(E) = 1$. That is, updating in the light of evidence $E$ ought to make an agent certain of $E$.
For instance, if I am about to perform an experiment, I will typically know the partition from which my evidence will come: perhaps I know that my measuring instrument will read 1, 2, or 3. Then I know that my evidence will come from the partition $\mathcal{E} = \{1, 2, 3\}$. An updating rule takes a partition and an element of the partition and tells you what your new credence function should be if you learn that element of the partition.
Williams asks us to consider the following situation. Suppose $D$ is a measure of distance between credence functions: for the purpose of his argument, $D$ could be Squared Euclidean Distance, or cross-entropy, or any other Bregman divergence. And suppose that $\mathcal{E}$ is a partition. Now suppose that there is some credence function $c'$ that is closer to each $c_{\mathbf{R}(\mathcal{E}, E)}$ (for $E$ in $\mathcal{E}$) than the agent's credence function $c$ is. This, Williams claims, would make the agent irrational. That is, for Williams, it is irrational to have a credence function and an updating rule such that, for some partition, the credence function is further than it needs to be from the various posterior credence functions that the updating rule would recommend in the light of the various elements of the partition. In symbols:
Future Credence Dominance Suppose I have credence function $c$ and I endorse updating rule $\mathbf{R}$. Suppose $\mathcal{E}$ is a partition. And suppose there is $c'$ such that
\[
D(c_{\mathbf{R}(\mathcal{E}, E)}, c') < D(c_{\mathrm{R}(\mathcal{E}, E)}, c)
\]
for all $E$ in $\mathcal{E}$. Then I am irrational.
Thus, suppose I am about to perform an experiment, and I thereby know that I will learn an element of the partition $\{1, 2, 3\}$. Suppose further that my updating rule tells me to adopt a credence of 0.1 if I learn 1, 0.2 if I learn 2, and 0.3 if I learn 3. But suppose that I currently have credence 0.5. Then, according to
Future Credence Dominance, I am irrational because there is another credence (for instance, 0.3) that is closer to each of my possible future credences than my current credence is.
What epistemic norm follows from
Future Credence Dominance along with the claim that $D$ must be a Bregman divergence? The answer is: van Fraassen's Generalized Reflection Principle.
Generalized Reflection Suppose I have credence function $c$ and I endorse updating rule $\mathbf{R}$. Then I am irrational unless
\[
c(X) = \sum_{E \in \mathcal{E}} c(E) c_{\mathbf{R}(\mathcal{E}, E)}(X)
\]
That is, my current credence in a proposition ought to be my expected future credence in it. Notice that this is a norm that applies to credence function-updating rule pairs. That this follows from Future Credence Dominance is a consequence of the following two Lemmas:
Lemma 1 Suppose $\mathbf{R}$ is an updating rule. Let $\mathbf{R}(\mathcal{E}) = \{c_{\mathbf{R}(\mathcal{E}, E)} : E \in \mathcal{E}\}$. Then
- If $c \not \in \mathbf{R}(\mathcal{E})^+$, then there is $c' \in \mathbf{R}(\mathcal{E})^+$ such that, for all $E$ in $\mathcal{E}$,\[D(c_{\mathbf{R}(\mathcal{E}, E)}, c') < D(c_{\mathrm{R}(\mathcal{E}, E)}, c)\]
- If $c \in \mathbf{R}(\mathcal{E})^+$, then there is no $c'$ such that, for all $E$ in $\mathcal{E}$, \[ D(c_{\mathbf{R}(\mathcal{E}, E)}, c') \leq D(c_{\mathrm{R}(\mathcal{E}, E)}, c) \]
Proof. This is a special case of the theorem to which we appealed in the accuracy-based argument for Probabilism and the accuracy-based argument for the Principal Principle. Suppose $\mathcal{X}$ is a set of credence functions; then, for any credence function $c$ that lies outside the convex hull $\mathcal{X}^+$ of $\mathcal{X}$, there is a credence function $c'$ that lies inside $\mathcal{X}^+$ such that $c'$ is closer to each member of $\mathcal{X}$ than $c$ is.
$\Box$
Lemma 2 (van Fraassen, 1999) $c \in \mathbf{R}(\mathcal{E})^+$ iff $c$ satisfies Generalized Reflection.
Proof. From right to left is straightforward. Thus, suppose $c \in \mathbf{R}(\mathcal{E})^+$. That is, for each $E$ in $\mathcal{E}$, there is $\lambda_E \geq 0$ such that $\sum_{E \in \mathcal{E}} \lambda_E = 1$ and
\[
c(X) = \sum_{E \in \mathcal{E}} \lambda_E c_{\mathbf{R}(\mathcal{E}, E)}(X)
\]
for all $X$. Thus, in particular, if $E'$ is in $\mathcal{E}$, then
\[
c(E') = \sum_{E \in \mathcal{E}} \lambda_E c_{\mathbf{R}(\mathcal{E}, E)}(E')
\]
But, by stipulation,
\[
c_{\mathbf{R}(\mathcal{E}, E)}(E') = \left \{ \begin{array}{ll}
1 & \mbox{ if } E' = E \\
0 & \mbox{ if } E' \neq E
\end{array}
\right.
\]
So $\lambda_{E'} = c(E')$, as required.
$\Box$
Thus, we have the following argument for Generalized Reflection:
- The distance between credence functions ought to be measured by a Bregman divergence $D$.
- Future Credence Dominance
- Lemmas 1 and 2
- Therefore, Generalized Reflection
And this is simultaneously an argument for
Conditionalization. After all, as van Fraassen pointed out,
Generalized Reflection is equivalent to
Conditionalization. Recall,
Conditionalization says:
Conditionalization Suppose I have credence function $c$ and I endorse updating rule $\mathbf{R}$. Then I am irrational unless
\[
c_{\mathbf{R}(\mathcal{E}, E)}(X) = c(X | E)
\]
Theorem 1 (van Fraassen, 1999) Generalized Reflection iff
Conditionalization
Proof. Suppose $c$, $\mathbf{R}$ satisfy
Generalized Reflection. Then
\begin{eqnarray*}
c(X | E') & = & \frac{c(XE')}{c(E')} \\
& = & \frac{\sum_{E \in \mathcal{E}} c(E)c_{\mathbf{R}(\mathcal{E}, E)}(XE')}{\sum_{E \in \mathcal{E}} c(E)c_{\mathbf{R}(\mathcal{E}, E)}(E')} \\
& = & \frac{c(E')c_{\mathbf{R}(\mathcal{E},
E')}(XE')}{c(E')c_{\mathbf{R}(\mathcal{E},
E')}(E')} \\
& = & c_{\mathbf{R}(\mathcal{E},
E')}(X)
\end{eqnarray*}
Now suppose $c$, $\mathbf{R}$ satisfy
Conditionalization. Then
\begin{eqnarray*}
c(X) & = & \sum_{E \in \mathcal{E}} c(E) c(X|E) \\
& = & \sum_{E \in \mathcal{E}} c(E) c_{\mathbf{R}(\mathcal{E}, E)}(X)
\end{eqnarray*}
$\Box$
Thus,
Future Credence Dominance gives us not only
Generalized Reflection, but also
Conditionalization, since the two norms are equivalent.
But one might wonder how compelling
Future Credence Dominance is. Why should we care about getting close to our future credences? Here's a suggestion together with some worries about it.
One might care about proximity to one's future credences because one cares about proximity to the omniscient credences and one believes that those future credences will be close to the omniscient credences. That is, one wishes to be accurate, and one believes that one's future credences are accurate. There seem to be two ways of making precise the accuracy that we attribute to our future credences:
- On the first, we say that my future credences are guaranteed to be closer to the omniscient credences than my current credences are. That is, we assume that, for each partition $\mathcal{E}$ and each $E \in \mathcal{E}$, \[ D(v_w, c_{\mathbf{R}(\mathcal{E}, E)}) < D(v_w, c)\]for all $w$ in $E$. For any $D$, there are updating rules that have this property. Now, we might try to justify Future Credence Dominance as follows: Suppose $c'$ is closer to each $c_{\mathrm{R}(\mathcal{E}, E)}$ than $c$ is; then, since each $c_{\mathbf{R}(\mathcal{E}, E)}$ is closer to $v_w$ than $c$ is, for $w$ in $E$, it will be the case that $c$ is closer to $v_w$ than $c'$ is, for each world $w$. But unfortunately, this argument isn't valid. The conclusion doesn't follow.
- On the second, we say that my future credences are expected to be closer to the omniscient credences than my current credences are. Expected by whom? By me. Again, for any $D$, there are updating rules that have this property. In fact, by Greaves and Wallace's result from a couple of posts ago, updating by conditionalization always has this property. But now we seem to be very close to the Greaves and Wallace argument. If we value proximity to our future credences because we expect them to be more accurate than we expect our current credences to be, surely we'll value the updating rule most that we expect to give the most accurate future credences. As Greaves and Wallace's argument shows, that rule is always conditionalization. So we have no need of a further argument.
That's my concern about
Future Credence Dominance. In the next section, I consider a different argument for
Conditionalization that goes through
Generalized Reflection.
Another argument for GRP and Conditionalization
Recall the arguments for the Principal Principle considered in the previous post: on the first, we showed that, if an agent values proximity to the objective chances, then she should satisfy the Principal Principle; on the second, we showed that, if she values proximity to the omniscient credences, but takes the objective chances to guide her actions whenever they speak univocally, she should satisfy the Principal Principle. The argument for Generalized Reflection, and therefore for Conditionalization, had a similar structure to the first argument for the Principal Principle: if an agent values proximity to her future credences, she ought to satisfy Generalized Reflection (and therefore Conditionalization). In this section, I'll give an argument that has a similar structure to the second argument for the Principal Principle: I'll point out that, if an agent values proximity to the omniscient credences (that is, she values accuracy), but takes her future credences to guide her actions whenever they speak univocally, she should satisfy Generalized Reflection (and therefore Conditionalization).
Here's the norm we'll use in place of Future Credence Dominance:
Future Credence Expected Dominance Suppose I have credence function $c$ and I endorse updating rule
$\mathbf{R}$. Suppose $\mathcal{E}$ is a partition. And suppose there
is $c'$ such that each $c_{\mathbf{R}(\mathcal{E}, E)}$ expects $c'$ to be more accurate than it expects $c$ to be: that is,
\[
\sum_w c_{\mathbf{R}(\mathcal{E}, E)}(w) D(v_w, c') < \sum_w c_{\mathbf{R}(\mathcal{E}, E)}(w) D(v_w, c)
\]
for all $E$ in $\mathcal{E}$. Then I am irrational.
Now, in the
previous post, I mentioned that we have the following result:
Lemma 3 If $\mathcal{X}$ is a set of probability functions and $c$ lies outside $\mathcal{X}^+$, then there is $c'$ that lies inside $\mathcal{X}^+$ such that every probability function in $\mathcal{X}$ expects $c'$ to be more accurate than it expects $c$ to be.
This, together with Future Credence Expected Dominance gives us that one's credence function ought to lie in the convex hull of one's possible future credences. And, as we saw above (
Lemma 2), if $c$ lies in the convex hull of the possible future credences, then $c$ satisfies Generalized Reflection and the possible future credences must be obtained by Conditionalization. Thus, we have the following argument:
- The distance between credence functions ought to be measured by a Bregman divergence $D$.
- Future Credence Expected Dominance
- Lemmas 2 and 3
- Therefore, Generalized Reflection (and therefore, Conditionalization)
Is
Future Credence Expected Dominance plausible?
I think so. Of course, one's future credences will rarely speak univocally in favour of one option over another. But, when they do, one should take their advice.
How does this argument compare to the Greaves and Wallace argument? Greaves and Wallace argue for Conditionalization by pointing out that it is the updating rule that looks best from the point of view of our current credence function; the present argument proceeds by pointing out that if we plan to update other than by Conditionalization, then our current credence function doesn't look optimal from the point of view of each of the future credence functions mandated by the updating rule. Thus, the present argument (and also Williams' argument) avoid a common objection to the Greaves and Wallace argument: the objection says that we shouldn't judge an updating rule by the lights of a credence function that the updating rule will lead us to replace. Instead, we are using the future credence functions with which the updating rule will replace our current credence function to judge our current credence function. Jason Konek has been looking at other ways in which we might justify updating rules by considering the point of view of the future credence functions to which the updating rule will give rise. I'll post on that later.