Reviving an old argument for Conditionalization

As a dear departed friend used to tell me, only a fool never changes their mind. So, endeavouring not to be fool, I have changed my mind about something. In fact, I've changed it back, having changed it once before. The matter is a certain argument that Hannes Leitgeb and I proposed for the Bayesian norm of Conditionalization, which says that you should update your credences in response to evidence by conditionalizing them on it. A suitably-reconstructed and generalized version of the argument has three premises:

  1. Credal Veritism The sole fundamental source of epistemic value for credences is their accuracy.
  2. Strict Propriety Every legitimate measure of the accuracy of a credence function at a world is strictly proper. (That is, every probabilistic credence function expects itself to be most accurate.)
  3. Maximize Evidentially-Truncated Subjective Expected Utility You should choose an option that maximizes a quantity we might call evidentially-truncated subjective expected utility, which is just subjective expected utility but calculated not over all the possible states of the world, but only over those that are compatible with your evidence.
Sapphire stubbornly refusing to receive any visual evidence

It is then straightforward to derive Conditionalization as a conclusion by showing that the credence function that maximizes evidentially-truncated expected accuracy from the point of view of your prior credence function and using a strictly proper measure of accuracy is the one obtained from your prior by conditionalizing on the evidence. First, suppose $C$ is your prior and suppose $C$ is probabilistic. Then, if $E$ is your evidence and $C(E) > 0$, then $C(-\mid E)$ is also probabilistic. And so, if $\mathfrak{A}$ is a strictly proper accuracy measure, then$$\sum_{w \in W} C(w \mid E) \mathfrak{A}(C(-\mid E), w) > \sum_{w \in W} C(w \mid E) \mathfrak{A}(C', w)$$for any $C' \neq C(-\mid E)$. But $C(w \mid E) = 0$ if $E$ is false at $w$ and $C(w \mid E) = \frac{C(w)}{C(E)}$ if $E$ is true at $w$, and so we have$$\sum_{w \in E} C(w)\mathfrak{A}(C(-\mid E), w) > \sum_{w \in E} C(w) \mathfrak{A}(C', w)$$as required.

Later, I came to worry about the third premise of the argument. Dmitri Gallow has formulated two of the main concerns well. First, he notes that the standard official justifications for maximizing expected utility do not support maximizing evidentially-truncated expected utility--for one thing, the credences by which we weight the utilities of an option at different states of the world when we calculate this quantity don't sum to 1. Second, he argues that the intuitive motivation for that principle relies on considerations that are not available to the credal veritist--specifically, it relies on evidential considerations. 

I was convinced by these arguments as well as Gallow's remedy and I adopted that in my book, Epistemic Risk and the Demands of Rationality--Gallow thinks you should maximize expected utility, but your measure of accuracy should change when you receive evidence so that it assigns the same neutral constant value to any credence function at any world at which your evidence is not true; and doing this also favours conditionalizing. But I've come to think I was wrong to do this; I've come to think that Hannes and I were right all along. In this post, I try to answer Gallow's worries.

1. Justifying our decision rule

To answer Gallow's concerns, I'll adapt work by Martin Peterson and Kenny Easwaran to give a direct argument for premise 3, the decision-theoretic norm, Maximize Evidentially-Truncated Subjective Expected Utility.

Peterson and Easwaran both set out to provide what Peterson calls an ex post justification of Maximize Expected Subjective Expected Utility. This contrasts with the more familiar ex ante justifications furnished by Savage's representation theorem. An ex ante justification begins with preferences over options, places constraints on those preferences (as well as the space of options), and proves that, for any preferences that satisfy them, there is a unique probability function and a utility function unique up to positive linear transformation such that you weakly prefer one option to another iff the expected utility of the first is at least as great than the expected utility of the second. An ex post justification, on the other hand, begins with a probabilistic credences and utilities already given, places constraints on how you combine them to build preferences over options, and proves that preferences so built order options by their subjective expected utility. I will adapt their constraints so that the preferences so built order options instead by their evidentially-truncated subjective expected utility.

So, suppose we have a finite set of states of the world $W$, and a probability function $P$ over $W$. And suppose the options are functions that take each world in $W$ and return a real number than measures the utility of that option at that world. Then we introduce a few related definitions:
  • A fine-graining of $W$ is a finite set of worlds $W'$ together with a surjective function $h: W' \rightarrow W$. We write it $W'_h$.
  • Given an option $o$ defined on $W$, a fine-graining $W'_h$, and an option $o'$ defined on $W'$, $o'$ is the fine-graining of $o$ relative to $h$ if, for all $w'$ in $W$, $o'(w') = o(h(w))$. We write it $o'_h$.
  • Given a probability function $P$ defined on $W$, a fine-graining $W'_h$, and a probability function $P'$ defined on $W'$, $P'$ is a fine-graining of $P$ relative to $h$ if, for all $w$ in $W$,$$P(w)= \sum_{\substack{w' \in W' \\ h(w') = w}} P(w')$$We write it $P'_h$.
  • Given a proposition $E \subseteq W$, a fine-graining $W'_h$, and a proposition $E' \subseteq W'$, $E'$ is the fine-graining of $E$ relative to $h$ if, for all $w'$ in $W$, $w'$ is in $E'$ iff $h(w)$ is in $E$. We write it $E'_h$.
And suppose $E \subseteq W$ is a proposition that represents our total evidence at the time in question. We will lay down constraints on weak preference orderings $\preceq_{P'_h, W'_h}$ over options defined on $W'$, where $W'_h$ is a fine-graining of $W$ and $P'_h$ is a fine-graining of $P$ relative to $h$. Note that this imposes constraints on a weak preference order of options defined on $W$ itself, since the identity function on $W$ defines a fine-graining of $W$, and $P$ is a fine-graining of itself relative to this function.

Reflexivity $\preceq_{P'_h, W'_h}$ is reflexive.

Transitivity $\preceq_{P'_h, W'_h}$ is transitive.

These are pretty natural assumptions.

Dominance For any options $o$, $o'$ defined on $W'$,
  • If $o(w) \leq o'(w)$, for all $w'$ in $E'_h$, then $o \preceq_{P'_h, W'_h} o'$;
  • If $o(w) < o'(w)$, for all $w'$ in $E'_h$, then $o \prec_{P'_h, W'_h} o'$.
This says that, if one option is at least as good as another at every world compatible with my evidence, I weakly prefer the first to the second; and if one option is strictly better than another at every world compatible with my evidence, I strongly prefer the first to the second.

Grain Invariance $o_h \preceq_{P'_h, W'_h} o^*_h$ iff $o \preceq_{P, W} o^*$.

This says that the grain at which I describe them shouldn't make any difference to my ordering of two options.

Trade-Off Indifference If, for two possible worlds $w'_i, w'_j$ in $E'_h$, 
  • $P'_h(w'_i) = P'_h(w'_j)$,
  • $o_h(w'_i) - o^*_h(w'_i) = o^*_h(w'_j) - o_h(w'_j)$
  • $o_h(w'_k) = o^*_h(w'_k)$, for all $w'_k \neq w'_i, w'_j$,
then $o \sim_{P'_h, W'_h} o^*$.

This says that, when two options differ only in their utilities at two equiprobable worlds compatible with my evidence, and the first is better than the second at one world and the second is better than the first at the other world, and by the same amount, then you should be indifferent between them.

Then we have the following theorem:

Main Theorem (Peterson 2004; Easwaran 2014) If, for every fine-graining $W'_h$ of $W$ and every fine-graining $P'_h$ of $P$ relative to $h$, $\preceq_{P'_h, W'_h}$ satisfies Reflexivity, Transitivity, Dominance, Grain Invariance, and Trade-Off Indifference, then, for any two options $o$, $o^*$ defined on $W$,$$o \preceq_{P, W} o^* \Leftrightarrow \sum_{w \in E} P(w)o(w) \leq \sum_{w \in E} P(w)o^*(w)$$

Hopefully this argument goes some way to answer Gallow's first concern, namely, that there's no principled reason to choose between options in the way Hannes Leitgeb and I imagined, that is, by maximizing evidentially-truncated subjective expected utility.

2. Does it appeal to non-veritist considerations?

Does this argument address Gallow's second worry? Here it is in Gallow's words:

"Why should you stop regarding the worlds outside of $E$ as epistemically possible, and thereby completely discount the accuracy of your credences at worlds outside of E? The natural answer to that question is: 'because those worlds are incompatible with your evidence.' This answer relies upon a norm like 'do not value accuracy at a world if it is incompatible with your evidence'. But this is a distinctively evidential norm. And we have done nothing to explain why someone who pursues accuracy alone, and cares not at all about evidence per se except insofar as it helps them attain their goal of accuracy, will have reason to abide by this evidential norm." (Emphasis in the original; page 10)

An appealing consequence of justifying Maximize Evidentially-Truncated Subjective Expected Utility using Peterson's and Easwaran's strategy is that we can see exactly where in our argument we make reference to the evidence. It is in Dominance and in Trade-Off Invariance. And, when we look at these, I think we can see how to meet Gallow's demand for an explanation. We can explain why someone whose only goal is the pursuit of accuracy should have preferences that obey both of these principles. The reason is that they care about actual accuracy; when we say they care about accuracy, we mean that they care about how accurate their credence function actually is. And what evidence does is rule out some worlds, telling us they're not actual. And so, if the accuracy of one credence function is greater than the accuracy of another at every world compatible with my evidence, then we should prefer the first to the second, since it is now sure that the actual accuracy of the first is greater than the actual accuracy of the second. It doesn't matter if the first is less accurate at some world that's incompatible with my evidence, because my evidence has ruled out that world; it's told me I'm not at it, and I care only about my accuracy at the world I inhabit. So that's how we motivate Dominance. In some sense, of course, it's an evidential norm, since what it demands depends on our evidence, and our motivation of course had to talk of evidence; but it is also a motivation that a credal veritist should find compelling.

And similarly for Trade-Off Indifference. Suppose two options differ only in that the first is better than the second at a world incompatible with the evidence and the second is better than the first by the same amount at an equiprobable world compatible with the evidence. Then this should not necessarily lead to indifference, since the second might be the actual world, but the first, our evidence tells us, is not, and so the betterness of the first at the world where it is better is no compensation for its poorer performance at the world where the second is better. So we can motivate the restriction of Trade-Off Indifference again by pointing to the fact that the credal veritist cares about their actual accuracy, and the evidence tells them something about that, and what it tells us motivates Dominance and Trade-Off Indifference.

3. Resistance to evidence revisited

Just to tie this in to the previous blogpost, it's worth repeating again that this argument gives a curious parallel to the Good-Myrvold approach to gathering evidence that I've been thinking about recently. On that approach, you're considering an evidential situation, and you're deciding whether or not to place yourself in it. Such a situation is characterized by what proposition you'll learn at each different way the world might be. If you also know how you'll respond to whatever proposition you learn, you can use measures of accuracy, or epistemic utility functions more generally, to calculate the expected epistemic utility of putting yourself in the evidential situation.

However, in the cases of resistance to evidence that interest Mona Simion, you've already gathered the evidence, and you're evaluating what to do in response. At that point, the argument Hannes and I gave kicks in. And it tells you to conditionalize. Doing so has greater evidentially-truncated expected utility than not doing so. And so, while there might be cases in which you shouldn't put yourself in the evidential situation, because doing so doesn't maximize expected epistemic utility, if you do nonetheless find yourself in that situation and you learn some evidence, whether because you chose irrationally or because you were placed in it against your will, then you should update by conditionalizing on that evidence.

Comments

  1. Super interesting! Can I summarise the view like this: even though there are no evidential norms governing your doxastic states, there are evidential norms governing rational choice?

    If that's right, then am I right that this approach will sometimes say that it's instrumentally irrational to maximise expected utility? For instance, suppose you're facing the decision from the miner's puzzle, and you've received evidence which implies that the miners are in shaft A, but for whatever reason you haven't updated on this evidence---you didn't notice it, or you were distracted, or whatever. Then, covering shaft A will dominate covering shaft B and not covering either shaft in all worlds consistent with your evidence, so the Dominance principle will say that you should most prefer covering shaft A. But, since you currently think they're just as likely to be in shaft B as shaft A, not covering either maximises expected utility. So maximising expected utility is sometimes instrumentally irrational.

    I can see someone saying that, if you don't respond to the evidence, it's not evidence that you have, and that we should understand the dominance principle so that it only applies to evidence in your possession. But I don't think we can say that on this view, since it's important to the view that you can possess evidence even when you haven't (yet) responded to it.

    ReplyDelete
  2. Thanks very much for this, Dmitri! Yes, that's exactly the view! And you're right it has this consequence. It's important for me that it has the consequence you mention in the final paragraph because I want to use it to deal with the sort of 'resistance to evidence' cases that Mona Simion identifies. (In fact, as I mention in the previous post, it's not obvious that the phenomenon she's describing always involves having evidence you don't respond to; I think she also counts cases in which you respond in the wrong way as cases of resisting evidence, where the right way to respond is determined not by your priors but by the evidential probabilities. But as a sceptic about evidential probabilities, I want to try to explain what goes wrong in her cases without appealing to that.) But in any case, I think my approach gets the miners case you describe right. After all, we often say that it's not fully rational to maximize expected utility from the point of view of credences that have been formed irrationally. For instance, if I receive some evidence, update on it in some way that wildly diverges from Conditionalization, and then face a decision, you might reasonably say that there's something irrational about me choosing in the face of that decision by maximizing expected utility from the point of view of my credences. And that's similar to what's happening here: I've received evidence, but haven't updated on it; rationality tells me I should update on it and by Conditionalization; so when I'm choosing I shouldn't maximize expected utility from the point of view of my actual credences.

    ReplyDelete

Post a Comment