tag:blogger.com,1999:blog-49876091144152055932020-07-13T20:26:53.093+01:00M-PhiA blog dedicated to mathematical philosophy.Jeffrey Ketlandhttp://www.blogger.com/profile/01753975411670884721noreply@blogger.comBlogger567125tag:blogger.com,1999:blog-4987609114415205593.post-90576754340989428422020-07-12T10:03:00.000+01:002020-07-13T18:57:34.505+01:00Hurwicz's Criterion of Realism and decision-making under massive uncertaintyFor a PDF version of this post, click <a href="https://drive.google.com/file/d/1XEMWmypE_lptZ_Ikm7SJ3bFT-9tDKtDr/view?usp=sharing" target="_blank">here</a>.<br /><br />[UPDATE: After posting this, Johan Gustafsson got in touch and it seems he and I have happened upon similar points via slightly different routes. His paper is <a href="http://johanegustafsson.net/papers/decisions-under-ignorance-and-the-individuation-of-states-of-nature.pdf" target="_blank">here</a>. He takes his axioms from Binmore's <i>Rational Decisions</i>, who took them from Milnor's 'Games against Nature'. Hurwicz and Arrow also cite Milnor, but Hurwicz's original characterisation appeared before Milnor's paper, and he cites Chernoff's Cowles Commission Discussion Paper: Statistics No. 326A as the source of his axioms.]<br /><br />In 1951, Leonid Hurwicz, a Polish-American economist who would go on to share the Nobel prize for his work on mechanism design, published a series of short notes as part of the Cowles Commission Discussion Paper series, where he introduced a new decision rule for choice in the face of massive uncertainty. The situations that interested him were those in which your evidence is so sparse that it does not allow you to assign probabilities to the different possible states of the world. These situations, he thought, fall outside the remit of Savage's expected utility theory.<br /><br />The rule he proposed is called <i>Hurwicz's Criterion of Realism</i> or just <i>the Hurwicz Criterion</i>. He introduced it in the form in which it is usually stated in February 1951 in the Cowles Commission Discussion Paper: Statistics No. 356 -- the title was <a href="https://cowles.yale.edu/sites/default/files/files/pub/cdp/s-0356.pdf" target="_blank">'A Class of Criteria for Decision-Making under Ignorance'</a>. The Hurwicz Criterion says that you should choose an option that maximises what I'll call its <i>Hurwicz score</i>, which is a particular weighted average of its best-case utility and its worst-case utility. A little more formally: We follow Hurwicz and let an option be a function $a$ from a set $W$ of possible states of the world to the real numbers $\mathbb{R}$. Now, you begin by setting the weight $0 \leq \alpha \leq 1$ you wish to assign to the best-case utility of an option, and then you assign the remaining weight $1-\alpha$ to its worst-case. Then the Hurwicz score of option $a$ is just $$H^\alpha(a) := \alpha \max_{w \in W} a(w) + (1-\alpha) \min_{w \in W} a(w)$$<br /><br />However, reading his other notes in the Cowles series that surround this brief three-page note, it's clear that Hurwicz's chief interest was not so much in this particular form of decision rule, but rather with any such rule that determines the optimal choices solely by looking at their best- and worst-case scenarios. The Hurwicz Criterion is one such rule, but there are others. You might, for instance, weight the best- and worst-cases not by fixed constant coefficients, but by coefficients that change with the minimum and maximum values, or change with the difference between them or with their ratio. One of the most interesting contributions of these papers that surround the one in which Hurwicz gives us his Criterion is a characterization of rules that depend only on best- and worst-case utilities. Hurwicz gave rather an inelegent initial version of that characterization in Cowles Commission Discussion Paper: Statistics No. 370, published at the end of 1951 -- the title was <a href="https://cowles.yale.edu/sites/default/files/files/pub/cdp/s-0370.pdf" target="_blank">'Optimality Criteria for Decision-Making under Ignorance'</a>. Kenneth Arrow then seems to have helped clean it up, and they published the new version together in the <a href="https://www.cambridge.org/core/books/studies-in-resource-allocation-processes/appendix-an-optimality-criterion-for-decisionmaking-under-ignorance/7846B18137B686F377133D3C7AA4404A" target="_blank">Appendix</a> of their edited volume, in which they contributed most of the chapters, often with co-authors, <a href="https://www.cambridge.org/core/books/studies-in-resource-allocation-processes/B120D93AE7A55F249285BCA429E5EB87" target="_blank"><i>Studies in Resource Allocation</i></a>. The version with Arrow is still reasonably involved, but the idea is quite straightforward, and it is remarkable how strong a restriction Hurwicz obtains from seemingly weak and plausible axioms. This really seems to me a case where axioms that seem quite innocuous on their own can combine in interesting ways to make trouble. So I thought it might be interesting to give a simplified version that has all the central ideas.<br /><br />Here's the framework:<br /><br /><i>Possibilities and possible worlds. </i>Let $\Omega$ be the set of possibilities. A possible world is a set of possibilities--that is, a subset of $\Omega$. And a set $W$ of possible worlds is a partition of $\Omega$. That is, $W$ presents the possibilities at $\Omega$ at a certain level of grain. So if $\Omega = \{\omega_1, \omega_2, \omega_3\}$, then $\{\{\omega_1\}, \{\omega_2\}, \{\omega_3\}\}$ is the most fine-grained set of possible worlds, but there are coarser-grained sets as well, such as $\{\{\omega_1, \omega_2\}, \{\omega_3\}\}$ or $\{\{\omega_1\}, \{\omega_2, \omega_3\}\}$. (This is not quite how Hurwicz understands the relationship between different sets of possible states of the world -- he talks of deleting worlds rather than clumping them together, but I think this formalization better captures his idea.)<br /><br /><i>Options</i>. For any set $W$ of possible worlds, an option defined on $W$ is simply a function from $W$ into the real numbers $\mathbb{R}$. So an option $a : W \rightarrow \mathbb{R}$ takes each world $w$ in $W$ and assigns a utility $a(w)$ to it. (Hurwicz refers to von Neumann and Morgenstern to motivate the assumption that utilities can be measured by real numbers.)<br /><br /><i>Preferences</i>. For any set $W$ of possible worlds, there is a preference relation $\preceq_W$ over the options defined on $W$. (Hurwicz states his result in terms of optimal choices rather than preferences. But I think it's a bit easier to see what's going on if we state it in terms of preferences. There's then a further question as to which options are optimal given a particular preference ordering, but we needn't address that here.)<br /><br />Hurwicz's goal was to lay down conditions on these preference relations such that the following would hold:<br /><br /><b>Hurwicz's Rule </b>Suppose $a$ and $a'$ are options defined on $W$. Then<br /><br /><b>(H1)</b> If<br /><ul><li>$\min_w a(w) = \min_w a'(w)$</li><li>$\max_w a(w) = \max_w a'(w)$</li></ul>then $a \sim_W a'$. That is, you should be indifferent between any two options with the same maximum and minimum.<br /><br /><b>(H2)</b> If<br /><ul><li>$\min_w a(w) < \min_w a'(w)$</li><li>$\max_w a(w) < \max_w a'(w)$</li></ul>then $a \prec_W a'$. That is, you should prefer one option to another if the worst case of the first is better than the worst case of the second and the best case of the first is better than the best case of the second.<br /><br />Here are the four conditions or axioms:<br /><br /><b>(A1)</b> <b>Structure</b> $\preceq_W$ is reflexive and transitive. <br /><br /><b>(A2) Weak Dominance </b><br /><ol><li>If $a(w) \leq a'(w)$ for all $w$ in $W$, then $a \preceq_W a'$.</li><li>If $a(w) < a'(w)$ for all $w$ in $W$, then $a \prec_W a'$.</li></ol>This is a reasonably weak version of a standard norm on preferences.<br /><br /><b>(A3) Permutation Invariance </b>For any set of worlds $W$ and any options $a, a'$ defined on $W$, if $\pi : W \cong W$ is a permutation of the worlds in $W$ and if $a'(w) = a(\pi(w))$ for all $w$ in $W$, then $a \sim_W a'$.<br /><br />This just says that it doesn't matter to you which worlds receive which utilities -- all that matters are the utilities received. <br /><br /><b>(A4) Coarse-Graining Invariance </b>Suppose $W = \{\ldots, w_1, w_2, \ldots\}$ is a set of possible worlds and suppose $a, a'$ are options on $W$ with $a(w_1) = a(w_2)$ and $a'(w_1) = a'(w_2)$. Then let $W' = \{\ldots, w_1 \cup w_2, \ldots\}$, so that $W'$ has the same worlds as $W$ except that, instead of $w_1$ and $w_2$, it has their union. And define options $b$ and $b'$ on $W'$ as follows: $b(w_1 \cup w_2) = a(w_1) = a(w_2)$ and $b'(w_1 \cup w_2) = a'(w_1) = a'(w_2)$, and $b(w) = a(w)$ and $b'(w) = a'(w)$ for all other worlds. Then $a \sim_W a'$ iff $b \sim_W b'$.<br /><br />This says that if two options don't distinguish between two worlds, it shouldn't matter to you whether they are defined on a fine- or coarse-grained space of possible worlds.<br /><br />Then we have the following theorem:<br /><br /><b>Theorem</b> <b>(Hurwicz)</b> (A1) + (A2) + (A3) + (A4) $\Rightarrow$ (H1) + (H2).<br /><br />Here's the proof. Assume (A1) + (A2) + (A3) + (A4). First, we'll show that (H1) follows. We'll sketch the proof only for the case in which $W = \{w_1, w_2, w_3\}$, since that gives all the crucial moves. So denote an act on $W$ by a triple $(a(w_1), a(w_2), a(w_3))$. Now, suppose that $a$ and $a'$ are options defined on $W$ with the same minimum, $m$, and maximum, $M$. Let $n$ be the middle value of $a$ and $n'$ the middle value of $a'$.<br /><br />Now, first note that<br />$$(m, m, M) \sim_W (m, M, M)$$ After all, $(m, m, M) \sim_W (M, m, m)$ by Permutation Invariance. And, by Coarse-Graining Invariance, $(m, M, M) \sim_W (M, m, m)$ iff $(m, M) \sim_{W'} (M, m)$, where $W' = \{w_1, w_2 \cup w_3\}$. And, by Permutation Invariance and the reflexivity of $\sim_{W'}$, $(m, M) \sim_{W'} (M, m)$. So $(m, M, M) \sim_W (M, m, m) \sim_W (m, m, M)$, as required. And now we have, by previous results, Permutation Invariance, and Weak Dominance: <br />$$a \sim_W (m, n, M) \preceq_W (m, M, M) \sim_W (m, m, M) \preceq_W (m, n', M) \sim_W a'$$<br />and <br />$$a' \sim_W (m, n', M) \preceq_W (m, M, M) \sim_W (m, m, M) \preceq_W (m, n, M) \sim_W a$$ <br />And so, by transitivity, $a \sim_W a'$. That gives (H1).<br /><br />For (H2), suppose $a$ has worst case $m$, middle case $n$, and best-case $M$, while $a'$ has worst case $m'$, middle case $n'$, and worst case $M'$. And suppose $m < m'$ and $M < M'$. Then$$a \sim_W (m, n, M) \preceq_W (m, M, M) \sim_W (m, m, M) \prec_W (m', n', M') \sim_W a'$$as required. $\Box$<br /><br />In a follow-up blog post, I'd like to explore Hurwicz's conditions (A1-4) in more detail. I'm a fan of his approach, not least because I want to use something like his decision rule within the framework of accuracy-first epistemology to understand how we select our first credences -- our ur-priors or superbaby credences (see <a href="https://www.cambridge.org/core/journals/episteme/article/jamesian-epistemology-formalised-an-explication-of-the-will-to-believe/5DD3912B582124D812DFAC948CE75BF3" target="_blank">here</a>). But I now think Hurwicz's focus on only the worst-case and best-case scenarios is too restrictive. So I have to grapple with the theorem I've just presented. That's what I hope to do in the next post. But here's a quick observation. (A1-4), while plausible at first sight, sail very close to inconsistency. For instance, (A1), (A3), and (A4) are inconsistent when combined with a slight strengthening of (A2). Suppose we add the following to (A2) to give (A2$^\star$):<br /><br />3. If $a(w) \leq a'(w)$ for all $w$ in $W$ and $a(w) < a'(w)$ for some $w$ in $W$, then $a \prec_W a'$.<br /><br />Then we have know from above that $(m, m, M) \sim_W (m, M, M)$, but (A2$^\star$) entails that $(m, m, M) \prec_W (m, M, M)$, which gives a contradiction. Richard Pettigrewhttp://www.blogger.com/profile/07828399117450825734noreply@blogger.com1tag:blogger.com,1999:blog-4987609114415205593.post-22964475148795066352020-07-06T08:29:00.000+01:002020-07-06T08:43:46.481+01:00Update on updating -- or: a fall from favourFor a PDF version of this post, click <a href="https://drive.google.com/file/d/1ITjFYDAloTZQkKl1MVVqZkwtxfWTzWhV/view?usp=sharing" target="_blank">here</a>. <br /><br />Life comes at you fast. Last week, I wrote <a href="https://m-phi.blogspot.com/2020/07/updating-by-minimizing-expected.html" target="_blank">a blogpost</a> extolling the virtues of the following scoring rule, which I called the enhanced log rule: $$\mathfrak{l}^\star_1(x) = -\log x + x \ \ \ \ \ \mbox{and}\ \ \ \ \ \ \ \mathfrak{l}^\star_0(x) = x$$And I extolled its virtues. I noted that it is strictly proper and therefore furnishes an accuracy dominance argument for Probabilism. And I showed that, if we restrict attention to credence functions defined over partitions, rather than full algebras, it is the unique strictly proper scoring rule that delivers Conditionalization when you ask for the posterior that minimizes expected inaccuracy with respect to the prior and under the constraint that the posterior credence in the evidence must be 1. But then Catrin Campbell-Moore asked the natural question: what happens when you focus attention instead on full algebras rather than partitions? And looking into this revealed that things don't look so rosy for the enhanced log score. Indeed, if we focus just on the algebra built over three possible worlds, we see that every strictly proper scoring rule delivers the same updating rule, and it is not Conditionalization.<br /><br />Let's see this in more detail. First, let $\mathcal{W} = \{w_1, w_2, w_3\}$ be our set of possible worlds. And let $\mathcal{F}$ be the algebra over $\mathcal{W}$. That is, $\mathcal{F}$ contains the singletons $\{w_1\}$, $\{w_2\}$, $\{w_3\}$, the pairs $\{w_1, w_2\}$, $\{w_1, w_3\}$, and $\{w_2, w_3\}$ and the tautology $\{w_1, w_2, w_3\}$. Now suppose that your prior credence function is $(p_1, p_2, p_3 = 1-p_1-p_2)$. And suppose that you learn evidence $E = \{w_1, w_2\}$. Then we want to find the posterior, among those that assign credence 1 to $E$, that minimizes expected inaccuracy. Such a posterior will have the form $(x, 1-x, 0)$. Now let $\mathfrak{s}$ be the strictly proper scoring rule by which you measure inaccuracy. Then you wish to minimize:<br />\begin{eqnarray*}<br />&& p_1[\mathfrak{s}_1(x) + \mathfrak{s}_0(1-x) + \mathfrak{s}_0(0) + \mathfrak{s}_1(x+(1-x)) + \mathfrak{s}_1(x+0) + \mathfrak{s}_0((1-x)+0)] + \\<br />&& p_2[\mathfrak{s}_0(x) + \mathfrak{s}_1(1-x) + \mathfrak{s}_0(0) + \mathfrak{s}_1(x+(1-x)) + \mathfrak{s}_0(x+0) + \mathfrak{s}_1((1-x)+0)] +\\<br />&& p_3[\mathfrak{s}_0(x) + \mathfrak{s}_0(1-x) + \mathfrak{s}_1(0) + \mathfrak{s}_0(x+(1-x)) + \mathfrak{s}_1(x+0) + \mathfrak{s}_1((1-x) +0))] <br />\end{eqnarray*}<br />Now, ignore the constant terms, since they do not affect the minima; replace $p_3$ with $1-p_1-p_2$; and group terms together. Then we get:<br />\begin{eqnarray*}<br />&& \mathfrak{s}_1(x)(1+p_1 - p_2) + \mathfrak{s}_1(1-x)(1-p_1 + p_2) + \\<br />&& \mathfrak{s}_0(x)(1-p_1 + p_2) + \mathfrak{s}_0(x)(1+p_1 - p_2)<br />\end{eqnarray*}<br />Now, divide through by 2, which again doesn't affect the minimization, and note that$$\frac{1+p_i-p_j}{2} = p_i + \frac{1-p_i-p_j}{2}$$. Then we have<br />\begin{eqnarray*}<br />&& (p_1 + \frac{1-p_1-p_2}{2})\mathfrak{s}_1(x) + (p_2 + \frac{1-p_1-p_2}{2})\mathfrak{s}_0(x) + \\<br />&& (p_2 + \frac{1-p_1-p_2}{2})\mathfrak{s}_1(1-x) + (p_1 + \frac{1-p_1-p_2}{2})\mathfrak{s}_0(1-x) <br />\end{eqnarray*} <br />Now, $\mathfrak{s}$ is strictly proper. And $p_2 + \frac{1 -p_1 -p_2}{2} = 1 - (p_1 + \frac{1-p_1-p_2}{2})$. So providing $p_1 + \frac{1-p_1-p_2}{2} \leq 1$ and $p_2 + \frac{1-p_1-p_2}{2} \leq 1$, the posterior that minimizes expected inaccuracy from the point of view of the prior and that assigns credence 1 to $E$ is $(x, 1-x, 0)$ where:$$x = p_1 + \frac{1-p_1-p_2}{2}\ \ \ \ \mbox{and}\ \ \ \ 1-x = p_2 + \frac{1-p_1-p_2}{2}$$And this is very much not Conditionalization. It turns out then, that no strictly proper scoring rule gives Conditionalization on full algebras in this manner. Richard Pettigrewhttp://www.blogger.com/profile/07828399117450825734noreply@blogger.com3tag:blogger.com,1999:blog-4987609114415205593.post-48653194715979027952020-07-03T07:14:00.000+01:002020-07-05T07:20:35.840+01:00Updating by minimizing expected inaccuracy -- or: my new favourite scoring ruleFor a PDF version of this post, click <a href="https://drive.google.com/file/d/16gsOrCiW2zsSn8y4NEOe0F2JOW0VcIee/view?usp=sharing" target="_blank">here</a>. <br /><br />One of the central questions of Bayesian epistemology concerns how you should update your credences in response to new evidence you obtain. The proposal I want to discuss here belongs to an approach that consists of two steps. First, we specify the constraints that your evidence places on your posterior credences. Second, we specify a means by which to survey the credence functions that satisfy those constraints and pick one to adopt as your posterior.<br /><br />For instance, in the first step, we might say that when we learn a proposition $E$, we must become certain of it, and so it imposes the following constraint on our posterior credence function $Q$: $Q(E) = 1$. Or we might consider the sort of situation Richard Jeffrey discussed, where there is a partition $E_1, \ldots, E_m$ and credences $q_1, \ldots, q_m$ with $q_1 + \ldots + q_m = 1$ such that your evidence imposes the constraint: $Q(E_i) = q_i$, for $i = 1, \ldots, m$. Or the situation van Fraassen discussed, where your evidence constrains your posterior conditional credences, so that there is a credence $q$ and propositions $A$ and $B$ such that your evidence imposes the constraint: $Q(A|B) = q$.<br /><br />In the second step of the approach, on the other hand, we might following objective Bayesians like Jon Williamson, Alena Vencovská, and Jeff Paris and say that, from among those credence functions that respect your evidence, you should pick the one that, on a natural measure of informational content, contains minimal information, and which thus goes beyond your evidence as little as possible (Paris & Vencovská 1990, Williamson 2010). Or we might follow what I call the method of minimal mutilation proposed by Persi Diaconis and Sandy Zabell and pick the credence function among those that respect the evidence that is closest to your prior according to some measure of divergence between probability functions <a href="https://amstat.tandfonline.com/doi/abs/10.1080/01621459.1982.10477893" target="_blank">(Diaconis & Zabell 1982)</a>. Or, you might proceed as Hannes Leitgeb and I suggested and pick the credence function that minimizes expected inaccuracy from the point of view of your prior, while satisfying the constraints the evidence imposes <a href="https://www.journals.uchicago.edu/doi/abs/10.1086/651318" target="_blank">(Leitgeb & Pettigrew 2010)</a>. In this post, I'd like to fix a problem with the latter proposal.<br /><br />We'll focus on the simplest case: you learn $E$ and this requires you to adopt a posterior $Q$ such that $Q(E) = 1$. This is also the case in which the norm governing it is least controversial. The largely undisputed norm in this case says that you should conditionalize your prior on your evidence, so that, if $P$ is your prior and $P(E) > 0$, then your posterior should be $Q(-) = P(-|E)$. That is, providing you assigned a positive credence to $E$ before you learned it, your credence in the proposition $X$ after learning $E$ should be your prior credence in $X$ conditional on $E$.<br /><br />In order to make the maths as simple as possible, let's assume you assign credences to a finite set of worlds $\{w_1, \ldots, w_n\}$, which forms a partition of logical space. Given a credence function $P$, we write $p_i$ for $P(w_i)$, and we'll sometimes represent $P$ by the vector $(p_1, \ldots, p_n)$. Let's suppose further that your measure of the inaccuracy of a credence function is $\mathfrak{I}$, which is generated additively from a scoring rule $\mathfrak{s}$. That is,<br /><ul><li>$\mathfrak{s}_1(x)$ measures the inaccuracy of credence $x$ in a truth;</li><li>$\mathfrak{s}_0(x)$ measures the inaccuracy of credence $x$ in a falsehood;</li><li>$\mathfrak{I}(P, w_i) = \mathfrak{s}_0(p_1) + \mathfrak{s}_0(p_{i-1} ) + \mathfrak{s}_1(p_i) + \mathfrak{s}_0(p_{i+1} ) + \ldots + \mathfrak{s}_0(p_n)$.</li></ul>Hannes and I then proposed that, if $P$ is your prior, you should adopt as your posterior the credence function $Q$ such that<br /><ol><li>$Q(E) = 1$;</li><li>for any other credence function $Q^\star$ for which $Q^\star(E) = 1$, the expected inaccuracy of $Q$ by the lights of $P$ is less than the expected inaccuracy of $Q^\star$ by the lights of $P$. </li></ol>Throughout, we'll denote the expected inaccuracy of $Q$ by the lights of $P$ when inaccuracy is measured by $\mathfrak{I}$ as $\mathrm{Exp}_\mathfrak{I}(Q | P)$. Thus,<br />$$ \mathrm{Exp}_\mathfrak{I}(Q | P) = \sum^n_{i=1} p_i \mathfrak{I}(Q, w_i)$$<br />At this point, however, a problem arises. There are two inaccuracy measures that tend to be used in statistics and accuracy-first epistemology. The first is the <i>Brier inaccuracy measure</i> $\mathfrak{B}$, which is generated by the <i>quadratic scoring rule</i> $\mathfrak{q}$:<br />$$\mathfrak{q}_0(x) = x^2\ \ \ \mbox{and}\ \ \ \ \mathfrak{q}_1(x) = (1-x)^2$$<br />So<br />$$\mathfrak{B}(P, w_i) = 1-2p_i + \sum^n_{i=1} p_i^2$$<br />The second is the <i>local log inaccuracy measure</i> $\mathfrak{L}$, which is generated by what I'll call here the <i>basic log score</i> $\mathfrak{l}$:<br />$$\mathfrak{l}_0(x) = 0\ \ \ \ \mbox{and}\ \ \ \ \mathfrak{l}_1(x) = -\log x$$<br />So<br />$$\mathfrak{L}(P, w_i) = -\log p_i$$<br />The problem is that both have undesirable features for this purpose: the Brier inaccuracy measure does not deliver Conditionalization when you take the approach Hannes and I described; the local log inaccuracy measure does give Conditionalization, but while it is strictly proper in a weak sense, the basic log score that generates it is not; and relatedly, but more importantly, the local log inaccuracy measure does not furnish an accuracy dominance argument for Probabilism. Let's work through this in more detail.<br /><br />According to the standard Bayesian norm of Conditionalization, if $P$ is your prior and $P(E) > 0$, then your posterior after learning at most $E$ should be $Q(-) = P(-|E)$. That is, when I remove all credence from the worlds at which my evidence is false, in order to respect my new evidence, I should redistribute it to the worlds at which my evidence is true <i>in proportion to my prior credence in those worlds</i>.<br /><br />Now suppose that I update instead by picking the posterior $Q$ for which $Q(E) = 1$ and that minimizes expected inaccuracy as measured by the Brier inaccuracy measure. Then, at least in most cases, when I remove all credence from the worlds at which my evidence is false, in order to respect my new evidence, I redistribute it <i>equally to the worlds at which my evidence is true</i>---not in proportion to my prior credence in those worlds, but equally to each, regardless of my prior attitude.<br /><br />Here's a quick illustration in the case in which you distribute your credences over three worlds, $w_1$, $w_2$, $w_3$ and the proposition you learn is $E = \{w_1, w_2\}$. Then we want to find a posterior $Q = (x, 1-x, 0)$ with minimal expected Brier inaccuracy from the point of view of the prior $P = (p_1, p_2, p_3)$. Then:<br />\begin{eqnarray*}<br />& & \mathrm{Exp}_\mathfrak{B}((x, 1-x, 0) | (p_1, p_2, p_3))\\<br />& = & p_1[(1-x)^2 + (1-x)^2 + 0^2] + p_2[x^2 + x^2 + 0^2] +p_3[x_2 + (1-x)^2 + 1]<br />\end{eqnarray*} <br />Differentiating this with respect to $x$ gives $$-4p_1 + 4x - 2p_3$$ which equals 0 iff $$x = p_1 + \frac{p_3}{3}$$ Thus, providing $p_1 + \frac{p_3}{3}, p_2 + \frac{p_3}{3} \leq 1$, then the posterior that minimizes expected Brier inaccuracy while respecting the evidence is $$Q = \left (p_1 + \frac{p_3}{3}, p_2 + \frac{p_3}{3}, 0 \right )$$ And this is typically not the same as Conditionalization demands.<br /><br />Now turn to the local log measure, $\mathfrak{L}$. Here, things are actually a little complicated by the fact that $-\log 0 = \infty$. After all, $$\mathrm{Exp}_\mathfrak{L}((x, 1-x, 0)|(p_1, p_2, p_3)) = -p_1\log x - p_2 \log (1-x) - p_3 \log 0$$ and this is $\infty$ regardless of the value of $x$. So every value of $x$ minimizes, and indeed maximizes, this expectation. As a result, we have to look at the situation in which the evidence imposes the constraint $Q(E) = 1-\varepsilon$ for $\varepsilon > 0$, and ask what happens as we let $\varepsilon$ approach 0. Then<br />$$\mathrm{Exp}_\mathfrak{L}((x, 1-\varepsilon-x, \varepsilon)|(p_1, p_2, p_3)) = -p_1\log x - p_2 \log (1-\varepsilon-x) - p_3 \log \varepsilon$$<br />Differentiating this with respect to $x$ gives <br />$$-\frac{p_1}{x} + \frac{p_2}{1-\varepsilon - x}$$<br />which equals 0 iff <br />$$x = (1-\varepsilon) \frac{p_1}{p_1 + p_2}$$<br />And this approaches Conditionalization as $\varepsilon$ approaches 0. So, in this sense, as Ben Levinstein pointed out, the local log inaccuracy measure gives Conditionalization, and indeed Jeffrey Conditionalization or Probability Kinematics as well <a href="https://doi.org/10.1086/666064" target="_blank">(Levinstein 2012)</a>. So far, so good. <br /><br />However, throughout this post, and in the two derivations above---the first concerning the Brier inaccuracy measure and the second concerning the local log inaccuracy measure---we assumed that all credence functions must be probability functions. That is, we assumed Probabilism, the other central tenet of Bayesianism alongside Conditionalization. Now, if we measure inaccuracy using the Brier measure, we can justify that, for then we have the accuracy dominance argument, which originated mathematically with Bruno de Finetti, and was given its accuracy-theoretic philosophical spin by Jim Joyce (de Finetti 1974, Joyce 1998). That is, if your prior or your posterior isn't a probability function, then there is an alternative that is and that is guaranteed to be more Brier-accurate. However, the local log inaccuracy measure doesn't furnish us with any such argument. One very easy way to see this is to note that the non-probabilistic credence function $(1, 1, \ldots, 1)$ over $\{w_1, \ldots, w_n\}$ dominates <i>all other credence functions</i> according to the local log measure. After all, $\mathfrak{L}((1, 1, \ldots, 1), w_i) = -\log 1 = 0$, for $i = 1, \ldots, n$, while $\mathfrak{L}(P, w_i) > 0$ for any $P$ with $p_i < 1$ for some $i = 1, \ldots, n$. <br /><br />Another related issue is that the scoring rule $\mathfrak{l}$ that generates $\mathfrak{L}$ is not strictly proper. A scoring rule $\mathfrak{s}$ is said to be strictly proper if every credence expects itself to be the best. That is, for any $0 \leq p \leq 1$, $p\mathfrak{s}_1(x) + (1-p) \mathfrak{s}_0(x)$ is minimized, as a function of $x$, at $x = p$. But $-p\log x + (1-p)0 = -p\log x$ is always minimized, as a function of $x$, at $x = 1$, where $-p\log x = 0$. Similarly, an inaccuracy measure $\mathfrak{I}$ is strictly proper if, for any probabilistic credence function $P$, $\mathrm{Exp}_\mathfrak{I}(Q | P) = \sum^n_{i=1} p_i \mathfrak{I}(Q, w_i)$ is minimized, as a function of $Q$ at $Q = P$. Now, in this sense, $\mathfrak{L}$ is not strictly proper, since $\mathrm{Exp}_\mathfrak{L}(Q | P) = \sum^n_{i=1} p_i \mathfrak{L}(Q, w_i)$ is minimized, as function of $Q$ at $Q = (1, 1, \ldots, 1)$, as noted above. Nonetheless, if we restrict our attention to probabilistic $Q$, $\mathrm{Exp}_\mathfrak{L}(Q | P) = \sum^n_{i=1} p_i \mathfrak{L}(c, w_i)$ is minimized at $Q = P$. In sum: $\mathfrak{L}$ is only a reasonable inaccuracy measure to use if you already have an independent motivation for Probabilism. But accuracy-first epistemology does not have that luxury. One of central roles of an inaccuracy measure in that framework is to furnish an accuracy dominance argument for Probabilism.<br /><br />So, we ask: is there a scoring rule $\mathfrak{s}$ and resulting inaccuracy measure $\mathfrak{I}$ such that:<br /><ol><li>$\mathfrak{s}$ is a strictly proper scoring rule;</li><li>$\mathfrak{I}$ is a strictly proper inaccuracy measure; </li><li>$\mathfrak{I}$ furnishes an accuracy dominance argument for Probabilism;</li><li>If $P(E) > 0$, then $\mathrm{Exp}_\mathfrak{I}(Q | P)$ is minimized, as a function of $Q$ among credence functions for which $Q(E) = 1$, at $Q(-) = P(-|E)$.</li></ol>Straightforwardly, (1) entails (2). And, by a result due to <a href="https://ieeexplore.ieee.org/document/5238758" target="_blank">Predd, et al.</a>, (1) also entails (3) (Predd 2009). So we seek $\mathfrak{s}$ with (1) and (4). Theorem 1 below shows that essentially only one such $\mathfrak{s}$ and $\mathfrak{I}$ exist and they are what I will call the <i>enhanced log score</i> $\mathfrak{l}^\star$ and the <i>enhanced log inaccuracy measure $\mathfrak{L}^\star$</i>:<br />$$\mathfrak{l}^\star_0(x) = x\ \ \ \ \mathrm{and}\ \ \ \ \mathfrak{l}^\star_1(x) = -\log x + x-1$$<br /><br /><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody><tr><td style="text-align: center;"><a href="https://1.bp.blogspot.com/-7vxZ5CP5wRk/Xv7hM5GV8MI/AAAAAAAAEyU/igefz95p5w8-ofxcuRaOr8TRgRkQywZowCLcBGAsYHQ/s1600/enhanced-log.jpeg" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="444" data-original-width="720" height="246" src="https://1.bp.blogspot.com/-7vxZ5CP5wRk/Xv7hM5GV8MI/AAAAAAAAEyU/igefz95p5w8-ofxcuRaOr8TRgRkQywZowCLcBGAsYHQ/s400/enhanced-log.jpeg" width="400" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">The enhanced log score $\mathfrak{l}^\star$. $\mathfrak{s}_0$ in yellow; $\mathfrak{s}_1$ in blue.</td></tr></tbody></table><br /><br />Before we state and prove the theorem, there are some features of this scoring rule and its resulting inaccuracy measure that are worth noting. Juergen Landes has identified this scoring rule for a different purpose <a href="https://doi.org/10.1016/j.ijar.2015.05.007" target="_blank">(Proposition 9.1, Landes 2015)</a>.<br /><br /><br /><b>Proposition 1 </b><i>$\mathfrak{l}^\star$ is strictly proper</i>.<br /><br /><i>Proof.</i> Suppose $0 \leq p \leq 1$. Then<br />$$\frac{d}{dx} p\mathfrak{l}^\star_1(x) + (1-p)\mathfrak{l}^\star_0(x) = \frac{d}{dx} p[-\log x + x] + (1-p)x = -\frac{p}{x} + 1 = 0$$ iff $p = x$. $\Box$<br /><br /><b>Proposition 2</b> <i>If $P$ is non-probabilistic, then $P^\star = \left (\frac{p_1}{\sum_k p_k}, \ldots, \frac{p_n}{\sum_k p_k} \right )$ accuracy dominates $P = (p_1, \ldots, p_n)$</i>.<br /><br /><i>Proof</i>. $$\mathfrak{L}^\star(P^\star, w_i) = -\log\left ( \frac{p_i}{\sum_k p_k} \right ) + 1 = -\log p_i + \log\sum_k p_k + 1$$ and $$\mathfrak{L}^\star(P, w_i) = -\log p_i + \sum_k p_k$$ But $\log x + 1 \leq x$, for all $x> 0$, with equality iff $x = 1$. So, if $P$ is non-probabilistic, then $\sum_k p_k \neq 1$ and $$\mathfrak{L}^\star(P^\star, w_i) < \mathfrak{L}^\star(P, w_i)$$ for $i = 1, \ldots, n$. $\Box$<br /><br /><b>Proposition 3</b> <i>If $P$ is probabilistic, $\mathfrak{L}^\star(P, w_i) = 1 + \mathfrak{L}(P, w_i)$</i>.<br /><br /><i>Proof</i>.<br />\begin{eqnarray*}<br />\mathfrak{L}^\star(P, w_i) & = & p_1 + \ldots + p_{i-1} + (-\log p_i + p_i ) + p_{i+1} + \ldots + p_n \\<br />& = & -\log p_i + 1 \\<br />& = & 1 + \mathfrak{L}(P, w_i) <br />\end{eqnarray*} <br /> $\Box$<br /><br /><b>Corollary 1</b> <i>If $P$, $Q$ are probabilistic, then</i><br />$$\mathrm{Exp}_{\mathfrak{L}^\star}(Q | P) = 1 + \mathrm{Exp}_\mathfrak{L}(Q | P)$$<br /><br /><i>Proof</i>. By Proposition 3. $\Box$<br /><br /><b>Corollary 2 </b><i>Suppose $E_1, \ldots, E_m$ is a partition and $0 \leq q_1, \ldots, q_m \leq 1$ with $\sum^m_{i=1} q_i = 1$. Then, among $Q$ for which $Q(E_i) = q_i$ for $i = 1, \ldots, m$, $\mathrm{Exp}_{\mathfrak{L}^\star}(Q |P)$ is minimized at the Jeffrey Conditionalization posterior $Q(-) = \sum^k_{i=1} q_iP(-|E_i)$</i>.<br /><br /><i>Proof</i>. This follows from Corollary 1 and Theorem 5.1 from (Diaconis & Zabell 1982). $\Box$<br /><br />Having seen $\mathfrak{l}^\star$ and $\mathfrak{L}^\star$ in action, let's see that they are unique in having this combination of features.<br /><br /><b>Theorem 1</b> <i>Suppose $\mathfrak{s}$ is a strictly proper scoring rule and $\mathfrak{I}$ is the inaccuracy measure it generates. And suppose that, for any $\{w_1, \ldots, w_n\}$ and any $E \subseteq \{w_1, \ldots, w_n\}$, and any probabilistic credence function $P$, the probabilistic credence function $Q$ that minimizes the expected inaccuracy of $Q$ with respect to $P$ with the constraint $Q(E) = 1$, and when inaccuracy is measured by $\mathfrak{I}$, is $Q(-) = P(-|E)$. Then the scoring rule is</i><br /><i>$$\mathfrak{s}_1(x) = -\log x +x\ \ \ \ \mbox{and}\ \ \ \ \mathfrak{s}_0(x) = x$$ or any affine transformation of this</i>.<br /><br /><i>Proof. </i>First, we appeal to the following lemma (Proposition 2, Predd, et al. 2009):<br /><br /><b>Lemma 1</b><br /><br />(i) <i>Suppose $\mathfrak{s}$ is a continuous strictly proper scoring rule. Then define$$\varphi_\mathfrak{s}(x) = -x\mathfrak{s}_1(x) - (1-x)\mathfrak{s}_0(x)$$Then $\varphi_\mathfrak{s}$ is differentiable on $(0, 1)$ and convex on $[0, 1]$ and $$\mathrm{Exp}_\mathfrak{I}(Q | P) - \mathrm{Exp}_\mathfrak{I}(P | P) = \sum^n_{i=1} \varphi_\mathfrak{s}(p_i) - \varphi_\mathfrak{s}(q_i) - \varphi_\mathfrak{s}^\prime (q_i)(p_i - q_i)$$</i> (ii)<i> Suppose $\varphi$ is differentiable on $(0, 1)$ and convex on $[0, 1]$. Then let</i><br /><ul><li><i>$\mathfrak{s}^\varphi_1(x) = - \varphi(x) - \varphi'(x)(1-x)$ </i></li><li><i>$\mathfrak{s}^\varphi_0(x) = - \varphi(x) - \varphi'(x)(0-x)$</i></li></ul><i>Then $\mathfrak{s}^\varphi$ is a strictly proper scoring rule.</i><br /><i><br /></i><i>Moreover, $\mathfrak{s}^{\varphi_\mathfrak{s}} = \mathfrak{s}$.</i><br /><br />Now, let's focus on $\{w_1, w_2, w_3, w_4\}$ and let $E = \{w_1, w_2, w_3\}$. Let $p_1 = a$, $p_2 = b$, $p_3 = c$. Then we wish to minimize<br />$$\mathrm{Exp}_\mathfrak{I}((x, y, 1-x-y, 0) | (a, b, c, 1-a-b-c))$$<br />Now, but Lemma 1, <br />\begin{eqnarray*}<br />&& \mathrm{Exp}_\mathfrak{I}((x, y, 1-x-y, 0) | (a, b, c, 1-a-b-c)) \\<br />& = & \varphi(a) - \varphi(x) - \varphi'(x)(a-x)\\<br />& + & \varphi(b) - \varphi(y) - \varphi'(y)(b-y) \\ <br />& + & \varphi(c) - \varphi(1-x-y) - \varphi'(1-x-y)(c - (1-x-y)) \\<br />& + & \mathrm{Exp}_\mathfrak{I}((a, b, c, 1-a-b-c) | (a, b, c, 1-a-b-c)) <br />\end{eqnarray*} <br />Thus:<br />\begin{eqnarray*}<br />&& \frac{\partial}{\partial x} \mathrm{Exp}_\mathfrak{I}((x, y, 1-x-y, 0) | (a, b, c, 1-a-b-c))\\<br />& = & \varphi''(x)(x-a) - ((1-x-y) - c) \varphi''(1-x-y)<br />\end{eqnarray*} <br />and<br />\begin{eqnarray*}<br />&& \frac{\partial}{\partial y} \mathrm{Exp}_\mathfrak{I}((x, y, 1-x-y, 0) | (a, b, c, 1-a-b-c))\\<br />& = & \varphi''(y)(y-b) - ((1-x-y) - c) \varphi''(1-x-y)<br />\end{eqnarray*} <br />which are both 0 iff$$\varphi''(x)(x-a) = \varphi''(y)(y-b) = ((1-x-y) - c) \varphi''(1-x-y)$$ Now, suppose this is true for $x = \frac{a}{a+b+c}$ and $y = \frac{b}{a + b+ c}$. Then, for all $0 \leq a, b, c \leq 1$ with $a + b + c \leq 1$, $$a\varphi'' \left ( \frac{a}{a+b+c} \right ) = b\varphi'' \left ( \frac{b}{a+b+c} \right ) $$<br />We now wish to show that $\varphi''(x) = \frac{k}{x}$ for all $0 \leq x \leq 1$. If we manage that, then it follows that $\varphi'(x) = k\log x + m$ and $\varphi(x) = kx\log x + (m-k)x$. And we know from Lemma 1:<br />\begin{eqnarray*}<br />& & \mathfrak{s}_0(x) \\<br />& = & - \varphi(x) - \varphi'(x)(0-x) \\<br />& = & - [kx\log x + (m-k)x] - [k\log x + m](0-x) \\<br />& = & kx<br />\end{eqnarray*}<br />and<br />\begin{eqnarray*}<br />&& \mathfrak{s}_1(x) \\<br />& = & - \varphi(x) - \varphi'(x)(1-x) \\<br />& = & - [kx\log x + (m-k)x] - [k\log x + m](1-x) \\<br />& = & -k\log x + kx - m <br />\end{eqnarray*}<br />Now, first, let $f(x) = \varphi''\left (\frac{1}{x} \right )$. Thus, it will suffice to prove that $f(x) = x$. For then $\varphi''(x) = \varphi''\left (\frac{1}{\frac{1}{x}} \right ) = f \left ( \frac{1}{x} \right ) = \frac{1}{x}$, as required. And to prove $f(x) = x$, we need only show that $f'(x)$ is a constant function. We know that, for all $0 \leq a, b, c \leq 1$ with $a + b + c \leq 1$, we have<br />$$a f \left ( \frac{a + b + c}{a} \right ) = bf \left ( \frac{a + b + c}{b} \right )$$<br />So$$<br />\frac{d}{dx} a f \left ( \frac{a + b + x}{a} \right ) = \frac{d}{dx} bf \left ( \frac{a + b + x}{b} \right )<br />$$So, for all $0 \leq a, b, c \leq 1$ with $a + b + c \leq 1$<br />$$<br />f'\left (\frac{a+b+c}{a} \right ) = f'\left (\frac{a + b + c}{b} \right )<br />$$We now show that, for all $x \geq 1$, $f'(x) = f'(2)$, which will suffice to show that it is constant. First, we consider $2 \leq x$. Then let<br />$$a = \frac{1}{x}\ \ \ \ \ b = \frac{1}{2}\ \ \ \ \ c = \frac{1}{2}-\frac{1}{x}$$<br />Then<br />$$f'(x) = f'\left (\frac{a + b + c}{a} \right ) = f'\left (\frac{a + b + c}{b} \right ) = f'(2)$$<br />Second, consider $1 \leq x \leq 2$. Then pick $2 \leq y$ such that $\frac{1}{x} + \frac{1}{y} \leq 1$. Then let<br />$$a = \frac{1}{x}\ \ \ \ \ b = \frac{1}{y}\ \ \ \ \ c = 1 - \frac{1}{x} - \frac{1}{y}$$<br />Then<br />$$f'(x) = f'\left (\frac{a + b + c}{a} \right ) = f'\left (\frac{a + b + c}{b} \right ) = f'(y) = f'(2)$$<br />as required. $\Box$Richard Pettigrewhttp://www.blogger.com/profile/07828399117450825734noreply@blogger.com0tag:blogger.com,1999:blog-4987609114415205593.post-44070058905831160292019-12-06T20:45:00.000+00:002019-12-07T08:29:26.975+00:00Deterministic updating and the symmetry argument for ConditionalizationAccording to the Bayesian, when I learn a proposition to which I assign a positive credence, I should update my credences so that my new unconditional credence in a proposition is my old conditional credence in that proposition conditional on the proposition I learned. Thus, if $c$ is my credence function before I learn $E$, and $c'$ is my credence function afterwards, and $c(E) > 0$, then it ought to be the case that $$c'(-) = c(-|E) := \frac{c(-\ \&\ E)}{c(E)}$$ There are many arguments for this Bayesian norm of updating. Some pay attention to the pragmatic costs of updating any other way (Brown 1976; Lewis 1999); some pay attention to the epistemic costs, which are spelled out in terms of the accuracy of the credences that result from the updating plans (Greaves & Wallace 2006; Briggs & Pettigrew 2018); others show that updating as the Bayesian requires, and only updating in that way, preserves as much as possible about the prior credences while still respecting the new evidence (Diaconis & Zabell 1982; Dietrich, List, and Bradley 2016). And then there are the symmetry arguments that are our focus here (Hughes & van Fraassen 1985; van Fraassen 1987; Grove & Halpern 1998).<br /><br />In <a href="https://link.springer.com/article/10.1007/s11098-019-01377-y" target="_blank">a recent paper</a>, I argued that the pragmatic and epistemic arguments for Bayesian updating are based on an unwarranted assumption, which I called <i>Deterministic Updating</i>. An <i>updating plan</i> says how you'll update in response to a specific piece of evidence. Such a plan is <i>deterministic</i> if there's a single credence function that it says you'll adopt in response to that evidence, rather than a range of different credence functions that you might adopt in response. Deterministic Updating says that your updating plan for a particular piece of evidence should be deterministic. That is, if $E$ is a proposition you might learn, your plan for responding to receiving $E$ as evidence should take the form:<br /><ul><li><i>If I learn $E$, I'll adopt $c'$</i> </li></ul>rather than the form:<br /><ul><li><i>If I learn $E$, I might adopt $c'$, I might adopt $c^+$, and I might adopt $c^*$</i>.</li></ul>Here, I want to show that the symmetry arguments make the same assumption.<br />Let's start by laying out the symmetry argument. Suppose $W$ is a set of possible worlds, and $F$ is an algebra over $W$. Then an <i>updating plan</i> on $M = (W, F)$ is a function $U^M$ that takes a credence function $P$ defined on $F$ and a proposition $E$ in $F$ and returns the set of credence functions that the updating plan endorses as responses to learning $E$ for those with credence function $P$. Then we impose three conditions on a family of updating plans $U$.<br /><br /><b>Deterministic Updating</b> This says that an updating plan should endorse at most one credence function as a response to learning a given piece of evidence. That is, for any $M = (W, F)$ and $E$ in $F$, $U^M$ endorses at most one credence function as a response to learning $E$. That is, $|U^M(P, E)| \leq 1$ for all $P$ on $F$ and $E$ in $F$.<br /><br /><b>Certainty</b> This says that any credence function that an updating plan endorses as a response to learning $E$ must be certain of $E$. That is, for any $M = (W, F)$, $P$ on $F$ and $E$ in $F$, if $P'$ is in $U^M(P, E)$, then $P'(E) = 1$.<br /><br /><b>Symmetry</b> This condition requires a bit more work to spell out. Very roughly, it says that the way that an updating plan would have you update should not be sensitive to the way the possibilities are represented. More precisely: Let $M = (W, F)$ and $M' = (W', F')$. Suppose $f : W \rightarrow W'$ is a surjective function. That is, for each $w'$ in $W'$, there is $w$ in $W$ such that $f(w) = w'$. And suppose for each $X$ in $F'$, $f^{-1}(X) = \{w \in W | f(w) \in X\}$ is in $F$. Then the worlds in $W'$ are coarse-grained versions of the worlds in $W$, and the propositions in $F'$ are coarse-grained versions of those in $F$. Now, given a credence function $P$ on $F$, let $f(P)$ be the credence function over $F'$ such that $f(P)(X) = P(f^{-1}(X))$. Then the credence functions that result from updating $f(P)$ by $E'$ in $F'$ using $U^{M'}$ are the image under $f$ of the credence functions that result from updating $P$ on $f^{-1}(E')$ using $U^M$. That is, $U^{M'}(f(P), E') = f(U^M(P, f^{-1}(E')))$.<br /><br />Now, van Fraassen proves the following theorem, though he doesn't phrase it like this because he assumes Deterministic Updating in his definition of an updating rule:<br /><br /><b>Theorem (van Fraassen)</b> <i>If $U$ satisfies Deterministic Updating, Certainty, and Symmetry, then $U$ is the conditionalization updating plan. That is, if $M = (W, F)$, $P$ is defined on $F$ and $E$ is in $F$ with $P(E) > 0$, then $U^M(P, E)$ contains only one credence function $P'$ and $P'(-) = P(-|E)$.</i><br /><br />The problem is that, while Certainty is entirely uncontroversial and Symmetry is very plausible, there is no particularly good reason to assume Deterministic Updating. But the argument cannot go through without it. To see this, consider the following updating rule:<br /><ul><li>If $0 < P(E) < 1$, then $V^M(P, E) = \{v_w | w \in W\ \&\ w \in E\}$, where $v_w$ is the credence function on $F$ such that $v_w(X) = 1$ if $w$ is in $X$, and $v_w(X) = 0$ is $w$ is not in $X$ ($v_w$ is sometimes called the <i>valuation function</i> for $w$, or the <i>omniscience credence function</i> at $w).</li><li>If $P(E) = 1$, then $V^M(P, E) = P$.</li></ul>That is, if $P$ is not already certain of $E$, then $V^M$ takes any credence function on $F$ and any proposition in $F$ and returns the set of valuation functions for the worlds in $W$ at which that proposition is true. Otherwise, it keeps $P$ unchanged.<br /><br />It is easy to see that $V$ satisfies Certainty, since $v_w(E) = 1$ for each $w$ in $E$. To see that $V$ satisfies Symmetry, the crucial fact is that $f(v_w) = v_{f(w)}$. First, take a credence function in $V^{M'}(f(P), E')$: that is, $v_{w'}$ for some $w'$ in $E'$. Then $f^{-1}(w')$ is in $f^{-1}(E')$ and so $v_{f^{-1}(w')}$ is in $V^M(P, f^{-1}(E')))$. And $f(v_{f^{-1}(w')}) = v_{w'}$, so $v_{w'}$ is in $f(V^M(P, f^{-1}(E')))$. Next, take a credence function in $f(V^M(P, f^{-1}(E')))$. That is, $f(v_w)$ for some $w$ in $f^{-1}(E')$. Then $f(v_w) = v_{f(w)}$ and thus $f(v_w)$ is in $V^{M'}(f(P), E')$, as required.<br /><br />So $V$ satisfies Certainty and Symmetry, but it is not the Bayesian updating rule.<br /><br />Now, perhaps there is some further desirable condition that $V$ fails to meet? Perhaps. And it's difficult to prove a negative existential claim. But one thing we can do is to note that $V$ satisfies all the conditions on updating plans on sets of probabilities that Grove & Halpern explore as they try to extend van Fraassen's argument from the case of precise credences to the case of imprecise credences. All, that is, except Deterministic Updating, which they also impose. Here they are:<br /><br /><b>Order Invariance</b> This says that updating first on $E$ and then on $E \cap F$ should result in the same posteriors as updating first on $F$ and then on $E \cap F$. This holds because, either way, you end up with $$U^M(P, E \cap F) = \{v_w : w \in W\ \&\ w \in E \cap F\}$$.<br /><br /><b>Stationarity</b> This says that updating on $E$ should have no effect if you are already certain of $E$. That is, if $P(E) = 1$, then $U^M(P, E) = P$. The second clause of our definition of $V$ ensures this.<br /><br /><b>Non-Triviality </b>This says that there's some prior that is less than certain of the evidence such that updating it on the evidence leads to some posteriors that the updating plan endorses. That is, for some $M = (W, F)$, some $P$ on $F$, and some $E$ in $F$, $U^M(P, E) \neq \emptyset$. Indeed, $V$ will satisfy this for any $P$ and any $E \neq \emptyset$.<br /><br />So, in sum, it seems that van Fraassen's symmetry argument for Bayesian updating shares the same flaw as the pragmatic and epistemic arguments, namely, they rely on Deterministic Updating, and yet that assumption is unwarranted.<br /><br /><h2>References</h2><ol class="BibliographyWrapper"><li class="Citation"><div class="CitationContent" id="CR1">Briggs, R. A., & Pettigrew, R. (2018). An accuracy-dominance argument for conditionalization. <em class="EmphasisTypeItalic ">Noûs</em>. <span class="ExternalRef"> <a href="https://doi.org/10.1111/nous.12258" rel="noopener" target="_blank"><span class="RefSource">https://doi.org/10.1111/nous.12258</span></a></span> <span class="Occurrences"><span class="Occurrence OccurrenceGS"><a class="google-scholar-link gtm-reference" data-reference-type="Google Scholar" href="http://scholar.google.com/scholar_lookup?title=An%20accuracy-dominance%20argument%20for%20conditionalization&author=RA.%20Briggs&author=R.%20Pettigrew&journal=No%C3%BBs&publication_year=2018&doi=10.1111%2Fnous.12258" rel="noopener" target="_blank"><span><span></span></span></a></span></span></div></li><li class="Citation"><div class="CitationContent" id="CR3">Brown, P. M. (1976). Conditionalization and expected utility. <em class="EmphasisTypeItalic ">Philosophy of Science</em>, <em class="EmphasisTypeItalic ">43</em>(3), 415–419.<span class="Occurrences"><span class="Occurrence OccurrenceGS"><a class="google-scholar-link gtm-reference" data-reference-type="Google Scholar" href="http://scholar.google.com/scholar_lookup?title=Conditionalization%20and%20expected%20utility&author=PM.%20Brown&journal=Philosophy%20of%20Science&volume=43&issue=3&pages=415-419&publication_year=1976" rel="noopener" target="_blank"><span><span></span></span></a></span></span></div></li><li class="Citation"><div class="CitationContent" id="CR5">Diaconis, P., & Zabell, S. L. (1982). Updating subjective probability. <em class="EmphasisTypeItalic ">Journal of the American Statistical Association</em>, <em class="EmphasisTypeItalic ">77</em>(380), 822–830.<span class="Occurrences"><span class="Occurrence OccurrenceGS"><a class="google-scholar-link gtm-reference" data-reference-type="Google Scholar" href="http://scholar.google.com/scholar_lookup?title=Updating%20subjective%20probability&author=P.%20Diaconis&author=SL.%20Zabell&journal=Journal%20of%20the%20American%20Statistical%20Association&volume=77&issue=380&pages=822-830&publication_year=1982" rel="noopener" target="_blank"><span><span></span></span></a></span></span></div></li><li class="Citation"><div class="CitationContent" id="CR6">Dietrich, F., List, C., & Bradley, R. (2016). Belief revision generalized: A joint characterization of Bayes’s and Jeffrey’s rules. <em class="EmphasisTypeItalic ">Journal of Economic Theory</em>, <em class="EmphasisTypeItalic ">162</em>, 352–371.<span class="Occurrences"><span class="Occurrence OccurrenceGS"><a class="google-scholar-link gtm-reference" data-reference-type="Google Scholar" href="http://scholar.google.com/scholar_lookup?title=Belief%20revision%20generalized%3A%20A%20joint%20characterization%20of%20Bayes%E2%80%99s%20and%20Jeffrey%E2%80%99s%20rules&author=F.%20Dietrich&author=C.%20List&author=R.%20Bradley&journal=Journal%20of%20Economic%20Theory&volume=162&pages=352-371&publication_year=2016" rel="noopener" target="_blank"><span><span></span></span></a></span></span></div></li><li class="Citation"><div class="CitationContent" id="CR8">Greaves, H., & Wallace, D. (2006). Justifying conditionalization: Conditionalization maximizes expected epistemic utility. <em class="EmphasisTypeItalic ">Mind</em>, <em class="EmphasisTypeItalic ">115</em>(459), 607–632.<span class="Occurrences"><span class="Occurrence OccurrenceGS"><a class="google-scholar-link gtm-reference" data-reference-type="Google Scholar" href="http://scholar.google.com/scholar_lookup?title=Justifying%20conditionalization%3A%20Conditionalization%20maximizes%20expected%20epistemic%20utility&author=H.%20Greaves&author=D.%20Wallace&journal=Mind&volume=115&issue=459&pages=607-632&publication_year=2006" rel="noopener" target="_blank"><span><span></span></span></a></span></span></div></li><li class="Citation"><div class="CitationContent" id="CR9">Grove, A. J., & Halpern, J. Y. (1998). Updating sets of probabilities. In <em class="EmphasisTypeItalic ">Proceedings of the 14th conference on uncertainty in AI</em> (pp. 173–182). San Francisco, CA: Morgan Kaufman.<span class="Occurrences"><span class="Occurrence OccurrenceGS"><a class="google-scholar-link gtm-reference" data-reference-type="Google Scholar" href="https://scholar.google.com/scholar?q=Grove%2C%20A.%20J.%2C%20%26%20Halpern%2C%20J.%20Y.%20%281998%29.%20Updating%20sets%20of%20probabilities.%20In%20Proceedings%20of%20the%2014th%20conference%20on%20uncertainty%20in%20AI%20%28pp.%20173%E2%80%93182%29.%20San%20Francisco%2C%20CA%3A%20Morgan%20Kaufman." rel="noopener" target="_blank"><span><span></span></span></a></span></span></div></li><li class="Citation"><div class="CitationContent" id="CR14">Lewis, D. (1999). Why conditionalize? <em class="EmphasisTypeItalic ">Papers in metaphysics and epistemology</em> (pp. 403–407). Cambridge: Cambridge University Press.<span class="Occurrences"><span class="Occurrence OccurrenceGS"><a class="google-scholar-link gtm-reference" data-reference-type="Google Scholar" href="http://scholar.google.com/scholar_lookup?title=Why%20conditionalize%3F&author=D.%20Lewis&pages=403-407&publication_year=1999" rel="noopener" target="_blank"><span><span></span></span></a></span></span></div></li></ol><br /><br /><br /><br /><br />Richard Pettigrewhttp://www.blogger.com/profile/07828399117450825734noreply@blogger.com4tag:blogger.com,1999:blog-4987609114415205593.post-37887081217419781692019-06-27T11:20:00.000+01:002019-06-27T11:20:07.486+01:00CFP (Formal Philosophy, Gdansk)<div dir="ltr" style="text-align: left;" trbidi="on"><br />The International Conference for Philosophy of Science and Formal Methods in Philosophy (CoPS-FaM-19) of the Polish Association for Logic and Philosophy of Science will take place on December 4-6, 2019 at the University of Gdansk (in cooperation with the University of Warsaw). Extended abstract submission: August 31, 2019.<br /><br />*Keynote speakers*<br />Hitoshi Omori (Ruhr-Universität Bochum)<br />Oystein Linnebo (University of Oslo)<br />Miriam Schoenfield (MIT)<br />Stanislav Speransky (St. Petersburg State University)<br />Katya Tentori (University of Trento)<br /><br />Full submission details available at:<br /><a href="http://lopsegdansk.blogspot.com/p/cops-fam-19-cfp.html">http://lopsegdansk.blogspot.com/p/cops-fam-19-cfp.html</a><br /><br /><br />*Programme Committee*<br />Patrick Blackburn (University of Roskilde)<br />Cezary Cieśliński (University of Warsaw)<br />Matteo Colombo (Tilburg University)<br />Juliusz Doboszewski (Harvard University)<br />David Fernandez Duque (Ghent University)<br />Benjamin Eva (University of Konstanz)<br />Benedict Eastaugh (LMU Munich)<br />Federico Faroldi (Ghent University)<br />Michał Tomasz Godziszewski (University of Warsaw)<br />Valentin Goranko (Stockholm University)<br />Rafał Gruszczyński (Nicolaus Copernicus University)<br />Alexandre Guay (University of Louvain)<br />Zalan Gyenis (Jagiellonian University)<br />Ronnie Hermens (Utrecht University)<br />Leon Horsten (University of Bristol)<br />Johannes Korbmacher (Utrecht University)<br />Louwe B. Kuijer (University of Liverpool)<br />Juergen Landes (LMU Munich)<br />Marianna Antonnutti Marfori (LMU Munich)<br />Frederik Van De Putte (Ghent University)<br />Jan-Willem Romeijn (University of Groningen)<br />Sonja Smets (University of Amsterdam)<br />Anthia Solaki (University of Amsterdam)<br />Jan Sprenger (University of Turin)<br />Stanislav Speransky (St. Petersburg State University)<br />Tom F. Sterkenburg (LMU Munich)<br />Johannes Stern (University of Bristol)<br />Allard Tamminga (University of Groningen)<br />Mariusz Urbański (Adam Mickiewicz University)<br />Erik Weber (Ghent University)<br />Leszek Wroński (Jagiellonian University)<br /><br />*Local Organizing Committee:*<br />Rafal Urbaniak<br />Patryk Dziurosz-Serafinowicz<br />Pavel Janda<br />Pawel Pawlowski<br />Paula Quinon<br />Weronika Majek<br />Przemek Przepiórka<br />Małgorzata Stefaniak</div>Rafal Urbaniakhttp://www.blogger.com/profile/10277466578023939272noreply@blogger.com8tag:blogger.com,1999:blog-4987609114415205593.post-47463280211445568922019-05-17T17:32:00.001+01:002019-05-17T17:32:56.541+01:00What is conditionalization and why should we do it?The three central tenets of traditional Bayesian epistemology are these:<br /><br /><b>Precision</b> Your doxastic state at a given time is represented by a credence function, $c$, which takes each proposition $X$ about which you have an opinion and returns a single numerical value, $c(X)$, that measures the strength of your belief in $X$. By convention, we let $0$ represent your minimal credence and we let $1$ represent your maximal credence.<br /><br /><b>Probabilism</b> Your credence function should be a probability function. That is, you should assign minimal credence (i.e. 0) to necessarily false propositions, maximal credence (i.e. 1) to necessarily true propositions, and your credence in the disjunction of two propositions whose conjunction is necessarily false should be the sum of your credences in the disjuncts.<br /><br /><b>Conditionalization</b> You should update your credences by conditionalizing on your total evidence.<br /><br />Note: Precision sets out the way in which doxastic states will be represented; Probabilism and Conditionalization are norms that are stated using that representation.<br /><br />Here, we will assume Precision and Probabilism and focus on Conditionalization. In particular, we are interested in what exactly the norm says; and, more specifically, which versions of the norm are supported by the standard arguments in its favour. That is, we are interested in what versions of the norm we can justify using the existing arguments. We will consider three versions of the norm; and we will consider four arguments in its favour. For each combination, we'll ask whether the argument can support the norm. In each case, we'll notice that the standard formulation relies on a particular assumption, which we call Deterministic Updating and which we formulate precisely below. We'll ask whether the argument really does rely on this assumption, or whether it can be amended to support the norm without that assumption. Let's meet the interpretations and the arguments informally now; then we'll be ready to dive into the details.<br /><br />Here are the three interpretations of Conditionalization. According to the first, Actual Conditionalization, Conditionalization governs your actual updating behaviour.<br /><br /><b>Actual Conditionalization (AC)</b> <br /><br />If<br /><ul><li>$c$ is your credence function at $t$ (we'll often refer to this as your prior);</li><li>the total evidence you receive between $t$ and $t'$ comes in the form of a proposition $E$ learned with certainty;</li><li>$c(E) > 0$;</li><li>$c'$ is your credence function at the later time $t'$ (we'll often refer to this as your posterior);</li></ul>then it should be the case that $c'(-) = c(-|E) = \frac{c(-\ \&\ E)}{c(E)}$. <br />According to the second, Plan Conditionalization, Conditionalization governs the updating behaviour you would endorse in all possible evidential situations you might face:<br /><br /><b>Plan Conditionalization (PC)</b> <br /><br />If<br /><ul><li>$c$ is your credence function at $t$;</li><li>the total evidence you receive between $t$ and $t'$ will come in the form of a proposition learned with certainty, and that proposition will come from the partition $\mathcal{E} = \{E_1, \ldots, E_n\}$;</li><li>$R$ is the plan you endorse for how to update in response to each possible piece of total evidence,</li></ul>then it should be the case that, if you were to receive evidence $E_i$ and if $c(E_i) > 0$, then $R$ would exhort you to adopt credence function $c_i(-) = c(-|E_i) = \frac{c(-\ \&\ E_i)}{c(E_i)}$.<br /><br />According to the third, Dispositional Conditionalization, Conditionalization governs the updating behaviour you are disposed to exhibit.<br /> <br /><b>Dispositional Conditionalization (DC)</b> <br /><br />If<br /><ul><li>$c$ is your credence function at $t$;</li><li>the total evidence you receive between $t$ and $t'$ will come in the form of a proposition learned with certainty, and that proposition will come from the partition $\mathcal{E} = \{E_1, \ldots, E_n\}$;</li><li>$R$ is the plan you are disposed to follow in response to each possible piece of total evidence,</li></ul>then it should be the case that, if you were to receive evidence $E_i$ and if $c(E_i) > 0$, then $R$ would exhort you to adopt credence function $c_i(-) = c(-|E_i) = \frac{c(-\ \&\ E_i)}{c(E_i)}$.<br /><br />Next, let's meet the four arguments. Since it will take some work to formulate them precisely, I will give only an informal gloss here. There will be plenty of time to see them in high-definition in what follows.<br /><br /><b>Diachronic Dutch Book or Dutch Strategy Argument (DSA)</b> This purports to show that, if you violate conditionalization, there is a pair of decisions you might face, one before and one after you receive your evidence, such that your prior and posterior credences lead you to choose options when faced with those decisions that are guaranteed to be worse by your own lights than some alternative options (Lewis 1999).<br /><br /><b>Expected Pragmatic Utility Argument (EPUA)</b> This purports to show that, if you will face a decision after learning your evidence, then your prior credences will expect your updated posterior credences to do the best job of making that decision if they are obtained by conditionalizing on your priors (Brown 1976).<br /><br /><b>Expected Epistemic Utility Argument (EEUA)</b> This purports to show that your prior credences will expect your posterior credences to be best epistemically speaking if they are obtained by conditionalizing on your priors (Greaves & Wallace 2006).<br /><br /><b>Epistemic Utility Dominance Argument (EUDA)</b> This purports to show that, if you violate conditionalization, then there will be alternative priors and posteriors that are guaranteed to be better epistemically speaking, when considered together, than your priors and posteriors (Briggs & Pettigrew 2018).<br /><br /><h2>The framework</h2><br />In the following sections, we will consider each of the arguments listed above. As we will see, these arguments are concerned directly with updating plans or dispositions, rather than actual updating behaviour. That is, the items that they consider don't just specify how you in fact update in response to the particular piece of evidence you actually receive. Rather, they assume that your evidence between the earlier and later time will come in the form of a proposition learned with certainty (Certain Evidence); they assume the possible propositions that you might learn with certainty by the later time form a partition (Evidential Partition); and they assume that each of the propositions you might learn with certainty is one about which you had a prior opinion (Evidential Availability); and then they specify, for each of the possible pieces of evidence in your evidential partition, how you might update if you were to receive it.<br /><br />Some philosophers, like David Lewis (1999), assume that all three assumptions---Certain Evidence, Evidential Partition, Evidential Availability---hold in all learning situations. Others, deny one or more. So Richard Jeffrey (1992) denies Certain Evidence and Evidential Availability; Jason Konek (2019) denies Evidential Availability but not Certain Evidence; Bas van Fraassen (1999), Miriam Schoenfield (2017), and Jonathan Weisberg (2007) deny Evidential Partition. But all agree, I think, that there are certain important situations when all three assumptions are true; there are certain situations where there is a set of propositions that forms a partition and about each member of which you have a prior opinion, and the possible evidence you might receive at the later time comes in the form of one of these propositions learned with certainty. Examples might include: when you are about to discover the outcome of a scientific experiment, perhaps by taking a reading from a measuring device with unambiguous outputs; when you've asked an expert a yes/no question; when you step on the digital scales in your bathroom or check your bank balance or count the number of spots on the back of the ladybird that just landed on your hand. So, if you disagree with Lewis, simply restrict your attention to these cases in what follows.<br /><br />As we will see, we can piggyback on conclusions about plans and dispositions to produce arguments about actual behaviour in certain situations. But in the first instance, we will take the arguments to address plans and dispositions defined on evidential partitions primarily, and actual behaviour only secondarily. Thus, to state these arguments, we need a clear way to represent updating plans or dispositions. We will talk neutrally here of an updating rule. If you think conditionalization governs your updating dispositions, then you take it to govern the updating rule that matches those dispositions; if you think it governs your updating intentions, then you take it to govern the updating rule you intend to follow.<br /><br />We'll introduce a slew of terminology here. You needn't take it all in at the moment, but it's worth keeping it all in one place for ease of reference.<br /><br /><b>Agenda</b> We will assume that your prior and posterior credence functions are defined on the same set of propositions $\mathcal{F}$, and we'll assume that $\mathcal{F}$ is finite and $\mathcal{F}$ is an algebra. We say that $\mathcal{F}$ is your <i>agenda</i>. <br /><br /><b>Possible worlds</b> Given an agenda $\mathcal{F}$, the set of possible worlds relative to $\mathcal{F}$ is the set of classically consistent assignments of truth values to the propositions in $\mathcal{F}$. We'll abuse notation throughout and write $w$ for (i) a truth value assignment to the propositions in $\mathcal{F}$, (ii) the proposition in $\mathcal{F}$ that is true at that truth value assignment and only at that truth value assignment, and (iii) what we might call the omniscient credence function relative to that truth value assignment, which is the credence function that assigns maximal credence (i.e. 1) to all propositions that are true on it and minimal credence (i.e. 0) to all propositions that are false on it. <br /><br /><b>Updating rules</b> An <i>updating rule</i> has two components:<br /><ul><li>a set of propositions, $\mathcal{E} = \{E_1, \ldots, E_n\}$. This contains the propositions that you might learn with certainty at the later time $t'$; each $E_i$ is in $\mathcal{F}$, so $\mathcal{E} \subseteq \mathcal{F}$; $\mathcal{E}$ forms a partition;</li><li>a set of sets of credence functions, $\mathcal{C} = \{C_1, \ldots, C_n\}$. For each $E_i$, $C_i$ is the set of possible ways that the rule allows you to respond to evidence $E_i$; that is, it is the set of possible posteriors that the rule permits when you learn $E_i$; each $c'$ in $C_i$ in $\mathcal{C}$ is defined on $\mathcal{F}$.</li></ul><br /><b>Deterministic updating rule</b> We say that an updating rule $R = (\mathcal{E}, \mathcal{C})$ is <i>deterministic</i> if each $C_i$ is a singleton set $\{c_i\}$. That is, for each piece of evidence there is exactly one possible response to it that the rule allows.<br /><br /><b>Stochastic updating rule</b> A <i>stochastic updating rule</i> is an updating rule $R = (\mathcal{C}, \mathcal{E})$ equipped with a probability function $P$. $P$ records, for each $E_i$ in $\mathcal{E}$ and $c'$ in $C_i$, how likely it is that I will adopt $c'$ in response to learning $E_i$. We write this $P(R^i_{c'} | E_i)$, where $R^i_{c'}$ is the proposition that says that you adopt posterior $c'$ in response to evidence $E_i$.<br /><ul><li>We assume $P(R^i_{c'} | E_i) > 0$ for all $c'$ in $C_i$. If the probability that you will adopt $c'$ in response to $E_i$ is zero, then $c'$ does not count as a response to $E_i$ that the rule allows.</li><li>Note that every deterministic updating rule is a stochastic updating rule for which $P(R^i_{c'} | E_i) = 1$ for each $c'$ in $C_i$. If $R = (\mathcal{E}, \mathcal{C})$ is deterministic, then, for each $E_i$, $C_i = \{c_i\}$. So let $P(R^i_{c_i} | E_i) = 1$.</li></ul><br /><b>Conditionalizing updating rule</b> An updating rule $R = (\mathcal{E}, \mathcal{C})$ is a <i>conditionalizing rule</i> for a prior $c$ if, whenever $c(E_i) > 0$, $C_i = \{c_i\}$ and $c_i(-) = c(-|E_i)$.<br /><br /><b>Conditionalizing pairs</b> A pair $\langle c, R \rangle$ of a prior and an updating rule is a <i>conditionalizing pair</i> if $R$ is a conditionalizing rule for $c$.<br /><br /><b>Pseudo-conditionalizing updating rule</b> Suppose $R = (\mathcal{E}, \mathcal{C})$ is an updating rule. Then let $\mathcal{F}^*$ be the smallest algebra that contains all of $\mathcal{F}$ and also $R^i_{c'}$ for each $E_i$ in $\mathcal{E}$ and $c'$ in $C_i$. (As above $R^i_{c'}$ is the proposition that says that you adopt posterior $c'$ in response to evidence $E_i$.) Then an updating rule $R$ is a <i>pseudo-conditionalizing rule</i> for a prior $c$ if it is possible to extend $c$, a credence function defined on $\mathcal{F}$, to $c^*$, a credence function defined on $\mathcal{F}^*$, such that, for each $E_i$ in $\mathcal{E}$ and $c'$ in $C_i$, $c'(-) = c^*(-|R^i_{c'})$. That is, each posterior is the result of conditionalizing the extended prior $c^*$ on the evidence to which it is a response and the fact that it was your response to this evidence. <br /><br /><b>Pseudo-conditionalizing pair</b> A pair $\langle c, R \rangle$ of a prior and an updating rule is a <i>pseudo-conditionalizing pair</i> if $R$ is a pseudo-conditionalizing rule for $c$.<br /><br />Let's illustrate these definitions using an example. Condi is a meteorologist. There is a hurricane in the Gulf of Mexico. She knows that it will make landfall soon in one of the following four towns: Pensecola, FL, Panama City, FL, Mobile, AL, Biloxi, MS. She calls a friend and asks whether it has hit yet. It has. Then she asks whether it has hit in Florida. At this point, the evidence she will receive when her friend answers is either $F$---which says that it made landfall in Florida, that is, in Pensecola or Panama City---or $\overline{F}$---which says it hit elsewhere, that is, in Mobile or Biloxi. Her prior is $c$:<br /><div class="separator" style="clear: both; text-align: center;"><a href="https://3.bp.blogspot.com/-l4KoBV2YGyM/XN7NZ-8qoII/AAAAAAAAB_c/PpFzZaB4ZdojM0zKfmDmOQym6HLVAW0jwCLcBGAs/s1600/Screenshot%2B2019-05-17%2Bat%2B16.02.59.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="147" data-original-width="969" height="48" src="https://3.bp.blogspot.com/-l4KoBV2YGyM/XN7NZ-8qoII/AAAAAAAAB_c/PpFzZaB4ZdojM0zKfmDmOQym6HLVAW0jwCLcBGAs/s320/Screenshot%2B2019-05-17%2Bat%2B16.02.59.png" width="320" /></a></div><br />Her evidential partition is $\mathcal{E} = \{F, \overline{F}\}$. And here are some posteriors she might adopt:<br /><br /><br /><div class="separator" style="clear: both; text-align: center;"><a href="https://1.bp.blogspot.com/-QX8gJ3c3osU/XN7ONCqoFWI/AAAAAAAAB_k/dLiRx-M1M4csuinu0wDxGIRX71EFMBk7ACLcBGAs/s1600/Screenshot%2B2019-05-17%2Bat%2B16.05.03.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="412" data-original-width="956" height="137" src="https://1.bp.blogspot.com/-QX8gJ3c3osU/XN7ONCqoFWI/AAAAAAAAB_k/dLiRx-M1M4csuinu0wDxGIRX71EFMBk7ACLcBGAs/s320/Screenshot%2B2019-05-17%2Bat%2B16.05.03.png" width="320" /></a></div><br />And here are four possible rules she might adopt, along with their properties:<br /><br /><div class="separator" style="clear: both; text-align: center;"></div><div class="separator" style="clear: both; text-align: center;"><a href="https://3.bp.blogspot.com/-K-mtoeYnsMM/XN7OPcTNDJI/AAAAAAAAB_o/ZaQnIvJX7Y0rnxEIUfMT-7w-S2R76-pfwCEwYBhgL/s1600/Screenshot%2B2019-05-17%2Bat%2B16.06.10.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="328" data-original-width="1110" height="94" src="https://3.bp.blogspot.com/-K-mtoeYnsMM/XN7OPcTNDJI/AAAAAAAAB_o/ZaQnIvJX7Y0rnxEIUfMT-7w-S2R76-pfwCEwYBhgL/s320/Screenshot%2B2019-05-17%2Bat%2B16.06.10.png" width="320" /></a></div><br />As we will see below, for each of our four arguments for conditionalization---DSA, EPUA, EEUA, and EUDA---the standard formulation of the argument assumes a norm that we will call Deterministic Updating:<br /><br /><b>Deterministic Updating (DU)</b> Your updating rule should be deterministic.<br /><br />As we will see, this is crucial for the success of these arguments. In what follows, I will present each argument in its standard formulation, which assumes Deterministic Updating. Then I will explore what happens when we remove that assumption.<br /><br /><h2>The Dutch Strategy Argument (DSA)</h2><br />The DSA and EPUA both evaluate updating rules by their pragmatic consequences. That is, they look to the choices that your priors and/or your possible posteriors lead you to make and they conclude that they are optimal only if your updating rule is a conditionalizing rule for your prior.<br /><br /><h3>DSA with Deterministic Updating</h3><br />Let's look at the DSA first. In what follows, we'll take a decision problem to be a set of options that are available to an agent: e.g. accept a particular bet or refuse it; buy a particular lottery ticket or don't; take an umbrella when you go outside, take a raincoat, or take neither; and so on. The idea behind the DSA is this. One of the roles of credences is to help us make choices when faced with decision problems. They play that role badly if they lead us to make one series of choices when another series is guaranteed to serve our ends better. The DSA turns on the claim that, unless we update in line with Conditionalization, our credences will lead us to make such a series of choices when faced with a particular series of decision problems. <br /><br />Here, we restrict attention to a particular class of decision problems you might face. They are the decision problems in which, for each available option, its outcome at a given possible world obtains for you a certain amount of a particular quantity, such as money or chocolate or pure pleasure, and your utility is linear in that quantity---that is, obtaining some amount of that quantity increases your utility by the same amount regardless of how much of the quantity you already have. The quantity is typically taken to be money, and we'll continue to talk like that in what follows. But it's really a placeholder for some quantity with this property. We restrict attention to such decision problems because, in the argument, we need to combine the outcome of one decision, made at the earlier time, with the outcome of another decision, made at the later time. So we need to ensure that the utility of a combination of outcomes is the sum of the utilities of the individual outcomes. <br /><br />Now, as we do throughout, we assume that the prior $c$ and the possible posteriors $c_1, \ldots, c_n$ permitted by a deterministic updating rule $R$ are all probability functions. And we will assume further that, when your credences are probabilistic, and you face a decision problem, then you should choose from the available options one of those that maximises expected utility relative to your credences.<br /><br />With this in hand, let's define two closely related features of a pair $\langle c, R \rangle$ that are undesirable from a pragmatic point of view, and might be thought to render that pair irrational. First:<br /><br /><b>Strong Dutch Strategies</b> $\langle c, R \rangle$ is vulnerable to a <i>strong Dutch strategy</i> if there are two decision problems, $\mathbf{d}$, $\mathbf{d}'$ such that<br /><ol><li>$c$ requires you to choose option $A$ from the possible options available in $\mathbf{d}$;</li><li>for each $E_i$ and each $c'$ in $C_i$, $c'$ requires you to choose $B$ from $\mathbf{d}'$;</li><li>there are alternative options, $X$ in $\mathbf{d}$ and $Y$ in $\mathbf{d}'$, such that, at every possible world, you'll receive more utility from choosing $X$ and $Y$ than you receive from choosing $A$ and $B$. In the language of decision theory, $X + Y$ strongly dominates $A + B$.</li></ol><b>Weak Dutch Strategies</b> $\langle c, R \rangle$ is vulnerable to a <i>weak Dutch strategy</i> if there are decision problems $\mathbf{d}$ and, for each $c'$ in $C_i$ in $\mathcal{C}$, $\mathbf{d}_{c'}$ such that<br /><ol><li>$c$ requires you to choose $A$ from $\mathbf{d}$;</li><li>for each $E_i$ and each $c'$ in $C_i$, $c'$ requires you to choose $B^i_{c'}$ from $\mathbf{d}'_{c'}$;</li><li>there are alternative options, $X$ in $\mathbf{d}$ and, for $E_i$ and $c'$ in $C_i$, $Y^i_{c'}$ in $\mathbf{d}'_{c'}$, such that (a) for each $E_i$, each world in $E_i$, and each $c'$ in $C_i$, you'll receive at least as much utility at that world from choosing $X$ and $Y^i_{c'}$ as you'll receive from choosing $A$ and $B^i_{c'}$, and (b) for some $E_i$, some world in $E_i$, and some $c'$ in $C_i$, you'll receive strictly more utility at that world from $X$ and $Y^i_{c'}$ than you'll receive from $A$ and $B^i_{c'}$. </li></ol>Then the Dutch Strategy Argument is based on the following mathematical fact (de Finetti 1974):<br /><br /><b>Theorem 1</b> Suppose $R$ is a deterministic updating rule. Then:<br /><ol><li>if $R$ is not a conditionalizing pair for $c$, then $\langle c, R \rangle$ is vulnerable to a strong Dutch strategy;</li><li>if $R$ is a conditionalizing rule for $c$, then $\langle c, R \rangle$ is not vulnerable even to a weak Dutch strategy.</li></ol>That is, if your updating rule is not a conditionalizing rule for your prior, then your credences will lead you to choose a strongly dominated pair of options when faced with a particular pair of decision problems; if you satisfy it, that can't happen.<br /><br />Now that we have seen how the argument works, let's see whether it supports the three versions of conditionalization that we met above: Actual (AC), Plan (PC), and Dispositional (DC) Conditionalization. Since they speak directly of rules, let's begin with PC and DC.<br /><br />The DSA shows that, if you endorse a deterministic rule that isn't a conditionalizing rule for your prior, then there is pair of decision problems, one that you'll face at the earlier time and the other at the later time, where your credences at the earlier time and your planned credences at the later time will require you to choose a dominated pair of options. And it seems reasonable to say that it is irrational to endorse a plan when you will be rendered vulnerable to a Dutch Strategy if you follow through on it. So, for those who endorse deterministic rules, DSA plausibly supports Plan Conditionalization.<br /><br />The same is true of Dispositional Conditionalization. Just as it is irrational to <i>plan</i> to update in a way that would render you vulnerable to a Dutch Strategy if you were to stick to the plan, it is surely irrational to be <i>disposed</i> to update in a way that will renders you vulnerable in this way. So, for those whose updating dispositions are deterministic, DSA plausibly supports Dispositional Conditionalization.<br /><br />Finally, AC. There various different ways to move from either PC or DC to AC, but each one of them requires some extra assumptions. For instance:<br /><br />(I) I might assume: (i) between an earlier and a later time, there is always a partition such that you know that the strongest pieces of evidence you might receive between those times is a proposition from that partition learned with certainty; (ii) if you know you'll receive evidence from some partition, you are rationally required to plan how you will update on each possible piece of evidence before you receive it; and (iii) if you plan how to respond to evidence before you receive it, you are rationally required to follow through on that plan once you have received it. Together with PC + DU, these give AC.<br /><br />(II) I might assume: (i) you have updating dispositions. So, if you actually update other than by conditionalization, then it must be a manifestation of a disposition other than conditionalizing. Together with DC + DU, this gives AC.<br /><br />(III) I might assume: (i) that you are rationally required to update in any way that can be represented as the result of updating on a plan that you were rationally permitted to endorse or as the result of dispositions that you were rationally permitted to have, even if you did not in fact endorse any plan prior to receiving the evidence nor have any updating dispositions. Again, together with PC + DU or DC + DU, this gives AC.<br /><br />Notice that, in each case, it was essential to invoke Deterministic Updating (DU). As we will see below, this causes problems for AC. <br /><br /><h3>DSA without Deterministic Updating</h3><br />We have now seen how the DSA proceeds if we assume Deterministic Updating. But what if we don't? Consider, for instance, rule $R_3$ from our list of examples above:<br />$$R_3 = (\mathcal{E} = \{F, \overline{F}\}, \mathcal{C} = \{\{c^\circ_F, c^+_F\}, \{c^\circ_{\overline{F}}, c^+_{\overline{F}}\}\})$$<br />That is, if Condi learns $F$, rule $R_3$ allows her to update to $c^\circ_F$ or to $c^+_F$. And if she receives $\overline{F}$, it allows her to update to $c^\circ_{\overline{F}}$ or to $c^+_{\overline{F}}$. Notice that $R_3$ violates conditionalization thoroughly: it is not deterministic; and, moreover, as well as not mandating the posteriors that conditionalization demands, it does not even permit them. Can we adapt the DSA to show that $R_3$ is irrational? No. We cannot use Dutch Strategies to show that $R_3$ is irrational because it isn't vulnerable to them.<br /><br />To see this, we first note that, while $R_3$ is not deterministic and not a conditionalizing rule, it is a pseudo-conditionalizing rule. And to see that, it helps to state the following representation theorem for pseudo-conditionalizing rules.<br /><br /><b>Lemma 1</b> $R$ is a pseudo-conditionalizing pair for $c$ iff<br /><ol><li>for all $E_i$ in $\mathcal{E}$ and $c'$ in $C_i$, $c'(E_i) = 1$, and</li><li>$c$ is in the convex hull of the possible posteriors that $R$ permits.</li></ol>But note:$$c(-) = 0.4c^\circ_F(-) + 0.4c^+_F(-) + 0.1c^\circ_{\overline{F}}(-) + 0.1 c^+_{\overline{F}}(-)$$<br />So $R_3$ is pseudo-conditionalizing. What's more:<br /><br /><br /><b>Theorem 2</b><br /><ul><li>If $R$ is not a pseudo-conditionalizing rule for $c$, then $\langle c, R \rangle$ is vulnerable at least to a weak Dutch Strategy, and possibly also a strong Dutch Strategy.</li><li>If $R$ is a pseudo-conditionalizing rule for $c$, then $\langle c, R \rangle$ is not vulnerable to a weak Dutch Strategy.</li></ul>Thus, $\langle c, R_3 \rangle$ is not vulnerable even to a weak Dutch Strategy. The DSA, then, cannot say what is irrational about Condi if she begins with prior $c$ and either endorses $R_3$ or is disposed to update in line with it. Thus, the DSA cannot justify Deterministic Updating. And without DU, it cannot support PC or DC either. After all, $R_3$ violates each of those, but it is not vulnerable even to a weak Dutch Strategy. And moreover, each of the three arguments for AC break down because they depend on PC or DC. The problem is that, if Condi updates from $c$ to $c^\circ_F$ upon learning $F$, she violates AC; but there is a non-deterministic updating rule---namely, $R_3$---that allows $c^\circ_F$ as a response to learning $F$, and, for all DSA tells us, she might have rationally endorsed $R_3$ before learning $F$ or she might rationally have been disposed to follow it. Indeed, the only restriction that DSA can place on your actual updating behaviour is that you should become certain of the evidence that you learned. After all:<br /><br /><b>Theorem 3</b> Suppose $c$ is your prior and $c'$ is your posterior. Then there is a rule $R$ such that:<br /><ol><li>$c'$ is in $C_i$, and</li><li>$R$ is a pseudo-conditionalizing rule for $c$</li></ol>iff $c'(E_i) = 1$.<br /><br />Thus, at the end of this section, we can conclude that, whatever is irrational about planning to update using non-deterministic but pseudo-conditionalizing updating rules, it cannot be that following through on those plans leaves you vulnerable to a Dutch Strategy, for it does not. And similarly, whatever is irrational about being disposed to update in those ways, it cannot be that those dispositions will equip you with credences that lead you to choose dominated options, for they do not. With PC and DC thus blocked, our route to AC is therefore also blocked.<br /><br /><h2>The Expected Pragmatic Utility Argument (EPUA)</h2><br />Let's look at EPUA next. Again, we will consider how our credences guide our actions when we face decision problems. In this case, there is no need to restrict attention to monetary decision problems. We will only consider a single decision problem, which we face at the later time, after we've received the evidence, so we won't have to combine the outcomes of multiple options as we did in the DSA. The idea is this. Suppose you will make a decision after you receive whatever evidence it is that you receive at the later time. And suppose that you will use your later updated credence function to make that choice---indeed, you'll choose from the available options by maximising expected utility from the point of view of your new updated credences. Which updating rules does your prior expect will lead you to make the choice best?<br /><br /><h3>EPUA with Deterministic Updating</h3><br />Suppose you'll face decision problem $\mathbf{d}$ after you've updated. And suppose further that you'll use a deterministic updating rule $R$. Then, if $w$ is a possible world and $E_i$ is the element of the evidential partition $\mathcal{E}$ that is true at $w$, the idea is that we take the pragmatic utility of $R$ relative to $\mathbf{d}$ at $w$ to be the utility at $w$ of whatever option from $\mathbf{d}$ we should choose if our posterior credence function were $c_i$, as $R$ requires it to be at $w$. But of course, for many decision problems, this isn't well defined because there is no unique option in $\mathbf{d}$ that maximises expected utility by the lights of $c_i$; rather there are sometimes many such options, and they might have different utilities at $w$. Thus, we need not only $c_i$ but also a selection function, which picks a single option from any set of options. If $f$ is such a selection function, then let $A^{\mathbf{d}}_{c_i, f}$ be the option that $f$ selects from the set of options in $\mathbf{d}$ that maximise expected utility by the lights of $c_i$. And let<br />$$u_{\mathbf{d},f}(R, w) = u(A^{\mathbf{d}}_{c_i, f}, w).$$<br />Then the EPUA argument turns on the following mathematical fact (Brown 1976):<br /><br /><b>Theorem 4</b> Suppose $R$ and $R^\star$ are both deterministic updating rules. Then:<br /><ul><li>If $R$ and $R^\star$ are both conditionalizing rules for $c$, and $f$, $g$ are selection functions, then for all decision problems $\mathbf{d}$ $$\sum_{w \in W} c(w) u_{\mathbf{d}, f}(R, w) = \sum_{w \in W} c(w) u_{\mathbf{d}, g}(R^\star, w)$$</li><li>If $R$ is a conditionalizing rule for $c$, and $R^\star$ is not, and $f$, $g$ are selection functions, then for all decision problems $\mathrm{d}$, $$\sum_{w \in W} c(w) u_{\mathbf{d}, f}(R, w) \geq \sum_{w \in W} c(w) u_{\mathbf{d}, g}(R^\star, w)$$with strict inequality for some decision problems $\mathbf{d}$.</li></ul>That is, a deterministic updating rule maximises expected pragmatic utility by the lights of your prior just in case it is a conditionalizing rule for your prior.<br /><br />As in the case of the DSA above, then, if we assume Deterministic Updating (DU), we can establish PC and DC, and on the back of those AC as well. After all, it is surely irrational to plan to update in one way when you expect another way to guide your actions better in the future; and it is surely irrational to be disposed to update in one way when you expect another to guide you better. And as before there are the same three arguments for AC on the back of PC and DC.<br /><br /><h3>EPUA without Deterministic Updating</h3><br />How does EPUA fare when we widen our view to include non-deterministic updating rules as well? An initial problem is that it is no longer clear how to define the pragmatic utility of such an updating rule relative to a decision problem at a possible world. Above, we said that, relative to a decision problem $\mathbf{d}$ and a selection function $f$, the pragmatic utility of rule $R$ at world $w$ is the utility of the option that you would choose when faced with $\mathbf{d}$ using the credence function that $R$ mandates at $w$ and $f$: that is, if $E_i$ is true at $w$, then<br />$$u_{\mathbf{d}, f}(R, w) = u(A^{\mathbf{d}}_{c_i, f}, w).$$<br />But, if $R$ is not deterministic, there might be no single credence function that it mandates at $w$. If $E_i$ is the piece of evidence you'll learn at $w$ and $R$ permits more than one credence function in response to $E_i$, then there might be a range of different options in $\mathbf{d}$, each of which maximises expected utility relative to a different credence function $c'$ in $C_i$. So what are we to do?<br /><br />Our response to this problem depends on whether we wish to argue for Plan or Dispositional Conditionalization (PC or DC). Suppose, first, that we are interested in DC. That is, we are interested in a norm that governs the updating rule that records how you are disposed to update when you receive certain evidence. Then it seems reasonable to assume that the updating rule that records your dispositions is stochastic. That is, for each possible piece of evidence $E_i$ and each possible response $c'$ in $C_i$ to that evidence that you might adopt in response to receiving that evidence, there is some objective chance that you will respond to $E_i$ by adopting $c'$. As we explained above, we'll write this $P(R^i_{c'} | E_i)$, where $R^i_{c'}$ is the proposition that you receive $E_i$ and respond by adopting $c'$. Then, if $E_i$ is true at $w$, we might take the pragmatic utility of $R$ relative to $\mathbf{d}$ and $f$ at $w$ to be the expectation of the utility of the options that each permitted response to $E_i$ (and selection function $f$) would lead us to choose:<br />$$u_{\mathbf{d}, f}(R, w) = \sum_{c' \in C_i} P(R^i_{c'} | E_i) u(A^{\mathbf{d}}_{c', f}, w)$$<br />With this in hand, we have the following result: <br /><br /><b>Theorem 5</b> Suppose $R$ and $R^\star$ are both updating rules. Then:<br /><ul><li>If $R$ and $R^\star$ are both conditionalizing rules for $c$, and $f$, $g$ are selection functions, then for all decision problems $\mathbf{d}$, $$\sum_{w \in W} c(w) u_{\mathbf{d}, f}(R, w) = \sum_{w \in W} c(w) u_{\mathbf{d}, g}(R^\star, w)$$</li><li>$R$ is a conditionalizing rule for $c$, and $R^\star$ is a stochastic but not conditionalizing rule, and $f$, $g$ are selection functions, then for all decision problems $\mathbf{d}$,$$\sum_{w \in W} c(w) u_{\mathbf{d}, f}(R, w) \geq \sum_{w \in W} c(w) u_{\mathbf{d}, g}(R^\star, w)$$with strictly inequality for some decision problems $\mathbf{d}$.</li></ul>This shows the first difference between the DSA and EPUA. The latter, but not the former, provides a route to establishing Dispositional Conditionalization (DC). If we assume that your dispositions are governed by a chance function, and we use that chance function to calculate expectations, then we can show that your prior will expect your posteriors to do worse as a guide to action unless you are disposed to update by conditionalizing on the evidence you receive.<br /><br />Next, suppose we are interested in Plan Conditionalization (PC). In this case, we might try to appeal again to Theorem 5. To do that, we must assume that, while there are non-deterministic updating rules that we might endorse, they are all at least stochastic updating rules; that is, they all come equipped with a probability function that determines how likely it is that I will adopt a particular permitted response to the evidence. That is, we might say that the updating rules that we might endorse are either deterministic or non-deterministic-but-stochastic. In the language of game theory, we might say that the updating strategies between which we choose are either pure or mixed. And then Theorem 5 will show that we should adopt a deterministic-and-conditionalizing rule, rather than any deterministic-but-non-conditionalizing or non-deterministic-but-stochastic rule. The problem with this proposal is that it seems just as arbitrary to restrict to deterministic and non-deterministic-but-stochastic rules as it was to restrict to deterministic rules in the first place. Why should we not be able to endorse a non-deterministic and non-stochastic rule---that is, a rule that says, for at least one possible piece of evidence $E_i$ in $\mathcal{E}$, there are two or more posteriors that the rule permits as responses, but does not endorse any chance mechanism by which we'll choose between them? But if we permit these rules, how are we to define their pragmatic utility relative to a decision problem and at a possible world?<br /><br />Here's one suggestion. Suppose $E_i$ is the proposition in $\mathcal{E}$ that is true at world $w$. And suppose $\mathbf{d}$ is a decision problem and $f$ is a selection rule. Then we might take the pragmatic utility of $R$ relative to $\mathbf{d}$ and $f$ and at $w$ to be the average utility of the options that each permissible response to $E_i$ and $f$ would choose when faced with $\mathbf{d}$. That is,$$u_{\mathbf{d}, f}(R, w) = \frac{1}{|C_i|} \sum_{c' \in C_i} u(A^{\mathbf{d}}_{c', f}, w)$$where $|C_i|$ is the size of $C_i$, that is, the number of possible responses to $E_i$ that $R$ permits. If that's the case, then we have the following:<br /><br /><b>Theorem 6</b> Suppose $R$ and $R^\star$ are updating rules. Then if $R$ is a conditionalizing rule for $c$, and $R^\star$ is not deterministic, not stochastic, and not a conditionalizing rule for $c$, and $f$, $g$ are selection functions, then for all decision problems $\mathbf{d}$,<br />$$\sum_{w \in W} c(w) u_{\mathbf{d}, f}(R, w) \geq \sum_{w \in W} c(w) u_{\mathbf{d}, f}(R^\star, w)$$with strictly inequality for some decision problems $\mathbf{d}$.<br /><br />Put together with Theorems 4 and 5, this shows that our prior expects us to do better by endorsing a conditionalizing rule than by endorsing any other sort of rule, whether that is a deterministic and non-conditionalizing rule, a non-deterministic but stochastic rule, or a non-deterministic and non-stochastic rule.<br /><br />So, again, we see a difference between DSA and EPUA. Just as the latter, but not the former, provides a route to establishing DC without assuming Deterministic Updating, so the latter but not the former provides a route to establishing PC without DU. And from both of those, we have the usual three routes to AC. This means that EPUA explains what might be irrational about endorsing a non-deterministic updating rule, or having dispositions that match one. If you do, there's some alternative updating rule that your prior expects to do better as a guide to future action.<br /><br /><h2>Expected Epistemic Utility Argument (EEUA)</h2><br />The previous two arguments criticized non-conditionalizing updating rules from the standpoint of pragmatic utility. The EEUA and EUDA both criticize such rules from the standpoint of epistemic utility. The idea is this: just as credences play a pragmatic role in guiding our actions, so they play other roles as well---they represent the world; they respond to evidence; they might be more or less coherent. These roles are purely epistemic. And so just as we defined the pragmatic utility of a credence function at world when faced with a decision problem, so we can also define the epistemic utility of a credence function at a world---it is a measure of how valuable it is to have that credence function from a purely epistemic point of view. <br /><br /><h3>EEUA with Deterministic Updating</h3><br />We will not give an explicit definition of the epistemic utility of a credence function at a world. Rather, we'll simply state two properties that we'll take measures of such epistemic utility to have. These are widely assumed in the literature on epistemic utility theory and accuracy-first epistemology, and I'll defer to the arguments in favour of them that are outlined there (Joyce 2009, Pettigrew 2016, Horowitz 2019).<br /><br />A local epistemic utility function is a function $s$ that takes a single credence and a truth value---either true (1) or false (0)---and returns the epistemic value of having that credence in a proposition with that truth value. Thus, $s(1, p)$ is the epistemic value of having credence $p$ in a truth, while $s(0, p)$ is the epistemic value of having credence $p$ in a falsehood. A global epistemic utility function is a function $EU$ that takes an entire credence function defined on $\mathcal{F}$ and a possible world and returns the epistemic value of having that credence function when the propositions in $\mathcal{F}$ have the truth values they have in that world.<br /><br /><b>Strict Propriety</b> A local epistemic utility function $s$ is <i>strictly proper</i> if each credence expects itself and only itself to have the greatest epistemic utility. That is, for all $0 \leq p \leq 1$,$$<br />ps(1, x) + (1-p) s(0, x)$$<br />is maximised, as a function of $x$ at $p = x$.<br /><br /><b>Additivity</b> A global epistemic utility function is <i>additive</i> if, for each proposition $X$ in $\mathcal{F}$, there is a local epistemic utility function $s_X$ such that the epistemic utility of a credence function $c$ at a possible world is the sum of the epistemic utilities at that world of the credences it assigns. If $w$ is a possible world and we write $w(X)$ for the truth value (0 or 1) of proposition $X$ at $w$, this says:$$EU(c, w) = \sum_{X \in \mathcal{F}} s_X(w(X), c(X))$$<br /><br />We then define the epistemic utility of a deterministic updating rule $R$ in the same way we defined its pragmatic utility above: if $E_i$ is true at $w$, and $C_i = \{c_i\}$, then<br />$$EU(R, w) = EU(c_i, w)$$Then the standard formulation of the EEUA turns on the following theorem (Greaves & Wallace 2006):<br /><br /><b>Theorem 7</b> Suppose $R$ and $R^\star$ are deterministic updating rules. Then:<br /><ul><li>If $R$ and $R^\star$ are both conditionalizing rules for $c$, then$$\sum_{w \in W} c(w) EU(R, w) = \sum_{w \in W} c(w) EU(R^\star, w)$$</li><li>If $R$ is a conditionalizing rule for $c$ and $R^\star$ is not, then$$\sum_{w \in W} c(w) EU(R, w) > \sum_{w \in W} c(w) EU(R^\star, w)$$</li></ul>That is, a deterministic updating rule maximises expected epistemic utility by the lights of your prior just in case it is a conditionalizing rule for your prior.<br />So, as for DSA and EPUA, if we assume Deterministic Updating, we obtain an argument for PC and DC, and indirectly one for AC too.<br /><br /><h3>EEUA without Deterministic Updating</h3><br />If we don't assume Deterministic Updating, the situation here is very similar to the one we encountered above when we considered EPUA. Suppose $R$ is a non-deterministic but stochastic updating rule. Then, as above, we let its epistemic utility at a world be the expectation of the epistemic utility that the various possible posteriors permitted by $R$ take at that world. That is, if $E_i$ is the proposition in $\mathcal{E}$ that is true at $w$, then$$EU(R, w) = \sum_{c' \in C_i} P(R_{c'} | E_i) EU(c', w)$$Then, we have a similar result to Theorem 5:<br /><br /><b>Theorem 8</b> Suppose $R$ and $R^\star$ are updating rules. Then if $R$ is a conditionalizing rule for $c$, and $R^\star$ is stochastic but not a conditionalizing rule for $c$, then<br />$$\sum_{w \in W} c(w) EU(R, w) > \sum_{w \in W} c(w) EU(R^\star, w)$$<br /><br />Next, suppose $R$ is a non-deterministic but also a non-stochastic rule. Then we let its epistemic utility at a world be the average epistemic utility that the various possible posteriors permitted by $R$ take at that world. That is, if $E_i$ is the proposition in $\mathcal{F}$ that is true at $w$, then<br />$$EU(R, w) = \frac{1}{|C_i|}\sum_{c' \in C_i} EU(c', w)$$And again we have a similar result to Theorem 6:<br /><br /><b>Theorem 9 </b>Suppose $R$ and $R^\star$ are updating rules. Then if $R$ is a conditionalizing rule for $c$, and $R^\star$ is not deterministic, not stochastic, and not a conditionalizing rule for $c$. Then:<br />$$\sum_{w \in W} c(w) EU(R, w) > \sum_{w \in W} c(w) EU(R^\star, w)$$<br /><br />So the situation is the same as for EPUA. Whether we assess a rule by looking at how well the posteriors it produces guide our future actions, or how good they are from a purely epistemic point of view, our prior will expect a conditionalizing rule for itself to be better than any non-conditionalizing rule. And thus we obtain PC and DC, and indirectly AC as well.<br /><br /><h2>Epistemic Utility Dominance Argument (EUDA)</h2><br />Finally, we turn to the EUDA. In EPUA and EEUA, we assess the pragmatic or epistemic utility of the updating rule from the viewpoint of the prior. In DSA, we assess the prior and updating rule together, and from no particular point of view; but, unlike the EPUA and EEUA, we do not assign utilities, either pragmatic or epistemic, to the prior and the rule. In EUDA, like in DSA and unlike EPUA and EEUA, we assess the the prior and updating rule together, and again from no particular point of view; but unlike in DSA and like in EPUA and EEUA, we assign utilities to them---in particular, epistemic utilities---and assess them with reference to those.<br /><br /><h3>EUDA with Deterministic Updating</h3><br />Suppose $R$ is a deterministic updating rule. Then, as before, if $E_i$ is true at $w$, let the epistemic utility of $R$ be the epistemic utility of the credence function $c_i$ that it mandates at $w$: that is,$$EU(R, w) = EU(c_i, w).$$<br />But this time also let the epistemic utility of the pair $\langle c, R \rangle$ consisting of the prior and the updating rule be the sum of the epistemic utility of the prior and the epistemic utility of the updating rule: that is,$$EU(\langle c, R \rangle, w) = EU(c, w) + EU(R, w) = EU(c, w) + EU(c_i, w)$$<br />Then the EUDA turns on the following mathematical fact (Briggs & Pettigrew 2018):<br /><br /><b>Theorem 10 </b> Suppose $EU$ is an additive, strictly proper epistemic utility function. And suppose $R$ and $R^\star$ are deterministic updating rules. Then:<br /><ul><li>if $\langle c, R \rangle$ is non-conditionalizing, there is $\langle c^\star, R^\star \rangle$ such that, for all $w$ $$EU(\langle c, R \rangle, w) < EU(\langle c^\star, R^\star \rangle, w))$$</li><li>if $\langle c, R \rangle$ is conditionalizing, there is no $\langle c^\star, R^\star \rangle$ such that, for all $w$ $$EU(\langle c, R \rangle, w) < EU(\langle c^\star, R^\star \rangle, w))$$</li></ul>That is, if $R$ is not a conditionalizing rule for $c$, then together they are $EU$-dominated; if it is a conditionalizing rule, they are not. Thus, like EPUA and EEUA and unlike DSA, if we assume Deterministic Updating, EUDA gives PC, DC, and indirectly AC.<br /><br /><h3>EUDA without Deterministic Updating</h3><br />Now suppose we permit non-deterministic updating rules as well as deterministic ones. In this case, there are two approaches we might take. On the one hand, we might define the epistemic utility of non-deterministic rules, both stochastic and non-stochastic, just as we did for EEUA. That is, we might take the epistemic utility of a stochastic rule at a world to be the expectation of the epistemic utility of the various posteriors that it permits in response to the evidence that you obtain at that world; and the epistemic utility of a non-stochastic rule at a world is the average of those epistemic utilities. This gives us the following result:<br /><br /><b>Theorem 11 </b> Suppose $EU$ is an additive, strictly proper epistemic utility function. Then, if $\langle c, R \rangle$ is not a conditionalizing pair, there is an alternative pair $\langle c^\star, R^\star \rangle$ such that, for all $w$, $$EU(\langle c, R \rangle, w) < EU(\langle c^\star, R^\star \rangle, w)$$And this therefore supports an argument for PC and DC and indirectly AC as well.<br /><br />On the other hand, we might consider more fine-grained possible worlds, which specify not only the truth value of all the propositions in $\mathcal{F}$, but also which posterior I adopt. We can then ask: given a particular pair $\langle c, R \rangle$, is there an alternative pair $\langle c^\star, R^\star \rangle$ that has greater epistemic utility at every fine-grained world by the lights of $EU$? If we judge updating rules by this standard, we get a rather different answer. If $E_i$ is the element of $\mathcal{E}$ that is true at $w$, and $c'$ is in $C_i$ and $c^{\star \prime}$ is in $C^\star_i$, then we write $w\ \&\ R^i_{c'}\ \&\ R^{\star i}_{c^{\star \prime}}$ for the more fine-grained possible world we obtain from $w$ by adding that $R$ updates to $c'$ and $R^\star$ updates to $c^{\star\prime}$ upon receipt of $E_i$. And let<br /><ul><li>$EU(\langle c, R \rangle, w\ \&\ R^i_{c'}\ \&\ R^{\star i}_{c^{\star \prime}} ) = EU(c, w) + EU(c', w)$</li><li>$EU(\langle c^\star, R^\star \rangle, w\ \&\ R^i_{c'}\ \&\ R^{\star i}_{c^{\star \prime}} ) = EU(c^\star, w) + EU(c^{\star\prime}, w)$</li></ul>Then:<br /><b>Theorem 12 </b>Suppose $EU$ is an additive, strictly proper epistemic utility function. Then:<br /><ul><li>If $\langle c, R \rangle$ is a pseudo-conditionalizing pair, there is no alternative pair $\langle c^\star, R^\star\rangle$ such that, for all $E_i$ in $\mathcal{E}$, $w$ in $E_i$, $c'$ in $C_i$ and $c^{\star\prime}$ in $C^\star_i$, $$EU(\langle c, R \rangle, w\ \&\ R^i_{c'}\ \&\ R^{\star i}_{c^{\star \prime}} ) < EU(\langle c^\star, R^\star \rangle, w\ \&\ R^i_{c'}\ \&\ R^{\star i}_{c^{\star \prime}})$$</li><li>There are pairs $\langle c, R \rangle$ that are non-conditionalizing and non-pseudo-conditionalizing for which there is no alternative pair $\langle c^\star, R^\star\rangle$ such that, for all $E_i$ in $\mathcal{E}$, $w$ in $E_i$, $c'$ in $C_i$ and $c^{\star\prime}$ in $C^\star_i$, $$EU(\langle c, R \rangle, w\ \&\ R^i_{c'}\ \&\ R^{\star i}_{c^{\star \prime}} ) < EU(\langle c^\star, R^\star \rangle, w\ \&\ R^i_{c'}\ \&\ R^{\star i}_{c^{\star \prime}})$$</li></ul>Interpreted in this way, then, and without the assumption of Deterministic Updating, EUDA is the weakest of all the arguments. Where DSA at least establishes that your updating rule should be pseudo-conditionalizing for your prior, even if it does not establish that it should be conditionalizing, EUDA does not establish even that. <br /><br /><h2>Conclusion</h2><br />One upshot of this investigation is that, so long as we assume Deterministic Updating (DU), all four arguments support the same conclusions, namely, Plan and Dispositional Conditionalization, and also Actual Conditionalization. But once we drop DU, that agreement vanishes.<br /><br />Without DU, DSA shows only that, if we plan to update using a particular rule, it should be a pseudo-conditionalizating rule for our prior; and similarly for our dispositions. As a result, it cannot support AC. Indeed, it can support only the weakest restrictions on our actual updating behaviour, since nearly any such behaviour can be seen as an implementation of a pseudo-conditionalizing rule.<br /><br />EPUA and EEUA are much more hopeful. Let's consider our updating dispositions first. It seems natural to assume that, even if these are not deterministic, they are at least governed by some objective chances. If so, this gives a natural definition of the pragmatic and epistemic utilities of my updating dispositions at a world---they are expectations of pragmatic and epistemic utilities the posteriors, calculated using the objective chances. And, relative to that, we can in fact establish DU---we no longer need to assume it. With that in hand, we regain DC and two of the routes to AC.<br /><br />Next, let's consider the updating plans we endorse. It also seems natural to assume that those plans, if not deterministic, might not be stochastic either. And, if that's the case, we can take their pragmatic or epistemic utility at a world to be the average pragmatic or epistemic utility of the different possible credence functions they endorse as responses to the evidence you gain at that world. And, relative to that, we can again establish DU. And with it PC and two of the routes to AC.<br /><br />Finally, EUDA is a mixed bag. Understanding the epistemic and pragmatic utility of an updating rule as we have just described gives us DU and with it PC, DC, and AC. But if we take a fine-grained approach, we cannot even establish that your updating rule should be a pseudo-conditionalizing rule for your prior.<br /><br /><h2>Proofs</h2>For proofs of the theorems in this post, please see the paper version <a href="https://drive.google.com/open?id=1aoWwDDOWDjF6jXCY2WyuFqkW1wPV8CsK" target="_blank">here</a>. Richard Pettigrewhttp://www.blogger.com/profile/07828399117450825734noreply@blogger.com33tag:blogger.com,1999:blog-4987609114415205593.post-72304981680549323752019-03-05T08:12:00.002+00:002019-03-06T06:15:37.267+00:00Dutch Books, Money Pumps, and 'By Their Fruits' ReasoningThere is a species of reasoning deployed in some of the central arguments of formal epistemology and decision theory that we might call 'by their fruits' reasoning. It seeks to establish certain norms of rationality that govern our mental states by showing that, if your mental states fail to satisfy those norms, they lead you to make choices that have some undesirable feature. Thus, just as we might know false prophets by their behaviour, and corrupt trees by their evil fruit, so can we know that certain attitudes are irrational by looking not to them directly but to their consequences. For instance, the Dutch Book argument seeks to establish the norm of Probabilism for credences, which says that your credences should satisfy the axioms of the probability calculus. And it does this by showing that, if your credences do not satisfy those axioms, they will lead you to enter into a series of bets that, taken together, lose you money for sure (Ramsey 1931, de Finetti 1937). The Money Pump argument seeks to establish, among other norms, the norm of Transitivity for preferences, which says that if you prefer one option to another and that other to a third, you should prefer the first option to the third. And it does this by showing that, if your preferences are not transitive, they will lead you, again, to make a series of choices that loses you money for sure (Davidson, et al. 1955). Both of these arguments use 'by their fruits' reasoning. In this paper, I will argue that such arguments fail. I will focus particularly on the Dutch Book argument so that I can illustrate the points with examples. But the objections I raise apply equally to Money Pump arguments.<br /><br /><h2>The Dutch Book argument: an example</h2><br />Joachim is more confident that Sarah is an astrophysicist and a climate activist (proposition $A\ \&\ B$) than he is that she is an astrophysicist (proposition $A$). He is 60% confident in $A\ \&\ B$ and only 30% confident in $A$. But $A\ \&\ B$ entails $A$. So, intuitively, Joachim's credences are irrational.<br /><br />How can we establish this? According to the Dutch Book argument, we look to the choices that Joachim's credences will lead him to make. The first premise of that argument posits a connection between credences and betting behaviour. Suppose $X$ is a proposition and $S$ is a number $S$, positive, negative, or zero. Then a £$S$ bet on $X$ is a bet that pays £$S$ if $X$ is true and £$0$ if $X$ is false. £$S$ is the stake of the bet. The first premise of the Dutch Book argument says that, if you have credence $p$ in $X$, you will buy a £$S$ bet on $X$ for anything less than £$pS$. That is, the more confident you are in a proposition, the greater a proportion of the stake you are prepared to pay to buy it. Thus, in particular:<br /><ul><li>Bet 1: Joachim will buy a £$100$ bet on $A\ \&\ B$ for £$50$;</li><li>Bet 2: Joachim will sell a £$100$ bet on $A$ for £$40$.</li></ul>The total net gain of these bets, taken together, is guaranteed to be negative. Thus, his credences will lead him to a perform a pair of actions that, taken together, loses him money for sure. This is the second premise of the Dutch Book argument against Joachim. We say that this pair of actions (buy the first bet for £$40$; sell the second for £$50$) is <i>dominated</i> by the pair of actions in which he refuses to enter into each bets (refuse the first bet; refuse the second). The latter pair is guaranteed to result in greater total value than the former pair; the latter pair results in no loss and no gain, while the former results in a loss for sure. The third premise of the Dutch Book argument contends that, since it is undesirable to choose a pair of dominated options, it is irrational to have credences that lead you to do this. Ye shall know them by their fruits.<br /><br />Thus, a Dutch Book argument has three premises. The first premise posits a connection between having a particular credence in a proposition and accepting certain bets on that proposition. The second is a mathematical theorem that shows that, if the first premise is true, and if your credences do not satisfy the probability axioms, they will lead you to make a series of choices that is dominated by some alternative series of choices you might have made instead; the third premise says that your credences are irrational if, together with the connection posited in the first premise, they lead you to choose a dominated series of options. My objection is this: there is no account of the connection between credences and betting behaviour that makes both the first and third premise plausible; those accounts strong enough to make the third premise plausible are too strong to make the first premise plausible. Our strategy will be to enumerate the possible putative accounts of that connection and show either that either the first or the third premise is false when we adopt that account.<br /><br />Let $C(p, X)$ be the proposition that you have credence $p$ in proposition $X$; and let $B(x, S, X)$ be the proposition that you pay £$x$ for a £$S$ bet on $X$. Then the first premise of the Dutch Book argument has the following form:<br /><br />For all credences $p$, propositions $X$, prices $x$, and stakes $S$, if $x < pS$<br />$$O(C(p, X) \rightarrow B(x, S, X))$$where $O$ is a modal operator. But which modal operator? Different answers to this constitute different versions of the connection between credences and betting behaviour that appears in the first and third premise of the Dutch Book argument. We will consider six different candidate operators and argue that none makes the first and third premises both true. The six candidates are: metaphysical necessity; nomological necessity; nomological high probability; deontic necessity; deontic possibility; and the modality of defeasible reasons.<br /><br /><h2>$O$ is metaphysical necessity</h2><br />We begin with metaphysical modality. According to this account, the first premise of the Dutch Book argument says that it is metaphysically impossible to have a credence of $p$ in $X$ while refusing to pay £$x$ for a £$S$ bet on $X$ (for $x < pS$). If you were to refuse such a bet, that would simply mean that you do not have that credence. This sort of account would be appealing to a behaviourist, who seeks an operational definition of what it means to have a particular precise credence in a proposition---a definition in terms of betting behaviour might well satisfy them.<br /><br />If this account were true, the third premise of the Dutch Book argument would be plausible. If having a set of mental states were to guarantee as a matter of metaphysical necessity that you'd make a dominated series of choices when faced with a particular series of decisions, that seems sufficient to show that those credences are irrational. The problem is that, as David Christensen (1996) shows, the account itself cannot be true. Christensen's central point is this: credences are often and perhaps typically connected to betting behaviour and decision-making more generally; but they are often and perhaps typically connected to other things as well, such as emotional states, conative states, and other doxastic states. If I have a high credence that my partner loves me, I'm likely to pay a high proportion of the stake for a bet on it; but I'm also likely to feel joy, plan to spend more time with him, hope that his love continues, and believe that we will still be together in five years' time. What's more, none of these connections is obviously more important than any other in determining that a mental state is a credence. And each might fail while the others hold. Indeed, as Christensen notes, in Dutch Book arguments, we are concerned precisely with those cases in which there is a breakdown of the rationally required connections between credences, namely, the connections described by the probability axioms. Having a credence in one proposition usually leads you to have at least as high a credence in another proposition it entails. But, as we saw in Joachim's case, this connection can break down. So, just as Joachim's case shows that it is metaphysically possible to have a particular credence that has all the other connections that we typically associate with it except the connection to other credences, so it must be at least metaphysically possible to have a credence has all the other connections that we associate with it but not the connection to betting behaviour posited by the first premise. Such a mental state would still count as the credence in question because of all the other connections; but it wouldn't give rise to the apparently characteristic betting behaviour that is required to run the Dutch Book argument. Moreover, note that we need not assume that the credence has none of the usual connections to betting behaviour. Consider Joachim again. Every Dutch Book to which he is vulnerable involves him buying a bet on $A\ \&\ B$ and selling a bet on $A$. That is, it involves him buying a bet on $A\ \&\ B$ with a positive stake and buying a bet on $A$ with a negative stake. So he would evade the argument if his credence in $A\ \&\ B$ were to lead him to buy the bets <i>with any stake</i> that the first premise says they will, while his credence in $A$ were only to lead him to buy the bets <i>with positive stake</i> that the first premise says they will. In this case, we'd surely say he has the credences we assign to him. But he would not be vulnerable to a Dutch Book argument.<br /><br />Thus, if $O$ is metaphysical necessity, the third premise might well be true; but the first premise is false.<br /><br /><h2>$O$ is nomological necessity</h2><br />Learning from the problems with the previous proposal, we might retreat to a weaker modality. For instance, we might suggest that $O$ is a nomological modality. There are two that it might be. We might say that the connection between credences and betting behaviour posited by the first premise is nomologically necessary---that is, it is entailed by the laws of nature. Or, we might say that it is nomologically highly probable---that is, the objective chance of the consequent given the antecedent is high. Let's take them in turn.<br /><br />First, $O$ is nomological necessity. The problem with this is the same as the problem with the suggestion from the previous section that $O$ is metaphysical necessity. Above, we imagined a mental state that had all the other features we'd typically expect of a particular credence in a proposition, except some range of connections to betting behaviour that was crucial for the Dutch Book argument. We noted that this would still count as the credence in question. All that needs to be added here is that the example we considered is not only metaphysically possible, but also nomologically possible. That is, this is not akin to an example in which the fine structure constant is different from what it actually is---in that case, it would be metaphysically possible, but nomologically impossible. There is no law of nature that entails that your credence will lead to particular betting behaviour.<br /><br />Thus, again, the first premise is false.<br /><br /><h2>$O$ is nomological high probability</h2><br />Nonetheless, while it is not guaranteed by the laws of nature that an individual with a particular credence in a proposition will engage in the betting behaviour posited by the first premise, it does seem plausible that they are very likely to do so---that is, the objective chance that they will display the behaviour given that they have the credence is high. In other words, while weakening from metaphysical to nomological necessity doesn't make the first premise plausible, weakening further to nomological high probability does. So let's suppose, then, that $O$ is nomological high probability. Unfortunately, this causes two problems for the third premise.<br /><br />Here's the first. Suppose I have credences in 1,000 mutually exclusive and exhaustive propositions. And suppose each credence is $\frac{1}{1,001}$. So they violate Probabilism. Suppose further that each credence is 99% likely to give rise to the betting behaviour mentioned in the first premise of the Dutch Book argument; and suppose that whether one of the credences does or not is independent of whether any of the others does or not. Then the objective chance that the set of 1,000 credences will give rise to the betting behaviour that will lose me money for sure is $0.99^{1,000} = 0.00004 \approx \frac{1}{23,163}$. And this tells against the third premise. After all, what is so irrational about a set of credences that will lead to a dominated series of choices less than once in every 20,000 times I face the bets described in the Dutch Book argument against me?<br /><br />Here's the second problem. On the account we are considering, having a particular credence in a proposition makes it highly likely that you'll bet in a particular way. Let's say, then, that you violate Probabilism, and your credences do indeed result in you making a dominated series of choices. The third premise infers from this that your credences are irrational. But why lay the blame at the credences' door? After all, there is another possible culprit, namely, the probabilistic connection between the credence and the betting behaviour. Consider an analogy. Suppose that, as the result of some bizarre causal pathway, when I fall in love, it is very likely that I will feed myself a diet of nothing but mud and leaves for a week. I hate the taste of the mud and the leaves make me very sick, and so I lower my utility considerably by responding in this way. But I do it anyway. In this case, we would not, I think, say that it is irrational to fall in love. Rather, we'd say that what is irrational is my response to falling in love. Similarly, suppose I make a dominated series of choices and thus reveal some irrationality in myself. Then, for all the Dutch Book argument says, it might be that the irrationality lies not in the credences, but rather in my response to having those credences. <br /><br />Thus, on this account, the first premise is plausible, but the third premise is unmotivated, for it imputes irrationality to my credences when it might instead lie in my response to having those credences.<br /><br /><h2>$O$ is deontic necessity</h2><br />A natural response to the argument of the previous section is that the analogy between the credence-betting connection and the love-diet connection fails because the first is a rationally appropriate connection, while the latter is not. This leads us to suggest, along with Christensen (1996), that the connection between credences and betting behaviour at the heart of the Dutch Book argument is not governed by a descriptive modality, such as metaphysical or nomological modality, but rather by a prescriptive modality, such as deontic modality. In particular, it suggests that what the first premise says is not that someone with a particular credence in a proposition <i>will</i> or <i>might</i> or <i>probably will</i> accept certain bets on that proposition; but rather that they <i>should</i> or <i>may</i> or <i>have good but defeasible reason</i> to do so.<br /><br />Let's begin with deontic necessity. Here, my objection is that, if this is the modality at play in the first and third premise, then the argument is self-defeating. To see why, consider Joachim again. Suppose the modality is deontic necessity, and suppose that the first premise is true. So Joachim is rationally required to make a dominated series of choices---buy the £$100$ bet on $A\ \&\ B$ for £$50$; sell the £$100$ bet on $A$ for £$40$. Now suppose further that the third premise is true as well---it does, after all, seem plausible on this account of the modality involved. Then we conclude that Joachim's credences are irrational. But surely it is not rationally required to choose in line with irrational credences. Surely what is rationally required of Joachim instead is that he should correct his irrational credences so that they are now rational, and he should then choose in line with his new rational credences. Now, whatever other features they have, his new rational credences must obey Probabilism. If not, they will be vulnerable to the Dutch Book argument and thus irrational. But the Converse Dutch Theorem shows that, if they obey Probabilism, they will not rationally require or even permit Joachim to make a dominated series of choices. And, in particular, they neither require nor permit him to accept both of the bets described in the original argument. But from this we can conclude that the first premise is false. Joachim's original credences do not rationally require him to accept both of the bets; instead, rationality requires him to fix up those credences and choose in line with the credences that result. But those new fixed up credences do not require what the first premise says they require. Indeed, they don't even permit what the first premise says they require. So, if the premises of the Dutch Book argument are true, Joachim's credences are irrational, and thus the first premise of the argument is false.<br /><br />Thus, on this account, the Dutch Book argument is self-defeating: if it succeeds, its first premise is false.<br /><br /><h2>$O$ is deontic possibility</h2><br />A similar problem arises if we take the modality to be deontic possibility, rather than necessity. On this account, the first premise says not that Joachim is required to make each of the choices in the dominated series of choices, but rather that he is permitted to do so. The third premise must then judge a person irrational if they are permitted to accept each choice in a dominated series of choices. If we grant that, we can conclude that Joachim's credences are irrational. And again, we note that rationality therefore requires him to fix up those credences first and then to choose in line with the fixed up credences. But just as those fixed up credences don't <i>require</i> him to make each of the choices in the dominated series, so they don't <i>permit</i> him to make them either. So the Dutch Book argument, if successful, undermines its first premise again.<br /><br />Again, the Dutch Book argument is self-defeating.<br /><br /><h2>$O$ is the modality of defeasible reasons</h2><br />The final possibility we will consider: Joachim's credences neither rationally require nor rationally permit him to make each of the choices in the dominated series; but perhaps we might say that each credence gives him a <i>pro tanto</i> or defeasible reason to accept the corresponding bet. That is, we might say that Joachim's credence of 60% in $A\ \&\ B$ gives him a <i>pro tanto</i> or defeasible reason to buy a £$100$ bet on $A\ \&\ B$ for £$50$, while his credence of 30% in $A$ gives him a <i>pro tanto</i> or defeasible reason to sell a £$100$ bet on $A$ for £$40$. As we saw above, those reasons must be defeasible, since they will be defeated by the fact that Joachim's credences, taken together, are irrational. Since they are irrational, he has stronger reason to fix up those credences and choose in line with the fixed up one than he has to choose in line with his original credences. But his original credences nonetheless still provide some reason in favour of accepting the bets.*<br /><br />Rendered thus, I think the first premise is quite plausible. The problem is that the third premise is not. It must say that it is irrational to have any set of mental states where (i) each state in the set gives <i>pro tanto</i> reason to make a particular choice and (ii) taken together, that series of choices is dominated by another series of choices. But that is surely false. Suppose I believe this car in front of me is two years old and I also believe it's done 200,000 miles. The first belief gives me <i>pro tanto</i> or defeasible reason to pay £$5,000$ for it. The second gives me <i>pro tanto</i> reason to sell it for £$500$ as soon as I own it. Doing both of these things will lose me £$4,500$ for sure. But there is nothing irrational about my two beliefs. The problem arises only if I make decisions in line with the reasons given by just one of the beliefs, rather than taking into account my whole doxastic state. If I were to attend to my whole doxastic state, I'd never pay £$5,000$ for the car in the first place. And the same might be said of Joachim. If he pays attention only to the reasons given by his credence in $A\ \&\ B$ when he considers the bet on that proposition, and pays attention only to the reasons given by his credence in $A$ when he considers the bet on that proposition, he will choose a dominated series of options. But if he looks to the whole credal state, and if the Dutch Book argument succeeds, he will see that its irrationality defeats those reasons and gives him stronger reason to fix up his credences and act in line with those. In sum, there is nothing irrational about a set of mental states each of which individually gives you <i>pro tanto</i> or defeasible reason to choose an option in a dominated series of options.<br /><br />On this account, the first premise may be true, but the third is false.<br /><br /><h2>Conclusion</h2><br />In conclusion, there is no account of the modality involved in the first and third premises of the Dutch Book argument that can make both premises true. Metaphysical and nomological necessity are too strong to make the first premise true. Nomological high probability is not, but it does not make the third premise true. Deontic necessity and possibility render the argument self-defeating, for if the arguments succeeds, the first premise must be false. Finally, the modality of defeasible reasons, like nomological high probability, renders the first premise plausible. But it is not sufficient to secure to the third premise.<br /><br />Before we conclude, let's consider briefly how these considerations affect money pump arguments. The first premise of a money pump argument does not posit a connection between credences and betting behaviour, but between preferences and betting behaviour. In particular: if I prefer one option to another, there will be some small amount of money I'll be prepared to pay to receive the first option rather than the second. As with the Dutch Book argument, the question arises what the modal force of this connection is. And indeed the same candidates are available. What's more, the same considerations tell against each of those candidates. Just as credences are typically connected not only to betting behaviour but also to emotional states, intentional states, and other doxastic states, so preferences are typically connected to emotional states, intentional states, and other preferences. If I prefer one option to another, then this might typically lead me to pay a little to receive the first rather than the second; but it will also typically lead me to hope that I will receive the first rather than the second, to fear that I'll receive the second, to intend to choose the first over the second when faced with such a choice, and to have a further preference for the first and a small loss of money over the second. And again the connections to behaviour are no more central to this preference than the connections to the emotional states of hope and fear, the intentions to choose, and the other preferences. So the modal force of the connection posited by the first premise cannot be metaphysical or nomological necessity. And for the same reasons as above, it cannot be nomological high probability, deontic necessity or possibility, or the modality of defeasible reasons. In each case, the same objections hold. <br /><br />So these two central instances of 'by their fruits' reasoning fail. We cannot give an account of the connection between the mental states and their evil fruit that renders the argument successful.<br /><br />[* Thanks to Jason Konek for pushing me to consider this account.]<br /><h2>References</h2><br /><ul><li>Christensen, D. (1996). Dutch-Book Arguments Depragmatized: Epistemic Consistency for Partial Believers. <i>The Journal of Philosophy,</i> 93(9), 450–479. </li><li>Davidson, D., McKinsey, J. C. C., & Suppes, P. (1955). Outlines of a Formal Theory of Value, I. <i>Philosophy of Science</i>, 22(2), 140–60.</li><li>de Finetti, B. (1937). Foresight: Its Logical Laws, Its Subjective Sources. In H. E. Kyburg, & H. E. K. Smokler (Eds.) <i>Studies in Subjective Probability</i>. Huntingdon, N. Y.: Robert E. Kreiger Publishing Co.</li><li>Ramsey, F. P. (1931). Truth and Probability. <i>The Foundations of Mathematics and Other Logical Essays</i>, (pp. 156–198).</li></ul>Richard Pettigrewhttp://www.blogger.com/profile/07828399117450825734noreply@blogger.com6tag:blogger.com,1999:blog-4987609114415205593.post-50165198240134281492019-02-02T17:38:00.000+00:002019-02-02T19:10:44.208+00:00Credences in vague propositions: supervaluationist semantics and Dempster-Shafer belief functionsSafet is considering the proposition $R$, which says that the handkerchief in his pocket is red. Now, suppose we take <i>red</i> to be a vague concept. And suppose we favour a supervaluationist semantics for propositions that involve vague concepts. According to such a semantics, there is a set of legitimate precisifications of the concept <i>red</i>, and a proposition that involves that concept is true if it is true relative to all legitimate precisifications, false if false relative to all legitimate precisifications, and neither if true relative to some and false relative to others. So <i>London buses are red</i> is true, <i>Daffodils are red</i> is false, and <i>Cherry blossom is red</i> is neither.<br /><br />Safet is assigning a credence to $R$ and a credence to its negation $\overline{R}$. He assigns 20% to $R$ and 20% to $\overline{R}$. Normally, we'd say that he is irrational, since his credences in mutually exclusive and exhaustive propositions don't sum to 100%. What's more, we'd demonstrate his irrationality using either<br /><br />(i) a sure loss betting argument, which shows there is a finite series of bets, each of which his credences require him to accept but which, taken together, are guaranteed to lose him money; or<br /><br />(ii) an accuracy argument, which shows that there are alternative credences in those two propositions that are guaranteed to be closer to the ideal credences.<br /><br />However, in Safet's case, both arguments fail.<br /><br />Take the sure loss betting argument first. According to that, my credences require me to sell a £100 bet on $R$ for £30 and sell a £100 bet on $\overline{R}$ for £30. Thus, I will receive £60 from the sale of these two bets. Usually the argument proceeds by noting that, however the world turns out, either $R$ is true or $\overline{R}$ is true. So I will have to pay out £100 regardless. And I'm therefore guaranteed to lose £40 overall. But, in a supervaluationist semantics, this assumption isn't true. If Safet's handkerchief is a sort of pinkish colour, $R$ will be neither true nor false, and $\overline{R}$ will be neither true nor false. So I won't have to pay out either bet, and I'll gain £60 overall. <br /><br />Next, take the accuracy argument. According to that, my credences are more accurate the closer they lie to the ideal credences; and the ideal credence in a true proposition is 100% while the ideal credence in a proposition that isn't true is 0%. Then, given that the measure of distance between credence functions has a particular property, then we usually show that there are alternative credences in $R$ and $\overline{R}$ that are closer to each set of ideal credences than Safet's are. For instance, if we measure the distance between two credence functions $c$ and $c'$ using the so-called squared Euclidean distance, so that $$SED(c, c') = (c(R) - c'(R))^2 + (c(\overline{R}) - c'(\overline{R}))^2$$ then credences of 50% in both $R$ and $\overline{R}$ are guaranteed to be closer than Safet's to the credences of 100% in $R$ and 0% in $\overline{R}$, which are ideal if $R$ is true, and closer than Safet's to the credences of 0% in $R$ and 100% in $\overline{R}$, which are ideal if $\overline{R}$ is true. Now, if $R$ is a classical proposition, then this covers all the bases--either $R$ is true or $\overline{R}$ is. But since $R$ has a supervaluationist semantics, there is a further possibility. After all, if Safet's handkerchief is a sort of pinkish colour, $R$ will be neither true nor false, and $\overline{R}$ will be neither true nor false. So the ideal credences will be 0% in $R$ and 0% in $\overline{R}$. And 50% in $R$ and 50% in $\overline{R}$ is not closer than Safet's to those credences. Indeed, Safet's are closer.<br /><br />So our usual arguments that try to demonstrate that Safet is irrational fail. So what happens next? The answer was given by Jeff Paris (<a href="http://www.maths.manchester.ac.uk/~jeff/papers/15.ps" target="_blank">'A Note on the Dutch Book Method'</a>). He argued that the correct norm for Safet is not Probabilism, which requires that his credence function is a probability function, and therefore declares him irrational. Instead, it is Dempster-Shaferism, which requires that his credence function is a Dempster-Shafer belief function, and therefore declares him rational. To establish this, Paris showed how to tweak the standard sure loss betting argument for Probabilism, which depends on a background logic that is classical, to give a sure loss betting argument for Dempster-Shaferism, which depends on a background logic that comes from the supervaluationist semantics. To do this, he borrowed an insight from Jean-Yves Jaffray (<a href="https://link.springer.com/article/10.1007/BF00159221" target="_blank">'Coherent bets under partially resolving uncertainty and belief functions'</a>). Robbie Williams then appealed to Jaffray's theorem to tweak the accuracy argument for Probabilism to give an accuracy argument for Dempster-Shaferism (<a href="https://www.jstor.org/stable/41653758" target="_blank">'Generalized Probabilism: Dutch Books and Accuracy Domination'</a>). However, Jaffray's result doesn't explicitly mention supervaluationist semantics. And neither Paris nor Williams fill in the missing details. So I thought it might be helpful to lay out those details here.<br /><br />I'll start by sketching the argument. Then I'll go into the mathematical detail. So first, the law of credences that we'll be justifying. We begin with a definition. Throughout we'll consider only credence functions on a finite Boolean algebra $\mathcal{F}$. We'll represent the propositions in $\mathcal{F}$ as subsets of a set of possible worlds.<br /><br /><b>Definition (belief function)</b> Suppose $c : \mathcal{F} \rightarrow [0, 1]$. Then $c$ is a Dempster-Shafer belief function if<br /><ul><li>(DS1a) $c(\bot) = 0$</li><li>(DS1b) $c(\top) = 1$</li><li>(DS2) For any proposition $A$ in $\mathcal{F}$,$$c(A) \geq \sum_{B \subsetneqq A} (-1)^{|A-B|+1}c(B)$$</li></ul>Then we state the law:<b> </b><br /><br /><b>Dempster-Shaferism</b> $c$ should be a D-S belief function. <br /><br />Now, suppose $Q$ is a set of legitimate precisifications of the concepts that are involved in the propositions in $\mathcal{F}$. Essentially, $Q$ is a set of functions each of which takes a possible world and returns a classically consistent assignment of truth values to the propositions in $\mathcal{F}$. Given a possible world $w$, let $A_w$ be the strongest proposition that is true at $w$ on all legitimate precisifications in $Q$. If $A = A_w$ for some world $w$, we say that $A$ is a <i>state description for </i>$w$.<br /><br /><b>Definition (belief function$^*$) </b>Suppose $c : \mathcal{F} \rightarrow [0, 1]$. Then $c$ is a Dempster-Shafer belief function$^*$ relative to a set of precisifications if $c$ is a Dempster-Shafer belief function and<br /><ul><li>(DS3) For any proposition $A$ in $\mathcal{F}$ that is not a state description for any world, $$c(A) = \sum_{B \subsetneqq A} (-1)^{|A-B|+1}c(B)$$</li></ul><b>Dempster-Shaferism$^*$</b> $c$ should be a Dempster-Shafer belief function$^*$.<br /><br />It turns out that Dempster-Shaferism$^*$ is the strongest credal norm that we can justify using sure loss betting arguments and accuracy arguments. The sure loss betting argument is based on the following assumption: Let's say that a £$S$ bet on a proposition $A$ pays out £$S$ if $A$ is true and £0 otherwise. Then if your credence in $A$ is $p$, then you are required to pay anything less than £$pS$ for a £$S$ bet on $A$. With that in hand, we can show that you are immune to a sure loss betting arrgument iff your credence function is a Dempster-Shafer belief function$^*$. That is, if your credence function violates Dempster-Shaferism$^*$, then there is a finite set of bets on propositions in $\mathcal{F}$ such that (i) your credences require you to accept each of them, and (ii) together, they lose you money in all possible worlds. If your credence function satisfies Dempster-Shaferism$^*$, there is no such set of bets.<br /><br />The accuracy argument is based on the following assumption: The ideal credence in a proposition at a world is 1 if that proposition is true at the world, and 0 otherwise; and the distance from one credence function to another is measured by a particular sort of measure called a Bregman divergence. With that in hand, we can show that you are immune to an accuracy dominance argument iff your credence function is a Dempster-Shafer belief function$^*$. That is, if your credence function violates Dempster-Shaferism$^*$, then there is an alternative credence function that is closer to the ideal credence function than yours at every possible world. If your credence function satisfies Dempster-Shaferism$^*$, there is no such alternative.<br /><br />So much for the sketch of the arguments. Now for some more details. Suppose $c : \mathcal{F} \rightarrow [0, 1]$ is a credence function defined on the set of propositions $\mathcal{F}$. Often, we don't have to assume anything about $\mathcal{F}$, but in the case we're considering here, we must assume that it is a finite Boolean algebra. In both sure loss arguments and accuracy arguments, we need to define a set of functions, one for each possible world. In the sure loss arguments, these specify when certain bets payout; in the accuracy arguments, they specify the ideal credences. In the classical case and in the supervaluationist case that we consider here, they coincide. Given a possible world $w$, we abuse notation and write $w : \mathcal{F} \rightarrow \{0, 1\}$ for the following function:<br /><ul><li>$w(A) = 1$ if $X$ is true at $w$---that is, if $A$ is true on all legitimate precisifications at $w$;</li><li>$w(A) = 0$ if $X$ is not true at $w$---that is, if $A$ is false on some (possibly all) legitimate precisifications at $w$. </li></ul>Then, given our assumptions, we have that a £$S$ bet on $A$ pays out £$Sw(A)$ at $w$; and we have that $w(A)$ is the ideal credence in $A$ at $w$. Now, let $\mathcal{W}$ be the set of these functions. And let $\mathcal{W}^+$ be the convex hull of $\mathcal{W}$. That is, $\mathcal{W}^+$ is the smallest convex set that contains $\mathcal{W}$. In other words, $\mathcal{W}^+$ is the set of convex combinations of the functions in $\mathcal{W}$. There is then <a href="https://m-phi.blogspot.com/2013/09/the-mathematics-of-dutch-book-arguments.html" target="_blank">a general result</a> that says that $c$ is vulnerable to a sure loss betting argument iff $c$ is not in $\mathcal{W}^+$. And another general result that says that $c$ is accuracy dominated iff $c$ is not in $\mathcal{W}^+$. To complete our argument, therefore, we must show that $\mathcal{W}^+$ is precisely the set of Dempster-Shafer belief functions$^*$. That's the central purpose of this post. And that's what we turn to now.<br /><br />We start with some definitions that allow us to given an alternative characterization of the Dempster-Shafer belief functions and belief functions$^*$.<br /><br /><b>Definition (mass function)</b> Suppose $m : \mathcal{F} \rightarrow [0, 1]$. Then $m$ is a mass function if<br /><ul><li>(M1) $m(\bot) = 0$</li><li>(M2) $\sum_{A \in \mathcal{F}} m(A) = 1$</li></ul><b>Definition (mass function$^*$)</b> Suppose $m : \mathcal{F} \rightarrow [0, 1]$. Then $m$ is a mass function$^*$ relative to a set of precisifications if $m$ is a mass function and <br /><ul><li>(M3) For any proposition $A$ in $\mathcal{F}$ that is not the state description of any world, $m(A) = 0$.</li></ul><b>Definition ($m$ generates $c$)</b> If $m$ is a mass function and $c$ is a credence function, we say that $m$ generates $c$ if, for all $A$ in $\mathcal{F}$, $$c(A) = \sum_{B \subseteq A} m(B)$$ That is, a mass function generates a credence function iff the credence assigned to a proposition is the sum of the masses assigned to the propositions that entail it.<br /><br /><b>Theorem 1</b><br /><ul><li>$c$ is a Dempster-Shafer belief function iff there is a mass function $m$ that generates $c$.</li><li>$c$ is a Dempster-Shafer belief function$^*$ iff there is a mass function$^*$ $m$ that generates $c$.</li></ul><i>Proof of Theorem 1</i> Suppose $m$ is a mass function that generates $c$. Then it is straightforward to verify that $c$ is a D-S belief function. Suppose $c$ is a D-S belief function. Then let$$m(A) = c(A) - \sum_{B \subsetneqq A} (-1)^{|A-B|+1}c(B)$$ This is positive, since $c$ is a belief function. It is then straightforward to verify that $m$ is a mass function. And it is straightforward to see that $m(A) = 0$ iff $c(A) = \sum_{B \subsetneqq A} (-1)^{|A-B|+1}c(B)$. That completes the proof.<br /><br /><b>Theorem 2</b> $c$ is in $\mathcal{W}^+$ iff $c$ is a Dempster-Shafer belief function$^*$.<br /><br /><i>Proof of Theorem 2 </i>Suppose $c$ is in $\mathcal{W}^+$. So $c(-) = \sum_{w \in \mathcal{W}} \lambda_w w(-)$. Then:<br /><ul><li>if $A$ is the state description for world $w$ (that is, $A = A_w$), then let $m(A) = m(A_w) = \lambda_w$;</li><li>if $A$ is not a state description of any world, then let $m(A) = 0$.</li></ul>Then $m$ is a mass function$^*$. And $m$ generates $c$. So $c$ is a Dempster-Shafer belief function$^*$.<br /><br />Suppose $c$ is a Dempster-Shafer belief function$^*$ generated by a mass function$^*$ $m$. Then for world $w$, let $\lambda_w = m(A_w)$. Then $c(-) = \sum_{w \in \mathcal{W}} \lambda_w w(-)$. So $c$ is in $\mathcal{W}^+$.<br /><br />This completes the proof. And with the proof we have the sure loss betting argument and the accuracy dominance argument for Dempster-Shaferism$^*$ when the propositions about which you have an opinion are governed by a supervaluationist semantics.Richard Pettigrewhttp://www.blogger.com/profile/07828399117450825734noreply@blogger.com12tag:blogger.com,1999:blog-4987609114415205593.post-58631784380016977802018-10-26T09:24:00.000+01:002018-10-26T09:24:13.976+01:00Dutch Books and Reflection<div style="text-align: justify;">I've just signed a contract with Cambridge University Press to write <a href="https://richardpettigrew.com/books/the-dutch-book-argument/" target="_blank">a book on the Dutch Book Argument</a> for their Elements in Decision Theory and Philosophy series. So over the next few months, I'm going to be posting some bits and pieces as I get properly immersed in the literature.</div><div style="text-align: justify;"><br /></div><div style="text-align: justify;">-------<br /><br />In <a href="https://m-phi.blogspot.com/2018/10/dutch-books-and-conditionalization.html" target="_blank">yesterday's post</a>, I walked through the Dutch Strategy argument for Conditionalization. Today, I'd like to think about a standard objection to it. As van Fraassen (1984) pointed out, we can give a seemingly analogous Dutch Strategy argument for his Reflection Principle, which says:<br /><br /><b>Reflection Principle</b> If $c(C_{c_i}) > 0$, then $c(X | C_{c_i}) = c_i(X)$.<br /><br />We'll consider the details of the argument below. Now, van Fraassen took this argument to count in favour of his Reflection Principle. Indeed, since the Reflection Principle looks implausible on all accounts of credence except van Fraassen's voluntarism, he appealed to this Dutch Strategy Argument for the Reflection Principle in an argument for voluntarism. But most philosophers have seen a <i>modus tollens</i> where van Fraassen saw a <i>modus ponens</i>. After all, the Reflection Principle demands a level of deference to your future credences that is sometimes simply not rationally permitted, let alone required. For instance, if Sandy knows that between Monday and Tuesday, he will take a drug that makes him enormously and irrationally under-confident in extreme climate scenarios and over-confident in more moderate scenarios, his confidence in <i>Medium</i> on Monday conditional on being 90% confident in <i>Medium</i> on Tuesday, for instance, should not be 90%---it should be less than that (Christensen 1991). Thus, from the denial of the Reflection Principle, many infer that the Dutch Strategy arguments in its favour is invalid, and from that they infer that all such arguments are invalid, and thus they cast doubt on the particular Dutch Strategy argument for conditionalizing. <br /><br />R. A. Briggs responds to this objection to the Dutch Strategy argument for conditionalizing by arguing that, contrary to appearances, there is a disanalogy between the two Dutch Strategy arguments. This allows us to reject the argument for the Reflection Principle as invalid, while retaining the argument for conditionalizing as valid. To see Briggs' point, let's place the two arguments side by side. First, the Dutch Strategy Argument for Rule Conditionalization. Suppose my credence function at $t$ is $c$, suppose $c(E) > 0$, and suppose that, if I learn $E$ and nothing more between $t$ and $t'$, I will adopt $c'$ at $t'$, where $c'(X) \neq c(X | E)$, for some $X$. Then there are sets of bets $B$, $B'_E$, and $B'_{\overline{E}}$ such that $c$ requires me to accept $B$, $c'$ requires me to accept $B'_E$, and any credence function requires me to accept $B'_{\overline{E}}$, where, taken together, $B$ and $B'_E$ will lose me money in all worlds at which $E$ is true and, taken together, $B$ and $B'_{\overline{E}}$ will lose me money in all worlds at which $E$ is false. Second, the Dutch Strategy Argument for Reflection. Suppose $c(C_{c_i}) > 0$ and suppose that $c(X | C_{c_i}) \neq c_i(X)$, for some $X$. Then there are bets $B$, $B'_{C_{c_i}}$, and $B'_{\overline{C_{c_i}}}$ such that $c$ requires me to accept $B$, $c_i$ requires me to accept $B'_{C_{c_i}}$, and any credence function requires me to accept $B'_{\overline{C_{c_i}}}$, where, taken together, the bets in $B$ and $B_{C_{c_i}}$ will lose me money in all worlds at which my credence function at $t'$ is indeed $c_i$, and, taken together, $B$ and $B'_{\overline{E}}$ will lose me money in all worlds at which $c_i$ is not my credence function at $t'$. So, as Briggs points out, if you will update other than conditionalizing if you learn $E$, then whatever evidence comes your way---whether $E$ or something else---the strategy described will generate bets that, taken together, will lose you money at all worlds at which your evidence is true. That is, they will lose you money at all epistemically possible worlds, which is what is required to establish irrationality. But, if you violate Reflection, then whatever credence function you adopt at $t'$---whether $c_i$ or something else---the strategy described will generate bets that, taken together, will lose you money at all worlds at which you adopt that credence function. However, that is not necessarily all epistemically possible worlds. For you might not know what your credences are at $t'$. In that case, even if I actually adopt $c_i$ at $t'$, there will nonetheless be an epistemically possible world at which I didn't adopt that, and then the bets in $B$ and $B'_{C_{c_i}}$, taken together, might not lose me money. And that blocks the Dutch Strategy argument for Reflection.<br /><br />However, while Briggs successfully blocks the argument for Reflection in its strong, general form, Anna Mahtani (2012) points out that they do not block a Dutch Strategy argument for a weak, more specific version of Reflection:<br /><br /><b>Weak Reflection Principle</b> If at $t'$ you will know what your credence function is, then if $c(C_{c_i}) > 0$, then $c(X | C_{c_i}) = c_i(X)$.<br /><br />After all, if you satisfy the antecedent of the principle, then it cannot be that, after you adopt credence function $c_i$, it is still epistemically possible that you have some different credence function. <br /><br />Now, the Weak Reflection Principle is hardly more plausible than the stronger version. That is, it is still very implausible. Knowing that his credences will be completely luminous to him after I take the mind-altering drug should not make Sandy any more inclined to defer to the credences he will end up having after taking it. Thus, the objection to the Dutch Strategy argument for Conditionalization remains intact.<br /><br />How, then, should we respond to this objection? The first thing to note is that, in a sense, the Dutch Strategy argument for Reflection does not actually target Reflection. Or, at least, it doesn't target it directly. One way to see this is to note that, unlike the versions of conditionalization we have been considering, Reflection is a synchronic norm. It says something about how your credences should be at $t$. It says nothing about how your credences at $t$ should relate to your credences at $t'$, only how your credences <i>at $t$</i> about your credences at $t'$ should relate to your other credences <i>at $t$</i>. But the Dutch Strategy argument involves bets that your credences at $t$ require you to accept, as well as bets that your credences at $t'$ require you to accept. You can violate Reflection, and have probabilistic credences---so the Converse Dutch Book Theorem shows that there is no synchronic Dutch Book argument against your credences; that is, there is no set of bets that $c$ alone requires you to accept that will lose you money at all epistemically possible worlds.<br /><br />So what's going on? The key fact is this: if you violate Reflection, and you have a deterministic updating rule, then that updating rule cannot possibly be a conditionalizing rule. After all, suppose $c(C_{c_i}) > 0$ and $c(X | C_{C_{c_i}}) \neq c_i(X)$ and you learn $C_{c_i}$ and nothing more. Then, since you learn $C_{c_i}$, it must be true and thus your new credence function must be $c_i$. But your violation of Reflection ensures that $c_i$ is not the result of conditionalizing on your evidence $C_{c_i}$. So the Dutch Strategy argument for Reflection does not target Reflection itself; rather, it targets the updating rule you are forced to adopt because you violate Reflection. <br /><br />Consider an analogous case---Briggs also considers this analogy, but draws a different moral from it. Suppose you think that it is irrational to have a set of beliefs that can't possibly all be true. Now, suppose you have the following second-order belief: you believe that you believe a contradiction, such as $X\ \&\ \overline{X}$. Then that belief itself might be true. So, by your standards, on its own, it is not irrational. However, suppose we now consider what your attitude to $X\ \&\ \overline{X}$ is. Whatever attitude you have, you are guaranteed to have a false belief: if you do believe the contradiction, your second-order belief is true, but your first-order belief is false; if you don't believe the contradiction, then your second-order belief itself is false. In this case, we might say that the belief itself is not irrational---it might be true, and it might be supported by your evidence. But its presence forces your total doxastic state to be irrational.<br /><br />The same thing is going on in the case of Reflection. Just as you think that it is irrational to have beliefs that cannot all be true, you also think it is irrational to have credences that require you to enter into bets that lose you money for sure. And just as the single second-order belief that you believe a contradiction is possibly true, so by the Converse Dutch Book Theorem, a credence function that violates Reflection doesn't require you to accept any bets that will lose you money for sure. However, just as the single belief that you believe a contradiction forces you to have an attitude to the contradiction (either believing it or not) that ensures that your total doxastic state (first- and second-order beliefs together) includes a false belief, so your violation of Reflection forces you to adopt an updating rule that is vulnerable to a Dutch Strategy. For this reason, we can allow that you are irrational if you are vulnerable to a Dutch Strategy without rendering violations of Reflection irrational, just as we can allow that you are irrational if you have beliefs that are guaranteed to include some falsehoods without rendering the second-order belief that you believe a contradiction irrational. Both force you to adopt some other sort of doxastic state---a first-order belief or an updating rule---that is irrational. But they are not themselves irrational. This saves the Dutch Strategy argument for Conditionalization.<br /><br /><h2>References</h2><ul><li>Briggs, R. A. (2009). Distorted Reflection. <i>Philosophical Review</i>, 118(1), 59–85.</li><li>Christensen, D. (1991). Clever Bookies and Coherent Beliefs. <i>Philosophical Review</i>, 100(2), 229–247.</li><li>Mahtani, A. (2012). Diachronic Dutch Book Arguments. <i>Philosophical Review</i>, 121(3), 443–450.</li></ul></div>Richard Pettigrewhttp://www.blogger.com/profile/07828399117450825734noreply@blogger.com11tag:blogger.com,1999:blog-4987609114415205593.post-68670395284034377012018-10-25T13:13:00.000+01:002018-10-25T13:13:12.666+01:00Dutch Books and Conditionalization<div style="text-align: justify;">I've just signed a contract with Cambridge University Press to write <a href="https://richardpettigrew.com/books/the-dutch-book-argument/" target="_blank">a book on the Dutch Book Argument</a> for their Elements in Decision Theory and Philosophy series. So over the next few months, I'm going to be posting some bits and pieces as I get properly immersed in the literature.</div><div style="text-align: justify;"><br /></div><div style="text-align: justify;">-------</div><div style="text-align: justify;"><br /></div><div style="text-align: justify;">In this post, I'm interested in Dutch Book or sure loss arguments for updating norms, which say how you should change your credences in response to new evidence.<br /><br />Roughly speaking, there are two sorts of updating norm: the first governs <i>actual</i> features of your updating behaviour, such as the credence function you <i>actually</i> adopt after receiving new evidence; the second governs both <i>actual</i> <i>and counterfactual</i> features of your updating behaviour, such as the credence function you <i>would</i> adopt or the credence functions you <i>might</i> adopt were you to learn one thing, the credence function you <i>would</i> adopt or the credence functions you <i>might</i> adopt were you to learn some other thing. As we will see, there are no good sure loss arguments for the first sort of norm. We'll see this in two ways. First, we'll see that, while there's one sort of sure loss that you are vulnerable to if you actually don't update in the prescribed way, you are also vulnerable to this if you actually do update in the prescribed way. Second, we'll see that, while there's a second sort of sure loss that doesn't have the problems of the first, it is possible actually to update in the prescribed way and yet still be vulnerable to this sort of sure loss (if you wouldn't have updated in the prescribed way had you learned something other than what you actually learned), and it is possible to not actually update in the prescribed way and yet nonetheless not be vulnerable to this sort of sure loss (if you might have updated differently in the light of the evidence you actually received, and if those other possible updates differ from your actual update in a certain way).<br /><br />To illustrate these points, let's pursue an example throughout the chapter. Consider Sandy. On Monday, Sandy is 40% confident that the global mean surface temperature will rise by between 0 and 1 degrees Celsius in the next 100 years. That is his unconditional credence in the proposition <i>Medium</i> on Monday. As well as that unconditional credence, he also has various conditional credences on Monday. For instance, let <i>CO$_2$ High</i> be the proposition that current CO$_2$ levels are greater than 420ppm, and let <i>CO$_2$ Low</i> be its negation. On Monday, Sandy has a conditional credence in <i>Medium</i> given <i>CO$_2$ High</i>---it is 60%. And he has a conditional credence in <i>Medium</i> given <i>CO$_2$ Low</i>---it is 20%. By definition, his conditional credence in one proposition $A$ given another $B$ is the proportion of his credence in $B$ he gives also to $A$. That is, it is the ratio of his credence in the conjunction $A\ \&\ B$ to his credence in $B$. Now, on Tuesday, Sandy learns <i>CO$_2$ High</i>. In the light of this, he updates his credences. His new unconditional credence in <i>Medium</i> is 70%. We naturally judge him irrational. If he were rational, his unconditional credences on Tuesday would be the same as his conditional credences on Monday given <i>CO$_2$ High</i>, the proposition he gained as evidence in between times. In the jargon, he would update by conditionalizing on his new evidence, <i>CO$_2$ High</i>. But he doesn't.<br /><br />So Sandy violates a putative norm that governs his actual updating behaviour:</div><div style="text-align: justify;"><br /></div><div style="text-align: justify;"><b>Actual Conditionalization </b> If</div><ul style="text-align: justify;"><li>$c$ is your credence function at $t$,</li><li>$c'$ is your credence function at a later time $t'$,</li><li>$E$ is the strongest evidence you obtained between $t$ and $t'$, and</li><li>$c(E) > 0$,</li></ul><div style="text-align: justify;">then it ought to be that $c'(X) = c(X | E)$, for all $X$.</div><div style="text-align: justify;"><br /></div><div style="text-align: justify;">Is there a sure loss argument against Sandy? As we said at the beginning, there is one sort of sure loss argument against him, but this proves too much; there is also another sort of sure loss argument, but it does not apply to Sandy only on the basis of his actual updating behaviour---if it applies to him at all, it is on the basis of other modal features of him. <br /><br />Throughout, we'll assume that Sandy's credences are probabilistic on Monday and probabilistic on Tuesday. As a result, the Converse Dutch Book Theorem tells us that there is no Dutch Book that can be made against his credences on Monday and no Dutch Book that can be made against his credences on Tuesday. Perhaps, though, there is some sort of Dutch Book we can make against the combination of his credences at the two different times. That is, perhaps there is a <i>diachronic Dutch Book</i> against Sandy. This would consist of a set of bets offered on Monday and a set of bets offered on Tuesday; his Monday credences would have to require him to accept the Monday bets and his Tuesday credences would have to require him to accept the Tuesday bets; and, taken together, those bets would be guaranteed to lose him money. We might say that you are irrational if you are vulnerable to a diachronic Dutch Book. <br /><br />In fact, it turns out that there is such a diachronic Dutch Book against Sandy. His credences require him to sell a £100 bet on <i>Medium</i> for £45 on Monday, since he's 40% confident in <i>Medium</i> on Monday. And they require him to buy a £100 bet on <i>Medium</i> for £55 on Tuesday, since he's 70% confident in <i>Medium</i> on Tuesday. Taken together, these two bets will lose Sandy £10 in all epistemically possible worlds. Thus, if vulnerability to a diachronic Dutch Book is sufficient for irrationality, then Sandy is irrational.<br /><br />The problem with this argument is that Sandy is also required to accept that same pair of bets on Monday and Tuesday if he updates by conditionalizing on the evidence he learns, namely, $E$. In that case, his credences on Monday are the same as before and thus require him to sell a £100 bet on <i>Medium</i> for £45 on Monday. And his credence in <i>Medium</i> on Tuesday is 60%, rather than 70%, and that still requires him to pay £55 for a £100 bet on <i>Medium</i>. So he is sure to lose £10. And indeed, unless he retains exactly the same credences between Monday and Tuesday, there will always be a pair of bets, one offered on Monday that his Monday credences require him to accept, and one offered on Tuesday that his Tuesday credences require him to accept, that, taken together, will lose him money at all epistemically possible worlds. So, if vulnerability to a diachronic Dutch Book is sufficient for irrationality, then Sandy is irrational, but so is anyone who ever changes any of their credences. And that surely can't be right.<br /><br />So the existence of a diachronic Dutch Book against your actual updating behaviour is not sufficient for irrationality. But why not? One natural answer is this. Come Tuesday, both we and Sandy know that he in fact learned <i>CO$_2$ High</i> (which we'll call $E$) and updated his credence in <i>Medium</i> (which we'll abbreviate $M$) on the basis of that. But, on Monday, it is still open at least from Sandy's own point of view which of $E$ or $\overline{E}$ he will learn. And, were he to learn $\overline{E}$ instead, he might well have updated his credences in a different way.</div><div style="text-align: justify;"><br /></div><div style="text-align: justify;">Let's suppose he would. In fact, contrary to the description of the example so far, let's suppose that, whichever he learns, he'll update by conditionalizing. So, if he learns $E$, he'll become 60% confident in $M$, and if he learns $\overline{E}$, he'll become 20% confident in $M$. Then his Monday credences require him to sell a £100 bet on $M$ for £45, and his Tuesday credences should he learn $E$ require him to buy a £100 bet on $M$ for £55, thereby losing him money whether or not $M$ is true. But his Tuesday credences should he learn $\overline{E}$ do not require him to buy a £100 bet on $M$ for £55. Indeed, they require him to refuse to buy that bet. <br /><br />This suggests the following sort of argument against someone who will update by something other than conditionalizing in the face of some evidence they might acquire. Suppose $c$ is your credence function at time $t$---it is defined on $\mathcal{F}$. There's some proposition $E$ in $\mathcal{F}$ that you might learn as evidence between an earlier time $t$ and a later time $t'$. And you'll learn $E$ just in case it's true. And suppose $c(E) > 0$. If you learn $E$---that is, if $E$ is true---you'll adopt credence function $c'$. If you don't learn $E$---that is, if $E$ is false---we don't know how you'll respond---perhaps it isn't determined. Then we'll say that you are vulnerable to a <i>moderate Dutch Strategy</i> if there are</div><ul style="text-align: justify;"><li>bets $B$ that $c$ requires you to accept,</li><li>bets $B'_E$ that $c'$ requires you to accept, and</li><li>bets $B'_{\overline{E}}$ that <i>any</i> credence function requires you to accept</li></ul><div style="text-align: justify;">such that</div><ul style="text-align: justify;"><li>the bets in $B$ and $B'_E$, taken together, lose you money in all worlds at which $E$ is true, and</li><li>the bets in $B$ and $B'_{\overline{E}}$, taken together, lose you money in all worlds at which $E$ is false.</li></ul><div style="text-align: justify;">And we'll say that you are irrational if you are vulnerable to a moderate Dutch Strategy. Now, if we accept this, we can give an argument for updating by conditionalizing that appeals to sure loss bets. Following R. A. Briggs' presentation of David Lewis' argument, here's how (Lewis 1999, Briggs 2009).<br /><br />Suppose $c'(X) = r' < r = c(X|E)$ and $c(E) = d > 0$. Then let $0 < \varepsilon < \frac{d(r-r')}{3}$. Then it is easy to see that your credences require you to accept the following bets:</div><div style="text-align: justify;"><br /></div><div class="separator" style="clear: both; text-align: justify;"><a href="https://4.bp.blogspot.com/-ik7YG3BeLrY/W9Gq9E0ceqI/AAAAAAAABnE/GnNbLd-Z1b8Sn-8rMennUaUJc-UqEMBxgCEwYBhgL/s1600/Screenshot%2B2018-10-25%2Bat%2B12.35.59.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="160" data-original-width="615" height="102" src="https://4.bp.blogspot.com/-ik7YG3BeLrY/W9Gq9E0ceqI/AAAAAAAABnE/GnNbLd-Z1b8Sn-8rMennUaUJc-UqEMBxgCEwYBhgL/s400/Screenshot%2B2018-10-25%2Bat%2B12.35.59.png" width="400" /></a></div><div style="text-align: justify;"></div><div style="text-align: justify;"></div><div style="text-align: justify;"></div><div style="text-align: justify;">After all, Bets 1 and 2 have positive expected value relative to $c$, and Bet 3 has positive expected value relative to $c'$, which you will adopt at $t'$ if $E$ is true. And it is easy to calculate that, if $E$ is true, then Bets 1, 2, and 3 taken together lose you money; and if $E$ is false, then Bets 1 and 2 taken together lose you money.<br /><br />Now notice that this argument is directed against someone who will update by something other than conditionalization on certain evidence she might receive. Thus, at least on the face of it, it is not directed against Sandy's <i>actual updating behaviour</i>, but rather against his <i>dispositions to update in different ways depending on the evidence he receives</i>---what we might call his <i>updating rule</i>. That is, the object of criticism against which the Dutch Strategy argument is posed is Sandy's updating rule. One way to see this is to ask what would happen if Sandy instead learned $\overline{E}$ and updated on $\overline{E}$ by conditionalizing on it. Then, even though his actual updating behaviour would have been in line with Actual Conditionalization, he would nonetheless still have been vulnerable to a Dutch Strategy because he would have strayed from conditionalization had he learned $E$ instead. This shows that Dutch Strategy arguments target irreducibly modal features of an agent---that is, they target rules or dispositions, not actual behaviour. We will see this again below. Thus, we might take the Dutch Strategy argument to establish the following norm, at least in the first instance:<br /> </div><div style="text-align: justify;"><b>Rule Conditionalization</b> If</div><ul style="text-align: justify;"><li>$c$ is your credence function at $t$,</li><li>if $E$ is the strongest evidence you obtain between $t$ and $t'$, then you will adopt $c'$ as your credence function at $t'$,</li><li>$c(E) > 0$,</li></ul><div style="text-align: justify;">then it ought to be that $c'(X) = c(X|E)$, for all $X$.<br /><br />The crucial difference between Actual and Rule Conditionalization lies in the modal status of the second clause. Whereas Actual Conditionalization targets what you <i>actually have</i> done, Rule Conditionalization targets what you <i>will</i> do.<br /><br />Now, it might seem that we can salvage an argument for Actual Conditionalization from Rule Conditionalization. Sandy violates Actual Conditionalization because his unconditional credence in $M$ on Tuesday is 70% while his conditional credence in $M$ given $E$ on Monday is 60%. But surely it is then true on Monday that he <i>will</i> adopt a credence of 70% in $M$ on Tuesday <i>if he learns $E$</i>. That is, he violates Rule Conditionalization as well.<br /><br />But there is a problem with that reasoning. Suppose Sandy's credences don't evolve deterministically. That is, suppose that, while on Tuesday it turns out that he <i>in fact</i> responded to learning $E$ by raising his confidence in $M$ to 70%, he <i>might have</i> responded differently. For instance, suppose that there was some possibility that he responded to the evidence $E$ by dropping his confidence to 50%. Then the Dutch strategy against Sandy described above has a hole. It tells us what to do if he learns $E$ <i>and responds by becoming 70% confident in $M$</i>. And it tells us what to do if he learns $\overline{E}$. But it says nothing about what to do if he learns $E$ <i>and he drops his confidence in $M$ to 50%</i>. And indeed it turns out that it isn't always possible to fill that gap. Thus, what the standard Dutch Strategy argument sketched above shows is that there always a Dutch strategy against someone with a <i>deterministic</i> update rule that would make them stray from conditionalizing in some cases. Now, it turns out that, for certain non-deterministic updating rules, we can create Dutch Strategies against them too. But not all of them. Indeed, there are non-deterministic ways to update your credences that <i>always</i> lead you to not conditionalize, but for which there is no strategy for creating a Dutch Book against you.<br /><br />I'll illustrate this with an example first, and then I'll state the central fact of which the example is a particular case. Suppose Sandy does not update by a purely deterministic rule. Rather, his credences develop as depicted in Figure 1.</div><div style="text-align: justify;"><br /></div><div style="text-align: justify;"></div><div style="text-align: justify;"></div><table cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody><tr><td style="text-align: center;"><a href="https://4.bp.blogspot.com/-lwYKetrDvcY/W9GrUKbrvvI/AAAAAAAABnM/isGDNhrIofoJ37Y9_l2o24_dJJJ5xmPXgCLcBGAs/s1600/Screenshot%2B2018-10-25%2Bat%2B12.36.42.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="806" data-original-width="530" height="400" src="https://4.bp.blogspot.com/-lwYKetrDvcY/W9GrUKbrvvI/AAAAAAAABnM/isGDNhrIofoJ37Y9_l2o24_dJJJ5xmPXgCLcBGAs/s400/Screenshot%2B2018-10-25%2Bat%2B12.36.42.png" width="262" /></a></td></tr><tr align="left"><td class="tr-caption">Figure 1: Sandy's credences on Monday are given by $c$. On Tuesday, he learns $E$ or $\overline{E}$. If $E$, he adopts either $c_1$ or $c_2$, but it is not determined which. If $\overline{E}$, he adopts either $c_3$ or $c_4$, but it is not determined which.</td></tr></tbody></table><div style="text-align: justify;"><br />Thus, whatever happens, Sandy will not update by conditionalizing. However, there is no Dutch strategy against him. The reason is that, while Sandy does not update by conditionalizing on his strongest evidence on Tuesday, there is a way of representing him <i>as if he were updating by conditionalization on some evidence, namely, the identity of his credence function on Tuesday</i>. First, notice that Sandy's credence function $c$ on Monday is the average of the possible credence functions he might adopt on Tuesday---that is, for any $X$ in $\mathcal{F}$,</div><div style="text-align: justify;">$$c(X) = \frac{1}{4}c_1(X) + \frac{1}{4}c_2(X) + \frac{1}{4}c_3(X) + \frac{1}{4}c_4(X)$$<br />Now, suppose we expand the set of propositions $\mathcal{F}$ to which Sandy assigns credences by adding, for each possible future credence function $c_i$, the proposition $C_{c_i}$, which says that $c_i$ is Sandy's credence function on Tuesday. And then suppose we extend $c$ to $c^*$, which is defined on this expanded set of propositions as follows: given a possible world $w$, let <br />\[<br />c^*(w\ \&\ C_{c_i}) = \frac{1}{4}c_i(w)<br />\]<br />Then it's easy to verify that $c^*(X) = c(X)$ for any $X$ in $\mathcal{F}$. So $c^*$ really is an extension of $c$. And we can also see that<br />\[<br />c^*(M | C_{c_i}) = \frac{c^*(M\ \&\ C_{c_i})}{c^*(C_{c_i})} = \frac{\frac{1}{4}c_i(M)}{\frac{1}{4}} = c_i(M)<br />\]<br />So, $c_i$ is the result of conditionalizing $c^*$ on the proposition that $c_i$ is Sandy's credence function on Tuesday. As we see in Theorem 1 below, there can be no Dutch Strategy against Sandy because he can be represented <i>as if</i> he is updating on these propositions about his Tuesday credences, even though that is not <i>in fact</i> how his updating proceeds. This shows again that whatever sure loss argument we have for updating, it does not target actual updating, but rather updating rules or dispositions. For it is possible to have an update rule that makes it certain that you will violate Actual Conditionalization, and yet not be vulnerable to a Dutch Strategy.<br /><br />Before we state our theorem, some terminology: Suppose $c$ is your credence function at $t$. It is defined on a set of propositions $\mathcal{F}$. Suppose $\mathcal{E} = \{E_1, \ldots, E_n\} \subseteq \mathcal{F}$ is a partition that contains the strongest propositions you might learn between $t$ and $t'$. Suppose $\mathcal{C} = \{c_1, \ldots, c_m\}$ is the set of possible credence functions you might adopt at $t'$. They are also defined on $\mathcal{F}$. So:</div><ul style="text-align: justify;"><li>for each $E_i$ in $\mathcal{E}$, there is at least one $c_j$ in $\mathcal{C}$ such that $c_j(E_i) = 1$---that is, for every possible piece of evidence you might acquire, there is some possible future credence function that is a response to that evidence;</li><li>for each $c_j$ in $\mathcal{C}$, there is exactly one $E_i$ in $\mathcal{E}$ such that $c_j(E_i) = 1$---that is, every possible future credence function is a response to exactly one of these possible pieces of evidence.</li></ul><div style="text-align: justify;">Then we say that <i>there is a strong Dutch Strategy against you</i> iff there are sets of bets $B$ and $B'$ such that</div><ul style="text-align: justify;"><li>$c$ requires you to accept the bets in $B$,</li><li>$c_j$ requires you to accept the bets in $B'$, for all $1 \leq j \leq m$, and</li><li>taken together, the bets in $B$ and $B'$ lose you money in all epistemically possible worlds.</li></ul><div style="text-align: justify;">And we say that <i>there is a weak Dutch Strategy against you</i> iff there are sets of bets $B, B'_1, \ldots, B'_n$ such that</div><ul style="text-align: justify;"><li>$c$ requires you to accept the bets in $B$,</li><li>$c_j$ requires you to accept the bets in $B'_j$, for all $1 \leq j \leq m$, and</li><li>the bets in $B$ and $B_j$, taken together, lose you money at all worlds at which you have credence function $c_j$ at time $t'$.</li></ul><div style="text-align: justify;">Note: if you are vulnerable to a strong Dutch Strategy, you're certainly vulnerable to a weak Dutch Strategy; if you are not vulnerable to a weak Dutch Strategy, then you cannot be vulnerable to a strong Dutch Strategy.<br /><br />We say that you are <i>representable as a conditionalizer</i> iff there is an extension of $c, c_1, \ldots, c_m$ to credence functions $c^*, c^*_1, \ldots, c^*_m$ defined on $(\mathcal{F} \cup \{C_{c_1}, \ldots, C_{c_m}\})^*$ such that </div><ul style="text-align: justify;"><li>$c^*_i(C_{c_i}) = 1$, for $1 \leq i \leq m$;</li><li>$c^*_i(X) = c^*(X | C_{c_i})$ for $X$ in $(\mathcal{F} \cup \{C_{c_1}, \ldots, C_{c_m}\})^*$</li></ul>where $(\mathcal{F} \cup \{C_{c_1}, \ldots, C_{c_m}\})^*$ is the smallest algebra to contain $\mathcal{F}$ and each $C_{c_i}$.<br /><br /><div style="text-align: justify;"><b>Theorem 1</b></div><ul style="text-align: justify;"><li>If you are not representable as a conditionalizer, there is a strong Dutch Strategy against you;</li><li>If you are representable as a conditionalizer, there is no weak (or strong) Dutch Strategy against you.</li></ul><div style="text-align: justify;">I won't give the full proof here, but it runs roughly as follows: you are representable as a conditionalizer iff $c$ is in the convex hull of $\{c_1, \ldots, c_n\}$ and if $c$ is not in the convex hull of $\{c_1, \ldots, c_n\}$, then there is a set of bets that $c$ requires you to buy for one price, while each $c_i$ requires you to sell them for a lower price.</div><div style="text-align: justify;"><br /></div><div style="text-align: justify;">Now, suppose your updating rule is deterministic. Then, for each $E_i$, there is exactly one $c_j$ in $\mathcal{C}$ such that $c_j(E_i) = 1$. Thus, in this case, your updating rule is vulnerable to a strong Dutch Strategy if it is not a conditionalizing rule, and not vulnerable even to a weak Dutch Strategy if it is. Thus, we have an extra argument for Rule Conditionalization. In some ways it strengthens the standard argument presented above, for it shows that the same set of bets can be offered at $t'$ regardless of what credence function you end up having. But in some ways it weakens that argument, for it relies on the assumption that there is a finite set of possible future credence functions you might adopt at $t'$.<br /><br />Now, of course, you might object that updating other than by a deterministic rule is irrational: your evidence, together with your prior credences, should determine your new credences; there should not be many possible ways you might respond to the same piece of evidence. This may be true, and if we supplement the Dutch Strategy argument with this assumption, we obtain a Dutch Strategy for conditionalizing. But note that the argument is no longer purely pragmatic. It is now partly evidentialist, because it incorporates this evidentialist principle that we have not and cannot justify on pragmatic grounds---we cannot specify how you will go wrong in your decisions if you update using a non-deterministic rule.</div><div style="text-align: justify;"><br /></div><h2 style="text-align: justify;">References</h2><ul style="text-align: justify;"><li>Briggs, R. A. (2009). Distorted Reflection. <i>Philosophical Review</i>, 118(1), 59–85. </li><li>Lewis, D. (1999). Why Conditionalize? In <i>Papers in Metaphysics and Epistemology</i>. Cambridge, UK: Cambridge University Press. </li></ul><div style="text-align: justify;"><br /></div><div style="text-align: justify;"><br /></div>Richard Pettigrewhttp://www.blogger.com/profile/07828399117450825734noreply@blogger.com3tag:blogger.com,1999:blog-4987609114415205593.post-36650031827530635152018-09-26T11:37:00.001+01:002018-09-26T11:37:02.304+01:00Assistant professorship in formal philosophy (Gdansk)<div dir="ltr" style="text-align: left;" trbidi="on"><br class="Apple-interchange-newline" /><span style="background-color: white; color: #1d2129; font-family: system-ui, -apple-system, system-ui, ".SFNSText-Regular", sans-serif; font-size: 14px;">A tenure-track job in formal philosophy in Gdansk is available. Polish language skills not required. The application deadline is November 23, 2018. Details <a href="http://entiaetnomina.blogspot.com/2018/09/assistant-professorship-in-formal.html">here</a>. </span></div>Rafal Urbaniakhttp://www.blogger.com/profile/10277466578023939272noreply@blogger.com3tag:blogger.com,1999:blog-4987609114415205593.post-28407807093046798202018-08-29T12:02:00.002+01:002018-08-29T17:56:18.166+01:00A new (?) sort of Dutch Book argument: exploitability vs dominance<h2>The exploitability-implies-irrationality argumentative strategy</h2><br />In decision theory, we often wish to impose normative constraints either on an agent's preference ordering or directly on the utility function that partly determines it. We might demand, for instance, that your preferences should not be cyclical, or that your utility function should discount the future exponentially. And in Bayesian epistemology, we often wish to impose normative constraints on credences. We might demand, for instance, that your credence in one proposition should be no greater than your credence in another proposition that it entails. In both cases, we often use a particular argumentative strategy to establish these norms: we'll call it the <i>exploitability-implies-irrationality strategy</i> (or <i>EII</i>, for short). I want to start by arguing that this is a bad argumentative strategy; and then I want to describe a way to replace it with a good argumentative strategy that is inspired by the problem we have identified with EII. I want to finish by sketching a version of the good argumentative strategy that would replace the EII strategy in the case of credal norms; that is, in the case of the Dutch Book argument. I leave it open here whether a similar strategy can be made to work in the case of preferences or utility functions. (I think this alternative argument strategy is new---it essentially combines an old result by Mark Schervish (1989) with a more recent result by Joel Predd and his co-authors at Princeton (2009); so it wouldn't surprise me at all if something similar has been proposed before---I'd welcome any information about this.)<br /><br />The EII strategy runs as follows:<br /><br /><b>(I) Mental state-action link.</b> It begins by claiming that, for anyone with a particular mental state---a preference ordering, a utility function, a credence function, or some combination of these---it is rationally required of them to choose in a particular way when faced with a decision problem.<br /><br /><i>Some examples</i>:<br />(i) someone with preference ordering $a \prec b$ is rationally required to pay some amount of money to receive $b$ rather than $a$;<br />(ii) someone with credence $p$ in proposition $X$ should pay £$(p-\epsilon)$ for a bet that pays out £1 if $X$ is true and £0 if $X$ is false---call this a £1 bet on $X$.<br /><br /><b>(II) Mathematical theorem.</b> It proceeds to show that, for anyone with a mental state that violates the norm in question, there are decision problems the agent might face such that, if she does, then there are choices she might make in response to them that <i>dominate</i> the choices that premise (I) says are rationally required of her as a result of her mental state. That is, the first set of choices is guaranteed to leave her better off than the second set of choices.<br /><br /><i>Some examples</i>:<br />(i) if $c \prec a \prec b \prec c$, then rationality requires you to pay to get $a$ rather than $c$, pay again to get $b$ rather than $a$, and pay again to get $c$ rather than $a$. If, instead, you'd just chosen $c$ at the beginning, and refused to pay anything to swap, you'd be better off for sure now.<br />(ii) if you have credence $p$ in $XY$ and a credence $q < p$ in $X$, then you will sell a £1 bet on $X$ for £$(q + \varepsilon)$, and you'll buy a £1 bet on $XY$ for £$(p-\varepsilon)$. Providing $3\varepsilon < p - q$, it is easy to see that, taken together, these bets lose you money for sure, and thus refusing both bets is guaranteed to leave you better off.<br /><br /><b>(III) Action-rationality link. </b>The final premise says that, if there is some series of decision problems such that the choices your mental states rationally require you to make are dominated by some other set of choices you might have made instead, then your mental states are irrational.<br /><br /><i>Some examples</i>:<br />(i) By (I-III)(i), we conclude that preferences $c \prec a \prec b \prec c$ are irrational.<br />(ii) By (I-III)(ii), we conclude that having a higher credence in a conjunction than in one of the conjuncts is irrational.<br /><br />Now, there are often problems with the instance of (I) that is used in such EII arguments. For instance, there are many reasons to think rationality does not require someone with credence $p$ in $X$ to pay £$(p - \varepsilon)$ for a £1 bet on $X$. But my focus here is on (III).<br /><br /><h2>The Problem with the Action-Rationality Link </h2><br />The problem with (III) is this: It is clear that it is irrational to make a series of decisions when there is an alternative series that is guaranteed to do better---it is irrational because, when you act, you are attempting to maximise your utility and doing what you have done is guaranteed to be suboptimal as a means to that end; there is an alternative you can know a priori would serve that end better. But it is much less clear why it is irrational to have mental states that require you to make a dominated series of decisions when faced with a particular decision problem. When you choose a dominated option, you are irrational because there's something else you could have done that is guaranteed to serve your ends better. But when you have mental states that require you to choose a dominated option, that alone doesn't tell us that there is anything else you could have done---any alternative mental states you could have had---that are guaranteed to serve your ends better.<br /><br />Of course, there is often something else you could have done that would not have required you to make the dominated choice. Let's focus on the case of credences. The Dutch Book Theorem shows that, if your credences are not probabilistic, then there's a series of decision problems and a dominated series of options from them that those credences require you to choose. The Converse Dutch Book Theorem shows that, if your credences are instead probabilistic, then there is no such series of decision problems and options. So it's true that there's something else you could do that's guaranteed not to require you to make a dominated choice. But making a dominated choice is not an eventuality so dreadful and awful that, if your credences require you to do it in the face of one particular sort of decision problem, they are automatically irrational, regardless of what they lead you to do in the face of any other decision problem and regardless of how likely it is that you face a decision problem in which they require it of you.<br /><br />After all, for all the Dutch Book or Converse Dutch Book Theorem tell you, it might be that your non-probabilistic credences lead you to choose badly when faced with the very particular Dutch Book decision problem, but lead you to choose extremely profitably when faced with many other decision problems. Any indeed, even in the case of the Dutch Book decision problem, it might be that your non-probabilistic credences require you to choose in a way that leaves you a little poorer for sure, while all the alternative probabilistic credences require you to choose in a way that leaves you with the possibility of great gain, but also the risk of great loss. In this case, it is not obvious that the probabilistic credences are to be preferred. Furthermore, you might have reason to think that it is extremely unlikely you will ever face the Dutch Book decision problem itself. Or at least much more probable that you'll face other decision problems where your credences don't lead you to choose a dominated series of options. For all these reasons, the mere possibility of a series of decision problems from which your credences require you to choose a dominated series of options is not sufficient to show that your credences are irrational. To do this, we need to show that there are some alternative credences that are in some sense sure to serve you better as you face the decision problems that make up your life. Without these alternative that do better, pointing out a flaw in some mental state does not show that it is irrational, even if there are other mental states without the flaw---for those alternative mental states might have other strikes against them that the mental state in question does not have.<br /><br /><h2>A new Dutch Book argument </h2><br />So our question is now: Is there any sense in which, when you have non-probabilistic credences, there are some alternative credences that are guaranteed to serve you better as a guide in your decision-making? Borrowing from work by Mark Schervish (<a href="https://projecteuclid.org/euclid.aos/1176347398" target="_blank">'A General Method for Comparing Probability Assessors', 1989</a>) and Ben Levinstein (<a href="https://www.journals.uchicago.edu/doi/full/10.1086/693444" target="_blank">'A Pragmatist's Guide to Epistemic Utility', 2017</a>), I want to argue that there is.<br /><br /><h3>The pragmatic utility of an individual credence </h3><br />Our first order of business is to create a utility function that measures how good individual credences are as a guide to decision-making. Then we'll take the utility of a whole credence function to be the sum of the utilities of the credences that comprise it. (In fact, I think there's a way to do all this without that additivity assumption, but I'm still ironing out the creases in that.)<br /><br />Suppose you assign credence $p$ to proposition $X$. Our job is to say how good this credence is as a guide to action. The idea is this:<br /><ul><li>an act is a function from states of the world to utilities---let $\mathcal{A}$ be the set of all acts;</li><li>an $X$-act is an act that assigns the same utility to all the worlds at which $X$ is true, and assigns the same utility to all worlds at which $X$ is false---let $\mathcal{A}_X$ be the set of all $X$-acts;</li><li>a decision problem is a set of acts; that is, a subset of $\mathcal{A}$---let $\mathcal{D}$ be the set of all decision problems;</li><li>an $X$-decision problem is a set of $X$-acts; that is, a subset of $\mathcal{A}_X$---let $\mathcal{D}_X$ be the set of all $X$-decision problems.</li></ul>We suppose that there is a probability function $P$ that says how likely it is that the agent will face different $X$-decision problems---since the set of $X$-decision problems is infinite, we actually take $P$ to be a probability density function. The idea here is that $P$ is something like an objective chance function. With that in hand, we take the pragmatic utility of credence $p$ in proposition $X$ to be the expected utility of the choices that credence $p$ in $X$ will lead you to make when faced with the decision problems you will encounter. That is, it is the integral, relative to measure $P$, over the possible $X$-decision problems $D$ in $\mathcal{D}_X$ you might face, of the utility of the act you'd choose from $D$ using $p$, discounted by the probability that you'd face $D$. Given $D$ in $\mathcal{D}_X$, let $D^p$ be the act you'd choose from $D$ using $p$---that is, $D^p$ is one of the acts in $D$ that maximises expected utility by the lights of $p$. Thus, for any $D$ in $\mathcal{D}_X$, and any act $a$ in $D$,$$\mathrm{Exp}_p(u(a)) \leq \mathrm{Exp}_p(u(D^p))$$ Then we define the pragmatic utility of credence $p$ in $X$ when $X$ is true as follows:<br />$$g_X(1, p) = \int_{\mathcal{D}_X}u(D^p, X) dP$$ And we define the pragmatic utility of credence $p$ in $X$ when $X$ is false as follows:<br />$$g_X(0, p) = \int_{\mathcal{D}_X}u(D^p, \overline{X}) dP$$ These are slight modifications of Schervish's and Levinstein's definitions.<br /><br /><br /><h3>$g$ is a strictly proper scoring rule</h3><br />Our next order of business is to show that this utility function for $g$ is a strictly proper scoring rule. That is, $\mathrm{Exp}_p(g_X(q)) = pg_X(1, q) + (1-p)g_X(0, q)$ is uniquely maximised, as a function of $q$, at $p = q$. We show this now:<br />\begin{eqnarray*}<br />\mathrm{Exp}_p(g_X(q)) & = & pg_X(1, q) + (1-p)g_X(0, q)\\<br />& = & p \int_{\mathcal{D}_X}u(D^q, X) dP + (1-p) \int_{\mathcal{D}_X}u(D^q, \overline{X}) dP \\<br />& = & \int_{\mathcal{D}_X}p u(D^q, X) + (1-p) u(D^q, \overline{X}) dP\\<br />& = & \int_{\mathcal{D}_X} \mathrm{Exp}_p(u(D^q)) dP <br />\end{eqnarray*} <br />But, by the definition of $D^q$, if $q \neq p$, then, for all $D$ in $\mathcal{D}_X$,<br />$$\mathrm{Exp}_p(u(D^q)) \leq \mathrm{Exp}_p(u(D^p))$$<br />and, for some $D$ in $\mathcal{D}_X$,<br />$$\mathrm{Exp}_p(u(D^q)) < \mathrm{Exp}_p(u(D^p))$$<br />Now, for two credences $p$ and $q$ in $X$, we say that a set of decision problems separates $p$ and $q$ if (i) each decision problem in the set contains only two available acts, (ii) for each decision problem in the set, $p$ expects one act to have higher expected value and $q$ expects the other to have higher expected value. Then, as long as there is some set of decision problems such that (i) that set separates $p$ and $q$ and (ii) $P$ assigns positive probability to this set, then <br />$$\mathrm{Exp}_p(g(q)) < \mathrm{Exp}_p(g(p))$$ And so the scoring rule $g$ is strictly proper.<br /><br /><h3>The pragmatic utility of a whole credence function </h3><br />The scoring rule $g_X$ we have just defined assigns pragmatic utilities to individual credences in $X$. In the next step, we define $G$, a pragmatic utility function that assigns pragmatic utilities to whole credence functions. We take the utility of a credence function to be the sum of the utilities of the individual credences it assigns. Suppose $c : \mathcal{F} \rightarrow [0, 1]$ is a credence function defined on the set of propositions $\mathcal{F}$. Then: $$G(c, w) = \sum_{X \in \mathcal{F}} g_X(w(X), c(X))$$ where $w(X) = 1$ if $X$ is true at $w$ and $w(X) = 0$ if $X$ is false at $w$. In this situation, we say that $G$ is generated from the scoring rules $g_X$ for $X$ in $\mathcal{F}$.<br /><br /><br /><br /><br /><h3>Predd, et al.'s Dominance Result</h3><br />Finally, we appeal to a theorem due to Predd, et al. (<a href="https://ieeexplore.ieee.org/document/5238758/" target="_blank">'Probabilistic Coherence and Proper Scoring Rules', 2009</a>):<br /><br /><b>Theorem (Predd, et al 2015) </b>Suppose $G$ is generated from strictly proper scoring rules $g_X$ for $X$ in $\mathcal{F}$. Then,<br />(I) if $c$ is not a probability function, then there is a probability function $c^*$ such that, $G(c, w) < G(c^*, w)$ for all worlds $w$;<br />(II) if $c$ is a probability function, then there is no credence function $c^* \neq c$ such that $G(c, w) \leq G(c^*, w)$ for all worlds $w$.<br /><br />This furnishes us with a new pragmatic argument for probabilism. And indeed, now that we have a pragmatic utility function that is generated from strictly proper scoring rules, we can take advantage of all of the epistemic utility arguments that make that same assumption, such as Greaves and Wallace's argument for Conditionalization, my arguments for the Principal Principle, the Principle of Indifference, linear pooling in judgment aggregation cases, and so on.<br /><br />In this argument, we see that non-probabilistic credences are irrational not because there is some series of decision problems such that, when faced with them, the credences require you to make a dominated series of choices. Rather, they are irrational because there are alternative credences that are guaranteed to serve you better on average as a guide to action---however the world turns out, the expected or average utility you'll gain from making decisions using those alternative credences is greater than the expected or average utility you'll gain from making decisions using the original credences.Richard Pettigrewhttp://www.blogger.com/profile/07828399117450825734noreply@blogger.com6tag:blogger.com,1999:blog-4987609114415205593.post-56954510927030670032018-08-06T20:50:00.003+01:002018-08-06T20:50:38.285+01:00Postdoc in formal epistemology & law<div dir="ltr" style="text-align: left;" trbidi="on"><div style="text-align: justify;">A postdoc position (3 years, fixed term) in the Chair of Logic, Philosophy of Science and Epistemology is available at the Department of Philosophy, Sociology, and Journalism, University of Gdansk, Poland. The application deadline is September 15, 2018. More details <a href="https://entiaetnomina.blogspot.com/2018/08/postdoc-position-in-formal-epistemology.html">here</a>.</div></div>Rafal Urbaniakhttp://www.blogger.com/profile/10277466578023939272noreply@blogger.com1tag:blogger.com,1999:blog-4987609114415205593.post-87835110580353858692018-07-30T21:16:00.002+01:002018-07-30T21:27:18.688+01:00The Dutch Book Argument for RegularityI've just signed a contract with Cambridge University Press to write <a href="https://richardpettigrew.com/books/the-dutch-book-argument/" target="_blank">a book on the Dutch Book Argument</a> for their Elements in Decision Theory and Philosophy series. So over the next few months, I'm going to be posting some bits and pieces as I get properly immersed in the literature.<br /><br />-------<br /><br />We say that a probabilistic credence function $c : \mathcal{F} \rightarrow [0, 1]$ is <i>regular</i> if $c(A) > 0$ for all propositions $A$ in $\mathcal{F}$ such that there is some world at which $A$ is true.<br /><br /><b>The Principle of Regularity (standard version)</b> If $c : \mathcal{F} \rightarrow [0, 1]$ is your credence function, rationality requires that $c$ is regular.<br /><br />I won't specify which worlds are in the scope of the quantifier over worlds that occurs in the antecedent of this norm. It might be all the logically possible worlds, or the metaphysically possible worlds, or the conceptually possible worlds; it might be the epistemically possible worlds. Different answers will give different norms. But we needn't decide the issue here. We'll just specify that it's the same set of worlds that we quantify over in the Dutch Book argument for Probabilism when we say that, if your credences aren't probabilistic, then there's a series of bets they'll lead you to enter into that will lose you money<i> at all possible worlds</i>. <br /><br />In this post, I want to consider the almost-Dutch Book Argument for the norm of Regularity. Here's how it goes: Suppose you have a credence $c(A) = 0$ in a proposition $A$, and suppose that $A$ is true at world $w$. Then, recall, the first premise of the standard Dutch Book argument for Probabilism:<br /><br /><b>Ramsey's Thesis</b> If your credence in a proposition $X$ is $c(X) = p$, then you're permitted to pay £$pS$ for a bet that returns £$S$ if $X$ is true and £$0$ if $X$ is false, for any $S$, positive or negative or zero.<br /><br />So, since $c(A) = 0$, your credences will permit you to sell the following bet for £0: if $A$, you must pay out £1; if $\overline{A}$, you will pay out £0. But selling this bet for this price is weakly dominated by refusing the bet. Selling the bet at that price loses you money in all $A$-worlds, and gains you nothing in $\overline{A}$-worlds. Whereas refusing the bet neither loses nor gains you anything in any world. Thus, your credences permit you to choose a weakly dominated act. So they are irrational. Or so the argument goes. I call this the almost-Dutch Book argument for Regularity since it doesn't punish you with a sure loss, but rather with a possible loss with no compensating possible gain.<br /><br />If this argument works, it establishes the standard version of Regularity stated above. But consider the following case. $A$ and $B$ are two logically independent propositions -- <i>It will be rainy tomorrow</i> and <i>It will be hot tomorrow</i>, for instance. You have only credences in $A$ and in the conjunction $AB$. You don't have credences in $\overline{A}$, $A \vee B$, $A\overline{B}$, and so on. What's more, your credences in $A$ and $AB$ are equal, i.e., $c(A) = c(AB)$. That is, you are exactly as confident in $A$ as you are in its conjunction with $B$. Then, in some sense, you violate Regularity, though you don't violate the standard version we stated above. After all, since your credence in $A$ is the same as your credence in $AB$, you must give no credence whatsoever to the worlds in which $A$ is true and $B$ is false. If you did, then you would set $c(AB) < c(A)$. But you don't have a credence in $A\overline{B}$. So there is no proposition true at some worlds to which you assign a credence of 0. Thus, the almost-Dutch Book argument sketched above will not work. We need a different Dutch Book argument for the following version of Regularity:<br /><br /><b>The Principle of Regularity (full version)</b> If $c : \mathcal{F} \rightarrow [0, 1]$ is your credence function, then rationality requires that there is an extension $c^*$ of $c$ to a full algebra $\mathcal{F}^*$ that contains $\mathcal{F}$ such that $c^* : \mathcal{F}^* \rightarrow [0, 1]$ is regular.<br /><br />It is this principle that you violate if $c(A) = c(AB)$ when $A$ and $B$ are logically independent. For any probabilistic extension $c^*$ of $c$ that assigns a credence to $A\overline{B}$ must assign it credence 0 even though there is a world at which it is true.<br /><br />How are we to give an almost-Dutch Book argument for this version of Regularity? There are two possible approaches.<br /><br />On the first, we strengthen the first premise of the standard Dutch Book argument. Ramsey's Thesis says: if you have credence $c(X) = p$ in $X$, then you are permitted to pay £$pS$ for a bet that pays £$S$ if $X$ and £$0$ if $\overline{X}$. The stronger version says:<br /><br /><b>Strong Ramsey's Thesis</b> If every extension $c^*$ of $c$ to a full algebra $\mathcal{F}^*$ that contains $\mathcal{F}$ is such that $c^*(X) = p$, then you are permitted to pay £$pS$ for a bet that pays £$S$ if $X$ and £$0$ if $\overline{X}$.<br /><br />The idea is that, if every extension assigns the same credence $p$ to $X$, then you are in some sense committed to assigning credence $p$ to $X$. And thus, you are permitted to enter into which ever bets you'd be permitted to enter into if you actually had credence $p$.<br /><br />On the second approach to giving an almost-Dutch Book argument for the full version of the Regularity principle, we actually provide an almost-Dutch Book using just the credences that you do in fact assign. Suppose, for instance, you have credence $c(A) = c(AB) = 0.5$. Then you will sell for £5 a bet that pays out £10 if $A$ and £0 if $\overline{A}$, while you will buy for £5 a bet that pays £10 if $AB$ and £0 if $\overline{AB}$. Then, if $A$ is true and $B$ is true, you will have a net gain of £0, and similarly if $A$ is false. But if $A$ is true and $B$ is false, you will lose £10. Thus, you face the possibility of loss with no possibility of gain. Now, the question is: can we always construct such almost-Dutch Books? And the answer is that we can, as the following theorem shows:<br /><b><br /></b><b>Theorem 1 (Almost-Dutch Book Theorem for Full Regularity) </b>Suppose $\mathcal{F} = \{X_1, \ldots, X_n\}$ is a set of propositions. Suppose $c : \mathcal{F} \rightarrow [0, 1]$ is a credence function that cannot be extended to a regular probability function on a full algebra $\mathcal{F}^*$ that contains $\mathcal{F}$. Then there is a sequence of stakes $S = (S_1, \ldots, S_n)$, such that if, for each $1 \leq i \leq n$, you pay £$(c(X_i) \times S_i)$ for a bet that pays out £$S_i$ if $X_i$ and £0 if $\overline{X_i}$, then the total price you'll pay is at least the pay off of these bets at all worlds, and more than the payoff at some.<br /><br />That is,<br />(i) for all worlds $w$,<br />$$S\cdot (w - c) = S \cdot w - S \cdot c = \sum^n_{i=1} S_iw(X_i) + \sum^n_{i=1} S_ic(X_i) \leq 0$$<br />(ii) for some worlds $w$, <br />$$S\cdot (w - c) = S \cdot w - S \cdot c = \sum^n_{i=1} S_iw(X_i) + \sum^n_{i=1} S_ic(X_i) \leq 0$$<br />where $w(X_i) = 1$ if $X_i$ is true at $w$ and $w(X_i) = 0$ if $X_i$ is false at $w$. We call $w(-)$ the indicator function of $w$. <br /><br /><i>Proof sketch. </i>First, recall de Finetti's observation that your credence function $c : \mathcal{F} \rightarrow [0, 1]$ is a probability function iff it is in the convex hull of the indicator functions of the possible worlds -- that is, iff $c$ is in $\{w(-) : w \mbox{ is a possible world}\}^+$. Second, note that, if your credence function can't be extended to a regular credence function, it sits on the boundary of this convex hull. In particular, if $W_c = \{w' : c = \sum_w \lambda_w w \Rightarrow \lambda_{w'} > 0\}$, then $c$ lies on the boundary surface created by the convex hull of $W_c$. Third, by the Supporting Hyperplane Theorem, there is a vector $S$ such that $S$ is orthogonal to this boundary surface and thus:<br />(i) $S \cdot (w-c) = S \cdot w - S \cdot c = 0$ for all $w$ in $W_c$; and<br />(ii) $S \cdot (w-c) = S \cdot w - S \cdot c < 0$ for all $w$ not in $W_c$.<br />Fourth, recall that $S \cdot w$ is the total payout of the bets at world $w$ and $S \cdot c$ is the price you'll pay for it. $\Box$Richard Pettigrewhttp://www.blogger.com/profile/07828399117450825734noreply@blogger.com0tag:blogger.com,1999:blog-4987609114415205593.post-11733654836568522122018-07-26T08:19:00.002+01:002018-07-26T11:32:59.271+01:00Dutch Strategy Theorems for Conditionalization and Superconditionalization<br />I've just signed a contract with Cambridge University Press to write <a href="https://richardpettigrew.com/books/the-dutch-book-argument/" target="_blank">a book on the Dutch Book Argument</a> for their Elements in Decision Theory and Philosophy series. So over the next few months, I'm going to be posting some bits and pieces as I get properly immersed in the literature.<br /><br />----- <br /><br />Many Bayesians formulate the update norm of Bayesian epistemology as follows:<br /><br /><b>Bayesian Conditionalization</b> If<br />(i) your credence function at $t$ is $c : \mathcal{F} \rightarrow [0, 1]$,<br />(ii) your credence function at a later time $t'$ is $c' : \mathcal{F} \rightarrow [0, 1]$,<br />(iii) $E$ is the strongest evidence you acquire between $t$ and $t'$,<br />(iv) $E$ is in $\mathcal{F}$, <br />then rationality requires that, if $c(E) > 0$, then for all $X$ in $\mathcal{F}$, $$c'(X) = c(X|E) = \frac{c(XE)}{c(E)}$$<br /><br />I don't. One reason you might fail to conditionalize between $t$ and $t'$ is that you re-evaluate the options between those times. You might disavow the prior that you had at the earlier time, perhaps decide it was too biased in one way or another, or not biased enough; perhaps you come to think that it doesn't give enough consideration to the explanatory power one hypothesis would have were it true, or gives too much consideration to the adhocness of another hypothesis; and so on. Now, it isn't irrational to change your mind. So surely it can't be irrational to fail to conditionalize as a result of changing your mind in this way. On this, I agree with van Fraassen.<br /><br />Instead, I prefer to formulate the update norm as follows -- I borrow the name from Kenny Easwaran:<br /><br /><b>Plan Conditionalization</b> If<br />(i) your credence function at $t$ is $c: \mathcal{F} \rightarrow [0, 1]$,<br />(ii) between $t$ and $t'$ you will receive evidence from the partition $\{E_1, \ldots, E_n\}$,<br />(iii) each $E_i$ is in $\mathcal{F}$ <br />(iv) at $t$, your updating plan is $c'$, so that $c'_i : \mathcal{F} \rightarrow [0, 1]$ is the credence function you will adopt if $E_i$,<br />then rationality requires that, if $c(E_i) > 0$, then for all $X$ in $\mathcal{F}$, $$c'_i(X) = c(X | E_i)$$<br /><br />I want to do two things in this post. First, I'll offer what I think is a new proof of the Dutch Strategy or Diachronic Dutch Book Theorem that justifies Plan Conditionalization (I haven't come across it elsewhere, though Ray Briggs and I used the trick at the heart of it for our accuracy dominance theorem in <a href="https://onlinelibrary.wiley.com/doi/abs/10.1111/nous.12258" target="_blank">this paper</a>). Second, I'll explore how that might help us justify other norms of updating that concern situations in which you don't come to learn any proposition with certainty. We will see that we can use the proof I give to justify the following standard constraint on updating rules: Suppose the evidence I receive between $t$ and $t'$ is not captured by any of the propositions to which I assign a credence -- that is, there is no proposition $e$ to which I assign a credence that is true at all and only the worlds at which I receive the evidence I actually receive between $t$ and $t'$. As a result, there is no proposition $e$ that I learn with certainty as a result of receiving that evidence. Nonetheless, I should update my credence function from $c$ to $c'$ in such a way that it is possible to extend my earlier credence function $c$ to a credence function $c^*$ so that: (i) $c^*$ does assign a credence to $e$, and (ii) my later credence $c'(X)$ in a proposition $X$ is the credence that this extended credence function $c^*$ assigns to $X$ conditional on me receiving evidence $e$ -- that is, $c'(X) = c^*(X | e)$. That is, I should update <i>as if</i> I had assigned a credence to $e$ at the earlier time and then updated by conditionalizing on it.<br /><br />Here's the Dutch Strategy or Diachronic Dutch Book Theorem for Plan Conditionalization:<br /><br /><b>Definition (Conditionalizing pair)</b> Suppose $c$ is a credence function and $c'$ is an updating rule defined on $\{E_1, \ldots, E_n\}$. We say that $(c, c')$ <i>is a conditionalizing pair</i> if, whenever $c(E_i) > 0$, then for all $X$, $c'_i(X) = c(X | E_i)$.<br /><br /><b>Dutch Strategy Theorem</b> Suppose $(c, c')$ is not a conditionalizing pair. Then<br />(i) there are two acts $A$ and $B$ such that $c$ prefers $A$ to $B$, and<br />(ii) for each $E_i$, there are two acts $A_i$ and $B_i$ such that $c'_i$ prefers $A_i$ to $B_i$,<br />and, for each $E_i$, $A + A_i$ has greater utility than $B + B_i$ at all worlds at which $E_i$ is true.<br /><br />We'll now give the proof of this.<br /><br />First, we describe a way of representing pairs $(c, c')$. Both $c$ and each $c'_i$ are defined on the same set $\mathcal{F} = \{X_1, \ldots, X_m\}$. So we can represent $c$ by the vector $(c(X_1), \ldots, c(X_m))$ in $[0, 1]^m$, and we can represent each $c'_i$ by the vector $(c'_i(X_1), \ldots, c'_i(X_m))$ in $[0, 1]^m$. And we can represent $(c, c')$ by concatenating all of these representations to give:<br />$$(c, c') = c \frown c'_1 \frown c'_2 \frown \ldots \frown c'_n$$<br />which is a vector in $[0, 1]^{m(n+1)}$.<br /><br />Second, we use this representation to give an alternative characterization of conditionalizing pairs. First, three pieces of notation:<br /><ul><li>Let $W$ be the set of all possible worlds. </li><li>For any $w$ in $W$, abuse notation and write $w$ also for the credence function on $\mathcal{F}$ such that $w(X) = 1$ if $X$ is true at $w$, and $w(X) = 0$ if $X$ is false at $w$.</li><li>For any $w$ in $W$, let $$(c, c')_w = w \frown c'_1 \frown \ldots \frown c'_{i-1} \frown w \frown c'_{i+1} \frown \ldots \frown c'_n$$ where $E_i$ is the element of the partition that is true at $w$.</li></ul><b>Lemma 1</b> If $(c, c')$ is not a conditionalizing pair, then $(c, c')$ is not in the convex hull of $\{(c, c')_w : w \in W\}$, which we write $\{(c, c')_w : w \in W\}^+$.<br /><br /><i>Proof of Lemma 1. </i>If $(c, c')$ is in $\{(c, c')_w : w \in W\}^+$, then there are $\lambda_w \geq 0$ such that<br /><br />(1) $\sum_{w \in W} \lambda_w = 1$,<br />(2) $c(X) = \sum_{w \in W} \lambda_w w(X)$<br />(3) $c'_i(X) = \sum_{w \in E_i} \lambda_w w(X) + \sum_{w \not \in E_i} \lambda_w c'_i(X)$.<br /><br />By (2), we have $\lambda_w = c(w)$. So by (3), we have $$c'_i(X) = c(XE_i) + (1-c(E_i))c'_i(X)$$ So, if $c(E_i) > 0$, then $c'_i(X) = c(X | E_i)$.<br /><br />Third, we use this alternative characterization of conditionalizing pairs to specify the acts in question. Suppose $(c, c')$ is not a conditionalizing pair. Then $(c, c')$ is outside $\{(c, c')_w : w \in W\}^+$. Now, let $(p, p')$ be the orthogonal projection of $(c, c')$ into $\{(c, c')_w : w \in W\}^+$. Then let $(S, S') = (c, c') - (p, p')$. That is, $S = c - p$ and $S'_i = c'_i - p'_i$. Now pick $w$ in $W$. Then the angle between $(S, S')$ and $(c, c')_w - (c, c')$ is obtuse and thus<br />$$(S, S') \cdot ((c, c')_w - (c, c')) = -\varepsilon_w < 0$$<br /><br />Thus, define the acts $A$, $B$, $A'_i$ and $B'_i$ as follows:<br /><ul><li>The utility of $A$ at $w$ is $S \cdot (w - c) + \frac{1}{3}\varepsilon_w$:</li><li>The utility of $B$ at $w$ is 0;</li><li>The utility of $A'_i$ at $w$ is $S'_i \cdot (w - c'_i) + \frac{1}{3}\varepsilon_w$;</li><li>The utility of $B'_i$ at $w$ is 0.</li></ul> Then the expected utility of $A$ by the lights of $c$ is $\sum^w c(w)\frac{1}{3}\varepsilon_w > 0$, while the expected utility of $B$ is 0, so $c$ prefers $A$ to $B$. And the expected utility of $A'_i$ by the lights of $c'_i$ is $\sum_w c'_i(w)\frac{1}{3}\varepsilon_w > 0$, while the expected utility of $B'_i$ is 0, so $c'_i$ prefers $A'_i$ to $B'_i$. But the utility of $A + A'_i$ at $w$ is<br />$$S \cdot (w - c) + S'_i \cdot (w - c'_i) + \frac{2}{3}\varepsilon_w = (S, S') \cdot ((c, c')_w - (c, c')) + \frac{2}{3}\varepsilon_w = - \frac{1}{3}\varepsilon_w < 0$$<br />where $E_i$ is true at $w$. While the utility of $B + B'_i$ at $w$ is 0.<br /><br />This completes our proof. $\Box$<br /><br />You might be forgiven for wondering why we are bothering to give an alternative proof for a theorem that is already well-known. David Lewis proved the Dutch Strategy Theorem in a handout for a seminar at Princeton in 1972, Paul Teller then reproduced it (with full permission and acknowledgment) in a paper in 1973, and Lewis finally published his handout in 1997 in his collected works. Why offer a new proof?<br /><br />It turns out that this style of proof is actually a little more powerful. To see why, it's worth comparing it to an alternative proof of the Dutch Book Theorem for Probabilism, which I described in <a href="http://m-phi.blogspot.com/2013/09/the-mathematics-of-dutch-book-arguments.html" target="_blank">this post</a> (it's not original to me, though I'm afraid I can't remember where I first saw it!). In the standard Dutch Book Theorem for Probabilism, we work through each of the axioms of the probability calculus, and say how you would Dutch Book an agent who violates it. The axioms are: Normalization, which says that $c(\top) = 1$ and $c(\bot) = 0$; and Additivity, which says that $c(A \vee B) = c(A) + c(B) - c(AB)$. But consider an agent with credences only in the propositions $\top$, $A$, and $A\ \&\ B$. Her credences are: $c(\top) = 1$, $c(A) = 0.4$, $c(A\ \&\ B) = 0.7$. Then there is no axiom of the probability calculus that she violates. And thus the standard proof of the Dutch Book Theorem is no help in identifying any Dutch Book against her. Yet she is Dutch Bookable. And she violates a more expansive formulation of Probabilism that says, not only are you irrational if your credence function is not a probability function, but also if your credence function <i>cannot be extended to a probability function</i>. So the standard proof of the Dutch Book Theorem can't establish this more expansive version. But the alternative proof I mentioned above can.<br /><br />Now, something similar is true of the alternative proof of the Dutch Strategy Theorem that I offered above (I happened upon this while discussing Superconditionalizing with Jason Konek, who uses similar techniques in his argument for J-Kon, the alternative to Jeffrey's Probability Kinematics that he proposes in his paper, <a href="https://philpapers.org/rec/KONTAO-2" target="_blank">'The Art of Learning'</a>, which was runner-up for last year's Sander's Prize in Epistemology). In Lewis' proof of that theorem: First, if you violate Plan Conditionalization, there must be $E_i$ and $X$ such that $c(E_i) > 0$ and $c'_i(X) \neq c(X|E_i)$. Then you place bets on $XE_i$, $\overline{E_i}$ at the earlier time $t$, and a bet on $X$ at $t'$. These bets then together lose you money in any world at which $E_i$ is true. Now, it might seem that you must have the required credences to make those bets just in virtue of violating Plan Conditionalization. But imagine the following is true of you: between $t$ and $t'$, you'll obtain evidence from the partition $\{E_1, \ldots, E_n\}$. And, at $t'$, you'll update on this evidence using the rule $c'$. That is, if $E_i$, then you'll adopt the new credence function $c'_i$ at time $t'$. Now, you don't assign credences to the propositions in $\{E_1, \ldots, E_n\}$. Perhaps this is because you don't have the conceptual resources to formulate these propositions. So while you will update using the rule $c'$, this is not a rule you consciously or explicitly adopt, since to state it would require you to use the propositions in $\{E_1, \ldots, E_n\}$. So it's more like you have a disposition to update in this way. Now, how might we state Plan Conditionalization for such an agent? We can't demand that $c'_i(X) = c(X|E_i)$, since $c(X | E_i)$ is not defined. Rather, we demand that there is some extension $c^*$ of $c$ to a set of propositions that does include each $E_i$ such that $c'_i(X) = c^*(X | E_i)$. Thus, we have:<br /><br /><b>Plan Superconditionalization</b> If<br />(i) your credence function at $t$ is $c : \mathcal{F} \rightarrow [0, 1]$,<br />(ii) between $t$ and $t'$ you will receive evidence from the partition $\{E_1, \ldots, E_n\}$,<br />(iii) at $t$, your updating plan is $c'$, so that $c'_i : \mathcal{F} \rightarrow [0, 1]$ is the credence function you plan to adopt if $E_i$,<br />then rationality requires that there is some extension $c^*$ of $c$ for which, if $c^*(E_i) > 0$, then for all $X$, $$c'_i(X) = c^*(X | E_i)$$<br /><br />And it turns out that we can adapt the proof above for this purpose. Say that $(c, c')$ is a superconditionalizing pair if there is an extension $c^*$ of $c$ such that, if $c^*(E_i) > 0$, then for all $X$, $c'_i(X) = c^*(X | E_i)$. Then we can prove that if $(c, c')$ is not a superconditionalizing pair, then $(c, c')$ is not in $\{(c, c')_w : w \in W\}^+$. Here's the proof from above adapted to our case: If $(c, c')$ is in $\{(c, c')_w : w \in W\}^+$, then there are $\lambda_w \geq 0$ such that<br /><br />(1) $\sum_{w \in W} \lambda_w = 1$,<br />(2) $c(X) = \sum_{w \in W} \lambda_w w(X)$<br />(3) $c'_i(X) = \sum_{w \in E_i} \lambda_w w(X) + \sum_{w \not \in E_i} \lambda_w c'_i(X)$.<br /><br />Define the following extension $c^*$ of $c$: $c^*(w) = \lambda_w$. Then, by (3), we have $$c'_i(X) = c^*(XE_i) + (1-c^*(E_i))c'_i(X)$$ So, if $c^*(E_i) > 0$, then $c'_i(X) = c^*(X | E_i)$, as required. $\Box$<br /><br />Now, this is a reasonably powerful version of conditionalization. For instance, as Skyrms showed <a href="http://www.oxfordscholarship.com/view/10.1093/acprof:oso/9780199652808.001.0001/acprof-9780199652808-chapter-7" target="_blank">here</a>, if we make one or two further assumptions on the extension of $c$ to $c^*$, we can derive Richard Jeffrey's Probability Kinematics from Plan Superconditionalization. That is, if the evidence $E_i$ will lead you to set your new credences across the partition $\{B_1, \ldots, B_k\}$ to $q_1, \ldots, q_k$, respectively, so that $c'_i(B_j) = q_j$, then your new credence $c'_i(X)$ must be $\sum^k_{j=1} c(X | B_j)q_j$, as Probability Kinematics demands. Thus, Plan Superconditionalization places a powerful constraint on updating rules for situations in which the proposition stating your evidence is not one to which you assign a credence. Other cases of this sort include the Judy Benjamin problem and the many cases in which MaxEnt is applied.Richard Pettigrewhttp://www.blogger.com/profile/07828399117450825734noreply@blogger.com4tag:blogger.com,1999:blog-4987609114415205593.post-75607084411413354532018-07-25T12:19:00.001+01:002018-07-25T12:19:05.435+01:00Deadline for PhD position in formal epistemology & law extended<div dir="ltr" style="text-align: left;" trbidi="on"><a href="http://entiaetnomina.blogspot.com/2018/04/one-phd-position-in-formal-epistemology.html">This position</a> is still available. Deadline extended to September 7, 2018.</div>Rafal Urbaniakhttp://www.blogger.com/profile/10277466578023939272noreply@blogger.com0tag:blogger.com,1999:blog-4987609114415205593.post-39135622723093007182018-07-25T11:06:00.002+01:002018-07-26T08:24:10.197+01:00On the Expected Utility Objection to the Dutch Book Argument for Probabilism<br />I've just signed a contract with Cambridge University Press to write <a href="https://richardpettigrew.com/books/the-dutch-book-argument/" target="_blank">a book on the Dutch Book Argument</a> for their Elements in Decision Theory and Philosophy series. So over the next few months, I'm going to be posting some bits and pieces as I get properly immersed in the literature. The following came up while thinking about Brian Hedden's paper <a href="https://onlinelibrary.wiley.com/doi/pdf/10.1111/j.1468-0068.2011.00842.x" target="_blank">'Incoherence without Exploitability'.</a><br /><br /><h2>What is Probabilism? </h2><br />Probabilism says that your credences should obey the axioms of the probability calculus. Suppose $\mathcal{F}$ is the algebra of propositions to which you assign a credence. Then we let $0$ represent the lowest possible credence you can assign, and we let $1$ represent the highest possible credence you can assign. We then represent your credences by your credence function $c : \mathcal{F} \rightarrow [0, 1]$, where, for each $A$ in $\mathcal{F}$, $c(A)$ is your credence in $A$.<br /><br /><b>Probabilism</b><br />If $c : \mathcal{F} \rightarrow [0, 1]$ is your credence function, then rationality requires that:<br />(P1a) $c(\bot) = 0$, where $\bot$ is a necessarily false proposition;<br />(P1b) $c(\top) = 1$, where $\top$ is a necessarily true proposition;<br />(P2) $c(A \vee B) = c(A) + c(B)$, for any mutually exclusive propositions $A$ and $B$ in $\mathcal{F}$.<br /><br />This is equivalent to:<br /><br /><b>Partition Probabilism</b><br />If $c : \mathcal{F} \rightarrow [0, 1]$ is your credence function, then rationality requires that, for any two partitions $\mathcal{X} = \{X_1, \ldots, X_m\}$ and $\mathcal{Y} = \{Y_1, \ldots, Y_n\}$,$$\sum^m_{i=1} c(X_i) = 1= \sum^n_{j=1} c(Y_j)$$<br /><br /><h2>The Dutch Book Argument for Probabilism</h2><br />The Dutch Book Argument for Probabilism has three premises. The first, which I will call <i>Ramsey's Thesis</i> and abbreviate <i>RT</i>, posits a connection between your credence in a proposition and the prices you are rationally permitted or rationally required to pay for a bet on that proposition. The second, known as the <i>Dutch Book Theorem</i>, establishes that, if you violate Probabilism, there is a set of bets you might face, each with a price attached, such that (i) by Ramsey's Thesis, for each bet, you are rationally required to pay the attached price for it, but (ii) the sum of the prices of the bets exceeds the highest possible payout of the bets, so that, having paid each of those prices, you are guaranteed to lose money. The third premise, which we might call the <i>Domination Thesis</i>, says that credences are irrational if they mandate you to make a series of decisions (i.e, paying certain prices for the bets) that is guaranteed to leave you worse off than another series of decisions (i.e., refusing to pay those prices for the bets)---in the language of decision theory, paying the attached price for each of the bets is<i> dominated</i> by refusing each of the bets, and credences that mandate you to choose dominated options are irrational. The conclusion of the Dutch Book Argument is then Probabilism. Thus, the argument runs:<br /><br /><b>The Dutch Book Argument for Probabilism</b><br />(DBA1) Ramsey's Thesis<br />(DBA2) Dutch Book Theorem<br />(DBA3) Domination Thesis<br />Therefore,<br />(DBAC) Probabilism<br /><br />The argument is valid. The second premise is a mathematical theorem. Thus, if the argument fails, it must be because the first or third premise is false, or both. In this paper, we focus on the first premise, and the expected utility objection to it. So, let's set out that premise in a little more detail.<br /><br />In what follows, we assume that (i) you are risk-neutral, and (ii) that there is some quantity such that your utility is linear in that quantity---indeed, we will speak as if your utility is linear in money, but that is just for ease of notation and familiarity; any quantity would do. Neither (i) nor (ii) is realistic, and indeed these idealisations are the source of other objections to Ramsey's Thesis. But they are not our concern here, so we will grant them.<br /><br /><b>Ramsey's Thesis (RT)</b> Suppose your credence in $X$ in $c(X)$. Consider a bet that pays you £$S$ if $X$ is true and £0 if $X$ is false, where $S$ is a real number, either positive, negative, or zero---$S$ is called the <i>stake</i> of the bet. You are offered this bet for the price £$x$, where again $x$ is a real number, either positive, negative, or zero. Then:<br />(i) If $x < c(X) \times S$, you are rationally required to pay £$x$ to enter into this bet;<br />(ii) If $x = c(X) \times S$, you are rationally permitted to pay £$x$ and rationally permitted to refuse;<br />(iii) If $x > c(X) \times S$, you are rationally required to refuse.<br /><br />Roughly speaking, Ramsey's Thesis says that, the more confident you are in a proposition, the more you should be prepared to pay for a bet on it. More precisely, it says: (a) if you have minimal confidence in that proposition (i.e. 0), then you should be prepared to pay nothing for it; (b) if you have maximal confidence in it (i.e. 1), then you should be prepared to pay the full stake for it; (c) for levels of confidence in between, the amount you should be prepared to pay increases linearly with your credence.<br /><br /><h2>The Expected Utility Objection</h2><br />We turn now to the objection to Ramsey's Thesis (RT) we wish to treat here. Hedden (2013) begins by pointing out that we have a general theory of how credences and utilities should guide action: <br /><br /><blockquote class="tr_bq">Given a set of options available to you, expected utility theory says that your credences license you to choose the option with the highest expected utility, defined as:</blockquote><blockquote>$$\mathrm{EU}(A) = \sum_i P(O_i|A) \times U(O_i)$$<br />On this view, we should evaluate which bets your credences license you to accept by looking at the expected utilities of those bets. (Hedden, 2013, 485)</blockquote><br />He considers the objection that this only applies when credences satisfy Probabilism, but rejects it:<br /><br /><blockquote class="tr_bq">In general, we should judge actions by taking the sum of the values of each possible outcome of that action, weighted by one's credence that the action will result in that outcome. This is a very intuitive proposal for how to evaluate actions that applies even in the context of incoherent credences. (Hedden, 2013, 486)</blockquote><br />Thus, Hedden contends that we should always choose by maximising expected utility relative to our credences, whether or not those credences are coherent. Let's call this principle <i>Maximise Subjective Expected Utility</i> and abbreviate it <i>MSEU</i>. He then observes that MSEU conflicts with RT. Consider, for instance, Cináed, who is 60% confident it will rain and 20% confident it won't. According to RT, he is rationally required to sell for £65 a bet in which he pays out £100 if it rains and £0 if is doesn't. But the expected utility of this bet for him is$$0.6 \times (-100 + 65) + 0.2 \times (-0 + 65) = -8$$That is, it has lower expected utility than refusing to sell the bet, since his expected utility for doing that is$$0.6 \times 0 + 0.2 \times 0 = 0$$So, while RT says you must sell that bet for that price, MSEU says you must not. So RT and MSEU are incompatible, and Hedden claims that we should favour MSEU. There are two ways to respond to this. On the first, we try to retain RT in some form in spite of Hedden's objection---I call this the <i>permissive response</i> below. On the second, we try to give a pragmatic argument for Probabilism using MSEU instead of RT---I call this the <i>bookless response</i> below. In the following sections, I will consider these in turn.<br /><br /><h2>The Permissive Response</h2><br />While Hedden is right to say that maximising expected utility in line with Maximise Subjective Expected Utility (MSEU) is intuitively rational even when your credences are incoherent, so is Ramsey's Thesis (RT). It is certainly intuitively correct that, to quote Hedden, ''we should judge actions by taking the sum of the values of each possible outcome of that action, weighted by one's credence that the action will result in that outcome.'' But it is also intuitively correct that, to quote from our gloss of Ramsey's Thesis above, ''(a) if you have minimal confidence in that proposition (i.e. 0), then you should be prepared to pay nothing for it; (b) if you have maximal confidence in it (i.e. 1), then you should be prepared to pay the full stake for it; (c) for levels of confidence in between, the amount you should be prepared to pay increases linearly with your credence.'' What are we to do in the face of this conflict between our intuitions?<br /><br />One natural response is to say that choosing in line with RT is rationally permissible and choosing in line with MSEU is also rationally permissible. When your credences are coherent, the dictates of MSEU and RT are the same. But when you are incoherent, they are sometimes different, and in that situation you are allowed to follow either. In particular, faced with a bet and proposed price, you are permitted to pay that price if it is permitted by RT <i>and</i> you are permitted to pay it if it is permitted by MSEU.<br /><br />If this is right, then we can resurrect the Dutch Book Argument with a permissive version of RT as the first premise:<br /><br /><b>Permissive Ramsey's Thesis</b> Suppose your credence in $X$ in $c(X)$. Consider a bet that pays you £$S$ if $X$ is true and £0 if $X$ is false. You are offered this bet for the price £$x$. Then:<br />(i) If $x \leq c(X) \times S$, you are rationally permitted to pay £$x$ to enter into this bet.<br /><br />And we could then amend the third premise---the Domination Thesis (DBA3)---to ensure we could still derive our conclusion. Instead of saying that credences are irrational if they <i>mandate</i> you to make a series of decisions that is guaranteed to leave you worse off than another series of decisions, we might say that credences are irrational if they <i>permit</i> you to make a series of decisions that is guaranteed to leave you worse off than another series of decisions. In the language of decision theory, instead of saying only that credences that <i>mandate</i> you to choose dominated options are irrational, we say also that credences that <i>permit</i> you to choose dominated options are irrational. We might call this the <i>Permissive Domination Thesis</i>.<br /><br />Now, by weakening the first premise in this way, we respond to Hedden's objection and make the premise more plausible. But we strengthen the third premise to compensate and perhaps thereby make it less plausible. However, I imagine that anyone who accepts one of the versions of the third premise---either the Domination Thesis or the Permissive Domination Thesis---will also accept the other. Having credences that <i>mandate</i> dominated choices may be worse than having credences that <i>permit</i> such choices, but both seem sufficient for irrationality. Perhaps the former makes you <i>more</i> irrational than the latter, but it seems clear that the ideally rational agent will have credences that do neither. And if that's the case, then we can replace the standard Dutch Book Argument with a slight modification:<br /><br /><b>The Permissive Dutch Book Argument for Probabilism</b><br />(PDBA1) Permissive Ramsey's Thesis<br />(PDBA2) Dutch Book Theorem<br />(PDBA3) Permissive Domination Thesis<br />Therefore,<br />(PDBAC) Probabilism<br /><br /><h2>The Bookless Response</h2><br />Suppose you refuse even the permissive version of RT, and insist that coherent and incoherent agents alike should choose in line with MSEU. Then what becomes of the Dutch Book Argument? As we noted above, Hedden shows that it fails---MSEU is not sufficient to establish the conclusion. In particular, Hedden gives an example of an incoherent credence function that is not Dutch Bookable via MSEU. That is, there are no sets of bets with accompanying prices such that (a) MSEU will demand that you pay each of those prices, and (b) the sum of those prices is guaranteed to exceed the sum of the payouts of that set of bets. However, as we will see, accepting individual members of such a set of bets is just one way to make bad decisions based on your credences.<br /><br />Consider Hedden's example. In it, you assign credences to propositions in the algebra built up from three possible worlds, $w_1$, $w_2$, and $w_3$. Here are some of your credences:<br /><ul><li>$c(w_1 \vee w_2) = 0.8$ and $c(w_3) = 0$</li><li>$c(w_1) = 0.7$ and $c(w_2 \vee w_3) = 0$</li></ul>Now, consider the following two options, $A$ and $B$, whose utilities in each state of the world are set out in the following table:<br /><br /><div class="separator" style="clear: both; text-align: center;"><a href="https://4.bp.blogspot.com/-ekyYtB-ehnw/W1hJ6FuFhHI/AAAAAAAAAvE/1lIqibrpdTsF3vuG8g6RYm2WBpHBgmMewCLcBGAs/s1600/IMG_6046.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="494" data-original-width="1016" height="96" src="https://4.bp.blogspot.com/-ekyYtB-ehnw/W1hJ6FuFhHI/AAAAAAAAAvE/1lIqibrpdTsF3vuG8g6RYm2WBpHBgmMewCLcBGAs/s200/IMG_6046.jpg" width="200" /></a></div><br />Then notice first that $A$ dominates $B$---that is, the utility of $A$ is higher than $B$ in every possible state of the world. But, using your incoherent credences, you assign a higher expected utility to $B$ than to $A$. Your expected utility for $A$---which must be calculated relative to your credences in $w_1$ and $w_2 \vee w_3$, since the utility of $A$ given $w_1 \vee w_2$ is undefined---is $0.7 \times 78 + 0 \times 77 = 54.6$. And your expected utility for $B$---which must be calculated relative to your credences in $w_1 \vee w_2$ and $w_3$, since the utility of $B$ given $w_2 \vee w_3$ is undefined---is $0.8 \times 74 + 0 \times 75 = 59.2$. So, while Hedden might be right that MSEU won't leave you vulnerable to a Dutch Book, it will leave you vulnerable to choosing a dominated option. And since what is bad about entering a Dutch Book is that it is a dominated option---it is dominated by the option of refusing the bets---the invulnerability to Dutch Books should be no comfort to you.<br /><br />Now, this raises the question: For which incoherence credences is it guaranteed that MSEU won't lead you to choose a dominated option? Is it <i>all</i> incoherent credences, in which case we would have a new Dutch Book Argument for Probabilism from MSEU rather than RT? Or is it some subset? Below, we prove a theorem that answers that. First, a weakened version of Probabilism:<br /><br /><b>Bounded Probabilism</b> If $c : \mathcal{F}\rightarrow [0, 1]$ is your credence function, then rationality requires that:<br />(BP1a) $c(\bot) = 0$, where $\bot$ is a necessarily false proposition;<br />(BP1b) There is $0 < M \leq 1$ such that $c(\top) = M$, where $\top$ is a necessarily true proposition;<br />(BP2) $c(A \vee B) = c(A) + c(B)$, if $A$ and $B$ are mutually exclusive.<br /><br />Bounded Probabilism says that you should have lowest possible credence in necessary falsehoods, some positive credence---not necessarily 1---in necessary truths, and your credence in a disjunction of two incompatible propositions should be the sum of your credences in the disjuncts.<br /><br /><b>Theorem 1</b> The following are equivalent:<br />(i) $c$ satisfies Bounded Probabilism<br />(ii) For all options $A$, $B$, if $A$ dominates $B$, then $\mathrm{EU}_c(A) > \mathrm{EU}_c(B)$.<br /><br />The proof is in the Appendix below. Thus, even without Ramsey's Thesis or the permissive version described above, you can still give a pragmatic argument for a norm that lies very close to Probabilism, namely, Bounded Probabilism. On its own, this argument cannot say what is wrong with someone who gives less than the highest possible credence to necessary truths, but it does establish the other requirements that Probabilism imposes. To see just how close to Probabilism lies Bounded Probabilism, consider the following two norms, which are equivalent to it:<br /><br /><b>Scaled Probabilism</b> If $c : \mathcal{F} \rightarrow [0, 1]$ is your credence function, then rationality requires that there is $0 < M \leq 1$ and a probability function $p : \mathcal{F} \rightarrow [0, 1]$ such that $c(-) = M \times p(-)$.<br /><br /><b>Bounded Partition Probabilism</b> If $c : \mathcal{F} \rightarrow [0, 1]$ is your credence function, then rationality requires that, for any two partitions $\mathcal{X} = \{X_1, \ldots, X_m\}$ and $\mathcal{Y} = \{Y_1, \ldots, Y_n\}$,$$\sum^m_{i=1} c(X_i) = \sum^n_{j=1} c(Y_j)<br />$$Then<br /><br /><b>Lemma 2</b> The following are equivalent:<br />(i) Bounded Probabilism<br />(ii) Scaled Probabilism<br />(iii) Bounded Partition Probabilism<br /><br />As before, the proof is in the Appendix.<br /><br />So, on its own, MSEU can deliver us very close to Probabilism. But it cannot establish (P1b), namely, $c(\top) = 1$. However, I think we can also appeal to a highly restricted version of the Permissive Ramsey's Thesis to secure (P1b) and push us all the way to Probabilism.<br /><br />Consider Dima and Esther. They both have minimal confidence---i.e. 0---that it won't rain tomorrow. But Dima has credence 0.01 that it will rain, while Esther has credence 0.99 that it will. If we permit only actions that maximise expected utility, then Dima and Esther are required to pay exactly the same prices for bets on rain---that is, Dima will be required to pay a price exactly when Esther is. After all, if £$S$ is the payoff when it rains, £0 is the payoff when it doesn't, and $x$ is a proposed price, then $0.01\times (S- x) + 0 \times (0-x) \geq 0$ iff $0.99 \times (S-x) + 0 \times (0-x) \geq 0$ iff $S \geq x$. So, according to MSEU, Dima and Esther are rationally required to pay anything up to the stake of the bet for such a bet. But this is surely wrong. It is surely at least permissible for Dima to refuse to pay a price that Esther accepts. It is surely permissible for Esther to pay £99 for a bet on rain that pays £100 if it rains and £0 if it doesn't, while Dima refuses to pay anything more than £1 for such a bet, in line with Ramsey's Thesis. Suppose Dima were offered such a bet for the price of £99, and suppose she then defended her refusal to pay that price saying, 'Well, I only think it's 1% likely to rain, so I don't want to risk such a great loss with so little possible gain when I think the gain is so unlikely'. Then surely we would accept that as a rational defence.<br /><br />In response to this, defenders of MSEU might concede that RT is sometimes the correct norm of action when you are incoherent, but only in very specific cases, namely, those in which you have a positive credence in a proposition, minimal credence (i.e. 0) in its negation, and you are considering the price you might pay for a bet on that proposition. In all other cases---that is, in any case in which your credences in the proposition and its negation are both positive, or in which you are considering an action other than a bet on a proposition---you should use MSEU. I have some sympathy with this. But, fortunately, this restricted version is all we need. After all, it is precisely by applying Ramsey's Thesis to such a case that we can produce a Dutch Book against someone with $c(\bot) = 0$ and $c(\top) < 1$---we simply offer to pay them £$c(\top) \times 100$ for a bet in which they will pay out £100 if $\top$ is true and £0 if it is false; this is then guaranteed to lose them £$100 \times (1-c(X))$, which is positive. Thus, we end up with a disjunctive pragmatic argument for Probabilism: if $c(\bot) = 0$ and $c(\top) < 1$, then RT applies and we can produce a Dutch Book against you; if you violate Probabilism in any other way, then you violate Bounded Probabilism and we can then produce two options $A$ and $B$ such that $A$ dominates $B$, but your credences, via MSEU, dictate that you should choose $B$ over $A$. This, then, is our bookless pragmatic argument for Probabilism:<br /><br /><b>Bookless Pragmatic Argument for Probabilism</b><br />(BPA1) If $c$ violates Probabilism, then either (i) $c(\bot) = 0$ and $c(\top) < 1$, or (ii) $c$ violates Bounded Probabilism.<br />(BPA2) If $c(\bot) = 0$ and $c(\top) < 1$, then RT applies, and there is a bet on $\top$ such that you are required by RT to pay a higher price for that bet than its guaranteed payoff. Thus, there are options $A$ and $B$ (namely, <i>refuse the bets</i> and <i>pay the price</i>), such that $A$ dominates $B$, but RT demands that you choose $B$ over $A$.<br />(BPA3) If $c$ violates Bounded Probabilism, then by Theorem 1, there are options $A$ and $B$ such that $A$ dominates $B$, but RT demands that you choose $B$ over $A$. Therefore, by (BPA1), (BPA2), and (BPA3),<br />(BPA4) If $c$ violates Probabilism, then there are options $A$ and $B$ such that $A$ dominates $B$, but rationality requires you to choose $B$ over $A$.<br />(BPA5) Dominance Thesis<br />Therefore,<br />(BPAC) Probabilism<br /><br /><h2>Conclusion</h2><br />The Dutch Book Argument for Probabilism assumes Ramsey's Thesis, which determines the prices an agent is rationally required to pay for a bet. Hedden argues that Ramsey's Thesis is wrong. He claims that Maximise Subjective Expected Utility determines those prices, and it often disagrees with RT. In our Permissive Dutch Book Argument, I suggested that, in the face of that disagreement, we might be permissive: agents are permitted to pay any price that is required or permitted by RT and they are permitted to pay any price that is required or permitted by MSEU. In our Bookless Pragmatic Argument, I then explored what we might do if we reject this permissive response and insist that only prices permitted or required by MSEU are permissible. I showed that, in that case, we can give a pragmatic argument for Bounded Probabilism, which comes close to Probabilism, but doesn't quite reach; and I showed that, if we allow RT in the very particular cases in which it agrees better with intuition than MSEU does, we can give a pragmatic argument for Probabilism.<br /><br /><h2>Appendix: Proof of Theorem 1</h2><br /><b>Theorem 1 </b>The following are equivalent:<br />(i) $c$ satisfies Bounded Probabilism<br />(ii) For all options $A$, $B$, if $A$ dominates $B$, then $\mathrm{EU}_c(A) > \mathrm{EU}_c(B)$.<br /><br />($\Rightarrow$) Suppose $c$ satisfies Bounded Probabilism. Then, by Lemma 2, there is $0 < M \leq 1$ and a probability function $p$ such that $c(-) = M \times p(-)$. Now suppose $A$ and $B$ are actions. Then<br /><ul><li>$\mathrm{EU}_c(A) = \mathrm{EU}_{M \times p}(A) = M \times \mathrm{EU}_p(A)$</li><li>$\mathrm{EU}_c(B) = \mathrm{EU}_{M \times p}(B) = M \times \mathrm{EU}_p(B)$</li></ul>Thus, $\mathrm{EU}_c(A) > \mathrm{EU}_c(B)$ iff $\mathrm{EU}_p(A) > \mathrm{EU}_p(B)$. And we know that, if $A$ dominates $B$ and $p$ is a probability function, then $\mathrm{EU}_p(A) > \mathrm{EU}_p(B)$.<br /><br />($\Leftarrow$) Suppose $c$ violates Bounded Probabilism. Then there are partitions $\mathcal{X} = \{X_1, \ldots, X_m\}$ and $\mathcal{Y} = \{Y_1, \ldots, Y_n\}$ such that $$\sum^m_{i=1} c(X_i) = x < y = \sum^n_{j=1} c(Y_j)$$We will now define two acts $A$ and $B$ such that $A$ dominates $B$, but $\mathrm{EU}_c(A) < \mathrm{EU}_c(B)$.<br /><ul><li>For any $X_i$ in $\mathcal{X}$, $$U(A, X_i) = y - i\frac{y-x}{2(m + 1)}$$</li><li>For any $Y_j$ in $\mathcal{Y}$,$$U(B, Y_j) = x + j\frac{y-x}{2(n + 1)}$$</li></ul>Then the crucial facts are:<br /><ul><li>For any two $X_i \neq X_j$ in $\mathcal{X}$,$$U(A, X_i) \neq U(A, X_j)$$</li><li>For any two $Y_i \neq Y_j$ in $\mathcal{Y}$,$$U(B, Y_i) \neq U(B, Y_j)$$</li><li>For any $X_i$ in $\mathcal{X}$ and $Y_j$ in $\mathcal{Y}$, $$x < U(B, Y_j) < \frac{x+y}{2} < U(A, X_i) < y$$</li></ul>So $A$ dominates $B$, but$$\mathrm{EU}_c(A) = \sum^m_{i=1} c(X_i) U(A, X_i) < \sum^m_{i=1} c(X_i) \times y = xy$$<br />while$$\mathrm{EU}_c(B) = \sum^n_{j=1} c(Y_i) U(B, Y_j) > \sum^n_{j=1} c(Y_j) \times x = yx$$So $\mathrm{EU}_c(B) > \mathrm{EU}_c(A)$, as required.Richard Pettigrewhttp://www.blogger.com/profile/07828399117450825734noreply@blogger.com0tag:blogger.com,1999:blog-4987609114415205593.post-13476030767335627332018-07-19T09:26:00.001+01:002018-07-26T08:24:39.726+01:00What is Probabilism?<br />I've just signed a contract with Cambridge University Press to write <a href="https://richardpettigrew.com/books/the-dutch-book-argument/" target="_blank">a book on the Dutch Book Argument</a> for their Elements in Decision Theory and Philosophy series. So over the next few months, I'm going to be posting some bits and pieces as I get properly immersed in the literature.<br /><br />---- <br /><br />Probabilism is the claim that your credences should satisfy the axioms of the probability calculus. Here is an attempt to state the norm more precisely, where $\mathcal{F}$ is the algebra of propositions to which you assign credences and $c$ is your credence function, which is defined on $\mathcal{F}$, so that $c(A)$ is your credence in $A$, for each $A$ in $\mathcal{F}$.<br /><br /><b>Probabilism (initial formulation)</b> <br /><ul><li>(Non-Negativity) Your credences should not be negative. In symbols: $c(A) \geq 0$, for all $A$ in $\mathcal{F}$.</li><li>(Normalization I) Your credence in a necessarily false proposition should be 0. In symbols: $c(\bot) = 0$.</li><li>(Normalization II) Your credence in a necessarily true proposition should be 1. In symbols: $c(\top) = 1$.</li><li>(Finite Additivity) Your credence in the disjunction of two mutually exclusive propositions should be the sum of your credences in the disjuncts. In symbols: $c(A \vee B) = c(A) + c(B)$.</li></ul>This sort of formulation is fairly typical. But I think it's misleading in various ways.<br /><br />As is often pointed out, 0 and 1 are merely conventional choices. Like utilities, we can measure credences on different scales. But what are they conventional choices for? It seems to me that they must represent the lowest possible credence you can have and the highest possible credence you can have, respectively. After all, what we want Normalization I and II to say is that we should have lowest possible credence in necessary falsehoods and highest possible credence in necessary truths. It follows that Non-Negativity is not a normative constraint on your credences, which is how it is often presented. Rather, it follows immediately from the particular representation of our credences that we have chosen to. Suppose we chose a different representation, where -1 represents the lowest possible credence and 1 represents the highest. Then Normalization I and II would say that $c(\bot) = -1$ and $c(\top) = 1$, so Non-Negativity would be false.<br /><br />One upshot of this is that Non-Negativity is superfluous once we have specified the representation of credences that we are using. But another is that Probabilism incorporates not only normative claims, such as Normalization I and II and Finite Additivity, but also a metaphysical claim, namely, that there is a lowest possible credence that you can have and a highest possible credence that you can have. Without that, we couldn't specify the representation of credences in such a way that we would want to sign up to Normalization I and II. Suppose that, for any credence you can have, there is a higher one than you could have. Then there is no credence that I would want to demand you have in a necessary truth--for any I demanded, it would be better for you to have one higher. So I either have to say that all credences in necessary falsehoods are rationally forbidden, or all are rationally permitted, or I pick some threshold above which any credence is rationally permitted. And the same goes, mutatis mutandis, for credences in necessary falsehoods. I'm not sure what the norm of credences would be if our credences were unbounded in one or other or both directions. But it certainly wouldn't be Probabilism.<br /><br />So Non-Negativity is not a normative claim, but rather a trivial consequence of a metaphysical claim together with a conventional choice of representation. The metaphysical claim is that there is a minimal and a maximal credence; the representation choice is that 0 will represent the minimal credence and 1 will represent the maximal credence.<br /><br />Next, suppose we make a different conventional choice. Suppose we pick real numbers $a$ and $b$, and we say that $a$ represents minimal credence and $b$ represents maximal credence. Then clearly Normalization I becomes $c(\bot) = a$ and Normalization II becomes $c(\top) = b$. But what of Finite Additivity? This looks problematic. After all, if $a = 10$ and $b = 30$, and $c(A) = 20 = c(\overline{A})$, then Finite Addivitity demands that $c(\top) = c(A \vee \overline{A}) = c(A) + c(\overline{A}) = 40$, which is greater than the maximal credence. So Finite Additivity makes an impossible demand on an agent who seems to have perfectly rational credences in $A$ and $\overline{A}$, given the representation.<br /><br />The reason is that Finite Additivity, formulated as we formulated it above, is peculiar to very specific representations of credences, such as the standard one on which 0 stands for minimal credence and 1 stands for maximal credence. The correct formulation of Finite Additivity in general says: $c(A \vee B) = c(A) + c(B) - c(A\ \&\ B)$, for any propositions $A$, $B$ in $\mathcal{F}$. Thus, in the case we just gave above, if $c(A\ \&\ \overline{A}) = 10$, in keeping with the relevant version of Normalization I, we have $c(A \vee \overline{A}) = 20 + 20 - 10 = 30$, as required. So we see that it's wrong to say that Probabilism says that your credence in the disjunction of two mutually exclusive propositions should be the sum of your credences in the disjuncts--that's actually only true on some representation of your credences (namely, those for which 0 represents minimal credence).<br /><br />Bringing all of this together, I propose the following formulation of Probabilism:<br /><br /><b>Probabilism (revised formulation)</b> <br /><ul><li>(Bounded credences) There is a lowest possible credence you can have; and there is a highest possible credence you can have.</li><li>(Representation) We represent the lowest possible credence you have using $a$, and we represent the highest possible credence you can have using $b$.</li><li>(Normalization I) Your credence in a necessarily false proposition should be the lowest possible credence you can have. In symbols: $c(\bot) = a$.</li><li>(Normalization II) Your credence in a necessarily true proposition should be the highest possible credence you can have. In symbols: $c(\top) = b$.</li><li>(Finite Additivity) $c(A \vee B) = c(A) + c(B) - c(A\ \&\ B)$, for any propositions $A$, $B$ in $\mathcal{F}$.</li></ul>We call such a credence function a probability$_{a, b}$ function. How can we be sure this is right? Here are some considerations in its favour:<br /><br /><b>Switching representations </b><br />(i)<b> </b>Suppose $c(-)$ is a probability$_{a, b}$ function. Then $\frac{1}{b-a}c(-) - \frac{a}{b-a}$ is a probability function (or probability$_{0, 1}$ function).<br />(ii) Suppose $c(-)$ is a probability function and $a, b$ are real numbers. Then $c(-)(b-a) + a$ is a probability$_{a, b}$ function.<br /><br /><b>Dutch Book Argument</b><br />The standard Dutch Book Argument for Probabilism assumes that, if you have credence $p$ in proposition $X$, then you will pay £$pS$ for a bet that pays £$S$ if $X$ and £$0$ if $\overline{X}$. But this assumes that you have credences between 0 and 1, inclusive. What is the corresponding assumption if you represent credences in a different scale? Shorn of its conventional choice of representation, the assumption is: (a) you will pay £$0$ for a bet on $X$ if you have minimal credence in $X$; (b) you will pay £$S$ for a bet on $X$ if you have maximal credence in $X$; (c) the price you will pay for a bet on $X$ increases linearly with your credence in $X$. Translated into a framework in which we measure credence on a scale from $a$ to $b$, the assumption is then: you will pay £$\frac{p-a}{b-a}S$ for a bet that pays £$S$ if $X$ and £$0$ if $\overline{X}$. And, with this assumption, we can find Dutch Books against any credence function that isn't a probability$_{a, b}$ function.<br /><br /><b>Accuracy Dominance Argument</b><br />The standard Accuracy Dominance Argument for Probabilism assumes that, for each world, the ideal or vindicated credence function at that world assigns 0 to all falsehoods and 1 to all truths. Of course, if we represent minimal credence by $a$ and maximal credence by $b$, then we'll want to change that assumption. We'll want to say instead that the ideal or vindicated credence function at a world assigns $a$ to falsehoods and $b$ to truths. Once we say that, for any credence function that isn't a probability$_{a, b}$ function, there is another credence function that is closer to the ideal credence function at all worlds.<br /><br />So, the usual arguments for having a credence function that is a probability function when you represent your credences on a scale from 0 to 1 can be repurposed to argue that you should have a credence function that is a probability$_{a, b}$ function when you represent your credences on a scale from $a$ to $b$. And that gives us good reason to think that the second formulation of Probabilism above is correct.<br /><br /><br />Richard Pettigrewhttp://www.blogger.com/profile/07828399117450825734noreply@blogger.com0tag:blogger.com,1999:blog-4987609114415205593.post-52914705473919112162018-07-16T11:02:00.000+01:002018-07-16T11:02:49.107+01:00Yet another assistant professorship in formal philosophy @ University of Gdansk<div dir="ltr" style="text-align: left;" trbidi="on"><div style="text-align: left;"><a href="http://entiaetnomina.blogspot.com/2016/12/assistant-professorship-in-mathematical.html" style="text-align: justify;" target="_blank">Some time ago</a><span style="text-align: justify;"> the Chair of Logic, Philosophy of Science and Epistemology had an opening in formal philosophy that since then has been filled. Now, another position (leading to a permanent position upon second renewal) is available (so, there'll be three tenure-track faculty members working on formal philosophy). <a href="http://entiaetnomina.blogspot.com/2018/07/yet-another-assistant-professorship-in.html">Details.</a></span></div></div>Rafal Urbaniakhttp://www.blogger.com/profile/10277466578023939272noreply@blogger.com0tag:blogger.com,1999:blog-4987609114415205593.post-59484165495972496612018-07-16T08:49:00.003+01:002018-07-16T08:49:58.175+01:00Lecturer Position in Logic and Philosophy of Language (MCMP) <br /><div class="MsoPlainText" style="text-align: left;"><span lang="EN" style="mso-ansi-language: EN;">The Ludwig-Maximilians-University Munich is one of the largest and most prestigious universities in Germany.</span><span lang="EN-US" style="mso-ansi-language: EN-US;"></span></div><div class="MsoPlainText" style="text-align: left;"><br /></div><div class="MsoPlainText" style="text-align: left;"><span lang="EN-US" style="mso-ansi-language: EN-US;">Ludwig-Maximilians-University Munich is seeking applications for one</span></div><div class="MsoPlainText" style="text-align: left;"><br /></div><div class="MsoPlainText" style="text-align: center;"><b style="mso-bidi-font-weight: normal;"><span lang="EN-US" style="mso-ansi-language: EN-US;">Lecturer Position (equivalent to Assistant Professorship) </span></b></div><div class="MsoPlainText" style="text-align: center;"><b style="mso-bidi-font-weight: normal;"><span lang="EN-US" style="mso-ansi-language: EN-US;">in Logic and Philosophy of Language</span></b></div><div class="MsoPlainText" style="text-align: center;"><span lang="EN-US" style="mso-ansi-language: EN-US;">(for three years, with the possibility of extension)</span></div><div class="MsoPlainText" style="text-align: left;"><br /></div><div class="MsoPlainText" style="text-align: left;"><span lang="EN-US" style="mso-ansi-language: EN-US;">at the Chair of Logic and Philosophy of Language (Professor Hannes Leitgeb) and the Munich Center for Mathematical Philosophy (MCMP) at the Faculty of Philosophy, Philosophy of Science and Study of Religion. The position, which is to start on December 1, 2018, is for three years with the possibility of extension for another three years.</span></div><div class="MsoPlainText" style="text-align: left;"><br /></div><div class="MsoPlainText" style="text-align: left;"><span lang="EN-US" style="mso-ansi-language: EN-US;">The appointee will be expected (i) to do philosophical research, especially in logic and philosophy of language, (ii) to teach five hours a week in areas relevant to the chair, and (iii) to participate in the administrative work of the MCMP.</span></div><div class="MsoPlainText" style="text-align: left;"><br /></div><div class="MsoPlainText" style="text-align: left;"><span lang="EN-US" style="mso-ansi-language: EN-US;">The successful candidate will have a PhD in philosophy or logic, will have teaching experience in philosophy and logic, and will have carried out research in logic and related areas (such as philosophy of logic, philosophy of language, philosophy of mathematics, formal epistemology).</span></div><div class="MsoPlainText" style="text-align: left;"><br /></div><div class="MsoPlainText" style="text-align: left;"><span lang="EN" style="mso-ansi-language: EN;">Your workplace is centrally located in Munich and is very easy to reach by public transport. We offer you an interesting and responsible job with good training and development opportunities.</span></div><div class="MsoPlainText" style="text-align: left;"><span lang="EN" style="mso-ansi-language: EN;">The employment takes place within the TV-L scheme.</span></div><div class="MsoPlainText" style="text-align: left;"><span lang="EN" style="mso-ansi-language: EN;">The position is initially limited to November 30, 2021.</span></div><div class="MsoNormal" style="line-height: normal; text-align: left;"><span lang="EN-US" style="font-family: "Calibri",sans-serif; letter-spacing: 0pt; mso-ansi-language: EN-US;">Furthermore, given equal qualification, severely physically challenged applicants will be preferred</span><span lang="EN" style="mso-ansi-language: EN;">.</span></div><div class="MsoPlainText" style="text-align: left;"><span lang="EN" style="mso-ansi-language: EN;">There is the possibility of part-time employment.</span></div><div class="MsoPlainText" style="text-align: left;"><span lang="EN" style="mso-ansi-language: EN;">The application of women is strongly welcome.</span></div><div class="MsoPlainText" style="text-align: left;"><br /></div><div class="MsoPlainText" style="text-align: left;"><span lang="EN-US" style="mso-ansi-language: EN-US;">Applications (including CV, certificates, list of publications, list of courses taught, a writing sample and a description of planned research projects (1000-1500 words)) should be sent either by email (ideally all requested documents in just one PDF document) or by mail to</span></div><div class="MsoPlainText" style="text-align: left;"><br /></div><div class="MsoPlainText" style="text-align: left;"><span lang="EN-US" style="mso-ansi-language: EN-US;">Ludwig-Maximilians-Universität München</span></div><div class="MsoPlainText" style="text-align: left;"><span lang="EN-US" style="mso-ansi-language: EN-US;">Faculty of Philosophy, Philosophy of Science and Study of Religion</span></div><div class="MsoPlainText" style="text-align: left;"><span lang="EN-US" style="mso-ansi-language: EN-US;">Chair of Logic and Philosophy of Language / MCMP</span></div><div class="MsoPlainText" style="text-align: left;"><span lang="DE">Geschwister-Scholl-Platz 1</span></div><div class="MsoPlainText" style="text-align: left;"><span lang="DE">80539 München</span></div><div class="MsoPlainText" style="text-align: left;"><span lang="DE">e-Mail: </span><span class="MsoHyperlink"><span lang="DE" style="font-family: "Calibri",sans-serif; mso-bidi-font-family: "Times New Roman";"><a href="mailto:office.leitgeb@lrz.uni-muenchen.de">office.leitgeb@lrz.uni-muenchen.de</a></span></span><span lang="DE"></span></div><div class="MsoPlainText" style="text-align: left;"><br /></div><div class="MsoPlainText" style="text-align: left;"><span lang="EN-US" style="mso-ansi-language: EN-US;">by <b style="mso-bidi-font-weight: normal;">September 1, 2018</b>. If possible, we very much prefer applications by email.</span></div><div class="MsoPlainText" style="text-align: left;"><br /></div><div class="MsoPlainText" style="text-align: left;"><span lang="EN" style="mso-ansi-language: EN;">In addition, we ask for two letters if reference, which must be sent by the reviewers directly to the above address (e-mail preferred).</span></div><div class="MsoPlainText" style="text-align: left;"><br /></div><div class="MsoPlainText" style="text-align: left;"><span lang="EN" style="mso-ansi-language: EN;">For further questions you can contact by e-mail <span class="MsoHyperlink"><span style="font-family: "Calibri",sans-serif; mso-bidi-font-family: "Times New Roman"; mso-bidi-theme-font: minor-bidi;"><a href="mailto:office.leitgeb@lrz.uni-muenchen.de">office.leitgeb@lrz.uni-muenchen.de</a></span></span>.</span><span lang="EN-US" style="mso-ansi-language: EN-US;"></span></div><div class="MsoPlainText" style="text-align: left;"><br /></div><div class="MsoPlainText" style="text-align: left;"><span lang="EN-US" style="mso-ansi-language: EN-US;">More information about the MCMP can be found at </span><span class="MsoHyperlink"><span lang="EN-US" style="font-family: "Calibri",sans-serif; mso-ansi-language: EN-US; mso-bidi-font-family: "Times New Roman";"><a href="http://www.mcmp.philosophie.uni-muenchen.de/index.html">http://www.mcmp.philosophie.uni-muenchen.de/index.html</a></span></span><span lang="EN-US" style="mso-ansi-language: EN-US;">.</span></div><div class="MsoPlainText" style="text-align: left;"><br /></div><div style="text-align: left;"> <span lang="EN-US" style="font-family: "Calibri",sans-serif; font-size: 11.0pt; mso-ansi-language: EN-US; mso-ascii-theme-font: minor-latin; mso-bidi-font-family: "LMU CompatilFact"; mso-bidi-language: AR-SA; mso-fareast-font-family: "Times New Roman"; mso-fareast-language: DE; mso-hansi-theme-font: minor-latin;">The German description of the position is to be found at <span class="MsoHyperlink"><span style="font-family: "Calibri",sans-serif; mso-ascii-theme-font: minor-latin; mso-bidi-font-family: "LMU CompatilFact"; mso-hansi-theme-font: minor-latin;"><a href="https://www.uni-muenchen.de/aktuelles/stellenangebote/wissenschaft/20180704161330.html">https://www.uni-muenchen.de/aktuelles/stellenangebote/wissenschaft/20180704161330.html</a></span></span></span> </div><div style="text-align: left;"><br /></div><style><!-- /* Font Definitions */ @font-face {font-family:"Cambria Math"; panose-1:2 4 5 3 5 4 6 3 2 4; mso-font-charset:0; mso-generic-font-family:roman; mso-font-pitch:variable; mso-font-signature:-536870145 1107305727 0 0 415 0;} @font-face {font-family:Calibri; panose-1:2 15 5 2 2 2 4 3 2 4; mso-font-charset:0; mso-generic-font-family:swiss; mso-font-pitch:variable; mso-font-signature:-536859905 -1073732485 9 0 511 0;} @font-face {font-family:"LMU CompatilFact"; panose-1:2 11 6 4 2 2 2 2 2 4; mso-font-alt:Calibri; mso-font-charset:0; mso-generic-font-family:auto; mso-font-pitch:variable; mso-font-signature:-2147483473 66 0 0 9 0;} /* Style Definitions */ p.MsoNormal, li.MsoNormal, div.MsoNormal {mso-style-unhide:no; mso-style-qformat:yes; mso-style-parent:""; margin:0cm; margin-bottom:.0001pt; line-height:12.0pt; mso-line-height-rule:exactly; mso-pagination:widow-orphan; font-size:11.0pt; font-family:"LMU CompatilFact"; mso-fareast-font-family:"Times New Roman"; mso-bidi-font-family:"LMU CompatilFact"; letter-spacing:.6pt; mso-ansi-language:DE; mso-fareast-language:DE;} a:link, span.MsoHyperlink {mso-style-priority:99; mso-style-unhide:no; font-family:"Times New Roman",serif; mso-bidi-font-family:"Times New Roman"; color:blue; text-decoration:underline; text-underline:single;} a:visited, span.MsoHyperlinkFollowed {mso-style-noshow:yes; mso-style-priority:99; color:purple; mso-themecolor:followedhyperlink; text-decoration:underline; text-underline:single;} p.MsoPlainText, li.MsoPlainText, div.MsoPlainText {mso-style-priority:99; mso-style-link:"Testo normale Carattere"; margin:0cm; margin-bottom:.0001pt; mso-pagination:widow-orphan; font-size:11.0pt; mso-bidi-font-size:10.5pt; font-family:"Calibri",sans-serif; mso-fareast-font-family:Calibri; mso-fareast-theme-font:minor-latin; mso-bidi-font-family:"Times New Roman"; mso-bidi-theme-font:minor-bidi; mso-ansi-language:DE; mso-fareast-language:EN-US;} span.TestonormaleCarattere {mso-style-name:"Testo normale Carattere"; mso-style-priority:99; mso-style-unhide:no; mso-style-locked:yes; mso-style-link:"Testo normale"; mso-bidi-font-size:10.5pt; font-family:"Calibri",sans-serif; mso-ascii-font-family:Calibri; mso-fareast-font-family:Calibri; mso-fareast-theme-font:minor-latin; mso-hansi-font-family:Calibri; mso-bidi-font-family:"Times New Roman"; mso-bidi-theme-font:minor-bidi; mso-fareast-language:EN-US;} .MsoChpDefault {mso-style-type:export-only; mso-default-props:yes; font-size:11.0pt; mso-ansi-font-size:11.0pt; mso-bidi-font-size:11.0pt; mso-ansi-language:DE; mso-fareast-language:DE;} @page WordSection1 {size:612.0pt 792.0pt; margin:70.85pt 2.0cm 2.0cm 2.0cm; mso-header-margin:36.0pt; mso-footer-margin:36.0pt; mso-paper-source:0;} div.WordSection1 {page:WordSection1;} --></style>Vincenzo Crupihttp://www.blogger.com/profile/08069145846190162517noreply@blogger.com1tag:blogger.com,1999:blog-4987609114415205593.post-67854095470407479962018-07-03T15:41:00.002+01:002018-07-03T15:41:33.894+01:00Entia et Nomina 2018 (August 28-29, Gdansk)<div dir="ltr" style="text-align: left;" trbidi="on"><div style="text-align: justify;">This is the seventh conference in the Entia et NominA series (previous editions took place in Poland, Belgium and India), which features workshops for researchers in formally and analytically oriented philosophy, in particular in <b>epistemology, logic, and philosophy of science</b>. The distinctive format of the workshop requires participants to distribute extended abstracts or full papers a couple of weeks before the workshop and to prepare extended comments on another participant's paper.</div><div><br /></div><div><b>Invited speakers</b></div><div>- Zalan Gyenis (Jagiellonian University, Poland)</div><div>- Masashi Kasaki (Nagoya University, Japan)</div><div>- Martin Smith (University of Edinburgh, Scotland)</div><div><br /></div><div><br /></div><div><b>Dates: </b></div><div>- Submission deadline: July 20</div><div>- Decisions: August 1</div><div>- Workshop: August 28-29</div><div><br /></div><div>For more details on the workshop and submission, consult the pdf file with full CFP:</div><div><br /></div><div><a href="https://drive.google.com/file/d/1XMYKaVEDyidZ6W901gDrVVN3_vIk4k_6/view?usp=sharing" target="_blank">Entia et Nomina FULL CFP</a></div></div>Rafal Urbaniakhttp://www.blogger.com/profile/10277466578023939272noreply@blogger.com103tag:blogger.com,1999:blog-4987609114415205593.post-38613388398177303672018-02-12T20:31:00.000+00:002018-07-26T08:20:15.610+01:00An almost-Dutch Book argument for the Principal PrinciplePeople often talk about the synchronic <a href="https://plato.stanford.edu/entries/dutch-book/#BasiDutcBookArguForProb" target="_blank">Dutch Book argument for Probabilism</a> and the <a href="https://plato.stanford.edu/entries/dutch-book/#DiacDutcBookArgu" target="_blank">diachronic Dutch Strategy argument for Conditionalization</a>. But the synchronic Dutch Book argument for the Principal Principle is mentioned less. That's perhaps because, in one sense, there couldn't possibly be such an argument. As the Converse Dutch Book Theorem shows, providing you satisfy Probabilism, there can be no Dutch Book made against you -- that is, there is no sets of bets, each of which you will consider fair or favourable on its own, but which, when taken together, lead to a sure loss for you. So you can violate the Principal Principle without being vulnerable to a sure loss, providing your satisfy Probabilism. However, there is a related argument for the Principal Principle. And conversations with a couple of philosophers recently made me think it might be worth laying it out.<br /><br />Here is the result on which the argument is based:<br /><br />(I) Suppose your credences violate the Principal Principle but satisfy Probabilism. Then there is a book of bets and a price such that: (i) you consider that price favourable for that book -- that is, your subjective expectation of the total net gain is positive; (ii) every possible objective chance function considers that price unfavourable -- that is, the objective expectation of the total net gain is guaranteed to be negative.<br /><br />(II) Suppose your credences satisfy both the Principal Principle and Probabilism. Then there is no book of bets and a price such that: (i) you consider that price favourable for that book; (ii) every possible objective chance function considers that price unfavourable.<br /><br />Put another way:<br /><br />(I') Suppose your credences violate the Principal Principle. There are two actions $a$ and $b$ such that: you prefer $b$ to $a$, but every possible objective chance function prefers $a$ to $b$.<br /><br />(II') Suppose your credences satisfy the Principal Principle. For any two actions $a$ and $b$: if every possible objective chance function prefers $a$ to $b$, then you prefer $a$ to $b$.<br /><br />To move from (I) and (II) to (I') and (II'), let $a$ be the action of accepting the bets in $B$ and let $b$ be the action of rejecting them. <br /><br />The proof splits into two parts:<br /><br />(1) First, we note that a credence function $c$ satisfies the Principal Principle iff $c$ is in the closed convex hull of the set of possible chance functions.<br /><br />(2) Second, we prove that:<br /><br />(2I) If a probability function $c$ lies outside the closed convex hull of a set of probability functions $\mathcal{X}$, then there is a book of bets and a price such the expected total net gain from that book at that price by the lights of $c$ is positive, while the expected total net gain from that book at that price by the lights of each $p$ in $\mathcal{X}$ is negative.<br /><br />(2II) If a probability function $c$ lies inside the closed convex hull of a set of probability functions $\mathcal{X}$, then there is no book of bets and a price such the expected total net gain from that book at that price by the lights of $c$ is positive, while the expected total net gain from that book at that price by the lights of each $p$ in $\mathcal{X}$ is negative.<br /><br />Here's the proof of (2), which I lift from my <a href="https://drive.google.com/file/d/11hxCUJAKLk7_6_WARz56z6ITm9lX4U5y/view" target="_blank">recent justification of linear pooling</a> -- the same technique is applicable since the Principal Principle essentially says that you should set your credences by applying linear pooling to the possible objective chances.<br /><br />First:<br /><ul><li>Let $\Omega$ be the set of possible worlds</li><li>Let $\mathcal{F} = \{X_1, \ldots, X_n\}$ be the set of propositions over which our probability functions are defined. So each $X_i$ is a subset of $\Omega$.</li></ul>Now:<br /><ul><li>We represent a probability function $p$ defined on $\mathcal{F}$ as a vector in $\mathbb{R}^n$, namely, $p = \langle p(X_1), \ldots, p(X_n)\rangle$.</li><li>Given a proposition $X$ in $\mathcal{F}$ and a stake $S$ in $\mathbb{R}$, we define the bet $B_{X, S}$ as follows: $$B_{X, S}(\omega) = \left \{ \begin{array}{ll}<br />S & \mbox{if } \omega \in X \\<br />0 & \mbox{if } \omega \not \in X<br />\end{array}<br />\right.$$ So $B_{X, S}$ pays out $S$ if $X$ is true and $0$ if $X$ is false.</li><li>We represent the book of bets $\sum^n_{i=1} B_{X_i, S_i}$ as a vector in $\mathbb{R}^n$, namely, $S = \langle S_1, \ldots, S_n\rangle$. </li></ul><br /><b>Lemma 1</b><br />If $p$ is a probability function on $\mathcal{F}$, the expected payoff of the book of bets $\sum^n_{i=1} B_{X_i, S_i}$ by the lights of $p$ is $$S \cdot p = \sum^n_{i=1} p(X_i)S_i$$<br /><b>Lemma 2</b><br />Suppose $c$ is a probability function on $\mathcal{F}$, $\mathcal{X}$ is a set of probability functions on $\mathcal{F}$, and $\mathcal{X}^+$ is the closed convex hull of $\mathcal{X}$. Then, if $c \not \in \mathcal{X}^+$, then there is a vector $S$ and $\varepsilon > 0$ such that, for all $p$ in $\mathcal{X}$, $$S \cdot p < S \cdot c - \varepsilon$$<br /><i>Proof of Lemma</i> <i>2</i>. Suppose $c \not \in \mathcal{X}^+$. Then let $c^*$ be the closest point in $\mathcal{X}^+$ to $c$. Then let $S = c - c^*$. Then, for any $p$ in $\mathcal{X}$, the angle $\theta$ between $S$ and $p - c$ is obtuse and thus $\mathrm{cos}\, \theta < 0$. So, since $S \cdot (p - c) = ||S||\, ||x - p|| \mathrm{cos}\, \theta$ and $||S||, ||p - c|| > 0$, we have $S \cdot (p - c) < 0$. And hence $S \cdot p < S \cdot c$. What's more, since $\mathcal{X}^+$ is closed, $p$ is not a limit point of $\mathcal{X}^+$, and thus there is $\delta > 0$ such that $||p - c|| > \delta$ for all $p$ in $\mathcal{X}$. Thus, there is $\varepsilon > 0$ such that $S \cdot p < S \cdot c - \varepsilon$, for all $p$ in $\mathcal{X}$.<br /><br />We now derive (2I) and (2II) from Lemmas 1 and 2:<br /><br />Let $\mathcal{X}$ be the set of possible objective chance functions. If $c$ violates the Principal Principle, then $c$ is not in $\mathcal{X}^+$. Thus, by Lemma 2, there is a book of bets $\sum^n_{i=1} B_{X_i, S_i}$ and $\varepsilon > 0$ such that, for any objective chance function $p$ in $\mathcal{X}$, $S \cdot p < S \cdot c - \varepsilon$. By Lemma 1, $S \cdot p$ is the expected payout of the book of bets by the lights of $p$, while $S \cdot c$ is the expected payout of the book of bets by the lights of $c$. Now, suppose we were to offer an agent with credence function $c$ the book of bets $\sum^n_{i=1} B_{X_i, S_i}$ for the price of $S \cdot c - \frac{\varepsilon}{2}$. Then this would have positive expected payoff by the lights of $c$, but negative expected payoff by the lights of each $p$ in $\mathcal{X}$. This gives (2I).<br /><br />(2II) then holds because, when $c$ is in the closed convex hull of $\mathcal{X}$, its expectation of a random variable is in the closed convex hull of the expectations of that random variable by the lights of the probability functions in $\mathcal{X}$. Thus, if the expectation of a random variable is negative by the lights of all the probability functions in $\mathcal{X}$, then its expectation by the lights of $c$ is not positive.<br /><br /><br />Richard Pettigrewhttp://www.blogger.com/profile/07828399117450825734noreply@blogger.com2tag:blogger.com,1999:blog-4987609114415205593.post-63562356791457858572018-01-01T20:39:00.000+00:002018-07-26T08:19:47.707+01:00A Dutch Book argument for linear poolingOften, we wish to aggregate the probabilistic opinions of different agents. They might be experts on the effects of housing policy on people sleeping rough, for instance, and we might wish to produce from their different probabilistic opinions an aggregate opinion that we can use to guide policymaking. Methods for undertaking such aggregation are called <i>pooling operators</i>. They take as their input a sequence of probability functions $c_1, \ldots, c_n$, all defined on the same set of propositions, $\mathcal{F}$. And they give as their output a single probability function $c$, also defined on $\mathcal{F}$, which is the aggregate of $c_1, \ldots, c_n$. (If the experts have non-probabilistic credences and if they have credences defined on different sets of propositions or events, problems arise -- I've written about these <a href="http://m-phi.blogspot.co.uk/2017/03/a-dilemma-for-judgment-aggregation.html" target="_blank">here</a> and <a href="http://m-phi.blogspot.co.uk/2017/09/aggregating-abstaining-experts.html" target="_blank">here</a>.) Perhaps the simplest are the <i>linear pooling operators</i>. Given a set of non-negative weights, $\alpha_1, \ldots, \alpha_n \leq 1$ that sum to 1, one for each probability function to be aggregated, the linear pool of $c_1, \ldots, c_n$ with these weights is: $c = \alpha_1 c_1 + \ldots + \alpha_n c_n$. So the probability that the aggregate assigns to a proposition (or event) is the weighted average of the probabilities that the individuals assign to that proposition (event) with the weights $\alpha_1, \ldots, \alpha_n$.<br /><br />Linear pooling has had a hard time recently. <a href="http://onlinelibrary.wiley.com/doi/10.1111/nous.12143/abstract" target="_blank">Elkin and Wheeler</a> reminded us that linear pooling almost never preserves unanimous judgments of independence; <a href="https://link.springer.com/article/10.1007/s11098-014-0350-8" target="_blank">Russell, et al.</a> reminded us that it almost never commutes with Bayesian conditionalization; and <a href="http://eprints.lse.ac.uk/80762/1/Bradley_Learning%20from%20others_2017.pdf" target="_blank">Bradley</a> showed that aggregating a group of experts using linear pooling almost never gives the same result as you would obtain from updating your own probabilities in the usual Bayesian way when you learn the probabilities of those experts. I've tried to defend linear pooling against the first two attacks <a href="https://drive.google.com/file/d/0B-Gzj6gcSXKrWHNLZzF6TERraWc/view" target="_blank">here</a>. In that paper, I also offer a positive argument in favour of that aggregation method: I argue that, if your aggregate is not a result of linear pooling, there will be an alternative aggregate that each experts expects to be more accurate than yours; if your aggregate is a result of linear pooling, this can't happen. Thus, my argument is a non-pragmatic, accuracy-based argument, in the same vein as Jim Joyce's non-pragmatic vindication of probabilism. In this post, I offer an alternative, pragmatic, Dutch book-style defence, in the same vein as the standard Ramsey-de Finetti argument for probabilism.<br /><br />My argument is based on the following fact: <b>if your aggregate probability function is not a result of linear pooling, there will be a series of bets that the aggregate will consider fair but which each expert will expect to lose money (or utility); if your aggregate is a result of linear pooling, this can't happen.</b> Since one of the things we might wish to use an aggregate to do is to help us make communal decisions, a putative aggregate cannot be considered acceptable if it will lead us to make a binary choice one way when every expert agrees that it should be made the other way. Thus, we should aggregate credences using a linear pooling operator.<br /><br />We now prove the mathematical fact behind the argument, namely, that if $c$ is not a linear pool of $c_1, \ldots, c_n$, then there is a bet that $c$ will consider fair, and yet each $c_i$ will expect it to lose money; the converse is straightforward.<br /><br />Suppose $\mathcal{F} = \{X_1, \ldots, X_m\}$. Then:<br /><ul><li>We can represent a probability function $c$ on $\mathcal{F}$ as a vector in $\mathbb{R}^m$, namely, $c = \langle c(X_1), \ldots, c(X_m)\rangle$.</li><li>We can also represent a book of bets on the propositions in $\mathcal{F}$ by a vector in $\mathbb{R}^m$, namely, $S = \langle S_1, \ldots, S_m\rangle$, where $S_i$ is the stake of the bet on $X_i$, so that the bet on $X_i$ pays out $S_i$ dollars (or utiles) if $X_i$ is true and $0$ dollars (or utiles) if $X_i$ is false.</li><li>An agent with probability function $c$ will be prepared to pay $c(X_i)S_i$ for a bet on $X_i$ with stake $S_i$, and thus will be prepared to pay $S \cdot c = c(X_1)S_1 + \ldots + c(X_m)S_m$ dollars (or utiles) for the book of bets with stakes $S = \langle S_1, \ldots, S_m\rangle$. (As is usual in Dutch book-style arguments, we assume that the agent is risk neutral.)</li><li>This is because $S \cdot c$ is the expected pay out of the book of bets with stakes $S$ by the lights of probability function $c$.</li></ul>Now, suppose $c$ is not a linear pool of $c_1, \ldots, c_n$. So $c$ lies outside the convex hull of $\{c_1, \ldots, c_n\}$. Let $c^*$ be the closest point to $c$ inside that convex hull. And let $S = c - c^*$. Then the angle $\theta$ between $S$ and $c_i - c$ is obtuse and thus $\mathrm{cos}\, \theta < 0$ (see diagram below). So, since $S \cdot (c_i - c) = ||S||\, ||c_i - c|| \mathrm{cos}\, \theta$ and $||S||, ||c_i - c|| \geq 0$, we have $S \cdot (c_i - c) < 0$. And hence $S \cdot c_i < S \cdot c$. But recall:<br /><ul><li>$S \cdot c$ is the amount that the aggregate $c$ is prepared to pay for the book of bets with stakes $S$; and </li><li>$S \cdot c_i$ is the expert $i$'s expected pay out of the book of bets with stakes $S$.</li></ul>Thus, each expert will expect that book of bets to pay out less than $c$ will be willing to pay for it.<br /><div class="separator" style="clear: both; text-align: center;"><a href="https://1.bp.blogspot.com/-Z4J-OXKKzu8/WkqZZPqWmBI/AAAAAAAAApQ/wwuZLqQwtzIUzt17WzSiE5sycbnfaOlFwCLcBGAs/s1600/IMG_3856.JPG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="1200" data-original-width="1600" height="300" src="https://1.bp.blogspot.com/-Z4J-OXKKzu8/WkqZZPqWmBI/AAAAAAAAApQ/wwuZLqQwtzIUzt17WzSiE5sycbnfaOlFwCLcBGAs/s400/IMG_3856.JPG" width="400" /></a></div><br /><br /><br />Richard Pettigrewhttp://www.blogger.com/profile/07828399117450825734noreply@blogger.com4tag:blogger.com,1999:blog-4987609114415205593.post-78601383681488027912017-10-10T18:47:00.000+01:002017-10-10T18:47:29.584+01:00Two Paradoxes of Belief (by Roy T Cook)This was posted originally at the <a href="https://www.blogger.com/The%20Liar%20paradox%20arises%20via%20considering%20the%20Liar%20sentence:%20%20L:%20L%20is%20not%20true.%20%20and%20then%20reasoning%20in%20accordance%20with%20the:%20%20T-schema:%20%20%E2%80%9C%CE%A6%20is%20true%20if%20and%20only%20if%20what%20%CE%A6%20says%20is%20the%20case.%E2%80%9D%20%20Along%20similar%20lines,%20we%20obtain%20the%20Montague%20paradox%20(or%20the%20%E2%80%9Cparadox%20of%20the%20knower%E2%80%9C)%20by%20considering%20the%20following%20sentence:%20%20M:%20M%20is%20not%20knowable.%20%20and%20then%20reasoning%20in%20accordance%20with%20the%20following%20two%20claims:%20%20Factivity:%20%20%E2%80%9CIf%20%CE%A6%20is%20knowable%20then%20what%20%CE%A6%20says%20is%20the%20case.%E2%80%9D%20%20Necessitation:%20%20%E2%80%9CIf%20%CE%A6%20is%20a%20theorem%20(i.e.%20is%20provable),%20then%20%CE%A6%20is%20knowable.%E2%80%9D%20%20Put%20in%20very%20informal%20terms,%20these%20results%20show%20that%20our%20intuitive%20accounts%20of%20truth%20and%20of%20knowledge%20are%20inconsistent.%20Much%20work%20in%20logic%20has%20been%20carried%20out%20in%20attempting%20to%20formulate%20weaker%20accounts%20of%20truth%20and%20of%20knowledge%20that%20(i)%20are%20strong%20enough%20to%20allow%20these%20notions%20to%20do%20substantial%20work,%20and%20(ii)%20are%20not%20susceptible%20to%20these%20paradoxes%20(and%20related%20paradoxes,%20such%20as%20Curry%20and%20Yablo%20versions%20of%20both%20of%20the%20above).%20A%20bit%20less%20well%20known%20that%20certain%20strong%20but%20not%20altogether%20implausible%20accounts%20of%20idealized%20belief%20also%20lead%20to%20paradox.%20%20The%20puzzles%20involve%20an%20idealized%20notion%20of%20belief%20(perhaps%20better%20paraphrased%20at%20%E2%80%9Crational%20commitment%E2%80%9D%20or%20%E2%80%9Cjustifiable%20belief%E2%80%9D),%20where%20one%20believes%20something%20in%20this%20sense%20if%20and%20only%20if%20(i)%20one%20explicitly%20believes%20it,%20or%20(ii)%20one%20is%20somehow%20committed%20to%20the%20claim%20even%20if%20one%20doesn%E2%80%99t%20actively%20believe%20it.%20Hence,%20on%20this%20understanding%20belief%20is%20closed%20under%20logical%20consequence%20%E2%80%93%20one%20believes%20all%20of%20the%20logical%20consequences%20of%20one%E2%80%99s%20beliefs.%20In%20particular,%20the%20following%20holds:%20%20B-Closure:%20%20%E2%80%9CIf%20you%20believe%20that,%20if%20%CE%A6%20then%20%CE%A8,%20and%20you%20believe%20%CE%A6,%20then%20you%20believe%20%CE%A8.%E2%80%9D%20%20Now,%20for%20such%20an%20idealized%20account%20of%20belief,%20the%20rule%20of%20B-Necessitation:%20%20B-Necessitation:%20%20%E2%80%9CIf%20%CE%A6%20is%20a%20theorem%20(i.e.%20is%20provable),%20then%20%CE%A6%20is%20believed.%E2%80%9D%20%20is%20extremely%20plausible%20%E2%80%93%20after%20all,%20presumably%20anything%20that%20can%20be%20proved%20is%20something%20that%20follows%20from%20things%20we%20believe%20(since%20it%20follows%20from%20nothing%20more%20than%20our%20axioms%20for%20belief).%20In%20addition,%20we%20will%20assume%20that%20our%20beliefs%20are%20consistent:%20%20B-Consistency:%20%20%E2%80%9CIf%20I%20believe%20%CE%A6,%20then%20I%20do%20not%20believe%20that%20%CE%A6%20is%20not%20the%20case.%E2%80%9D%20%20So%20far,%20so%20good.%20But%20neither%20the%20belief%20analogue%20of%20the%20T-schema:%20%20B-schema:%20%20%E2%80%9C%CE%A6%20is%20believed%20if%20and%20only%20if%20what%20%CE%A6%20says%20is%20the%20case.%E2%80%9D%20%20nor%20the%20belief%20analogue%20of%20Factivity:%20%20B-Factivity:%20%20%E2%80%9CIf%20you%20believe%20%CE%A6%20then%20what%20%CE%A6%20says%20is%20the%20case.%E2%80%9D%20%20is%20at%20all%20plausible.%20After%20all,%20just%20because%20we%20believe%20something%20(or%20even%20that%20the%20claim%20in%20question%20follows%20from%20what%20we%20believe,%20in%20some%20sense)%20doesn%E2%80%99t%20mean%20the%20belief%20has%20to%20be%20true!%20%20There%20are%20other,%20weaker,%20principles%20about%20belief,%20however,%20that%20are%20not%20intuitively%20implausible,%20but%20when%20combined%20with%20B-Closure,%20B-Necessitation,%20and%20B-Consistency%20lead%20to%20paradox.%20We%20will%20look%20at%20two%20principles%20%E2%80%93%20each%20of%20which%20captures%20a%20sense%20in%20which%20we%20cannot%20be%20wrong%20about%20what%20we%20think%20we%20don%E2%80%99t%20believe.%20%20The%20first%20such%20principle%20we%20will%20call%20the%20First%20Transparency%20Principle%20for%20Disbelief:%20%20TPDB1:%20%20%E2%80%9CIf%20you%20believe%20that%20you%20don%E2%80%99t%20believe%20%CE%A6%20then%20you%20don%E2%80%99t%20believe%20%CE%A6.%E2%80%9D%20%20In%20other%20words,%20although%20many%20of%20our%20beliefs%20can%20be%20wrong,%20according%20to%20TPDB1%20our%20beliefs%20about%20what%20we%20do%20not%20believe%20cannot%20be%20wrong.%20The%20second%20principle,%20which%20is%20a%20mirror%20image%20of%20the%20first,%20we%20will%20call%20the%20Second%20Transparency%20Principle%20for%20Disbelief:%20%20TPDB2:%20%20%E2%80%9CIf%20you%20don%E2%80%99t%20believe%20%CE%A6%20then%20you%20believe%20that%20you%20don%E2%80%99t%20believe%20%CE%A6.%E2%80%9D%20%20In%20other%20words,%20according%20to%20TPDB2%20we%20are%20aware%20of%20(i.e.%20have%20true%20beliefs%20about)%20all%20of%20the%20facts%20regarding%20what%20we%20don%E2%80%99t%20believe.%20%20Either%20of%20these%20principles,%20combined%20with%20B-Closure,%20B-Necessitation,%20and%20B-Consistency,%20lead%20to%20paradox.%20I%20will%20present%20the%20argument%20for%20TPBD1.%20The%20argument%20for%20TPDB2%20is%20similar,%20and%20left%20to%20the%20reader%20(although%20I%20will%20give%20an%20important%20hint%20below).%20%20Consider%20the%20sentence:%20%20S:%20It%20is%20not%20the%20case%20that%20I%20believe%20S.%20%20Now,%20by%20inspection%20we%20can%20understand%20this%20sentence,%20and%20thus%20conclude%20that:%20%20(1)%20What%20S%20says%20is%20the%20case%20if%20and%20only%20if%20I%20do%20not%20believe%20S.%20%20Further,%20(1)%20is%20something%20we%20can,%20via%20inspecting%20the%20original%20sentence,%20informally%20prove.%20(Or,%20if%20we%20were%20being%20more%20formal,%20and%20doing%20all%20of%20this%20in%20arithmetic%20enriched%20with%20a%20predicate%20%E2%80%9CB(x)%E2%80%9D%20for%20idealized%20belief,%20a%20formal%20version%20of%20the%20above%20would%20be%20a%20theorem%20due%20to%20G%C3%B6del%E2%80%99s%20diagonalization%20lemma.)%20So%20we%20can%20apply%20B-Necessitation%20to%20(1),%20obtaining:%20%20(2)%20I%20believe%20that:%20what%20S%20says%20is%20the%20case%20if%20and%20only%20if%20I%20do%20not%20believe%20S.%20%20Applying%20a%20version%20of%20B-Closure,%20this%20entails:%20%20(3)%20I%20believe%20S%20if%20and%20only%20if%20I%20believe%20that%20I%20do%20not%20believe%20S.%20%20Now,%20assume%20(for%20reductio%20ad%20absurdum)%20that:%20%20(4)%20I%20believe%20S.%20%20Then%20combining%20(3)%20and%20(4)%20and%20some%20basic%20logic,%20we%20obtain:%20%20(5)%20I%20believe%20that%20I%20do%20not%20believe%20S.%20%20Applying%20TPDB1%20to%20(5),%20we%20get:%20%20(6)%20I%20do%20not%20believe%20S.%20%20But%20this%20contradicts%20(4).%20So%20lines%20(4)%20through%20(6)%20amount%20to%20a%20refutation%20of%20line%20(4),%20and%20hence%20a%20proof%20that:%20%20(7)%20I%20do%20not%20believe%20S.%20%20Now,%20(7)%20is%20clearly%20a%20theorem%20(we%20just%20proved%20it),%20so%20we%20can%20apply%20B-Necessitation,%20arriving%20at:%20%20(8)%20I%20believe%20that%20I%20do%20not%20believe%20S.%20%20Combining%20(8)%20and%20(3)%20leads%20us%20to:%20%20(9)%20I%20believe%20S.%20%20But%20this%20obviously%20contradicts%20(7),%20and%20we%20have%20our%20final%20contradiction.%20%20Note%20that%20this%20argument%20does%20not%20actually%20use%20B-Consistency%20(hint%20for%20the%20second%20argument%20involving%20TPDB2:%20you%20will%20need%20B-Consistency!)%20%20These%20paradoxes%20seem%20to%20show%20that,%20as%20a%20matter%20of%20logic,%20we%20cannot%20have%20perfectly%20reliable%20beliefs%20about%20what%20we%20don%E2%80%99t%20believe%20%E2%80%93%20in%20other%20words,%20in%20this%20idealized%20sense%20of%20belief,%20there%20are%20always%20things%20that%20we%20believe%20that%20we%20don%E2%80%99t%20believe,%20but%20in%20actuality%20we%20do%20believe%20(the%20failure%20of%20TPDB1),%20and%20things%20that%20we%20don%E2%80%99t%20believe,%20but%20don%E2%80%99t%20believe%20that%20we%20don%E2%80%99t%20believe%20(the%20failure%20of%20TPDB2).%20At%20least,%20the%20puzzles%20show%20this%20if%20we%20take%20them%20to%20force%20us%20to%20reject%20both%20TPDB1%20and%20TPDB2%20in%20the%20same%20way%20that%20many%20feel%20that%20the%20Liar%20paradox%20forces%20us%20to%20abandon%20the%20full%20T-Schema.%20%20Once%20we%E2%80%99ve%20considered%20transparency%20principles%20for%20disbelief,%20it%E2%80%99s%20natural%20to%20consider%20corresponding%20principles%20for%20belief.%20There%20are%20two.%20The%20first%20is%20the%20First%20Transparency%20Principle%20for%20Belief:%20%20TPB1:%20%20%E2%80%9CIf%20you%20believe%20that%20you%20believe%20%CE%A6%20then%20you%20believe%20%CE%A6.%E2%80%9D%20%20In%20other%20words,%20according%20to%20TPD1%20our%20beliefs%20about%20what%20we%20believe%20cannot%20be%20wrong.%20The%20second%20principle,%20again%20is%20a%20mirror%20image%20of%20the%20first,%20is%20the%20Second%20Transparency%20Principle%20for%20Belief:%20%20TPB2:%20%20%E2%80%9CIf%20you%20believe%20%CE%A6%20then%20you%20believe%20that%20you%20believe%20%CE%A6.%E2%80%9D%20%20In%20other%20words,%20according%20to%20TPB2%20we%20are%20aware%20of%20all%20of%20the%20facts%20regarding%20what%20we%20believe.%20%20Are%20either%20of%20these%20two%20principles,%20combined%20with%20B-Closure,%20B-Necessitation,%20and%20B-Consistency,%20paradoxical?%20If%20not,%20are%20there%20additional,%20plausible%20principles%20that%20would%20lead%20to%20paradoxes%20if%20added%20to%20these%20claims?%20I%E2%80%99ll%20leave%20it%20to%20the%20reader%20to%20explore%20these%20questions%20further.%20%20A%20historical%20note:%20Like%20so%20many%20other%20cool%20puzzles%20and%20paradoxes,%20versions%20of%20some%20of%20these%20puzzles%20first%20appeared%20in%20the%20work%20of%20medieval%20logician%20Jean%20Buridan.">OUPBlog</a>. This is a first in a series of cross-posted blogs by <a href="https://cla.umn.edu/about/directory/profile/cookx432" target="_blank">Roy T Cook</a> (Minnesota) from the OUPBlog series on <a href="https://blog.oup.com/category/series-columns/paradoxes-puzzles-roy-cook/" target="_blank">Paradox and Puzzles</a>.<br /><br />The Liar paradox arises via considering the Liar sentence:<br /><br />L: L is not true.<br /><br />and then reasoning in accordance with the:<br /><br />T-schema:<br /><br />“Φ is true if and only if what Φ says is the case.”<br /><br />Along similar lines, we obtain the Montague paradox (or the “paradox of the knower“) by considering the following sentence:<br /><br />M: M is not knowable.<br /><br />and then reasoning in accordance with the following two claims:<br /><br />Factivity:<br /><br />“If Φ is knowable then what Φ says is the case.”<br /><br />Necessitation:<br /><br />“If Φ is a theorem (i.e. is provable), then Φ is knowable.”<br /><br />Put in very informal terms, these results show that our intuitive accounts of truth and of knowledge are inconsistent. Much work in logic has been carried out in attempting to formulate weaker accounts of truth and of knowledge that (i) are strong enough to allow these notions to do substantial work, and (ii) are not susceptible to these paradoxes (and related paradoxes, such as Curry and Yablo versions of both of the above). A bit less well known that certain strong but not altogether implausible accounts of idealized belief also lead to paradox.<br /><br />The puzzles involve an idealized notion of belief (perhaps better paraphrased at “rational commitment” or “justifiable belief”), where one believes something in this sense if and only if (i) one explicitly believes it, or (ii) one is somehow committed to the claim even if one doesn’t actively believe it. Hence, on this understanding belief is closed under logical consequence – one believes all of the logical consequences of one’s beliefs. In particular, the following holds:<br /><br />B-Closure:<br /><br />“If you believe that, if Φ then Ψ, and you believe Φ, then you believe Ψ.”<br /><br />Now, for such an idealized account of belief, the rule of B-Necessitation:<br /><br />B-Necessitation:<br /><br />“If Φ is a theorem (i.e. is provable), then Φ is believed.”<br /><br />is extremely plausible – after all, presumably anything that can be proved is something that follows from things we believe (since it follows from nothing more than our axioms for belief). In addition, we will assume that our beliefs are consistent:<br /><br />B-Consistency:<br /><br />“If I believe Φ, then I do not believe that Φ is not the case.”<br /><br />So far, so good. But neither the belief analogue of the T-schema:<br /><br />B-schema:<br /><br />“Φ is believed if and only if what Φ says is the case.”<br /><br />nor the belief analogue of Factivity:<br /><br />B-Factivity:<br /><br />“If you believe Φ then what Φ says is the case.”<br /><br />is at all plausible. After all, just because we believe something (or even that the claim in question follows from what we believe, in some sense) doesn’t mean the belief has to be true!<br /><br />There are other, weaker, principles about belief, however, that are not intuitively implausible, but when combined with B-Closure, B-Necessitation, and B-Consistency lead to paradox. We will look at two principles – each of which captures a sense in which we cannot be wrong about what we think we don’t believe.<br /><br />The first such principle we will call the First Transparency Principle for Disbelief:<br /><br />TPDB1:<br /><br />“If you believe that you don’t believe Φ then you don’t believe Φ.”<br /><br />In other words, although many of our beliefs can be wrong, according to TPDB1 our beliefs about what we do not believe cannot be wrong. The second principle, which is a mirror image of the first, we will call the Second Transparency Principle for Disbelief:<br /><br />TPDB2:<br /><br />“If you don’t believe Φ then you believe that you don’t believe Φ.”<br /><br />In other words, according to TPDB2 we are aware of (i.e. have true beliefs about) all of the facts regarding what we don’t believe.<br /><br />Either of these principles, combined with B-Closure, B-Necessitation, and B-Consistency, lead to paradox. I will present the argument for TPBD1. The argument for TPDB2 is similar, and left to the reader (although I will give an important hint below).<br /><br />Consider the sentence:<br /><br />S: It is not the case that I believe S.<br /><br />Now, by inspection we can understand this sentence, and thus conclude that:<br /><br />(1) What S says is the case if and only if I do not believe S.<br /><br />Further, (1) is something we can, via inspecting the original sentence, informally prove. (Or, if we were being more formal, and doing all of this in arithmetic enriched with a predicate “B(x)” for idealized belief, a formal version of the above would be a theorem due to Gödel’s diagonalization lemma.) So we can apply B-Necessitation to (1), obtaining:<br /><br />(2) I believe that: what S says is the case if and only if I do not believe S.<br /><br />Applying a version of B-Closure, this entails:<br /><br />(3) I believe S if and only if I believe that I do not believe S.<br /><br />Now, assume (for reductio ad absurdum) that:<br /><br />(4) I believe S.<br /><br />Then combining (3) and (4) and some basic logic, we obtain:<br /><br />(5) I believe that I do not believe S.<br /><br />Applying TPDB1 to (5), we get:<br /><br />(6) I do not believe S.<br /><br />But this contradicts (4). So lines (4) through (6) amount to a refutation of line (4), and hence a proof that:<br /><br />(7) I do not believe S.<br /><br />Now, (7) is clearly a theorem (we just proved it), so we can apply B-Necessitation, arriving at:<br /><br />(8) I believe that I do not believe S.<br /><br />Combining (8) and (3) leads us to:<br /><br />(9) I believe S.<br /><br />But this obviously contradicts (7), and we have our final contradiction.<br /><br />Note that this argument does not actually use B-Consistency (hint for the second argument involving TPDB2: you will need B-Consistency!)<br /><br />These paradoxes seem to show that, as a matter of logic, we cannot have perfectly reliable beliefs about what we don’t believe – in other words, in this idealized sense of belief, there are always things that we believe that we don’t believe, but in actuality we do believe (the failure of TPDB1), and things that we don’t believe, but don’t believe that we don’t believe (the failure of TPDB2). At least, the puzzles show this if we take them to force us to reject both TPDB1 and TPDB2 in the same way that many feel that the Liar paradox forces us to abandon the full T-Schema.<br /><br />Once we’ve considered transparency principles for disbelief, it’s natural to consider corresponding principles for belief. There are two. The first is the First Transparency Principle for Belief:<br /><br />TPB1:<br /><br />“If you believe that you believe Φ then you believe Φ.”<br /><br />In other words, according to TPD1 our beliefs about what we believe cannot be wrong. The second principle, again is a mirror image of the first, is the Second Transparency Principle for Belief:<br /><br />TPB2:<br /><br />“If you believe Φ then you believe that you believe Φ.”<br /><br />In other words, according to TPB2 we are aware of all of the facts regarding what we believe.<br /><br />Are either of these two principles, combined with B-Closure, B-Necessitation, and B-Consistency, paradoxical? If not, are there additional, plausible principles that would lead to paradoxes if added to these claims? I’ll leave it to the reader to explore these questions further.<br /><br />A historical note: Like so many other cool puzzles and paradoxes, versions of some of these puzzles first appeared in the work of medieval logician Jean Buridan.Richard Pettigrewhttp://www.blogger.com/profile/07828399117450825734noreply@blogger.com2tag:blogger.com,1999:blog-4987609114415205593.post-8893471410512231962017-09-10T09:49:00.001+01:002017-09-10T09:49:19.900+01:00Aggregating abstaining expertsIn a series of posts a few months ago (<a href="http://m-phi.blogspot.co.uk/2017/03/a-dilemma-for-judgment-aggregation.html" target="_blank">here</a>, <a href="http://m-phi.blogspot.co.uk/2017/03/a-little-more-on-aggregating-incoherent.html" target="_blank">here</a>, and <a href="http://m-phi.blogspot.co.uk/2017/03/aggregating-incoherent-credences-case.html" target="_blank">here</a>), I explored a particular method by which we might aggregate expert credences when those credences are incoherent. The result was this <a href="https://drive.google.com/file/d/0B-Gzj6gcSXKrSTRZRGNxOUdIR3M/view?usp=sharing" target="_blank">paper</a>, which is now forthcoming in <i>Synthese</i>. The method in question was called <i>the coherent approximation principle</i> (CAP), and it was introduced by Daniel Osherson and Moshe Vardi in <a href="https://www.cs.rice.edu/~vardi/papers/geb06.pdf" target="_blank">this</a> 2006 paper. CAP is based on what we might call <i>the principle of minimal mutilation</i>. We begin with a collection of credence functions, $c_1$, ..., $c_n$, one for each expert, and some of which might be incoherent. What we want at the end is a single coherent credence function $c$ that is the aggregate of $c_1$, ..., $c_n$. The principle of minimal mutilation says that $c$ should be as close as possible to the $c_i$s -- when aggregating a collection of credence functions, you should change them as little as possible to obtain your aggregate.<br /><br />We can spell this out more precisely by introducing a <i>divergence</i> $\mathfrak{D}$. We might think of this as a measure of how far one credence function lies from another. Thus, $\mathfrak{D}(c, c')$ measures the distance from $c$ to $c'$. We call these measures <i>divergences</i> rather than <i>distances</i> or <i>metrics</i>, since they do not have the usual features that mathematicians assume of a metric: we assume $\mathfrak{D}(c, c') \geq 0$, for any $c, c'$, and $\mathfrak{D}(c, c') = 0$ iff $c = c'$, but we do not assume that $\mathfrak{D}$ is symmetric nor that it satisfies the triangle inequality. In particular, we assume that $\mathfrak{D}$ is an <i>additive Bregman divergence</i>. The standard example of an additive Bregman divergence is <i>squared Euclidean distance</i>: if $c$, $c'$ are both defined on the set of propositions $F$, then<br />$$<br />\mathrm{SED}(c, c') = \sum_{X \in F} |c(X) - c'(X)|^2<br />$$In fact, $\mathrm{SED}$ is symmetric, but it does not satisfy the triangle inequality. The details of this family of divergences needn't detain us here (but see here and here for more). Indeed, we will simply use $\mathrm{SED}$ throughout. But a more general treatment would look at other additive Bregman divergences, and I hope to do this soon.<br /><br />Now, suppose $c_1$, ..., $c_n$ is a set of expert credence functions. And suppose $c_i$ is defined on the set of propositions $F_i$. And suppose that $\mathfrak{D}$ is an additive Bregman divergence -- you might take it to be $\mathrm{SED}$. Then how do we define the aggregate $c$ that is obtained from $c_1$, ..., $c_n$ by a minimal mutilation? We let $c$ be the coherent credence function such that the sum of the distances from $c$ to the $c_i$s is minimal. That is,<br />$$<br />\mathrm{CAP}_{\mathfrak{D}}(c_1, \ldots, c_n) = \mathrm{arg\ min}_{c \in P_{F_i}} \sum^n_{i=1} \mathfrak{D}(c, c_i)<br />$$<br />where $P_{F_i}$ is the set of coherent credence functions over $F_i$.<br /><br />As we see in my paper linked above, if each of the credence functions are defined over the same set of propositions -- that is, if $F_i = F_j$, for all $1 \leq i, j, \leq n$ -- then:<br /><ul><li>if $\mathfrak{D}$ is squared Euclidean distance, then this aggregate is the <i>straight linear pool</i> of the original credences; if $c$ is defined on the partition $X_1$, ..., $X_m$, then the straight linear pool of $c_1$, ..., $c_n$ is this:$$c(X_j) = \frac{1}{n}c_1(X_j) + ... + \frac{1}{n}c_n(X_j)$$</li><li>if $\mathfrak{D}$ is the generalized Kullback-Leibler divergence, then the aggregate is the <i>straight geometric pool</i> of the originals; if $c$ is defined on the partition $X_1$, ..., $X_m$, then the straight geometric pool of $c_1$, ..., $c_n$ is this: $$c(X_j) = \frac{1}{K}(c_1(X_j)^{\frac{1}{n}} \times ... \times c_1(X_j)^{\frac{1}{n}})$$where $K$ is a normalizing factor.</li></ul>(For more on these types of aggregation, see <a href="http://personal.lse.ac.uk/list/PDF-files/OpinionPoolingReview.pdf" target="_blank">here</a> and <a href="https://link.springer.com/article/10.1007/s11098-014-0350-8" target="_blank">here</a>).<br /><br />In this post, I'm interested in cases where our agents have credences in different sets of propositions. For instance, the first agent has credences concerning the rainfall in Bristol tomorrow and the rainfall in Bath, but the second has credences concerning the rainfall in Bristol and the rainfall in Birmingham.<br /><br />I want to begin by pointing to a shortcoming of CAP when it is applied to such cases. It fails to satisfy what we might think of as a basic desideratum of such procedures. To illustrate this desideratum, let's suppose that the three propositions $X_1$, $X_2$, and $X_3$ form a partition. And suppose that Amira has credences in $X_1$, $X_2$, and $X_3$, while Benito has credences only in $X_1$ and $X_2$. In particular:<br /><ul><li>Amira's credence function is: $c_A(X_1) = 0.3$, $c_A(X_2) = 0.6$, $c_A(X_3) = 0.1$.</li><li>Benito's credence function is: $c_B(X_1) = 0.2$, $c_B(X_2) = 0.6$.</li></ul>Now, notice that, while Amira's credence function is defined on the whole partition, Benito's is not. But, nonetheless, Benito's credences uniquely determine a coherent credence function on the whole partition:<br /><ul><li>Benito's extended credence function is: $c^*_B(X_1) = 0.2$, $c^*_B(X_2) = 0.6$, $c^*_B(X_3) = 0.2$.</li></ul>Thus, we might expect our aggregation procedure to give the same result whether we aggregate Amira's credence function with Benito's or with Benito's extended credence function. That is, we might expect the same result whether we aggregate $c_A$ with $c_B$ or with $c^*_B$. After all, $c^*_B$ is in some sense implicit in $c_B$. An agent with credence function $c_B$ is committed to the credences assigned by credence function $c^*_B$.<br /><br />However, CAP does not do this. As mentioned above, if you aggregate $c_A$ and $c^*_B$ using $\mathrm{SED}$, then the result is their linear pool: $\frac{1}{2}c_A + \frac{1}{2}c^*_B$. Thus, the aggregate credence in $X_1$ is $0.25$; in $X_2$ it is $0.6$; and in $X_3$ it is $0.15$. The result is different if you aggregate $c_A$ and $c_B$ using $SED$: the aggregate credence in $X_1$ is $0.2625$; in $X_2$ it is $0.6125$; in $X_3$ it is $0.125$.<br /><br />Now, it is natural to think that the problem arises here because Amira's credences are getting too much say in how far a potential aggregate lies from the agents, since she has credences in three propositions, while Benito only has credences in two. And, sure enough, $\mathrm{CAP}_{\mathrm{SED}}(c_A, c_B)$ lies closer to $c_A$ than to $c_B$ and closer to $c_A$ than the aggregate of $c_A$ and $c^*_B$ lies. And it is equally natural to try to solve this potential bias in favour of the agent with more credences by normalising. That is, we might define a new version of CAP:<br />$$<br />\mathrm{CAP}^+_D(c_1, \ldots, c_n) = \mathrm{arg\ min}_{c' \in P_{F_i}} \sum^n_{i=1} \frac{1}{|F_i|}D(c, c_i)<br />$$<br />However, this doesn't help. Using this definition, the aggregate of Amira's credence function $c_A$ and Benito's extended credence function $c^*_B$ remains the same; but the aggregate of Amira's credence function and Benito's original credence function changes -- the aggregate credence in $X_1$ is $0.25333$; in $X_2$, it is $0.61333$; in $X_3$, it is $0.1333$. Again, the two ways of aggregating disagree.<br /><br />So here is our desideratum in general:<br /><br /><b>Agreement with Coherent Commitments (ACC)</b> Suppose $c_1$, ..., $c_n$ are coherent credence functions, with $c_i$ defined on $F_i$, for each $1 \leq i \leq n$. And let $F = \bigcup^n_{i=1} F_i$. Now suppose that, for each $c_i$ defined on $F_i$, there is a unique coherent credence function $c^*_i$ defined on $F$ that extends $c_i$ -- that is, $c_i(X) = c^*_i(X)$ for all $X$ in $F_i$. Then the aggregate of $c_1$, ..., $c_n$ should be the same as the aggregate of $c^*_1$, ..., $c^*_n$.<br /><br />CAP does not satisfy ACC. Is there a natural aggregation rule that does? Here's a suggestion. Suppose you wish to aggregate a set of credence functions $c_1$, ..., $c_n$, where $c_i$ is defined on $F_i$, as above. Then we proceed as follows.<br /><ol><li>First, let $F = \bigcup^n_{i=1} F_i$.</li><li>Second, for each $1 \leq i \leq n$, let $$c^*_i = \{c : \mbox{$c$ is coherent & $c$ is defined on $F$ & $c(X) = c_i(X)$ for all $X$ in $F$}\}$$ That is, while $c_i$ represents a precise credal state defined on $F_i$, $c^*_i$ represents an imprecise credal state defined on $F$. It is the set of coherent credence functions on $F$ that extend $c_i$. That is, it is the set of coherent credence functions on $F$ that agree with $c_i$ on propositions in $F_i$. Thus, if, like Benito, your coherent credences on $F_i$ uniquely determine your coherent credences on $F$, then $c^*_i$ is just the singleton that contains that unique extension. But if your credences over $F_i$ do not uniquely determine your coherent credences over $F$, then $c^*_i$ will contain more coherent credence functions.</li><li>Finally, we take the aggregate of $c_1$, ..., $c_n$ to be the credence function $c$ that minimizes the total distance from $c$ to the $c^*_i$s. The problem is that there isn't a single natural definition of the distance from a point to a set of points, even when you have a definition of the distance between individual points. I adopt a very particular measure of such distances here; but it would be interesting to explore the alternative options in greater detail elsewhere. Suppose $c$ is a credence function and $C$ is a set of credence functions. Then $$D(c, C) = \frac{\mathrm{min}_{c' \in C}D(c, c') + \mathrm{max}_{c' \in C}D(c, c')}{2}$$ With this in hand, we can finally give our aggregation procedure:$$\mathrm{CAP}^*_D(c_1, \ldots, c_n) = \mathrm{arg\ min}_{c' \in P_F} \sum^n_{i=1} D(c, c^*_i)$$ </li></ol>The first thing to note about CAP$^*$ is that, unlike the original CAP, or CAP$^+$, it automatically satisfies ACC.<br /><br />Let's now see CAP$^*$ in action.<br /><ul><li>Since CAP$^*$ satisfies ACC, the aggregate for $c_A$ and $c_B$ is the same as the aggregate for $c_A$ and $c^*_B$, which is just their straight linear pool.</li><li>Next, suppose we wish to aggregate Amira with a third agent, Cleo, who has a credence only in $X_1$, which she assigns $0.5$ -- that is, $c_C(X_1) = 0.5$. Then $F = \{X_1, X_2, X_3\}$, and $$c^*_C = \{c : c(X_1) = 0.5, c(X_2) \geq 0.5, c(X_3) = 1 - c(X_1) - c(X_2)\}$$ So, $$\mathrm{CAP}^*_{\mathfrak{D}}(c_A, c_B) = \mathrm{arg\ min}_{c' \in P_F} \mathfrak{D}(c', c_A) + \mathfrak{D}(c', c^*_C)$$Working through the calculation for $\mathfrak{D} = \mathrm{SED}$, we obtain the following aggregate: $c(X_1) = 0.4$, $c(X_2) = 0.425$, $c(X_3) = 0.175$.</li><li>One interesting feature of CAP$^*$ is that, unlike CAP, we can apply it to individual agents. Thus, for instance, suppose we wish to take Cleo's single credence in $X_1$ and 'fill in' her credences in $X_2$ and $X_3$. Then we can use CAP$^*$ to do this. Her new credence function will be $$c'_C = \mathrm{CAP}^*_{\mathrm{SED}}(c_C) = \mathrm{arg\ min}_{c' \in P_F} D(c', c_C)$$ That is, $c'_C(X_1) = 0.5$, $c'_C(X_2) = 0.25$, $c'_C(X_3) = 0.25$. Rather unsurprisingly, $c'_C$ is the midpoint of the line formed by the imprecise probabilities $c^*_C$. Now, notice: the aggregate of Amira and Cleo given above is just the straight linear pool of Amira's credence function $c_A$ and Cleo's 'filled in' credence function $c'_C$. I would conjecture that this is generally true: filling in credences using CAP$^*_{\mathrm{SED}}$ and then aggregating using straight linear pooling always agrees with aggregating using CAP$^*_{\mathrm{SED}}$. And perhaps this generalises beyond SED.</li></ul>Richard Pettigrewhttp://www.blogger.com/profile/07828399117450825734noreply@blogger.com1