Wednesday, 26 September 2018

Assistant professorship in formal philosophy (Gdansk)


A tenure-track job in formal philosophy in Gdansk is available. Polish language skills not required. The application deadline is November 23, 2018. Details here

Wednesday, 29 August 2018

A new (?) sort of Dutch Book argument: exploitability vs dominance

The exploitability-implies-irrationality argumentative strategy


In decision theory, we often wish to impose normative constraints either on an agent's preference ordering or directly on the utility function that partly determines it. We might demand, for instance, that your preferences should not be cyclical, or that your utility function should discount the future exponentially. And in Bayesian epistemology, we often wish to impose normative constraints on credences. We might demand, for instance, that your credence in one proposition should be no greater than your credence in another proposition that it entails. In both cases, we often use a particular argumentative strategy to establish these norms: we'll call it the exploitability-implies-irrationality strategy (or EII, for short). I want to start by arguing that this is a bad argumentative strategy; and then I want to describe a way to replace it with a good argumentative strategy that is inspired by the problem we have identified with EII. I want to finish by sketching a version of the good argumentative strategy that would replace the EII strategy in the case of credal norms; that is, in the case of the Dutch Book argument. I leave it open here whether a similar strategy can be made to work in the case of preferences or utility functions. (I think this alternative argument strategy is new---it essentially combines an old result by Mark Schervish (1989) with a more recent result by Joel Predd and his co-authors at Princeton (2009); so it wouldn't surprise me at all if something similar has been proposed before---I'd welcome any information about this.)

The EII strategy runs as follows:

(I) Mental state-action link. It begins by claiming that, for anyone with a particular mental state---a preference ordering, a utility function, a credence function, or some combination of these---it is rationally required of them to choose in a particular way when faced with a decision problem.

Some examples:
(i) someone with preference ordering $a \prec b$ is rationally required to pay some amount of money to receive $b$ rather than $a$;
(ii) someone with credence $p$ in proposition $X$ should pay £$(p-\epsilon)$ for a bet that pays out £1 if $X$ is true and £0 if $X$ is false---call this a £1 bet on $X$.

(II) Mathematical theorem. It proceeds to show that, for anyone with a mental state that violates the norm in question, there are decision problems the agent might face such that, if she does, then there are choices she might make in response to them that dominate the choices that premise (I) says are rationally required of her as a result of her mental state. That is, the first set of choices is guaranteed to leave her better off than the second set of choices.

Some examples:
(i) if $c \prec a \prec b \prec c$, then rationality requires you to pay to get $a$ rather than $c$, pay again to get $b$ rather than $a$, and pay again to get $c$ rather than $a$. If, instead, you'd just chosen $c$ at the beginning, and refused to pay anything to swap, you'd be better off for sure now.
(ii) if you have credence $p$ in $XY$ and a credence $q < p$ in $X$, then you will sell a £1 bet on $X$ for £$(q + \varepsilon)$, and you'll buy a £1 bet on $XY$ for £$(p-\varepsilon)$. Providing $3\varepsilon < p - q$, it is easy to see that, taken together, these bets lose you money for sure, and thus refusing both bets is guaranteed to leave you better off.

(III) Action-rationality link. The final premise says that, if there is some series of decision problems such that the choices your mental states rationally require you to make are dominated by some other set of choices you might have made instead, then your mental states are irrational.

Some examples:
(i) By (I-III)(i), we conclude that preferences $c \prec a \prec b \prec c$ are irrational.
(ii) By (I-III)(ii), we conclude that having a higher credence in a conjunction than in one of the conjuncts is irrational.

Now, there are often problems with the instance of (I) that is used in such EII arguments. For instance, there are many reasons to think rationality does not require someone with credence $p$ in $X$ to pay £$(p - \varepsilon)$ for a £1 bet on $X$. But my focus here is on (III).

The Problem with the Action-Rationality Link


The problem with (III) is this: It is clear that it is irrational to make a series of decisions when there is an alternative series that is guaranteed to do better---it is irrational because, when you act, you are attempting to maximise your utility and doing what you have done is guaranteed to be suboptimal as a means to that end; there is an alternative you can know a priori would serve that end better. But it is much less clear why it is irrational to have mental states that require you to make a dominated series of decisions when faced with a particular decision problem. When you choose a dominated option, you are irrational because there's something else you could have done that is guaranteed to serve your ends better. But when you have mental states that require you to choose a dominated option, that alone doesn't tell us that there is anything else you could have done---any alternative mental states you could have had---that are guaranteed to serve your ends better.

Of course, there is often something else you could have done that would not have required you to make the dominated choice. Let's focus on the case of credences. The Dutch Book Theorem shows that, if your credences are not probabilistic, then there's a series of decision problems and a dominated series of options from them that those credences require you to choose. The Converse Dutch Book Theorem shows that, if your credences are instead probabilistic, then there is no such series of decision problems and options. So it's true that there's something else you could do that's guaranteed not to require you to make a dominated choice. But making a dominated choice is not an eventuality so dreadful and awful that, if your credences require you to do it in the face of one particular sort of decision problem, they are automatically irrational, regardless of what they lead you to do in the face of any other decision problem and regardless of how likely it is that you face a decision problem in which they require it of you.

After all, for all the Dutch Book or Converse Dutch Book Theorem tell you, it might be that your non-probabilistic credences lead you to choose badly when faced with the very particular Dutch Book decision problem, but lead you to choose extremely profitably when faced with many other decision problems. Any indeed, even in the case of the Dutch Book decision problem, it might be that your non-probabilistic credences require you to choose in a way that leaves you a little poorer for sure, while all the alternative probabilistic credences require you to choose in a way that leaves you with the possibility of great gain, but also the risk of great loss. In this case, it is not obvious that the probabilistic credences are to be preferred. Furthermore, you might have reason to think that it is extremely unlikely you will ever face the Dutch Book decision problem itself. Or at least much more probable that you'll face other decision problems where your credences don't lead you to choose a dominated series of options. For all these reasons, the mere possibility of a series of decision problems from which your credences require you to choose a dominated series of options is not sufficient to show that your credences are irrational. To do this, we need to show that there are some alternative credences that are in some sense sure to serve you better as you face the decision problems that make up your life. Without these alternative that do better, pointing out a flaw in some mental state does not show that it is irrational, even if there are other mental states without the flaw---for those alternative mental states might have other strikes against them that the mental state in question does not have.

A new Dutch Book argument


So our question is now: Is there any sense in which, when you have non-probabilistic credences, there are some alternative credences that are guaranteed to serve you better as a guide in your decision-making? Borrowing from work by Mark Schervish ('A General Method for Comparing Probability Assessors', 1989) and Ben Levinstein ('A Pragmatist's Guide to Epistemic Utility', 2017), I want to argue that there is.

The pragmatic utility of an individual credence 


Our first order of business is to create a utility function that measures how good individual credences are as a guide to decision-making. Then we'll take the utility of a whole credence function to be the sum of the utilities of the credences that comprise it. (In fact, I think there's a way to do all this without that additivity assumption, but I'm still ironing out the creases in that.)

Suppose you assign credence $p$ to proposition $X$. Our job is to say how good this credence is as a guide to action. The idea is this:
  • an act is a function from states of the world to utilities---let $\mathcal{A}$ be the set of all acts;
  • an $X$-act is an act that assigns the same utility to all the worlds at which $X$ is true, and assigns the same utility to all worlds at which $X$ is false---let $\mathcal{A}_X$ be the set of all $X$-acts;
  • a decision problem is a set of acts; that is, a subset of $\mathcal{A}$---let $\mathcal{D}$ be the set of all decision problems;
  • an $X$-decision problem is a set of $X$-acts; that is, a subset of $\mathcal{A}_X$---let $\mathcal{D}_X$ be the set of all $X$-decision problems.
We suppose that there is a probability function $P$ that says how likely it is that the agent will face different $X$-decision problems---since the set of $X$-decision problems is infinite, we actually take $P$ to be a probability density function. The idea here is that $P$ is something like an objective chance function. With that in hand, we take the pragmatic utility of credence $p$ in proposition $X$ to be the expected utility of the choices that credence $p$ in $X$ will lead you to make when faced with the decision problems you will encounter. That is, it is the integral, relative to measure $P$, over the possible $X$-decision problems $D$ in $\mathcal{D}_X$ you might face, of the utility of the act you'd choose from $D$ using $p$, discounted by the probability that you'd face $D$. Given $D$ in $\mathcal{D}_X$, let $D^p$ be the act you'd choose from $D$ using $p$---that is, $D^p$ is one of the acts in $D$ that maximises expected utility by the lights of $p$. Thus, for any $D$ in $\mathcal{D}_X$, and any act $a$ in $D$,$$\mathrm{Exp}_p(u(a)) \leq \mathrm{Exp}_p(u(D^p))$$ Then we define the pragmatic utility of credence $p$ in $X$ when $X$ is true as follows:
$$g_X(1, p) = \int_{\mathcal{D}_X}u(D^p, X) dP$$ And we define the pragmatic utility of credence $p$ in $X$ when $X$ is false as follows:
$$g_X(0, p) = \int_{\mathcal{D}_X}u(D^p, \overline{X}) dP$$ These are slight modifications of Schervish's and Levinstein's definitions.


$g$ is a strictly proper scoring rule


Our next order of business is to show that this utility function for $g$ is a strictly proper scoring rule. That is, $\mathrm{Exp}_p(g_X(q)) = pg_X(1, q) + (1-p)g_X(0, q)$ is uniquely maximised, as a function of $q$, at $p = q$. We show this now:
\begin{eqnarray*}
\mathrm{Exp}_p(g_X(q)) & = & pg_X(1, q) + (1-p)g_X(0, q)\\
& = & p \int_{\mathcal{D}_X}u(D^q, X) dP + (1-p) \int_{\mathcal{D}_X}u(D^q, \overline{X}) dP \\
& = & \int_{\mathcal{D}_X}p u(D^q, X) + (1-p) u(D^q, \overline{X}) dP\\
& = & \int_{\mathcal{D}_X} \mathrm{Exp}_p(u(D^q)) dP
\end{eqnarray*}
But, by the definition of $D^q$, if $q \neq p$, then, for all $D$ in $\mathcal{D}_X$,
$$\mathrm{Exp}_p(u(D^q)) \leq \mathrm{Exp}_p(u(D^p))$$
and, for some $D$ in $\mathcal{D}_X$,
$$\mathrm{Exp}_p(u(D^q)) < \mathrm{Exp}_p(u(D^p))$$
Now, for two credences $p$ and $q$ in $X$, we say that a set of decision problems separates $p$ and $q$ if (i) each decision problem in the set contains only two available acts, (ii) for each decision problem in the set, $p$ expects one act to have higher expected value and $q$ expects the other to have higher expected value. Then, as long as there is some set of decision problems such that (i) that set separates $p$ and $q$ and (ii) $P$ assigns positive probability to this set, then
$$\mathrm{Exp}_p(g(q)) < \mathrm{Exp}_p(g(p))$$ And so the scoring rule $g$ is strictly proper.

The pragmatic utility of a whole credence function


The scoring rule $g_X$ we have just defined assigns pragmatic utilities to individual credences in $X$. In the next step, we define $G$, a pragmatic utility function that assigns pragmatic utilities to whole credence functions. We take the utility of a credence function to be the sum of the utilities of the individual credences it assigns. Suppose $c : \mathcal{F} \rightarrow [0, 1]$ is a credence function defined on the set of propositions $\mathcal{F}$. Then: $$G(c, w) = \sum_{X \in \mathcal{F}} g_X(w(X), c(X))$$ where $w(X) = 1$ if $X$ is true at $w$ and $w(X) = 0$ if $X$ is false at $w$. In this situation, we say that $G$ is generated from the scoring rules $g_X$ for $X$ in $\mathcal{F}$.




Predd, et al.'s Dominance Result


Finally, we appeal to a theorem due to Predd, et al. ('Probabilistic Coherence and Proper Scoring Rules', 2009):

Theorem (Predd, et al 2015) Suppose $G$ is generated from strictly proper scoring rules $g_X$ for $X$ in $\mathcal{F}$. Then,
(I) if $c$ is not a probability function, then there is a probability function $c^*$ such that, $G(c, w) < G(c^*, w)$ for all worlds $w$;
(II) if $c$ is a probability function, then there is no credence function $c^* \neq c$ such that $G(c, w) \leq G(c^*, w)$ for all worlds $w$.

This furnishes us with a new pragmatic argument for probabilism. And indeed, now that we have a pragmatic utility function that is generated from strictly proper scoring rules, we can take advantage of all of the epistemic utility arguments that make that same assumption, such as Greaves and Wallace's argument for Conditionalization, my arguments for the Principal Principle, the Principle of Indifference, linear pooling in judgment aggregation cases, and so on.

In this argument, we see that non-probabilistic credences are irrational not because there is some series of decision problems such that, when faced with them, the credences require you to make a dominated series of choices. Rather, they are irrational because there are alternative credences that are guaranteed to serve you better on average as a guide to action---however the world turns out, the expected or average utility you'll gain from making decisions using those alternative credences is greater than the expected or average utility you'll gain from making decisions using the original credences.

Monday, 6 August 2018

Postdoc in formal epistemology & law

A postdoc position (3 years, fixed term) in the Chair of Logic, Philosophy of Science and Epistemology is available at the Department of Philosophy, Sociology, and Journalism, University of Gdansk, Poland. The application deadline is September 15, 2018. More details here.

Monday, 30 July 2018

The Dutch Book Argument for Regularity

I've just signed a contract with Cambridge University Press to write a book on the Dutch Book Argument for their Elements in Decision Theory and Philosophy series. So over the next few months, I'm going to be posting some bits and pieces as I get properly immersed in the literature.

-------

We say that a probabilistic credence function $c : \mathcal{F} \rightarrow [0, 1]$ is regular if $c(A) > 0$ for all propositions $A$ in $\mathcal{F}$ such that there is some world at which $A$ is true.

The Principle of Regularity (standard version) If $c : \mathcal{F} \rightarrow [0, 1]$ is your credence function, rationality requires that $c$ is regular.

I won't specify which worlds are in the scope of the quantifier over worlds that occurs in the antecedent of this norm. It might be all the logically possible worlds, or the metaphysically possible worlds, or the conceptually possible worlds; it might be the epistemically possible worlds. Different answers will give different norms. But we needn't decide the issue here. We'll just specify that it's the same set of worlds that we quantify over in the Dutch Book argument for Probabilism when we say that, if your credences aren't probabilistic, then there's a series of bets they'll lead you to enter into that will lose you money at all possible worlds.

In this post, I want to consider the almost-Dutch Book Argument for the norm of Regularity. Here's how it goes: Suppose you have a credence $c(A) = 0$ in a proposition $A$, and suppose that $A$ is true at world $w$. Then, recall, the first premise of the standard Dutch Book argument for Probabilism:

Ramsey's Thesis If your credence in a proposition $X$ is $c(X) = p$, then you're permitted to pay £$pS$ for a bet that returns £$S$ if $X$ is true and £$0$ if $X$ is false, for any $S$, positive or negative or zero.

So, since $c(A) = 0$, your credences will permit you to sell the following bet for £0: if $A$, you must pay out £1; if $\overline{A}$, you will pay out £0. But selling this bet for this price is weakly dominated by refusing the bet. Selling the bet at that price loses you money in all $A$-worlds, and gains you nothing in $\overline{A}$-worlds. Whereas refusing the bet neither loses nor gains you anything in any world. Thus, your credences permit you to choose a weakly dominated act. So they are irrational. Or so the argument goes. I call this the almost-Dutch Book argument for Regularity since it doesn't punish you with a sure loss, but rather with a possible loss with no compensating possible gain.

If this argument works, it establishes the standard version of Regularity stated above. But consider the following case. $A$ and $B$ are two logically independent propositions -- It will be rainy tomorrow and It will be hot tomorrow, for instance. You have only credences in $A$ and in the conjunction $AB$. You don't have credences in $\overline{A}$, $A \vee B$, $A\overline{B}$, and so on. What's more, your credences in $A$ and $AB$ are equal, i.e., $c(A) = c(AB)$. That is, you are exactly as confident in $A$ as you are in its conjunction with $B$. Then, in some sense, you violate Regularity, though you don't violate the standard version we stated above. After all, since your credence in $A$ is the same as your credence in $AB$, you must give no credence whatsoever to the worlds in which $A$ is true and $B$ is false. If you did, then you would set $c(AB) < c(A)$. But you don't have a credence in $A\overline{B}$. So there is no proposition true at some worlds to which you assign a credence of 0. Thus, the almost-Dutch Book argument sketched above will not work. We need a different Dutch Book argument for the following version of Regularity:

The Principle of Regularity (full version) If $c : \mathcal{F} \rightarrow [0, 1]$ is your credence function, then rationality requires that there is an extension $c^*$ of $c$ to a full algebra $\mathcal{F}^*$ that contains $\mathcal{F}$ such that $c^* : \mathcal{F}^* \rightarrow [0, 1]$ is regular.

It is this principle that you violate if $c(A) = c(AB)$ when $A$ and $B$ are logically independent. For any probabilistic extension $c^*$ of $c$ that assigns a credence to $A\overline{B}$ must assign it credence 0 even though there is a world at which it is true.

How are we to give an almost-Dutch Book argument for this version of Regularity? There are two possible approaches.

On the first, we strengthen the first premise of the standard Dutch Book argument. Ramsey's Thesis says: if you have credence $c(X) = p$ in $X$, then you are permitted to pay £$pS$ for a bet that pays £$S$ if $X$ and £$0$ if $\overline{X}$. The stronger version says:

Strong Ramsey's Thesis If every extension $c^*$ of $c$ to a full algebra $\mathcal{F}^*$ that contains $\mathcal{F}$ is such that $c^*(X) = p$, then you are permitted to pay £$pS$ for a bet that pays £$S$ if $X$ and £$0$ if $\overline{X}$.

The idea is that, if every extension assigns the same credence $p$ to $X$, then you are in some sense committed to assigning credence $p$ to $X$. And thus, you are permitted to enter into which ever bets you'd be permitted to enter into if you actually had credence $p$.

On the second approach to giving an almost-Dutch Book argument for the full version of the Regularity principle, we actually provide an almost-Dutch Book using just the credences that you do in fact assign. Suppose, for instance, you have credence $c(A) = c(AB) = 0.5$. Then you will sell for £5 a bet that pays out £10 if $A$ and £0 if $\overline{A}$, while you will buy for £5 a bet that pays £10 if $AB$ and £0 if $\overline{AB}$. Then, if $A$ is true and $B$ is true, you will have a net gain of £0, and similarly if $A$ is false. But if $A$ is true and $B$ is false, you will lose £10. Thus, you face the possibility of loss with no possibility of gain. Now, the question is: can we always construct such almost-Dutch Books? And the answer is that we can, as the following theorem shows:

Theorem 1 (Almost-Dutch Book Theorem for Full Regularity) Suppose $\mathcal{F} = \{X_1, \ldots, X_n\}$ is a set of propositions. Suppose $c : \mathcal{F} \rightarrow [0, 1]$ is a credence function that cannot be extended to a regular probability function on a full algebra $\mathcal{F}^*$ that contains $\mathcal{F}$. Then there is a sequence of stakes $S = (S_1, \ldots, S_n)$, such that if, for each $1 \leq i \leq n$, you pay £$(c(X_i) \times S_i)$ for a bet that pays out £$S_i$ if $X_i$ and £0 if $\overline{X_i}$, then the total price you'll pay is at least the pay off of these bets at all worlds, and more than the payoff at some.

That is,
(i) for all worlds $w$,
$$S\cdot (w - c) = S \cdot w - S \cdot c = \sum^n_{i=1} S_iw(X_i) + \sum^n_{i=1} S_ic(X_i) \leq 0$$
(ii) for some worlds $w$,
$$S\cdot (w - c) = S \cdot w - S \cdot c = \sum^n_{i=1} S_iw(X_i) + \sum^n_{i=1} S_ic(X_i) \leq 0$$
where $w(X_i) = 1$ if $X_i$ is true at $w$ and $w(X_i) = 0$ if $X_i$ is false at $w$. We call $w(-)$ the indicator function of $w$.

Proof sketch. First, recall de Finetti's observation that your credence function $c : \mathcal{F} \rightarrow [0, 1]$ is a probability function iff it is in the convex hull of the indicator functions of the possible worlds -- that is, iff $c$ is in $\{w(-) : w \mbox{ is a possible world}\}^+$. Second, note that, if your credence function can't be extended to a regular credence function, it sits on the boundary of this convex hull. In particular, if $W_c = \{w' : c = \sum_w \lambda_w w \Rightarrow \lambda_{w'} > 0\}$, then $c$ lies on the boundary surface created by the convex hull of $W_c$. Third, by the Supporting Hyperplane Theorem, there is a vector $S$ such that $S$ is orthogonal to this boundary surface and thus:
(i) $S \cdot (w-c) = S \cdot w - S \cdot c = 0$ for all $w$ in $W_c$; and
(ii) $S \cdot (w-c) = S \cdot w - S \cdot c < 0$ for all $w$ not in $W_c$.
Fourth, recall that $S \cdot w$ is the total payout of the bets at world $w$ and $S \cdot c$ is the price you'll pay for it. $\Box$

Thursday, 26 July 2018

Dutch Strategy Theorems for Conditionalization and Superconditionalization


I've just signed a contract with Cambridge University Press to write a book on the Dutch Book Argument for their Elements in Decision Theory and Philosophy series. So over the next few months, I'm going to be posting some bits and pieces as I get properly immersed in the literature.

-----

Many Bayesians formulate the update norm of Bayesian epistemology as follows:

Bayesian Conditionalization  If
(i) your credence function at $t$ is $c : \mathcal{F} \rightarrow [0, 1]$,
(ii) your credence function at a later time $t'$ is $c' : \mathcal{F} \rightarrow [0, 1]$,
(iii) $E$ is the strongest evidence you acquire between $t$ and $t'$,
(iv) $E$ is in $\mathcal{F}$,
then rationality requires that, if $c(E) > 0$, then for all $X$ in $\mathcal{F}$, $$c'(X) = c(X|E) = \frac{c(XE)}{c(E)}$$

I don't. One reason you might fail to conditionalize between $t$ and $t'$ is that you re-evaluate the options between those times. You might disavow the prior that you had at the earlier time, perhaps decide it was too biased in one way or another, or not biased enough; perhaps you come to think that it doesn't give enough consideration to the explanatory power one hypothesis would have were it true, or gives too much consideration to the adhocness of another hypothesis; and so on. Now, it isn't irrational to change your mind. So surely it can't be irrational to fail to conditionalize as a result of changing your mind in this way. On this, I agree with van Fraassen.

Instead, I prefer to formulate the update norm as follows -- I borrow the name from Kenny Easwaran:

Plan Conditionalization If
(i) your credence function at $t$ is $c: \mathcal{F} \rightarrow [0, 1]$,
(ii) between $t$ and $t'$ you will receive evidence from the partition $\{E_1, \ldots, E_n\}$,
(iii) each $E_i$ is in $\mathcal{F}$
(iv) at $t$, your updating plan is $c'$, so that $c'_i : \mathcal{F} \rightarrow [0, 1]$ is the credence function you will adopt if $E_i$,
then rationality requires that, if $c(E_i) > 0$, then for all $X$ in $\mathcal{F}$, $$c'_i(X) = c(X | E_i)$$

I want to do two things in this post. First, I'll offer what I think is a new proof of the Dutch Strategy or Diachronic Dutch Book Theorem that justifies Plan Conditionalization (I haven't come across it elsewhere, though Ray Briggs and I used the trick at the heart of it for our accuracy dominance theorem in this paper). Second, I'll explore how that might help us justify other norms of updating that concern situations in which you don't come to learn any proposition with certainty. We will see that we can use the proof I give to justify the following standard constraint on updating rules: Suppose the evidence I receive between $t$ and $t'$ is not captured by any of the propositions to which I assign a credence -- that is, there is no proposition $e$ to which I assign a credence that is true at all and only the worlds at which I receive the evidence I actually receive between $t$ and $t'$. As a result, there is no proposition $e$ that I learn with certainty as a result of receiving that evidence. Nonetheless, I should update my credence function from $c$ to $c'$ in such a way that it is possible to extend my earlier credence function $c$ to a credence function $c^*$ so that: (i) $c^*$ does assign a credence to $e$, and (ii) my later credence $c'(X)$ in a proposition $X$ is the credence that this extended credence function $c^*$ assigns to $X$ conditional on me receiving evidence $e$ -- that is, $c'(X) = c^*(X | e)$. That is, I should update as if I had assigned a credence to $e$ at the earlier time and then updated by conditionalizing on it.

Here's the Dutch Strategy or Diachronic Dutch Book Theorem for Plan Conditionalization:

Definition (Conditionalizing pair) Suppose $c$ is a credence function and $c'$ is an updating rule defined on $\{E_1, \ldots, E_n\}$. We say that $(c, c')$ is a conditionalizing pair if, whenever $c(E_i) > 0$, then for all $X$, $c'_i(X) = c(X | E_i)$.

Dutch Strategy Theorem Suppose $(c, c')$ is not a conditionalizing pair. Then
(i) there are two acts $A$ and $B$ such that $c$ prefers $A$ to $B$, and
(ii) for each $E_i$, there are two acts $A_i$ and $B_i$ such that $c'_i$ prefers $A_i$ to $B_i$,
and, for each $E_i$, $A + A_i$ has greater utility than $B + B_i$ at all worlds at which $E_i$ is true.

We'll now give the proof of this.

First, we describe a way of representing pairs $(c, c')$. Both $c$ and each $c'_i$ are defined on the same set $\mathcal{F} = \{X_1, \ldots, X_m\}$. So we can represent $c$ by the vector $(c(X_1), \ldots, c(X_m))$ in $[0, 1]^m$, and we can represent each $c'_i$ by the vector $(c'_i(X_1), \ldots, c'_i(X_m))$ in $[0, 1]^m$. And we can represent $(c, c')$ by concatenating all of these representations to give:
$$(c, c') = c \frown c'_1 \frown c'_2 \frown \ldots \frown c'_n$$
which is a vector in $[0, 1]^{m(n+1)}$.

Second, we use this representation to give an alternative characterization of conditionalizing pairs. First, three pieces of notation:
  • Let $W$ be the set of all possible worlds.
  • For any $w$ in $W$, abuse notation and write $w$ also for the credence function on $\mathcal{F}$ such that $w(X) = 1$ if $X$ is true at $w$, and $w(X) = 0$ if $X$ is false at $w$.
  • For any $w$ in $W$, let $$(c, c')_w = w \frown c'_1 \frown \ldots \frown c'_{i-1} \frown w \frown c'_{i+1} \frown \ldots \frown c'_n$$ where $E_i$ is the element of the partition that is true at $w$.
Lemma 1 If $(c, c')$ is not a conditionalizing pair, then $(c, c')$ is not in the convex hull of $\{(c, c')_w : w \in W\}$, which we write $\{(c, c')_w : w \in W\}^+$.

Proof of Lemma 1. If $(c, c')$ is in $\{(c, c')_w : w \in W\}^+$, then there are $\lambda_w \geq 0$ such that

(1) $\sum_{w \in W} \lambda_w = 1$,
(2) $c(X) = \sum_{w \in W} \lambda_w w(X)$
(3) $c'_i(X) = \sum_{w \in E_i} \lambda_w w(X) + \sum_{w \not \in E_i} \lambda_w c'_i(X)$.

By (2), we have $\lambda_w = c(w)$. So by (3), we have $$c'_i(X) = c(XE_i) + (1-c(E_i))c'_i(X)$$ So, if $c(E_i) > 0$, then $c'_i(X) = c(X | E_i)$.

Third, we use this alternative characterization of conditionalizing pairs to specify the acts in question. Suppose $(c, c')$ is not a conditionalizing pair. Then $(c, c')$ is outside $\{(c, c')_w : w \in W\}^+$. Now, let $(p, p')$ be the orthogonal projection of $(c, c')$ into $\{(c, c')_w : w \in W\}^+$. Then let $(S, S') = (c, c') - (p, p')$. That is, $S = c - p$ and $S'_i = c'_i - p'_i$. Now pick $w$ in $W$. Then the angle between $(S, S')$ and $(c, c')_w - (c, c')$ is obtuse and thus
$$(S, S') \cdot ((c, c')_w - (c, c')) = -\varepsilon_w < 0$$

Thus, define the acts $A$, $B$, $A'_i$ and $B'_i$ as follows:
  • The utility of $A$ at $w$ is $S \cdot (w - c) + \frac{1}{3}\varepsilon_w$:
  • The utility of $B$ at $w$ is 0;
  • The utility of $A'_i$ at $w$ is $S'_i \cdot (w - c'_i) + \frac{1}{3}\varepsilon_w$;
  • The utility of $B'_i$ at $w$ is 0.
 Then the expected utility of $A$ by the lights of $c$ is $\sum^w c(w)\frac{1}{3}\varepsilon_w > 0$, while the expected utility of $B$ is 0, so $c$ prefers $A$ to $B$. And the expected utility of $A'_i$ by the lights of $c'_i$ is $\sum_w c'_i(w)\frac{1}{3}\varepsilon_w > 0$, while the expected utility of $B'_i$ is 0, so $c'_i$ prefers $A'_i$ to $B'_i$. But the utility of $A + A'_i$ at $w$ is
$$S \cdot (w - c)  + S'_i \cdot (w - c'_i) + \frac{2}{3}\varepsilon_w = (S, S') \cdot ((c, c')_w - (c, c')) + \frac{2}{3}\varepsilon_w = - \frac{1}{3}\varepsilon_w < 0$$
where $E_i$ is true at $w$. While the utility of $B + B'_i$ at $w$ is 0.

This completes our proof. $\Box$

You might be forgiven for wondering why we are bothering to give an alternative proof for a theorem that is already well-known. David Lewis proved the Dutch Strategy Theorem in a handout for a seminar at Princeton in 1972, Paul Teller then reproduced it (with full permission and acknowledgment) in a paper in 1973, and Lewis finally published his handout in 1997 in his collected works. Why offer a new proof?

It turns out that this style of proof is actually a little more powerful. To see why, it's worth comparing it to an alternative proof of the Dutch Book Theorem for Probabilism, which I described in this post (it's not original to me, though I'm afraid I can't remember where I first saw it!). In the standard Dutch Book Theorem for Probabilism, we work through each of the axioms of the probability calculus, and say how you would Dutch Book an agent who violates it. The axioms are: Normalization, which says that $c(\top) = 1$ and $c(\bot) = 0$; and Additivity, which says that $c(A \vee B) = c(A) + c(B) - c(AB)$. But consider an agent with credences only in the propositions $\top$, $A$, and $A\ \&\ B$.  Her credences are: $c(\top) = 1$, $c(A) = 0.4$, $c(A\ \&\ B) = 0.7$. Then there is no axiom of the probability calculus that she violates. And thus the standard proof of the Dutch Book Theorem is no help in identifying any Dutch Book against her. Yet she is Dutch Bookable. And she violates a more expansive formulation of Probabilism that says, not only are you irrational if your credence function is not a probability function, but also if your credence function cannot be extended to a probability function. So the standard proof of the Dutch Book Theorem can't establish this more expansive version. But the alternative proof I mentioned above can.

Now, something similar is true of the alternative proof of the Dutch Strategy Theorem that I offered above (I happened upon this while discussing Superconditionalizing with Jason Konek, who uses similar techniques in his argument for J-Kon, the alternative to Jeffrey's Probability Kinematics that he proposes in his paper, 'The Art of Learning', which was runner-up for last year's Sander's Prize in Epistemology). In Lewis' proof of that theorem: First, if you violate Plan Conditionalization, there must be $E_i$ and $X$ such that $c(E_i) > 0$ and $c'_i(X) \neq c(X|E_i)$. Then you place bets on $XE_i$, $\overline{E_i}$ at the earlier time $t$, and a bet on $X$ at $t'$. These bets then together lose you money in any world at which $E_i$ is true. Now, it might seem that you must have the required credences to make those bets just in virtue of violating Plan Conditionalization. But imagine the following is true of you: between $t$ and $t'$, you'll obtain evidence from the partition $\{E_1, \ldots, E_n\}$. And, at $t'$, you'll update on this evidence using the rule $c'$. That is, if $E_i$, then you'll adopt the new credence function $c'_i$ at time $t'$. Now, you don't assign credences to the propositions in $\{E_1, \ldots, E_n\}$. Perhaps this is because you don't have the conceptual resources to formulate these propositions. So while you will update using the rule $c'$, this is not a rule you consciously or explicitly adopt, since to state it would require you to use the propositions in $\{E_1, \ldots, E_n\}$. So it's more like you have a disposition to update in this way. Now, how might we state Plan Conditionalization for such an agent? We can't demand that $c'_i(X) = c(X|E_i)$, since $c(X | E_i)$ is not defined. Rather, we demand that there is some extension $c^*$ of $c$ to a set of propositions that does include each $E_i$ such that $c'_i(X) = c^*(X | E_i)$. Thus, we have:

Plan Superconditionalization If
(i) your credence function at $t$ is $c : \mathcal{F} \rightarrow [0, 1]$,
(ii) between $t$ and $t'$ you will receive evidence from the partition $\{E_1, \ldots, E_n\}$,
(iii) at $t$, your updating plan is $c'$, so that $c'_i : \mathcal{F} \rightarrow [0, 1]$ is the credence function you plan to adopt if $E_i$,
then rationality requires that there is some extension $c^*$ of $c$ for which, if $c^*(E_i) > 0$, then for all $X$, $$c'_i(X) = c^*(X | E_i)$$

And it turns out that we can adapt the proof above for this purpose. Say that $(c, c')$ is a superconditionalizing pair if there is an extension $c^*$ of $c$ such that, if $c^*(E_i) > 0$, then for all $X$, $c'_i(X) = c^*(X | E_i)$. Then we can prove that if $(c, c')$ is not a superconditionalizing pair, then $(c, c')$ is not in $\{(c, c')_w : w \in W\}^+$. Here's the proof from above adapted to our case: If $(c, c')$ is in $\{(c, c')_w : w \in W\}^+$, then there are $\lambda_w \geq 0$ such that

(1) $\sum_{w \in W} \lambda_w = 1$,
(2) $c(X) = \sum_{w \in W} \lambda_w w(X)$
(3) $c'_i(X) = \sum_{w \in E_i} \lambda_w w(X) + \sum_{w \not \in E_i} \lambda_w c'_i(X)$.

Define the following extension $c^*$ of $c$: $c^*(w) = \lambda_w$. Then, by (3), we have $$c'_i(X) = c^*(XE_i) + (1-c^*(E_i))c'_i(X)$$ So, if $c^*(E_i) > 0$, then $c'_i(X) = c^*(X | E_i)$, as required. $\Box$

Now, this is a reasonably powerful version of conditionalization. For instance, as Skyrms showed here, if we make one or two further assumptions on the extension of $c$ to $c^*$, we can derive Richard Jeffrey's Probability Kinematics from Plan Superconditionalization. That is, if the evidence $E_i$ will lead you to set your new credences across the partition $\{B_1, \ldots, B_k\}$ to $q_1, \ldots, q_k$, respectively, so that $c'_i(B_j) = q_j$, then your new credence $c'_i(X)$ must be $\sum^k_{j=1} c(X | B_j)q_j$, as Probability Kinematics demands. Thus, Plan Superconditionalization places a powerful constraint on updating rules for situations in which the proposition stating your evidence is not one to which you assign a credence. Other cases of this sort include the Judy Benjamin problem and the many cases in which MaxEnt is applied.

Wednesday, 25 July 2018

Deadline for PhD position in formal epistemology & law extended

This position is still available. Deadline extended to September 7, 2018.

On the Expected Utility Objection to the Dutch Book Argument for Probabilism


I've just signed a contract with Cambridge University Press to write a book on the Dutch Book Argument for their Elements in Decision Theory and Philosophy series. So over the next few months, I'm going to be posting some bits and pieces as I get properly immersed in the literature. The following came up while thinking about Brian Hedden's paper 'Incoherence without Exploitability'.

What is Probabilism?


Probabilism says that your credences should obey the axioms of the probability calculus. Suppose $\mathcal{F}$ is the algebra of propositions to which you assign a credence. Then we let $0$ represent the lowest possible credence you can assign, and we let $1$ represent the highest possible credence you can assign. We then represent your credences by your credence function $c : \mathcal{F} \rightarrow [0, 1]$, where, for each $A$ in $\mathcal{F}$, $c(A)$ is your credence in $A$.

Probabilism
If $c : \mathcal{F} \rightarrow [0, 1]$ is your credence function, then rationality requires that:
(P1a) $c(\bot) = 0$, where $\bot$ is a necessarily false proposition;
(P1b) $c(\top) = 1$, where $\top$ is a necessarily true proposition;
(P2) $c(A \vee B) = c(A) + c(B)$, for any mutually exclusive propositions $A$ and $B$ in $\mathcal{F}$.

This is equivalent to:

Partition Probabilism
If $c : \mathcal{F} \rightarrow [0, 1]$ is your credence function, then rationality requires that, for any two partitions $\mathcal{X} = \{X_1, \ldots, X_m\}$ and $\mathcal{Y} = \{Y_1, \ldots, Y_n\}$,$$\sum^m_{i=1} c(X_i) = 1=  \sum^n_{j=1} c(Y_j)$$

The Dutch Book Argument for Probabilism


The Dutch Book Argument for Probabilism has three premises. The first, which I will call Ramsey's Thesis and abbreviate RT, posits a connection between your credence in a proposition and the prices you are rationally permitted or rationally required to pay for a bet on that proposition. The second, known as the Dutch Book Theorem, establishes that, if you violate Probabilism, there is a set of bets you might face, each with a price attached, such that (i) by Ramsey's Thesis, for each bet, you are rationally required to pay the attached price for it, but (ii) the sum of the prices of the bets exceeds the highest possible payout of the bets, so that, having paid each of those prices, you are guaranteed to lose money. The third premise, which we might call the Domination Thesis, says that credences are irrational if they mandate you to make a series of decisions (i.e, paying certain prices for the bets) that is guaranteed to leave you worse off than another series of decisions (i.e., refusing to pay those prices for the bets)---in the language of decision theory, paying the attached price for each of the bets is dominated by refusing each of the bets, and credences that mandate you to choose dominated options are irrational. The conclusion of the Dutch Book Argument is then Probabilism. Thus, the argument runs:

The Dutch Book Argument for Probabilism
(DBA1) Ramsey's Thesis
(DBA2) Dutch Book Theorem
(DBA3) Domination Thesis
Therefore,
(DBAC) Probabilism

The argument is valid. The second premise is a mathematical theorem. Thus, if the argument fails, it must be because the first or third premise is false, or both. In this paper, we focus on the first premise, and the expected utility objection to it. So, let's set out that premise in a little more detail.

In what follows, we assume that (i) you are risk-neutral, and (ii) that there is some quantity such that your utility is linear in that quantity---indeed, we will speak as if your utility is linear in money, but that is just for ease of notation and familiarity; any quantity would do. Neither (i) nor (ii) is realistic, and indeed these idealisations are the source of other objections to Ramsey's Thesis. But they are not our concern here, so we will grant them.

Ramsey's Thesis (RT) Suppose your credence in $X$ in $c(X)$. Consider a bet that pays you £$S$ if $X$ is true and £0 if $X$ is false, where $S$ is a real number, either positive, negative, or zero---$S$ is called the stake of the bet. You are offered this bet for the price £$x$, where again $x$ is a real number, either positive, negative, or zero. Then:
(i) If $x < c(X) \times S$, you are rationally required to pay £$x$ to enter into this bet;
(ii) If $x = c(X) \times S$, you are rationally permitted to pay £$x$ and rationally permitted to refuse;
(iii) If $x > c(X) \times S$, you are rationally required to refuse.

Roughly speaking, Ramsey's Thesis says that, the more confident you are in a proposition, the more you should be prepared to pay for a bet on it. More precisely, it says: (a) if you have minimal confidence in that proposition (i.e. 0), then you should be prepared to pay nothing for it; (b) if you have maximal confidence in it (i.e. 1), then you should be prepared to pay the full stake for it; (c) for levels of confidence in between, the amount you should be prepared to pay increases linearly with your credence.

The Expected Utility Objection


We turn now to the objection to Ramsey's Thesis (RT) we wish to treat here. Hedden (2013) begins by pointing out that we have a general theory of how credences and utilities should guide action:

Given a set of options available to you, expected utility theory says that your credences license you to choose the option with the highest expected utility, defined as:
$$\mathrm{EU}(A) = \sum_i P(O_i|A) \times U(O_i)$$
On this view, we should evaluate which bets your credences license you to accept by looking at the expected utilities of those bets. (Hedden, 2013, 485)

He considers the objection that this only applies when credences satisfy Probabilism, but rejects it:

In general, we should judge actions by taking the sum of the values of each possible outcome of that action, weighted by one's credence that the action will result in that outcome. This is a very intuitive proposal for how to evaluate actions that applies even in the context of incoherent credences. (Hedden, 2013, 486)

Thus, Hedden contends that we should always choose by maximising expected utility relative to our credences, whether or not those credences are coherent. Let's call this principle Maximise Subjective Expected Utility and abbreviate it MSEU. He then observes that MSEU conflicts with RT. Consider, for instance, Cináed, who is 60% confident it will rain and 20% confident it won't. According to RT, he is rationally required to sell for £65 a bet in which he pays out £100 if it rains and £0 if is doesn't. But the expected utility of this bet for him is$$0.6 \times (-100 + 65) + 0.2 \times (-0 + 65) = -8$$That is, it has lower expected utility than refusing to sell the bet, since his expected utility for doing that is$$0.6 \times 0 + 0.2 \times 0 = 0$$So, while RT says you must sell that bet for that price, MSEU says you must not. So RT and MSEU are incompatible, and Hedden claims that we should favour MSEU. There are two ways to respond to this. On the first, we try to retain RT in some form in spite of Hedden's objection---I call this the permissive response below. On the second, we try to give a pragmatic argument for Probabilism using MSEU instead of RT---I call this the bookless response below. In the following sections, I will consider these in turn.

The Permissive Response


While Hedden is right to say that maximising expected utility in line with Maximise Subjective Expected Utility (MSEU) is intuitively rational even when your credences are incoherent, so is Ramsey's Thesis (RT). It is certainly intuitively correct that, to quote Hedden, ''we should judge actions by taking the sum of the values of each possible outcome of that action, weighted by one's credence that the action will result in that outcome.'' But it is also intuitively correct that, to quote from our gloss of Ramsey's Thesis above, ''(a) if you have minimal confidence in that proposition (i.e. 0), then you should be prepared to pay nothing for it; (b) if you have maximal confidence in it (i.e. 1), then you should be prepared to pay the full stake for it; (c) for levels of confidence in between, the amount you should be prepared to pay increases linearly with your credence.'' What are we to do in the face of this conflict between our intuitions?

One natural response is to say that choosing in line with RT is rationally permissible and choosing in line with MSEU is also rationally permissible. When your credences are coherent, the dictates of MSEU and RT are the same. But when you are incoherent, they are sometimes different, and in that situation you are allowed to follow either. In particular, faced with a bet and proposed price, you are permitted to pay that price if it is permitted by RT and you are permitted to pay it if it is permitted by MSEU.

If this is right, then we can resurrect the Dutch Book Argument with a permissive version of RT as the first premise:

Permissive Ramsey's Thesis Suppose your credence in $X$ in $c(X)$. Consider a bet that pays you £$S$ if $X$ is true and £0 if $X$ is false. You are offered this bet for the price £$x$. Then:
(i) If $x \leq c(X) \times S$, you are rationally permitted to pay £$x$ to enter into this bet.

And we could then amend the third premise---the Domination Thesis (DBA3)---to ensure we could still derive our conclusion. Instead of saying that credences are irrational if they mandate you to make a series of decisions that is guaranteed to leave you worse off than another series of decisions, we might say that credences are irrational if they permit you to make a series of decisions that is guaranteed to leave you worse off than another series of decisions. In the language of decision theory, instead of saying only that credences that mandate you to choose dominated options are irrational, we say also that credences that permit you to choose dominated options are irrational. We might call this the Permissive Domination Thesis.

Now, by weakening the first premise in this way, we respond to Hedden's objection and make the premise more plausible. But we strengthen the third premise to compensate and perhaps thereby make it less plausible. However, I imagine that anyone who accepts one of the versions of the third premise---either the Domination Thesis or the Permissive Domination Thesis---will also accept the other. Having credences that mandate dominated choices may be worse than having credences that permit such choices, but both seem sufficient for irrationality. Perhaps the former makes you more irrational than the latter, but it seems clear that the ideally rational agent will have credences that do neither. And if that's the case, then we can replace the standard Dutch Book Argument with a slight modification:

The Permissive Dutch Book Argument for Probabilism
(PDBA1) Permissive Ramsey's Thesis
(PDBA2) Dutch Book Theorem
(PDBA3) Permissive Domination Thesis
Therefore,
(PDBAC) Probabilism

The Bookless Response


Suppose you refuse even the permissive version of RT, and insist that coherent and incoherent agents alike should choose in line with MSEU. Then what becomes of the Dutch Book Argument? As we noted above, Hedden shows that it fails---MSEU is not sufficient to establish the conclusion. In particular, Hedden gives an example of an incoherent credence function that is not Dutch Bookable via MSEU. That is, there are no sets of bets with accompanying prices such that (a) MSEU will demand that you pay each of those prices, and (b) the sum of those prices is guaranteed to exceed the sum of the payouts of that set of bets. However, as we will see, accepting individual members of such a set of bets is just one way to make bad decisions based on your credences.

Consider Hedden's example. In it, you assign credences to propositions in the algebra built up from three possible worlds, $w_1$, $w_2$, and $w_3$. Here are some of your credences:
  • $c(w_1 \vee w_2) = 0.8$ and $c(w_3) = 0$
  • $c(w_1) = 0.7$ and $c(w_2 \vee w_3) = 0$
Now, consider the following two options, $A$ and $B$, whose utilities in each state of the world are set out in the following table:


Then notice first that $A$ dominates $B$---that is, the utility of $A$ is higher than $B$ in every possible state of the world. But, using your incoherent credences, you assign a higher expected utility to $B$ than to $A$. Your expected utility for $A$---which must be calculated relative to your credences in $w_1$ and $w_2 \vee w_3$, since  the utility of $A$ given $w_1 \vee w_2$ is undefined---is $0.7 \times 78 + 0 \times 77 = 54.6$. And your expected utility for $B$---which must be calculated relative to your credences in $w_1 \vee w_2$ and $w_3$, since the utility of $B$ given $w_2 \vee w_3$ is undefined---is $0.8 \times 74 + 0 \times 75 = 59.2$. So, while Hedden might be right that MSEU won't leave you vulnerable to a Dutch Book, it will leave you vulnerable to choosing a dominated option. And since what is bad about entering a Dutch Book is that it is a dominated option---it is dominated by the option of refusing the bets---the invulnerability to Dutch Books should be no comfort to you.

Now, this raises the question: For which incoherence credences is it guaranteed that MSEU won't lead you to choose a dominated option? Is it all incoherent credences, in which case we would have a new Dutch Book Argument for Probabilism from MSEU rather than RT? Or is it some subset? Below, we prove a theorem that answers that. First, a weakened version of Probabilism:

Bounded Probabilism If $c : \mathcal{F}\rightarrow [0, 1]$ is your credence function, then rationality requires that:
(BP1a) $c(\bot) = 0$, where $\bot$ is a necessarily false proposition;
(BP1b) There is $0 < M \leq 1$ such that $c(\top) = M$, where $\top$ is a necessarily true proposition;
(BP2) $c(A \vee B) = c(A) + c(B)$, if $A$ and $B$ are mutually exclusive.

Bounded Probabilism says that you should have lowest possible credence in necessary falsehoods, some positive credence---not necessarily 1---in necessary truths, and your credence in a disjunction of two incompatible propositions should be the sum of your credences in the disjuncts.

Theorem 1 The following are equivalent:
(i) $c$ satisfies Bounded Probabilism
(ii) For all options $A$, $B$, if $A$ dominates $B$, then $\mathrm{EU}_c(A) > \mathrm{EU}_c(B)$.

The proof is in the Appendix below. Thus, even without Ramsey's Thesis or the permissive version described above, you can still give a pragmatic argument for a norm that lies very close to Probabilism, namely, Bounded Probabilism. On its own, this argument cannot say what is wrong with someone who gives less than the highest possible credence to necessary truths, but it does establish the other requirements that Probabilism imposes. To see just how close to Probabilism lies Bounded Probabilism, consider the following two norms, which are equivalent to it:

Scaled Probabilism  If $c : \mathcal{F} \rightarrow [0, 1]$ is your credence function, then rationality requires that there is $0 < M \leq 1$ and a probability function $p : \mathcal{F} \rightarrow [0, 1]$ such that $c(-) = M \times p(-)$.

Bounded Partition Probabilism  If $c : \mathcal{F} \rightarrow [0, 1]$ is your credence function, then rationality requires that, for any two partitions $\mathcal{X} = \{X_1, \ldots, X_m\}$ and $\mathcal{Y} = \{Y_1, \ldots, Y_n\}$,$$\sum^m_{i=1} c(X_i) = \sum^n_{j=1} c(Y_j)
$$Then

Lemma 2 The following are equivalent:
(i) Bounded Probabilism
(ii) Scaled Probabilism
(iii) Bounded Partition Probabilism

As before, the proof is in the Appendix.

So, on its own, MSEU can deliver us very close to Probabilism. But it cannot establish (P1b), namely, $c(\top) = 1$. However, I think we can also appeal to a highly restricted version of the Permissive Ramsey's Thesis to secure (P1b) and push us all the way to Probabilism.

Consider Dima and Esther. They both have minimal confidence---i.e. 0---that it won't rain tomorrow. But Dima has credence 0.01 that it will rain, while Esther has credence 0.99 that it will. If we permit only actions that maximise expected utility, then Dima and Esther are required to pay exactly the same prices for bets on rain---that is, Dima will be required to pay a price exactly when Esther is. After all, if £$S$ is the payoff when it rains, £0 is the payoff when it doesn't, and $x$ is a proposed price, then $0.01\times (S- x) + 0 \times (0-x) \geq 0$ iff $0.99 \times (S-x) + 0 \times (0-x) \geq 0$ iff $S \geq x$. So, according to MSEU, Dima and Esther are rationally required to pay anything up to the stake of the bet for such a bet. But this is surely wrong. It is surely at least permissible for Dima to refuse to pay a price that Esther accepts. It is surely permissible for Esther to pay £99 for a bet on rain that pays £100 if it rains and £0 if it doesn't, while Dima refuses to pay anything more than £1 for such a bet, in line with Ramsey's Thesis. Suppose Dima were offered such a bet for the price of £99, and suppose she then defended her refusal to pay that price saying, 'Well, I only think it's 1% likely to rain, so I don't want to risk such a great loss with so little possible gain when I think the gain is so unlikely'. Then surely we would accept that as a rational defence.

In response to this, defenders of MSEU might concede that RT is sometimes the correct norm of action when you are incoherent, but only in very specific cases, namely, those in which you have a positive credence in a proposition, minimal credence (i.e. 0) in its negation, and you are considering the price you might pay for a bet on that proposition. In all other cases---that is, in any case in which your credences in the proposition and its negation are both positive, or in which you are considering an action other than a bet on a proposition---you should use MSEU. I have some sympathy with this. But, fortunately, this restricted version is all we need. After all, it is precisely by applying Ramsey's Thesis to such a case that we can produce a Dutch Book against someone with $c(\bot) = 0$ and $c(\top) < 1$---we simply offer to pay them £$c(\top) \times 100$ for a bet in which they will pay out £100 if $\top$ is true and £0 if it is false; this is then guaranteed to lose them £$100 \times (1-c(X))$, which is positive. Thus, we end up with a disjunctive pragmatic argument for Probabilism: if $c(\bot) = 0$ and $c(\top) < 1$, then RT applies and we can produce a Dutch Book against you; if you violate Probabilism in any other way, then you violate Bounded Probabilism and we can then produce two options $A$ and $B$ such that $A$ dominates $B$, but your credences, via MSEU, dictate that you should choose $B$ over $A$. This, then, is our bookless pragmatic argument for Probabilism:

Bookless Pragmatic Argument for Probabilism
(BPA1) If $c$ violates Probabilism, then either (i) $c(\bot) = 0$ and $c(\top) < 1$, or (ii) $c$ violates Bounded Probabilism.
(BPA2) If $c(\bot) = 0$ and $c(\top) < 1$, then RT applies, and there is a bet on $\top$ such that you are required by RT to pay a higher price for that bet than its guaranteed payoff. Thus, there are options $A$ and $B$ (namely, refuse the bets and pay the price), such that $A$ dominates $B$, but RT demands that you choose $B$ over $A$.
(BPA3) If $c$ violates Bounded Probabilism, then by Theorem 1, there are options $A$ and $B$ such that $A$ dominates $B$, but RT demands that you choose $B$ over $A$. Therefore, by (BPA1), (BPA2), and (BPA3),
(BPA4) If $c$ violates Probabilism, then there are options $A$ and $B$ such that $A$ dominates $B$, but rationality requires you to choose $B$ over $A$.
(BPA5) Dominance Thesis
Therefore,
(BPAC) Probabilism

Conclusion


The Dutch Book Argument for Probabilism assumes Ramsey's Thesis, which determines the prices an agent is rationally required to pay for a bet. Hedden argues that Ramsey's Thesis is wrong. He claims that Maximise Subjective Expected Utility  determines those prices, and it often disagrees with RT. In our Permissive Dutch Book Argument, I suggested that, in the face of that disagreement, we might be permissive: agents are permitted to pay any price that is required or permitted by RT and they are permitted to pay any price that is required or permitted by MSEU. In our Bookless Pragmatic Argument, I then explored what we might do if we reject this permissive response and insist that only prices permitted or required by MSEU are permissible. I showed that, in that case, we can give a pragmatic argument for Bounded Probabilism, which comes close to Probabilism, but doesn't quite reach; and I showed that, if we allow RT in the very particular cases in which it agrees better with intuition than MSEU does, we can give a pragmatic argument for Probabilism.

Appendix: Proof of Theorem 1


Theorem 1 The following are equivalent:
(i) $c$ satisfies Bounded Probabilism
(ii) For all options $A$, $B$, if $A$ dominates $B$, then $\mathrm{EU}_c(A) > \mathrm{EU}_c(B)$.

($\Rightarrow$) Suppose $c$ satisfies Bounded Probabilism. Then, by Lemma 2, there is $0 < M \leq 1$ and a probability function $p$ such that $c(-) = M \times  p(-)$. Now suppose $A$ and $B$ are actions. Then
  • $\mathrm{EU}_c(A) = \mathrm{EU}_{M \times  p}(A) = M \times  \mathrm{EU}_p(A)$
  • $\mathrm{EU}_c(B) = \mathrm{EU}_{M \times  p}(B) = M \times  \mathrm{EU}_p(B)$
Thus, $\mathrm{EU}_c(A) > \mathrm{EU}_c(B)$ iff $\mathrm{EU}_p(A) > \mathrm{EU}_p(B)$. And we know that, if $A$ dominates $B$ and $p$ is a probability function, then $\mathrm{EU}_p(A) > \mathrm{EU}_p(B)$.

($\Leftarrow$) Suppose $c$ violates Bounded Probabilism. Then there are partitions $\mathcal{X} = \{X_1, \ldots, X_m\}$ and $\mathcal{Y} = \{Y_1, \ldots, Y_n\}$ such that $$\sum^m_{i=1} c(X_i) = x < y = \sum^n_{j=1} c(Y_j)$$We will now define two acts $A$ and $B$ such that $A$ dominates $B$, but $\mathrm{EU}_c(A) < \mathrm{EU}_c(B)$.
  • For any $X_i$ in $\mathcal{X}$, $$U(A, X_i) = y - i\frac{y-x}{2(m + 1)}$$
  • For any $Y_j$ in $\mathcal{Y}$,$$U(B, Y_j) = x + j\frac{y-x}{2(n + 1)}$$
Then the crucial facts are:
  • For any two $X_i \neq X_j$ in $\mathcal{X}$,$$U(A, X_i) \neq U(A, X_j)$$
  • For any two $Y_i \neq  Y_j$ in $\mathcal{Y}$,$$U(B, Y_i) \neq U(B, Y_j)$$
  • For any $X_i$ in $\mathcal{X}$ and $Y_j$ in $\mathcal{Y}$, $$x < U(B, Y_j) < \frac{x+y}{2} < U(A, X_i) < y$$
So $A$ dominates $B$, but$$\mathrm{EU}_c(A) = \sum^m_{i=1} c(X_i) U(A, X_i) < \sum^m_{i=1} c(X_i) \times y = xy$$
while$$\mathrm{EU}_c(B) = \sum^n_{j=1} c(Y_i) U(B, Y_j) > \sum^n_{j=1} c(Y_j) \times x = yx$$So $\mathrm{EU}_c(B) > \mathrm{EU}_c(A)$, as required.

Thursday, 19 July 2018

What is Probabilism?


I've just signed a contract with Cambridge University Press to write a book on the Dutch Book Argument for their Elements in Decision Theory and Philosophy series. So over the next few months, I'm going to be posting some bits and pieces as I get properly immersed in the literature.

----

Probabilism is the claim that your credences should satisfy the axioms of the probability calculus. Here is an attempt to state the norm more precisely, where $\mathcal{F}$ is the algebra of propositions to which you assign credences and $c$ is your credence function, which is defined on $\mathcal{F}$, so that $c(A)$ is your credence in $A$, for each $A$ in $\mathcal{F}$.

Probabilism (initial formulation)
  • (Non-Negativity) Your credences should not be negative. In symbols: $c(A) \geq 0$, for all $A$ in $\mathcal{F}$.
  • (Normalization I) Your credence in a necessarily false proposition should be 0. In symbols: $c(\bot) = 0$.
  • (Normalization II) Your credence in a necessarily true proposition should be 1. In symbols: $c(\top) = 1$.
  • (Finite Additivity) Your credence in the disjunction of two mutually exclusive propositions should be the sum of your credences in the disjuncts. In symbols: $c(A \vee B) = c(A) + c(B)$.
This sort of formulation is fairly typical. But I think it's misleading in various ways.

As is often pointed out, 0 and 1 are merely conventional choices. Like utilities, we can measure credences on different scales. But what are they conventional choices for? It seems to me that they must represent the lowest possible credence you can have and the highest possible credence you can have, respectively. After all, what we want Normalization I and II to say is that we should have lowest possible credence in necessary falsehoods and highest possible credence in necessary truths. It follows that Non-Negativity is not a normative constraint on your credences, which is how it is often presented. Rather, it follows immediately from the particular representation of our credences that we have chosen to. Suppose we chose a different representation, where -1 represents the lowest possible credence and 1 represents the highest. Then Normalization I and II would say that $c(\bot) = -1$ and $c(\top) = 1$, so Non-Negativity would be false.

One upshot of this is that Non-Negativity is superfluous once we have specified the representation of credences that we are using. But another is that Probabilism incorporates not only normative claims, such as Normalization I and II and Finite Additivity, but also a metaphysical claim, namely, that there is a lowest possible credence that you can have and a highest possible credence that you can have. Without that, we couldn't specify the representation of credences in such a way that we would want to sign up to Normalization I and II. Suppose that, for any credence you can have, there is a higher one than you could have. Then there is no credence that I would want to demand you have in a necessary truth--for any I demanded, it would be better for you to have one higher. So I either have to say that all credences in necessary falsehoods are rationally forbidden, or all are rationally permitted, or I pick some threshold above which any credence is rationally permitted. And the same goes, mutatis mutandis, for credences in necessary falsehoods. I'm not sure what the norm of credences would be if our credences were unbounded in one or other or both directions. But it certainly wouldn't be Probabilism.

So Non-Negativity is not a normative claim, but rather a trivial consequence of a metaphysical claim together with a conventional choice of representation. The metaphysical claim is that there is a minimal and a maximal credence; the representation choice is that 0 will represent the minimal credence and 1 will represent the maximal credence.

Next, suppose we make a different conventional choice. Suppose we pick real numbers $a$ and $b$, and we say that $a$ represents minimal credence and $b$ represents maximal credence. Then clearly Normalization I becomes $c(\bot) = a$ and Normalization II becomes $c(\top) = b$. But what of Finite Additivity? This looks problematic. After all, if $a = 10$ and $b = 30$, and $c(A) = 20 = c(\overline{A})$, then Finite Addivitity demands that $c(\top) = c(A \vee \overline{A}) = c(A) + c(\overline{A}) = 40$, which is greater than the maximal credence. So Finite Additivity makes an impossible demand on an agent who seems to have perfectly rational credences in $A$ and $\overline{A}$, given the representation.

The reason is that Finite Additivity, formulated as we formulated it above, is peculiar to very specific representations of credences, such as the standard one on which 0 stands for minimal credence and 1 stands for maximal credence. The correct formulation of Finite Additivity in general says: $c(A \vee B) = c(A) + c(B) - c(A\ \&\ B)$, for any propositions $A$, $B$ in $\mathcal{F}$. Thus, in the case we just gave above, if $c(A\ \&\ \overline{A}) = 10$, in keeping with the relevant version of Normalization I, we have $c(A \vee \overline{A}) = 20 + 20 - 10 = 30$, as required. So we see that it's wrong to say that Probabilism says that your credence in the disjunction of two mutually exclusive propositions should be the sum of your credences in the disjuncts--that's actually only true on some representation of your credences (namely, those for which 0 represents minimal credence).

Bringing all of this together, I propose the following formulation of Probabilism:

Probabilism (revised formulation)
  • (Bounded credences) There is a lowest possible credence you can have; and there is a highest possible credence you can have.
  • (Representation) We represent the lowest possible credence you have using $a$, and we represent the highest possible credence you can have using $b$.
  • (Normalization I) Your credence in a necessarily false proposition should be the lowest possible credence you can have. In symbols: $c(\bot) = a$.
  • (Normalization II) Your credence in a necessarily true proposition should be the highest possible credence you can have. In symbols: $c(\top) = b$.
  • (Finite Additivity) $c(A \vee B) = c(A) + c(B) - c(A\ \&\ B)$, for any propositions $A$, $B$ in $\mathcal{F}$.
We call such a credence function a probability$_{a, b}$ function. How can we be sure this is right? Here are some considerations in its favour:

Switching representations 
(i) Suppose $c(-)$ is a probability$_{a, b}$ function. Then $\frac{1}{b-a}c(-) - \frac{a}{b-a}$ is a probability function (or probability$_{0, 1}$ function).
(ii) Suppose $c(-)$ is a probability function and $a, b$ are real numbers. Then $c(-)(b-a) + a$ is a probability$_{a, b}$ function.

Dutch Book Argument
The standard Dutch Book Argument for Probabilism assumes that, if you have credence $p$ in proposition $X$, then you will pay £$pS$ for a bet that pays £$S$ if $X$ and £$0$ if $\overline{X}$. But this assumes that you have credences between 0 and 1, inclusive. What is the corresponding assumption if you represent credences in a different scale? Shorn of its conventional choice of representation, the assumption is: (a) you will pay £$0$ for a bet on $X$ if you have minimal credence in $X$; (b) you will pay £$S$ for a bet on $X$ if you have maximal credence in $X$; (c) the price you will pay for a bet on $X$ increases linearly with your credence in $X$. Translated into a framework in which we measure credence on a scale from $a$ to $b$, the assumption is then: you will pay £$\frac{p-a}{b-a}S$ for a bet that pays £$S$ if $X$ and £$0$ if $\overline{X}$. And, with this assumption, we can find Dutch Books against any credence function that isn't a probability$_{a, b}$ function.

Accuracy Dominance Argument
The standard Accuracy Dominance Argument for Probabilism assumes that, for each world, the ideal or vindicated credence function at that world assigns 0 to all falsehoods and 1 to all truths. Of course, if we represent minimal credence by $a$ and maximal credence by $b$, then we'll want to change that assumption. We'll want to say instead that the ideal or vindicated credence function at a world assigns $a$ to falsehoods and $b$ to truths. Once we say that, for any credence function that isn't a probability$_{a, b}$ function, there is another credence function that is closer to the ideal credence function at all worlds.

So, the usual arguments for having a credence function that is a probability function when you represent your credences on a scale from 0 to 1 can be repurposed to argue that you should have a credence function that is a probability$_{a, b}$ function when you represent your credences on a scale from $a$ to $b$. And that gives us good reason to think that the second formulation of Probabilism above is correct.


Monday, 16 July 2018

Yet another assistant professorship in formal philosophy @ University of Gdansk

Some time ago the Chair of Logic, Philosophy of Science and Epistemology had an opening in formal philosophy that since then has been filled. Now, another position (leading to a permanent position upon second renewal) is available (so, there'll be three tenure-track faculty members working on formal philosophy). Details.

Lecturer Position in Logic and Philosophy of Language (MCMP)


The Ludwig-Maximilians-University Munich is one of the largest and most prestigious universities in Germany.

Ludwig-Maximilians-University Munich is seeking applications for one

Lecturer Position (equivalent to Assistant Professorship) 
in Logic and Philosophy of Language
(for three years, with the possibility of extension)

at the Chair of Logic and Philosophy of Language (Professor Hannes Leitgeb) and the Munich Center for Mathematical Philosophy (MCMP) at the Faculty of Philosophy, Philosophy of Science and Study of Religion. The position, which is to start on December 1, 2018, is for three years with the possibility of extension for another three years.

The appointee will be expected (i) to do philosophical research, especially in logic and philosophy of language, (ii) to teach five hours a week in areas relevant to the chair, and (iii) to participate in the administrative work of the MCMP.

The successful candidate will have a PhD in philosophy or logic, will have teaching experience in philosophy and logic, and will have carried out research in logic and related areas (such as philosophy of logic, philosophy of language, philosophy of mathematics, formal epistemology).

Your workplace is centrally located in Munich and is very easy to reach by public transport. We offer you an interesting and responsible job with good training and development opportunities.
The employment takes place within the TV-L scheme.
The position is initially limited to November 30, 2021.
Furthermore, given equal qualification, severely physically challenged applicants will be preferred.
There is the possibility of part-time employment.
The application of women is strongly welcome.

Applications (including CV, certificates, list of publications, list of courses taught, a writing sample and a description of planned research projects (1000-1500 words)) should be sent either by email (ideally all requested documents in just one PDF document) or by mail to

Ludwig-Maximilians-Universität München
Faculty of Philosophy, Philosophy of Science and Study of Religion
Chair of Logic and Philosophy of Language / MCMP
Geschwister-Scholl-Platz 1
80539 München

by September 1, 2018. If possible, we very much prefer applications by email.

In addition, we ask for two letters if reference, which must be sent by the reviewers directly to the above address (e-mail preferred).

For further questions you can contact by e-mail office.leitgeb@lrz.uni-muenchen.de.

More information about the MCMP can be found at http://www.mcmp.philosophie.uni-muenchen.de/index.html.

The German description of the position is to be found at https://www.uni-muenchen.de/aktuelles/stellenangebote/wissenschaft/20180704161330.html 

Tuesday, 3 July 2018

Entia et Nomina 2018 (August 28-29, Gdansk)

This is the seventh conference in the Entia et NominA series (previous editions took place in Poland, Belgium and India), which  features workshops for researchers in formally and analytically oriented philosophy, in particular in epistemology, logic, and philosophy of science.  The distinctive format of the workshop requires participants to distribute extended abstracts or full papers a couple of weeks before the workshop and to prepare extended comments on another participant's paper.

Invited speakers
- Zalan Gyenis (Jagiellonian University, Poland)
- Masashi Kasaki (Nagoya University, Japan)
- Martin Smith (University of Edinburgh, Scotland)


Dates: 
- Submission deadline: July 20
- Decisions: August 1
- Workshop: August 28-29

For more details on the workshop and submission, consult the pdf file with full CFP:

Monday, 12 February 2018

An almost-Dutch Book argument for the Principal Principle

People often talk about the synchronic Dutch Book argument for Probabilism and the diachronic Dutch Strategy argument for Conditionalization. But the synchronic Dutch Book argument for the Principal Principle is mentioned less. That's perhaps because, in one sense, there couldn't possibly be such an argument. As the Converse Dutch Book Theorem shows, providing you satisfy Probabilism, there can be no Dutch Book made against you -- that is, there is no sets of bets, each of which you will consider fair or favourable on its own, but which, when taken together, lead to a sure loss for you. So you can violate the Principal Principle without being vulnerable to a sure loss, providing your satisfy Probabilism. However, there is a related argument for the Principal Principle. And conversations with a couple of philosophers recently made me think it might be worth laying it out.

Here is the result on which the argument is based:

(I) Suppose your credences violate the Principal Principle but satisfy Probabilism. Then there is a book of bets and a price such that: (i) you consider that price favourable for that book -- that is, your subjective expectation of the total net gain is positive; (ii) every possible objective chance function considers that price unfavourable -- that is, the objective expectation of the total net gain is guaranteed to be negative.

(II) Suppose your credences satisfy both the Principal Principle and Probabilism. Then there is no book of bets and a price such that: (i) you consider that price favourable for that book; (ii) every possible objective chance function considers that price unfavourable.

Put another way:

(I') Suppose your credences violate the Principal Principle. There are two actions $a$ and $b$ such that: you prefer $b$ to $a$, but every possible objective chance function prefers $a$ to $b$.

(II') Suppose your credences satisfy the Principal Principle. For any two actions $a$ and $b$: if every possible objective chance function prefers $a$ to $b$, then you prefer $a$ to $b$.

To move from (I) and (II) to (I') and (II'), let $a$ be the action of accepting the bets in $B$ and let $b$ be the action of rejecting them.

The proof splits into two parts:

(1) First, we note that a credence function $c$ satisfies the Principal Principle iff $c$ is in the closed convex hull of the set of possible chance functions.

(2) Second, we prove that:

(2I) If a probability function $c$ lies outside the closed convex hull of a set of probability functions $\mathcal{X}$, then there is a book of bets and a price such the expected total net gain from that book at that price by the lights of $c$ is positive, while the expected total net gain from that book at that price by the lights of each $p$ in $\mathcal{X}$ is negative.

(2II) If a probability function $c$ lies inside the closed convex hull of a set of probability functions $\mathcal{X}$, then there is no book of bets and a price such the expected total net gain from that book at that price by the lights of $c$ is positive, while the expected total net gain from that book at that price by the lights of each $p$ in $\mathcal{X}$ is negative.

Here's the proof of (2), which I lift from my recent justification of linear pooling -- the same technique is applicable since the Principal Principle essentially says that you should set your credences by applying linear pooling to the possible objective chances.

First:
  • Let $\Omega$ be the set of possible worlds
  • Let $\mathcal{F} = \{X_1, \ldots, X_n\}$ be the set of propositions over which our probability functions are defined. So each $X_i$ is a subset of $\Omega$.
Now:
  • We represent a probability function $p$ defined on $\mathcal{F}$ as a vector in $\mathbb{R}^n$, namely, $p = \langle p(X_1), \ldots, p(X_n)\rangle$.
  • Given a proposition $X$ in $\mathcal{F}$ and a stake $S$ in $\mathbb{R}$, we define the bet $B_{X, S}$ as follows: $$B_{X, S}(\omega) =  \left \{ \begin{array}{ll}
    S & \mbox{if } \omega \in X \\
    0 & \mbox{if } \omega \not \in X
    \end{array}
    \right.$$ So $B_{X, S}$ pays out $S$ if $X$ is true and $0$ if $X$ is false.
  • We represent the book of bets $\sum^n_{i=1} B_{X_i, S_i}$ as a vector in $\mathbb{R}^n$, namely, $S = \langle S_1, \ldots, S_n\rangle$. 

Lemma 1
If $p$ is a probability function on $\mathcal{F}$, the expected payoff of the book of bets $\sum^n_{i=1} B_{X_i, S_i}$ by the lights of $p$ is $$S \cdot p = \sum^n_{i=1} p(X_i)S_i$$
Lemma 2
Suppose $c$ is a probability function on $\mathcal{F}$, $\mathcal{X}$ is a set of probability functions on $\mathcal{F}$, and $\mathcal{X}^+$ is the closed convex hull of $\mathcal{X}$. Then, if $c \not \in \mathcal{X}^+$, then there is a vector $S$ and $\varepsilon > 0$ such that, for all $p$ in $\mathcal{X}$, $$S \cdot p < S \cdot c - \varepsilon$$
Proof of Lemma 2.  Suppose $c \not \in \mathcal{X}^+$. Then let $c^*$ be the closest point in $\mathcal{X}^+$ to $c$. Then let $S = c - c^*$. Then, for any $p$ in $\mathcal{X}$, the angle $\theta$ between $S$ and $p - c$ is obtuse and thus $\mathrm{cos}\, \theta < 0$. So, since $S \cdot (p - c) = ||S||\, ||x - p|| \mathrm{cos}\, \theta$ and $||S||, ||p - c|| > 0$, we have $S \cdot (p - c) < 0$. And hence $S \cdot p < S \cdot c$. What's more, since $\mathcal{X}^+$ is closed, $p$ is not a limit point of $\mathcal{X}^+$, and thus there is $\delta > 0$ such that $||p - c|| > \delta$ for all $p$ in $\mathcal{X}$. Thus, there is $\varepsilon > 0$ such that $S \cdot p < S \cdot c - \varepsilon$, for all $p$ in $\mathcal{X}$.

We now derive (2I) and (2II) from Lemmas 1 and 2:

Let $\mathcal{X}$ be the set of possible objective chance functions. If $c$ violates the Principal Principle, then $c$ is not in $\mathcal{X}^+$. Thus, by Lemma 2, there is a book of bets $\sum^n_{i=1} B_{X_i, S_i}$ and $\varepsilon > 0$ such that, for any objective chance function $p$ in $\mathcal{X}$, $S \cdot p < S \cdot c - \varepsilon$. By Lemma 1, $S \cdot p$ is the expected payout of the book of bets by the lights of $p$, while $S \cdot c$ is the expected payout of the book of bets by the lights of $c$. Now, suppose we were to offer an agent with credence function $c$ the book of bets $\sum^n_{i=1} B_{X_i, S_i}$ for the price of $S \cdot c - \frac{\varepsilon}{2}$. Then this would have positive expected payoff by the lights of $c$, but negative expected payoff by the lights of each $p$ in $\mathcal{X}$. This gives (2I).

(2II) then holds because, when $c$ is in the closed convex hull of $\mathcal{X}$, its expectation of a random variable is in the closed convex hull of the expectations of that random variable by the lights of the probability functions in $\mathcal{X}$. Thus, if the expectation of a random variable is negative by the lights of all the probability functions in $\mathcal{X}$, then its expectation by the lights of $c$ is not positive.


Monday, 1 January 2018

A Dutch Book argument for linear pooling

Often, we wish to aggregate the probabilistic opinions of different agents. They might be experts on the effects of housing policy on people sleeping rough, for instance, and we might wish to produce from their different probabilistic opinions an aggregate opinion that we can use to guide policymaking. Methods for undertaking such aggregation are called pooling operators. They take as their input a sequence of probability functions $c_1, \ldots, c_n$, all defined on the same set of propositions, $\mathcal{F}$. And they give as their output a single probability function $c$, also defined on $\mathcal{F}$, which is the aggregate of $c_1, \ldots, c_n$. (If the experts have non-probabilistic credences and if they have credences defined on different sets of propositions or events, problems arise -- I've written about these here and here.) Perhaps the simplest are the linear pooling operators. Given a set of non-negative weights, $\alpha_1, \ldots, \alpha_n \leq 1$ that sum to 1, one for each probability function to be aggregated, the linear pool of $c_1, \ldots, c_n$ with these weights is: $c = \alpha_1 c_1 + \ldots + \alpha_n c_n$. So the probability that the aggregate assigns to a proposition (or event) is the weighted average of the probabilities that the individuals assign to that proposition (event) with the weights $\alpha_1, \ldots, \alpha_n$.

Linear pooling has had a hard time recently. Elkin and Wheeler reminded us that linear pooling almost never preserves unanimous judgments of independence; Russell, et al. reminded us that it almost never commutes with Bayesian conditionalization; and Bradley showed that aggregating a group of experts using linear pooling almost never gives the same result as you would obtain from updating your own probabilities in the usual Bayesian way when you learn the probabilities of those experts. I've tried to defend linear pooling against the first two attacks here. In that paper, I also offer a positive argument in favour of that aggregation method: I argue that, if your aggregate is not a result of linear pooling, there will be an alternative aggregate that each experts expects to be more accurate than yours; if your aggregate is a result of linear pooling, this can't happen. Thus, my argument is a non-pragmatic, accuracy-based argument, in the same vein as Jim Joyce's non-pragmatic vindication of probabilism. In this post, I offer an alternative, pragmatic, Dutch book-style defence, in the same vein as the standard Ramsey-de Finetti argument for probabilism.

My argument is based on the following fact: if your aggregate probability function is not a result of linear pooling, there will be a series of bets that the aggregate will consider fair but which each expert will expect to lose money (or utility); if your aggregate is a result of linear pooling, this can't happen. Since one of the things we might wish to use an aggregate to do is to help us make communal decisions, a putative aggregate cannot be considered acceptable if it will lead us to make a binary choice one way when every expert agrees that it should be made the other way. Thus, we should aggregate credences using a linear pooling operator.

We now prove the mathematical fact behind the argument, namely, that if $c$ is not a linear pool of $c_1, \ldots, c_n$, then there is a bet that $c$ will consider fair, and yet each $c_i$ will expect it to lose money; the converse is straightforward.

Suppose $\mathcal{F} = \{X_1, \ldots, X_m\}$. Then:
  • We can represent a probability function $c$ on $\mathcal{F}$ as a vector in $\mathbb{R}^m$, namely, $c = \langle c(X_1), \ldots, c(X_m)\rangle$.
  • We can also represent a book of bets on the propositions in $\mathcal{F}$ by a vector in $\mathbb{R}^m$, namely, $S = \langle S_1, \ldots, S_m\rangle$, where $S_i$ is the stake of the bet on $X_i$, so that the bet on $X_i$ pays out $S_i$ dollars (or utiles) if $X_i$ is true and $0$ dollars (or utiles) if $X_i$ is false.
  • An agent with probability function $c$ will be prepared to pay $c(X_i)S_i$ for a bet on $X_i$ with stake $S_i$, and thus will be prepared to pay $S \cdot c = c(X_1)S_1 + \ldots + c(X_m)S_m$ dollars (or utiles) for the book of bets with stakes $S = \langle S_1, \ldots, S_m\rangle$. (As is usual in Dutch book-style arguments, we assume that the agent is risk neutral.)
  • This is because $S \cdot c$ is the expected pay out of the book of bets with stakes $S$ by the lights of probability function $c$.
Now, suppose $c$ is not a linear pool of $c_1, \ldots, c_n$. So $c$ lies outside the convex hull of $\{c_1, \ldots, c_n\}$. Let $c^*$ be the closest point to $c$ inside that convex hull. And let $S = c - c^*$. Then the angle $\theta$ between $S$ and $c_i - c$ is obtuse and thus $\mathrm{cos}\, \theta < 0$ (see diagram below). So, since $S \cdot (c_i - c) = ||S||\, ||c_i - c|| \mathrm{cos}\, \theta$ and $||S||, ||c_i - c|| \geq 0$, we have $S \cdot (c_i - c) < 0$. And hence $S \cdot c_i < S \cdot c$. But recall:
  • $S \cdot c$ is the amount that the aggregate $c$ is prepared to pay for the book of bets with stakes $S$; and 
  • $S \cdot c_i$ is the expert $i$'s expected pay out of the book of bets with stakes $S$.
Thus, each expert will expect that book of bets to pay out less than $c$ will be willing to pay for it.