## Tuesday, 25 March 2014

### Buchak on risk and rationality II: the virtues of expected utility theory

In a previous post, I gave an overview of the alternative to expected utility theory that Lara Buchak formulates and defends in her excellent new book, Risk and Rationality (Buchak 2013).  Buchak dubs the alternative risk-weighted expected utility theory.  It permits agents to have risk-sensitive attitudes.  In this post and the next one, I wish to argue that risk-weighted expected utility theory is right about the constraints that rationality places on our external attitudes, but wrong about the way our internal attitudes ought to combine to determine those external attitudes (for the internal/external attitude terminology, as well as other terminology in this post, please see the previous post):  that is, I agree with the axioms Buchak demands our preferences must satisfy, but I disagree with the way she combines probabilities, utilities, and risk attitudes to determine those preferences.  I wish to argue that, in fact, we ought to combine our internal attitudes in exactly the way that expected utility theory suggests.  In order to maintain both of these positions, I will have to redescribe the outcomes to which we assign utilities.  I do this in the next post.  In this post, I want to argue that all the effort that we will go to in order to effect this redescription is worth it.  That is, I want to argue that there are good reasons for thinking that an agent's internal attitudes ought to be combined to give her external attitudes in exactly the way prescribed by expected utility theory. (These three posts will together provide the basis for my commentary on Buchak's book at the Pacific APA this April.)

The disagreement between Buchak and the expected utility theorist is two-fold: the expected utility theorist posits two relevant internal attitudes where Buchak posits three; and the expected utility theorist endorses the EU rule of combination where Buchak endorses the REU rule of combination.  How can one tell between different rules of combination?  It is commonly assumed that representation theorems help us to do this.  But this is a mistake.  A representation theorem presupposes a rule of combination: it demonstrates that, relative to a particular rule of combination, for any agent whose preferences satisfy certain axioms, there are internal attitudes with certain properties (unique to some extent) such that these internal attitudes determine the preferences in line with that rule of combination.  Thus, Savage's representation theorem runs as follows:

Theorem (Savage) If $\succeq$ satisfies the Savage axioms, there is a unique probability function $p$ and unique-up-to-affine-transformation utility function $u$ such that $\succeq$ is determined by $p$ and $u$ in line with the EU rule of combination.

Now, consider the following alternative to the EU rule of combination (Zynda 2000):

The EU$^2$ rule of combination Suppose $f = \{E_1, x_1; \ldots; E_n, x_n\}$.  Then let
$$EU^2_{p, u}(f) = \sum^n_{i=1} p(E_i)^2u(x_i)^2$$
Then it ought to be that
$$f \succeq g \Longleftrightarrow EU^2_{p, u}(f) \geq EU^2_{p, u}(g)$$

Then we have the following straightforward corollary to Savage's theorem:

Theorem If $\succeq$ satisfies the Savage axioms, there is a unique probability$^2$ function $p$ and unique-up-to-affine$^2$-transformation utility function $u$ such that $\succeq$ is determined by $p$ and $u$ in line with the EU$^2$ rule of combination (where $p$ is a probability$^2$ function iff $p^2$ is a probability function, and $u$ is unique-to-affine$^2$-transformation iff $u^2$ is unique-up-to-affine-transformation).

Thus, Savage's theorem is agnostic between these two rules of combination.  So what, then, is to tell between EU and EU$^2$, or between EU and REU?

Here is an argument in favour of the EU rule of combination.  It draws heavily on an argument due to Bruno de Finetti (p. 136, de Finetti 1974):
1. It ought to be that an agent weakly prefers one option to another iff the agent's estimate of the utility of the first is at least her estimate of the utility of the second.
2. (de Finetti) An agent's estimate of a quantity ought to be her subjective expectation of it.
3. Therefore, $\succeq$ ought to be determined by the EU rule of combination.

The first premise is supposed to be intuitively plausible.  Suppose I desire only wine--obtaining as much of it as possible is my only goal.  And suppose my estimate of the quantity of wine in the bottle on my left is greater than my estimate of the quantity of wine in the bottle on my right.  And suppose that, nonetheless, I weakly prefer the bottle on my right.  Then it would seem I am irrational.  Likewise, if I desire only utility then I would be irrational if my estimate of the utility of an act $g$ were higher than my estimate of the utility another act $f$ and yet I were to weakly prefer $f$ to $g$.  This is the argument for premise (1).

The second premise is based on a mathematical argument together with a plausible claim about the goodness of estimates.  Estimates, so the argument goes, are better the closer they are to the true quantity they estimate.  Thus, if I estimate that the volume of wine remaining in my glass is 73ml and my friend estimates that it is 79ml and in fact it is 80ml, then her estimate is better than mine.  According to this argument, this claim extends to sets of estimates as well:  a series of estimates of a series of quantities is better the closer it lies to the series of true values of those quantities.  But how do we measure the distance between a series of estimates and the true values of those estimates?  According to de Finetti, we represent a series of estimates of a series of quantities $X_1$, $\ldots$, $X_n$ by a vector $\vec x = (x_1, \ldots, x_n)$ where $x_i$ is the estimate of $X_i$; and we represent a state of the world $s$ by a vector $\vec s = (X_1(s), \ldots, X_n(s))$ where $X_i(s)$ is the value of $X_i$ in  state $s$.  Then we measure the distance between $\vec x$ and $\vec s$ as the square of the Euclidean distance between them.  Thus, the badness of an estimate $\vec x$ at a state of the world $s$ is given by:
$$SED(\vec x, \vec s) = \sum^n_{i=1} (x_i - X_i(s))^2$$
(The same argument can be run for a broad range of distance measures: indeed, any of the so-called Bregman divergences will do the job (Bregman 1967) (see this post). But I will consider only SED for the sake of simplicity.)

Now we said above that estimates are better the closer they lie to the true value of the quantity that they estimate.  But this sounds a lot like something we might want to say about credences:  a credence in a true proposition is better the closer it lies to the maximal credence, which is 1; and a credence in a false proposition is better the closer it lies to the minimal credence, which is 0.  Thus, we can conceive of a credence in a proposition $X$ as an estimate of a particular quantity, namely, the so-called \emph{indicator quantity of $X$}:  this is the quantity (denoted $I_X$) such that $I_X(s) = 1$ if $X$ is true in state $s$ and $I_X(s) = 0$ if $X$ is false in state $s$.  Thus, suppose that an agent has credences $\vec p = (p_1, \ldots, p_n)$ in states $s_1$, $\ldots$, $s_n$.  And she has estimates $\vec x = (x_1, \ldots, x_m)$ in quantities $X_1$, $\ldots$, $X_m$.  Then she has estimates $\vec x \vec p = (x_1, \ldots, x_m, p_1, \ldots, p_n)$ in quantities $X_1$, $\ldots$, $X_m$, $I_{s_1}$, $\ldots$, $I_{s_n}$.  And thus the badness of those estimates in a state $s$ is given by:
$$B(\vec x \vec p, \vec s) = \sum^m_{i=1} (x_i - X_i(s))^2 + \sum^n_{i=1} (p_i - I_{s_i}(s))^2$$
Now, we say that an agent's estimates $\vec x \vec p$ are expectational if:
1. $\sum_i p_i = 1$
2. $x_i = \sum_i p_i X(s_i)$
Then de Finetti proved the following theorem:

Theorem (de Finetti)
1. If $\vec x \vec p$ are not expectational, then there are $\vec x'\vec p'$ that are expectational such that$$SED(\vec x' \vec p', \vec s) < SED(\vec x \vec p, \vec s)$$
for all states $s$ in $\S$.
2. If $\vec x$ and $\vec p$ are  expectational, then there are no $\vec x'\vec p' \neq \vec x \vec p$ such that$$SED(\vec x' \vec p', \vec s) \leq SED(\vec x \vec p, \vec s)$$ for all states $s$ in $\S$.
That is, if one's estimates are not expectational, then there are expectational estimates that are closer to the true values of the quantities being estimated at any state of the world, and thus are better qua estimates; and if one's estimates are expectational, then this never happens.  This is the argument for premise (2).

This, then, is the argument in favour of the EU rule of combination over the EU$^2$ rule and the REU rule.  Thus, my concern with Buchak's rule of combination is this: an agent who determines her preferences on the basis of the REU rule does not align those preferences with her estimates of the utilities of the outcomes of the acts.  Or so it seems.  As we will see in the next post, there is a way of redescribing the outcomes of each act in such a way that we can take any agent with preferences that Buchak permits to have determined them using the EU rule of combination.  Hopefully, the argument of this post has at least motivated this redescription strategy.

## References

• Bregman, L. M. (1967). The relaxation method of finding the common points of convex sets and its application to the solution of problems in convex programming. USSR Computational Mathematics and Mathematical Physics, 78(384):200–217.
• Buchak, L. (2014). Risk and Rationality. Oxford University Press.
• de Finetti, B. (1974). Theory of Probability, volume I. John Wiley & Sons, New York.
• Zynda, L. (2000). Representation Theorems and Realism about Degrees of Belief. Philosophy of Science, 67(1):45–69.