Monday, 7 April 2014

Buchak on risk and rationality III: the redescription strategy

This is the third in a series of three posts in which I rehearse what I hope to say at the Author Meets Critics session for Lara Buchak's tremendous* new book Risk and Rationality at the Pacific APA in a couple of weeks.  The previous two posts are here and here.  In the first post, I gave an overview of risk-weighted expected utility theory, Buchak's alternative to expected utility theory.  In the second post, I gave a prima facie reason for worrying about any departure from expected utility theory: if an agent violates expected utility theory (perhaps whilst exhibiting the sort of risk-sensitivity that Buchak's theory permits), then her preferences amongst the acts don't line up with her estimates of the value of those acts.  In this post, I want to consider a way of reconciling the preferences Buchak permits with the normative claims of expected utility theory.

I will be making a standard move.  I will be redescribing the space of outcomes in such a way that we can understand any Buchakian agent as setting her preferences in line with her expectation (and thus estimate) of the value of that act.


Redescribing the outcomes


Moving from expected utility theory to risk-weighted expected utility theory involves an agent evaluating an act in the way illustrated in Figure 1 below to evaluating it in the way illustrated in Figure 2 below.

Figure 1: As in previous posts, $h = \{u_1, F_1; u_2, F_2; u_3, F_3\}$.  The expected utility $EU_{p, u}(h)$ of $h$ is given by the dark grey area.  It is obtained by summing the areas of the three horizontal rectangles.  Their areas are $p(F_1 \vee F_2 \vee F_3)u_1 = u_1$, $p(F_2 \vee F_3)(u_2 - u_1)$, and $p(F_3)(u_3 - u_2)$.

Figure 2: The risk-weighted expected utility $REU_{r_2, p, u}(h)$ of $h$ is given by the dark grey area, where $r_2(x) := x^2$.

In order to begin to see how we can redescribe the REU rule of combination as an instance of the EU rule of combination, we reformulate the REU rule in the way illustrated in Figure 3.

Figure 3: Again, the risk-weighted expected utility $REU_{r_2, p, u}(f)$ of $f$ is given by the grey area, where $r_2(x) = x^2$.


Thus, we can reformulate $REU_{r, p, u}(f)$ as follows:$$REU_{r, p, u}(f) = \sum^{k-1}_{j=1} (r(p(F_j \vee \ldots \vee F_k)) - r(p(F_{j+1} \vee \ldots \vee F_k)))u_j + r(p(F_k))u_n$$And we can reformulate this as follows: $REU_{r, p, u}(f) =$
$$\sum^{k-1}_{j=1} p(F_j) \frac{r(p(F_j \vee \ldots \vee F_k)) - r(p(F_{j+1} \vee \ldots \vee F_k))}{p(F_j \vee \ldots \vee F_k) - p(F_{j+1} \vee \ldots \vee F_k)}u_j + p(F_k)\frac{r(p(F_k))}{p(F_k)}u_n$$since $p(F_j \vee \ldots \vee F_k) - p(F_{j+1} \vee \ldots \vee F_k) = p(F_j)$.

Now, suppose we let$$
s_j = \left \{ \begin{array}{ll}
\frac{r(p(F_j \vee \ldots \vee F_k)) - r(p(F_{j+1} \vee \ldots \vee F_k))}{p(F_j \vee \ldots \vee F_k) - p(F_{j+1} \vee \ldots \vee F_k)} & \mbox{if } j = 1, \ldots, k-1\\
& \\
\frac{r(p(F_j))}{p(F_j)} & \mbox{if } j = k
\end{array}
\right.$$
Then we have:$$REU_{r, p, u}(f) = \sum^k_{j=1} p(F_j)s_ju_j$$

Reformulating Buchak's rule of combination in this way suggests two accounts of it.  On the first, utilities attach ultimately to outcomes $x_i$, and they are weighted not by an agent's probabilities but rather by a function of those probabilities that encodes the agent's attitude to risk (given by a risk function).  On this account, we group $p(F_j)s_j$ together to give this weighting.  Thus, we assume that this weighting has a particular form: it is obtained from a probability function $p$ and a risk function $r$ to give $p(F_j)s_j$; this weighting then attaches to $u_j$ to give $(p(F_j)s_j)u_j$.

On the second account, probabilities do provide the weightings for utility, as in the EU rule of combination, but utilities attach ultimately to act-outcome pairs $(x_i, f)$.  On this account, we group $s_ju_j$ together to give this utility; this utility is then weighted by $p(F_j)$ to give $p(F_j)(s_ju_j)$.  That is, we say that an agent's utility function is defined on a new outcome space:  it is not defined on a set of outcomes $\mathcal{X}$, but on a particular subset of $\mathcal{X} \times \mathcal{A}$, which we will call $\mathcal{X}^*$.  $\mathcal{X}^*$ is the set of outcome-act pairs $(x_i, f)$ such that $x_i$ is a possible outcome of $f$:  that is, $\mathcal{X}^* = \{(x, f) : \exists s \in \mathcal{S}(f(s) = x)\}$.  Now, just as the first account assumed that the weightings of the utilities had a certain form---namely, it is generated by a risk function and probability function in a certain way---so this account assumes something about the form of the new utility function $u^*$ on $\mathcal{X}^*$:  we assume that a certain relation holds between the utility that $u^*$ assigns to outcome-act pairs in which the act is the constant act over the outcome and the utility $u^*$ to outcome-act pairs in which this is not the case.  We assume that the following holds:$$u^*(x, f) = s_ju^*(x, \overline{x})$$If a utility function on $\mathcal{X}^*$ satisfies this property, we say that it encodes attitudes to risk relative to risk function $r$.  Thus, on this account an agent evaluates an act as follows:
  • She begins with a risk function $r$ and a probability function $p$.
  • She then assigns utilities to all constant outcome-act pairs $(x, \overline{x})$, defining $u^*$ on $\overline{\mathcal{X}}^* = \{(x, \overline{x}) : x \in \mathcal{X}\} \subseteq \mathcal{X}^*$.).
  • Finally, she extends $u^*$ to cover all outcome-act pairs in $\mathcal{X}^*$ in the unique way required in order to make $u^*$ a utility function that encodes attitudes to risk relative to $r$. That is, she obtains $u^*(x, f)$ by weighting $u^*(x, \overline{x})$ in a certain way that is determined by the agent's probability function and her attitudes to risk.
Let's see this in action in our example act $h$; we'll consider $h$ from the point of view of two risk functions, $r_2(x) = x^2$ and $r_{0.5}(x) = \sqrt{x}$.  Recall: $r_2$ is a risk-averse risk function; $r_{0.5}$ is risk-seeking.  We begin by assigning utility to all constant outcome-act pairs $(x, \overline{x})$:


Then we do the same trick as above and amalgamate the outcome-act pairs with the same utility:  thus, again, $F_1$ is the event in which the act gives outcome-act pair $(x_1, h)$, $F_2$ is the event in which it gives $(x_2, h)$ or $(x_3, h)$, and $F_3$ the event in which it gives $(x_4, h)$.  Next, we assigns utilities to $(x_1, h)$, $(x_2, h)$, $(x_3, h)$, and $(x_4, h)$ in such a way as to make $u^*$ encode attitudes to risk relative to the risk function $r$.

Let's start by considering the utility of $(x_1, h)$, the lowest outcome of $h$.  Suppose our risk function is $r_2$; then $u^*(x_1, h) :=$ $$\frac{r_2(p(F_1 \vee F_2 \vee F_3)) - r_2(p(F_2 \vee F_3))}{p(F_1 \vee F_2 \vee F_3) - p(F_2 \vee F_3)}u^*(x_1, \overline{x_1}) = \frac{r_2(1) - r_2(0.7)}{1-0.7} = 1.7u^*(x_1, \overline{x_1})$$And now suppose our risk function is $r_{0.5}$; then $u^*(x_1, h) :=$ $$\frac{r_{0.5}(p(F_1 \vee F_2 \vee F_3)) - r_{0.5}(p(F_2 \vee F_3))}{p(F_1 \vee F_2 \vee F_3) - p(F_2 \vee F_3)}u^*(x_1, \overline{x_1}) = \frac{r_{0.5}(1) - r_{0.5}(0.7)}{1-0.7} \approx 0.54u^*(x_1, \overline{x_1})$$Thus, the risk-averse agent---that is, the agent with risk function $r_2$---values this lowest outcome $x_1$ as the result of $h$ more than she values the same outcome as the result of a certain gift of $x_1$, whereas the risk-seeking agent---with risk function $r_{0.5}$---values it less.  And this is true in general:  if $r(x) < x$ for all $x$, the utility of the lowest outcome as a result of $h$ will be more valuable than the same outcome as a result of the constant act on that outcome; if $r(x) < x$ it will be less valuable.

Next, let us consider the utility of $(x_4, h)$, the highest outcome of $h$.  Suppose her risk function is $r_2$; then$$u^*(x_4, h) := \frac{r_2(p(F_3))}{p(F_3)}u^*(x_4, \overline{x_4}) = \frac{r_2(0.4)}{0.4}u^*(x_4, \overline{x_4}) = 0.4u^*(x_4, \overline{x_4})$$
And now suppose her risk function is $r_{0.5}$; then$$u^*(x_4, h) := \frac{r_{0.5}(p(F_3))}{p(F_3)}u^*(x_4, \overline{x_4}) = \frac{r_{0.5}(0.4)}{0.4}u^*(x_4, \overline{x_4}) = 2.5u^*(x_4, \overline{x_4})$$Thus, the risk-averse agent---that is, the agent with risk function $r_2$---values this highest outcome $x_4$ as the result of $h$ less than she values the same outcome as the result of a certain gift of $x_4$, whereas the risk-seeking agent---with risk function $r_{0.5}$---values it more.  And, again, this is true in general:  if $r(x) < x$ for all $x$, the utility of the highest outcome as a result of $h$ will be less valuable than the same outcome as a result of the constant act on that outcome; if $r(x) < x$ it will be more valuable.

This seems right.  The risk-averse agent wants the highest utility, but also cares about how sure she was to obtain it.  Thus, if she obtains $x_1$ from $h$, she knows she was guaranteed to obtain at least this much utility from $h$ or from $\overline{x_1}$ (since $x_1$ is the lowest possible outcome of each act).  But she also knows that $h$ gave her some chance of getting more utility.  So she values $(x_1, h)$ more than $(x_1, \overline{x_1})$.  But if she obtains $x_4$ from $h$, she knows she was pretty lucky to get this much utility, while she knows that she would have been guaranteed that much if she had obtained $x_4$ from $\overline{x_4}$.  So she values $(x_4, h)$ less than $(x_4, \overline{x_4})$.  And similarly, but in reverse, for the risk-seeking agent.

Finally, let's consider the utilities of $(x_2, h)$ and $(x_3, h)$, the middle outcomes of $h$.  They will have the same value, so we need only consider the utility of $(x_2, h)$.  Suppose her risk function is $r_2$; then $u^*(x_2, h) :=$ $$\frac{r_2(p(F_2 \vee F_3)) - r_2(p(F_3))}{p(F_2 \vee F_3) - p(F_3)}u^*(x_2, \overline{x_2}) = \frac{r_2(0.7) - r_2(0.4)}{0.7-0.4}u^*(x_2, \overline{x_2}) = 1.1u^*(x_2, \overline{x_2})$$Thus, again, the agent with risk function $r_2$ assigns higher utility to obtaining $x_2$ as a result of $h$ than to obtaining $x_2$ as the result of $\overline{x_2}$.  But this is not generally true of risk-averse agents.  Consider, for instance, a more risk-averse agent, who has a risk function $r_3(x) := x^3$.  Then $u^*(x_2, h) :=$ $$\frac{r_3(p(F_2 \vee F_3)) - r_3(p(F_3))}{p(F_2 \vee F_3) - p(F_3)}u^*(x_2, \overline{x_2}) = \frac{r_3(0.7) - r_3(0.4)}{0.7-0.4}u^*(x_2, \overline{x_2}) = 0.93u^*(x_2, \overline{x_2})$$Again, this seems right.  As we said above, the risk-averse agent wants the highest utility, but she also cares about how sure she was to obtain it.  The less risk-averse agent---whose risk function is $r_2$---is sufficiently sure that $h$ would obtain for her at least the utility of $x_2$ and possibly more that she assigns higher value to getting $x_2$ as a result of $h$ than to getting it as a result of $\overline{x_2}$.  For the more risk-averse agent---whose risk function is $r_3$---she is not sufficiently sure.  And reversed versions of these points can be made for risk-seeking agents with risk functions $r_{0.5}$ and $r_{0.333}$, for instance.  Thus, we can see why it makes sense to demand of an agent that her utility function $u^*$ on $\mathcal{X}^*$ encodes attitudes to risk relative to a risk function in the sense that was made precise above.

Now, just as we saw that Savage's representation theorem is agnostic between EU and EU$^2$, we can now see that Buchak's representation theorem is agnostic between a version of REU in which utilities attach to elements of $\mathcal{X}$, and a version of EU in which utilities attach to elements of $\mathcal{X}^*$.

Theorem 1 (Buchak) If $\succeq$ satisfies the Buchak axioms, there is a unique probability function $p$, unique risk function, and unique-up-to-affine-transformation utility function $u$ on $\mathcal{X}$ such that $\succeq$ is determined by $r$, $p$, and $u$ in line with the REU rule of combination.

And we have the following straightforward corollary, just as we had in the Savage case above:

Theorem 2 If $\succeq$ satisfies the Buchak axioms, there is a unique probability function $p$ and unique-up-to-affine$^*$-transformation utility function $u^*$ on $\mathcal{X}^*$ that encodes attitudes to risk relative to a risk function such that $\succeq$ is determined by $p$ and $u^*$ in line with the EU rule of combination (where $u^*$ is unique-up-to-affine$^*$-transformation if $u^*|_{\overline{\mathcal{X}}^*}$ is unique-up-to-affine-transformation).

Thus, by redescribing the set of outcomes to which our agent assigns utilities, we can see how her preferences in fact line up with her estimates of the utility of her acts, as required by the de Finetti-inspired argument given in the previous section.

What's wrong with redescription?


Buchak consider the sort of redescription strategy of which this is an instance and raises what amount to two objections (Chapter 4, Buchak 2014).  (She raises a further objection against versions of the redescription strategy that attempt to identify certain outcome-act pairs to give a more coarse-grained outcome space; but these do not affect my proposal.)

The problem of proliferation


One potential problem that arises when one moves from assigning utilities to $\mathcal{X}$ to assigning them to $\mathcal{X}^*$ is that an element in the new outcome space is never the outcome of more than one act:  $(x, f)$ is a possible outcome of $f$ but not of $g \neq f$.  Thus, this outcome never appears in the expected utility (or indeed risk-weighted expected utility) calculation of more than one act.  The result is that very few constraints are placed on the utilities that must be assigned to these new outcomes and the probabilities that must be assigned to the propositions in order to recover a given preference ordering on the acts via the EU (or REU) rule of combination.  Suppose $\succeq$ is a preference ordering on $\mathcal{A}$.  Then, for each $f \in \mathcal{A}$, pick a real number $r_f$ such that $f \succeq g$ iff $r_f \geq r_g$.  Now there are many ways to do this, and they are not all affine transformations of one another---indeed, any strictly increasing $\tau : R \rightarrow R$ will take one such assignment to another.  Now pick any probability function $p$ on $\mathcal{F}$.  Now, given an act $f = \{x_1, E_1; \ldots; x_n, E_n\}$, the only constraint on the values $u^*(x_1, f)$, $\ldots$, $u^*(x_n, f)$ is that $\sum_i p(E_i)u^*(x_i, f) = r_f$.  And this of course permits many different values. (In general, for $r \in R$, $0 \leq \alpha_1, \ldots, \alpha_n$ with $\sum_i \alpha_i = 1$, there are many sequences $\lambda_1, \ldots, \lambda_n \in \mathbb{R}$ such that $\sum_i \lambda_i \alpha_i = r$.}  Buchak dubs this phenomenon belief and desire proliferation (p. 140, Buchak 2014).)

Why is this a problem?  There are a number of reasons to worry about belief and desire proliferation.  There is the epistemological worry that, if utilities and probabilities are as loosely constrained as this, it is not possible to use an agent's observed behaviour to predict her unobserved behaviour.  Divining her preferences between two acts will teach us nothing about the utilities she assigns to the outcomes of any other acts since those outcomes are unique to those acts.  Also,  those who wish to use representation theorems for the purpose of radical interpretation will be concerned by the complete failure of the uniqueness of the rationalisation of preferences that such a decision theory provides.

Neither of these objections seems fatal to me.  But in any case, the version of the redescription strategy presented here avoids them altogether.  The reason is that I placed constraints on the sort of utility function $u^*$ an agent can have over $\mathcal{X}^*$:  I demanded that $u^*$ encode attitudes to risk; that is, $u^*(x, f)$ is defined in terms of $u^*(x, \overline{x})$ in a particular way.  And we saw in Theorem 2 above that, for any agent whose preferences satisfy the Buchak axioms, there is a unique probability function $p$ and a unique utility function $u^*$ on $\mathcal{X}^*$ that encodes attitudes to risk relative to some risk function such that together $p$ and $u^*$ generate the agent's preferences in accordance with the EU rule of combination.

Ultimate ends and the locus of utility


Buchak's second objection initially seems more worrying (pp. 137-8, Buchak 2014).  A theme running through Risk and Rationality is that decision theory is the formalisation of instrumental or means-end reasoning.  One consequence of this is that an account of decision theory that analyses an agent as engaged in something other than means-end reasoning is thereby excluded.

Buchak objects to the redescription strategy on these grounds.  According to Buchak, to understand an agent as engaged in means-end reasoning, one must carefully distinguish the means and the ends:  in Buchak's framework, the means are the acts and the ends are the outcomes.  One must then assign utilities to the ends only.  Of course, in terms of these utilities and the agent's probabilities and possibly other representations of internal attitude such as the risk function, one can then assign value or utility to the means.  But the important point is that this value or utility that attaches to the means is assigned on the basis of the assignment of utility to the ultimate ends.  Thus, while there is a sense in which we assign a value or utility to means---i.e. acts---in expected utility theory, this assignment must depend ultimately on the utility we attach to ends---i.e. outcomes.

Thus, a first pass at Buchak's second complaint against the redescription strategy is this: the redescription strategy assigns utilities to something other than ends---it assigns utilities to outcome-act pairs, and these are fusions of means and ends.  Thus, an agent analysed in accordance with the redescription strategy is not understood as engaged in means-end reasoning.

However, this seems problematic in two ways.  Whether they constitute ultimate ends or not, there are at least two reasons why an agent must assign utilities to outcome-act pairs rather than outcomes on their own.  That is, there are two reasons why at least this part of the redescription strategy---namely, the move from $\mathcal{X}$ to $\mathcal{X}^*$---is necessary irrespective of the need to accommodate risk in expected utility theory.

Firstly, utilities must attach to the true outcomes of an act.  But these true outcomes aren't the sort of thing we've been calling an outcome here.  When I choose Safe over Risky and receive $£50$, the outcome of that act is not merely $£50$; it is $£50$ as the result of Safe.  Thus, the true outcomes of an act are in fact the elements of $\mathcal{X}^*$---they are what we have been calling the outcome-act pairs.

Of course, at this point, Buchak might accept that utilities attach to outcome-act pairs, but insist that it is nonetheless a requirement of rationality that an agent assign the same utility to two outcome-act pairs with the same act component; that is, $u^*(x, f) = u^*(x, g)$; that is, while utilities attach to fusions of means and ends, they must be a function only of the ends.  But the second reason for attaching utilities to outcome-act pairs tells against this claim in general.  The reason is this:  As Bernard Williams urges, it is neither irrational nor even immoral to assign higher utility to a person's death as a result of something other than my agency than to that same person's death as a result of my agency (Williams 1973).  This, one might hold, is what explains our hesitation when we are asked to choose between killing one person or letting twelve people including that person be shot by a firing squad:  I assign higher utility to the death of a prisoner at the hands of the firing squad than to the death of that prisoner at my hands. Thus, it is permissible in at least some situations to care about the act that gives rise to the outcome and let one's utility in an outcome-act pair be a function also of outcome.

Nonetheless, this is not definitive.  After all, Buchak could reply that this is peculiar to acts that have morally relevant consequences.  Acts such as those in the Allais paradox do not have morally relevant consequences; but the redescription strategy still requires us to make utilities depend on acts as well as outcomes in those cases.  Thus, for non-moral acts $f$ and $g$, Buchak might say, it is a requirement of rationality that $u^*(x, f) = u^*(x, g)$, even if it is not such a requirement for moral cases.  And this would be enough to scupper the redescription strategy.

However, it is not clear why the moral and non-moral cases should differ in this way.  Consider the following decision problem:   I must choose whether to shoot an individual or not; I know that, if I do not shoot him, someone else will.  I strictly prefer not shooting him to shooting him.  My reasoning might be reconstructed as follows:  I begin by assigning a certain utility to this person's death as the result of something other than my agency---natural causes, for instance, or murder by a third party.  Then, to give my utility for his death at my hand, I weight this original utility in a certain way, reducing it on the basis of the action that gave rise to the death.  Thus, the badness of the outcome-act pair (X's death, My agency) is calculated by starting with the utility of another outcome-act pair with the same outcome component---namely, (X's death, Not my agency)---and then weighting that utility based on the act component.  We might call (X's death, Not my agency) the reference pair attached to the outcome X's death.  The idea is that the utility we assign to the reference pair attached to an outcome comes closest to what we might think of as the utility that attaches solely to the outcome; the reference pair attached to an outcome $x$ is the outcome-act pair $(x, f)$ for which the act $f$ contributes least to the utility of the pair.

Now this is exactly analogous to what the redescription strategy proposes as an analysis of risk-sensitive behaviour.  In that case, when one wishes to calculate the utility of an outcome-act pair $(x, f)$, one begins with the utility one attaches to $(x, \overline{x})$.  Then one weights that utility in a certain way that depends on the riskiness of the act.  This gives the utility of $(x, f)$.  Thus, if we take $(x, \overline{x})$ to be the reference pair attached to the outcome $x$, then this is analogous to the moral case above.  In both cases, we can recover something close to the notion of utility for ultimate ends or pure outcomes (i.e. elements of $\mathcal{X}$):  the utility of the pure outcome $x$---to the extent that such a utility can be meaningfully said to exist---is $u^*(x, \overline{x})$, the utility of the reference pair attached to $x$.  That seems right.  Strictly speaking, there is little sense to asking an agent for the utility they assign to a particular person's death; one must specify whether or not the death is the result of that agent's agency. But we often do give a utility to that sort of outcome; and when we do, I submit, we give the utility of the reference pair.  Similarly, we often assign a utility to receiving $£50$, even though the request makes little sense without specifying the act that gives rise to that pure outcome:  again, we give the utility of $£50$ for sure, that is, the utility of $(£50, \overline{£50})$.

Understood in this way, the analysis of a decision given by the redescription strategy still portrays the agent as engaged in means-end reasoning.  Of course, there are no pure ultimate ends to which we assign utilities.  But there is something that plays that role, namely, reference pairs.  An agent's utility for an outcome-act pair $(x, f)$ is calculated in terms of her utility for the relevant reference pair, namely, $(x, \overline{x})$; and the agent's value for an act $f$ is calculated in terms of her utilities for each outcome-act pair $(x, f)$ where $x$ is a possible outcome of $f$.  Thus, though the value of an act on this account is not ultimately grounded in the utilities of pure, ultimate outcomes of that act, it is grounded in the closest thing that makes sense, namely, the utilities of the reference pairs attached to the pure, ultimate outcomes of the act.

Conclusion


Buchak proposes a novel decision theory.  It is formulated in terms of an agent's probability function, utility function, and risk function.  It permits a great many more preference orderings than orthodox expected utility theory.  On Buchak's theory the utility that is assigned to an act is not the expectation of the utility of its outcome; rather it is the risk-weighted expectation.  But the argument of the second post in this series suggested that estimates should be expectations and the utility of an act should be the estimate of the utility of its outcome.  In this post, I have tried to reconcile the preferences that Buchak endorses with the conclusion of de Finetti's argument.  To do this, I redescribed the outcome space so that utilities were attached ultimately to outcome-act pairs rather than outcomes themselves.  This allowed me to capture precisely the preferences that Buchak permits, whilst letting the utility of an act be the expectation of the utility it will produce.  The redescription strategy raises some questions:  Does it prevent us from using decision theory for certain epistemological purposes?  Does it fail to portray agents as engaged in means-end reasoning?  In the final section of this post, I tried to answer these questions.

* I'm reliably informed by a native speaker of American English that "tremendous" is rarely used on that side of the Atlantic to mean "extremely good"; rather, it is used to mean "extremely large".  So, to clarify: Buchak's book is extremely good, but not extremely large.  Divided by a common language indeed.

No comments:

Post a comment