Thursday, 23 June 2022

Aggregating for accuracy: another accuracy argument for linear pooling

 A PDF of this blogpost is available here.

I don't have an estimate for how long it will be before the Greenland ice sheet collapses, and I don't have an estimate for how long it will be before the average temperature at Earth's surface rises more than 3C above pre-industrial levels. But I know a bunch of people who do have such estimates, and I might hope that learning theirs might help me set mine. Unfortunately, each of these people has a different estimate for each of these two quantities. What should I do? Should I pick one of them at random and adopt their estimates as mine? Or should I pick some compromise between them? If the latter, which compromise?

Cat inaccurately estimates width of step
The following fact gives a hint. An estimate of a quantity, such as the number of years until an ice sheet collapses or the number of years until the temperature rises by a certain amount, is better the closer it lies to the true value of the quantity and worse the further it lies from this. There are various ways to measure the distance between estimate and true value, but we'll stick with a standard one here, namely, squared error, which takes the distance to be the square of the difference between the two values. Then the following is simply a mathematical fact: taking the straight average of the group's estimates of each quantity as your estimate of that quantity is guaranteed to be better, in expectation, than picking a member of the group at random and simply deferring to them. This is sometimes called the Diversity Prediction Theorem, or it's a corollary of what goes by that name.

The result raises a natural question: Is it only by taking the straight average of the group's estimate as your own that you can be guaranteed to do better, in expectation, than by picking at random? Or is there another method for aggregating the estimates that also has this property? As I'll show, only straight averaging has this property. If you combine the group's estimates in any other way to give your own, there is a possible set of true values that will lie further from your estimate than you would lie, in expectation, were you to pick at random. The question is natural, and the answer is not difficult to prove, so I'm pretty confident this has been asked and answered before; but I haven't been able to find it, so I'd be grateful for a reference if anyone has one.

Let's make all of this precise. We have a group of $m$ individuals; each of them has an estimate for each of the quantities $Q_1, \ldots, Q_n$. We represent individual $j$ by the sequence $X_j = (x_{j1}, \ldots, x_{jn})$ of their estimates of these quantities. So $x_{ji}$ is the estimate of quantity $Q_i$ by individual $j$. Suppose $T = (t_1, \ldots, t_n)$ is the sequence of true values of these quantities. So $t_i$ is the true value of $Q_i$. Then the disvalue or badness of individual $j$'s estimates, as measured by squared error, is:$$(x_{j1} - t_1)^2 + \ldots + (x_{jn} - t_n)^2$$The disvalue or badness of individual $j$'s estimate of quantity $Q_i$ is $(x_{ji} - t_i)^2$, and the disvalue or badness of their whole set of estimates is the sum of the disvalue or badness of their individual estimates. We write $\mathrm{SE}(X, T)$ for this sum. That is,$$\mathrm{SE}(X_j, T) = \sum_i (x_{ji} - t_i)^2$$Then the Diversity Prediction Theorem says that, for any $X_1, \ldots, X_m$ and any $T$,$$\mathrm{SE}\left (\frac{1}{m}X_1 + \ldots + \frac{1}{m}X_m, T \right ) < \frac{1}{m}\mathrm{SE}(X_1, T) + \ldots + \frac{1}{m}\mathrm{SE}(X_m, T)$$And we wish to prove a sort of converse, namely, if $V \neq \frac{1}{m}X_1 + \ldots + \frac{1}{m}X_m$, then there is a possible set of true values $T = (t_1, \ldots, t_n)$ such that$$\mathrm{SE}(V, T) > \frac{1}{m}\mathrm{SE}(X_1, T) + \ldots + \frac{1}{m}\mathrm{SE}(X_m, T)$$I'll give the proof below.

Why is this interesting? One question at the core of those parts of philosophy that deal with collectives and their attitudes is this: How should you aggregate the opinions of a group of individuals to give a single set of opinions? When the opinions come in numerical form, such as when they are estimates of quantities or when they are probabilities, there are a number of proposals. Taking the straight arithmetic average as we have done here is just one. How are we to decide which to use? Standard arguments proceed by identifying a set of properties that only one aggregation method boasts, and then arguing that the properties in the set are desirable given your purpose in doing the aggregation in the first place. The result we have just noted might be used to mount just such an argument: when we aggregate estimates, we might well want a method that is guaranteed to produce aggregate estimates that are better, in expectation, than picking at random, and straight averaging is the only method that does that. 

Finally, here's a slightly more general version of the result, which considers not just straight averages but also weighted averages; the proof is also given.

Proposition Suppose $\lambda_1, \ldots, \lambda_m$ is a set of weights, so that $0 \leq \lambda_j \leq 1$ and $\sum_j \lambda_j = 1$. Then, if $V \neq \lambda_1 X_1 + \ldots + \lambda_mX_m$, then there is a possible set of true values $T = (t_1, \ldots, t_n)$ such that$$\mathrm{SE}(V, T) > \lambda_1\mathrm{SE}(X_1, T) + \ldots + \lambda_m\mathrm{SE}(X_m, T)$$

Proof. The left-hand side of the inequality is
$$
\mathrm{SE}(V, T) = \sum_i (v_i - t_i)^2 = \sum_i v_i^2 - 2\sum_i v_it_i + \sum_i t^2_i
$$The right-hand side of the inequality is
\begin{eqnarray*}
\sum_j \lambda_j \mathrm{SE}(X_j, T) & = & \sum_j \lambda_j \sum_i (x_{ji} - t_i)^2 \\
& = & \sum_j \lambda_j \sum_i \left ( x^2_{ji} - 2x_{ji}t_i + t_i^2 \right ) \\
& = & \sum_{i,j} \lambda_j x^2_{ji} - 2\sum_{i,j} \lambda_j x_{ji}t_i + \sum_i t_i^2
\end{eqnarray*}
So $\mathrm{SE}(V, T) > \sum_j \lambda_j \mathrm{SE}(X_j, T)$ iff$$
\sum_i v_i^2 - 2\sum_i v_it_i > \sum_{i,j} \lambda_j x^2_{ji} - 2\sum_{i,j} \lambda_j x_{ji}t_i
$$iff$$
  2\left ( \sum_i \left ( \sum_j \lambda_j x_{ji}- v_i \right) t_i \right ) > \sum_{i,j} \lambda_j x^2_{ji} - \sum_i v_i^2
$$And, if $(v_1, \ldots, v_n) \neq (\sum_j \lambda_j x_{j1}, \ldots,\sum_j \lambda_j x_{jn})$, there is $i$ such that $\sum_j \lambda_j x_{ji} - v_i \neq 0$, and so it is always possible to choose $T = (t_1, \ldots, t_n)$ so that the inequality holds, as required.

Wednesday, 18 May 2022

Should we agree? III: the rationality of groups

In the previous two posts in this series (here and here), I described two arguments for the conclusion that the members of a group should agree. One was an epistemic argument and one a pragmatic argument. Suppose you have a group of individuals. Given an individual, we call the set of propositions to which they assign a credence their agenda. The group's agenda is the union of its member's agendas; that is, it includes any proposition to which some member of the group assigns a credence. The precise conclusion of the two arguments we describe is this: the group is irrational if there no single probability function defined on the group's agenda that gives the credences of each member of the group when restricted to their agenda. Following Matt Kopec, I called this norm Consensus. 

Cats showing a frankly concerning degree of consensus
 

Both arguments use the same piece of mathematics, but they interpret it differently. Both appeal to mathematical functions that measure how well our credences achieve the goals that we have when we set them. There are (at least) two such goals: we aim to have credences that will guide our actions well and we aim to have credences that will represent the world accurately. In the pragmatic argument, the mathematical function measures how well our credences achieve the first goal. In particular, they measure the utility we can expect to gain by having the credences we have and choosing in line with them when faced with whatever decision problems life throws at us. In the epistemic argument, the mathematical function measures how well our credences achieve the second goal. In particular, they measure the accuracy of our credences. As we noted in the second post on this, work by Mark Schervish and Ben Levinstein shows that the functions that measure these goals have the same properties: they are both strictly proper scoring rules. The arguments then appeal to the following fact: given a strictly proper scoring rule, if the members of a group do not agree on the credences they assign in the way required by Consensus, then there are some alternative credences they might assign instead that are guaranteed to be better according to that scoring rule.

I'd like to turn now to assessing these arguments. My first question is this: In the norm of Probabilism, rationality requires something of an individual, but in the norm of Consensus, rationality requires something of a group of individuals. We understand what it means to say that an individual is irrational, but what could it mean to say that a group is irrational?

Here, I follow Kenny Easwaran's suggestion that collective entities---in his case, cities; in my case, groups---can be said quite literally to be rational or irrational. For Easwaran, a city is rational "to the extent that the collective practices of its people enable diverse inhabitants to simultaneously live the kinds of lives they are each trying to live." As I interpret him, the idea is this: a city, no less than its individual inhabitants, has an end or goal or telos. For Easwaran, for instance, the end of a city is enabling its inhabitants to live as they wish to. And a city is irrational if it does not provide---in its physical and technological infrastructure, its byelaws and governing institutions---the best means to that end among those that are available. Now, we might disagree with Easwaran's account of a city's ends. But the template he provides by which we might understand group rationality is nonetheless helpful. Following his lead, we might say that a group, no less than its individual members, has an end. For instance, its end might be maximising the total utility of its members, or it might be maximizing the total epistemic value of their credences. And it is then irrational if it does not provide the best means to that end among those available. So, for instance, as long as agreement between members is available, our pragmatic and epistemic arguments for Consensus seem to show that a group whose ends are as I just described does not provide the best means to its ends if it does not deliver such agreement.

Understanding group rationality as Easwaran does helps considerably. As well as making sense of the claim that the group itself can be assessed for rationality, it also helps us circumscribe the scope of the two arguments we've been exploring, and so the scope of the version of Consensus that they justify. After all, it's clear on this conception that these arguments will only justify Consensus for a group if

  1. that group has the end of maximising total expected pragmatic utility or total epistemic utility, i.e., maximising the quantities measured by the mathematical functions described above;
  2. there are means available to it to achieve Consensus.

So, for instance, a group of sworn enemies hellbent of thwarting each other's plans is unlikely to have as its end maximising total utility, while a group composed of randomly selected individuals from across the globe is unlikely to have as its end maximising total epistemic utility, and indeed a group so disparate might lack any ends at all.

And we can easily imagine situations in which there are no available means by which the group could achieve Consensus, perhaps because it would be impossible to set up reliable lines of communication.

This allows us to make sense of two of the conditions that Donald Gillies places on the groups to which he takes his sure loss argument to apply (this is the first version of the pragmatic argument for Consensus; the one I presented in the first post and then abandoned in favour of the second version in the second post). He says (i) the members of the group must have a shared purpose, and (ii) there must be good lines of communication between them. Let me take these in turn to understand their status more precisely.

It's natural to think that, if a group has a shared purpose, it will have as its end maximising the total utility of the members of the group. And indeed in some cases this is almost certainly true. Suppose, for instance, that every member of a group cares only about the amount of biodiversity in a particular ecosystem that is close to their hearts. Then they will have the same utility function, and it is natural to say that maximising that shared utility is the group's end. But of course maximising that shared utility is equivalent to maximising the group's total utility, since the total utility is simply the shared utility scaled up by the number of members of the group.

However, it is also possible for a group to have a shared purpose without its end being to maximise total utility. After all, a group can have a shared purpose without each member taking that purpose to be the one and only valuable end. Imagine a different group: each member cares primarily about the level of biodiversity in their preferred area, but each also cares deeply about the welfare of their family. In this case, you might take the group's end to be maximising biodiversity in the area in question, particularly if it was this shared interest that brought them together as a group in the first place, but maximising this good might require the group not to maximise total utility, perhaps because some members of the group have family who are farmers and who will be adversely affected by whatever is the best means to the end of greater biodiversity.

What's more, it's possible for a group to have as its end maximising total utility without having any shared purpose at all. For instance, a certain sort of utilitarian might say that the group of all sentient beings has as its end the maximisation of the total utility of its members. But that group does not have any shared purpose.

So I think we can use the pragmatic and epistemic arguments to determine the groups to which the norm of Consensus applies, or at least the groups for which our pragmatic and epistemic arguments can justify its application. It is those groups that have as their end either maximising the total pragmatic utility of the group, or maximising their total epistemic utility, or maximising some weighted average of the two---after all, the weighted average of two strictly proper scoring rules, one measuring epistemic utility and one measuring pragmatic utility, is itself a strictly proper scoring rule. Of course, this requires an account of when a group has a particular end. This, like all questions about when collectives have certain attitudes, is delicate. I won't say anything more about it here.

Let's turn next to Gillies' claim that Consensus applies only to groups between whose members there are reliable lines of communication. In fact, I think our versions of the arguments show that this condition lives a strange double life. On the one hand, if such lines of communcation are necessary to achieve agreement across the group, then the norm of Consensus simply does not apply to a group when these lines of communication are impossible, perhaps because of geographical, social, or technological barriers. A group cannot be judged irrational for failing to achieve something it could not possibly achieve, however much closer it would get to its goal if it could achieve that. 

On the other hand, if such lines of communication are available, and if they increase the chance of agreement among members of the group, then our two arguments for Consensus are equally arguments for establishing such lines of communication, providing that the cost of doing so is outweighed by the gain in pragmatic or epistemic utility that comes from achieving agreement.

But these arguments do something else as well. They lend nuance to Consensus. In some cases in which some lines of communication are available but others aren't, or are too costly, our arguments still provide norms. Take, for instance, a case in which some central planner is able to communicate a single set of prior credences that each member of the group should have, but after the members start receiving evidence, this central planner can no longer coordinate their credences. And suppose we know that the members will receive different evidence: they'll be situated in different places, and so they'll see different things, have access to different information sources, and so on. So we know that, if they update on the evidence they receive in the standard way, they'll end up having different credences from one another and therefore violating Consensus. You might think, from looking at Consensus, that the group would do better, both pragmatically and epistemically, if each of its members were to ignore whatever evidence were to come in and to stick with their prior regardless in order to be sure that they remain in agreement and satisfy Consensus both in their priors and their posteriors.

In fact, however, this isn't the case. Let's take an extremely simple example. The group has just two members, Ada and Baz. Each has opinions only about the outcomes of two independent tosses of a fair coin. So the possible worlds are HH, HT, TH, TT. Ada will learn the outcome of the first, and Baz will learn the outcome of the second. A central planner can communicate to them a prior they should adopt, but that central planner can't receive information from them, and so can't receive their evidence and pool it and communicate a shared posterior to them. How should Ada and Baz proceed? How should they pick their priors, and what strategies should each adopt for updating when the evidence comes in? The entity we're assessing for rationality is the quadruple that contains Ada's prior together with her plan for updating, and Baz's prior together with his plan for updating. Which of these are available? Well, nothing constrains Ada's priors and nothing constrain's Baz's. But there are constraints on their updating rules. Ada's updating rule must give the same recommendation at any two worlds at which her evidence is the same---so, for instance, it must give the same recommendation at HH as at HT, since all she learns at both is that the first coin landed heads. And Baz's updating rule must give the same recommendation at any two worlds at which his evidence is the same---so, for instance, it must give the same recommendation at HH as at TH. Then consider the following norm:

Prior Consensus Ada and Baz should have the same prior and both should plan to update on their private evidence by conditioning on it.

And the argument for this is that, if they don't, there's a quadruple of their priors and plans that (i) satisfy the constraint outlined above and (ii) together have greater total epistemic utility at each possible world; and there's a quadruple of their priors and plans that (i) satisfy the constraint outlined above and (ii) together have greater total expected pragmatic utility at each possible world. This is a corollary of an argument that Ray Briggs and I gave, and that Michael Nielsen corrected and improved on. So, if Ada and Baz are in agreement on their prior, and plan to stick with it rather than update on their evidence because that way they'll retain agreement, then they're be accuracy dominated and pragmatically dominated.

You might wonder how this is possible. After all, whatever evidence Ada and Baz each receive, Prior Consensus requires them to update on it in a way that leads them to disagree, and we know that they are then accuracy and pragmatically dominated. This is true, and it would tell against the priors + updating plans recommended by Prior Consensus if there were some way for Ada and Baz to communicate after their evidence came in. It's true that, for each possible world, there is some credence function such that if, at each world, Ada and Baz were to have that credence function rather than the ones they obtain by updating their shared prior on their private evidence, then they'd end up with greater total accuracy and pragmatic utility. But, without the lines of communication, they can't have that.

So, by looking in some detail at the arguments for Consensus, we come to understand better the groups to which it applies and the norms that apply to those groups to which it doesn't apply in its full force.

Friday, 6 May 2022

Should we agree? II: a new pragmatic argument for consensus

There is a PDF version of this blogpost available here.

In the previous post, I introduced the norm of Consensus. This is a claim about the rationality of groups. Suppose you've got a group of individuals. For each individual, call the set of propositions to which they assign a credence their agenda. They might all have quite different agendas, some of them might overlap, others might not. We might say that the credal states of these individual members cohere with one another if there is a some probability function that is defined for any proposition that appears in any member's agenda, and the credences each member assigns to the propositions in their agenda match those assigned by this probability function to those propositions. Then Consensus says that a group is irrational if it does not cohere.

A group coming to consensus

In that post, I noted that there are two sorts of argument for this norm: a pragmatic argument and an epistemic argument. The pragmatic argument is a sure loss argument. It is based on the fact that, if the individuals in the group don't agree, there is a series of bets that their credences require them to accept that will, when taken together, lose the group money for sure. In this post, I want to argue that there is a problem with the sure loss argument for Consensus. It isn't peculiar to this argument, and indeed applies equally to any argument that tries to establish a rational requirement by showing that someone who violates it is exploitable. Indeed, I've raised it elsewhere against the sure loss argument for Probabilism (Section 6.2, Pettigrew 2020) and the money pump argument against non-exponential discounting and changing preferences in general (Section 13.7.4, Pettigrew 2019). I'll describe the argument here, and then offer a solution based on work by Mark Schervish (1989) and Ben Levinstein (2017). I've described this sort of solution before (Section 6.3, Pettigrew 2020), and Jason Konek (ta) has recently put it to interesting work addressing an issue with Julia Staffel's (2020) account of degrees of incoherence.

Sure loss and money pump arguments judge the rationality of attitudes, whether credences or preferences, by looking at the quality of the choices they require us to make. As Bishop Butler said, probability is the very guide of life. These arguments evaluate credences by exactly how well they provide that guide. So they are teleological arguments: they attempt to derive facts about the epistemic right---namely, what is rationally permissible---from facts about the epistemic good---namely, leading to pragmatically good choices.

Say that one sequence of choices dominates another if, taken together, the first leads to better outcomes for sure. Say that a collection of attitudes is exploitable if there is a sequence of decision problems you might face such that, if faced with them, these attitudes will require you to make a dominated sequence of choices.

For instance, take the sure loss argument for Probabilism: if you violate Probabilism because you believe $A\ \&\ B$ more strongly than you believe $A$, your credence in the former will require you to pay some amount of money for a bet that pays out a pound if $A\ \&\ B$ true and nothing if it's false, and your credence in the latter will require you to sell for less money a bet that pays out a pound if $A$ is true and nothing if it's false; yet you'd be better off for sure rejecting both bets. So rejecting both bets dominates accepting both; your credences require you to accept both; so your credences are exploitable. Or take the money pump argument against cyclical preferences: if you prefer $A$ to $B$ and $B$ to $C$ and $C$ to $A$, then you'll choose $B$ when offered a choice between $B$ and $C$, you'll then pay some amount to swap to $A$, and you'll then pay some further amount to swap to $C$; yet you'd be better off for sure simply choosing $C$ in the first place and not swapping either time that possibility was offered. So choosing $C$ and sticking with it dominates the sequence of choices your preferences require; so your preferences are exploitable.

But, I contend, the existence of a sequence of decision problems in response to which your attitudes require you to make a dominated series of choices does not on its own render those attitudes irrational. After all, it is just one possible sequence of decision problems you might face. And there are many other sequences you might face instead. The argument does not consider how your attitudes will require you to choose when faced with those alternative sequences, and yet surely that is relevant to assessing those attitudes, for it might be that however bad is the dominated sequences of choices the attitudes require you to make when faced with the sequence of decision problems described in the argument for exploitability, there is another sequence of decision problems where those same attitudes require you to make a series of choices that are very good; indeed, they might be so good that they somehow outweigh the badness of the dominated sequence. So, instead of judging your attitudes by looking only at the outcome of choosing in line with them when faced with a single sequence of decision problems, we should rather judge them by looking at the outcome of choosing in line with them when faced with any decision problem that might come your way, weighting each by how likely you are to face it, to give a balanced view of the pragmatic benefits of having those credences. That's the approach I'll present now, and I'll show that it leads to a new and better pragmatic argument for Probabilism and Consensus.

As I presented them, the sure loss arguments for Probabilism and Consensus both begin with a principle that I called Ramsey's Thesis. This is a claim about the prices that an individual's credence in a proposition requires her to pay for a bet on that proposition. It says that, if $p$ is your credence in $A$ and $x < pS$, then you are required to pay $£x$ for a bet that pays out $£S$ if $A$ is true and $£0$ if $A$ is false. Now in fact this is a particular consequence of a more general norm about how our credences require us to choose. Let's call the more general norm Extended Ramsey's Thesis. It says how our credence in a proposition requires us to choose when faced with a series of options, all of whose payoffs depend only on the truth or falsity of that proposition. Given a proposition $A$, let's say that an option is an $A$-option if its payoffs at any two worlds at which $A$ is true are the same, and its payoffs at any two worlds at which $A$ is false are the same. Then, given a credence $p$ in $A$ and an $A$-option $a$, we say that the expected payoff of $a$ by the lights of $p$ is
$$
p \times \text{payoff of $a$ when $A$ is true} + (1-p) \times \text{payoff of $a$ when $A$ is false}
$$Now suppose you face a decision problem in which all of the available options are $A$-options. Then Extended Ramsey's Thesis says that you are required to pick an option whose expected payoff by the lights of your credence in $A$ is maximal.*

Next, we make a move that is reminiscent of the central move in I. J. Good's argument for Carnap's Principle of Total Evidence (Good 1967). We say what we take the payoff to be of having a particular credence in a particular proposition given a particular way the world is and when faced with a particular decision problem. Specifically, we define the payoff of having credence $p$ in the proposition $A$ when that proposition is true, and when you're faced with a decision problem $D$ in which all of the options are $A$-options, to be the payoff when $A$ is true of whichever $A$-option available in $D$ maximises expected payoff by the lights of $p$. And we define the payoff of having credence $p$ in the proposition $A$ when that proposition is false, and when you're faced with a decision problem $D$ in which all of the options are $A$-options, to be the payoff when $A$ is false of whichever $A$-option available in $D$ maximises expected payoff by the lights of $p$. So the payoff of having a credence is the payoff of the option you're required to pick using that credence.

Finally, we make the move that is central to Schervish's and Levinstein's work. We now know the payoff of having a particular credence in propositiojn $A$ when you face a decision problem in which all options are $A$-options. But of course we don't know which such decision problems we'll face. So, when we evaluate the payoff of having a credence in $A$ when $A$ is true, for instance, we look at all the decision problems populated by $A$-options we might face and weight them by how likely we are to face them and then take the payoff of having that credence when $A$ is true to be the expected payoff of the $A$-options it would leave us to choose faced with the decision problems we'll face. And then we note, as Schervish and Levinstein themselves note: if we make certain natural assumptions about how likely we are to face different decisions, then this resulting measure of the pragmatic payoff of having credence $p$ in proposition $A$ is a continuous and strictly proper scoring rule. That is, mathematically, the functions we use to measure the pragmatic value of a credence function are identical to the functions we use to evaluate the epistemic value of a credence that we use in the epistemic utility argument for Probabilism and Consensus.**

With this construction in place, we can piggyback on the theorems stated in the previous post to give new pragmatic arguments for Probabilism and Consensus. First: Suppose your credences do not obey Probabilism. Then there are alternative ones you might have instead that do obey that norm and, at any world, if we look at each decision problem you might face and ask what payoff you'd receive at that world were you to choose from the options in that decision problem as the two different sets of credences require, and then weight those payoffs by how likely they are to face that decision to give their expected payoff, then the alternatives will always have the greater expected payoff. This gives strong reason to obey Probabilism.

Second: Take a group of individuals. Now suppose the group's credences do not obey Consensus. Then there are alternative credences each member might have instead such that, if they were to have them, the group would obey Consensus and, at any world, if we look at each decision problem each member might face and ask what payoff that individual would receive at that world were they to choose from the options in that decision problem as the two different sets of credences require, and then weight those payoffs by how likely they are to face that decision to give their expected payoff, then the alternatives will always have the greater expected total payoff when this is summed across the whole group.

So that is our new and better pragmatic argument for Consensus. The sure loss argument points out a single downside to a group that violates the norm. Such a group is vulnerable to exploitation. But it remains silent on whether there are upsides that might balance out that downside. The present argument addresses that problem. It finds that, if a group violates the norm, there are alternative credences they might have that are guaranteed to serve them better in expectation as a basis for decision making.

* Notice that, if $x < pS$, then the expected payoff of a bet that pays $S$ if $A$ is true and $0$ if $A$ is false is
$$
p(-x + S) + (1-p)(-x) = pS- x
$$
which is positive. So, if the two options are accept or reject the bet, accepting maximises expected payoff by the lights of $p$, and so it is required, as Ramsey's Thesis says.

** Konek (ta) gives a clear formal treatment of this solution. For those who want the technical details, I'd recommend the Appendix of that paper. I think he presents it better than I did in (Pettigrew 2020).

References

Good, I. J. (1967). On the Principle of Total Evidence. The British Journal for the Philosophy of Science, 17, 319–322.

Konek, J. (ta). Degrees of incoherence, dutch bookability & guidance value. Philosophical Studies.

Levinstein, B. A. (2017). A Pragmatist’s Guide to Epistemic Utility. Philosophy of Science, 84(4), 613–638.

Pettigrew, R. (2019). Choosing for Changing Selves. Oxford, UK: Oxford University Press.

Pettigrew, R. (2020). Dutch Book Arguments. Elements in Decision Theory and Philosophy. Cambridge, UK: Cambridge University Press.

Schervish, M. J. (1989). A general method for comparing probability assessors. The Annals of Statistics, 17, 1856–1879.

Staffel, J. (2020). Unsettled Thoughts. Oxford University Press.


Thursday, 5 May 2022

Should we agree? I: the arguments for consensus

You can find a PDF of this blogpost here.

Should everyone agree with everyone else? Whenever two members of a group have an opinion about the same claim, should they both be equally confident in it? If this is sometimes required of groups, of which ones is it required and when? Whole societies at any time in their existence? Smaller collectives when they're engaged in some joint project?

Of course, you might think these are purely academic questions, since there's no way we could achieve such consensus even if we were to conclude that it is desirable, but that seems too strong. Education systems and the media can be deployed to push a population towards consensus, and indeed this is exactly how authoritarian states often proceed. Similarly, social sanctions can create incentives for conformity. So it seems that a reasonable degree of consensus might be possible.

But is it desirable? In this series of blogposts, I want to explore two formal arguments. They purport to establish that groups should be in perfect agreement; and they explain why getting closer to consensus is better, even if perfect agreement isn't achieved---in this case, a miss is not as good as a mile. It's still a long way from their conclusions to practical conclusions about how to structure a society, but they point sufficiently strongly in a surprising direction that it is worth exploring them. In this first post, I set out the arguments as they have been given in the literature and polish them up a bit so that they are as strong as possible.

Since they're formal arguments, they require a bit of mathematics, both in their official statement and in the results on which they rely. But I want to make the discussion as accessible as possible, so, in the main body of the blogpost, I state the arguments almost entirely without formalism. Then, in the technical appendix, I sketch some of the formal detail for those who are interested.

Two sorts of argument for credal norms

There are two sorts of argument we most often use to justify the norms we take to govern our credences: there are pragmatic arguments, of which the betting arguments are the most famous; and there are epistemic arguments, of which the epistemic utility arguments are the most well known.

Take the norm of Probabilism, for instance, which says that your credences should obey the axioms of the probability calculus. The betting argument for Probabilism is sometimes known as the Dutch Book or sure loss argument.* It begins by claiming that the maximum amount you are willing to pay for a bet on a proposition that pays out a certain amount if the proposition is true and nothing if it is false is proportional to your credence in that proposition. Then it shows that, if your credences do not obey the probability axioms, there is a set of bets each of which they require you to accept, but which when taken together lose you money for sure; and if your credences do obey those axioms, there is no such set of bets.

The epistemic utility argument for Probabilism, on the other hand, begins by claiming that any measure of the epistemic value of credences must have certain properties.** It then shows that, by the lights of any epistemic utility function that does have those properties, if your credences do not obey the probability axioms, then there are alternatives that are guaranteed to be have greater epistemic utility than yours; and if they do obey those axioms, there are no such alternatives.

Bearing all of this in mind, consider the following two facts.

(I) Suppose we make the same assumptions about which bets an individual's credences require them to accept that we make in the betting argument for Probabilism. Then, if two members of a group assign different credences to the same proposition, there is a bet the first should accept and a bet the second should accept that, taken together, leave the group poorer for sure (Ryder 1981, Gillies 1991). 

(II) Suppose we measure the epistemic value of credences using an epistemic utility function that boasts the properties required of it by the epistemic utility argument for Probabilism. Then, if two members of a group assign different credences to the same proposition, there is a single credence such that the group is guaranteed to have greater total epistemic utility if every member adopts that single credence in that proposition (Kopec 2012).

Given the epistemic utility and betting arguments for Probabilism, neither (I) nor (II) is very surprising. After all, one consequence of Probabilism is that an individual must assign the same credence to two propositions that have the same truth value as a matter of logic. But from the point of view of the betting argument or the epistemic utility argument, this is structurally identical to the requirement that two different people assign the same credence to the same proposition, since obviously a single proposition necessarily has the same truth value as itself! However we construct the sure loss bets against the individual who violates the consequence of Probabilism, we can use an analogous strategy to construct the sure loss bets against the pair who disagree in the credences they assign. And however we construct the alternative credences that are guaranteed to be more accurate than the ones that violate the consequence of Probabilism, we can use an analogous strategy to construct the alternative credence that, if adopted by all members of the group that contains two individuals who currently disagree, would increase their total epistemic utility for sure.

Just as a betting argument and an epistemic utility argument aim to establish the individual norm of Probabilism, we might ask whether there is a group norm for which we can give a betting argument and an epistemic utility argument by appealing to (I) and (II)? That is the question I'd like to explore in these posts. In the remainder of this post, I'll spell out the details of the epistemic utility argument and the betting argument for Probabilism, and then adapt those to give analogous arguments for Consensus.

The Epistemic Utility Argument for Probabilism

Two small bits of terminology first:

  • Your agenda is the set of propositions about which you have an opinion. We'll assume throughout that all individuals have finite agendas.
  • Your credence function takes each proposition in your agenda and returns your credence in that proposition.

With those in hand, we can state Probabilism

Probabilism Rationality requires of an individual that their credence function is a probability function. 

What does it mean to say that a credence function is a probability function? There are two cases to consider.

First, suppose that, whenever a proposition is in your agenda, its negation is as well; and whenever two propositions are in your agenda, their conjunction and their disjunction are as well. When this holds, we say that your agenda is a Boolean algebra. And in that case your credence function is a probability function if two conditions hold: first, you assign the minimum possible credence, namely 0, to any contradiction and the maximum possible credence, namely 1, to any tautology; second, your credence in a disjunction is the sum of your credences in the disjuncts less your credence in their conjunction (just like the number of people in two groups is the number in the first plus the number in the second less the number in both).

Second, suppose that your agenda is not a Boolean algebra. In that case, your credence function is a probability function if it is possible to extend it to a probability function on the smallest Boolean algebra that contains your agenda. That is, it's possible to fill out your agenda so that it's closed under negation, conjunction, and disjunction, and then extend your credence function so that it assign credences to those new propositions in such a way that the result is a probability function on the expanded agenda. Defining probability functions on agendas that are not Boolean algebras allows us to say, for instance, that, if your agenda is just It will be windy tomorrow and It will be windy and rainy tomorrow, and you assign credence 0.6 to It will be windy and 0.8 to It will be windy and rainy, then you violate Probabilism because there's no way to assign credences to It won't be windy, It will be windy or rainy, It won't be rainy, etc in such a way that the result is a probability function.

The Epistemic Utility Argument for Probabilism begins with three claims about how to measure the epistemic value of a whole credence function. The first is Individual Additivity, which says that the epistemic utility of a whole credence function is simply the sum of the epistemic utilities of the individual credences it assigns. The second is Continuity, which says that, for any proposition, the epistemic utility of a credence in that proposition is a continuous function of that credence. And the third is Strict Propriety, which says that, for any proposition, each credence in that proposition should expect itself to be have greater epistemic utility than it expects any alternative credence in that proposition to have. With this account in hand, the argument then appeals to a mathematical theorem, which tells us two consequences of measuring epistemic value using an epistemic utility function that has the three properties just described, namely, Individual Additivity, Continuity, and Strict Propriety.

(i) For any credence function that violates Probabilism, there is a credence function defined on the same agenda that satisfies it and that has greater epistemic utility regardless of how the world turns out. In this case, we say that the alternative credence function dominates the original one. 

(ii) For any credence function that is a probability function, there is no credence function that dominates it. Indeed, there is no alternative credence function that is even as good as it at every world. For any alternative, there will be some world where that alternative is strictly worse.

The argument concludes by claiming that an option is irrational if there is some alternative that is guaranteed to be better and no option that is guaranteed to be better than that alternative.

The Epistemic Utility Argument for Consensus

As I stated it above, and as it is usually stated in the literature, Consensus says that, whenever two members of a group assign credences to the same proposition, they should assign the same credence. But in fact the epistemic argument in its favour establishes something stronger. Here it is: 

Consensus Rationality requires of a group that there is a single probability function defined on the union of the agendas of all of the members of the group such that the credence function of each member assigns the same credence to any proposition in their agenda as this probability function does.

This goes further than simply requiring that all agents agree on the credence they assign to any proposition to which they all assign credences. Indeed, it would place constraints even on a group whose members' agendas do not overlap at all. For instance, if you have credence 0.6 that it will be rainy tomorrow, while I have credence 0.8 that it will be rainy and windy, the pair of us will jointly violate Consensus, even though we don't assign credences to any of the same propositions, since no probability function assigns 0.6 to one proposition and 0.8 to the conjunction of that proposition with another one. In these cases, we say that the group's credences don't cohere.

One notable feature of Consensus is that it purports to govern groups, not individuals, and we might wonder what it could mean to say that a group is irrational. I'll return to that in a later post. It will be useful to have the epistemic utility and betting arguments for Consensus to hand first.

The Epistemic Utility Argument for Consensus begins, as the epistemic argument for Probabilism does, with Individual Additivity, Continuity, and Strictly Propriety. And it adds to those Group Additivity, which says that group's epistemic utility is the sum of the epistemic utilities of the credence functions of its members. With this account of group epistemic value in hand, the argument then appeals again to a mathematical theorem, but a different one, which tells us two consequences of Group and Individual Additivity, Continuity, and Strict Propriety:***

(i) For any group that violates Consensus, there is, for each individual, an alternative credence function defined on their agenda that they might adopt such that, if all were to adopt these, the group would satisfy Consensus and it would be more accurate regardless of how the world turns out. In this case, we say that the alternative credence functions collectively dominate the original ones.

(ii) For any group that satisfies Consensus, there are no credence functions the group might adopt that collectively dominate it.

The argument concludes by assuming again the norm that an option is irrational if there is some alternative that is guaranteed to be better.

The Sure Loss Argument for Probabilism

The Sure Loss Argument for Probabilism begins with a claim that I call Ramsey's Thesis. It tells you the prices at which your credences require you to buy and sell bets. It says that, if your credence in $A$ is $p$, and $£x < £pS$, then you should be prepared to pay $£x$ for a bet that pays out $£S$ if $A$ is true and $£0$ if $A$ is false. And this is true for any stakes $S$, whether positive, negative, or zero. Then it appeals to a mathematical theorem, which tells us two consequences of Ramsey's Thesis.

(i) For any credence function that violates Probabilism, there is a series of bets, each of which your credences require you to accept, that, taken together, lose you money for sure.

(ii) For any credence function satisfies Probabilism, there is no such series of bets.

The argument concludes by assuming a norm that says that it is irrational to have credences that require you to make a series of choices when there is an alternative series of choices you might have made that would be better regardless of how the world turns out.

The Sure Loss Argument for Consensus

The Sure Loss Argument for Consensus also begins with Ramsey's Thesis.  It appeals to a mathematical theorem that tells us two consequences of Ramsey's Thesis.

(i) For any group that violates Consensus, there is a series of bets, each offered to a  member of the group whose credences require that they accept it, that, taken together, lose the group money for sure.

(ii) For any group that satisfies Consensus, there is no such series of bets.

And it concludes by assuming that it is irrational for the members of a group to have credences that require them to make a series of choices when there is an alternative series of choices they might have made that would be better for the group regardless of how the world turns out.

So now we have the Epistemic Utility and Sure Loss Arguments for Consensus. In fact, I think the Sure Loss Argument doesn't work. So in the next post I'll say why and provide a better alternative based on work by Mark Schervish and Ben Levinstein. But in the meantime, here's the technical appendix.

Technical appendix

First, note that Probabilism is the special case of Consensus when the group has only one member. So we focus on establishing Consensus.

Some definitions to begin:

  • If $c$ is a credence function defined on the agenda $\mathcal{F}_i = \{A^i_1, \ldots, A^i_{k_i}\}$, represent it as a vector as follows:$$c = \langle c(A^i_1), \ldots, c(A^i_{k_i})\rangle$$
  • Let $\mathcal{C}_i$ be the set of credence functions defined on $\mathcal{F}_i$, represented as vectors in this way.
  • If $c_1, \ldots, c_n$ are credence functions defined on $\mathcal{F}_1, \ldots, \mathcal{F}_n$ respectively, represent them collectively as a vector as follows:
    $$
    c_1 \frown \ldots \frown c_n = \langle c_1(A^1_1), \ldots, c_1(A^1_{k_1}), \ldots, c_n(A^n_1), \ldots, c_n(A^n_{k_n}) \rangle
    $$
  • Let $\mathcal{C}$ be the set of sequences of credence functions defined on $\mathcal{F}_1, \ldots, \mathcal{F}_n$ respectively, represented as vectors in this way. 
  • If $w$ is a classically consistent assignment of truth values to the propositions in $\mathcal{F}_i$, represent it as a vector $$w = \langle w(A^i_1), \ldots, w(A^i_{k_i})\rangle$$ where $w(A) = 1$ if $A$ is true according to $w$, and $w(A) = 0$ if $A$ is false according to $w$.
  • Let $\mathcal{W}_i$ be the set of classically consistent assignments of truth values to the propositions in $\mathcal{F}_i$, represented as vectors in this way.
  • If $w$ is a classically consistent assignment of truth values to the propositions in $\mathcal{F} = \bigcup^n_{i=1} \mathcal{F}_i$, represent the restriction of $w$ to $\mathcal{F}_i$ by the vector $$w_i = \langle w(A^i_1), \ldots, w(A^i_{k_i})\rangle$$So $w_i$ is in $\mathcal{W}_i$. And represent $w$ as a vector as follows:
    $$
    w = w_1 \frown \ldots \frown w_n = \langle w(A^1_1), \ldots, w(A^1_{k_1}), \ldots, w(A^n_1), \ldots, w(A^n_{k_n})\rangle
    $$
  • Let $\mathcal{W}$ be the set of classical consistent assignments of truth values to the propositions in $\mathcal{F}$, represented as vectors in this way.

Then we have the following result, which generalizes a result due to de Finetti (1974):

Proposition 1 A group of individuals with credence functions $c_1, \ldots, c_n$ satisfy Consensus iff $c_1 \frown \ldots \frown c_n$ is in the closed convex hull of $\mathcal{W}$.

We then appeal to two sets of results. First, concerning epistemic utility measures, which generalizes a result to Predd, et al. (2009):

Theorem 1

(i) Suppose $\mathfrak{A}_i : \mathcal{C}_i \times \mathcal{W}_i \rightarrow [0, 1]$ is a measure of epistemic utility that satisfies Individual Additivity, Continuity, and Strict Propriety. Then there is a Bregman divergence $\mathfrak{D}_i : \mathcal{C}_i \times \mathcal{C}_i \rightarrow [0, 1]$ such that $\mathfrak{A}_i(c, w) = -\mathfrak{D}_i(w, c)$.

(ii) Suppose $\mathfrak{D}_1, \ldots, \mathfrak{D}_n$ are Bregman divergences defined on $\mathcal{C}_1, \ldots, \mathcal{C}_n$, respectively. And suppose $\mathcal{X}$ is a closed convex subset of $\mathcal{C}$. And suppose $c_1 \frown \ldots \frown c_n$ is not in $\mathcal{X}$. Then there is $c^\star_1 \frown \ldots \frown c^\star_n$ in $\mathcal{Z}$ such that, for all $z_1 \frown \ldots \frown z_n$ in $\mathcal{Z}$,
$$
\sum^n_{i=1} \mathfrak{D}_i(z_i, c^\star_i) < \sum^n_{i=1} \mathfrak{D}_i(z_i, c_i)
$$

So, by Proposition 1, if a group $c_1, \ldots, c_n$ does not satisfy Consensus, then $c_1 \frown \ldots \frown c_n$ is not in the closed convex hull of $\mathcal{W}$, and so by Theorem 1 there is $c^\star_1 \frown \ldots \frown c^\star_n$ in the closed convex hull of $\mathcal{W}$ such that, for all $w$ in $\mathcal{W}$, $$\mathfrak{A}_i(c, w) < \mathfrak{A}(c^\star, w)$$ as required.

Second, concerning bets, which is a consequence of the Separating Hyperplane Theorem:

Theorem 2
Suppose $\mathcal{Z}$ is a closed convex subset of $\mathcal{C}$. And suppose $c_1 \frown \ldots \frown c_n$ is not in $\mathcal{Z}$. Then there are vectors
$$
x = \langle x^1_1, \ldots, x^1_{k_1}, \ldots, x^n_1, \ldots, x^n_{k_n}\rangle
$$
and
$$
S = \langle S^1_1, \ldots, S^1_{k_1}, \ldots, S^n_1, \ldots, S^n_{k_n}\rangle
$$
such that, for all $x^i_j$ and $S^i_j$,
$$
x^i_j < c_i(A^i_j)S^i_j
$$
and, for all $z$ in $\mathcal{Z}$,
$$
\sum^n_{i=1} \sum^{k_i}_{j = 1} x^i_j > \sum^n_{i=1} \sum^{k_i}_{j=1} z^i_jS^i_j
$$

So, by Proposition 1, if a group $c_1, \ldots, c_n$ does not satisfy Consensus, then $c_1 \frown \ldots \frown c_n$ is not in the closed convex hull of $\mathcal{W}$, and so, by Theorem 2, there is $x = \langle x^1_1, \ldots, x^1_{k_1}, \ldots, x^n_1, \ldots, x^n_{k_n}\rangle$ and $S = \langle S^1_1, \ldots, S^1_{k_1}, \ldots, S^n_1, \ldots, S^n_{k_n}\rangle$ such that (i) $x^i_j < c_i(A^i_j)S^i_j$ and (ii) for all $w$ in $\mathcal{W}$,
$$\sum^n_{i=1} \sum^{k_i}_{j = 1} x^i_j > \sum^n_{i=1} \sum^{k_i}_{j=1} w(A^i_j)S^i_j$$
But then (i) says that the credences of individual $i$ require them to pay $£x^i_j$ for a bet on $A^i_j$ that pays out $£S^i_j$ if $A^i_j$ is true and $£0$ if it is false. And (ii) says that the total price of these bets across all members of the group---namely, $£\sum^n_{i=1} \sum^{k_i}_{j = 1} x^i_j$---is greater than the amount the bets will payout at any world---namely, $£\sum^n_{i=1} \sum^{k_i}_{j=1} w(A^i_j)S^i_j$.

* This was introduced independently by Frank P. Ramsey (1931) and Bruno de Finetti (1937). For overviews, see (Hajek 2008, Vineberg 2016, Pettigrew 2020).

**Much of the discussion of these arguments in the literature focusses on versions on which the epistemic value of a credence is taken to be its accuracy. This literature begins with Rosenkrantz (1981) and Joyce (1998). But, following Joyce (2009) and Predd (2009), it has been appreciated that we need not necessarily assume that accuracy is the only source of epistemic value in order to get the argument going.

*** Matthew Kopec (2012) offers a proof of a slightly weaker result. It doesn't quite work because it assumes that all strictly proper measures of epistemic value are convex, when they are not---the spherical scoring rule is not. I offer an alternative proof of this stronger result in the technical appendix below.

References

de Finetti, B. (1937 [1980]). Foresight: Its Logical Laws, Its Subjective Sources. In H. E. Kyburg, & H. E. K. Smokler (Eds.) Studies in Subjective Probability. Huntingdon, N. Y.: Robert E. Kreiger Publishing Co.

de Finetti, B. (1974). Theory of Probability, vol. I. New York: John Wiley & Sons.

Gillies, D. (1991). Intersubjective probability and confirmation theory. The British Journal for the Philosophy of Science, 42(4), 513–533.

Hájek, A. (2008). Dutch Book Arguments. In P. Anand, P. Pattanaik, & C. Puppe (Eds.) The Oxford Handbook of Rational and Social Choice, (pp. 173–195). Oxford: Oxford University Press.

Joyce, J. M. (1998). A Nonpragmatic Vindication of Probabilism. Philosophy of Science, 65(4), 575–603.

Joyce, J. M. (2009). Accuracy and Coherence: Prospects for an Alethic Epistemology of Partial Belief. In F. Huber, & C. Schmidt-Petri (Eds.) Degrees of Belief. Dordrecht and Heidelberg: Springer.

Kopec, M. (2012). We ought to agree: A consequence of repairing Goldman’s group scoring rule. Episteme, 9(2), 101–114.

Pettigrew, R. (2020). Dutch Book Arguments. Cambridge University Press.

Predd, J., Seiringer, R., Lieb, E. H., Osherson, D., Poor, V., & Kulkarni, S. (2009). Probabilistic Coherence and Proper Scoring Rules. IEEE Transactions of Information Theory, 55(10), 4786–4792.

Ramsey, F. P. (1926 [1931]). Truth and Probability. In R. B. Braithwaite (Ed.) The Foundations of Mathematics and Other Logical Essays, chap. VII, (pp. 156–198). London: Kegan, Paul, Trench, Trubner & Co.

Rosenkrantz, R. D. (1981). Foundations and Applications of Inductive Probability. Atascadero, CA: Ridgeview Press.

Ryder, J. (1981). Consequences of a simple extension of the Dutch Book argument. The British Journal for the Philosophy of Science, 32(2), 164–167.

Vineberg, S. (2016). Dutch Book Arguments. In E. N. Zalta (Ed.) Stanford Encyclopedia of Philosophy. Metaphysics Research Lab, Stanford University.


Tuesday, 13 April 2021

What we together risk: three vignettes in search of a theory

For a PDF version of this post, see here.

Many years ago, I was climbing Sgùrr na Banachdich with my friend Alex. It's a mountain in the Black Cuillin, a horseshoe of summits that surround Loch Coruisk at the southern end of the Isle of Skye. It's a Munro---that is, it stands over 3,000 feet above sea level---but only just---it measures 3,166 feet. About halfway through our ascent, the mist rolled in and the rain came down heavily, as it often does near these mountains, which attract their own weather system. At that point, my friend and I faced a choice: to continue our attempt on the summit or begin our descent. Should we continue, there were a number of possible outcomes: we might reach the summit wet and cold but not injured, with the mist and rain gone and in their place sun and views across to Bruach na Frìthe and the distinctive teeth-shaped peaks of Sgùrr nan Gillean; or we might reach the summit without injury, but the mist might remain, obscuring any view at all; or we might get injured on the way and either have to descend early under our own steam or call for help getting off the mountain. On the other hand, should we start our descent now, we would of course have no chance of the summit, but we were sure to make it back unharmed, for the path back is good and less affected by rain.

Alex and I had climbed together a great deal that summer and the summer before. We had talked at length about what we enjoyed in climbing and what we feared. To the extent that such comparisons make sense and can be known, we both knew that we both gained exactly the same pleasure from reaching a summit, the same additional pleasure if the view was clear; we gained the same displeasure from injury, the same horror at the thought of having to call for assistance getting off a mountain. What's more, we both agreed exactly on how likely each possible outcome was: how likely we were to sustain an injury should we persevere; how likely that the mist would clear in the coming few hours; and so on. Nonetheless, I wished to turn back, while Alex wanted to continue.

How could that be? We both agreed how good or bad each of the options was, and both agreed how likely each would be were we to take either of the courses of action available to us. Surely we should therefore have agreed on which course of action would maximise our expected utility, and therefore agreed which would be best to undertake. Yes, we did agree on which course of action would maximise our expected utility. However, no, we did not therefore agree on which was best, for there are theories of rational decision-making that do not demand that you must rank options by their expected utility. These are the risk-sensitive decision theories, and they include John Quiggin's rank-dependent decision theory and Lara Buchak's risk-weighted expected utility theory. According to Quiggin's and Buchak's theories, what you consider best is not determined only by your utilities and your probabilities, but also by your attitudes to risk. The more risk-averse will give greater weight to the worst-case scenarios and less to the best-case ones than expected utility demands; the more risk-inclined will give greater weight to the best outcomes and less to the worst than expected utility does; and the risk-neutral person will give exactly the weights prescribed by expected utility theory. So, perhaps I preferred to begin our descent from Sgùrr na Banachdich while Alex preferred to continue upwards because I was risk-averse and he was risk-neutral or risk-seeking, or I was risk-neutral and he was risk-seeking. In any case, he must have been less risk-averse than I was.

Of course, as it turned out, we sat on a mossy rock in the rain and discussed what to do. We decided to turn back. Luckily, as it happened, for a thunderstorm hit the mountains an hour later at just the time we'd have been returning from the summit. But suppose we weren't able to discuss the decision. Suppose we'd roped ourselves together to avoid getting separated in the mist, and he'd taken the lead, forcing him to make the choice on behalf of both of us. In that case, what should he have done?

As I will do throughout these reflections, let me simply report by own reaction to the case. I think, in that case, Alex should have chosen to descend (and not only because that was my preference---I'd have thought the same had it been he who wished to descend and me who wanted to continue!). Had he chosen to continue---even if all had turned out well and we'd reached the summit unharmed and looked over the Cuillin ridge in the sun---I would still say that he chose wrongly on our behalf. This suggests the following principle (in joint work, Ittay Nissan Rozen and Jonathan Fiat argue for a version of this principle that applies in situations in which the individuals do not assign the same utilities to the outcomes):

Principle 1  Suppose two people assign the same utilities to the possible outcomes, and assign the same probabilities to the outcomes conditional on choosing a particular course of action. And suppose that you are required to choose between those courses of action on their behalf. Then you must choose whatever the more risk-averse of the two would choose.

However, I think the principle is mistaken. A few years after our unsuccessful attempt on Sgùrr na Banachdich, I was living in Bristol and trying to decide whether to take up a postdoctoral fellowship there or a different one based in Paris (a situation that seems an unimaginable luxury and privilege when I look at today's academic job market). Staying in Bristol was the safe bet; moving to Paris was a gamble. I already knew what it would be like to live in Bristol and what the department was like. I knew I'd enjoy it a great deal. I'd visited Paris, but I didn't know what it would be like to live there, and I knew the philosophical scene even less. I knew I'd enjoy living there, but I didn't know how much. I figured I might enjoy it a great deal more than Bristol, but also I might enjoy it somewhat less. The choice was complicated because my partner at the time would move too, if that's what we decided to do. Fortunately, just as Alex and I agreed on how much we valued the different outcomes that faced us on the mountain, so my partner and I agreed on how much we'd value staying in Bristol, how much we'd value living in Paris under the first, optimistic scenario, and how much we'd value living there under the second, more pessimistic scenario. We also agreed how likely the two Parisian scenarios were---we'd heard the same friends describing their experiences of living there, and we'd drawn the same conclusions about how likely we were to value the experience ourselves to different extents. Nonetheless, just as Alex and I had disagreed on whether or not to start our descent despite our shared utilities and probabilities, so my partner and I disagreed on whether or not to move to Paris. Again the more risk-averse of the two, I wanted to stay in Bristol, while he wanted to move to Paris. Again, of course, we sat down to discuss this. But suppose that hadn't been possible. Perhaps my partner had to make the decision for both of us at short notice and I was not available to consult. How should he have chosen?

In this case, I think either choice would have been permissible. My partner might have chosen Paris or he might have chosen Bristol and either of these would have been allowed. But of course this runs contrary to Principle 1.

So what is the crucial difference between the decision on Sgùrr na Banachdich and the decision whether to move cities? In each case, there is an option---beginning our descent or staying in Bristol---that is certain to have a particular level of value; and there is an alternative option---continuing to climb or moving to Paris---that might give less value than the sure thing, but might give more. And, in each case, the more risk-averse person prefers the sure thing to the gamble, while the more risk-inclined prefers the gamble. So why must someone choosing for me and Alex in the first case choose to descend, while someone choosing for me and my partner in the second case choose either Bristol or Paris?

Here's my attempt at a diagnosis: in the choice of cities, there is no risk of harm, while in the decision on the mountain, there is. In the first case, the gamble opens up a possible outcome in which we're harmed---we are injured, perhaps quite badly. In the second case, the gamble doesn't do that---we countenance the possibiilty that moving to Paris might not be as enjoyable as remaining in Bristol, but we are certain it won't harm us! This suggests the following principle:

Principle 2  Suppose two people assign the same utilities to the possible outcomes, and assign the same probabilities to the outcomes conditional on choosing a particular course of action. And suppose that you are required to choose between those courses of action on their behalf. Then there are two cases: if one of the available options opens the possibility of a harm, then you must choose whatever the more risk-averse of the two would choose; if neither of the available options opens the possibility of a harm, then you may choose an option if at least one of the two would choose it. 

So risk-averse preferences do not always take precedence, but they do when harms are involved. Why might that be?

A natural answer: to expose someone to the risk of a harm requires their consent. That is, when there is an alternative option that opens no possibility of harm, you are only allowed to choose an option that opens up the possibility of a harm if everyone affected would consent to being subject to that risk. So Alex should only choose to continue our ascent and expose us to the risk of injury if I would consent to that, and of course I wouldn't, since I'd prefer to descend. But my partner is free to choose the move to Paris even though I wouldn't choose that, because it exposes us to no risk of harm.

A couple of things to note: First, in our explanation, reference to risk-aversion, risk-neutrality, and risk-inclination have dropped out. What is important is not who is more averse to risk, but who consents to what. Second, our account will only work if we employ an absolute notion of harm. That is, I must say that there is some threshold and an option harms you if it causes your utility to fall below that threshold. We cannot use a relative notion of harm on which an option harms you if it merely causes your utility to fall. After all, using a relative notion of harm, the move to Paris will harm you should it turn out to be worse than staying in Bristol.

The problem with Principle 2 and the explanation we have just given is that it does not generalise to cases in which more than two people are involved. That is, the following principle seems false:

Principle 3  Suppose each member of a group of people assign the same utilities to the possible outcomes, and assign the same probabilities to the outcomes conditional on choosing a particular course of action. And suppose that you are required to choose between those courses of action on their behalf. Then there are two cases: if one of the available options opens the possibility of a harm, then you must choose whatever the most risk-averse of them would choose; if neither of the available options opens the possibility of a harm, then you may choose an option if at least one member of the group would choose it.

A third vignette might help to illustrate this.

I grew up between two power stations. My high school stood in the shadow of the coal-fired plant at Cockenzie, while the school where my mother taught stood in the lee of the nuclear plant at Torness Point. And I was born two years after the Three Mile Island accident and the Chernobyl tragedy happened as I started school. So the risks of nuclear power were somewhat prominent growing up. Now, let's imagine a community of five million people who currently generate their energy from coal-fired plants---a community like Scotland in 1964, just before its first nuclear plant was constructed. This community is deciding whether to build nuclear plants to replace its coal-fired ones. All agree that having a nuclear plant that suffered no accidents would be vastly preferable to having coal plants, and all agree that a nuclear plant that suffered an accident would be vastly worse than the coal plants. And we might imagine that they also all assign the same probability to the prospective nuclear plants suffering an accident---perhaps they all defer to a recent report from the country's atomic energy authority. But, while they agree on the utilities and the probabilities, they have don't all have the same attitudes to risk. In the end, 4.5million people prefer to build the nuclear facilities, while half a million, who are more risk-averse, prefer to retain the coal-fired alternatives. Principle 3 says that, for someone choosing on behalf of this population, the only option they can choose is to retain the coal-fired plants. After all, a nuclear accident is clearly a harm, and there are individuals who would suffer that harm who would not consent to being exposed to the risk. But surely that's wrong. Surely, despite such opposition, it would be acceptable to build the nuclear plant.

So, while Principle 2 might yet be true, Principle 3 is wrong. And I think my attempt to explain the basis of Principle 2 must be wrong as well, for if it were right, it would also support Principle 3. After all, in no other case I can think of in which a lack of consent is sufficient to block an action does that block disappear if there are sufficiently many people in favour of the action.

So what general principles underpin our reactions to these three vignettes? Why do the preferences of the more risk-averse individuals carry more weight when one of the outcomes involves a harm than when they don't, but not enough weight to overrule a significantly greater number of more risk-inclined individuals? That's the theory I'm in search of here.

Tuesday, 6 April 2021

Believing is said of groups in many ways

For a PDF version of this post, see here.

In defence of pluralism

Recently, after a couple of hours discussing a problem in the philosophy of mathematics, a colleague mentioned that he wanted to propose a sort of pluralism as a solution. We were debating the foundations of mathematics, and he wanted to consider the claim that there might be no single unique foundation, but rather many different foundations, no one of them better than the others. Before he did so, though, he wanted to preface his suggestion with an apology. Pluralism, he admitted, is unpopular wherever it is proposed as a solution to a longstanding philosophical problem. 

I agree with his sociological observation. Philosophers tend to react badly to pluralist solutions. But why? And is the reaction reasonable? This is pure speculative generalisation based on my limited experience, but I've found that the most common source of resistance is a conviction that there is a particular special role that the concept in question must play; and moreover, in that role, whether or not something falls under the concept determines some important issue concerning it. So, in the philosophy of mathematics, you might think that a proof of a mathematical proposition is legitimate just in case it can be carried out in the system that provides the foundation for mathematics. And, if you allow a plurality of foundations of differing logical strength, the legitimacy of certain proof becomes indeterminate---relative to some foundations, they're legit; relative to others, they aren't. Similarly, you might think that a person who accidentally poisons another person is innocent of murder if, and only if, they were justified in their belief that the liquid they administered was not poisonous. And, if you allow a plurality of concepts of justification, then whether or not the person is innocent might become indeterminate. 

I tend to respond to such concerns in two ways. First, I note that, while the special role that my interlocutor picks out for the concept we're discussing is certainly among the roles that this concept needs to play, it isn't the only one; and it is usually not clear why we should take it to be the most important one. One role for a foundation of mathematics is to test the legitimacy of proofs; but another is to provide a universal language that mathematicians might use, and that might help them discover new mathematical truths (see this paper by Jean-Pierre Marquis for a pluralist approach that takes both of these roles seriously).

Second, I note that we usually determine the important issues in question independently of the concept and then use our determinations to test an account of the concept, not the other way around. So, for instance, we usually begin by determining whether we think a particular proof is legitimate---perhaps by asking what it assumes and whether we have good reason for believing that those assumptions are true---and then see whether a particular foundation measures up by asking whether the proof can be carried out within it. We don't proceed the other way around. And we usually determine whether or not a person is innocent independently of our concept of justification---perhaps just by looking at the evidence they had and their account of the reasoning they undertook---and then see whether a particular account of justification measures up by asking whether the person is innocent according to it. Again, we don't proceed the other way around.

For these two reasons, I tend not to be very moved by arguments against pluralism. Moreover, while it's true that pluralism is often greeted with a roll of the eyes, there are a number of cases in which it has gained wide acceptance. We no longer talk of the probability of an event but distinguish between its chance of occurring, a particular individual's credence in it occurring, and perhaps even it's evidential probability relative to a body of evidence. That is, we are pluralists about probability. Similarly, we no longer talk of a particular belief being justified simpliciter, but distinguish between propositional, doxastic, and personal justification. We are, along some dimensions at least, pluralists about justification. We no longer talk of a person having a reason to choose one thing rather than another, but distinguish between their internal and external reasons

I want to argue that we should extend pluralism to so-called group beliefs or collective beliefs. Britain believes lockdowns are necessary to slow the virus. Scotland believes it would fare well economically as an independent country. The University believes the pension fund has been undervalued and requires no further increase in contributions in the near future to meet its obligations in the further future. In 1916, Russia believed Rasputin was dishonest. In each of these sentences, we seem to ascribe a belief to a group or collective entity. When is it correct to do this? I want to argue that there is no single answer. Rather, as Aristotle said of being, believing is said of groups in many ways---that is, a pluralist account is appropriate.

I've been thinking about this recently because I've been reading Jennifer Lackey's fascinating new book, The Epistemology of Groups (all page numbers in what follows refer to that). In it, Lackey offers an account of group belief, justified group belief, group knowledge, and group assertion. I'll focus here only on the first.

Lackey's treatment of group belief

Three accounts of group belief

Lackey considers two existing accounts of group belief as well as her own proposal. 

The first, due to Margaret Gilbert and with amendments by Raimo Tuomela, is a non-summative account that treats groups as having 'a mind of their own'. Lackey calls it the Joint Acceptance Account (JAA). I'll stick with the simpler Gilbert version, since the points I'll make don't rely on Tuomela's more involved amendment (24):

JAA  A group $G$ believes that $p$ iff it is common knowledge in $G$ that the members of $G$ individually have intentionally and openly expressed their willingness jointly to accept that $p$ with the other members of $G$.

The second, due to Philip Pettit, is a summative account that treats group belief as strongly linked to individual belief. Lackey calls it the Premise-Based Aggregation Account (PBAA) (29). Here's a rough paraphrase:

PBAA  A group $G$ believes that $p$ iff there is some collection of propositions $q_1, \ldots, q_n$ such that (i) it is common knowledge among the operative members of $G$ that $p$ is true iff each $q_i$ is true, (ii) for each operative member of $G$, they believe $p$ iff they believe each $q_i$, and (iii) for each $q_i$, the majority of operative members of $G$ believe $q_i$.

Lackey's own proposal is the Group Agent Account (GAA) (48-9):

GAA  A group $G$ believes that $p$ iff (i) there is a significant percentage of $G$'s operative members who believe that $p$, and (ii) are such that adding together the bases of their beliefs that $p$ yields a belief set that is not substantively incoherent.

Group lies (and bullshit) and judgment fragility: two desiderata for accounts of group belief

To distinguish between these three accounts, Lackey enumerates four desiderata for accounts of group belief that she takes to tell against JAA and PBAA and in favour of GAA. The first three are related to an objection to Gilbert's account of group belief that was developed by K. Brad Wray, A. W. M. Meijers, and Raul Hakli in the 2000s. According to this, JAA makes it too easy for groups to actively, consciously, and intentionally choose what they believe: all they need to do is intentionally and openly express their willingness jointly to accept the proposition in question. Lackey notes two consequences of this: (a) on such an account, it is difficult to give a satisfactory account of group lies (or group bullshit, though I'll focus on group lies); (b) on such an account, whether or not a group believes something at a particular time is sensitive to the group's situation at that time in a way that beliefs should not be sensitive.

So Lackey's first desideratum for an account of group belief is that it must be able to accommodate a plausible account of group lies (and the second that it accommodate group bullshit, but as I said I'll leave that for now). Suppose each member of a group strongly believes $p$ on the basis of excellent evidence that they all share, but they also know that the institution will be culpable of a serious crime if it is taken to believe $p$. Then they might jointly agree to accept $\neg p$. And, if they do, Gilbert must say that they do believe $\neg p$. But were they to assert $\neg p$, we would take the group to have lied, which would require that it believes $p$. The point is that, if a group's belief is so thorougly within its voluntary control, it can manipulate it whenever it likes in order to avoid ever lying in situations in which dishonesty would be subject to censure. 

Lackey's third desideratum for an account of group belief is that such belief should not be rendered sensitive in certain ways to the situation in which the group formed it. Suppose that, on the basis of the same shared evidence, a substantial majority of members of a group judge the horse Cisco most likely to win the race, the horse Jasper next most likely, and the horse Whiskey very unlikely to win. But, again on the basis of this same shared body of evidence, the remaining minority of members judge Whiskey most likely to win, Jasper next most likely, and Cisco very unlikely to win. The group would like a consensus before it reports its opinion, but time is short---the race is about to begin, say, and the group has been asked for its opinion before the starting gates open. So, in order to achieve something close to a consensus, it unanimously agrees to accept that Jasper will win, even though he is everyone's second favourite. Yet we might also assume that, had time not been short, the majority would have been able to persuade the minority of Cisco's virtues; and, in that case, they'd unanimously agree to accept that Cisco will win. So, according to Gilbert's account, under time pressure, the group believes Jasper will win, while with world enough and time, they would have believed that Cisco will win. Lackey holds that no account of group belief should make it sensitive to the situation in which it is formed in this way, and thus rejects JAA.

Lackey argues that any account of group belief must satisfy the two desiderata we've just considered. I agree that we need at least one account of group belief that satisfies the first desideratum, but I'm not convinced that all need do this---but I'll leave that for later, when I try to motivate pluralism. For now, I'd like to explain why I'm not convinced that any account needs to satisfy the second desideratum. After all, we know from various empirical studies in social psychology, as well as our experience as thinkers and reasoners and believers, that our ordinary beliefs as individuals are sensitive to the situation in which they're formed in just the sort of way that Lackey wishes to rule out for the beliefs of groups. One of the central theses of Amos Tversky and Daniel Kahneman's work is that we use a different reasoning system when we are forced to make a judgment under time pressure from the one we use when more time is available. So, when my implicit biases are mobilised under time pressure, I might come to believe that a particular job candidate is incompetent, while I might judge them to be competent were I to have more time to assess their track record and override my irrational hasty judgment. And, whenever we are faced with a complex body of evidence that, on the face of it, seems to point in one direction, but which, under closer scrutiny, points in the opposite direction, we will form a different belief if we must do so under time pressure than if we have greater leisure to unpick and balance the different components of the evidence. If individual beliefs can be sensitive to the situation in which they're formed in this way, I see no reason why group beliefs might not also be sensitive in this way.

Before moving on, I'd like to consider whether the PBAA---Pettit's premise-based aggregation account---satisfies Lackey's first desideratum. If it doesn't, it can't be for the same reason that Gilbert's JAA doesn't. After all, according to the PBAA, the group's belief is no more under its voluntary control than the beliefs of its individual members. If, for each $q_i$, a majority believes $q_i$, then the group believes $p$. The only way a group could manipulate its belief is by manipulating the beliefs of its members. But if that sort of manipulation rules out a group belief, Lackey's account is just as vulnerable.

So why does Lackey think that PBAA cannot adequately account for group lies. She considers a case in which the three board members of a tobacco company know that smoking is safe to health iff it doesn't cause lung cancer and it doesn't cause emphysema and it doesn't cause heart disease. The first member believes it doesn't cause lung cancer or heart disease, but believes it does cause emphysema, and so believes it is not safe to health; the second believes it doesn't cause emphysema or heart disease, but it does cause lung cancer, and so believes it is not safe to health; and the third believes it doesn't cause lung cancer or emphysema, but it does cause heart disease, and so believes it is not safe to health. The case is illustrated in Table 1. 


Then each board member believes it is not safe to health, but PBAA says that it is, because a majority (first and third) believe it doesn't cause lung cancer, a majority (second and third) believe it doesn't cause emphysema, and a majority (first and second) believe it doesn't cause heart disease. If the company then asserts that it is safe to health, then Lackey claims that it lies, while PBAA says that it believes the proposition it asserts and so does not lie.

I think this case is a bit tricky. I suspect our reaction to it is influenced by our knowledge of how the real-world version played out and the devastating effect it has had. So let us imagine that this group of three is not the board of a tobacco company, but the scientific committee of a public health organisation. The structure of the case will be exactly the same, and the nature of the organisation should not affect whether or not belief is present. Now suppose that, since the stakes are so high, each member would only come to believe of a specific putative risk that it is not present if their credence that it is not present is above 95%. That is, there is some pragmatic encroachment here to the extent that the threshold for belief is determined in part by the stakes involved. And suppose further that the first member of the scientific committee has credence 99% that smoking doesn't cause lung cancer, 99% that it doesn't cause heart disease, and 93% that it doesn't cause emphysema. And let's suppose that, by a tragic bout of bad luck that has bestowed on them very misleading evidence, the evidence available to them supports these credences. Then their credence that smoking is safe to health must be at most 93%---since the probability of a conjunction must be at most the probability of any of the conjuncts---and thus below 95%. So the first member doesn't believe it is safe to health. And suppose the same for the other two members of the committee, but for the other combinations of risks. So the second is 99% sure it doesn't cause emphysema and 99% sure it doesn't cause heart disease, but only 93% sure it doesn't cause lung cancer. And the third is 99% sure it doesn't cause lung cancer and 99% sure it doesn't cause emphysema, but only 93% sure it doesn't cause heart disease. So none of the three believe that smoking is safe to health. The case is illustrated in Table 2. 


However, just averaging the group's credences in each of the three specific risks, we might say that it is 97% sure that smoking doesn't cause lung cancer, 97% sure it doesn't cause emphysema, and 97% sure it doesn't cause heart disease ($\frac{0.99 + 0.99 + 0.93}{3} = 0.97$). And it is then possible that the group assigns a higher than 95% credence to the conjunction of these three. And, if it does, it seems to me, the PBAA may well get things right, and the group does not lie if it says that smoking carries no health risks.

Nonetheless, I think the PBAA cannot be right. In the example I just described, I noted that, just taking a straight average gives, for each specific risk, a credence of 97% that it doesn't exist. And I noted that it's then possible that the group credence that smoking is safe to health is above 95%. But of course, it's also possible that it's below 95%. This would happen, for instance, if the group were to take the three risks to be independent. Then the group credence that smoking is safe to health would be a little over 91%---too low for the group to believe it given the stakes. But PBAA would still say that the group believes that smoking is safe to health. The point is that PBAA is not sufficiently sensitive to the more fine-grained attitudes to the propositions that lie behind the beliefs in those propositions. Simply knowing what each member believes about the three putative risks is not sufficient to determine what the group thinks about them. You also need to look to their credences.

Of course, there are lots of reasons to dislike straight averaging as a means for pooling credences---it can't preserve judgments of independence, for instance---and lots of reasons to dislike the naive application of a threshold or Lockean view of belief that is in the background here---it gives rise to the lottery paradox. But it seems that, for any reasonable method of probablistic aggregation and any reasonable account of the relationship between belief and credence, there will be cases like this in which the PBAA says the group believes a proposition when it shouldn't. So I agree with Lackey that the PBAA sometimes gets things wrong, but I disagree about exactly when.

Base fragility: a further desideratum

Consider an area of science in which two theories vie for precedence, $T_1$ and $T_2$. Half of the scientists working in this area believe the following:

  • ($A_1$) $T_1$ is simpler than $T_2$,
  • ($B_1$) $T_2$ is more explanatory than $T_1$,
  • ($C_1$) simplicity always trumps explanatory power in theory choice.

These scientists consequently believe $T_1$. The other half of the scientists believe the following: 

  • ($A_2$) $T_2$ is simpler than $T_1$,
  • ($B_2$) $T_1$ is more explanatory than $T_2$,
  • ($C_2$) explanatory power always trumps simplicity in theory choice.

These scientists consequently believe $T_1$. So all scientists believe $T_1$. But they do so for diametrically opposed reasons. Indeed, all of their beliefs about the comparisons between $T_1$ and $T_2$ are in conflict, but because their views about theory choice are also in conflict, they end up believing the same theory. Does the scientific community believe $T_1$? Lackey says no. In order for a group to believe a proposition, the bases of the members' beliefs must not be substantively incoherent. In our example, for half of the members, the basis of their belief in $T_1$ is $A_1\ \&\ B_1\ \&\ C_1$, while for the other half, it's $A_2\ \&\ B_2\ \&\ C_2$. And $A_1$ contradicts $A_2$, $B_1$ contradicts $B_2$, and $C_1$ contradicts $C_2$. The bases are about as incoherent as can be. 

Is Lackey correct to say that the scientific community does not believe in this case? I'm not so sure. For one thing, attributing belief in $T_1$ would help to explain a lot of the group's behaviour. Why does the scientific community fund and pursue research projects that are of interest only if $T_1$ is true? Why does the scientific community endorse and teach from textbooks that give much greater space to expounding and explaining $T_1$? Why do departments in this area hire those with the mathematical expertise required to understand $T_1$ when that expertise is useless for understanding $T_2$? In each case, we might say: because the community believes $T_1$.

Lackey raises two worries about group beliefs based in incoherent bases: (i) they cannot be subject to rational evaluation; (ii) they cannot coherently figure in accounts of collective deliberation. On (ii), it seems to me that the group belief could figure in deliberation. Suppose the community is deliberating about whether to invite a $T_1$-theorist or a $T_2$-theorist to give the keynote address at the major conference in the area. It seems that the group's belief in the superiority of $T_1$ could play a role in the discussions: 'Yes, we want the speaker who will pose the greatest challenge intellectually, but we don't want to hear a string of falsehoods, so let's go with the $T_1$-theorist,' they might reason.

On (i): Lackey asks what we would say if the group were to receive new evidence that $T_1$ has greater simplicity and less explanatory power than we initially thought. For the first half of the group, this would make their belief in $T_1$ more justified; for the second half, it would make their belief less justified. What would it do to the group's belief? Without an account of justification for group belief, it's hard to say. But I don't think the incoherent bases rule out an answer. For instance, we might be reliabilists about group justification. And if we are, then we look at all the times that the members of the group have made judgments about simplicity and explanatory power that have the same pattern as they have time---that is, half one way, half the other---and we look at the proportion of those times that the group belief---formed by whatever aggregation method we favour---has been true. If it's high, then the belief is justified; if it's not, it's not. And we can do that for the group before and after this new evidence comes in. And by doing that, we can compare the level of justification for the group belief.

Of course, this is not to say that reliabilism is the correct account of justification for group beliefs. But it does suggest that incoherent bases don't create a barrier to such accounts.

Varieties of group belief

One thing that is striking when we consider different proposed accounts of group belief is how large the supervenience base might be; that is, how many different features of a group $G$ might partially determine whether or not it believes a proposition $p$. Here's a list, though I don't pretend that it's exhaustive:

(1) The beliefs of individual members of the group

(1a) Some accounts are concerned only with individual members' beliefs in $p$; others are interested in members' beliefs beyond that. For instance, a simple majoritarian account is interested only in members' beliefs in $p$. But Pettit's PBAA is interested instead in members' beliefs in each proposition from a set $q_1, \ldots, q_n$ whose conjunction is equivalent to $p$. And Lackey's GAA is interested in the members' beliefs in $p$ as well as the members' beliefs that form the bases for their belief in $p$ when they do believe $p$.

(1b) Some accounts are concerned with the individual beliefs of all members of the group, some only with so-called operative members. For instance, some will say that what determines whether a company believes $p$ is only whether or not members of their board believe $p$, while others will say that all employees of the company count.

(2) The credences of individual members of the group

There are distinctions corresponding to (1a) and (1b) here as well.

(3) The outcomes of discussions between the members of the group

(3a) Some will say that only discussions that actually take place make a difference---you might say that, before a discussion takes place, the members of the group each believe $p$, but after they discuss it and retain those beliefs, you can say that the group believes $p$; others will say that hypothetical discussions can also make a difference---if individual members would dramatically change their beliefs were they to discuss the matter, that might mean the group does not believe, even if all members do.

(3b) Some will say that it is not the individual members' beliefs after discussion that is important, but their joint decision to accept $p$ as the group's belief. (Margaret Gilbert's JAA is such an account.)

(4) Belief-forming structures within the group

(4a) Some groups are extremely highly structured, and some of these structures relate to group belief formation. Some accounts of group belief acknowledge this by talking of 'operative members' of groups, and taking their attitudes to have greater weight in determining the group's attitude. For instance, it is common to say that the operative members of a company are its board members; the operative members of a British university might be its senior management team; the operative members of a trade union might be its executive committee. But of course many groups have much more complex structures than these. For instance, many large organisations are concerned with complex problems that break down into smaller problems, each of which requires a different sort of expertise to understand. The World Health Organization (WHO) might be such an example, or the Intergovernmental Panel on Climate Change (IPCC), or Médecins san Frontières (MSF). In each case, there might be a rigid reporting structure whereby subcommittees report their findings to the main committee, but each subcommittee might form its own subcommittees that report to them; and there might be strict rules about how the findings of a subcommittee must be taken into account by the committee to which it reports before that committee itself reports upwards. In such a structure, the notion of operative members and their beliefs is too crude to capture what's necessary.

(5) The actions of the group 

(5a) Some might say that a group has a belief just in case it acts in a way that is best explained by positing a group belief. Why does the scientific community persist in appointing only $T_1$-theorists and no $T_2$-theorists? Answer: It believes $T_1$. (I think Kenny Easwaran and Reuben Stern take this view in their recent joint work.)

So, in the case of group beliefs, the disagreement between different accounts does not concern only the conditions on an agree supervenience base; it also concerns the extent of the supervenience base itself. Now, this might soften us up for pluralism, but it is hardly an argument. To give an argument, I'd like to consider a range of possible accounts and, for each, describe a role that group beliefs are typically taken to play and for which this account is best suited.

Group beliefs as summaries

One thing we do when we ascribe beliefs to groups is simply to summarise the views of the group. If I say that, in 1916, Russia believed that Rasputin was dishonest, I simply give a summary of the views of people who belong to the group to which 'Russia' refers in this sentence, namely, Russians alive in 1916. And I say roughly that a substantial majority believed that he was dishonest. 

For this role, a simple majoritarian account (SMA) seems best:

SMA  A group $G$ believes $p$ iff a substantial majority of members of $G$ believes $p$.

There is an interesting semantic point in the background here. Consider the sentence: 'At the beginning of negotiations at Brest-Litovsk in 1917-8, Russia believed Germany's demands would be less harsh than they turned out to be.' We might suppose that, in fact, this belief was not widespread in Russia, but it was almost universal among the Bolshevik government. Then we might nonetheless say that the sentence is true. At first sight, it doesn't seem that SMA can account for this. But it might do if 'Russia' refers to different groups in the two different sentences: to the whole population in 1916 in the first sentence; to the members of the Bolshevik government in the second. 

I'm tempted to think that this happens a lot when we discuss group beliefs. Groups are complex entities, and the name of a group might be used in one sentence to pick out some subset of its structure---just its members, for instance---and in another sentence some other subset of its structure---its members as well as its operative group, for instance---and in another sentence yet some further subset of its structure---its members, its operative group, and the rules by which the operative group abide when they are debating an issue.

Of course, this might look like straightforward synecdoche, but I'm inclined to think it's not, because it isn't clear that there is one default referent of the term 'Russia' such that all other terms are parasitic on that. Rather, there are just many many different group structures that might be picked out by the term, and we have to hope that context determines this with sufficient precision to evaluate the sentence.

Group beliefs as attitudes that play a functional role

An important recent development in our understanding of injustice and oppression has been the recognition of structural forms of racism, sexism, ableism, homophobia, transphobia, and so on. The notion is contested and there are many competing definitions, but to illustrate the point, let me quote from a recent article in the New England Journal of Medicine that considers structural racism in the US healthcare system:

All definitions [of structural racism] make clear that racism is not simply the result of private prejudices held by individuals, but is also produced and reproduced by laws, rules, and practices, sanctioned and even implemented by various levels of government, and embedded in the economic system as well as in cultural and societal norms (Bailey, et al. 2021).

The point is that a group---a university, perhaps, or an entire healthcare system, or a corporation---might act as if it holds racist or sexist beliefs, even though no majority of its members holds those beliefs. A university might pay academics who are women less, promote them less frequently, and so on, even while few individuals within the organisation, and certainly not a majority, believe that women's labour is worth less, and that women are less worthy of promotion. In such a case, we might wish to ascribe those beliefs to the institution as a whole. After all, on certain functionalist accounts of belief, to have a belief simply is to be in a state that has certain casual relationships with other states, including actions. And the state of a group is determined not only by the state of the individuals within it but also by the other structural features of the group, such as its laws, rules and practices. And if the states of the individuals within the group, combined with these laws, rules and practices give rise to the sort of behaviour that we would explain in a individual by positing a belief, it seems reasonable to do so in the group case as well. What's more, doing so helps to explain group behaviour in just the same way that ascribing beliefs to individuals helps to explain their behaviour. (As mentioned above, I take it that Kenny Easwaran and Reuben Stern take something like this view of group belief.)

Group beliefs as ascriptions that have legal standing

In her book, Lackey pays particular attention to cases of group belief that are relevant to corporate culpability and liability. In the 1970s, did the tobacco company Philip Morris believe that their product is hazardous to health, even while they repeatedly denied it? Between 1998 and 2014, did Volkswagen believe that their diesel emissions reports were accurate? In 2003, did the British government believe that Iraq could deploy biological weapons within forty-five minutes of an order to do so? Playing this role well is an important job for an account of group belief. It can have very significant real world consequences: Do those who trusted the assertions of tobacco companies and became ill as a result receive compensation? Do governments have a case against car manufacturers? Should a government stand down?

In fact, I think the consequences are often so large and, perhaps more importantly, so varied that the decision whether or not to put them in train should not depend on the applicability of a single concept with a single precise definition. Consider cases of corporate culpability. There are many ways in which this might be punished. We might fine the company. We might demand that it change certain internal policies or rules. We might demand that it change its corporate structure. We might do many things. Some will be appropriate and effective if the company believes a crucial proposition in one sense; some appropriate if it believes that proposition in some other sense. For instance, a fine does many things, but among them is this: it affects the wealth of the company's shareholders, who will react by putting pressure on the company's board. Thus, it might be appropriate to impose a fine if we think that the company believed the proposition that it denied in its public assertions in the sense that a substantial majority of its board believed it. On the other hand, demanding that the company change certain internal policies or rules would be appropriate if the company believes the proposition that it publicly denied in the sense that it is the outcome of applying its belief-forming rules and policies (such as, for instance, the nested set of subcommittees that I imagined for the WHO or the IPPC or MSF above).

The point is that our purpose in ascribing culpability and liability to a group is essentially pragmatic. We do it in order to determine what sort of punishment we might mete out. This is perhaps in contrast to cases of individual culpability and liability, where we are interested also in the moral status of the individual's action independent of how we respond to it. But, in many cases, such as when a corporation has lied, which punishment is appropriate depends on which of the many ways in which a group can believe the company believed the negation of the proposition it asserted in its lie.

So it seems to me that, even if this role were the only role that our concept of group belief had to play, pluralism would be appropriate. Groups are complex entities and there are consequently many ways in which we can seek to change them in order to avoid the sorts of harms that arise when they behave badly. We need different concepts of group belief in order to identify which is appropriate in a given case.

It's perhaps worth noting that, while Lackey's opens her book with cases of corporate culpability, and this is a central motivation for her emphasis on group lying, it isn't clear to me that her group agent account (GAA) can accommodate all cases of corporate lies. Consider the following situation. The board of a tobacco company is composed of eleven people. Each of them believes that tobacco is hazardous to health. However, some believe it for very different reasons from the others. They have all read the same scientific literature on the topic, but six of them remember it correctly and the other five remember it incorrectly. The six who remember it correctly remember that tobacco contains chemical A and remember that when chemical A comes into contact with tissue X in the human body, it causes cancer in that tissue; and they also remember that tobacco does not contain chemical B and they remember that, when chemical B comes into contact with tissue Y in the human body, it does not cause cancer in that tissue. The five who remember the scientific literature incorrectly believe that tobacco contains chemical B and believe that when chemical B comes into contact with tissue Y in the human body, it causes cancer in that tissue; and they also believe that tobacco does not contain chemical A and they believe that, when chemical A comes into contact with tissue X in the human body, it does not cause cancer in that tissue. So, all board members believe that smoking causes cancer. However, the bases of their beliefs forms an incoherent set. The two propositions on which the six base their belief directly contradict the two propositions on which the five base theirs. The board then issues a statement saying that tobacco does not cause cancer. The board is surely lying, but according to GAA, they are not because the bases of their beliefs conflict and so they do not believe that tobacco does cause cancer.