Thursday, 5 May 2022

Should we agree? I: the arguments for consensus

You can find a PDF of this blogpost here.

Should everyone agree with everyone else? Whenever two members of a group have an opinion about the same claim, should they both be equally confident in it? If this is sometimes required of groups, of which ones is it required and when? Whole societies at any time in their existence? Smaller collectives when they're engaged in some joint project?

Of course, you might think these are purely academic questions, since there's no way we could achieve such consensus even if we were to conclude that it is desirable, but that seems too strong. Education systems and the media can be deployed to push a population towards consensus, and indeed this is exactly how authoritarian states often proceed. Similarly, social sanctions can create incentives for conformity. So it seems that a reasonable degree of consensus might be possible.

But is it desirable? In this series of blogposts, I want to explore two formal arguments. They purport to establish that groups should be in perfect agreement; and they explain why getting closer to consensus is better, even if perfect agreement isn't achieved---in this case, a miss is not as good as a mile. It's still a long way from their conclusions to practical conclusions about how to structure a society, but they point sufficiently strongly in a surprising direction that it is worth exploring them. In this first post, I set out the arguments as they have been given in the literature and polish them up a bit so that they are as strong as possible.

Since they're formal arguments, they require a bit of mathematics, both in their official statement and in the results on which they rely. But I want to make the discussion as accessible as possible, so, in the main body of the blogpost, I state the arguments almost entirely without formalism. Then, in the technical appendix, I sketch some of the formal detail for those who are interested.

Two sorts of argument for credal norms

There are two sorts of argument we most often use to justify the norms we take to govern our credences: there are pragmatic arguments, of which the betting arguments are the most famous; and there are epistemic arguments, of which the epistemic utility arguments are the most well known.

Take the norm of Probabilism, for instance, which says that your credences should obey the axioms of the probability calculus. The betting argument for Probabilism is sometimes known as the Dutch Book or sure loss argument.* It begins by claiming that the maximum amount you are willing to pay for a bet on a proposition that pays out a certain amount if the proposition is true and nothing if it is false is proportional to your credence in that proposition. Then it shows that, if your credences do not obey the probability axioms, there is a set of bets each of which they require you to accept, but which when taken together lose you money for sure; and if your credences do obey those axioms, there is no such set of bets.

The epistemic utility argument for Probabilism, on the other hand, begins by claiming that any measure of the epistemic value of credences must have certain properties.** It then shows that, by the lights of any epistemic utility function that does have those properties, if your credences do not obey the probability axioms, then there are alternatives that are guaranteed to be have greater epistemic utility than yours; and if they do obey those axioms, there are no such alternatives.

Bearing all of this in mind, consider the following two facts.

(I) Suppose we make the same assumptions about which bets an individual's credences require them to accept that we make in the betting argument for Probabilism. Then, if two members of a group assign different credences to the same proposition, there is a bet the first should accept and a bet the second should accept that, taken together, leave the group poorer for sure (Ryder 1981, Gillies 1991). 

(II) Suppose we measure the epistemic value of credences using an epistemic utility function that boasts the properties required of it by the epistemic utility argument for Probabilism. Then, if two members of a group assign different credences to the same proposition, there is a single credence such that the group is guaranteed to have greater total epistemic utility if every member adopts that single credence in that proposition (Kopec 2012).

Given the epistemic utility and betting arguments for Probabilism, neither (I) nor (II) is very surprising. After all, one consequence of Probabilism is that an individual must assign the same credence to two propositions that have the same truth value as a matter of logic. But from the point of view of the betting argument or the epistemic utility argument, this is structurally identical to the requirement that two different people assign the same credence to the same proposition, since obviously a single proposition necessarily has the same truth value as itself! However we construct the sure loss bets against the individual who violates the consequence of Probabilism, we can use an analogous strategy to construct the sure loss bets against the pair who disagree in the credences they assign. And however we construct the alternative credences that are guaranteed to be more accurate than the ones that violate the consequence of Probabilism, we can use an analogous strategy to construct the alternative credence that, if adopted by all members of the group that contains two individuals who currently disagree, would increase their total epistemic utility for sure.

Just as a betting argument and an epistemic utility argument aim to establish the individual norm of Probabilism, we might ask whether there is a group norm for which we can give a betting argument and an epistemic utility argument by appealing to (I) and (II)? That is the question I'd like to explore in these posts. In the remainder of this post, I'll spell out the details of the epistemic utility argument and the betting argument for Probabilism, and then adapt those to give analogous arguments for Consensus.

The Epistemic Utility Argument for Probabilism

Two small bits of terminology first:

  • Your agenda is the set of propositions about which you have an opinion. We'll assume throughout that all individuals have finite agendas.
  • Your credence function takes each proposition in your agenda and returns your credence in that proposition.

With those in hand, we can state Probabilism

Probabilism Rationality requires of an individual that their credence function is a probability function. 

What does it mean to say that a credence function is a probability function? There are two cases to consider.

First, suppose that, whenever a proposition is in your agenda, its negation is as well; and whenever two propositions are in your agenda, their conjunction and their disjunction are as well. When this holds, we say that your agenda is a Boolean algebra. And in that case your credence function is a probability function if two conditions hold: first, you assign the minimum possible credence, namely 0, to any contradiction and the maximum possible credence, namely 1, to any tautology; second, your credence in a disjunction is the sum of your credences in the disjuncts less your credence in their conjunction (just like the number of people in two groups is the number in the first plus the number in the second less the number in both).

Second, suppose that your agenda is not a Boolean algebra. In that case, your credence function is a probability function if it is possible to extend it to a probability function on the smallest Boolean algebra that contains your agenda. That is, it's possible to fill out your agenda so that it's closed under negation, conjunction, and disjunction, and then extend your credence function so that it assign credences to those new propositions in such a way that the result is a probability function on the expanded agenda. Defining probability functions on agendas that are not Boolean algebras allows us to say, for instance, that, if your agenda is just It will be windy tomorrow and It will be windy and rainy tomorrow, and you assign credence 0.6 to It will be windy and 0.8 to It will be windy and rainy, then you violate Probabilism because there's no way to assign credences to It won't be windy, It will be windy or rainy, It won't be rainy, etc in such a way that the result is a probability function.

The Epistemic Utility Argument for Probabilism begins with three claims about how to measure the epistemic value of a whole credence function. The first is Individual Additivity, which says that the epistemic utility of a whole credence function is simply the sum of the epistemic utilities of the individual credences it assigns. The second is Continuity, which says that, for any proposition, the epistemic utility of a credence in that proposition is a continuous function of that credence. And the third is Strict Propriety, which says that, for any proposition, each credence in that proposition should expect itself to be have greater epistemic utility than it expects any alternative credence in that proposition to have. With this account in hand, the argument then appeals to a mathematical theorem, which tells us two consequences of measuring epistemic value using an epistemic utility function that has the three properties just described, namely, Individual Additivity, Continuity, and Strict Propriety.

(i) For any credence function that violates Probabilism, there is a credence function defined on the same agenda that satisfies it and that has greater epistemic utility regardless of how the world turns out. In this case, we say that the alternative credence function dominates the original one. 

(ii) For any credence function that is a probability function, there is no credence function that dominates it. Indeed, there is no alternative credence function that is even as good as it at every world. For any alternative, there will be some world where that alternative is strictly worse.

The argument concludes by claiming that an option is irrational if there is some alternative that is guaranteed to be better and no option that is guaranteed to be better than that alternative.

The Epistemic Utility Argument for Consensus

As I stated it above, and as it is usually stated in the literature, Consensus says that, whenever two members of a group assign credences to the same proposition, they should assign the same credence. But in fact the epistemic argument in its favour establishes something stronger. Here it is: 

Consensus Rationality requires of a group that there is a single probability function defined on the union of the agendas of all of the members of the group such that the credence function of each member assigns the same credence to any proposition in their agenda as this probability function does.

This goes further than simply requiring that all agents agree on the credence they assign to any proposition to which they all assign credences. Indeed, it would place constraints even on a group whose members' agendas do not overlap at all. For instance, if you have credence 0.6 that it will be rainy tomorrow, while I have credence 0.8 that it will be rainy and windy, the pair of us will jointly violate Consensus, even though we don't assign credences to any of the same propositions, since no probability function assigns 0.6 to one proposition and 0.8 to the conjunction of that proposition with another one. In these cases, we say that the group's credences don't cohere.

One notable feature of Consensus is that it purports to govern groups, not individuals, and we might wonder what it could mean to say that a group is irrational. I'll return to that in a later post. It will be useful to have the epistemic utility and betting arguments for Consensus to hand first.

The Epistemic Utility Argument for Consensus begins, as the epistemic argument for Probabilism does, with Individual Additivity, Continuity, and Strictly Propriety. And it adds to those Group Additivity, which says that group's epistemic utility is the sum of the epistemic utilities of the credence functions of its members. With this account of group epistemic value in hand, the argument then appeals again to a mathematical theorem, but a different one, which tells us two consequences of Group and Individual Additivity, Continuity, and Strict Propriety:***

(i) For any group that violates Consensus, there is, for each individual, an alternative credence function defined on their agenda that they might adopt such that, if all were to adopt these, the group would satisfy Consensus and it would be more accurate regardless of how the world turns out. In this case, we say that the alternative credence functions collectively dominate the original ones.

(ii) For any group that satisfies Consensus, there are no credence functions the group might adopt that collectively dominate it.

The argument concludes by assuming again the norm that an option is irrational if there is some alternative that is guaranteed to be better.

The Sure Loss Argument for Probabilism

The Sure Loss Argument for Probabilism begins with a claim that I call Ramsey's Thesis. It tells you the prices at which your credences require you to buy and sell bets. It says that, if your credence in $A$ is $p$, and $£x < £pS$, then you should be prepared to pay $£x$ for a bet that pays out $£S$ if $A$ is true and $£0$ if $A$ is false. And this is true for any stakes $S$, whether positive, negative, or zero. Then it appeals to a mathematical theorem, which tells us two consequences of Ramsey's Thesis.

(i) For any credence function that violates Probabilism, there is a series of bets, each of which your credences require you to accept, that, taken together, lose you money for sure.

(ii) For any credence function satisfies Probabilism, there is no such series of bets.

The argument concludes by assuming a norm that says that it is irrational to have credences that require you to make a series of choices when there is an alternative series of choices you might have made that would be better regardless of how the world turns out.

The Sure Loss Argument for Consensus

The Sure Loss Argument for Consensus also begins with Ramsey's Thesis.  It appeals to a mathematical theorem that tells us two consequences of Ramsey's Thesis.

(i) For any group that violates Consensus, there is a series of bets, each offered to a  member of the group whose credences require that they accept it, that, taken together, lose the group money for sure.

(ii) For any group that satisfies Consensus, there is no such series of bets.

And it concludes by assuming that it is irrational for the members of a group to have credences that require them to make a series of choices when there is an alternative series of choices they might have made that would be better for the group regardless of how the world turns out.

So now we have the Epistemic Utility and Sure Loss Arguments for Consensus. In fact, I think the Sure Loss Argument doesn't work. So in the next post I'll say why and provide a better alternative based on work by Mark Schervish and Ben Levinstein. But in the meantime, here's the technical appendix.

Technical appendix

First, note that Probabilism is the special case of Consensus when the group has only one member. So we focus on establishing Consensus.

Some definitions to begin:

  • If $c$ is a credence function defined on the agenda $\mathcal{F}_i = \{A^i_1, \ldots, A^i_{k_i}\}$, represent it as a vector as follows:$$c = \langle c(A^i_1), \ldots, c(A^i_{k_i})\rangle$$
  • Let $\mathcal{C}_i$ be the set of credence functions defined on $\mathcal{F}_i$, represented as vectors in this way.
  • If $c_1, \ldots, c_n$ are credence functions defined on $\mathcal{F}_1, \ldots, \mathcal{F}_n$ respectively, represent them collectively as a vector as follows:
    $$
    c_1 \frown \ldots \frown c_n = \langle c_1(A^1_1), \ldots, c_1(A^1_{k_1}), \ldots, c_n(A^n_1), \ldots, c_n(A^n_{k_n}) \rangle
    $$
  • Let $\mathcal{C}$ be the set of sequences of credence functions defined on $\mathcal{F}_1, \ldots, \mathcal{F}_n$ respectively, represented as vectors in this way. 
  • If $w$ is a classically consistent assignment of truth values to the propositions in $\mathcal{F}_i$, represent it as a vector $$w = \langle w(A^i_1), \ldots, w(A^i_{k_i})\rangle$$ where $w(A) = 1$ if $A$ is true according to $w$, and $w(A) = 0$ if $A$ is false according to $w$.
  • Let $\mathcal{W}_i$ be the set of classically consistent assignments of truth values to the propositions in $\mathcal{F}_i$, represented as vectors in this way.
  • If $w$ is a classically consistent assignment of truth values to the propositions in $\mathcal{F} = \bigcup^n_{i=1} \mathcal{F}_i$, represent the restriction of $w$ to $\mathcal{F}_i$ by the vector $$w_i = \langle w(A^i_1), \ldots, w(A^i_{k_i})\rangle$$So $w_i$ is in $\mathcal{W}_i$. And represent $w$ as a vector as follows:
    $$
    w = w_1 \frown \ldots \frown w_n = \langle w(A^1_1), \ldots, w(A^1_{k_1}), \ldots, w(A^n_1), \ldots, w(A^n_{k_n})\rangle
    $$
  • Let $\mathcal{W}$ be the set of classical consistent assignments of truth values to the propositions in $\mathcal{F}$, represented as vectors in this way.

Then we have the following result, which generalizes a result due to de Finetti (1974):

Proposition 1 A group of individuals with credence functions $c_1, \ldots, c_n$ satisfy Consensus iff $c_1 \frown \ldots \frown c_n$ is in the closed convex hull of $\mathcal{W}$.

We then appeal to two sets of results. First, concerning epistemic utility measures, which generalizes a result to Predd, et al. (2009):

Theorem 1

(i) Suppose $\mathfrak{A}_i : \mathcal{C}_i \times \mathcal{W}_i \rightarrow [0, 1]$ is a measure of epistemic utility that satisfies Individual Additivity, Continuity, and Strict Propriety. Then there is a Bregman divergence $\mathfrak{D}_i : \mathcal{C}_i \times \mathcal{C}_i \rightarrow [0, 1]$ such that $\mathfrak{A}_i(c, w) = -\mathfrak{D}_i(w, c)$.

(ii) Suppose $\mathfrak{D}_1, \ldots, \mathfrak{D}_n$ are Bregman divergences defined on $\mathcal{C}_1, \ldots, \mathcal{C}_n$, respectively. And suppose $\mathcal{X}$ is a closed convex subset of $\mathcal{C}$. And suppose $c_1 \frown \ldots \frown c_n$ is not in $\mathcal{X}$. Then there is $c^\star_1 \frown \ldots \frown c^\star_n$ in $\mathcal{Z}$ such that, for all $z_1 \frown \ldots \frown z_n$ in $\mathcal{Z}$,
$$
\sum^n_{i=1} \mathfrak{D}_i(z_i, c^\star_i) < \sum^n_{i=1} \mathfrak{D}_i(z_i, c_i)
$$

So, by Proposition 1, if a group $c_1, \ldots, c_n$ does not satisfy Consensus, then $c_1 \frown \ldots \frown c_n$ is not in the closed convex hull of $\mathcal{W}$, and so by Theorem 1 there is $c^\star_1 \frown \ldots \frown c^\star_n$ in the closed convex hull of $\mathcal{W}$ such that, for all $w$ in $\mathcal{W}$, $$\mathfrak{A}_i(c, w) < \mathfrak{A}(c^\star, w)$$ as required.

Second, concerning bets, which is a consequence of the Separating Hyperplane Theorem:

Theorem 2
Suppose $\mathcal{Z}$ is a closed convex subset of $\mathcal{C}$. And suppose $c_1 \frown \ldots \frown c_n$ is not in $\mathcal{Z}$. Then there are vectors
$$
x = \langle x^1_1, \ldots, x^1_{k_1}, \ldots, x^n_1, \ldots, x^n_{k_n}\rangle
$$
and
$$
S = \langle S^1_1, \ldots, S^1_{k_1}, \ldots, S^n_1, \ldots, S^n_{k_n}\rangle
$$
such that, for all $x^i_j$ and $S^i_j$,
$$
x^i_j < c_i(A^i_j)S^i_j
$$
and, for all $z$ in $\mathcal{Z}$,
$$
\sum^n_{i=1} \sum^{k_i}_{j = 1} x^i_j > \sum^n_{i=1} \sum^{k_i}_{j=1} z^i_jS^i_j
$$

So, by Proposition 1, if a group $c_1, \ldots, c_n$ does not satisfy Consensus, then $c_1 \frown \ldots \frown c_n$ is not in the closed convex hull of $\mathcal{W}$, and so, by Theorem 2, there is $x = \langle x^1_1, \ldots, x^1_{k_1}, \ldots, x^n_1, \ldots, x^n_{k_n}\rangle$ and $S = \langle S^1_1, \ldots, S^1_{k_1}, \ldots, S^n_1, \ldots, S^n_{k_n}\rangle$ such that (i) $x^i_j < c_i(A^i_j)S^i_j$ and (ii) for all $w$ in $\mathcal{W}$,
$$\sum^n_{i=1} \sum^{k_i}_{j = 1} x^i_j > \sum^n_{i=1} \sum^{k_i}_{j=1} w(A^i_j)S^i_j$$
But then (i) says that the credences of individual $i$ require them to pay $£x^i_j$ for a bet on $A^i_j$ that pays out $£S^i_j$ if $A^i_j$ is true and $£0$ if it is false. And (ii) says that the total price of these bets across all members of the group---namely, $£\sum^n_{i=1} \sum^{k_i}_{j = 1} x^i_j$---is greater than the amount the bets will payout at any world---namely, $£\sum^n_{i=1} \sum^{k_i}_{j=1} w(A^i_j)S^i_j$.

* This was introduced independently by Frank P. Ramsey (1931) and Bruno de Finetti (1937). For overviews, see (Hajek 2008, Vineberg 2016, Pettigrew 2020).

**Much of the discussion of these arguments in the literature focusses on versions on which the epistemic value of a credence is taken to be its accuracy. This literature begins with Rosenkrantz (1981) and Joyce (1998). But, following Joyce (2009) and Predd (2009), it has been appreciated that we need not necessarily assume that accuracy is the only source of epistemic value in order to get the argument going.

*** Matthew Kopec (2012) offers a proof of a slightly weaker result. It doesn't quite work because it assumes that all strictly proper measures of epistemic value are convex, when they are not---the spherical scoring rule is not. I offer an alternative proof of this stronger result in the technical appendix below.

References

de Finetti, B. (1937 [1980]). Foresight: Its Logical Laws, Its Subjective Sources. In H. E. Kyburg, & H. E. K. Smokler (Eds.) Studies in Subjective Probability. Huntingdon, N. Y.: Robert E. Kreiger Publishing Co.

de Finetti, B. (1974). Theory of Probability, vol. I. New York: John Wiley & Sons.

Gillies, D. (1991). Intersubjective probability and confirmation theory. The British Journal for the Philosophy of Science, 42(4), 513–533.

Hájek, A. (2008). Dutch Book Arguments. In P. Anand, P. Pattanaik, & C. Puppe (Eds.) The Oxford Handbook of Rational and Social Choice, (pp. 173–195). Oxford: Oxford University Press.

Joyce, J. M. (1998). A Nonpragmatic Vindication of Probabilism. Philosophy of Science, 65(4), 575–603.

Joyce, J. M. (2009). Accuracy and Coherence: Prospects for an Alethic Epistemology of Partial Belief. In F. Huber, & C. Schmidt-Petri (Eds.) Degrees of Belief. Dordrecht and Heidelberg: Springer.

Kopec, M. (2012). We ought to agree: A consequence of repairing Goldman’s group scoring rule. Episteme, 9(2), 101–114.

Pettigrew, R. (2020). Dutch Book Arguments. Cambridge University Press.

Predd, J., Seiringer, R., Lieb, E. H., Osherson, D., Poor, V., & Kulkarni, S. (2009). Probabilistic Coherence and Proper Scoring Rules. IEEE Transactions of Information Theory, 55(10), 4786–4792.

Ramsey, F. P. (1926 [1931]). Truth and Probability. In R. B. Braithwaite (Ed.) The Foundations of Mathematics and Other Logical Essays, chap. VII, (pp. 156–198). London: Kegan, Paul, Trench, Trubner & Co.

Rosenkrantz, R. D. (1981). Foundations and Applications of Inductive Probability. Atascadero, CA: Ridgeview Press.

Ryder, J. (1981). Consequences of a simple extension of the Dutch Book argument. The British Journal for the Philosophy of Science, 32(2), 164–167.

Vineberg, S. (2016). Dutch Book Arguments. In E. N. Zalta (Ed.) Stanford Encyclopedia of Philosophy. Metaphysics Research Lab, Stanford University.


No comments:

Post a Comment