Friday, 17 March 2023

The Robustness of the Diversity Prediction Theorem II: problems with asymmetry

There is a PDF of this post here.

Take a quantity whose value you wish to estimate---the basic reproduction number for a virus; the number of jelly beans in a jar at the school fĂȘte; the global temperature rise caused by doubling the concentration of CO2 in the atmosphere; the number of years before humanity goes extinct. Ask a group of people to provide their estimate of that value, and take the mean of their answers. The Diversity Prediction Theorem says that, if you measure distance as squared difference, so that, for example, the distance from $2$ to $5$ is $(2-5)^2$, the distance that the mean of the answers lies from the true value will be equal to the average distance from the answers to the true value less a quantity that measures the diversity of the answers, namely, the average distance from the answers to the mean answer. 

In a previous blogpost, I asked: to what extent is this result a quirk of squared difference? Extending a result due to David Pfau, I showed that it is true of exactly the Bregman divergences. But there's a problem. In the Diversity Prediction Theorem, we measure the diversity of estimates as the average distance from the individual answers to the mean answer. But why this, and not the average distance to the individual answers from the mean answer? Of course, if we use squared difference, then these values are the same, because squared difference is a symmetric measure of distance: the squared difference from one value to another is the same as the squared difference from the second value to the first. And so one set of answers will be more diverse than another according to one definition if it is more diverse according to the other definition. But squared difference is in fact the only symmetric Bregman divergence. So, for all other definitions of Bregman divergence, the two definitions of diversity come apart.

This has an odd effect on the Diversity Prediction Theorem. One of the standard lessons the theorem is supposed to teach is that the mean of a more diverse group is more accurate than the mean of a less diverse group. In fact, even the original version of the theorem doesn't tell us that. It tells us that the mean of a more diverse group is more accurate than the mean of a less diverse group when the average distance from the truth is the same for both groups. But, if we use a non-symmetric distance measure, i.e., one of the Bregman divergences that isn't squared error, and we use the alternative measure of diversity mentioned in the previous paragraph---that is,  the average distance to the individual answers from the mean answer---then we can get a case in which the mean of a less diverse group is more accurate than the mean of a more diverse group, even though the average distance from the answers to the truth is the same for both groups. So it seems that we have three choices: (i) justify using squared difference only; (ii) justify the first of the two putative definitions of diversity in terms of average distance between mean answer and the answers; (iii) give up the apparent lesson of the Diversity Prediction Theorem that diversity leads to more accurate average answers. For my money, I think (iii) is the most plausible.

Tuesday, 14 March 2023

The Robustness of the Diversity Prediction Theorem I: generalizing the result

There is a PDF version of this post here.

Pick a question to which the answer is a number. What is the population of Weston-super-Mare? How many ants are there in the world? Then go out into the street and ask the first ten people you meet for their answers. Look how far those answers lie from the truth. Now take the average of the answers---their mean. And look how far it lies from the truth. If you measure the distance between an answer and the true value in the way many statisticians do---that is, you take the difference and square it---then you'll notice that the distance of the average from the truth is less than the average distance from the truth. I can predict this with confidence because it is no coincidence, nor even just a common but contingent outcome---it's a mathematical fact. And it remains a mathematical fact for many alternative ways of measuring the distance from an answer to the truth---essentially, all that's required is that the measure is strictly convex in its first argument (Larrick, et al. 2012). That's a neat result! In expectation, you're better off going with the  average answer than picking an answer at random and going with that.

But, if we keep using squared difference, we get an even more interesting result. The distance the average answer lies from the truth is equal to the average distance the answers lie from the truth less a quantity that measures the diversity of the answers. Scott Page calls this the Diversity Prediction Theorem and uses it as part of his general argument that diversity is valuable in group reasoning and decision-making. The quantity that measures the diversity of the answer is the average distance the answers lie from the mean answer: the answers in a homogeneous set will, on average, lie close to their mean, while those in a heterogeneous set will, on average, lie far from their mean. This has an intriguing corollary: given two collections of answers with the same average distance from the truth, the mean of the more diverse one will be more accurate than the mean of the less diverse one. That's another neat result! It isn't quite as neat as people sometimes make out. Sometimes, they say that it shows that more diverse sets of answers are more accurate; but that's only true when you're comparing sets with the same average distance from the truth; increasing diversity can often increase the average distance from the truth. But it's still a neat result!

Thursday, 4 August 2022

Should longtermists recommend hastening extinction rather than delaying it?

I'm cross-posting this from the Effective Altruism Forum. It's the latest version of my critique of longtermism that uses Lara Buchak's risk-weighted expected utility theory.

Here's the abstract: Longtermism is the view that the most urgent global priorities, and those to which we should devote the largest portion of our current resources, are those that focus on ensuring a long future for humanity, and perhaps sentient or intelligent life more generally, and improving the quality of those lives in that long future. The central argument for this conclusion is that, given a fixed amount of a resource that we are able to devote to global priorities, the longtermist’s favoured interventions have greater expected goodness than each of the other available interventions, including those that focus on the health and well-being of the current population. In this paper, I argue that, even granting the longtermist's axiology and their consequentialist ethics, we are not morally required to choose whatever option maximises expected utility, and may not be permitted to do so. Instead, if their axiology and consequentialism is correct, we should choose using a decision theory that is sensitive to risk, and allows us to give greater weight to worse-case outcomes than expected utility theory. And such decision theories do not recommend longtermist interventions. Indeed, sometimes, they recommend hastening human extinction. Many, though not all, will take this as a reductio of the longtermist's axiology or consequentialist ethics. I remain agnostic on the conclusion we should draw. 

A kitten playing the long game

Wednesday, 27 July 2022

Self-recommending decision theories for imprecise probabilities

A PDF version of this post is available here.

The question of this blogpost is this: Take the various decision theories that have been proposed for individuals with imprecise probabilities---do they recommend themselves? It is the final post in a trilogy on the topic of self-recommending decision theories (the others are here and here).

One precise kitten and one imprecise kitten

Tuesday, 19 July 2022

More on self-recommending decision theories

A PDF of this blogpost can be found here.

Last week, I wrote about how we might judge a decision theory by its own lights. I suggested that we might ask the decision theory whether it would choose to adopt itself as a decision procedure if it were uncertain about which decisions it would face. And I noted that many instances of Lara Buchak's risk-weighted expected utility theory (REU) do not recommend themselves when asked this question. In this post, I want to give a little more detail about that case, and also note a second decision theory that doesn't recommend itself, namely, $\Gamma$-Maximin (MM), a decision theory designed to be used when uncertainty is modeled by imprecise probabilities.

A cat judging you...harshly
 

Monday, 11 July 2022

Self-recommending decision theories

A PDF of this blogpost is available here.

Once again, I find myself stumbling upon a philosophical thought that seems so natural that I feel reasonably confident it must have been explored before, but I can't find where. So, in this blogpost, I'll set it out in the hope that a kind reader will know where to find a proper version already written up fully.*

I'd like to develop a type of objection that might be raised against a theory of rational decision-making. Here, I'll raise it against Lara Buchak's risk-weighted expected utility theory, in particular, but there will be many other theories to which it applies.

In brief, the objection applies to decision theories that are not self-recommending. That is, it applies to a decision theory if there is a particular instance of that theory that recommends that you use some alternative decision theory to make your decision; if you were to use this decision theory to choose which decision theory to use to make your choices, it would tell you to choose a different one, and not itself. We might naturally say that a decision theory that is not self-recommending in this sense is not a coherent means by which to make decisions, and that seems to be a strong strike against it.

A self-recommending Timothy Dalton

Thursday, 23 June 2022

Aggregating for accuracy: another accuracy argument for linear pooling

 A PDF of this blogpost is available here.

I don't have an estimate for how long it will be before the Greenland ice sheet collapses, and I don't have an estimate for how long it will be before the average temperature at Earth's surface rises more than 3C above pre-industrial levels. But I know a bunch of people who do have such estimates, and I might hope that learning theirs might help me set mine. Unfortunately, each of these people has a different estimate for each of these two quantities. What should I do? Should I pick one of them at random and adopt their estimates as mine? Or should I pick some compromise between them? If the latter, which compromise?

Cat inaccurately estimates width of step

Wednesday, 18 May 2022

Should we agree? III: the rationality of groups

In the previous two posts in this series (here and here), I described two arguments for the conclusion that the members of a group should agree. One was an epistemic argument and one a pragmatic argument. Suppose you have a group of individuals. Given an individual, we call the set of propositions to which they assign a credence their agenda. The group's agenda is the union of its member's agendas; that is, it includes any proposition to which some member of the group assigns a credence. The precise conclusion of the two arguments we describe is this: the group is irrational if there no single probability function defined on the group's agenda that gives the credences of each member of the group when restricted to their agenda. Following Matt Kopec, I called this norm Consensus. 

Cats showing a frankly concerning degree of consensus

Friday, 6 May 2022

Should we agree? II: a new pragmatic argument for consensus

There is a PDF version of this blogpost available here.

In the previous post, I introduced the norm of Consensus. This is a claim about the rationality of groups. Suppose you've got a group of individuals. For each individual, call the set of propositions to which they assign a credence their agenda. They might all have quite different agendas, some of them might overlap, others might not. We might say that the credal states of these individual members cohere with one another if there is a some probability function that is defined for any proposition that appears in any member's agenda, and the credences each member assigns to the propositions in their agenda match those assigned by this probability function to those propositions. Then Consensus says that a group is irrational if it does not cohere.

A group coming to consensus

Thursday, 5 May 2022

Should we agree? I: the arguments for consensus

You can find a PDF of this blogpost here.

Should everyone agree with everyone else? Whenever two members of a group have an opinion about the same claim, should they both be equally confident in it? If this is sometimes required of groups, of which ones is it required and when? Whole societies at any time in their existence? Smaller collectives when they're engaged in some joint project?

Of course, you might think these are purely academic questions, since there's no way we could achieve such consensus even if we were to conclude that it is desirable, but that seems too strong. Education systems and the media can be deployed to push a population towards consensus, and indeed this is exactly how authoritarian states often proceed. Similarly, social sanctions can create incentives for conformity. So it seems that a reasonable degree of consensus might be possible.

But is it desirable? In this series of blogposts, I want to explore two formal arguments. They purport to establish that groups should be in perfect agreement; and they explain why getting closer to consensus is better, even if perfect agreement isn't achieved---in this case, a miss is not as good as a mile. It's still a long way from their conclusions to practical conclusions about how to structure a society, but they point sufficiently strongly in a surprising direction that it is worth exploring them. In this first post, I set out the arguments as they have been given in the literature and polish them up a bit so that they are as strong as possible.

Since they're formal arguments, they require a bit of mathematics, both in their official statement and in the results on which they rely. But I want to make the discussion as accessible as possible, so, in the main body of the blogpost, I state the arguments almost entirely without formalism. Then, in the technical appendix, I sketch some of the formal detail for those who are interested.

Tuesday, 13 April 2021

What we together risk: three vignettes in search of a theory

For a PDF version of this post, see here.

Many years ago, I was climbing SgĂčrr na Banachdich with my friend Alex. It's a mountain in the Black Cuillin, a horseshoe of summits that surround Loch Coruisk at the southern end of the Isle of Skye. It's a Munro---that is, it stands over 3,000 feet above sea level---but only just---it measures 3,166 feet. About halfway through our ascent, the mist rolled in and the rain came down heavily, as it often does near these mountains, which attract their own weather system. At that point, my friend and I faced a choice: to continue our attempt on the summit or begin our descent. Should we continue, there were a number of possible outcomes: we might reach the summit wet and cold but not injured, with the mist and rain gone and in their place sun and views across to Bruach na FrĂŹthe and the distinctive teeth-shaped peaks of SgĂčrr nan Gillean; or we might reach the summit without injury, but the mist might remain, obscuring any view at all; or we might get injured on the way and either have to descend early under our own steam or call for help getting off the mountain. On the other hand, should we start our descent now, we would of course have no chance of the summit, but we were sure to make it back unharmed, for the path back is good and less affected by rain.

Tuesday, 6 April 2021

Believing is said of groups in many ways

For a PDF version of this post, see here.

In defence of pluralism

Recently, after a couple of hours discussing a problem in the philosophy of mathematics, a colleague mentioned that he wanted to propose a sort of pluralism as a solution. We were debating the foundations of mathematics, and he wanted to consider the claim that there might be no single unique foundation, but rather many different foundations, no one of them better than the others. Before he did so, though, he wanted to preface his suggestion with an apology. Pluralism, he admitted, is unpopular wherever it is proposed as a solution to a longstanding philosophical problem. 

Sunday, 14 March 2021

Permissivism and social choice: a response to Blessenohl

In a recent paper discussing Lara Buchak's risk-weighted expected utility theory, Simon Blessenohl notes that the objection he raises there to Buchak's theory might also tell against permissivism about rational credence. I offer a response to the objection here.

Wednesday, 6 January 2021

Life on the edge: a response to Schultheis' challenge to epistemic permissivism about credences

In their 2018 paper, 'Living on the Edge', Ginger Schultheis issues a powerful challenge to epistemic permissivism about credences, the view that there are bodies of evidence in response to which there are a number of different credence functions it would be rational to adopt. The heart of the argument is the claim that a certain sort of situation is impossible. Schultheis thinks that all motivations for permissivism must render situations of this sort possible. Therefore, permissivism must be false, or at least these motivations for it must be wrong.

Monday, 4 January 2021

Using a generalized Hurwicz criterion to pick your priors

Over the summer, I got interested in the problem of the priors again. Which credence functions is it rational to adopt at the beginning of your epistemic life? Which credence functions is it rational to have before you gather any evidence? Which credence functions provide rationally permissible responses to the empty body of evidence? As is my wont, I sought to answer this in the framework of epistemic utility theory. That is, I took the rational credence functions to be those declared rational when the appropriate norm of decision theory is applied to the decision problem in which the available acts are all the possible credence functions, and where the epistemic utility of a credence function is measured by a strictly proper measure. I considered a number of possible decision rules that might govern us in this evidence-free situation: Maximin, the Principle of Indifference, and the Hurwicz criterion. And I concluded in favour of a generalized version of the Hurwicz criterion, which I axiomatised. I also described which credence functions that decision rule would render rational in the case in which there are just three possible worlds between which we divide our credences. In this post, I'd like to generalize the results from that treatment to the case in which there any finite number of possible worlds.

Friday, 1 January 2021

How permissive is rationality? Horowitz's value question for moderate permissivism

Rationality is good; irrationality is bad. Most epistemologists would agree with this rather unnuanced take, regardless of their view of what exactly constitutes rationality and its complement. Granted this, a good test of a thesis in epistemology is whether it can explain why these two claims are true. Can it answer the value question: Why is rationality valuable and irrationality not? And indeed Sophie Horowitz gives an extremely illuminating appraisal of different degrees of epistemic permissivism and impermissivism by asking of each what answer it might give. Her conclusion is that the extreme permissivist -- played in her paper by the extreme subjective Bayesian, who thinks that satisfying Probabilism and being certain of your evidence is necessary and sufficient for rationality -- can give a satisfying answer to this question, or, at least, an answer that is satisfying from their own point of view. And the extreme impermissivist -- played here by the objective Bayesian, who thinks that rationality requires something like the maximum entropy distribution relative to your evidence -- can do so too. But, Horowitz argues, the moderate permissivist -- played by the moderate Bayesian, who thinks rationality imposes requirements more stringent than merely Probabilism, but who does not think they're stringent enough to pick out a unique credence function -- cannot. In this post, I'd like to raise some problems for Horowitz's assessment, and try to offer my own answer to the value question on behalf of the moderate Bayesian. (Full disclosure: If I'm honest, I think I lean towards extreme permissivism, but I'd like to show that moderate permissivism can defend itself against Horowitz's objection.)

Tuesday, 15 December 2020

Deferring to rationality -- does it preclude permissivism?

Permissivism about epistemic rationality is the view that there are bodies of evidence in response to which rationality permits a number of different doxastic attitudes. I'll be thinking here about the case of credences. Credal permissivism says: there are bodies of evidence in response to which rationality permits a number of different credence functions.

Thursday, 3 September 2020

Accuracy and Explanation in a Social Setting: thoughts on Douven and Wenmackers

For a PDF version of this post, see here.

In this post, I want to continue my discussion of the part of van Fraassen's argument against inference to the best explanation (IBE) that turns on its alleged clash with Bayesian Conditionalization (BC). In the previous post, I looked at Igor Douven's argument that there are at least some ways of valuing accuracy on which updating by IBE comes out better than BC. I concluded that Douven's arguments don't save IBE; BC is still the only rational way to update.

The setting for Douven's arguments was individualist epistemology. That is, he considered only the single agent collecting evidence directly from the world and updating in the light of it. But of course we often receive evidence not directly from the world, but indirectly through the opinions of others. I learn how many positive SARS-CoV-2 tests there have been in my area in the past week not my inspecting the test results myself but by listening to the local health authority. In their 2017 paper, 'Inference to the Best Explanation versus Bayes’s Rule in a Social Setting', Douven joined with Sylvia Wenmackers to ask how IBE and BC fare in a context in which some of my evidence comes from the world and some from learning the opinions of others, where those others are also receiving some of their evidence from the world and some from others, and where one of those others from whom they're learning might be me. Like Douven's study of IBE vs BC in the individual setting, Douven and Wenmackers conclude in favour of IBE. Indeed, their conclusion in this case is considerably stronger than in the individual case:

The upshot will be that if agents not only update their degrees of belief on the basis of evidence, but also take into account the degrees of belief of their epistemic neighbours, then the noted advantage of Bayesian updating [from Douven's earlier paper] evaporates and IBE does better than Bayes’s rule on every reasonable understanding of inaccuracy minimization. (536-7)

As in the previous post, I want to stick up for BC. As in the individualist setting, I think this is the update rule we should use in the social setting.

Monday, 24 August 2020

Accuracy and explanation: thoughts on Douven

For a PDF of this post, see here.

Igor has eleven coins in his pocket. The first has 0% chance of landing heads, the second 10% chance, the third 20%, and so on up to the tenth, which has 90% chance, and the eleventh, which has 100% chance. He picks one out without letting me know which, and he starts to toss it. After the first 10 tosses, it has landed tails 5 times. How confident should I be that the coin is fair? That is, how confident should I be that it is the sixth coin from Igor's pocket; the one with 50% chance of landing heads? According to the Bayesian, the answer is calculated as follows:$$P_E(H_5) = P(H_5 | E) = \frac{P(H_5)P(E | H_5)}{\sum^{10}_{i=0} P(H_i) P(E|H_i)}$$where

  • $E$ is my evidence, which says that 5 out of 10 of the tosses landed heads,
  • $P_E$ is my new posterior updating credence upon learning the evidence $E$,
  • $P$ is my prior,
  • $H_i$ is the hypothesis that the coin has $\frac{i}{10}$ chance of landing heads,
  • $P(H_0) = \ldots = P(H_{10}) = \frac{1}{11}$, since I know nothing about which coin Igor pulled from his pocket, and
  • $P(E | H_i) = \left ( \frac{i}{10} \right )^5 \left (\frac{10-i}{10} \right )^5$, by the Principal Principle, and since each coin toss is independent of each other one.

So, upon learning that the coin landed heads five times out of ten, my posterior should be:$$P_E(H_5) = P(H_5 | E) = \frac{P(H_5)P(E | H_5)}{\sum^{10}_{i=0} P(H_i) P(E|H_i)} = \frac{\frac{1}{11} \left ( \frac{5}{10} \right )^5\left ( \frac{5}{10} \right )^5}{\sum^{10}_{i=1}\frac{1}{11} \left ( \frac{i}{10} \right )^5 \left (\frac{10-i}{10} \right )^5 } \approx 0.2707$$But some philosophers have suggested that this is too low. The Bayesian calculation takes into account how likely the hypothesis in question makes the evidence, as well as how likely I thought the hypothesis in the first place, but it doesn't take into account that the hypothesis explains the evidence. We'll call these philosophers explanationists. Upon learning that the coin landed heads five times out of ten, the explanationist says, we should be most confident in $H_5$, the hypothesis that the coin is fair, and the Bayesian calculation does indeed give this. But we should be most confident in part because $H_5$ best explains the evidence, and the Bayesian calculation takes no account of this.