In August, I'll speak at a conference that aims to bring together what we might call 'mainstream epistemology' and 'formal epistemology'. I always find it hard to draw the line between these, since what people usually call mainstream epistemology often uses formal tools, such as those developed in epistemic logic or in modal semantics, and formal epistemology quite frequently uses none. But they do tend to differ in the way they represent their objects of study, namely, beliefs. Mainstream epistemology tends to favour a coarse-grained representation: you can believe a proposition, suspend on it, and perhaps disbelieve it, but those are usually the only options considered. Some recent work also talks of individuals thinking a proposition is true, being sure it is, holding it to be true, or accepting it, but even if we add all of those, that's still fairly few sorts of attitude. Formal epistemology, on the other hand, born as it was out of the foundations of statistics, the epistemology of science, and theory of rational choice under uncertainty, favours a more fine-grained approach: in the standard version, you can believe a proposition to a particular degree, which is represented by a real number at least 0, which is minimal belief, and at most 1, which is maximal, yielding a continuum of possible attitudes to each proposition; these degrees of belief we call your credences--they're the states you report when you say 'I'm 80% sure there's milk in the fridge' or 'I'm 50-50 whether it's going to rain in the next hour'. The theory of hyperreal credences and the various theories of imprecise credences, from sets of probability functions to sets of desirable gambles, provide even more fine-grained representations, but I'm no expert in those, so I'll set them aside here.
Coming as I do from fine-grained epistemology, for my contribution to the conference, I'd like to give the view from there of a topic that has been discussed a great deal recently in coarse-grained epistemology, hoping of course that there will be something I can offer that is useful to them. The topic is inquiry.
|
An inquiring mind |
1. Recent work on inquiry
In the recent literature on inquiry, mainstream epistemologists point out that epistemology has often begun at the point at which you already have your evidence, and it has then focussed on identifying the beliefs for which that evidence provides justification or which count as knowledge for someone with that evidence. Yet we are not mere passive recipients of the evidence we have. We often actively collect it. We often choose to put ourselves in positions from which we'll gather some pieces of evidence but not others: we'll move to a position from which we'll see or hear or smell how the world is in one respect but miss how it is in another; we'll prod the world in one way to see how it responds but we won't prod it in another; and so on. As many in the recent literature point out, this has long been recognised, but typically mainstream epistemologists have taken the norms that govern inquiry to be pragmatic or practical ones, not epistemic ones--we inquire in order to find out things that inform our practical decisions, and so the decision what to find out is governed by practical considerations, and epistemologists leave well alone. One upshot of this is that, where norms of inquiry might appear to clash with norms of belief, mainstream epistemologists have not found this troubling, since they're well aware that the pragmatic and the epistemic can pull in different directions. But, contends much of the recent literature, norms of inquiry, norms of evidence-gathering, and so on, are often epistemic norms not practical ones. And that renders clashes between those norms and norms of belief more troubling, since both are now epistemic.
What I'd like to do in my paper at this conference, and in this blogpost, is give a primer on a framework for thinking about norms of inquiry that is reasonably well-established in fine-grained epistemology, and then turn to some of the questions from the recent debate about inquiry and ask what this framework has to say about them. Questions will include: When should we initiate an inquiry, when should we continue it, when should we conclude it, and when should we reopen it or double-check our findings? Are there purely epistemic norms that govern these actions, as Carolina Flores and Elise Woodard contend? How do epistemic norms of inquiry relate to epistemic norms of belief or credence, and can they conflict, as Jane Friedman contends? How should we resolve the apparent puzzle raised by Jane Friedman's example of counting the windows in the Chrysler Building? How should we understand Julia Staffel's distinction between transitional attitudes and terminal attitudes (here, here, here)?
I'm not sure much, if anything, I'll say will be new. I'm really drawing out an insight that dates back to I. J. Good's famous Value of Information Theorem, and has been pursued in Wayne Myrvold's version of that theorem within epistemic utility theory, some further developments of Myrvold's ideas by Alejandro PĂ©rez Carballo, and some recent work on generalizing Good's result that begins with the economist John Geanokoplos and has been developed by Nilanjan Das, Miriam Schoenfield, and Kevin Dorst, among others.
2. Representing an individual as having credences
Throughout, we'll represent an individual's doxastic state by their credence function. Collect into a set all the propositions about which the individual has an opinion and call it their agenda. Their credence function is then a function that takes each proposition in their agenda and returns a real number, at least 0 and at most 1, that measures how strongly they believe that proposition. So when I say 'Chris is 65% sure it's going to rain', I'm ascribing to him a credence of 0.65 in the proposition that it is going to rain.
We'll assume throughout that our individual's credence function at any time is probabilistic. That is, it assigns 1 to all necessary truths, 0 to all necessary falsehoods, and the credence it assigns to a disjunction of two mutually exclusive propositions is the sum of the credences it assigns to the disjuncts. (I present this more formally in Appendix A.1)
We'll also assume that, when they do receive evidence, our individual will respond by conditionalizing their credences on it. That is, their new unconditional credence in a proposition will be their old conditional credence in that proposition given the proposition learned as evidence; that is, their new unconditional credence in a proposition will be the proportion of their old credence in the proposition they've now learned that they then also assigned to the proposition in question.
3. Good on the pragmatic value of gathering evidence
While most discussion of
Good's 1967 paper, 'On the Principle of Total Evidence', focuses on what has become known as the Value of Information Theorem, the real contribution lies in his account of the pragmatic value of gathering evidence.
This account begins with an account of the pragmatic value of a credence function. Suppose you will face a particular decision between a range of options (an option is defined by now much utility it has at each possible state of the world, and the utility of an option at a world is a real number that measures how much you value that option at that world). Then the standard theory of choice under uncertainty says that you should pick an option with maximal expected utility from the point of view of the credence function you have when you face the decision. So let's assume you'll do this. Then we define the pragmatic value for you, at a particular state of the world, of having a particular credence function when faced with a particular decision: it is the utility, at that state of the world, of the option that this credence function will lead you to pick from those available in the decision. This will be one of the options that maximizes expected utility from the point of view of that credence function; but there might be more than one that maximizes that, so we assume you have a way of breaking ties between them.
So, for instance, suppose I have to walk to the shops and I must decide whether or not to take an umbrella with me. And suppose I have credences concerning whether or not it will rain as I walk there. Let's suppose first that taking the umbrella uniquely maximizes expected value from the point of view of those credences. Then the pragmatic value of those credences at a world at which it does rain is the utility of walking to the shops in the rain with an umbrella, while their pragmatic value at a world at which it doesn't rain is the utility of walking to the shops with no rain carrying an umbrella. And similarly if leaving without the umbrella uniquely maximizes expected utility from the point of view of those credences, then their pragmatic value at a rainy world is the utility of walking to the shops in the rain without an umbrella, and their pragmatic value at a dry world is the utility of walking to the shops with no rain and no umbrella. And if they both maximize expected utility from the point of view of the credences, then the pragmatic value of the credences will depend on how I break ties.
Now, with the pragmatic value of a credence function defined relative to a particular decision you'll face, Good can define the pragmatic value of a particular episode of evidence-gathering relative to such a decision. We represent such an episode as follows: for each state of the world, we specify the strongest proposition you'll learn as evidence at that state of the world. Then the pragmatic value, at a particular world, of an episode of evidence-gathering is the pragmatic value, at that world, of the credence function you'll have after learning whatever evidence you'll gather at that world and updating on it by conditionalizing. So, holding fixed the decision problem you'll face, the pragmatic value of a credence function is the utility of the option it'll lead you to pick, and the pragmatic value of gathering evidence is the pragmatic value of the credence function it will lead you to have.
So, for instance, suppose I have to walk to the shops later and, at that point, I'll have to decide whether or not to take an umbrella with me. And suppose that, between now and then, I can gather evidence by looking at the weather forecast. If I do, I'll learn one of two things: rain is forecast, or rain is not forecast. And updating on that evidence, if I choose to gather it, will change my credences concerning whether or not it will rain on my way to the shops. Then what is the value, at a particular state of the world, of gathering evidence by looking at the forecast? Consider a world at which (i) rain is not forecast but (ii) it does rain; and suppose that, upon learning that rain is not forecast, I'll drop my credence in rain low enough that I'll not take my umbrella. Then the value of gathering evidence at that world is the utility of walking to the shops in the rain without an umbrella. In contrast, consider a world at which (i) rain is forecast but (ii) it doesn't rain; and suppose that, upon learning that rain is forecast, I raise my credence in rain high enough that I take the umbrella. Then the value of gathering evidence at that world is the utility of talking to the shops with no rain but carrying an umbrella. And so on.
This is Good's account of the pragmatic value, at a particular world, of a particular episode of evidence-gathering. With this in hand, we can now define the expected pragmatic value of such an episode, and we can also define the expected pragmatic value of not gathering evidence at all, since that is just the degenerate case of evidence-gathering in which you simply learn a tautology at every state of the world. Good's Value of Information Theorem then runs as follows: Fix a decision problem you'll face at a later time; and fix the way you break ties between a set of options when they all maximize expected utility; now suppose that, for no cost, you may gather evidence that will teach you which element of a particular partition is true; then the expected pragmatic value, from the point of view of your current credences, of gathering that evidence is at least as great expected pragmatic value, from the point of view of your current credences, as not gathering it; and, if you assign some positive credence to a state of the world in which the evidence you'll learn will change how you make the decision you'll face, then the expected pragmatic value of gathering the evidence is strictly greater than the expected pragmatic value of not gathering it. (I run through this using formal notation in
Appendix A.2.)
4. Myrvold on the epistemic value of gathering evidence
Good's theorem tells us something about when you have practical reason to engage in a certain sort of evidence-gathering. When it will teach you which element of a partition is true, when it costs nothing, and when you consider it possible it will change how you'll choose, then you should do it. But, as
Wayne Myrvold shows, building on work by
Graham Oddie and
Hilary Greaves & David Wallace, there is also a version that tells us something about when you have epistemic reason to gather evidence.
Alejandro PĂ©rez Carballo has extended Myrvold's approach in various ways.
Recall: Good's insight is that the pragmatic value of a credence function is the utility of the option it leads you to choose, and the pragmatic value of an episode of evidence-gathering is the pragmatic value of the credence function it will lead you to have. But credence functions don't just have pragmatic value; we don't use them only to guide our decisions. We also use them to represent the world, and their purely epistemic value derives from how well they do that, regardless of whether we need them to help us choose.
Many ways of measuring this purely epistemic value have been proposed, but by far the most popular characterizations of the legitimate epistemic utility functions says that they are all
strictly proper, where this means that, if we measure epistemic utility in this way, any probabilistic credence function expects itself to have strictly greater epistemic utility than it expects any alternative credence function to have; that is, it thinks of itself as uniquely best from the epistemic point of view. (
Jim Joyce defends this view here, and
Robbie Williams and I have recently defended it here.)
Perhaps the most well-known strictly proper epistemic utility function is the so-called negative Brier score: given a proposition, we say that the omniscient credence in it is 1 if it's true and 0 if it's false; the Brier score of a credence function at a world is then obtained by taking each proposition to which it assigns a credence, taking the difference between the credence it assigns to that proposition and the omniscient credence in that proposition at that world, squaring that difference, and then summing these squared differences; the negative Brier score is then, as the name suggests, merely the negative of the Brier score. In the negative Brier score, each proposition is given equal weight in the sum, but we can also give greater weight to some propositions than others in order to record that we consider them more important. This gives a weighted negative Brier score. This is important in the current context, since it allows us to explain why it is better, epistemically speaking, to engage in some evidence-gathering episodes rather than others, even when the latter will improve certain credences more than the former will improve others; the explanation is that the credences the latter will improve are less important to us. So, one evidence-gathering episode might, in expectation, greatly improve the accuracy of my credences concerning how many blades of grass there are on my neighbour's lawn, while another might, in expectation, only slightly improve the accuracy of my credences about the fundamental nature of reality, and yet I might favour the latter because the propositions it concerns are more important to me.
So now we have a way of assigning epistemic value to a credence function at a world. And so we can simply appeal to Good's insight to say that the epistemic value, at a world, of gathering evidence is the epistemic value of the credence function you'll end up with when you update on the evidence you'll get at that world. And now we can state Myrvold's epistemic version of Good's theorem: suppose you may gather evidence that will teach you which element of a particular partition is true, and suppose your epistemic utility function is strictly proper; then the expected epistemic value of gathering the evidence, from the point of view of your current credences, is always at least as great as the expected epistemic value of not gathering the evidence, from the same point of view; and, if you give some positive credence to a state of the world at which what you will learn will lead you to change your credences, then the expected epistemic value of gathering the evidence is strictly greater than the expected epistemic value of not doing so. (I present a more formal version in
Appendix A.3.)
4. Expanding the approach
What Good and Myrvold offer is almost what we need to investigate both pragmatic and epistemic reasons for gathering evidence and the norms to which they give rise. But we should make them a little more general before we move on.
First, gathering evidence is rarely cost-free, and so in any evaluation of whether to do so, we must include not only Good's account of it pragmatic value but also its cost; but of course that's easy to do. So the true pragmatic value of an evidence-gathering episode at a world is not just the utility at that world of the option you'll choose using the credence function it will lead you to have; it's that utility minus the cost of gathering the evidence.
Second, Good assumes that you know for sure which decision you'll face using your credences, but of course you might be uncertain of this. But again, it's easy to incorporate this: we simply ensure that our possible worlds specify not only the truth values of the propositions to which we assign credences, but also which decision we'll face with our credences; we then ensure that we assign credences to these; and, having done all this, we can define the pragmatic value of a credence function at a world to be the utility at that world of the option it would lead us to choose from the decision we face at that world; and then the pragmatic value of an evidence-gathering episode is again the pragmatic value of the credence function you'll end up with after gathering the evidence and updating on it. And Good's theorem still goes through with this amendment.
Third: Good's theorem and Myrvold's epistemic version only tells you how a particular evidence-gathering episode compares with gathering no evidence at all. But our options are rarely so limited. Often, we can choose between various different evidence-gathering episodes. Perhaps there are different partitions from which we can learn the true element, such as when I choose which weather app to open to learn their forecast, or when I choose to measure the weight of a chemical sample rather than its reactive properties. Good's insight and Myrvold's epistemic version allow us to compare these are well, as PĂ©rez Carballo notes: we can simply compare the expected pragmatic or epistemic value of the different available evidence-gathering episodes and pick one that is maximal.
Fourth, Good's theorem and Myrvold's epistemic version only cover cases in which the evidence-gathering episode will teach you which element of a partition is true. This is very idealized, but it is true to a certain way in which we gather evidence in science. When I measure the weight of a chemical sample, or when I ask how many organisms in a given population are infected after exposure to a particular pathogen, there is a fixed partition from which my evidence will come: I'll learn the sample is this weight or that weight or another one; I'll learn the number of infected organisms was zero or one or two or...up to the size of the population. But of course there are many cases in which our evidence-gathering will not be partitional or even factive in this way; at one world I might learn it will rain tomorrow, at another, I might learn it will rain or snow; and, on some non-factive conceptions of evidence, I might learn it will snow at a world at which it won't. What happens to Good's theorem and Myrvold's epistemic version in these cases? The answer is that it depends. Building on work by the economist
John Geanokoplos,
Nilanjan Das has brought some order to the pragmatic case, but it would be interesting to see what happens in the epistemic case. (I discuss this a little in
Appendix A.3.)
In any case, here's just a small example to give a flavour. It will either not rain tomorrow, rain lightly, or rain heavily. I have the opportunity to check the weather forecast. But it never commits fully, and it is typically a bit pessimistic. So, in the world where it won't rain, it'll report: no rain or light rain. In the world where it will rain lightly, it'll report: light rain or heavy rain. And in the world where it will rain heavily, it will also report: light rain or heavy rain. Should I check the weather forecast? In the practical case, it depends on the decision I'll face, but the important point is that, whatever regular credences I have in the three possibilities, there is a decision I might face such that I expect myself to face it better with my current credences than with the credences I'll acquire after gathering evidence from the forecast. The reason is that, given the way the evidence overlaps, I know that my credence in light rain will rise regardless of which evidence I obtain and update on--Das calls such evidence-gathering episodes biased inquiries. So consider a bet that pays out a pound if there is light rain and nothing if there isn't. Then there's a price I'll pay for that bet after gathering the evidence that I won't pay for it now, since I'll be more confident in it then for sure. So, from my current point of view, I shouldn't update on it. What about the epistemic case? Interestingly, in this case, at least if we measure epistemic utility using the negative Brier score, there are prior credences I might have from whose point of view gathering the evidence is the best thing to do, and prior credences from whose point of view not gathering the evidence is the best thing to do. So it's a mixed bag. And if we use another well-known epistemic utility function, namely, the negative log score, every prior credence function expects gathering the evidence to be best; and indeed this is true for any factive evidence-gathering episode.
One response to this sort of case is to change how we update in response to evidence. For instance,
Miriam Schoenfield proposes that we should update by conditionalizing not on our evidence itself but on the face that we learned it. And, if we do this, Good's theorem and Myrvold's epistemic version are restored. The debate then centres on whether Schoenfield's way of updating is really available to us (see
Gallow).
5. Questions from the recent literature
So the framework inspired by Good's insight, and given an epistemic twist by Myrvold, is very general. It provides very general norms for when to gather evidence. So it's natural to ask what these norms say about questions from the recent literature on inquiry.
5.1 Flores and Woodard on epistemic norms of evidence-gathering
Carolina Flores and Elise Woodard offer two compelling arguments that there are purely epistemic norms on inquiry: the first is defensive, showing that standard state-centred or evidentialist sources of resistance to such norms are misguided; the second is offensive, showing that we criticize one another in epistemic ways for our evidence-gathering practices, and arguing that this gives reason to think there are epistemic norms that govern those practices. Myrvold's adaptation of Good's insight adds a third argument. Here is a norm:
The Epistemic Norm of Inquiry Gather evidence so as to maximize the expected epistemic utility of your future credence function.
It governs evidence-gathering and it is epistemic. It captures the idea that we care about our beliefs not just for their instrumental pragmatic value as guides to action, but also for their epistemic value as representations of the world we inhabit. And, as a result, there are purely epistemic norms that govern us when we can take steps to change them. Of course, as Flores and Woodard note, these are pro tanto norms--they can be overridden by other considerations, epistemic or moral or both. But they are norms all the same. And, of course, any pragmatic norm about evidence-gathering, such as that you should gather evidence so as to maximize the expected pragmatic utility of your future credence function given the decision you face with it, is also pro tanto--it can be overridden by moral or epistemic considerations. There is no primacy of the pragmatic.
What's more, the Epistemic Norm of Inquiry helps to make sense of Flores and Woodard's examples. Their first, Cloistered Claire, gets all of her evidence from a single source; their second, Gullible Gabe, gets his information from a source he believes to be reliable, but that his existing evidence suggests is not reliable; and Lazy Larry gets his evidence from a good source, but only attends to part of the evidence that source provides due to laziness. In each case, Flores and Woodard submit, we want to criticize their evidence-gathering practices on purely epistemic grounds. Let's discuss them in turn.
The case of Claire doesn't seem obvious to me. It seems that we can sometimes be in situations in which it's best to stick with a single source; that is, contra Flores and Woodard, it's not always best to diversify one's sources of evidence. For instance, if all but one of the news sources available to you is owned by people with a vested interest in a particular policy, you might reasonably stick with the only independent outlet. The explanation might be that gathering evidence from the others is a biased inquiry in Nilanjan Das's sense from above: that is, you can know in advance that it will raise your credence in some proposition, and so it is something your current credences tell you not to do from an epistemic point of view. So I think the devil will be in the details here. Flores and Woodard are certainly correct that it is sometimes best to diversify your sources of evidence, and in these cases you'll violate the Epistemic Norm of Inquiry by doing so; but sometimes it's best to stick with just one, and in those cases again the Epistemic Norm of Inquiry will entail that. How we fill in the details around Claire's case will determine which of these sorts of situations she's in.
The cases of Gabe and Larry are clearer. In Gabe's case, it might not be the evidence-gathering that is ultimately at fault, but his high credence that the source is reliable. From the point of view of a high credence that the source is reliable, the expected pragmatic or epistemic value of gathering evidence from it is likely to be high, and therefore rationally required, given there are few costs. But that original high credence itself might be irrational because it isn't a good response to Gabe's evidence concerning the source's reliability.
In Larry's case, it's important that he ignores the further evidence his source can provide through laziness and not simply a lack of time. If he had only a little time with his source and couldn't attend to all the evidence it provides, then there might be nothing wrong with his evidence-gathering--he did the best he could under the constraints placed on him! But, as Flores and Woodard describe the case, he could have gathered more evidence, but he failed to do so. In that case, it's likely that the cost of gathering that extra evidence and updating on it is considerably less than the pragmatic or epistemic value he expects to get from it. And this explains why he is criticizable: he violates the Epistemic Norm of Inquiry.
5.2 Willard-Kyle on the conclusion of inquiry
So far, we have just been talking about evidence-gathering episodes, and not inquiry. But an inquiry is simply a sequence of such episodes, and we usually embark upon one with the aim of answering some question. When is an inquiry complete? When should you cease inquiring further? Some say when you have knowledge of the answer to the question at which the inquiry was aimed; some say when you have true belief in it; and so on.
Christopher Willard-Kyle argues that none of these answers can be right because, even after you've achieved any of these, it's always possible to improve your epistemic situation. For instance, you might obtain better knowledge of the correct answer: you might obtain a safer belief, even though your current belief is sufficiently safe to count as knowledge; or you might obtain the belief you currently have, but using an even more reliable process, even though your current belief was formed by a sufficiently reliable process. Good's insight and Myrvold's epistemic version shed light on this.
In fact, your pragmatic reasons for further inquiry can just run out, and from that point of view it can be irrational to pursue that inquiry any further. This happens if you care only about the pragmatic value of your credence function as a guide to action in the face of the decision you know you'll face with it. At some point, you come to know that all further evidence-gathering episodes that are actually available to you either won't change your mind about what to choose when faced with the decision, or that any that will change your mind are too costly. At this point, further inquiry is irrational from this myopic pragmatic point of view. While you might continue to improve your credence function from an epistemic point of view, you achieve no further gains from a pragmatic point of view.
In the epistemic case, however, things are different. Unless you somehow acquire certainty about the correct answer to the question at which your inquiry aims, there will always be some evidence-gathering episode that you'll expect to improve your credence function from a purely epistemic point of view, though of course that episode may not be available to you. Indeed, you will rarely acquire such certainty. After all, for most inquiries, the evidence-gathering episodes don't give definitive answers to the target question; they give definitive answers to related questions that bear on the target question, such as when I gather evidence about what the weather forecast says as part of my inquiry into whether or not it will rain tomorrow. So this vindicates Willard-Kyle's claim. There will nearly always be room for improvement from an epistemic point of view.
5.3 Staffel on transitional and terminal attitudes
This last point casts doubt on
Julia Staffel's distinction between transitional and terminal attitudes in inquiry. On her account, during the course of an inquiry, we form transitional versions of the attitudes we seek--outright beliefs, perhaps, or precise credences. Only when the inquiry is complete do we form terminal versions of those attitudes. So, for instance, a detective who is methodically working her way through the body of evidence her team has amassed forms transitional credences concerning the identity of the culprit, and only after she has surveyed all this evidence does she form terminal credences on that matter. Staffel says that what distinguishes these attitudes is what we're prepared to do with them: for one thing, we are prepared to act on terminal attitudes but not on transitional ones. The views I've been describing here can shed light on Staffel's view.
Here's an argument that there can be no transitional attitudes that answer to Staffel's description. If I face a decision in the midst of my inquiry that I only expected to face at the end, it seems that I have no choice but to choose using the credences I have at that point, which have been obtained from my credences at the beginning of the inquiry by updating on the evidence I've received during its course to date. What else is available to me? Of course, there are my credences at the beginning of my inquiry. Should I use those instead? The problem with that suggestion is that those credences themselves don't think I should use them, at least if the evidence-gathering episodes I've embarked on so far are ones that the prior credences expect to have greater pragmatic value than not embarking on them, such as if those evidence-gathering episodes have the features that make Good's Value of Information Theorem applicable. Sure, my prior credences would have liked it even more if I'd got to complete my inquiry, but the world has prevented that and I must act now. So, if I must act either on my priors or on my current credences, which I hold mid-inquiry, I should act on my current ones, which suggests they're not transitional.
But the argument isn't quite right. If an inquiry is made up of a series of evidence-gathering episodes, each of which satisfies the conditions that make Good's Value of Information Theorem hold, then the argument works: there can be no transitional attitudes during such an inquiry. But not all inquiries are like that. Sometimes the whole sequence of evidence-gathering episodes is such that we expect our credence function to be better after they're all completed, but there are points in the course of the investigation at which we expect our credence function will be worse. This might happen, for instance, if we string together a bunch of biased inquiries, where those in the first stretch are biased in one direction and those in the second are biased in the other, but taken together, they aren't biased in either direction. For instance, suppose our detective divides up the evidence her team has collected into that which suggests the first suspect is guilty and that which suggests the second culprit is guilty. She plans to work through the first set first and the second set second. Then, while her prior credences expect the credences she'll have once she's worked through both sets to be better than they are, they also expect the credences she'll have once she's only worked through the first set to be worse. And so, if she's interrupted just as she completes the first set and suddenly has to make a decision she was hoping to make only at the end, she might well decide not to use her current credences. And in that sense they are transitional. Staffel considers a case very much like in her recent book manuscript.
Of course, you might wonder how it could be rational to embark upon a series of evidence-gathering episodes with the expectation that, at various points along the way you'll be doing worse than you were doing at the beginning. But that's a pretty standard thing we do: we commit to a sequence of actions that, if completed will bring about benefits, but if left half done will leave us worse off; and we simply have to weigh our credence that we'll get to complete the sequence against the benefits if we do and the detriments if we don't and see whether it's worth it. And the same goes in the epistemic case.
Nonetheless, while I think the argument against transitional attitudes that I gave above fails to show there are no such attitudes, I think it might show that they're rather rarer than Staffel imagines. Most inquiries involve a sequence of evidence-gathering episodes each of which improves your credences in expectation; the minority are like the detective running through a series of individually biased but collectively unbiased inquiries.
5.4 Friedman's example of the Chrysler building
Let's turn now to an example that motivates much of
Jane Friedman's recent contributions to the literature. I'll quote at length:
"I want to know how many windows the Chrysler Building in Manhattan has (say I’m in the window business). I decide that the best way to figure this out is to head down there myself and do a count. [...] Say it takes me an hour of focused work to get the count done and figure out how many windows that building has. [...] Now think about the hour during which I’m doing my counting. During that hour there are many other ways I could make epistemic gains. [...] First, I’m a typical epistemic subject and so I arrive at Grand Central with an extensive store of evidence: the body of total evidence, relevant to all sorts of topics and subject matters, that I’ve acquired over my lifetime. Second, I’m standing outside Grand Central Station for that hour and so the amount of perceptual information available to me is absolutely vast. [...] However, during my hour examining the Chrysler Building I barely do any of that. I need to get my count right, and to do that I really have to stay focused on the task. Given this, during that hour I don’t extend my current stores of knowledge by drawing inferences that aren’t relevant to my counting task, and I do my best to ignore everything else going on around me. And this seems to be exactly what I should be doing during that hour if I want to actually succeed in the inquiry I’m engaged in. [...] There is an important sense in which I succeed in inquiry by failing to respect my evidence for some stretch of time. It’s not that my success in this case comes by believing things my evidence doesn’t support, but it does come by ignoring a lot of my evidence and failing to come to know a great deal of what I’m in a position to know."
I think the natural thing to say here is that, as Friedman faces the Chrysler Building, she faces a choice between a number of different evidence-gathering episodes she might undertake. Some of them are the ones that form the inquiry she is there to undertake, namely, determining the number of windows in the building; some involve attending to sensory information and perhaps testimony that is available at the spot where she's ended up, but which is irrelevant to her inquiry; and some involve drawing inferences from the store of memories and other evidence she's previous collected, which is again irrelevant to her inquiry.
Of course, it's rather unusual to think of these last episodes as involving evidence-gathering. After all, you already have the evidence, and you're simply drawing conclusions from it that you haven't drawn before. But I think it's reasonable to view logical reasoning as doing something similar to what gathering empirical evidence does. In both cases, they are ruling out states of the world that are in some sense possible. When I see that it is raining, I rule out those states of the world in which it is not. And when I reason to the conclusion that one proposition entails another, I rule out those states of the world in which the former is true and the latter false. Now these latter states of the world were never truly possible; or at least, they were never logically possible. But until I did this reasoning, they were epistemically or personally possible for me, to borrow Ian Hacking's term. And we can give a version of credal epistemology on which they are the states of the world, sets of those represent the propositions to which we assign credences, suitably adapted probability axioms hold of them, and conditionalization is defined: building on Hacking's insights, I do this here.
So, having seen this we can understand the logical reasoning that Friedman doesn't do when she's in front of the Chrysler Building as just another sort of evidence she doesn't gather, just as she doesn't gather the evidence she might do if she were to attend to the conversation between the two commuters standing to her left, say. And once we do that, we can say that Friedman does the right thing by continuing with her window-counting inquiry so long as, at each stage, the evidence-gathering episode that comes next in that inquiry is the one that maximizes expected pragmatic or epistemic value among those episodes that are available to her. And if we see things in this way, there is no clash between an epistemic norm and a zetetic one. There are just two norms: gather evidence in the way that maximizes expected utility; and respond to that evidence by conditionalizing. And they govern what Friedman should do in front of the Chrysler Building.
6. Conclusion
One of the attractions of the picture I've been sketching here is that it is unified. It tells you there are pragmatic norms for inquiry and epistemic ones and all-things-considered norms as well. Each arises from a different source of value. The pragmatic norms arise from focussing only on the pragmatic value of credences as guides to action, and the pragmatic value of evidence-gathering as the pragmatic value of the credences it leads to. The norm is then: Gather evidence so as to maximize the expected pragmatic value of your credences. The epistemics norms arise from focussing only on the epistemic value of credences as representations of the world, and the epistemic value of evidence-gathering as the epistemic value of the credences it leads to. The norm is then: Gather evidence so as to maximize the expected epistemic value of your credences. But we can also combine the two sources of value into a measure of all-things-considered value, and then an analogous all-things-considered norm emerges.
Appendices
- Let $W$ be a set of possible worlds.
- Let $\mathcal{F}$ be the set of all subsets of $\mathcal{W}$; these represent propositions.
- A credence function on $\mathcal{F}$ is a function $C : \mathcal{F} \rightarrow [0, 1]$.
- A credence function is probabilistic if (i) $C(\emptyset) = 0$, (ii) $C(W) = 1$, and (iii) $C(X \cup Y) = C(X) + C(Y)$ for $X \cap Y = \emptyset$.
- A credence function $C$ on $\mathcal{F}$ is regular if $C(w) > 0$, for all $w$ in $W$.
- Given a regular probabilistic credence function $C$ on $\mathcal{F}$, and a proposition $E$ in $\mathcal{F}$, we define $C_E$ as follows: for each $X$ in $\mathcal{F}$,$$C_E(X) = C(X \mid E) = \frac{C(X \cap E)}{C(E)}$$
- An option on $W$ is a function $o : W \rightarrow \mathbb{R}$.
- A decision problem on $W$ is a set of options on $W$.
- Given a probabilistic credence function $C$ and an option $o$, the expected utility of $o$ from the point of view of $C$ is $\sum_{w \in W} C(w)o(w)$.
- A tie-breaker function $t$ takes a set of options and returns a single option from among them.
- An evidence-gathering episode $E$ is a function that takes each possible world and returns a proposition.
- An evidence function is partitional if $\{E(w) \mid w \in W\}$ is a partition.
- An evidence function is factive if $w$ is in $E(w)$ for all $w$ in $W$.
- Given a probabilistic credence function $C$, a decision problem $D$, and a tie-breaker function $t$, if $O^D_C$ is the set of options in $D$ that maximize expected utility from the point of view of $C$, then let $o^{D,t}_C = t(O^D_C)$.
- Given a probabilistic credence function $C$, a decision problem $D$, a tie-breaker function $t$, and a world $w$, the pragmatic value at $w$ of $C$ for someone who faces $D$ is defined as follows: $PU^{D,t}(C, w) =o^{D,t}_C(w)$.
Good's Pragmatic Value of Information Theorem Given a regular probabilistic credence function $C$, a partitional and factive evidence function $E$, a decision problem $D$, and a tie-breaker function $t$, if $o^{D,t}_C \neq o^{D,t}_{C_{E(w)}}$, for some $w$, then $$\sum_{w \in W} C(w)PU^{D, t}(C,w) < \sum_{w \in W} C(w)PU^{D, t}(C_{E(w)}, w)$$
- An epistemic utility function $EU$ takes a credence function $C$ on $\mathcal{F}$ and a world $w$ in $W$ and returns $EU(C, w)$, which measures the epistemic value of $C$ at $w$.
- $EU$ is strictly proper if, for any probabilistic credence function $C$ and any credence function $C'$, if $C \neq C'$, then $\sum_{w \in W} C(w)EU(C, w) > \sum_{w \in W} C(w)EU(C', w)$.
Myrvold's Epistemic Value of Information Theorem Given a regular probabilistic credence function $C$, a partitional and factive evidence function $E$, and a strictly proper epistemic utility function, if $C \neq C_{E(w)}$, for some $w$, then$$\sum_{w \in W} C(w)EU(C,w) < \sum_{w \in W} C(w)EU(C_{E(w)}, w)$$
Some strictly proper epistemic utility functions:
- The negative Brier score of $C$ at $w$ is$$-\sum_{X \in \mathcal{F}} |V_w(X) - C(X)|^2$$, where $V_w(X) = 0$ if $X$ is false at $w$ and $V_w(X) = 1$ if $X$ is true at $w$.
- Given a weighting $0 < \lambda_X$ for each $X$ in $\mathcal{F}$, the weighted negative Brier score of $C$ at $w$ relative to those weights is $$-\sum_{X \in \mathcal{F}} \lambda_X|V_w(X) - C(X)|^2$$
- The negative log score of $C$ at $w$ is $\log(C(w))$.
Consider the evidence function described in the main text:
- $W = \{w_1, w_2, w_3\}$
- $E(w_1) = \{w_1, w_2\}$, $E(w_2) = \{w_2, w_3\}$, $E(w_3) = \{w_2, w_3\}$
Then if you use the negative Brier score...
- This credence function expects evidence-gathering to be better than not: $(10/39, 5/13, 14/39)$.
- The credence function expects evidence-gathering to be worse than not: $(1/16, 7/16, 1/2)$.
But if you use the negative log score, every credence function expects evidence-gathering to be better than not. And indeed, for any factive evidence function, every credence function expects evidence-gathering to be better than not, if you use the log score. That's because, for any such evidence function, learning it will raise your credence in the true world and that's all the negative log score cares about.
All great stuff, Richard. But can I ask about the title of Good's original paper, viz "On the Principle of Total Evidence"? If that principle means that you should conditionalise and so act on everything you know, it looks like Good's theorem presupposes that, rather than justifying it. That is, Good shows that the expected utility of gathering more free information is at least as great as proceeding without doing so--on the assumption that you'll conditionalise and act on that extra information. In this connection, I note you call Good's theorem the "Value of Information" theorem, not the "Principle of Total Evidence". Does Good discuss this? (Actually, why don't I find out for myself? I'll have a look now.)
ReplyDeleteYes, he does discuss it, and observes that he's shown that what has at least as much expected utility is gathering extra information AND conditionalising on it. Hm. Well, he's shown that doing all that will PROBABLY bring wanted results ("at least as much EXPECTED utility"). Still, why should I actually do all that in a particular case, unless we presuppose that I should do what is probabilistically advised GIVEN my total evidence? Tricky. I am going to think about this some more.
DeleteInteresting, David! Actually, I think you can tack on Peter M. Brown's argument for conditionalization to show that really what maximizes expected utility is gathering-evidence-and-conditionalizing, and so I think we can get a fully pragmatic justification for the whole thing. And in the epistemic case, we can tack on Hilary Greaves and David Wallace's epistemic argument for conditionalization to give the same conclusion. Brown's article here: https://philpapers.org/rec/BROCAE-3. Greaves and Wallace here: https://philpapers.org/rec/GREJCC. I'm about to do another blogpost on something related, so I'll spell that out in more detail.
Delete