Wednesday, 28 May 2014

How inaccurate is your total doxastic state?

I've written a lot on this blog about ways in which we might measure the inaccuracy of an agent when she has precise numerical credences in propositions.  I've tried to describe the various ways in which philosophers have tried to use such measures to help argue for different principles of rationality that govern these credences.  For instance, Jim Joyce has argued that credences should satisfy the axioms of the probability calculus because any non-probabilistic credences are accuracy-dominated by probabilistic credences: that is, if $c$ is a non-probabilistic credence function, there is a probabilistic credence function $c^*$ such that $c^*$ is guaranteed to be more accurate than $c$.

Of course much of the epistemological literature is concerned with agents who have quite different sorts of doxastic attitudes.  It is concerned with agents who have not credences, which we might think of as partial beliefs, but rather agents who have full or all-or-nothing or categorical beliefs.  One might wonder whether we can also describe ways of measuring the inaccuracy of these doxastic attitudes.  It turns out that we can.  The principles of rationality that follow have been investigated by (amongst others) Hempel, Maher, Easwaran, and Fitelson.  I'll describe some of the inaccuracy measures below.

This raises a question.  Suppose you think that credences and full beliefs are both genuine doxastic attitudes, neither of which can be reduced to the other.  Then it is natural to think that the inaccuracy of one's total doxastic state is the sum of the inaccuracy of the credal part and the inaccuracy of the full belief part.  Now suppose that you think that, while neither sort of attitude can be reduced to the other, there is a tight connection between them for rational believers.  Indeed, you accept a normative version of the Lockean thesis: that is, you say that an agent should have a belief in $p$ iff her credence in $p$ is at least $t$ (for some threshold $0.5 < t \leq 1$) and she should have a disbelief in $p$ iff her credence in $p$ is at most $1-t$.  Then it turns out that something rather unfortunate happens.  Joyce's accuracy dominance argument for probabilism described above fails.  It now turns out that there are non-probabilistic credence functions with the following properties: while they are accuracy-dominated, the rational total doxastic state that they generate via the normative Lockean thesis -- that is, the total doxastic state that includes those credences together with the full beliefs or disbeliefs that the normative Lockean thesis demands -- is not accuracy-dominated by any other total doxastic state that satisfies the normative Lockean thesis.

Let's see how this happens.  We need three ingredients:

Inaccuracy for credences

The inaccuracy of a credence $x$ in proposition $X$ at world $w$ is given by the quadratic scoring rule:
i(x, w) = \left \{ \begin{array}{ll}
(1-x)^2 & \mbox{if $X$ is true at $w$} \\
x_k & \mbox{if $X$ is false at $w$}
Suppose $c = \{c_1, \ldots, c_n\}$ is a set of credences on a set of propositions $\mathbf{F} = \{X_1, \ldots, X_n\}$.  The inaccuracy of the whole credence function is given as follows:
I(c, w) = \sum_k i(c_k, w)

Inaccuracy for beliefs

Suppose $\mathbf{B} = \{b_1, \ldots, b_n\}$ is a set of beliefs and disbeliefs on a set of propositions $\mathbf{F} = \{X_1, \ldots, X_n\}$.  Thus, each $b_k$ is either a belief in $X_k$ (denoted $B(X_k)$), a disbelief in $X_k$ (denoted $D(X_k)$), or a suspension of judgment in $X_k$ (denoted $S(X_k)$).  Then we measure the inaccuracy of attitude $b$ in proposition $X$ at world $w$ is given as follows: there is a reward $R$ for a true belief or a false disbelief; there is a penalty $W$ for a false belief or a true disbelief; and suspensions receive neither penalty nor reward regardless of the truth of the proposition in question.  We assume $R, W > 0$.  Since we are interested in measuring inaccuracy rather than accuracy, the reward then makes a negative contribution to inaccuracy and the penalty makes a positive contribution. Thus:
i(B(X), w) = \left \{\begin{array}{ll}
-R & \mbox{if $X$ is true at $w$} \\
W & \mbox{if $X$ is false at $w$}
i(S(X), w) = \left \{\begin{array}{ll}
0 & \mbox{if $X$ is true at $w$} \\
0 & \mbox{if $X$ is false at $w$}
i(D(X), w) = \left \{ \begin{array}{ll}
W & \mbox{if $X$ is true at $w$} \\
-R & \mbox{if $X$ is false at $w$}
This then generates an inaccuracy measure on a set of beliefs $\mathbf{B}$ as follows:
I(\mathbf{B}, w) = \sum_k i(b_k, w)
Hempel noticed that, if $R = W$ and $p$ is a probability function, then: $B(X)$ uniquely minimises expected utility by the lights of $p$ iff $p(X) > 0.5$; $D(X)$ uniquely maximises expected utility by the lights of $p$ iff $p(X) < 0.5$; $S(X)$ maximises expected utility iff $p(X_k) = 0.5$, but in that situation, $B(X)$ and $D(X)$ do too.  Easwaran has investigated what happens if $R \neq W$.

Lockean thesis

For some $0.5 < t \leq 1$:
  • A rational agent has a belief in $X$ iff $c(X) \geq t$;
  • A rational agent has a disbelief in $X$ iff $c(X) \leq 1-t$;
  • A rational agent suspends judgment in $X$ iff $1-t < c(X) < t$.

Inaccuracy for total doxastic state

We can now put these three ingredients together to give an inaccuracy measure for a total doxastic state that satisfies the normative Lockean thesis.  We state the measure as a measure of the inaccuracy of a credence $x$ in proposition $X$ at world $w$, since any total doxastic state that satisfies the normative Lockean thesis is completely determined by the credal part.
i_t(x, w) = \left \{ \begin{array}{ll}
(1-x)^2 - R & \mbox{if } t \leq x \leq 1\mbox{ and } X \mbox{ is true} \\
(1-x)^2  & \mbox{if } 1- t < x < t\mbox{ and } X \mbox{ is true} \\
(1-x)^2 + W & \mbox{if } 0 \leq x \leq t\mbox{ and } X \mbox{ is true} \\
x^2 + W & \mbox{if } t \leq x \leq 1\mbox{ and } X \mbox{ is false} \\
x^2  & \mbox{if } 1- t < x < t \mbox{ and } X \mbox{ is false}\\
x^2 - R & \mbox{if } 0 \leq x \leq t \mbox{ and } X \mbox{ is false}\\
Finally, we give the total inaccuracy of such a doxastic state:
I_t(c, w) = \sum_k i_t(c_k, w)
Three things are interesting about this inaccuracy measure.  First, unlike the inaccuracy measures we usually deal with, it's discontinuous.  The inaccuracy of $x$ in $X$ is discontinuous at $t$ and at $1-t$.  If $X$ is true, this is because, as $x$ crosses the Lockean threshold $t$, it gives rise to a true belief, whose reward contributes negatively to the inaccuracy; and as it crosses the other Lockean threshold $1-t$, it gives rise to a true disbelief, whose penalty contributes positively to the inaccuracy.

Second, the measure is proper.  That is, each probabilistic set of credences expects itself to be amongst the least inaccurate.

Third, as mentioned above, there are non-probabilistic credence functions that are not accuracy-dominated when inaccuracy is measured by $I_t$.  Consider the following example. 
  • $\mathbf{F} = \{X, \neg X\}$.  That is, our agent has credences only in two propositions.
  • $c(X) = 0.6$ and $c(\neg X) = 0.5$.
  • $R = 0.4$, $W = 0.6$.  That is, the penalty for a false belief or true disbelief is fifty percent higher than the reward for a true belief.
  • $t = 0.6$.  That is, a rational agent has a belief in $X$ iff her credence is at least than 0.6; and she has a disbelief in $X$ iff her credence is at most 0.4.  It's worth noting that, for probabilistic agents who specify $R$ and $W$ as we just have, satisfying the Lockean thesis with $t = 0.6$ will always minimize expected inaccuracy.
Then we have the following result:  There is no total doxastic state that satisfies the Lockean thesis that $I_t$-dominates $c$.

The following figure helps us to see why.

Here, we plot the possible credence functions on $\mathbf{F} = \{X, \neg X\}$ on the unit square.  The dotted lines represent the Lockean thresholds: a belief threshold for $X$ and a disbelief threshold for $X$; and similarly for $\neg X$.  The undotted diagonal line include all the probabilistically coherent credence functions; that is, those for which the credence in $X$ and the credence in $\neg X$ sum to 1.  $c$ is the credence function described above.  It is probabilistically incoherent.  The lower right-hand arc includes all the possible credence functions that are exactly as inaccurate as $c$ when $X$ is true and inaccuracy is measured by $I$.  The upper left-hand arc includes all the possible credence functions that are exactly as inaccurate as $c$ when $\neg X$ is true and inaccuracy is measured by $I$.

Note that, in line with Joyce's accuracy-domination argument for probabilism, $c$ is $I$-dominated.  It is $I$-dominated by all of the credence functions that lie between the two arcs.  Some of these -- namely, those that also lie on the diagonal line -- are not themselves $I$-dominated.  This seems to rule out $c$ as irrational.  But of course, when we are considering not only the inaccuracy of $c$ but also the inaccuracy of the beliefs and disbeliefs to which $c$ gives rise in line with the Lockean thesis, our measure of inaccuracy is $I_t$, not $I$.  Notice that all the credence functions that $I$-dominate $c$ do not $I_t$-dominate it.  The reason is that every such credence function assigns $X$ a credence less than 0.6.  Thus, none of them give rise to a full belief in $X$.  As a result, the decrease in $I$ that is obtained by moving to one of these does not exceed $R$, which is the accuracy 'boost' obtained by having the true belief in $X$ to which $c$ gives rise.  By checking cases, we can see further that no other credence function $I_t$-dominates $c$.

Is this a problem?  That depends on whether one takes credences and beliefs to be two separate, but related doxastic states.  If one does, and if one accepts further that the Lockean thesis describes the way in which they are related, then $I_t$ seems the natural way to measure the total doxastic state that arises when both are present.  But then one loses the accuracy-domination argument for probabilism.  However, one might avoid this conclusion if one were to say that, really, there are only credence functions; and that beliefs, to the extent they exist at all, are reducible to credences.  That is, if one were to take the Lockean thesis to be a reductionist claim rather than a normative claim, it would seem natural to measure the inaccuracy of a credence function using $I$ instead of $I_t$.  While one would still say that, as a credence in $X$ moves across the Lockean threshold for belief, it gives rise to a new belief, it would no longer seem right to think that this discontinuous change in doxastic state should give rise to a discontinuous change in inaccuracy; for the new belief is not really a genuinely new doxastic state; it is rather a way of classifying the credal state.

Tuesday, 27 May 2014

Is Mathematics Physics?

Is mathematics physics?

I think of mathematics as the physics of necessity. The major difference between physical entities (like atoms, the magnetic field $\mathbf{B}$, wavefunctions, etc.) and pure mathematical entities, like $\pi$ or $\aleph_0$ or $SU(3)$, is their modal status.

[Update 28 May 2014: I clarify what I mean a bit more. The idea, which I haven't quite figured out properly yet, is this. The idea I'm defending here was mentioned before in an earlier M-Phi post, Abstracta - The Way of Modal Invariance. As we move from possible world to possible world, the intrinsic properties of a pure mathematical object, like $\pi$ or $\aleph_0$, remain invariant. In all possible worlds, $\pi > 3$. And, in all possible worlds, $\aleph_0 > n$, for any $n \in \mathbb{N}$. Pure mathematical entities are, in some important sense, modally invariant: they do not "change" their relations amongst each other temporally (in time) or modally. However, the modal status of physical / concrete entities is different: their properties and relations to each other change as we move from world to world. In some worlds, there are atoms; in some there are no atoms. In some worlds (perhaps ours), $\nabla \cdot \mathbf{B} = 0$, and in some worlds, $\nabla \cdot \mathbf{B} = 4 \pi \rho_m$ (where $\rho_m$ is a magnetic density; see this for magnetic monopoles). In some worlds (ours, say, $w$*), the set $B$ of Beatles (for definiteness, say 1 January 1966) has cardinality 4; and, in some world, say $w_1$, the set of Beatles has cardinality 3; and in some world, say $w_2$, the set of Beatles has cardinality 0. One might say that really there are three different sets, $B_{w^{\ast}}, B_{w_1}, B_{w_2}$. Or we can say that the property being a Beatle has different extensions at different worlds, because the contingent facts are varying. In these cases, the changes across time and across worlds reflect that the relevant facts are contingent, and involve concrete entities - entities which change over time and from world to world.]

There is a quite different, and opposing, idea that the role of mathematics in science is to "represent". There is a serious attempt to establish this view:
Field 1980, Science Without Numbers
This is a brilliant piece of work, full of extremely interesting ideas and insights. It introduces two main lines of argument:
  • First, it invokes conservativeness to explain both the "insubstantiality" and utility of mathematics (that is, it explain how one might accept and use a theory $T_m$ referring to mathematical objects without thinking the theory is true - rather it is a conservative extension of a purely "nominalistic" sub-theory $T_n$, which is the theory one takes to be true); 
  • second, it invokes representation theorems to explain the representational role of a mathematicized extension $T_m$ of a purely "nominalistic" theory $T_n$. 
In the first case, the central conservativeness claim is:
(Con) For any $\phi \in Sent(L_n)$, if $T_m \vdash \phi$, then $T_n \vdash \phi$.
In the second case, the central representation claim is
(Rep) Any model $\mathcal{M} \models T_n$ can be "nicely embedded" in a model of $T_m$.
People interested in these topics need to learn about the conservativeness results and the relevant representation theorems. They lie at the heart of the debate. I would call Field's book a classic of analytic metaphysics, along with other classics, such as Frege's Foundations of Arithmetic (1884), Russell's Principles of Mathematics (1903), Carnap's Der logische Aufbau der Welt (1928), Lewis's On the Plurality of Worlds (1986) and Parts of Classes (1991).

But I think Field's approach, despite its many important insights, is problematic for a number of reasons, quite complicated ones, and impossible to summarize easily. Many of the reasons are set out in detail in,
Burgess & Rosen 1997: A Subject with No Object.
I find the Burgess & Rosen response to the claim that "the role of mathematics is to represent" definitive. (However, very recently, there has been a piece of work developing a similar view to Field's: this is Artnzenius & Dorr (2012): see below.)

There has been a recent turn to "instrumentalism". To me, this seems a desperate move. The instrumentalist about mathematics claims that there is a separation of purely "nominalistic content" of a mixed assertion, e.g., a physical law like $\nabla \cdot \mathbf{B} = 0$, but without saying exactly what that content is. To me, this seems like mysticism.

Occasionally, when discussing these topics, there is unclarity about co-ordinate systems. The whole idea goes back to Descartes who, putting it anachronistically,  noticed an "isomorphism" between geometric space and $\mathbb{R}^3$. This is why we talk of cartesian co-ordinates, etc. Instead of talking about points, lines, surfaces and regions in space, we can talk of polynomial equations, $f(x,y,z) = 0$ (where the values of variables are real numbers) and use algebraic methods.

Let $(U,\phi)$ be a co-ordinate system on spacetime; a map that takes each spacetime point $p$ in the spacetime region $U$ to its co-ordinates $(x^0,x^1,x^2,x^3) = \phi(p)$, where $x^0,x^1,x^2,x^3 \in \mathbb{R}$. Let the image of $U$ under $\phi$ be $\phi[U] \subseteq \mathbb{R}^4$. The map $\phi$ has to preserve the physical topology of spacetime itself: the pre-image $\phi^{-1}[V]$ of any open set $V \subseteq \phi[U]$ has to be open in the physical topology.

Suppose that $\mathbf{B}$ is a physical vector field on spacetime (e.g., the magnetic field). Then there is a co-ordinate representation $\mathbf{B}^{\phi}$ of $\mathbf{B}$ relative to $\phi$. I.e., for any $x \in \phi[U]$,
$\mathbf{B}^{\phi}(x) = \mathbf{B}(\phi^{-1}(x))$
Then the function
$\mathbf{B}^{\phi} : \mathbb{R}^4 \to \mathbb{R}^3$ 
represents the vector field $\mathbf{B}$ on spacetime. The fact that $\mathbf{B}^{\phi}$ represents the function $\mathbf{B}$ does not imply that $\mathbf{B}$ is not a function. It is simply one function representing another function.

Here I am just describing the usual practice of mathematical physics. It does not bear on "nominalism" unless one denies that there is such a function as $\mathbf{B}$, the magnetic field. If one wishes to argue, e.g., that $\mathbf{B}$ is not a function, or that we may dispense with it somehow, then one needs to present a detailed argument. Hartry Field did present an argument that, while physical fields such as $\mathbf{B}$ are indeed functions (his example involved the gravitational potential $\Phi$ and the mass density field $\rho$, but the underlying issues and reformulation method are the same), even so, an "physically equivalent" theory can be formulated using only primitive relational predicates on space-time points; since then, almost no one has tried, because it is very hard technically to carry this through. The only exception is Frank Arntzenius & Cian Dorr 2012, "Calculus as Geometry". Their idea is not to deny that $\mathbf{B}$ is a function, but rather to expand the concrete ontology, and declare the whole fibre bundle concrete! They have concretized the abstracta. A review of their work by David Baker is here.

While mixed sets of concreta and physical functions (like $\mathbf{B}$, $F_{ab}$, wavefunctions, etc.) and so on do, perhaps, "explain", this is not the issue. For example, if we wish to refer to the average height of a Beatle, we assert that
  • there is a set $B$ of Beatles, 
  • each Beatle $p \in B$ has a height-in-metres $h(p) \in \mathbb{R}$, 
  • there is a cardinal number $N = |B| \neq 0$ of this set, 
and we define the average height(-in-metres) by adding these heights $h(p_1) + \dots$, and dividing by $N$. This is how mathematics is actually applied, in practice.

People assume the disciplines of mathematics and physics are "separate" and seem astonished that someone might dispute this. But it seems to me they are not separate. It's more accurate to say that they are the same thing: mathematics = physics. This kind of view is also advocated by Professor Max Tegmark of MIT. The difference between mathematics and physics has little to do with epistemology. It seems right that the central difference is that physics deals with contingencies and pure mathematics with necessities. For example:
it is contingent that $\nabla \cdot \mathbf{B} = 0$.
it is necessary that $(\mathbb{N}, <)$ is well ordered.

Ergo publishes first issue

Ergo -- a general, open access philosophy journal edited by Franz Huber and Jonathan Weisberg (Toronto) -- has published its first issue.

See here for a blog post by Anna Mahtani (LSE) on a paper by Mike Caie (Syracuse), which is published in the first issue.

Michael Caie's 'Calibration and Probabilism' (Guest post by Anna Mahtani)

Michael Caie has a really interesting paper forthcoming in Ergo. The paper is a criticism of van Fraassen's Calibration argument. It's carefully argued and technical, but here I just give the gist of Caie's argument, and highlight a point that I think it would be interesting to pursue.

Here's the rough idea behind van Fraassen's Calibration argument. We can begin with this thought: being calibrated is a 'good-making feature' (as Caie puts it) of an agent's credal state. And to say that an agent's credal state is calibrated is to say, roughly, that the agent's credence in claims of a particular type match the frequency of truths of that type. So here's an example. Suppose that I am looking at a pack of cards. For each card in the pack (call them $c_1$, $c_2$, ... $c_{52}$), I have a credence of 1/4 that that card is a diamond ($D$). So we have a set of claims $\{$$D_{c_1}$, $D_{c_2}$,..., $D_{c_{52}}$$\}$, and I have a credence of 1/4 in each claim in this set. Now in fact (as you would expect), exactly 1/4 of the claims in the set are true - and so my credence matches up with the frequency. Thus we say that my credal state is 'calibrated' over this set of claims. (In fact, calibration does not require my credence to exactly match the frequency; it requires for any $\epsilon > 0$, the difference between my credence and the frequency is lower than $\epsilon$).

Now suppose instead that I am looking at just one card from the pack, say $c_{24}$. Again I have a credence of 1/4 in $D_{c_{24}}$. For whatever reason (I don't know about their existence, I have never considered them, ...) I have no credences about the other cards in the pack. So if we wanted to gather up into a set all my credences in claims of the form $D_{c_i}$, then there would be just one claim in that set: $D_{c_{24}}$. Thus for my credal state to be calibrated over this set, I must have either a credence of 1 in $D_{c_{24}}$ (if $D_{c_{24}}$ is true) or a credence of 0 in $D_{c_{24}}$ (if $D_{c_{24}}$ is false). But then being calibrated no longer seems like such a good-making feature: at least, it isn't something that we should require of every rational agent. What we do then, in Caie's words, is we 'abstract away from the limitations imposed by the numbers of propositions in this class'. We ask: if we added further relevant claims to the set, could we get the agent's credal state to be calibrated over the (new, extended) set? If so, then the agent's credal state is calibratable over the (original, in this case single-member) set. So in our example, we start with the set containing just $D_{c_{24}}$, and add other relevant claims to this set - perhaps $D_{c_1}$ and $D_{c_2}$ for example. (Exactly what makes a claim 'relevant' is an issue that Caie discusses in the paper - here for simplicity I'll skate over this and assume you get the idea). We suppose that just as I have a credence of 1/4 in $D_{c_{24}}$, so I have a credence of 1/4 in all these relevantly similar claims in the new extended set. Now can we - by adding claims in this way - get my credence calibrated over this extended set? In this example, we can: one way to do it would be just to add in relevant claims for all the 51 remaining cards. And because there is an extended set such that my credal state would be calibrated over that extended set - we can say that my credence function is calibratable over the original set.

Van Fraassen's key claim is that if an agent's credence function is not calibratable (over some set over which it is defined), then the agent is irrational. Or - more accurately, an agent is irrational if (s)he has a credal state such that it can be determined a priori that if the agent has that credal state, then it is not calibratable (over some set over which it is defined). Van Fraassen shows that it follows from this principle that a rational agent will obey the probability axioms. This is a welcome result, because the probability axioms are intuitively compelling - and here we have an argument to the conclusion that rational agents' credence functions obey these axioms. What Caie argues in this paper, though, is that unwelcome results also follow from van Fraassen's principle.
The centrepiece of Caie's paper concerns this sentence, where $Cr_a$ refers to Annie's credence function:

(*)        $\neg Cr_a(T(*)) \geq 0.5$

The sentence (T) below is an instance of the T-schema, and plausibly Annie (if rational) has a credence of 1 in (T).

(T)        $T(*) \leftrightarrow \neg Cr_a(T(*)) \geq 0.5$

Caie shows that if Annie has a credence of 1 in (T), and any credence at all in (*), then her credal state is not calibratable. Thus, given van Fraassen's principle, Annie is classed as irrational. But this is an unwelcome result. As Caie rightly points out, (*) is not a liar sentence: it can be, say, true without contradiction. Furthermore Annie can have a credence of 1 in (T) and some credence in (*) without violating the probability axioms. Thus (Caie argues), van Fraassen's principle gives the wrong results here. (In fact, Caie acknowledges that it may seem that Annie's credence state should be classed as irrational if she has some credence in (*). He goes on to show that - given van Fraassen's principle - her credence state is classed as irrational if she has any credence greater than 0 in (T), and this is much harder to swallow).

We can see why if Annie has a credence in (*), then her credal state is not calibratable across the set containing (*) and (T). To see this, let's try to extend the set, and find a set over which Annie's credal state would be calibrated. In extending the set, we can introduce as many sentences $x$ as we like that are relevantly similar to (*); but to count as 'relevantly similar to (*)', these sentences $x$ must be such that $T(x) \leftrightarrow \neg Cr_a(T(x)) \geq 0.5$ holds. Let's start then by supposing that Annie's credence in (*) is less than 0.5. Then, to get her credal state to calibrate, we want to include in the extended set some false sentences $x$, that are relevantly similar to (*). But when we include a sentence $x$ that is relevantly similar to (*), we must suppose that Annie's credence in each $x$ is the same as her credence in (*), and her credence in (*) is less than 0.5. And whenever we have a sentence $x$ relevantly similar to (*), such that Annie's credence in $x$ is less than 0.5, that sentence will be true. This is because $T(x) \leftrightarrow \neg Cr_a(T(x)) \geq 0.5$ holds for all these $x$'s that are relevantly similar to (*).  Thus if Annie's credence in (*) is less than 0.5, then we cannot extend the set containing (*) and (T) to produce a set over which Annie's credal state would be calibrated. We can reason in a parallel way to show that we can't produce an extended set over which Annie's credal state is calibrated if Annie's credence in (*) is greater than or equal to 0.5. So - whatever Annie's credence in (*), her credence can't be calibrated over the set. Thus on van Fraassen's account, she is irrational. As Caie argues, this is an unwelcome result.

This argument from Caie reminds me of Sorensen's discussion of epistemic blindspots. You can have a claim that is perfectly consistent, but such that if you conjoin it with the further claim that it is believed (perhaps by a particular person or at a particular time - depending on the claim), the conjunction is inconsistent. Here is a simple example:

(i)    S does not have any beliefs.

This sentence is perfectly consistent, but conjoin (i) with the claim that S believes (i) and you get an inconsistent conjunction. Thus S cannot (as a matter of logical necessity) truly believe (i). The same goes for this Moorean sentence:

(ii)    'P and S does not believe that P'.

(ii) is consistent, but it would be inconsistent to state the conjunction of (ii) and the claim that S believes (ii). Thus S cannot truly believe (ii). There are even sentences such that S can neither truly believe the sentence, nor truly believe its negation. Here is an example:

 (iii)    S does not believe (iii).

The conjunction of (iii) together with the claim that S believes (iii) is inconsistent, so S cannot truly believe (iii). But it seems that S (if coherent) also cannot truly believe the negation of (iii). For if S believes the negation of (iii), then (if S is coherent) S does not believe (iii) - in which case (iii) is true, and the negation of (iii) is false. It seems then that S (if coherent) cannot either truly believe (iii), or truly believe the negation of (iii). Further, this can be figured out a priori. So doesn't it follow that S is incoherent if (s)he believes either S or its negation?

I don't think this does follow, however. I think that whether an agent's outright belief set counts as irrational should depend simply on whether the contents of his or her beliefs are consistent, and both (iii) and its negation are consistent. We might be tempted to judge whether S's outright belief state is rational by thinking about whether the set of beliefs that S holds are such that S could hold all of these beliefs truly. But this introduces the quirks that we have seen: S can have a perfectly consistent set of beliefs, but because some of these are about her own belief state, we end up classing S as irrational. It seems better to me to judge whether S's outright belief state is rational by thinking about whether the set of beliefs that S holds are such that someone could hold all of those beliefs truly. But even this can lead us astray - for we will still have quirky cases of beliefs that someone has a particular belief. It is better simply to ask whether the content of the beliefs form a consistent set.

The same issue seems to arise in Caie's example. If S has a credence of 1 in (T), and any credence in (*), then Annie's credal state cannot be calibrated. That is, there is no way of extending Annie's credal state in such a way that it can be both Annie's credal state and calibrated. This is why (as Caie shows) on van Fraassen's account, Annie gets classed as irrational. However, there are ways of extending Annie's credal state in such a way that it can be calibrated: it just can't be both calibrated and Annie's credal state. Here then I think we have a new sort of epistemic blindspot: if we accept van Fraassen's account, then it seems that a rational credal state can be defined over both (T) and (*)  - but not if it is Annie's.

One option, then, is to adjust van Fraassen's view to get around this problem. We could require a rational agent's credence state to be such that it can be extended into a calibrated credal state - but not necessarily into a credal state that would be calibrated if it was the agent's credal state. Caie has something to say about why this move is a mistake - but I think it might be worth pursuing.

Friday, 23 May 2014

The Quine-Putnam Indispensability Argument Again

An earlier post on the Quine-Putnam indispensability argument was linked on reddit and received some discussion there. It gets mentioned here occasionally because it was my thesis topic, and I tend to think the argument given by W.V. Quine and Hilary Putnam is not quite understood properly. It is sometimes expressed as an argument "for the existence of mathematical entities" -- as if the default position was nominalism (the non-existence of mathematical entities). It's certainly true that this was Quine's default up to around 1947, and perhaps that's why it generally gets formulated in that way. Quine was concerned with whether a "nominalistic" position - that is, a theory of reality which postulates only concrete entities - would be sufficient for the needs of science, and eventually concluded that it would not be. Scientific theories not only do, but also need to, refer to numbers, functions, sets, quantities, vector spaces, fibre bundles, Lie groups and so on. So, really the argument, as worked out by Quine and Putnam, is simply that science and nominalism don't combine well.

[I can't remember if I have mentioned this joke before: abstract entities include propositions. And nominalism is itself a proposition. Consequently, if nominalism is true, nominalism doesn't even exist.]

But there are other widely discussed formulations, the most widely known one being by Mark Colyvan, which begins from some sort of nominalistic view as the default, and then argues for an epistemological conclusion that we "ought to have ontological commitment" to mathematical objects because they are indispensable to science.

The lines of argument by Quine appeared in scattered writings over a period of forty years; and Hilary Putnam wrote two works focused on the topic: Philosophy of Logic (1971) and "What is Mathematical Truth?" (1975). In its strongest form, the argument pits modern science against nominalism.

For example, the magnetic field, $\mathbf{B}$, a physical quantity associated with all kinds of phenomena (e.g., light, which generates another joke: if you think the electromagnetic field is an invisible theoretical entity, then, since it actually is light, then your view implies that light is invisible: I think I heard this joke towards scientific anti-realism from Jeremy Butterfield). In technical terms, $\mathbf{B}$ is an axial vector field. In even more technical terms, the components $B_x,B_y,B_z$ are three of the components of the electromagnetic tensor field $F_{ab}$. In even more technical terms, the electromagnetic field $F_{ab}$ is a "curvature" of a "connection" on a fibre bundle; ... Putnam's own example was the gravitational field (or, more exactly, the potential $\Phi$), as appears in the field theoretic formulation of Newton's theory of gravity via Poisson's equation,
$\nabla^2 \Phi = 4 \pi G \rho$
The field $\mathbf{B}$ is a function, its domain the set of spacetime points (themselves usually thought of as concrete entities) and its range some vector space (the details aren't important: if you want, let it be $\mathbb{R}^3$). Nominalism is the claim that there are no numbers, sets, functions, propositions, sentences, Hilbert spaces, manifolds, etc., etc. Then a strong version of the argument of Quine and Putnam runs like this. The premises are:
(i) $\mathbf{B}$ is a function.
(ii) The existence of functions is inconsistent with nominalism.
(C) If nominalism is true, there is no such thing as the magnetic field.
This argument doesn't get into details of theory formulations, or existential quantifiers, or claims about "ontological commitment". It simply says that, generally, physics and mathematics are mixed up together in a very deep way; and dispensing with physical quantities, like fields and wave functions and so on (which are functions with abstract value ranges), without a huge reformulation effort is no easy task.

There are many responses to such arguments, and the debate can get quite technical quickly. My co-blogger Richard Pettigrew has written an interesting detailed response, "Indispensability arguments and instrumental nominalism" (RSL, 2012) mentioned here before, along with some other similar approaches.

JOB: post-doc, Groningen, Roots of Deduction project


Within the VIDI project ‘The Roots of Deduction’ led by Catarina Dutilh Novaes, the Faculty of Philosophy of the University of Groningen is advertising a 12-month post-doc position (salary scale 11.0, EUR 3.259 gross per month), to commence in January 2015 or shortly thereafter. The successful candidate will be expected to participate in the activities of the project (seminars, reading groups etc.) and to conduct research related to the topics covered by the project (no teaching obligations included).


Given the broad scope of the project, candidates with a number of different backgrounds will be considered, as long as they have a keen interest in the general topic of deductive proofs in logic and mathematics. In particular, candidates may have the following areas of expertise:

·      Ancient logic
·      Ancient mathematics (Greek as well as other traditions)
·      Philosophical logic and philosophy of logic
·      Philosophy of mathematics and philosophy of mathematical practice (proofs in particular)
·      Dialogical logic and games in logic
·      Deductive and mathematical cognition (psychology/cognitive science)

Candidate’s profile

Candidates will have just obtained or be about to obtain their PhD degrees. Since it was launched in 2011, the project has been running in a highly collaborative manner, and thus we are specifically looking for people who like to engage in collective research. Women and members of other underrepresented groups in philosophy are strongly encouraged to apply.


To apply, send a CV (including the names of two referents), a cover letter and a short (circa 1000 words) research statement, elaborating on how your research interests intersect with the topics of the project, and how you see yourself contributing to the project’s investigations. For the latter, candidates are strongly encouraged to familiarize themselves with the questions, results and activities of the project by consulting the project’s website. The research statement will be the most important factor when evaluating the applications.

Send all the application material to, with subject ‘Post-doc application’. For further inquiries, contact Catarina Dutilh Novaes at


·      Deadline for application: July 10th 2014
·      Interviews (through video-conferencing): in the week of August 18th to 22nd 2014
·      Start date: January 2015 or shortly thereafter


The Faculty of Philosophy at the University of Groningen is a stimulating, lively and internationally oriented community of excellent lecturers and researchers. In the past decade, it has been consistently ranked as one of the top philosophy programs in the Netherlands, both on research and on teaching. The faculty has an interdisciplinary outlook and maintains strong ties with other faculties in the university.

Thursday, 22 May 2014


Considering "... the most accurate report" has reminded me of a puzzle connected to the topic of "truthlikeness" or "accuracy" (also sometimes called "verisimilitude" or "approximate truth"), as might be expressed by,
$A$ is closer to the truth than $B$ is
where $A$ and $B$ are statements or theories.

David Miller published (1974, "Popper's Qualitative Theory of Verisimilitude", BJPS; and 1976, "Verisimilitude Redeflated", BJPS) an interesting problem concerning trying to make sense of this concept. Because the notion seems clearly relevant to any decent theory of scientific method, Sir Karl Popper had previously tried to develop an explication of the concept, but it turned out to suffer a serious (separate) problem, discovered by David Miller also, set out in the 1974 paper (roughly, on that explication, all false theories are as truthlike as each other).

But the other problem - the Language Dependence Problem - is this. (It is explained also in Sc. 1.4.4 of Oddie, "Truthlikeness", SEP.) The statements $A$ and $B$ are, we suppose, both false. But we should still, nonetheless, like to make sense of what it might mean for $A$ to be closer to the truth (or, more accurate) than $B$ is. For a scientific example, we intuitively would like to say that Einstein's relativistic equation for the kinetic energy of a point particle of mass $m$ at speed $v$,
$E_k = \frac{1}{2}mv^2 + \frac{3}{8} m \frac{v^4}{c^2} + \dots$
is more accurate than the classical equation,
$E_k = \frac{1}{2}mv^2$.
(Miller 1975, "The Accuracy of Predictions" (Synthese), explains how the language dependence problem arises here also, for such comparisons.)

Suppose $A$ and $B$ are false sentences in language $L$, and let the truth be $T$. I.e., $T$ is the single true statement that $A$ and $B$ are falsely approximating. Miller pointed out, given some very natural ways to measure the "distance" between $A$ (or $B$) and the truth, a language relativity appears. One such way is to count the number of "errors" in a false statement; and then the statement with least number of errors is closer to the truth.

I will give an example which is based on Miller's weather example, but a bit simpler. Let the language $L$ be a simple propositional language with, as its primitive sentences,
$R$ ("it is raining")
$C$ ("it is cold"). 
Suppose the truth $T$ is that it is not raining and it is cold. Let $A$ say that it is raining and it is cold and let $B$ say that it is raining and it is not cold. So, both $A$ and $B$ are false. In symbols, we have:
$T = \neg R \wedge C$.
$A = R \wedge C$.
$B = R \wedge \neg C$.
Which of $A$ or $B$ is more accurate? It seems intuitively clear that
(1) $A$ is closer to the truth than $B$ is. 
For $R \wedge C$ makes one error, while $R \wedge \neg C$ makes two errors. For those interested in the fancier details, this is called the Hamming distance between the corresponding binary sequences. For this case, it amounts to (1,1) being closer to (0,1) than (1,0) is.

Miller's language dependence problem is that if we translate the statements into an equivalent language $L^{\ast}$, then we can reverse this evaluation! We can get the translation of $B$ to be closer to the truth than the translation of $A$ is.

First, we define $L^{\ast}$ to have primitive sentences $R$ and a new sentence $E$, whose translation into $L$ is "it is raining if and only if it is cold". I.e., $R \leftrightarrow C$. One can "invert" this translation, and see that the translation of $C$ into $L^{\ast}$ is given by $R \leftrightarrow E$. (This is because if $\phi \equiv (\alpha \leftrightarrow \beta)$, then $\beta \equiv (\phi \leftrightarrow \alpha)$.)

Next we translate $T$, $A$ and $B$ into $L^{\ast}$ as follows:
$T^{\ast} = \neg R \wedge \neg E$.
$A^{\ast} = R \wedge E$.
$B^{\ast} = R \wedge \neg E$.
Expressed in the new language $L^{\ast}$, we have:
(2) $B^{\ast}$ is closer to the truth than $A^{\ast}$ is.
For $B^{\ast}$ makes only one error, while $A^{\ast}$ makes two errors.

Consequently, if we adopt this measure of distance from the truth, we can reverse closeness to truth or accuracy by simply translating into an equivalent language (i.e., one that has different primitives).

In technical terms, the problem is this. We have placed a metric $d$ on the set $X$ of propositional assignments (or models, if you like), the Hamming distance. Indeed, $X$ is just the set of four binary ordered pairs, i.e.,
$X = \{(0,0),(0,1),(1,0),(1,1)\}$
And, the Hamming distances are given by:
$d((0,0),(1,0)) = 1$
$d((1,0),(0,1)) = 2$
So, $(X,d)$ is then a metric space. A Miller-style translation from $L$ to $L^{\ast}$ induces a bijection $f: X \to X$ of this space, but this mapping $f$ is not an isometry of $d$.

Monday, 12 May 2014

Hannes Leitgeb bei "scobel" (3sat) - 8. Mai 2014 (German only)

Die Sendung "scobel" auf 3sat will vielfältigsten Themen ihren Raum, ihr Format geben. Gert Scobel verbindet als Moderator interdisziplinär die Vielfalt der Themen aus Kultur, Natur- und Geisteswissenschaften und Gesellschaft. Das thematische Spektrum reicht von der wissenschaftlichen Forschung und ihren ethisch-moralischen Implikationen und Auswirkungen auf andere Fach- und Lebensbereiche bis zu Literatur, Musik, aktuellen Gesellschaftstheorie und -kritik. In der Ausgabe vom 8. Mai widmet sich das Format der Frage "Was ist normal?". Zu Gast im Studio ist auch Hannes Leitgeb. Das Video kann bis 14. Mai noch online abgerufen werden:

Wednesday, 7 May 2014

"Mathematicians are supposed to prove things *in* arithmetic, not *about* arithmetic!"

Hitler does not like Gödel's theorem one bit. Perhaps surprisingly, he displays a sophisticated understanding of the implications and presuppositions of the theorem. (In other words, there's some very solid philosophy of logic in the background -- I think I could teach a whole course on the material presupposed here.)

(Courtesy of Diego Tajer, talented young logician from Buenos Aires, giving continuation to the best Monty Python tradition!)