Tuesday, 23 April 2013

For Mathematicians

Here's a nice shortish article (talk) by Mark Balaguer aiming to explain the basic ideas of philosophy of mathematics to mathematicians:
A Guide for the Perplexed: What Mathematicians Need to Know to Understand Philosophers of Mathematics
From the introductory paragraph:
My hope is to make clear for mathematicians what philosophers of mathematics are really up to and, also, to eliminate some confusions.

Monday, 22 April 2013

Fermat, set theory, and arithmetic (guest post by Colin McLarty)

This is a guest post by Colin McLarty, Truman P. Handy Professor of Intellectual Philosophy and professor of Mathematics at Case Western Reserve University. It is a follow-up to a short post I wrote last month on his exciting current work on the foundations of mathematics. In this post, Colin explains to us what the whole project is about in just 1000 words.

--------------

Some philosophers suspect mathematicians don't care about foundations but only care about what works. But that elides the problem mathematicians constantly face: what will work? And it can promote the misapprehension that modern mathematics abandons intuition in favor of technicalities. Mathematics works by making rigor serve intuition.  Mathematicians use tools that help them see how to do what they want---without breaking down even in what I will call "deliberate, utterly reliable gaps".

By that I mean points in an argument where a mathematician cites a substantial, hard to prove result where the citing mathematician may or may not have once gone through the whole proof of that result but certainly is not calling the whole proof to mind in citing it.  The citing mathematician relies on that earlier result not only to be proved correctly, but to be stated in full precision so it can be applied concisely out of context without fear of error. Major proofs today have many deliberate utterly reliable gaps, as do their citations in turn.

These themes converged in the on-line row over whether Wiles’s proof of Fermat's Last Theorem (FLT) uses Grothendieck universes. Universes are controversial in some circles since they are sets large enough to model Zermelo Fraenkel set theory (ZF) and so, by Gödel's incompleteness theorem, ZF cannot prove they exist.

The term "universe" is not in Wiles's paper. Neither are proofs of most theorems he uses.  He gives citations which cite others in turn.  The citations often lead to the works where Grothendieck and colleagues established the modern methods of number theory (and about half of today’s category theory) using universes. As he depended on those proofs so he depended on universes.

One way out is never taken. Grothendieck knew everything he does with universes in practice he could also do by discarding some larger scale structures and treating others as mere ways of speaking rather than actual entities. Number theorists often say something like this would put their work on a ZF foundation. But they give no precise statement. And really doing it would distract from arithmetic by offering un-insightful set theoretic complications for no serious foundational benefit. ZF itself is remote from arithmetic.

It is no surprise theoretically that a statement about numbers could be proved by high level set theory.   Gödel showed things like this have to happen sometimes, since any increase in the consistency strength of a foundation makes new number theoretic statements provable.  Consistency itself can be expressed by number theoretic statements. But it is surprising in fact that FLT should be proved this way. We do not expect to see the Gödel phenomenon in such simple statements.  I am working to lessen the surprise in the case of FLT and other recent number theory by bringing the proofs closer to arithmetic. I have formalized the whole Grothendieck toolkit in finite order arithmetic. That is the strongest theory that is commonly called "arithmetic".  From that point of view it is the simple theory of types including an axiom of infinity. From another viewpoint it is the weakest theory that is commonly called "set theory". It is set theory using only numbers and sets of numbers and sets of sets of numbers, all built from numbers in some finite number of levels by bounded comprehension.

The version in my article "A finite order arithmetic foundation for cohomology" looks like Grothendieck’s to anyone but a professional logician. You can just replace a few foundational passages in the Grothendieck work by this foundation. It proves less than Grothendieck’s universes in principle.  But all the general theorems actually in the Grothendieck corpus follow verbatim as Grothendieck and his colleagues proved them.

On the other hand, this foundation is still much stronger than PA. It uses every finite level above PA, though only finite levels.

My current focus is to formalize the central Grothendieck tools at the logical strength of second or third order arithmetic. On one hand this will formalize the insight of practitioners who say their work with these tools really only uses "very small sets". And on the other hand it will bring the foundation within striking distance of methods of reverse mathematics, a well-developed discipline exploring the exact logical strength of mathematical results expressible in second order arithmetic. My article "Zariski cohomology in second order arithmetic" gives some progress on this front.

One goal is to take current methods of number theory, which textbooks and reference works justify by various combinations of Grothendieck universes and hand waving, and justify them rigorously in pretty much their current form in low order arithmetic. Essential to this goal is that most proofs do not get longer and their appearance is not much changed. The other goal is to show that the great number theoretic results proved by these tools can be proved in Peano Arithmetic. It would be great to find proofs in PA without changing the existing proofs very much. But that may not be possible. At any rate it is not intrinsic to the second goal. Showing these theorems can be proved in PA is likely to require serious advances in number theory. I can try to clear up the logical side.

As a philosophical goal I want to show how Grothendieck and many mathematicians since him have cared enough to either develop rigorous foundations for these tools or else to protest foundations they do not like—and others draw on these foundations without needing to highlight them. Grothendieck has been clear that the size of sets is not important to him but the conceptual unity of his toolkit is. I have shown that unity can be preserved without anything like the size of his original universes. I regard Grothendieck as developing the unity of intuition and rigor, in terms very like the post "Terry Tao on rigor in mathematics". I hope others will too.

Further thoughts on Priest's Inclosure Schema


After publishing my post on Priest’s Inclosure Schema (IS) a few days ago, I’ve had a number of interesting exchanges on the content of the post, including with Priest himself. So here are a few additional thoughts, in case anyone is interested.

Regarding the charge of extensional inadequacy (over- and undergeneration), I think it had been made sufficiently clear by others before me that the fact that the Curry paradox does not fit into IS is a big blow if IS claims to be a formal explanans for the informal concept of paradoxes of self-reference. However, while Priest’s original claim seemed to pertain to paradoxes of self-reference specifically, he seems to have changed a bit the intended scope of IS, and now tends to talk about ‘Inclosure paradoxes’. I don’t think there is anything wrong with this ‘change of heart’, but it does have consequences for how we should conceive the role of IS in debates on paradoxes. To make sense of this development, let me turn to a distinction introduced by S. Shapiro (in the words of L. Horsten in his SEP entry on philosophy of mathematics):
Shapiro draws a useful distinction between algebraic and non-algebraic mathematical theories (Shapiro 1997). Roughly, non-algebraic theories are theories which appear at first sight to be about a unique model: the intended model of the theory. We have seen examples of such theories: arithmetic, mathematical analysis… Algebraic theories, in contrast, do not carry a prima facie claim to be about a unique model. Examples are group theory, topology, graph theory, ….
By analogy, I would submit that IS was first introduced as a ‘non-algebraic theory’, intended to capture one very precise class of arguments, namely paradoxes of self-reference. But as things moved along, it became clear to Priest and others that IS in fact determines a different but possibly equally interesting class of arguments, which he refers to as Inclosure paradoxes. From this point of view, IS is now an ‘algebraic theory’: rather than starting with a given target-phenomenon and trying to formulate a formal account which would capture all of (and only) this phenomenon, IS is now a freestanding formal account, and it is a non-trivial question as to which class(es) of entities it accurately describes. (In non-algebraic theories, you start with the phenomenon and look for the theory; in algebraic theories, you start with the theory and look for the phenomenon.)

From this angle, it becomes a noteworthy observation, rather than an extensional failure, to notice that Curry does not fit into IS, and that the sorites paradoxes and some reductio arguments do fit into IS, thus unveiling some (surprising) structural similarities. In other words, if IS is intended as an ‘algebraic theory’, then the charges of over- and –undergeneration do not get off the ground.

But it seems to me that this would represent a significant departure from how IS was originally presented in Priest’s 1994 paper, namely as a formal explanans for the class of self-referential paradoxes. I would suggest that proponents of IS could give us a clearer account of how exactly they see the role of IS in research on paradoxes (in particular, as a non-algebraic or as an algebraic theory, in Shapiro's sense). Priest has already been moving in this direction, for example when he claims that Inclosure paradoxes are those that have to do with contradiction and with the limits of thought as such. However, it is not yet clear to me why Curry does not concern the limits of thought as such (apart from the fact that it is not captured by IS…), so I look forward to the continuation of this debate.

It's Complicated

[This is a post using Newman-style reasoning to argue for the existence of natural properties and relations.]

Consider a claim like:
(1) The mind-independent world is complicated
One might deny that there is a mind-independent world (Idealism) or one might accept that there is, while insisting that it is "unknowable", while adding that what is known is mentally constituted (Kantian Transcendental Idealism). Here, in asserting the latter, one does merely mean representations are mentally constituted, for this is a truism that no one denies. One means that what knowledge is about is also mentally constituted (e.g, that physical objects are representations; that space and time are representations). Idealism is not the truism that our thoughts and representations are somehow in, or connected with, our minds; it is the much stronger metaphysical claim that everything (Idealism) or almost everything (Kant) is mind-dependent.

Assuming that we're not Idealists, what might this statement (1) mean? It might mean:
(2) The cardinality of the mind-independent things is quite large (e.g., $>10^{50}$).
If this is what (1) means, then the complexity of the world is solely its cardinality. Therefore, a sound and complete description of the mind-independent world consists in a statement of the form:
(3) The cardinality of the mind-independent world $= \kappa$,
where $\kappa$ is some cardinal number. It should strike anyone as surprising that the ultimate goal of physics, chemistry, biology, etc., is simply to identify this number $\kappa$. (Cf., the punchline of Douglas Adams's joke "42".) So, I take it that this is not what the statement (1) means.

So, perhaps (1) means,
(4) There are mind-independent properties and relations amongst the mind-independent things and their relations (e.g., scientific laws) are complicated.
Here "complexity" may mean something like the structural complexity of the truth set for a language containing predicates for these properties and relations. For example, the truth set for full arithmetic is more complicated than the truth set for arithmetic with just addition. For the latter is a recursive set, while the former is not recursive -- and in fact not even arithmetically definable. There are other ways of measuring complexity, notably Kolmogorov complexity, for finite strings, and various notions of computational complexity. Perhaps, if the world is finite, "complexity" might involve the Kolmogorov complexity of the simplest program that answers soundly all questions about the world.

However, independently of how one understands the concept of "complexity", one has to be careful. Suppose that by "property" or "relation" one means just any set of things, or any set of ordered pairs of things. These are properties in a very broad sense. It then follows, by Newman-style reasoning, that (4) is reducible to (3). For any structure (or classification, if you like) $\mathcal{A}$ can be imposed on some collection $C$ of things so long as there are enough of them.

To illustrate: consider a finite set $X = \{1, \dots, n\}$ of numbers, and partition it any way you like. Let the partition be $(Y_i \mid i \in I)$, where $I$ is the index set. I.e., the sets $Y_i$ are non-empty and disjoint, and $X = \bigcup_i Y_i$. Now, suppose that we have a collection $C$ of $n$ things, or physical objects, or what have you. Then it is easy to define a partition $(C_i \mid i \in I)$ of these things which is isomorphic to $(Y_i \mid i \in I)$. For since $C$ and $X$ have the same cardinality, let $f : C \to X$ be a bijection (this function enumerates the elements of $C$). Then, for each $i \in I$, define $C_i$ by:
$c \in C_i$ iff $f(c) \in Y_i$. 
By construction, this gives us an isomorphism. So, if we have a partition of $n$ natural numbers (the "mathematical model") and collection $C$ of physical things of size $n$, we can partition $C$ isomorphically to the original partition. If there are no independent constraints built into $C$ itself beyond cardinality, we can impose any structure $\mathcal{A}$ we like onto $C$, modulo $C$ having cardinality at least as large as that of $\mathcal{A}$.

Consequently, if the reasonable sounding (4) is not to trivialize down to (3), the quantifier in "there are ... properties" must range over a special subset of the set of all properties in the broader sense. In principle, this might be any special subset. But, usually, what is intended is what metaphysicians call "natural properties". This is because what "selects" that subset as special is not the mind, but Nature. If one intends it to mean "there is a mind-dependent subset of properties ...", then one is back to Idealism, this is almost certainly not what (1) is taken to mean by anyone.

So, if this reasoning is right, the most reasonable interpretation of "the mind-independent world is complicated" is:
(5) There are mind-independent natural properties and relations amongst the mind-independent things and their relations (e.g., scientific laws) are complicated.
And this is much more in keeping with scientific inquiry. However, note that (5) implies the existence of mind-independent natural properties and relations.

So, if there is a mind-independent world (Idealism is incorrect) and the mind-independent world is complicated, then either this mind-independent complexity consists merely in its cardinality, or it consists in the complexity of the laws and relations amongst natural properties and relations. In particular, if Idealism is incorrect but there are no natural properties or relations, then the complexity of the mind-independent world consists solely in its cardinality.

(I'm inclined to think that this latter position is, more or less, Kant's metaphysical view.)

Sunday, 21 April 2013

The Probability of a Carnap Sentence

In the simplest, "logical empiricist"-style, framework for the formalization of scientific theories, we have 1-sorted language $L_{O,T}$, where the vocabulary has been partitioned into O-predicates and T-predicates (it's easy to include constants and function symbols if one wishes; but it's simpler to omit them). And scientific theories are formulated in $L_{O,T}$. The language obtained by deleting the T-predicates can be denoted $L_{O}$ and is called the observational sublanguage of $L_{O,T}$.

Suppose that $\Theta(\vec{O}, \vec{T})$ is a single axiom for a finitely axiomatized theory in $L_{O,T}$, where $\vec{O}$ is a sequence of O-predicates and $\vec{T}$ is a sequence of T-predicates. Then the Ramsey sentence of $\Theta$ is defined by:
$\Re(\Theta) := \exists \vec{X} \Theta(\vec{O}, \vec{X})$, 
where $\vec{X} = (X_1, \dots)$ is a sequence of second-order variables matching the arities of the predicates $T_1, \dots$ in $\vec{T}$. So, the theoretical predicates have been replaced by second-order variables, and existentially quantified.

Nothing has been said about the meanings of the O-predicates and T-predicates. In principle, one could simply assume some $L_{O,T}$-interpretation $\mathcal{I}$, and let $(L_{O,T}, \mathcal{I})$ be the corresponding fully interpreted language. However, the logical empiricists---the first group of thinkers aiming to apply the newly emerging methods of mathematical logic to the formalization of scientific theories---did not adopt this approach, Instead, largely because of their empiricist metasemantics, they assumed only an $L_{O}$-interpretation $\mathcal{I}^{\circ}$, and consequently $(L_{O,T}, \mathcal{I}^{\circ})$ is then a partially interpreted language.

Because the language is partially interpreted, for each O-predicate $O_i$, there is now a meaning, $(O_i)^{\mathcal{I}^{\circ}}$. How then do the T-predicates get their meanings? Certainly not by explicit definition in terms of O-predicates! In a sense, the new underlying idea is that the meanings of T-terms are not pinned down uniquely and independently of theory, but rather implicitly defined by theories themselves. The basic way of implementing this view of meaning is to consider the Carnap sentence of the theory $\Theta$, i.e.,
$\Re(\Theta) \to \Theta$
and to insist that this sentence is analytic --- true in virtue of meaning.

As Hannes Leitgeb has pointed out in the talk I mentioned in the post "The Probability of a Ramsey Sentence" yesterday, it now seems reasonable, to assign probability 1 to the Carnap sentence. After all, if $\phi$ is analytically true, surely its probability should be 1, whether or not probability is understood subjectively or not. So, we assume that we have some probability function $Pr(.)$ defined over $L_{O,T}$-sentences.

What can we say about connections between the probabilities of theories and their ramsifications? Well, as explained in tre previous post, if the Carnap sentence has probability 1, i.e.,
$Pr(\Re(\Theta) \to \Theta) = 1$
then we can show that,
$Pr(\Re(\Theta)) = Pr(\Theta)$.
On the other hand, suppose that the Carnap sentence has probability slightly lower than 1. E.g., suppose that,
$Pr(\Re(\Theta) \to \Theta) = 1 - \epsilon$
for some small parameter $\epsilon$. In this case, it follows that
$Pr(\Re(\Theta)) = Pr(\Theta) + \epsilon$.
Proof: By the Lemma in the previous post,
$Pr(\Theta) + Pr(\Theta \to \Re(\Theta)) = Pr(\Re(\Theta)) + Pr(\Re(\Theta) \to \Theta)$.
But $Pr(\Theta \to \Re(\Theta)) = 1$, because $\Theta \vdash \Re(\Theta)$ (assuming second-order logic). So,
$Pr(\Theta) + 1 = Pr(\Re(\Theta)) + 1 - \epsilon$.
So, $Pr(\Re(\Theta)) = Pr(\Theta) + \epsilon$, as required. QED.

So, if the Carnap sentence for a theory has a probability lower than 1 by some amount, then the Ramsey sentence for the theory has a higher probability than the theory does, by that same amount. This makes sense intuitively, because the Ramsey sentence is, in a number of senses, weaker than the theory itself (unless it happens to be inconsistent, of course).

Saturday, 20 April 2013

The Probability of a Ramsey Sentence

This post is inspired by a recent very interesting talk, "Theoretical Terms and Induction", by Hannes Leitgeb at a conference on "Theoretical Terms" in Munich a couple of weeks ago (April 3-5th, 2013).

Hannes's talk is a response to the debate about whether a Ramsey sentence for a theory $\Theta$ can account for the inductive systematization of evidence given by $\Theta$ itself. This debate goes back to earlier works by Carl Hempel and Israel Sheffler (The Anatomy of Inquiry, 1963) and, in particular, a 1968 Journal of Philosophy paper, "Reflections on the Ramsey Method" by Sheffler. The debate has recently been revived in an interesting 2012 Synthese paper, "Ramsification and Inductive Inference", by Panu Raatikainen. The conclusion of this argument is that ramsification of a theory $\Theta$ damages the inductive systematization that the theory $\Theta$ provides. I recommend interested readers consult Panu's 2012 paper on this.

On Hannes's approach, one assigns a probability to a Ramsey sentence $\Re(\Theta)$, on the assumption that the corresponding Carnap sentence
$\Re(\Theta) \to \Theta$
has probability 1. Since Carnap himself insisted that the Carnap sentence of a theory is analytic, it seems reasonable, on his perspective, to assign it probability 1. On this Carnapian assumption, it can then be shown that the probability of a theory and its Ramsey sentence are the same. (Hannes's discussion also related these probabilistic conclusions to the notion of logical probability, counting models of a theory, over a finite domain.)

To explain what's going on, note first that it's well-known that $\Theta$ and $\Re(\Theta)$ are deductively equivalent with respect to the observation language $L_O$. That is, for any $\phi \in L_O$, we have,
$\Theta \vdash \phi$ if and only if $\Re(\Theta) \vdash \phi.$ 
But suppose the Carnap sentence has probability 1. Then we can show that $\Theta$ and $\Re(\Theta)$ are probabilistically equivalent.

First, we give a lemma in probability theory:
Lemma:
$Pr(A) + Pr(A \to B) = Pr(B) + Pr(B \to A)$.
Proof. Reasoning using probability axioms,
$Pr(A \to B) = Pr(\neg A \vee B)$
= $Pr(\neg A) + Pr(B) - Pr(\neg A \wedge B)$
= $1 - Pr(A) + Pr(B) - Pr(\neg (B \to A))$
= $1 + Pr(B) - Pr(A) - 1 + Pr(B \to A)$
= $Pr(B) - Pr(A) + Pr(B \to A)$.
So:
$Pr(A) + Pr(A \to B) = Pr(B) + Pr(B \to A)$.
QED.

Next, let $\Theta$ be a theory and let $\Re(\Theta)$ be its Ramsey sentence. Note that
$\Theta \vdash \Re(\Theta)$.
(That is, one can deduce $\Re(\Theta)$ from $\Theta$ in a system of second-order logic, using comprehension.)

It follows that,
$Pr(\Theta \to \Re(\Theta)) = 1$.
Suppose that the Carnap sentence, $\Re(\Theta) \to \Theta$, has probability 1. That is,
$Pr(\Re(\Theta) \to \Theta) = 1$.
Then the Lemma above gives:
$Pr(\Re(\Theta)) = Pr(\Theta)$.
So, given a theory $\Theta$. the probability of its Ramsey sentence equals the probability of the theory itself, on the assumption that its Carnap sentence has probability 1.

[UPDATE 20 April: I have made a few changes and modified the Lemma used to a slightly stronger one.]

Two objections to Priest's Inclosure Schema


(My student Rein van der Laan will be defending his Bachelors thesis on Priest’s Inclosure Schema this week. It was in the process of supervising him that I developed my current ideas on the topic, which means that the content of this post is basically joint work with Rein.)

In a number of papers (such as this 1994 paper) and in his book Beyond the Limits of Thought (BtLoT), Graham Priest defends the claim that all paradoxes of self-reference can be adequately captured by the Inclosure Schema IS, which he formulates in the following way:

(1) Ω = {y; φ(y)} exists and ψ(Ω)               Existence
(2) if x Ω and ψ(x) (a) δ(x) ∉ x               Transcendence
(b) δ(x) Ω             Closure

The different paradoxes of self-reference would be generated by different instantiations of the schematic letters of the schema (for details, consult BtLoT).

There have been quite some articles discussing IS in the meantime (among others: Abad, Grattan-Guinness, Badici, and responses by Priest and Weber), where a number of interesting objections have been raised against the idea that IS successfully describes all paradoxes of self-reference (the Liar, Russell’s paradox etc). Here I discuss two (not necessarily novel) objections that I think are quite problematic for Priest’s general project with IS -- in particular, that of arguing for the Principle of Uniform Solution: similar paradoxes must receive similar solutions. (Unsurprisingly, he goes on to claim that only dialethism is able to offer an uniform solution to all these paradoxes.)

The over/undergeneration objection. One plausible way to understand what Priest is up to with IS is that it is intended as a formal explanans for the informal notion of ‘paradoxes of self-reference’. If this is correct, then it is legitimate to raise the question of whether IS gets the extension of the informal concept right; it may overgenerate (arguments which we do not want to count as self-referential paradoxes would fit into the schema) and/or undergenerate (it may fail to capture arguments which we do want to count as self-referential paradoxes).

As it turns out, IS seems both to over- and undergenerate. It overgenerates in that a number of reductio arguments which are not paradoxical, properly speaking, seem to conform to IS (as pointed out e.g. by Abad). One example would be Cantor’s diagonal argument for the uncountability of the real numbers (see this earlier blog post of mine for a presentation of the argument). And it undergenerates in that Curry’s paradox, which obviously (?) should count as a self-referential paradox, cannot be accounted for by means of IS. Priest is well aware of this limitation, but retorts:
[Curry] paradoxes belong to a quite different family. [They] do not involve negation and, a fortiori, contradiction. They therefore have nothing to do with contradictions at the limits of thought. (BtLoT, 169)
This seems odd, as the original claim seemed to be that IS was meant to describe paradoxes of self-reference in general, not only those involving a negation. (To be fair, Curry is the hardest of all paradoxes; as Graham himself says, Curry is hard on everyone…) At any rate, if IS both over- and undergenerates as a formal explanans of paradoxes of self-reference (which is at least what the original 1994 paper seems to claim it should be), this is not good news for Priest’s general project. (He may, of course, say that Curry falls out of the scope of IS and thus of the Principle of Uniform Solution, but the overgeneration charge still stands.)

The form/matter objection. A useful and frequently cited definition of paradoxes is the one offered by Sainsbury (2009, 1): what characterizes a paradox is “an apparently unacceptable conclusion derived by apparently acceptable reasoning from apparently acceptable premises.” This means that one crucial component of a paradox is the degree of belief an agent attributes to the premises and the cogency of the reasoning, and the degree of disbelief she attributes to the conclusion. The ‘apparently’ clause does not need to entail a relativistic conception of paradoxes, but it does mean that a paradox has a perspectival component. Galileo’s paradox was paradoxical for Galileo and many others, but not for Cantor, who did not see the conclusion of the reasoning as unacceptable.

Now, there is an old but by now largely forgotten conception of the form and matter of an argument according to which the matter of the argument is the ‘quality’ of its premises. In this vein, a materially defective argument is one where the premises are false, while a formally defective argument is one where the reasoning is not valid. (See this article of mine for the historical background of this conception.) On the basis of this idea, we could say that the matter of an argument corresponds to one’s degree of belief/disbelief in the premises/conclusion (again, perspectival), and the form corresponds to the structure of the argument. This would entail that paradoxes come in degrees, in function of the agent's degrees of (dis)belief in the premises, reasoning and conclusion.

With this distinction in mind, we can see why IS fails to capture the extension of the concept of paradoxes of self-reference: it captures only the form of such arguments, but is silent concerning their matter (understood as the degrees of (dis)belief in premises and conclusion). The paradoxical nature of a paradox, however, is crucially determined by the degrees of (dis)belief in the premises and conclusion (as made clear in Sainsbury’s quote). [UPDATE: this sentence has been misunderstood by many people. Notice that I am here using an unconventional understanding of the matter of an argument (introduced in the previous paragraph), not the more familiar schematic notion of form vs. matter.] This is why IS cannot differentiate between a truly paradoxical argument and a reductio argument, intended to establish the falsity of one of the premises rather than being truly paradoxical.

[UPDATE: In BtLoT Priest introduces the restriction that different instantiations of IS must yield true premises for an argument to count as an instantiation of IS, and thus to be an inclosure paradox. This is why the Barber then does not count as an inclosure paradox. This restriction seems to me to be too strong, as often what is under discussion when a paradox emerges is whether the apparently acceptable premises are indeed as acceptable as they seem.]

So ultimately, the conclusion seems to be that IS fails to deliver what Priest wants it to deliver. Nevertheless, I firmly believe that the formulation of IS has been one of the most interesting and important developments in research on paradoxes of the last decades. It forces us to think about paradoxes with a much-needed higher level of generality, and thus leads to a new, deeper understanding of the phenomenon – even if the conclusion must be that IS cannot be the whole story after all.

UPDATE: Some further thoughts on the Inclosure Schema here.

Theoretical Terms in Mathematical Physics

Semantics is the theory of meanings of expressions, normally with a particular, fixed language in mind. In semantics, one might be interested in:
  • the meaning of "the" (in English)
  • the meaning of "and" (in English)
  • the meaning of adverbs (in English)
  • etc.
For example, the meaning of "the" in English for expressions of the form "the $F$" (definite descriptions) was given a famous analysis by Bertrand Russell in his article "On Denoting" (1905). This analysis is contextual (i.e., "the $F$" is not explicitly defined). Following Russell, the statement
The current Prime Minister of the UK studied PPE
is analysed as,
There is exactly one current Prime Minister of the UK and this person studied PPE. 
Metasemantics is the metatheory of semantics. In metasemantics one is interested in questions like:
  • what are languages, in general?
  • what is status of claims about the semantic properties of languages?
  • how are languages acquired, spoken, implemented, cognized, grasped, etc., by minds?
One particularly pressing part of metasemantics concerns the semantics of theoretical expressions in science. A reasonable metasemantics for science aims to explain how meanings of theoretical expressions in, e.g., the language(s) of mathematical physics are grasped and assigned to linguistic strings. For example, how do we grasp the meanings of expressions in a passage like this:
Passage 1.
Consider a massive uncharged scalar field $\phi$ propagating on flat Minkowski spacetime $M$ with a potential $V(\phi)$. The field $\phi$ satisfies the Klein-Gordon equation,
$(\square + m^2) \phi + \frac{\partial V}{\partial \phi} = 0$.
Next let us consider the behaviour of this field when we consider a small graviton field $h_{\mu \nu}$ coupled to the energy tensor $T_{\mu \nu}$ of $\phi$.
(I've made this passage up, but it's the sort of thing one reads in a mathematical physics textbook or a paper.)

There are two main problems with the language of modern mathematical physics. The first is to understand how mathematical expressions obtain meaning. And the second is to understand how theoretical expressions like "massive uncharged scalar field", "potential", etc., obtain physical meanings.

We can sort of "undo" the physical content of Passage 1 as follows.
Passage 2.
Consider a scalar function $\phi$ on a differentiable manifold $M$ diffeomorphic to $\mathbb{R}^4$, with metric $g_{\mu \nu} = diag\{1,-1,-1,-1\}$. Let $\phi$ satisfy the equation,
$(\square + m^2) \phi + \frac{\partial V}{\partial \phi} = 0$,
for some $m \in \mathbb{R}^+$ and some function $V(\phi)$. Next let us consider the behaviour of this field when we consider a small symmetric (0,2) tensor $h_{\mu \nu}$ on $M$ coupled to the tensor $T_{\mu \nu}$ defined as follows ....
In Passage 2, we use notions like "manifold", "diffeomorphic", "$\mathbb{R}^4$", "scalar function", "metric", "$\mathbb{R}^+$", "symmetric (0,2) tensor". All of these can be defined in pure mathematics (and, in fact, reduced to the language of $ZF$ set theory, although this would be a bit nutty). For example,
A manifold $M$ is a topological space such that ... 
In Passage 1, however, we are imagining a possible physical world, whose underlying physical spacetime is rather like our actual spacetime would be if it were flat (i.e., Minkowski), along with certain physical fields with certain properties (i.e., a spin zero scalar field with mass $m$).

It strikes me as highly implausible to suppose that the notions from mathematical physics in Passage 1 can be somehow reduced to "logical constructions from sense data" as Bertrand Russell and Rudolf Carnap had hoped. But even so, it remains very unclear how human cognition can mentally represent how a hypothetical massive scalar field would behave under these circumstances. We can. It's just not clear how we can.

A metasemantic theory which accounts for the semantics of Passage 1 almost certainly will involve "heavy-duty" notions from Lewisian metaphysics: in particular, modality and "natural properties", etc.

Theoretical Terms: Defining E and B

It is sometimes claimed by philosophers of science that the meanings of theoretical terms are implicitly fixed by the total theory $\Theta$ (i.e., theoretical laws/equations plus correspondence rules) in which these terms appear. This is then the basis for the philosophical claim that a Carnap sentence,
$\Re(\Theta) \to \Theta$.
is analytic -- i.e., true in virtue of meaning. In a sense, on this view, theoretical terms are (second-order) Skolem constants (or, equivalently, Hilbertian $\epsilon$-terms).

This claim about the semantics of theoretical terms is, however, inconsistent with the standard practice of physics, for example. In physics, one usually adopts far more local definitions of theoretical terms.

For example, in electromagnetism, the Lorentz force law plays a crucial role, but Maxwell's equations do not. So, the following formulation by Professor James Sparks at the Mathematical Institute in Oxford corresponds fairly closely to the definitions that I learnt as a physics undergraduate a long, long time ago (at the Other Place):
The force on a point charge q at rest in an electric field $\mathbf{E}$ is simply
$\mathbf{F} = q \mathbf{E}$.
We used this to define $\mathbf{E}$ in fact. (Sparks, Lecture notes on "Electromagnetism", p. 12)
Notice that Maxwell's equations are not mentioned. The meaning of
"the electric field at point $\mathbf{r}$"
is not implicitly defined in terms of Maxwell's equations. Rather it is explicitly defined using the notion of a force on a charged test particle.
When the charge is moving the force law is more complicated. From experiments one finds that if $q$ at position $\mathbf{r}$ is moving with velocity $\mathbf{u} = d\mathbf{r}/dt$ it experiences a force
$\mathbf{F} = q \mathbf{E}(\mathbf{r}) + q \mathbf{u} \wedge \mathbf{B}(\mathbf{r})$. $\text{        }$ (2.8)
Here $\mathbf{B} = \mathbf{B}(\mathbf{r})$ is a vector field, called the magnetic field, and we may similarly regard the Lorentz force $\mathbf{F}$ in (2.8) as defining $\mathbf{B}$.
(Sparks, Lecture notes on "Electromagnetism", pp 12-13)
Again, notice that Maxwell's equations are not mentioned. The meaning of
"the magnetic field at point $\mathbf{r}$"
is not implicitly defined in terms of Maxwell's equations. Rather it is explicitly defined using the notion of a force on a charged test particle.

This is not the end of the story, of course. But it casts considerable doubt on the claim that the meanings of theoretical terms are given by implicit definitions within global theories. The definitions are far more local. It is not even clear that these local definitions fit the mould of what a logician would call a genuine explicit definition. But, in any case, one does not use the whole apparatus of Maxwell's equations to define the expressions "electric field" and "magnetic field".

Wednesday, 17 April 2013

The Newman Objection to Ramsey Sentence Structuralism

There appears to be considerable confusion about what the Newman Objection to Ramsey Sentence Structuralism actually is. Three examples of this that I have in mind are:
  • insisting that Newman shows that the Ramsey sentence $\Re(\Theta)$ of a theory $\Theta$ states only a cardinality condition, 
  • insisting that the structural content of a theory $\Theta$ (i.e., the content of $\Re(\Theta)$ beyond $\Theta$ having an empirically correct model) is more than cardinality content, 
  • redefining "empiricism" to be much stricter than, e.g., van Fraassen allows (i.e., empirical regularities). 
An example of this is Worrall 2007, in "Miracles and Models: Why the Reports of the Death of Structural Realism May Be Exaggerated" (in O’Hare (ed.), Philosophy of science (Royal Institute of Philosophy 61):
The argument in its crispest form goes as follows:
1. SSR is committed to the view that the Ramsey sentence of any scientific theory T captures the full ‘cognitive content’ of that theory.
2. However, as Newman showed, the Ramsey sentence of any theory imposes only a very weak constraint on the universe—it amounts in essence to a mere cardinality constraint, and so if there are sufficiently many objects in the universe then the Ramsey-version of T, for any T, will be true.
3. However it is clear that standard scientific theories impose much more stringent constraints on the universe if they are to be true than merely a constraint on the minimum number of entities the world must include.
4. Hence SSR is committed to an account of the cognitive content of scientific theories that is plainly untenable and is, therefore, itself untenable. (Worrall 2007, pp. 140-141.)
("SSR" stands for "Structural Scientific Realism", which is another name for what I'm calling Ramsey sentence structuralism.)

But this is not the argument and this is not the Newman objection to Ramsey sentence structuralism. In fact, Premise 2 is mistaken and contradicts the basic technical result that is involved (see below). The objection is this:
Ramsey sentence structuralism $\approx$ Constructive empiricism
where "$\approx$" here means "is more or less equivalent to".

Here, Constructive empiricism is the anti-realist/instrumentalist view, associated with van Fraassen (The Scientific Image, 1980), that accepting a scientific theory $\Theta$ is believing that $\Theta$ has an empiricaly correct model $\mathcal{A}$. Ramsey sentence structuralism is the view that the synthetic content of a scientific theory $\Theta$ is given by its Ramsey sentence $\Re(\Theta)$. Strictly speaking, Constructive empiricism is, primarily, an epistemic view, concerning theoretical justification, whereas Ramsey sentence structuralism seems like a semantic view: a view about the content of theories. The connection between the epistemology and semantics is that Ramsey sentence structuralism involves a strictly empiricist and descriptivist view of the semantics of "theoretical terms".

The objection is that there is little difference between advocating Ramsey sentence structuralism and advocating constructive empiricism. That is, Ramsey sentence structuralism is a form of anti-realism about scientific theories. Equivalently, so-called Structural "Realism" is a form of anti-realism.

There is a small difference between Ramsey sentence structuralism and Constructive empiricism, and this difference can be clarified by various results of the following sort, which use reasoning first given by Max Newman in 1928 against Russell's version of structuralism set out in Russell 1927, The Analysis of Matter. Let $\Theta$ be a finitely axiomatized theory in 2-sorted interpreted language $(L, \mathcal{I})$, where $\mathcal{I}$ is an interpretation of $L$. Let the cardinality of the theoretical domain of $\mathcal{I}$ be $\kappa$. Let $\Re(\Theta)$ be the ramsification of $\Theta$. Then:
$\Re(\Theta)$ is true if and only if $\Theta$ has an empirically correct model $\mathcal{A}$ whose theoretical domain has cardinality $\kappa$.
On this 2-sorted setup, one separates quantification over observables from quantification over non-observables, which allows one to define the cardinality of the theoretical domain and fix empirical content as being entirely about observable objects. But there are variations on this setup, and the corresponding results are rather sensitive to how one formulates the alleged O/T distinction and defines "empirical adequacy". However, all give more or less the same outcome.

I've written two articles on this topic, "Empirical Adequacy and Ramsification" (BJPS 2004) and "Empirical Adequacy and Ramsification II" (in Leitgeb & Hieke (eds.) 2009, Reduction, Abstraction, Analysis).

So, as I see it (and this is formulated also in Demopoulos & Friedman 1985 and many other places), the underlying Newman Objection runs as follows:
1. The truth of $\Re(\Theta)$ is equivalent to $\Theta$ having an empirical adequate model of the right cardinality.
2. Therefore, the sole content of $\Re(\Theta)$ beyond empirical adequacy is cardinality content.
3. Therefore, the sole difference between Constructive empiricism and Ramsey sentence structuralism is this cardinality content.
4. Therefore, Ramsey sentence structuralism is more or less equivalent to Constructive empiricism.
The central assumption in the argument, i.e., Premise 1, is a theorem of mathematical logic. The three further conclusions are drawn from this, modulo certain definitions. There is a certain unavoidable vagueness in the argument (e.g., concerning the exact framework for formalization of scientific theories, the definition of "empirical adequacy", and what counts as "more or less equivalent").

Many other authors have written on this topic. In particular, I'd draw attention to:
[UPDATE: 18th April. I've updated this post in a number of ways, including a quote from John Worrall and some further clarifications.]