Tuesday, 18 August 2015

Book review: John P. Burgess' Rigor and Structure (OUP)

Rigor and Structure, Burgess tells us in the preface, was originally intended to provide for mathematical structuralism the sort of survey that A Subject with No Object (Burgess & Rosen, 1999) provided for nominalism. However, the book that Burgess has ended up writing is importantly different from his earlier work with Rosen. In large part, this is because, for Burgess, not only is mathematical structuralism true --- whereas he took nominalism to be false --- but moreover it is a ''trivial truism'', at least as a description of modern mathematics from the beginning of the twentieth century onwards (Burgess, 2015, 111). Thus, instead of providing philosophical arguments in favour of mathematical structuralism, Burgess instead devotes the first half of the book (Chapters 1 and 2) to providing an historical account of how mathematics developed into the modern discipline of which mathematical structuralism is so obviously a true description. And this is where the other component of the title enters the story. For it is Burgess' contention that modern structuralist mathematics --- which he explores in the second half of the book, that is, in Chapters 3 and 4 --- is an inevitable consequence of the long quest for rigor, which began, so far as we know, with Euclid's Elements, and was completed by work in the nineteenth and early twentieth century that led to the arithmetization of analysis, the axiomatization or arithmetic, analysis, and geometry, the formulation of non-Euclidean geometries, and the founding of modern algebraic theories, such as group theory.

Thus, in the first two chapters of Rigor and Structure, Burgess asks two questions: What is mathematical rigor? Why did mathematicians strive so hard to achieve it throughout the period just described? To answer the first question, Burgess turns initially to the pronouncements of mathematicians themselves, but he finds little that is precise enough to satisfy a philosopher there. So he turns next to Aristotle and, looking to the Posterior Analytics, extracts the following suggestion:

Mathematical rigor requires that:
  • ''every new proposition must be deduced from previously established propositions'';
  •  ''every new notion must be defined in terms of previously explained notions'';
  • there are primitive notions from which the chain of definitions begins;
  • there are primitive postulates from which the chain of deductions begins;
  • ''the meaning of the primitives and the truth of the postulates must be evident''.
(Burgess, 2015, 6-7)


As Burgess recognises, both of these components needs some work before it can provide a faithful account of modern mathematics. Let's take them in reverse order.

As Burgess notes, the nature, source, unity, and justification of the primitive notions and postulates from which mathematics begins have changed significantly since Euclid. At that time, the primitive notions on which geometry was based were notions concerning actual physical space, whilst those at the beginning of the chains of definitions found in arithmetic --- to the extent there really was such a discipline --- concerned finite pluralities of objects (in Greek, arithmoi) (Mayberry, 2000). In short, mathematics at that point did not answer to Russell's famous description of it as the field in which ''we never know what we are talking about'' (Rusell, 1959, 58). Both of its two branches had a specific subject matter and that subject matter was different in the two cases. By the beginning of the twentieth century, as Burgess explains, these features had changed in an important way. Burgess' historical account suggests a number of catalysts for the change. In the case of real analysis, the story runs roughly as follows: The controversy over the status of the infinitesimals used in calculations in Leibniz's version of the calculus --- a controversy driven partly by practical questions about when it was legitimate to divide by a quantity $d$ that one would later take to be 0 --- led to the development of more precise definitions of the key notions using $\epsilon$-$\delta$ methods. And these, in turn and in conjunction with a broader and more precise account of a mathematical function, led to counterexamples to conjectures in analysis that seemed, by the lights of our spatiotemporal intuition, certain to be true. Thus, the drive for greater rigor --- fueled by the recognition that unrigorous methods, such as intuition, were unreliable --- led the arithmetizers of analysis in the nineteenth century to lay down exactly what could be assumed about the objects, originally thought of as the real numbers, that they wished to treat (Dedekind, 1872). But, as Burgess points out in what he calls the paradox of rigor,
a treatment of a given subject matter that is genuinely rigorous will ipso facto cease to be a treatment of that subject matter (alone). (Burgess, 2015, 65)
The point is that, once the primitive notions are specified and the postulates governing them are stated with the precision required by rigor --- that is, they are stated in a language whose basic components are only logical vocabulary supplemented with vocabulary that can express the primitive notions in question --- then any results deduced from those postulates will hold of any structure that satisfies the primitive postulates under some interpretation of the vocabulary that expresses the primitive notions. This follows from the purely formal nature of deduction. Thus, even if there were some entities, the real numbers, that were the intended subject matter of real analysis, once that topic was axiomatized and thereby made rigorous by Dedekind, it ensured that the results proved rigorously in that area would hold of any complete ordered field, whether the elements of its underlying sets are real numbers, points in space, beer mugs, or Roman emperors. And thus, as an inevitable by-product of mathematics completing its project of rigorization, we obtained the version of real analysis of which the account given by structuralism is a trivial truism --- due to the paradox of rigor, the pursuit of rigor in the nineteenth century made structuralism inevitable. What's more, there is a similar story for geometry, where the catalyst was the discovery of non-Euclidean geometries. Again, it was the process of rigorization that led to Hilbert's axiomatization of geometry and the practice of modern geometry, where again structuralism is obviously the correct account. Whether we can easily identify a similar catalyst in the case of arithmetic is less clear. Looking at the writings of Frege and Dedekind, my impression is that the catalyst here was the inadequacy of existing definitions of number, and even of the particular number 1. In any case, Dedekind's axiomatization rendered that area structuralist as well.

Thus, by the early years of the twentieth century, the major areas of mathematics had shed the subject matters with which they were originally endowed; they had come to concern any structure that satisfies their primitive postulates after their primitive notions have been interpreted in that structure. So, the subject matter of mathematics had, by that point, become structures, just as structuralism, in its most general form, asserts. Just what these structures are and what primitive postulates hold of them is the concern of the second half of Burgess' book (Chapters 3 and 4). But before we turn to that, let us first consider the first component in the account of rigor that Burgess extracts from Aristotle's Posterior Analytics:

Mathematical rigor requires that:
  • ''every new proposition must be deduced from previously established propositions''; 
  • ''every new notion must be defined in terms of previously explained notions''
Thanks to Frege and Tarski, we have formal accounts of both parts of this condition. A deduction in a particular area of mathematics is formally rigorous if, for instance, it is a valid proof in a Hilbert-style proof system, where the non-logical axioms are the primitive postulates of the area in question. And a definition is formally rigorous if it is given in a formal logical language with only logical vocabulary and expressions for the primitive notions of the area in question. However, as everyone recognises, almost no proof in a modern mathematical journal is formally rigorous, and equally few employ only formally rigorous definitions. Indeed, few are even written entirely in a formal logical language. So mathematics counts proofs and definitions as rigorous that are not formally rigorous. How, then, does mathematics circumscribe the domain of rigorous proofs? Burgess' answer to this question begins with the cliché that a proof of a theorem is that which convinces its audience that the theorem is true. He dispatches with that particular formulation quickly, but suggests an adaptation according to which a proof is that which convinces its audience that a formally rigorous proof of the theorem exists, even if it would be infeasible to write it out. But even that, he notes, is not adequate. When Terence Tao asserts a number-theoretic theorem, his testimony convinces me that the theorem is true and that there is a formally rigorous proof of it. But his assertion is no proof. Rather, Burgess says, what is required of a proof is that it convinces its audience of the existence of a formally rigorous proof in the right way. And what is the right way? According to Burgess, it is by giving ''enough'' of the steps of the formally rigorous proof itself (Burgess, 2015, 97). And what counts as ''enough''? Burgess leaves his account at this point, which is a pity, for I think there is more that can be said. As Burgess himself notes, what counts as enough will depend on the audience. Maryam Mirzakhani will need to see far fewer steps of a proof to be convinced that a formally rigorous proof exists. And herein lies a problem. Given her inductive evidence about her own exceptional ability to find proofs of theorems in ergodic theory, Mirzakhani may only need to see one step from the formally rigorous proof of a given theorem in that area to convince herself that the full formally rigorous proof exists --- and she may only need to see a trivial step at that. But we should not allow that a trivial step of a full formally rigorous proof can ever count as a proof, even for a Fields Medalist. The point is this: it is not that we need enough steps to convince us that there is a formally rigorous proof --- what we need are the right steps. We need the steps that require some ingenuity to come up with them; the ones that are not just routine parts of many proofs in the given area. For instance, a proof of Cantor's Theorem --- which says that a set has fewer elements than subsets --- need only point out that, given a set $X$ and a function $f : X \rightarrow P(X)$, the set $\{x \in X : x \not \in f(x)\}$ is not in the range of $f$. This is the only step that requires any ingenuity; all the others can be filled in routinely. It is this distinction between steps in a proof that are routine and those that are not that is missing from Burgess' account. Given their track records, Mirzakhani and Tao may need to see very few, if any, of the non-routine steps to be convinced that there is a formally rigorous proof of the theorem in question because they are able to supply the ingenuity themselves. But if the non-routine steps are not given explicitly, we would not wish to say that we have a rigorous proof, even relative to those two mathematicians as our audience. Thus, we might adapt Burgess' already adapted definition and say that a rigorous proof is one that convinces its audience that there exists a formally rigorous proof by providing those steps in the formally rigorous proof that it is not simply routine to provide. Of course, as with Burgess' attempted account, there is still a great deal of vagueness, but let us leave the matter there.

Let us turn instead to the second half of the book. By the end of the first half, Burgess has brought us to the culmination of the rigorization of mathematics --- and, due to the paradox of rigor, mathematics has swapped the subject matters of physical space, finite pluralities, etc. for the subject matter of structures. As this point, Burgess asks: What are the structures that form the subject matter of mathematics? He canvasses various possibilities. First, in Chapter 3, he considers the set-theoretic structuralism of Bourbaki, also hinted at by Paul Benacerraf (1965), endorsed by John Mayberry (2000), and to some extent the current orthodoxy amongst mathematicians. This is a version of hard-headed or eliminative structuralism, which treats analysis as concerned with what is true in all complete ordered fields, rather than in some privileged one. For the set-theoretic structuralist, the complete ordered fields over which analysts quantify consist of an underlying set equipped with distinguished elements, functions, relations, subsets of its powerset, or other sorts of structure --- and similarly for other structures such as groups, $\omega$-sequences, etc. Second, also in Chapter 3, Burgess considers mystic or ante rem or sui generis structuralism of the sort championed by Stewart Shapiro (1997), and possibly Dedekind (1872) himself. On this view, there is just one structure for each isomorphism-type --- thus, real analysis is concerned with the unique ante rem structure that is a complete ordered field. And finally, in Chapter 4, Burgess considers the claims of category theory to provide a theory of the structures over which modern mathematics --- that is, mathematics after the advent of the axiomatic method --- quantifies.

In the end, Burgess sides with all of these possibilities and with none of them. One of the central theses of the book --- elaborated most explicitly in the final three sections of Chapter 3, pp. 145-58 --- is that mathematicians are indifferent to much that goes on at the foundational level of their subject.  They espouse no opinions, nor say anything in their mathematical works, that could tell between any of the three possibilities just enumerated, nor between those and the many others not yet explicitly formulated. The correct account of mathematics, Burgess thinks, should respect this indifference; it should  interpret mathematics in a way that does not go beyond the explicit commitments of its practitioners. That is, he agrees with Stewart Shapiro's principle of minimalism:
The [...] desideratum is to not attribute mathematical properties to mathematical objects unless those attributions are explicit or at least implicit in mathematics itself. (Shapiro, 2006, 110)
Burgess proposes that we interpret the assertions of mathematicians in their mathematical works as if they were writing a chapter in an encyclopedia of mathematics not unlike Bourbaki's Éléments de mathématique. By doing so, Burgess thinks, we remove from the individual mathematician the responsibility for ensuring that there is a complete and rigorous chain of definitions and deductions leading from her results back to the primitive notions and postulates of her area. She is responsible only for the rigor of any chain that leads to her work from the prior results on which her work builds. Burgess sums this up as follows:
what mathematical rigor requires [...] is not actual codification, but only potential codifiability. It must be possible to view new work as if it were a chapter in a codification, but it need not really be such a chapter, and may remain indifferent to all the many more or less arbitrary or conventional choices that would have to have been made on the way to the immediately preceding chapter. (Burgess, 2015, 156)
The arbitrary choices to which Burgess refers are these:
  • Choice of the ontology of structures  For instance, the choice whether to take structures to be set-theoretical, ante rem, category-theoretic, or of some other sort.
  • Choice of primitive postulates  For instance, the choice whether to include symbols for addition and multiplication in the axiomatization of arithmetic, or to define them using the recursion permitted by the induction axiom.
  • Choice of definitions of primitive notions  For instance, the choice whether to define the concept of prime number in such a way that 1 is prime.
Burgess quotes G. H. Hardy with approval on the topic of indifference:
What is essential in mathematics is that its symbols should be capable of some interpretation; generally they are capable of many, and then, so far as mathematics is concerned, it does not matter which we adopt. (Hardy, 1914, 15)
We might call the resulting view indifferentism.

Now, there is much to be said for indifferentism as a description of how mathematicians in fact view their own work and how they determine the norms that govern publishing results in their journals. But one problem for it as a general philosophical account of mathematics is that it makes it difficult to formulate precisely what any mathematician is in fact saying when she asserts a mathematical claim. That is, it poses a problem for those who wish to provide a semantics for mathematical language with a precision that befits the subject.

Take the number-theoretic claim that there are infinitely many primes. According to the set-theoretic structuralist, this statement has a precise semantic content. It takes structures to be set-theoretic, it makes a choice about the axiomatization of arithmetic and about the definition of the concept of prime number, and it says, relative to those choices, that, in any structure that satisfies the axioms for arithmetic, there are infinitely many primes. And the ante rem structuralist and the category-theoretic foundationalists have their own precise accounts of the semantics of mathematical language. But what can an indifferentist like Burgess say? The natural line is to follow the eliminative structuralist, who interprets the statements of those who are indifferent between various options as universal quantifications over all of those options. But this raises two problems.

First: How are we to circumscribe the range of interpretations between which mathematicians are indifferent when they make a particular mathematical statement? Take their indifference between rival accounts of structure, for instance. We have listed three possible versions of structuralism --- set-theoretic, ante rem, and category-theoretic --- and noted that they each ascribe different precise semantic content to the statement that there are infinitely many primes. But of course there are many others already formulated that do the same --- Hellman's modal structuralism, for instance. And there are many yet to be formulated that will do the same again. In order to specify the range of admissible interpretations of a mathematician's statements, we have to be able to say what all of these different versions of structuralism between which she is indifferent have in common. But no such account is forthcoming.

The second problem that indifferentism raises for a precise semantics of mathematical language is this. As well as being able to circumscribe the range of interpretations between which mathematicians are indifferent, we must also say something about this new sort of object over which it forces mathematicians to quantify, namely, interpretations of their language. So now we need an account of interpretations of a language. And that account will include primitive postulates that state basic properties of these new objects. But then we face the question: Are they set-theoretic entities? Or category-theoretic? Or ante rem structures? Presumably the mathematician is indifferent between these different possibilities. But then the problem arises again and we are launched on a regress.

In sum: Burgess is right to emphasise the extent of the indifference that mathematicians show to foundations. But, by doing so, he raises a serious problem for anyone --- himself included --- who wishes to claim that what mathematicians say has precise semantic content. Their indifference to the ultimate meaning of their terms and to the sorts of objects over which they are quantifying creates this puzzle: it is not clear how you can be indifferent to the meaning of what you say and yet nonetheless speak with the precision demanded by mathematics as our most rigorous intellectual discipline.

Burgess explains in his preface that he wanted to write a book that is accessible equally to mathematicians and to philosophers. In this, he has succeeded completely. It will be no surprise to those familiar with Burgess' other work that this book is written with exceptional clarity. And it is packed full of Burgess' astute insights into the history of mathematics and its current practice. Given that his audience includes both mathematicians and philosophers, however, it would have been welcome to have made much more extensive reference to the philosophical literature on some of the topics covered, so that interested mathematicians (and philosophers unfamiliar with these literatures) could follow up the discussion at greater length: Easwaran's work on the nature of mathematical proof, for instance, and the debate about the status of computer-aided proofs, which relates to Burgess' discussion of the same in the final section of Chapter 2, pp. 98-105 (Easwaran, 2009); or Carter's analysis of structuralism from the point of view of mathematical practice, a paper that shares the methodology of Burgess' book (Carter, 2008). But that is a small concern, and search engines will fill the gap for those who are interested. Overall, mathematicians and philosophers alike will gain much from Burgess' specific insights about particular parts of mathematical practice; but they will also find his larger picture of modern mathematics extremely illuminating.

References

  • Benacerraf, P. (1965). What Numbers Could Not Be. The Philosophical Review, 74:47–73.
  • Burgess, J. P. (2015). Rigor and Structure. Oxford University Press, Oxford.
  • Burgess, J. P. and Rosen, G. (1999). A Subject With No Object: Strategies for
    Nominalistic Interpretation of Mathematics
    . Oxford University Press, Oxford.
  • Carter, J. (2008). Structuralism as a philosophy of mathematical practice. Synthese, 163(2):119–131.
  • Dedekind, R. (1872). Stetigkeit und irrationale Zahlen. Vieweg, Braunschweig.
  • Easwaran, K. (2009). Probabilistic proofs and transferability. Philosophia Mathematica, 17(3):341–362.
  • Hardy, G. H. (1914). A Course of Pure Mathematics. Cambridge University Press, Cambridge, 2nd edition.
  • Mayberry, J. (2000). The Foundations of Mathematics in the Theory of Sets. Cambridge University Press, Cambridge.
  • Russell, B. (1959). Mathematics and the Metaphysicians. In Mysticism and Logic, and other essays. George Allen & Unwin Ltd, London.
  • Shapiro, S. (1997). Philosophy of Mathematics: Structure and Ontology. Oxford University Press, Oxford.
  • Shapiro, S. (2006). Structure and Identity. In MacBride, F., editor, Identity and Modality. Oxford University Press.

2 comments: