Diagonalization and the Diagonal Lemma
A couple of recent M-Phi posts by Catarina have discussed the topic of diagonalization and the Diagonal Lemma in Gödel's incompleteness results.
Because I can't be bothered to get out of bed, this post consists in some Sunday afternoon rambling about how diagonalization is usually understood and how the Diagonal Lemma is proved (i.e., how to define a fixed-point, or self-referential, sentence).
Gödel's original 1931 paper on incompleteness, "On Formally Undecidable Propositions in Principia Mathematica and Related Systems, I", is brimming with new ideas, of which four important ones are:
1. The (Usual) First-Order Language of Arithmetic
The language of arithmetic is a first-order language with identity, and with signature and is intended to talk about about the structure . The set of -terms, , is defined inductively, by starting with and all variables, , and then closing under applying the function symbols . Then, an atomic -formula is defined to be of the form , where . And the set of -formulas, , is defined inductively, by starting with atomic formulas , and then closing under applying the unary connective , the binary connective and quantifiers . (Other connectives, and , can be defined.)
It is useful sometimes to abuse notation and conflate the symbol and its denotation, the number ; or the symbol and its denotation, the function , etc. Context is usually sufficient to disambiguate. If one is considering a non-standard -interpretation (i.e., is not isomorphic to ), then will be the denotation of in ; the denotation of , and so on.
A bounded quantifier has the form or , where is a term, and is the formula . A bounded formula is a formula in which all quantifiers are bounded. A -sentence has the form , where is bounded.
2. Canonical Numerals for Numbers
Because of the presence of and in , there is a canonical way of denoting numbers, using canonical numerals. Roughly, the number is denoted by the term , with repeated times. More exactly, one defines a function
, which is the term . By induction, it is easy to prove that .
3. Robinson Arithmetic:
A very important weak system of arithmetic is Robinson arithmetic, . Its axioms are:
does not have as an axiom scheme the induction scheme,
with the induction scheme gives Peano arithmetic.
The system is sound and complete for certain classes of sentences. That is,
is a quantifier-free sentence, a bounded sentence or a -sentence.
In general, can correctly perform finite computations using successor, addition and multiplication. However, is not very good at algebra. For example, does not prove . Importantly, although has axioms for addition and multiplication, it does not have axioms for exponentiation.
3. Gödel Coding
Under gödel coding, each primitive symbol of the alphabet of is given a code number, . Also, each finite sequence of numbers can be coded as a number, sometimes written as . For example, one might use the coding are the primes in ascending order, beginning with 2. (And we assign as the code of the empty string .) For example, . Such numbers are then called sequence numbers.
Because -strings are sequences of primitive symbols, the code of an -string is defined as:
, is not the same as the code of the corresponding one-element string . For example, if , say, then .)
Because each -string now has a code, say , we can think of the numeral as a kind of name of . So, if , then the numeral is written: , and is sometimes called the gödel quotation name of .
So, using coding, the interpreted language of arithmetic is able to talk (indirectly) about the syntax of . For each syntactical entity has a kind of name in itself. Furthermore, special sets of (and relations amongst) syntactical entities become sets of numbers under coding. If is a set of syntactical entities, then is , the corresponding set of numbers. For example, the variables of are coded as a set of numbers; the set of terms is coded as a set of numbers. Etc. Then one can ask whether these sets (or relations) are "expressible" in the language of arithmetic.
In general, these syntactical sets and relations come out, under coding, as either primitive recursive, or recursive, or recursively enumerable.
4. Primitive Recursive Functions
The primitive recursive functions are nowadays defined a bit differently from Gödel's original approach. (See this SEP article for more details.) First,
and , we say that is obtained by primitive recursion from and just if, for all ,
There is a further operation on functions---minimization. Closing the primitive recursive functions under that operation then yields the recursive functions.
The notion of recursiveness can be extended to subsets of, and relations on, by considering the characteristic function of the set or relation. Suppose . The characteristic function is defined by: is (primitive) recursive, we say that is (primitive) recursive.
For a relation , the definitions are analogous.
5. Representability in
There are two main notions of "expressing" a numerical function or relation. The first is semantical: given some set , simply whether there is an -formula such that defines .
The other notion is syntactical, and always involves a theory in . Given a function , we say is representable in just when there is an -formula such that, for all ,
, a similar notion can be applied to subsets, and relations on, . Suppose . We say that is strongly representable in just when there is some formula , such that for all ,
is weakly representable in just when condition (i) holds, but with "if ... then" replaced by "... iff ..." (thanks to Kai Wehmeier for the correction).
An analogous definition can be given for a relation to be strongly or weakly representable.
The most important representation property for is:
; and then one shows that the representable functions are closed under composition, primitive recursion and minimization. Details of how to do this are given in standard textbooks, such as H. Enderton, A Mathematical Introduction to Logic. As Aldo Antonelli notes in the comments below, in order to show that functions obtained by primitive recursion are representable, one needs to show how sequence coding and decoding can be expressed in , which only has and as primitives. The crucial result required is Gödel's Beta Function Lemma, using the Chinese Remainder Theorem. These results need only to be proved in the metatheory, not in itself (which lacks induction).
The converse of the representability theorem is true as well: if is representable in , then is recursive.
A corollary of the representation theorem for is that all recursive sets and relations are strongly representable in . So, if is recursive, there is a formula such that, for all ,
, the set of (codes of) axioms is recursive (for example, a computer program can be written to decide of a given input string whether it is an axiom or not), indeed, primitive recursive. Furthermore, the proof-in- relation, " is a proof of (in )", is primitive recursive. So, there are -formulas which strongly represent (in ) such relations. In particular, there is a formula which strongly represents the proof-in- relation: is the code of the proof-in- of formula with code . Then the formula expresses " is a theorem of " and is written .
6. Diagonalization
Consider the binary syntactic operation on strings,
For example, the diagonalization of
is a formula of the language of arithmetic, with at most the variable free, then its diagonalization is defined to be:
for all free occurrence of in .
An alternative, but equivalent (in predicate logic) is:
, given by, for any -string ,
is primitive recursive.
Because all recursive functions are representable in , it follows that there is an -formula such that, for all ,
7. The Diagonal (or Fixed Point) Lemma
Let be an -formula with at most free. (If happens to have only one variables free, say , but not , then one can do a relabelling of variables to ensure that has only free.) We want to find a formula that is in some sense equivalent to .
First consider informally, the expression,
. Now says that a certain thing is . This thing is the diagonalization of "the diagonalization of is "---but this is precisely the statement itself. Such a statement is called a fixed-point sentence for the original formula , and it says of itself that it is .
So, to get this into the formalized language , we begin with some formula and we consider,
. Let be the diagonalization of . That is, is . So, trivially, we have,
. So, the term is the numeral . And thus, trivially, we have
represents , we have,
:
8. Diagonal Lemma using a Function Term
For the Diagonal Lemma, one needs that the external syntactic operation of diagonalization on a string be represented in the theory. Under coding, the diagonalization operation corresponds to a primitive recursive function, , the diagonal function.
has the property that all recursive functions are representable --- which means that for any recursive function , there is a formula which represents it. Note that the choice of such a formula is not unique. But is weak, and it is not always true that proves that this formula is functional. In particular, may not prove . For example, cannot prove the totality of exponentation.
However, if one extends with the induction scheme, then this situation changes. is strong enough to prove the totality of any primitive recursive function. In particular, one can choose the formula representing the diagonal function such that,
as follows:
be the result of adding this definition to . By the representablity of , we have,
. As noted, one needs an -formula expressing,
.
Above, "the diagonalization of is " was expressed with
, one can use instead
has the form , where is the term . To show that this works, note that we have from (*),
is a self-referential sentence containing a term provably (in ) denoting the code of . And we have a modified version of the Diagonal Lemma,
9. Gödel Sentence
For an arithmetic theory to which Gödel's incompleteness theorem applies, a gödel sentence is usually defined to be a fixed-point of the formula where is a formula expressing " is a theorem of ". That is,
instead of as our underlying syntax theory, can be expressed rather vividly as follows:
has the form , where denotes the code of .
Because I can't be bothered to get out of bed, this post consists in some Sunday afternoon rambling about how diagonalization is usually understood and how the Diagonal Lemma is proved (i.e., how to define a fixed-point, or self-referential, sentence).
Gödel's original 1931 paper on incompleteness, "On Formally Undecidable Propositions in Principia Mathematica and Related Systems, I", is brimming with new ideas, of which four important ones are:
(i) the notion of what is now called a primitive recursive function;There are short explanations of each of these below.
(ii) arithmetization of syntax (aka, gödel coding);
(iii) the diagonalization operation on a string;
(iv) given any-formula , the construction of a "fixed-point" or self-referential sentence equivalent to .
1. The (Usual) First-Order Language of Arithmetic
The language of arithmetic
It is useful sometimes to abuse notation and conflate the symbol
Although taking
A bounded quantifier has the form
2. Canonical Numerals for Numbers
Because of the presence of
by primitive recursion as follows:,
So, for example, the number 7 is denoted by
3. Robinson Arithmetic:
A very important weak system of arithmetic is Robinson arithmetic,
It is noteworthy that
Extending
The system
This holds whenif and only if is true.
In general,
3. Gödel Coding
Under gödel coding, each primitive symbol
where,
Because
(One needs to be a bit careful as the code of a primitive symbol, say.
Because each
So, using coding, the interpreted language of arithmetic is able to talk (indirectly) about the syntax of
In general, these syntactical sets and relations come out, under coding, as either primitive recursive, or recursive, or recursively enumerable.
4. Primitive Recursive Functions
The primitive recursive functions are nowadays defined a bit differently from Gödel's original approach. (See this SEP article for more details.) First,
The basic functions:One may then build up more functions by either composition, or by primitive recursion. That is, given functions
(i) the zero function, by .
(ii) the successor function, by .
(iii) the projection functions(with ), by .
The primitive recursive functions are then defined to be those functions obtainable from the basic functions by composition and primitive recursion.
There is a further operation on functions---minimization. Closing the primitive recursive functions under that operation then yields the recursive functions.
The notion of recursiveness can be extended to subsets of, and relations on,
Ifif , and 0 otherwise.
For a relation
5. Representability in
There are two main notions of "expressing" a numerical function or relation. The first is semantical: given some set
If so, we say that.
The other notion is syntactical, and always involves a theory
Given a theory
(i) ifWe say thatthen
(ii) if, then
An analogous definition can be given for a relation
The most important representation property for
Representability inTo prove this, one shows first that each of the basic functions is representable in
All recursive functions are representable in.
The converse of the representability theorem is true as well: if
A corollary of the representation theorem for
(i) ifAs it turns out, for a standard axiomatic theory, then .
(ii) if, then .
6. Diagonalization
Consider the binary syntactic operation on strings,
the result of substituting the quotation of stringAlthough this is binary, obviously, one can take this operation on the same argument, i.e.,for all free occurrences of a variable in string .
the result of substituting the quotation of stringInformally, this is (syntactic) diagonalization.for all free occurrences of a variable in string .
For example, the diagonalization of
iscontains 23 symbols
"Ifcontains 23 symbols" contains 23 symbols
I.e., the result of substituting the gödel quotation
An alternative, but equivalent (in predicate logic) is:
Clearly, this syntactic operation of diagonalization corresponds to a numerical function,
One can show (the details are tedious) thatif is an -formula with at most free;
if is not the code of such an -formula
Because all recursive functions are representable in
7. The Diagonal (or Fixed Point) Lemma
Let
First consider informally, the expression,
the diagonalization ofIf one diagonalizes this, one gets,is
the diagonalization of "the diagonalization ofCall this statementis " is
So, to get this into the formalized language
Call this formula
In particular,
Furthermore, since
Let us now reason about
This establishes the
Diagonal Lemma
Given an-formula , there is an -sentence such that
8. Diagonal Lemma using a Function Term
For the Diagonal Lemma, one needs that the external syntactic operation of diagonalization on a string be represented in the theory. Under coding, the diagonalization operation corresponds to a primitive recursive function,
However, if one extends
When this is so, one can introduce, by explicit definition, a new function symbol
Let
(*)Returning to the Diagonal Lemma, suppose that one wishes to find the fixed-point for a given formula
the diagonalization ofwhich one then diagonalizes to obtainis
Above, "the diagonalization of
But with a function term
One may then diagonalize this, to obtain a fixed point sentence
Note that
and so,
So,
Diagonal Lemma using a Term
Letbe extended with an explicit definition of a function term . Let be an -formula. Then there is a term such that,
9. Gödel Sentence
For an arithmetic theory
The construction above shows that, if we use
So,
Super stuff Jeff: must have been a rainy Sunday afternoon. The more of them you get, the better for us but the worse for you!
be the diagonalization of . That is, is .
makes an ascription not to itself but to the referent of namely . And likewise the sentence on the right hand side of the fixed point lemma namely is not ascribing to itself but to the the (code of) . Whereas, as you say:
is a self-referential sentence containing a term provably (in ) denoting the code of .
is F cannot ascribe F to itself since the sentence as a whole will have a larger Godel code than the code of the sub-string - a string of 'S's ending in zero whose code will be greater than the number, viz.the code of , which it stands for. (And to get a quine program you can't just take some source code blah and write print 'blah' because the initial [print '] and final ['] will be additional.)
ReplyDeleteThe nuts and bolts of my fixation about self-reference come out here. You don't get self-reference by diagonalising alone, nor in the fixed point lemma, but you do get it directly in your final section 8 function term way (though you don't need function terms to get the self-reference). One might say there is never literally self-reference in arithmetized syntax, always reference to numbers but since we are referring to abstract expression types who is to say these aren't numbers? (Whatever the existence of numbers and types comes to, let's not go there just now!).
Thus in your
Let
the formula
I've always thought these self-referential loops really neat. In the dear departed comment I'd said something to the effect that you can't do that just by (ordinary) quotation or just by Godel naming alone. No sentence ["blah" is F] can ascribe F to itself, if it is a predication applying to whatever the stuff inside the quotation marks is a token of. For the blah inside the quotation marks has to be a proper part of the whole sentence. Similarly under normal Godel-coding
So Godel's trick of diagonalising on a sentence which contains a representation of diagonalising is really clever as is Quine's ["yields a falsehood when appended to its own quotation" yields a falsehood when appended to its own quotation]. As are those quine programs: I suppose the basic trick is to construct a routine "printwice" such that [printwice 'blah'] prints the blah stuff twice, the second time in quotes, i.e. outputs [blah 'blah'] and then [printwice 'printwice'] does the job. In the Quine case we have a context [is F] which is such that ["blah" is F] doesn't just attribute a property to the blah stuff inside the quotes but more importantly attributes falsehood to something generated from the stuff inside the quote marks, which happens- well doesn't just happen- to be the sentence itself.
Thanks, Alan!
ReplyDeleteWhat you say on quotations is right; but when we have a formula with a *term* that denotes the formula itself, the term need not be the quotation. Rather the term may be something like "the diagonalization of "...." ". For example, in Smullyan's approach, he uses what he calls the norm of an expression. (See R. Smullyan, 1957, "Languages in Which Self-Reference is Possible", JSL 22: 55-67 (reprinted in Hintikka, 1969, Philosophy of Mathematics.)
Quine's example is very clever, and I use it in my lectures.
Jeff
Nicely done, Jeff, but the part where you non-chalantly point out that Q does not have a primitive symbol for exponentiation and then go on saying that it can nonetheless represent all (primitive) recursive functions -- well, that's a bit like being told that the murderer is the butler, but without being told what the murder weapon was.
ReplyDeleteAll the action lies precisely in bootstrapping addition and multiplication into a representation of all (primitive) recursive functions, which requires a coding of sequences (a feat famously pulled off by Gödel with his beta-function). Now, you probably don't want to go into the glorious detail of the Chinese Remainder Theorem, but your readers need to know precisely where the magic takes place.
But I'm picking at nits -- I really liked your exposition.
Thanks, Aldo! Yeah, you're right, I confess, I was being a bit sneaky about how we prove that all recursive functions are representable in Q (using sequence coding/decoding, the beta function, Chinese Remainder Theorem, etc.), given that we just have successor, plus and times as primitives in . Maybe I'll add an update, with a couple of links.
ReplyDeleteJeff
Hi Jeff! Neat post, thanks for that. Not that it matters to your exposition, but the definition of weak representability isn't quite right. You'll want to say that A is weakly representable if condition (i) holds with "iff" in place of "if... then". As it stands, every set of natural numbers is weakly representable, and by the same formula, at that (e.g. any truth-functional tautology).
ReplyDelete--Kai
Oops - yes, you're right, Kai. Will update it.
ReplyDeleteCheers, Jeff
Nice Post .Keep updating Artificial Intelligence Online Training
ReplyDeleteThanks for sharing it is important for me. I also searched for that from here. Visit our site Trend Micro belgie
ReplyDelete