Feeds:
Posts

# Gödel’s incompleteness theorems

In mathematical logic, Gödel’s incompleteness theorems, proved by Kurt Gödel in 1931, are two theorems stating inherent limitations of all but the most trivial formal systems for arithmetic of mathematical interest.

The theorems are also of considerable importance to the philosophy of mathematics. They are widely regarded as showing that Hilbert’s program to find a complete and consistent set of axioms for all of mathematics is impossible, thus giving a negative answer to Hilbert’s second problem. Authors such as J. R. Lucas have argued that the theorems have implications in wider areas of philosophy and even cognitive science as well as preventing any complete Theory of Everything from being found in physics, but these claims are less generally accepted.

//

## First incompleteness theorem

Gödel’s first incompleteness theorem, perhaps the single most celebrated result in mathematical logic, states that:

For any consistent formal, computably enumerable theory that proves basic arithmetical truths, an arithmetical statement that is true, but not provable in the theory, can be constructed.1 That is, any effectively generated theory capable of expressing elementary arithmetic cannot be both consistent and complete.

Here, “theory” refers to an infinite set of statements, some of which are taken as true without proof (these are called axioms), and others (the theorems) that are taken as true because they are implied by the axioms. “Provable in the theory” means “derivable from the axioms and primitive notions of the theory, using standard first-order logic“. A theory is “consistent” if it never proves a contradiction. “Can be constructed” means that some mechanical procedure exists which can construct the statement, given the axioms, primitives, and first order logic. “Elementary arithmetic” consists merely of addition and multiplication over the natural numbers. The resulting true but unprovable statement is often referred to as “the Gödel sentence” for the theory, although there are infinitely many other statements in the theory that share with the Gödel sentence the property of being true but not provable from the theory.

The hypothesis that the theory is computably enumerable means that it is possible in principle to write a computer program that (if allowed to run forever) would list all the theorems of the theory and no other statements. In fact, it is enough to enumerate the axioms in this manner since the theorems can then be effectively generated from them.

The first incompleteness theorem first appeared as “Theorem VI” in his 1931 paper On Formally Undecidable Propositions in Principia Mathematica and Related Systems I. In Gödel’s original notation, it states:

“The general result about the existence of undecidable propositions reads as follows:

“Theorem VI. For every ω-consistent recursive class κ of FORMULAS there are recursive CLASS SIGNS r, such that neither v Gen r nor Neg(v Gen r) belongs to Flg(κ) (where v is the FREE VARIABLE of r).2 (van Heijenoort translation and typsetting 1967:607. “Flg” is from “Folgerungsmenge = set of consequences” and “Gen” is from “Generalisation = generalization” (cf Meltzer and Braithwaite 1962, 1992 edition:33-34) )

Roughly speaking, the Gödel statement, G, asserts: “G cannot be proven true”. If G were able to be proven true under the theory’s axioms, then the theory would have a theorem, G, which contradicts itself, and thus the theory would be inconsistent. But if G were not provable, then it would be true (for G expresses this very fact) and thus the theory would be incomplete.

The argument just given is in ordinary English and thus not mathematically rigorous. In order to provide a rigorous proof, Gödel represented statements by numbers; then the theory, which is already about numbers, also pertains to statements, including its own. Questions about the provability of statements are represented as questions about the properties of numbers, which would be decidable by the theory if it were complete. In these terms, the Gödel sentence is a claim that there does not exist a natural number with a certain property. A number with that property would be a proof of inconsistency of the theory. If there were such a number then the theory would be inconsistent, contrary to hypothesis. So, assuming the theory is consistent (as done in the theorem’s hypothesis) there is no such number, and the Gödel statement is true, but the theory cannot prove it. An important conceptual point is that we must assume that the theory is consistent in order to state that this statement is true.

### Extensions of Gödel’s original result

Gödel demonstrated the incompleteness of the theory of Principia Mathematica, a particular theory of arithmetic, but a parallel demonstration could be given for any effective theory of a certain expressiveness. Gödel commented on this fact in the introduction to his paper, but restricted the proof to one system for concreteness. In modern statements of the theorem, it is common to state the effectiveness and expressiveness conditions as hypotheses for the incompleteness theorem, so that it is not limited to any particular formal theory.

Gödel’s original statement and proof of the incompleteness theorem requires the assumption that the theory is not just consistent but ω-consistent. A theory is ω-consistent if it is not ω-inconsistent, and is ω-inconsistent if there is a predicate P such that for every specific natural number n the theory proves ~P(n), and yet the theory also proves that there exists a natural number n such that P(n). That is, the theory says that a number with property P exists while denying that it has any specific value. The ω-consistency of a theory implies its consistency, but consistency does not imply ω-consistency. J. Barkley Rosser later strengthened the incompleteness theorem by finding a variation of the proof that does not require the theory to be ω-consistent, merely consistent. This is mostly of technical interest, since all true formal theories of arithmetic, that is, theories with only axioms that are true statements about natural numbers, are ω-consistent and thus Gödel’s theorem as originally stated applies to them. The stronger version of the incompleteness theorem that only assumes consistency, not ω-consistency, is now commonly known as Gödel’s incompleteness theorem.

## Second incompleteness theorem

Gödel’s second incompleteness theorem can be stated as follows:

For any formal recursively enumerable (i.e. effectively generated) theory T including basic arithmetical truths and also certain truths about formal provability, T includes a statement of its own consistency if and only if T is inconsistent.

(Proof of the “if” part:) If T is inconsistent then anything can be proved, including that T is consistent. (Proof of the “only if” part:) If T is consistent then T does not include the statement of its own consistency. This follows from the first theorem.

There is a technical subtlety involved in the second incompleteness theorem, namely how exactly are we to express the consistency of T in the language of T. There are many ways to do this, and not all of them lead to the same result. In particular, different formalizations of the claim that T is consistent may be inequivalent in T, and some may even be provable. For example, first order arithmetic (Peano arithmetic or PA for short) can prove that the largest consistent subset of PA is consistent. But since PA is consistent, the largest consistent subset of PA is just PA, so in this sense PA “proves that it is consistent”. What PA does not prove is that the largest consistent subset of PA is, in fact, the whole of PA. (The term “largest consistent subset of PA” is rather vague, but what is meant here is the largest consistent initial segment of the axioms of PA ordered according to some criteria, e.g. by “Gödel numbers”, the numbers encoding the axioms as per the scheme used by Gödel mentioned above).

In the case of Peano arithmetic or any familiar explicitly axiomatized theory T, it is possible to define the consistency “Con(T)” of T in terms of the non-existence of a number with a certain property, as follows: “there does not exist an integer coding a sequence of sentences, such that each sentence is either one of the (canonical) axioms of T, a logical axiom, or an immediate consequence of preceding sentences according to the rules of inference of first order logic, and such that the last sentence is a contradiction”. However, for arbitrary T there is no canonical choice for Con(T).

The formalization of Con(T) depends on two factors: formalizing the notion of a sentence being derivable from a set of sentences and formalizing the notion of being an axiom of T. Formalizing derivability can be done in canonical fashion, so given an arithmetical formula A(x) defining a set of axioms, we can canonically form the predicate ProvA(P) which expresses that P is provable from the set of axioms defined by A(x). Using this predicate we can express Con(T) as “not ProvA(‘P and not-P’)”. Solomon Feferman showed that Gödel’s second incompleteness theorem goes through when the formula A(x) is chosen so that it has the form “there exists a number n satisfying the decidable predicate P” for some P. In addition, ProvA(P) must satisfy the so-called HilbertBernays provability conditions:

1. If T proves P, then T proves ProvA(P)

2. T proves 1., i.e. T proves that if T proves P, then T proves ProvA(P)

3. T proves that if T proves that (P implies Q) then T proves that provability of P implies provability of Q

Gödel’s second incompleteness theorem also implies that a theory T1 satisfying the technical conditions outlined above can’t prove the consistency of any theory T2 which proves the consistency of T1. This is because then T1 can prove that if T2 proves the consistency of T1, then T1 is in fact consistent. For the claim that T1 is consistent has form “for all numbers n, n has the decidable property of not being a code for a proof of contradiction in T1“. If T1 were in fact inconsistent, then T2 would prove for some n that n is the code of a contradiction in T1. But if T2 also proved that T1 is consistent, i.e. there is no such n, it would itself be inconsistent. We can carry out this reasoning in T1 and conclude that if T2 is consistent, then T1 is consistent. Since by second incompleteness theorem, T1 does not prove its consistency, it can’t prove the consistency of T2 either.

This easy corollary of the second incompleteness theorem shows that there is no hope of proving e.g. the consistency of first order arithmetic using finitistic means provided we accept that finitistic means are correctly formalized in a theory the consistency of which is provable in PA. It’s generally accepted that the theory of primitive recursive arithmetic (PRA) is an accurate formalization of finitistic mathematics, and PRA is provably consistent in PA. Thus PRA can’t prove the consistency of PA. This is generally seen to show that Hilbert’s program, which is to use “ideal” mathematical principles to prove “real” (finitistic) mathematical statements by showing that the “ideal” principles are consistent by finitistically acceptable principles, can’t be carried out.

This corollary is actually what makes the second incompleteness theorem epistemically relevant. As Georg Kreisel remarked, it would actually provide no interesting information if a theory T proved its consistency. This is because inconsistent theories prove everything, including their consistency. Thus a consistency proof of T in T would give us no clue as to whether T really is consistent; no doubts about T’s consistency would be resolved by such a consistency proof. The interest in consistency proofs lies in the possibility of proving the consistency of a theory T in some theory T’ which is in some sense less doubtful than T itself, e.g. weaker than T. For most naturally occurring T and T’, such as T = Zermelo-Fraenkel set theory and T’ = primitive recursive arithmetic, the consistency of T’ is provable in T, and thus T’ can’t prove the consistency of T by the above corollary of the second incompleteness theorem.

The consistency of first-order arithmetic has been proved assuming that a certain ordinal called ε0 is wellfounded. See Gentzen’s consistency proof.

### Original statement of Gödel’s Theorem XI

While contemporary usage calls it the “Second incompleteness Theorem”, in the original Gödel presented it as his “Theorem XI”. It is stated thus (in the following, “Section 2” is where his Theorem VI appears, and P is Gödel’s abbreviation for Peano Arithmetic ):

”The results of Section 2 have a surprising consequence concerning a consistency proof for the system P (and its extensions), which can be stated as follows:

”Theorem XI. Let κ be any recursive consistent63 class of FORMULAS; then the SENTENTIAL FORMULA stating that κ is consistent is not κ-PROVABLE; in particular, the consistency of P is not provable in P,64 provided P is consistent (in the opposite case, of course, every proposition is provable [in P])”. (Brackets in original added by Gödel “to help the reader”, translation and typography in van Heijenoort 1967:614)

63 “κ is consistent” (abbreviated by “Wid(κ)”) is defined as thus: Wid(κ)≡ (Ex)(Form(x) & ~Bewκ(x)).”

(Note: In the original “Bew” has a negation-“bar” written over it, indicated here by ~. “Wid” abbreviates “Widerspruchfreiheit = consistency”, “Form” abbreviates “Formel = formula”, “Bew” abbreviates “Beweisbar = provable” (translations from Meltzer and Braithwaite 1962, 1996 edition:33-34) )
64 This follows if we substitute the empty class of FORMULAS for κ.”

## Meaning of Gödel’s theorems

Gödel’s theorems are theorems about first-order logic, and must ultimately be understood in that context. In formal logic, both mathematical statements and proofs are written in a symbolic language, one where we can mechanically check the validity of proofs so that there can be no doubt that a theorem follows from our starting list of axioms. In theory, such a proof can be checked by a computer, and in fact there are computer programs that will check the validity of proofs. (Automatic proof verification is closely related to automated theorem proving, though proving and checking the proof are usually different tasks.)

To be able to perform this process, we need to know what our axioms are. We could start with a finite set of axioms, such as in Euclidean geometry, or more generally we could allow an infinite list of axioms, with the requirement that we can mechanically check for any given statement whether it is an axiom from that set or not (an axiom schema). In computer science, this is known as having a recursive set of axioms. While an infinite list of axioms may sound strange, this is exactly what’s used in the usual axioms for the natural numbers, the Peano axioms: the inductive axiom is in fact an axiom schema — it states that if zero has any property and whenever any natural number has that property, its successor also has that property, then all natural numbers have that property — it does not specify which property and the only way to say in first-order logic that this is true of all definable properties is to have infinitely many statements, one for each property.

Gödel’s first incompleteness theorem shows that any formal system that includes enough of the theory of the natural numbers is incomplete: it contains statements that are neither provably true nor provably false. Or one might say, no formal system which aims to define the natural numbers can actually do so, as there will be true number-theoretical statements which that system cannot prove. This has severe consequences for the program of logicism proposed by Gottlob Frege and Bertrand Russell, which aimed to define the natural numbers in terms of logic.

The existence of an incomplete system is in itself not particularly surprising. For example, if you take Euclidean geometry and you drop the parallel postulate, you get an incomplete system (in the sense that the system does not contain all the true statements about Euclidean space). A system can be incomplete simply because you haven’t discovered all the necessary axioms.

What Gödel showed is that in most cases, such as in number theory or real analysis, you can never create a complete and consistent finite list of axioms, or even an infinite list that can be produced by a computer program. Each time you add a statement as an axiom, there will always be other true statements that still cannot be proved as true, even with the new axiom. Furthermore if the system can prove that it is consistent, then it is inconsistent.

It is possible to have a complete and consistent list of axioms that cannot be produced by a computer program (that is, the list is not computably enumerable). For example, one might take all true statements about the natural numbers to be axioms (and no false statements). But then there is no mechanical way to decide, given a statement about the natural numbers, whether it is an axiom or not.

Gödel’s theorem has another interpretation in the language of computer science. In first-order logic, theorems are computably enumerable: you can write a computer program that will eventually generate any valid proof. You can ask if they have the stronger property of being recursive: can you write a computer program to definitively determine if a statement is true or false? Gödel’s theorem says that in general you cannot.

Many logicians believe that Gödel’s incompleteness theorems struck a fatal blow to David Hilbert‘s program towards a universal mathematical formalism which was based on Principia Mathematica. The generally agreed-upon stance is that the second theorem is what specifically dealt this blow. However some believe it was the first, and others believe that neither did.

## Examples of undecidable statements

There are two distinct senses of the word “undecidable” in contemporary use. The first of these is the sense used in relation to Gödel’s theorems, that of a statement being neither provable nor refutable in a specified deductive system. The second sense is used in relation to computability theory and applies not to statements but to decision problems, which are countably infinite sets of questions each requiring a yes or no answer. Such a problem is said to be undecidable if there is no computable function that correctly answers every question in the problem set. The connection between these two is that if a decision problem is undecidable (in the recursion theoretical sense) then there is no consistent, effective formal system which proves for every question A in the problem either “the answer to A is yes” or “the answer to A is no”.

Because of the two meanings of the word undecidable, the term independent is sometimes used instead of undecidable for the “neither provable nor refutable” sense. The usage of “independent” is also ambiguous, however. Some use it to mean just “not provable”, leaving open whether an independent statement might be refuted.

Undecidability of a statement in a particular deductive system does not, in and of itself, address the question of whether the truth value of the statement is well-defined, or whether it can be determined by other means. Undecidability only implies that the particular deductive system being considered does not prove the truth or falsity of the statement. Whether there exist so-called “absolutely undecidable” statements, whose truth value can never be known or is ill-specified, is a controversial point among various philosophical schools.

One of the first problems suspected to be undecidable, in the second sense of the term, was the word problem for groups, first posed by Max Dehn in 1911, which asks if there is a finitely presented group for which no algorithm exists to determine whether two words are equivalent. This was shown to be the case in 1952.

The combined work of Gödel and Paul Cohen has given two concrete examples of undecidable statements (in the first sense of the term): The continuum hypothesis can neither be proved nor refuted in ZFC (the standard axiomatization of set theory), and the axiom of choice can neither be proved nor refuted in ZF (which is all the ZFC axioms except the axiom of choice). These results do not require the incompleteness theorem. Gödel proved in 1940 that neither of these statements could be disproved in ZF or ZFC set theory. In the 1960s, Cohen proved that neither is provable from ZF, and the continuum hypothesis cannot be proven from ZFC.

In 1970, Soviet mathematician Yuri Matiyasevich showed that Hilbert’s Tenth Problem, posed in 1900 as a challenge to the next century of mathematicians, cannot be solved. Hilbert’s challenge sought an algorithm which finds all solutions of a Diophantine Equation. A Diophantine Equation is a more general case of Fermat’s Last Theorem; we seek the rational roots of a polynomial in any number of variables with integer coefficients. Since we have only one equation but n- variables, infinite solutions exist (and are easy to find) in the Complex Plane; the problem becomes difficult (impossible) by constraining solutions to rational values only. Matiyasevich showed this problem to be unsolvable by mapping a Diophantine Equation to a recursively enumerable set and invoking Gödel’s Incompleteness Theorem.

In 1936, Alan Turing proved that the halting problem—the question of whether or not a Turing machine halts on a given program—is undecidable, in the second sense of the term. This result was later generalized to Rice’s theorem.

In 1973, the Whitehead problem in group theory was shown to be undecidable, in the first sense of the term, in standard set theory.

In 1977, Paris and Harrington proved that the Paris-Harrington principle, a version of the Ramsey theorem, is undecidable in the axiomatization of arithmetic given by the Peano axioms but can be proven to be true in the larger system of second-order arithmetic.

Kruskal’s tree theorem, which has applications in computer science, is also undecidable from the Peano axioms but provable in set theory. In fact Kruskal’s tree theorem (or its finite form) is undecidable in a much stronger system codifying the principles acceptable on basis of a philosophy of mathematics called predicativism.

Goodstein’s theorem is a statement about the Ramsey theory of the natural numbers that Kirby and Paris showed is undecidable in Peano arithmetic.

Gregory Chaitin produced undecidable statements in algorithmic information theory and proved another incompleteness theorem in that setting. Chaitin’s theorem states that for any theory that can represent enough arithmetic, there is an upper bound c such that no specific number can be proven in that theory to have Kolmogorov complexity greater than c. While Gödel’s theorem is related to the liar paradox, Chaitin’s result is related to Berry’s paradox.

Douglas Hofstadter gives a notable alternative proof of incompleteness, inspired by Gödel, in his book Gödel, Escher, Bach.

## Limitations of Gödel’s theorems

The conclusions of Gödel’s theorems only hold for the formal systems that satisfy the necessary hypotheses (which have not been fully described in this article). Not all axiom systems satisfy these hypotheses, even when these systems have models that include the natural numbers as a subset. For example, there are first-order axiomatizations of Euclidean geometry and real closed fields that do not meet the hypotheses of Gödel’s theorems. The key fact is that these axiomatizations are not expressive enough to define the set of natural numbers or develop basic properties of the natural numbers.

A second limitation is that Gödel’s theorems only apply to systems that are used as their own proof systems. For example, the consistency of the Peano arithmetic can be proved in set theory if set theory is consistent (however, one cannot prove that the latter is consistent in that framework). In 1936, Gerhard Gentzen proved the consistency of Peano arithmetic using a formal system which was more powerful in certain aspects than arithmetic, but less powerful than standard set theory.

## Discussion and implications

The incompleteness results affect the philosophy of mathematics, particularly versions of formalism, which use a single system formal logic to define their principles. One can paraphrase the first theorem as saying, “we can never find an all-encompassing axiomatic system which is able to prove all mathematical truths, but no falsehoods.”

On the other hand, from a strict formalist perspective this paraphrase would be considered meaningless because it presupposes that mathematical “truth” and “falsehood” are well-defined in an absolute sense, rather than relative to each formal system.

On the other hand, from a strict formalist perspective this paraphrase would be considered meaningless because it presupposes that mathematical “truth” and “falsehood” are well-defined in an absolute sense, rather than relative to each formal system.

The following rephrasing of the second theorem is even more unsettling to the foundations of mathematics:

If an axiomatic system can be proven to be consistent and complete from within itself, then it is inconsistent.

Therefore, in order to establish the consistency of a system S, one needs to use some other more powerful system T, but a proof in T is not completely convincing unless T’s consistency has already been established without using S.

At first, Gödel’s theorems seemed to leave some hope—it was thought that it might be possible to produce a general algorithm that indicates whether a given statement is undecidable or not, thus allowing mathematicians to bypass the undecidable statements altogether. However, the negative answer to the Entscheidungsproblem, obtained in 1936, showed that no such algorithm exists.

There are some who hold that a statement that is unprovable within a deductive system may be quite provable in a metalanguage. And what cannot be proven in that metalanguage can likely be proven in a meta-metalanguage, recursively, ad infinitum, in principle. By invoking such a system of typed metalanguages, along with an axiom of Reducibility — which by an inductive assumption applies to the entire stack of languages — one may, for all practical purposes, overcome the obstacle of incompleteness.

Note that Gödel’s theorems only apply to sufficiently strong axiomatic systems. “Sufficiently strong” means that the theory contains enough arithmetic to carry out the coding constructions needed for the proof of the first incompleteness theorem. Essentially, all that is required are some basic facts about addition and multiplication as formalized, e.g., in Robinson arithmetic Q. There are even weaker axiomatic systems that are consistent and complete, for instance Presburger arithmetic which proves every true first-order statement involving only addition.

The axiomatic system may consist of infinitely many axioms (as first-order Peano arithmetic does), but for Gödel’s theorem to apply, there has to be an effective algorithm which is able to check proofs for correctness. For instance, one might take the set of all first-order sentences which are true in the standard model of the natural numbers. This system is complete; Gödel’s theorem does not apply because there is no effective procedure that decides if a given sentence is an axiom. In fact, that this is so is a consequence of Gödel’s first incompleteness theorem.

Another example of a specification of a theory to which Gödel’s first theorem does not apply can be constructed as follows: order all possible statements about natural numbers first by length and then lexicographically, start with an axiomatic system initially equal to the Peano axioms, go through your list of statements one by one, and, if the current statement cannot be proven nor disproven from the current axiom system, add it to that system. This creates a system which is complete, consistent, and sufficiently powerful, but not computably enumerable.

Gödel himself only proved a technically slightly weaker version of the above theorems; the first proof for the versions stated above was given by J. Barkley Rosser in 1936.

In essence, the proof of the first theorem consists of constructing a statement p within a formal axiomatic system that can be given a meta-mathematical interpretation of:

p = “This statement cannot be proven in the given formal theory”

As such, it can be seen as a modern variant of the Liar paradox, although unlike the classical paradoxes it’s not really paradoxical.

If the axiomatic system is consistent, Gödel’s proof shows that p (and its negation) cannot be proven in the system. Therefore p is true (p claims to be not provable, and it is not provable) yet it cannot be formally proved in the system. If the axiomatic system is ω-consistent, then the negation of p cannot be proven either, and so p is undecidable. In a system which is not ω-consistent (but consistent), either we have the same situation, or we have a false statement which can be proven (namely, the negation of p).

Adding p to the axioms of the system would not solve the problem: there would be another Gödel sentence for the enlarged theory. Theories such as Peano arithmetic, for which any computably enumerable consistent extension is incomplete, are called essentially incomplete.

### Minds and machines

Authors including J. R. Lucas have debated what, if anything, Gödel’s incompleteness theorems imply about human intelligence. Much of the debate centers on whether the human mind is equivalent to a Turing machine, or by the Church-Turing thesis, any finite machine at all. If it is, and if the machine is consistent, then Gödel’s incompleteness theorems would apply to it.

Hilary Putnam (1960) suggested that while Gödel’s theorems cannot be applied to humans, since they make mistakes and are therefore inconsistent, it may be applied to the human faculty of science or mathematics in general. If we are to believe that it is consistent, then either we cannot prove its consistency, or it cannot be represented by a Turing machine.

### Postmodernism and continental philosophy

Appeals are sometimes made to the incompleteness theorems to support by analogy ideas which go beyond mathematics and logic. For instance, Régis Debray applies it to politics. A number of authors have commented, mostly negatively, on such extensions and interpretations, including Torkel Franzen, Alan Sokal and Jean Bricmont, Ophelia Benson and Jeremy Stangroom. The last two quote biographer Rebecca Goldstein commenting on the disparity between Gödel’s avowed Platonism and the anti-realist uses to which his ideas are put by humanist intellectuals.

### Theories of everything and physics

Stanley Jaki followed much later by Stephen Hawking and others argue that (an analogous argument to) Gödel’s theorem implies that even the most sophisticated formulation of physics will be incomplete, and that therefore there can never be an ultimate theory that can be formulated as a finite number of principles, known for certain as “final”.  

## Relationship with computability

As early as 1943, Kleene gave a proof of Godel’s incompleteness theorem using basic results of computability theory. A basic result of computability shows that the halting problem is unsolvable: there is no computer program that can correctly determine, given a program P as input, whether P eventually halts when run with no input. Kleene showed that the existence of a complete effective theory of arithmetic with certain consistency properties would force the halting problem to be decidable, a contradiction. An exposition of this proof at the undergraduate level was given by Charlesworth (1980).

By enumerating all possible proofs, it is possible to enumerate all the provable consequences of any effective first-order theory. This makes is possible to search for proofs of a certain form. Moreover, the method of arithmetization introduced by Gödel can be used to show that any sufficiently strong theory of arithmetic can represent the workings of computer programs. In particular, for each program P there is a formula Q such that Q expresses the idea that P halts when run with no input. The formula Q says, essentially, that there is a natural number that encodes the entire computation history of P and this history ends with P halting.

If, for every such formula Q, either Q or the negation of Q was a logical consequence of the axiom system, then it would be possible, by enumerating enough theorems, to determine which of these is the case. In particular, for each program P, the axiom system would either prove “P halts when run with no input,” or “P doesn’t halt when run with no input.”

Consistency assumptions imply that the axiom system is correct about these theorems. If the axioms prove that a program P doesn’t halt when the program P actually does halt, then the axiom system is inconsistent, because it is possible to use the complete computation history of P to make a proof that P does halt. This proof would just follow the computation of P step-by-step until P halts after a finite number of steps.

The mere consistency of the axiom system is not enough to obtain a contradiction, however, because a consistent axiom system could still prove the ω-inconsistent theorem that a program halts, when it actually doesn’t halt. The assumption of ω-consistency implies, however, that if the axiom system proves a program doesn’t halt then the program actually does not halt. Thus if the axiom system was consistent and ω-consistent, its proofs about which programs halt would correctly reflect reality. Thus it would be possible to effectively decide which programs halt by merely enumerating proofs in the system; this contradiction shows that no effective, consistent, ω-consistent formal theory of arithmetic that is strong enough to represent the workings of a computer can be complete.

## Proof sketch for the first theorem

Throughout the proof we assume a formal system is fixed and satisfies the necessary hypotheses. The proof has three essential parts. The first part is to show that statements can be represented by natural numbers, known as Gödel numbers, and that properties of the statements can be detected by examining their Gödel numbers. This part culminates in the construction of a formula expressing the idea that a statement is provable in the system. The second part of the proof is to construct a particular statement that, essentially, says that it is unprovable. The third part of the proof is to analyze this statement to show that is neither provable nor disprovable in the system.

### Arithmetization of syntax

The main problem in fleshing out the above mentioned proof idea is the following: in order to construct a statement p that is equivalent to “p cannot be proved”, p would have to somehow contain a reference to p, which could easily give rise to an infinite regress. Gödel’s ingenious trick, which was later used by Alan Turing in his work on the Entscheidungsproblem, is to represent statements as numbers, which is often called the arithmetization of syntax.

To begin with, every formula or statement that can be formulated in our system gets a unique number, called its Gödel number. This is done in such a way that it is easy to mechanically convert back and forth between formulas and Gödel numbers. It is similar, for example, to the way English sentences are encoded as sequences (or “strings”) of numbers using ASCII: such a sequence is considered as a single (if potentially very large) number. Because our system is strong enough to reason about numbers, it is now also possible to reason about formulas within the system.

A formula F(x) that contains exactly one free variable x is called a statement form or class-sign. As soon as x is replaced by a specific number, the statement form turns into a bona fide statement, and it is then either provable in the system, or not. For certain formulas one can show that for every natural number n, F(n) is true if and only if it can be proven (the precise requirement in the original proof is weaker, but for the proof sketch this will suffice). In particular, this is true for every specific arithmetic operation between a finite number of natural numbers, such as “2*3=6”.

Statement forms themselves are not statements and therefore cannot be proved or disproved. But every statement form F(x) can be assigned with a Gödel number which we will denote by G(F). The choice of the free variable used in the form F(x) is not relevant to the assignment of the Gödel number G(F).

Now comes the trick: The notion of provability itself can also be encoded by Gödel numbers, in the following way. Since a proof is a list of statements which obey certain rules, we can define the Gödel number of a proof. Now, for every statement p, we may ask whether a number x is the Gödel number of its proof. The relation between the Gödel number of p and x, the Gödel number of its proof, is an arithmetical relation between two numbers. Therefore there is a statement form Bew(x) that uses this arithmetical relation to state that a Gödel number of a proof of x exists:

Bew(y) = ∃ x ( y is the Gödel number of a formula and x is the Gödel number of a proof of the formula encoded by y).

The name Bew is short for beweisbar, the German word for “provable”. An important feature of Bew is that if a statement p is provable in the system then Bew(G(p)) is also provable. This is because any proof of p would have a corresponding Gödel number, the existence of which causes Bew(G(p)) to be satisfied.

### Diagonalization

The next step in the proof is to obtain a statement that says it is unprovable. Although Gödel constructed this statement directly, the existence of at least one such statement follows from the diagonal lemma, which says that for any sufficiently strong formal system and any statement form F there is a statement p such that the system proves

pF(G(p)).

We obtain p by letting F be the negation of Bew(x); thus p roughly states that its own Gödel number is the Gödel number of an unprovable formula.

The statement p is not literally equal to ~Bew(G(p)); rather, p states that if a certain calculation is performed, the resulting Gödel number will be that of an unprovable statement. But when this calculation is performed, the resulting Gödel number turns out to be the Gödel number of p itself. This is similar to the following sentence in English:

“, when preceded by itself in quotes, is unprovable.”, when preceded by itself in quotes, is unprovable.

This sentence does not directly refer to itself, but when the stated transformation is made the original sentence is obtained as a result, and thus this sentence asserts its own unprovability. The proof of the diagonal lemma employs a similar method.

### Proof of independence

We will now assume that our axiomatic system is ω-consistent. We let p be the statement obtained in the previous section.

If p were provable, then Bew(G(p)) would be provable, as argued above. But p asserts the negation of Bew(G(p)). Thus our system would be inconsistent, proving both a statement and its negation. This contradiction shows that p cannot be provable.

If the negation of p were provable, then Bew(G(p)) would be provable (because p was constructed to be equivalent to the negation of Bew(G(p))). However, for each specific number x, x cannot be the Gödel number of the proof of p, because p is not provable (from the previous paragraph). Thus on one hand the system proves there is a number with a certain property (that it is the Gödel number of the proof of p), but on the other hand, for every specific number x, we can prove that it does not have this property. This is impossible in an ω-consistent system. Thus the negation of p is not provable.

So the statement p is undecidable: it can neither be proved nor disproved within our system. ∎

It should be noted that p is not provable (and thus true) in every consistent system. The assumption of ω-consistency is only required for the negation of p to be not provable. Thus:

• In an ω-consistent formal system, we may prove neither p nor its negation, and so p is undecidable.
• In a consistent formal system we may either have the same situation, or we may prove the negation of p; In the later case, we have a statement (“not p“) which is false but provable.

Note that if one tries to “add the missing axioms” in order to avoid the undecidability of the system, then one has to add either p or “not p” as axioms. But then the definition of “being a Gödel number of a proof” of a statement changes. which means that the statement form Bew(x) is now different. Thus when we apply the diagonal lemma to this new form Bew, we obtain a new statement p, different from the previous one, which will be undecidable in the new system if it is ω-consistent.

Rosser (1936) showed, by employing a Gödel sentence more complicated than p, that ordinary consistency sufficed for this proof.

### Boolos’s short proof

George Boolos (1998) vastly simplified the proof of the First Theorem, if one agrees that that theorem is equivalent to:

“There is no algorithm M whose output contains all true sentences of arithmetic and no false ones.”

“Arithmetic” refers to Peano or Robinson arithmetic, but the proof invokes no specifics of either. It is tacitly assumed that these systems allow ‘<‘ and ‘×’ to have their usual meanings (these are also the only defined arithmetical notions the proof requires). The Gödel sentence draws on Berry’s paradox, except that “fewer than n symbols of the language of arithmetic” replace “fewer than n natural language syllables.” Boolos proves the theorem in about two pages, employing the language of first order logic but invoking no facts about the connectives or quantifiers. The domain is the natural numbers, but the proof is innocent of infinity in any form.

Let [n] abbreviate (the natural number) n successive applications of the successor function, starting from 0. Boolos then defines several related predicates, starting with Cxz, which comes out true iff an arithmetic formula containing z symbols “names” (see below) the number x. The construction of C is only sketched. This sketch assumes that every formula has a Gödel number; this is the only mention of Gödel numbering in the entire proof. The other predicates are:

Bxy ↔ ∃z(z<yCxz),
Axy ↔ ¬Bxy ∧ ∀a(a<xBay),
Fx ↔ ∃y((y=×[k]) ∧ Axy). k = the number of symbols appearing in Axy.

Fx “names” n if the output of M includes the sentence ∀x(Fx ↔(x=[n])). Thus Berry’s paradox is formalized. The balance of the proof, requiring but 12 lines of text, shows that this sentence is true in a semantic sense, but no algorithm M will identify it as true. Thus arithmetic truth outruns proof. QED.

The proof is intuitionistically valid, and requires but two existential quantifiers. The proof nowhere mentions recursive functions or any facts from number theory; Boolos even claims that the proof dispenses with diagonalization. For more on this proof, see Berry’s paradox.

## Proof sketch for the second theorem

The main difficulty in proving the second incompleteness theorem is to show that various facts about provability used in the proof of the first incompleteness theorem can be formalized within the system using a formal predicate for provability. Once this is done, the second incompleteness theorem essentially follows by formalizing the entire proof of the first incompleteness theorem within the system itself.

Let p stand for the undecidable sentence constructed above, and assume that the consistency of the system can be proven from within the system itself. We have seen above that if the system is consistent, then p is not provable. The proof of this implication can be formalized within the system, and therefore the statement “p is not provable”, or “not P(p)” can be proven in the system.

But this last statement is equivalent to p itself (and this equivalence can be proven in the system), so p can be proven in the system. This contradiction shows that the system must be inconsistent.

## Footnotes

1 The word “true” here is being used disquotationally; that is, the statement “GT is true” means the same thing as GT itself. Thus a formalist might reinterpret the claim

for every theory T satisfying the hypotheses, if T is consistent, then GT is true

to mean

for every theory T satisfying the hypotheses, it is a theorem of Peano Arithmetic that Con(T)→GT

where Con(T) is the natural formalization of the claim “T is consistent”, and GT is the Gödel sentence for T.

2 Here Flg(κ) represents the theory generated by κ and “v Gen r” is a particular formula in the language of arithmetic.