van Dalen - Logic and Structure

van Dalen - Logic and Structure - Dirk van Dalen Logic and...

Info iconThis preview shows pages 1–3. Sign up to view the full content.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Dirk van Dalen Logic and Structure Third, Augmented Edition Springer-Verlag Berlin Heidelberg New York London Paris Tokyo Hong Kong Barcelona Budapest Dirk van Dalen Mathematical Institute Utrecht University Budapestlaan 6 3508 TA Utrecht The Netherlands Preface Mathematics Subject Classification (1991): 03-01 ISBN 3-540-57839-0 Springer-Verlag Berlin Heidelberg New York ISBN 0-387-57839-0 Springer-Verlag New York Berlin Heidelberg Library of Congress Cataloging-in-Publication Data D a b , D van ( Did), I 932 - L ogic and s trurturelDirk van D alen p. cm. - (Universitext) I ncludes hihliographical r eferences and index. ISBN 3 -540-57839-0 1. Logic, Symbolic and mathematical. I. Title. Q A9.Dl6 1 994 5 1 1.3-dc20 9 4-3443 CIP '1[ , .. . This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation. broadcasting, reproduction on microfilms or in any other way, and storage in data banks. Duplication of this publication or parts thereof is permitted only under the provisions of the German Copyright Law of September 9. 1965, in its current version, and permission for use must always be obtained from Springer-Verlag. Violations are liable for prosecution under the German Copyright Law. O Springer-Verlag Berlin Heidelberg 1994 Printed in the United States of America Typesetting: Camera-ready copy from the author using a Springer TEX m acro package SPIN: I 0466452 4113140- 5 4 3 2 1 0 - Printed on acid-free paper Logic appears in a 'sacred' and in a 'profane' form; the sacred form is dominant in proof theory, the profane form in model theory. The phenomenon is not unfamiliar, one observes this dichotomy also in other areas, e.g. s et theory and recursion theory. Some early catastrophies, such as the discovery of the set theoretical paradoxes (Cantor, Russell), or the definability paradoxes (Richard, Berry), make us treat a subject for some time with the utmost awe and diffidence. Sooner or later, however, people start to treat the matter in a more free and easy way. Being raised in the 'sacred' tradition, my first encounter with the profane tradition was something like a culture shock. Hartley Rogers introduced me t o a more relaxed world of logic by his example of teaching recursion theory to mathematicians as if it were just an ordinary course in, say, linear algebra or algebraic topology. In the course of time I have come to accept this viewpoint as the didactically sound one: before going into esoteric niceties one should develop a certain feeling for the subject and obtain a reasonable amount of plain working knowledge. For this reason this introductory text sets out in the profane vein and tends towards the sacred only at the end. The present book has developed out of courses given at the mathematics department of Utrecht University. The experience drawn from these courses and the reaction of the participants suggested strongly that one should not practice and teach logic in isolation. As soon as possible examples from everyday mathematics should be introduced; indeed, first-order logic finds a rich field of applications in the study of groups, rings, partially ordered sets, etc. The role of logic in mathematics and computer science is two-fold - a tool for applications in both areas, and a technique for laying the foundations. The latter role will be neglected here, we will concentrate on the daily matters of formalised (or formalisable) science. Indeed, I have opted for a practical approach, - I will cover the basics of proof techniques and semantics, and then go on t o topics that are less abstract. Experience has taught us that the natural deduction technique of Gentzen lends itself best t o a n introduction, i t is close enough to actual informal reasoning to enable students t o devise proofs by themselves. Hardly any artificial tricks are involved and at the end there is t h e pleasing discovery that the system has striking structural prop- VI Preface erties. in particular it perfectly suits the constructive interpretation of logic a nd it allows normal forms . T he latter topic has been added to this edition in view of its importance in theoretical computer science. In chapter 3 we already have enough technical power t o obtain some of the traditional and (even today) surprising model theoretic results . T he book is written for beginners without knowledge of more advanced topics. no esoteric set theory or recursion theory is required . T he basic ingredients are natural deduction and semantics. the latter is presented in constructive and classical form . In chapter 5 intuitionistic logic is treated on the basis of natural deduction without the rule of Reductio ad absurdum. and of Kripke semantics . Intuitionistic logic has gradually freed itself from the image of eccentricity and now it is recognised for its usefulness in e.g.. topos theory and type theory. hence its inclusion in a introductory text is fully justified . T he final chapter. on normalisation. has been added for the same reasons; normalisation plays an important role in certain parts of computer science; traditionally normalisation (and cut elimination) belong t o proof theory. but gradually applications in other areas have been introduced . I n chapter 6 we consider only weak normalisation. a number of easy applications is given . Various people have contributed to the shaping of the text a t one time or another; Dana Scott. Jane Bridge and Henk Barendregt have been most helpful for the preparation of the first edition . Since then many colleagues and students have spotted mistakes and suggested improvements; this edition benefitted from the remarks of Eleanor McDonnell. A . Scedrov and Karst Koymans. T o all of these critics and advisers I am grateful . Progress has dictated that the traditional typewriter should be replaced by more modern devices; this book has been redone in J&T#by Addie Dekker and my wife Doke. Addie led the way with the first three sections of chapter one and Doke finished the rest of the manuscript; I am indebted t o both of them. especially t o Doke who found time and courage to master the secrets of the B T ~ X t r a d. T hanks go t o Leen Kievit for putting together the e . derivations and for adding the finer touches required for a J &T~xmanuscript P aul Taylor's macro for proof trees has been used for the natural deduction derivations . Table of Contents 0 . . Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 Propositional Logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 .1 Propositions and Connectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 .2 Semantics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 .3 Some Properties of Propositional Logic . . . . . . . . . . . . . . . . . . . . 1 .4 N atural Deduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 .5 Completeness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 .6 T he Missing Connectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Predicate Logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 .1 Quantifiers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 S tructures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 .3 T he Language of a Similarity Type . . . . . . . . . . . . . . . . . . . . . . . 2 .4 Semantics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 .5 Simple Properties of Predicate Logic . . . . . . . . . . . . . . . . . . . . . . 2.6 Identity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.7 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 .8 N atural Deduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 .9 Adding the Existential Quantifier . . . . . . . . . . . . . . . . . . . . . . . . . 2.10 N atural Deduction and Identity . . . . . . . . . . . . . . . . . . . . . . . . . . 5 5 14 20 29 39 48 55 55 56 58 66 71 79 81 90 95 98 1 2 . 3 . De Meern. J une 1994 Dirk v an Dalen 4 5 Completeness and Applications . . . . . . . . . . . . . . . . . . . . . . . . . .103 3 .1 T he Completeness Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103 3 .2 Compactness and Skolem-Lowenheim . . . . . . . . . . . . . . . . . . . . . 111 3 .3 Some Model Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .118 3 .4 Skolem Functions or How to Enrich Your Language . . . . . . . . . 136 Second Order Logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145 Intuitionistic Logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155 5 .1 Constructive Reasoning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155 5.2 Intuitionistic Propositional and Predicate Logic . . . . . . . . . . . . 158 . . V III Table of Contents 5.3 Kripke Semantics. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .164 5.4 Some Model Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .175 0. Introduction 6. N o r m a l i s a t i o n . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .189 1 6.1 Cuts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 6.2 Normalisation for Classical Logic . . . . . . . . . . . . . . . . . . . . . . . . . 194 6.3 Normalisation for Intuitionistic Logic . . . . . . . . . . . . . . . . . . . . . 200 . Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211 I n d e x . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 2 W ithout adopting one of the various views advocated in the foundations of mathematics, we may agree that mathematicians need and use a language, if only for the communication of their results and their problems. While mathematicians have been claiming the greatest possible exactness for their methods, they have been less be sensitive as to their means of communication. I t is well known that Leibniz proposed to put the practice of mathematical communication and mathematical reasoning on a firm base; it was, however, not before the nineteenth century that those enterprises were (more) successfully undertaken by G . Frege and G . Peano. No matter how ingeniously and rigorously Frege, Russell, Hilbert, Bernays and others developed mathematical logic, it was only in the second half of this century that logic and its language showed any features of interest to the general mathematician. The sophisticated results of Godel were of course immediately appreciated, but they remained for a long time technical highlights without practical use . Even Tarski's result on the decidability of elementary algebra and geometry had to bide its time before any applications turned up. Nowadays the application of logic to algebra, analysis, topology, etc. are numerous and well-recognised. It seems strange that quite a number of simple facts, within the grasp of any student, were overlooked for such a long time. It is not possible to give proper credit to all those who opened up this new territory, any list would inevitably show the preferences of the author, and neglect some fields and persons. Let us note that mathematics has a fairly regular, canonical way of formulating its material, partly by its nature, partly under the influence of strong schools, like the one of Bourbaki. Furthermore the crisis at the beginning of this century has forced mathematicians to pay attention to the finer details of t heir language and to their assumptions concerning the nature and the extent of the mathematical universe. This attention started to pay off when it was discovered that there was in some cases a close connection between classes of mathematical structures and their syntactical description. Here is an example: It is well known that a subset of a group G which is closed under multiplication and inverse, is a group; however, a subset of an aigebraically closed field F which is closed under sum, product, minus and inverse, is in general 2 0. Introduction 0. Introduction 3 not an algebraically closed field. This phenomenon is an instance of something quite general: an axiomatizable class of structures is axiomatised by a set of universal sentences (of the form V xl,. . . ,x,cp, with cp quantifier free) iff it is closed under substructures. If we check the axioms of group theory we see that indeed all axioms are universal, while not all the axioms of the theory of algebraically closed fields are universal. The latter fact could of course be accidental, it could be the case that we were not clever enough t o discover a universal axiomatization of the class of algebraically closed fields. The above theorem of Tarski a nd Los tells us, however, that it is impossible t o find such an axiomatization! The point of interest is that for some properties of a class of structures we have simple syntactic criteria. We can, so to speak, read the behaviour of the real mathematical world (in some simple cases) off from its syntactic description. There are numerous examples of the same kind, e.g. Lyndon's Theorem: an axiomatisable class of structures is closed under homomorphisms iff it can be axiomatised by a set of positive sentences (i.e. sentences which, in prenex normal form with the open part in disjunctive normal form, do not contain negations). The most basic and at the same time monumental example of such a connection between syntactical notions and the mathematical universe is of course Godel's completeness theorem, which tells us that provability in the familiar formal systems is extensionally identical with t mth in all structures. That is t o say, although provability and truth are totally different notions, (the first is combinatorial in nature, the latter set theoretical), they determine the same class of sentences: cp is provable iff cp is true in all structures. Given the fact that the study of logic involves a great deal of syntactical toil, we will set out by presenting an efficient machinery for dealing with syntax. We use the technique of inductive definitions and as a consequence we are rather inclined to see trees wherever possible, in particular we prefer natural deduction in the tree form to the linear versions that are here and there in use. One of the amazing phenomena in the development of the foundations of mathematics is the discovery that the language of mathematics itself can be studied by mathematical means. This is far from a futile play: Godel's incompleteness theorems, for instance, lean heavily on a mathematical analysis of the language of arithmetic, and the work of Godel a nd Cohen in the field of the independence proofs in set theory requires a thorough knowledge of the mathematics of mathematical language. These topics are not with in the scope of the present book, so we can confine ourselves to the simpler parts of the syntax. Nonetheless we will aim a t a thorough treatment, in the hope that the reader will realise that all these things which he suspects t o be trivial, but cannot see why, are perfectly amenable t o proof. I t may help the reader to think of himself as a computer with great mechanical capabilities, but with no creative insight, in those cases where he is puzzled because 'why should we prove something so utterly evident'! On the other hand the reader should keep in mind that he is not a computer and that, certainly when he gets t o chapter 3 , certain details should be recognised as trivial. For the actual practice of mathematics predicate logic is doubtlessly the perfect tool, since it allows us to handle individuals. All the same we start this book with an exposition of propositional logic. There are various reasons for this choice. In the first place propositional logic offers in miniature the problems that we meet in predicate logic, but there the additional difficulties obscure some of the relevant features e.g. t he completeness theorem for propositional logic already uses the concept of 'maximal consistent set', but without the complications of the Henkin axioms. In the second place there are a number of truly propositional matters that would be difficult t o treat in a chapter on predicate logic without creating a impression of discontinuity that borders on chaos. Finally it seems a matter of sound pedagogy t o let propositional logic precede predicate logic. The beginner can in a simple context get used to the proof theoretical, algebraic and model theoretic skills that would be overbearing in a first encounter with predicate logic. All that has been said about the role of logic in mathematics can be repeated for computer science; the importance of syntactical aspects is even more pronounced than in mathematics, but it does not stop there. The literature of theoretical computer science abounds with logical systems, completeness proofs and the like. In the context of type theory (typed lambda calculus) intuitionistic logic has gained an important role, whereas the technique of normalisation has become a staple diet for computer scientists. 1. Propositional Logic 1.1 Propositions and Connectives Traditionally, logic is said to be the art (or study) of reasoning; so in order t o describe logic in this tradition, we have t o know what 'reasoning' is. According t o some traditional views reasoning consists of the building of chains of linguistic entities by means of a certain relation '... follows from ...', a view which is good enough for our present purpose. The linguistic entities occurring in this kind of reasoning are taken to be sentences, i.e. entities that express a complete thought, or state of affairs. We call those sentences declarative. This means that, from the point of view of natural language our class of acceptable linguistic objects is rather restricted. Fortunately this class is wide enough when viewed from the mathematician's point of view. So far logic has been able t o get along pretty well under this restriction. True, one cannot deal with questions, or imperative statements, but the role of these entities is negligible in pure mathematics. I must make an exception for performative statements, which play an important role in programming; think of instructions as 'goto, if ... t hen, else ...', e tc. For reasons given below, we will, however, leave them out of consideration. The sentences we have in mind are of the kind '27 is a square number', 'every positive integer is the sum of four squares', 'there is only one empty set'. A common feature of all those declarative sentences is the possibility of assigning them a truth value, true or false. We do not require the actual determination of the truth value in concrete cases, such as for instance Goldbach's conjecture or Riemann's hypothesis. It suffices that we can 'in principle' assign a truth value. Our so-called two-valued logic is based on the assumption that every sentence is either true or false, it is the cornerstone of the practice of truth tables-! SO& sentences are minimal in the sense that there is no proper part which is also a sentence. e.g. 5 E { 0,1,2,5,7), or 2 2 = 5; others can be taken apart into smaller parts, e.g. ' c is rational or c is irrational' (where c is some constant). Conversely, we can build larger sentences from smaller ones by using connectives. We know many connectives in natural language; the following list is by no means meant to be exhaustive: and, or, not, if ... then ..., but, since, as, for, although, neither ... n or ... . I n ordinary discourse, and + 6 1. Propositional Logic 1.1 Propositions and Connectives 7 also in informal mathematics, one uses these connectives incessantly; however, in formal mathematics we will economise somewhat on the connectives we admit. This is mainly for reason of exactness. Compare, for example, the following two sentences: "nis irrational, but it is not algebraic", "Max is a Marxist, but he is not humourless". In the second statement we may discover a suggestion of some contrast, as if we should be surprised that Max is not humourless. In the first case such a surprise cannot be so easily imagined (unless, e.g. one has just read that almost all irrationals are algebraic); without changing the meaning one can transform this statement into "nis irrational and .rr is not algebraic". So why use (in a formal text) a formulation that carries vague, emotional undertones? For these and other reasons (e.g. of economy) we stick in logic t o a limited number of connectives, in particular those that have shown themselves t o be useful in the daily routine of formulating and proving. Note, however, that even here ambiguities loom. Each of the connectives has already one or more meanings in natural language. We will give some examples: 1. John drove on and hit a pedestrian. 2. John hit a pedestrian and drove on. 3. If I open the window then we'll have fresh air. 4. If I open the window then 1 3 = 4. 5. If 1 2 = 4 , then we'll have fresh air. 6. J ohn is working or he is a t home. 7. Euclid was a Greek or a mathematician. From 1 a nd 2 we conclude that 'and' may have an ordering function in time. Not so in mathematics; "nis irrational and 5 is positive" simply means that both parts are the case. Time just does not play a role in formal mathematics. We could not very well say ".rr was neither algebraic nor transcendent before 1882". What we would want t o say is "before 1882 it was unknown whether n was algebraic or transcendent". In the examples 3-5 we consider the implication. Example 3 will be generally accepted, it displays a feature that we have come to accept as inherent to implication: there is a relation between the premise and conclusion. This feature is lacking in the examples 4 and 5. Nonetheless we will allow cases such as 4 a nd 5 in mathematics. There are various reasons t o do so. One is the consideration that meaning should be left out of syntactical considerations. Otherwise syntax would become unwieldy and we would run into an esoteric practice of exceptional cases. This general implication, in use in mathematics, is called material implication. Some other implications have been studied under the names of strict implication, relevant implication, etc. Finally 6 a nd 7 d emonstrate the use of 'or'. We tend to accept 6 a nd t o reject 7. One mostly thinks of 'or' as something exclusive. In 6 we more or less expect John not t o work at home, while 7 is unusual in the sense that we as a rule do not use 'or' when we could actually use 'and'. Also, we normally hesitate t o use a disjunction if we already know which of the two parts is the case, e.g. "32 is a prime or 32 is not a prime" will be considered artificial (to say t he least) by most of us, since we already know that 32 is not a prime. Yet mathematics freely uses such superfluous disjunctions, for example "2 2 2" (which stands for "2 > 2 or 2 = 2"). In order to provide mathematics with a precise language we will create an artificial, formal language, which will lend itself to mathematical treatment. First we will define a language for propositional logic, i.e. t he logic which deals only with propositions (sentences, statements). Later we will extend our treatment to a logic which also takes properties of individuals into account. The process of formalisation of propositional logic consists of two stages: (1) present a formal language, (2) specify a procedure for obtaining valid or true propositions. We will first describe the language, using the technique of inductive definitions. The procedure is quite simple: First give the smallest propositions, which are not decomposable into smaller propositions; next describe how composite propositions are constructed out of already given propositions. + + Definition 1.1.1. T he language of propositional logic has an alphabet consisting of proposition symbols : po, p l , p2, . . . , (i) connectives : A , V , -' , , * , I, (ii) (iii) auxiliary symbols : ( , ) . 7 T he connectives carry traditional names: A - a nd conjunction v - or - disjunction + - if ..., then ... - implication - not - negation H - iff - equivalence, bi-implication I - falsity - falsum, absurdum The proposition symbols and I s tand for the indecomposable propositions, which we call atoms, or atomic propositions. 7 Definition 1.1.2. T he set P R O P of propositions is the smallest set X with the properties pi E X ( i E N ), IEX , (2) (24 c p , $ E X * ( c p A $ ) , ( c p V ~ ) , ( c p + ~ ) , ( c p ~ $ ) ~ X , ( iii) cp E X + ( 7 9 ) E X . The clauses describe exactly the possible ways of building propositions. In order to simplify clause (ii) we write cp, $ E X =+ (cp $) E X , where 0 is one of the connectives A , V , 4, ++. 8 1. Propositional Logic 1.1 Propositions and Connectives 9 A warning to the reader is in order here. We have used Greek letters cp, $ in the definition; are they propositions? Clearly we did not intend them to be so, as we want only those strings of symbols obtained by combining symbols of the alphabet in a correct way. Evidently no Greek letters come in at all! The explanation is that cp a nd 1 a re used as variables for propositions. Since 1 we want to study logic, we must use a language to discuss it in. As a rule this language is plain, everyday English. We call the language used to discuss logic our meta-language and cp a nd $ a re meta-variables for propositions. We could do without meta-variables by handling (ii) and (iii) verbally: if two propositions are given, then a new proposition is obtained by placing the connective between them and by adding brackets in front and at the end, etc. This verbal version should suffice to convince the reader of the advantage of the mathematical machinery. Note that we have added a rather unusual connective, I . Unusual, in the sense that it does not connect anything. Logical constant would be a better name. For uniformity we stick to our present usage. I is added for convenience, one could very well do without it, but it has certain advantages. One may note that there is something lacking, namely a symbol for the true proposition; we will indeed add another symbol, T , a s an abbreviation for the "true" proposition. Examples. ( ~ + P O ) , ((1 7 Vp32) A ( - 7 ~ 2 )E P ROP. ) 1, ( 4 @ P ROP ( A PI * p7, I t is easy to show that something belongs to P ROP ( just carry out the construction according to 1.1.2); it is somewhat harder to show that something does not belong to P ROP. We will do one example: 1 1 T h e o r e m 1.1.3 ( I n d u c t i o n Principle). Let A be a property, then A(cp) holds f or all cp E P ROP if A (pi), f or all i ,and A ( I ) , (i) A(cp), A($) =+ 4 (cp 11)h ( i 4 A(cp) =+ A ( ( 1 ~ ) ) . (4 Proof. Let X = {cp E P ROP ( A(cp)), t hen X satisfies (i), (ii) and (iii) of definition 1.1.2. So P ROP C X , i.e. for all cp E P ROP A(cp) holds. 0 We call an application of theorem 1.1.3 a proof by induction on cp. T he reader will note an obvious similarity between the above theorem and the principle of complete induction in arithmetic. The above procedure for obtaining all propositions, and for proving properties of propositions is elegant and perspicuous; there is another approach, however, which has its own advantages (in particular for coding): consider propositions as the result of a linear step-by-step construction. E.g. ( ( 7 p o ) + I ) is constructed by assembling it from its basic parts by using previously constructed parts: po . . . I . . . ( 7po) . . . ( ( i p o ) + I ) . T his is formalised as follows: Definition 1.1.4. ( a) A sequence 9 0 , .. . , cpn is called a formation sequence of cp if cp, = p a nd for all i n pi is atomic, or cpi = ( cpj 0 c pk) for certain j , k < i , or cpi = ( y j ) for certain j < i. ( b) cp is a subformula(cf. Exercise 7 ) of $ if p = 11, or q 2)and cp is a subformula of $1 or of $2, or $= $ = ( ~ $ 1 ) nd cp is a subformula of $1. a < I # ROP. P Observe that in this definition we are considering strings cp of symbols from the given alphabet; this mildly abuses our notational convention. p V ), ) , ) Examples. ( a) I , 2, p3, (1 P ~ ) ,( -(-I- V P ~ ) ( - 7 ~ 3 a nd ~ 3( - 7 ~ 3 a re b oth formation sequences of ( 1 ~ ~ ) . that formation sequences may contain Note 'garbage'. ( b) p , is a subformula of ( (p7V ( 1 ~ 2 )-) p l ) . ( pl + I) a subformula of -t is V (PI + I ) ) . ( ( ( ~ 2 (PI A Po)) We now give some trivial examples of proof by induction. In practice we actually only verify the clauses of the proof by induction and leave the conclusion to the reader. Suppose I E and X satisfies (i), (ii), (iii) of Definition 1.1.2. We X claim that Y = X I)also satisfies (i), (ii) and (iii). Since I , p i E X , also I , p i E Y . If cp,$ E Y , then p ,$ E X . Since X satisfies (ii) (cpO$) E X . From the form of the expressions it is clear that (cp $) # 7 1 I (look at the brackets), so ( p $) E X I) = Y . Likewise one shows that Y satisfies (iii). Hence X is not the smallest set satisfying (i), (ii) and (iii), so I c annot belong to P ROP. P roperties of propositions are established by an inductive procedure analogous to definition 1.1.2: first deal with the atoms, and then go from the parts to the composite propositions. This is made precise in 1 1 (1 1 (-7 1 1 - 1 . E ach proposition has an even number of brackets. Proof. ( i) Each atom has 0 brackets and 0 is even. (ii) Suppose cp a nd $ have 2 n, resp. 2m brackets, then (P 0 $) h as 2(n m + 1 ) brackets. (iii) Suppose cp has 2n brackets, then (-.cp) has 2(n + 1 ) brackets. + 10 1. Propositional Logic 1.1 Propositions and Connectives 11 2. E ach proposition has a formation sequence. Proof. ( i) If cp is an atom, then the sequence consisting of just cp is a formation sequence of cp. (ii) Let cpo, . . . ,vn a nd $ 0,. . . , $m be formation sequences of cp a nd $, t hen one easily sees that cpo, . . . ,cp,, $0,. . . , $ m, ( p n 0 $m) is a formation sequence of (cp 0 $I). (iii) Left t o the reader. 0 We can improve on 2: T h e o r e m 1.1.5. P ROP i s the set of all expressions having formation sequences. We now formulate' the general principle of T h e o r e m 1.1.6 ( Definition by R ecursion). Let mappings H o : A2 -+ A and H , : A -+ A be given and let H at be a m apping from the set of atoms i nto A, t hen there exists exactly one mapping F : P ROP -+ A s uch that Proof. Let F be the set of all expressions (i.e. strings of symbols) having formation sequences. We have shown above that P ROP F. Let cp have a formation sequence P O,.. . , cp,, we show cp E P ROP by induction on n . c I n concrete applications it is usually rather easily seen to be a correct principle. However, in general one has to prove the existence of a unique function satisfying the above equations. The proof is left as a n exercise, cf. Exercise 12. We give some examples of definition by recursion: 1. T he (parsing) tree of a proposition cp is defined by n = 0 : cp = cpo a nd by definition cp is atomic, so cp E P ROP. Suppose that all expressions with formation sequences of length m < n f a re in P ROP. B y definition cp, = ( pi cpj) for i , j < n , or cp, = ( 1 9 % )or i < n , or cp, is atomic. In the first case cpi a nd c pj have formation sequences of length i ,j < n , SO by induction hypothesis c pi,cpj E P ROP. As P ROP satisfies the clauses of definition 1.1.2, also (cpiOcpj) E P ROP. Treat negation likewise. The atomic case is trivial. Conclusion F C_ P ROP. 0 Theorem 1.1.5 is in a sense a justification of the definition of formation sequence. It also enables us to establish properties of propositions by ordinary induction on the length of formation sequences. In arithmetic one often defines functions by recursion, e.g. exponentiation is defined by xO = 1 a nd xYf = x . x , or the factorial function by O! = 1 Y a nd ( x + l ) ! = x ! . ( x+ 1 ). The jusitification is rather immediate: each value is obtained by using the preceding values (for positive arguments). There is an analogous principle in our syntax. E xample. T he number b(cp), of brackets of cp, can be defined as follows: The value of b(cp) can be computed by successively computing b($) for its subformulae $. A simpler way t o ,exhibit the trees consists of listing the atoms at the bottom, and indicating the connectives at the nodes. 12 1 . Propositional Logic 1 .1 Propositions and Connectives 13 Exercises 2. Show that ( (452 P R O P . 3. Let cp b e a subformula of $. Show that cp occurs in each formation sequence of $. + + + . 7 . Recast definition 1.1.4(b) in the form of a definition by recursion of the function s ub : P R O P + P ( P R 0 P ) which assigns to each proposition cp t he set sub(cp) of its subformulas. 8. Let #(T(cp)) be the number of nodes of T(cp). By the "number of con- nectives in cp" we mean the number of occurrences of connectives in cp. ( In general #(A) stands for the number of elements of a (finite) set A) . ( a) I f cp does not contain I,show: number of connectives of cp+ number of atoms of cp < #(T(cp)). +A 2. T he rank r(cp) of a proposition cp is defined by = 0 for atomic cp, .((cp 11,)) = max(r(cp), r ( $ ) )+ 1, = r(cp) + 1 . 'r" (((-cp)) I n order t o simplify our notation we will economise on brackets. We will always discard the outermost brackets and we will discard brackets in the case of negations. Furthermore we will use the convention that A a nd V bind more strongly than + a nd H (cf. . a nd + in arithmetic), and that binds more strongly than the other connectives. Examples. l c p V cp s tands for ( (lcp) V cp), ~(7llcpA ) I stands for ( l ( ( l ( ~ ( 1 c p ) ) )I )), A cpV$+cp s tands for ((cp V $) 4 cp), cp cp V ($ X ) s tands for (cp (cp V ($ XI)). Warning. Note that those abbreviations are, properly speaking, not propositions. In the proposition ( pl 4 p l) only one atom is used to define it, it is however used twice and it occurs at two places. For some purpose it is convenient t o distinguish between formulas a nd formula occurrences. Now the definition of subformula does not tell us what an occurrence of cp in 11, is, we have t o add some information. One way t o indicate an occurrence of cp is to give its place in the tree of $, e.g. a n occurrence of a formula in a given formula 11, is a pair (cp, k ), where k is a node in the tree of $. One might even code k a s a sequence of 0's a nd l 's, where we associate t o each node the following sequence: ( ) ( the empty sequence) to the top node, ( so,. . . , sn-1, 0 ) t o the left immediate . descendant of the node with sequence (so,. . . ,sn-1) a nd ( so,. . ,s ,-1,l) t o the second immediate descendant of it (if there is one). We will not be overly formal in handling occurrences of formulas (or symbols, for that matter), but it is important that it can be done. i 4. If cp occurs in a shortest formation sequence of $ t hen cp is a subformula of $. 7 5. Let r b e the rank function. (a) Show that r ( p ) l number of occurrences of connectives of cp, ( b) Give examples of cp such that < or = holds in (a), (c) Find the rank of the propositions in exercise 1. 6. ( a) Determine the trees of the propositions in exercise 1 , Determine the propositions with the following trees. (G 1. Propositional Logic 1 .2 Semantics 15 ( b) #(sub(cp)) 5 #(T(cp)). ( c) A branch of a tree is a maximal linearly ordered set. The length of a branch is the number of its nodes minus one. Show that r(cp) is the length of a longest branch in T(cp). ( d) Let cp not contain I .Show: the number of connectives in cp t he number of atoms of cp 5 2'('+')+' - 1. Jones "Smith is in" A "Jones is in" is true iff "Smith is in" is true and "Jones is in" is true. + Show that a proposition with n connectives has at most 2n mulas. Show that the relation "is a subformula of" is transitive. + 1 subfor- We write u(cp) = 1 (resp. 0) for "cp is true" (resp. false). Then the above = = consideration can be stated as v(cp A $) = 1 iff u (p) = ~(11,) 1, or v(cp~11,) min(v(cp),u ($)). O ne c an also write it in the form of a t mth table: Show that for PROP we have a unique decomposition theorem: for each non-atomic proposition u either there are two propostions cp a nd 11, such I , that a = cp $ or there is a proposition cp such that a = lcp. ( a) Give an inductive definition of the function F , defined by recursion on P ROP from the functions Hat, Hn H,, as a set F* of pairs. (b) Formulate and prove for F* t he induction principle. (c) Prove that F* is indeed a function on P ROP. ( d) Prove that it is the unique function on P ROP satisfying the recursion equations. One reads the truth table as follows: the first argument is taken from the leftmost column and the second argument is taken from the top row. Disjunction. If a visitor wants to see one of the partners, no matter which one, he wants the table to be in one of the positions I I in I o ut I I in I o ut I I I in I o ut I ( Jones I x 1 In the last case he can make a choice, but that is no problem, he wants t o see a t least one of the gentlemen, no matter which one. In our notation, the interpretation of V is given by v(cpV11,) = 1 iff u(cp) = 1 or 1.2 Semantics T he task of interpreting propositional logic is simplified by the fact that the entities considered have a simple structure. The propositions are built up from rough blocks by adding connectives. The simplest parts (atoms) are of the form "grass is green", "Mary likes Goethe","G - 3 = 2", which are simply true or false. We extend this assignment of truth values t o composite propositions, by reflection on the meaning of the logical connectives. Let us agree to use 1 a nd 0 instead of 'true' and 'false'. The problem we are faced with is how to interprete cp 11,, 79, given the truth values of cp a nd $ . I We will illustrate the solution by considering the in-out-table for Messrs. Smith and Jones. Conjunction. A visitor who wants to see both Smith and Jones wants the table t o be in the position shown here, i.e. Shorter: v(cp v 11,) = max(u(cp), ~ (11,)). In t ruth table form: Rzl 0 0 1 Negation. The visitor who is solely interested in our Smith will state that "Smith is not in" if the table is in the position: SO "Smith is not in" is true if "Smith is in" is false. We write this as v(1cp) = 1 iff v(cp) = 0, or v(-cp) = 1 - ~ ( c p ) . 16 1. Propositional Logic I n truth table form: R l I 1.2 Semantics 17 Falsum. An absurdity, such as "0 # O n , "some odd numbers are even", "I a m not myself', cannot be true. So we put v ( l ) = 0. S trictly speaking we should add one more truth table, i.e. t he table for T, t he opposite of falsum. Verum. T his symbol stands for manifestly true propostion such as 1 = 1; we p ut v (T) = 1 for all v. We collect the foregoing in Implication. Our legendary visitor has been informed that "Jones is in if Smith is in". Now he can a t least predict the following positions of the table Smith Jones - --- I I I x 1 I Jones I in Smith 1 I o ut I X 1 I If the table is in the position then he knows that the information was false. The remaining case, Smith Jones 1 , cannot be dealt with in If a valuation is only given for atoms then it is, by virtue of the definition by recursion, possible to extend it t o all propositions, hence we get: such a simple way. There evidently is no reason'to consider the information false, rather 'not very helpful', or 'irrelevant'. However, we have committed ourselves t o the position that each statement is true or false, so we decide to call "If Smith is in, then Jones is in" true also in this particular case. The reader should realise that we have made a deliberate choice here; a choice that will prove a happy one in view of the elegance of the system that results. There is no compelling reason, however, to stick to the notion of implication that we just introduced. Various other notions have been studied in the literature, for mathematical purpose our notion (also called 'material implication') is however ~ e r f e c t l v uitable. s Note that there is just one case in which an implication is false (see t he truth table below), one should keep this observation in mind for future application - i t helps t o cut down calculations. In our notation the interpretation of implication is given by v(cp 4 $) = 0 iff v (p) = 1 a nd v($) = 0. Its truth table is: 0 1 Theorem 1.2.2. If v i s a mapping from the atoms into { O,l), satisfying v ( l ) = 0, then there exists a unique valuation I[.],, such that [cp], = v (p) f or atomic cp. I t 'has become common practice to denote valuations as defined above by [cp], s o will adopt this notation. Since 1.1 is completely determined by its values on the atoms, [cp] is often denoted by [cp],. Whenever there is no confusion we will delete the index v. Theorem 1.2.2 tells us that each of the mappings v a nd I[.], determines the other one uniquely, therefore we call v also a valuation (or an atomic valuation, if necessary). From this theorem it appears that there are many valuations (cf. Exercise 4 ). I t is also obvious that the value [cp], of cp under v only depends on the values of v on its atomic subformulae: ' Lemma 1.2.3. If v (pi) = v f(pi) f or all pi occurring i n cp, then [cp], Proof. An easy induction on cp. = [cp],~. Equivalence. If our visitor knows that "Smith is in if and only if Jones is in", then he knows that they are either both in, or both out. Hence v(cp o $) = 1 iff v(cp) = v(+). T he truth table of 0 - An important subset of P ROP is that of all propositions cp which are always true, i.e. t rue under all valuations. is: 18 1. Propositional Logic 1.2 Semantics 19 D efinition 1.2.4. ( i) cp is a tautology if [[cp], = 1 for all valuations v, cp s tands for 'cp is a tautology', (ii) (iii) Let r b e a set of propositions, then r t= cp iff for all v: ([$Iv = 1 for all 11, E r ) [cpllv = 1. + [&, 5 The proof of the second part essentially uses the fact that ([$Iv for all v(cf. Exercise 6 ). k cp -+ 11, iff 0 I n words: r cp holds iff cp is true under all valuations that make all $ in r t rue. We say that cp is semantical consequence of r . We write r cp if k cp is not the case. r 0 T he proof of the substitution theorem now immediately follows . T he substitution theorem says in plain english that parts may be replaced b y equivalent parts. There are various techniques for testing tautologies. One such (rather slow) technique uses truth tables. We give one example: Convention. c pl, . . . ,cp, k 11, s tands for { cpl, . . . , cp,) k cp. Note that "Ucp], = 1 for all v" is another way of saying "up] valuations". Examples. (i) = 1 for all cp; k cp V $ * $ V cp, ( ii)cp,$kcpA$; cp,cp+$i=$; cp-+$J77$t=7cp. One often has to substitute propositions for subformulae; it turns out t o k cp --, cp; k 7 - p 4 cp $ l c p 1 00 1 01 0 10 0 11 -4 1 0 1 0 cp+$ 1 1 0 1 - 4 4 7 ~cp-+11,)*(+-+7cp) ( 1 1 0 1 1 1 1 1 be sufficient to define substitution for atoms only. We write cp[$/pi] for the proposition obtained by replacing all occurrences of pi in cp by $. As a matter of fact, substitution of $ for pi defines a mapping of P R O P into P R O P , which can be given by recursion (on cp). T he last column consists of 1 's only. Since, by lemma 1.2.3 only the values of cp a nd $ a re relevant, we had to check 22 cases. If there are n (atomic) parts we need 2, lines. One can compress the above table a bit, by writing it in the following form: 0 1 1 1 0 1 1 0 1 1 1 1 0 1 0 1 0 1 1 0 0 T he following theorem spells out the basic property of the substitution of equivalent propositions. T h e o r e m 1.2.6 ( S u b s t i t u t i o n T h e o r e m ) . I f c pl * 9 2 , then $[cpl/p] * 11,[cp2/p], where p is an atom and distinct from I . T he substitution theorem is actually a consequence of a slightly stronger + Let us make one more remark about the role of the two 0-ary connectives, I and T. Clearly, T - 4 , so we can define T from I . On the other 11 hand, we cannot define I from T a nd 4;we note that from T we can never get anything but a proposition equivalent to T by using A, V , -+, b ut from I we can generate I a nd T by means of applying A , V, -+. + Exercises 1. Check by the truth table method which of the following propositions are tautologies ( a) ( 7 9 V $1 * ($ cp) (b) cp ( ($ 0) ((cp -+ $1 (cp 0 ))) (c) (cp -+ -cp) * l c p ( 4 ~ ( c p - 9) (cp --+ ($ 0 ) ) ++ ((cp A $1 -+ 0 ) ( f) cp V l c p (principle of the excluded third) ( g) I* (cp A 7 9 ) ( h) I+cp ( ex falso sequitur quodlibet) + Proof. Induction on 11,. We only have to consider [cpl * cp2]lv = 1 ( why?). + (e) -+ + - + - + + - + - 20 1. Propositional Logic 1 .3 Some Properties of Propositional Logic 21 4. Show that there are 2N0 valuations. I n order t o apply the previous theorem in "logical calculations" we need a few more equivalences. This is demonstrated in the simple equivalence (P A ((P V $) ++ cp (exercise for the reader). For, by the distributive law k ( P ~ ( c p V ~ )(cpAcp)V(cpA$) a nd I= (cpAcp)V(cpA$) + +cpV(cpA$), by idempotency a nd the substitution theorem. So cp A (cp V $) o cp V (cp A $). Another application of the distributive law will bring us back t o start, so just applying the above laws will not eliminate $! We list therefore a few more convenient properties. + 1.3 Some Properties of Propositional Logic O n the basis of the previous sections we can already prove a lot of theorems about propositional logic. One of the earliest discoveries in modern propositional logic was its similarity with algebras. Following Boole, an extensive study of the algebraic properties was made by a number of logicians. The purely algebraic aspects have since then been studied in the so-called Boolean Algebra. We will just mention a few of those algebraic laws. Theorem 1 .3.1. T he following propositions are tautologies Lemma 1 .3.2. If + cp + $, t hen + cp A $ ct cp and ~PV$++$ Ptvof.Left t o the reader. 0 T he following theorem establishes some equivalences involving various connectives. It tells us that we can "define" up to logical equivalence all connectives in terms of { v, T ), o r {+, or { A, 11, or {+, I). T hat is, we can find e.g. a proposition involving only V a nd which is equivalent to cp ct $, e tc. l), 1 , llcp ++ - cp double negation law 22 1. Propositional Logic 1 .3 Some Properties of Propositional Logic 23 ~ h e f f estroke r Proof. Compute the truth values of the left-hand and right-hand sides. We now have enough material to handle logic as if it were algebra. For convenience we write cp = 11, for cp 11,. - Let us say that an n-ary logical connective $ is defined by its truth table, or by its valuation function, if [ [$(PI,. . ,p,)] = f ([pl], . . . , [p,]) . for some function f . Although we can apparently introduce many new connectives in this way, there are no surprises in stock for us, as all of those connectives are definable : in terms of V a nd 1 Lemma 1 3 5 ... Proof. Use + cp + - is an equivalence relation on P ROP,i.e. cp = cp (reflexitivity), cp = + = cp ( symmetv), cp = lj, a nd 11, NN a + cp = a (transitivity). NN + + Theorem 1.3.6. For each n-ary connective $ defined by its valuation function, there is a proposition T , containing only p l, . . . ,p,, V a nd 1 , such that 7- H $ (p1,. . . , Pn). Proof. Induction on n . For n 0 = 1 t here are 4 possible connectives with truth 11, i ff [[& [ $IV for " 1 v. = 1 tables We give some examples of algebraic computations, which establish a chain of equivalences. (11, a ) ] [cp A 11, --, 4 , 1. 1. k [cp c p V (11, 4 u ), (1.3.4(b)) cp + ( + a ) V c p V ( + u ) = -cp V (-II, a ), ( 1.3.4(b) a nd s ubst. t hm.) -cp V (-11, V a ) = (lcp V -11,) V a , (ass.) ( De Morgan and s ubst. t hm.) ( -9 V +) V a = -(cp A 11,) V a , ( 1 .3.4(b)) -(cp A 11,) V a = ((cp A 11,) -+ a , So cp + (lj, + a ) = ( PA$) + a . We now leave out the references to the facts used, and make one long string. We just calculate till we reach a tautology. 2 . 2. k (cp 11,) (-lj, l cp), 711,-)-cp"ll11,v-cp=11,v7cp~~cpv~~cp~lj, 3 . 3. k c p + (11, - -+cp), ( cp+(++cp) = " P v ( - + v v ) =(-cpvcp)v-11,. + + - One easily checks that the propositions -(p V i p ) , pV -p, p a nd -p will meet the requirements. Suppose that for all n-ary connectives propositions have been found. Consider $ (pl, . . . ,p,, P ,+~) with truth table: - + We have seen that v a nd A a re associative, therefore we adopt the convention, also used in algebra, to delete brackets in iterated disjunctions and conjunctions; i.e. we write c pl V cp2 V cp3 V cp4, etc. This is alright, since no matter how we restore (syntactically correctly) the brackets, the resulting formula is determined uniquely up to equivalence. Have we introduced all connectives so far? Obviously not. We can always invent new ones. Here is a famous one, introduced by Sheffer; cp I $ s tands for "not both cp a nd 11,". More precise: (cp I 1C, is given by t he following truth table where i k < 1. 24 1. Propositional Logic 1 .3 Some Properties of Propositional Logic 25 We consider two auxiliary connectives $1 a nd $2 defined by $ 1 ( ~ 2 ,... p n+l) = $ ( I ,~ 2 ,. . ,p n+i) a nd $2(p2,. . . , pn+l) = $ (T,p2,. . . , pn+l), where T = I ( as given by the upper and lower half of the above table). By the induction hypothesis there are propositions a l a nd 02, containing o nlyp2 , ..., p n+l, V a nd SO t hat + $i(p2 , . . . , p n + l ) - a i . From those two propositions we can construct the proposition r : r := ( pl --+ 02) A ( 7pl -+ 01). We consider two auxiliary connectives $1 a nd $2 defined by 7 7 Definition 1.3.8. If cp' = flC\ W V i j l where p,j is atomic or the negation j<m, of an atom, then cp is a c onjunctive normal form. If cp = W flC\ p ij, where i<n isn j lm, pij is atomic or the negation of an atom, then cp is a d isjunctive normal form. T he normal forms are analogous to the well-known normal forms in algebra: ax2 byx is "normal", whereas x (ax by) is not. One can obtain normal forms by simply "multiplying", i.e. repeated application of distributive laws. In algebra there is only one "normal form"; in logic there is a certain duality between A a nd V , so that we have two normal form theorems. + + $ 1 ( ~ 2.,. . r pn+l) = $ ( L , P ~ , . i pn+l) and .. $ 2 ( ~ 2 ,. .. , pn+l) = $ (TI~ 2 ,. . , pn+i), where T = I (as given by the upper and lower half of the above table). By the induction hypothesis there are propositions a1 a nd 0 2 , containing $ i(p2,. . . ,p n+l) ++ a i. From those two only pa, . . . ,p n+l, V a nd so that propositions we can construct the proposition r : r := (PI 0 2 ) A ( 7pl + 01). C laim $ (PI,.. . , p,+l) * T. If UP ID^ = 0, t hen [[PI azlV= 1, SO b l v = PI a llv = [ allv = U$1(~21.. i ~ n + l ) ] u U $(PI,P~,. . i ~ n + l ) ] v using PI]^ = 0 = [ 1 1,. . = . i T he case [pl], = 1 is similar. Now expressing -4 a nd r\ in terms of V a nd 7 (1.3.4), we have [ r'l = [ $(pl,. . . , P,+~)] for all valuations (another use of lemLa 1.2.3), where T' z T 0 a nd T' contains only the connectives V a nd For another solution see Exercise 7 7 7 + Theorem 1.3.9. For each cp there are conjunctive normal forms cpA and disjunctive normal forms c pV, s uch that cp o cpA and cp o cpV. + + ) + + 7. Proof.F irst eliminate all connectives other than I, A , V a nd 1.T hen prove the theorem by induction on the resulting proposition in the restricted language of I , A , V a nd 1.In fact, I plays no role in this setting; it could just as well be ignored. (a) cp is atomic. Then cpA = cpV = cp. ( b) cp = $ A a . Then cpA = $ A A a ". In order to obtain a disjunctive normal form we consider $" = gi, a" = a j, where the $ i l ~and a j's are conjunctions of atoms and negations of atoms. NOW c p = $ ~ a ~ $ "" A" NW($~ACT~). v v i,j T he above theorem and theorem 1.3.4 are pragmatic justifications for our choice of the truth table for -+: we get an extremely elegant and useful theory. Theorem 1.3.6 is usually expressed by saying that v a nd 1 form a f unctionally complete s et of connectives. Likewise A , 1 a nd +, a nd I , + form functionally complete sets. In analogy to the C a nd from algebra we introduce finite disjunctions and conjunctions: 7 n T he last proposition is in normal form, so we equate cpV t o it. (c) cp = $ V a . Similar to ( b). (d) cp = 7$. By induction hypothesis $ has normal forms $' a nd '@ . 7~ M zA M /m $ij, where $ij = ~ $ i , if M A h j is atomic, and $ij = 7$ij if $ij is the negation of an atom. (Observe lT,bij $ ij). Clearly M $ij is a conjunctive normal form for cp. T he disjunctive normal form is left to the reader. 0 ,For a nother proof of the normal form theorems see Exercise 7. 7 v v 7$iJ v w When looking at the algebra of logic in theorem 1.3.1, we saw that V Definition 1.3.7. and A behaved in a very similar way, to the extent that the same laws hold for both. We will make this 'duality' precise. For this purpose we consider a language with only the connectives V , A a nd 7. Definition 1 .3.10. Define an auxiliary mapping * : P R O P cursivelyby p* = 7 c p if cp is atomic, ( P A + ) * = c p*V$*, ( c p ~ $ ) * = cp*A+*, (-cp)* = 7cp*. -+ P ROP re- 26 1. Propositional Logic 1.3 Some Properties of Propositional Logic 27 Example. ((POA - PI) V P2)* = (Po A - pi)* A p; = (p; V ( ~ p l ) *A l P2 = ) ( T o V 7 ~ ; A l P2 = (-Po V --PI) A -P2 ) ( 7po V p l ) A l p2. Note that the effect of the *-translation boils down t o taking t he negation and applying De Morgan's laws. Lemma 1.3.11. [cp*] = [ icp] Proof Induction on cp. For atomic cp [ p*] = [ y ] . U (P $ A )*I = I[V* v = I[-cp v = A+ )I). [(p v a nd [(-cp)*] a re left to the reader. Exercises $)*I v n -m I[+ 1. Show by 'algebraic' means Contraposition, ) *( +) ( (cp $) A ($ + a ) + (cp + a ) , transitivity of +, t= (cp ($ A -$I) l c p , k (9 lcp) l c p 7 I= -(cp A l c p ) I I= c p+($-+cpAdJ), ((cp + $) -+ cp) + cp. Peirce's L aw. + + + + + Corollary 1.3.12. + cp* H - 9. 0 Proof. Immediate from Lemma 1.3.11. So far this is not the proper duality we have been looking for. We really just want t o interchange A a nd V. So we introduce a new translation. Definition 1.3.13. T he duality mapping : P ROP defined by cpd = cp for cpatomic, (cp A $Id = cpd v qdl (cp v $)d = cpd (1cp)d = 7cpd. Theorem 1.3.14 (Duality Theorem). -+ P ROP is recursively 3. Show that { 1) is not a functionally complete set of connectives. Idem for ( 4, ) ( hint: show that each formula cp with only + a nd v t here is V a valuation v such that [cp] = 1 ). 4. Show that the Sheffer stroke, t="P * cp I cp). k cp H $ @ k cpd H qd. 5. Show that the connective 1, with valuation function [cp 1 $1 = 1 iff [p] = [$] = 0 , forms a functionally complete set ( neither cp, n or $). 6. Show that I a nd 1 a re the only binary connectives $ such that functionally complete. Proof. We use the * -translation a s a n intermediate step. Let us introduce the notion of simultaneous substitution to simplify the proof: u [ r O , . . , r n / p O I . . , pn] is obtained by substituting ri for pi for all i 5 n simultaneously (see Exercise 1 5). Observe that cp* = cpd [ 7po,. . . , i p n / p o , . . . ,p,], SO v * [ ~ P o ~ 7 ~ n / ~. . ,Pn] = v ~ [ ~ ~ . . 1 o , . n / ~ ~ ,l n], where t he -. . ~,. P ll~ . P. . a toms of cp occur among the P O,.. . ,p,. By the Substitution Theorem cpd cp* [-PO, . . . ,l p n / p o , . . . ,p,]. T he same equivalence holds for $. B y Corollary 1.3.12 k cp* * l c p , k $* * +. Since cp $, also I= - p ++ -$. Hence k cp* ++ $* , a nd therefore cp* [-PO, . . . ,-pn /po , . . . ,p,] ++ 1 t= - 7. T he functional completeness of {v, 7) can be shown in an alternative way. Let $ be an n -ary connective with valuation function [ $(pi, . . . ,p,)] = f ( b l ] , . . . , [[p,]). We want a proposition T (in V , 1 ) such that [T] = Suppose f ( [pl], . . . , [p,]) = 1 a t least once. Consider all tuples (bl], . . , up,]) with f ( h l ] , . . . , I IPn]) = 1 a nd form corresponding con. junctions pl A p2 A . . . A pn such that pi = pi if hi] = 1 , pi = l p i if b = O . T henshow ( p i ~ p. i. .~ A ~ ~ ) v . . . v ( ~ ~ A ~ ~ H . . . A ~ ~ i n A $ (PI,. . . ,p,), where the disjunction is taken over all n-tuples such that f (l[Pl],.. [ ~ n ] = 1 . ) Alternatively, we can consider the tuples for which f ( l[pl],. . . , h,]) = 0.C arry out the details. Note that this proof of the functional completeness a t the same time proves the Normal Form Theorems. f (bl]] . b nl). .7 + + + + $*[~PO,...,~P~/PO,...,P~]. Using the above relation between cpd a nd cp* we now obtain cpd ct q d . T he converse follows immediately, as cpdd = cp. 0 T he duality Theorem gives us one identity for free for each identity we establish. + + - 1, forms a functionally complete set (hint: ($1 is + 28 1. Propositional Logic 1.4 Natural Deduction 29 8. Let the ternary connective $ be defined by [$(cpl, p 2, (P3)] = 1 H ['PI] [cpz] 8 ~ 3 1 2 ( the m ajority connective). Express $ in t erms L of V a nd 7 . + + 1.4 N atural Deduction In the preceding sections we have adopted the view that propositional logic is based on truth tables, i.e. we have looked at logic from a semantical point of view. This, however, is not the only possible point of view. If one thinks of logic as a codification of (exact) reasoning, then it should stay close to the practice of inference making, instead of basing itself on the notion of truth. We will now explore the non-semantic approach, by setting up a system for deriving conclusions from premises. Although this approach is of a formal nature, i.e. i t abstains from interpreting the statements and rules, it is advisable t o keep some interpretation in mind. We are going to introduce a number of derivation rules, which are, in a way, the atomic steps in a derivation. These derivations rules are designed (by Gentzen), t o render the intuitive meaning of the connectives as faithfully as possible. There is one minor problem, which a t the same time is a major advantage, namely: our rules express the constructive meaning of the connectives. This advantage will not be exploited now, but it is good to keep it in mind when dealing with logic (it is exploited in intuitionistic logic). One small example: the principle of the excluded third tells us that cpv yep, i.e., assuming that cp is a definite mathematical statement, either it or its negation must be true. Now consider some unsolved problem, e.g. Riemann's Hypothesis, call it R. Then either R is true, or 7 R is true. However, we do not know which of the two is true, so the constructive content of R V 1 R is nil. Constructively, one would require a method to find out which of the alternatives holds. The propositional connective which has a strikingly different meaning in a constructive and in a non-constructive approach is the disjunction. Therefore we restrict our language for the moment to the connectives A , --, a nd I.T his is no real restriction as ( 4 , )is a functionally complete set. I Our derivations consist of very simple steps, such as "from cp a nd cp -+ $ conclude q Y1, w ritten as: 9. Let the binary connective # be defined by Express # in terms of V a nd 7 . 10. Determine conjunctive and disjunctive normal forms for -(cp ( ( 9 --, $) --, $) --, $ 7 (cp ( 9 A ~ $ 1 ) (+ A ( $ A 7 ))). + - $ ), 11. Give a criterion for a conjunctive normal form to be a tautology. 1 2. Prove flC\ cpi V i<n j<m $J = i5n flC\ ( pi V $ j) a nd 13. The set of all valuations, thought of as the set of all 0 - 1-sequences, forms a topological space, the so-called Cantor space C. The basic open sets are finite unions of sets of the form {v I [[pilnv = . . . = [[pznnv = 1 a nd [p,], = . . . = bj,], = O ) , i k # j pfor k 5 n ; p I m . Define a function [ ] : P R O P --, P (C) (subsets of Cantor space) by: ucpn = {v I ucpnv = 1 ). ( a) Show that [Ip] is a basic open set (which is also closed), cl ( b) ucp v $1 = ucpn U !$I; ucp A $1 = IIcpI n !$I; u 4 = Up ", ( c) I= cp [cpl = C ; UB = 0; i cp L = $ UP] G U$1. E xtend the mapping t o sets of propositions r by = {V I [[cp], = 1 for all cp E T).Note that [r]s closed. i ( 4r I= cp H ul ucp1. r * + * c 14. We can view the relation cp --, $ as a kind of ordering. P u t cp c $ := kcp-)$and F$+cp. ( i) for each cp, $ such that cp c $, find a with cp c a c $, (ii) find cpl,cpz, 9 3 , . . . such t hat cpl c 9 2 c cp3 c 9 4 c . . ., (iii) show that for each cp, $ with cp a nd $ incomparable, there is a least a with cp, $ c a . + The propositions above the line are premises , and the one below the line is the conclusion . T he above example eliminated the connective 4. We can also introduce connectives. The derivation rules for A a nd + a re separated into 15. Give a recursive definition of the simultaneous substitution p[$, . . . , $ n / p ~ , .. . , pn] a nd formulate and prove the appropriate analogue of the Substitution Theorem (theorem l .2.6). 30 1. Propositional Logic 1.4 Natural Deduction 31 INTRODUCTION RULES ELIMINATION RULES We have two rules for I,b oth of which eliminate I,b ut introduce a formula. - RAA cp As usual ' l c p ' is used here as an abbreviation for 'cp + ' I T he rules for A a re evident: if we have cp a nd $ we may conclude cp A $, a nd if we have cp A $ we may conclude cp (or $). T he introduction rule for implication has a different form. It states that, if we can derive $ from cp ( as a hypothesis), then we may conclude cp -+ 11, (without the hypothesis cp). T his agrees with the intuitive meaning of implication: cp + $ means "11, follows from cp". We have written the rule (-+ I ) in the above form to suggest a derivation. The notation will become clearer after we have defined derivations. For the time being we will write the premises of a rule in the order that suits us best, later we will become more fastidious The rule (+ E ) is also evident on the meaning of implication. If cp is given and we know that $ follows from cp, t hen we have also $. T he falsum rule, ( I ) , expresses that from an absurdity we can derive everything (ex falso sequitur quodlibet), and the reductio ad absurdum rule , (RAA), is a formulation of the principle of proof by contradiction : if one derives a contradiction from the hypothesis l c p , t hen one has a derivation of cp (without the hypothesis -cp, of course). In both (+ I) a nd (RAA) hypotheses disappear, this is indicated by the striking out of the hypothesis. We say that such a hypothesis is cancelled. Let us digress for a moment on the cancellation of hypotheses. We first consider implication introduction. There is a well-known theorem in I plane geometry which states that ' if a triangle is isosceles, then the angles opposite the equal sides are equal t o one another" (Euclid's Elements, Book I, proposition 5 ). T his is shown as follows: we suppose that we have an isosceles triangle and then, in a number of steps, we deduce that the angles a t the base a re equal. Thence we conclude that the angles at the base are equal if the triangle is isosceles. Query 1: d o we still need the hypothesis that the triangle is isosceles? Of course not! We have, so to speak, incorporated this condition in the s t a t e ment itself. It is precisely the role of conditional statements, such as "if it rains I will use my umbrella", to get rid of the obligation to require (or verify) the condition. In abstracto: if we can deduce 11, using the hypothesis cp, t hen -+ I, is the case without the hypothesis cp ( there may be other hypotheses, I of course). Query 2: is it forbidden t o maintain the hypothesis? Answer: no, but it clearly is superfluous. As a matter of fact we usually experience superfluous conditions as confusing or even misleading, but that is rather a matter of the psychology of problem solving than of formal logic. Usually we want the best possible result, and it is intuitively clear that the more hypotheses we state for a theorem, the weaker our result is. Therefore we will as a rule cancel as many hypotheses as possible. In the case of reductio ad absurdum we also deal with cancellation of hypotheses. Again, let us consider an example. In analysis we introduce the notion of a convergent sequence (a,) and subsequently the notion "a is a limit of (n)". The next step is to prove that for each convergent sequence there is a unique limit; we are interested in the part of the proof that shows that there is a t most one limit. Such a proof may run as follows: we suppose that there are two distinct limits a and a', and from this hypothesis, a # a', we derive a contradiction. Conclusion: a = a'. In t his case we of course drop the hypothesis a # a ', this time it is not a case of being superfluous, but of being in conflict! So, both in the case ( 4 I ) and of ( RAA), i t is sound practice to cancel all occurrences of the hypothesis concerned. I n order to master the technique of Natural Deduction, and to get familiar with t he technique of cancellation, one cannot do better than to look a t a few concrete cases. So before we go on t o the notion of derivation we consider a few examples. 32 1. Propositional Logic 1.4 Natural Deduction 33 [Y A dl' A E + - [cp A dl1 A E 67 [67 + ( d-+ff)12 If we use the customary abbreviation ' - p ' for 'cp +I,, can bring some we derivations into a more convenient form. (Recall that i c p a nd cp -+I, as given in 1 .2, a re semantically equivalent). We rewrite derivation I1 using the abbreviation: One can just as well present derivations as (linear) strings of propositions: we will stick, however, to the tree form, the idea being that what comes naturally in tree form should not be put in a linear straight-jacket. We now shave t o define the notion of d erivation in general. We will use an inductive definition t o produce trees. N otation In the following example we use the negation sign and also the bi-implication; d for ( 9 d ) A ( d 9 ) . cp + - + V' if cP ' cp' v v 1 1 v V' d cp c are derivations with conclusions cp, cp', t hen - - p' cp , a re derivations obtained by applying a derivation rule to cp ( and cp a nd 9'). d T he cancellation of a hypothesis is indicated as follows: if V is a derivation cp [@I with hypothesis $, t hen ' P (7 is a derivation with y5 cancelled. T he examples show us that derivations have the form of trees. We show the trees below: W ith respect to the cancellation of hypotheses, we note that one does not necessarily cancel all occurrences of such a proposition $. T his clearly is justified, as one feels t hat adding hypotheses does not make a proposition underivable (irrelevant information may always be added). It is a matter of prudence, however, to cancel as much as possible. Why carry more hypotheses t han necessary? Furthermore one may apply (-+ I) if there is no hypothesis available for cancellation e.g. Y , $ 4 + I is a correct derivation, using just (+ I ) . To sum ~ i t UP: given a derivation tree of 11 , we obtain a derivation tree of cp -+ 11, (or 'P) a t t he bottom of t he tree and striking out some (or all) occurrences, if of cp (or -y)on top of a tree. A few words on t he practical use of natural deduction: if you want to give a derivation fof a proposition it is advisable t o devise some kind of strategy, just 34 1. Propositional Logic 1.4 Natural Deduction 35 like in a game. Suppose that you want to show [cp A $ 4a] -+ [cp 4 ($ a)] (Example I II), t hen (since the proposition is an implicational formula) the rule (-+ I) suggests itself. SO t ry to derive cp -+ ($ -t a ) from cp A $ -+ a . 0 Now we know where to start and where to go to. To make use of cpA$ we want cpA$ (for (-+ E )),a nd to get cp -t ($ -+ a ) we want t o derive $ -t a from cp. So we may add cp a s a hypothesis and look for a derivation of $ a. Again, this asks for a derivation of a from $, so a dd $ as a hypothesis and look for a derivation of a . By now we have the following hypotheses available: cp A $ -+ a, cp a nd $. Keeping in mind that we want to eliminate cp A $ it is evident what we should do. The derivation I11 shows in detail how t o carry out the derivation. After making a number of derivations one gets the practical conviction that one should first take propositions apart from the bottom upwards, and then construct the required propositions by putting together the parts in a suitable way. This practical conviction is confirmed by the N ormalization Theorem, t o which we will return later. There is a particular point which tends t o confuse novices: + + + ID If cpA$ E X , t hen cpA$, pAy -ccp EX. 11, ( 2 1 ) If v E X , then I I E X and ' -cp If v V E X , then EX look very much alike. Are they not both cases of Reductio ad absurdum? As a matter of fact the leftmost derivation tells us (informally) that the assumption of cp leads to a contradiction, so cp c annot be the case. T his is in our terminology the meaning of "not " . T he rightmost derivation tells us that the assumption of l c p leads to a ntradiction, hence (by the same reasoning) 7 c p cannot be the case. So, on account of the meaning of negation, we only would get 7 - p . I t is by no means clear that l l c p is equivalent t o cp (indeed, this is denied by the intuitionists), so it is an extra property of our logic. (This is confirmed in a technical sense: i y c p -+ cp is not derivable in the system without RAA. We now return to our theoretical notions. $ T he bottom formula of a derivation is called its c onclusion. Since the class of derivations is inductively defined, we can mimic the results of section 1.1. E.g. we have a principle of induction o n 27: let A be a property. If A (D) for one element derivations and A is preserved under the clauses ( 2 ~ ( 2 +) )~ a nd (2 I ), t hen A (D) holds for all derivations. Likewise we can define mapp i n s on t he set of derivations by recursion (cf. Exercises 6 ,8). Definition 1.4.2. T he relation r t cp between sets of propositions and ~ ropositions s defined by: t here is a derivation with conclusion cp a nd with i "1 (uncancelled) hypotheses in r . 1 We say that cp is derivable from r . Note that by definition may contain many superfluous "hypotheses1'. The symbol k is called t urnstile . If r = 0,we write t cp, a nd we say that cp is a theorem. We could have avoided the notion of 'derivation' and taken instead the notion of 'derivabilityl as fundamental, see Exercise 9 . T he two notions, however, are closely related. r Definition 1.4.1. T he set of derivations is the smallest set X such that (1) T he one element tree cp belongs to X for all cp € PROP. 36 1. Propositional Logic 1.4 Natural Deduction 37 Proof. Immediate from the definition of derivation. We now list some theorems. 1 a nd ct a re used a s abbreviations. So now we have (cp - $) -- (-11, (cp + 1cp) (+ -- 79) -- (cp -+ --+ $1 -- $) * ('$ "P) 5. We already proved cp -- - 1cp as a n example. Conversely: Proof. T he result now follows. The numbers 6 a nd 7 a re left t o the reader. O T he system, outlined in this section, is called the "calculus of natural deduction" for a good reason. T hat is: its manner of making inferences corresponds t o the reasoning we intuitively use. The rules present means t o t ake formulas apart, or t o p ut t hem together. A derivation then consists of a skilful manipulation of the rules, the use of which is usually suggested by the form of the formula we want to prove. We will discuss one example in order to illustrate the general strategy of building derivations. Let us consider the converse of our previous example 111. To prove (cp A $ + a ) -+ [cp + ($ 4 a)]t here is just one initial step: . a=%me cp A $ +. o a nd t ry t o derive cp + ($ --+ a).Now we can either look a t t he assumption or a t t he desired result. Let us consider the latter one first: t o show cp -+ ( $ -+ g), we should assume cp a nd derive $ -- a , b ut for the latter we should assume $ a nd derive a. So, altogether we may assume cp A $ -- a a nd cp a nd $. Now the procedure itself: derive cp A $ from cp a nd $, a nd a from cp A $ a nd cp A $ a. P ut t ogether, we get the following derivation: + 4. For one direction, substitute I for a in 3, then k (cp Conversely: + $) + (T$ --t -9). 38 1. Propositional Logic 1.5 Completeness 39 Had we considered cp A $ -+ a first, then the only way to proceed is t o add cp A $ a nd apply -+ E. Now cp A $ either remains an assumption, or it is obtained from something else. It immediately occurs to the reader t o derive cp A $ from cp a nd $. B ut now he will build up the derivation we obtained above. Simple as this example seems, there are complications. In particular the rule of reductio ad absurdum is not nearly as natural as the other ones. Its use must be learned by practice; also a sense for the distinction between constructive and non-constructive will be helpful when trying to decide on when t o use it. Finally, we recall that T is an abbreviation for TIi.e. I-+ I ) . ( 6 . Analogous to the substitution operator for propositions we define a substitution operator for derivations. V[cp/p] is obtained by replacing each occurrence of p in each proposition in V by cp. Give a recursive definition of V[cp/p]. Show that V[cp/p] is a derivation if V is one, and that r I- a a r[cp/p] t- a[cp/p]. Remark: for several purposes finer notions of substitution a re required, but this one will do for us. 7. ( Substitution T h e o r e m ) I- (91 9 2 ) ($[cpl/p] ++ $[cpz/p]). Hint: use induction on $; t he theorem will also follow from the Substitution Theorem for +, once we have established the Completeness Theorem. + + + 8. T he s ize, s (V), of a derivation is the number of proposition occurrences in 2). Give an inductive definition of s (V). Show that one can prove properties of derivations by it induction on the size. 9. Give an inductive definition of the relation I- (use the list of Lemma 1.4.3), show that this relation coincides with the derived relation of Definition 1.4.2. Conclude that each r with r t- cp contains a finite A , such that also A k cp Exercises 1 .5 Completeness In t he present section we will show that "truth" and "derivability" coincide, t o be precise: the relations "k " a nd "t-" coincide. The easy part of t he claim is: "derivabilityn implies "truth"; for derivability is established by the existence of a derivation. The latter motion is inductively defined, so we can Prove t he implication by induction on the derivation. Lemma 1.5.1 ( Soundness). 'roof. r t- cp * I- p . Since, by definition 1.4.2, r t- cp iff there is a derivation V with all its h ~ o t h e s e sn i r , it suffices t o show: for each derivation V with conclusion cp 40 1. Propositional Logic 1.5 Completeness 41 a nd hypotheses in r we have r + cp. We now use induction on V . t hat r' cp, t hen I[$] = 1 for all $ E r' a nd [cp] = 0 for some valuation. Since r' contains all hypotheses of the first derivation we have a contradiction. (basis) If V h as one element, then evidently cp E r . T he reader easily sees cp. that r + ( RAA). I nduction hypothesis: for each F containing all hypotheses of and I cp cp, t hen there exists a valuation such that I[$] = 1 for all suppose r' ?I, E r' a nd [p] = 0 , i.e. [[y]1. B ut TI1 = r' u {y} = contains all hypotheses of the first derivation and [[$I = 1 for all $ E TI'. T his is 0 H impossible since r'' +I. ence r' cp. 2) ( A 1) Induction hypothesis: and are derivations and for each r , cp cp' r' containing the hypotheses of V , V', r cp, r1 cp' V V' Now let r" contain the hypotheses of cp CP' cp A cp' Choosing r a nd r' t o be precisely the set of hypotheses of D ,V', we see that r" 2 r U T I. cp a nd I'" cp'. Let I[$], = 1 for a11 $ E r",t hen [cp], = So I'" [[cp'], = 1, hence [cp A p'], = 1. T his shows t;. cp A cp'. lcp + + , we have r + I . Let r' contain all hypotheses of + I r" ( A E ) Induction hypothesis: For any T containing the hypotheses of v cpA$ V we have r cp A $. Consider a r containing all hypotheses of v a nd cp A $ $ cp A $ cp This lemma may not seem very impressive, but it enables us to show that some propositions are not theorems, simply by showing that they are not tautologies. W ithout this lemma that would have been a very awkward task. We would have to show that there is no derivation (without hypotheses) of t he given proposition. In general this requires insight in the nature of derivations, something which is beyond us at the moment. Examples . Y PO, Y (cp $1 4 ' P A $. I n the first example take the constant 0 valuation. [po] = 0 , so po a nd hence y po. I n the second example we are faced with a meta proposition (a schema); strictly speaking it cannot be derivable (only real propositions can be). By t- (cp 4 $) -+ cp A $ we mean that all propositions of that form (obtained by substituting real propositions for cp and $, if you like) are derivable. To refute it we need only one instance which is not derivable. Take + . I t is left to the reader to show r + cp and r + $. , (+ I) Induction hypothesis: for any r I' + $. Let r' contain all hypotheses of cp 2) cp containing all hypotheses of V $ lcpl li, . Now r' u {cp} con- I n order t o prove the converse of Lemma 1.5.1 we need a few new notions. The first one has an impressive history; it is the notion of fmedom from contradiction or consistency. It was made the cornerstone of t he foundations of mathematics by Hilbert. cP = $ = PO. t ains all hyptheses of , so if [cp] = 1 a nd [XI = 1 for all x in r , then $1 = 1 if all [[$I = 1. Therefore the truth table of 4 tells us that [[cp 4 propositions in r' have value 1. Hence r' cp + $. 11 Definition 1 5 2 A set ... + r can be expressed in various other forms: r of propositions is consistent if r Y l . I n words: one cannot derive a contradiction from r . T he consistency of (4 E ) An exercise for the reader. I' + I . Since [I] 0 for all valuations, there is no valuation such that = D ' = 1 for all 1 E r . Let r ' c ontain all hypotheses of -and suppose 1, I cp ( I ) Induction hypothesis: For each r containing all hypotheses of 2, r Lemma 1.5.3. The following three conditions are equivalent: (i) r 2s consistent, (ii) F o r no cp, r t cp a nd r t- ~ c p , (iii) Them is at least one cp such that r Y p 'roof. Let u s call of [[$I r inconsistent if r tl,t hen we can just as well prove the 42 1. Propositional Logic 1.5 Completeness 43 (iv) r is inconsistent, (v) There is a cp such that (vi) r F cp for all cp. r k cp and r k 7cp, - RAA cp (iv) (vi) Let I' E l , i.e. t here is a derivation D with conclusion I a nd hypotheses in r . By ( I ) we can add one inference, 1 F cp, t o D , so that r F cp. This holds for all cp. (vi) + ( v) Trivial. . (v) + (iv) Let r k cp a nd r t- ~ c p From the two associated derivations one obtains a derivation for r t - I by (-+ E ). Clause (vi) tells us why inconsistent sets (theories) are devoid of mathematical interest. For, if everything is derivable, we cannot distinguish between "good" and "bad" propositions. Mathematics tries t o find distinctions, not t o blur them. In mathematical practice one tries to establish consistency by exhibiting a model (think of the consistency of the negation of Euclid's fifth postulate and the non-euclidean geometries). In the context of propositional logic this means looking for a suitable valuation. (a) (b) * I I -+ I 'p c Definition 1.5.6. A set r is consistent, r T' and r consistent + r = r'. r is m aximally consistent iff r is a proper subset of TI, t hen r' is inconsistent. Le., by just throwing in one extra proposition, the set ber R emark. One could replace (b) by (b'): if L e m m a 1.5.4. If there i s a valuation such t h a t [ $ ] , = 1 for all 11, E F , t hen r i s consistent. r F I , then by Lemma 1.5.1 r +I,o for any valuation v s [ ($)Iv = 1 for all $ r + [ l I v = 1. Since [ ] = 0 for a11 valuations, l, there is no valuation with [$], = 1 for all $ E T . Contradiction. Hence r is consistent. Proof. Suppose + 0 E xamples. 1. {PO,l p l , p l + ~ 2 i)s consistent. A suitable valuation is one satisfying [POI = 1, upl] = 0. 2. { po,pl,. . .) is consistent. Choose the constant 1 valuation. Clause (v) of Lemma 1.5.3 tells us that F U {cp, ~ c p )is inconsistent. Now, how could r U { l i p ) be inconsistent? It seems plausible to blame this on the derivability of cp. T he following confirms this. comes inconsistent. Maximally consistent sets play an important role in logic. We will show that there are lots of them. = {cpl[cp] = 1) for a fixed valuation. By Here is one example: Lemma 1.5.4 r is consistent. Consider a consistent set r' such that r & r ' . Now let @ E r' a nd suppose [[$I = 0, t hen [ l$] = 1, a nd so 7 1 E r . 1, B ut since I' r' this implies that I" is inconsistent. Contradiction. Therefore [@I = 1 for all $ E r', so by definition r = r ' . From the proof of Lemma 1.5.11 it follows moreover, that this basically is the only kind of maximally consistent set we may expect. The following fundamental lemma is proved directly. The reader may recognise in it an analogue of the Maximal Ideal Existence Lemma from ring theory (or the Boolean Prime Ideal Theorem), which is usually proved by an application of Zorn's Lemma. c L e m m a 1.5.7. E ach consistent set 9 r*. r i s contained in a m aximally consistent Proof. T here are countably many propositions, so suppose we have a list 'Po, Pi,P2,.....of all propositions (cf. Exercise 5). We define a non-decreasing sequence of sets risuch t hat t he union is maximally consistent. L e m m a 1.5.5. ( a) r U { y) inconsistent =+ k cp, is ( b) r U {cp) i s inconsistent + r E lcp. r rn+l = P roof. T he assumptions of (a) and (b) yield the two derivations below: with conclusion I. By applying ( RAA), and (+ I ),we obtain derivations with hypotheses in r , of cp, resp. 1 9 . r *- = (a) rn is consistent for all n. (b) r*is consistent. rnUlse. r, U {cpn)is consistent, rne {cpn)if U {rn 1 n > 0). Immediate, by induction on n . 44 1. Propositional Logic 1.5 Completeness 45 Suppose r*tI t hen, by the definition of I t here is derivation V of I with hypotheses in r *; h as finitely many hypotheses $0,. . . , & . Since D r*= U {rnln 2 0 ), we have for each i k q i E r n , for some n i. Let n k ) , t hen $ 0,. . . , $ k E rna nd hence r,, k I . B ut rnis be max{nili consistent. Contradiction. i (c) r *s maximally consistent. Let r * A and A consistent. If $ E A , t hen $ = cp, for some m . Since r C r' L A and A is consistent, r U {cp,) , , is consistent. Therefore rm+l , U {cp,), =r i.e. cp, E C r * . This 1. For atomic cp t he claim holds by definition. 2. p = $ A a . [cp], = 1 e , ]I $ [ = [a] = 1 @ (induction hypothesis) $ , a E r*a nd so cp E r * . Conversely $ A 'O E r* $ ,a E r*(1.5.8). @ T he rest follows from the induction hypothesis. [$ = 1 a nd [a], = 0 e ( induction 3. (P = y!~ -' a I( -+ a], = 0 e I[$], hypothesis) $ E r *a nd LT @ r* $ -t c # r *(by 1.5.9). @ 0 (c) Since r r*we have [$nu = 1 for a11 $ E r . < < c S~OWS r*= A . rm+l c L e m m a 1.5.8. If r i s maximally consistent, then ability (2.e. r t cp + cp E r ) . r C orollary 1.5.12. I' Y cp 11, E a nd [p] = 0. r * there is a valuation such that [$I = 1 f or all is closed under deriv- Proof. Let r t cp a nd suppose cp # r . T hen r U {cp) must be inconsistent. 0 Hence r I- i c p , so is inconsistent. Contradiction. for all 11, E Proof.r Y cp e r r U {icp), U { l c p ) consistent @ t here is a valuation such that or [$] = 1 for a11 $ E r a nd [cp] = 0. [$I] = 1 0 r T h e o r e m 1.5.13 ( Completeness T h e o r e m ) . L e m m a 1.5.9. Let r be maximally consistent; then (a) for all cp either cp E r , o r l c p E r , re ( c p ~ r + $r )~. ( b) for allcp,$ c p + $ r t cp e r + cp. 0 Proof. y cp + r r cp by 1.5.12. The converse holds by 1.5.1. r' = r U {cp). If r' is inconsistent, then, by 1.5.5, 1.5.8, l c p E consistent, then cp g r by the maximality of r . Proof. (a) We know that not both cp a nd 7 c p can belong to r . Consider F . If r' is . ( b) Let c p - - i $ ~ r a n d c p ~ r To show: $ E T . Sincecp,cp+ $ E r a n d since is closed under derivability (Lemma 1.5.8), we get $ E r by + E. Conversely: let cp E r + $ E r . If cp E r t hen obviously r k $, so r F cp -+ $. If I,@ then l c p E r , and hence r k -y.Therefore r t cp + $. I r r, 0 Note that we automatically get the following: Corollary 1.5.10. I r is maximally consistent, then cp E f a nd l c p E T H cp # r . r @lcp $r , L e m m a 1.5.11. If r i s consistent, then there exists a valuation such that = 1 f or all $ E r . [$I Proof.(a) By 1.5.7 r is contained in a maximally consistent r* 1 if p, E r * and extend v t o the valuation [ 1 , ( b) Define v(p,) = 0 else Claim: [ p] = 1 H p E r * . Use induction on cp. { In particular we have k p @ t= cp, so the set of theorems is exactly the set to tautologies. The Completeness Theorem tells us that the tedious task of making derivations can be replaced by the (equally tedious, but automatic) task of checking tautologies. This simplifies the search for theorems considerably; for derivations one has to be (moderately) clever, for truth tables one has t o possess perseverance. For logical theories one sometimes considers another notion of completeness: a set r is called complete if for each cp, either r t cp, or r 7 9 . T his notion is closely related to "maximally consistent". From Exercise 6 it follows that C o n s ( r ) = { ~ k r ) ( the set of consequences of I') is maximally la ' consistent if I is a complete set. The converse also holds (cf. Exercise 10). Propositional logic itself (i.e. t he case r = 0) is not complete in this sense, e.g. y PO and y 7 po. T here is another i mportant notion which is traditionally considered in logic: t hat of decidability. Propositional logic is decidable in the following sense: t here is a n effective procedure t o check the derivability of propositions $0. P ut otherwise: t here is a n algorithm that for each cp t ests if t c p. T he algorithm is simple: w rite down t he complete truth table for cp a nd check if t he last column contains only 1's. If so, then cp a nd, by the Completeness Theorem, t cp. 1f n ot, t hen &t cp a nd hence Y cp. T his is certainly not the best possible algorithm, one can find more economical ones. There are also algorithms t hat give more information, e.g. t hey not only test t- p , b ut also yield a derivation, if one exists. Such algorithms require, however, a deeper of derivations. This falls outside the scope of the present book. + 46 1. Propositional Logic 1.5 Completeness 47 T here is one aspect of t he Completeness Theorem t hat we want t o discuss now. It does not come as a surprise that truth follows from derivability. After all we start with a combinatorial notion, defined inductively, and we end up w ith 'being true for all valuations'. A simple inductive proof does the trick. For the converse the situation is totally different. By definition r cp means that [cp], = 1 for a11 valuations v t hat make all propositions of r t rue. So we know something about the behaviour of all valuations with respect to r a nd cp. C an we hope t o extract from such infinitely many set theoretical facts the finite, concrete information needed to build a derivation for r t cp? Evidently the available facts do not give us much to go on. Let us therefore simplify matters a bit by cutting down the r ; after all we use only finitely many formulas of r in a derivation, so let us sup. . ,$ a re given. Now we can hope for more , pose that those formulas success, since only finitely many atoms are involved, and hence we can consider a finite "part" of the infinitely many valuations that play a role. That is to say only the restrictions of the valuations t o the set of atoms occurring in $1,. . . ,$ cp a re relevant. Let us simplify the problem one more , step. We know that $1,. . . , $ k cp ($1,. . . ,$ , , (P) c an be replaced by F A . . . A $ + cp(+ A . . . A $ + cp), on the ground of the derivation , , rules (the definition of valuation). So we ask ourselves: given the truth table for a tautology a , can we effectively find a derivation for a ? This question is not answered by the Completeness Theorem, since our proof of it is not effective (at least not prima facie so). It has been answered positively, e.g. by Post, Bernays and Kalmar (cf.Kleene IV, 529) a nd it is easily treated by means of Gentzen techniques, or semantic tableaux. We will just sketch a method of proof: we can effectively find a conjunctive normal form a* for a such that I- a * a *. It is easily shown that a* is a tautology iff each conjunct contains an atom and its negation, or 1I ,a nd glue it all together t o obtain a derivation of a * , which immediately yields a derivation of a . 4. A s et r is independent if for each cp E r r {(P) Y cp. ( a) Show that each finite set r has an independent subset A such that A t cp for all cp E r . ( b) Let r = {cpo, ( PI, ( ~ 2 ,. ..). Find an equivalent set r' = ($0, . .) (i.e. I' t $ ia nd r' I- cp, for all i) such that k -+ $ , b ut $ n + l Note that r' may be finite. y ., (c) Consider an infinite r' a s in (b). Define a 0 = $0, a,+l = $ + $,+l. Show that A = {ao, l , a 2 , . . .) is independent and equivalent to r'. a - + ( d) Show that each set I' is equivalent t o an independent set A. (e) Show that A need not be a subset of r (consider {po,po A p l , Po A P l A ~ 2 ,. ..)I. 5. F ind an effective way of enumerating all propositions (hint: consider sets r, of all propositions of rank 5 n with atoms from P O,.. . ,p,). 6. Show that a consistent set 7 c p E r for all cp. r is maximally consistent if either cp E r or + 7. Show that {po,p i , p2, . . . ,p,, . . .) is complete. 8. (Compactness Theorem). Show : t here is a v such that I[$], = 1 for all $J E r e for each finite subset A & I' t here is a v such that [a], = 1 for all a E A . Formulated in terms of Exercise 13 of 1.3: [ r] 0 if [A] # 0 for all finite f A G r. 9. Consider an infinite set {cpl, 2 , ( ~ 3.,. .). If for each valuation there is an n such that [[cp,] = 1 , then there is an m such that t c pl V . . .V .( P , (Hint: consider the negations -cpl,-p2 . . . a nd apply Exercise 8) . d 10. Show: C o n s ( r ) = { air t a ) is maximally consistent H Exercises 1. Check which of the following sets are consistent. A P o I P ~ (-PI P~),PO YP~), ( a) { T P ~ ~2 ( b) {PO4 P I,PI ~ 2 ~ 3 ~ ~PO), 2 ~ 3 ( c) { P O + P ~ , P O ~ P ~ ~ ' P ~ , P O ~ P ~ Ap3 Apg, .- + ~ I - P~ ~ P ~ - ..). + + r is complete. + 11. Show : r is maximally consistent e t here is a unique valuation such t hat [$] = 1 for all $ E r , where r is a theory, i.e. T is closed under t (rt~+a~r). 12. Let cp be a proposition containing the atom p. For convenience we write ~ ( a for alp]. As before we abbreviate I by T . ) Show: (i) cp(T) t p ( T ) * T and cp(T) t cp((P(T)). ( ii) -cp(T) c pU) -1, cp(P),-(Pm t P H I , cp(P),lcp(T) t- cp(cp(T)). ( iii) V(P) t d c p ( T ) ) . 1 + + + 2. Show that the following are equivalent: (a) { cpl, . . ., cp,) is consistent. ( b) Y - ( ~ A ( ~ A . . . ~ n ) . i 2 ( c) Y c pi A cp2 A . . . A v n-l --t 1 % . 3. cp is independent from r if independent from {pl * po r Y cp and r Y ~ c p Show that: PI . A -+ pz is 7p2, p 2 + p a). 48 1. Propositional Logic 1.6 The Missing Connectives 49 14. Let t cp + $. We call a a n interpolant if t cp + a a nd t a -+ $, a nd moreover a contains only atoms common to cp a nd $. Consider cp(p, r ) , $ (r,q ) with all atoms displayed. Show that cp(cp(T,r),r ) is an interpolant (use Exercise 12, 1 3). 15. Prove the general Interpolation Theorem (Craig): For any c p,$ with t cp + $ t here exists an interpolant (iterate the procedure of Exercise 13). proof. T he only non-trivial part is (ii). We exhibit a derivation of a from r and V $ (i.e. ~ ( l c A l $ ) ) , given derivations D la nd D 2 of r,cp t a a nd p r,$t- 0. 1 .6 The Missing Connectives T he language of section 1.4 contained only the connectives A,-+ a nd I . We already know that, from the semantical point of view, this language is sufficiently rich, i.e. t he missing connectives can be defined. As a matter of fact we have already used the negation as a defined notion in the preceding sections. It is a matter of sound mathematical practice to introduce new notions if their use simplifies our labour, and if they codify informal existing practice. This, clearly, is a reason for introducing 1 , * a nd V. Now there are two ways to proceed: one can introduce the new connectives as abbreviations (of complicated propositions), or one can enrich the language by actually adding the connectives to the alphabet, and providing rules of derivation. The first procedure was adopted above; it is completely harmless, e.g. each time one reads cp * $, one has to replace it by (cp + $) A ($ + c p). So it is nothing but a shorthand, introduced for convenience. The second procedure is of a more theoretical nature. The language is enriched and the set of derivations is enlarged. As a consequence one has to review the theoretical results (such as the Completeness Theorem) obtained for the simpler language. We will adopt the first procedure and also outline the second approach. T he remaining cases are left to the reader. 0 Note that (i) and (ii) read as introduction and elimination rules for V, (iii) a nd ( iv) as d itto for ( vi) a nd ( v) a s ditto for -. They legalise the following shortcuts in derivations: 7, [cpl 1101 N.B. This means that the above expressions are not part of the language, but abbreviations for certain propositions. T he properties of V, 1 a nd H a re given in the following: 50 1. Propositional Logic 1.6 T he Missing Connectives 51 Conversely [v12 [*I1 Consider for example an application of V E Combining (1) and (2) we get one derivation: This is a mere shorthand for [PI Dl [+I v 2 T he reader is urged to use the above shortcuts in actual derivations, whenever convenient. As a rule, only V I and v E are of importance, the reader has of course recognised the rules for and H a s slightly eccentric applications of familiar rules. 7 Examples. t- (p A +) V H (cp V u )A (+ V a). 52 1. Propositional Logic 1.6 The Missing Connectives 53 know ip V (we even know exactly which disjunct). The ( vE)-rule c aptures t he idea of "proof by cases": if we know cp V $ a nd in each of both cases we c a ~ conclude a , t hen we may outright conclude a . Disjunction intuitively l ca& for a decision: which of t he two disjuncts is given or may be assumed? This constructive streak of V is crudely but conveniently blotted out by the identification of cp V $ a nd ' (lcp A 7 7 ) ) . T he latter only tells us that cp a nd + cannot both be wrong, but not which one is right. For more information on t his matter of constructiveness, which plays a role in demarcating the borderline between two-valued classical logic and effective intuitionistic logic, t he reader is referred t o Chapter 5. Note that with V as a primitive connective some theorems become harder to prove. E.g. k l ( 1 - v A ~ c p is trivial, but k p V -cp is not. The following ) rule of the thumb may be useful: going from non-effective (or no) premises t o an effective conclusion calls for an application of R AA. + Exercises We now give a sketch of the second approach. We add v , a nd ~ to the language, and extend the set of propositions correspondingly. Next we add the rules for V, 1 a nd ++ listed above to our stock of derivation rules. To be precise we should now also introduce a new derivability sign, we will however stick to the trusted t- in the expectation that the reader will remember that now we are making derivations in a larger system. The following holds: + + Consider the full language L with the connectives A, +, I ,* V a nd the restricted language C' with connectives A , +, 1 Using the appropriate . derivation rules we get the derivability notions t- a nd t . We define an ' obvious translation from L into L': \ cp+ := cp for atomic cp ( 'PO$)+ := cp+U $+ for = A ,+, ( 9 V $)+ := l ( l c p + A l cp+),wherel is an abbreviation, ( cpH+)+ := ( cp++$+)A($++cp+), ( lip)+ := cp+ + I . Show (i) ( ii) ( iii) (iv) k cp ++ p + , t cp e t ' p + , cpf = cpforcp E 1 3'. Proof. Observe that by Lemma 1.6.2 the defined and the primitive (real) connectives obey exactly the same derivability relations (derivation rules, if you wish). This leads immediately t o the desired result. Let us give one example. A -+) a nd $ I- '(lcp A l ) (1.6.2 (i)), so by V E we get $ cp V $ k ~ ( l Ap-$) . . . (1) Conversely cp t cp V $ (by v I ) , hence by 1.6.2 (ii) -(-cp A '+) t- p V $ . . . (2) Apply ++ I, t o (1) and ( 2), t hen t cp V $ ++ ' ( l p +). T he rest is left t o p t '("P Show that the full logic, is conservative over the restricted logic, i.e. forcp E L' I- cp @ I-' cp. Show t hat the Completeness Theorem holds for the full logic. Hint: use Exercise 2. t he reader. 0 For more results the reader is directed to the exercises. The rules for V , ++,a nd c apture indeed the intuitive meaning of those connectives. Let us consider disjunction: ( vI) : If we know cp t hen we certainly 7 54 1. Propositional Logic 2. Predicate Logic 6. Show (a) r is complete H ( rt cp v $ H r k cp o r r t $, for all cp, $), ( b) r is maximally consistent w r is a consistent theory and for a llcp,$ ( c p V $ ~ r w c p ~ r o r $~r). 2.1 Quantifiers In propositional logic we used large chunks of mathematical language, namely those p arts that can have a truth value. Unfortunately this use of language is patently insufficient for mathematical practice. A simple argument, such as "all squares are positive, 9 is a square, therefore 9 is positive" cannot be dealt w ith. From the propositional point of view the above sentence is of the form cp A $ + a , a nd there is no reason why this sentence should be true, although we obviously accept it as t rue. The moral is that we have t o extend the language, in such a way as t o be able t o discuss objects and relations. In particular we wish to introduce means to talk about all objects of the domain of discourse, e.g. we want t o allow statements of the form "all even numbers are a sum of two odd primes". Dually, we want a m e a m expressing ' there exists an ob'ect such that . . . " , e.g. in "there exists a real number whose square is 2". Experience has taught us that the basic mathematical statements are of the form " a h as the property P" or " a a nd b a re in the relation R", e tc. Examples are: "n is even", "f is differentiable", "3 = 5", "7 < 1 2", "B is between A and C". Therefore we build our language from symbols for properties, rel a t h s and objects. Furthermore we add variables t o range over objects (so called individual variables), and the usual logical connectives now including t he q uantifiers Q and 3 (for "for all" and "there exists"). We f i s t give a few informal examples. 3~p(x) - t here is an x with p ropertyp, - for all y P holds (all y have the v~p(Y) property P ) , V X ~ Y (= 2y) X for all x t here is a y such that x is two times y, - for all positive E t here is an n such ve(e > 0 -+ 3 n ( i < 5)) that < E, Y -+ 3 4 < z A z < y) - if x < y, then there is a z such that .2 x < z a nd z < y, =1 VX~Y(X.~ ) for each x t here exists an inverse y. 3 " 56 2. Predicate Logic 2.2 Structures 57 We know from elementary set theory t hat functions are a special kind of relations. It would, however, be in flagrant conflict with mathematical practice to avoid functions (or mappings). Moreover, it would be extremely cumbersome. So we will incorporate functions in our language. Roughly speaking the language deals with two categories of syntactical entities: one for objects - t he t erns, one for statements - t he formulas. Examples of terms are: 17, x , (2 5) - 7, x3y+l. W hat is the subject of predicate logic with a given language? Or, to put it differently, what are terms and formulas about? The answer is: formulas can express properties concerning a given set of relations and functions on a fixed domain of discourse. We have already met such situations in mathematics; we talked about structures, e.g. groups, rings, modules, ordered sets (see any algebra text). We will make structures our point of departure and we will get t o the logic later. In our logic we will speak about "all numbers" or "all elements", but not about "all ideals" or "all subsets", etc. Loosely speaking, our variables will vary over elements of a given universe (e.g. t he n x n matrices over the reals), but not over properties or relations, or properties of properties, etc. For this reason the predicate logic of this book is called first-order logic, or also elementary logic. In everyday mathematics, e.g. analysis, one uses higher order logic. In a way it is a surprise that first-order logic can do so much for mathematics, as we will see. A short introduction to seiCond-orderlogic will be presented in chapter 4. ~f we overlook for a moment the special properties of the relations and (e.g. commutativity of addition on the reals), then what remains is t he type of a structure, which is given by the number of relations, functions (or o perations), a nd their respective arguments, plus the number (cardinality) of constants. D efinition 2 .2.2. T he similarity type of a structure U = ( A, R 1, . . . , R,, Fl, . .. ,Frn,{cili E I ) ) is a sequence, ( TI,. . . ,r,; a l , . . . ,a,; r ; ) , where R i G AT., F~ : Aaj -+ A , K = I{ci li E I )/(cardinality of I ) . T he two structures in our example have (similarity) type (-; 2 ,2,1; 2) and (1;-;O). T he absence of relations, functions is indicated by -. T here is no objection t o extending the notion of structure to contain arbitrarily many or functions, but the most common structures have finite types (including finitely many constants). I t would, of course, have been better to use similar notations for our structures, i.e. ( A; R1, . . . , R,; Fl,. . . , F; c,li E I ), b ut that would be too , pedantic. If R A , t hen we call R a property (or unary relation), if A 2, then we call R a binary relation, if R An, then we call R a n n-ary relation. + c RX 2 .2 S tructures A group is a (non-empty) set equipped with two operations , a binary one and a unary one, and with a neutral element (satisfying certain laws). A partially ordered set is a set, equipped with a binary relation (satisfying certain laws). We generalise this as follows: Definition 2.2.1. A structure is an ordered sequence , (A, R 1,. . . , R,, F l , . . . , F, {cili E I ) ) , where A is a non-empty set. R 1,. . . , Rn a re relations on A, F l , . . . , F a re functions on A, the c, ( i E I) a re elements , of A (constants). Warning. The functions Fi a re total, i.e. defined for all arguments; this calls sometimes for tricks, as with 0-I (cf. p. 85). Examples. ( R, +, .,-I , O , l ) - t he field of real numbers, (N, ) - t he ordered set of natural numbers. < We denote structures by Gothic capitals: Z, 23,C,9 , .. .. T he script letters are shown on page 102. The set A is called universe of U. Notation. A = I%\.U is called (in)finite if i ts universe is (in)finite. We will mostly commit a slight abuse of language by writing down the constants instead of the set of constants, in the example of t he field of real numbers we should have written: ( R, .,-I , (0, I ) ) , b ut (R, +, - ,-I , 0 , 1 ) is more traditional. Among the relations one finds in structures, there is a very special one: the identity (or equality) relation. Since mathematical structures, as a rule, are equipped with the identity relation, we d o not list the relation separately. It does, therefore, not occur in the similarity t ype. We henceforth assume all structures t o possess an identity relation and we will explicitly mention any exceptions. For purely logical investigations it makes, of course, perfect sense to consider a logic without identity, but the present book caters for readers from the mathematics or computer science community. +, One also considers the "limiting cases" of relations and functions, i.e. 0-ary relations and functions. An 0-ary relation is a subset of A @. ince A @= (0) S there are two such relations: 0 a nd ( 0) (considered as ordinals: 0 and 1). 0-ary relations can thus be seen as truth values, which makes them play the role of t he interpretations of propositions. In practice 0-ary relations do not crop up, e.g; t hey have no role t o play in ordinary algebra. Most of the time the reader c an joyfully forget about them, nonetheless we will allow them in our definition because they simplify certain considerations. A 0-ary function is a mapping from [email protected] nto A, i.e. a mapping from (0) into A. Since the mapping has a singleton a s domain, we can identify it with its range. 58 2 . Predicate Logic 2 .3 The Language of a Similarity Type 59 In this way 0-ary functions can play the role of constants. The advantage of the procedure is, however, negligible in the present context, so we will keep our constants. Next we define the two syntactical categories. D efinition 2.3.1. T ERM is the smallest set X with the properties (i) zi E X ( i E I ) and xi E X ( i E N ), (ii) t i , ..., t,, E X + f i(t1, . . . ,t,,) E X , for 1 5 i 5 m T ERM is our set of terms. Definition 2.3.2. FORM is the smallest set X with the properties: (i) I E X ; P E X if ri = 0 ; t l , . . . , t r i E T ERM + i P i(t1,. . . , t r , ) E X ; t i l t 2 E T ERM =+ t l = t 2 E X , (ii) cp, $ E X + (cp $) E X , where 0 E { A, V , +, H ), (iii) cp E X + ( -9) E X , (iv) cp E X + ( ( v x i ) ~ ) , 3 x i ) ~ E X . (( ) FORM is our set of f ornulas. We have introduced t l = t 2 separately, but we could have subsumed it under the first clause. If convenient, we will not treat equality separately. The formulas introduced in (i) are called atoms. We point out t hat(i) includes the case of 0-ary predicate symbols, conveniently called proposition symbols. A proposition+symbol is interpreted as a 0-ary relation, i.e. a s 0 or 1 (cf. 2.2.2). This is in accordance with the practice of propositional logic t o interpret propositions as true or false. For our present purpose propositions are a luxury. In dealing with concrete mathematical situations (e.g. groups or posets) one has no reason to introduce propositions (things with a fixed t ruth value). However, propositions are convenient (and even important) in t he context of Boolean-valued logic or Heyting-valued logic, and in syntactical considerations. We will, however, allow a special proposition: I ,t he symbol for the false Proposition (cf. 1.2). The logical connectives have, what one could call 'a domain of action', e-g. in cp -4 I,t he connective -t yields the new formula cp -+ $ from formulas I 'P a nd $, a nd so + bears on cp, $ a nd all their parts. For propositional connectives t his is not terribly interesting, but for quantifiers (and variablebinding o perators in general) it is. The notion goes by the name of scope . So in ( ( V X ) ~ nd ((3x)cp),cp is the scope of the quantifier. By locating the a) matching brackets one can easily effectively find the scope of a quantifier. If a variable, term or formula occurs in cp, we say that it is in the scope of the quantifier in Vxcp or 3 xp. J ~ s t . a sn the case of PROP, we have induction principles for TERM and i FORM. Exercises 1. Write down the similarity type for the following structures: (Q,,O) < (9 (N, , .,S , 0 ,1,2,3, 4 , . . . , n , . . . ),whereS(x) = x 1 , + (ii) ( P(N), U, n," , 8), (iii) ( Z/(5), +, ., - ,-I , 0 , l , 2 ,3,4), (iv) ((0, I ) , A , V , +, 1 , 0 , 1 ) , where A , V , +, o perate according to ( v) the ordinary truth tables, (4 ( R, I ) , (vii) ( R), < (viii) ( R, N, , T, 2 , ( 1 , -) , where T ( a , b, c ) is the relation'b is between a a nd is t he square function, and I I t he absolute value. c, + 7 2. Give structures with type ( 1,l;-; 3 ), (4; -; 0 ). 2.3 T he Language of a Similarity Type T he considerations of this section are generalizations of those in section 1.1'.4. Since the arguments are rather similar, we will leave a number of details t o the reader. For convenience we fix the similarity type in this section: ( r l l . .. , r,; a l , . . . , a m ;K ) , where we assume ri O ,aj > 0. The alphabet consists of the following symbols: 1. Predicate symbols: P I , . . . ,P,, = 2. Function symbols: f i , . . . ,f m 3. C onstant symbols: Ei for i E I 4. Variables: xo, X I,22, . . .(countably many) 5. Connectives: V, A , -+, H , 1V , 3 6. Auxiliary symbols: (, ), . > 1 , 'd a nd 3 are called the universal and existential quantifier. The curiously looking equality symbol has been chosen to avoid possible confusion, there are in fact a number of equality symbols in use: one to indicate the identity in the models, one t o indicate the equality in the meta language and the syntactic one introduced above. We will, however, practice the usual abuse of language, and use these distinctions only if it is really necessary . As a rule the reader will have no difficulty in recognising the kind of identity involved. 60 2. Predicate Logic 2.3 The Language of a Similarity Type 61 Lemma 2.3.3. Let A (t) be a property of terms. I A (t) holds for t a varif able or a constant, and if A (tl), A (t2),. . . , A(t,) + A (f ( t l , . . . ,t,)), f or all function symbols f , then A (t) holds for all t E T ERM. Proof. cf. 1.1.3. 0 Lemma 2 .3.4. Let A((o) be a property of formulas. If (i) A(cp) f or atomic (o, (ii) A((o),A($) + A (v (iii) 4 9 ) + 4 - ( o ) ) , ( iv) A(cp) + A((Vxi)(o),A (3xi)p) f or all i , then A (y) holds for all cp E F ORM. $I), Proof. cf. 1.1.3. 0 We will straight away introduce a number of abbreviations. In the first place we adopt the bracket conventions of propositional logic. Furthermore we delete the outer brackets and the brackets round Vx a nd 3 2, whenever possible. We agree that quantifiers bind more strongly than binary connectives. Furthermore we join strings of quantifiers, e.g. Vxlx23x3x4(o s tands for Vx1Vx23x33x4(o. For better readability we will sometimes separate the quantifier and the formula by a dot: Vx . 9 . We will also assume that r i : i f ( t l , . . . ,t,), P ( t l , . . . ,t,) always indicates the correct number of arguments. A word of warning: the use of = might confuse a careless reader. The symbol ' =' is used in the language L , where it is a proper syntactic object, It occurs in formulas such as xo = x7, b ut it also occurs in the meta-language, e.g. in the form x = y , which must be read "x a nd y are one and the same variable". However, the identity symbol in x = y can just as well be the legitimate symbol from the alphabet, i.e. x = y is a meta-atom, which can be converted into a proper atom by substituting genuine variable symbols for x a nd y. Some authors use for "syntactically identical", as in "x a nd y are the same variable". We will opt for "=" for the equality in structures (sets) and " A" for the identity predicate symbol in the language. We will use a few times, but we prefer to stick to a simple " = " t rusting the alertness of the reader. (We have chosen a suggestive notation; think of the language of ordered groups: L for "less than", p , i for "product" and "inverse"). Note that the order in which the various symbols are listed is important. In our example p has 2 a rguments and i has 1. In mathematics there are a number of variable binding operations, such as summation,integration, abstraction: consider, for example, integration, in J s inzdx t he variable plays an unusual role for a variable. For x cannot ; 'bary"; we cannot (without writing nonsense) substitute any number we like for x. I n the integral the variable x is reduced to a tag. We say that the variable x is bound by the integration symbol. Analogously we distinguish in logic between free and bound variables. In defining various syntactical notions we again freely use the principle of definition by recursion (cf.1.1.6). The justification is immediate: the value of a t erm (formula) is uniquely determined by the values of its parts. This allows us t o find the value of H ( t ) in finitely many steps. - Definition by Recursion on T E R M : Let HO: V ar U C onst 4 A (i.e.Ha is defined on variables and constants), Hi : Aa+ A, then there is a unique mapping H : T E R M --+ A such that Example 2.3.5. Example of a language of type (2; 2 , l ; 1 ). predicate symbols: L, = function symbols: p, i constant symbol: e Some terms: t l := xo; t 2 := p (x1,x2); t g := ~ ( E , E )t;4 := i (x7); t 5 := P ( ~ ( P ( X Z > (x1)). i~ ) ) , Definition by Recursion on F O R M : Let Hat : A t -+ A (i.e.Hat is defined on atoms), Ho : A2 - -,A, (0 { v,A,+,-)) E H,:A-,A, HV:AxN+A, H3:AxN--,A. then there is a unique mapping H : F O R M + A such that 62 2 . Predicate Logic 2 .3 The Language of a Similarity Type 63 D efinition 2.3.6. T he set F V ( t ) of free variables of t is defined by := 1x2) (2) F V(xi) F V(Zi) := 0 ( ii) V f ( 1 , . . t ) ) := F V ( t l ) U . . . u F V(t,). 1 Remark. To avoid messy notation we will usually drop the indices and tacitly assume that the number of arguments is correct. The reader can easily provide the correct details, should he wish to do so. Definition 2 .3.7. T he set FV(cp) of (2) F V ( P ( t l l . . t p)) F V(t1 = t2) F V ( I ) = FV(P) ( 4 F V ( p $) F V(-V) ( iii) FV(Vxicp) := FV(3xicp) free variables of cp is defined by := F V ( t l ) u . . . u F V(t,), := F V ( t l ) U F V ( t 2 ) , := 0 for P a proposition symbol, := F V(V) u FV($), := F V ( p ) , := FV(cp) - {x,). Substitution of formulas is defined as in the case of propositions, for convenience we use '$' a s a symbol for the propositional symbol (0-ary predicate symbol) t hat acts as a 'place holder'. Definition 2.3.11. a[cp/$] is defined by: Definition 2.3.8. t or cp is called closed if F V ( t ) = 0, resp. F V ( p ) = 0. A closed formula is also called a sentence. A formula without quantifiers is called open. TERM, denotes the set of closed terms; SENT denotes the s et of sentences. It is left to the reader to define the set BV(cp) of bound variables of cp. Continuation of Example 2.3.5. F V(t2) = { x1,x2);F V(t3) = 0; FV(cp2) = F V(t3) U F V(t4) = ( 2 7 ) ; f FV(cp7) = 0; BV(cp4) = 0; BV(cp6) = 1 x0,X I). p5, p el 9 7 a re sentences. Warning. FV(cp) nBV(cp) need not be empty, in other words, the same variable may occur free and bound. To handle such situations one can consider free (resp. bound) occurrences of variables. When necessary we will make informally use of occurrences of variables. Example. V xl(xl = x 2) --+ P ( x l ) contains x l b oth free and bound, for the occurrence of x l in P ( x l ) is not within the scope of the quantifier In predicate calculus we have substitution operators for terms and for formulas. Definition 2.3.9. Let s a nd t be terms, then s [t/x] is defined by: := ( apifu ' if c $ =$ for atomic a , Continuation of Example 2 .3.5. We will sometimes make simultaneous substitutions, the definition is a slight modification of definitions 2.3.9, 2.3.10 and 2.3.11. The reader is asked t o write down the formal definitions. We denote the result of a simultaneous substitution of t l , . . . ,t, for y l , . . . ,y, in t by t [tl, . . . , t ,/yl,. . . ,y,] (similarly for cp). Note that a simultaneous substitution is not the same as its corresponding repeated s ubstitution. .- c . c[tlx1 f ( t l , . . . , t ,)[tlx] := f ( t ~ [ t / x . ., . , t p[t/x]). ] ( ii) Note that in the clause (i) y = x means "x and y are the same variables". The quantifier clause in definition 2.3.10 forbids substitution for bound variables. There is, however, one more case we want to forbid: a substitution, in which some variable after the substitution becomes bound. We will give &n example of such a substitution; the reason why we forbid it is that it can change t he t ruth value in an absurd way. At this moment we do not have a truth definition, so t he argument is purely heuristic. 64 2. Predicate Logic 2.3 The Language of a Similarity Type 65 Example. 3 x(y < x )[x/y] = 3 2 ( x < x ). Note that the right-hand side is false in an ordered s tructure, whereas 3 x(y < x ) may very well be true. We make our restriction precise: Definition 2.3.12. t is free for x in cp if (i) cp is atomic, (ii) cp := c pl cp2 (or cp := - y l ) and t is free for x in c pl a nd cp2 (resp.cpl) , (iii) cp := 3y$, or cp := Vy$, a nd y @ F V ( t ) a nd t is free for x in $, where x f y. Examples. 1. x2 is free for xo in 3 xgP(xo, x 3), 2. f ( xo,x l ) is not free for xo in 3x1P (xo, x 3), 3. x5 is free for x l in P ( x I , 23) + 3 xlQ(x1, ~ 2 ) . For all practical purposes the use of "t is free for x in cp" consists of the fact that the (free) variables o f t are not going to be bound after substitution in c p. L e m m a 2.3.13. t i s free for x in cp H the variables oft in cp[t/x] are not i; the scope of a quantifier. Proof. Induction on cp. - For atomic cp t he lemma is evident. - Lemma 2.3.15. cp i s free for $ i n u not i n the scope of a quantifier. Proof. As for Lemma 2.3.13. @ the free variables of cp are in u[cp/$] 0 From now on we tacitly suppose that all our substitutions are "free for". For convenience we introduce an informal notation that simplifies reading \ and writing. Notation. In order to simplify the substitution notation and t o conform t o an ancient suggestive tradition we will write down (meta-) expressions like cp(x, y, z ) , $ (x, x ) , etc. This neither means that the listed variables occur free nor that no other ones occur free. It is merely a convenient way t o handle substitution informally: cp(t) is the result of replacing x by t in c p ( x )p ( t ) is called a substitution instance of cp(x). ; We use the languages introduced above to describe structures, or classes of structures of a given type. The predicate symbols, function symbols and constant symbols act as names for various relations, operations and constants. In describing a structure it is a great help t o be able to refer t o all elements of IUI individually, i.e. t o have names for all elements (if only as an auxiliary device). Therefore we introduce: Definition 2.3.16. T he extended language, L (U), of U is obtained from the language L, of the type of U, by adding constant symbols for all elements of 8 . We d enote the constant symbol, belonging to a E (U(,by Zi;. c p = c pl 0 cp2. t is free for x in cp dg' is free for x in c pl and t is free tor t x in cp2 t he variables of t in cpl[t/z] a re not in the scope of a quantifier and the variables of t in cp:![t/x] a re not in the scope of a quantifier H t he variables of t in (cpl cpz)[t/x] a re not in the scope of a quantifier. - cp = ~ c p l similar. , - cp = Vy$. t is free for x in i p dg' @ F V ( t ) a nd t is free for x in $ % t he y variables o f t are not in the scope of V a nd the variables o f t in $ [t/x] are y not in the scope of (another) quantifier H t he variables of t in cp[t/x] are 0 not in the scope of a quantifier. ; 3 Example. Consider the language L of groups; then L (U), f- - or U t he additive --group of integers, has (extra) constant symbols 0, 1 , 2, . . ., -1, -2, -3, . . . .. Observe that in this way 0 gets two names: the old one and one of the new ones. This is no problem, why should not something have more than one name? Exercises T here is an analogous definition and lemma for the substitution of formulas. Definition 2.3.14. cp is free for $ in u if: (i) u is atomic, (ii) u := a1 0a 2 (or ~ u l a)nd a is free for $ in a1 a nd in a:! (or in u l ) , ! (iii) a := 3 y7 (or 'v'yr) and y $ FV(cp) and cp is free for $ in 7 . 1 . Write down an alphabet for the languages of the types given in Exercise 1 of section 2.2 2 . Write down five terms of the language belonging to Exercise 1, (iii), (viii), W rite down two atomic formulas of the language belonging t o Exercise 1, (vii) and two closed atoms for Exercise 1, (iii), (vi). 3. Write down an alphabet for languages of types (3; 1 ,1,2;O), (-; 2; 0 ) a nd (1;-; 3 ). 66 2. Predicate Logic 2.4 Semantics 67 4. Check which terms are free in substitution: (a) x for x in x = x , (b) y f o r x i n x = x , (c) x y for y in z = 0, ( d) n + y f o r y i n 3 x ( y = x ) , ( e) x y for z in 3w(w x = b ), the following cases, a nd carry o ut t he + + ( f ) x + w for z in Vw(x + z = n ) , ( g ) x + y for z in Vw(x + z = 0 ) A ( h) ~ Y (= x ), Z x+yfor zinVu(u=v) Vz(z = 9 ). -+ + 2 .4 Semantics T he art of interpreting (mathematical) statements presupposes a strict separation between "language" and the mathematical "universe" of entities. The objects of language are symbols, or strings of symbols, the entities of mathematics are numbers, sets, functions, triangles, etc. It is a matter for the philosophy of mathematics to reflect on the universe of mathematics; here we will simply accept it as given to us. Our requirements concerning the mathematical universe are, a t present, fairly modest. For example, ordinary set theory will do very well for us. Likewise our desiderata with respect 'to language are modest. We just suppose that there is an unlimited supply of symbols. The idea behind the semantics of predicate logic is very simple. Following Tarski, we assume that a statement a is true in a structure, if it is actu$lly t he case that a applies (the sentence "Snow is white" is true if snow actually is white). A m athematical example: "2 2 = $' is true in the structure of natural numbers (with addition) if 2 2 = 4 (i.e. if addition of the numbers 2 a nd 2 yields the number 4). Interpretation is the art of relating syntactic objects (strings of symbols) and states of affairs "in reality". We will start by giving an example of an interpretation in a simple case. We consider the structure U = ( Z, <, -, 0 ), i.e. t he ordered group of integers. The language has in its alphabet: predicatesymbols : =, L functionsymbols : P, M constantsymbol : Roughly speaking, we interpret m a s "its number", P a s plus, M as minus. Note that we interpret only closed terms. This stands t o reason, how should one assign a definite integer to x? Next we interpret sentences of L (U) by assigning one of the truth values 0 or 1 . As far as the propositional connectives are concerned, we follow the semantics for propositional logic. 41) v (t = s ) = 0, 0 else, A few remarks are in order. 1. I n fact we have defined a function v by recursion on cp. 2. T he valuation of a universally quantified formula is obtained by taking the minimum of all valuations of the individual instances, i.e. t he value is 1 ( true) iff all instances have the value 1. In this respect 'd is a generalisation of A. Likewise 3 is a generalisation of V. 3. v is uniquely determined by U, hence'v would be a more appropriate notation. For convenience we will, however, stick t o just v. 4. As in the semantics of propositional logic, we will write [cp]' for ~ '(cp), a nd when no confusion arises we will drop the subscript U. 5. It would be tempting to make our notation really uniform by writing it]' for t'. We will, however, keep both notations and use whichever is the most readable. The superscript notation has the drawback that it requires more brackets, but the I[ ]-notation does not improve readability. Examples. 1. ( P ( P ( 2 , 3 ) , ~ ( 7 ) ) ) = ~ ( 3 , 3 ) ' + ~ ( 7 ) '= ( +3') ' 2% (' = 2 3 -) 7 (-7) = - 2, 2. p=-I] = 0 , since 2 # -1, -3. @ A i -+ L (%,m)] = 1, since = i = 0 and [L(25, l o)] = 0; by the ll interpretation of the implication the value is 1, 4. [ Qx3y(L(x,y ))] = m in,(maxm[L(E,E))] [ (fi,m)] = 1 for m > n , so for fixed n, m axm[L(n,m)] = 1 , and hence min, m maxm[L(5i,E)] 1. = + + +, n + ++ L (U) has, in addition t o all that, constant symbols m for all m E Z. We first interpret the closed terms of L (U); t he interpretation t" of a term t is an element of Z . [o 68 2 . Predicate Logic 2.4 Semantics 69 Let US now present a definition of interpretation for t he general case. ,, Consider U = ( A, R 1, . . . , R n , F 1 , . . . , F {cili E I)) of a given similarity type ( T I , .. . , r n ; a l , . . . , a r n ;111). T he corresponding language has predicate symbols x l , . . . , R n, unction f symbols F 1 , . . . , F , a nd constant symbols z i. L (U), moreover, has constant symbols a for all a E IUI. D efinition 2.4.1. A n interpretation of the closed terms of L (U) in U, is a mapping (.)" : T E R M , + \%I satisfying: - c, (2) cp - a, a" = F ,(ty, . . . ,t:), where p = a i. ( ii) F i ( t l , . . . , t,))" D efinition 2.4.2. A n interpretation of the sentences cp of L (U) in U, is a mapping l[.] : S E N T -+ { 0,1), satisfying: a Definition 2.4.4. ( i) U cp i ff U k Cl(cp), (ii) cp iff U cp for all U (of the appropriate type), (iii) U r iff U for all 11, E r , U p ) , where r U {cp) consists of sentences. (iv) r cp i ff (U r + + + + + ++ * If U k a , we call U a model of a . I n general: if U r , we call U a We say that cp is true if p , cp is a semantic consequence of r if ?nodelof cp i.e. cp holds in each model of r . Note that this is all a straight-forward generalisation of 1.2.4. r r. + + If cp is a formula with free variables, say FV(cp) = { zl,. . . , z k), t hen we say t hat cp is satisfied by a l l . . . , ak E /%I if U k cp[al,. . . , E k / z l , . . . , zk],cp is called satisfiable in 2l if there are a l , . . . , ak such that cp is satisfied by a l, . . . , ak a nd cp is called satisfiable if it is satisfiable in some U. Note that cp is satisfiable in U i ff U 3z1 . . . zkcp. T he properties of the satisfaction relation are in understandable and convenient correspondence with the intuitive meaning of the connectives. + Proof. I mmediate from Definition 2.4.2. We will do two cases. (iv) U k cp + @ [Icp --+ + In = m a 4 1 - [Icp]n,[I$l~~) 1 . S upposeU k cp, = i.e. [Icp]" = 1 , then clearly [+]" = 1 , or U +. cp + 111, t hen cp + U I= +, a nd suppose U Conversely, let U = 1. [cp + $1" = m ax(1 - [[cp]%, [I+]") = 0.Hence [+]a = 0 a nd Contradiction. (vii) U 3xcp(x) @ m ax{([[cp(~)]Ula (%I) = 1 H t here is an a E IUI such E 0 = that [[cp(a]lU 1 H t here is an a E IUJ such that U cp(E). + In predicate logic there is a popular and convenient alternative for the valuation-notation: = I . We say t h a t u p is true, valid, in U" if Notation. U cp s tands for U b cp. T he relation is called the satisfaction relation. So far we have only defined truth for sentences of L (U). I n order t o extend k t o arbitrary formulas we introduce a new notation. Of course, the same notation is available in propositional logic - t here the cp for role of U is taken by the valuation, so one could very well write v u vnv = 1 + + + + + L emma 2.4.5 tells us that the interpretation of sentences in U r uns parallel t o t he construction of the sentences by means of the connectives. In other D efinition 2.4.3. Let FV(cp) = (21,. . . , z k ) , t hen Cl(cp) := V zl.. . zkp is the universal closure of cp (we assume the order of variables zi t o be fixed in some way). Words, we replace the connectives by their analogues in the meta-language and i nterpret the atoms by checking the relations in the structure. E g . , t ake our example of the ordered additive group of integers. U -Vx3y(x A P ( y , y )) H I t is not the case that for each number n there exists an m such t hat n = 2m @ not every number can be halved in U. T his clearly is correct, take for instance n = 1 . 70 2 . Predicate Logic 2.5 Simple Properties of Predicate Logic 71 symbols; a n Let us reflect for a moment on the valuation of i 0-ary relation is a subset of A @ = { @), .e. it is 0 or ( 0) a nd these are, = considered as ordinals, 0 or 1. So [[PIn P, and p is a t ruth value. This makes our definition perfectly reasonable. Indeed, without aiming for a systematic treatment, we may observe that formules correspond to subsets of A k, where k is the number of free variables. E.g. let FV(cp) = { zl,. . . ,z k), t hen we could stretch the meaning of [cp]n a bit by putting [cp]%= { ( a l , . . . ,ak)IU k cp(Zl,. . . ,a k)} = { ( a l l .. . , a,)I[cp(iZl,. . . , a k)]% = 1 ) It is immediately clear t hat applying quantifiers to cp reduces the "dimension", e.g. [ 3xP(x, y )]a = {alU P ( ~ , z )for some b), which is the projection of [ P(x, y)] onto the y-axis. 2.5 Simple Properties of Predicate Logic Our definition of validity (truth) was a straightforward extension of the valuation-definition of propositional logic. As a consequence formulas which are instances of tautologies are true in all structures U (exercise 1). So we c an copy many results from sections 1.2 and 1.3. We will use these results a simple reference to propositional logic. The specific properties concerning quantifiers will be treated in this section. First we consider the generalisations of De Morgan's laws . T h e o r e m 2.5.1. ( i) ( ii) ( iii) ( iv) + k ~ Vxcp++3xlcp + 73xcp + Vxcp ++ ++ Exercises Let T = ( N , +, ., S , O ), a nd L a language of type (-; 2 ,2,1; 1 ). (i) Give two distinct terms t in L such that t n = 5 , (ii) Show that for each natural number n E N t here is a term t such that t n = n , (iii) Show that for each n E N there are infinitely many terms t SUCH t hat t n = n . Let - U be the structure of exercise 1 0) ++ 3xcp ++ Vx-y 73x-p +xy Proof. If there are no free variables involved, then the above equivalences are almost trivial. We will do one general case. (i) Let FV(Vxcp) = { zl, . . . , z k), t hen we must show U Vzl . . . zk(7Vxcp(x,z 1,. . . , z k) ++ 3x7cp(x, z l, . . . , z k)), for all U. So we have to show U lVxcp(x, E l, . . . ,Sik) 3x7cp(xlS ill . . . ,a k) for arbitrary a l , . . . ,ak E IUJ. We apply the properties of+ as listed in Lemma 2.4.5: U Vxcp(x,Zl,. . . ,Zk) e n ot for all - W ~ c p ( ~ . . .~a,k) e U ,a , b E IUI U k c p(b,Sil,... , a k ) e t here is a b E IUI such that U k ~ ~ ( l b. .,. ,a rc)H % k 3x7cp(x1 l, . . . ,Sin). a, Z (ii) is similarly dealt with, (iii) can be obtained from (i), (ii), (iv) can be obtained from (i), (ii). 0 + + + (v) of section 2.2. Evaluate ((7 + ( , (i4 7(7G v i ) ) % . - Let U be the structure of exercise 1 (viii). Evaluate ( ~ ( f i ) ~ 31)', -(-W)? (i- Which cases of Lemma 2.4.5 remain correct if we consider formulas in general? For sentences a we have U a or U T a. Show that this does not hold for cp w ith FV(cp) # 0. Show that not even for sentences a or holds. T he order of quantifiers of the same sort is irrelevant, and quantification over a variable that does not occur can be deleted. T h e o r e m 2.5.2. ( i) ( ii) ( iii) ( iv) Proof. Left t o the reader + + k VxVycp 3x9 + + 70 + 3 x 3 ~ 3~ y3xcp1 + Vxcp cp if x g' F V ( p ) , ++ + + VyVxcp, ++ ++ cp if x 6FV(cp). 0 Show for closed terms t and formulas cp (in L (U)): kt=rn,, p ( t ) ++ cp([[tBa)(We will also obtain this as a corollary to the U Substitution Theorem, (2.5.11). Show that U versa. cp We have already observed that V a nd 3 are , in a way, generalisations of A a nd V. Therefore it is not surprising that V (resp. 3) distributes over A ( resp.v). V ( and 3) distributes over V (resp. A ) only if a certain condition is met. + U + + for all U, implies cp + + $, but not vice 'f? 72 2 . Predicate Logic 2.5 Simple Properties of Predicate Logic 73 Theorem 2.5.3. ( i) ( ii) ( iii) ( iv) + Vx(cp A $) * Vxcp A Vx$, + Vxx(cp(x)$) $) 3* Vxcp(x) v $ if x $ F V($), + 3 ( p v * xcp v 3x$, + 3x(cp(x) + V A $) ++ 3 x 4 2 ) A $ if x $ F V($). ! Proof. (i) and (ii) are immediate. (iii) Let FV(Vx(cp(x) V $)) = ( 21,. . . ,zk). We must show that U /=Vzl . . . zk[Vx(cp(x)V $) * Vxcp(x) V $1 for all 2, s o we show, using Lemma 2.4.5, t hat '21 Vx[cp(x,El,.. . , Ek))V $ (El,. . . ,Zk)] @ % Vxcp(x,E l, . . . ,Zk) V + (El, . . . ,Ek) for all a nd all a l l . . . ,ak E I Ul. Note that in the course of the argument a l , . . . , ak remain fixed, SO in the future we will no longer write them down. + Lemma 2.5.4. ( i) Let x and y be distinct variables such t hat x @ F V ( r ) , then ( t[s/xl) [ rlyl = ( t [ r / ~ l ) [ s [ r / ~ l / x l ~ (ii) let x and y be distinct variables such that x 6 F V ( s ) a nd let t and s be f ree f or x a nd Y in cp, then (cp[tlxl)[slYl = ( c p [ s l ~ l ) [ t [ s l ~ l l x l , (iii) let $ be free for $ i n cp, and let t be free for x in cp and $, then = ( cp[~l$l)[tlxl ( c p ~ t l ~ l ) [ ~ [ t l ~ l l $ l l (iv) Let cp, $ be free for $1, $2 in o ,let $ be free for $2 i n cp, a nd let not occur i n then ( ~[cpI$ll)[$I$21 ( ~[+I$2l)[cp[$I$2l/$ll. = $7 Proof. (i) Induction on t . + : % + Vxcp(x,-)V+(-) e % k Vxcp(x,-) or U $(-) H 2.l cp(b,) for all b or U $(-). If 9 $ () , t hen also 2 l cp (z, ) V $ () for all b, and so Vxy(x,-) V $(-). If for all b % k cp(5,-) t hen a k cp(b,-) v $() for all b, so U 'v'x(cp(x,-) v $(-)). I n both cases we get the desired result. * + : We know that for each b E I %( U cp(6,-) V $(-). If U $(-), t hen also 2 Vxcp(x,-) V $(-), so we are done. t hen necessarily U cp(b,-) for all b, so If % $(-) U Vxcp(x,) a nd hence U Vxcp(x,-) V $(-). (iv) is similar. 0 + + + + + + + + + + + I n the proof above we have demonstrated a technique for dealing w i t c t h e e xtra free variables z l , . . . , zk, t hat do not p h y a n actual role. One chooses an arbitrary string of elements a l l . . . , ak t o substitute for the z,'s a nd keeps them fixed during the proof. So in future we will mostly ignore the extra variables. WARNING. Vx(cp(x) V $ (x)) 3x(p(x) A 3 x$(x) -+ + 3x(cp(x) A Vxcp(x) V Vx$(x), a nd $ (x)) a re not true. (ii) Induction on cp. Left to the reader. (iii) Induction on cp. - cp = or P distinct from $. Trivial. I - cp = $. T hen ( $[$/$])[t/x] = $ [t/x] a nd ( $[t/x])[$[t/x]/$] = $ [$[tlxll$l = $ [tlxl. - cp = (PI 0 pa, -pl. Trivial. - cp = VYcp1. T hen (VY. c pl [ $l$l)[tlxl = (VY. c pl[$l$l)[tl~I = V Y . ( (PI [ $l$l)[tlxl) i$=h. Y((cp1[ tIxl)[$[tlxl/$l) = V ((VYcpl)[ tlxl)[ $[tlxll$l. cp = 3 y p 1 Idem. (jv) I nduction on a . Left to the reader. We immediately get One of the Cinderella tasks in logic is the bookkeeping of substitution, keeping track of things in iterated substitution, etc. We will provide a number of useful lemmas, none of them is difficult - i t is a mere matter of clerical labour. A word of advice t o the reader: none of these syntactical facts are hard to prove, nor is there a great deal to be learned from the proofs (unless one is after very specific goals, such as complexity of certain predicates); the best procedure is to give the proofs directly and only to look at the proofs in the book in case of emergency. Corollary 2.5.5. ( i) I f z # F V ( t ) , t h e n t [ E / x ] = (t[z/x])[E/z], ( ii) If z $ F V((p) a nd2 free forxin cp, t hen cp[Elxl = (cp[zlxl)IElzl. I t is possible to pull out quantifiers from formula. The trick is wellknown in analysis: the bound variable in an integral may be changed. E.g. J xdx + J sin ydy = J xdx + J sinxdx = J(x + s inx)dx. I n predicate logic we have a similar phenomenon. 5.h.' indicates the use of the induction hypothesis 74 2. Predicate Logic 2.5 Simple Properties of Predicate Logic 75 - cp = P i V c p2, 9 A 9 2, cp1 1 -+ c p2, ~ c p 1 . e W Proof. It suffices t o consider cp with FV(cp) C ( 2). We have to show U k 3xcp[x/z] @ U 3ycp[y/z] for any '21. U 3xcp[x/z] w U (cp[x/z])[Z/z] for some a -+ U cp[Z/z] for some a @ U k (cp[y/z])[?i/z] for some a U 3ycp[y/z]. 0 T he universal quantifier is handled completely similarly. + + + * T he upshot of this theorem is that one can always replace a bound variable by a "fresh" one, i.e. one that did not occur in the formule. From this one easily concludes Corollary 2.5.7. Every formula is equivalent to one in which no variable occurs both free and bound. We now can pull out quantifiers: Vxcp(x) V Vx$(x) Vxcp(x) V Vy$(y) a nd Vxcp(x) V Vy$(y) H Vxy(cp(x) V $ (y)), for a suitable y. I n order to handle predicate logic in an algebraic way we need the technique of substituting equivalents for equivalents. - Theorem 2.5.8 (Substitution Theorem). t l = t 2 4 s [tl/x] = s [ t ~ / x ] ( i) (ii) t l = t2 cp[tllx] * cp[talx]) (ff[cpl$l * f f[$/$l) (iii) I= (cp $) ++ 9 I= ( cplVcp2)[tll~I 2 I= c p l [ t l l ~o r 2 i= u I or k cp2[t2/xl ' U k ( (PI V cp2)[t2/xl. 3 T he remaining connectives are treated similarly. - cp = 3 ~ $ c,p = VY$. We consider the existential quantifier. U k ( 3y+) [ tl1x1 e 2 b ~ Y ( $ [ ~ I / x ] )U k + [ t ~ / x ] [ ~ / yor some a . f] BY 2 .5.6 I= $ [ t l l ~ l [ Z l ~ l U I= ( $ [ ~ l ~ l ) [ t l [ ~ lApply lt he @ ~l/~ . . induction hypothesis t o $ [ ~ / y ] nd the terms t l [z/y], t 2[ ~ / y ]Observe a that t l a nd t 2 a re closed, so t l[Z/y] = t landt2 = t2[Z/y]. We get U $ [ t 2 / x ] [ ~ / y ] ,nd hence U b 3 y$[t2/x]. T he other implication is a similar, so is the case of the universal quantifier. (iii) Let U cp u U k $. We show U k a[cp/$] e U a [$/$] by induction on a. - a is atomic. Both cases a = $ a nd a # $ a re trivial. - a = a1 0a 2 (or l a l ) . Left to the reader. - a = Vx . T . Observe that cp a nd $ a re closed, but even if they were not then x could not occur free in cp, P I= (Vx . T)[(P/$]@ I= VX(T[(P/$]). ick an a E IUI, t hen U 2" t= ( ~ [ ~ l ~ l ) [ c p [ ~ / ~ a/ $=l ( WxI)[cp/$l Q lt (7-[cpl~l)[Z/xl& 7-[ZlxI[$l$l b 7 -~~/xl[+[Z/xI/$l @ k (7-1$l$l)[~/xI. Hence U 21 a[cp/$] @ U a [+/$]. 0 T he existential quantifier is treated similarly. consider the disjunction: cpz[t1/x] U c pl[t2/~] 2. * + + + +. t= * * + -) - Proof. It is no restriction t o assume that the terms and formulas are closed. We tacitly assume that the substitutions satisfy the "free for" conditions. (i) Let U t l = t 2, i.e. t y = t t . Now use induction on s. - s is aconstant or a variable. Trivial. - s = F ( s l , . . . , s k ) T hen s [ti/x] = F (sl[ti/x], . . .) a nd ( s[ti/x])" = F ( ( s l [ti])"/x, . . .). Induction hypothesis: ( s j [ t l l ~ ]= ( s j [t2/x])", )~ 1 5 j 5 k. So ( ~ [ t l l x ] ) " ~ ( ( s l [ t l / x ] ) " . . .) = = , F ( ( S ~ [ ~ ~ / X . . ." ,= (s[t,/x])". Hence 2l I= s [tl/x] = s [tz/x]. ]) ) t l = t 2, so trf' = t t . We show U (ii) Let U c p[t2/~] y b cp[tl/x] % U induction on cp. - cp is atomic. The case of a propositional symbol (including I) is trivial. So consider cp = P ( s l , . . . , s k ) . u P(s1,. ..,sk)[tl/x] k P ( s l [ t l / x ] ,. . .) e ( (sl[tl/x])", . . . , ( sk[tl/x])" E P . BY (i) ( sj[tl/x])" = ((s[t2/x])", j = 1,. . . ,k . So we get ( ( s l[ tllx])", . . .) € P @ ... - U 3 P ( s 1 , . . . )[tz/x]. Observe that in the above proof we have applied induction t o "a[cp/$] for all cp", because the substitution formula changed during the quantifier case. Note that also the o changed, so properly we are applying induction t o the rank (or we have to formulate the induction principle 2.3.4 a bit more liberal). + * Corollary 2.5.9. ( i) [ s[t/x]] = [s[l[tD/x]] ( 4 Ucp[tlxlI = IIcp[ntB/xlI h o f We apply the Substitution Theorem. Consider an arbitrary U. Note = [t]] ( by definition), so U It] = t . Now (i) and (ii) follow that 0 immediately. I n a more relaxed notation, we can write (i) and (ii)as b (t)l = Us(nt]l)kor a k s (t) = a nd Ucp(t)ll = IIcp(Ktll)Bl or 3 I 4 t ) cp(!tll). = Observe that ([t](= I[t]%)is just another way to write t". Proofs involving detailed analysis of substitution are rather dreary but, unfortunately, unavoidable. The reader may simplify the above and other Proofs by supposing the formulas involved t o be closed. There is no real loss + - 76 2. Predicate Logic 2.5 Simple Properties of Predicate Logic 77 in generality, since we only introduce a number of constants from L (%) a nd check that the result is valid for all choices of constants. We now really can manipulate formulae in an algebraic way. Again, write cp eq $ for k cp $. Examples. + + 1. Vxcp(x) $ eq -+xcp(x) V $ eq 32(7cp(x)) V I, eq 3 x(lcp(x) V $) eq I 3x(cp(x) $), where x @ F V ( $ ) . 2. Vxcp(x) -+ 3xcp(x) eq ~ V x p ( x ) ~ 3 x c p (eq)~ x ( - ~ ( x ) v ~ ( xT he formula x )). in the scope of the quantifier is true (already by propositional logic), SO t he formula itself is true. + + Definition 2 .5.10. A formula cp is in prenex (normal) form if cp consists of a (possibly empty) string of quantifiers followed by an open (i.e. quantifier free) formula. We also say that cp is a prenex formula. v < y ), VxVy3z(P(x,y) A Q ( Yx ) , Examples. 3xVy3z3v(x = z V y = z P ( z , 2 )). By pulling out quantifiers we can reduce each formula to a formula in prenex form. + + Example. Let 2l = ( R, Q, <) be the structure of the reals with the set of numbers singled out, provided with the natural order. The sentence 0 := Vxy (x < y -+ 3 z(Q(z)A X < z A z < y )) c an be interpreted in U : U a, and i t tells us that the rationals are dense in the reals (in the natural ordering). We find this mode of expression, however, rather cumbersome. Therefore we introduce the notion of relativised quantifiers. Since it does not matter whether we express informally " x is rational" by x E Q or Q (x), we will suit ourselves a nd any time choose the notation which is most convenient. We use (32 E Q ) and (Vx E Q ) as informal notation for "there exists an x in Q" and "for all x in Q". Now we can write a as Vxy(x < y 3.2 E Q (x < z Az < y )). since: (1) there is no relation R Note t hat we do not write (Vxy E R)(-), in Q , ( 2) variables automatically range over IUI = R . Let us now define the relativisation of a quantifier properly: + Definition 2 .5.12. If P is a unary predicate symbol, then (Vx V x(P(x) 4 cp), ( 3 s E P)cp := ( 3x)(P(x) A cp). E P)cp := Theorem 2 .5.11. F or each cp there is a prenex formula $ such that kcp++$. Proof. First eliminate -+ a nd ++. Use induction on the resulting formula cp'. For atomic cp' t he theorem is trivial. If cp' = c pl V 9 2 a nd c pi,cpa a re equivale?t t o prenex $1, q2 t hen = ( Ql y l) . . . (Q, Y n)$l. $2 = ( Q ~ Z I .). (Q,z,)$~, where Q ~Q,; a re quantifiers and $ I , q 2open. By The. orem 2.5.6 we can choose all bound variables distinct, taking care that no 9' H variable is both free and bound. Applying Theorem 2.5.3 we find ( Q I Y ~ . .. ( Qnyn)(Qizl). . . ( ~ L z r n ) ( $ l q 2 ) , SO we a re done. ) V 0 The remaining cases are left to the reader. I n ordinary mathematics it is usually taken for granted that the benevolent reader can guess the intentions of the author, not only the explicit ones, but also the ones that are tacitly handed down generations of mathematicians. Take for example the definition of convergence of a sequence: VE > 0 3nVm((an- an+,( < E ) . In order to make sense out of this expression one has t o add: the variables n , m range over natural numbers. Unfortunately our syntax does not allow for variables of different sorts. So how do we incorporate expressions of the above kind? The answer is simple: we add predicates of the desired sort and indicate inside the formula the "nature" of the variable. T his notation has the intended meaning, as appears from (Vx E P)cp w for all a E P" U k cp[?i/x], U k ( 32 E P)cp H t here 2 exists an a E P a such t hat U )= (P[E/x]. T he proof is immediate. We will often use informal notations, such as (Vx > 0) or (3y # l ) , which can be cast into the above form. The meaning of such notations will always be evident. One can restrict all quantifiers to the same set (predicate), this amounts t o passing t o a restricted universe (cf. Exercise 11). It is a common observation that by strengthening a part of a conjunction (disjunction) the whole formula is strengthened, but that by strengthening cp in - t he whole formula is weakened. This phenomenon has a syntactic p origin, and we will introduce a bit of terminology to handle it smoothly. We inductively define that a subformula occurrence cp is positive (negative) in a : Definition 2 .5.13. T he notion "occurs positive (negative) in", cp <+ a ( p < - a ) , is given by: ( 9 cp <+ cp (ii) T = c p ~ $ , $ v c p , c p ~ $ , $ ~ c p , $ - - t c p a n d r < + a + c p < + a (iii) T = ~ v , ..., $ - + c p a n d r < - a + c p < - a $ (iv) T = V - - i $ a n d r < - a + c p < + a ( v) r = c p + $ a n d r < + a + c p < - a (vi) r = ~ c p a n d r < - a = + c p < + a (vii) ~ = 7 c p a n d ~ < + a = + c p < - a (viii) T = 3 x9, Vxcp a nd T < + u + cp <+ u (ix) ~ = 3 x ( p , V x c p a n d ~ < - u * c p < - a We could have restricted ourselves t o A , -+ and V , b ut it does not cost much e xtra space to handle the other connectives . Moreover, in intuitionistic 78 2 . Predicate Logic 2.6 Identity 79 logic the connectives are not interdefinable, so t here we have to consider the full language. The following theorem makes the basic intuition clear: if a positive part of a formula increases in truth value then the formula increases in truth value (better: does not decrease in truth value). We express t his role of positive and negative subformules as follows: 10. Show that the converses of exercise 9(i) - (iii) and (v) do not hold. 11. Let L have a unary predicate P . Define the relativisation a P of a by u p := a foratomiccp, ( c p ~ $ ) := c p P ~ $ P , ~ ('cp)P := 'cpP, :~ ( V X ~ ) = V x(P(x) -+ cp), ( ~ X V := 3 x ( P ( x ) A cp). )~ Let U be a structure without functions and constants. Consider the structure B w ith universe P" a nd relations which are restrictions of the relations of 2, where P" # 8. Show U k a P ++ 9 k a for sentences a . Why 3 are only relations allowed in U? Proof. Induction on a . 0 12. Let S be a binary predicate symbol. Show k + IyVx(S(y, ) ct l S ( x , x ) ) . x (Think of "y shaves x" and recall Russell's barber's paradox). 13. (i) Show that the condition "free for" cannot be dropped from 2.5.8. (ii) Show t = s cp[t/x] * cp[s/x] . (iii) Show t= cp * $ ++a[cp/$] * a[$/$] . Exercises 1. Show that all propositional tautologies are true in all structures (of the right similarity type). + 3. Show that the condition on F V($) in exercise 2 is necessary. 4. Show 5. Show 6. Show Vx3ycp +-+ 15. Show + 3x(cp(x) + Vycp(y)). ( It is instructive to think of p (x) a s 'x drinks'). 3yVxcp. t= cp + k Vxcp a nd --+ 2 6 Identity . 3 x9. We have limited ourselves in this book to the consideration of structures with identity, and hence of languages with identity . Therefore we classified '=' as a logical symbol, rather than a mathematical one. We can, however, not treat = a s just some binary predicate, since identity satisfies a number of characteristic axioms, listed below. I 1 Vx(x = x ), I2 Vxy(x = y - y = x ), , 13 V x y z ( x = y A y = z - + x = z ) , 1 4 V X . . . x , Y ~ . . y n(IX\ x i = Y i ~ t (x1,. . . , x n ) = t (y1,. . . , y n ) ) , -+ - i 3xcp + Vxcp. + 3x(p A 7. Show 3xcp A 3x$ $). 8. Show that the condition on x, y in Theorem 2.5.6 is necessary. z <n V XI .,. . X n Y 1 . . . i<n Xi = Y i --+ XI,. . . ,x n) + ( ~ ( 9 1 , .. , y n))). . .", 80 2. Predicate Logic 2 .7 Examples 81 O ne simply checks that Il,12, are true, in all s tructures U. For 1 4 , I3 observe that we can suppose the formulas t o be closed. Otherwise we add quantifiers for the remaining variables and add dummy identities, e.g. V . . - t.axi . . . x nyi . . . y n(lX\ X. = y, A n zk = t.x -+ t ( x l , . . . ,x n) = 7. Consider the schema ~5 : x y -t (cp[x/z] cp[y/z]). Show that 0 2 A a 3. N.B. if a is a s chema, then A U { a) cp s tands al,a5 for A u E /= p, where C consists of all instances of a . + - --f + flC\ zsk zln 8. Derive the term-version of I4from the formula version. t (y1,. . . ,y n ) ) Now (t(Si1, . . . , a,))" defines a function t" on J U J nobtained , from the given functions of U by various substitutions, hence a i = bi(i 5 n ) =+ (t(Si1,. . . ,?in))" = t (b1,. . . ,bn))? T his establishes the first part of 1 4 . T he second part is proved by induction on cp (using the first part): e.g. consider the universal quantifier case and let a i = bi for all i 5 n . i.h. U Vucp(u,a l , . - . ,Sin) @ U k cp(E,El,. . . , En) for all c Q . U cp@,b l, . . . , b,) for all c Q U k VU(P(U,l, . . . ,b,). b SO o k ( A zi = T i) = u . v ~ c p ( ~ , . . ~a,) .a , , V U ~ ( U , .T . b,)) Q .~, 2.7 Examples We will consider languages for some familiar kinds of structures. Since all languages are built in the same way, we shall not list the logical symbols. All structures are supposed to satisfy the identity axioms Il - 1 4. For a refinement see 1 2.10.2 . 1. The language of identity. Type: (-; -; 0 ). Alphabet. Predicate symbol : = T he structures of this type are of the form U = ( A ) ,and satisfy 11,12,3. ( In 1 this language I4follows from 11,2,13, cf. 2.10 Exercise 5). 1 I n an identity structure there is so little "structure", that all one can virtually do is look for the number of elements (cardinality). There are sentences A, a nd p, saying that there are at least (or at most) n elements (Exercise 3, section 3.1) + + iln - for all a l , . . ., a,, bl, . . ., b,, hence U k V XI,.. . r ,yl. . . Y n ( / & x i = Y i + i sn ( VUV(U,XI,.. , x n ) . V UP(U,YI>., ~ n ) ) . .. Note that (P (respectively t ) , in I4can be any formula (respectively term), so I4s tands for infinitely many axioms. We call such an "instant axiom" an axiom schema . T he first three axioms state that identity is an equivalence relation. I 4 s tates that identity is a congruence with respect t o all (definable) relations. It is important t o realise that from the axioms alone, we cannot determine the precise nature of the interpreting relation. We explicitly adopt the convention that "=" will always be interpreted by real equality. + Exercises 1. Show + Vx3y(x = y ). 2. Show k Vx(cp(x) o 3 y(x = y A cp(y))) a nd k Vx(cp(x) t-) Vy(x = y 4 ~ ( y ) ) )where y does not occur in p (x). , 3. Show that cp(t) t-)Vx(x = t -+ : = 3 ~ l . . . ~ n n (# y ji, ( n > I ), \~ i #j p n := Vyo.. . y , W yi = p il ( n > 0 ). i #j An cp(x)) if x @ F V ( t ) . So 2 A, A p n iff JUI has exactly n elements. Since universes are not empty b 3 x(x = x ) always holds. We can also formulate "there exists a unique x such that . . . ". 4. Show that the conditions in exercises 2 a nd 3 are necessary. x ) , 0 2 = Vxy(x y4y x ) , 03 = Vxyz(x 5. Consider a1 = Vx(x . x 2 ) . Show that i f 2 k 01 A a 2 A a 3 , w here% = ( A , R ) , yA y t t hen R is an equivalence relation. N.B. x y is a suggestive notation for the atom z ( x , y ). N --f - - Definition 2.7.1. 3!xcp(x) := 3x(cp(x) A Vy(cp(y) -+ a: = Y )), where n ot occur in cp(x). Note that 3!x(p(x) is an (informal) abbreviation. does I : N 6. L et a 4 = V xyz(x -- y A x N z -+ y I : z ). Show that a l, 0 4 02 A 03 82 2. Predicate Logic 2 .7 Examples 83 2. The language of partial order. Type: (2; -; 0 ). Alphabet. Predicate symbols : Abbreviations =, 5 . x<y:=x<yAx#y, x2y:=y<x, % D efinition 2.7.3. ( i) U is a (lznearly or totally) ordered set if it is a poset and U /= Vxy(x _< y V y 5 x ) (each two elements are comparable). (ii) U is densely ordered if U /= Vxy(x < y + 3 z(x < z A z < y )) (between any two elements there is a third one). I t is a moderately amusing exercise to find sentences that distinguish between structures and vice versa. E.g. we can distinguish U3 a nd U4 (from t he diagram above) as follows: in U4 t here is precisely one element that is incomparable with all other elements, in U t here are more such elements. g P u t a ( x ) : =Vy(y # x --+ i y 5 x A l x 5 y ). Then ! /= V xy(u(x) A a ( y ) 4 x = y ), b ut U & g l 'dxy(u(x) A a ( y ) -+ x = y ). 3. The language of groups. Type: (-; 2 , l ; 1 ). Alphabet. Predicate symbol: Function symbols: C onstant symbol: = . ,-I x # y := T X = Y 1 x>y:=y<x, x<y<z:=x5yAy<z. % i D efinition 2.7.2. U is a partially ordered set(poset) if U is a model of Vxyz(x 5 y 5 z -+ x 5 z ), b'xy(x 5 y x ++x = y ). < The notation may be misleading, since one usually introduces the relation 5 (e.g. o n the reals) as a disjunction: x < y or x = y. In our alphabet the relation is primitive, another symbol might have been preferable, but we chose to observe the tradition. Note that the relation is reflexive: x 5 x . Partially ordered sets are very basic in mathematics, they appear in many guises. It is often convenient to visualise posets by means of diagrams, where a 5 b is represented as equal or above (respectively to the right). One of the traditions in logic is to keep objects and their names apart. Thus we speak of function symbols which are interpreted by functions, etc. However, in practice this is a bit cumbersome. We prefer to use the same notation for the syntactic objects and their interpretations, e.g if 3 = ( R, 5 ) )is the partially ordered set of reals, then 3 Vx3y(x 5 y ), whereas it should be something like V X ~ ~ (t oX ~ ~ ) the symbol from the relation. distinguish The '5' in 3 s tands for the actual relation and the ' 5' in the sentence stands for the predicate symbol. The reader is urged to distinguish symbols in their various guises. We show some diagrams of posets. e Notation: In order to conform with practice we write t . s a nd t-' instead of . (t,s ) a nd - '(t). * + D efinition 2.7.4. 2l is a group if it is a model of V xyz((x. y) . z = x . (y . z )), V x(x. e = x A e , x = x ), V x(x. x-' = e A x-' . x = e ). When convenient, we will write t s for t .s; we will adopt the bracket conventions from algebra. A group U is commutative or abelian if U Vxy(xy = y x). Commutative groups are often described in the language of additive groups, which have the following alphabet: Predicate symbol: = Function symbols: C onstant symbol: 0 +, From the diagrams we can easily read off a number of properties. E .g. 2 1 /= 3xVy(x 5 y)(Ui is the structure with the diagram of figure i), i.e. 21 1 has a least element (a minimum). Uy V x 4 y ( x < 9 ). i.e. in Ug no element is strictly less than another element. + The language of plane projective geometry. Type: (2; -; 0 ) T he structures one considers are projective planes, which are usually taken t o consist of points and lines with an incidence relation. In this approach the t ype would be ( 1,1,2; -; 0 ). We can, however, use a more simple type, since a point can be defined as something that is incident with a line, and a line as something for which we can find a point which is incident with it. Of course this requires a non-symmetric incidence relation. We will now list the axioms, which deviate somewhat from the traditional set. It is a simple exercise to show that the system is equivalent to the standard sets. Alphabet. ., P redicate symbols: I , =. 4. 84 2 . Predicate Logic 2.7 Examples 85 We introduce the following abbreviations: A(y) := 3 x(xIy). I I ( x ) := 3 y(xIy), D efinition 2 .7.5. U is a projective plane if it satisfies yo : V x(fl(x) -3 4 x ) ) , ' y1 : V xy(fl(x) A n ( y ) 4 3 z(xIz A y Iz) 7 2 : V uv(A(u) A A(v) -+ 3 x(xIu A x Iv)) , 7 : ~ x y u v ( x I u y I u A x I v A y Iv 4 x = y V u = v ), 3 A X,IU~ A xiluj A TX~IU~). 74 : j=i- l (mod3) 3#t-l(mod3) i #j 70 tells us that in a projective plane everything is either a point, or a line, y l a nd 72 tell us that "any two lines intersect in a point" and " a ny two points can be joined by a line", by 73 t his point (or line) is unique if the given lines (or points) are distinct. Finally 7 4 makes projective planes non-trivial, in the sense that there are enough points and lines. = { a E IUI/Uk IT($) a nd A" = { b E I U1IU A($)) a re the sets of p oints a nd l ines of 2l; I" is the i ncidence relation on U. T he above formalisation is rather awkward. One usually employs a twosorted formalism, with P, Q , R , . . . varying over points and ! m , n . . . varying , over lines. The first axiom is then suppressed by convention. The remaining axioms become T; : VPQSC(PIC A Q Ie), 7; : v e m 3 P ( P I e A P I m ) , y; : VPQ!m(PI! A QIC A P I m A Q I m -+ P = Q V C = m ) , : ~ P ~ P ~ P ~ P ~ ! ~ C ~ J ? ,P~ ( ~ A \ ! illi X Pilej A lPiIej). j=i- l(mod3) 3f z-l(mod3) i #j T he translation from one language to the other presents no difficulty. The above axioms are different from the ones usually given in the course in projective geometry. We have chosen these particular axioms because they are easy t o formulate and also because the so-called D uality principle follows immediately. (cf. 2.10, Exercise 6 ). T he fourth axiom is an existence axiom, it merely says that certain things exist; it can be paraphrased diffently: there are four points no three of which are collinear (i.e. on a line). Such an existence axiom is merely a precaution to make sure that trivial models are excluded. In this particular case, one would not do much geometry if there was only one triangle! 5 . T he language of rings with unity. Type: (-; 2 ,2,1; 2 ) D efinition 2 .7.6. U is a r ing ( with unity) if it is a model of V xyz((x+ y) + z = x + (y + z ) ) , Vxy(x y = y x ), V X Y ~ ( ( X Y )x ( Y ~ ) ) , =Z Vxyz(x(y z ) = xy x z), Vx(x 0 = x ), Vx(x (-5) = O), V x(1. x = x A X . 1 = x ), O#l + + + + + + ~ x ~ ~ ~ ~ ~ ~ ~ /~ ~ ~ ~ u ~ u ~ ( / X \ X\ A ring U is c ommutative if U 21 Vxy(xy = y x). A ring 2 is a d ivision ring if U Vx(x # 0 - 3 y(xy = 1 )). l , A commutative division ring is called a field . n" Actually it is more convenient to have an inverse-function symbol available in the language of fields, which therefore has type (-; 2 ,2,1,1; 2 ). Therefore we add to the above list the sentences Vx(x # 0 + x . x-' = 1 A X - ' . x = 1 ) and 0-I = 1 . Note that we must somehow "fix the value of OV1", t he reason will appear in 2.10, Exercise 2 . 6. T he language of arithmetic . T ype (-; 2 ,2,1; 1 ). A lphabet. P redicate symbol: = Function symbols: +, .,S C onstant symbol: 0 ( S s tands for the successor function n /X\ /X\ H n + 1 ). Historically, the language of arithmetic was introduced by P eano w ith the intention to describe the natural numbers with plus, times and successor up to a n isomorphism. This in contrast to, e.g. t he theory of groups, in which one tries t o capture a large class of non-isomorphic structures. It has turned out, however, that Peano's axioms characterise a large class of structures, which we will call (lacking a current term) P eano s tructures. Whenever confusion threatens we will use the official notation for the zero-symbol: b ut mostly we will trust the good sense of the reader. n, A lphabet. P redicate symbol: = Function symbols: +, ., C onstant symbols: 0, 1 86 2. Predicate Logic 2.7 Examples 87 T he last axiom schema is called the induction schema or the principle of mathematical induction. - I t will prove handy to have some notation. We define: 1 := S @), 2 := S (i),a nd in general := S (n), x < y := 3 4 2 S z = y ), xIy:=x<yVx=y. T here is one P eano s tructure which is the intended model of arithmetic, namely the structure of the ordinary natural numbers, with the ordinary addition, multiplication and successor (e.g. t he finite ordinals in set theory). We call this Peano s tructure the standard model %, a nd the ordinary natural numbers are called the standard numbers. One easily checks that ~ ' ) = n a nd 9l n < H n < m : by definition l of interpretation we have 0')l = 0. Assume nfl = n , m g = ( s ( E ) ) ~ = nq 1 = n 1. We now apply mathematical induction in the meta-language, = and obtain ~ ' ) l n for all n. For the second claim see Exercise 13. In % we can define all kinds of sets, relations and numbers. To be precise we say that a k-ary relation R in T is defined by cp if ( a l , . . . , a k) E R 9l c p ( ~ ,. .. , a k ) . An element a E I9ll is defined in 9l by cp if 9l cp@) % b = a , or 9l Vz(cp(x) ++ x = a ). Examples. h ( a) The set of even numbers is defined by E ( x ) := 3 y(x = y y ). (b) The divisibility relation is defined by xly := 3 z(xz = y ). y = 1V (c) The set of prime numbers is defined by P ( x ) := Vyz(x = yz z=l)Ax#l. We can say that we have introduced predicates E , I a nd P by (explicit) definition. 7. The language of graphs. We usually think of graphs as geometric figures consisting of vertices and edges connecting certain of the vertices. A suitable language for the theory of graphs is obtained by introducing a predicate R which expresses the fact that two vertices are connected by an edge. Hence, we don't need variables or constants for edges. from t he point of view of the numerous applications of graphs it appears that more liberal notions are required. Examples. + n + + * We can also consider graphs in which the edges are directed. A directed graph U = ( A, R ) satisfies only V x7R(x, x ). Examples. + Alphabet. Predicate symbols: R , = . Definition 2.7.8. A graph is a structure U = (A, R) satisfying the following axioms: V xy(R(x, y) -+ R (y, x )) V xyR(x, x ) T his definition is in accordance with the geometric tradition. There are elements, called vertices, of which some are connected by edges. Note that two vertices are connected by at most one edge. Furthermore there is no (need for an) edge from a vertex to itself. This is geometrically inspired, however, If we drop the condition of irreflexivity then a "graph" is just a set with a binary relation. We can generalise the notion even further, so that more edges may connect a pair of vertices. In order to treat those generalised graphs we consider a language with two u nary predicates V, E a nd one ternary predicate C . Think of V (x) as "x is a vertex". E ( x ) as "x is an edge", and C (x, z , y) as "z connects x and y" . A directed multigraph is a structure = ( A, V, E , C ) satisfying the following axioms: v x(V(x) k +(x)), V X Y ~ ( C (z ,, ~ ) V (X)A V (Y)A E ( z ) ) . X The edges can be seen as arrows. By adding the symmetry condition, C ( Y , , x)) one obtains plain multigraphs. z Vxyz(C(x, z , y) + + 88 2 . Predicate Logic 2.7 Examples 89 Examples. 7. Consider the language of groups. Define the properties: (a) x is idempot ent; (b) x belongs to the centre. 8. Let U be a ring, give a sentence a such that U domain (has no divisors of zero). + a e U is an integral + a ( a ) w t he 9. Give a formula a ( x ) in the language of rings such that U principal ideal (a) is prime (in U). Remark: The nomenclature in graph theory is not very uniform. We have chosen our formal framework such that it lends itself to treatment in firstorder logic. For the purpose of describing multigraphs a two-sorted language (cf. geometry) is well-suited. The reformulation is left to the reader. Exercises 1. Consider the language of partial order. Define predicates for (a) x is the m mimum; (b) x is m mimal; (c) there is no element between x and y; (d) x is an immediate successor (respectively predecessor) of y; ( e) z is .. the infimum of x and y. 10. Define in the language of arithmetic: (a) x and y are relatively prime; ( b) x is the smallest prime greater than y; (c) x is the greatest number with 22 < y. 11. a := V x l . . . x n 3 y - 1 . . . yc ,p a nd T := 3yl . . . y,$ a re sentences in a language without identity, function symbols and constants, where cp a nd $ a re quantifier free. Show: a e a holds in all structures with n elements. T e T holds in all structures with 1 element. + + Give a sentence a such that Uz the diagrams of p.82). + a and U4+ 70 (for Ui associated to Let U1 = (N, 5 ) a nd U2 = ( El<) be the ordered sets of natural, respectively integer, numbers. Give a sentence a such that U1 a and U2 l a . DO the same for U2 a nd 23 = ((Q, <) ( the ordered set of rationals). N.B. a is in the language of posets; in particular, you may not add extra constants, function symbols, etc., defined abbreviations are of course harmless. 12. Monadic predicate calculus has only unary predicate symbols (no identity). Consider U = ( A, R 1,. . . , &) where all R i a re sets. Define a b := a E R i w b E R i for all i < n . Show that is an equivalence has a t most 2, equivalence classes. The equivalence relation and that a nd [a] E Si H a E R i, 23 = class of a is denoted by [a]. Define B = A / ( B, S 1,. . . , S,). Show U a 23 a for all a in the corresponding language. For such a show a w U a for all U with a t most 2, elements. Using this fact, outline a decision procedure for truth in monadic predicate calculus. - + + + + - - - + 13. Let '9t be the standard model of arithmetic. Show '9t 5i < E i @ n < m. 14. Let U = (N, ) a nd 23 = (N, A ), where n A m iff (i) n < m a nd n , m < b oth even or both odd, or (ii) if n is even and m o dd. Give a sentence a such that U a a nd C l a . Let a = 3xVy(x 5 y V y 5 x ). Find posets U a nd 23 such that U 2 3 - 0. + + a and r such + + Do the same for a = Vxy3z[(x < z A y 5 z ) V ( z < x A z < y )]. t 15. If (A, R ) is a projective plane, then (A, R ) is also a projective plane (the dual plane), where R is the converse of the relation R. Formulated in the two sorted language: if ( Ap, A L, I) is a projective plane, then so is ( 4 , A P, f)). Using the language of identity structures give an (infinite) set t hat U is a model of r iff U is infinite. 90 2 . Predicate Logic 2.8 Natural Deduction 91 2 .8 N atural Deduction We extend the system of section 1.5 t o predicate logic. For reasons similar t o the ones mentioned in section 1.5 we consider a language with connectives A, -+, I a nd V. T he existential quantifier is left o ut, b ut will be considered later. We adopt all the rules of propositional logic and we add T he V elimination a t the first step was illegal. Note that y is not free for x in -Vy(x = y). The derived sentence is clearly not t rue in structures with a t least two elements. We now give some examples of derivations. We assume that the reader where in V I t he variable x may not occur free in any hypothesis on which cp(x) depends, i.e. a n uncancelled hypothesis in the derivation of ~ ( x )In V E . we, of course, require t t o be free for x. has by now enough experience in calcelling hypotheses, so that we will not longer indicate the cancellations by encircled numbers. V I has the following intuive explanation: if an arbitrary object x has the property p , t hen every object has the property cp. T he problem is that none of the objects we know in mathematics can be considered "arbitrary". So instead of looking for the "arbitrary object" in the real world (as far as mathematics is concerned), let us try to find a syntactic criteria. Consider a variable x ( or a constant) in a derivation, are there reasonable grouns for calling x "arbitrary" ? Here is a plausible suggestion: in the context of the derivations we shall call x arbitrary if nothing has been assumed concerning x. In more technical terms, x is arbitrary at its particular occurrence in a derivation if the part of the derivation above it contains no hypotheses containing x free. We will demonstrate the necessity of the above restrictions, keeping in mind that the system at least has to be s ound, i.e. t hat derivable statements should be true. Restriction on VI: Let x 6F V ( p ) I n the righthand derivation V I is allowed, since x applicable. T he V introduction a t the first step was illegal. So t- 0 = 0 -+ Vx(x = O), b ut clearly structure containing more than just 0 ). Restriction on YE: 0 =0 -+ # FV(cp), a nd V E is Vx(x = 0) (take any Note that V I in the bottom left derivation is allowed because x for a t that stage cp is still (part of) a hypothesis. # FV(cp), The reader will have grasped the technique behind the quantifier rules: reduce a Vxcp t o cp a nd reintroduce V l ater, if necessary. Intuitively, one makes the following s tep: to show " for all x . . . x . . . " it suffices to show ". . . x . . . " r 92 2. Predicate Logic 2.8 Natural Deduction 93 for an arbitrary Z. T he latter statement is easier t o handle. Without going into fine philosophical distinctions, we note that the distinction '(for all x . . . x . . . " - "for an arbitrary x . . . x . . . " is embodied in our system by means of the distinction." quantified statement" - " free variable statement". The reader will also have observed that under a reasonable derivation strategy, roughly speaking, elimination precedes introduction. There is a sound explanation for this phenomenon, its proper treatment belongs to proof theory, where n ormal derivations (derivations without superfluous steps) are considered. See Ch. 6. For the moment the reader may accept the above mentioned fact as a convenient rule of thumb. We can formulate the derivability properties of the universal quantifier in terms of the relation k: r F cp(x) + T k Vxcp(x) if x @ F V($J) for all $J E r r F Vxcp(x) + I- cp(t) if t is free for x in cp. T he above implications follow directly from (VI) a nd (YE). Since we have cast our definition of satisfaction in terms of valuations, which evidently contains the propositional logic as a special case, we can copy t he cases of (1) the one element derivation, (2) the derivations with a p p o s i t i o n a l rule at last step, from Lemma 1.6.1 (please check this claim). So we have to treat derivations with (VI) or (YE) as the final step. r V h as its hypotheses in a nd x is not free in T. 27 cp(x) Induction hypothesis: T cp(x), i.e. U r ( a ) 3 U (cp(x))(a) for all U a nd all a . Vxcp(x) It is no restriction to suppose that x is the first of the free variables involved (why?). So we can substitute Zl for x in cp. P ut a = ( a l , al).Now we have: for all a1 a nd a' = ( az, . . .) U k r ( a l ) + U k c p(G)(a1),s o for all a ' U k r ( a ' ) + (U b (cp(Zl))(af) for all a 1 , so for all a'U k r ( a l ) + U (Vxcp(x))(al). PI) r + O ur next goal is the correctness of the system of natural deduction for predicate logic. We first extend the definition of +. T his shows r /=Vxcp(x). ( Note that in this proof we have used Vx(a -+ ~ ( x ) -+ ( a ) V XT(X)), here x @ F V ( u ) , in the metalanguage. Of course w we may use sound principles on the metalevel). + D efinition 2.8.1. Let I' be a set of formulae and let { xil, x i z , .. .) = U { F V ( $ ) ( $E r u { a ) ) . If a is a sequence ( a l ,a z . . .)) of elements (repetitions J allowed) of I%\,t hen r ( a ) is obtained from T by replacing simultaneously in all formulas of r t he x z3 by Z J ( j 5 1 ) (for r = ($1 we write $(a)). We now define ( i) U r ( a ) if U $J for all $J E r ( a ) if U r ( a ) + U a ( a ) for all U, a . ( ii) r a + + + Induction hypothesis: T Vxcp(x), D Vxcp(x> 2.l /= (Vxcp(x))(a), i.e.U /= r ( a ) for all a a nd U. ~(t) cp(6)(a) for all b E IU(. I n particular we may r ( a ) , t hen U So let U take t [ ~ / z for 6, where we slightly abuse the notation; since there are finitely ] many variables z l , . . . , z,, we only need finitely many of the a i's, and we consider it therefore an ordinary simultaneous substitution. !2l ( cp[a/z])[t[a/z]/x], hence by Lemma 2.5.4, U b (cp[t/x])[a/z] o r I= (cp(t))(a). 0 W E) + + + + I n case only sentences are involved, the definition can be simplified: r+uifU+T+UkuforallU. If r = 0, we write /= u . We can paraphrase this definition as : r a , if for all structures U and all choices of a , a ( a ) is true in U if all hypotheses of T ( a ) are true in 2 . Now we can formulate L e m m a 2 .8.2 ( Soundness). I- k u Having established the soundness of our system, we can easily get nonderivability results. Examples. + 1. y Vx3ycp -, 3yvxcp. Take U = ( (0, I ), ( (0, I ) , ( 1 , O ) ) ) ( type (2; -; 0 )) and consider p := P ( x , y ), the predicate interpreted in U. U V x3yP(x, y), since for 0 we have ( 0 , l ) E P and for 1 we have (LO) E +r k a P. Proof. By definition of r t- a is suffices to show that for each derivation with hypothesis set r a nd conclusion a T k a . We use induction on D ( cf. 1.6.1 and exercise 2 ). B ut, U 3 yVxP(x, y), since for 0 we have (0,O) @ P a nd for 1 we have ( L l ) @ p. 94 2. Predicate Logic A3 2.9 Adding the Existential Quantifier 95 Consider 9 = ( R, P ) with P 3 = { (a, b) ( la - bl 1 1 ). 5 , Although variables and constants are basically different, they share some properties. Both constants and free variables may b e introduced in derivations through VE, b ut only free variables can be subjected to VI, - t hat is free variables can disappear in derivations by other than propositional means. I t --follows that a variable can take the place of a constant in a derivation but in general not vice versa. We make this precise as follows. Show r F cp + r t k t cpt, where t s tands for "derivable without using t (VI) or (YE)" (does the converse hold?) Conclude the consistency of predicate logic. Show that predicate logic is conservative over propositional logic (cf. definition 3.1.5). 2 .9 Adding the Existential Quantifier Let us introduce 3 x 9 a s an abbreviation for - V x y (Theorem 2.5.1 tells us t hat t here is a good reason for doing so). We can prove the following: Theorem 2.8.3. Let x be a variable not occurring in r o r p. ( i) r k cp + r [ x / c ] t- cp[x/c]. (ii) If c does not occur i n I', then r k cp(c) + r t- Vxcp(x). Proof. (ii) follows immediately from (i) by VI. (i) Induction on the deriva0 tion of r I- cp. Left to the reader. Lemma 2.9.1. ( i) cp(t) k 3 x 4 ~ ()t free for x in cp) ( 4 r,cp(x) t- 11, + r ,3xcp(x) t- 11, if x is not free in 11, o r any formula of F. Proof. (i) Observe that the result is rather obvious, changing c t o x is just as harmless as colouring c red - t he derivation remains intact. Exercises 1 . Show: (i) k Vx(cp(x) 1 1,(~)) (Vxcp(x) VX@(X)), (ii) k Vxcp(x) 4 -VXT(P(X), (iii) t Vxcp(x) 4 Vzcp(z)if zdoesnotoccurincp(x), (iv) k VxVycp(x, Y ) VyVxcp(x, Y ), Vxcp(x1x ), VxVycp(x, Y ) ( v) (vi) k Vx(cp(x) A 11,(x)) Vxcp(x) A Vx11,(~), ( vii) k Vx(cp -+ 11,(x)) ((P V X+(X)). 2 . E xtend the definition of derivation to the present system (cf. 1.5.1). + + + + (ii) + - ++ + 3. Show ( s(t) [ ~ l x ] ) " ( s((t[~i/x])") = [ z~x])'. 4. Show the inverse implications of 2.8.3. 5. Assign t o each atom P ( t 1 , . . . , t,) a proposition symbol, denoted by P . Now define a translation t from the language of predicate logic into the language of propositional logic by P ( t l , . . . ,t,))t := P a nd l t : = l ( cpO+)t := cpt O + t (,cp)t := -cpt (Vzcp)t := cpt I - RAA 11, 1 I . in r U {cp(x)) (only cp(x) is shown). Since p (x) ( that is, all occurrences of it) is cancelled and x does not occur free in F or $, we may apply VI. From the derivation we conclude that I , x 4 s ) t- $. '3 We can compress the last derivation into an elimination rule for 3 : E ~ l a n a t z o n T he subderivation top left is the given one; its hypotheses are . 96 2 . Predicate Logic 2.9 Adding the Existential Quantifier 97 I t is time now t o state the rules for V a nd 3 with more precision. We want % t o allow substitution of terms for some occurrences of the quantified variable in ( Y E ) a nd ( 3 E ) . T he following example motivates this. with the conditions: x is not free in $, or in a hypothesis of the subderivation of Q,o ther than c p(x). T his is easily seen to be correct since we can always fill in the missing d etails,as shown in the preceding derivation. By (i) we also have an introduction rule: - I for t free for x in cp. 3 3x P(x) E xamples of derivations. The result would not be derivable if we could only make substitutions for all occurrences at the same time. Yet, the result is evidently true. The proper formulation of the rules now is: cp(t) with the appropriate restrictions. Exercises We will also sketch the alternative approach, that of enriching the language. I Theorem 2.9.2. Conszder predzcate logzc wzth the full language and r ules for all connectzves, then t 3 x(p(x)++ - V X ~ ( P ( X ) . 11 f Z Proof. Compare 1.6.3. 98 2. Predicate Logic 2.10 Natural Deduction and Identity 99 2 .10 Natural Deduction and Identity We will give rules, corresponding to the axioms I I - I4 of section 2.6. The above are three legitimate applications of R14 having three different conclusions. The rule R I1 has no hypotheses, which may seem surprising, but which certainly is not forbidden. The rules R14 have many hypotheses, a s a consequence the derivation trees can look a bit more complicated. Of course one can get all the benefits from R14 by a restricted rule, allowing only one substitution at the time. Lemma 2.10.1. k Ii f or i = 1 , 2 , 3 , 4 . Proof Immediate. 0 We can weaken the rules R14 slightly by considering only the simplest terms and formulae. Lemma 2.10.2. Let L be of type ( r l , . . . , r n ; a l , .. . ,a,; k ). If the rules where y l , . . . ,yn a re free for X I , .. . , x n in (P. Note that we want t o allow substitution of the variable y i(i 5 n ) for some and not necessarily all occurrences of the variable xi.We can express this by formulating R14 in the precise terms of the simultaneous substitution operator: and are given, then the rules R14 are derivable. Proof We consider a special case. Let L have one binary predicate symbol and one unary function symbol. (i) We show x = y t- t (x) = t (y) by induction on t. (a) t (x) is a variable or a constant. Immediate. (b) t (x) = f ( s(x)). Induction hypothesis: x = y k s (x) = s (y) Example. 100 2. P redicate Logic 2.10 N atural Deduction and Identity 101 (ii) We show Z = ij, cp(Z) t cp(y') ) ( a) cp is atomic, then cp = P ( t , s. t a nd s may (in t his example) contain a t most one variable each. So it suffices to consider 5 1 = y1,22 = ~ z ~ P ( t ( x l , x 2 ) , ~ ( ~ 1t- x 2 t)() l , y 2 ) , s ( Y l , ~ 2 ) ) 1 , ~( y ( i.e.P(t [ XI, 2/z1, zz], . . .). x Now we get, by applying [ xi = Y I ] -+ ( c) c p = f~ A T , left t o the reader. E twice, from 21 Induction hypothesis: Z = ij, + (z, Z) t- + (z, f) [x2 = ~ [ p(51,x2)1 a nd the following two instances of (i) Xl = y1 x2 = Y2 21 = y1 5 2 = 92 V a nd V' , So Z = y', Vz+(z, Z) t- Vz$(z, ij). T his establishes, by induction, the general rule. Exercises t he required result, ( P ( s z ,t,) = P ( s y ,t y)). SO x1 = y1,x2 = Y2 t- P (sx,tz) -' P (sy1ty) where s, = s (x1, x z), sy = s(y1, yz) tz = t (x1, x 2), t y = t (y1, ~ 2 ) . 1. Show that Vx(x = x ), Vxyz(x = y A z = y predicate logic only). + x = z ) I- I2 A I3 (using 2 . Show t 3 x(t = x ) for any term t . Explain why all functions in a structure are total (i.e. defined for all arguments), think of 0 -l. ( b) cp = a -+ T. Z = 6 a ($ t a (?) Z = Y;T(.') I- T ( $ .'= Induction hypotheses: y' b(y31 5. Show that in the language of identity 11, I z , 1 3 t- 14. ID a (Z) 4 T (Z) 42) 6. Prove the following Duality Principle for projective geometry (cf. section 2.7, definition 2.7.5): If r t cp t hen also r t ipd, where r is the set of axioms of projective geometry and cpd is obtained from cp by replacing each atom x Iy by y Ix. ( Hint: check the effect of the translation d on the derivation of cp from r ) . 102 2. Predicate Logic Gothic Alphabet 3.1 T he Completeness Theorem J ust as in the case of propositional logic we shall show that 'derivability' and 'semantical consequence' coincide. We will do quite a bit of work before we get t o the theorem. Although the proof of the completeness theorem is not harder than, say, some proofs in analysis, we would advise the reader t o read the statement of the theorem and t o skip the proof at the first reading and t o return t o it later. It is more instructive t o go to the applications and it will probably give the reader a better feeling for the subject. The main tool in this chapter is the Lemma 3.1.1 (Model Existence Lemma). If sentences, then h as a model. r r i s a consistent set of A s harper version is Lemma 3.1.2. L et L h ave cardinality K . If then r h as a model of cardinality 5 K . r i s a consistent set of sentences, r k cp w r cp. From 3.1.1 we immediately deduce Godel's Theorem 3.1.3 (Completeness Theorem). We will now go through all the steps of the proof of the completeness theorem. In this section we will consider sentences, unless we specifically mention non-closed formulas. Furthermore ' t-' will stand for 'derivability in predicate logic with identity'. Just as in the case of propositional logic we have t o construct a model and the only thing we have is our consistent theory. This construction is a kind of Baron von Miinchhausen trick; we have to pull ourselves (or rather, a model) o ut of the quicksand of syntax and proof rules. The most plausible idea is t o make a universe out of the closed terms and to define relations as the sets of (tuples of) terms in the atoms of the theory. There are basically two things we have to take care of: (i) if the theory tells us that 3xcp(x), t hen the model has t o make 3xcp(x) t rue, and SO i t has to exhibit an element (which is in this case a closed term t ) such t hat cp(t) is true. This means that the theory has t o prove cp(t) for a suitable closed term t. This problem is solved in so-called 104 3. Completeness and Applications 3 .1 The Completeness Theorem 105 Henkin theories. (ii) A model has t o decide sentences, i.e. i t has t o say a or for each sentence a . As in propositional logic, this is handled by maximal consistent theories. 70 Definition 3.1.4. ( i) A theory 11 T is a collection of sentences with the property T t cp + cp E T ( a theory is closed under derivability). (ii) A set T such that T = { prt- cp) is called an axiom set of the theory T . cl The elements of T a re called axioms. (iii) T is called a Henkin theory if for each sentence 3xcp(x) t here is a constant c such that 3 x 4 ~ 4 cp(c) E T (such a c is called a witness for 3xcp(x). ) Note that T = { a ( rI- a ) is a theory. For, if T t cp, t hen 0 1,. . . , a k I- cp for certain ai with r t- a ,. V1 D 2 n1 02 V r t Vy[(3xcp(x) cp(y)) + $1. This application of (VI) is correct, since c did not occur in r . 4. r k 3y(3xcp(x) cp(y)) -+ $, (cf. example of 2.9). 5. r t ( 3 x 4 ~ ) 3ycp(y)) $, (2.9 Exercise 2.9). 6. I- 3xcp(x) g ycp(~). 7. r t $, (from 5 ,6). 3. --+ + 4 --t ( b) Let T * I- $ for a $ E L . By the definition of derivability T U { al,. . . ,a,) k $, where the ai a re the new axioms of the form 3xcp(x) -+ cp(c). We show T t $ by induction on n. For n = 0 we are done. Let T u { a l , . . . , a,+l) F $. P u t r' = T u { a l , . . . ,a,}, t hen T 1,an+l t 11, a nd we may apply (a). Hence T U { al,.. . , a,) t $. Now by induction hypothesis T t- $. Although we have added a large number of witnesses to T I t here is no evidence that T * is a Henkin theory, since by enriching the language we also add new existential statements 3 xr(x) which may not have witnesses. In order t o overcome this difficulty we iterate the above process countably many times. L e m m a 3.1.8. Define To := T ;Tn+l := ( Tn)*; T := ~ { T , l n 2 0). Then , T i s a Henkin theory and it is conservative over T . , Proof. Call the language of T, (resp. T,) L, (resp. L,). (i) T is convervative over T . Induction on n . , , , (ii) T is a theory. Suppose T k a , t hen cpo,. . . , cpn I- a for certain 9 0 , . . . , p n E Tw For each i n cpi E Tm, for some m i. Let m = max{mili 5 n ). Since Tk Tk+l for all k , we have Tm, C T,(i 5 n ). Therefore Tm k a . Tm is (by definition) a theory, so a E Tm C T,. , (iii) T is a Henkin theory. Let 3 xp(x) E L,, then lxcp(x) E L, for some n. By definition 3xcp(x) 4 cp(c) E T,+l for a certain c. So 3xcp(x) cp(c) E . . . V k From the derivationsD1,. . . , V k of T k a l l . . . , . . . a k r t a k a nd V of 0 1,. . . , f fk t cp a derivation of r t cp is obtained, a s indicated. Definition 3 .1.5. Let T a nd TI be theories in the languages L and L'. (i) T ' is an extension of T if T C TI, (ii) T ' is a conservative extension of T if TI n L = T (i.e. all theorems of T ' in the language L are already theorems of T ) . Example of a conservative extension: Consider propositional logic P in the language L with + , A , I , t , 7 . T hen Exercise 2 , section 1.6, tells us that P' is conservative over P. , O ur first task is the construction of Henkin extensions of a given theory T , that is to say: extensions of T which are Henkin theories. Definition 3.1.6. Let T be a theory with language L. The language L* is obtained from L by adding a constant c, for each sentence of the form 3xcp(x), a constant c, . T * is the theory with axiom set T U {3xcp(x) 4 cp(c,)( 3xcp(x) closed, with witness c,}. L e m m a 3.1.7. T * is conservative over T . Proof. (a) Let 3 xp(x) + p (c) be one of the new axioms. Suppose r , 3xcp(x) --t cp(c) t $, where $ does not contain c a nd where r is a set of sentences, none of which contains the constant c. We show r t- $ in a number of steps. 1. 2. cp(c)) --t $3 4 cp(y)) --+ +, where y is a variable that does not occur in the associated derivation. 2 follows from 1 by 2.8.4. 4 < --+ rn 1. , (iv) T is conservative over T. Observe that T t a if T k a for some n , , , a nd apply (i). As a corollary we get: T is consistent if T is so. For suppose T in, , , , consistent, then T F I. As T is conservative over T ( and IE L ) T tl. Contradiction. O ur next step is t o extend T, as far as possible, just as we did in propositional logic (1.5.7). We state a general principle: L e m m a 3.1.9 ( Lindenbaum). Each consistent theory is contained in a m&mally consistent theory. r t(3x4~) r k (3xcp(x) 106 3. Completeness and Applications 3 .1 The Completeness Theorem 107 Proof. We give a straightforward application of Zorn's Lemma. Let T be consistent. Consider the set A of all consistent extensions TI of T , partially ordered by inclusion. Claim: A has a maximal element. 1. Each chain in A has an upper bound. Let {Tili E I) be a chain. Then T' = UTi is a consistent extension of T containing all Ti's (Exercise 2). So T' is an upper bound. 2. Therefore A has a maximal element Tm (Zorn's lemma). 3 . Tm is a maximally consistent extension of T . We only have to show: Tm & T' a nd T' E A , then Tm = T'. But this is trivial as Tm is maximal in the sense of C.Conclusion: T is contained in the maximally consistent 0 theory Tm. Note that in general T has many maximally consistent extensions. The above existence is far from unique (as a matter of fact the proof of its existence essentially uses the axiom of choice). We now combine the construction of a Henkin extension with a maximally consistent extension. Fortunately the property of being a Henkin theory is preserved under taking a maximally consistent extension. For, the language remains fixed, so if for an existential statement 3xcp(x) t here is a witness c such that 3xcp(x) - cp(c) E T , then trivially, 3xcp(x) -+ cp(c) E T m. Hence , Lemma 3.1.10. An extension of a Henkin theory with the same language is again a Henkin theory. Although it looks as if we have created the required model, we have to improve the result, because '=' is not interpreted as the real equality. We can only assert that (a) The relation t s defined by Tm t t = s for t , s E A is an equivalence relation. By lemma 2.10.1 I l , I 2 , I3a re theorems of T m, so Tm I- Vx(x = x ), and hence (by QE)Tm t- t = t , or t t. Symmetry and transitivity follow in the same way. (b) t i s i(i 5 p) a nd ( t i , . . . :t,) E P + ( SI,. . . ,s,) E P . ti s i(i 5 k) + f ^ ( t l , . . , t k ) f ( ~ 1 ,. . , s k ) for all symbols P a nd f . . The proof is simple: use Tm t I4(Lemma 2.10.1). Once we have an equivalence relation, which, moreover, is a congruence with respect t o the basic relations and functions, it is natural t o introduce the quotient structure. Denote the equivalence class o f t under by [t]. Define U := ( A/ N , P I , . . . ,P,, ? I,. . . , f m, {Eili E I )),where z =1 . . , [tT.])!(tl,. . . , tTi) E p i ) fj ( [tl],. . . , [ta,]) = [ f j ( t ~. ,. . ,t aj )I ci := [ ti]. a re well-defined, One has to show that the relations and functions on A/ but that is taken care of by (b) above. Closed terms lead a kind of double life. On the one hand they are syntactical objects, on the other hand they are the stuff that elements of the universe are made from. The two things are related by t" = [t]. This is shown by induction on t. - - N - - - We now get t o the proof of our main result. Lemma 3.1.11 (Model Existence Lemma). If has a model. r r. i s consistent, then r (i) t = c, t hen t" = 2 := [t]= [ t], (ii) t = f ( t l , . . . , t k ) , t hen t" = j (tB1,. . . ,t"" = [ f^(tl,. . . , t k)] = [f( t l , . . . , tk)l. i .h. = f ( [ t l ] ,. . . , [ tk]) - Proof. Let T = t- a ) be the theory given by Any model of T is, of course, a model of Let Tm be a maximally consistent Henkin extension of T (which exists by the preceding lemmas), with language L,. We will construct a model of Tm using T itself. At this point the reader , should realise that a language is, after all, a set, that is a set of strings of symbols. So, we will exploit this set to build the universe of a suitable model. 1. A = {t E L mlt is closed). 2. For each function symbol 7 we define a function f^ : Ak -+ A by f (t1 , . . . A ) : = f ( t 1 , . . . , t , & 3. For each predicate symbol P we define a relation P A by ( tl, . . . ,t,) E P P HT, I- ~ ( , .t. . , t p ) . l 4. For each constant symbol c we define a constant t := c . {a(r r. Furthermore we have U cp(t) H U cp(m),by the above and by Exercise 6 section 2.4. , Claim. U cp(t) for all sentences in the language L m of T which, by the way, is also L (U), since each element of A/ has a name in L,. We prove the claim by induction on cp. N + (i) cp is atomic. U P ( t l , . . . , t,) * ( t y , . . . , t$) E f i H ( [tl],. . . , [t,]) E @ ( t l , . . . ,t,) E P H Tm t P (t1,. . . , t,). T he case cp =Iis trivial. (ii) cp = a A T . Trivial. (iii) cp = a -+ T . We recall that, by lemma 1.6.9 Tm t- a T @ (Tm to + Tm t 7 ). Note that we can copy this result, since its proof only uses propositional logic, and hence remains correct in predicate logic. U ~-+TH(%+u+U+T)'& ( T ~ ~ - ~ = S T , ~ T ) W T ~ F ~ - + T . (iv) cp = v x+(x). t= Vx+(x) H U 3 x l + ( x ) H U l+(a), for all a E I%( H for all a 6 ( U((U + (a)).Assuming U V x+(x), we get + + 108 3. Completeness and Applications 3.1 The Completeness Theorem 109 in particular, 2 $(c) for the witness c belonging t o ~ xT$(x). By induction hypothesis: Tm F $(c). Tm I- 3 x+(x) -+ -$(c), SO T m I$ ( c ) + - 37$(x). Hence T, t Vxcp(x). Conversely: T t- Vx$(x) + Tm t- $ (t), so Tm I- $ (t) for all closed t , , a nd therefore by induction hypothesis, U $ (t) for all closed t. Hence I= tJx$(x). Now we see that U is a model of T , as T 2 T,. T he model constructed above goes by various names, it is sometimes called the canonical model or the (closed) t e r n model. In logic programming the set of closed terms of any language is called the Herbrand universe or - domain and the canonical model is called the Herbrand model. In order t o get an estimation of the cardinality of the model we have to compute the number of closed terms in L,. As we did not change the language going from T, t o Tm,we can look a t the language L,. We will indicate how to get the required cardinalities, given the alphabet of the original language L. We will use the axiom of choice freely, in particular in the form of absorption laws (i.e. K X = K . X = m a x ( ~A,) for infinite cardinals). Say L has type Therefore L, has No. /I = /I t erms and formulas. L, is also the language of T,. 5 . L, h as a t most p closed terms. Since L1 has p witnesses, L , has a t least p , a nd hence exactly p closed terms. 6 . T he set of closed terms has p equivalence classes under SO llUll 5 p . < All this adds up to the strengthened version of the Model Existence Lemma: Lemma 3.1.12. F i s consistent * T has a model of cardinality at most the cardinality of the language. Note the following facts: - If L has finitely many constants, then L is countable. - If L has K 2 N o constants, then ILI = K . T he completeness theorem for predicate logic raises the same question a s t he completeness theorem for propositional logic: can we effectively find a derivation of cp is cp is true? The problem is that we don't have much t o go on; cp is true in all structures (of the right similarity type). Even though (in t he case of a countable language) we can restrict ourselves to countable structures, the fact that cp is true in all those structures does not give the combinatorial information, necessary to construct a derivation for cp. T he matter is a t this stage beyond us. A treatment of the problem belongs to proof theory; Gentzen's sequent calculus or the tableau method are more suitable t o search for derivations, than natural deduction. In the case of predicate logic there are certain improvements on the completeness theorem. One can, for example, ask how complicated the model is that we constructed in the model existence lemma. The proper setting for those questions is found in recursion theory. We can, however, have a quick look a t a simple case. Let T be a decidable theory with a countable language, i.e. we have an effective method t o test membership (or, what comes to the same, we can test I- cp for a set of axioms of T ) . Consider the Henkin theory T introduced in 3 .1.6.; a E T, if a E T for a certain n. This number n can be read off , from a by inspection of the witnesses occurring in a . From t he witnesses we can also determine which axioms of the form 3x(p(x) 4 cp(c) a re involved. Let { TI,. . . ,T,) be the set of axioms required for the derivation of o ,t hen T U {TI, . . . , 7,) I- a . By the rules of logic this reduces to T I- TI A. . . AT, -+ a . Since the constants ci are new with respect to T I this is equivalent to T IV ZI,.. . , z k(~A a') for suitable variables 21,. . . ,zk, where T;, . . . ,T;, 0 a re + ' obtained by substitution. Thus we see that a E T , is decidable. The next step is the formation of a maximal extension T,. Let cpo, P I ,p 2 , . . . be an enumeration of all sentences of T,. We add sen, tences to T in steps. + 1 . Define TERMo := {cili E I) U { xjlj E N ) j T E R M n + ~ := T ERM, U { fj(tl,. . . , t a 3 ) ( I m , tk E T ERM, for k I a j). T hen T E R M = U{TERM,ln E N ) (Exercise 5 ) ITERMoI = m ax(n,No) = p . Suppose (TERM,I = p . T hen I {fj(tl,. . . , ta,)ltl,. . . , t q E T E R M n ) ( = I TERMnJa3 = p aj = p . SO 1TERMn+11 = p p f . . . p ( m f 1 times) = p . ITERMnI = N o . p = p . Finally I TERMI = + C + nEN 2. Define FORMo F ORMn+, := { Pi(tl,. . . , t , li n ,tk E T E R M ) U {I) := F ORMn U (9 E { A, +), 9, $ E F ORM,) u{Vxicp(i E N , cp E F ORM,). < $1 r Then F O R M = U{FORM,ln E N ) (Exercise 5) As in 1 . one shows I F O R M ( = p . 3. T he set of sentences of the form 3xcp(x) has cardinality p . I t trivially is I. Consider A = ( 3 s ( xo = ci)l E I). Clearly [ A\= K . N o = p . Hence p the cardinality of the existential statements is p . 4. L1 has the constant symbols of L, plus the witnesses. By 3 t he cardinality of the set of constant symbols is p . Using 1 and 2 we find Lo h as p t erms and p formulas. By induction on n each L , has p t erms and p formulas. 110 3 . Completeness and Applications 3 .2 Compactness and Skolem-Lowenheim 111 3 .2 Compactness and Skolem-Lowenheim Unless specified otherwise, we consider sentences in this section. From the Model Existence Lemma we get the following: T h e o r e m 3.2.1 ( C o m p a c t n e s s T h e o r e m ) . subset A of r has a model. r has a model w each finite T, U {cpo) if T U ( 90)is consistent, step 0 : To = T, U { - p o ) else. T U { , ~ if T, U { P,+~)is consistent, , P+) step n 1 : T n+l = Tn U { T n + l ) else. r An equivalent formulation is: has no model e some finite A Cr has no model. + +: Trivial. Proof. We consider the second version. To = uT, ( To is given by a suitable infinite path in the tree). It is easily seen that T o is maximally consistent. Moreover, T o is decidable. To test cp, E T o we have to test cp, E T,, or Tn-1 U (9,) k I is decidable. So To is decidable. The model U constructed in 3.1.11 is therefore also decidable in the following sense: the operations and relations of U a re decidable, which means that ( [tl],. . . , [t,]) E P a nd f ([tl], . . . , [ tk])= [t] are decidable. Summing up we say that a decidable consistent theory has a decidable model (this can be made more precise by replacing 'decidable' by 'recursive'). Exercises 1. Consider the language of groups. T = (01% a ) , where U is a fixed non-trivial group. Show that T is not a Henkin theory. 2. Let {Tili E I) be a set of theories, linearly ordered by inclusion. Show that T = u{Tili U I) is a theory which extends each Ti. ,each Ti is If consistent, then T is consistent. J: S uppose r h as no model, then by the Model Existence Lemma is inconsistent, i.e. r F I.Therefore there are a l, . . . , a , E I' such that a l , . . . , a, F I. T his shows that A = { al,. . . , a,) has no model. a for all a E r ) . Let us introduce a bit of notation: M o d ( r ) = {UIU r for U E M o d ( r ) . We write For convenience we will often write U Mod(cp1,. . . , cp2) instead of Mod{cpl,. . . , cp,)). I n general M o d ( r ) is not a set (in the technical sense of set theory: M o d ( r ) is most of the time a proper class). We will not worry about that since the notation is only used as a abbreviation. Conversely, let K be a class of structures (we have fixed the similarity type), then T h(K) = { aJU a for all U E K ). We call T h(K) t he theo7-y of r K. + We adopt the convention (already used in section 2.7) not to include the identity axioms in a set r ;these will always be satisfied. Examples. ' 1. Mod(Vxy(x 5 y~ I I x ++ x = y ),Vxyz(x 5 y A y z + x I z )) is y the class of posets. 2. Let be the class of all groups. T h(G) is the theory of groups. < 3 . Show that A, F a & a holds in all models with a t least n elements. p, k a H a holds in all models with at most n elements. A,Ap, k a & u holds in all models with exactly n elements, {X,ln E N k a H a holds ) in all infinite models, (for a definition of A,, p, cf. section 2.7). 4. Show that T = { a J X 2 t- a ) U {cl # c n) in a language with = a nd two constant symbols c l, c2, is a Henkin theory. 5 . Show T E R M = U{TERM,ln 1 .1.5). E We can consider the set of integers with the usual additive group struct ure, but also with the ring structure, so there are two structures U a nd 23, of which the first one is in a, sense a part of the second (category theory uses ' a forgetful functor to express this). We say that U is a reduct of 23, or 23 is a n expansion of U . ! I n general , N, F O R M = U{FORM,ln ) EN } (cf. 2 B Definition 3 .2.2. U is a reduct of % ( B a n expansion of U) if 11= 1 l a nd moreover all relations, functions and constants of U occur also as relations, functions a nd constants of B . 112 3 . Completeness and Applications 3 .2 Compactness and Skolem-Lowenheim 113 Notation. (U, 5'1,. . . , S n, g l , . . . , g,, { aj Ij E J )) is the expansion of U with the indicated extras. In the early days of logic (before "model theory" was introduced) Skolem (1920) and Lowenheim (1915) studied the possible cardinalities of models of consistent theories. The following generalisation follows immediately from the preceding results. Theorem 3.2.3 (Downward Skolem-Lowenheim Theorem). Let r be . a set of sentences in a language of cardinality K , and let K. < A If r has a model of cardinality A , then r has a model of cardinality rc', with rc 5 rc' < A. Proof. Add t o the language L of r a set of fresh constants (not occurring in the alphabet of L ) {cili E I) of cardinality rc, a nd consider r' = T U {ci # c jli, j E I , i # j ) . Claim: M o d ( r l ) # 0. Consider a model U of r of cardinality A. We expand U t o U' by adding rc distinct constants (this is possible: IUI contains a subset of cardinality K ). U' E M o d ( r ) (cf. Exercise 3 ) and U' ci # c, ( i # j ) . Consequently M od(T1) # 0. The cardinality of the language of r' is rc. By the Model Existence Lemma r' h as a model 9.3' of cardinality 5 rc, b ut, by the axioms 3 ci # c j, t he cardinality if also 2 rc. Hence 9' has cardinality rc. Now take the .3 0 reduct 9 of 9.3' in the language of r , t hen 9.3 E M o d ( r ) ) (Exercise 3 ). Now take U a nd expand it to U' = (U, a l l . . . , a k ) , where the a i a re distinct. Then obviously a' E M od(ro), so U' E M od(A). By the Compactness Theorem there is a 9.3' E M o d ( r l ) . T he reduct 9.3 of U' t o the (type of the) language L is a model of r . From the extra axioms in r' it follows that %I, a nd hence 9.3, has cardinality p . We now apply the downward Skolem-Lowenheim Theorem and obtain the existence of a model of r of cardinality p . 0 We now list a number of applications. > Application I. Non-standard Models of PA. Corollary 3.2.5. Peano's arithmetic has non-standard models. Let P be the class of all Peano structures. Put PA = T h ( P ) . By the Completeness Theorem PA = {alC I- a ) where C is the set of axioms listed in section 2.7, Example 6. PA has a model of cardinality No ( the standard mode1 T ) , so by the upward Skolem-Lowenheim Theorem it has models of every k > No. These models are clearly not isomorphic to T . For more see 3.3. + Examples. 1. The theory of real numbers, T h(R), in the language of fields, has a countable model. 2. Consider Zermelo-Fraenkel's set theory Z F . If M od(ZF) # 0, t hen Z F has a countable model. This fact was discovered by Skolem. Because of its baffling nature, it was called Skolem's paradox. One can prove in Z F the existence of uncountable sets (e.g. t he continuum), how can Z F then have a countable model? The answer is simple: countability as seen from outside and from inside the model is not the same. To establish countability one needs a bijection to the natural numbers. Apparently a model can be so poor that it misses some bijections which do exist outside the model. Application 11. Finite and Infinite Models. Lemma 3.2.6. I r has arbitrarily large finite models, then r has an infinite f model. Theorem 3.2.4 (Upward Skolem-Lowenheim Theorem). Let r have a language L of cardinality rc, and U E M od(T) with cardinality X > K . F or each p > A r has a model of cardinality p . Proof. Add p fresh constants ci, i E I t o L and consider r' = r U {ci # c jli f j , i , j E I ) . Claim: M o d ( r l ) # 0. We apply the Compactness Theorem. Let A P be finite. Say A contains new axioms with constants ci,,,. . . , c i k rt hen A r U {tip # cirllp,q 5 k ) = To. Clearly each model of To is a model of A (Exercise l (i)). Proof. ut r' = r u{Anln > l ), where A n expresses the sentence "there are a t P least n distinct elements", cf. section 2.7, Example 1. Apply the Compactness Theorem. Let A & r' be finite, and let A, be the sentence A, in A with Now r has the largest index n. Verify that M od(A) M od(T U {p,)). arbitrarily large finite models, so r has a model U with at least m elements, i.e. U E Mod(rU{A,)). Now r h as arbitrarily large finite models, so I' has a ~ So M od(A) # 8. model U with a t least m elements, i.e. U E M o d ( r {A,)). By compactness M o d ( r l ) # 0, b ut in virtue of the axioms A, a model of is infinite. Hence r',and therefore r , has an infinite model. 0 We get the following simple r' Corollary 3.2.7. Consider a class K of structures which has arbitrarily large finite models. Then, in the language of the class, there is no set E of sentences, such that U € M o d ( C ) e U i s finite and U E K. Proof. Immediate. 0 We can paraphrase t he result as follows: the class of finite structures in such a class K is not axiomatisable in first-order logic. 114 3 . Completeness and Applications 3 .2 Compactness and Skolem-Lowenheim 115 We all know that finiteness can be expressed in a language that contains variables for sets or functions (e.g. Dedekind7sdefinition), so the inability t o characterise the notion of finite is a specific defect of first-order logic. We say that finiteness is not a first-order property. The corollary applies t o numerous classes, e.g. groups, rings, fields, posets, sets (identity structures). We now get a number of corollaries. Corollary 3.2.11. The class of all infinite sets (identity structures) is miomatisable, but not finitely miomatisable. Application 111. Axiomatisability and Finite Axiomatisability. Definition 3.2.8. A class K of structures is (finitely) miomatisable if there is a (finite) set such that K = M o d ( r ) . We say that r axiomatises K ; t he sentences of F a re called it axioms (cf. 3.1.3). ). ) . Pmof U is infinite @ 2l E Mod{A,Jn E N) So the axiom set is {Anln E N O n t he other hand the class of finite sets is not axiomatisable, so, by Lemma (b), t he class of infinite sets is not finitely axiomatisable. 0 Corollary 3.2.12. ( i) The class of fields of characteristic p (> 0 ) i s finitely axiomatisable. (ii) The class of fields of characteristic 0 is axiomatisable but not finitely axiomatisable. (iii) The class of fields of positive characteristic is not axiomatisable. Proof. (i) The theory of fields has a finite set A of axioms. A U { p = 0) axiomatises the class FPof fields of characteristic p (where jj = 0) axiomatises the class 3 of fields of characteristic p (where p s tands for l + l + . . . + I , ( px)). , (ii) A U (9 # 0 , s # 0 , . . . , p # 0 , . . .) axiomatises the class F0 of fields of w characteristic 0. Suppose To as finitely axiomatisable, then by Lemma 3.2.9 Fowas axiomatisable by I' = A U {pl # 0 ,. . . , p k Z. 0), where p l , . . . ,pk are primes (not necessarily the first k ones). Let q be a prime greater than all pi (Euclid). Then Z /(q) ( the integers modulo q ) is a model of r , b ut Z /(q) is not a field of characteristic 0. Contradiction. (iii) follows immediately from (ii) and Lemma 3.2.10. 0 r Examples for the classes of posets, ordered sets, groups, rings, Peanos tructures axiom sets I' a re listed in section 2.7. The following fact is very useful: Lemma 3.2.9. If K = M o d ( r ) a nd K i s finitely axiomatisable, then K is axiomatisable by a finite subset of r . Proof. Let K = M od(A) for a finite A, then K = M od(a), where a is the conjunction of all sentences of A (Exercise 4). Then a $ for all $ E I' a nd r a , hence also I' I- a . Thus there are finitely many $1,. . . , $k E r such . . , Gk k a . Claim K = M od($l,.. . ,$k). that ( i) ($1, . . . ,$ k) r so M o d ( r ) G mod($^, . . . ,$k). (ii) From $1, . . . , $k I- a i t follows that M od(Gl,. . . ,q k)C_ M od(o). c Using (i) a,nd (ii) we conclude Mod($l, . . . ,$ k ) = IC. 0 Corollary 3.2.13. The class A, of all algebraically closed fields is axiomatisable, but not finitely miomatisable. Proof. Let a, = ' dyl.. . y n3x(xn y lxn-l . . . y n-lx yn = 0 ). Then = A U { a n ( n 2 l ) ( A a s in corollary 3.2.12) axiomatises A,. To show nonfinite axiomatisability, apply Lemma 3.2.9 to T a nd find a field in which a certain polynomial does not factorise. 0 T his lemma is instrumental in proving non-finite-axiomatisability results. We need one more fact. r + ++ + Lemma 3.2.10. K i s finitely axiomatisable H K a nd its complement K C are both axiomatisable. Proof *. Let K = M od(p1,. . . , p,), t hen K: = M od(pl A . . . A cpk). U E K c (complement of K ) @ U c pl A . . . A cp, U l (cpl A . . .cpn). So KC= M o d ( ~ ( c p 1 A . . Acpn)). e. et K = M o d ( r ) , K C= M od(A). K n K C = L M o d ( r U A ) = 0 (Exercise 1). , By compactness, there are 9 1 , . . . , cp, E r a nd G I , . . . , $ E A such that , M o d ( ~ l , ..., ~ n , $ l . . . , $ m ) = 0, or ), = 0, (1) Mod(cp1,. . . ,cpn) n M Od($l,.. . ,+ K = M o d ( r ) 2 M o d ( p l , . . . , p,), (2) KC= M od(A) C M od($1,. . . , ,, + ) (3) 0 K = M od(p17.. . ,%). ( I ) , (% ( 3) * Corollary 3.2.14. The class of all torsion-free abelian groups is axiomatisable, but not finitely miomatisable. * 0 Proof. Exercise 3.2.14. R emark In Lemma 3.2.9 we used the Completeness Theorem and in Lemma 3.2.10 the Compactness Theorem. The advantage of using only the Compactness Theorem is t hat one avoids the notion of provability altogether. The reader might object t hat this advantage is rather artificial since the Compactness Theorem is a corollary t o the Completeness Theorem. This is true in our presentation; one c an, however, derive the Compactness Theorem 116 3. Completeness and Applications 3 .2 Compactness and SkoIem-Lowenheim 117 by purely model theoretic means (using ultraproducts, cf. Chang-Keisler) , so there are situations where one has t o use the Compactness Theorem. For the moment the choice between using the Completeness Theorem or the Compactness Theorem is largely a matter of taste or convenience. By way of illustration we will give an alternative proof of Lemma 3.2.9 using the Compactness Theorem: Again we have M o d ( r ) = M od(a)( *) . Consider r' = r U { l a ) . U E M o d ( r f ) H U E M o d ( r ) a nd U l a , H U E Modr a nd IU 6M od(a). I n view of (*) we have M o d ( r l ) = 0 . By the Compactness Theorem there is a finite subset A of r f with M od(A) = 0. I t is no restriction t o suppose that l a E A , hence M od(ql, . . . ,$ J ~ ,10) = 0. I t now easily follows that M od(Gl,. . . , & ) = M od(a) = M od(F). 0 1 3. If U w ith language L is a reduct of '23, t hen U t= a '23 + a for a E L. 5 . I' I= cp + A cp for a finite subset A C r . (Give one proof using completeness, another proof using compactness on T u ( 70)). 6. Show that well-ordering is not a first-order notion. Suppose that r axiomatises the class of well-orderings. Add countably many constants ci a nd show that F U {ci+l < cili E N) has a model. + g> , f 7. If r h as only finite models, then there is an n such that each model has a t most n elements. i 4 1I' ; Application IV. Ordering Sets. O ne easily shows that each finite set can be ordered, for infinite sets this is harder. A simple trick is presented below. Theorem 3.2.15. Each infinite set can be ordered. Proof. Let ( XI = rc 2 N o. Consider r , t he set of axioms for linear order (section 2.7.3. r h as a countable model, e.g. W. By the upward SkolemLowenheim Theorem r h as a model U = ( A ,<) of cardinality K . Since X a nd A have the same cardinality there is a bijection f : X -+ A . Define 0 x < x' := f ( x) < f (x'). Evidently, < is a linear order. I n the same way one gets: Each infinite set can be densely ordered. The same trick works for axiomatisable classes in general. 8. Let L have the binary predicate symbol P . a := V x l P ( x , x )AVxyz(P(x, y )/\ P ( y , t ) - P ( x , t ) )A V x3yP(x, y ). Show that M od(a) contains only infi, nite models. 9. Show that a V Q xy(x = y) has infinite models and a finite model, but no arbitrarily large finite models ( a as in 8). 10. Let L have one unary function symbol. (i) Write down a sentence p such that U I= cp f a is a surjection. (ii) Idem for an injection. (iii) Idem for a bijection (permutation). (iv) Use (ii) to formulate a sentence a such that U a o U is infinite (Dedekind). ( v) Show that each infinite set carries a permutation without fixed points (cf. the proof of 3.2.15). '' + Exercises 1. Show: (i) r A M od(A) M o d ( r ) , T h(K2) C T h(Kl), (ii) K I 2 K:! (iii) M o d ( r U A ) = M od(T) n M od(A), (iv) T h(K1 U K 2) = Th(lCl) n T h(K2), ( v) K C M o d ( r ) H r g T h(K), (vi) M o d ( r n A ) 2 M od(T) U M od(A), (vii) T h(K1 fl K2) C T h(Kl) U Th(K2). Show that in (vi) and (vii) 2 cannot be replaced by =. 2 . ( i) I' C T h ( M o d ( r ) ) , (ii) K C M od(Th(K)), (iii) T h ( M o d ( r ) ) is a theory with axiom set r . c* * 11. Show: a holds for fields of characteristic zero, characteristic q > p for a certain p. + a holds for all fields of 12. Consider a sequence of theories Ti such that Ti # Ti+l a nd T, C_ T i+l. Show that u{Tilz E N) is not finitely axiomatisable. 13. If TI a nd T2 are theories such that M od(Tl U T 2) = 0 , t hen there is a o such that TI k a and T2 - a. + 14. ( i) A group can be ordered ordered. H each finitely generated suSgroup can be 118 3 . Completeness and Applications 3 .3 Some Model Theory 119 (ii) An abelian group U can be ordered ++ 2 is torsion free. (Hint: look a t all closed atoms of L (U) t rue in U.) 1 5. Prove Corollary 3 .2.14. D efinition 3.3.1. ( i) f : IUI -+ 1231 is a homomorphism if ( a l , . . . ,a k ) E p," + ( f ( a l ) , . . . ,f ( ak)) E P," for all Pi, f (F:(al,. . . , a,)) = (f ( a l ) , . . . ,f ( ap)) for all F j , a nd f (c?) = c" for all ci. (ii) f is an isomorphism if it is a homomorphism which is bijective and satisfies ( a l , . . . , a,) E P e ( f ( a l ) , . . . , f (a,)) E P?, for all P i . : We write f : U - 23 if f is a homomorphism from U t o 23. U 2 23 s tands , for "U is isomorphic to B " , i.e. t here is an isomorphism f : U + 23. Definition 3.3.2. U a nd 23 a re elementarily equivalent if for all sentences a o f L , U + a @ % 2 30. Notation. U r 23. Note that U 16. Show that each countable, ordered set can be embedded in the rationals. 17. Show that the class of trees cannot be axiomatised. Here we define a tree is a partial order, such that for each a as a structure (T, <, t ) , where t he predecessors form a finite chain a = a, < a,-1 < . . . < a1 < ao = t . t is called the top. < = 23 e T h(U) = Th(23). 18. A g raph (with symmetric and irreflexive R ) is called k-colourable if we can paint the vertices with k-different colours such that adjacent vertices have distinct colours. We formulate this by adding k u nary predicates c l, . . . ,ck, plus the following axioms L e m m a 3.3.3. U r 23 23 U u 23 Proof. Exercise 2 . 0 flC\ V XY(C~(X) i(y) AC i + ~ R ( x , )). Y D efinition 3.3.4. U is a substructure (submodel) of 23 (of the same type) n JUIn = ,P F? f (%In= FfP and c? = c 3 w h e r e n is the : if I%( C 1231; number of arguments). P? Show that a graph is k-colourable if each finite s ubgraph is k-colourable (De Bruijn-Erdos). 23. Note that it is not sufficient for U t o be contained in 23 Notation. U "as a s et"; the relations and functions of 23 have to be extensions of the corresponding ones on U, in the specific way indicated above. Examples. The field of rationals is a substructure of the field of reals, but not of the ordered field of reals. Let U be the additive group of rationals, 23 t he multiplicative group of non-zero rationals. Although 1231 C IUI, 23 is not a substructure of U. T he well-known notions of subgroups, subrings, subspaces, all satisfy the above definition. The notion of elementary equivalence only requires that sentences (which d o n ot refer t o specific elements, except for constants) are simultaneously true in two structures. We can sharpen the notion, by considering U C 23 a nd by allowing reference to elements of I%(. is an elementary substructure of 23 (or 23 is an elD efinition 3.3.5. ementary extension of %) if U 23 and for all cp(x1,. . . ,x,) in L and ~ I , . . . , u , €~ U \ , ~ ~ V. .( , Z n ) , e B + ~ ( ~ l , . . . , a n ) . . ~~ 3 .3 Some Model Theory I n model theory one investigates the various properties of models (structures), in particular in connection with the features of their language. One could say that algebra is a part of model theoryome parts of algebra indeed belong t o model theory, other parts only in the sense of the limiting case in which the role of language is negligible. It is the interplay between language and models that makes model theory fascinating. Here we will only discuss the very beginnings of the topic. In algebra one does not distinguish structures which are isomorphic; the nature of the objects is purely accidental. In logic we have another criterion: we distinguish between two structures by exhibiting a sentence which holds in one but not in the other. So, if U a e 23 1a for all a , then we cannot (logically) distinguish U a nd 23. c Notation. U + 23. We say t hat U a nd 23 have the same true sentences with parameters in U. 120 3 . Completeness and Applications 3 .3 Some Model Theory 121 F a c t 3 .3.6. U 4 23 + U = 23. L e m m a 3.3.9. U 4 23 23 !% + T h(&). . ,a), The converse does not hold (cf. Exercise 4 ). Since we will often join all elements of IUI t o U a s constants, it is convenient t o have a special notation for the enriched structure: 3 = ((U, (%I). If one wants to describe a certain structure a , one has t o specify all the basic relationships and functional relations. This can be done in the language L (U) belonging to U (which, incidentally, is the language of the type of U). Definition 3.3.7. The diagram, D iag(U), is the set of closed atoms and negations of closed atoms of L (U), which are true in U. T he positive diagram, D iagf ( a ) , is the set of closed atoms y of L (U) such that U k cp. Example. n- ) 1. U = ( N).Diag(U) = { Ti = filn E W ) U {Ti # mln # m ;- , m E N.2. 23-= ( {1,2,3), <). ( natural order). Diag23 = {i 1, 2 = 2, 3 = 3 = = , i f 2, Z # 3 , Z # i , 3 # i, 3 # 2 , i < 2 , T < 3 , Z < 3 , l Z < i , 1 3 < 2, 1 3 < i , l i<i, < 2 , 1 3 < 3 ) . 4 Diagrams are useful for lots of purposes. We demonstrate one here: We say that (U is isomorphically embedded in 23 if there is an isomorphism f from U i nto a substructure of 23. L e m m a 3.3.8. U i s isomorphically embedded in 23 H D iag(2l). b N.B. U 4 23 holds "up to isomorphism". ! is supposed to be of a similarity type which admits a t least constants for all constant symbols of L (U). Proof. +. L et c p ( q , . . . , G) E T ~ ( G ) t hen U k cp&, . . , and hence 6 c p ( q , . . . , En). So !% k T h(&). e. y 3.3.8, U C 23 ( up to isomorphism). The reader can easily finish B t he proof now. 0 We now give some applications. Application I. N o n - s t a n d a r d M o d e l s of A r i t h m e t i c . Recall that 9l = (W, ., s , 0 ) is the standard model of arithmetic. We know that it satisfies Peano's axioms (cf. example 6 , section 2.7). We use the abbreviations introduced in section 2.7. Let us now construct a non-standard model. Consider T = T h ( 3 ) . By t he Skolem-Lowenheim Theorem T has an uncountable model 332. Since M ~ h ( $ ) , we have, by 3.3.9, 9l 4 M . Observe that 3 M (why?). We will have a closer look at the way in which fl is embedded in M. We note that Cfl k Vxyz(x < y A y < z -+ x < t ) (1) VXY.Z(X y v x = y v y < X ) ( 2) < (= V X(W X ) (3) nk 1 3 ( z < x ~ x < m ) ( 4) Hence, 9l being an elementary substructure of IM, we have (1) and (2) for W,i.e. IM is linearly ordered. J?rom 3 4 IM and (3) we conclude that 0 is the first element of M. Furthermore, (4) with 9 < M tells us that there are no elements of M between the "standard natural numbers". As a result we see that is an initial segment of M : +, + n+ n 6 3 is a model of Proof. +. Let f be an i somor~hicembedding of U in 23, t hen U k p (I . . . , an) ) CB n I = P , ( ~i) , i ;, ) f .., a nd U b t ( i 1 , . . . ,b,) '= s(?il,. . . , a,)H CB t ( f ( a l ) , . . .) = s ( f ( a l ) , . . .) (cf. Exercise 2.). By interpreting h a s f ( a) in !% (i.e. @ = f ( a)), we immediately see !% k D iag(U). e=: Let $ 3 D iag(U). Define a mapping f : ( U(-+ 1231 by f ( a ) = ?dB).hen, T clearly, f satisfies the conditions of definition 3.3.1 on relations and functions -(since they are given by atoms and negations of atoms). Moreover if a 1 # az t hen U b 1 iZ1 = ?i2, 23 k l i Z 1 = h2. so Hence # a?, a nd thus f ( al) # f ( a2). This shows that f is an isomorphism. 0 o+ - n + / , . . . . . ................................... s tandardnumbers n on-standardnumbers .......................... . , We will often identify U with its image under an isomorphic embedding 3 into 23, so that we may consider U a s a substructure of 9 We have a similar criterion for elementary extension. We say that 'U is elementarily embeddable in 23 if U S U' and U' 4 23 for some U'. Again, we often simplify matters by just writing U 4 23 when we mean "elementarily embeddable" . Remark: it is important t o realise that (1) - ( 4) a re not only t rue i n the standard model, but even provable in P A . This implies that they hold not only in elementary extensions of 3,b ut in all Peano s tructures. The price one has t o pay is the actual proving of ( 1) - (4) in P A, which is more cumbersome than the mere establishing their validity in n . However, anyone who can give an informal proof of these simple properties will find out that it is just one more (tedious, but not difficult) step to formalise the proof in our natural deduction system. Step-by-step proofs are outlined in the Exercises 25, 28. So, all elements of 1JI - 191, t he non-standard numbers, come after the 9I standard ones. Since is uncountable, there is a t least one non-standard number a . Note t hat n < a for all n , so M h as a non-archimedean order (recall that n = 1 + 1 + . .. + l ( n x ) ) . 122 3. Completeness and Applications i i 3 .3 Some Model Theory 123 We see that the successor S ( n ) ( = n +1) of a s tandard number is standard. Furthermore 'JI --t 3 y(y + i = x )), s o, since ' I 4 M , also J Vx(x # each number, distinct from 9X VX(X # 0 -+ 3 y(y T = x )), i.e. in zero, has a (unique) predecessor. Since a is non-standard it is distinct from zero, hence it has a predecessor, say a l . Since successors of standard numbers are standard, a1 is non-standard. We can repeat this procedure indefinitely and obtain an infinite descending sequence a > a l > a 2 > a s > . . . of nonstandard numbers. Conclusion: M is not well-ordered. However, non-empty definable subsets of M do possess a least element. For, such a set is of the form {bJ9XI= cp(b)), where cp E L (%), a nd we know 3 xp(x) + 3x(cp(x) ~ V y ( p ( y--, x y )). This sentence also holds in 9 ) X a nd it tells us that {blm cp(5)) has a least element if it is not empty. The above construction not merely gave a non-standard Peano s tructure (cf. 3.2.5), but also a non-standard model of true arithmetic, i.e. it is a model of all sentences true in the standard model. Moreover, it is an elementary extension. The non-standard models of P A that are elementary extensions of f l a re the ones that can be handled most easily, since the facts from the standard model carry over. There are also quite a number of properties that have been established for non-standard models in general. We treat two of them here: + I re-established by quite different means by Paris-Kirby-Harrington, Kripke, and others. As a result we have now examples for y, which belong t o 'normal mathematics', whereas Godel's y , although purely arithmetical, can be considered as slightly artificial, cf. Barwise, Handbook of Mathematical Logic, D8.P A has a decidable (recursive) model, namely the standard model. That, however, is the only one. By the theorem of Tennenbaum all non-standard models of P A are undecidable (not recursive). + < Application 11. N o n - s t a n d a r d R e a l N u m b e r s . , Similarly to the above application, we can introduce non-standard models 1 for t he real number system. We use the language of the ordered field R of real numbers, and for convenience we use the function symbol, ( 1, for the ab\ 1 solute value function. By the Skolem-Lowenheim Theorem there is a model [ * R of T ~ Rsuch that * R has greater cardinality than R. Applying 3.3.9, ) ' we see that R 4 * R, so * R is an ordered field, containing the standard real numbers. For cardinality reasons there is an element a E (*RI- I R(. For the - element a there are two possibilities: (i) l a\ > Irl for all r E I RJ, (ii) there is an r E J R Jsuch that la1 < r. T h e o r e m 3.3.10. The set of standard numbers in a non-standard model is not definable. Proof. Suppose there is a cp(x) in the language of P A , such that: M p ( ~ H "a is a standard natural number", then - p ( x ) defines the non) standard numbers. Since P A proves the least number principle, we have M 3 x(-p(x) / \ Vy < xcp(y)), or there is a least non-standard number. However, a s we have seen above, this is not the case. So there is no such definition. 0 A simple consequence is the + + I n the second case {u E IRI I u < J al) is a bounded, non-empty set, which therefore has a supremum s (in R ) . Since l a[ is non-standard number, there is no standard number between s a nd la1 By ordinary algebra, there is no standard number between 0 a nd I la1 - s I. Hence (la1- 31-' is larger than all standard numbers. So in case (ii) there is also a non-standard number greater than all standard numbers. Elements satisfying the condition (i) above, are called infinzte and elements satisfying (ii) are called finite (note that the standard numbers are finite). We now list a number of facts, leaving the (fairly simple) proofs to the reader. L e m m a 3.3.11 ( Overspill Lemma). If p (E) holds in a non-standard model for infinitely many finite numbers n, then cp(a) holds for at least one infinite number a . Proof. Suppose that for no infinite a cp@) holds , t hen 3y(x < y r\ p (y)) defines the set of standard natural numbers in the model. This contradicts the preceding result. 0 O ur technique of constructing models yields various non-standard models of Peano's arithmetic. We have at this stage no means to decide if all models of PA a re elementarily equivalent or not. The answer t o this question is provided by Godel's incompleteness theore, which states that there is a sentence y such that PA Y y a nd PA Y ~ y T.he incompleteness of PA has been , I \ 1 1. * R has a non-archimedean order. 2. T here are numbers a such that for all positive standard r, 0 < l 1 < r . a We call such numbers, including 0, infinitesimals. 3. a is infinitesimal % a -l is infinite. 4. For each non-standard finite number a there is a unique standard number s t(a) such that a - s t(a) is infinitesimal. Infinitesimals can be used for elementary calculus in the Leibnizian tradition. We will give a few examples. Consider an expansion R' of R with a predicate for N and a function v . Let * R' be the corresponding non-standard model such that R' 4 *R1. We are actually considering two extensions at the same time. N is contained in R', i.e. singled out by a special predicate N. Hence N is extended, along with R' t o * N . As is t o be expected * N is an elementary extension of N (cf. Exercise 1 4). here fore we may safely operate in the traditional manner with real 1 24 3 . Completeness and Applications I I ! 3 .3 Some Model Theory 125 numbers and natural numbers. In particular we have in * R1also infinite natural numbers available. We want v t o be a sequence, i.e. we are only interested in the values of v for natural number arguments. The concepts of convergence , limit, etc. can be taken from analysis. We will use the notation of the calculus. The reader may try t o give the correct formulation. Here is one example: 3mVn > m(lvn - vml < E ) stands for 3 x(N(x) A V y(N(y) A y > x -+ Iv(y) - v (x) 1 < 6 ) . Properly speaking we should relativise quantifiers over natural numbers (cf. 2.5.9), b ut it is more convenient t o use variables of several sorts. 5. T he sequence v (or (v,)) converges in R' iff for all infinite natural numbers n , m Iv, - u rn[is infinitesimal. Proof. (v,) converges in R' if R' V t > 03nVm > n ((v, - u rn\ < t ) . Assume that (v,) converges. Choose for 6 > 0 a n n (t) E ( R'J such that R' + 'dm > n(lv, -v,( < E ) . T hen also * R' + V m > n(lv, - u r n ( < E ) . In particular, if m , m' are infinite, then m, m' > n ( ~ for all E. Hence ) Ivm - u rn,I < 2 t for all E . T his means that Ivm - u rn\ is infinitesimal. Conversely, if Jv, - v mJ is infinitesimal for all infinite n , m, then Vm > n(lv, - vrnl < E ) where n is infinite and E s tandard, pos*R 3nVm > n ((v, - vrnl < t ), for each standard 6 > 0. itive. So * R' Now, since R' 4 * R',R1 3nQm > n(lv, - vrnl < E ) for 6 > 0, so 0 V t > O3nVm > n(lv, - vm1 < E ).Hence (v,) converges. R' Definition 3.3.14. A theory with axioms r in the language L , is called c omplete if for each sentence a in L, either r I- a , or r t- -v. , A complete theory leave, so to speak, no questions open, but it does not ' prima facie restrict the class of models. In the old days mathematicians tried , to find for such basic theories as arithmetic axioms that would determine up ' t o isomorphism one model , i.e. t o give a set I of axioms such that ' a ,%E M o d ( r ) + U 3 23. T he Skolem-Lowenheim Theorems have taught us t hat this is (barring the finite case) unattainable. There is, however, a significant notion: b + Definition 3.3.15. Let K be a cardinal. A theory is n-categorical if it has at least one model of cardinality K a nd if any two of its models of cardinality K a re isomorphic. Categoricity in some cardinality is not as unusual a s one might think. We l i t some examples. 1. T he theory of infinite sets (zdentity structures) i s n-categorical for all infinite ity'. K. + Proof. Immediate, because here "isomorphic" means "of the same cardinal0 2. T he theory of densely ordered sets without end-poznts zs No-categorical. 6. lim v, = a n -+m & la - v,l is infinitesimal for infinite n. P roof. Similar to 5. We have only been able to touch the surface "non-standard analysis". For an extensive treatment, see e.g. R obinson, Stroyan-Luxemburg. We can now strengthen the Skolem-Lowenheim Theorems. Proof. See any textbook on set-theory. The theorem was proved by Cantor using the so-called back-and-forth method. 0 3. T he theory of divisible torsion-free abelian groups is K-categorical for rc > N o. T heorem 3.3.12 (Downward Skolem-Lowenheim). Let the language L of U have cardinality K , and suppose U has cardinality X > n . T hen there is a s tructure 23 of cardinality r ; s uch that 23 4 U. Proof. Check that a divisible torsion-free abelian group is a vector space over the rationals. Use the fact that vector spaces of the same dimension (over the 0 same field) are isomorphic. 4 . T he theory of algebraically closed fields (of a fixed characteristic) is rc-categorical f or K > N o. Proof. See corollary 3.4.11. 0 T heorem 3.3.13 (Upward Skolem-Lowenheim). Let the language L o f U have cardinality n a nd suppose U h as cardinality X 2 n . Then for each p > X there is a structure 23 of cardinality p , s uch t hat U 4 23. P roof. Use Steinitz' Theorem: two algebraically closed fields of the same characteristic and of the same uncountable transcedence degree are isomorphic. 0 T he connection between categoricity and completeness, for countable languages, is given by Proof. Apply the old upward Skolem-Lowenheim Theorem t o ~ h ( & ) . In the completeness proof we used maximally consistent theories. In model theory these are called complete theories. As a rule the notion is defined with respect to axiom sets. 126 3. Completeness and Applications 3 .3 Some Model Theory 1 27 T h e o r e m 3.3.16 ( Vaught's T h e o r e m ) . I f T h as n o finite models and is K - categorical for some K n ot less t han t he c ardinality of L , t hen T i s complete. (2) write down all derivations of size 2, using a1,a2 uncancelled, 01, a 2 , plcpz, with a t most Proof. Suppose T is not complete. Then there is a a such that T Y a a nd T y l a . By the Model Existence Lemma, there are U a nd 23 in M od(T) a a nd 23 l a . Since U a nd 23 a re infinite we can apply such that the Skolem-Lowenheim Theorem (upwards or downwards), so as t o obtain 2' a nd 23', of cardinality K, such that U U', and 23 23'. But then U' 23', a nd hence a' 23', so U - 23. 0 This contradicts U a a nd 23 l a . - -- ( n) write down all derivations of size n , using a l, . . . , a,, cp, . . . , p,, with at most 0 1,. . . , a, uncancelled, " As a consequence we see that the following theories are complete: 1. 2. 3. 4. the theory of infinite sets; the theory of densely ordered sets without end-points; the theory of divisible torsion-free abelian groups; t he theory of algebraically closed fields of fixed characteristic. A corollary of the last fact was known as L efschetz' principle: i f a s entence a , i n the first-order language of fields, holds for the complex numbers, it holds for all algebraically closed fields of characteristic zero. Each time we get only finitely many theorems and each theorem is eventually derived. The process is clearly effective (although not efficient). We now observe L e m m a 3.3.17. I f P and then r i s decidable. r c( complement of I') are effectively enumerable, Proof. Generate the lists of r a nd r Cimultaneously. In finitely many steps s we will either find a in the list for I' or in the list for r C.o for each a we S can decide in finitely many steps whether a E P or not. 0 As a corollary we get the T h e o r e m 3.3.18. If T i s effectively axiomatisable and complete, then T i s T his means that an "algebraic " theorem a concerning algebraically closed fields of characteristic 0 can be obtained by devising a proof by whatsoever means (analytical, topological, . . . ) for the special case of the complex numbers. decidable. Proof. Since T is complete, we have I k a or I' k -a for each a (where I' ' axiomatizes T ) . So a E TCe y a H I' t- l a . From the above sketch it follows that T a nd T C are effectively enumerable. By t he lemma T is decidable. 0 Decidability. We have seen in chapter I t hat there is an effective method to test whether a proposition is provable - by means of the truth table technique, since "truth = provability" . I t would be wonderful t o have such a method for predicate logic. Church has shown, however, that there is no such method (if we identify "effective" with "recursive") for general predicate logic. But there might be, and indeed there are, special theories which are decidable. A technical study of decidability belongs t o recursion theory. Here we will present a few informal considerations. If T, with language L, has a decidable set of axioms I ', t hen there is an effective method for enumerating all theorems of T . One can obtain such a enumeration as follows: (a) Make an effective list a l, a 2 , a s , . . . of all axioms of T ( this is possible because r is decidable), and a list p ~ , c p z ,... of all formulas of L. ( b) (1) write down all derivations of size 1, using using a l,cpl, with at most a uncancelled, 1 r A pplication. T he following theories are decidable: 1. 2. 3. 4. the t he t he t he theory theory theory theory of of of of infinite sets; densely ordered sets without end-points; divisible, torsion-free abelian groups; algebraically closed fields of fixed characteristic. : Proof. See the consequences of Vaught's Theorem (3.3.16). T he effective enumerating is left t o the reader (the simplest case is, of course, that of a finitely axiomatisable theory, e.g. ( I ) , ( 2). 0 We will finally present one more application of the non-standard approach, by giving a non-standard proof of : L e m m a 3.3.19 (Konig's L e m m a ) . A n infinite, finitary tree has an infi- nite path. 128 3 . Completeness and Applications 3 .3 Some Model Theory 129 A finitary tree, or fan, has the property that each node has only finitely many immediate successors ('zero successors' is included). By contraposition one obtains from Konig's Lemma the so-called Fan Theorem (which was actually discovered first): Theorem 3.3.20. I i n a fan all paths are finite then the length of the paths f is bounded. Note that if one considers the tree as a topological space, with its canonical topology (basic open set "are" nodes), then Konig's Lemma is the BolzanoWeierstrasz Theorem and the Fan Theorem states the compactness. We will now provide a non-standard proof of Konig's Lemma. Let T be a fan, and let T* be a proper elementary extension (use 3.3.13). (1) t he relation ".... is an immediate successor of the language of partial order: ...." can be expressed in ~ < ~ y : = x < y ~ V z ( x ~ z 5 y - + x = z ~ w here,asusual, y=z) x < y s tands for x < y A x f y. ( 2) If a is standard, then its immediate successors in T* are also standard. Since T is finitary, we can indicate a l l . . . , a n such that T Vx(x < i it +-+ Let a * be a non-standard element of T*. We claim that P = {a E ( Tlla*< a ) is an infinite path (i.e. a chain). : (i) P is linearly ordered since T k Vxyz(x y A x z -+ y 5 z V z I y) and hence for a n y p , q ~ P C we h a v e p < q o r q < p . IT*l (ii) Suppose P is finite with last element b, then b has a successor and hence an immediate successor in T * , which is a predecessor of a*. < < W l<k<n irk = x ). By T + T *, we also have By ( 2) t his immediate successor belongs to P. Contradiction. Hence P is infinite. This establishes that T has an infinite path. 0 Quantifier Elimination Some theories have the pleasant property that they allow the reduction of formulas to a particularly simple form: one in which no quantifiers occur. Without going into a general theory of quantifier elimination, we will demonstrate the procedure in a simple case: the theory D O of dense order without end points, cf. 2.7.3(ii); 'without end points' is formulated a s "vx3yz(y < x A x < 2)". Let FV(cp) = { yl, . . . ,y,), where all variables actually occur in cp. By standard methods we obtain a prenex normal form cp' of cp, such that V' := Q 1x1Q2x2.. . Q mxm$(xl,. . . , x,, ~ 1 , .., yn), where each Q i is one of . the quantifiers V, 3 . We will eliminate the quantifiers starting with the innermost one. Consider the case Q, = 3. We bring $ into disjunctive normal form 'VI/ q j, where each $j is a conjunction of atoms and negations of atoms. First we observe that the negations of atoms can be eliminated in favor of atoms, since D O I- l z = 2 * ( z < z' V z' < z ) and D O I- -z < z' H ' ( z = z' V z' < z ). So we may assume that the + j's contain only atoms. $J by t he equivalent forBy plain predicate logic we can replace 32, E~X,$~ mula T* VX(X< i ir * W l<k<n irk = x ) , so if b is an immediate successor of a in T *,t hen b = ak for some Ic n , i.e. b is standard. Note that a node without successors in T has no successors in T* either, for T Vx(x < i r o x = 8 ) T* k Vx(x < T i o x = Z). (3) In T we have that a successor of a node is an immediate successor of that node or a successor of an immediate successor, i.e. < T his is the case since for nodes a and b with a < b, b must occur in the finite chain of all predecessors of a. So let a = a, < a,-I < . . . < a , = b < ai-1 < . . ., t hen a 5 a i+l < i b. Since the desired property is expressed by a first-order sentence ( *), ( 3) also holds for T*. w 130 3 . Completeness and Applications 3 .3 Some Model Theory 131 Notation: for the rest of this example we will use $ cf, T a s a n abbreviation for D O F a ++ T . We have just seen that it suffices to consider only formulas of the form 32, A u p,where each a, is atomic. A systematic look at the conjuncts will show us what to do. a,, we can delete the quantifier (cf. 2.5.2). (1) If x, does not occur in (2) Otherwise, collect all atoms containing x, a nd regroup the atoms, such t hat we get A a, A A x, < U ~ A A < X,AA wk = x,Ax, where x does not contain x,. Abbreviate this formula as T A X. By predicate logic we have ~ x , ( T A X ) A ~ X , T A x (cf. 2.5.3). Since we want to eliminate 3 xm, i t suffices to consider 3 2 , ~ only. Example. 3 xy(x < y A 3 z ( x < z A z < y A V u ( u # r -+ u < y V u = x ) ) ) yj Now the matter has been reduced to bookkeeping. Bearing in mind that we are dealing with a linear order, we will exploit the information given by T concerning the relative position of the u i, v j, w k7swith respect to Xm. ( 2a) T := A x m < u i A A vj < x m A A w = x,. k T hen ~ X , T A T', with T' := A wo < ui A A vj < wo A A wo = w k (where wo is the first variable among the wk's). T he equivalence follows immediately by a model theoretic argument (i.e. D O 3 2 , ~ ++ 7'). ( 2b) T := A x m < ui A A v j < x,. Now the properties of D O a re essential. Observe that 3 xm(A x, < Zi < AA < x,) holds in a densely ordered set if and only if all the a i's lie to v j < u i. the right of the bj9s.So we get (by completeness) 3 x m r *fr A 3 xyz1[x=yv . .. V ~<~V(~<~A~<X)V(Y<ZAX<~) ~(y<zAy<x)V(y<zAx<z)V(z<yAy<x) + V(z<yAx< y)V(z<xAy<x)VT]. Cf, 3 x y z ( i T ) . Cf, 1 . Evidently the above quantifier elimination for the theory of dense order without endpoints provides an alternative proof of its decidability. For, if cp is a sentence, then cp is equivalent to an open sentence cp'. Given the language of D O i t is obvious that cp' is equivalent t o either T or I . Hence, we have an algorithm for deciding D O I- cp. Note that we have obtained more: D O is o , complete, since D O t cp HI r D O I- cp t T , so D O I- i c p or D O I- cp. In general we cannot expect that much from quantifier elimination: e.g. t he theory of algebraically closed fields admits quantifier elimination, but it is n ot complete (because the characteristic has not been fixed in advance); the open sentences may contain unprovable and unrefutable atoms such as 7 = 12, 23 = 0. We may conclude from the existence of a quantifier elimination a certain model theoretic property, introduced by Abraham Robinson, which has turned out t o be important for applications in algebra (cf. the Handbook of Mathematical Logic, A 4). D efinition 3.3.21. A theory T is model complete if for U ~!l3+-!2l<B. We can now immediately obtain the following: zi A , ( 2 ~ T := A X m < U i A A W k = 2,. ) wo < ui A wk = wo. T hen 3 x m r A (2d) T := A v j < x m A A wk = x,. Cf. (2c). (2e) T := A x, < u i. Observe that 3 2 , ~ holds in all ordered sets without a left endpoint. So we have 3 2 , ~ T, since we work in DO. ( 2f) T := A v j < 2,. Cf. (2e). (2g) 7- : = A W k = x m . wo = wk. T hen 3 cm7 *fr Remarks. (i) The cases ( 2b), (2e) and (2f) make essential use of D O. (ii) It is often possible to introduce shortcuts, e.g. when a variable (other than x,) occurs in two of the big conjuncts we have 3xmT Al. If the innermost quantifier is universal, we reduce it to an existential one by V X , ~ H 7 3 x r n - p . Now it is clear how t o eliminate the quantifiers one by one . 2 >3 , a,%E M od(T) 132 3. Completeness and Applications 3 .3 Some Model Theory 133 Theorem 3.3.22. If T admits quantifier elimination, then T is model complete. Proof. Let U a nd B b e models of T , such that U C B . We must show that U c p(al,. . . , a n ) e B 23 c p(al,. . . , &) for all a l , . . . , a n E JUI,where F V ( q ) = { X I , . . . r x n). Since T admits quantifier elimination, there is a quantifier free $. $ ( x l , . . . , x,) such that t- cp Hence it suffices t o show U t- $ @I,. . . , a n ) e B k $ @I,. . . , a n ) for a quantifier free $. A simple induction establishes this equivalence. 0 r - 4. Let U = ( N, <),% = ( N - {O), <). s how: ( i) u r %; ( ii) U = 23; ( iii) B 2; ( iv) not23 4 U. c 5. ( Tarski). Let U B . Show U 4 23 @ for all cp E L a nd a l , . . . , a n E IUI, B t- 3 ycp(y,al,. . . , a n ) + t here is an element a E IUJ such that B t- c p(a,al,. . . , a n ) , where F V(cp(y,El,.. . , a n ) = {y). Hint: for -eshow (i) t ' ( ~ ~. . . ,a n ) = t B ( a l , . . . ,a n ) for t E L , , (ii) 2l t cp(Sil,. . . ,?in) H 93 t- cp(Sil,. . . , a,)for cp E L by induction on cp ( use only V, 7 , 3 ) Some theories T have a particular model that is, up t o isomorphism, contained in every model of T . We call such a model a prime model of T. Examples. (i) The rationals form a prime model for the theory of dense ordering without endpoints; (ii) The field of the rationals is the prime model of the theory of fields of characteristic zero; (iii) The standard model of arith- metic is the prime model of Peano's a rithmetic. 6. Another construction of a non-standard model of arithmetic: Add to the language L of arithmetic a new constant c. Show = ~ h ( % U {c > ) 5i1n E IT[)h as a model M. Show that M 3 . C an M b e countable? r Theorem 3.3.23. A model complete theoy with a prime model is complete. Proof. Left t o the reader. I Exercises 1. Let U = ( A ,I ) be a poset. Show that D iagf (U) U {a # b I a # b ,a, b E IUI} U {Vxy(x y V y I x )) has a model. (Hint: use compactness). Conclude that every poset can be linearly ordered by an extension of its ordering. 7. Consider the ring Z of integers. Show that there is an U such that Z 4 2 a nd Z 2l ( a non-standard model of the integers). Show that U h as an "infinite prime number", p,. Let (p,) be the principal ideal in U generated by p, . Show that U/(p,) is a field F . (Hint: look at Vx("x not in (p,)" + 3 yz(xy = 1 zp,)), give a proper formulation and use elementary equivalence). What is the characteristic of F? ( This yields a non-standard construction of the rationals from the integers: consider the prime field). + < 8. Use the non-standard model of arithmetic to show that "well-ordering" is not a first-order concept Use from the non-standard model of the reals t o show that "archimedean ordered field" is not a first-order concept. Consider the language of identity with constants c i(i E N ) Show that the theory of k-categorical for k > N o, b ut not No-categorical. 2. I f f : U Z L B a n d F V ( c p ) = { x l , . . . , x ~ @ o w U t- cp[al,. . . , a n / x l , . . . ,x n] 93 t- ~ [ a f) , . . . , f ( a n ) / x l , . . . , xnI. (l I n particular, U 23. 3 . Let U C B . cp is called universal (existential) if cp is prenex with only universal (existential) quantifiers. (i) Show that for universal sentences cp B cp c p U cp. (ii) Show that for existential sentences cp U cp c p B 23 cp. (Application: a s ubstructure of a group is a group. This is one reason to use the similarity type (-; 2 , l ; 1) for groups, instead of (-; 2 ; O), or (-; 2; l ) , as some authors do). - r = {II,z, 3) {ci # cjli, j E N , i f j ) . I1U r is + Show that the condition "no finite models" in Vaughts's T heorem is necessary (look a t the theory of identity). (%I. Define Xo = X U C where C is the set of constants Let X J of U, X n + l = X n U {f ( a l , . . . ,a m ) f in U, a l , . . . , a m E X,), X, = U{Xnln E N I . Show that % = ( XU,R i n X 2 , . . . , R 2 n X 2 ,f l l X z l , . . . ,f mIXzm,{clli E 134 3 . Completeness and Applications 3 .3 Some Model Theory 135 I)) is a substructure of U. We say that 23 is the substructure generated by X . Show that 93 is the smallest substructure of U containing X ; 23 can also be characterized as the intersection of all substructures containing X. 13. Let * R be a non-standard model of T h(R). Show that s t (cf. page 123) is a homomorphism from the ring of finite numbers onto R. What is the kernel? 14. Consider '5%'= ( R, N, <, +, ., -,-I , 0 , I ) , where N is the set of natural numbers. L(!Xf) h as a predicate symbol N and, we can, restricting ourselves to a nd ., recover arithmetic by relativizing our formulas t o N (cf. 2.5.9). Let %I +* 9' = ( *R,*N, . . .). Show that X = ( N , <, ., 0 , l ) 4 ( *N,<, +, ., 0 , l ) =* X (Hint: consider for each cp E L (X) t he relativized cpN E L(fS1)). Show that the theory of dense order without end-points is not categorical in the cardinality of the continuum. Consider the structure U = ( R, <, f ), where < is the natural order, and where f is a unary function. Let L be the corresponding language. Show that there is no sentence a in L such that U u a f ( r ) > 0 for all r E R. (hint: consider isomorphisms x H x k ). + + 22. Let U = ( A ,N ), where is an equivalence relation with denumerably many equivalence classes, all of which are infinite. Show that T h(U) is No-categorical. Axiomatize T h(U). I s there a finite axiomatisation? Is T h(U) K-categorical for K > No? N +, 15. Show that any Peano-structure contains X a s a substructure. 16. Let L be a language without identity and with at least one constant. , Let a = 3 x1 . . . z ncp(sl,. . . , x,) a nd C = (cp(t1,. . . , t n ) ( t iclosed in L), where cp is quantifier free. a H each U is a model of a t least one sentence in Cu.hint: for ( (i) each U, look a t the substructure generated by 0 ). (ii) Consider C, a s a set of propositions. Show that for each valuation v ( in the sense of propositional logic) there is a model U such that [ ~ ( t l ,. . ,tn)], = [ ~ ( t l ,. . ,t n ) ] ~f,or all ~ ( t l ,. . t n) E z u . . . (iii) Show that k a a V !cp(tf, . . . ,t k) for a certain m (hint: use Exercise 9, section 1.5). 23. Let L be a language with one unary function symbol f . Find a sentence rn a t here r n , which says that "f has a loop of length n ", i.e. U are a l , . . . , a n E (UI such that fa(ai) = a i+l(i < n ) a nd f a ( a n ) = a l . Consider a theory T with axiom set { P , ~ 1 , l r 2 , ~ . . , , . ~3 1 rn, . . )(nE w), where p expresses "f is bijective". . Show that T is K-categorical for K > No. (hint: consider the partition { ( f " )i(a)(i E w) in a model U). Is T Ho-categorical? Show that T is complete and decidable. Is T finitely axiomatisable? 24. P ut Tv = { a(Tk a a nd a is universal). Show that Tv axiomatizes the theory of all substructures of models of T. Note that one part follows from Exercise 3 . For the converse: let U be a model of Tv a nd consider D iag(U) U T . Use compactness. El 17. Let U,23 E M od(T) a nd U Z L 9 Show that D iag(U) U Diag(23) U T 3. is consistent (use the Compactness Theorem). Conclude that there is a model of T in which both U a nd 93 can be isomorphically embedded. 18. Consider the class K of all structures of type (1; -; 0) with a denumerable unary relation. Show that any U a nd 23 in K of the same cardinality K > N O a re isomorphic. Show that T = T h(K) is not K-categorical for any K No. 25. We say that a theory is preserved under substructures if U C 23 and % E M od(T) implies U E M od(T). ( Los-Tarski). Show that T is preserved under substructures iff T can be axiomatized by universal sentences (use Exercise 24). 26. Let U = 23, show that there exists a C such that U + C , B 4 C ( up to isomorphism). Hint: assume that the set of new constants of % is ! ) disjoint with the set of new constants of !k. Show that ~ h ( ! k U ~ h ( ! % ) has a model. 27. Show that the ordening <, defined by x < y := 3 u(y = x +Su) is provably transitive in Peano's Arithmetic, i.e. P A F Vxyz(x < y Ay < z 4 x < z ). > 19. Consider a theory T of identity with axioms A n for all n E N . In which cardinalities is T categorical? Show that T is complete and decidable. Compare the result with Exercise 10. 136 3 . Completeness and Applications 3 .4 Skolem Functions or How to Enrich Your Language 137 28. Show PA t Vz(0 < x ) (use induction on x), (ii) PA t Vx(x = 0 V 3 y(x = S y)) (use induction on x), (iii) PA t- Vxy(x y = y + x ), S x < y ), (use induction on y), (iv) PA t Vy(x < y (v) PA t Vxy(x < y V x = y V y < x ) (use induction on x , the case of x = 0 is simple, for the step from x to S x use ( iv)), ( iv) PA I- V y 4 x ( y < x A x < S y) (compare with (iv)). ( i) 1 [ , D efinition 3 .4.2. Let cp b e a formula of the language L with F V((p) = { xl,. . . ,a n,y). Associate with cp a n n-ary function symbol f,, called the Skolem functzon (symbol) of cp. T he sentence V XI. . . x ~ ( ~ Y v ( x .~ x.~ , Y ) XI,. . . , x n , f V ( x 1 , . . . , x n ) ) ) ,, . + + Ir + ! t is called the Skolem axzom for cp. 1 Note that the witness of section 3.1 is a special case of a Skolem function (take n = 0 ) : f, is a constant. Definition 3 4 3 If T is a theory with language L, then T~~= T U {ala is ... a Skolem axiom for some formula of L) is the Skolem extension of T a nd its language L~~e xtends L by including all Skolem functions for L. If U is of the type of L and USk a n expansion of U of the type of L " ~ such that ask a , for all Skolem axioms a of L and IUI = IUskI, t hen USk is called a Skolem expansion of U. 29. ( i) Show that the theory L, of identity with "infinite universe" (cf. section 3.1, Exercise 3 or Exercise 19 above) admits quantifier elimination. (ii) Show that L, has a prime model. ' + 3.4 Skolem Functions or How to Enrich Your Language I n mathematical arguments one often finds passages such as ".....there is an x such that cp(x) holds. Let a be such an element, then we see that ...". I n terms of our logic, this amounts to the introduction of a constant whenever the existence of some element satisfying a certain condition has been established. The problem is: does one thus strengthen the theory in an essential way? In a precise formulation: suppose T t 3xcp(x). Introduce a (new) constant a and replace T by T' = T U {cp(a)). Question: is T' conservative over T , i.e. does T' I- $ + T I- $ hold, for $ not containing a ? We have dealt with a similar problem in the context of Henkin theories, (section 3 .1), so we can use the experience obtained there. T h e o r e m 3.4.1. Let T be a theory with language L, such t hat T t 3xcp(x), where ( FV((p) = { x), and let c be a constant not occurring in L. Then T U {cp(c)) is conservative over T . Proof. By Lemma 3.1.7, T' = T U {3xcp(x) + cp(c)) is conservative over T. If $ E L a nd T ' U {cp(c)) k $, t hen T ' U { 3xq(x)) t $, or T' k 3xcp(x) --, $. Since T' is conservative over T we have T t 3 x 4 ~ - $ Using T I- 3xcp(x), ), we get T t $. (For an alternative proof see Exercise 6). 0 T he above is but a special case of a very common piece of practice; if one, in the process of proving a theorem, establishes that "for each x there is a y such t hat cp(x, y )", then it is convenient to introduce an auxilliary function f t hat picks a y for each x, such that cp(x, f ( x)) holds for each x. This technique usually invokes the axiom of choice. We can put the same question in this case: if T I- Vx3ycp(x, y ), introduce a function symbol f a nd replace T by T' = T U Vxcp(x, f ( x)). Question: is T' conservative over T? T he idea of enriching the language by the introduction of extra function symbols, which take the role of choice functions, goes back to Skolem. t T he interpretation in USk of a Skolem function symbol is called a Skolem function. Note that a Skolem expansion contains infinitely many functions, so it is a mild extension of our notion of structure. The analogue of 3.1.7 is T h e o r e m 3.4.4. (i) T S k is conservative over T . (ii) each U E M od(T) h as a Skolem expansion USk E M O ~ ( T ' ~ ) Proof. We first show (ii). We only consider the case of formulas with FV(cp) = ( 21,. . . , x n, y) for n 1 . The case n = 0 is similar, but simpler. It requires the introduction of new constants in U (cf. Exercise 6). Let U E mod(^) a nd cp E L w ith FV(cp) = { xl, . . . ,x n , y). We want to find a Skolem function for cp in U. Define Val, = {b E IUI I 2 ~ ( Z I ,. . , ~ n , ? ) ) . Apply AC t o the set {Val,...,_ IVal, ..,an # 0): there is a choice function F such t hat F (Val , E Val , Define a Skolem function by > + F v(a1,. . . , a n ) = { F(Va1,...a n ) if Val ,...a n e else, # 0, where e E IUI. Now it is a routine matter t o check that indeed ask V X I . .. xn(3ycp(x17... , x n , y ) XI,. . . , x n , f , ( x l , . . . , ~ n ) ) ) ,where % = f', and where BSk is the expansion of U with all Skolem functions :& F, (including the "Skolem constants", i.e. witnesses). (i) follows immediately from (ii): Let T y 1C, ( with 1C, E L ), t hen there is an B such that U y $. Since 1C, E L , we also have ask 1C, (cf. section 3.2, Exercise 3), hence T~~ $. y Y t= + 138 3. Completeness and Applications 3 .4 Skolem Functions or How to Enrich Your Language 139 Remark. It is not necessary (for 3.4.4) to extend L with all Skolem function symbols. We may just add Skolem function symbols for some given set S of formulas of L. We then speak of the Skolem extension of T with respect to S ( or with respect t o cp if S = {cp)). T he following corollary confirms that we can introduce Skolem functions in the course of a mathematical argument, without essentially strengthening the theory. z, Corollary 3.4.5. I T I- Vxl, . . . x n3ycp(xl,. . . , , y ), where FV(cp) = f { x l , . . . , x,, y), then T' = T U {Vxl . . . x,cp(xl,. . . ,x,, f , ( x ~ , . . . , 2,))) is conservative over T . ! Proof. (i) We will show that f a cts just like a Skolem function; in fact T + is equivalent to the theory T' of Corollary 3.4.5 (taking f for f,). ( a) T + 1 V X I . .. x n ~ ( x l ,. .. , x n , f ( x l , . . . , xn)). For, T + t- Vx1. . .xn3ycp(xl,.. . , x,,y) a nd T + I - V XI... x,Y((P(xI,.. . , x n 7 y ) Y = f ( x 1 7 . .. , x n ) ) . Now a simple exercise in natural deduction, involving R14, yields (a). Therefore T' T + (in the notation of 3.4.5). / l 1 1 I L 1 c Proof Observe that T" = T U {Vxi . . .xn(3ycp(xi, . . . ,X n , Y) 4. n ( ~ ( x l , . . . ~ x n , f , ( x l , .7%)). SO .. ( P ( x ~ , . ,x n), f ( ( ~ ( x 1 , . 7xn))) I- V XI .. . TI t $ + TI' t- $. Now apply 3.4.4. 0 T he introduction of a Skolem extension of a theory T results in the "elimination" of the existential quantifier in prefixes of the form Vx, . . . ,x n3y. T he iteration of this process on prenex normal forms eventually results in the elimination of all existential quantifiers. The Skolem functions in an expanded model are by no means unique. If, however, U Vxl . . . x n3!ycp(xl, . . . ,x,, y ), then the Skolem function for (F is uniquely determined; we even have ask Vx1. . . x,y(cp(xl,. . . ,x,, y) Y = f p ( ~ l ~ 3%)). ... We say that cp defines the function F in a sk,nd , a y = f ,(xl,. . . ,x,)) is called the definition of Vxl . . . x ny(cp(xl,. . . ,x,, y) F in a s k . , (ii) The idea, underlying the translation, is to replace occurrences of f ( -) by a new variable and t o eliminate f . Let T E L * and let f (-) be a term in L* not containing f in any of its subterms. Then I- T (. . . ,f (-), . . .) o 3 y(y = f ( -)A T(.. . ,y , . . . )), where y does not occur in T , a nd T + I- T (. . . ,f ( - ) , . . .) 3y((p(-, y ) A T (. . . ,y, . . .)). T he right-hand side contains one occurrence of f + - - - We may reasonably expect that with respect to Skolem functions the V3!-combination yields better results than the V3-combination. T he following theorem tells us that we get substantially more than just a conservative extension result. less than T . I teration of the procedure leads t o the required f-free formula T O. T he reader can provide the details of a precise inductive definition of T O; note that one need only consider atomic T ( the translation extends trivially t o all formulas). Hint: define something like "f -depth" of terms and atoms. From the above description of T O i t immediately follows that T + t- T o TO. 0 Now (2) follows from (i) and (1).Finally (3) is evident. As a special case we get the explicit definition of a function. Corollary 3.4.7. Let F V ( t ) = { XI,.. . ,a,) and f $ L . Then T + = T U {VXI . . . x,(t = f ( xl, . . . ,x,)) i s conservative over T . Proof. We have Vxl . . . xn3!y(y = t ), so the definition of f , as in 3.4.6, becomes Vxl . . . x,y(y = t o y = f ( x l , . . . ,x,)), which, by the predicate and 0 identity rules, is equivalent to Vxl . . . x n(t = f ( X I , .. . ,x,)). We call f ( x l , . . . x,) = t t he explicit definition o f f . One can also add new Predicate symbols to a language in order t o replace formulas by atoms. Theorem 3.4.6. Let T t Vxl . . . x n3!ycp(x1, . . . ,x,, y ), where F V((p) = { xl, . . . , x,, y) and let f be an n -ary symbol not occurring in T or cp. Then T + = T U(Vx1.. . x,y(cp(xl,. . . ,x,, y) y = f ( x l , . . . , x,))) is conservative over T. Moreover, there is a translation T TO from L f = L U { f) t o L, such that - ( 1 ) T + t- T c-, To, ( 2) T + t- T % T t- T o, ( 3) T = TO for T E L . Theorem 3.4.8. Let F V((p)= { XI,.. . ,x,) a nd let Q be a predicate symbol not in L. Then (i) T + = T u {VXI . . . x n ( p - Q ( x l , . . . ,x,))) i s conservative over T. 140 3 . Completeness and Applications 1 3.4 Skolem Functions or How to Enrich Your Language 141 (ii) there is a translation T (1) T + t- 7 TO, ( 2 ) T + F 7 H T I- TO, ( 3) T = r 0 for T E L. - -+ r 0 i nto L such that Proof. Similar to, but simpler than, the above. We indicate the steps; the details are left t o the reader. (a) Let U be of the type of L. E xpand U t o Uf by adding a relation Q f = { ( a l l . . . l a n ) l %k ~ ( z l l . . . l G z ) } . ( b) Show U T w U+ T + a nd conclude (i). (c) Imitate the translation of 3.4.6. 0 functions and Skolem expansions. As we remarked before, the introduction of Skolem functions allows us to dispense with certain existential quantifiers in formulas. We will exploit this idea t o rewrite formulas as universal formulas (in a n extended language !). F irst we transform the formula cp into prenex normal form cp'. Let us suppose that cp' = Vxl . . . x n3yy5(xl,. . . , x,, y, z l l . . . ,z k), where 21,. . . , zk are all the free variables in cp. Now consider T * = T U { V X ~ . . Xnzl . . . zk(3y'$(x1, . . . , X n , Y Z 1, . . . , Zk) 4 . , $(~l~~~~,~nlf(~1,..r~n,~1i..,~k)i~li~~~~zk))}~ + + We call the extensions shown in 3.4.6, 3.4.7 and 3.4.8, extensions by definition. The sentences Vx1. . . x n y ( ~ Y = f ( X I , .. . , x n ) ) , Vxl . . . x,(f ( 51,. . . , x,) = t ), V XI . . . x n ( ~ Q (x11...1xn))1 a re called the defining axioms for f a nd Q respectively. Extension by Definition belongs to the daily practice of mathematics (and science in general). If a certain notion, definable in a given language, plays an important role in our considerations, then it is convenient to have a short, handy notation for it. Think of "x is a prime number", "x is equal to y or less than y", "z is the maximum of x and y", e tc. ++ + + By Theorem 3.4.4 T * is conservative over T, and it is a simple exercise in logic to show that T* k V X ~ . xn3yy5(-, y, -) ++ V X . . . x,$(-, f (. . .), -). . ~ We now repeat the process and eliminate the next existential quantifier in t he prefix of y5; in finitely many steps we obtain a formula cpS in prenex normal form without existential quantifiers, which, in a suitable conservative extension of T obtained by a series of Skolem expansions, is equivalent to cp. W a r n i n g : the Skolem form cpS differs in kind from other normal forms, in the sense that it is not logically equivalent to cp. T heorem 3.4.4 shows that the adding of Skolem Axioms to a theory is conservative, so we can safely operate with Skolem forms. The Skolem form cpS h as the property that is satisfiable if and only if cp is so (cf. Exercise 4). Therefore it is sometimes called the Skolem form for satisfiability. There is a dual Skolem form cp, (cf. Exercise 5), which is valid if and only if cp is so. cp, is called the Skolem form for validity. Examples. 1. Characteristic functions Consider a theory T w ith (at least) two constants Q , c l, such that T k co # c l. Let FV(cp) = { x l , . . . ,x,), t hen T F Vxl . . . xn3!y(cp A y = c l) V (-y A y = Q )). (Show this directly or use the Completeness Theorem). The defining axiom for the characteristic function K,, is Vxl . . . x,y[(cp A y = c l) V (-cp A y = ~ 2 ) t+Y = K , ( x ~ , . . . , x,)). ) 2. Definition by (primitive) Recursion. In arithmetic one often introduces functions by recursion, e.g. x!, xy. T he study of these and similar functions belongs to recursion theory; here we only note that we can conservatively add symbols and axioms for them. Fact (Godel, Davis, Matijasevich): each recursive function is definable in P A , in the sense that there is a formula cp of P A such that (i) P A I- V xl.. . x n3!ycp(xll.. . , x,, y) and -(ii) for k l l . . . , k,,m E N f ( k l , . . . , k,) = m =+ P A t c p(El,. . , k ,,m). . For details see Smorynski, 1991; Davis, 1958. Before ending this chapter, let us briefly return t o t he topic of Skolem c ~4 ~2). Example. V x 1 3 1 3 ~ 2 V x 2 3 ~ 3 ~ ~ 3 V x 4p3( ~ 1 , ~ 2 , ~ 3 , ~ 4 , ~ 1 , ~ 2 , ~ 3 ~ ~ 4 , z 1 s tep 1. Eliminate y l: Q x ~ ~ Y ~ V X ~ ~ Y ~c V ( ~ ~ l ~X2 ~ ~ 3Y, ~ 4 , f ( ~ i l ~ i l z 2 ) , ~ 2 1 Y 3 1 ~ p X iV 1~ ~ step 2 . E liminate y2: (~ 4 ~ ~2)1 V x l x 2 3 ~ 3 V x 3 ~ 4 3 P(. . . ,f ( x 1 1 ~ 1 , ~ 2 ) , ~ ( ~ 1 1Y 3i1Y,41 211~ 2 ) . s tep 3. Eliminate y3: 1, V x 1 ~ 2 ~ 3 ~ 4 (. ~.4 f ( ~ 12, z 2)ig(xliZli z2)i h(xliX2i Zl,z2)iY4izlrz2) v3 . , s tep 4. Eliminate y4: , V x l x 2 ~ 3 ~ 4 . . f ( ~ 1211 z2),9(51121, 2211 h(x11 X2121,z2, Y (. k h , 2 2 , 2 3, ~ 4 , ~z2),,w 2 ) . 1 1 I n Skolem expansions we have functions available which pick elements for us. We can exploit this t o obtain elementary extensions. Theorem 3.4.9. Consider U and B of the same type. If BSk s a Skolem expansion of B and U* B s k , where U* i s some expansion i of U, then U 4 B . 142 3 . Completeness and Applications I I ! 3 .4 Skolem Functions or How to Enrich Your Language 143 Proof. We use Exercise 5 of section 3.3. Let a l , . . . , a, E IUI, 93 cp( f , ( ~ ~.,. ,~ i , z l, . . . ,z,), where f, is . ), 3ycp(y,E l , . . . ,Z,) @ 93* the Skolem function for cp. Since U* g W k, ( a l , . . . , a,) = ~ ' ~ ' ( . a . ,~ n ) a nd so b = ( f,(al,. . . . a, = ( f,(Zl,. . . ,B))%* t PI. 0 cp(b,7il,. . . , a,). T his shows U 4 93. Hence 'BSk + fE* 1 4. A formula cp with FV(cp) = { X I , .. . , x,) is called satzsfiable if there is a n U a nd a l , . . . , a , E IUI such that U cp(7i, . . . ,z,). Show that cp is satisfiable iff cpS is satisfiable. 1 5. Let a b e a sentence in prenex normal form. We define the dual Skolem f orm a, of a a s follows: let a = ( Q l x l ) . . . (Q,x,)T, where T is quantifier where free - and Q1 a re quantifiers. Consider a' = ( Q l x l ) . . . ( Q,x,)v, Q1 = V, 3 iff Q1 = 3,V. Suppose (a')' = ( G x , , ) . . . ( G x , , ) ~ t;hen as = (Qz, 2 2 , ) . . . (QZk k17'. G I n words: eliminate from a t he universal quantifiers and their variables just as t he existential ones in the case of the Skolem form. We end up with an existential sentence. Example. (Vx3YVzcp(xYx))s= 3ycp(c, Y7 f ( ~ 1 ) . We suppose that L h as at least one constant symbol. (a) Show that for all (prenex) sentences a , a iff a ,. (Hint: look a t Exercise 4). Hence the name: "Skolem form for validity". (b) Prove Herbrand's Theorem: Theorem 3.4.10. Let X C IQI. The Skolem Hull Bx of X is the substructure of U which is the reduct of the structure generated by X in the Slcolem expansion USk of 2l (cf. Exercise 12, section 3.3). In other words Bx is the smallest substructure of a , containing X , which is closed under all Skolem functions (including the constants). Corollary 3.4.11. F or all X C IUI Bx + U. We now immediately get the strengthening of the downward SkolemLowenheim Theorem formulated in Theorem 3.3.12, by observing that the cardinality of a substructure generated by X is the maximum of the cardinalities of X and of the language. This holds too in the present case, where infinitely many Skolem functions are added to the language). + Exercises 1. Consider the example concerning the characteristic function. (i) Show T + t Vxl . . . x,(cp K , ( x ~ ,. . . ,2,) = c l). (ii) Translate K ,(xl, . . . , x,) = K ,(yl, . . . , y,). (iii) Show T + t- Vxl . . . x,yl, . . . ,y n(K,(xl, . . . ,x,) = + + for some m , where a: is obtained from a, by deleting the quantifiers. The t J ( i m , j 5 n ) a re certain closed terms in the dual Skolem expansion of L. Hint: look at ~ ( 7 0 ) ' .Use Exercise 16, section 3.3 < KV(yl,...,Yn))++v~l...~n~(~l,...,~n)V , v xl . . . x nlcp(xl,. . . , x,). I 6. Let T t 3xcp(x), with FV(cp) = { x). Show that any model U of T can be expanded to a model U* of T with an extra constant c such that U* cp(c). Use this for an alternative proof of 3.4.1 2. Determine the Skolem forms of (a) Vy3x(2x2 yx - 1 = O), ( b) v&36(& 0 + (6 > O A ~ / X ( I-a1 < 6 + If(X) - f @)l < &), > X ( c) V53y(x = f ( Y)), ( d) Vxy(x < y -+ 3 u(u < x ) A 3 v(y < v) A 3 w(x < v A w < y )), ( e) Vx3y(x = y2 v x = -y 2 ). + ' 7. Consider I t he theory of identity "with infinite universe" with axioms , X,(n E N ) and I:, with extra constants c,(i E N ) and axioms c, # c, for i # j , i , j E N . Show that I:, is conservative over I,. 3. Let a s be the Skolem form of a . Consider only sentences. (i) Show that F U { aS)is conservative over F U { a ) . (ii) P u t FS= { us Iu E F ) . Show that for finite F , FSis conservative over 1. T? (iii) Show that FSis conservative over r for arbitrary r . 4. Second Order Logic In first-order predicate logic the variables range over e lements of a structure, in particular the quantifiers are interpreted in the familiar way as "for all elements a of I%[ . . . " and "there exists an element a of [%I . . .". We will now allow a second kind of variable ranging over subsets of the universe and its Cartesian products, i.e. relations over the universe. The introduction of these second-order variables is not the result of an unbridled pursuit of generality; one is often forced to take all subsets of a structure into consideration. Examples are "each bounded non-empty set of reals has a suppremum", "each non-empty set of natural numbers has a smallest element", "each ideal is contained in a maximal ideal". Already the introduction of the reals on the basis of the rationals requires quantification over sets of rationals, as we know from the theory of Dedekind cuts. Instead of allowing variables for (and quantification over) sets, one can also allow variables for functions. However, since we can reduce functions to sets (or relations), we will restrict ourselves here to second-order logic with set variables. When dealing with second-order arithmetic we can restrict our attention to variables ranging over subsets N , since there is a coding of finite sequences of numbers to numbers, e.g. via Godel's P-function, or via prime factorisation. In general we will, however, allow for variables for relations. The introduction of the syntax of second-order logic is so similar to that of first-order logic that we will leave most of the details to the reader. The alphabet consists of symbols for (i) individual variables: xo, X I , x 2,. . ., (ii) individual constants: co, c l, c 2,. . ., and for each n 0 , > (iii) n-ary set (predicate) variables: X$ ,XT, X;, . . . , (iv) n-ary set (predicate) constants: I ,P t, PT, P T, . . . , ( v) connectives : A , +, V , H , 3, . V Finally we have the usual auxiliary symbols: ( , ) , , . R emark. There are denumerably many variables of each kind. The number of constants may be arbitrarily large. 7, 146 4. Second Order Logic 1 , 4. Second Order Logic 147 Formulas are inductively defined by: (i) X P, P IE O R M , ! , F (ii) for n > 0 X n ( t l , . . . , t,) E F O R M , P n ( t l , . . . ,t,) E F O R M , (iii) F O R M is closed under the propositional connectives, (iv) F O R M is closed under first- and second-order quantification. Notation. We will often write ( 21,. . . , x,) E X n for X n ( x l , . . . , x,) and we will usually drop the superscript in X n . The semantics of second-order logic is defined in the same manner as in the case of first-order logic. Definition 4.1. A second-order structure is a sequence U = (A, A*, c*,R *), where A* = (A,Jn E N ) , c* = { qli E N ) C A, R* = ( Rali,n E N ) , and A, c P ( A n ) , R y E A,. In words: a second-order structure consists of a universe A of individuals and second-order universes of n-ary relations ( n 2 0 ), individual constants and set (relation) constants, belonging to the various universes. In case each A, contains all n-ary relations (i.e. A, = P (An)), we call 2l full. Since we have listed Ia s a 0-ary predicate constant, we must accomodate it in the structure U. In accordance with the customary definitions of set theory, we write 0 = 0 , l = 0 a nd 2 = { 0,1). Also we take A0 = 1, and hence A. G P ( A O )= P ( 1 ) = 2. By convention we assign 0 t o I.Since we also want a distinct 0-ary predicate ( proposition)T := 1 I ,we put 1 E Ao. So, in fact, A = ?(A0) = 2. . Now, in order to define validity in U, we mimic the procedure of firstorder logic. Given a structure %, we introduce an extended language L (U) w ith names 3 for all elements S of A and A,(n E N ) . The constants R a are interpretations of the corresponding constant symbols P r . We define % cp, cp is true or valid in U, for closed cp. D efinition 4.2. ( i) % 3 if S = 1, (31, (ii) U ?? . . . ,3,)if(sl, . . . , s,) E Sn, (iii) the propositional connectives are interpreted as usual (cf. 1.2.1, 2 .4.51, (iv) U Vxcp(x)ifU cp(3)for alls E A, U 3xcp(x)ifU cp(3)for somes E A, (v) 2l VXncp(Xn)ifU k cp(Sn)for allSn E A,, 2l 3Xncp(Xn)if%i ( p ( r ) f o r someSn E A,. = As in first-order logic we have a natural deduction system, which consists of the usual rules for first-order logic, plus extra rules for second-order quantifiers. where the conditions on V 2 1 a nd g 2 E a re the usual ones, and cp* is obtained from cp by replacing each occurrence of X n ( t l , . . . , t,) by a ( t l , . . . ,t,) for a certain formula a , such that no free variables among the t i become bound after the substitution. Note that I 2Igives us the traditional Comprehension Schema: where X n may not occur free in cp. Proof. 3 21 3 x n v x 1 . . . x n ( p ( x l , .. . , x,) ++ X n ( x l , . . . , x,)) Since the topline is derivable, we have a proof of the desired principle. Conversely, l 2 follows from the comprehension principle, given the ordinary 1 rules of logic. The proof is sketched here ( Z a nd stand for sequences of variables or terms; assume that X n does not occur in a ) . [VZ(a(Z) * X n (Z))] V XI... x n ( ( ~ ( x l.,.., x n ) ++ ~ ( x 1. , -12,)) . 3 XnVZ(a(Z) ++ X n ( Z)) 3Xncp(. . . , x n ( T ), . . .) + + + + + If 2l cp we say that cp is true, or valid, in 2 . l I n t a number of steps are involved, i.e. those necessary for the Substitution Theorem. In * we have applied a harmless 3-introduction, in the sense that we went from a instance involving a variable to an existence statement, exactly as in first-order logic. This seems to beg the question, as we want t o 148 4. Second Order Logic 4. Second Order Logic 149 justify 32-introduction. However, on the basis of the ordinary quantifier rules we have justified something much stronger than * on the assumption of the Comprehension Schema, namely the introduction of the existential quantifier, given a formula a a nd not merely a variable or a constant. Since we can define V2 from g2 a similar argument works for V2E. T he extra strength of the second-order quantifier rules lies in V 2 1 a nd q 2 E . We can make this precise by considering second-order logic as a special kind of first-order logic (i.e. "flattening" 2nd-order logic). The basic idea is to introduce special predicates to express the relation between a predicate and its arguments. So let us consider a first-order logic with a sequence of predicates A pol A pl, Ap2, A p3,. . . , such t hat each Ap, is ( n 1)-ary. We think of . 1. A pn(x, ~ 1 1 . . Yn) a s ~ ~ ( ~ 1r Yn). . For n = 0 we get A po(x) a s a first-order version of X O , b ut that is in accordance with our intentions. X 0 is a proposition (i.e. something that can be assigned a truth value), and so is A po(x). We now have a logic in which all variables are first-order, so we can apply all the results from the preceding chapters. For the sake of a natural simulation of second-order logic we add unary predicates V, Uo, U1, U2, . . ., t o be thought of as "is an element", "is a o-ary predicate (i.e. proposition)" "is a 1-ary predicate", etc. We now have to indicate axioms of our first-order system that embody the characteristic properties of second-order logic. X2'+', czz+1,for i X~E.~TL 1 > 0, ~31,5n,for > 0, n 2 0' i Apo(x3=),fori > 0, Apo(cyt),fori > 0, Apo(co). cp* +* for binary connectives 0 a nd l c p * a nd v xf(v(x:) + cp*(xf)) 32; ( V(xf ) A cp* ( xf ) ) v(Xl)*(un(Xl)*) v*(xl)*)) 3 ( X l ) * ( u n ( ( X l ) * A P *((x%)*)) ) + + .1 I t is a tedious but routine job to show that k 2 cp @kl cp*, where 2 a nd 1 refer to derivability in the respective second-order and first-order systems. Note that the above translation could be used as an excuse for not doing second-order logic at all, were it not for the fact that first-order version is not nearly so natural as the second-order one. Moreover, it obscures a number of interesting and fundamental features, e.g. validity in all principal models see below, makes sense for the second-order version, whereas it is rather an extraneous matter with the first-order version. Definition 4.3. A second-order structure U is called a model of second-order logic if the comprehension schema is valid in U. If U is full (i.e. A, = P ( A n ) for all n ), t hen we call U a principal (or standard) model. From the notion of model we get two distinct notions of "second-order validity": (i) true in all models, (ii) true in all principal models. Recall that U cp was defined for arbitrary second-order structures; we will use b cp for "true in all models". By the standard induction on derivations we get t 2 cp cp. Using the above translation into first-order logic we also get cp =+ t-2 cp. Combining these results we get (i) Vxyz(Ui(x) A Uj(y) A V(z) -+ x # y A y # z A z # x ) for all i # j . (i.e. t he Uils a re pairwise disjoint, and disjoint from V). (ii) Vxyi . . . y n(Apn(x1 i, . . . , yn) + Un(x) A A V (yi)) for n 2 1. p (i.e. if x, y l, . . . ,y, a re in the relation Ap,, t hen think of x as a predicate, and the y ils a s elements). (iii) U o(Co,V (C2i+1),for i 2 0, and Un(C3i.5n),for i , n 2 0. (i.e. c ertain constants are designated as "elements" and "predicates"). (iv) V z l . . . z r n 3 x [ U n ( ~ ) ~ V y l . . V (yi) (v* * A ~ n ( x i Y l i ...,Yn)))Ii where x @ F V(cp*), see below. (The first-order version of the comprehension schema. We assume that FV(cp) C { zl,. . . , z,, y l, . . . , y,). ( v) l Apo(Co). (so there is a 0-ary predicate for 'falsity'). + + We claim that the first-order theory given by the above axioms represents second-order logic in the following precise way: we can translate second-order logic in the language of the above theory such that derivability is faithfully preserved. The translation is obtained by assigning suitable symbols to the various symbols of the alphabet of second-order logic and defining an inductive procedure for converting composite strings of symbols. We p ut T h e o r e m 4.4 ( C o m p l e t e n e s s T h e o r e m ) . t-2 cp ~b cp Obviously, we also have cp + cp is true in all principal models. The converse, however, is not t he case. We can make this plausible by the following argument: 150 4. Second Order Logic 4. Second Order Logic 151 (i) We can define the notion of a unary function in second-order logic, and hence the notions 'bijective' and 'surjective'. Using these notions we can formulate a sentence a , which states "the universe (of individuals) is finite" (any injection of the universe into itself is a surjection). (ii) Consider = {a) U {Anln E N). r is consistent, because each finite subset {a, A, , . . . ,A,,) is consistent, since it has a second-order model, namely the principal model over a universe with n elements, where n = m ax(n1, . . . ,nk). Conversely, [PI MI r So, by the Completeness Theorem above r has a second-order model. Suppose now that r h as a principal model 2 l. T hen I2lI is actually Dedekind finite, and (assuming the axiom of choice) finite. Say U has no elements, then 2l An,+1. Contradiction. So r has no principal model. Hence the Completeness Theorem fails for validity w .r.t. principal models (and likewise compactness). To find a sentence that holds in all principal models, but fails in some model a more refined argument is required. A peculiar feature of second-order logic is the definability of all the usual connectives in terms of V a nd 4. Theorem 4.5. Proof. ( a) is obvious. ( b) [cp A $1 + - ( a) (b) VXO.XO, XO)) XO), cp A $ VXO((cp ($ X'), (c) k 2 cp V $ H ' dxO((q4 X O )A ($ -+ x O) ( d ) t 23 x 9 * VXO(Vx(cp-+ X O )-+ X O ) , ( e ) t 23Xncp H VXO(VXn((cp -t X O )4 X O ) . t2lt2 + + + (c) a nd (e) are left t o the reader. I 0 t I n second-order logic we also have natural means to define identity for individuals. The underlying idea, going back to Leibniz, is that equals have exactly the same properties. I Definition 4.6 (Leibniz-identity). x = y := V X(X(x) I - X (y)) T his defined identity has t he desired properties, i.e. it satisfies I 1,. . . , 1 4. L 152 4 . Second Order Logic 4 . Second Order Logic 153 Theorem 4.7. (2) k2 x = x ( ii) t-2 x = y --+ y = 3: x ( iii) t 2 = y A y = z -+ x = z (iv) 1 2 x = Y --,(cp(x) cp(y)) + - axioms ( I ) , ( 2), ( 3): N is the smallest class containing 0 and closed under the successor operation. X ( y f ) ) -+X (x)]. Let u (x) := VX[(X(O)A Vy(X(y) T hen, by the comprehension axiom 3YVx(v(x) Y (x)). As yet we cannot assert the existence of a unique Y satisfying Vx(v(x) * Y (x)), since we have not yet introduced identity for second-order terms. Therefore, let us add identity relations for the various second-order terms, plus their obvious axioms. Now we can formulate the + Proof. Obvious. 0 - I n case the logic already has an identity relation for individuals, say =, we can show Theorem 4.8. Proof. + I-2 x =y *x = y. Axiom of Extensionality. is obvious, by 14.+ is obtained as follows: So, finally, with the help of the axiom of extensionality, we can assert 3 !Wx(u(x) * Y (x)). T hus we can conservatively add a unary predicate constant N with axiom Vx(u(x) H N (x)). T he axiom of extensionality is on the one hand rather basic - it allows definition by abstraction ("the set of all x, such that . . ."), on the other hand rather harmless - we can always turn a second-order model without extensionality into one with extensionality by taking a quotient with respect t o the equivalence relation induced by =. x-x x+ x-y H x =y In V 2 E we have substituted z = x for X (z). We can also use second-order logic to extend Peano's Arithmetic to second-order arithmetic. We consider a second-order logic with (first-order) identity and one binary predicate constant S, which represents, intuitively, the successor relation. The following special axioms are added: 1. 3!xVy7S(y1x ) 2. Vx3!yS(x1y) 3. v xyz(S(x, z ) A S (y, z ) -+ x = Y ) For convenience we extend the language with numerals and the successor function. This extension is conservative anyway, under the following axioms: Ql (9 V Y ~ S ( Y ~ (ii) S (A, (iii) y = xf H S (x, y ). We now write down the induction m iom ( N.B., not a schema, as in firstorder arithmetic, but a proper axiom!). 4. V X(X(0) A V x(X(x) -+ X ( x f ) ) V xX(x)) Exercises 1. Show that the restriction on X n in the comprehension schema cannot be dropped (consider ~ X ( X ) ) . 2 . Show r l2p w r* l cp* (where T* = { $*I$ c t E r)). Hint: use induction on the derivation, with the comprehension schema and simplified V , 3-rules. For the quantifier rules it is convenient t o consider an intermediate step consisting of a replacement of the free variable by a fresh constant of the proper kind. m), 3. Prove (c) and (e) of Theorem 4.5. 4. Prove Theorem 4.7. + 5. Give a formula ( p(X2),which states that T he extension from first-order to second-order arithmetic in not conservative. It is, however, beyond our modest means to prove this fact. One can also use the idea behind the induction axiom t o give an (inductive) definition of the class of natural numbers in a second-order logic with x 2is a function. 6. Give a formula ( p(X2) which states that X 2 is a linear order. 154 4. Second Order Logic 7. Give a sentence CJ which states that t he individuals can be linearly ordered without having a last element ( a can serve as an infinity axiom). 8. Given second-order arithmetic with the successor function, give axioms for addition as a ternary relation. 9. Let a second-order logic with a binary predicate constant < be given with extra axioms that make < a dense linear ordering without end points. We write x < y for < ( x, y ). X is a Dedekind Cut if 3 xX(x) A 3 x i X ( x ) A V x(X(x) A y < x 4 X (y)). Define a partial ordering on the Dedekind cuts by putting X 5 X ' := V x(X(x) -+ X 1(x)). Show that this partial order is total. 5 . Intuitionistic Logic 5 .1 Constructive Reasoning In the preceding chapters, we have been guided by the following, seemingly harmless extrapolation from our experience with finite sets: infinite universes can be surveyed in their totality. In particular can we in a global manner determine whether U 3x(p(x) holds, or not. To adapt Hermann Weyl's phrasing: we are used t o think of infinite sets not merely as defined by a property, but as a set whose elements are so to speak spread out in front of us, so that we can run through them just as an officer in the police office goes through his file. This view of the mathematical universe is an attractive but rather unrealistic idealisation. If one takes our limitations in the face of infinite totalities seriously, then one has to read a statement like "there is a prime number greater than in a stricter way than "it is impossible that the set of primes is exhausted before For we cannot inspect the set of natural numbers in a glance and detect a prime. We have to exhibit a prime p greater than Similarly, one might be convinced that a certain problem (e.g. t he determination of the saddle point of a zero-sum game) has a solution on the basis of a n abstract theorem (such as Brouwer's fixed point theorem). Nonetheless one cannot always exhibit a solution. What one needs is a c onstructive method (proof) that determines the solution. One more example to illustrate the restrictions of abstract methods. Consider the problem "Are there two irrational numbers a and b such that a b is rational?" We apply the following smart reasoning: suppose t hen we have solved the problem. Should \/Z 10. Consider the first-order version of second-order logic (involving the predicates Ap,, U,, V) with the axiom of extensionality. Any model U of this first-order theory can be "embedded" in the principal second-order model over La = { a € IUIIU V(E), as follows. Define for any r E U , f ( r ) = { ( a l , . . . ,a,)IU . Apn(~,q,. Show that f establishes an "isomorphic" embedding of U into the corresponding principal model. Hence principal models can be viewed as unique maximal models of second-order logic. + , an)) 11. Formulate the axiom of choice - for each number x there is a set X . . . in second-order arithmetic. 1 2. Show that in definition 4.6 a single implication suffices. fiJZ is rational, d 2 \ / be irrational then ( aJi) is rational. In both cases there is a solution, so the answer to the problem is: Yes. However, should somebody ask us to produce such a pair a , b, then we have t o engage in some serious number theory in order to come up with the right choice between the numbers mentioned above. Evidently, statements can be read in an inconstructive way, as we did in the preceding chapters, and in a constructive way. We will in the present chapter briefly sketch t he logic one uses in constructive reasoning. In mathematics the practice of constructive procedures and reasoning has been advocated by 156 5. Intuitionistic Logic 5.1 Constructive Reasoning 157 + ( b, c) such that b proves cp a nd c proves $. ( v) a proves cp V $J := a is a pair (b, c) such that b is a natural number and if b = 0 t hen c proves cp, if b # 0 t hen c proves $J. ( 4)a proves cp --+ $ := a is a construction that converts any proof p of p into a proof a (p) of $. ( I) o a proves I . n In order to deal with the quantifiers we assume that some domain D of objects is given. (V) a proves Vxcp(x) := a is a construction such that for each b E D a (b) proves cp(b). (3) a proves 3xcp(x) := a is a pair ( b, c) such that b E D a nd c proves cp(b). T he above explanation of the connectives serves as a means of giving the reader a feeling for what is and what is not correct in intuitionistic logic. I t is generally considered the intended intuitionistic meaning of the connectives. ( A) a proves cp A $J := a is a pair T he reader may try to justify some statements for himself, but he should not worry if the details turn out to be too complicated. A convenient handling of these problems requires a bit more machinery than we have a t hand (e.g. A-notation). Note, by the way, that the whole procedure is not unproblematic since we assume a number of closure properties of the class of constructions. Now that we have given a rough heuristics of the meaning of the logical connectives in intuitionistic logic, let us move on t o a formalisation. As it happens, the system of natural deduction is almost right. The only rule that lacks constructive content is that of Reduction ad Absurdum. As we have seen (p. 37), a n application of R AA yields t- - 1cp -+ p, b ut for 1 - p cp t o hold informally we need a construction that transforms a proof of l l c p into a p nto Proof of cp. Now a proves ~ ~ if ai transforms each proof b of ~ c ip a proof , of I , i.e. t here cannot be a proof b of lcp. b itself should be a construction t hat transforms each proof c of cp into a proof of I.So we know that there cannot be a construction that turns a proof of cp into a proof of I ,b ut that is a long way from the required proof of cp! (cf. ex. 1) + + - a number of people, but the founding fathers of constructive mathematics clearly are L. Kronecker and L.E.J. Brouwer. The latter presented a complete program for the rebuilding of mathematics on a constructive basis. Brouwer's mathematics (and the accompaying logic) is called intuitionistic, and in this context the traditional nonconstructive mathematics (and logic) is called classical. There are a number of philosophical issues connected with intuitionism, for which we refer the reader to the literature, cf. Dummett, Troelstra-van Dalen. Since we can no longer base our interpretations of logic on the fiction that the mathematical universe is a predetermined totality which can be surveyed as a whole, we have t o provide a heuristic interpretation of the logical connectives in intuitionistic logic. We will base our heuristics on the proof-interpretation put forward by A. Heyting. A similar semantics was proposed by A . Kolmogorov; the proof-interpretation is called the BrouwerHeyting-Kolmogorov (BHK)-interpretation . T he point of departure is that a s tatement cp is considered t o be true (or t o hold) if we have a proof for it. By a proof we mean a mathematicai construction that establishes cp, not a deduction in some formal system. For example, a proof of ' 2 3 = 5' consists of the successive constructions of 2 , 3 a nd 5, followed by a construction that adds 2 a nd 3, followed by a construction that compares the outcome of this addition and 5. The primitive notion is here "a proves cp", where we understand by a proof a (for our purpose unspecified) construction. We will now indicate how proofs of composite statements depend on proofs of their parts. Examples. I. cp A $ --, cp is true, for let (a, b) be a proof of cp A $, t hen the construction c with c (a, b) = a converts a proof of cp A $ into a proof of cp. So c proves (cp A $ J cp). 2. (cp A $J --, a ) + (cp + (cp -+ a ) ) . Let a prove cp A I, + a , i.e. a converts I each proof ( b, c) of cp A 1C, into a proof a(b, c) of a . Now the required proof p of cp -+ ($ - + a ) is a construction that converts each proof b of cp into a p(b) of $ -+ a . So p(b) is a construction that converts a proof c of cp i nto a proof (p(b)(c) of u . Recall that we had a proof a (b, c ) of a , so put (p(b)(c) = a (b, c ); let q be given by q(c) = a (b,c), t hen p is defined by p(b) = q. Clearly, the above contains the description of a construction a ) . (For those familiar with that converts a into a proof p of cp -+ ($ the A-notation: p = Xb.Ac.a(b,c ), so Aa.Ab.Ac.a(b, c ) is the proof we are looking for). 3. -3xcp(x) --, Vx-cp(x). We will now argue a bit more informal. Suppose we have a construction a that reduces a proof of 3 xp(x) t o a proof of I.We want a construction p that produces for each d E D a proof of cp(2) -+I, a construction i.e. that converts a proof of cp(d) i nto a proof of I. So let b be a proof of q ) t hen (d, b) is a proof of 3xcp(x), a nd a (d, b) is a proof of I . Hence @, p with ( p(d))(b) = a (d, b) is a proof of b'x-cp(x). This provides us with a construction that converts a into p. + 158 5. Intuitionistic Logic 5 .2 Intuitionistic Propositional and Predicate Logic 159 5 .2 Intuitionistic Propositional and Predicate Logic We adopt all the rules of natural deduction for the connectives V, A , -+, 1 , 3 ,V with the exception of the rule R AA. In order to cover both propositional and predicate logic in one sweep we allow in the alphabet (cf. 2.3.,p. 58) 0-ary predicate symbols, usually called proposition symbols. Strictly speaking we deal with a derivability notion different from the one introduced earlier (cf. p.35), since R A A is dropped; therefore we should use a distinct notation, e.g. F i. However, we will continue to use k when no confusion arises. We can now adopt all results of the preceding parts that did not make use of R AA. T he following list may be helpful: in some cases there is only an implication one way; we will show later that these implications cannot, in general, be reversed. From a constructive point of view R A A is used to derive strong concluA sions from weak premises. E.g. in ~ ( c p $) k l c p V 'cp t he premise is weak (something has no proof) and the conclusion is strong, it asks for an effective decision. One cannot expect to get such results in intuitionistic logic. Instead there is a collection of weak results, usually involving negations and double negations. Lemma 5.2.1. + - (1) (2) (3) ( 4) (5) (6) (7) (8) ( 9) (10) (11) (12) (13) (15) (14) (16) (17) (18) (19) (20) (21) (22) (23) (24) (25) (26) (Observe that (19) and (20) are special cases of (26) and (25) All of those theorems can be proved by means of straight forward application of the rules. Some well-known theorems are conspicuously absent, and In order to abbreviate derivations we will use the notation = in a cp derivation when there is a derivation for r k $ ( r has 0 , l or 2 elements). r Proof.( 1) 'cp -+ "cp follows from Lemma 5.2.1 (7). For the converse we [cp A -$I2 cp again use 5.2.1(7) [vll cp-+"P "cp I cp-+$l1 [cpA'$I2 l$ ["cpI2 ' -1 11, I 2 'cp (cp A l $ ) -+ + I 'p c l l l c p -+ 'cp 1 $ '(cp 111) 2 ! 160 5 . Intuitionistic Logic I 5.2 Intuitionistic Propositional and Predicate Logic 161 Prove (3) also by using (14) and (15) from 5.2.1 ( 4) Apply the intuitionistic half of the contraposition (Lemma 5.2.1(14)) t o t -3x-(p(x)-+~'dxp(x), 5.2.1(20) so +d ~ ~ V X ( P-( X )x i p ( x ) , 5.2.1(14) hence 1 7Vxp(x) 4 V xl-p(x). 5.2.1(19) Most of the straightforward meta-theorems of propositional and predicate logic carry over to intuitionistic logic. The following theorems can be proved by a tedious but routine induction. ( 6) t' 1 , Theorem 5.2.3 (Substitution Theorem for Derivations). If D zs a derzvat ion and $ a proposztional atom, then D[cp/$] i s a derivatzon i f t he free variables of p do not occur bound i n V . For the converse we apply some facts from 5.2.1. Theorem 5.2.4 (Substitution Theorem for Derivability). If r t- a ' and $ i s a propositional atom, then r [ p / $ ] t- a[cp/$], where the free variables of cp do not occur bound i n a o r r . - Theorem 5.2.5 (Substitution Theorem for Equivalence). r t- ( p l ( ~ 2 ) (@bl/$I 111[~2/$1), r t c pl p:! + r F $ [(P~/$]+ 11,[p2/$],where $ i s an atomic proposition, + the free variables of p l and cp2 do not occur bound i n r o r 11, and the bound variables of 11, do not occur free i n r . : + + + - - ( 5) 4 : Apply ( 3) t o c pA@ is given below. + p a nd cpA11, -+ $. T he derivation of the converse T he proofs of the above theorems are left to the reader. Theorems of this kind are always suffering from unaesthetic variable-conditions. In practical applications one always renames bound variables or considers only closed hypotheses, so that there is not much to worry. For precise formulations cf. i Ch. 6. T he reader will have observed from the heuristics that V a nd 3 c arry most 5 of t he burden of constructiveness. We will demonstrate this once more in an informal argument. There is an effective procedure to compute the decimal expansion of I ~ ( 3 , 1 4 1 5 9 2 7 ...). Let us consider the statement cp, := in the decimal ex' pansion of .rr t here is a sequence of n consecutive sevens. Clearly ploo -+ pgg holds, but there is no evidence whatsoever for -"Pl00 v cp99. T he fact that A , +,V, I d o not ask for the kind of decisions that V a nd 3 require, is more or less confirmed by the following a , Theorem 5.2.6. If cp does not contain V o r 3 and all atoms but I i n p are negated, then k cp H "(P Proof. Induction on cp. We leave the proof to the reader. (Hint: apply 5.2.2.) 162 5. Intuitionistic Logic 5.2 Intuitionistic Propositional and Predicate Logic 163 By definition intuitionistic predicate (propositional) logic is a subsystem of the corresponding classical systems. Godel and Gentzen have shown, however, that by interpreting the classical disjunction and existence quantifier in a weak sense, we can embed classical logic into intuitionistic logic. For this purpose we introduce a suitable translation: Definition 5 .2.7. T he mapping O : F O R M -+ F O R M is defined by ( i) I0 := I a nd cpO := 1 - p for atomic cp dinstinct from I . (ii) ( PA$)' := cpOA$" (iii) (cp V $)O := 1(1cp0 A 7 9') (iv) (cp + $)" := cpO $" ( v) (Vxcp(x)" := VxcpO(x) (vi) (3xcp(x)" := -'v'x-pO ( x) + , The mapping O is called the Godel translation. We define r 0 { cpOlcp E r ) . T he relation between classical derivability ( t , ) = a nd intuitionistic derivability ( t i is given by T he remaining rules are left t o the reader. 3. T he last rule of V is the falsum rule. This case is obvious. 4. T he last rule of V is a quantifier introduction or elimination rule. Let us consider two cases. Theorem 5 .2.8. r t, cp r0t i cpO. VI V cp(x) Proof. I t follows from the preceding chapters that t-, cp +-+cpO,therefore an immediate consequence of r t i cp + r k c cp. For +, we use induction on the derivation 2) of cp from r . + is Induction hypothesis: r0 i cp(x)O t ByVI r 0 i Vxcp(x)",so To k i (Vxcp(x))". t 1. cp E r , t hen also cpO E r0 nd hence T o k i cpO. a 2 . T he last rule of V is a propositional introduction or elimination rule. We consider two cases: I [CP] Induction hypothesis T o ,cpO k i $". By I r0 i cpO --, $", a nd so by definition k V r 0Fi (cp + $)O. $ + + [cp(x)] 3E: ~x(P(x) a D l a Induction h ypothesis:ro t i ( 3x(p(x))", r0 i a ". k So To t i ( ~V7cp(x))"nd a r 0t i Vx(cp(x)" -+ a "). Vx(cp(x)" -+ a " ) cp4$ VE 2 , [cp(x)"I1 cp(x)O --, a 0 [PI Dl a a [$I V2 a VV$ Induction hypothesis: r 0 i (cp V $ )O, t r O, t i a OrO,+O (PO ki (where r contains all uncancelled hypotheses involved). $O r 0k z i ("pO 7 q 0 ) ,r 0k i cpO + a', r 0k i A + a". T he result follows from the derivation below: We now get r 0 F i a ". 164 5. Intuitionistic Logic 5 .3 Kripke Semantics 165 5. The last rule of D is RAA. ( 79)' k , I . Induction hypothesis r O, [ -PI V so r 0 i - y " , a n d hence by Lemma 2.5.8 t r 0t i cpO 0 I cp Let us call formulas in which all atoms occur negated, and which contain only the connectives A, 4 , V,I, negative. The special role of V a nd 3 is underlined by Corollary 5.2.9. Classical predicate (propositional) logic is conservative over intuitionistic predicate (propositional) logic with respect to negative formulae, i. e. k , cp @ ti f or negative cp. cp Proof. cpO,for negative cp, is obtained by replacing each atom p by l ~ pS.ince all atoms occur negated we have t i cpO t+ p (apply 5.2.2(1) a nd 5.2.5). The c 0 result now follows from 5.2.8. I n some particular theories (e.g. a rithmetic) the atoms are decidable, i.e. k t he idealised mathematician has various choices for his future activities (he may even stop alltogether), the stages of his activity must be thought of as being partially ordered, a nd not necessarily linearly ordered. How will the idealised mathematician interpret the logical connectives? Evidently the interpretation of a composite statement must depend on the interpretation of its parts, e.g. t he idealised mathematician has established cp o r (and) 11, a t stage k if he has established cp a t stage k or (and) 11, a t stage k. The implication is more cumbersome, since cp --, 11, may be known at stage k w ithout cp or $ being known. Clearly, the idealised mathematician knows cp 4 $ a t stage k if he knows that if at any future stage (including k) cp is established, also 1C, is established. Similarly 'dxcp(x) is established at stage k if at any future stage (including k) for all objects a that exist at that stage cp(Z) is established. Evidently we must in case of the universal quantifier take the future into account since for all elements means more than just "for all elements that we have constructed so far"! Existence, on the other hand, is not relegated t o the future. The idealised mathematician knows at stage k t hat 3xcp(x) if he has constructed an object a such that a t stage k he has established cp(Z). Of course, there are many observations that could be made, for example that it is reasonable to add "in principle" to a number of clauses. This takes care of large numbers, choice sequences etc. Think of Vxy3z(z = x p), does the idealised mathematician really construct 10'' a s a succession of units? For this and similar questions the reader is referred to the literature. We will now formalise the above sketched semantics. It is for a first introduction convenient to consider a language without functions symbols. Later it will be simple to extend the language. We consider models for some language L. I ' k c p ~ ~ c for atomic cp. For such theories one may simplify the Godel transp lation by putting cpO := cp for atomic cp. Observe that Corollary 5.2.9 tells us that intuitionistic logic is consistent iff classical logic is so (a not very surprising result!). For propositional logic we have a somewhat stronger result than 5.2.8. Theorem 5.2.10 (Glivenko's Theorem). k c cp H t i l i p . use Proof. Show by induction on cp t hat k i cpO ++ ~ c (p 5.2.2), a nd apply 5.2.8. 0 Definition 5.3.1. A Kripke model is a quadruple K = ( K, C , C , D ) , where K is a (non-empty) partially ordered set, C a function defined on the constants of L, D a set valued function on K , C a function on K such that - C (c) 5.3 Kripke Semantics T here are a number of (more or less formalised) semantics for intuitionistic logic that allow for a completeness theorem. We will concentrate here on the semantics introduced by Kripke since it is convenient for applications and it is fairly simple. Heuristic motivation. Think of an idealised mathematician (in this context traditionally called the creative subject), who extends both his knowledge and his universe of objects in the course of time. At each moment k he has a stock C k of sentences, which he, by some means, has recognised a s true and a stock Ak of objects which he has constructed (or created). Since at every moment - D (k) - E D (k) for all k E K , # 0 for all k E K , C ( k ) C Atk for all k E K , where A tk is the set of all atomic sentences of L with constants for the elements of D (k). D a nd C satisfy the following conditions: (i) k 5 1 + D (k) C D(1). @ (ii) I C ( k ) , for all k. (iii) k 5 1 + E ( k ) E (1). D ( k ) is called the domain of K a t k, the elements of K are called nodes of K . I nstead of "cp has auxilliary constants for elements of D (k)" we say for s hort "cp has p arameters in D ( k ) " . 166 5 . Intuitionistic Logic , 5 .3 Kripke Semantics 167 C assigns t o each node the 'basic facts' t hat hold a t k, the conditions (i), (ii), (iii) merely state that the collection of available objects does not decrease in time, that a falsity is never established and that a basic fact that once has been established remains true in later stages. The constants are interpreted by the same elements in all domains (they are rigid designators). Note that D and C together determine at each node k a classical structure U (k) (in the sense of 2.2.1). The universe of U (k) is D (k) a nd the relations of U (k) a re given by C ( k ) a s the positive diagram: E R % ( ~ ) R(;) iff E C (k). T he conditions (i) and (iii) above tell us that the universes are increasing: IU(k) C lU(1)J k51 a nd that the relations are increasing: k 5 1 ~ ' 2 ' ( k )2 ~ ~ ( 1 ) . Furthermore c"(" = cc"(') for all k and 1. In C (k) t here are also propositions, something we did not allow in classical predicate logic. Here it is convenient for treating propositional and predicate logic simultaneously. The function C tells us which atoms are "true" in k. We now extend C t o all sentences. Lemma 5.3.4 (Monotonicity of It-). Proof. Induction on cp. Ic 5 1, k ll- cp + 1I cp t , I : ' (z) * * atomic cp : t he lemma holds by definition 5 .3.1. c p = c p l A c p z : let k l t c p l A c p 2 a nd k 5 1, then k l t c plAcp2 k l t c pl a nd k I cp2 a (ind. hyp.) 1 I c pl a nd 1I cp2 e 1 I c pl A cp2. t t t t cp = c pl V 9 2 : mimic the conjunction case. cp = c pi cp2 Let klk c pi cp2, 1 k. Suppose p 1 and p l t c pl t hen, since p k ,plF 9%Hence 1IF c pl + cp2. cp = 3xcpl ( x) " immediate. cp = Vxcpl ( x) : let k It- Vxcpl ( x) and 1 L k. Suppose p 1 and a E D (p), t hen, since p L k , p l t cpl(6). Hence 1 I t Vxcpl(x). 0 * > + --+ > > > We will now present some examples. It suffices to indicate which atoms are forced a t each node. We will simplify the presentation by drawing the partially ordered set and indicate the atoms forced at each node. For propositional logic no domain function is required (equivalently, a constant one, say D (k) = {O)), so we simplify the presentation accordingly. Lemma 5 3 2 C h as a unique extension to a function on K (also denoted ... b y C ) such t hat C ( k ) C S entk, the set of all sentences with parameters in D (k), satisfying: (i) cp V E C ( k ) H cp E C (k) o r E C (k) (ii) cp A I,E C ( k ) @ cp E C ( k ) and E C (k) I (iii) cp -+ E C ( k ) H f or all 1 k (cp E C(1) a E C (1)) (iv) 3xcp(x) E C ( k ) w there is an a E D (k) such that cp(6) E C (k) f or all 1 L k a nd for all a E D(1) cp(a) E C(1). ( v) Vxcp(x) E C (k) + + + > + + * Pmof. Immediate. We simply define cp by induction on cp. E C (k) for all k E K simultaneously 0 Notation. We write k l t cp for cp E C (k), pronounce 'k forces cp'. Exercise for the reader: reformulate (i) - (v) above in terms of forcing. Corollary 5.3.3. ( i) k I l c p t f or all 1 2 k 1 ly cp. f or all 1 2 k there exists a p 2 1 such that ( plt cp). (ii) k IF 1 - p * Proof. k l t l c p H k l t cp - + I w f o r a l l 1 2 k ( l l t c p * 11t1) w f o r a l l 1 2 k llycp. k I t ~ ~ c p ~ f o r a l l 1 ~ k l l ~ ~ c p ~ f o r ( f lo l a l ~pk npo y c p ) * a r 1 l >l l t 0 for all 1 2 k t here is a p 1 such that p l t cp. The monotonicity of C for atoms is carried over to arbitrary formulas. ( a) I n the bottom node no atoms are known, in the second one only cp, t o t cp. Note, be precise ko y cp, kl I cp. By 5 .3.3 k olt l - p , so koly l l c p t however, that ko ly 79, since kl I cp. So ko ly cp V lcp. t t w ( b) ki ly cp A ( i = 0 ,1,2), so ko I l (cp A +). By definition, 50 I y c p V ko I t- l c p or ko l t l + . T he first is false, since kl I cp, a nd the latter is t false, since k2 I +. Hence ko ly ~ ( c A +) -+ l c p V + I . t p ( c ) T he bottom node forces I, cp, b ut it does not force I V cp (why?). So ( + V cp). it does not force (+ --+ cp) ( d) In the bottom node the following implications are forced: cp2 cpl,cp3 --+ cp2, cp3 -+ c pl, b ut none of the converse implications is forced, hence ko ly ( 'PI cp2) V ($92 9 3) V (cp3 ( PI). We will analyse the last example a bit further. Consider a Kripke model with two nodes as in d , with some assignment C of atoms. We will show that for four arbitrary propositions 01, a2, a s, a4 koI lai a j , i.e. from any four propositions a t least two are + + l+ + + J --+ + + + ++ W + + 1<1<3<4 equivalent. 168 5 . Intuitionistic Logic 5 .3 Kripke Semantics 169 T here are a number of cases. (1) At least two of o l , 0 2,03, a 4 a re forced in k o T hen we are done. (2) Just one ai is forced in k O. T hen of the remaining propositions, either two are forced in k l, or two of them are not forced in k l. I n both cases there are two o j , a j, such that ko I a j ++ a j, . t ( 3) No ai is forced in ko. T hen we may repeat the argument under ( 2). The next thing to do is to show that Kripke semantics is sound for intuitionistic logic. We define a few more notions for sentences: (i) K I cp if k I t cp for all k E K . t (ii) I cp if K ll- cp for all K. t For formulas containing free variables we have to be more careful. Let cp contain free variables, then we say that k l t cp iff k l t Cl(cp) ( the universal closure). For a set a nd a formula cp with free variables xi,,, x i*,x i2,. . . (which we will denote by we define rlt cp by: for all K, k E K a nd for all ( ZE D (k)) [ k l t $ ( 2 ) for all $ E r =+ k l t & )I. ( Z E D (k) is a convenient abuse of language). Before we proceed we introduce an extra abuse of language which will prove extremely useful: we will freely use quantifiers in our meta-language. It will have struck the reader that the clauses in the definition of the Kripke semantics abound with expressions like "for all 1 2 k"," for all a E D (k)" . It saves quite a bit of writing to use "V1 k", "Va E D (k)" i nstead, and it increases systematic readability t o boot. By now the reader is well used t o the routine phrases of our semantics, so he will have no difficulty t o avoid a confusion of quantifiers in the meta-language and the object-language. By way of example we will reformulate the preceding definition: IF cp := ( v K ) ( v ~ K )(V & D ( ~ ) ) [ v Q r ( k 1 t E E + k ~ 4t 3 1 . T here is a useful reformulation of this "semantic consequence" notion. r z ), > ( e) ( i) ko I cp 4 3 xa(x), for the only node that forces cp is k l, a nd indeed t kilt a ( l ) , s o k llk 3 xa(x). Now suppose k o\F 3 x ( p 4 a ( x ) ) , then - since D (ko = (0) - ko I cp t a (0). B ut kl l t cp a nd kl l y a (0). Contradiction. Hence ko ly (cp 4 3 xo(x)) + 3 x ( p -+ a ( x ) ) . + Remark. (cp -t 3 xa(x)) + 3 x ( q -+ a ( x ) ) is called the independence of premise principle. I t is not surprising that it fails in some Kripke models. for cp --t 3 xa(x) tells us that the required element a for a @) may depend on the proof of cp (in our heuristic interpretation); while in 3 x ( p + a ( x ) ) : t he element a m ust be found independently of cp. So the right hand side is stronger. (ii) ko l k l Vx$(x) e k, l y Vx$(x) ( i = 0 , l ) . kl l y $(i), s o we have t t shown ko I l Vx$(x), ko lk 3 x+(2) e -$@). However, kl I $ (b), so ko ly 3 x+(x). Hence ko ly l Vx$(x) - 32-$(x). , (iii) A similar argument shows ko ly (Vx$(z) -+ 7 ) 4 3 x($(x) -+ r ) , where r is n ot forced in k l. ( f ) D (ki) = ( 0 , . . . , i ), C (ki) = {cp(O),. . . , cp(i - I ) ) , ko I V x l ~ c p ( x ) for t e all i ki Il- ~ l c p ( j ) , 5 i. T he latter is true since for all p > i k,lt c p(j),j 5 j i . Now ko I 1-Vxcp(x) H for all i t here is a j 2 i such that k3 I V X(P(X). t t 77Vxcp(x). B ut no k3 forces Vxcp(x). So ko ly V x - y ( x ) + L e m m a 5 .3.5. Let r be finite, then TI t cp e I t C l ( A C l ( X ) is the universal closure of X ). r 4 cp) (where Proof. eft t o the reader. L T h e o r e m 5 .3.6 ( Soundness T h e o r e m ) . 0 r t- cp + r ll- cp. Proof. Use induction on the derivation V of cp from r . We will abbreviate "k I $ (Z) for all $ E r" by "k I P ( 2 ) " . T he model K is fixed in the proof. t t (1) V consists of just cp, t hen obviously kll- r ( 2 ) + k l t cp(2)for all k and ( 2 ) E D(k). ( 2) V e nds with an application of a derivation rule. ( A I ) Induction hypothesis: VkV Z E D (k)(kl t r ( 2 ) + kl t cpi(;), for ; i = 1 ,2. Now choose a k E K a nd E D ( k ) such that k l t r ( 2 ) , t hen kIF a nd klF ( ~ 2 ( 2 )S,O kll- ((PI A c p2)(z). Note that t he choice of 2 did not really play a role in this proof. To simplify t he presentation we will suppress reference to 2 , when it does not play a role. R emark.(l) Note that all formulas of (a) . . . ( f) are classically true. (2) We have seen that -lVxcp(z) + Vx--cp(x) is derivable - a nd the reader may check that it holds in all Kripke models (or he may wait for the Soundness Theorem) - t he converse fails, however, in some models. The schema - ~ V s p ( z )is called the double negation s h ~ f(DNS). i Vxl-p(x) - 170 5. Intuitionistic Logic 5 .3 Kripke Semantics 171 ( AE)) Immediate. ( v I ) Immediate. ( vE) Induction hypothesis: Vk(k I t- r + k l t cp v $I), Vk(k ll- r , cp =+ k I a ) , t V k(kIt r,$ + kIF a ) . Now let k l t r , t hen by i.h. kit- cp V $, so kIk cp or k ll- $. I n the first case k It- r,cp, so k ll- a . In the second case k It- r,$, so k ll- a . I n both cases k ll- a , so we are done. (-+ I) Induction hypothesis: (Vk)(V ;E D (k))(klt r (;), cp(2) + k l t $(;)). Now let kll- r ( z ) for some ;E D (k). We want to show so let 1 2 k a nd 1 It- cp(2). By monotonicity 1 I ~ ( 2 ) . t klt- (cp + $)(;), a nd Z E D(1), so the ind. hyp. tells us that 1 It- $;. () Hence V1 2 k(1 I t cp(2) + 1 ll- $(;)), so k I (cp -+ $)(;). t (+ E ) Immediate. ( I) Induction hypothesis Vk(kll- I' + k lt-1). Since, evidently, no k can force r ,Vk(k1l- T + kll- cp) is correct. ; (VI) T he free variables in r a re , a nd z does not occur in the sequence x . Induction hypothesis: (Vk)(V,; b E D (k))(kll- ~ ( 2 + k l t cp(2, b )). ) Now let k l t r ( 2 ) for some ;E D (k), we must show kll- Vzcp(2,z). So let 1 > k a nd b E D(1). By monotonicity I It- F(;) a nd Z E D(1), so by the ind. hyp. lit- cp(2, b ). This shows (V1 2 k)(Vb E D(l))(Lll- c p((2,b), and hence k ll- ~ z c p ( 2z,). (YE) Immediate. (31) Immediate. ( 3E) Induction hypothesis: (Vk)(V ;E D (k)(kll- r ( z ) + kll- 3 z u ( z , r ) ) a nd (Vk)(V 2, b t D (k)(klt- cp(2, b), k l t r(;) + kll- a(;)). Here the -+ variables in r a nd a a re x , and z does not occur in the sequence.; Now let klk r(;), for some ;E D (k), t hen klk 3 zcp(b,z). So let kll- cp(;,b) 0 for some b E D (k). B y the induction hypothesis k l t a(;). -* I n general one has to extend the language L of r by a suitable set of 'witnessing' constants. Proof. E xtend the language L of r by a denumerable set of constants to a new language L'. T he required theory r' is obtained by series of extensions ro& rl C r 2 . . .. We put To := r . Let r k be given such that r k y cp a nd rk ontains only finitely many new c constants. We consider two cases. ! k i s even. Look for the first existential sentence 3 x$(x) in L' t hat has not yet been treated, such that r k t 3 x$(x). Let c be the first new constant U not in r k . Now put r k + l := rk {$(c)). t k i s odd. Look for the first disjunctive sentence $1 V$2 with rk - $1 Vqb2 t hat has not yet been treated. Note that not both rk,l k cp a nd r k, t cp q $12 for then by V3 r h t cp. NOW we put: rk+1 := r k u { $$2) k (l) u ofrk,lCll Y P i therwise. Finally: , r' := U r k . k20 T here are a few things to be shown. , ji i b i 5 6 r' y cp. We first show r, y cp by induction on i . For i = 0, To y cp holds by assumption. The induction step is obvious for i odd. For i even we suppose r,+l- cp. T hen P a,$ ( c ) t cp. Since r, l- 3x$(x), we get r, t cp l by 3E, which contradicts the induction hypothesis. Hence T,+l y cp, a nd therefore by complete induction r, y cp for all i . Now, if r' l- cp t hen Fak cp for some i . Contradiction. 2. r' is a prime theory. 1. For the Completeness Theorem we need some notions and a few lemma's. i i I Definition 5.3.7. A s et of sentences language L if r is a prime t h e o y with respect to a ! (i) r is closed under t(ii)cpv$~r+cp~ror$tr (iii) 3 xp(x) E r + cp(c) E r for some constant c in L. The following is analogue of the Henkin construction combined with a maximal consistent extension. Lemma 5.3.8. Let ( a) Let $1 V$2 E r' a nd let k be the least number such that rk k ~$2. Clearly $1 V$2 has not been treated before stage k, and rh k ~$2 for h 2 k. Eventually $1 V $2 has to be treated at some stage h k, SO t hen $1 E r h + l o r $2 E r h + l , a nd hence $1 E r' or $2 E r'. ( b) Let 3 x$(x) E r', a nd let k be t he least number such that rkk 3 x$(x). For some h L k 3x$(x) is treated, and hence $ ( c ) E rh+l 5 r' for some c. ( c) r' is closed under t . If r' t- $, t hen r' k $ V $, a nd hence by (a) 1 ~,E r r . Conclusion: r' is a prime theory containing I',such that r' y cp. 0 r and cp be closed, then if r Y cp, r' extending r such that r' y 9. there i s a p rime t h e o y Note that in t he above construction of rn+l can easily skip (in the we e ven case) a few constants, so that the resulting r' does not necessarily contain all new constants. We will make of use this in the proof below. 172 5 . Intuitionistic Logic 5 .3 Kripke Semantics 173 T he next step is t o construct for closed r a nd cp with t model, with IC I r a nd k ly c p for some k E K . r Y cp, a Kripke - $ = Mx$(x). ( b) The converse is simple, it is left to the reader. -+ i Lemma 5.3.9 (Model Existence Lemma). If r Y cp then there is a K npke model IC with a bottom node ko such that k oIt r a nd ko llf cp. We first extend r t o a suitable prime theory I" such that r' Y cp. r' has the language L' with set of constants C'. Consider a set of distinct constants { ckli 0 , m 0) disjoint with C'. A denumerable family of denumerable sets of constants is given by C i = { cklm 2 0). We will construct a Kripke model over the poset of all finite sequences of natural numbers, including the empty sequence ( ) , w ith their natural ordering, "initial segment of'. Define C ( ( ) ) := C' a nd ~ ( z = C ( ( ) )U C 0 u . . . u c kpl ?t of positive ) for length k. L(;) is the extension of L by ~ ( z )w,ith set of atoms At(;). Now . put D (;) := ~ ( z )We define Z ( z ) by induction on the length of z . C ( ( ) ):= r' n A t(()). Suppose Z ( z ) has already been defined. Consider + a n enumeration (ao, O), 01, r l), . . of all pairs of sentences in r ( n ) such T( . that r(;l*),ai y r t . Apply Lemma 5.3.8 t o r ( g ) U {ai) a nd ~ i for each i. This yields a prime theory r ( z , i ) and L(;, i ) such that ci E r ( z , i ) and r ( Z , i) y T ~ . > > t c I cp(c)).~ s s u m r ( z ) YVxcp(x), t e ( a) I Vxcp(x) H M m > n M E c(;)(; t hen for a suitable i r ( z , i ) Y Vxcp(x) ( take T for ai in the above construction). Let c be a constant in ~ ( 2 ) , not in r(;, i ) , then r ( z , i ) y cp(c), i a nd by induction hypothesis ( G ,i ) ly cp(c). Contradiction. -+ -+ ( b ) r t Mxcp(x). Suppose I Y Vxcp(x), t hen I Y cp(c) for some m 2 n a nd for some c E L($), hence r ( $ ) y cp(c) a nd therefore T ( z ) Y Vxcp(x). Contradiction. - $ = 3 xp(x). T he implication from left to right is obvious. For the converse we use the fact that r ( z ) is a prime theory. The details are left to the reader. z We now can finish our proof. The bottom node forces forced. r a nd cp is not 0 We can get some extra information from the proof of the Model Existence Lemma: (i) the underlying partially ordered set is a tree, (ii) all sets D(;) are denumerable. From the Model Existence Lemma we easily derive the following Theorem 5.3.10 (Completeness Theorem - Kripke). cp ( r a nd cp closed). Now put C ( ( n o , .. . , nk-1)) := r ( n i ) n At(;, i ). We observe that all conditions for a Kripke model are met. The model reflects (like the model of 3.1.1) very much the nature of the prime theories involved. Claim: I $ t ~ ( z t )$. We prove the claim by induction on $. - -+ r t, c p H TI t z Proof. We have already shown +. For the converse we assume r, Y cp and 0 : apply 5.3.9, which yields a contradiction. - For atomic $ t he equivalence holds by definition. - immediate $ = $1 A - or I $1 or 2 I $2 + (ind. hyp.) ~ ( z t- ) t t I!- $1 V $2 v lo2. r ( z ) t q2+ r(;) t. ( b) r ( z ) t $1 V $2 + r ( z ) k $1 or ~ ( z t) 7+b2, since r ( z )+is a prime t theory (in the right language). So, by induction hypothesis, n I $1 or + n I $2, a nd hence I t t V qb2. (a) $ = $1 V$2. Actually we have proved the following refinement: intuitionistic logic is complete for countable models over trees. I 2 j z T he above results are completely general (safe for the cardinality restriction on L ), so we may as well assume that r contains the identity axioms 1 1,. . ,I4 ( 2.6). May we also assume that the identity predicate is interpreted . by t he real equality in each world? The answer is no, this assumption constitutes a real restriction, as the following theorem shows. Theorem 5.3.11. If f or all k E K k l t ? =i b - $ = $1 ( a) I $1 t $2. Suppose r ( z ) Y $1 -+ $ 2, +hen r ( $ , Q1 Y Q 2. BY t t he definition of the model there is an extension m= ( no,. . . , nk-1, i ) of n such that r ( z ) U ($1) r (;) and r ( z ) y $9. By induction hypothesis ++ ;It$2. Applying the ; It a nd by m 2 n a nd It Q1 4 $2, induction hypothesis once more we get r (;) t $2. Contradiction. Hence ) A z + $ 2. K I t Vxy(x = y v x # y ). * a = b for a , b E D (k) then c z m Proof. Let a , b E D (k) a nd k l y a = 6, t hen a # b, not only in D (k), b ut in all D(1) for 1 2 k , hence for all 1 1 k, 1ly a = b , so k I E # 6. t 0 For a kind of converse, cf. Exercise 18. $1 + $ 2. 174 5. Intuitionistic Logic 5.4 Some Model Theory 175 The fact that the relation a -k b in % (k), given by k It- Si = 6, is not the identity relation is definitely embarrassing for a language with function symbols. So let us see what we can do about it. We assume that a function symbol F is interpreted in each k by a function F k. W+ require k 5 1 e F k C F l. F -++ has to obey I4 : V x Y ( x=Y-+ F(;) = F ( Y ) ) .For more about functions see Exercise 34. Lemma 5.3.15. Let K* be the modified Kripke model obtained from IC by dividing out -. Then klk c p(z) ++klk* cp(2 / N ) f or all k E K . Proof. Left t o the reader. - * 0 + Corollary 5.3.16. Intuitionistic logic (with identity) is complete with respect to modified Kripke semantics. Proof. Apply 5 .3.9 a nd 5 .3.15. Lemma 5.3.12. The relation k. -k is a congruence relation on U (k), f or each 0 13 Proof. Straightforward, by interpreting I1 - I4 For convenience we usually drop the index k. We now may define new structures by taking equivalence classes: U*(k) := U (k)/ ~ k i.e. t he elements of (U*(k)l a re equivalence classes a / ~k of ele, ments a E D (k), a nd the relations are canonically determined by R i ( a / -, . . .) @ R k ( a , .. .), similarly for the functions F l ( a / N , . . .) = F k(a,. . . )/ -. U (l), for k 5 1, is now replaced by a map f k l : T he inclusion U(k) U*(k) -+ U* (E), where f k l is defined by f k l ( a) = a"(1) for a E IU*(k)1 To be . precise: a / --ka / -1, so we have t o show a -k a' + a -1 a' to ensure the well1It- h = a'. definedness of f kl. T his, however, is obvious, since k lk h = 2 Claim 5.3.13. fkl We will usually work with ordinary Kripke models, but for convenience we will often replace inclusions of structures U(k) L U(1) by inclusion mappings U(1). U (k) - 5 .4 Some Model Theory ;I We will give some simple applications of Kripke's semantics. The first ones concern the so-called disjunction and existence properties. 1 is a homomorphism. (j Proof. Let us look a t a binary relation. R;(a/ -, b / N ) R (a, b) + 1It- R (a, b) @ R l(a, b ) H R:(a/ N , b / N ). T he case of an operation is left to the reader. R k(a,b) @ k lk 0 1 / / / Definition 5.4.1. A s et of sentences r has the (i) disjunction property ( D P ) if r t- cp V yi + r I- cp or T t $. (ii) existence property ( EP) if r I- 3 x 4 ~ + r t- cp(t) for some closed term ) t (where cp V yi a nd 3xcp(x) a re closed). In a sense LIP a nd EP reflect the constructive character of the theory F ( in the frame of intuitionistic logic), since it makes explicit the clause 'if T he upshot is that we can define a modified notion of Kripke model. ' we have a proof of 3 zp(x), then we have a proof of a particular instance', similarly for disjunction. Classical logic does not have D P or E P , for consider in propositional logic po V l po. Clearly I-, po V l p o , but neither t-, po nor t-, 7po! Definition 5 3 1 . A modified Kripke model for a language L is a triple IC = ..4 ( K, U, f ) such that K is a partially ordered set, 2.l a nd f a re mappings such t hat for k E K , U(k) is a structure for L and for k, 1 E K with k 5 1 f ( k, I ) is a homomorphism from U (k) t o U(1) and f ( 1, m ) o f ( k, 1) = f ( k, m ), f ( k, k) cp, = i d. fkl Notation. We write for f ( k, l ), and klk* cp for U (k) + cp, for atomic Theorem 5.4.2. Intuitionistic propositional and predicate logic without funci tions symbols have D P . f I Now one may mimic the development presented for the original notion of Kripke semantics. In particular the connection between the two notions is given by I K I a nd K z with bottom nodes k l and k2 such that kl ly cp a nd k2 ly yi. I t is no restriction t o suppose that the partially ordered sets K I , K2 of K1 a nd K2 a re disjoint. Proof. Let t p V $, a nd suppose Y p a nd Y yi, then there are Kripke models 176 5. Intuitionistic Logic 5.4 Some Model Theory 177 constructive nature of our considerations. For the present we will not bother t o make our arguments constructive, it may suffice to remark that classical arguments can often be circumvented, cf. Ch. 6. In constructive mathematics one often needs stronger notions than the classical ones. A paradigm is the notion of inequality. E.g. in the case of the real numbers it does not suffice to know that a number is unequal (i.e. n ot equal) t o 0 in order to invert it. The procedure that constructs the inverse for a given Cauchy sequence requires that there exists a number n such that the distance of the given number t o zero is greater than 2 Tn. Instead of a negative notion we need a positive one, this was introduced by Brouwer and formalised by Heyting. We define a new Kripke model with K = K1 U K 2 U {kO)where ko # K 1 UK2, see picture for the ordering. We define U(k) = 211(k) for k E K1 & (k) for k E K2 I2lI for k = ko. Definition 5.4.4. A binary relation # is called an apartness relation if (i) Vxy(x = y ++ l x # y ) ( 4VXY(X#Y Y #X) (iii) Vxyz(x#y 4 x #z V y #z) - where IUI consists of all the constants of L , if there are any, otherwise I2lI conU (ki)(i = 1 ,2) tains only one element a. The inclusion mapping for U (ko) is defined by c H if there are constants, if not we pick a i E 2l(ki) a rbitrarily and define f ol(a) = a l l f o2(a) = a2. 2l satisfies the definition of a Kripke model. The models IC1 a nd K2 a re 'submodels' of the new model in the sense that the forcing induced on Ki by that of K is exactly its old forcing, cf. Exercise t 13. By the Completeness Theorem ko t- cp V $, so ko Il- cp or ko It- $. If ko I p , t t hen kl I cp. Contradiction. If ko F $, t hen k2 t $. Contradiction. So Y p 0 a nd y $ is not true, hence k- cp or t- $. ~"(~1) - Examples. 1. For rational numbers the inequality is an apartness relation. 2. If the equality relation on a set is decidable (i.e. Vxy(x = y V x # y )), then # is an apartness relation (Exercise 22). 3 . For real numbers the relation la - bl > 0 is an apartness relation (cf. Troelstra-van Dalen). We call the theory with axioms (i), (ii), (iii) of 5.4.4 A P, t he theory of apartness (of course the obvious identity axiom x1 = x2 A y1 = y2 A xl#yl -+ x2#y2 is included). Observe that this proof can be considerably simplified for propositional logic, all we have t o do is place an extra node under kl a nd k2 in which no atom is forced (cf. Exercise 19). Theorem 5.4.5. A P l- V xy(11x = y 4 x = y ). Proof. Observe that l l x = y lllx#y t , - lx#y H x = y. 0 Theorem 5.4.3. Let the language of intuitionistic predicate logic contain at least one constant and no function symbols, then E P holds. Proof. Let t- 3xcp(x) a nd Y cp(c) for all constants c. Then for each c there is a Kripke model Kc with bottom node kc such that kc Y cp(c). Now mimic t he argument of 5.4.2 above, by taking the disjoint union of the Kc's a nd adding 0 t a bottom node ko. Use the fact that ko I 3 xp(x). T he reader will have observed that we reason about our intuitionistic logic and model theory in a classical meta-theory, In particular we use the principle of the excluded third in our meta-language. This indeed detracts from the We call an equality relation that satisfies the condition ' dxy(71x = y 4 stable. Note that stable is essentially weaker than decidable (Exercise 23). I n the passage from intuitionistic theories to classical ones by adding the principle of the excluded third usually a lot of notions are collapsed, e.g. i i x = y a nd x = y. Or conversely, when passing from classical theories t o intuitionistic ones (by deleting the principle of the excluded third) there is a choice of t he right notions. Usually (but not always) the strongest notions fare best. A n example is the notion of linear order. The theory of linear order, L O, has the following axioms: < (i) V X Y ~ ( X9 A Y < z --+ x < Z ) x = y) 178 5. Intuitionistic Logic 5.4 Some Model Theory 179 (ii) Vxyz(x < y -+ z < y v x < z ) (iii) Vxyz(x = y +-+ T X < y A l y < x ). One might wonder why we did not choose the axiom Vxyz(x < y V x = y V y < x ) instead of (ii), it certainly would be stronger! There is a simple reason: the axiom is too strong, it does not hold, e.g., for the reals. We will next investigate the relation between linear order and apartness. I Theorem 5.4.6. The relation x < y V y < x is an apartness relation. 0 Proof. An exercise in logic. Conversely, Smoryhski has shown how to introduce an order relation in a Kripke model of A P : Let K ll- A P, t hen in each D (k) t he following is an equivalence relation: k ly a#b. (a) kl t a = a ~ a # a ,since I cl t- a = a we get kl t ~ a # a nd hence a k ly a #a. k ly b#a. t ( b) k I a#b H b#a, so obviously k ly a#b ly a#b, k ly b#c and suppose klt- a#c, then by axiom (iii) k l t a#b (c) let k or k IF c#b which contradicts the assumptions. So k If a #c. Observe that this equivalence relation contains the one induced by the identity; k l t a = b + k ly a#b. T he domains D (k) a re t hussplit up in equivalence classes, which can be linearly ordered in the classical sense. Since we want t o end up with a Kripke model, we have t o be a bit careful. Observe that equivalence classes may be split by passing to a higher node, e.g. if k < 1 and k ly a#b then 1 It- a#b is very well possible, but 1 ly a#b + k ly a#b. We take an arbitrary ordering of the equivalence classes of the bottom node (using the axiom of choice in our meta-theory if necessary). Next we indicate how t o order the equivalence classes in an immediate successor 1 of k. The 'new' elements of D(1) a re indicated by the shaded part. (i) Consider an equivalence class [ao],+n D (k), a nd look at the corresponding i set Bo := U{[aIlla E [aoIk}. T his set splits in a number of classes; we order those linearly. Denote the equivalence classes of a. by a ob (where b is a representative). Now the classes belonging to the b's are ordered, and we order all the classes on UBolao E D (k)) lexicographically according to the representation aob. (ii) Finally we consider the new equivalence classes, i.e. of those that are not equivalent to any b in U{Bolao E D (k)). We order those classes and put t hem in that order behind the classes of case (i). Under this procedure we order all equivalence classes in all nodes. w We now define a relation R,+for each k: & (a, b ) := [aIk < [ b ] k , here < is the ordering defined above. By our definition k < 1 a nd R k(a, b ) + R l(a, b ). We leave it t o the reader t o show that I4 is valid, i.e. in particular Iclt- Vxyz(x = x' A x < y + x' < y ), where < is interpreted by Rk. - k Observe that in this model the following holds: (#) VXY(X#Y x < Y V Y < x ), ' + for in all nodes k , k l ! - a # b o k ll-a < b o r kll- b < a . Now we must check the axioms of linear order. (i) transitivity. ko Il- Vxyr(x < y A y < z x < z ) o for all k ko, for a < c o for all k k o, for all all a , b , c E D (k)kll- a < b A b < c a,b,c~D(k)andforallE~kll~a<bandll~b<c+lIl-a<c. So we have t o show R i(a, b) and Ri(b, c) + R i(a, c ), but that is indeed the case by the linear ordering of the equivalence classes. (ii) (weak)linearity. We must show ko F Vxyt(x < y -+ z < y Vx < 2 ). Since x < y V y < x ) holds the problem is reduced in our model Vxy(x#y to pure logic: show: A P b'xyz(x < y A y < t x < z ) Vxy(x#y x < y v y < x) C Vxyz(x < y z < y v x < 2). We leave the proof to the reader. C (iii) anti-symmetry. We must show ko ll- Vxy(x = y o T < y A -y < x ). As before the problem is reduced to logic. Show: AP+Vxy(x#y-x<y~y<x)l-Vxy(x=~'+~x<~~~~<x). -+ > > + - - - - Now we have finished the job - we have put a linear order on a model with an apartness relation. We can now draw some conclusions. Theorem 5.4.7. A P + LO + (#) zs conservative over LO. Proof. Immediate, by Theorem 5.4.6. 180 5 . Intuitionistic Logic 5 .4 Some Model Theory 181 Theorem 5.4.8 (van Dalen-Statman). AP over A P. + LO + ( #) is conservative We end this chapter by giving some examples of models with unexpected properties. Proof. Suppose AP Y cp, t hen by the Model Existence Lemma there is a tree model K of A P such that the bottom node ko does not force cp. We now carry out the construction of a linear order on K , t he resulting model K * is a model of A P LO (#), a nd, since cp does not contain <, ko l y cp. Hence A P LO ( #) Y cp. T his shows the conservative extension result: 0 A P + LO ( #) t- cp + A P t- cp, for cp in the language of A P. + + + + + T here is a convenient tool for establishing elementary equivalence between Kripke models: Definition 5.4.9. (i) A bisimulation between two posets A and B is a relation R A x B such that for each a, a', b with a 5 a ', a Rb t here is a b' with a'Rb' a nd for each a , b, b' with aRb, b 5 b' t here is an a' such that a1Rb'. (ii) R is a bisimulation between propositional Kripke models A a nd B if it is a bisimulation between the underlying posets and if a Rb + C ( a ) = C (b) (i.e. a a nd b force the same atoms). c f is the identity and g is the canonical ring homomorphism Z -+ Z /(2). 4 K is a model of the ring axioms (p. 88). Note that k o I t 3 # O ,koly 2 = 0 ,koly 2 # 0 and k olyVx(x # 0 Bisimulations are useful to establish elementary equivalence node-wise. 2. 3 y(xy = I )), but also k oly 3 x(x # 0 A Vy(xy # 1 )). We se that K is a commutative ring in which not all non-zero elements are invertible, but in which it is impossible t o exhibit a non-invertible, non-zero element. Lemma 5.4.10. Let R be a bisimulation between A and B then for all a , b , cp, aRb + (alt- cp e blk cp). For atoms and conjunctions and disjunctions the Proof. Induction on cp. result is obvious. Consider cp = c pl 4 9 2. p b b 'lt Let a Rb a nd alt- c pl -t c p Suppose bly c pl 4 p a, t hen for some b' c pl a nd b' ly cp2. By definition, there is an a' a such that alRb'. By induction hypothesis a' ll- c pl a nd a' Iy cp2. Contradiction. 0 The converse is completely similar. > > Corollary 5.4.11. If R is a total bisimulation between A and B, i.e. domR = A , r a n R = B , then A a nd I are elementarily equivalent (Alt- cp w BIF cp). 3 Again f a nd g a re the canonical homomorphisms. commutative ring, as one easily verifies. K is an intuitionistic, = K h as no zero-divisors: k olF 4 x y ( x # 0Ay # 0 A xy 0) @ for all = 0. hl l- 3 xy(x # 0~~#OAx~=0)@k~Ikm#OAn#O~mn=Oforsomem,n.S m # 0 , n f 0 , mn = 0 . Contradiction. This proves (1). i k ily3xy(x # O A y # O Axy = 0 ). (1) For i = 1 , 2 t his is obvious, so let us consider i 182 5. Intuitionistic Logic 5.4 Some Model Theory 183 T he cardinality of the model is rather undetermined. We know ko 1 t 3 xy(x # y ) - t ake 0 a nd 1, and k olt ~ 3 x 1 x 2 x 3 xi # x j . B ut note 1 <i<j<4 X t hat kolY 3 ~ 1 x 2 ~ ~ X I # ~ j , k ~ V y ~ X ~ X ~ X ~ i = x j a nd lX l<i<j<3 1 <i<j<4 /X\ /X\ W X I # x j. ko ly 1 3 ~ 1 ~ 2 x 3 1 1i<j<3 Observe that the equality relation in K is not stable: ko I 1 -0 = 6, but t ko ly 0 = 6. /X\ t he issues so that a reader can get a rough impression of the problems and methods without going into the finer foundational details. In particular we have treated intuitionistic logic in a classical meta-mathematics, e.g. we have freely applied proof by contradiction (cf. 5.3.10). Obviously this does not do justice t o constructive mathematics as an alternative mathematics in its own right. For this and related issues the reader is referred to the literature. A more constructive appraoch is presented in the next chapter. Exercises 1. (informal mathematics). Let cp(n) be a decidable property of natural numbers such that neither 3ncp(n), nor Vn-cp(n) has been established (e.g. "n is the largest number such that n and n 2 a re prime"). Define a real number a by the cauchy sequence: 3. + a, := I i =l k 2-' if k i= 1 < n a nd ~ ( k a)nd - q(i)for i < k. is rational", but Show that (a,) is a cauchy sequence and that a " there is no evidence for "a is rational". S , is the (classical) symmetric group on n elements. Choose n 2 3 . ko forces the group axioms (p. 86). ko IF -Vxy(xy = y x), but ko If 3 xy(xy y x), and ko IY Vxy(xy = y x). So this group is not commutative, but one cannot indicate non-commuting elements. + 4. Define the double negation translation cp" of cp by placing of each subformula. Show t ocpO ++ p" a nd I-, cp e b i c " p. 7 1 in front t Define an apartness relation by kl I a#b H a # b in Z /(2), idem for k2. T hen K I t Vx(x#O --+ 3y(xy = 1 )). This model is an intuitionistic field, but we cannot determine its characteristic. k l t Vx(x x = 0 ), kz I Vx(x + x x = 0). All we know is t K I Vx(6.x = 0). t 5. Show that for propositional logic ti ~ c -kc lcp. p + + I n the short introduction to intuitionistic logic that we have presented we re only been able t o scratch the surface. We have intentionally simplified 6. Intuitionistic arithmetic H A (Heyting's arithmetic) is the first-order intuitionistic theory with the axioms of page 85 as mathematical axioms. = Show H A 1VXY(X y V x # y) (use the principle of iilduction). Show that the Godel translation works for arithmetic, i.e. P A t cp e H A k p 0 5. Intuitionistic Logic (where P A is Peano's (classical) arithmetic). Note that we need not doubly negate the atoms. Show that PA is conservative over H A with respect t o formula's not containing V a nd 3. 5.4 Some Model Theory 185 Consider a propositional Kripke model K , where the C function assigns only subsets of a finite set r of the propositions, which is closed under subformulas. We may consider the sets of propositions forced a t a node instead of the node: define [k] = {cp E r l k l t- cp). T he set {[k]lk E K ) is partially ordered by inclusion define C r([k]) := C (k) n A t, show that the conditions of a Kripke model are satisfied; call this model ICr, a nd denote the forcing by It-r. We say that K r is obtained by filtration from K. ( b) Use the completeness theorem to establish the following theorems: Y Wl si<j<n( p i ++ pj), c for all n > 2. ( a) Show [k]Itr cp @ klt- cp, for cp E r . ( b) Show that ICr has an underlying finite partially ordered set. (c) Show that t cp cp holds in all finite Kripke models. (d) Show that intuitionistic propositional logic is decidable (i.e. t here is a decision method for t cp), apply 3.3.17. Each Kripke model with bottom node ko can be turned into a model over a tree as follows: Kt, consists of all finite increasing sequences (ko, k l, . . . , k,), ki < ki+l(O i < n ), and U tT((kol... , k,)) := U (kn). Show ( ko,. . . , k,), IFtT cp & k, I cp, where It,, is the forcing relation in t the tree model. < Give the simplified definition of a Kripke model for (the language of) propositional logic by considering the special case of def. 5.3.1 with C ( k ) consisting of propositional atoms only, and D (k) = (0) for all k. 11. Give an alternative definition of Kripke model based on the "structuremap" U (k) a nd show the equivalence with definition 5.3.1 (without propositional atoms). 12. Prove the soundness theorem using lemma 5.3.5. 13. A subset K ' of a partially ordered set K is closed (under <) if k E K', k 1 + 1 E K '. If K ' is a closed subset of the underlying partially ordered set K of a Kripke model IC, t hen K ' determines a Kripke model IC' over K ' with D 1(k) = D (k) a nd klt-' cp k lt- cp for k E K' and (F kit- cp for all cp with parameters in D (k), for atomic. Show k l t ' cp k E K ' (i.e. it is the future that matters, not the past). I i 1 I (a) Show that (cp -+ $) V (11, -+ cp) holds in all linearly ordered Kripke models for propositional logic. (b) Show that LC Y + there is a linear Kripke model of LC in which a fails, where LC is the propositional theory axiomatized by t he schema (cp -+ $) V (11, -+ cp) (Hint: apply Exercise 15). Hence LC is complete for linear Kripke models (Dummett). C < : * 1 t 18. Consider a Kripke model IC for decidable equality (i.e. Vxy(x = y V x # y )). For each k the relation kl t = 6 is an equivalence relation. Define a new model IC' with the same partially ordered set as IC, a nd D 1(k) = {[aIr,laE D (k)), where [a] is the equivalence class of a. Replace the inclusion of D (k) in D(1), for k < 1, by the corresponding canonical embedding [ajt c [ a],. Define for atomic cp k l t-' p := k l t cp a nd show k It-' cp k It- cp for all cp. * 19. Prove D P for propositional logic directly by simplifying the proof of 5.4.2. 20. Show that HA has D P a nd EP, t he latter in the form: HA I- 3xcp(x)) + H A t cp(E) for some n E N . ( Hint, show that the model, constructed in 5.4.2 a nd in 5.4.3, is a model of H A). 14. Give a modified proof of the model existence lemma by taking as nodes of the partially ordered set prime theories that extend r a nd that have a language with constants in some set C 0 U C 1 u . . . U C k-l (cf. proof of 5.3.8 ) ( note that the resulting partially ordered set need not (and, as a m atter of fact, is not) a tree, so we lose something). 186 5. Intuitionistic Logic 5.4 Some Model Theory 187 2 1. Consider predicate logic in a language without function symbols and constants. Show l- 3 xp(x) +t Vxcp(x). (Hint: add an auxilliary constant c, apply 5 .4.3, a nd replace it by a suitable variable). A P , where A P consists of the three 2 2. Show Vxy(x = y V x # y) F axioms of the apartness relation, with x#y replaced by # . 30. We consider now only propositions with a single atom p. Define a sec pl , quence of formulas by (PO :=I,:= p , cp2 := ~ p(Pzn+3 := c p2n+l ~ c p ~ (~2n+4= ( ~2n+2 cpzn+l a nd an extra formula cp, := T . T here is a spe: cific set of implications among the p i, indicated in the diagram on the left. n + ~ 2 4. Show that k It- cp v l c p for maximal nodes k of a Kripke model, so C ( k ) = T h(U(k)) (in the classical sense). That is, "the logic in maximal node is classical." 25. Give an alternative proof or Glivenko's theorem, using 15 a nd 24. 26. Consider a Kripke model with two nodes ko, k l, ko R , U (kl) = C . Show ko ly 7Vx(x2 1 # 0) 4 3 x(x2 + + 1 = 0). < k l a nd U(ko) = 27. Let O = R [X]/X2 be the ring of dual numbers. 0 has a unique maximal ideal, generated by X . Consider a Kripke model with two nodes ko, k l; ko < kl a nd U (ko) = O ,U(kl) = R , with f : O -+ R t he canonical map f ( a b X ) = a . Show that the model is an intuitionistic field, define the apartness relation. + 28. Show that V x(9 V $ (x)) -+ (cp V Vx$(x)) ( x @ F V ( 9 ) ) holds in all Kripke models with constant domain function (i.e. V kl(D(k) = D(1)). 29. This exercise will establish the undefinability of propositional connectives in terms of other connectives. To be precise the connective 1 is not if there is no formula 9, containing only the definable by 2 ,. . . , connectives 1 , . . . , a nd the atoms po, p l, such that I- po 1 p2 * q. ( i) V is not definable by - +,A, I.Hint: suppose cp defines V , apply the Godel translation. (ii) A is not definable in -+,V , I.Consider the Kripke model with three nodes k l, k 2, kg a nd kl < k3, k2 < k3, k1 IF p, k2 IF q, k3 I t- p, q. Show that all A-free formulas are either equivalent t o I or are forced in ki or k2. (iii) -+ is not definable in A , V , 1, . Consider the Kripke model with I three nodes k l, kz, k3 a nd Icl < kg, k2 < k3, kl I t p, k2 I t q, k3 IF p , g . Show for all -+ -free formulas k2 I t cp + kl IF cp. , ( i) Show that the following implications hold: (P2n+4, (P2n+2 (P2n+31 (P2n+l (P2n+31 (P2n+l (Pn, F (Pn P FY'O (ii) Show that the following 'identities' hold: t- ((P2n+l ( ~2n+2) (P2n+21 ((P2n+2 -' (P2n+4) (P2n+4, ((P2nf3 ~ 2 n + l ) (P2n+41t- ((P2n+4 (P2n+l) (P2n+61 t- ( (~2n+5 ( ~2n+l) (P2n+l1 )- ((P2n+6 (P2n+l) t+ (P2n+41 )- ( c~k (P2n+l) (P2n+l for k 1 212 7 , I- ( ' ~ k ~ 2 n + 2 ) (P2n+2 for k 2 2n 3 . H Determine identities for the implications not covered above. (iii) Determine all possible identities for conjunctions and disjunctions of v i's (look at the diagram). (iv) Show that each formula in p is equivalent to some p i. ( v) In order to show that there are no other implications than those indicated in the diagram (and the compositions of course) it suffices t o show that no cpn is derivable. Why? + + + - ) + + + + 190 6 . Normalisation 6 .1 Cuts 191 We will introduce a number of notions in order to facilitate the treatment. Derivations will systematically be converted into simpler ones by "elimination of cuts"; here is an example: Definition 6.1.1. T he formulas directly above the line in a derivation rule are called the premises, the formula directly below the line, the conclusion. In elimination rules the premise containing the connective is called the major premise, the other premises, if any, are called the minor premises. Convention The major premises will from now on appear on the left hand side. a +I $+o V ' $ --t converts to 2) Definition 6.1.2. A formula occurrence y is a cut in a derivation when it is the conclusion of an introduction rule and the major premise of an elimination rule. y is called the cut formula of the cut. In the above example 11, + a is a cut formula. We will adopt a slightly changed VI-rule, this will help t o streamline the system. a In general, when the tree under consideration is a s ubtree of a larger derivation the whole s ubtree ending with a is replaced by the second one. The rest of the derivation remains unaltered. This is one of the features of natural deduction derivations: for a formula a in the derivation only the part above a is relevant to a . Therefore we will only indicate conversions as far as required, but the reader will do well to keep in mind that we make the replacement inside a given bigger derivation. We list the possible conversions: E a Dl v 2 VI cp VI v x P[x/YI is converted to D i Pi where y does not occur free in a hypotheses of the derivation of c p, a nd x is free for y in c p. T he old version of V I is clearly a special case of the new rule. We will use the familiar notations, e.g. Dl Note that with the new rule we get a shorter derivation for i +I i+cp--tE P is converted to $ v 2 cp P namely is converted to '~YCP(Y) T he adoption of the new rule is not necessary, but rather convenient. We will first look a t predicate calculus with A , -+,I , . V I t is not immediately clear that this conversion is a legitimate operation m derivations, e.g. consider the elimination of the lower cut which converts W/Y~ P[t/YI 192 6. Normalisation 6.1 C uts 193 L emma 6 .1.3. I n a d erivation the bound variables can be renamed so that no variable occurs both free and bound. Proof. B y induction on D . Actually it is better to do some 'induction loading', in particular t o prove that the bound variables can be chosen outside a given set of variables (including the free variables under consideration). The proof is simple, and hence left to the reader. 0 Note that the formulation of the lemma is rather cryptic, we mean of course that the resulting configuration is again a derivation. It also expedient to rename some of the free variables in a derivation, in particular we want to keep the proper and the non-proper free variables separated. T he thoughtless substitution of v for z in V is questionable because v is not free for z in the third line and we see that in the resulting derivation V I violates the condition on the proper variable. In order to avoid confusion of the above kind, we have to look a bit closer at the way we handle our variables in derivations. There is, of course, the obvious distinction between free and bound variables, but even the free variables do not all play the same role. Some of them are "the variable" involved in a V I . We call these occurrences proper variables a nd we extend the name to all occurrences that are "related" to them. The notion "related" is the transitive closure of the relation that two occurrences of the same variable have if one occurs in a conclusion and the other in a premise of a rule in "related" formula occurrences. It is simplest to define "related" as the reflexive, symmetric, transitive closure of the "direct relative" relation which ~ ( 5A) d ( x , Y) AE the t op is given by checking all derivation rules, e.g. in d ( x , Y) occurrence and bottom occurrence of $(I, y ) are directly related, and so are the corresponding occurrences of x a nd y . Similarly the cp a t the top and the one a t the bottom in [ cpl L emma 6 .1.4. I n a derivation the free variables may be renamed, so that unrelated proper variables are distinct and each one is used exactly once i n its inference rule. Moreover, no variable occurs as a proper and a non-proper variable. Proof. Induction on 2). Choose always a fresh variable for a proper variable. Note that the renaming of the proper variables does not influence the hypotheses and the conclusion. 0 I n practice it may be necessary to keep renaming variables in order t o satisfy the results of the above lemmas. From now on we assume that our derivations satisfy the above condition, i.e. ( i) bound and free variables are distinct, (ii) proper and non-proper variables are distinct and each proper variable is used in precisely one VI. L emma 6.1.5. T he conversions for -+,A , V yield derivations. Proof. T he only difficult case is the V-conversion. But according t o our variables-condition D [ t / u ] is a derivation when D is one, for the variables in t d o not act as proper variables in 2). 0 T he details are left t o the reader. Dangerous clashes of variables can always be avoided, it takes just a routine renaming of variables. Since these syntactic matters present notorious pitfalls, we will exercise some care. Recall that we have shown earlier that bound variables may be renamed while retaining logical equivalence. We will use this expedient trick also in derivations. R emark T here is an alternative practice for formulating the rules of logic, which is handy indeed for proof theoretical purposes: make a typographical distinction between bound and free variables (a distinction in the alphabet). Free variables are called parameters in that notation. We have seen that the same effect can be obtained by the syntactical transformations described above. I t is t hen necessary, of course, to formulate the V-introduction in the liberal form! 194 6. Normalisation 6 .2 Normalisation for Classical Logic 195 6 .2 Normalisation for Classical Logic Definition 6.2.1. A s tring of conversions is called a reduction sequence. A derivation V is called irreducible derivation if there is no V' such that D >l V'. (Note that in the right hand derivation some hypothesis may be cancelled, this is, however, not necessary; if we want to get a derivation from the same hypotheses, then it is wiser not t o cancel the cp a t that particular V I ) A similar fact holds for RAA: it suffices to apply RAA to atomic instances. The proof is again a matter of reducing the complexity of the relevant formula. Notation 2) > I V' s tands for "V is converted t o V'".V > V' s tands for l "there is a finite sequence of conversions V = D o> I D > I . . . > I D,-1 = 2) a nd V D s tands for 2) > D or 2) = D'.(V reduces to D'). ' ' > T he basic question is of course 'does every sequence of conversions terminate in finitely many steps?', or equivalently 'is > well-founded?' The answer turns out to be 'yes', but we will first look a t a simpler question: 'does every derivation reduce t o an irreducible derivation?' - RAA Definition 6 .2.2. if there is no 2 : such that D l > 1 2); (i.e. if D l does ) not contain cuts), then we call D l a normal derivation, or we say that D l is in normal form, and if V V' where V' is normal, then we say that V normalises to V '. We say that > h as the strong normalisation property if > is well-founded, i.e. t here are no infinite reduction sequences, and the weak normalisation property if every derivation normalises. cp PA$ I - R AA I $ A1 > Popularly speaking strong normalisation tells you that no matter how you choose your conversions, you will ultimately find a normal form; weak normalisation tells you that if you choose your conversions in a particular way, you will find a normal form. Before getting down the normalisation proofs, we remark that the I-rule can be restricted to instances where the conclusion is atomic. This is achieved by lowering the rank of the conclusion step by step. Example. 2) I is replaced by I - R AA cp-$ is replaced by is replaced by etc. I -R AA c pb) 196 6. Normalisation 6.2 Normalisation for Classical Logic 197 Some definitions are in order now: Definition 6.2.3. ( i) a maximal cut formula is a cut formula with maximal rank. (ii) d = max{r(cp)lcp c ut formula in D ) (observe that max 0 = 0 ). n = n umber of maximal cutformulas and c r(D) = ( d, n), the cut rank of 2). If D has no cuts, put c r(D) = (0,O). We will systematically lower the cut rank of a derivation until all cuts have been eliminated. The ordering on cut ranks is lexicographic: (d, n ) < (d', n') := d < d' V (d = d' A n < n'). F a c t 6 .2.4. < i s a well-ordering (actually w descending sequences. Observe t h a t in the A, 4,I , - language the reductions are fairly simple, i.e. V p arts of derivations are replaced by proper parts (forgetting for a moment about the terms) - t hings get smaller! L e m m a 6.2.6. If c r(D) > 0, then there is a D' with D c r(D1) < c r(D). >I D' a nd + w) and hence has no infinite Proof. Select a maximal cut formula in D such that all cuts above it have lower rank. Apply the appropriate reduction t o this maximal cut, then the part of the derivation 2) ending in the conclusion a of the cut is replaced, by Lemma 6.2.5, by a (sub-) derivation in which all cut formula have lower rank. If the maximal cut formula was the only one, then d is lowered by 1, otherwise n is lowered by 1 a nd d remains unchanged. In both cases c r(D) g ets smaller. Note that in the first case n may become much larger, but that does not matter in the lexicographic order. Observe that the elimination a cut ( here!) is a local affair, i.e. i t only affects the part of the derivation tree above the conclusion of the cut. T h e o r e m 6.2.7. All denvatzons normalise. Proof. By Lemma 6.2.6 the cut rank can be lowered t o (0,O) i n a finite number of steps, hence the last derivation in the reduction sequence has no more c uts. 0 Normal derivations have a number of convenient properties, which can be read off from their structure. In order t o formulate these properties and the s tructure, we introduce some more terminology. D efinition 6 .2.8. A path in a derivation is a sequence of formulas cp,, . . . , p , such that cpo is a hypothesis, cp, is the conclusion and cp, is a premise i m m e d iately above cp,+1(0 5 i I - 1). (ii) A track is an initial part of a path n which stops a t the first minor premise or at the conclusion. In other words, a t rack can only pass through the major premises of elimination rules. Example. [cp A + L e m m a 6.2.5. Let D be a derivation with a cut at the bottom, let this cut have rank n while all other cuts have rank < n , then the conversion o f D a t this lowest cut yields a derivation with only cuts of rank < n . Proof. Consider all the possible cuts at the bottom and check the ranks of the cuts after the conversion. (i) +-cut , I i ?I, Observe that nothing happened in D l a nd D 2, so all the cuts in D' have rank < n . (ii) V-cut I , I 1 r 1 I i ( $1 [cp A I $40 cp $1 cp(t) T he substitution of a term does not affect the cut-rank of a derivation, ' so in D all cuts have rank < n . (iii) A-cut. Similar. 0 ?I, 198 6. Normalisation 6.2 Normalisation for Classical Logic 199 The underlying tree is provided with number labels: and the tracks are ( 6,4,3,2, l ),( 9,7) and (8,5). Fact 6.2.9. I n a normal derivation no introduction rule (application) can T he picture suggests that the tracks are classified as to "how far" t hey are from the maximal track. We formalise this in the notion of order. Definition 6.2.12. Let 'D be a normal derivation. o(t,) = 0 for a maximal track t,. = o ( t l ) 1 if the end formula of track t is a minor premise o(t) belonging to a major premise in t' precede an elimination rule (application) i n a track. Proof. Suppose an introduction rule precedes an elimination rule in a track, then there is a last introduction rule that precedes the first elimination rule. Because the derivation is normal, one cannot immediately precede the other. So there has to be a rule in between, which must be the I-rule or the RAA, but that clearly is impossible, since I cannot be the conclusion of an introduction rule. 0 Fact 6.2.10. A track i n a normal derivation is divided into (at most) three parts: an elimination part, followed by a I - p a r t , followed by an introduction + T he orders of the various tracks are indicated in the picture Fact 6.2.13 (Subformula Property). I n a normal derivation V , which is , 1 part. Each of the parts may be empty. Proof. By Fact 6.2.9 we know that if the first rule is an elimination, then all eliminations come first. Look a t the last elimination, it results in the conclusion of V , or it results in I ,in which case the I-rule or RAA may be applied, or it is followed by an introduction. In the last case only introductions can follow. If we applied the I - or RAA - rule, then an atom appears, which can only be the premise of an introduction rule (or the conclusion of V). Fact 6.2.11. Let 'D be a normal derivation. Then 'D h as at least one maximal n ot the hypothesis of a RAA-application, each formula of r t cp i s a subform ula of cp of a hypothesis i n y . Proof. Consider a formula $ in V , if it occurs in the elimination part of its track t , t hen it evidently is a subformula of the hypothesis a t the top of t . If not, then it is a subformula of the end formula of t . Hence $1 is a subformula of a formula q2 of a track t l with o (t1) < o ( t ) . R epeating the argument we find that is a subformula of a hypothesis or of the conclusion. 0 + track, ending i n the conclusion. T he underlying tree of a normal derivation looks like Sofar we considered all hypotheses, but we can do better. If p is a subformula of a cancelled hypothesis, it must be a subformula of the resulting implicational formula in case of an -+ I application, or of the resulting formula in case of a n RAA-application, or (and this is the only exception) it is the cancelled hypothesis of the RAA-application. One can draw some immediate corollaries from our results so far. 200 6. Normalisation 6 .3 Normalisation for Intuitionistic Logic 201 Corollary 6 .2.14. Predicate logic is consistent. Proof. Suppose F I , t hen there is a normal derivation ending in I with all hypotheses cancelled. There is a track through the conclusion; in this track there are no introduction rules, so the top (hypothesis) is not cancelled. Contradiction. 0 Note that 6.2.14 does not come as a surprise, we already knew that predicate logic is consistent on the basis of the Soundness Theorem. The nice point of the above proof is, that it uses only syntactical arguments. 3 - c onversion dt) - v1 converts to cp(t) Lemma 6.3.1. F or any derivation 4~) V1 with y not free in a a nd t free for u Corollary 6.2.15. Predicate logic is conservative over propositional logic. Proof. Let V be a normal derivation of T F ip, where T a nd cp contain no quantifiers, then by the subformula property V contains only quantifier-free 0 formulas, hence V is a derivation in propositional logic. cp(t) y in cp(y), V 1[t/y] i s also a derivation. 0 Proof. Induction on V'. 0 I t becomes somewhat harder t o define tracks; recall that tracks were introduced in order t o formalise something like "essential successor". In 'l we did not consider cp t o be an "essential successor" of ip ( the 6 .3 Normalisation for Intuitionistic Logic When we consider the full language, including V and 3, some of the notions introduced above have to be reconsidered. We briefly mention them: ip(u) - 1L '' in the 3 E 3 2 cp(x) u v 0 u is called the proper variable. t he lemmas on bound variables, proper variables and free variables remain correct. - c uts and cut formulas are more complicated, they will be dealt with below. - minor premise) since $J has nothing to do with cp. I n V E a nd 3 E t he cancelled hypotheses have something to do with the major premise, so we deviate from the geometric idea of going down in the tree and we make a track that ends in cp V $J continue both through (the cancelled) cp a nd $J,similarly a track that gets to 3xip(x) continues through (the cancelled) cp(y). T he old clauses are still observed, except that tracks are not allowed to start at hypotheses, cancelled by V E or 3 E. Moreover, a track (naturally) ends in a major premise of V E or 3E if no hypotheses are cancelled in these rule application. Example. As before we assume that our derivations satisfy the conditions on free and bound variables and on proper variables. Intuitionistic logic adds certain complications to the technique developed above. We can still define all conversions: V - c onversion Vi -I V cp1 v P2 Dl u v 2 0 converts t o 9% 1 In tree form: VE 202 6. Normalisation 6 .3 Normalisation for Intuitionistic Logic 203 Example. The derivation contains the following tracks: ( 2 , 4 , 9 , 7 , 5 , 3 ,I ), ( 2,4,10,8,6,3,1). T here are still more problems to be faced in the intuitionistic case: (i) There may be superfluous applications of V E a nd 3E in the sense that "nothing is cancelled. In each track there is an A-introduction and two steps later an Aelimination, but we are not in a position to apply a reduction. We would still not be willing to accept this derivation a s 'normal', if only because nothing is left of the subformula property: cp A cp is neither a subformula of its predecessor in the track, nor of its predecessor. The problem is caused by the repetitions that may occur because of V Ea nd 3 E , e.g. one may get a string of occurrences of the same formula: v v1 1.e. in 3x(p(x) ff no hypotheses cp(y) a re cancelled in Dl. We add extra conversions to get rid of those elimination rule applications: 2) a ff v 2 cpV11, converts t o D i ff 0 if cp a nd $ a re not cancelled in resp. 'Dl, D2. Clearly the formulas that would have t o interact in a reduction may be too far apart. The solution is to change the order of the rule applications, we call this a permutation conversion. Our example is converted by 'pulling' the A E upwards: 3 ~ 4 ~0 ) ff converts t o ff if p (y) is not cancelled in Dl. (ii) An introduction may be followed by an elimination in a track without giving rise t o a conversion. Now we can apply the A-conversion: 204 6. Normalisation 6.3 Normalisation for Intuitionistic Logic 205 In view of the extra complications we have to extend our notion of c ut. We just have to consider the extra connectives: Definition 6 .3.2. A s tring of occurrences of a formula g in a track which starts with the result of an introduction and ends with an elimination is called a c ut segment. A maximal cut segment is one with a cut formula of maximal rank. We have seen that the elimination a t the bottom of the cut segment can be permuted upwards: I 'Pv+ c an be replaced by I 'P 'Pv+ E xample. I*] u 3 x(pl(x) ~YPZ(Y) $+a u *+u @ + ~7 converts to 3 x(pl(x) * u + We will show that in intuitionistic logic derivations can be normalised. Define the cut rank as before; but now for cut segments: 4-0 .11, ~YVZ(Y) u u 11, Definition 6.3.3. ( i ) T he rank of a cut segment is the rank of its formula. (ii) d = m ax{r(cp)lq c ut formula in V ) , n = number of maximal cut segments, c r ( V ) = ( d ,n ) with the same lexicographical ordering. L emma 6.3.4. If V i s a derivation ending with a cut segment of maximal rank such that all cut segments distinct from this segment, have a smaller rank, then a number of permutation conversions and a conversion reduce V t o a derivation wzth smaller cut rank. 1 4 a nd then to *-+u * 1 I ' Proof. ( i ) C arry out the permutation conversions on the maximal segment, so that an elimination immediately follows an introduction. E.g. 'P Q ... cP Now we can eliminate the cut formula + --, 9: . .. ... ... (PA+ + 'P PA+ 'PAQ (PA+ 'PA+ > . .. . .. 'P cP cP cP Observe that the cut rank is not raised. We now apply the "connective" conversion t o the remaining cut. The result is a derivation with a lower d . So a cut segment may be eliminated by applying a series of permutation conversions followed by a "connective-conversion" . As in the smaller language, we can restrict our attention to applications of the I -rule for atomic instances. I L emma 6 .3.5. I f c r ( D ) > (O,O), t hen there is a V s uch that V > V' a nd ' c r ( V 1 )< c r ( V ) . Proof. Let s be a maximal segment such that in the sub derivation v ending with s no o ther maximal segments occur. Apply the reduction steps indicated 206 6. Normalisation 6 .3 Normalisation for Intuitionistic Logic 207 in Lemma 6.3.4, then V is replaced by V a nd either the d is not lowered, ' 0 but n is lowered, or d is lowered. In both cases c r(V1) < c r(V). Theorem 6.3.6. E ach intuitionistic derivation normalises. Proof. Apply Lemma 6.3.5. 0 Observe that this time the derivation may grow in size during the reductions, e.g. Fact 6.3.8. ( i) I n a normal derivation, no application of an introduction rule can precede an application of an elimination rule. ( ii) A track i n a normal derivation is divided into (at most) three parts: an elimination part, followed by a I part, followed by an introduction part. These parts consist of segments, the last formula of which are resp. the major premise of an elimination rule, the falsum rule or (an introduction rule or the conclusion). (iii) I n a normal derivation the conclusion is i n at least one m aximal t rack. Theorem 6.3.9 (Subformula Property). I n a normal derivation of cp, each f ormula i s a subformula of a hypothesis i n r , o r of p . rt 0 Proof. Left t o the reader. Definition 6.3.10. T he relation " p is rence of $" is inductively defined by: (1) p is a strictly positive subformula ( 2) 11, is a strictly positive subformula c pV$,$Vcp,cp+4, (3) $ is a strictly positive subformula : is reduced by a permutation conversion to a s trictly positive subformula occuroccurrence of cp, occurrence of cp A 11,, 11, A p , occurrence of Vx+, 3x$. j 1B In general, parts of derivations may be duplicated. The structure theorem for normal derivations holds for intuitionistic logic Note that here we consider occurrences ; a s a rule this will be tacitly und erstood. We will also say, for short, is strictly positive in 11,, or occurs strictly positive in $. T he extension to connectives and terms is obvious, e.g. is strictly positive in 11,". Lemma 6.3.11. ( i) T he immediate successor of the major premise of an elzmination rule is strictly positive i n this premise (for 4 E , A E,VE t his actually is the conclusion). (ii) A s trictly positive part of a strictly positive part of cp i s a strictly positive part of cp. i/ 6 \ as well; note that we have t o use the extended notion of track a nd that segments m ay occur. Fact 6.3.7. ( i) I n a normal derivation, no application of an introduction rule c an precede an application of an elimination rule. (ii) A track i n a normal derivation is divided into (at most) three parts: an elimination part, followed by a I part, followed by an introduction part. These parts consist of segments, the last formula of which are resp. the major premise of an elimination rule, the falsum rule or (an introduction rule or the conclusion). (iii) I n a normal derivation the conclusion is i n at least one m&mal track. I I 1 / P roof. I mmediate. We now give some applications of the Normal Form Theorem. 0 Theorem 6.3.12. Let r F cp V 11,, where r does not contain V i n strictly positive subformulas, then r t cp o r r t 11,. Proof. Consider a normal derivation V of cp V 4 a nd a maximal track t . If t he first occurrence cp V of its segment belongs to the elimination part of t , t hen cp V d ) is a strictly positive part of the hypothesis in t , which has not been cancelled. Contradiction. + 208 6. Normalisation 2) 6.3 Normalisation for Intuitionistic Logic 209 Hence c p V $ belongs t o the introduction part of t , a nd thus subderivation of c p o r of $. contains a V' V looks like 3xcp(z) a t the beginning of an end segment results from an introduction (else it would occur strictly positive in r ) , say from cp(ti).I t could also result from a I rule, but then we could conclude a suitable instance of cp(x). The last k s teps are 3 E o r V E . If any of them were an V-elimination then the disjunction would be in the elimination part of a track and hence a V would occur strictly positive in some hypothesis of r . Contradiction. Hence all the eliminations are 3 E. Replace the derivation now by: We now replace the parts of parts yielding disjunctions: 2) yielding the tops of the end segments by [ Pll [ all Dl dtl) a1 V P1 cp(t1)V cp(t2) v 2 dt2) d t l ) V cp(t2) So r ts ubterms of W ~ ( t , .)Since the derivation was normal the various t,'s r or 3xcp(x). a re 0 r t- c p. In this derivation exactly the same hypothesis have been cancelled, so 1 Consider a language without function symbols (i.e. all terms are variables or constants). i/ b I I Corollary 6.3.14. If in addition V does not occur strictly positive in r , then r k cp(t)for a suitable t . r tVX~(X). Corollary 6.3.15. If the language does not contain constants, then we get Theorem 6.3.13. If F t- 3xcp(x), where r does not contain an existential formula as a strictly positive part, then r I- cp(tl) v . . . v cp(t,), where the terms t l , . . . ,t n occur in the hypotheses or in the conclusion. , We have obtained here constructive proofs of the Disjunction and Exist ence Properties, which had already been proved by classical means in Ch. 5. Exercises 1. Show that there is no formula c p with atoms p a nd q without V , Iso that k cp H Proof. Consider an end segment of a normal derivation D of 3xcp(x) from r . E nd segments run through minor premises of V E and 3 E. I n this case an end segment cannot result b m 3 E, since then some ~ u ( P (would occur strictly u) positive in r . Hence the segment runs through minor premises of V E's. 1.e. we get: prq (hence v is not definable from the remaining connectives). 210 6. Normalisation 2. If c p does not contain + t hen y i cp. Use this to show that p definable by the remaining connectives. 4 q is not Bibliography 3. Let A not occur in cp, t hen c p t p a nd c p k q (where p a nd q a re distinct atoms) + cp t-I. 4. Eliminate the cut segment ( a V 7 ) from T he following books are recommended for further reading: J. Barwise ( ed). Handbook of Mathematical Logic. North-Holland Publ. Co., Amsterdam 1977. E. Borger. C omputability, Complexity, Logic. North-Holland Publ. Co., Amsterdam 1989. C.C. Chang, J.J. Keisler. Model Theory. North-Holland Publ. Co., Amsterdam 1990 D. van Dalen. Intuitionistic Logic. In: Gabbay, D. and F. Guenthner (eds.) Handbook of Philosophical Logic. III. Reidel P ubl. Co., Dordrecht 1986, 225-340. M. Davis. C omputability and Unsolvability. McGraw Hill, New York 1958. J.Y. Girard. Proof Theory and Logical Complexity. I. Bibliopolis, Napoli 1987. J.Y. Girard, Y. Lafont, P. Taylor. Proofs and Types. Cambridge University Press, Cambridge 1989. S.C. Kleene. I ntroduction to meta-mathematics. North-Holland Publ. Co., Amsterdam 1952. J .R. Shoenfield. M athematical Logic. Addison and Wesley, Reading, Mass. 1967. J .R. Shoenfield. R ecursion Theory. Lecture Notes in Logic 1. Springer, Berlin 1993. C . Smorynski. Logical Number Theory I. Springer, Berlin 1991. K.D. Stroyan, W.A.J. Luxemburg. I ntroduction to the theory of i nfinitesim als. Academic Press, New York 1976. A.S. Troelstra, D. van Dalen. C onstructivism in M athematics I, II. NorthHolland Publ. Co., Amsterdam 1988. 5. Show that a prenex formula ( Qlxl) . . . (Qnxn)cp is derivable if and only if a suitable quantifier-free formula, obtained from cp, is derivable. Additional Remarks: Strong Normalization and Church-Rosser As we already mentioned, there is a stronger result for natural deduction: every reduction sequence terminates (i.e. < 1 is well-founded). For proofs see Girard 1987 and Girard et al. 1989. Indeed, one can also show for > t he socalled C hurch-Rosser property (or confluence property): if V > D l, V > V2 t hen there is a V3 such t hat D l 2 V3 a nd V2 2 V3. As a consequence each V has a u nique normal f orm. One easily shows, however, that a given cp may have more than one normal derivation. I I I 1 1 ! Index 213 Index B V , 62 F V , 62 11,. . , 1 4 , 79 . L ( W , 66 M od, 111 R I1,. . . , R I 4 , 98 S ENT, 62 TERM,, 62 T h, 111 n-categorical, 125 T , 38 & ( a ) , 123 0-ary, 58 abelian group, 83 absurdum, 7 algebraically closed fields, 115, 125 apartness relation, 177 arbitrary object, 90 associativity, 20 atom, 59 axiom of extensionality, 153 axiom schema, 80 axiomatisability, 114 axioms, 104 BHK-interpretation, 156 bi-implication, 7 binary relation, 57 bisimulation, 180 Boolean Algebra, 20 bound variable, 62 Brouwer, 156 Brouwer-Heyting-Kolmogorov -interpretation, 156 canonical model, 108 Cantor space, 28 cauchy sequence, 183 change of bound variables, 74 characteristic function, 140 Church-Rosser property, 210 closed formula, 62 commutativity, 20 compactness theorem, 47, 111 complete, 45, 47, 54 complete theory, 125 completeness theorem, 45, 53, 103, 149, 173 comprehension schema, 147 conclusion, 29, 35, 190 conjunction, 7, 14 conjunctive normal form, 25 connective, 7 conservative, 53, 179, 180, 200 conservative extension, 104 consistency, 41 consistent, 41, 200 constants, 56 contraposition, 27 converge, 124 conversion, 191, 193, 200 Craig, 48 cut formula, 190 cut rank, 196, 205 cut segment, 204 De Morgan's laws, 20, 71 decidability, 45, 126, 185 decidable, 185, 188 decidable theory, 109 Dedekind cut, 154 definable Skolem functions, 188 definable subsets, 122 definition by recursion, 61 dense order, 129 densely ordered, 83 densely ordered sets, 125 derivability, 35, 39 derivation, 31, 34 diagram, 120 directed graph, 87 disjunction, 7, 15 disjunction property, 175, 207, 209 disjunctive normal form, 25 distributivity, 20 divisible torsion-free abelian groups, 125 division ring, 85 D NS, 168 double negation law, 21 double negation shift, 168 double negation translation, 183 downward Skolem-Lowenheim theorem, 112, 124 dual plane, 89 dual Skolem form, 143 duality mapping, 26 duality principle, 84, 101 edge, 86 elementarily embeddable, 120 elementarily equivalent, 119 elementary extension, 119 elementary logic, 56 elementary substructure, 119 elimination rule, 29 equality relation, 57, 173 equivalence, 7, 16 existence property, 175, 208, 209 existential quantifier, 58 existential sentence, 132 expansion, 111 explicit definition, 139 extended language, 65 extension, 104 extension by definition, 140 falsum, 7, 17 fan, 128 field, 85 filtration, 185 finite axiomatisability, 114 finite, 123 finite models, 113 first-order logic, 56 forcing, 166 F ORM,59,108 formation sequence, 9 formula, 59, 146 free for, 64 full model, 146 functional completeness, 27 functionally complete, 24 functions, 56 Godel translation, 162 Glivenko's theorem, 164 graph, 86 group, 83 Henkin extension, 104 Henkin theory, 104 Herbrand model, 108 Herbrand universe, 108 Herbrand's theorem, 143 Heyting, 156 homomorphism, 119 hypothesis, 30 idempotency, 20 identity, 79, 98, 173 identity axioms, 79 identity relation, 57 identity rules, 98 implication, 7, 16 inconsistent, 41 independence of premise principle, 168 independent, 46 induction axiom, 152 induction principle, 8 induction schema, 86 infinite, 123 infinitesimals, 123 interpolant, 48 Interpolation theorem, 48 interpretation, 68 introduction rule, 29 irreducible, 194 isomorphism, 119 Konig's lemma, 127 Kolmogorov, 156 Kripke model, 165 Kripke semantics, 164 Kronecker, 156 -language of plane projective geometry, 83 ; i I ' 1 language of a similarity type, 58 language of arithmetic, 85 language of graphs, 86 language of groups, 83 language of identity, 81 language of partial order, 82 language of rings with unity, 84 Lefschetz' principle, 126 Leibniz-identity, 151 Lindenbaum, 105 linear order. 177 214 Index principle of induction, 35 principle of mathematical induction, 86 principle of the excluded third, 29 projective plane, 84 proof by contradiction, 30 proof-interpretation, 156 PROP, 7 proper variable, 192 proposition, 7 proposition symbol, 7 quantifier elimination, 129 quantifiers, 55 RAA, 30 rank, 12 recursion, 11 reduces to, 194 reduct, 111 reductio ad absurdum rule, 30 reduction sequence, 194 relations, 56 relativisation, 77 relativised quantifiers, 77 Rieger-Nishimura lattice, 188 rigid designators, 166 ring, 85 satisfaction relation, 68 satisfiable, 69, 143 scope, 59 second-order structure, 146 semantics, 66 Sheffer stroke, 23, 27 similarity type, 57, 58 simultaneous substitution, 63 size, 39 Skolem axiom, 137 Skolem constant, 137 Skolem expansion, 137 Skolem form, 141 Skolem function, 137 Skolem hull, 142 soundness, 39, 92, 169 soundness theorem, 169 standard model, 86 standard model of arithmetic, 121 standard numbers, 86 strictly positive subformula, 207 strong normalisation, 210 strong normalisation property, 194 structure, 56 subformula, 9 subformula property, 199, 207 submodel, 119 substitution, 18, 39, 62 substitution instance, 65 substitution operator, 62 substitution theorem, 18, 39, 74, 161 substructure. 119 Tarski, 133 tautology, 18 TERM, 59, 108 term, 59 term model, 108 theory, 47, 104 torsion-free abelian groups, 115 totally ordered set, 83 track, 197 tree, 11 truth, 39 truth table, 15 truth values, 14 turnstile, 35 two-valued logic, 5 t ype, 57 Index 215 linearly ordered set, 83 Los-Tarski, 135 major premise, 190 material implication, 6 maximal cut formula, 196 maximally consisten, 105 maximally consistent, 43 meta-language, 8 meta-variable, 8 minimum, 82 minor premise, 190 model, 69 model complete, 131 model existence lemma, 103, 172 model of second-order logic, 149 modified Kripke model, 174 monadic predicate calculus, 89 multigraph, 87 natural deduction, 29, 90, 147 negation, 7, 15 negative formula, 164 negative subformula, 77 non-archimedean order, 121 non-standard models, 113 non-standard models of arithmetic, 121 non-standard real numbers, 123 normal derivation, 194 normal form, 194 normalises to. 194 occurrence, 12 open formula, 62 order of a track, 199 ordering sets, 116 overspill lemma, 122 parameters, 119, 193 partially ordered set, 82 path, 127, 197 Peano s tructures, 85 Peirce's law, 27 permutation conversion, 203 poset, 82 positive diagram, 120 positive subformula, 77 premise, 29 prenex (normal) form, 76 prenex formula, 188 preservation under substructures, 135 prime model, 132 prime theory, 170 principal model, 149 unary relation, 57 unique normal form, 210 uiiiversal closure, 68 universal quantifier, 58 universal sentence, 132 universe. 57 tim upward ~ k o l e m - ~ ~ w e n h e heorem, 112, 124 valid, 146 valuation, 17 variable binding operations, 61 variables, 55 Vaught's theorem, 126 vertex, 86 verum, 17 weak normalisation property, 194 well-ordering, 117 Zorn's lemma. 43 ...
View Full Document

This note was uploaded on 10/28/2010 for the course ENG 106 taught by Professor Steve during the Fall '10 term at Purdue.

Page1 / 111

van Dalen - Logic and Structure - Dirk van Dalen Logic and...

This preview shows document pages 1 - 3. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online