# Foundations of cryptography

foundations of cryptography
Unformatted text preview: Foundations of Cryptography (Fragments of a Book) Oded Goldreich Department of Computer Science and Applied Mathematics Weizmann Institute of Science Rehovot, Israel. February 23, 1995 Preface to Dana Why fragments? Several years ago, Sha Goldwasser and myself have decided to write together a book titled \Foundations of Cryptography". In a rst burst of energy, I've written most of the material appearing in these fragments, but since then very little progress has been done. The chances that we will complete our original plan within a year or two seem quite slim. In fact, we even fail to commit ourselves to a date on which we will resume work on this project. What is in these fragments? These fragments contain a rst draft for three major chapters and an introduction chapter. The three chapters are the chapters on computational di culty (or one-way functions), pseudorandom generators and zero-knowledge. These chapters are quite complete with the exception that the zero-knowledge chapter misses the planned section on non-interactive zero-knowledge. However, none of these chapters has been carefully proofread and I expect them to be full of various mistakes ranging from spelling and grammatical mistakes to minor technical inaccuracies. I hope and believe that they are no fatal mistakes, but I cannot guarantee this either. A major thing which is missing: An updated list of references is indeed missing. Instead I enclose an old annotated list of references (compiled mostly in February 1989). c 1995 O. Goldreich. All rights reserved. 1 2 Author's Note: Text appearing in italics within indented paragraphs, such as this one, is not part of the book, but rather part of the later comments added to its fragments... Author's Note: The original preface should have started here: Revolutionary developments which took place in the previous decade have transformed cryptography from a semi-scienti c discipline to a respectable eld in theoretical Computer Science. In particular, concepts such as computational indistinguishability, pseudorandomness and zero-knowledge interactive proofs were introduced and classical notions as secure encryption and unforgeable signatures were placed on sound grounds. This book attempts to present the basic concepts, de nitions and results in cryptography. The emphasis is placed on the clari cation of fundamental concepts and their introduction in a way independent of the particularities of some popular number theoretic examples. These particular examples played a central role in the development of the eld and still o er the most practical implementations of all cryptographic primitives, but this does not mean that the presentation has to be linked to them. Using this book Author's Note: Giving a course based on the material which appears in these fragments is indeed possible, but kind of strange since the basic tasks of encrypting and signing are not covered. Chapters, sections, subsections, and subsubsections denoted by an asterisk (*) were intended for advanced reading. Historical notes and suggestions for further reading are provided at the end of each chapter. Author's Note: However, a corresponding list of reference is not provided. Instead, the read may try to trace the paper by using the enclosed annotated list of references (dating to 1989). Acknowledgements .... very little do we have and inclose which we can call our own in the deep sense of the word. We all have to accept and learn, either from our predecessors or from our contemporaries. Even the greatest genius would not have achieved much if he had wished to extract everything from inside himself. But there are many good people, who do not understand this, and spend half their lives wondering in darkness with their dreams of originality. I have known artists who were proud of not having followed any teacher and of owing everything only to their own genius. Such fools! Goethe, Conversations with Eckermann, 17.2.1832] First of all, I would like to thank three remarkable people who had a tremendous in uence on my professional development. Shimon Even introduced me to theoretical computer science and closely guided my rst steps. Silvio Micali and Sha Goldwasser led my way in the evolving foundations of cryptography and shared with me their constant e orts of further developing these foundations. I have collaborated with many researchers, yet I feel that my collaboration with Benny Chor and Avi Wigderson had a fundamental impact on my career and hence my development. I would like to thank them both for their indispensable contribution to our joint research, and for the excitement and pleasure I had when collaborating with them. Leonid Levin does deserve special thanks as well. I had many interesting discussions with Lenia over the years and sometimes it took me too long to realize how helpful these discussions were. Clearly, continuing in this pace will waste too much of the publisher's money. Hence, I con ne myself to listing some of the people which had contributed signi cantly to my understanding of the eld. These include Laszlo Babai, Mihir Bellare, Michael Ben-Or, Manuel Blum, Ran Canetti (who is an expert in Wine and Opera), Cynthia Dwork, Uri Feige, Mike Fischer, Lance Fortnow, Johan Hastad (who is a special friend), Russel Impagliazzo, Joe Kilian, Hugo Krawcyzk (who still su ers from having been my student), Mike Luby (and his goat), Moni Naor, Noam Nisan, Rafail Ostrovsky, Erez Petrank, Michael Rabin, Charlie 3 4 Racko , Steven Rudich, Ron Rivest, Claus Schnorr, Mike Sipser, Adi Shamir, Andy Yao, and Moti Yung. Author's Note: I've probably forgot a few names and will get myself in deep trouble for it. Wouldn't it be simpler and safer just to acknowledge that such a task is infeasible? In addition, I would like to acknowledge helpful exchange of ideas with Ishai Ben-Aroya, Richard Chang, Ivan Damgard, Amir Herzberg, Eyal Kushilevitz (& sons), Nati Linial, Yishay Mansour, Yair Oren, Phil Rogaway, Ronen Vainish, R. Venkatesan, Yacob Yacobi, and David Zuckerman. Author's Note: Written in Tel-Aviv, mainly between June 1991 and November 1992. Contents 1 Introduction 1.1 Cryptography { Main Topics : : : : : : : : : : : : : : : : : : 1.1.1 Encryption Schemes : : : : : : : : : : : : : : : : : : : 1.1.2 Pseudorandom Generators : : : : : : : : : : : : : : : : 1.1.3 Digital Signatures : : : : : : : : : : : : : : : : : : : : 1.1.4 Fault-Tolerant Protocols and Zero-Knowledge Proofs : 1.2 Some Background from Probability Theory : : : : : : : : : : 1.2.1 Notational Conventions : : : : : : : : : : : : : : : : : 1.2.2 Three Inequalities : : : : : : : : : : : : : : : : : : : : 1.3 The Computational Model : : : : : : : : : : : : : : : : : : : : 1.3.1 P, NP, and NP-completeness : : : : : : : : : : : : : : 1.3.2 Probabilistic Polynomial-Time : : : : : : : : : : : : : 1.3.3 Non-Uniform Polynomial-Time : : : : : : : : : : : : : 1.3.4 Intractability Assumptions : : : : : : : : : : : : : : : 1.3.5 Oracle Machines : : : : : : : : : : : : : : : : : : : : : 1.4 Motivation to the Formal Treatment : : : : : : : : : : : : : : 1.4.1 The Need to Formalize Intuition : : : : : : : : : : : : 1.4.2 The Practical Consequences of the Formal Treatment 1.4.3 The Tendency to be Conservative : : : : : : : : : : : : 5 11 : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 11 11 13 14 16 18 18 19 23 23 24 27 29 30 31 31 32 33 6 CONTENTS 2 Computational Di culty 2.1 One-Way Functions: Motivation : : : : : : : : : : : : : : : : : : 2.2 One-Way Functions: De nitions : : : : : : : : : : : : : : : : : : : 2.2.1 Strong One-Way Functions : : : : : : : : : : : : : : : : : 2.2.2 Weak One-Way Functions : : : : : : : : : : : : : : : : : : 2.2.3 Two Useful Length Conventions : : : : : : : : : : : : : : 2.2.4 Candidates for One-Way Functions : : : : : : : : : : : : : 2.2.5 Non-Uniformly One-Way Functions : : : : : : : : : : : : : 2.3 Weak One-Way Functions Imply Strong Ones : : : : : : : : : : : 2.4 One-Way Functions: Variations : : : : : : : : : : : : : : : : : : : 2.4.1 * Universal One-Way Function : : : : : : : : : : : : : : : 2.4.2 One-Way Functions as Collections : : : : : : : : : : : : : 2.4.3 Examples of One-way Collections (RSA, Factoring, DLP) 2.4.4 Trapdoor one-way permutations : : : : : : : : : : : : : : 2.4.5 * Clawfree Functions : : : : : : : : : : : : : : : : : : : : : 2.4.6 On Proposing Candidates : : : : : : : : : : : : : : : : : : 2.5 Hard-Core Predicates : : : : : : : : : : : : : : : : : : : : : : : : 2.5.1 De nition : : : : : : : : : : : : : : : : : : : : : : : : : : : 2.5.2 Hard-Core Predicates for any One-Way Function : : : : : 2.5.3 * Hard-Core Functions : : : : : : : : : : : : : : : : : : : : 2.6 * E cient Ampli cation of One-way Functions : : : : : : : : : : 2.7 Miscellaneous : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 2.7.1 Historical Notes : : : : : : : : : : : : : : : : : : : : : : : 2.7.2 Suggestion for Further Reading : : : : : : : : : : : : : : : 2.7.3 Open Problems : : : : : : : : : : : : : : : : : : : : : : : : 2.7.4 Exercises : : : : : : : : : : : : : : : : : : : : : : : : : : : 35 : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 35 36 36 38 39 42 44 45 51 51 52 54 57 58 61 61 62 63 67 70 76 76 77 78 78 CONTENTS 7 3 Pseudorandom Generators 3.1 Motivating Discussion : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 85 3.1.1 Computational Approaches to Randomness : : : : : : : : : : : : : : 85 3.1.2 A Rigorous Approach to Pseudorandom Generators : : : : : : : : : 86 3.2 Computational Indistinguishability : : : : : : : : : : : : : : : : : : : : : : : 87 3.2.1 De nition : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 87 3.2.2 Relation to Statistical Closeness : : : : : : : : : : : : : : : : : : : : 89 3.2.3 Indistinguishability by Repeated Experiments : : : : : : : : : : : : : 90 3.2.4 Pseudorandom Ensembles : : : : : : : : : : : : : : : : : : : : : : : : 94 3.3 De nitions of Pseudorandom Generators : : : : : : : : : : : : : : : : : : : : 94 3.3.1 * A General De nition of Pseudorandom Generators : : : : : : : : : 95 3.3.2 Standard De nition of Pseudorandom Generators : : : : : : : : : : : 96 3.3.3 Increasing the Expansion Factor of Pseudorandom Generators : : : 96 3.3.4 The Signi cance of Pseudorandom Generators : : : : : : : : : : : : 100 3.3.5 A Necessary Condition for the Existence of Pseudorandom Generators 101 3.4 Constructions based on One-Way Permutations : : : : : : : : : : : : : : : : 102 3.4.1 Construction based on a Single Permutation : : : : : : : : : : : : : : 102 3.4.2 Construction based on Collections of Permutations : : : : : : : : : : 104 3.4.3 Practical Constructions : : : : : : : : : : : : : : : : : : : : : : : : : 106 3.5 * Construction based on One-Way Functions : : : : : : : : : : : : : : : : : 106 3.5.1 Using 1-1 One-Way Functions : : : : : : : : : : : : : : : : : : : : : : 106 3.5.2 Using Regular One-Way Functions : : : : : : : : : : : : : : : : : : : 112 3.5.3 Going beyond Regular One-Way Functions : : : : : : : : : : : : : : 117 3.6 Pseudorandom Functions : : : : : : : : : : : : : : : : : : : : : : : : : : : : 118 3.6.1 De nitions : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 118 3.6.2 Construction : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 120 3.7 * Pseudorandom Permutations : : : : : : : : : : : : : : : : : : : : : : : : : 125 3.7.1 De nitions : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 125 3.7.2 Construction : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 127 85 8 3.8 Miscellaneous : : : : : : : : : : : : : : 3.8.1 Historical Notes : : : : : : : : 3.8.2 Suggestion for Further Reading 3.8.3 Open Problems : : : : : : : : : 3.8.4 Exercises : : : : : : : : : : : : CONTENTS : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 130 130 131 132 132 4 Encryption Schemes 5 Digital Signatures and Message Authentication 6 Zero-Knowledge Proof Systems 6.1 Zero-Knowledge Proofs: Motivation : : : : : : : : : : : : : : : : : : 6.1.1 The Notion of a Proof : : : : : : : : : : : : : : : : : : : : : : 6.1.2 Gaining Knowledge : : : : : : : : : : : : : : : : : : : : : : : : 6.2 Interactive Proof Systems : : : : : : : : : : : : : : : : : : : : : : : : 6.2.1 De nition : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 6.2.2 An Example (Graph Non-Isomorphism in IP) : : : : : : : : : 6.2.3 Augmentation to the Model : : : : : : : : : : : : : : : : : : : 6.3 Zero-Knowledge Proofs: De nitions : : : : : : : : : : : : : : : : : : : 6.3.1 Perfect and Computational Zero-Knowledge : : : : : : : : : : 6.3.2 An Example (Graph Isomorphism in PZK) : : : : : : : : : : 6.3.3 Zero-Knowledge w.r.t. Auxiliary Inputs : : : : : : : : : : : : 6.3.4 Sequential Composition of Zero-Knowledge Proofs : : : : : : 6.4 Zero-Knowledge Proofs for NP : : : : : : : : : : : : : : : : : : : : : 6.4.1 Commitment Schemes : : : : : : : : : : : : : : : : : : : : : : 6.4.2 Zero-Knowledge proof of Graph Coloring : : : : : : : : : : : 6.4.3 The General Result and Some Applications : : : : : : : : : : 6.4.4 E ciency Considerations : : : : : : : : : : : : : : : : : : : : 6.5 * Negative Results : : : : : : : : : : : : : : : : : : : : : : : : : : : : 6.5.1 Implausibility of an Unconditional \NP in ZK" Result : : : : 6.5.2 Implausibility of Perfect Zero-Knowledge proofs for all of NP 139 141 : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 143 143 144 146 148 148 153 156 157 157 162 167 169 175 175 180 191 194 196 197 198 CONTENTS 6.5.3 Zero-Knowledge and Parallel Composition : : : : * Witness Indistinguishability and Hiding : : : : : : : : 6.6.1 De nitions : : : : : : : : : : : : : : : : : : : : : 6.6.2 Parallel Composition : : : : : : : : : : : : : : : : 6.6.3 Constructions : : : : : : : : : : : : : : : : : : : : 6.6.4 Applications : : : : : : : : : : : : : : : : : : : : * Proofs of Knowledge : : : : : : : : : : : : : : : : : : : 6.7.1 De nition : : : : : : : : : : : : : : : : : : : : : : 6.7.2 Observations : : : : : : : : : : : : : : : : : : : : 6.7.3 Applications : : : : : : : : : : : : : : : : : : : : 6.7.4 Proofs of Identity (Identi cation schemes) : : : : * Computationally-Sound Proofs (Arguments) : : : : : : 6.8.1 De nition : : : : : : : : : : : : : : : : : : : : : : 6.8.2 Perfect Commitment Schemes : : : : : : : : : : : 6.8.3 Perfect Zero-Knowledge Arguments for NP : : : 6.8.4 Zero-Knowledge Arguments of Polylogarithmic E * Constant Round Zero-Knowledge Proofs : : : : : : : : 6.9.1 Using commitment schemes with perfect secrecy 6.9.2 Bounding the power of cheating provers : : : : : * Non-Interactive Zero-Knowledge Proofs : : : : : : : : 6.10.1 De nition : : : : : : : : : : : : : : : : : : : : : : 6.10.2 Construction : : : : : : : : : : : : : : : : : : : : * Multi-Prover Zero-Knowledge Proofs : : : : : : : : : : 6.11.1 De nitions : : : : : : : : : : : : : : : : : : : : : 6.11.2 Two-Senders Commitment Schemes : : : : : : : 6.11.3 Perfect Zero-Knowledge for NP : : : : : : : : : : 6.11.4 Applications : : : : : : : : : : : : : : : : : : : : Miscellaneous : : : : : : : : : : : : : : : : : : : : : : : : 6.12.1 Historical Notes : : : : : : : : : : : : : : : : : : 6.12.2 Suggestion for Further Reading : : : : : : : : : : 6.12.3 Open Problems : : : : : : : : : : : : : : : : : : : 6.12.4 Exercises : : : : : : : : : : : : : : : : : : : : : : 9 6.6 6.7 6.8 : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 6.9 6.10 6.11 ciency : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 6.12 : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 199 202 202 205 206 208 208 209 211 212 213 217 218 219 225 227 228 230 234 237 237 237 237 238 240 244 246 247 247 249 250 250 10 CONTENTS 7 Cryptographic Protocols 8 * New Frontiers 9 The E ect of Cryptography on Complexity Theory 10 * Related Topics A Annotated List of References (compiled Feb. 1989) A.1 A.2 A.3 A.4 A.5 A.6 A.7 A.8 General : : : : : : : : : : : : : : : : : : : : : : : : Hard Computational Problems : : : : : : : : : : : Encryption : : : : : : : : : : : : : : : : : : : : : : Pseudorandomness : : : : : : : : : : : : : : : : : : Signatures and Commitment Schemes : : : : : : : Interactive Proofs, Zero-Knowledge and Protocols : Additional Topics : : : : : : : : : : : : : : : : : : : Historical Background : : : : : : : : : : : : : : : : 255 257 259 261 : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 263 269 269 272 273 275 276 285 290 Chapter 1 Introduction In this chapter we shortly discuss the goals of cryptography. In particular, we discuss the problems of secure encryption, digital signatures, and fault-tolerant protocols. These problems lead to the notions of pseudorandom generators and zero-knowledge proofs which are discussed as well. Our approach to cryptography is based on computational complexity. Hence, this introductory chapter contains also a section presenting the computational models used throughout the book. Likewise, the current chapter contains a section presenting some elementary backgound from probability theory, which is used extensively in the sequal. 1.1 Cryptography { Main Topics Traditionally, cryptography has been associated with the problem of designing and analysing encryption schemes (i.e., schemes which provide secret communication over insecure communication media). However, nowadays, also problems such as constructing unforgeable digital signatures and designing fault-tolerant protocols, are considered as falling in the domain of cryptography. Furthermore, it turns out that notions as \pseudorandom generators" and \zero-knowledge proofs" are very related to the above problems, and hence must be treated as well in a book on cryptography. In this section we brie y discuss the above-mentioned terms. 1.1.1 Encryption Schemes The problem of providing secret communication over insecure media is the most basic problem of cryptography. The setting of this problem consists of two parties communicating through a channel which is possibly tapped by an adversary. The parties wish to exchange 11 12 CHAPTER 1. INTRODUCTION information with each other, but keep the \wiretapper" as ignorant as possible regrading the contents of this information. Loosely speaking, an encryption scheme is a protocol allowing these parties to communicate secretly with each other. Typically, the encryption scheme consists of a pair of algorithms. One algorithm, called encryption, is applied by the sender (i.e., the party sending a message), while the other algorithm, called decryption, is applied by the receiver. Hence, in order to send a message, the sender rst applies the encryption algorithm to the message, and sends the result, called the ciphertext, over the channel. Upon receiving a ciphertext, the other party (i.e., the receiver) applies the decryption algorithm to it, and retrieves the original message (called the plaintext). In order for the above scheme to provide secret communication, the communicating parties (at least the receiver) must know something which is not known to the wiretapper. (Otherwise, the wiretapped can decrypt the ciphertext exactly as done by the receiver.) This extra knowledge may take the form of the decryption algorithm itself, or some parameters and/or auxiliary inputs used by the decryption algorithm. We call this extra knowledge the decryption key. Note that, without loss of generality, we may assume that the decryption algorithm is known to the wiretapper and that the decryption algorithm needs two inputs: a ciphertext and a decryption key. We stress that the existence of a secret key, not known to the wiretapper, is merely a necessary condition for secret communication. Evaluating the \security" of an encryption scheme is a very tricky business. A preliminary task is to understand what is \security" (i.e., to properly de ne what is meant by this intuitive term). Two approaches to de ning security are known. The rst (\classic") approach is information theoretic. It is concerned with the \information" about the plaintext which is \present" in the ciphertext. Loosely speaking, if the ciphertext contains information about the plaintext then the encryption scheme is considered insecure. It has been shown that such high (i.e., \perfect") level of security can be achieved only if the key in use is at least as long as the total length of the messages sent via the encryption scheme. The fact, that the key has to be longer than the information exchanged using it, is indeed a drastic limitation on the applicability of such encryption schemes. In particular, it is impractical to use such keys in case huge amounts of information need to be secretly communicated (as in computer networks). The second (\modern") approach, followed in the current book, is based on computational complexity. This approach is based on the observation that it does not matter whether the ciphertext contains information about the plaintext, but rather whether this information can be e ciently extracted. In other words, instead of asking whether it is possible for the wiretapper to extract speci c information, we ask whether it is feasible for the wiretapper to extract this information. It turns out that the new (i.e., \computational complexity") approach o ers security even if the key is much shorter than the total length of the messages sent via the encryption scheme. For example, one may use \pseudorandom generators" (see below) which expand short keys into much longer \pseudo-keys", so that the latter are as secure as \real keys" of comparable length. 1.1. CRYPTOGRAPHY { MAIN TOPICS 13 In addition, the computational complexity approach allows the introduction of concepts and primitives which cannot exist under the information theoretic approach. A typical example is the concept of public-key encryption schemes. Note that in the above discussion we concentrated on the decryption algorithm and its key. It can be shown that the encryption algorithm must get, in addition to the message, an auxiliary input which depends on the decryption key. This auxiliary input is called the encryption key. Traditional encryption schemes, and in particular all the encryption schemes used in the millenniums until the 1980's, operate with an encryption key equal to the decryption key. Hence, the wiretapper in this schemes must be ignorant of the encryption key, and consequently the key distribution problem arises (i.e., how can two parties wishing to communicate over an insecure channel agree on a secret encryption/decryption key). (The traditional solution is to exchange the key through an alternative channel which is secure, though \more expensive to use", for example by a convoy.) The computational complexity approach allows the introduction of encryption schemes in which the encryption key may be given to the wiretapper without compromising the security of the scheme. Clearly, the decryption key in such schemes is di erent and furthermore infeasible to compute from the encryption key. Such encryption scheme, called public-key, have the advantage of trivially resolving the key distribution problem since the encryption key can be publicized. In the chapter devoted to encryption schemes, we discuss private-key and public-key encryption schemes. Much attention is placed on de ning the security of encryption schemes. Finally, constructions of secure encryption schemes based on various intractability assumptions are presented. Some of the constructions presented are based on pseudorandom generators, which are discussed in a prior chapter. Other constructions use speci c one-way functions such as the RSA function and/or squaring modulo a composite number. 1.1.2 Pseudorandom Generators It turns out that pseudorandom generators play a central role in the construction of encryption schemes (and related schemes). In particular, pseudorandom generators are the clue to the construction of private-key encryption schemes, and this observation is often used in practice (usually implicitly). Although the term \pseudorandom generators" is commonly used in practice, both in the contents of cryptography and in the much wider contents of probabilistic procedures, it is important to realize that this term is seldom associated a precise meaning. We believe that using a term without knowing what it means is dangerous in general, and in particular in a delicate business as cryptography. Hence, a precise treatment of pseudorandom generators is central to cryptography. Loosely speaking, a pseudorandom generator is a deterministic algorithm expanding short random seeds into much longer bit sequences which appear to be \random" (although they are not). It other words, although the output of a pseudorandom generator is not 14 CHAPTER 1. INTRODUCTION really random, it is infeasible to tell the di erence. It turns out that pseudorandomness and computational di culty are linked even in a more fundamental manner, as pseudorandom generators can be constructed based on various intractability assumptions. Furthermore, the main result in the area asserts that pseudorandom generators exists if and only if oneway functions exists. The chapter devoted to pseudorandom generators starts with a treatment of the concept of computational indistinguishability. Pseudorandom generators are de ned next, and constructed using special types of one-way functions (de ned in a prior chapter). Pseudorandom functions are de ned and constructed as well. 1.1.3 Digital Signatures A problem which did not exist in the \pre-computerized" world is that of a \digital signature". The need to discuss \digital signatures" has arise with the introduction of computer communication in business environment in which parties need to commit themselves to proposals and/or declarations they make. Discussions of \unforgeable signatures" did take place also in previous centuries, but the objects of discussion were handwritten signatures (and not digital ones), and the discussion was not perceived as related to \cryptography". Relations between encryption and signature methods became possible with the \digitalization" of both, and the introduction of the computational complexity approach to security. Loosely speaking, a scheme for unforgeable signatures requires that each user can e ciently generate his own signature on documents of his choice each user can e ciently verify whether a given string is a signature of another (speci c) user on a speci c document but nobody can e ciently produce signatures of other users to documents they did not sign. We stress that the formulation of unforgeable digital signatures provides also a clear statement of the essential ingredients of handwritten signatures. The ingredients are each person's ability to sign for himself, a universally agreed veri cation procedure, and the belief (or assertion) that it is infeasible (or at least hard) to forge signatures in a manner that pass the veri cation procedure. Clearly, it is hard to state to what extent do handwritten signatures meet these requirements. In contrast, our discussion of digital signatures will supply precise statements concerning the extend by which digital signatures meet the above requirements. Furthermore, unforgeable digital signature schemes can be constructed using the same computational assumptions as used in the construction of encryption schemes. In the chapter devoted to signature schemes, much attention is placed on de ning the security (i.e., unforgeability) of these schemes. Next, constructions of unforgeable signature 1.1. CRYPTOGRAPHY { MAIN TOPICS 15 schemes based on various intractability assumptions are presented. In addition, we treat the related problem of message authentication. Message authentication Message authentication is a task related to the setting considered for encryption schemes, i.e., communication over an insecure channel. This time, we consider an active adversary which is monitoring the channel and may alter the messages sent on it. The parties communicating through this insecure channel wish to authenticate the messages they send so their counterpart can tell an original message (sent by the sender) from a modi ed one (i.e., modi ed by the adversary). Loosely speaking, a scheme for message authentication requires that each of the communicating parties can e ciently generate an authentication tag to any message of his choice each of the communicating parties can e ciently verify whether a given string is an authentication tag of a given message but no external adversary (i.e., a party other than the communicating parties) can e ciently produce authentication tags to messages not sent by the communicating parties. In some sense \message authentication" is similar to digital signatures. The di erence between the two is that in the setting of message authentication the adversary is not required to be able to verify the validity of authentication tags produced by the legitimate users, whereas in the setting of signature schemes the adversary is required to be able to verify the validity of signatures produced by other users. Hence, digital signatures provide a solution to the message authentication problem. On the other hand, message authentication schemes do not necessarily constitute a digital signature scheme. Signatures widen the scope of cryptography Considering the problem of digital signatures as belonging to cryptography, widens the scope of this area from the speci c \secret communication problem" to a variety of problems concerned with limiting the \gain" obtained by \dishonest" behaviour of parties (that are either internal or external to the system). Speci cally In the \secret communication problem" (solved by use of encryption schemes) one wishes to reduce as much as possible the information that a potential wiretapper may extract from the communication between two (legitimate) users. In this case, the legitimate system consists of the two communicating parties, and the wiretapper is considered as an external (\dishonest") party. 16 CHAPTER 1. INTRODUCTION In the \message authentication problem" one aims at prohibiting an (external) wiretapper from modifying the communication between two (legitimate) users. In the \signature problem" one aims at supplying all users of a system with a way of making self-binding statements so that other users may not make statements that bind somebody else. In this case, the legitimate system consists of the set of all users and a potential forger is considered as an internal yet dishonest user. Hence, in the wide sense, cryptography is concerned with any problem in which one wishes to limit the a ect of dishonest users. A general treatment of such problems is captured by the treatment of \fault-tolerant" (or cryptographic) protocols. 1.1.4 Fault-Tolerant Protocols and Zero-Knowledge Proofs A discussion of signature schemes naturally leads to a discussion of cryptographic protocols, since it is of natural concern to ask under what circumstances should a party send his signature to another party. In particular, problems like mutual simultaneous commitment (e.g., contract signing), arise naturally. Another type of problems, which are motivated by the use of computer communication in the business environment, consists of \secure implementation" of protocols (e.g., implementing secret and incorruptible voting). Simultaneity problems A typical example of a simultaneity problem is the problem of simultaneous exchange of secrets, of which contract signing is a special case. The setting in a simultaneous exchange of secrets consists of two parties, each holding a \secret". The goal is to execute a protocol so that if both parties follow it correctly then at termination each holds its counterpart's secret, and in any case (even if one party \cheats") the rst party \holds" the second party's secret if and only if the second party \holds" the rst party's secret. Simultaneous exchange of secrets can be achieved only when assuming the existence of third parties which are trusted to some extend. Simultaneous exchange of secrets can be easily achieved using the active participation of a trusted third party. Each party sends its secret to the trusted party (using a secure channel), who once receiving both secrets send both of them to both parties. There are two problems with this solution 1. The solution requires active participation of an \external" party in all cases (i.e., also in case both parties are honest). We note that other solutions requiring milder forms of participation (of external parties) do exist, yet further discussion is postponed to the chapter devoted to cryptographic protocols. 1.1. CRYPTOGRAPHY { MAIN TOPICS 17 2. The solution requires the existence of a totally trusted entity. In some applications such an entity does not exist. Nevertheless, in the sequel we discuss the problem of implementing a trusted third party by a set of users with an honest majority (even if the identity of the honest users is not known). Secure implementation of protocols and trusted parties A di erent type of protocol problems are the problems concerned with the secure implementation of protocols. To be more speci c, we discuss the problem of evaluating a function of local inputs each held by a di erent user. An illustrative and motivating example is voting, in which the function is majority and the local input held by user A is a single bit representing the vote of user A (e.g., \Pro" or \Con"). We say that a protocol implements a secure evaluation of a speci c function if it satis es privacy: No party \gains information" on the input of other parties, beyond what is deduced from the value of the function and robustness: No party can \in uence" the value of the function, beyond the in uence obtained by selecting its own input. It is sometimes required that the above conditions hold with respect to \small" (e.g., minority) coalitions of parties (instead of single parties). Clearly, if one of the users is known to be totally trusted then there exist a simple solution to the problem of secure evaluation of any function. Each user just sends its input to the trusted party (using a secure channel), who once receiving all inputs, computes the function, sends te outcome to all users, and erase all intermediate computations (including the inputs received) from its memory. Certainly, it is unrealistic to assume that a party can be trusted to such an extend (e.g. that it erases voluntarily what it has \learnt"). Nevertheless, we have seen that the problem of implementing secure function evaluation reduces to the problem of implementing a trusted party. It turns out that a trusted party can be implemented by a set of users with an honest majority (even if the identity of the honest users is not known). This is indeed a major result in the area. Zero-knowledge as a paradigm A major tool in the construction of cryptographic protocols is the concept of zero-knowledge proof systems, and the fact that zero-knowledge proof systems exist for all languages in NP (provided that one-way functions exist). Loosely speaking, zero-knowledge proofs yield nothing but the validity of the assertion. Zero-knowledge proofs provide a tool for \forcing" parties to follow a given protocol properly. 18 CHAPTER 1. INTRODUCTION To illustrate the role zero-knowledge proofs, consider a setting in which a party upon receiving an encrypted message should answer with the least signi cant bit of the message. Clearly, if the party just sends the (least signi cant) bit (of the message) then there is no way to guarantee that it did not cheat. The party may prove that it did not cheat by revealing the entire message as well as its decryption key, but this would yield information beyond what has been required. A much better idea is to let the party augment the bit it sends by a zero-knowledge proof that this bit is indeed the least signi cant bit of the message. We stress that the above statement is of the \NP-type" (since the proof speci ed above can be e ciently veri ed), and therefore the existence of zero-knowledge proofs for NP-statements implies that the above statement can be proven without revealing anything beyond its validity. 1.2 Some Background from Probability Theory Probability plays a central role in cryptography. In particular, probability is essential in order to allow a discussion of information or lack of information (i.e., secrecy). We assume that the reader is familiar with the basic notions of probability theory. In this section, we merely present the probabilistic notations that are used in throughout the book, and three useful probabilistic inequalities. 1.2.1 Notational Conventions Throughout the entire book we will refer only to discrete probability distributions. Traditionally, a random variable is de ned as a function from the sample space into the reals (or integers). In this book we use the term random variable also when referring to functions mapping the sample space into the set of binary strings. For example, we may say that X 1 is a random variable assigned values in the set of all strings so that Prob(X =00) = 3 and 2 . This is indeed a non-standard convention, but a useful one. Also, we Prob(X = 111) = 3 will refer directly to the random variables without specifying the probability space on which they are de ned. In most cases the probability space consists of all strings of a particular length. How to read probabilistic statements All our probabilistic statements refer to functions of random variables which are de ned beforehand. Typically, we may write Prob(f (X )=1), where X is a random variable de ned beforehand (and f is a function). An important convention is that all occurrences of the same symbol in a probabilistic statement refer to the same (unique) random variable. Hence, if E ( ) is an expression depending on two variables and X is a random variable then 1.2. SOME BACKGROUND FROM PROBABILITY THEORY 19 Prob(E (X X )) denotes the probability that E (x x) holds when x is chosen with probability Prob(X = x). Namely, X Prob(E (X X )) = Prob(X = x) val(E (x x)) where val(E (x x)) equals 1 if E (x x) holds and equals 0 otherwise. For example, for every random variable X , we have Prob(X = X ) = 1. We stress that if one wishes to discuss the probability that E (x y ) holds when x and y are chosen independently with identical probability distribution the one needs to de ne two independent random variables each with the same probability distribution. Hence, if X and Y are two independent random variables then Prob(E (X Y )) denotes the probability that E (x y ) holds when the pair (x y ) is chosen with probability Prob(X = x) Prob(Y = y ). Namely, X Prob(E (X Y )) = Prob(X = x) Prob(Y = y ) val(E (x y )) For example, for every two independent random variables, X and Y , we have Prob(X = Y ) = 1 only if both X and Y are trivial (i.e., assign the entire probability mass to a single string). xy x Typical random variables Throughout the entire book, Un denotes a random variable uniformly distributed over the set of strings of length n. Namely, Prob(Un = ) equals 2;n if 2 f0 1gn and equals 0 otherwise. In addition, we will occasionally use random variables (arbitrarily) distributed over f0 1gn or f0 1gl(n), for some function l : N 7! N. Such random variables are typically II denoted by Xn , Yn , Zn , etc. We stress that in some cases Xn is distributed over f0 1gn whereas in others it is distributed over f0 1gl(n), for some function l( ), typically a polynomial. Another type of random variable, the output of a randomized algorithm on a xed input, is discussed in the next section. 1.2.2 Three Inequalities The following probabilistic inequalities will be very useful in course of the book. All inequalities refer to random variables which are assigned real values. The most basic inequality is Markov Inequality which asserts that, for random variables assigned values in some interval, some relation must exist between the deviation of a value from the expectation of the random variable and the probability that the random variable is assigned this value. Speci cally, Markov Inequality: Let X be a non-negative random variable and v a real number. Then Prob (X v ) < Exp(X ) v 20 1 Equivalently, Prob(X r Exp(X )) < r . CHAPTER 1. INTRODUCTION Proof: Exp(X ) = X x X x<v Prob(X = x) x Prob(X = x) 0 + > The claim follows. X xv Prob(X = x) v = Prob(X v ) v Markov inequality is typically used in cases one knows very little about the distribution of the random variable. It su ces to know its expectation and at least one bound on the range of its values. Exercise 1: 1. Let X be a random variable such that Exp(X ) = and X 2 . Give an upper bound on Prob(X < 2 ). 2. Let 0 < < 1, and Y be a random variable ranging in the interval 0 1] such that Exp(Y ) = + . Give a lower bound on Prob(Y + 2 ). Using Markov's inequality, one gets a \possibly stronger" bound for the deviation of a random variable from its expectation. This bound, called Chebyshev's inequality, is useful provided one has additional knowledge concerning the random variable (speci cally a good upper bound on its variance). Chebyshev's Inequality: Let X be a random variable, and > 0. Then Prob (jX ; Exp(X )j > ) < Var(X ) 2 We get Proof: We de ne a random variable Y def (X ; Exp(X ))2, and apply Markov inequality. = Prob (jX ; Exp(X )j > ) = Prob (X ; Exp(X ))2 > 2 < Exp((X ; Exp(X )) ) 2 2 1.2. SOME BACKGROUND FROM PROBABILITY THEORY and the claim follows. 21 Chebyshev's inequality is particularly useful in the analysis of the error probability of approximation via repeated sampling. It su ces to assume that the samples are picked in a pairwise independent manner. Corollary (Pairwise Independent Sampling): Let X1 X2 ::: Xn be pairwise independent random variables with the identical expectation, denoted , and identical variance, denoted 2. Then Pn X 2 i=1 i ; > Prob < 2n n The Xi 's are pairwise independent of for every i 6= j and all a b, it holds that Prob(Xi = a ^ Xj = b) equals Prob(Xi = a) Prob(Xj = b). = Proof: De ne the random variables X i def Xi ; Exp(Xi). Note that the X i's are pairwise independent, and each has zero expectation. Applying Chebyshev's inequality to the P random variable de ned by the sum n Xi , and using the linearity of the expectation operator, we get i=1 n Prob j n X Xi i=1 n ; j> ! < = Var Exp Pn Pn X i=1 i 2 X i=1 ni 2 2 Now (again using the linearity of Exp) i=1 n2 0 n !21 n X X X Exp @ X i A = Exp X 2 + Exp X i X j i i=1 1 i6=j n 2 n @ X Xi A = n Exp By the pairwise independence of the X i 's, we get Exp(X i X j ) = Exp(X i ) Exp(X j ), and using Exp(X i ) = 0, we get 0 !1 2 The corollary follows. i=1 Using pairwise independent sampling, the error probability in the approximation is decreasing linearly with the number of sample points. Using totally independent sampling 22 CHAPTER 1. INTRODUCTION points, the error probability in the approximation can be shown to decrease exponentially with the number of sample points. (The random variables X1 X2 ::: Xn are said to be totally independent if for every sequence a1 a2 ::: an it folds that Prob(^n=1 Xi = ai ) equals i Qn Prob(X = a ).) ii i=1 The bounds quote below are (weakenings of) a special case of the Martingale Tail Inequality which su ces for our purposes. The rst bound, commonly referred to as Cherno Bound, concerns 0-1 random variables (i.e., random variables which are assigned as values either 0 or 1). and X1 X2 ::: Xn be independent 0-1 random variables so that Prob(Xi = 1) = p, for each i. Then for all , 0 < p(1 ; p), we have Cherno Bound: Let p 1 2, Pn X i=1 i ; p > Prob n 2 < 2 e; 2p(1;p) n 1 We will usually apply the bound with a constant p 2 . In this case, n independent samples give an approximation which deviates by from the expectation with probability which is exponentially decreasing with 2 n. Such an approximation is called an ( )-approximation, and can be achieved using n = O( ;2 log(1= )) sample points. It is important to remember that the su cient number of sample points is polynomially related to ;1 and logarithmically related to ;1 . So using poly(n) many samples the error probability (i.e. ) can be made negligible (as a function in n), but the accuracy of the estimation can be bounded above by any xed polynomial fraction (but cannot be made negligible). A more general bound, useful in the approximations of the expectation of a general random variable (not necessarily 0-1), is given below. Hoefding Inequality: Let X1 X2 ::: Xn be n independent random variables with identiPn X i=1 i ; p > Prob n 22 < 2 e; b;a n cal probability distribution, each ranging over the (real) interval a b], and let denote the expected value of each of these variables. Then, Hoefding Inequality is useful in estimating the average value of a function de ned over a large set of values. It can be applied provided we can e ciently sample the set and have a bound on the possible values (of the function). Exercise 2: Let f : f0 1g 7! 0 1] be a polynomial-time computable function, and let F (n) denote the average value of f over f0 1gn. Namely, P n f (x) F (n) def = x2f0 1g 2n 1.3. THE COMPUTATIONAL MODEL 23 Let p( ) be a polynomial. Present a probabilistic polynomial-time algorithm that on input 1n outputs an estimate to F (n), denoted A(n), such that Prob jF (n) ; A(n)j > p(1 ) < 2;n n Guidance: The algorithm selects at random polynomially many (how many?) sample points si 2 f0 1gn. These points are selected independently and with uniform probability distribution (why?). The algorithm outputs the average value taken over this sample. Analyze the performance of the algorithm using Hoefding Inequality (hint: de ne random variables Xi = f (si )). 1.3 The Computational Model Our approach to cryptography is heavily based on computational complexity. Thus, some background on computational complexity is required for our discussion of cryptography. In this section, we brie y recall the de nitions of the complexity classes P , NP , BPP , non-uniform P (i.e., P =poly), and the concept of oracle machines. In addition, we discuss the type of intractability assumptions used throughout the rest of the book. 1.3.1 P, NP, and NP-completeness A conservative approach to computing devices associates e cient computations with the complexity class P . Jumping ahead, we note that the approach taken in this book is a more liberal one in that it allows the computing devices to use coin tosses. De nition 1.1 P is the class of languages which can be recognized by a (deterministic) polynomial-time machine (algorithm). Language L is recognizable in polynomial-time if there exists a (deterministic) Turing machine M and a polynomial p( ) such that On input a string x, machine M halts after at most p(jxj) steps. M (x) = 1 if and only if x 2 L. Likewise, the complexity class NP is associated with computational problems having solutions that, once given, can be e ciently tested for validity. It is customary to de ne NP as the class of languages which can be recognized by a non-deterministic polynomial-time machine. A more fundamental interpretation of NP is given by the following equivalent de nition. 24 CHAPTER 1. INTRODUCTION De nition 1.2 A language L is in NP , if there exists a Boolean relation RL f0 1g f0 1g and a polynomial p( ) such that RL can be recognized in (deterministic) polynomialtime and x 2 L if and only if there exists a y such that jy j p(jxj) and (x y ) 2 RL. Such a y is called a witness for membership of x 2 L. Thus, NP consists of the set of languages for which there exist short proofs of membership that can be e ciently veri ed. It is widely believed that P 6= NP , and settling this conjecture is certainly the most intriguing open problem in Theoretical Computer Science. If indeed P 6= NP then there exists a language L 2 NP so that for every algorithm recognizing L has super-polynomial running-time in the worst-case. Certainly, all NP complete languages (see de nition below) will have super-polynomial time complexity in the worst-case. De nition 1.3 A language is NP -complete if it is in NP and every language in NP is polynomially-reducible to it. A language L is polynomially-reducible to a language L0 if there exist a polynomial-time computable function f so that x 2 L if and only if f (x) 2 L0 . Among the languages known to be NP -complete are Satis ablity (of propositional formulae), and Graph Colorability. 1.3.2 Probabilistic Polynomial-Time The basic thesis underlying our discussion is the association of \e cient" computations with probabilistic polynomial-time computations. Namely, we will consider as e cient only randomized algorithms (i.e., probabilistic Turing machines) whose running time is bounded by a polynomial in the length of the input. Such algorithms (machines) can be viewed in two equivalent ways. One way of viewing randomized algorithms is to allow the algorithm to make random moves (\toss coins"). Formally this can be modeled by a Turing machine in which the transition function maps pairs of the form (hstatei hsymboli) to two possible triples of the form (hstatei hsymboli hdirectioni). The next step of such a machine is determined by a random choice of one of these triples. Namely, to make a step, the machine chooses at random (with probability one half for each possibility) either the rst triple or the second one, and then acts accordingly. These random choices are called the internal coin tosses of the machine. The output of a probabilistic machine, M , on input x is not a string but rather a random variable assuming strings as possible values. This random variable, denoted M (x), is induced by the internal coin tosses of M . By Prob(M (x) = y ) we mean the probability that machine M on input x outputs y . The probability space is that of all possible outcomes for the internal coin taken with uniform probability distribution. The last sentence is slightly more problematic than it seems. The simple case is when, on input 1.3. THE COMPUTATIONAL MODEL 25 x, machine M always makes the same number of internal coin tosses (independent of their outcome). Since, we only consider polynomial-time machines, we may assume without loss of generality, that the number of coin tosses made by M on input x is independent of their outcome, and is denoted by tM (x). We denote by Mr (x) the output of M on input x when r is the outcome of its internal coin tosses. Then, Prob(M (x)= y ) is merely the fraction of r 2 f0 1gtM (x) for which Mr (x) = y. Namely, tM (x) Prob (M (x)= y ) = jfr 2 f0 1g tM (x:) Mr (x)= y gj 2 The second way of looking at randomized algorithms is to view the outcome of the internal coin tosses of the machine as an auxiliary input. Namely, we consider deterministic machines with two inputs. The rst input plays the role of the \real input" (i.e. x) of the rst approach, while the second input plays the role of a possible outcome for a sequence of internal coin tosses. Thus, the notation M (x r) corresponds to the notation Mr (x) used above. In the second approach one considers the probability distribution of M (x r), for any xed x and a uniformly chosen r 2 f0 1gtM (x). Pictorially, here the coin tosses are not \internal" but rather supplied to the machine by an \external" coin tossing device. Before continuing, let me remark that one should not confuse the ctitious model of \non-deterministic" machines with the model of probabilistic machines. The rst is an unrealistic model which is useful for talking about search problems the solutions to which can be e ciently veri ed (e.g., the de nition of NP ), while the second is a realistic model of computation. In the sequel, unless otherwise stated, a probabilistic polynomial-time Turing machine means a probabilistic machine that always (i.e., independently of the outcome of its internal coin tosses) halts after a polynomial (in the length of the input) number of steps. It follows that the number of coin tosses of a probabilistic polynomial-time machine M is bounded by a polynomial, denoted TM , in its input length. Finally, without loss of generality, we assume that on input x the machine always makes TM (jxj) coin tosses. Thesis: E cient computations correspond to computations that can be carried out by probabilistic polynomial-time Turing machines. A complexity class capturing these computations is the class, denoted BPP , of languages recognizable (with high probability) by probabilistic polynomial-time machines. The probability refers to the event \the machine makes correct verdict on string x". De nition 1.4 (Bounded-Probability Polynomial-time | BPP ): BPP is the class of lan- guages which can be recognized by a probabilistic polynomial-time machine (i.e., randomized algorithm). We say that L is recognized by the probabilistic polynomial-time machine M if 26 For every x 2 L it holds that Prob(M (x)=1) For every x 62 L it holds that Prob(M (x)=0) CHAPTER 1. INTRODUCTION 2. 3 2. 3 The phrase \bounded-probability" indicates that the success probability is bounded 1 away from 2 . In fact, substituting in De nition 1.4 the constant 2 by any other constant 3 1 greater than 2 does not change the class de ned. More generally: Exercise 1: Prove that De nition 1.4 is robust under the substitution of 2 by 1 + p(j1xj) , 3 2 for every polynomial p( ). Namely, that L 2 BPP if there exists a polynomial p( ) and a probabilistic polynomial-time machine, M , such that 1 For every x 2 L it holds that Prob(M (x)=1) 2 + p(j1xj) . For every x 62 L it holds that Prob(M (x)=0) 1 + p(j1xj) . 2 Guidance: Given a probabilistic polynomial-time machine M satisfying the above condition, construct a probabilistic polynomial-time machine M 0 as follows. On input x, machine M 0, runs O(p(jxj)) many copies of M , on the same input x, and rules by majority. Use Chebyshev's inequality (see Sec. 1.2) to show that M 0 is correct with probability > 2 . 3 2 Exercise 2: Prove that De nition 1.4 is robust under the substitution of 3 by 1 ; 2;jxj. Guidance: Similar to Exercise 1, except that you have to use a stronger probabilistic inequality (namely Cherno bound | see Sec. 1.2). We conclude that languages in BPP can be recognized by probabilistic polynomialtime machines with a negligible error probability. By negligible we call any function which decreases faster than one over any polynomial. Namely, De nition 1.5 (negligible): We call a function : N 7! N negligible if for every polynoII mial p( ) there exists an N such that for all n > N (n) < p(1 ) n p For example, the functions 2; n and n; log2 n , are negligible (as functions in n). Negligible function stay this way when multiplied by any xed polynomial. Namely, for every negligible function and any polynomial p, the function 0 (n) def p(n) (n) is negligible. It follows = that an event which occurs with negligible probability is highly unlikely to occur even if we repeat the experiment polynomially many times. Convention: In De nition 1.5 we used the phrase \there exists an N such that for all statements that contain several (more essential) quanti ers. n > N ". In the future we will use the shorter and less tedious phrase \for all su ciently large n". This makes one quanti er (i.e., the 9N ) implicit, and is particularly bene cial in 1.3. THE COMPUTATIONAL MODEL 27 1.3.3 Non-Uniform Polynomial-Time A stronger model of e cient computation is that of non-uniform polynomial-time. This model will be used only in the negative way namely, for saying that even such machines cannot do something. A non-uniform polynomial-time \machine" is a pair (M a), where M is a two-input polynomial-time machine and a = a1 a2 ::: is an in nite sequence such that jan j = poly(n). For every x, we consider the computation of machine M on the input pair (x ajxj). Intuitively, an may be thought as an extra \advice" supplied from the \outside" (together with the input x 2 f0 1gn). We stress that machine M gets the same advice (i.e., an ) on all inputs of the same length (i.e., n). Intuitively, the advice an may be useful in some cases (i.e., for some computations on inputs of length n), but it is unlikely to encode enough information to be useful for all 2n possible inputs. Another way of looking at non-uniform polynomial-time \machines" is to consider an in nite sequence of machines, M1 M2 ::: so that both the length of the description of Mn and its running time on inputs of length n are bounded by polynomial in n ( xed for the entire sequence). Machine Mn is used only on inputs of length n. Note the correspondence between the two ways of looking at non-uniform polynomial-time. The pair (M (a1 a2 :::)) (of the rst de nition) gives rise to an in nite sequence of machines Ma1 Ma2 :::, where Majxj (x) def = M (x ajxj). On the other hand, a sequence M1 M2 ::: (as in the second de nition) gives rise to the pair (U (hM1i hM2i :::)), where U is the universal Turing machine and hMn i is the description of machine Mn (i.e., U (x hMjxji) = Mjxj (x)). In the rst sentence of the current subsection, non-uniform polynomial-time has been referred to as a stronger model than probabilistic polynomial-time. This statement is valid in many contexts (e.g., language recognition as in Theorem 1 below). In particular it will be valid in all contexts we discuss in this book. So we have the following informal \metatheorem" Meta-Theorem: Whatever can be achieved by probabilistic polynomial-time machines can be achieved by non-uniform polynomial-time \machines". The meta-theorem is clearly wrong if one thinks of the task of tossing coins... So the meta-theorem should not be understood literally. It is merely an indication of real theorems that can be proven in reasonable cases. Let's consider the context of language recognition. De nition 1.6 The complexity class non-uniform polynomial-time (denoted P =poly) is the class of languages L which can be recognized by a non-uniform (sequence) polynomial-time \machine". Namely, L 2 P =poly if there exists an in nite sequence of machines M1 M2 ::: satisfying 28 CHAPTER 1. INTRODUCTION 1. There exists a polynomial p( ) such that, for every n, the description of machine Mn has length bounded above by p(n). 2. There exists a polynomial q ( ) such that, for every n, the running time of machine Mn on each input of length n is bounded above by q (n). has length p(n). 3. For every n and every x 2 f0 1gn, machine Mn accepts x if and only if x 2 L. Note that the non-uniformity is implicit in the lack of a requirement concerning the construction of the machines in the sequence. It is only required that these machines exist. In contrast, if one augments De nition 1.6 by requiring the existence of a polynomial-time algorithm that on input 1n (n presented in unary) outputs the description of Mn then one gets a cumbersome way of de ning P . On the other hand, it is obvious that P P =poly (in fact strict containment can be proven by considering non-recursive unary languages). Furthermore, Theorem 1: BPP P =poly. Proof: Let M be a probabilistic machine recognizing L 2 BPP . Let L(x) def 1 if x 2 L = and L (x) = 0 otherwise. Then, for every x 2 f0 1g , 2 3 Assume, without loss of generality, that on each input of length n, machine M uses the same number, m = poly(n), of coin tosses. Let x 2 f0 1gn. Clearly, we can nd for each x 2 f0 1gn a sequence of coin tosses r 2 f0 1gm such that Mr (x) = L (x) (in fact most sequences r have this property). But can one sequence r 2 f0 1gm t all x 2 f0 1gn? Probably not (provide an example!). Nevertheless, we can nd a sequence r 2 f0 1gn which ts 2 of all the x's of length n. This is done by a counting argument (which asserts that if 2 of3the r's are good for each x then there is an r which is good for at least 2 of the x's). 3 3 However, this does not give us an r which is good for all x 2 f0 1gn. To get such an r we have to apply the above argument on a machine M 0 with exponentially vanishing error probability. Such a machine is guaranteed by Exercise 2. Namely, for every x 2 f0 1g , Prob(M (x)= L (x)) Prob(M 0(x)= L (x)) > 1 ; 2;jxj Applying the argument now we conclude that there exists an r 2 f0 1gm, denoted rn , which is good for more than a 1 ; 2;n fraction of the x 2 f0 1gn. It follows that rn is good for all the 2n inputs of length n. Machine M 0 (viewed as a deterministic two-input machine) together with the in nite sequence r1 r2 ::: constructed as above, demonstrates that L is in P =poly. 1.3. THE COMPUTATIONAL MODEL 29 Finally, let me mention a more convenient way of viewing non-uniform polynomial-time. This is via (non-uniform) families of polynomial-size Boolean circuits. A Boolean circuit is a directed acyclic graph with internal nodes marked by elements in f^ _ :g Nodes with no ingoing edges are called input nodes, and nodes with no outgoing edges are called output nodes, A node mark : may have only one child. Computation in the circuit begins with placing input bits on the input nodes (one bit per node) and proceeds as follows. If the children of a node (of indegree d) marked ^ have values v1 v2 ::: vd then the node gets the value ^d=1 vi . Similarly for nodes marked _ and :. The output of the circuit is read from i its output nodes. The size of a circuit is the number of its edges. A polynomial-size circuit family is an in nite sequence of Boolean circuits, C1 C2 ::: such that, for every n, the circuit Cn has n input nodes and size p(n), where p( ) is a polynomial ( xed for the entire family). Clearly, the computation of a Turing machine M on inputs of length n can be simulated by a single circuit (with n input nodes) having size O((jhM ij + n + t(n))2), where t(n) is a bound on the running time of M on inputs of length n. Thus, a non-uniform sequence of polynomial-time machines can be simulated by a non-uniform family of polynomial-size circuits. The converse is also true as machines with polynomial description length can incorporate polynomial-size circuits and simulate their computations in polynomial-time. The thing which is nice about the circuit formulation is that there is no need to repeat the polynomiality requirement twice (once for size and once for time) as in the rst formulation. 1.3.4 Intractability Assumptions We will consider as intractable those tasks which cannot be performed by probabilistic polynomial-time machines. However, the adverserial tasks in which we will be interested (e.g., \breaking an encryption scheme", \forging signatures", etc.) can be performed by non-deterministic polynomial-time machines (since the solutions, once found, can be easily tested for validity). Thus, the computational approach to cryptography (and in particular most of the material in this book) is interesting only if NP is not contained in BPP (which certainly implies P 6= NP ). We use the phrase \not interesting" (rather than \not valid") since all our statements will be of the form \if hintractability assumptioni then huseful consequencei". The statement remains valid even if P = NP (or just hintractability assumptioni which is never weaker than P 6= NP is wrong), but in such a case the implication is of little interest (since everything is implied by a fallacy). In most places where we state that \if hintractability assumptioni then huseful consequencei" it will be the case that huseful consequencei either implies hintractability assumptioni or some weaker form of it, which in turn implies NP;BPP 6= . Thus, in light of the current state of knowledge in complexity theory, one cannot hope for asserting huseful consequencei without any intractability assumption. In few cases an assumption concerning the limitations of probabilistic polynomial-time machines (e.g., BPP does not contain NP ) will not su ce, and we will use instead an 30 CHAPTER 1. INTRODUCTION assumption concerning the limitations of non-uniform polynomial-time machines. Such an assumption is of course stronger. But also the consequences in such a case will be stronger as they will also be phrased in terms of non-uniform complexity. However, since all our proofs are obtained by reductions, an implication stated in terms of probabilistic polynomial-time is stronger (than one stated in terms of non-uniform polynomial-time), and will be preferred unless it is either not known or too complicated. This is the case since a probabilistic polynomial-time reduction (proving implication in its probabilistic formalization) always implies a non-uniform polynomial-time reduction (proving the statement in its non-uniform formalization), but the converse is not always true. (The current paragraph may be better understood in the future after seeing some concrete examples.) Finally, we mention that intractability assumptions concerning worst-case complexity (e.g., P 6= NP ) will not su ce, because we will not be satis ed with their corresponding consequences. Cryptographic schemes which are guaranteed to be hard to break in the worst-case are useless. A cryptographic scheme must be unbreakable on \most cases" (i.e., \typical case") which implies that it is hard to break on the average. It follows that, since we are not able to prove that \worst-case intractability" imply analogous \intractability for average case" (such a result would be considered a breakthrough in complexity theory), our intractability assumption must concern average-case complexity. 1.3.5 Oracle Machines The original utility of oracle machines in complexity theory is to capture notions of reducibility. In this book we use oracle machines for a di erent purpose altogether. We use an oracle machine to model an adversary which may use a cryptosystem in course of its attempt to break it. De nition 1.7 A (deterministic/probabilistic) oracle machine is a (deterministic/probabilistic) Turing machine with an additional tape, called the oracle tape, and two special states, called oracle invocation and oracle appeared. The computation of the deterministic oracle machine M on input x and access to the oracle f : f0 1g 7! f0 1g is de ned by the successive con guration relation. For con gurations with state di erent from \oracle invocation" the next con guration is de ned as usual. Let be a con guration in which the state is \oracle invocation" and the contents of the oracle tape is q . Then the con guration following is identical to , except that the state is \oracle appeared" and the contents of the oracle tape is f (q ). The string q is called M 's query and f (q ) is called thee oracle reply. The computation of a probabilistic oracle machine is de ned analogously. We stress that the running time of an oracle machine is the number of steps made during its computation, and that the oracle's reply on each query is obtained in a single step. 1.4. MOTIVATION TO THE FORMAL TREATMENT 31 1.4 Motivation to the Formal Treatment It is indeed unfortunate that our formal treatment of the eld of cryptography requires justi cation. Nevertheless, we prefer to address this (unjusti ed) requirement rather than ignore it. In the rest of this section we address three related issues 1. the mere need for a formal treatment of the eld 2. the practical meaning and/or consequences of the formal treatment 3. the \conservative" tendencies of the treatment. Parts of this section may become more clear after reading any of the chapters 3{7. 1.4.1 The Need to Formalize Intuition An abstract justi cation We believe that one of the roles of science is to formulate our intuition about reality so that this intuition can be carefully examined, and consequently either be justi ed as sound or be rejected as false. Notably, there are many cases in which our initial intuition turns out to be correct, as well as many cases in which our initial intuition turns out to be wrong. The more we understand the discipline the better our intuition becomes. At this stage in history it would be very presumptuous to claim that we have good intuition about the nature of e cient computation. In particular, we even don't know the answer to a basis question as whether P is strictly contained in NP , let alone having an understanding what makes one computation problem hard while a seemingly related computational problem is easy. Consequently, we should be extremely careful when making assertions about what can or cannot be e ciently computed. Unfortunately, making assertions about what can or cannot be e ciently computed is exactly what cryptography is all about... Not to mention that may of the problems of cryptography have a much more cumbersome and delicate description than what is usually standard in complexity theory. Hence, not only that there is a need to formalize \intuition" in general, but the need to formalize \intuition" is particularly required in a sensitive eld as cryptography. A concrete justi cation Cryptography, as a discipline, is well-motivated. Consequently, cryptographic issues are being discussed by many researchers, engineers, and students. Unfortunately, most of these discussions are carried out without a precise de nition of their subject matter. Instead it is implicitly assumed that the basic concepts of cryptography (e.g., secure encryption) 32 CHAPTER 1. INTRODUCTION are self-evident (since they are so intuitive), and that there is no need to present adequate de nitions. The fallacy of this assumption is demonstrated by the abandon of papers (not to mention private discussion) which derive and/or jump into wrong conclusions concerning security. In most cases these wrong conclusions can be traced back into implicit misconceptions regarding security, which could not have escaped the eyes of the authors if made explicitly. We avoid listing all these cases here for several obvious reasons. Nevertheless, we mention one well-known example. In around 1979, Ron Rivest claimed that no signature scheme that is \proven secure assuming the intractability of factoring" can resist a \chosen message attack". His argument was based on an implicit (and unjusti ed) assumption concerning the nature of a \proof of security (which assumes the intractability of factoring)". Consequently, for several years it was believe that one has to choose between having a signature scheme \proven to be unforgeable under the intractability of factoring" and having a signature scheme which resist a \chosen message attack". However, in 1984 Goldwasser, Micali and Rivest (himself) pointed out the fallacy on which Rivest's argument (of 1979) was based, and furthermore presented signature schemes which resist a \chosen message attack", under general assumptions. In particular, the intractability of factoring su ces for proving that there exists a signature scheme which resist \forgery", even under a \chosen message attack". To summary, the basic concepts of cryptography indeed very intuitive, yet they are not are self-evident and/or well-understood. Hence, we do not understand these issues well enough yet to be able to discuss them correctly without using precise de nitions. 1.4.2 The Practical Consequences of the Formal Treatment As customary in complexity theory, our treatment is presented in terms of asymptotic analysis of algorithms. This makes the statement of the results somewhat less cumbersome, but is not essential to the underlying ideas. Hence, the results, although stated in an \abstract manner", lend themselves to concrete interpolations. To clarify the above statement we consider a generic example. A typical result presented in this book relates two computational problems. The rst problem is a simple computational problem which is assumed to be intractable (e.g., intractability of factoring), whereas the second problem consists of \breaking" a speci c implementation of a useful cryptographic primitive (e.g., a speci c encryption scheme). The abstract statement may assert that if integer factoring cannot be performed in polynomial-time then the encryption scheme is secure in the sense that it cannot be \broken" in polynomialtime. Typically, the statement is proven by a xed polynomial-time reduction of integer factorization to the problem of breaking the encryption scheme. Hence, by working out the constants one can derive a statement of the following type: if factoring integers of X (say 300) decimal digits is infeasible in practice then the encryption scheme is secure in practice provided one uses a key of length Y (say 500) decimal digits. Actually, the statement will 1.4. MOTIVATION TO THE FORMAL TREATMENT 33 have to be more cumbersome so that it includes also the computing power of the real machines. Namely, if factoring integers of 300 decimal digits cannot be done using 1000 years of a Cray then the encryption scheme cannot be broken in 10 years by a Cray, provided one uses a key of length 500 decimal digits. We stress that the relation between the four parameters mentioned above can be derived from the reduction (used to prove the abstract statement). For most results these reduction yield a reasonable relation between the various parameters. Consequently, all cryptographic primitives considered in this book (i.e., public and private-key encryption, signatures, zero-knowledge, pseudorandom generators, fault-tolerant protocols) can be implemented in practice based on reasonable intractability assumptions (such as the unfeasibility of factoring 500 digit integers). In few cases, the reductions currently known do not yield practical consequences, since the \security parameter" (e.g., key length) in the derived cryptographic primitive has to be too large. In all these cases, the \impracticality" of the result is explicitly stated, and the reader is encouraged to try to provide a more e cient reduction that would have practical consequences. Hence, we do not consider these few cases as indicating a de ciency in our approach, but rather as important open problems. 1.4.3 The Tendency to be Conservative When reaching the chapters in which cryptographic primitives are de ned (speci cally in Chapters 3 through 7), the reader may notice that we are unrealistically \conservative" in our de nitions of security. In other words, we are unrealistically liberal in our de nition of insecurity. Technically speaking, this tendency raises no problems since our primitives which are secure in a very strong sense are certainly secure also in the (more restricted) reasonable sense. Furthermore, we are able to implement such (strongly secure) primitives using reasonable intractability assumptions, and in most cases one can show that such assumptions are necessary even for much weaker (and in fact less than minimal) notions of security. Yet the reader may wonder why we choose to present de nitions which seem stronger than what is required in practice. The reason to our tendency to be conservative, when de ning security, is that it is extremely di cult to capture what is exactly require in practice. Furthermore, a certain level in security may be required in one application, whereas another level is required in a di erent application. In seems impossible to cover whatever can be required in all applications without taking our conservative approach. In the sequel we shall see how one can de ne security in a way covering all possible practical applications. 34 CHAPTER 1. INTRODUCTION Chapter 2 Computational Di culty In this chapter we present several variants of the de nition of one-way functions. In particular, we de ne strong and weak one-way functions. We prove that the existence of weak one-way functions imply the existence of strong ones. The proof provides a simple example of a case where a computational statement is much harder to prove than its \information theoretic analogue". Next, we de ne hard-core predicates, and prove that every one-way function \has" a hard-core predicate. 2.1 One-Way Functions: Motivation As stated in the introduction chapter, modern cryptography is based on a gap between e cient algorithms guaranteed for the legitimate user versus the computational infeasibility of retrieving protected information for an adversary. To illustrate this, we concentrate on the cryptographic task of secure data communication, namely encryption schemes. In secure encryption schemes, the legitimate user should be able to easily decipher the messages using some private information available to him, yet an adversary (not having this private information) should not be able to decrypt the ciphertext e ciently (i.e., in probabilistic polynomial-time). On the other hand, a non-deterministic machine can quickly decrypt the ciphertext (e.g., by guessing the private information). Hence, the existence of secure encryption schemes implies that there are tasks (e.g., \breaking" encryption schemes) that can be performed by non-deterministic polynomial-time machines, yet cannot be performed by deterministic (or even randomized) polynomial-time machines. In other words, a necessary condition for the existence of secure encryption schemes is that NP is not contained in BPP (and thus P 6= NP ). Although P 6= NP is a necessary condition it is not a su cient one. P 6= NP implies that the encryption scheme is hard to break in the worst case. It does not rule-out the 35 36 CHAPTER 2. COMPUTATIONAL DIFFICULTY possibility that the encryption scheme is easy to break almost always. Indeed, one can construct \encryption schemes" for which the breaking problem is NP-complete, and yet there exist an e cient breaking algorithm that succeeds 99% of the time. Hence, worstcase hardness is a poor measure of security. Security requires hardness on most cases or at least \average-case hardness". A necessary condition for the existence of secure encryption schemes is thus the existence of languages in NP which are hard on the average. It is not known whether P 6= NP implies the existence of languages in NP which are hard on the average. The mere existence of problems (in NP) which are hard on the average does not su ce either. In order to be able to use such hard-on-the-average problems we must be able to generate hard instances together with auxiliary information which enable to solve these instances fast. Otherwise, these hard instances will be hard also for the legitimate users, and consequently the legitimate users gain no computational advantage over the adversary. Hence, the existence of secure encryption schemes implies the existence of an e cient way (i.e. probabilistic polynomial-time algorithm) of generating instances with corresponding auxiliary input so that 1. it is easy to solve these instances given the auxiliary input and 2. it is hard on the average to solve these instances (when not given the auxiliary input). The above requirement is captured by the de nition of one-way functions presented in the next subsection. For further details see Exercise 1. 2.2 One-Way Functions: De nitions In this section, we present several de nitions of one-way functions. The rst version, hereafter referred to as strong one-way function (or just one-way function), is the most popular one. We also present weak one-way functions, non-uniformly one-way functions, and plausible candidates for such functions. 2.2.1 Strong One-Way Functions Loosely speaking, a one-way function is a function which is easy to compute but hard to invert. The rst condition is quite clear: saying that a function f is easy to compute means that there exists a polynomial-time algorithm that on input x outputs f (x). The second condition requires more elaboration. Saying that a function f is hard to invert means that every probabilistic polynomial-time algorithm trying, on input y to nd an inverse of y under f , may succeed only with negligible (in jy j) probability. A sequence fsn gn2N is I called negligible in n if for every polynomial p( ) and all su ciently large n's it holds that sn < p(1n) . Further discussion proceeds the de nition. 2.2. ONE-WAY FUNCTIONS: DEFINITIONS (strongly) one-way if the following two conditions hold 37 De nition 2.1 (strong one-way functions): A function f : f0 1g 7! f0 1g is called 1. easy to compute: There exists a (deterministic) polynomial-time algorithm, A, so that on input x algorithm A outputs f (x) (i.e., A(x) = f (x)). 2. hard to invert: For every probabilistic polynomial-time algorithm, A0, every polynomial p( ), and all su ciently large n's Prob A0 (f (Un ) 1n) 2 f ;1 f (Un ) < p(1 ) n Recall that Un denotes a random variable uniformly distributed over f0 1gn. Hence, the probability in the second condition is taken over all the possible values assigned to Un and all possible internal coin tosses of A0 , with uniform probability distribution. In addition to an input in the range of f , the inverting algorithm is also given the desired length of the output (in unary notation). The main reason for this convention is to rule out the possibility that a function is consider one-way merely because the inverting algorithm does not have enough time to print the output. Consider for example the function flen de ned by flen(x) = y where y is the binary representation of the length of x (i.e., flen(x) = jxj). Since jflen(x)j = log2 jxj no algorithm can invert flen(x) in time polynomial in jflen(x)j, yet there exists an obvious algorithm which inverts flen(x) in time polynomial in jxj. In general, the auxiliary input 1jxj, provided in conjunction to the input f (x), allows the inverting algorithm to run in time polynomial in the total length of the input and the desired output. Note that in the special case of length preserving functions f (i.e., jf (x)j = jxj for all x's), the auxiliary input is redundant. Hardness to invert is interpreted as an upper bound on the success probability of e cient inverting algorithms. The probability is measured with respect to both the random choices of the inverting algorithm and the distribution of the (main) input to this algorithm (i.e., f (x)). The input distribution to the inverting algorithm is obtained by applying f to a uniformly selected x 2 f0 1gn. If f induces a permutation on f0 1gn then the input to the inverting algorithm is uniformly distributed over f0 1gn. However, in the general case where f is not necessarily a one-to-one function, the input distribution to the inverting algorithm may di er substantially from the uniform one. In any case, it is required that the success probability, de ned over the above probability space, is negligible (as a function of the length of x), where negligible means being bounded above by all functions of the form 1 poly(n) . To further clarify the condition made on the success probability, we consider the following examples. Consider, an algorithm A1 that on input (y 1n) randomly selects and outputs a string of length n. In case f is a 1-1 function, we have Prob A1(f (Un ) 1n ) 2 f ;1 f (Un ) = 1n 2 38 CHAPTER 2. COMPUTATIONAL DIFFICULTY since for every x the probability that A1 (f (x)) equals x is exactly 2;n . Hence, the success probability of A1 on any 1-1 function A1 is negligible. On the other hand, for every function f , the success probability of A1 on input f (Un) is never zero (speci cally it is at least 2;n ). In case f is constant over strings of the same length (e.g., f (x) = 0jxj ), we have Prob A1 (f (Un ) 1n) 2 f ;1 f (Un ) = 1 since every x 2 f0 1gn is a preimage of 0n under f . It follows that a one-way function cannot be constant on strings of the same length. Another trivial algorithm, denoted A2, is one that computes a function which is constant on all inputs of the same length (e.g., A2(y 1n) = 1n ). For every function f we have 1 Prob A2(f (Un ) 1n ) 2 f ;1 f (Un ) 2n (with equality in case f (1n ) has a single preimage under f ). Hence, the success probability of A2 on any 1-1 function is negligible. On the other hand, if Prob(f (Un ) = f (1n )) is non-negligible then so is the success probability of algorithm A2 . A few words, concerning the notion of negligible probability, are in place. The above de nition and discussion considers the success probability of an algorithm to be negligible if, as a function of the input length, the success probability is bounded above by every polynomial fraction. It follows that repeating the algorithm polynomially (in the input length) many times yields a new algorithm that also has a negligible success probability. In other words, events which occur with negligible (in n) probability remain negligible even if the experiment is repeated for polynomially (in n) many times. Hence, de ning negligible success as \occurring with probability smaller than any polynomial fraction" is naturally coupled with de ning feasible as \computed within polynomial time". A \strong negation" of the notion of a negligible fraction/probability is the notion of a non-negligible fraction/probability. We say that a function is non-negligible if there exists a polynomial p( ) such that for all su ciently large n's it holds that (n) > p(1n) . Note that functions may be neither negligible nor non-negligible. 2.2.2 Weak One-Way Functions One-way functions as de ned above, are one-way in a very strong sense. Namely, any e cient inverting algorithm has negligible success in inverting them. A much weaker de nition, presented below, only requires that all e cient inverting algorithm fails with some non-negligible probability. De nition 2.2 (weak one-way functions): A function f : f0 1g 7! f0 1g is called weakly one-way if the following two conditions hold 2.2. ONE-WAY FUNCTIONS: DEFINITIONS 1. easy to compute: as in the de nition of strong one-way function. 39 2. slightly-hard to invert: There exists a polynomial p( ) such that for every probabilistic polynomial-time algorithm, A0, and all su ciently large n's Prob A0 (f (Un ) 1n) 62 f ;1 f (Un ) > p(1 ) n 2.2.3 Two Useful Length Conventions In the sequel it will be convenient to use the following two conventions regarding the length of the of the preimages and images of a one-way function. In the current subsection we justify the used of these conventions. One-way functions de ned only for some lengths In many cases it is more convenient to consider one-way functions with domain partial to the set of all strings. In particular, this facilitates the introduction of some structure in the domain of the function. A particularly important case, used throughout the rest of this section, is that of functions with domain n2N f0 1gp(n), where p( ) is some polynomial. I Let I N, and denote by sI (n) the successor of n with respect to I namely, sI (n) is the I smallest integer that is both greater than n and in the set I (i.e., sI (n) def minfi 2 I : i>ng). = A set I N is called polynomial-time enumerable if there exists an algorithm that on input I n, halts within poly(n) steps and outputs sI (n). Let I be a polynomial-time enumerable set and f be a function with domain n2I f0 1gn. We call f strongly (resp. weakly) one-way on lengths in I if f is polynomial-time computable and is hard to invert over n's in I . Such one-way functions can be easily modi ed into function with the set of all strings as domain, while preserving one-wayness and some other properties of the original function. In particular, for any function f with domain n2I f0 1gn, we can construct a function g : f0 1g 7! f0 1g by letting g (x) def f (x0 ) = where x0 is the longest pre x of x with length in I . (In case the function f is length preserving, see de nition below, we can preserve this property by modifying the construction so that g (x) def f (x0 )x00 where x = x0x00 , and x0 is the longest pre x of x with length in I . = The following proposition remains valid also in this case, with a minor modi cation in the proof.) Proposition 2.3 : Let I be a polynomial-time enumerable set, and f be strongly (resp. weakly) one-way on lengths in I . Then g (constructed above) is strongly (resp. weakly) one-way (in the ordinary sense). 40 CHAPTER 2. COMPUTATIONAL DIFFICULTY Although the validity of the above proposition is very appealing, we urge the reader not to skip the following proof. The proof, which is indeed quite simple, uses for the rst time in this book an argument that is used extensively in the sequel. The argument used to prove the \hardness to invert" property of the function g proceeds by assuming, to the contradiction, that g can be e ciently inverted with unallowable success probability. Contradiction is derived by deducing that f can be e ciently inverted with unallowable success probability. In other words, inverting f is \reduced" to inverting g . The term \reduction" is used here in a non-standard sense, which preserves the success probability of the algorithms. This kind of an argument is called a reducibility argument. Proof: We rst prove that g can be computed in polynomial-time. To this end we use the fact that I is a polynomial-time enumerable set. It follows that on input x one can nd in polynomial-time the largest m jxj that satis es m 2 I . Computing g (x) amounts to nding this m, and applying the function f to the m-bit pre x of x. We next prove that g maintains the \hardness to invert" property of f . For sake of concreteness we present here only the proof for the case that f is strongly one-way. The proof for the case that f is weakly one-way is analogous. The prove proceeds by contradiction. We assume, on contrary to the claim (of the proposition), that there exists an e cient algorithm that inverts g with success probability that is not negligible. We use this inverting algorithm (for g ) to construct an e cient algorithm that inverts f with success probability that is not negligible, hence deriving a contradiction (to the hypothesis of the proposition). In other words, we show that inverting f (with unallowable success probability) is e ciently reducible to inverting g (with unallowable success probability), and hence conclude that the latter is not feasible. The reduction is based on the observation that inverting g on images of arbitrary length yields inverting g also on images of length in I , and that on such lengths g collides with f . Details follow. Given an algorithm, B 0 , for inverting g we construct an algorithm, A0, for inverting f so that A0 has complexity and success probability related to that of B 0 . Algorithm A0 uses algorithm B 0 as a subroutine and proceeds as follows. On input y and 1n (supposedly y is in the range of f (Un ) and n 2 I ) algorithm A0 rst computes sI (n) and sets k def sI (n) ; n ; 1. = 0 initiates algorithm B 0 , on input (y 1n+i), obtaining For every 0 i k, algorithm A zi B 0(y 1n+i ), and checks if g(zi) = y . In case one of the zi's satis es the above condition, algorithm A0 outputs the n-bit long pre x of zi . This pre x is in the preimage of y under f (since g (x0x00) = f (x0) for all x0 2 f0 1gn and jx00j k). Clearly, if B 0 is a probabilistic polynomial-time algorithm then so is A0. We now analyze the success probability of A0 (showing that if B 0 inverts g with unallowable success probability then A0 inverts f with unallowable success probability). Suppose now, on the contrary to our claim, that g is not strongly one-way, and let B 0 be an algorithm demonstrating this contradiction hypothesis. Namely, there exists a polynomial p( ) so that for in nitely many m's the probability that B 0 inverts g on g (Um) 2.2. ONE-WAY FUNCTIONS: DEFINITIONS 41 is at least p(1 ) . Let us denote the set of these m's by M . De ne a function I : N 7! I so I m def maxfi 2 I : i mg). Clearly, that I (m) is the largest lower bound of m in I (i.e., I (m) = m sI (I (m)) ; 1 for every m. The following two claims relate the success probability of algorithm A0 with that of algorithm B 0 . Claim 2.3.1: Let m be an integer and n = I (m). Then Prob A0 (f (Un ) 1n) 2 f ;1 f (Un ) Prob B 0 (g (Um) 1m) 2 g ;1 g (Um) (Namely, the success probability of algorithm A0 on f (UI (m) ) is bounded below by the success probability of algorithm B 0 on g (Um).) Proof: By construction of A0, on input (f (x0) 1n), where x0 2f0 1gn, algorithm A0 obtains the value B 0 (f (x0) 1t), for every t sI (n) ; 1. In particular, since m sI (I (m)) ; 1 = sI (n) ; 1, it follows that algorithm A0 obtains the value B 0 (f (x0) 1m). By de nition of g , for all x00 2f0 1gm;n , it holds that f (x0 ) = g (x0x00). The claim follows. 2 Claim 2.3.2: There exists a polynomial q ( ) such that m < q (I (m)), for all m's. Hence, the set S def fI (m) : m 2 M g is in nite. = Proof: Using the polynomial-time enumerability of I , we get sI (n) < poly(n), for every n. Therefore, for every m, we have m < sI (I (m)) < poly(I (m)). Furthermore, S must be in nite, otherwise for n upper-bounding S we get m < q (n) for every m 2 M .2 Using Claims 2.3.1 and 2.3.2, it follows that, for every n = I (m) 2 S , the probability that 1 A0 inverts f on f (Un) is at least p(1 ) > p(q1n)) = poly(n) . It follows that f is not strongly m ( one-way, in contradiction to the proposition's hypothesis. Length-regular and length-preserving one-way functions A second useful convention is to assume that the function, f , we consider is length regular in the sense that, for every x y 2 f0 1g , if jxj = jy j then jf (x)j = jf (y )j. We point out that the transformation presented above preserves length regularity. A special case of length regularity, preserved by a the modi ed transformation presented above, is of length preserving functions. De nition 2.4 (length preserving functions): A function f is length preserving if for every x 2 f0 1g it holds that jf (x)j = jxj. Given a strongly (resp. weakly) one-way function f , we can construct a strongly (resp. weakly) one-way function h which is length preserving, as follows. Let p be a polynomial bounding the length expansion of f (i.e., jf (x)j p(jxj)). Such a polynomial must exist 42 CHAPTER 2. COMPUTATIONAL DIFFICULTY since f is polynomial-time computable. We rst construct a length regular function g by de ning g (x) def f (x)10p(jxj);jf (x)j = (We use a padding of the form 10 in order to facilitate the parsing of g (x) into f (x) and the \leftover" padding.) Next, we de ne h only on strings of length p(n) + 1, for n 2 N, by I letting h(x0x00) def g (x0) , where jx0x00j = p(jx0j) + 1 = Clearly, h is length preserving. Proposition 2.5 : If f is a strongly (resp. weakly) one-way function then so are g and h (constructed above). Proof Sketch: It is quite easy to see that both g and h are polynomial-time computable. Using \reducibility arguments" analogous to the one used in the previous proof, we can establish the hardness-to-invert of both g and h. For example, given an algorithm B 0 for inverting g , we construct an algorithm A0 for inverting f as follows. On input y and 1n (supposedly y is in the range of f (Un )), algorithm A0 halts with output B 0 (y 10p(n);jyj 1p(n)+1). The reader can easily verify that if f is length preserving then it is redundant to provide the inverting algorithm with the auxiliary input 1jxj (in addition to f (x)). The same holds if f is length regular and does not shrink its input by more than a polynomial factor (i.e., there exists a polynomial p( ) such that p(jf (x)j) jxj for all x). In the sequel, we will only deal with one-way functions that are length regular and does not shrink their its input by more that a polynomial factor. Furthermore, we will mostly deal with length preserving functions. Hence, in these cases, we assume, without loss of generality, that the inverting algorithm is only given f (x) as input. Functions which are length preserving are not necessarily 1-1. Furthermore, the assumption that 1-1 one-way functions exist seems stronger than the assumption that arbitrary (and hence length preserving) one-way functions exist. For further discussion see Section 2.4. 2.2.4 Candidates for One-Way Functions Following are several candidates for one-way functions. Clearly, it is not known whether these functions are indeed one-way. This is only a conjecture supported by extensive research which has so far failed to produce an e cient inverting algorithm (having non-negligible success probability). 2.2. ONE-WAY FUNCTIONS: DEFINITIONS 43 Integer factorization In spite of the extensive research directed towards the construction of e cient (integer) factoring algorithms, the best algorithms known for factoring an integer N , run in time p L(P ) def 2O( log P log log P ), where P is the second biggest prime factor of N . Hence it is = reasonable to believe that the function fmult, which partitions its input string into two parts and returns the (binary representation of the) integer resulting by multiplying (the integers represented by) these parts, is one-way. Namely, let fmult(x y ) = x y where jxj = jy j and x y denotes (the string representing) the integer resulting by multiplying the integers (represented by the strings) x and y . Clearly, fmult can be computed in polynomial-time. Assuming the intractability of factoring and using the \density of primes" theorem (which guarantees that at least logN N of the integers smaller than N are primes) 2 it follows that fmult is at least weakly one-way. Using a more sophisticated argument, one can show that fmult is strongly one-way. Other popular functions (e.g. the RSA) related to integer factorization are discussed in Subsection 2.4.3. Decoding of random linear codes One of the most outstanding open problems in the area of error correcting codes is that of presenting e cient decoding algorithms for random linear codes. Of particular interest are random linear codes with constant information rate which can correct a constant fraction of errors. An (n k d)-linear-code is a k-by-n binary matrix in which the vector sum (mod 2) of any non-empty subset of rows results in a vector with at least d one-entries. (A k-bit long message is encoded by multiplying it with the k-by-n matrix, and the resulting n-bit d long vector has a unique preimage even when ipping up to 2 of its entries.) The GilbertVarshanov Bound for linear codes guarantees the existence of such a code, provided that def def d 1 k n < 1 ; H2( n ), where H2(p) = ;p log2 p ; (1 ; p) log2(1 ; p) if p < 2 and H2 (p) = 1 otherwise (i.e., H2 ( ) is a modi cation of the binary entropy function). Similarly, if for some k > 0 it holds that n < 1 ; H2( (1+ )d ) then almost all k-by-n binary matrices constitute n (n k d)-linear-codes. Consider three constants > 0 satisfying < 1 ; H2((1 + ) ). The function fcode, hereafter de ned, seems a plausible candidate for a one-way function. fcode(C x i) def (C xC + e(i)) = where C is an n-by-n binary matrix, x is a n-dimensional binary vector, i is the index of an n-dimensional binary vector having at most 2n one-entries (the string itself is denoted e(i)), and the arithmetic is in the n-dimensional binary vector space. Clearly, fcode is polynomialtime computable. An e cient algorithm for inverting fcode would yield an e cient algorithm for inverting a non-negligible fraction of the linear codes (an earthshaking result in coding theory). 44 CHAPTER 2. COMPUTATIONAL DIFFICULTY The subset sum problem Consider the function fss de nes as follows. fss(x1 ::: xn I ) = (x1 ::: xn X i2I xi ) where jx1j = = jxn j = n, and I f1 2 ::: ng. Clearly, fss is polynomial-time computable. The fact that the subset-sum problem is NP-complete cannot serve as evidence to the onewayness of fss . On the other hand, the fact that the subset-sum problem is easy for special cases (such as having \hidden structure" and/or \low density") can not serve as evidence for the weakness of this proposal. The conjecture that fss is one-way is based on the failure of known algorithm to handle random \high density" instances (i.e., instances in which the length of the elements is not greater than their number). Yet, one has to admit that the evidence in favour of this candidate is much weaker than the evidence in favour of the two previous ones. 2.2.5 Non-Uniformly One-Way Functions In the above two de nitions of one-way functions the inverting algorithm is probabilistic polynomial-time. Stronger versions of both de nitions require that the functions cannot be inverted even by non-uniform families of polynomial-size circuits. We stress that the \easy to compute" condition is still stated in terms of uniform algorithms. For example, following is a non-uniform version of the de nition of strong one-way functions. De nition 2.6 (non-uniformly strong one-way functions): A function f : f0 1g 7! f0 1g is called non-uniformly one-way if the following two conditions hold 1. easy to compute: There exists a (deterministic) polynomial-time algorithm, A, so that on input x algorithm A outputs f (x) (i.e., A(x) = f (x)). 2. hard to invert: For every (even non-uniform) family of polynomial-size circuits, fCngn2N, every polynomial p( ), and all su ciently large n's I Prob Cn (f (Un )) 2 f ;1 f (Un ) < p(1 ) n The probability in the second condition is taken only over all the possible values of Un . Note that it is redundant to give 1n as an auxiliary input to Cn . It can be shown that if f is non-uniformly one-way then it is one-way (i.e., in the uniform sense). The proof follows by converting any (uniform) probabilistic polynomial-time 2.3. WEAK ONE-WAY FUNCTIONS IMPLY STRONG ONES 45 inverting algorithm into a non-uniform family of polynomial-size circuits, without decreasing the success probability. Details follow. Let A0 be a probabilistic polynomial-time (inverting) algorithm. Let rn denote a sequence of coin tosses for A0 maximizing the success probability of A0 . Namely, rn satis es Prob(A0rn (f (Un ) 2 f ;1 f (Un )) Prob(A(f (Un ) 2 f ;1 f (Un )), where the rst probability is taken only over all possible values of Un and the second probability is also over all possible coin tosses for A0 . (Recall that A0r (y ) denotes the output of algorithm A0 on input y and internal coin tosses r.) The desired circuit Cn incorporates the code of algorithm A0 and the sequence rn (which is of length polynomial in n). It is possible that one-way functions exist (in the uniform sense) and yet there are no non-uniformly one-way functions. However, such a possibility is considered not very plausible. 2.3 Weak One-Way Functions Imply Strong Ones We rst remark that not every weak one-way function is necessarily a strong one. Consider for example a one-way function f (which without loss of generality is length preserving). Modify f into a function g so that g (x p) = (f (x) p) if p starts with log2 jxj zeros and g(x p) = (x p) otherwise, where (in both cases) jxj = jpj. We claim that g is a weak oneway function but not a strong one. Clearly, g can not be a strong one-way function (since 1 for all but a n fraction of the strings of length 2n the function g coincides with the identity function). To prove that g is weakly one-way we use a \reducibility argument". Details follow. Proposition 2.7 Let f be a one-way function (even in the weak sense). Then g, constructed above, is a weakly one-way function. Proof: Given a probabilistic polynomial-time algorithm, B0, for inverting g, we construct a probabilistic polynomial-time algorithm A0 which inverts f with \related" success probability. Following is the description of algorithm A0 . On input y , algorithm A0 sets n def jy j = def log n, selects p0 uniformly in f0 1gn;l, computes z def B 0 (y 0lp0 ), and halts with and l = 2 = output the n-bit pre x of z . Let S2n denote the sets of all 2n-bit long strings which start with log2 n zeros (i.e., s2n def f0log2 n : 2 f0 1g2n;log2 n g). Then, by construction of A0 = and g , we have Prob A0 (f (Un )) 2 f ;1 f (Un ) Prob B 0 (f (Un ) 0lUn;l ) 2 (f ;1 f (Un ) 0lUn;l ) = Prob B 0 (g (U2n)) 2 g ;1g (U2n) j U2n 2 S2n ; Prob B 0 (g (U2n)) 2 g ;1g (U2n) ; Prob (U2n 62 S2n ) Prob (U2n 2 S2n ) 46 CHAPTER 2. COMPUTATIONAL DIFFICULTY 1 =n 1 Prob B 0 (g (U2n)) 2 g ;1 g (U2n) ; 1 ; n = 1 ; n 1 ; Prob B 0 (g (U2n)) 2 g ;1g (U2n ) \ (For the second inequality, we used Prob(AjB ) = Prob(ABB ) and Prob(A \ B ) Prob(A) ; Prob( ) Prob(B ).) It should not come as a surprise that the above expression is meaningful only in 1 case Prob(B 0 (g (U2n)) 2 g ;1g (U2n)) > ;(1 ; n ). It follows that, for every polynomial p( ) and every integer n, if B 0 inverts g on g (U2n) 1 with probability greater than 1 ; p(2n) then A0 inverts f on f (Un ) with probability greater n than 1 ; p(2n) . Hence, if g is not weakly one-way (i.e., for every polynomial p( ) there exist in nitely many m's such that g can be inverted on g (Um) with probability 1 ; 1=p(m)) then also f is not weakly one-way (i.e., for every polynomial q ( ) there exist in nitely many n's such that f can be inverted on f (Un ) with probability 1 ; 1=q (n)). This contradicts our hypothesis (that f is one-way). We have just shown that, unless no one-way functions exist, there exist weak one-way functions which are not strong ones. Fortunately, we can rule out the possibility that all one-way functions are only weak ones. In particular, the existence of weak one-way functions implies the existence of strong ones. Theorem 2.8 : Weak one-way functions exist if and only if strong one-way functions exist. We strongly recommend to the reader not to skip the following proof, since we believe that the proof is very instructive to the rest of the book. In particular, the proof demonstrates that ampli cation of computational di culty is much more involved than ampli cation of an analogous probabilistic event. Proof: Let f be a weak one-way function, and let p be the polynomial guaranteed by the de nition of a weak one-way function. Namely, every probabilistic polynomial-time algorithm fails to invert f on f (Un ) with probability at least p(1n) . We assume, for simplicity, that f is length preserving (i.e. jf (x)j = jxj for all x's). This assumption, which is not really essential, is justi ed by Proposition 2.5. We de ne a function g as follows g(x1 ::: xt(n)) def f (x1) ::: f (xt(n)) = where jx1 j = jxt(n) j = n and t(n) def n p(n). Namely, the n2 p(n)-bit long input of g is = partitioned into t(n) blocks each of length n, and f is applied to each block. Clearly, g can be computed in polynomial-time (by an algorithm which breaks the input into blocks and applies f to each block). Furthermore, it is easy to see that inverting g on 2.3. WEAK ONE-WAY FUNCTIONS IMPLY STRONG ONES 47 were indeed the case then the probability that an inverting algorithm successfully inverts all f (xi )'s is at most (1 ; p(1n) )n p(n) < 2;n (which is negligible also as a function of n2 p(n)). However, the assumption that an algorithm trying to invert g works independently on each f (xi ) cannot be justi ed. Hence, a more complex argument is required. Following is an outline of our proof. The proof that g is strongly one-way proceeds by a contradiction argument. We assume on the contrary that g is not strongly oneway namely, we assume that there exists a polynomial-time algorithm that inverts g with probability which is not negligible. We derive a contradiction by presenting a polynomialtime algorithm which, for in nitely many n's, inverts f on f (Un ) with probability greater than 1 ; p(1n) (in contradiction to our hypothesis). The inverting algorithm for f uses the inverting algorithm for g as a subroutine (without assuming anything about the manner in which the latter algorithm operates). Details follow. Suppose that g is not strongly one-way. By de nition, it follows that there exists a probabilistic polynomial-time algorithm B 0 and a polynomial q ( ) so that for in nitely many m's Prob B 0 (g (Um)) 2 g ;1 g (Um) > q (1 ) m 0 , the in nite set of integers for which the above holds. Let N 0 denote Let us denote by M the in nite set of n's for which n2 p(n) 2 M 0 (note that all m's considered are of the form n2 p(n), for some integer n). We now present a probabilistic polynomial-time algorithm, A0 , for inverting f . On input y (supposedly in the range f ) algorithm A0 proceeds by applying the following probabilistic procedure, denoted I , on input y for a(jy j) times, where a( ) is a polynomial depends on the polynomials p and q (speci cally, we set a(n) def 2n2 p(n) q (n2 p(n))). = g(x1 ::: xt(n)) requires nding a preimage to each f (xi). One may be tempted to deduce that it is also clear that g is a strongly one-way function. An naive argument, assumes implicitly (with no justi cation) that the inverting algorithm works separately on each f (xi ). If this Procedure I Input: y (denote n def jy j). = For i = 1 to t(n) do begin 1. Select uniformly and independently a sequence of strings x1 ::: xt(n) 2 f0 1gn. 2. Compute (z1 ::: zt(n)) B 0 (f (x1) ::: f (xi;1) y f (xi+1) ::: f (xt(n))) (Note that y is placed in the ith position instead of f (xi ).) 3. If f (zi ) = y then halt and output y . (This is considered a success). 48 end CHAPTER 2. COMPUTATIONAL DIFFICULTY We now present a lower bound on the success probability of algorithm A0 . To this end we de ne a set Sn , which contains all n-bit strings on which the procedure I succeeds with non-negligible probability (speci cally greater than a(nn) ). (The probability is taken only over the coin tosses of algorithm A0 ). Namely, Sn def x : Prob I (f (x)) 2 f ;1f (x) > a(n ) = n In the next two claims we shall show that Sn contains all but a 2p1n) fraction of the strings ( of length n 2 N 0 , and that for each string x 2 Sn the algorithm A0 inverts f on f (x) with probability exponentially close to 1. It will follow that A0 inverts f on f (Un ), for n 2 N 0, with probability greater than 1 ; p(1n) , in contradiction to our hypothesis. Prob A0 (f (x) 2 f ;1 f (x) > 1 ; 21n Claim 2.8.1: For every x 2 Sn Proof: By de nition of the set Sn , the procedure I inverts f (x) with probability at least n 0 a(n) . Algorithm A merely repeats I for a(n) times, and hence The claim follows. 2 Prob A0 (f (x) 62 f ;1 f (x) < 1 ; a(n ) n a(n) < 21n Claim 2.8.2: For every n 2 N 0, jSnj > 1 ; 2p1n) 2n ( Proof: We assume, to the contrary, that jSn j (1 ; 2p1n) ) 2n . We shall reach a contradiction ( to our hypothesis concerning the success probability of B 0 . Recall that by this hypothesis s(n) def Prob B 0 (g(Un2p(n))) 2 g ;1g(Un2p(n)) > q(n21 (n)) = p (1) ( Let Un ::: Unn p(n)) denote the n-bit long blocks in the random variable Un2 p(n) (i.e., these ( Uni) 's are independent random variables each uniformly distributed in f0 1gn). Clearly, s(n) is the sum of s1 (n) and s2 (n) de ned by ( s1 (n) def Prob B0 (g(Un2p(n))) 2 g;1g (Un2p(n)) ^ 9i s.t. Uni) 62 Sn = 2.3. WEAK ONE-WAY FUNCTIONS IMPLY STRONG ONES and 49 ( s2(n) def Prob B0 (g (Un2 p(n))) 2 g ;1g (Un2p(n) ) ^ 8i : Uni) 2 Sn = (Use Prob(A) = Prob(A ^ B ) + Prob(A ^ :B ).) We derive a contradiction to the lower bound on s(n) by presenting upper bounds for both s1 (n) and s2 (n) (which sum up to less). First, we present an upper bound on s1 (n). By the construction of algorithm I it follows that, for every x 2f0 1gn and every 1 i n p(n), the probability that I inverts f on f (x) ( in the ith iteration equals the probability that B 0 inverts g on g (Un2p(n) ) when Uni) = x. It follows that, for every x 2f0 1gn and every 1 i n p(n), Prob I (f (x)) 2 f ;1f (x) ( Prob B 0 (g (Un2 p(n))) 2 g ;1 g (Un2p(n) ) j Uni) = x Using trivial probabilistic inequalities (such as Prob(9i Ai ) B ) Prob(A j B)), it follows that P Prob(A ) and Prob(A ^ i i s1(n) nX ) p(n i=1 nX ) p(n i=1 nX ) p(n i=1 ( Prob B 0 (g (Un2p(n) )) 2 g ;1g (Un2 p(n)) ^ Uni) 62 Sn ( Prob B 0 (g (Un2p(n) )) 2 g ;1g (Un2 p(n)) j Uni) 62 Sn Prob I (f (Un )) 2 f ;1 f (Un ) j Un 62 Sn n p(n) a(n ) n (The last inequality uses the de nition of Sn .) We now present an upper bound on s2 (n). Recall that by the contradiction hypothesis, jSnj (1 ; 2p1n) ) 2n. It follows that ( s2 (n) ( Prob 8i : Uni) 2 Sn n p(n) 1 ; 2p1n) ( < 1n 22 2 p Hence, on one hand s1 (n) + s2 (n) < 2na(n()n) = q(n21 (n)) (equality by de nition of a(n)). p Yet, on the other hand s1 (n) + s2 (n) = s(n) > q(n21 (n)) . Contradiction is reached and the p claim follows. 2 50 CHAPTER 2. COMPUTATIONAL DIFFICULTY Combining Claims 2.8.1 and 2.8.2, It follows that the probabilistic polynomial-time algorithm, A0 , inverts f on f (Un ), for n 2 N 0 , with probability greater than 1 ; p(1n) , in contradiction to our hypothesis (that f cannot be e ciently inverted with such success probability). The theorem follows. Let us summarize the structure of the proof of Theorem 2.8. Given a weak one-way function f , we rst constructed a polynomial-time computable function g . This was done with the intention of later proving that g is strongly one-way. To prove that g is strongly oneway we used a \reducibility argument". The argument transforms e cient algorithms which supposedly contradict the strong one-wayness of g into e cient algorithms which contradict the hypothesis that f is weakly one-way. Hence g must be strongly one-way. We stress that our algorithmic transformation, which is in fact a randomized Cook reduction, makes no implicit or explicit assumptions about the structure of the prospective algorithms for inverting g . Such assumptions, as the \natural" assumption that the inverter of g works independently on each block, cannot be justi ed (at least not at the current state of understanding of the nature of e cient computations). Theorem 2.8 has a natural information theoretic (or \probabilistic") analogue which asserts that repeating an experiment, which has a non-negligible success probability, sufciently many times yields success with very high probability. The reader is probably convinced at this stage that the proof of Theorem 2.8 is much more complex that the proof of the information theoretic analogue. In the information theoretic context the repeated events are independent by de nition, whereas in our computational context no such independence can be guaranteed. Another indication to the di erence between the two settings follows. In the information theoretic setting the probability that none of the events occur decreases exponentially in the number of repetitions. However, in the computational setting we can only reach a negligible bounds on the inverting probabilities of polynomial-time algorithms. Furthermore, it may be the case that g constructed in the proof of Theorem 2.8 can be e ciently inverted on g (Un2p(n) ) with success probability which is subexponentially 2 decreasing (e.g., with probability 2; log3 m ), whereas the analogous information theoretic experiment fails with probability at most 2;n . By Theorem 2.8, whenever assuming the existence of one-way functions, there is no need to specify whether we refer to weak or strong ones. Thus, as far as the mere existence of one-way function goes, the notions of weak and strong one-way functions are equivalent. However, as far as e ciency considerations are concerned the two notions are not really equivalent, since the above transformation of weak one-way functions into strong ones is not practical. An alternative transformation which is much more e cient does exist for the case of one-way permutations and other speci c classes of one-way functions. Further details are presented in Section 2.6. 2.4. ONE-WAY FUNCTIONS: VARIATIONS 51 2.4 One-Way Functions: Variations In this section, we discuss several issues concerning one-way functions. In the rst subsection, we present a function that is (strongly) one-way, provided that one-way functions exist. The construction of this function is of strict abstract interest. In contrast, the issues discussed in the other subsections are of practical importance. First, we present a formulation which is better suited for describing many natural candidates for one-way functions, and use it in order to describe popular candidates for one-way functions. Next, we use this formulation to present one-way functions with additional properties speci cally, (one-way) trapdoor permutations, and clawfree functions. We remark that these additional properties are used in several constructions (e.g., trapdoor permutations are used in the construction of public-key encryption schemes whereas clawfree permutations are used in the construction of collision-free hashing). We conclude this section with remarks addressing the \art" of proposing candidates for one-way functions. 2.4.1 * Universal One-Way Function Using the result of the previous section and the notion of a universal machine it is possible to prove the existence of a universal one-way function. Proposition 2.9 There exists a polynomial-time computable function which is (strongly) one-way if and only if one-way functions exist. Proof: A key observation is that there exist one-way functions if and only if there exist one-way functions which can be evaluated by a quadratic time algorithm. (The choice of the speci c time bound is immaterial, what is important is that such a speci c time bound exists.) This statement is proven using a padding argument. Details follow. Let f be an arbitrary one-way function, and let p( ) be a polynomial bounding the time complexity of an algorithm for computing f . De ne g (x0x00) def f (x0)x00, where jx0x00 j = = p(jx0j). An algorithm computing g rst parses the input into x0 and x00 so that jx0 x00j = p(jx0j), and then applies f on x0. The parsing and the other overhead operations can be implemented in quadratic time (in jx0 x00j), whereas computing f (x0) is done within time p(jx0j) = jx0x00j (which is linear in the input length). Hence, g can be computed (by a Turing machine) in quadratic time. The reader can verify that g is one-way using a \reducibility argument" analogous to the one used in the proof of Proposition 2.5. We now present a (universal one-way) function, denoted funi . funi(desc(M ) x) def (desc(M ) M (x)) = 52 CHAPTER 2. COMPUTATIONAL DIFFICULTY where desc(M ) is a description of Turing machine M , and M (x) is de ned as the output of M on input x if M runs at most quadratic time on x, and as x otherwise. Clearly, funi can be computed in polynomial-time by a universal machine which uses a step counter. To show that funi is one-way we use a \reducibility argument". By the above observation, we know that there exist a one-way function g which is computed in quadratic time. Let Mg be the quadratic time machine computing g . Clearly, an (e cient) algorithm inverting funi on inputs of the form funi(desc(Mg ) Un ), with probability (n), can be easily modi ed into an (e cient) algorithm inverting g on inputs of the form g (Un), with probability (n). It follows that an algorithm inverting funi with probability (n), on strings of length jdesc(Mg )j + n, (n) yields an algorithm inverting g with probability 2jdesc(Mg )j on strings of length n. Hence, if funi is not weakly one-way then also g cannot be weakly one-way. Using Theorem 2.8, the proposition follows. The observation, that it su ces to consider one-way functions which can be evaluated within a speci c time bound, is crucial to the construction of funi. The reason being, that it is not possible to construct a polynomial-time machine which is universal for the class of polynomial-time machines (i.e., a polynomial-time machine that can \simulate" all polynomial-time machines). It is however possible to construct, for every polynomial p( ), a polynomial-time machine that is universal for the class of machines with running-time bounded by p( ). The impracticality of the suggestion to use funi as a one-way function stems from the fact that funi is likely to be hard to invert only on huge input lengths. 2.4.2 One-Way Functions as Collections The formulation of one-way functions, used in so far, is suitable for an abstract discussion. However, for describing many natural candidates for one-way functions, the following formulation (although being more cumbersome) is more adequate. Instead of viewing one-way functions as functions operating on an in nite domain (i.e., f0 1g ), we consider in nite collections of of functions each operating on a nite domain. The functions in the collection share a single evaluating algorithm, that given as input a succint representation of a function and an element in its domain, return the value of the speci ed function at the given point. The formulation of a collection of functions is also useful for the presentation of trapdoor permutations and clawfree functions (see the next two subsections). We start with the following de nition. De nition 2.10 (collection of functions): A collection of functions consists of an in nite set of indices, denoted I , a nite set Di, for each i 2 I , and a function fi de ned over Di . 2.4. ONE-WAY FUNCTIONS: VARIATIONS 53 We will only be interested in collections of functions that can be applied. As hinted above, a necessary condition for applying a collection of functions is the existence of an e cient function-evaluating algorithm (denoted F ) that, on input i 2 I and x, returns fi (x). Yet, this condition by itself does not su ce. We need to be able to (randomly) select an index, specifying a function over a su ciently large domain, as well as to be able to (randomly) select an element of the domain (when given the domain's index). The sampling property of the index set is captured by an e cient algorithm (denoted I ) that on input an integer n (presented in unary) randomly selects an poly(n)-bit long index, specifying a function and its associated domain. (As usual unary presentation is used to enhence the standard association of e cient algorithms with those running in time polynomial in their length.) The sampling property of the domains is captured by an e cient algorithm (denoted D) that on input an index i randomly selects an element in Di. The one-way property of the collection is captured by requiring that every e cient algorithm, when given an index of a function and an element in its range, fails to invert the function, except for with negligible probability. The probability is taken over the distribution induced by the sampling algorithms I and D. time algorithms, I , D and F , so that the following two conditions hold De nition 2.11 (collection of one-way functions): A collection of functions, ffi : Di 7! f0 1g gi2I , is called strongly (resp., weakly) one-way if there exists three probabilistic polynomial1. easy to sample and compute: The output distribution of algorithm I , on input 1n , is a random variable assigned values in the set I \ f0 1gn. The output distribution of algorithm D, on input i 2 I , is a random variable assigned values in Di. On input i 2 I and x 2 Di , algorithm F always outputs fi (x). 2. hard to invert (version for strongly one-way): For every probabilistic polynomial-time algorithm, A0 , every polynomial p( ), and all su ciently large n's Prob A0 (fIn (Xn ) In) 2 fI;1 fIn (Xn) < 1 n p(n) where In is a random variable describing the output distribution of algorithm I on input 1n , and Xn is a random variable describing the output of algorithm D on input (random variable) In . (The version for weakly one-way collections is analogous.) We may relate to a collection of one-way functions by indicating the corresponding triplet of algorithms. Hence, we may say that a triplet of probabilistic polynomial-time algorithms, (I D F ), constitutes a collection of one-way functions if there exists a collection of functions for which these algorithms satisfy the above two conditions. 54 CHAPTER 2. COMPUTATIONAL DIFFICULTY We stress that the output of algorithm I , on input 1n , is not necessarily distributed uniformly over I \ f0 1gn. Furthermore, it is not even required that I (1n) is not entirely concentrated on one single string. Likewise, the output of algorithm D, on input i, is not necessarily distributed uniformly over Di . Yet, the hardness-to-invert condition implies that D(i) cannot be mainly concentrated on polynomially many (in jij) strings. We stress that the collection is hard to invert with respect to the distribution induced by the algorithms I and D (in addition to depending as usual on the mapping induced by the function itself). Clearly, a collection of one-way functions can be represented as a one-way function and vice versa (see Exercise 12), yet each formulation has its advantages. In the sequel we use the formulation of a collection of one-way functions in order to present popular candidates of one-way functions. To allow less cumbersome presentation of natural candidates of one-way collections (of functions), we relax De nition 2.11 in two ways. First, we allow the index sampling algorithm to output, on input 1n , indices of length p(n), where p( ) is some polynomial. Secondly, we allow all algorithms to fail with negligible probability. Most importantly, we allow the index sampler I to output strings not in I as long as the probability that I (1n) 62 I \f0 1gp(n) is a negligible function in n. (The same relaxations can be made when discussing trapdoor permutations and clawfree functions.) 2.4.3 Examples of One-way Collections (RSA, Factoring, DLP) In this subsection we present several popular collections of one-way functions, based on computation number theory (e.g., RSA and Discrete Exponentiation). In the exposition which follows, we assume some knowledge of elementary number theory and some familiarity with simple number theoretic algorithms. Further discussion of the relevant number theoretic material is presented in Appendix missing(app-cnt)] The RSA function The RSA collection of functions has an index set consisting of pairs (N e), where N is a 1 product of two ( 2 log2 N )-bit primes, denoted P and Q, and e is an integer smaller than N and relatively prime to (P ; 1) (Q ; 1). The function of index (N e), has domain f1 ::: N g and maps the domain element x to xe mod N . Using the fact that e is relatively prime to (P ; 1) (Q ; 1), it can be shown that the fuction is in fact a permutation over its domain. Hence, the RSA collection is a collection of permutations. We rst substantiate the fact that the RSA collection satis es the rst condition of the de nition of a one-way collection (i.e., that it is easy to sample and compute). To this end, we present the triplet of algorithms (IRSA DRSA FRSA). On input 1n , algorithm IRSA selects uniformly two primes, P and Q, such that 2n;1 P < Q< 2n , and an integer e such that e is relatively prime to (P ; 1) (Q ; 1). Algorithm 2.4. ONE-WAY FUNCTIONS: VARIATIONS 55 tributed primes. Such an algorithm does exist. However, it is more e cient to generate two primes by selecting two integers uniformly in the interval 2n;1 2n ; 1] and checking via a fast randomized primality test whether these are indeed primes (this way we get, with exponentially small probability, an output which is not of the desired form). For more details concerning the uniform generation of primes see Appendix missing(app-cnt)]. As for algorithm DRSA , on input (N e), it selects (almost) uniformly an element in the set DN e def f1 ::: N g. The output of FRSA , on input ((N e) x), is = IRSA terminates with output (N e), where N = P Q. For an e cient implementation of IRSA , we need a probabilistic polynomial-time algorithms for generating uniformly dis- RSAN e(x) def xe mod N = It is not known whether factoring N can be reduced to inverting RSAN e , and in fact this is a well-known open problem. We remark that the best algorithms known for inverting RSAN e proceed by (explicitly or implicitly) factoring N . In any case it is widely believed that the RSA collection is hard to invert. In the above description DN e corresponds to the additive group mod N (and hence contain N elements). Alternatively, the domain DN e can be restricted to the elements of p the multiplicative group modulo N (and hence contain (P ; 1) (Q ; 1) N ; 2 N N elements). A modi ed domain sampler may work by selecting an element in f1 ::: N g and discarding the unlikely cases in which the selected element is not relatively prime to N . The function RSAN e de ned above indues a permutation on the multiplicative group modulo N . The resulting collection is as hard to invert as the original one. (A proof of this statement is left as an exercise to the reader.) The question which formulation to prefer seems to be a matter of personal taste. The Rabin function The Rabin collection of functions is de ned analogously to the RSA collection, except that the function is squaring modulo N (instead of raising to the power e mod N ). Namely, RabinN (x) def x2 mod N = This function, however, does not induces a permutation on the multiplicative group modulo N , but is rather a 4-to-1 mapping on the multiplicative group modulo N . It can be shown that extracting square roots modulo N is computationally equivalent to factoring N (i.e., the two tasks are reducible to one another via probabilistic polynomialtime reductions). For details see Exercise 15. Hence, squaring modulo a composite is a collection of one-way functions if and only if factoring is intractable. We remind the reader that it is generally believed that integer factorization is intractable. 56 CHAPTER 2. COMPUTATIONAL DIFFICULTY The Factoring Permutations For a special subclass of the integers, known by the name of Blum Integers, the function RabinN ( ) de ned above induces a permutation on the quadratic residues modulo N . We say that r is a quadratic residue mod N if there exists an integer x such that r x2 mod N . We denote by QN the set of quadratic residues in the multiplicative group mod N . For purposes of this paragraph, we say that N is a Blum Integer if it is the product of two primes, each congruent to 3 mod 4. It can be shown that when N is a Blum integer, each element in QN has a unique square root which is also in QN , and it follows that in this case the function RabinN ( ) induces a permutation over QN . This leads to the introduction of the following collection, SQR def (IBI DQR FSQR), of permutations. On input 1n , algorithm IBI selects = uniformly two primes, P and Q, such that 2n;1 P < Q < 2n and P Q 3 mod 4, and outputs N = P Q. It is assumed that the density of such primes is non-negligible and thus that this step can be e ciently implemented. On input N , algorithm DQR , uniformly selects an element of QN , by uniformly selecting an element of the multiplicative group modulo N , and squaring it mod N . Algorithm FSQR is de ned exactly as in the Rabin collection. The resulting collection is one-way, provided that factoring is intractable also for the set of Blum integers (de ned above). Discrete Logarithms Another computational number theoretic problem which is widely believed to be intractable is that of extracting discrete logarithms in a nite eld (and in particular of prime cardinality). The DLP collection of functions, borrowing its name (and one-wayness) from the Discrete Logarithm Problem, is de ned by the triplet of algorithms (IDLP DDLP FDLP ). On input 1n , algorithm IDLP selects uniformly a prime, P , such that 2n;1 P < 2n , and a primitive element G in the multiplicative group modulo P (i.e., a generator of this cyclic group), and terminates with output (P G). There exists a probabilistic polynomial-time algorithm for uniformly generating primes together with the prime factorization of P ; 1, where P is the prime generated (see Appendix missing(app-cnt)]). Alternatively, one may uniformly generate a prime P of the form 2Q +1, where Q is also a prime. (In the latter case, however, one has to assume the intractability of DLP with respect to such primes. We remark that such primes are commonly believed to be the hardest for DLP.) Using the factorization of P ; 1 one can nd a primitive element by selecting an element of the group at random and checking whether it has order P ; 1 (by raising to powers which non-trivially divide P ; 1). Algorithm DDLP , on input (P G), selects uniformly a residue modulo P ; 1. Algorithm FDLP , on input ((P G) x), halts outputting DLPP G(x) def Gx mod P = 2.4. ONE-WAY FUNCTIONS: VARIATIONS 57 Hence, inverting DLPP G amounts to extracting the discrete logarithm (to base G) modulo P . For every (P G) of the above form, the function DLPP G induces a 1-1 and onto mapping from the additive group mod P ; 1 to the multiplicative group mod P . Hence, DLPP G induces a permutation on the the set f1 ::: P ; 1g. Exponentiation in other groups is also a reasonable candidate for a one-way function, provided that the discrete logarithm problem for the group is believed to be hard. For example, it is believed that the logarithm problem is hard in the group of points on an Elliptic curve. Author's Note: ll-in more details 2.4.4 Trapdoor one-way permutations The formulation of collections of one-way functions is convenient as a starting point to the de nition of trapdoor permutations. Loosely speaking, these are collections of one-way permutations, ffi g, with the extra property that fi is e ciently inverted once given as auxiliary input a \trapdoor" for the index i. The trapdoor of index i, denoted by t(i), can not be e ciently computed from i, yet one can e ciently generate corresponding pairs (i t(i)). De nition 2.12 (collection of trapdoor permutations): Let I be a probabilistic algorithm, and let I1 (1n ) (resp. I2(1n )) denote the rst (resp. second) half of the output of I (1n). A triple of algorithms, (I D F ), is called a collection of strong (resp. weak) trapdoor permutations if the following two conditions hold 1. the algorithms induce a collection of one-way permutations: The triple (I1 D F ) constitutes a collection of one-way permutations. 2. easy to invert with trapdoor: There exists a (deterministic) polynomial-time algorithm, denoted F ;1 , so that for every (i t) in the range of I and for every x 2 Di, it holds that F ;1 (t F (i x)) = x. A useful relaxation of the above conditions is to require that they are satis ed with overwhelmingly high probability. Namely, the index generating algorithm, I , is allowed to output, with negligible probability, pairs (i t) for which either fi is not a permutation or F ;1 (t F (i x)) = x does not hold for all x 2 Di. The RSA (or factoring) Trapdoor The RSA collection presented above can be easily modi ed to have the trapdoor property. To this end algorithm IRSA should be modi ed so that it outputs both the index (N e) and 58 CHAPTER 2. COMPUTATIONAL DIFFICULTY the trapdoor (N d), where d is the multiplicative inverse of e modulo (P ; 1) (Q ; 1) (note that e has such inverse since it has been chosen to be relatively prime to (P ; 1) (Q ; 1)). ;1 ;1 The inverting algorithm FRSA is identical to the algorithm FRSA (i.e., FRSA ((N d) y ) = yd mod N ). The reader can easily verify that FRSA ((N d) FRSA ((N e) x)) = xed mod N indeed equals x for every x in the multiplicative group modulo N . In fact, one can show that xed x (mod N ) for every x (even in case x is not relatively prime to N ). We remark that the Rabin collection presented above can be easily modi ed in an analogous manner, enabling to e ciently compute all 4 square roots of a given quadratic residue (mod N ). The square roots mod N can be computed by extracting a square root modulo each of the primes factors of N and combining the result using the Chinese Reminder Theorem. E cient algorithms for extracting square root modulo a given prime are known. Furthermore, in case the prime, P , is congruent to 3 mod 4, the square roots of x mod P +1 can be computed by raising x to the power P 4 (while reducing the intermediate results mod P ). Furthermore, in case N is a Blum integer, the collection SQR, presented above, forms a collection of trapdoor permutations (provided of course that factoring is hard). 2.4.5 * Clawfree Functions Loosely speaking, a clawfree collection consists of a set of pairs of functions which are easy to evaluate, both have the same range, and yet it is infeasible to nd a range element together with preimages of it under each of these functions. fi0 and fi1 de ned over Di0 and Di1, respectively. Such a collection is called clawfree if there exists three probabilistic polynomial-time algorithms, I , D and F , so that the following conditions hold De nition 2.13 (clawfree collection): A collection of pairs of functions consists of an in nite set of indices, denoted I , two nite sets Di0 and Di1 , for each i 2 I , and two functions 1. easy to sample and compute: The random variable I (1n ) is assigned values in the set I \ f0 1gn. For each i 2 I and 2 f0 1g, the random variable D( i) is distributed over Di and F ( i x) = fi (x). 2. identical range distribution: For every i in the index set I , the random variables fi0 (D(0 i)) and fi1(D(1 i)) are identically distributed. 3. hard to form claws: A pair (x y ) satisfying fi0(x) = fi1 (y ) is called a claw for index i. Let Ci denote the set of claws for index i. It is required that for every probabilistic polynomial-time algorithm, A0, every polynomial p( ), and all su ciently large n's ; Prob A0 (In ) 2 CIn < p(1 ) n 2.4. ONE-WAY FUNCTIONS: VARIATIONS 59 where In is a random variable describing the output distribution of algorithm I on input 1n . The rst requirement in De nition 2.13 is analogous to what appears in De nition 2.11. The other two requirements (in De nition 2.13) are kind of con icting. On one hand, it is required that that claws do exist (to say the least), whereas on the other hand it is required that calws cannot be e ciently found. Clearly, a clawfree collection of functions yields a collection of strong one-way functions (see Exercise 16). A special case of interest is when both domains are identical (i.e., Di def Di0 = Di1), the random variable D( i) = is uniformly distributed over Di , and the functions, fi0 and fi1 , are permutations over Di. Such a collection is called a collection of (clawfree) permutations. Again, a useful relaxation of the conditions of De nition 2.13 is obtained by allowing the algorithms (i.e., I , D and F ) to fail with negligible probability. An additional property that a (clawfree) collection may (or may not) have is an e ciently recognizable index set (i.e., an probabilistic polynomial-time algorithm for determining whether a give string is I ). This property is useful in some applications of clawfree collections (hence this discussion). E cient recognition of the index set may be important since the function-evaluating algorithm F may induce functions also in case its second input (which is supposedly an index) is not in I . In this case it is no longer guaranteed that the induced pair of functions has identical range distribution. In some applications (e.g., see section 6.8), dishonest parties may choose, on purpose, an illegal index and try to capitalize on the induce functions having di erent range distributions. The DLP Clawfree Collection We now turn to show that clawfree collections do exists under speci c reasonable intractability assumptions. We start by presenting such a collection under the assumption that the Discrete Logarithm Problem (DLP) for elds of prime cardinality is intractable. Following is the description a collection of clawfree permutations (based on the above assumption). The index sets consists of triples, P G Z ), where P is a prime, G is a primitive element mod P , and Z is an element in the eld (of residues mod P ). The index sampling algorithm, selects P and G as in the DLP collection presented in Subsection 2.4.3, and Z is selected uniformly among the residues mod P . The domain of both functions with index (P G Z ) is identical, and equals the set f1 ::: P ; 1g, and the domain sampling algorithm selects uniformly from this set. As for the functions themselves, we set The reader can easily verify that both functions are permutations over f1 ::: P ; 1g. Also, the ability to form a claw for the index (P G Z ) yields the ability to nd the discrete fP G Z (x) def Z Gx mod P = 60 CHAPTER 2. COMPUTATIONAL DIFFICULTY logarithm of Z mod P to base G (since Gx Z Gy mod P yields Gx;y Z mod P ). Hence, ability to form claws for a non-negligible fraction of the index set translates to a contradiction to the DLP intractability assumption. The above collection does not have the additional property of having an e ciently recognizable index set, since it is not known how to e ciently recognize primitive elements modulo a prime. This can be amended by making a slightly stronger assumption concerning the intractability of DLP. Speci cally, we assume that DLP is intractable even if one is given the factorization of the size of the multiplicative group (i.e., the factorization of P ; 1) as additional input. Such an assumption allows to add the factorization of P ; 1 into the description of the index. This makes the index set e ciently recognizable (since one can rst test P for primality, as usual, and next test whether G is a primitive element by raising it to powers of the form (P ; 1)=Q where Q is a prime factor of P ; 1). If DLP is hard also for primes of the form 2Q + 1, where Q is also a prime, life is even easier. To test whether G is a primitive element mod P one just computes G2 (mod P ) and G(P ;1)=2 (mod P ), and checks whether either of them equals 1. The Factoring Clawfree Collection We now show that a clawfree collection (of functions) does exists under the assumption that integer factorization is infeasible for integers which are the product of two primes each congruent to 3 mod 4. Such composite numbers, hereafter referred to as Blum integers, have the property that the Jacobi symbol of ;1 (relative to them) is 1 and half of the square roots of each quadratic residue, in the corresponding multiplicative group (modulo this composite), have Jacobi symbol 1 (see Appendix missing(app-cnt)]). The index set of the collection consists of all Blum integers which are composed of two primes of equal length. The index selecting algorithm, on input 1n , uniformly select such an integers, by uniformly selecting two (n-bit) primes each congruent to 3 mod 4, +1 ; and outputting their product, denoted N . Let JN (respectively, JN 1 ) denote the set of residues in the multiplicative group modulo N with Jacobi Symbol +1 (resp., ;1). The 0 1 functions of index N , denoted fN and fN , consist both of squaring modulo N , but their ( corresponding domains are disjoint. The domain of function fN equals the set JN;1) . The domain sampling algorithm, denoted D, uniformly selects an element of the corresponding domain as follows. Speci cally, on input ( N ) algorithm D uniformly selects polynomially many residues mod N , and outputs the rst residue with Jacobi Symbol (;1) . 0 1 The reader can easily verify that both fN (D(0 N )) and fN (D(1 N )) are uniformly distributed over the set of quadratic residues mod N . The di cult of forming claws follows +1 ; from the fact that a claw yield two residues, x 2 JN and y 2 JN 1 such that x2 y 2 +1 , it follows that x 6= y and the gcd of x y and N yields a (mod N ). Since ;1 2 JN factorization of N . 2.5. HARD-CORE PREDICATES 61 The above collection does not have the additional property of having an e ciently recognizable index set, since it is not even known how to e ciently distinguish products of two primes from products of more than two primes. 2.4.6 On Proposing Candidates Although we do believe that one-way functions exist, their mere existence does not su ce for practical applications. Typically, an application which is based on one-way functions requires the speci cation of a concrete (candidate one-way) function. As explained above, the observation concerning the existence of a universal one-way function is of little practical signi cance. Hence, the problem of proposing reasonable candidates for one-way functions is of great practical importance. Everyone understands that such a reasonable candidate (for a one-way function) should have a very e cient algorithm for evaluating the function. (In case the \function" is presented as a collection of one-way functions, especially the domain sampler and function-evaluation algorithm should be very e cient.) However, people seem less careful in seriously considering the di culty of inverting the candidates that they propose. We stress that the candidate has to be di cult to invert on \the average" and not only on the worst case, and that \the average" is taken with respect to the instance-distribution determined by the candidate function. Furthermore, \hardness on the average" (unlike worst case analysis) is extremely sensitive to the instance-distribution. Hence, one has to be extremely careful in deducing average-case complexity with respect to one distribution from the average-case complexity with respect to another distribution. The short history of the eld contains several cases in which this point has been ignored and consequently bad suggestions has been made. Consider for example the following suggestion to base one-way functions on the conjectured di culty of the Graph Isomorphism problem. Let fGI (G ) = (G G), where G is an undirected graph, is a permutation on its vertex set, and G denotes the graph resulting by renaming the vertices of G using (i.e., ( (u) (v)) is an edge in G i (u v ) is an edge in G). Although it is indeed believed that Graph Isomorphism cannot be solved in polynomial-time, it is easy to see that FGI is easy to invert on most instances (e.g., use vertex degree statistics to determine the isomorphism). 2.5 Hard-Core Predicates Loosely speaking, saying that a function f is one-way means that given y it is infeasible to nd a preimage of y under f . This does not mean that it is infeasible to nd out partial information about the preimage of y under f . Speci cally it may be easy to retrieve half of the bits of the preimage (e.g., given a one-way function f consider the function g de ned by g (x r) def (f (x) r), for every jxj = jrj). The fact that one-way functions do not = 62 CHAPTER 2. COMPUTATIONAL DIFFICULTY necessarily hide partial information about their preimage limits their \direct applicability" to tasks as secure encryption. Fortunately, assuming the existence of one-way functions, it is possible to construct one-way functions which hide speci c partial information about their preimage (which is easy to compute from the preimage itself). This partial information can be considered as a \hard core" of the di culty of inverting f . 2.5.1 De nition A polynomial-time predicate b, is called a hard-core of a function f if all e cient algorithm, given f (x), can guess b(x) only with success probability which is negligibly better than half. De nition 2.14 (hard-core predicate): A polynomial-time computable predicate b : f0 1g 7! f0 1g is called a hard-core of a function f if for every probabilistic polynomial-time algorithm A0, every polynomial p( ), and all su ciently large n's ; 1 Prob A0(f (Un ))= b(Un ) < 2 + p(1 ) n It follows that if b is a hard-core predicate (for any function) then b(Un ) should be almost unbiased (i.e., jProb(b(Un)=0) ; Prob(b(Un )=1)j must be a negligible function in n). As b itself is polynomial-time computable the failure of e cient algorithms to approximate b(x) from f (x) (with success probability signi cantly more than half) must be due to either an information loss of f (i.e., f not being one-to-one) or to the di culty of inverting f . For example, the predicate b( ) = is a hard-core of the function f ( ) def 0 , where = 2 f0 1g and 2 f0 1g . Hence, in this case the fact that b is a hard-core of the function f is due to the fact that f losses information (speci cally the rst bit ). On the other hand, in case f losses no information (i.e., f is one-to-one) hard-cores for f exist only if f is one-way (see Exercise 19). Finally, we note that for every b and f , there exist obvious algorithms which guess b(Un ) from f (Un ) with success probability at least half (e.g., either an algorithm A1 that regardless of its input answers with a uniformly chosen bit, or, in case b is not biased towards 0, the constant algorithm A2 (x) def 1). = Simple hard-core predicates are known for the RSA, Rabin, and DLP collections (presented in Subsection 2.4.3), provided that the corresponding collections are one-way. Specifically, the least signi cant bit is a hard-core for the RSA collection, provided that the RSA collection is one-way. Namely, assuming that the RSA collection is one-way, it is infeasible to guess (with success probability signi cantly greater than half) the least signi cant bit of x from RSAN e (x) = xe mod N . Likewise, assuming that the DLP collection is one-way, it is infeasible to guess whether x < P when given DLPP G (x) = Gx mod P . In the next 2 subsection we present a general result of the kind. 2.5. HARD-CORE PREDICATES 63 2.5.2 Hard-Core Predicates for any One-Way Function Actually, the title is inaccurate, as we are going to present hard-core predicates only for (strong) one-way functions of special form. However, every (strong) one-way function can be easily transformed into a function of the required form, with no substantial loss in either \security" or \e ciency". Theorem 2.15 Let f be an arbitrary strong one-way function, and let g be de ned by g(x r) def (f (x) r), where jxj = jrj. Let b(x r) denote the inner-product mod 2 of the binary = vectors x and r. Then the predicate b is a hard-core of the function g . In other words, the theorem states that if f is strongly one-way then it is infeasible to guess the exclusive-or of a random subset of the bits of x when given f (x) and the subset itself. We stress that the theorem requires that f is strongly one-way and that the conclusion is false if f is only weakly one-way (see Exercise 19). We point out that g maintains properties of f such as being length-preserving and being one-to-one. Furthermore, an analogous statement holds for collections of one-way functions with/without trapdoor etc. is reduced to predicting b(x r) from (f (x) r). Hence, we assume (for contradiction) the existence of an e cient algorithm predicting the inner-product with advantage which is not negligible, and derive an algorithm that inverts f with related (i.e. not negligible) success probability. This contradicts the hypothesis that f is a one-way function. Let G be a (probabilistic polynomial-time) algorithm that on input f (x) and r tries to predict the inner-product (mod 2) of x and r. Denote by "G (n) the (overall) advantage of algorithm G in predicting b(x r) from f (x) and r, where x and r are uniformly chosen in f0 1gn. Namely, 1 "G(n) def Prob (G(f (Xn) Rn) = b(Xn Rn)) ; 2 = where here and in the sequel Xn and Rn denote two independent random variables, each uniformly distributed over f0 1gn. Assuming, to the contradiction, that b is not a hard-core of g means that exists an e cient algorithm G, a polynomial p( ) and an in nite set N so that for every n 2 N it holds that "G (n) > p(1n) . We restrict our attention to this algorithm G and to n's in this set N . In the sequel we shorthand "G by ". Our rst observation is that, on at least an "(2n) fraction of the x's of length n, algorithm G has an "(2n) advantage in predicting b(x Rn) from f (x) and Rn. Namely, Proof: The proof uses a \reducibility argument". This time inverting the function f Claim 2.15.1: there exists a set Sn every x 2 Sn , it holds that f0 1gn of cardinality at least "(n) 2 2n such that for n s(x) def Prob(G(f (x) Rn)= b(x Rn)) 1 + "(2 ) = 2 64 CHAPTER 2. COMPUTATIONAL DIFFICULTY In the sequel we restrict our attention to x's in Sn . We will show an e cient algorithm that on every input y , with y = f (x) and x 2 Sn , nds x with very high probability. Contradiction to the (strong) one-wayness of f will follow by noting that Prob(Un 2 Sn ) "(n) . 2 The next three paragraphs consist of a motivating discussion. The inverting algorithm, that uses algorithm G as subroutine, will be formally described and analyzed later. This time the probability is taken over all possible values of Rn and all internal coin tosses of algorithm G, whereas x is xed. Proof: The observation follows by an averaging argument. Namely, write Exp(s(Xn )) = 1 2 + "(n), and apply Markov Inequality.2 A motivating discussion Consider a xed x 2 Sn . By de nition s(x) 1+ "(2n) > 1 + 2p1n) . Suppose, for a moment, 2 2 ( 3 that s(x) > 4 + 2p1n) . Of course there is no reason to believe that this is the case, we are just ( 1 doing a mental experiment. In this case (i.e., of s(x) > 3 + poly(jxj) ) retrieving x from f (x) 4 is quite easy. To retrieve the ith bit of x, denoted xi , we randomly select r 2 f0 1gn, and compute G(f (x) r) and G(f (x) r ei), where ei is an n-dimensional binary vector with 1 in the ith component and 0 in all the others, and v u denotes the addition mod 2 of the binary vectors v and u. Clearly, if both G(f (x) r) = b(x r) and G(f (x) r ei ) = b(x r ei ), then G(f (x) r) G(f (x) r ei ) = b(x r) b(x r ei) = b(x ei) = xi n xr + n xs n x (r + s ) b(x r s) mod 2. The since b(x r) b(x s) i i=1 i i i=1 i i i=1 i i probability that both equalities hold (i.e., both G(f (x) r) = b(x r) and G(f (x) r ei ) = 1 1 1 b(x r ei)) is at least 1 ; 2 ( 4;poly(jxj) ) > 1 ; poly(jxj) . Hence, repeating the above procedure su ciently many times and ruling by majority we retrieve xi with very high probability. Similarly, we can retrieve all the bits of x, and hence invert f on f (x). However, the entire 3 analysis was conducted under (the unjusti able) assumption that s(x) > 4 + 2p(1jxj) , whereas 1 we only know that s(x) > 2 + 2p(1jxj) . The problem with the above procedure is that it doubles the original error probability of algorithm G on inputs of form (f (x) ). Under the unrealistic assumption, that the G's 1 error on such inputs is signi cantly smaller than 4 , the \error-doubling" phenomenon raises no problems. However, in general (and even in the special case where G's error is exactly 1 4 ) the above procedure is unlikely to invert f . Note that the error probability of G can not be decreased by repeating G several times (e.g., G may always answer correctly on three quarters of the inputs, and always err on the remaining quarter). What is required P P P 2.5. HARD-CORE PREDICATES 65 is an alternative way of using the algorithm G, a way which does not double the original error probability of G. The key idea is to generate the r's in a way which requires applying algorithm G only once per each r (and i), instead of twice. Speci cally, we used algorithm G to obtain a \guess" for b(x r ei) and obtain b(x r) in a di erent way. The good news are that the error probability is no longer doubled, since we only need to use G to get a \guess" of b(x r ei). The bad news are that we still need to know b(x r), and it is not clear how we can know b(x r) without applying G. The answer is that we can guess b(x r) by ourselves. This is ne if we only need to guess b(x r) for one r (or logarithmically in jxj many r's), but the problem is that we need to know (and hence guess) b(x r) for polynomially many r's. An obvious way of guessing these b(x r)'s yields an exponentially vanishing success probability. The solution is to generate these polynomially many r's so that, on one hand they are \su ciently random" whereas on the other hand we can guess all the b(x r)'s with non-negligible success probability. Speci cally, generating the r's in a particular pairwise independent manner will satisfy both (seemingly contradictory) requirements. We stress that in case we are successful (in our guesses for the b(x r)'s), we can retrieve x with high probability. Hence, we retrieve x with non-negligible probability. A word about the way in which the pairwise independent r's are generated (and the corresponding b(x r)'s are guessed) is indeed in place. To generate m = poly(n) many r's, we uniformly (and independently) select l def log2 (m + 1) strings in f0 1gn. Let us = 1 ::: sl. We then guess b(x s1) through b(x sl). Let use denote denote these strings by s these guesses, which are uniformly (and independently) chosen in f0 1g, by 1 through l. 1 Hence, the probability that all our guesses for the b(x si)'s are correct is 2;l = poly(n) . The di erent r's correspond to the di erent non-empty subsets of f1 2 ::: lg. We compute rJ def j 2J sj . The reader can easily verify that the rJ 's are pairwise independent and each = is uniformly distributed in f0 1gn. The key observation is that b(x rJ ) = b(x j 2J s j) = j 2J b(x sj ) Hence, our guess for the b(x rJ )'s is j 2J j , and with non-negligible probability all our guesses are correct. Back to the formal argument Following is a formal description of the inverting algorithm, denoted A. We assume, for simplicity that f is length preserving (yet this assumption is not essential). On input y (supposedly in the range of f ), algorithm A sets n def jy j, and l def dlog2 (2n p(n)2 +1)e, where = = p( ) is the polynomial guaranteed above (i.e., (n) > p(1n) for the in nitely many n's in N ). Algorithm A uniformly and independently select s1 ::: sl 2 f0 1gn, and 1 ::: l 2 f0 1g. It then computes, for every non-empty set J f1 2 ::: lg, a string rJ j 2J sj and a J j . For every i 2 f1 ::: ng and every non-empty J f1 :: lg, algorithm A bit j 2J J G(y rJ ei ). Finally, algorithm A sets zi to be the majority of the z J computes ziJ i 66 CHAPTER 2. COMPUTATIONAL DIFFICULTY values, and outputs z = z1 zn . (Remark: in an alternative implementation of the ideas, the inverting algorithm, denoted A0 , tries all possible values for 1 ::: l, and outputs only one of resulting strings z , with an obvious preference to a string z satisfying f (z ) = y .) Following is a detailed analysis of the success probability of algorithm A on inputs of the form f (x), for x 2 Sn , where n 2 N . We start by showing that, in case the j 's are correct, then the with constant probability, zi = xi for all i 2 f1 ::: ng. This is proven by bounding from below the probability that the majority of the ziJ 's equals xi . Claim 2.15.2: For every x 2 Sn and every 1 i n, Prob jfJ : b(x rJ ) G(f (x) rJ ei ) = xi gj > 1 (2l ; 1) > 1 ; 21 2 n where rJ def j 2J sj and the sj 's are independently and uniformly chosen in f0 1gn. = Proof: For every J , de ne a 0-1 random variable J , so that J equals 1 if and only if b(x rJ ) G(f (x) rJ ei ) = xi. The reader can easily verify that each rJ is uniformly distributed in f0 1gn. It follows that each J equals 1 with probability s(x), which by 1 x 2 Sn , is at least 2 + 2p1n) . We show that the J 's are pairwise independent by showing that ( the rJ 's are pairwise independent. For every J 6= K we have, without loss of generality, j 2 J and k 2 K ; J . Hence, for every 2 f0 1gn, we have Prob rK = j rJ = = Prob sk = j sj = = Prob sk = = Prob rK = and pairwise independence of the rJ 's follows. Let m def 2l ; 1. Using Chebyshev's Inequal= ity, we get Prob X J J 1m 2 ! Prob j X J J ; (1 + < < The claim now follows. 2 Var( f1g) ( 2p1n) )2 (2n p(n)2 ) ( 1 4 1 ) mj 2 2p(n) 1m 2p(n) ! ( 2p1n) )2 (2n p(n)2 ) ( = 21 n 2.5. HARD-CORE PREDICATES 67 Recall that if j = b(x sj ), for all j 's, then J = b(x rJ ) for all non-empty J 's. In this case z output by algorithm A equals x, with probability at least half. However, the rst event happens with probability 2;l = 2n p1(n)2 independently of the events analyzed in Claim 2.15.2. Hence, in case x 2 Sn , algorithm A inverts f on f (x) with probability at least 4p(1jxj) (whereas, the modi ed algorithm, A0 , succeeds with probability 1 ). Recalling 2 that jSn j > 2p1n) 2n , we conclude that, for every n 2 N , algorithm A inverts f on f (Un ) ( with probability at least 8p(1n)2 . Noting that A is polynomial-time (i.e., it merely invokes G for 2n p(n)2 = poly(n) times in addition to making a polynomial amount of other computations), a contradiction, to our hypothesis that f is strongly one-way, follows. 2.5.3 * Hard-Core Functions We have just seen that every one-way function can be easily modi ed to have a hard-core predicate. In other words, the result establishes one bit of information about the preimage which is hard to approximate from the value of the function. A stronger result may say that several bits of information about the preimage are hard to approximate. For example, we may want to say that a speci c pair of bits is hard to approximate, in the sense that 1 it is infeasible to guess this pair with probability signi cantly larger than 4 . In general, a polynomial-time function, h, is called a hard-core of a function f if no e cient algorithm can distinguish (f (x) h(x)) from (f (x) r), where r is a random string of length jh(x)j. For further discussion of the notion of e cient distinguishability the reader is referred to Section 3.2. We assume for simplicity that h is length regular (see below). De nition 2.16 (hard-core function): Let h : f0 1g 7! f0 1g be a polynomial-time computable function, satisfying jh(x)j = jh(y )j for all jxj = jy j, and let l(n) def jh(1n )j. The = function h : f0 1g 7! f0 1g is called a hard-core of a function f if for every probabilistic polynomial-time algorithm D0 , every polynomial p( ), and all su ciently large n's jProb ;D0(f (Xn) h(Xn))=1 ; Prob D0(f (Xn) Rl(n))=1 j < p(1n) where Xn and Rl(n) are two independent random variables the rst uniformly distributed over f0 1gn, and the second uniformly distributed over f0 1gl(n), Theorem 2.17 Let f be an arbitrary strong one-way function, and let g2 be de ned by g2(x s) def (f (x) s), where jsj = 2jxj. Let c > 0 be a constant, and l(n) def dc log2 ne. Let = = bi (x s) denote the inner-product mod 2 of the binary vectors x and (si+1 ::: si+n), where s = (s1 ::: s2n). Then the function h(x s) def b1(x s) bl(jxj)(x s) is a hard-core of the = function g2. 68 CHAPTER 2. COMPUTATIONAL DIFFICULTY The proof of the theorem follows by combining a proposition concerning the structure of the speci c function h with a general lemma concerning hard-core functions. Loosely speaking, the proposition \reduces" the problem of approximating b(x r) given g (x r) to the problem of approximating the exclusive-or of any non-empty set of the bits of h(x s) given g2(x s), where b and g are the hard-core and the one-way function presented in the previous subsection. Since we know that the predicate b(x r) cannot be approximated from g (x r), we conclude that no exclusive-or of the bits of h(x s) can be approximated from g2(x s). The general lemma states that, for every \logarithmically shrinking" function h0 (i.e., h0 satisfying jh0 (x)j = O(log jxj)), the function h0 is a hard-core of a function f 0 if and only if the exclusive-or of any non-empty subset of the bits of h0 cannot be approximated from the value of f 0 . Proposition 2.18 Let f , g2 and bi's be as above. Let I (n) f1 2 ::: l(n)g, n 2 N, be an I arbitrary sequence of non-empty subsets, and let bI (jxj)(x s) def i2I (jxj)bi(x s). Then, for = 0 , every polynomial p( ), and all su ciently every probabilistic polynomial-time algorithm A large n's Prob A0(g2 (U3n)) = bI (n) (U3n ) < 1 + 1 2 p(n) Proof: The proof is by a \reducibility" argument. It is shown that the problem of approximating b(Xn Rn) given (f (Xn ) Rn) is reducible to the problem of approximating bI (n)(Xn S2n) given (f (Xn) S2n), where Xn, Rn and S2n are independent random variable and the last is uniformly distributed over f0 1g2n. The underlying observation is that, for every jsj = 2 jxj, bI (x s) = i2I bi(x s) = b(x i2I subi (s) where subi (s1 ::: s2n) def (si+1 ::: si+n). Furthermore, the reader can verify that for every = non-empty I f1 ::: ng, the random variable i2I subi (S2n ) is uniformly distributed over f0 1gn, and that given a string r 2 f0 1gn and such a set I one can e ciently select a string uniformly in the set fs : i2I subi (s) = rg. (Veri cation of both claims is left as an exercise.) Now, assume to the contradiction, that there exists an e cient algorithm A0 , a polynomial p( ), and an in nite sequence of sets (i.e., I (n)'s) and n's so that 1 Prob A0(g2 (U3n)) = bI (n) (U3n ) 2 + p(1 ) n We rst observe that for n's satisfying the above inequality we can nd in probabilistic polynomial time (in n) a set I satisfying ; Prob A0 (g2(U3n)) = bI (U3n ) 1 + 2p1n) 2 ( 2.5. HARD-CORE PREDICATES 69 (i.e., by going over all possible I 's and experimenting with algorithm A0 on each of them). Of course we may be wrong here, but the error probability can be made exponentially small. We now present an algorithm for approximating b(x r), from y def f (x) and r. On input = y and r, the algorithm rst nds a set I as described above (this stage depends only on jxj which equals jrj). Once I is found, the algorithm uniformly select a string s so that i2I subi (s) = r, and return A0(y s). Evaluation of the success probability of this algorithm is left as an exercise. Lemma 2.19 (Computational XOR Lemma): Let f and h be arbitrary length regular functions, and let l(n) def jh(1n )j. Let D be an algorithm. Denote = p def Prob (D(f (Xn) h(Xn)) = 1) and q def Prob D(f (Xn) Rl(n)) = 1 = = where Xn and Rl are as above. Let G be an algorithm that on input y , S (and l(n)), selects r uniformly in f0 1gl(n), and outputs D(y r) 1 ( i2S ri), where r = r1 rl and ri 2 f0 1g. Then, Prob (G(f (Xn) Il l(n))= i2Il (hi (Xn ))) = 1 + lpn; q 2 2( );1 where Il is a randomly chosen non-empty subset of f1 ::: l(n)g and hi (x) denotes the ith bit of h(x). It follows that, for logarithmically shrinking h's, the existence of an e cient algorithm that distinguishes (with a gap which is not negligible in n) the random variables (f (Xn ) h(Xn)) and (f (Xn ) Rl(n)) implies the existence of an e cient algorithm that approximates the exclusive-or of a random non-empty subset of the bits of h(Xn ) from the value of f (Xn ) with an advantage that is not negligible. On the other hand, it is clear that any e cient algorithm, which approximates an exclusive-or of an non-empty subset of the bits of h from the value of f , can be easily modi ed to distinguish (f (Xn ) h(Xn)) from (f (Xn) Rl(n)). Hence, for logarithmically shrinking h's, the function h is a hard-core of a function f if and only if the exclusive-or of any non-empty subset of the bits of h cannot be approximated from the value of f . Proof: All that is required is to evaluate the success probability of algorithm G. We start by xing an x 2 f0 1gn and evaluating Prob(G(f (x) Il l) = i2Il (hi (x)), where Il is a uniformly chosen non-empty subset of f1 ::: lg and l def l(n). Let B denote the set of all = non-empty subsets of f1 ::: lg. De ne, for every S 2 B , a relation S so that y S z if and only if i2S yi = i2S zi , where y = y1 yl and z = z1 zl . By the de nition of G, it follows that on input (f (x) S l) and random choice r 2 f0 1gl, algorithm G outputs i2S (hi (x)) 70 CHAPTER 2. COMPUTATIONAL DIFFICULTY if and only if either \D(f (x) r) = 1 and r S h(x)" or \D(f (x) r) = 0 and r 6 S h(x)". By elementary manipulations, we get s(x) def Prob(G(f (x) Il l) = i2Il (hi (x))) = X1 = jBj Prob(G(f (x) S l) = i2S (hi(x)) S 2B X1 (Prob(D(f (x) Rl)=1 j Rl S h(x)) + Prob(D(f (x) Rl)=0 j Rl 6 S h(x))) S 2B 2 jB j X = 1 + 2j1 j (Prob(D(f (x) Rl)=1 j Rl S h(x)) ; Prob(D(f (x) Rl)=1 j Rl 6 S h(x))) 2 B S 2B 0 1 1+ 1 1 @ X X Prob(D(f (x) r)=1) ; X X Prob(D(f (x) x)=1)A = 2 2jB j 2l;1 S 2B r6 S h(x) 0 S2B r S h(x) 1 1 + 1 @X X Prob(D(f (x) r)=1) ; X X Prob(D(f (x) r)=1)A = 2 2l jB j r S 2E (r h(x)) r S 2N (r h(x)) = where E (r z ) def fS 2 B : r S z g and N (r z ) def fS 2 B : r 6 S z g. Observe that for = = l;1 (and jE (r z )j = 2l;1 ; 1). On the other hand, every r 6= z it holds that jN (r z )j = 2 E (z z ) = B (and N (z z) = ). Hence, we get X l;1 1 s(x) = 2 + 2l j1Bj (2 ; 1) Prob(D(f (x) r) = 1) ; 2n;1 Prob(D(f (x) r) = 1) r6=h(x) 1 jB j Prob(D(f (x) h(x)) = 1) + 2l jB j = 1 + 1 (Prob(D(f (x) h(x)) = 1) ; Prob(D(f (x) Rn) = 1)) 2 jB j Thus 11 Exp(s(Xn )) = 2 + jB j (Prob(D(f (Xn) h(Xn)) = 1) ; Prob(D(f (Xn ) Rn) = 1)) and the lemma follows. 2.6 * E cient Ampli cation of One-way Functions The ampli cation of weak one-way functions into strong ones, presented in Theorem 2.8, has no practical value. Recall that this ampli cation transforms a function f which is hard to 2.6. * EFFICIENT AMPLIFICATION OF ONE-WAY FUNCTIONS 71 invert on a non-negligible fraction (i.e., p(1n) ) of the strings of length n into a function g which is hard to invert on all but a negligible fraction of the strings of length n2 p(n). Speci cally, it is shown that an algorithm running in time T (n) which inverts g on a (n) fraction of the strings of length n2 p(n) yields an algorithm running in time poly(p(n) n (1 ) ) T (n) which n inverts f on a 1 ; p(1n) fraction of the strings of length n. Hence, if f is \hard to invert in 1 practice on a 1000 fraction of the strings of length 100" then all we can say is that g is \hard 999 to invert in practice on a 1000 fraction of the strings of length 1,000,000". In contrast, an e cient ampli cation of one-way functions, as given below, should relate the di culty of inverting the (weak one-way) function f on strings of length n to the di culty of inverting the (strong one-way) function g on the strings of length O(n) (rather than relating it to the to the di culty of inverting the function g on the strings of length poly(n)). The following de nition is natural for a general discussion of ampli cation of one-way functions. De nition 2.20 (quantitative one-wayness): Let T : N 7! N and : N 7! R be polynomialII II time computable functions. A polynomial-time computable function f : f0 1g 7! f0 1g is Prob A0 (f (Un )) 62 f ;1 f (Un ) > (n) called ( )-one-way with respect to time T ( ) if for every algorithm, A0 , with running-time bounded by T ( ) and all su ciently large n's Using this terminology we review what we know already about ampli cation of oneway functions. A function f is weakly one-way if there exists a polynomial p( ) so that f is p(1 ) -one-way with respect to polynomial time. A function f is strongly one-way if, 1 for every polynomial p( ), the f is (1 ; p( ) )-one-way with respect to polynomial time. The ampli cation result of Theorem 2.8 can be generalized and restated as follows. If there exist 1 a polynomial-time computable function f which is poly( ) -one-way with respect to time T ( ) 1 then there exist a polynomial-time computable function g which is (1 ; poly( ) )-one-way with respect to time T 0 ( ), where T 0(poly(n)) = T (n) (i.e., in other words, T 0(n) = T (n ) for some > 0). In contrast, an e cient ampli cation of one-way functions, as given below, should state that the above should hold with respect to T 0(O(n)) = T (n) (i.e., in other words, T 0 (n) = T ( n) for some > 0). Such a result can be obtained for regular oneway functions. A function f is called regular if there exists a polynomial-time computable function m : N 7! N and a polynomial p( ) so that, for every y in the range of f , the number II n of preimages (of length n) of y under f , is between m((n)) and m(n) p(n). In this book we p only review the result for one-way permutations (i.e., length preserving 1-1 functions). 1 time computable permutation which is p( ) -one-way with respect to time T ( ). Then, there Theorem 2.21 (E cient ampli cation of one-way permutations): Let p( ) be a polynomial and T : N 7! N be a polynomial-time computable function. Suppose that f is a polynomialII 72 CHAPTER 2. COMPUTATIONAL DIFFICULTY exists a polynomial-time computable permutation F so that, for every polynomial-time computable function : N 7! 0 1], the function F is (1 ; ( ))-one-way with respect to time T 0( ), I 0 (O(n)) def (n)2 T (n). where T = poly(n) The constants, in the O-notation and in the poly-notation, depend on the polynomial p( ). The key to the ampli cation of a one-way permutation f is to apply f on many di erent arguments. In the proof of Theorem 2.8, f is applied to unrelated arguments (which are disjoint parts of the input). This makes the proof relatively easy, but also makes the construction very ine cient. Instead, in the construction presented in the proof of the current theorem, we apply the one-way permutation f on related arguments. The rst idea which comes to mind is to apply f iteratively many times, each time on the value resulting from the previous application. This will not help if easy instances for the inverting algorithm keep being mapped, by f , to themselves. We cannot just hope that this will not happen. The idea is to use randomization between successive applications. It is important that we use only a small amount of randomization,, since the \randomization" will be encoded into the argument of the constructed function. The randomization, between successive applications of f , takes the form of a random step on an expander graph. Hence a few words about these graphs and random walks on them are in place. A graph G =(V E ) is called an (n d c)-expander if it has n vertices (i.e., jV j = n), every vertex in V has degree d (i.e., G is d-regular), and G has the following expansion property (with expansion factor c > 0): for every subset S V if jS j n then jN (S )j c jS j, 2 where N (S ) denotes the vertices in V ; S which have neighbour in S (i.e., N (S ) def fu 2 = V ; S : 9v 2 S s.t. (u v ) 2 E g). By explicitly constructed expanders we mean a family of graphs fGn gn2N so that Gn is a (22n d c) expander (d and c are the same for all graphs I in the family) having a polynomial-time algorithm that on input a description of a vertex in an expander outputs its adjacency list (vertices in Gn are represented by binary strings of length 2n). Such expender families do exist. By a random walk on a graph we mean the sequence of vertices visited by starting at a uniformly chosen vertex and randomly selecting at each step one of the neighbouring vertices of the current vertex, with uniform probability distribution. The expanding property implies (via a non-trivial proof) that the vertices along random walks on an expander have surprisingly strong \random properties". In particular, for every l, the probability that vertices along an O(l)-step long random walk hit a subset, S , is approximately the same as the probability that at least one of l independently chosen vertices hits S . We remind the reader that we are interested in successively applying the permutation f , while interleaving randomization steps between successive applications. Hence, before applying permutation f , to the result of the previous application, we take one random step on an expender. Namely, we associate the domain of the given one-way permutation with the vertex set of the expander. Our construction alternatively applies the given one-way permutation, f , and randomly moves from the vertex just reached to one of its neighbours. 2.6. * EFFICIENT AMPLIFICATION OF ONE-WAY FUNCTIONS 73 A key observation is that the composition of an expander with any permutation on its vertices yields an expander (with the same expansion properties). Combining the properties of random walks on expanders and a \reducibility" argument, the construction is showed to amplify the one-wayness of the given permutation in an e cient manner. Construction 2.22 Let fGngn2N be a family of d-regular graphs, so that Gn has vertex I set f0 1gn and self-loops at every vertex. Consider a labeling of the edges incident to each vertex (using the labels 1 2 ::: d). De ne gl (x) be the vertex reachable from vertex x by following the edge labeled l. Let f : f0 1g 7! f0 1g be a 1-1 length preserving function. For every k 0, x 2 f0 1gn, and 1 2 ::: k 2 f1 2 ::: dg, de ne F (x 1 2 ::: k ) = 1 F (g 1 (f (x)) 2 ::: k) (with F (x ) = x). For every k : N 7! N, de ne Fk( ) ( ) def F (x 1 :: t), where t = k(jxj) II = and i 2f1 2 ::: dg. Proposition 2.23 Let fGng, f , k : N 7! N, and Fk( ) be as in Construction 2.22 (above), II and suppose that fGn gn2N is an explicitly constructed family of d-regular expander graphs, I and f is polynomial-time computable. Suppose that : N 7! R and T : N 7! N are polynomialII II time computable, and f is ( )-one-way with respect to time T : N ! N. Then, for every II polynomial-time computable " : N 7! R, the function Fk( ) is polynomial-time computable as II well as (1 ; "( )) ( )-one-way with respect to time T 0 : N ! N, where (n) def (1 ; (1 ; II = 2 (n))k(n)=2) and T 0 (n + k(n) log2 d) def "(k()n) nn) T (n). =n( Theorem 2.21 follows by applying the proposition + 1 times, where is the degree of 1 the polynomial p( ) (speci ed in the hypothesis that f is p( ) -one-way). In all applications 1 of the proposition we use k(n) def 3n. In the rst applications we use any "(n) < 7 . The = 1 -one-way. th application of the proposition, for i function resulting from the i , is 2n ;i 1 -one-way. (It seems that the In particular, after applications, the resulting function is 2 1 notion of 2 -one-wayness is worthy of special attention, and deserves a name as mostly oneway.) In the last (i.e., + 1st ) application we use "(n) = (n). The function resulting of the last (i.e., + 1st ) application of the proposition satis es the statement of Theorem 2.21. The proposition itself is proven as follows. First, we use the fact that f is a permutation to show, that the graph Gf = (V Ef ), obtained from G = (V E ) by letting Ef def f(u f (v)) : (u v ) 2 E g, has the same expansion property as the graph G. Next, = we use the known relation between the expansion constant of a graph and the ratio of the two largest eigenvalues of its adjacency matrix to prove that with appropriate choice of the 1 family fGn g we can have this ratio bounded below by p2 . Finally, we combine the following two Lemmata. 74 CHAPTER 2. COMPUTATIONAL DIFFICULTY 1 factor d ) adjacency matrix for which the ratio of the rst and second eigenvalues is smaller 1 than p2 . Let 1=2 and S be a subset of measure of the expender's nodes. Then a random walk of length 2k on the expander hits S with probability at least 1 ; (1 ; )k . Lemma 2.24 (Random Walk Lemma): Let G be a d-regular graph having a normalized (by The proof of the Random Walk Lemma regards probability distributions oven the expander vertex-set as linear combinations of the eigenvectors of the adjacency matrix. It can be shown that the largest eigenvalue is 1, and the eigenvector associated to it is the uniform distribution. Going step by step, we bound from above the probability mass assigned to random walks which do not pass through the set S . At each step, the component of the current distribution, which is in the direction of the rst eigenvector, losses a factor of its weight (this represents the fraction of the paths which enter S in the current step). The problem is that we cannot make a similar statement with respect to the other components. Yet, using the bound on the second eigenvalue, it can be shown that in each step these components are \pushed" towards the direction of the rst eigenvector. The details, being of little relevance to the topic of the book, are omitted. : N 7! 0 1], and Gf n be a d-regular graph on I n vertices satisfying the following random path property: for every measure (n) subset, 2 S , of Gf n's nodes, at least a fraction (n + k(n) log2 d) of the paths of length k(n) passes through a node in S (typically (n + k(n) log2 d) > (n)). Suppose that f is ( ( ) + exp( ))one-way with respect to time T ( ). Then, for every polynomial-time computable " : N 7! R, II the function Fk( ), de ned above, is (1 ; "( )) ( )-one-way with respect to time T 0 : N ! N, II 2 where (n + k(n) log2 d) def (1 ; (1 ; (n))k(n)=2 ) and T 0(n + k(n) log 2 d) def "(k()n) nn) T (n). = =n( Lemma 2.25 (Reducibility Lemma): Let Fk( ) de ned as above can be inverted in time T 0( ) with probability at least 1 ; (1 ; "(m)) (m) on inputs of length m def n + k(n) log2 d. Amplify A to invert Fk( ) with = overwhelming probability on a 1 ; (m) fraction of the inputs of length m (originally A inverts each such point with probability > "(m), as we can ignore inputs inverted with probability smaller than "(m)). Note that inputs to A correspond to k(n)-long paths on the graph Gn . Consider the set, denoted Bn , of paths (x p) such that A inverts Fk(n) (x p) with overwhelming probability. In the sequel, we use the shorthands k def k(n), m def n + k log2 d, " def "(m), def (m), = = = = def (n), and B def B . Let P be the set of all k-long paths which pass through v , and B = =n v v be the subset of B containing paths which pass through v (i.e., Bv = B \ Pv ). De ne v as good if jBv j=jPv j " =k (and bad otherwise). Intuitively, a vertex v is called good if at least a " =k fraction of the paths going through v can be inverted by A. Let B 0 = B ; v badBv namely B 0 contain all \invertible" paths which pass solely through good nodes. Clearly, Proof Sketch: The proof is by a \reducibility argument". Assume for contradiction that 2.6. * EFFICIENT AMPLIFICATION OF ONE-WAY FUNCTIONS Claim 2.25.1: The measure of B 0 in the set of all paths is greater than 1 ; . Proof: Denote by (S ) the measure of the set S in the set of all paths. Then (B 0 ) = (B ) ; ( v badBv ) X 1 ; (1 ; ) ; (Bv ) v X bad v 75 > 1; + ; > 1; 2 ( =k) (Pv ) Using the random path property, we have Claim 2.25.2: The measure of good nodes is at least 1 ; . Proof: Otherwise, let S be the set of bad nodes. If S has measure then, by the random path property, it follows the fraction of path which pass through vertices of S is at least . Hence, B 0 , which cannot contain such paths can contain only a 1 ; fraction of all paths in contradiction to Claim 2.25.1. 2 The following algorithm for inverting f , is quite natural. The algorithm uses as subroutine an algorithm, denoted A, for inverting Fk( ) . Inverting f on y is done by placing y on a random point along a randomly selected path p, taking a walk from y according to the su x of p, and asking A for the preimage of the resulting pair under Fk . Algorithm for inverting f : On input y , repeat kn times: 1. Select randomly i 2f1 2 ::: kg, and 12 ::: k 2f1 2 ::: dg 2. Compute y 0 = F (g i (y ) i+1::: k ) 3. Invoke A to get x0 A( 1 2 ::: k y 0) 4. Compute x = F (x0 1::: i;1) 5. If f (x) = y then halt and output x. Since x is good, a random path going through it (selected above) corresponds to an \invertible path" with probability at least =k. If such a path is selected then we obtain the inverse of f (x) with overwhelming probability. The algorithm for inverting f repeats the process su ciently many times to guarantee overwhelming probability of selecting an \invertible path". Analysis of the inverting algorithm (for a good x): 76 CHAPTER 2. COMPUTATIONAL DIFFICULTY By Claim 2.25.2, the good x's constitute a 1 ; fraction of all n-bit strings. Hence, the existence of an algorithm inverting Fk( ) , in time T 0( ) with probability at least 1 ; (1 ; "( )) ( ), implies the existence of an algorithm inverting f , in time T ( ) with probability at least 1 ; ( ) ; exp( ). This constitutes a contradiction to the hypothesis of the lemma, and hence the lemma follows. 2.7 Miscellaneous 2.7.1 Historical Notes The notion of a one-way function originates from the paper of Di e and Hellman DH76]. Weak one-way functions were introduced by Yao Y82]. The RSA function was introduced by Rivest, Shamir and Adleman RSA78], whereas squaring modulo a composite was introduced and studied by Rabin R79]. The suggestion for basing one-way functions on the believed intractability of decoding random linear codes is taken from BMT78,GKL88], and the suggestion to base one-way functions on the subset sum problem is taken from IN89]. The equivalence of existence of weak and strong one-way functions is implicit in Yao's work Y82]. The existence of universal one-way functions is stated in Levin's work L85]. The e cient ampli cation of one-way functions, presented in Section 2.6, is taken from Goldreich el. al. GILVZ], which in turn uses ideas originating in AKS]. Author's Note: GILVZ = Goldreich, Impagliazzo, Levin, Venkatesan and Zuckerman (FOCS90) AKS = Ajtai, Komolos and Szemeredi (STOC87). The concept of hard-core predicates originates from the work of Blum and Micali BM82]. That work also proves that a particular predicate constitutes a hard-core for the \DLP function" (i.e., exponentiation in a nite eld), provided that this function is one-way. Consequently, Yao proved that the existence of one-way functions implies the existence of hard-core predicates Y82]. However, Yao's construction, which is analogous to the contraction used for the proof of Theorem 2.8, is of little practical value. The fact that the inner-product mod 2 is a hard-core for any one-way function (of the form g (x r)=(f (x) r)) was proven by Goldreich and Levin GL89]. The proof presented in this book, which follows ideas originating in ACGS84], is due to Charles Racko . Hard-core predicates and functions for speci c collections of permutations were suggested in BM82,LW,K88,ACGS84,VV84]. Speci cally, Kalisky K88], extending ideas of BM82,LW], proves that the intractability of various discrete logarithm problems yields hard-core functions for the related exponentiation permutations. Alexi el. al. ACGS84], building on work by Ben-Or et. al. BCS83], prove that the intractability of factoring yields hard-core functions for permutations induced by squaring modulo a composite number. 2.7. MISCELLANEOUS 77 2.7.2 Suggestion for Further Reading Our exposition of the RSA and Rabin functions is quite sparse in details. In particular, the computational problems of generating uniformly distributed \certi ed primes" and of \primality checking" deserve much more attention. A probabilistic polynomial-time algorithm for generating uniformly distributed primes together with corresponding certi cates of primality has been presented by Bach BachPhd]. The certi cate produced, by this algorithm, for a prime P consists of the prime factorization of P ; 1, together with certi cates for primality of these factors. This recursive form of certi cates for primality originates in von-Pratt's proof that the set of primes is in NP (cf. vP]). However, the above procedure is not very practical. Instead, when using the RSA (or Rabin) function in practice, one is likely to prefer an algorithm that generates integers at random and checks them for primality using fast primality checkers such as the algorithms presented in SSprime,Rprime]. One should note, however, that these algorithms do not produce certi cates for primality, and that with some (small) probability may assert that a composite number is a prime. Probabilistic polynomial-time algorithms (yet not practical ones) that, given a prime, produce a certi cate for primality, are presented in GKprime,AHprime] Author's Note: SSprime = Solovay and Strassen, Rprime = Rabin, GKprime = Goldwasser and Kilian, AHprime = Adleman and Haung. The subset sum problem is known to be easy in two special cases. One case is the case in which the input sequence is constructed based on a simple \hidden sequence". For example, Merkle and Hellman MH78], suggested to construct an instance of the subset-sum problem based on a \hidden superP increasing sequence" as follows. Let s1 ::: sn M def sn+1 be a = sequence satisfying, si > ij;1 sj , for every i, and let w be relatively prime to M . Such =1 P a sequence is called super increasing. The instance consists of (x1 ::: xn) and i2I xi , for I f1 ::: ng, where xi def w si mod M . It can be shown that knowledge of both w and M = allows easy solution of the subset sum problem for the above instance. The hope was that, when w and M are not given, solving the subset-sum problem is hard even for instances generated based on a super increasing sequence (and this would lead to a trapdoor one-way function). However, the hope did not materialize. Shamir presented an e cient algorithm for solving the subset-sum problem for instances with a hidden super increasing sequence S82]. Another case for which the subset sum problem is known to be easy is the case of low density instances. In these instances the length of the elements in binary representation is considerably larger than the number of elements (i.e. jx1 j = = jxn j = (1 + )n for some constant > 0). For further details consult the original work of Lagarias and Odlyzko LO85] and the later survey of Brickell and Odlyzko BO88]. For further details on hard-core functions for the RSA and Rabin functions the reader is directed to Alexi el. al. ACGS84]. For further details on hard-core functions for the \DLP function" the reader is directed to Kalisky's work K88]. 78 CHAPTER 2. COMPUTATIONAL DIFFICULTY The theory of average-case complexity, initiated by Levin L84], is somewhat related to the notion of one-way functions. For a survey of this theory we refer the reader to BCGL]. Loosely speaking, the di erence is that in our context it is required that the (e cient) \generator" of hard (on-the-average) instances can easily solve them himself, whereas in Levin's work the instances are hard (on-the-average) to solve even for the \generator". However, the notion of average-case reducibility introduced by Levin is relevant also in our context. Author's Note: BCGL = Ben-David, Chor, Goldreich and Luby (JCSS, April 1992). Readers interested in further details about the best algorithms known for the factoring problem are directed to Pomerance's survey P82]. Further details on the best algorithms known for the discrete logarithm problem (DLP) can be found in Odlyzko's survey O84]. In addition, the reader is referred to Bach and Shalit's book on computational number theory BS92book]. Further details about expander graphs, and random walks on them, can be found in the book of Alon and Spencer AS91book]. Author's Note: Updated versions of the surveys by Pomerance and Odlyzko do exist. 2.7.3 Open Problems The e cient ampli cation of one-way functions, originating in GILVZ], is only known to work for special types of functions (e.g., regular ones). We believe that presenting (and proving) an e cient ampli cation of arbitrary one-way functions is a very important open problem. It may also be instrumental for more e cient constructions of pseudorandom generators based on arbitrary one-way functions (see Section 3.5). An open problem of more practical importance is to try to present hard-core functions with larger range for the RSA and Rabin functions. Speci cally, assuming that squaring mod N is one-way, is the function which returns the rst half of x a hard-core of squaring mod N ? Some support to a positive answer is provided by the work of Shamir and Shrift SS90]. A positive answer would allow to construct extremely e cient pseudorandom generators and public-key encryption schemes based on the conjectured intractability of the factoring problem. 2.7.4 Exercises Exercise 1: Closing the gap between the motivating discussion and the de nition of oneway functions: We say that a function h : f0 1g 7! f0 1g is hard on the average but 2.7. MISCELLANEOUS 79 easy with auxiliary input if there exists a probabilistic polynomial-time algorithm, G, such that 1. There exists a polynomial-time algorithm, A, such that A(x y ) = h(x) for every (x y ) in the range of G (i.e., for every (x y ) so that (x y ) is a possible output of G(1n) for some input 1n ). 2. for every probabilistic polynomial-time algorithm, A0 , every polynomial p( ), and all su ciently large n's Prob(A0 (Xn )= h(Xn )) < p(1 ) n where (Xn Yn ) def G(1n) is a random variable assigned the output of G. = Prove that if there exist \hard on the average but easy with auxiliary input" functions then one-way functions exist. Exercise 2: One-way functions and the P vs. NP question (part 1): Prove that the existence of one-way functions implies P 6= NP . (Guidelines: for every function f de ne Lf 2 NP so that if Lf 2 P then there exists a polynomial-time algorithm for inverting f .) Exercise 3: One-way functions and the P vs. NP question (part 2): Assuming that P 6= NP , construct a function f so that the following three claims hold: 1. f is polynomial-time computable 2. there is no polynomial-time algorithm that always inverts f (i.e., successfully inverts f on every y in the range of f ) and 3. f is not (even weakly) one-way. Furthermore, there exists a polynomial-time algorithm which inverts f with exponentially small failure probability, where the probability space is (again) of all possible choices of input (i.e., f (x)) and internal coin tosses for the algorithm. (Guidelines: consider the function fsat de ned so that fsat( ) = ( 1) if is a satisfying assignment to propositional formulae , and fsat ( ) = ( 0) otherwise. Modify this function so that it is easy to invert on most instances, yet inverting fsat is reducible to inverting its modi cation.) Exercise 4: Let f be a strongly one-way function. Prove that for every probabilistic polynomial-time algorithm A, and for every polynomial p( ) the set BA p def fx : Prob(A(f (x)) 2 f ;1f (x)) p(j1xj) g = has negligible density in the set of all strings (i.e., for every polynomial q ( ) and all 0n su ciently large n it holds that jB \fn 1g < p(1n) ). 2 80 CHAPTER 2. COMPUTATIONAL DIFFICULTY tion resulting from De nition 2.6 by allowing the circuits to be probabilistic (i.e., have an auxiliary input which is uniformly selected). Prove that the resulting new de nition is equivalent to the original one. Exercise 5: Another de nition of non-uniformly one-way functions: Consider the de ni- Exercise 6: Let fmult be as de ned in Section 2.2. Assume that every integer factoring def (plog P log log P ) , where P is the algorithm has, on input N , running time L(P ) = 2 second biggest prime factor of N . Prove that fmult is strongly one-way. (Guideline: using results on density of smooth numbers, show that the density, of integers N with second biggest prime smaller than L(N ), is smaller that L(1N ) .) not a one-way function. (Guideline: don't try to capitalize on the possibility that prime(N ) is too large, e.g., larger than N + poly(log N ). It is unlikely that such a result, in number theory, can be proven. Furthermore, it is generally believed that there exists a constant c such that, for all integer N 2, it holds that prime(N ) < N + logc N .) Hence, it is likely 2 that fadd is polynomial-time computable.) prove that if f is (even weakly) one-way then for every polynomial p( ) and all su ciently large n's it holds jff (x) : x 2f0 1gngj > p(n). Exercise 7: De ne fadd : f0 1g 7! f0 1g so that fadd(xy) = prime(x) + prime(y), where jxj = jyj and prime(z) is the smallest prime which is larger than z. Prove that fadd is Exercise 8: Prove that one-way functions cannot have a polynomial-size range. Namely, Exercise 9: Prove that one-way functions cannot have polynomially bounded cycles. Namely, for every function f de ne cycf (x) to be the smallest positive integer i such that applying f for i times on x yields x. Prove that if f is (even weakly) one-way then for every polynomial p( ) and all su ciently large n's it holds Exp(cycf (Un )) > p(n), where Un is a random variable uniformly distributed over f0 1gn. Exercise 10: on the improbability of strengening Theorem 2.8 (part 1): Suppose that the de nition of weak one-way function is further weakened so that it is required that every algorithm fails to inverts the function with negligible probability. Demonstrate the di culty of extending the proof of Theorem 2.8 to this case. (Hint: suppose that there exists an algorithm that if run with time bound t(n) inverts the function with probability 1=t(n).) Exercise 11: on the improbability of strengening Theorem 2.8 (part 2) (due to S. Rudich): Suppose that the de nition of a strong one-way function is further strengthen so that it is required that every algorithm fails to inverts the function with some speci ed p negligible probability (e.g., 2; n ). Demonstrate the di culty of extending the proof of Theorem 2.8 to this case. (Guideline: suppose that that we construct the strong one-way function g as in the 2.7. MISCELLANEOUS 81 original proof. Note that you can prove that any algorithm that works separately on each block of the function g , can invert it only with exponentially low probability. However, there may be an inverting algorithm, A, that inverts the function g with probability . Show that any inverting algorithm for the weakly one-way function f that uses algorithm A as a black-box \must" invoke it at least 1 times.) Exercise 12: collections of one-way functions and one-way functions: Represent a collec- tion of one-way functions, (I D F ), as a single one-way function. Given a one-way function f , represent it as a collection of one-way functions. (Remark: the second direction is quite trivial.) generality, algorithms I and D of a collection (of one-way functions) can be modi ed so that each of them uses a number of coins which exactly equals the input length. (Guideline: \apply padding" rst on 1n , next on the coin tosses and output of I , and nally to the coin tosses of D.) ing the index of the function to the inverting algorithm is essential for a meaningful de nition of a collection of one-way functions. (Guideline: consider a collection ffi : f0 1gjij 7! f0 1gjijg where fi (x) = x i.) Exercise 13: a convention for collections of one-way functions: Show that without loss of Exercise 14: justi cation for a convention concerning one-way collections: Show that givExercise 15: Rabin's collection and factoring: Show that the Rabin collection is one-way if and only if factoring integers which are the product of two primes of equal binary expsansion is intractable in a strong sense (i.e., every e cient algorithm succeeds with negligible probability). (Guideline: For one direction use the Chinese Reminder Theorem and an e cient algorithm for extracting square roots modulo a prime. For the other direction observe that an algorithm for extracting square roots modulo a composite N can be use to get two integers x and y such that x2 y 2 mod N and yet x 6 y mod N . Also, note that such a pair, (x y ), yields a split of N (i.e., two integers a b 6= 1 such that N = a b).) (I D F ), where F (i x) def F ( i x), is a collection of strong one-way functions. = Repeat the exercise when replacing the word functions' by permutations'. (I D F ) be a clawfree collection of functions tions: Consider another suggestion to base one-way functions on the conjectured di culty of the Graph Isomorphism problem. This time we present a collection of functions, de ned by the algorithmic triplet (IGI DGI FGI). On input 1n , algorithm Exercise 16: clawfree collections imply one-way functions: Let (I D F ) be a clawfree collection of functions (see Subsection 2.4.5). Prove that, for every f0 1g, the triplet Exercise 17: more on the inadequacy of graph isomorphism as a basis for one-way func- 82 CHAPTER 2. COMPUTATIONAL DIFFICULTY IGI selects uniformly a d(n)-regular graph on n vertices (i.e., each of the n vertices in the graph has degree d(n)). On input a graph on n vertices, algorithm DGI randomly selects a permutation in the symmetric group of n elements (i.e., the set of permutations of n elements). On input a (n-vertex) graph G and a (n-element) permutation , algorithm FGI returns fG ( ) def G. = 1. Present a polynomial-time implementation of IGI. 2. In light of the known algorithms for the Graph Isomorphism problem, which values of d(n) should be de nitely avoided? 3. Using a known algorithm, prove that the above collection does not have a oneway property, no matter which function d( ) one uses. (A search into the relevant literature is indeed required for items (2) and (3).) Exercise 18: Assuming the existence of one-way functions, prove that there exist a oneway function f so that no single bit of the preimage constitutes a hard-core predicate. (Guideline: given a one-way function f construct a function g so that g (x I J ) def = (f (xI \J ) xI J I J ), where I J f1 2 :::jxjg, and xS denotes the string resulting by taking only the bits of x with positions in the set S (i.e., xi1 ::: is def xi1 xis , where = x = x1 xjxj).) Exercise 19: hard-core predicate for a 1-1 function implies that the function is one-way: 1. Prove that if f is polynomial-time computable then it is strongly one-way. 2. Prove that (regardless of whether f is polynomial-time computable or not) f 1 must be weakly one-way. Furthermore, for every > 2 , the function f cannot be inverted on a fraction of the instances. Let f be a 1-1 function (you may assume for simplicity that it is length preserving) and let b be a hard-core for f . Exercise 20: In continuation to the proof of Theorem 2.15, we present guidelines for a more e cient inverting algorithm. In the sequel it will be more convenient to use arithmetic of reals instead of that of Boolean. Hence, we denote b0(x r) = (;1)b(r x) and G0(y r) = (;1)G(y r) . 1. Prove that for every x it holds that Exp(b0(x r) G0(f (x) r + ei)) = s0 (x) (;1)xi , where s0 (x) def 2 (s(x) ; 1 ). = 2 2. Let v be an l-dimensional Boolean vector, and let R be a uniformly chosen l-by-n Boolean matrix. Prove that for every v 6= u 2 f0 1gl it holds that vR and uR are pairwise independent and uniformly distributed in f0 1gn. 3. Prove that b0 (x vR) = b0(xRT v ), for every x 2 f0 1gn and v 2 f0 1gl. 2.7. MISCELLANEOUS 83 4. Prove that, with probability at least 1 , there exists 2 f0 1gl so that for every 2 P 1 i n the sign of v2f0 1gl b0( v )G0(f (x) vR + ei )) equals the sign of (;1)xi . (Hint: def xRT .) = 5. Let B be an 2l -by-2l matrix with the ( v )-entry being b0( v ), and let gi be an 2l-dimensional vector with the v th entry equal G0 (f (x) vR + ei ). The inverting algorithm computes z i B g i , for all i's, and forms a matrix Z in which the columns are the z i 's. The output is a row that when applying f to it yields f (x). Evaluate the success probability of the algorithm. Using the special structure of matrix B , show that the product Bg i can be computed in time l 2l . Hint: B is the Sylvester matrix, which can be written recursively as Sk = Sk;1 Sk;1 Sk;1 Sk;1 ! where S0 = +1 and M means ipping the +1 entries of M to ;1 and vice versa. 84 CHAPTER 2. COMPUTATIONAL DIFFICULTY Chapter 3 Pseudorandom Generators In this chapter we discuss pseudorandom generators. Loosely speaking, these are e cient deterministic programs which expand short randomly selected seeds into much longer \pseudorandom" bit sequences. Pseudorandom sequences are de ned as computationally indistinguishable from truly random sequences by e cient algorithms. Hence, the notion of computational indistinguishability (i.e., indistinguishability by e cient procedures) plays a pivotal role in our discussion of pseudorandomness. Furthermore, the notion of computational indistinguishability, plays a key role also in subsequent chapters, and in particular in the discussion of secure encryption, zero-knowledge proofs, and cryptographic protocols. In addition to de nitions of pseudorandom distributions, pseudorandom generators, and pseudorandom functions, the current chapter contains constructions of pseudorandom generators (and pseudorandom functions) based on various types of one-way functions. In particular, very simple and e cient pseudorandom generators are constructed based on the existence of one-way permutations. 3.1 Motivating Discussion The nature of randomness has attracted the attention of many people and in particular of scientists in various elds. We believe that the notion of computation, and in particular of e cient computation, provides a good basis for understanding the nature of randomness. 3.1.1 Computational Approaches to Randomness One computational approach to randomness has been initiated by Solomonov and Kolmogorov in the early 1960's (and rediscovered by Chaitin in the early 1970's). This approach is \ontological" in nature. Loosely speaking, a string, s, is considered Kolmogorov-random 85 86 CHAPTER 3. PSEUDORANDOM GENERATORS if its length (i.e., jsj) equals the length of the shortest program producing s. This shortest program may be considered the \simplest" \explanation" to the phenomenon described by the string s. Hence, the string, s, is considered Kolmogorov-random if it does not posses a simple explanation (i.e., an explanation which is substantially shorter than jsj). We stress that one cannot determine whether a given string is Kolmogorov-random or not (and more generally Kolmogorov-complexity is a function that cannot be computed). Furthermore, this approach seems to have no application to the issue of \pseudorandom generators". An alternative computational approach to randomness is presented in the rest of this chapter. In contrast to the approach of Kolmogorov, the new approach is behavioristic in nature. Instead of considering the \explanation" to a phenomenon, we consider the phenomenon's e ect on the environment. Loosely speaking, a string is considered pseudorandom if no e cient observer can distinguish it from a uniformly chosen string of the same length. The underlying postulate is that objects that cannot be told apart by e cient procedures are considered equivalent, although they may be very di erent in nature (e.g., have fundamentally di erent (Kolmogorov) complexity). Furthermore, the new approach naturally leads to the concept of a pseudorandom generator, which is a fundamental concept with lots of practical applications (and in particular to the area of cryptography). 3.1.2 A Rigorous Approach to Pseudorandom Generators The approach to pseudorandom generators, presented in this book, stands in contrast to the heuristic approach which is still common in discussions concerning \pseudorandom generators" which are being used in real computers. The heuristic approach consider \pseudorandom generators" as programs which produce bit sequences \passing" several speci c statistical tests. The choice of statistical tests, to which these programs are subjected, is quite arbitrary and lacks a systematic foundation. Furthermore, it is possible to construct e cient statistical tests which foil the \pseudorandom generators" commonly used in practice (and in particular distinguish their output from a uniformly chosen string of equal length). Consequently, before using a \pseudorandom generator", in a new application (which requires \random" sequences), extensive tests have to be conducted in order to detect whether the behaviour of the application when using the \pseudorandom generator" preserves its behaviour when using a \true source of randomness". Any modi cation of the application requires new comparison of the \pseudorandom generator" against the \random source", since the non-randomness of the \pseudorandom generator" may badly e ect the modi ed application (although it did not e ect the original application). Furthermore, using such a \pseudorandom generator" for \cryptographic purposes" is highly risky, since the adversary may try to exploit the known weaknesses of the \pseudorandom generator". In contrast the concept of pseudorandom generators, presented below, is a robust one. By de nition these pseudorandom generators produce sequences which look random to any e cient observer. It follows that the output of a pseudorandom generator may be used 3.2. COMPUTATIONAL INDISTINGUISHABILITY 87 instead of \random sequences" in any e cient application requiring such (i.e., \random") sequences. 3.2 Computational Indistinguishability The concept of e cient computation leads naturally to a new kind of equivalence between objects. Objects are considered to be computationally equivalent if they cannot be told apart by any e cient procedure. Considering indistinguishable objects as equivalent is one of the basic paradigms of both science and real-life situations. Hence, we believe that the notion of computational indistinguishability is fundamental. Formulating the notion of computational indistinguishability is done, as standard in computational complexity, by considering objects as in nite sequences of strings. Hence, the sequences, fxn gn2N and fyn gn2N , are said to be computational indistinguishable if no e cient procedure can tell them apart. In other words, no e cient algorithm, D, can accept in nitely many xn 's while rejecting their y-counterparts (i.e., for every e cient algorithm D and all su ciently large n's it holds that D accepts xn i D accepts yn ). Objects which are computationally indistinguishable in the above sense may be considered equivalent as far as any practical purpose is concerned (since practical purposes are captured by e cient algorithms and those can not distinguish these objects). The above discussion is naturally extended to the probabilistic setting. Furthermore, as we shall see, this extension yields very useful consequences. Loosely speaking, two distributions are called computationally indistinguishable if no e cient algorithm can tell them apart. Given an e cient algorithm, D, we consider the probability that D accepts (e.g., outputs 1 on input) a string taken from the rst distribution. Likewise, we consider the probability that D accepts a string taken from the second distribution. If these two probabilities are close, we say that D does not distinguish the two distributions. Again, the formulation of this discussion is with respect to two in nite sequences of distributions (rather than with respect to two xed distributions). Such sequences are called probability ensembles. 3.2.1 De nition De nition 3.1 (ensembles): Let I be a countable index set. An ensemble indexed by I is a sequence of random variables indexed by I . Namely, X = fXigi2I , where the Xi 's are random variables, is an ensemble indexed by I . We will use either N or a subset of f0 1g as the index set. Typically, in our applications, I an ensemble of the form X = fXn gn2N has each Xn ranging over strings of length n, I whereas an ensemble of the form X = fXw gw2f0 1g will have each Xw ranging over strings 88 CHAPTER 3. PSEUDORANDOM GENERATORS of length jwj. In the rest of this chapter, we will deal with ensembles indexed by N, I whereas in other chapters (e.g., in the de nition of secure encryption and zero-knowledge) we will deal with ensembles indexed by strings. To avoid confusion, we present variants of the de nition of computational indistinguishability for each of these two cases. The two formulations can be uni ed if one associates the natural numbers with their unary representation (i.e., associate N and f1n : n 2 Ng). I I De nition 3.2 (polynomial-time indistinguishability): 1. variant for ensembles indexed by N: Two ensembles, X def fXngn2N and Y def I = = I fYngn2N, are indistinguishable in polynomial-time if for every probabilistic polynomialI time algorithm, D, every polynomial p( ), and all su ciently large n's jProb (D(Xn 1n)=1) ; Prob (D(Yn 1n)=1) j < p(1n) 2. variant for ensembles indexed by a set of strings S : Two ensembles, X def fXw gw2S = def fY g , are indistinguishable in polynomial-time if for every probabilistic and Y = w w2S polynomial-time algorithm, D, every polynomial p( ), and all su ciently long w's jProb (D(Xw w)=1) ; Prob (D(Yw w)=1) j < p(j1 j) w The probabilities in the above de nition are taken over the corresponding random variables Xi (or Yi ) and the internal coin tosses of algorithm D (which is allowed to be a probabilistic algorithm). The second variant of the above de nition will play a key role in subsequent chapters, and further discussion of it is postponed to these places. In the rest of this chapter we refer only to the rst variant of the above de nition. The string 1n is given as auxiliary input to algorithm D in order to make the rst variant consistent with the second one, and in order to make it more intuitive. However, in typical cases, where the length of Xn (resp. Yn ) and n are polynomialy related (i.e., jXnj < poly(n) and n < poly(jXnj)) and can be computed one from the other in poly(n)-time, giving 1n as auxiliary input is redundant. The following mental experiment may be instructive. For each 2 f0 1g , consider the probability, hereafter denoted d( ), that algorithm D outputs 1 on input . Consider the expectation of d taken over each of the two ensembles. Namely, let d1 (n) = Exp(d(Xn)) and d2(n) = Exp(d(Yn )). Then, X and Y are said to be indistinguishable by D if the di erence (function) (n) def jd1(n) ; d2(n)j is negligible in n. A few examples may help to further = clarify the de nition. Consider an algorithm, D1 , which obliviously of the input, ips a 0-1 coin and outputs its outcome. Clearly, on every input, algorithm D1 outputs 1 with probability exactly one 3.2. COMPUTATIONAL INDISTINGUISHABILITY 89 half, and hence does not distinguish any pair of ensembles. Next, consider an algorithm, D2, which outputs 1 if and only if the input string contains more zeros than ones. Since D2 can be implemented in polynomial-time, it follows that if X and Y are polynomial-time indistinguishable then the di erence jProb(! (Xn) < n ) ; Prob(! (Yn ) < n )j is negligible 2 2 (in n), where ! ( ) denotes the number of 1's in the string . Similarly, polynomial-time indistinguishable ensembles must exhibit the same \pro le" (up to negligible error) with respect to any \string statistics" which can be computed in polynomial-time. However, it is not required that polynomial-time indistinguishable ensembles have similar \pro les" with respect to quantities which cannot be computed in polynomial-time (e.g., Kolmogorov Complexity or the function presented right after Proposition 3.3). 3.2.2 Relation to Statistical Closeness Computational indistinguishability is a re nement of a traditional notion from probability theory. We call two ensembles X def fXngn2N and Y def fYn gn2N, statistically close if their = = I I statistical di erence is negligible, where the statistical di erence (also known as variation distance) of X and Y is de ned as the function (n) def = X jProb(Xn = ) ; Prob(Yn = )j Clearly, if the ensembles X and Y are statistically close then they are also polynomial-time indistinguishable (see Exercise 5). The converse, however, is not true. In particular Proposition 3.3 There exist an ensemble X = fXngn2N so that X is not statistically I close to the uniform ensemble, U def fUn gn2N , yet X and U are polynomial-time indis= I n=2 tinguishable. Furthermore, Xn assigns all its probability mass to at most 2 length n). strings (of Recall that Un is uniformly distributed over strings of length n. Although X and U are polynomial-time indistinguishable, one can de ne a function f : f0 1g 7! f0 1g so that f has average 1 over X while having average almost 0 over U (e.g., f (x) = 1 if and only if x is in the range of X ). Hence, X and U have di erent \pro le" with respect to the function f , yet f is (necessarily) impossible to compute in polynomial-time. Proof: We claim that, for all su ciently large n, there exist a random variable Xn, disjProb(Cn(Un )=1) ; Prob(Cn(Xn)=1)j < 2;n=8 tributed over some set of at most 2n=2 strings (each of length n), so that for every circuit, Cn, of size (i.e., number of gates) 2n=8 it holds that 90 CHAPTER 3. PSEUDORANDOM GENERATORS The proposition follows from this claim, since polynomial-time distinguishers (even probabilistic ones - see Exercise 6) yield polynomial-size circuits with at least as big a distinguishing gap. The claim is proven using a probabilistic argument (i.e., a counting argument). Let Cn be some xed circuit with n inputs, and let pn def Prob(Cn(Un ) = 1). We select, = independently and uniformly 2n=2 strings, denoted s1 ::: s2n=2 , in f0 1gn. De ne random variables i 's so that i = Cn (si ) (these random variables depend on the random choices of the corresponding si 's). Using Cherno Bound, we get that 1 0 n= X2 1 2 j 2;n=8 A 2e;2 2n=2 2;n=4 < 2;2n=4 Prob @jpn ; n=2 i 2 i=1 Letting Xn equal si with probability 2;n=2 , for every 1 i 2n=2 , the claim follows. Since there are at most 22n=4 di erent circuits of size (number of gates) 2n=8 , it follows that there exists a sequence of s1 ::: s2n=2 2 f0 1gn, so that for every circuit Cn of size 2n=8 it holds that n= X2 1 2 C (s )j < 2;n=8 jProb(Cn(Un)=1) ; 2n=2 ni i=1 3.2.3 Indistinguishability by Repeated Experiments By De nition 3.2, two ensembles are considered computationally indistinguishable if no e cient procedure can tell them apart based on a single sample. We shall now show that \e ciently constructible" computational indistinguishable ensembles cannot be (e ciently) distinguished even by examining several samples. We start by presenting de nitions of \indistinguishability by sampling" and \e ciently constructible ensembles". De nition 3.4 (indistinguishability by sampling): Two ensembles, X def fXngn2N and = I def fY g Y = n n2N, are indistinguishable by polynomial-time sampling if for every probabilistic I polynomial-time algorithm, D, every two polynomials m( ) and p( ), and all su ciently large n's (1) ( jProb D(Xn ::: Xnm(n)))=1 ; Prob D(Yn(1) ::: Yn(m(n)))=1 j < p(1n) (1) ( where Xn through Xnm) and Yn(1) through Yn(m), are independent random variables with ( each Xni) identical to Xn and each Yn(i) identical to Yn . 3.2. COMPUTATIONAL INDISTINGUISHABILITY 91 De nition 3.5 (e ciently constructible ensembles): An ensemble, X def fXngn2N, is said = I to be polynomial-time constructible if there exists a probabilistic polynomial time algorithm S so that for every n, the random variables S (1n) and Xn are identically distributed. Theorem 3.6 Let X def fXngn2N and Y def fYn gn2N, be two polynomial-time con= = I I structible ensembles, and suppose that X and Y are indistinguishable in polynomial-time. Then X and Y are indistinguishable by polynomial-time sampling. An alternative formulation of Theorem 3.6 proceeds as follows. For every ensemble Z def = fZngn2N and every polynomial m( ) de ne the m( )-product of Z as the ensemble I( (1) ( f(Zn ::: Znm(n)))gn2N, where the Zni)'s are independent copies of Zn . Theorem 3.6 asI serts that if the ensembles X and Y are polynomial-time indistinguishable, and each is polynomial-time constructible, then, for every polynomial m( ), the m( )-product of X and the m( )-product of X are polynomial-time indistinguishable. The information theoretic analogue of the above theorem is quite obvious: if two ensembles are statistically close then also their polynomial-products must be statistically close (since the statistical di erence between the m-products of two distributions is bounded by m times the distance between the individual distributions). Adapting the proof to the computational setting requires, as usual, a \reducibility argument". This argument uses, for the rst time in this book, the hybrid technique. The hybrid technique plays a central role in demonstrating the computational indistinguishability of complex ensembles, constructed based on simpler (computational indistinguishable) ensembles. Subsequent application of the hybrid technique will involve more technicalities. Hence, the reader is urged not to skip the following proof. Proof: The proof is by a \reducibility argument". We show that the existence of an e cient algorithm that distinguishes the ensembles X and Y using several samples, implies the existence of an e cient algorithm that distinguishes the ensembles X and Y using a single sample. The implication is proven using the following argument, which will be latter called a \hybrid argument". Suppose, to the contradiction, that there is a probabilistic polynomial-time algorithm D, and polynomials m( ) and p( ), so that for in nitely many n's it holds that (1) ( ( (n) def jProb D(Xn ::: Xnm))=1 ; Prob D(Yn(1) ::: Ynm) )=1 j > p(1 ) = n ( where m def m(n), and the Xni)'s and Yn(i)'s are as in De nition 3.4. In the sequel, we will = derive a contradiction by presenting a probabilistic polynomial-time algorithm, D0 , that distinguishes the ensembles X and Y (in the sense of De nition 3.2). 92 CHAPTER 3. PSEUDORANDOM GENERATORS k For every k, 0 k m, we de ne the hybrid random variable Hn as a (m-long) sequence consisting of k independent copies of Xn and m ; k independent copies of Yn . Namely, (1) ( where Xn through Xnk) and Yn(k+1) through Yn(m) , are independent random variables with ( (1) ( m each Xni) identical to Xn and each Yn(i) identical to Yn . Clearly, Hn = Xn ::: Xnm), ( 0 whereas Hn = Yn(1) ::: Ynm) . 0 m By our hypothesis, algorithm D can distinguish the extreme hybrids (i.e., Hn and Hn ). As the total number of hybrids is polynomial in n, a non-negligible gap between (the \accepting" probability of D on) the extreme hybrids translates into a non-negligible gap between (the \accepting" probability of D on) a pair of neighbouring hybrids. It follows that D, although not \designed to work on general hybrids", can distinguish a pair of neighbouring hybrids. The punch-line is that, algorithm D can be easily modi ed into an algorithm D0 which distinguishes X and Y . Details follow. We construct an algorithm D0 which uses algorithm D as a subroutine. On input (supposedly in the range of either Xn or Yn ), algorithm D0 proceeds as follows. Algorithm D0, rst selects k uniformly in the set f0 1 ::: m ;1g. Using the e cient sampling algorithm for the ensemble X , algorithm D0 generates k independent samples of Xn . These samples are denoted x1 ::: xk. Likewise, using the e cient sampling algorithm for the ensemble Y , algorithm D0 generates m ; k ; 1 independent samples of Yn , denoted y k+2 ::: y m. Finally, algorithm D0 invokes algorithm D and halts with output D(x1 ::: xk y k+2 ::: y m). Clearly, D0 can be implemented in probabilistic polynomial-time. It is also easy to verify the following claims. Claim 3.6.1: m;1 X Prob(D(H k+1)=1) Prob(D0(X )=1) = 1 k= (1) ( Hn def (Xn ::: Xnk) Yn(k+1) ::: Yn(m) ) n and X 1 m;1 Prob(D(H k )=1) Prob(D0 (Yn )=1) = m n k=0 m k=0 n Proof: By construction of algorithm D0, we have (1) ( D0( ) = D(Xn ::: Xnk) Yn(k+2) ::: Yn(m)) k Using the de nition of the hybrids Hn , the claim follows. 2 Claim 3.6.2: jProb(D0(Xn)=1) ; Prob(D0(Yn)=1)j = m(n) (n) 3.2. COMPUTATIONAL INDISTINGUISHABILITY Proof: Using Claim 3.6.1 for the rst equality, we get 93 jProb(D0(Xn)=1) ; Prob(D0(Yn )=1)j X 1 m;1 k k = m j Prob(D(Hn +1)=1) ; Prob(D(Hn )=1)j 1 m 0 = m jProb(D(Hn )=1) ; Prob(D(Hn )=1)j = (n) k=0 m (1) ( ( m 0 The last equality follows by observing that Hn = Xn ::: Xnm) and Hn = Yn(1) ::: Ynm), and using the de nition of (n). 2 Since by our hypothesis (n) > p(1n) , for in nitely many n's, it follows that the probabilistic polynomial-time algorithm D0 distinguishes X and Y in contradiction to the hypothesis of the theorem. Hence, the theorem follows. It is worthwhile to give some thought to the hybrid technique (used for the rst time in the above proof). The hybrid technique constitutes a special type of a \reducibility argument" in which the computational indistinguishability of complex ensembles is proven using the computational indistinguishability of basic ensembles. The actual reduction is in the other direction: e ciently distinguishing the basic ensembles is reduced to e ciently distinguishing the complex ensembles, and hybrid distributions are used in the reduction in an essential way. The following properties of the construction of the hybrids play an important role in the argument: 1. Extreme hybrids collide with the complex ensembles: this property is essential since what we want to prove (i.e., indistinguishability of the complex ensembles) relates to the complex ensembles. 2. Neighbouring hybrids are easily related to the basic ensembles: this property is essential since what we know (i.e., indistinguishability of the basic ensembles) relates to the basic ensembles. We need to be able to translate our knowledge (speci cally computational indistinguishability) of the basic ensembles to knowledge (speci cally computational indistinguishability) of any pair of neighbouring hybrids. Typically, it is required to e ciently transform strings in the range of a basic hybrid into strings in the range of a hybrid, so that the transformation maps the rst basic distribution to one hybrid and the second basic distribution to the neighbouring hybrid. (In the proof of Theorem 3.6, the hypothesis that both X and Y are polynomial-time constructible is instrumental for such e cient transformation.) 94 CHAPTER 3. PSEUDORANDOM GENERATORS 3. The number of hybrids is small (i.e. polynomial): this property is essential in order to deduce the computational indistinguishability of extreme hybrids from the computational indistinguishability of neighbouring hybrids. We remark that, in the course of an hybrid argument, a distinguishing algorithm referring to the complex ensembles is being analyzed and even executed on arbitrary hybrids. The reader may be annoyed of the fact that the algorithm \was not designed to work on such hybrids" (but rather only on the extreme hybrids). However, \an algorithm is an algorithm" and once it exists we can apply it to any input of our choice and analyze its performance on arbitrary input distributions. 3.2.4 Pseudorandom Ensembles A special, yet important, case of computationally indistinguishable ensembles is the case in which one of the ensembles is uniform. Ensembles which are computational indistinguishable from the a uniform ensemble are called pseudorandom. Recall that Um denotes a random variable uniformly distributed over the set of strings of length m. The ensemble fUn gn2N I is called the standard uniform ensemble. Yet, it will be convenient to call uniform also ensembles of the form fUl(n)gn2N , where l is a function on natural numbers. I De nition 3.7 (pseudorandom ensembles): Let U def fUl(n)gn2N be a uniform ensemble, = I def fX g and X = n n2N be an ensemble. The ensemble X is called pseudorandom if X and U I are indistinguishable in polynomial-time. We stress that jXn j is not necessarily n (whereas jUmj = m). In fact, with high probability jXnj equals l(n)). In the above de nition, as in the rest of this book, pseudorandomness is a shorthand for \pseudorandomness with respect to polynomial-time". 3.3 De nitions of Pseudorandom Generators Pseudorandom ensembles, de ned above, can be used instead of uniform ensemble in any efcient application without noticeable degradation in performance (otherwise the e cient application can be transformed into an e cient distinguisher of the supposedly-pseudorandom ensemble from the uniform one). Such a replacement is useful only if we can generate pseudorandom ensembles at a cheaper cost than required to generate a uniform ensemble. The cost of generating an ensemble has several aspects. Standard cost considerations are re ected by the time and space complexities. However, in the context of randomized algorithms, and 3.3. DEFINITIONS OF PSEUDORANDOM GENERATORS 95 in particular in the context of generating probability ensembles, a major cost consideration is the quantity and quality of the randomness source used by the algorithm. In particular, in many applications (and especially in cryptography), it is desirable to generate pseudorandom ensembles using as little randomness as possible. This leads to the de nition of a pseudorandom generator. 3.3.1 * A General De nition of Pseudorandom Generators De nition 3.8 (pseudorandom generator): A pseudorandom generator is a deterministic polynomial-time algorithm, G, satisfying the following two conditions: 1. expansion: for every s 2 f0 1g it holds that jG(s)j > jsj. 2. pseudorandomness: the ensemble fG(Un )gn2N is pseudorandom. I The input, s, to the generator is called its seed. It is required that a pseudorandom generator G always outputs a string longer than its seed, and that G's output, on a uniformly chosen seed, is pseudorandom. In other words, the output of a pseudorandom generator, on a uniformly chosen seed, must be polynomial-time indistinguishable from uniform, although it cannot be uniform (or even statistically close to uniform). To justify the last statement consider a uniform ensemble fUl(n) gn2N that is polynomial-time indistinguishable from the I ensemble fG(Un )gn2N (such a uniform ensemble must exist by the pseudorandom property I of G). We rst claim that l(n) > n, since otherwise an algorithm that on input 1n and a string outputs 1 if and only if j j > n will distinguish G(Un ) from Ul(n) (as jG(Un )j > n by the expansion property of G). It follows that l(n) n + 1. We next bound from below the statistical di erence between G(Un ) and Ul(n) , as follows X x jProb(Ul(n) = x) ; Prob(G(Un)= x)j = X x62fG(s):s2f0 1gng (2l(n) ; 2n ) 2;l(n) jProb(Ul(n) = x) ; Prob(G(Un)= x)j 1 2 It can be shown, see Exercise 8, that all the probability mass of G(Un), except for a negligible (in n) amount, is concentrated on strings of the same length and that this length equals l(n), where fG(Un )gn2N is polynomial-time indistinguishable from fUl(n) gn2N. For I I simplicity, we consider in the sequel, only pseudorandom generators G satisfying jG(x)j = l(jxj) for all x's. 96 CHAPTER 3. PSEUDORANDOM GENERATORS 3.3.2 Standard De nition of Pseudorandom Generators De nition 3.9 (pseudorandom generator - standard de nition): A pseudorandom generator is a deterministic polynomial-time algorithm, G, satisfying the following two conditions: 1. expansion: there exists a function l : N 7! N so that l(n) > n for all n 2 N, and II I jG(s)j = l(jsj) for all s 2 f0 1g . The function l is called the expansion factor of G. 2. pseudorandomness (as above): the ensemble fG(Un )gn2N is pseudorandom. I Again, we call the input to the generator a seed. The expansion condition requires that the algorithm G maps n-bit long seeds into l(n)-bit long strings, with l(n) > n. The pseudorandomness condition requires that the output distribution, induced by applying algorithm G to a uniformly chosen seed, is polynomial-time indistinguishable from uniform (although it is not statistically close to uniform - see justi cation in previous subsection). The above de nition says little about the expansion factor l : N 7! N. We merely know II that for every n it holds that l(n) n + 1, that l(n) poly(n), and that l(n) can be computed in time polynomial in n. Clearly, a pseudorandom generator with expansion factor l(n) = n + 1 is of little value in practice, since it o ers no signi cant saving in coin tosses. Fortunately, as shown in the subsequent subsection, even pseudorandom generators with such small expansion factor can be used to construct pseudorandom generators with any polynomial expansion factor. Hence, for every two expansion factors, l1 : N 7! N and II l2 : N 7! N, that can be computed in poly(n)-time, there exists a pseudorandom generator II with expansion factor l1 if and only if there exists a pseudorandom generator with expansion factor l2. This statement is proven by using a pseudorandom generator with expansion factor l1(n) def n + 1 to construct, for every polynomial p( ), a pseudorandom generator = with expansion factor p(n). Note that a pseudorandom generator with expansion factor l1(n) def n + 1 can be derived from any pseudorandom generator (even from one in the = general sense of De nition 3.8). 3.3.3 Increasing the Expansion Factor of Pseudorandom Generators Given a pseudorandom generator, G1, with expansion factor l1 (n) = n + 1, we construct a pseudorandom generator G with polynomial expansion factor, as follows. Construction 3.10 Let G1 a deterministic polynomial-time algorithm mapping strings of length n into strings of length n +1, and let p( ) be a polynomial. De ne G(s) = 1 p(jsj), def s, the bit is the rst bit of G (s ), and s is the jsj-bit long su x of G (s ), where s0 = i 1 i;1 i 1 i;1 for every 1 i p(jsj). (i.e., i si = G1 (si;1 )) 3.3. DEFINITIONS OF PSEUDORANDOM GENERATORS 97 Hence, on input s, algorithm G applies G1 for p(jsj) times, each time on a new seed. Applying G1 to the current seed yields a new seed (for the next iteration) and one extra bit (which is being output immediately). The seed in the rst iteration is s itself. The seed in the ith iteration is the jsj-long su x of the string obtained from G1 in the previous iteration. Algorithm G outputs the concatenation of the \extra bits" obtained in the p(jsj) iterations. Clearly, G is polynomial-time computable and expands inputs of length n into output strings of length p(n). Theorem 3.11 Let G1, p( ), and G be as in Construction 3.10 (above). Then, if G1 is a pseudorandom generator then so is G. Intuitively, the pseudorandomness of G follows from that of G1 by replacing each application of G1 by a random process which on input s outputs s, where is uniformly chosen in f0 1g. Loosely speaking, the indistinguishability of a single application of the random process from a single application of G1 implies that polynomially many applications of the random process are indistinguishable from polynomially many applications of G1. The actual proof uses the hybrid technique. Proof: The proof is by a \reducibility argument" . Suppose, to the contradiction, that G is not a pseudorandom generator. It follows that the ensembles fG(Un )gn2N and fUp(n)gn2N I I are not polynomial-time indistinguishable. We will show that it follows that the ensembles fG1(Un)gn2N and fUn+1gn2N are not polynomial-time indistinguishable, in contradiction I I to the hypothesis that G1 is a pseudorandom generator with expansion factor l1(n) = n +1. The implication is proven, using the hybrid technique. k For every k, 0 k p(n), we de ne a hybrid Hp(n) as follows. First we de ne, for k 0 every k, a function gn : f0 1gn 7! f0 1gk by letting gn (x) def (the empty string) and = k+1 (x) = g k (y ), where is the rst bit of G1(x) and y is the n-bit long su x of G1 (x) gn n k (i.e., y = G1(x)). Namely, for every k p(jxj), the string gn (x) equals the k-bit long k resulting by concatenating a uniformly pre x of G(x). De ne the random variable Hp(n) chosen k-bit long string and the random variable g p(n);k (Un ). Namely k= (2) Hp(n) def Uk(1)g p(n);k (Un ) (1) (2) where Uk and Un are independent random variables (the rst uniformly distributed over f0 1gk and the second uniformly distributed over f0 1gn). Intuitively, the hybrid Hpk(n) consists of the k-bit long pre x of Up(n) and the (p(n) ; k)-bit long su x of G(Xn), where Xn is obtained from Un by applying G1 for k times each time to the n-bit long su x of the previous result. However, the later way of looking at the hybrids is less convenient for our purposes. 98 CHAPTER 3. PSEUDORANDOM GENERATORS pn 0 At this point it is clear that Hp(n) equals G(Un ), whereas Hp((n)) equals Up(n). It follows that if an algorithm D can distinguish the extreme hybrids then D can also distinguish two neighbouring hybrids, since the total number of hybrids is polynomial in n and a nonnegligible gap between the extreme hybrids translates into a non-negligible gap between some neighbouring hybrids. The punch-line is that, using the structure of neighbouring hybrids, algorithm D can be easily modi ed to distinguish the ensembles fG1(Un )gn2N I and fUn+1 gn2N. Details follow. I The core of the argument is the way in which the distinguishability of neighbouring hybrids relates to the distinguishability of G(Un ) from Un+1 . As stated, this relation stems from the structure of neighbouring hybrids. Let us, thus, take a closer look at the hybrids k k Hp(n) and Hp(+1, for some 0 k p(n) ; 1. To this end, de ne a function f m : f0 1gn+1 7! n) m by letting f 0(z ) def and f m+1 (z ) def g m(y ), where z = y with 2 f0 1g. f0 1g = = Claim 3.11.1: (1) (2) k 1. Hp(n) = Uk f p(n);k (Xn+1 ), where Xn+1 = G1(Un ). (1) (3) k 2. Hp(+1 = Uk f p(n);k (Yn+1 ), where Yn+1 = Un+1 . n) Proof: 1. By de nition of the functions g m and f m , we have g m (x) = f m (G1(x)). Using the k de nition of the hybrid Hp(n), it follows that k (2) (2) Hp(n) = Uk(1)g p(n);k (Un ) = Uk(1)f p(n);k (G1(Un )) 2. On the other hand, by de nition f m+1 ( y ) = g m(y ), and using the de nition of the k hybrid Hp(+1 , we get n) (3) k (2) Hp(+1 = Uk(1) g p(n);k;1 (Un ) = Uk(1)f p(n);k (Un+1) +1 n) 2 Hence distinguishing G1(Un ) from Un+1 is reduced to distinguishing the neighbouring hyk k brids (i.e. Hp(n) and Hp(+1 ), by applying f p(n);k to the input, padding the outcome (in n) front of) by a uniformly chosen string of length k, and applying the hybrid-distinguisher to the resulting string. Further details follow. We assume, to the contrary of the theorem, that G is not a pseudorandom generators. Suppose that D is a probabilistic polynomial-time algorithm so that for some polynomial q ( ) and for in nitely many n's it holds that (n) def jProb(D(G(Un)=1) ; Prob(D(Up(n))=1)j > q (1 ) = n 3.3. DEFINITIONS OF PSEUDORANDOM GENERATORS 99 We derive a contradiction by constructing a probabilistic polynomial-time algorithm, D0, that distinguishes G1(Un ) from Un+1 . Algorithm D0 uses algorithm D as a subroutine. On input 2 f0 1gn+1, algorithm D0 operates as follows. First, D0 selects an integer k uniformly in the set f0 1 ::: p(n) ; 1g, next D0 selects uniformly in f0 1gk , and nally D0 halts with output D( f p(n);k ( )), where f p(n);k is as de ned above. Clearly, D0 can be implemented in probabilistic polynomial-time (in particular f p(n);k is computed by applying G1 polynomially many times). It is left to analyze the performance of D0 on each of the distributions G1(Un ) and Un+1 . Claim 3.11.2: Prob(D0(G(Un ))=1) = 1 and p(n) p(X 1 n); k=0 k Prob(D(Hp(n))=1) p(X 1 n); k Prob(D0(Un+1 )=1) = p(1 ) Prob(D(Hp(+1)=1) n) n k=0 p(X 1 n); Prob(D0( ) =1) = p(1 ) Prob(D(Uk f p(n);k ( ))=1) n Proof: By construction of D0 we get, for every 2 f0 1gn+1, k=0 Using Claim 3.11.1, our claim follows. 2 k Let dk (n) denote the probability that D outputs 1 on input taken from the hybrid Hp(n) pn k 0 (i.e., dk (n) def Prob(D(Hp(n) = 1)). Recall that Hp(n) equals G(Un ), whereas Hp((n)) equals = Up(n). Hence, d0 (n) = Prob(D(G(Un)) = 1), dp(n)(n) = Prob(D(Up(n)) = 1), and (n) = jd0(n) ; dp(n)(n)j. Combining these facts with Claim 3.11.2, we get, jProb(D0(G1(Un ))=1) ; Prob(D0(Un+1 )=1)j = n); 1 j p(X 1 dk (n) ; dk+1 (n)j p(n) k=0 0 ; dp(n) = jd (n) p(n) (n)j n = p((n)) Recall that by our (contradiction) hypothesis (n) > q(1n) , for in nitely many n's. Contradiction to the pseudorandomness of G1 follows. 100 CHAPTER 3. PSEUDORANDOM GENERATORS 3.3.4 The Signi cance of Pseudorandom Generators Pseudorandom generators have the remarkable property of being e cient \ampli ers/expanders of randomness". Using very little randomness (in form of a randomly chosen seed) they produce very long sequences which look random with respect to any e cient observer. Hence, the output of a pseudorandom generator may be used instead of \random sequences" in any e cient application requiring such (i.e., \random") sequences. The reason being that such an application may be viewed as a distinguisher. In other word, if some e cient algorithm su ers noticeable degradation in performance when replacing the random sequences it uses by pseudorandom one, then this algorithm can be easily modi ed into a distinguisher contradicting the pseudorandomness of the later sequences. The generality of the notion of a pseudorandom generator is of great importance in practice. Once you are guaranteed that an algorithm is a pseudorandom generator you can use it in every e cient application requiring \random sequences" without testing the performance of the generator in the speci c new application. The bene ts of pseudorandom generators to cryptography are innumerable (and only the most important ones will be presented in the subsequent chapters). The reason that pseudorandom generators are so useful in cryptography is that the implementation of all cryptographic tasks requires a lot of \high quality randomness". Thus, producing, exchanging and sharing large amounts of \high quality random bits" at low cost is of primary importance. Pseudorandom generators allow to produce (resp., exchange and/or share) poly(n) pseudorandom bits at the cost of producing (resp., exchanging and/or sharing) only n random bits! A key property of pseudorandom sequences, that is used to justify the use of such sequences in cryptography, is the unpredictability of the sequence. Loosely speaking, a sequence is unpredictable if no e cient algorithm, given a pre x of the sequence, can guess its next bit with an advantage over one half that is not negligible. Namely, time if for every probabilistic polynomial-time algorithm A and every polynomial p( ) and for all su ciently large n's Prob(A(1n Xn)=nextA (1n Xn)) < 1 + 1 2 p(n) where nextA (x) returns the i + 1st bit of x if A on input (1n x) reads only i < jxj of the bits of x, and returns a uniformly chosen bit otherwise (i.e. in case A read the entire string x). De nition 3.12 (unpredictability): An ensemble fXngn2N is called unpredictable in polynomialI Clearly, pseudorandom ensembles are unpredictable in polynomial-time (see Exercise 14). It turns out that the converse holds as well. Namely, only pseudorandom ensembles are unpredictable in polynomial-time (see Exercise 15). 3.3. DEFINITIONS OF PSEUDORANDOM GENERATORS 101 3.3.5 A Necessary Condition for the Existence of Pseudorandom Generators Up to this point we have avoided the question of whether pseudorandom generators exist at all. Before saying anything positive, we remark that a necessary condition to the existence of pseudorandom generators is the existence of one-way function. Jumping ahead, we wish to reveal that this necessary condition is also su cient: hence, pseudorandom generators exist if and only if one-way functions exist. At this point we only prove that the existence of pseudorandom generators implies the existence of one-way function. Namely, Proposition 3.13 Let G be a pseudorandom generator with expansion factor l(n) = 2n. Then the function f : f0 1g 7! f0 1g de ned by letting f (x y ) def G(x), for every jxj = jy j, = is a strongly one-way function. polynomial-time algorithm invert f with only negligible probability. We use a \reducibility argument". Suppose, on the contrary, that A is a probabilistic polynomial-time algorithm 1 which for in nitely many n's inverts f on f (U2n ) with success probability at least poly(n) . We will construct a probabilistic polynomial-time algorithm, D, that distinguishes U2n and G(Un ) on these n's and reach a contradiction. The distinguisher D uses the inverting algorithm A as a subroutine. On input 2 f0 1g , algorithm D uses A in order to try to get a preimage of under f . Algorithm D then checks whether the string it obtained from A is indeed a preimage and halts outputting 1 in case it is (otherwise it outputs 0). Namely, algorithm A computes A( ), and outputs 1 if f ( ) = and 0 otherwise. By our hypothesis, for some polynomial p( ) and in nitely many n's, Prob(f (A(f (U2n )))= f (U2n)) > 1 p(n) By f 's construction the random variable f (U2n ) equals G(Un ), and therefore Prob(D(G(Un))= 1) > p(1n) . On the other hand, by f 's construction at most 2n di erent 2n-bit long strings have a preimage under f . Hence, Prob(f (A(U2n))= U2n ) 2;n . It follows that for in nitely many n's Proof: Clearly, f is polynomial-time computable. It is left to show that each probabilistic jProb(D(G(Un))=1) ; Prob(D(U2n)=1)j > p(1n) ; 21n > 2p1n) ( which contradicts the pseudorandomness of G. 102 CHAPTER 3. PSEUDORANDOM GENERATORS 3.4 Constructions based on One-Way Permutations In this section we present constructions of pseudorandom generator based on one-way permutations. The rst construction has a more abstract avour, as it uses a single length preserving 1-1 one-way function (i.e., a single one-way permutation). The second construction utilizes the same underlying ideas to present practical pseudorandom generators based on collections of one-way permutations. 3.4.1 Construction based on a Single Permutation By Theorem 3.11 (see Subsection 3.3.3), it su ces to present a pseudorandom generator expanding n-bit long seeds into n + 1-bit long strings. Assuming that one-way permutations (i.e., 1-1 length preserving functions) exist, such pseudorandom generators can be constructed easily. We remind the reader that the existence of one-way permutation implies the existence of one-way permutation with corresponding hard-core predicates. Thus, it su ces to prove the following Theorem 3.14 Let f be a length-preserving 1-1 (strongly one-way) function, and let b Intuitively, the ensemble ff (Un )b(Un )gn2N is pseudorandom since otherwise b(Un) can I be e ciently predicted from f (Un ). The proof merely formalizes this intuition. be a hard-core predicate for f . Then the algorithm G, de ned by G(s) def f (s)b(s), is a = pseudorandom generator. an e cient algorithm D which distinguishes G(Un ) from Un+1 . Recalling that G(Un ) = f (Un )b(Un ) and using the fact that f induces a permutation on f0 1gn, we deduce that algorithm D distinguishes f (Un )b(Un ) from f (Un )U1. It follows that D distinguishes f (Un )b(Un) = from f (Un )b(Un ), where b(x) is the complement bit of b(x) (i.e., b(x) def f0 1g;b(x)). Hence, algorithm D provides a good indication of b(Un) from f (Un ), and can be easily modi ed into an algorithm guessing b(Un ) from f (Un ), in contradiction to the hypothesis that b is a hard-core predicate of f . Details follows. We assume, on the contrary, that there exists a probabilistic polynomial-time algorithm D and a polynomial p( ) so that for in nitely many n's jProb(D(G(Un))=1) ; Prob(D(Un+1)=1)j > p(1n) Assume, without loss of generality, that for in nitely many n's it holds that (n) def (Prob(D(G(Un))=1) ; Prob(D(Un+1 )=1)) > p(1 ) = n Proof: We use a \reducibility argument". Suppose, on the contrary, that there exists 3.4. CONSTRUCTIONS BASED ON ONE-WAY PERMUTATIONS 103 denoted , otherwise. Clearly, A works in polynomial-time. It is left to evaluate the success probability of algorithm A. We evaluate the success probability of A by considering two complementary events. The event we consider is whether or not \on input x algorithm A selects so that = b(x)". We construct a probabilistic polynomial-time algorithm, A, for predicting b(x) from f (x). Algorithm A uses the algorithm D as a subroutine. On input y (equals f (x) for some x), algorithm A proceeds as follows. First, A selects uniformly 2 f0 1g. Next, A applies D to y . Algorithm A halts outputting if D(y ) = 1 and outputs the complement of , Claim 3.14.1: Prob(A(f (Un ))= b(Un) j = b(Un)) = Prob(D(f (Un )b(Un))=1) Prob(A(f (Un ))= b(Un) j 6= b(Un)) = 1 ; Prob(D(f (Un )b(Un ))=1) where b(x)= f0 1g ; b(x). Proof: By construction of A, Prob(A(f (Un ))= b(Un) j = b(Un)) = Prob(D(f (Un ) ) =1 j = b(Un)) = Prob(D(f (Un )b(Un))=1 j = b(Un)) = Prob(D(f (Un )b(Un))=1) where the last equality follows since D's behavior is independent of the value of . Likewise, Prob(A(f (Un ))= b(Un) j 6= b(Un)) = Prob(D(f (Un ) ) =0 j = b(Un )) = Prob(D(f (Un )b(Un ))=0 j = b(Un )) = 1 ; Prob(D(f (Un )b(Un ))=1) The claim follows. 2 Claim 3.14.2: Prob(D(f (Un )b(Un))=1) = Prob(D(G(Un))=1) Prob(D(f (Un )b(Un ))=1) = 2 Prob(D(Un+1 )=1) ; Prob(D(f (Un)b(Un))=1) Proof: By de nition of G, we have G(Un ) = f (Un )b(Un ), and the rst claim follows. To justify the second claim, we use the fact that f is a permutation over f0 1gn, and hence f (Un) is uniformly distributed over f0 1gn. It follows that Un+1 can be written as f (Un)U1. We get Prob(D(Un+1 )=1) = Prob(D(f (Un )b(Un))=1) + Prob(D(f (Un)b(Un ))=1) 2 104 and the claim follows. 2 CHAPTER 3. PSEUDORANDOM GENERATORS Combining Claims 3.14.1 and 3.14.2, we get Prob(A(f (Un ))= b(Un)) = Prob( = b(Un )) Prob(A(f (Un ))= b(Un) j = b(Un)) +Prob( 6= b(Un)) Prob(A(f (Un ))= b(Un) j 6= b(Un)) = 1 Prob(D(f (Un)b(Un ))=1) + 1 ; Prob(D(f (Un)b(Un ))=1) 2 1 = 2 + (Prob(D(G(Un ))=1) ; Prob(D(Un+1 )=1)) = 1 + (n) 2 Since (n) > p(1n) for in nitely many n's, we derive a contradiction and the theorem follows. 3.4.2 Construction based on Collections of Permutations We now combine the underlying ideas of Construction 3.10 (of Subsection 3.3.3) and Theorem 3.14 (above) to present a construction of pseudorandom generators based on collections of one-way permutations. Let (I D F ) be a triplet of algorithms de ning a collection of oneway permutations (see Section 2.4.2). Recall that I (1n r) denotes the output of algorithm I on input 1n and coin tosses r. Likewise, D(i s) denotes the output of algorithm D on input i and coin tosses s. The reader may assume, for simplicity, that jrj = jsj = n. Actually, this assumption can be justi ed in general - see Exercise 13. However, in many applications it is more natural to assume that jrj = jsj = q (n) for some xed polynomial q ( ). We remind the reader that Theorem 2.15 applies also to collections of one-way permutations. Construction 3.15 Let (I D F ) be a triplet of algorithms de ning a strong collection of one-way permutations, and let B be a hard-core predicate for this collection. Let p( ) be an def def n arbitrary polynomial. De ne G(r s) = 1 p(n), where i = I (1 r), s0 = D(i s), and for every 1 j p(jsj) it holds that j = B (sj ;1 ) and sj = fi (sj ;1 ). On seed (r s), algorithm G rst uses r to determine a permutation fi over Di (i.e., I (1n r)). Secondly, algorithm G uses s to determine a \starting point", s0 , in Di. For simplicity, let us shorthand fi by f . The essential part of algorithm G is the repeated application of the function f to the starting point s0 and the extraction of a hard-core predicate for each resulting element. Namely, algorithm G computes a sequence of elements s1 ::: sp(n), where sj = f (sj ;1 ) for every j (i.e., sj = f (j )(s0 ), where f (j) denotes j successive applications of the function f ). Finally, algorithm G outputs the string 1 p(n), i 3.4. CONSTRUCTIONS BASED ON ONE-WAY PERMUTATIONS 105 where j = B (sj ;1 ). Note that j is easily computed from sj ;1 but is a \hard to approximate" from sj = f (sj ;1 ). The pseudorandomness property of algorithm G depends on the fact that G does not output the intermediate sj 's. (In the sequel, we will see that outputting the last element, namely sp(n) , does not hurt the pseudorandomness property.) The expansion property of algorithm G depends on the choice of the polynomial p( ). Namely, the polynomial p( ) should be larger than the polynomial 2q ( ) (where 2q (n) equals the total length of r and s corresponding to I (1n )). Theorem 3.16 Let (I D F ), B, p( ), and G be as in Construction 3.15 (above), so that Theorem 3.16 is an immediate corollary of the following proposition. p(n) > 2q(n) for all n's. Suppose that for every i in the range of algorithm I , the random variable D(i) is uniformly distbuted over the set Di . Then G is a pseudorandom generator. Proposition 3.17 Let n and t be integers. For every i in the range of I (1n) and every x in Di , de ne Gi t(x) = 1 def (j ) t , where s0 = x, sj = fi (x) (f (j ) denotes j successive applications of the function f ) and j = B (sj ;1 ), for every 1 j t. Let (I D F ) and B be as in Theorem 3.16 (above), In be a random variable representing I (1n ), and Xn = D(In ) be a random variable depending on In . Then, for every polynomial p( ), the ensembles p p f(In GIn p(n)(Xn) fI(n (n))(Xn))gn2N and f(In Up(n) fI(n (n))(Xn))gn2N are polynomial-time I I indistinguishable. Hence, the distinguishing algorithm gets, in addition to the p(n)-bit long sequence to be examined, also the index i chosen by G (in the rst step of G's computation) and the last sj (i.e., sp(n) ) computed by G. Even with this extra information it is infeasible to distinguish GIn p(n) (Xn ) = G(1nU2q(n)) from Up(n). Proof Outline: The proof follows the proofs of Theorems 3.11 and 3.14 (of Subsection 3.3.3 and the current subsection, resp.). First, the statement is proven for p(n) = 1 (for all n's). This part is very similar to the proof of Theorem 3.14. Secondly, observe that the random variable Xn has distribution identical to the random variable fIn (Xn ), even conditioned on In = i (of every i). Finally, assuming the validity of the case p( ) = 1, the statement is proven for every polynomial p( ). This part is analogous to the proof of Theorem 3.11: one has to construct hybrids so that the kth hybrid starts with an element i in the support of In , followed by k random bits, and ends with Gi p(n);k (Xn ) and fip(n);k (Xn ), where Xn = D(i). The reader should be able to complete the argument. Proposition 3.17 and Theorem 3.16 remain valid even if one relaxes the condition concerning the distribution of D(i), and only requires that D(i) is statistically close (as a function in jij) to the uniform distribution over Di. 106 CHAPTER 3. PSEUDORANDOM GENERATORS 3.4.3 Practical Constructions As an immediate application of Construction 3.15, we derive pseudorandom generators based on either of the following assumptions The Intractability of the Discrete Logarithm Problem: The genertor is based on the fact that it is hard to predict, given a prime P , a primitive element G in the multiplicative group mod P , and an element Y of the group, whether there exists 0 x P =2 so that Y Gx mod P . In other words, this bit constitues a hard-core for the DLP collection (of Subsection 2.4.3). The di culty of inverting RSA: The genertor is based on the fact that the least signi cant bit constitues a hard-core for the RSA collection. The Intractability of Factoring Blum Integers: The genertor is based on the fact that the least signi cant bit constitues a hard-core for the Rabin collection, when viewed as a collection of permutations over the quadratic residues of Blum integers (see Subsection 2.4.3). We ellaborate on the last example since it o ers the most e cient implementation and yet is secure under a widely believed intractability assumption. The generator uses its seed in order to generate a composite number, N , which is the product of two relatively large primes. *** PROVIDE DETAILS ABOVE. MORE EFFICIEN HEURISTIC BELOW... 3.5 * Construction based on One-Way Functions It is known that one-way functions exist if and only if pseudorandom generators exist. However, the known construction which transforms arbitrary one-way functions into pseudorandom generators is impractical. Furthermore, the proof that this construction indeed yields pseudorandom generators is very complex and unsuitable for a book of the current nature. Instead, we refrain to present some of the ideas underlying this construction. 3.5.1 Using 1-1 One-Way Functions Recall that if f is a 1-1 length-preserving one-way function and b is a corresponding hardcore predicate then G(s) def f (s)b(s) constitutes a pseudorandom generator. Let us relax the = condition imposed on f and assume that f is a 1-1 one-way function (but is not necessarily length preserving). Without loss of generality, we may assume that there exists a polynomial p( ) so that jf (x)j = p(jxj) for all x's. In case f is not length preserving, it follows that 3.5. * CONSTRUCTION BASED ON ONE-WAY FUNCTIONS 107 p(n) > n. At rst glance, one may think that we only bene t in such a case since f by itself has an expanding property. The impression is misleading since the expanded strings may not \look random". In particular, it may be the case that the rst bit of f (x) is zero for all x's. More generally, f (Un ) may be easy to distinguish from Up(n) (otherwise f itself constitutes a pseudorandom generator). Hence, in the general case, we need to get rid of the expansion property of f since it is not accompanied by a \pseudorandom" property. In general, we need to shrink f (Un ) back to length n so that the shrunk result induces uniform distribution. The question is how to e ciently carry on this process (i.e., of shrinking f (x) back to length jxj, so that the shrunk f (Un ) induces a uniform distribution on f0 1gn). Suppose that there exists an e ciently computable function h so that fh (x) def h(f (x)) = def h(f (s))b(s), where b is a is length preserving and 1-1. In such a case we can let G(s) = hard-core predicate for f , and get a pseudorandom generator. The pseudorandomness of G follows from the observation that if b is a hard-core for f it is also a hard-core for fh (since an algorithm guessing b(x) from h(f (x)) can be easily modi ed so that it guesses b(x) from f (x), by applying h rst). The problem is that we \know nothing about the structure" of f and hence are not guaranteed that h as above does exist. An important observation is that a uniformly selected hashing function will have approximately the desired properties. Hence, hashing functions play a central role in the construction, and consequently we need to discuss these functions rst. Hashing Functions The following terminology relating to hashing functions is merely an ad-hoc terminology m (which is not a standard one). Let Sn be a set of strings representing functions mapping m n-bit strings to m-bit strings. In the sequel we freely associate the strings in Sn with the m be a random variable uniformly distributed over the functions that they represent. Let Hn m m set Sn . We call Sn a hashing family if it satis es the following three conditions: m 1. Sn is a pairwise independent family of mappings: for every x 6= y 2 f0 1gn, the m m random variables Hn (x) and Hn (y ) are independent and uniformly distributed in m. f0 1g m m 2. Sn has succinct representation: Sn = f0 1gpoly(n m) . m 3. Sn can be e ciently evaluated: there exists a polynomial-time algorithm that, on m input a representation of a function, h (in Sn ), and a string x 2f0 1gn, returns h(x). A widely used hashing family is the set of a ne transformations mapping n-dimensional binary vectors to m-dimensional ones (i.e., transformations a ected by multiplying the ndimensional vector by an n-by-m binary matrix and adding an m-dimensional vector to 108 CHAPTER 3. PSEUDORANDOM GENERATORS the result). A hashing family with more succinct representation is obtained by considering only the transformations a ected by Toeplitz matrices (i.e., matrices which are invariant along the diagonals). For further details see Exercise 16. Following is a lemma, concerning hashing functions, that is central to our application (as well as to many applications of hashing functions in complexity theory). Loosely speaking, the lemma asserts that most h's in a hashing family have h(Xn ) distributed almost uniformly, provided Xn does not assign too much probability mass to any single string. m Lemma 3.18 Let m <b;n be integers, Sn be a hashing family, and b and be two reals so m that b n and 2; 2 . Suppose that Xn is a random variable distributed over f0 1gn so that for every x it holds that Prob(Xn = x) 2;b . Then, for every 2 f0 1gm, and for m all but a 2;(b;m) ;2 fraction of the h's in Sn , it holds that Prob(h(Xn )= ) 2 (1 ) 2;m A function h not satisfying Prob(h(Xn ) = ) 2 (1 ) 2;m is called bad (for and the random variable Xn ). Averaging on all h's we have Prob(h(Xn ) = ) equal 2;m . Hence the lemma bounds the fraction of h's which deviate from the average value. Typically we b;m 1 (making the deviation from average equal the fraction of bad shall use def 2; 3 = h's). Another useful choice is > 1 (which yields an even smaller fraction of bad h's, yet badness has only a \lower bound interpretation", i.e. Prob(h(Xn )= ) (1 + ) 2;m ). Proof: Fix an arbitrary random variable Xn, satisfying the conditions of the lemma, and an arbitrary 2 f0 1gm. Denote wx def Prob(Xn = x). For every h we have = X Prob(h(Xn )= ) = where x (h) equal 1 if h(x) = and 0 otherwise. Hence, we are interested in the probability, P taken over all possible choices of h, that j2;m ; x wx x (h)j > 2;m . Looking at the x's m as random variables de ned over the random variable Hn , it is left to show that ! ;(b;m) ;m ; X w j > 2;m > 2 Prob j2 This is proven by applying Chebyshev's Inequality, using the fact that the x 's are pairwise independent, and that x equals 1 with probability 2;m (and 0 otherwise). (We also take advantage on the fact that wx 2;b .) Namely, ! P Var ( x wx x ) ;m ; X wx x j > 2;m Prob j2 ( 2;m )2 x P ;m 2 x < 2 22;2wx m ;m 2;b 2 2 2;2m x xx 2 x wx x (h) 3.5. * CONSTRUCTION BASED ON ONE-WAY FUNCTIONS The lemma follows. 109 Constructing \Almost" Pseudorandom Generators Using any 1-1 one-way function and any hashing family, we can take a major step towards constructing a pseudorandom generator. Construction 3.19 Let f : f0 1g 7! f0 1g be a function satisfying jf (x)j = p(jxj) for some polynomial p( ) and all x's. For any integer function l : N 7! N, let g : f0 1g 7! f0 1g be a II n;l(n) be a hashing family. For every x 2 f0 1gn function satisfying jg (x)j = l(jxj)+1, and Sp(n) n and h 2 Sp(;l)(n) , de ne n G(x h) def (h(f (x)) h g(x)) = Clearly, jG(x h)j = (jxj; l(jxj)) + jhj + (l(jxj)+1) = jxj + jhj + 1. Proposition 3.20 Let f , l, g and G be as above. Suppose that f is 1-to-1 and g is a jProb(A(G(Un Uk))=1) ; Prob(A(Un+k+1 )=1)j < 2; l(3n) + p(1n) n where k is the length of the representation of the hashing functions in Sp(;l)(n) . n hard-core function of f . Then, for every probabilistic polynomial-time algorithm A, every polynomial p( ), and all su ciently large n's The proposition can be extended to the case in which the function f is polynomial-to-1 (instead of 1-to-1). Speci cally, let f satisfy jf ;1 f (x)j < q (jxj), for some polynomial q ( ) and all su ciently long x's. The modi ed proposition asserts that for every probabilistic polynomial-time algorithm A, every polynomial p( ), and all su ciently large n's jProb(A(G(Un Uk))=1) ; Prob(A(Un+k+1)=1)j < 2; l(n);log 2 q(n) 3 + p(1 ) n where k is as above. In particular, the above proposition holds for functions l( ) of the form l(n) def c log2 n, = where c > 0 is a constant. For such functions l, every one-way function (can be easily modi ed into a function which) has a hard-core g as required in the proposition's hypothesis (see Subsection 2.5.3). Hence, we get very close to constructing a pseudorandom generator. 110 CHAPTER 3. PSEUDORANDOM GENERATORS Proof Sketch: We rst note that n n G(Un Uk ) = (Hp(;l)(n)(f (Un)) Hp(;l)(n) g(Un)) n n n;l(n) U Un+k+1 = (Un;l(n) Hp(n) l(n)+1) n n We consider the hybrid (Hp(;l)(n) (f (Un )) Hp(;l)(n) Ul(n)+1 ). The proposition is a direct conn n sequence of the following two claims. Claim 3.20.1: The ensembles and f(Hpn(;l)(n)(f (Un)) Hpn(;l)(n) g(Un))gn2N n n I are polynomial-time indistinguishable. Proof Idea: Use a \reducibility argument". If the claim does not hold then contradiction to the hypothesis that g is a hard-core of f is derived. 2 Claim 3.20.2: The statistical di erence between the random variables n n (Hp(;l)(n) (f (Un )) Hp(;l)(n) Ul(n)+1 ) n n n (Un;l(n) Hp(;l)(n) Ul(n)+1) n f(Hpn(;l)(n)(f (Un)) Hpn(;l)(n) Ul(n)+1)gn2N n n I and is bounded by 2;l(n)=3 . n Proof Idea: Use the hypothesis that Sp(;l)(n) is a hashing family, and apply Lemma 3.18. 2 n Since the statistical di erence is a bound on the ability of algorithms to distinguish, the proposition follows. Applying Proposition 3.20 Once the proposition is proven we consider the possibilities of applying it in order to construct pseudorandom generators. We stress that applying Proposition 3.20, with length function l( ), requires having a hard-core function g for f with jg (x)j = l(jxj) + 1. By Theorem 2.17 (of Subsection 2.5.3) such hard-core functions exist practically for all one-way functions, provided that l( ) is logarithmic (actually, Theorem 2.17 asserts that such hardcores exist for a modi cation of any one-way function which preserves its 1-1 property). Hence, combining Theorem 2.17 and Proposition 3.20, and using a logarithmic length function, we get very close to constructing a pseudorandom generator. In particular, for every polynomial p( ), using l(n) def 3 log2 p(n), we can construct a deterministic polynomial-time = 3.5. * CONSTRUCTION BASED ON ONE-WAY FUNCTIONS 111 algorithm expanding n-bit long seeds into (n +1)-bit long strings so that no polynomial-time algorithm can distinguish the output strings from uniformly chosen ones, with probability greater than p(1n) (except for nitely many n's). Yet, this does not imply that the output is pseudorandom (i.e., that the distinguishing gap is smaller than any polynomial fraction). A nal trick is needed (since we cannot use l( ) bigger than any logarithmic function). In the sequel we present two alternative ways for obtaining a pseudorandom generator from the above construction. The rst alternative is to use Construction 3.10 (of Subsection 3.10) in order to increase the expansion factor of the above algorithms. In particular, for every integer k, we construct a deterministic polynomial-time algorithm expanding n-bit long seeds into n3 -bit long strings so that no polynomial-time algorithm can distinguish the output strings from uniformly chosen ones, with probability greater than n1k (except for nitely many n's). Denote these algorithms by G1 G2 :::, and construct a pseudorandom generator G by letting G(s) def G1(s1 ) G2(s2 ) = Gk(jsj)(sk(jsj) ) where denotes bit-by-bit exclusive-or of strings, s1 s2 sk(jsj) = s, jsi j = k(jjssjj) 1, and k(n) def pn. Clearly, jG(s)j k(jsj) ( k(jjssjj) )3 = jsj2. The pseudorandomness of G follows =3 by a \reducibility argument". (The choice of the function k is rather arbitrary, and any unbounded function k( ) satisfying k(n) < n2=3 will do.) The second alternative is to apply Construction 3.19 to the function f de ned by f (x1 ::: xn) def f (x1) = f (xn) where jx1j = = jxn j = n. The bene t in applying Construction 3.19 to the function f is that we can use l(n2) def n ; 1, and hence Proposition 3.20 yields that G is a pseudorandom = generator. All that is left is to show that f has a hard core function which maps n2 -bit strings into n-bit strings. Assuming that b is a hard-core predicate of the function f , we can construct such a hard-core function for f . Speci cally, Construction 3.21 Let f : f0 1g 7! f0 1g and b : f0 1g 7! f0 1g. De ne f (x1 ::: xn) def f (x1) f (xn ) = def b(x ) b(x ) g(x1 ::: xn) = 1 n where jx1 j = = jxn j = n. Proposition 3.22 Let f and b be as above. If b is a hard-core predicate of f then g is a hard-core function of f . 112 CHAPTER 3. PSEUDORANDOM GENERATORS Proof Idea: Use the hybrid technique. The ith hybrid is (1) ( (1) ( f (Un ::: Unn)) b(Un ) ::: b(Uni)) U1(i+1) ::: U1(n) Use a reducibility argument (as in Theorem 3.14 of Subsection 3.4.1) to convert a distinguishing algorithm into one predicting b from f . Using either of the above alternatives, we get Theorem 3.23 If there exist 1-1 one-way functions then pseudorandom generators exist as well. The entire argument can be extended to the case in which the function f is polynomial-to-1 (instead of 1-to-1). Speci cally, let f satisfy jf ;1 f (x)j < q (jxj), for some polynomial q ( ) and all su ciently long x's. Then if f is one-way then (either of the above alternatives yields that) pseudorandom generators exists. Proving the statement using the rst alternative is quite straightforward given the discussion proceeding Proposition 3.20. In proving the statement using the second alternative apply Construction 3.19 to the function f with l(n2) def n (1 + log2 q (n)) ; 1. This requires showing that f has a hard core function which = maps n2 -bit strings into n(1+log2 q (n))-bit strings. Assuming that g is a hard-core function of the function f , with jg (x)j = 1 + log2 q (jxj), we can construct such a hard-core function for f . Speci cally, g(x1 ::: xn) def g (x1) g(xn ) = where jx1 j = = jxn j = n. 3.5.2 Using Regular One-Way Functions The validity of Proposition 3.20 relies heavily on the fact that if f is 1-1 then f (Un ) maintains the \entropy" of Un in a strong sense (i.e., Prob(f (Un )= ) 2;n for every ). In this case, it was possible to shrink f (Un ) and get almost uniform distribution over f0 1gn;l(n). As stressed above, the condition may be relaxed to requiring that f is polynomial-to-1 (instead of 1-to-1). In such a case only logarithmic loss of \entropy" occurs, and such a loss can be compensated by an appropriate increase in the range of the hard-core function. We stress that hard-core functions of logarithmic length (i.e., satisfying jg (x)j = O(log jxj)) can be constructed for any one-way function. However, in general, the function f may not be polynomial-to-1 and in particular it can map exponentially many images to the same range element. If this is the case then applying f to Un yields a great loss in \entropy", which cannot be compensated using the above methods. For example, if f (x y ) def f 0 (x)0jyj, for = ; j 2 j for some 's. In this case, achieving uniform distrijxj = jyj, then Prob(f (Un)= ) 2 bution from f (Un ) requires shrinking it to length n=2. In general, we cannot compensate 3.5. * CONSTRUCTION BASED ON ONE-WAY FUNCTIONS 113 for these lost bits since f may not have a hard-core with such huge range (i.e., a hard-core g satisfying jg ( )j = j2 j ). Hence, in this case, the above methods fail for constructing an algorithm that expands its input into a longer output. A new idea is needed, and indeed presented below. The idea is that, in case f maps di erent preimages into the same image y , we can augment y by the index of the preimage, in the set f ;1 (y ), without damaging the hardnessto-invert of f . Namely, we de ne F (x) def f (x) idxf (x), where idxf (x) denotes the index = (say by lexicographic order) of x in the set fx0 : f (x0) = f (x)g. We claim that inverting F is not substantially easier than inverting f . This claim can be proven by a \reducibility argument". Given an algorithm for inverting F we can invert f as follows. On input y (supposedly in the range of f (Un )), we rst select m uniformly in f1 ::: ng, next select i uniformly in f1 ::: 2mg, and nally try to invert F on (y i). When analyzing this algorithm, consider the case i = dlog2 jf ;1 (y )je. The function F suggested above does preserve the hardness-to-invert of f . The problem is that it does not preserve the easy-to-compute property of f . In particular, for general f it is not clear how to compute idxf (x) (i.e., the best we can say is that this task can be performed in polynomial space). Again, hashing functions come to the rescue. Suppose, for mm example that f is 2m -to-1 on strings of length n. Then, we can set idxf (x) = (Hn Hn (x)), obtaining \probabilistic indexing" of the set of preimages. We stress that applying the above trick requires having a good estimate for the size of the set of preimages (of a given image). That is, given x it should be easy to compute jf ;1 f (x)j. A simple case where such an estimate can be handy is the case of regular functions. De nition 3.24 (Regular functions): A function f : f0 1g 7! f0 1g is called regular if there exists an integer function m : N 7! N so that for all su ciently long x 2 f0 1g it II holds jfy : f (x)= f (y) ^ jxj = jyjgj = 2m(jxj) n computes m(n) in poly(n)-time. As we shall see, in the end of this subsection, one can do without this assumption. For sake of simplicity (of notation), we assume in the sequel that if f (x)= f (y ) then jxj = jy j. For simplicity, the reader may further assume that there exists an algorithm that on input Construction 3.25 Let f : f0 1g 7! f0 1g be a regular function with m(jxj) = log2 jf ;1f (x)j m for some integer function m( ). Let l : N 7! N be an integer function, and Sn (n);l(n) be a II m hashing family. For every x 2 f0 1gn and h 2 Sn (n);l(n) , de ne F (x h) def (f (x) h(x) h) = 114 CHAPTER 3. PSEUDORANDOM GENERATORS If f can be computed in polynomial-time and m(n) can be computed from n in poly(n)time, then F can be computed in polynomial-time. We now show that if f is a regular one-way function, then F is \hard to invert". Furthermore, if l( ) is logarithmic then F is \almost 1-1". Proposition 3.26 Let f , m, l and F be as above. Suppose that there exists an algorithm that on input n computes m(n) in poly(n)-time. Then, 1. F is \almost" 1-1: m Prob jF ;1 F (Un Hn (n);l(n))j > 2l(n)+1 < 2; l(n) 2 k k (Recall that Hn denotes a random variable uniformly distributed over Sn .) 2. F \preserves" the one-wayness of f : If f is strongly (resp. weakly) one-way then so is F . Proof Sketch: Part (1) is proven by applying Lemma 3.18, using the hypothesis that unallowable success probability (reaching contradiction). For sake of concreteness, we consider the case in which f is strongly one-way, and assume to the contradiction that algorithm m 1 A inverts F on F (Un Hn (n);l(n)) with success probability (n), so that (n) > poly(n) for in nitely many n's. Following is a description of A0 . On input y (supposedly in the range of f (Un )), algorithm A0 repeats the following m experiment for poly( (n ) ) many times. Algorithm A0 selects uniformly h 2 Sn (n);l(n) and n 2 f0 1gm(n);l(n), and initiates A on input (y h). Algorithm A0 sets x to be the n-bit long pre x of A(y h), and outputs x if y = f (x). Otherwise, algorithm A0 continues to the next experiment. 1 Clearly, algorithm A0 runs in polynomial-time, provided that (n) > poly(n) . We now evaluate the success probability of A0 . For every possible input, y , to algorithm A0, we consider a random variable Xn uniformly distributed in f ;1 (y ). Let (y ) denote the success k k probability of algorithm A on input (y Hn (Xn ) Hn ), where n def jy j and k def m(n) ; l(n). = = n Clearly, Exp( (f (Un ))) = (n), and Prob( (f (Un ))> (2 ) ) > (2n) follows. We x an arbitrary y 2 f0 1gn so that (y) > (2n) . We prove the following technical claim. Claim 3.26.1: Let n, k and Xn be as above. Suppose that B is a set of pairs, and def Prob((H k (X ) = nn m Sn (n);l(n) is a hashing family. Part (2) is proven using a \reducibility argument" . Assuming, to the contradiction, that there exists an e cient algorithm A that inverts F with unallowable success probability, we construct an e cient algorithm A0 that inverts f with k Hn ) 2 B ) 3.5. * CONSTRUCTION BASED ON ONE-WAY FUNCTIONS Then, 4 115 k Prob((Uk Hn ) 2 B ) > 28 k Using this claim, it follows that the probability that A0 inverts f on y in a single iteration is at least ( (4y) )4 1 . We reach a contradiction (to the one-wayness of f ), and the proposition k follows. All that is left is to prove Claim 3.26.1. The proof, given below, is rather technical. k which is connected to more than t2 2k = t12 jSn j of the h's. We now relate the expansion and overweighting properties. Speci cally, if h is (T )expanding then there exists an integer i 2f1 ::: kg so that h is (T 2i;1 k 2i )-overweighting. We stress that the fact that m(n) can be computed from n does not play an essential role in the reducibility argument (as it is possible to try all possible values of m(n)). Claim 3.26.1 is of interest for its own sake. However, its proof provides no signi cant insights and may be skipped without signi cant damage (especially by readers that are more interested in cryptography than in \probabilistic analysis"). Proof of Claim 3.26.1: We rst use Lemma 3.18 to show that only a \tiny" fraction of k the hashing functions in Sn can map \large" probability mass into \small" subsets. Once this is done, the claim is proven by dismissing those few bad functions and relating the two probabilities, appearing in the statement of the claim, conditioned on the function not being bad. Details follow. We begin by bounding the fraction of the hashing functions that map \large" probability k mass into \small" subsets. We say that a function h 2 Sn is (T )-expanding if there exists k of cardinality k so that Prob(h(Xn ) 2 R)) (T + 1) . In other a set R f0 1g 2 words, h maps to some set of density a probability mass T + 1 times the density of the set. Our rst goal is to prove that at most 4 of the h's are ( 322k 643k )-expanding. In other words, only 4 of the function map to some set of density 643k a probability mass of more than 2 . We start with a related question. We say that 2 f0 1gk is t-overweighted by the k function h if Prob(h(Xn )= )) (t +1) 2;k . A function h 2 Sn is called (t )-overweighting k of cardinality 2k so that each 2 R is t-overweighted by h. if there exists a set R f0 1g (Clearly, if h is (t )-overweighting then it is also (t )-expanding, but the converse is not 1 necessarily true.) We rst show that at most a t2 fraction of the h's are (t )-overweighting. The proof is given in the rest of this paragraph. Recall that Prob(Xn = x) 2;k , for every x. Using Lemma 3.18, it follows that each 2 f0 1gk is t-overweighted by at most a t;2 fraction of the h's. Assuming, to the contradiction, that more than a t21 fraction of the h's are (t )-overweighting, we construct a bipartite graph by connecting each of these h's with the 's that it t-overweights. Contradiction follows by observing that there exists an k jSn j 2k 116 Hence, at most a k X CHAPTER 3. PSEUDORANDOM GENERATORS 1 < 4k (T 2i;1)2 k 2i T 2 i=1 fraction of the h's can be (T )-expanding. It follows that at most 4 of the h's are ( 322k 643k )expanding. We call h honest if it is not ( 322k 643k )-expanding. Hence, if h is honest and Prob(h(Xn ) 2 R) 2 then R has density at least 643k . Concentrating on the honest h's, we now evaluate the probability that ( h) hits B , when is uniformly chosen. We call h good if k Prob((h(Xn) h) 2 B ) 2 . Clearly, the probability that Hn is good is at least 2 , and the k is both good and honest is at least . Denote by G the set of these h's (i.e., h's probability Hn 4 which are both good and honest). Clearly, for every h 2 G we have Prob((h(Xn) h) 2 B ) 2 k (since h is good) and Prob((Uk h) 2 B ) 643k (since h is honest). Using Prob(Hn 2 G) 4 , the claim follows. 2 Applying Proposition 3.26 It is possible to apply Construction 3.19 to the function resulting from Construction 3.25, and the statement of Proposition 3.20 still holds with minor modi cations. Speci cally, Construction 3.19 is applied with l( ) twice the function (i.e., the l( )) used in Construction 3.25, l(n) l(n) and the bound in Proposition 3.20 is 3 2; 6 (instead of 2; 3 ). The argument leading to Theorem 3.23, remains valid as well. Furthermore, we may even waive the requirement that m(n) can be computed (since we can construct functions Fm for every possible value of m(n)). Finally, we note that the entire argument holds even if the de nition of regular functions is relaxed as follows. De nition 3.27 (Regular functions - revised de nition): A function f : f0 1g 7! f0 1g is called regular if there exists an integer function m0 : N 7! N and a polynomial q ( ) so that II for all su ciently long x 2 f0 1g it holds 2m0 (jxj) jfy : f (x)= f (y )gj q (jxj) 2m0 (jxj) When using these (relaxed) regular functions in Construction 3.25, set m(n) def m0 (n). The = resulting function F will have a slightly weaker \almost" 1-1 property. Namely, m Prob jF ;1 F (Un Hn (n);l(n))j > q (n) 2l(n)+1 < 2; 2 The application of Construction 3.19 will be modi ed accordingly. We get l(n) Theorem 3.28 If there exist regular one-way functions then pseudorandom generators exist as well. 3.5. * CONSTRUCTION BASED ON ONE-WAY FUNCTIONS 117 3.5.3 Going beyond Regular One-Way Functions The proof of Proposition 3.26 relies heavily on the fact that the one-way function f is regular (at least in the weak sense). Alternatively, Construction 3.25 needs to be modi ed so that di erent hashing families are associated to di erent x 2 f0 1gn. Furthermore, the argument leading to Theorem 3.23 cannot be repeated unless it is easy to compute the cardinality of set f ;1 (f (x)) given x. Note that this time we cannot construct functions Fm for every possible value of dlog2 jf ;1(y)je since none of the functions may satisfy the statement of Proposition 3.26. Again, a new idea is needed. A key observation is that although the value of log2 jf ;1 (f (x))j may vary for di erent x 2 f0 1gn, the value m(n) def Exp(log2 jf ;1(f (Un ))j) is unique. Furthermore, the function = f de ned by f (x1 ::: xn2) def f (x1) ::: f (xn2 ) = where jx1 j = jxn2 j = n, has the property that all but a negligible fraction of the domain reside in preimage sets with logarithm of cardinality not deviating too much from the expected value. Speci cally, let m(n3 ) def Exp(log2 jf ;1 (f (Un3 ))j). Clearly, m(n3 ) = n2 m(n). Using = Cherno Bound, we get Prob abs m(n3 ) ; log2 jf ;1 (f (Un3 ))j > n2 < 2;n Suppose we apply Construction 3.25 to f setting l(n3) def n2 . Denote the resulting = function by F . Suppose we apply Construction 3.19 to F setting this time l(n3 ) def 2n2 ; 1. = Using the ideas presented in the proofs of Propositions 3.20 and 3.26, one can show that if the n3 -bit to l(n3) + 1-bit function used in Construction 3.19 is a hard-core of F then the resulting algorithm constitutes a pseudorandom generator. Yet, we are left with the problem of constructing2 a huge hard-core function, G, for the function F . Speci cally, jG(x)j has to equal 2jxj 3 , for all x's. A natural idea is to de ne G analogously to the way g is de ned in Construction 3.21. Unfortunately, we do not know how to prove the validity of this construction (when applied to F ), and a much more complicated construction is required. This construction does use all the above ideas in conjunction with additional ideas not presented here. The proof of validity is even more complex, and is not suitable for a book of the current nature. We thus conclude this section by merely stating the result obtained. Theorem 3.29 If there exist one-way functions then pseudorandom generators exist as well. 118 CHAPTER 3. PSEUDORANDOM GENERATORS 3.6 Pseudorandom Functions Pseudorandom generators enable to generate, exchange and share a large number of pseudorandom values at the cost of a much smaller number of random bits. Speci cally, poly(n) pseudorandom bits can be generated, exchanged and shared at the cost of n (uniformly chosen bits). Since any e cient application uses only a polynomial number of random values, providing access to polynomially many pseudorandom entries seems su cient. However, the above conclusion is too hasty, since it assumes implicitly that these entries (i.e., the addresses to be accessed) are xed beforehand. In some natural applications, one may need to access addresses which are determined \dynamically" by the application. For example, one may want to assign random values to (poly(n) many) n-bit long strings, produced throughout the application, so that these values can be retrieved at latter time. Using pseudorandom generators the above task can be achieved at the cost of generating n random bits and storing poly(n) many values. The challenge, met in the sequel, is to achieve the above task at the cost of generating and storying only n random bits. The key to the solution is the notion of pseudorandom functions. In this section we de ne pseudorandom functions and show how to e ciently implement them. The implementation uses as a building block any pseudorandom generator. 3.6.1 De nitions Loosely speaking, pseudorandom functions are functions which cannot be distinguished from truly random functions by any e cient procedure which can get the value of the function at arguments of its choice. Hence, the distinguishing procedure may query the function being examined at various points, depending possibly on previous answers obtained, and yet can not tell whether the answers were supplied by a function taken from the pseudorandom ensemble (of functions) or from the uniform ensemble (of function). Hence, to formalize the notion of pseudorandom functions we need to consider ensembles of functions. For sake of concreteness we consider in the sequel ensembles of length preserving functions. Extensions are discussed in Exercise 21. De nition 3.30 (function ensembles): A function ensemble is a sequence F = fFngn2N I of random variables, so that the random variable Fn assumes values in the set of functions mapping n-bit long strings to n-bit long strings. The uniform function ensemble, denoted H = fHn gn2N, has Hn uniformly distributed over the set of functions mapping n-bit long I strings to n-bit long strings. To formalize the notion of pseudorandom functions we use (probabilistic polynomialtime) oracle machines. We stress that our use of the term oracle machine is almost identical to the standard one. One deviation is that the oracle machines we consider have a length 3.6. PSEUDORANDOM FUNCTIONS 119 preserving function as oracle rather than a Boolean function (as is standard in most cases in the literature). Furthermore, we assume that on input 1n the oracle machine only makes queries of length n. These conventions are not really essential (they merely simplify the exposition a little). is called pseudorandom if for every probabilistic polynomial-time oracle machine M , every polynomial p( ) and all su ciently large n's jProb(M Fn (1n)=1) ; Prob(M Hn (1n)=1)j < p(1n) where H = fHn gn2N is the uniform function ensemble. I De nition 3.31 (pseudorandom function ensembles): A function ensemble, F = fFngn2N, I Using techniques similar to those presented in the proof of Proposition 3.3 (of Subsection 3.2.2), one can demonstrate the existence of pseudorandom function ensembles which are not statistically close to the uniform one. However, to be of practical use, we need require that the pseudorandom functions can be e ciently computed. De nition 3.32 (e ciently computable function ensembles): A function ensemble, F = fFngn2N, is called e ciently computable if the following two conditions hold I 1. (e cient indexing): There exists a probabilistic polynomial time algorithm, I , and a mapping from strings to functions, , so that (I (1n )) and Fn are identically distributed. We denote by fi the f0 1gn 7! f0 1gn function assigned to i (i.e., fi def (i)). = 2. (e cient evaluation): There exists a probabilistic polynomial time algorithm, V , so that V (i x) = fi (x). In particular, functions in an e ciently computable function ensemble have relatively succinct representation (i.e., of polynomial rather than exponential length). It follows that e ciently computable function ensembles may have only exponentially many functions (out of the double-exponentially many possible functions). Another point worthy of stressing is that pseudorandom functions may (if being efciently computable) be e ciently evaluated at given points, provided that the function description is give as well. However, if the function (or its description) is not known (and it is only known that it is chosen from the pseudorandom ensemble) then the value of the function at a point cannot be approximated (even in a very liberal sense and) even if the values of the function at other points is also given. In the rest of this book we consider only e ciently computable pseudorandom functions. Hence, in the sequel we sometimes shorthand such ensembles by calling them pseudorandom functions. 120 CHAPTER 3. PSEUDORANDOM GENERATORS 3.6.2 Construction Using any pseudorandom generator, we construct a (e ciently computable) pseudorandom function (ensemble). Construction 3.33 Let G be a deterministic algorithm expanding inputs of length n into strings of length 2n. We denote by G0(s) the jsj-bit long pre x of G(s), and by G1(s) the jsj-bit long su x of G(s) (i.e., G(s) = G0(s)G1(s)). For every s 2 f0 1gn, we de ne a function fs : f0 1gn 7! f0 1gn so that for every 1 ::: n 2f0 1g fs ( 12 n) def G = n( (G 2 (G 1 (s)) ) Let Fn be a random variable de ned by uniformly selecting s 2 f0 1gn and setting Fn = fs . Finally, let F = fFn gn2N be our function ensemble. I Pictorially, the function fs is de ned by n-step walks down a full binary tree of depth n having labels on the vertices. The root of the tree, hereafter referred to as the level 0 vertex of the tree, is labelled by the string s. If an internal node is labelled r then its left child is labelled G0(r) whereas its right child is labelled G1(r). The value of fs (x) is the string residing in the leaf reachable from the root by a path corresponding to string x, when the root is labelled by s. The random variable Fn is assigned labelled trees corresponding to all possible 2n labellings of the root, with uniform probability distribution. A function, operating on n-bit strings, in the ensemble constructed above can be speci ed by n bits. Hence, selecting, exchanging and storing such a function can be implemented at the cost of selecting, exchanging and storing a single n-bit string. Theorem 3.34 Let G and F be as in Construction 3.33, and suppose that G is a pseudorandom generator. Then F is an e ciently computable ensemble of pseudorandom functions. Proof: Clearly, the ensemble F is e ciently computable. To prove that F is pseudorandom we use the hybrid technique. The kth hybrid will be assigned functions which result by uniformly selecting labels for the vertices of the kth (highest) level of the tree and computing the labels of lower levels as in Construction 3.33. The 0-hybrid will correspond to the random variable Fn (since a uniformly chosen label is assigned to the root), whereas the n-hybrid will correspond to the uniform random variable Hn (since a uniformly chosen label is assigned to each leaf). It will be shown that an e cient oracle machine distinguishing neighbouring hybrids can be transformed into an algorithm that distinguishes polynomially many samples of G(Un ) from polynomially many samples of U2n . Using Theorem 3.6 (of Subsection 3.2.3), we derive a contradiction to the hypothesis (that G is a pseudorandom generator). Details follows. 3.6. PSEUDORANDOM FUNCTIONS 121 k For every k, 0 k n, we de ne a hybrid distribution Hn (assigned as values functions n 7! f0 1gn) as follows. For every s1 s2 ::: s k 2 f0 1gn, we de ne a function f : f0 1g 2 fs1 ::: s2k : f0 1gn 7! f0 1gn so that fs1 ::: s2k ( 12 n) def G = n( (G k+2 (G k+1 (sidx( k 1 ))) ) where idx( ) is index of in the standard lexicographic order of strings of length j j. (In the sequel we take the liberty of associating the integer idx( ) with the string .) Namely, fs0k ::: s1k (x) is computed by rst using the k-bit long pre x of x to determine one of the sj 's, and next using the (n ; k)-bit long su x of x to determine which of the functions G0 k and G1 to apply at each remaining stage. The random variable Hn is uniformly distributed k over the above (2n )2 possible functions. Namely, k= Hn def fUn ::: Un k ) (1) (2 ( where Unj ) 's are independent random variables each uniformly distributed over f0 1gn. 0 n At this point it is clear that Hn is identical to Fn , whereas Hn is identical to Hn . Again, as usual in the hybrid technique, ability to distinguish the extreme hybrids yields ability to distinguish a pair of neighbouring hybrids. This ability is further transformed (as sketched above) so that contradiction to the pseudorandomness of G is reached. Further details follow. We assume, in contradiction to the theorem, that the function ensemble F is not pseudorandom. It follows that there exists a probabilistic polynomial-time oracle machine, M , and a polynomial p( ) so that for in nitely many n's (n) def jProb(M Fn (1n )=1) ; Prob(M Hn (1n )=1)j > p(1 ) = n Let t( ) be a polynomial bounding the running time of M (1n ) (such a polynomial exists since M is polynomial-time). It follows that, on input 1n , the oracle machine M makes at most t(n) queries (since the number of queries is clearly bounded by the running time). Using the machine M , we construct an algorithm D that distinguishes the t( )-product of the ensemble fG(Un)gn2N from the t( )-product of the ensemble fU2n gn2N as follows. I I On input 1 ::: t 2 f0 1g2n (with t = t(n)), algorithm D proceeds as follows. First, D selects uniformly k 2 f0 1 ::: n ; 1g. This random choice, hereafter called the checkpoint, and is the only random choice made by D itself. Next, algorithm D invokes the oracle machine M (on input 1n ) and answers M 's queries as follows. The rst query of machine M , denoted q1, is answered by G n ( (G k+2 (P k+1 ( 1))) ) 122 CHAPTER 3. PSEUDORANDOM GENERATORS where q1 = 1 n , and P0 ( ) denotes the n-bit pre x of (and P1 ( ) denotes the n-bit su x of ). In addition, algorithm D records this query (i.e., q1 ). Subsequent queries are answered by rst checking if their k-bit long pre x equals the k-bit long pre x of a previous query. In case the k-bit long pre x of the current query, denoted qi , is di erent from the k-bit long pre xes of all previous queries, we associate this pre x a new input string (i.e., i ). Namely, we answer query qi by G n ( (G k+2 (P k+1 ( i))) ) where qi = 1 n . In addition, algorithm D records the current query (i.e., qi ). The other possibility is that the k-bit long pre x of the ith query equals the k-bit long pre x of some previous query. Let j be the smallest integer so that the k-bit long pre x of the ith query equals the k-bit long pre x of the j th query (by hypothesis j < i). Then, we record the current query (i.e., qi ) but answer it using the string associated with query qj . Namely, we answer query qi by G n ( (G k+2 (P k+1 ( j ))) ) where qi = 1 n . Finally, when machine M halts, algorithm D halts as well and outputs the same output as M . Pictorially, algorithm D answers the rst query by rst placing the two halves of 1 in the corresponding children of the tree-vertex reached by following the path from the root corresponding to 1 k . The labels of all vertices in the subtree corresponding to are determined by the labels of these two children (as in the construction of F ). 1 k Subsequent queries are answered by following the corresponding paths from the root. In case the path does not pass through a (k + 1)-level vertex which has already a label, we assign this vertex and its sibling a new string (taken from the input). For sake of simplicity, in case the path of the ith query requires a new string we use the ith input string (rather than the rst input string not used so far). In case the path of a new query passes through a (k + 1)-level vertex which has been labelled already, we use this label to compute the labels of subsequent vertices along this path (and in particular the label of the leaf). We stress that the algorithm does not necessarily compute the labels of all vertices in a subtree corresponding to 1 k (although these labels are determined by the label of the vertex corresponding to 1 k ), but rather computes only the labels of vertices along the paths corresponding to the queries. Clearly, algorithm D can be implemented in polynomial-time. It is left to evaluate its performance. The key observation is that when the inputs are taken from the t(n)-product of G(Un ) and algorithm D chooses k as the checkpoint then M behaves exactly as on the kth hybrid. Likewise, when the inputs are taken from the t(n)-product of U2n and algorithm D chooses k as the checkpoint then M behaves exactly as on the k + 1st hybrid. Namely, Claim 3.34.1: Let n be an integer and t def t(n). Let K be a random variable describing = the random choice of checkpoint by algorithm D (on input a t-long sequence of 2n-bit long 3.6. PSEUDORANDOM FUNCTIONS strings). Then for every k 2f0 1 ::: n ; 1g (1) ( Prob D(G(Un ) ::: G(Unt)))=1 j K = k (1) (t Prob D(U2n ::: U2n))=1 j K = k k = Prob M Hn (1n )=1 k+1 = Prob M Hn (1n )=1 123 ( (j where the Uni) 's and U2n) 's are independent random variables uniformly distributed over n and f0 1g2n, respectively. f0 1g The above claim is quite obvious, yet a rigorous proof is more complex than one realizes at rst glance. The reason being that M 's queries may depend on previous answers it gets, and hence the correspondence between the inputs of D and possible values assigned to the hybrids is less obvious than it seems. To illustrate the di culty consider a n-bit string which is selected by a pair of interactive processes, which proceed in n iterations. At each iteration the rst party chooses a new location, based on the entire history of the interaction, and the second process sets the value of this bit by ipping an unbiased coin. It is intuitively clear that the resulting string is uniformly distributed, and the same holds if the second party sets the value of the chosen locations using the outcome of a coin ipped beforehand. In our setting the situation is slightly more involved. The process of determining the string is terminated after k < n iterations and statements are made of the partially determined string. Consequently, the situation is slightly confusing and we feel that a detailed argument is required. Proof of Claim 3.34.1: We start by sketching a proof of the claim for the extremely simple case in which M 's queries are the rst t strings (of length n) in lexicographic order. Let us further assume, for simplicity, that on input 1 ::: t, algorithm D happens to choose checkpoint k so that t = 2k+1 . In this case the oracle machine M is invoked on input 1n and access to the function fs1 ::: s2k+1 , where s2j ;1+ = P ( j ) for every j 2k and 2 f0 1g. Thus, if the inputs to D are uniformly selected in f0 1g2n then M is invoked with access to the k +1st hybrid random variable (since in this case the sj 's are independent and uniformly distributed in f0 1gn). On the other hand, if the inputs to D are distributed as G(Un ) then M is invoked with access to the kth hybrid random variable (since in this case fs1 ::: s2k+1 = fr1 ::: r2k where the rj 's are seeds corresponding to the j 's). For the general case we consider an alternative way of de ning the random variable m Hn , for every 0 m n. This alternative way is somewhat similar to the way in which D answers the queries of the oracle machine M . (We use the symbol m instead of k since m does not necessarily equal the checkpoint, denoted k, chosen by algorithm D.) This m way of de ning Hn consists of the interleaving of two random processes, which together rst select at random a function g : f0 1gm 7! f0 1gn, that is later used to determine a function f : f0 1gn 7! f0 1gn. The rst random process, denoted , is an arbitrary process (\given to us from the outside"), which speci es points in the domain of g . (The process 124 CHAPTER 3. PSEUDORANDOM GENERATORS corresponds to the queries of M , whereas the second process corresponds to the way A answers these queries.) The second process, denoted , assigns uniformly selected n-bit long strings to every new point speci ed by , thus de ning the value of g on this point. We stress that in case speci es an old point (i.e., a point for which g is already de ned) then the second process does nothing (i.e., the value of g at this point is left unchanged). The process may depend on the history of the two processes, and in particular on the values chosen for the previous points. When terminates the second process (i.e., ) selects random values for the remaining unde ned points (in case such exist). We stress that the second process (i.e., ) is xed for all possible choices of a (\ rst") process . The rest of this paragraph gives a detailed description of the interleaving of the two random processes (and may be skipped). We consider a randomized process mapping sequences of n-bit strings (representing the history) to single m-bit strings. We stress that is not necessarily memoryless (and hence may \remember" its previous random choices). Namely, for every xed sequence v1 ::: vi 2f0 1gn, the random variable (v1 ::: vi) is (arbitrarily) distributed over f0 1gm f?g where ? is a special symbol denoting termination. A \random" function g : f0 1gm 7! f0 1gn is de ned by iterating the process with the random process de ned below. Process starts with g which is unde ned on every point in its domain. At the ith iteration lets pi def (v1 ::: vi;1) and, assuming pi 6= ?, sets vi def vj if pi = pj for some = = j < i and lets vi be uniformly distributed in f0 1gn otherwise. In the latter case (i.e., pi is new and hence g is not yet de ned on pi ), sets g (pi) def vi (in fact g (pi)= g (pj )= vj = vi = also in case pi = pj for some j < i). When terminates, i.e., (v1 ::: vT ) = ? for some T , completes the function g (if necessary) by choosing independently and uniformly in f0 1gn values for the points at which g is unde ned yet. (Alternatively, we may augment the process so that it terminates only after specifying all possible m-bit strings.) Once a function g is totally de ned, we de ne a function f g : f0 1gn 7! f0 1gn by ) The reader can easily verify that f g equals fg(0m ) ::: g(1m) (as de ned in the hybrid construction above). Also, one can easily verify that the above random process (i.e., the interleaving of with any ) yields a function g that is uniformly distributed over the set of all possible functions mapping m-bit strings to n-bit strings. It follows that the above described random process yields a result (i.e., a function) that is distributed identically to the random m variable Hn . Suppose now that the checkpoint chosen by D equals k and that D's inputs are independently and uniformly selected in f0 1g2n. In this case the way in which D answers the M 's queries can be viewed as placing independently and uniformly selected n-bit strings as the labels of the (k + 1)-level vertices. It follows that the way in which D answers M 's queries corresponds to the above described process with m = k + 1 (with M playing the role of and A playing the role of ). Hence, in this case M is invoked with access to the k + 1st hybrid random variable. 12 f g( n) def G = n( (G k+2 (G k+1 (g ( k 1 ))) 3.7. * PSEUDORANDOM PERMUTATIONS 125 Suppose, on the other hand, that the checkpoint chosen by D equals k and that D's inputs are independently selected so that each is distributed identically to G(Un ). In this case the way in which D answers the M 's queries can be viewed as placing independently and uniformly selected n-bit strings as the labels of the k-level vertices. It follows that the way in which D answers the M 's queries corresponds to the above described process with m = k. Hence, in this case M is invoked with access to the kth hybrid random variable. 2 Using Claim 3.34.1, it follows that which, by the contradiction hypothesis is greater than n p1(n) , for in nitely many n's. Using Theorem 3.6, we derive a contradiction to the hypothesis (of the current theorem) that G is a pseudorandom generator, and the current theorem follows. ( t (1) ( jProb D(G(Un ) ::: G(Unt)))=1 ; Prob D(U2(1) ::: U2(n))=1 j = nn) n 3.7 * Pseudorandom Permutations In this section we present de nitions and constructions for pseudorandom permutations. Clearly, pseudorandom permutations (over huge domains) can be used instead of pseudorandom functions in any e cient application, yet pseudorandom permutation o er the extra advantage of having unique preimages. This extra advantage may be useful sometimes, but not always (e.g., it is not used in the rest of this book). The construction of pseudorandom permutation uses pseudorandom functions as a building block, in a manner identical to the high level structure of the DES. Hence, the proof presented in this section can be viewed as a supporting the DES's methodology of converting \randomly looking" functions into \randomly looking" permutations. (The fact that in the DES this methodology is applied to functions which are not \randomly looking" is not of our concern here.) 3.7.1 De nitions We start with the de nition of pseudorandom permutations. Loosely speaking a pseudorandom ensemble of permutations is de ned analogously to a pseudorandom ensemble of functions. Namely, De nition 3.35 (permutation ensembles): A permutation ensemble is a sequence P = fPngn2N of random variables, so that the random variable Pn assumes values in the set I of permutations mapping n-bit long strings to n-bit long strings. The uniform permutation ensemble, denoted K = fKn gn2N , has Kn uniformly distributed over the set of permutations I mapping n-bit long strings to n-bit long strings. 126 CHAPTER 3. PSEUDORANDOM GENERATORS Every permutation ensemble is a function ensemble. Hence, the de nition of an e ciently computable permutation ensemble is obvious (i.e., it is derived from the de nition of an e ciently computable function ensemble). Pseudorandom permutations are de ned as computationally indistinguishable from the uniform permutation ensemble. De nition 3.36 (pseudorandom permutation ensembles): A permutation ensemble, P = fPngn2N, is called pseudorandom if for every probabilistic polynomial-time oracle machine I M , every polynomial p( ) and all su ciently large n's jProb(M Pn (1n)=1) ; Prob(M Kn (1n)=1)j < p(1n) where K = fKn gn2N is the uniform permutation ensemble. I The fact that P is a pseudorandom permutation ensemble rather then just being a pseudorandom function ensemble cannot be detected in poly(n)-time by an observer given oracle access to Pn . This fact steams from the observation that the uniform permutation ensemble is polynomial-time indistinguishable from the uniform function ensemble. Namely, Proposition 3.37 The uniform permutation ensemble (i.e., K = fKngn2N) constitutes a I pseudorandom function ensemble. function, when given access to Hn , is bounded by t2 2;n , where t denotes the number of queries made by the machine. Conditioned on not nding such a collision, the answers of Hn are indistinguishable from those of Kn. Finally, using the fact that a polynomial-time machine can ask at most polynomially many queries, the proposition follows. Hence, using pseudorandom permutations instead of pseudorandom functions has reasons beyond the question of whether a computationally restricted observer can detect the difference. Typically, the reason is that one wants to be guaranteed of the uniqueness of preimages. A natural strengthening of this requirement is to require that, given the description of the permutation, the (unique) preimage can be e ciently found. Proof Sketch: The probability that an oracle machine detects a collision in the oracle- De nition 3.38 (e ciently computable and invertible permutation ensembles): A permutation ensemble, P = fPn gn2N, is called e ciently computable and invertible if the following I three conditions hold 1. (e cient indexing): There exists a probabilistic polynomial time algorithm, I , and a mapping from strings to permutation, , so that (I (1n)) and Pn are identically distributed. 3.7. * PSEUDORANDOM PERMUTATIONS 127 2. (e cient evaluation): There exists a probabilistic polynomial time algorithm, V , so that V (i x) = fi (x), where (as before) fi def (i). = 3. (e cient inversion): There exists a probabilistic polynomial time algorithm, N , so that N (i x) = fi;1 (x) (i.e., fi (N (i x))= x). Items (1) and (2) are guaranteed by the de nition of an e ciently computable permutation ensemble. The additional requirement is stated in item (3). In some settings it makes sense to augment also the de nition of a pseudorandom ensemble by requiring that the ensemble cannot be distinguished from the uniform one even when the observer gets access to two oracles: one for the permutation and the other for its inverse. De nition 3.39 (strong pseudorandom permutations): A permutation ensemble, P = fPngn2N, is called strongly pseudorandom if for every probabilistic polynomial-time oracle I machine M , every polynomial p( ) and all su ciently large n's ; jProb(M Pn Pn;1 (1n)=1) ; Prob(M Kn Kn 1 (1n)=1)j < p(1n) where M f g can ask queries to both of its oracles (e.g., query (1 q ) is answered by f (q ), whereas query (2 q ) is answered by g (q )). 3.7.2 Construction The construction of pseudorandom permutation uses pseudorandom functions as a building block, in a manner identical to the high level structure of the DES. Namely, Construction 3.40 Let f : f0 1gn 7! f0 1gn. For every x y 2 f0 1gn, we de ne DESf (x y ) def (y x f (y )) = where x y denotes the bit-by-bit exclusive-or of the binary strings x and y . Likewise, for f1 ::: ft : f0 1gn 7! f0 1gn, we de ne DESft ::: f1 (x y ) def DESft ::: f2 (DESf1 (x y )) = For every function ensemble F = fFn gn2N , and every function t : N 7! N, we de ne the II I t(n)g t(n) def DES function ensemble fDESFn n2N by letting DESFn = ( (1) I Fnt) ::: Fn , where t = t(n) and (i) 's are independent copies of the random variable F . the Fn n 128 CHAPTER 3. PSEUDORANDOM GENERATORS Theorem 3.41 Let Fn , t( ), and DEStF(nn) be as in Construction 3.40 (above). Then, for evn ery polynomial-time computable function t( ), the ensemble fDESt(n ) gn2N is an e ciently F I computable and invertible permutation ensemble. Furthermore, if F = fFn gn2N is a pseuI dorandom function ensemble then the ensemble fDES3 n gn2N is pseudorandom, and the F I ensemble fDES4 n gn2N is strongly pseudorandom. F I n Clearly, the ensemble fDESt(n ) gn2N is e ciently computable. The fact that it is a F I permutation ensemble, and furthermore one with e cient inverting algorithm, follows from the observation that for every x y 2f0 1gn DESf zero(DESf (x y )) = = = = DESf zero(y x f (y )) DESf (x f (y ) x) (y (x f (y )) f (y )) (x y ) where zero(z ) def 0jzj for all z 2f0 1gn. = To prove the pseudorandomness of fDES3 n gn2N (resp., strong pseudorandomness of F I fDES4 n gn2N) it su ces to prove the pseudorandomness of fDES3 n gn2N (resp., strong F H I I pseudorandomness of fDES4 n gn2N ). The reason being that if, say, fDES3 n gn2N is pseuH H I I dorandom while fDES3 n gn2N is not, then one can derive a contradiction to the pseudoF I randomness of the function ensemble F (i.e., a hybrid argument is used to bridge between the three copies of Hn and the three copies of Fn ). Hence, Theorem 3.41 follows from Proposition 3.42 fDES3 n gn2N is pseudorandom, whereas fDES4 n gn2N is strongly pseuH H I I dorandom. Proof Sketch: We start by proving that fDES3 n gn2N is pseudorandom. Let P2n def = H I 3g fDESHn n2N, and K2n be the random variable uniformly distributed over all possible I permutation acting on f0 1g2n. We prove that for every oracle machine, M , that, on input 1n , asks at most m queries, it holds that m2 jProb(M P2n (1n)=1) ; Prob(M K2n (1n)=1)j 22n Let qi = (L0 R0), with jL0j = jR0j = n, denote the random variable representing the ith ii i i query of M when given access to oracle P2n . Recall that P2n = DESHn Hn Hn , where the (3) (2) (1) (j )'s are three independent random variables each uniformly distributed over the functions Hn ( acting on f0 1gn. Let Rk+1 def Lk Hnk+1) (Rk ) and Lk+1 def Rk , for k =0 1 2. We assume, i i =i i =i 3.7. * PSEUDORANDOM PERMUTATIONS 129 without loss of generality, that M never asks the same query twice. We de ne the following a random variable m representing the event \there exists i < j m and k 2 f1 2g so that Rk = Rk " (namely, \on input 1n and access to oracle P2n two of the m rst queries of M i j satisfy the relation Rk = Rk ). Using induction on m, the reader can prove concurrently the i j following two claims (see guidelines below). Claim 3.42.1: Given : m , we have the R3 's uniformly distributed over f0 1gn and the L3's i i uniformly distributed over the n-bit strings not assigned to previous L3 's. Namely, for every j 1 ::: m 2f0 1gn m Prob ^m (R3 = i ) j : m = 1n i=1 i 2 n whereas, for every distinct 1 ::: m 2f0 1g m Y1 Prob ^m (L3 = i ) j : m = i=1 i 2n ; i + 1 i=1 Claim 3.42.2: 2m 2n Proof Idea: The proof of Claim 3.42.1 follows by observing that the R3 's are determined i (3) by applying the random function Hn to di erent arguments (i.e., the R2's), whereas the i (2) L3 = R2's are determined by applying the random function Hn to di erent arguments i i (i.e., the R1 's) and conditioning that the R2 's are di erent. The proof of Claim 3.42.2 i i follows by considering the probability that Rk +1 = Rk , for some i m and k 2f1 2g. Say m i that R0 = R0 +1 then certainly (by recalling qi 6= qm+1 ) we have m i Prob ( m+1 j : m) (1) (1) (1) R1 = L0 Hn (R0)= L0 Hn (R0) 6= L0 +1 Hn (R0 +1)= R1 +1 i i i i j m m m On the other hand, say that R0 6= R0 +1 then m i (1) (1) Prob R1 = R1 +1 = Prob Hn (R0 ) Hn (R0 +1 )= L0 L0 +1 = 2;n i m i m i m Furthermore, if R1 6= R1 +1 then i m (2) (2) Prob R2 = R2 +1 = Prob Hn (R1) Hn (R1 +1)= R0 R0 +1 = 2;n i m i m i m Hence, both claims follow. 2 Combining the above claims, we conclude that Prob( m ) < mn2 , and furthermore, given that 2 m is false, the answers of P2n have left half uniformly chosen among all n-bit strings not appearing as left halves in previous answers, whereas the right half uniformly distributed among all n-bit strings. On the other hand, the answers of K2n are uniformly distributed 130 CHAPTER 3. PSEUDORANDOM GENERATORS among all 2n-bit strings not appearing as previous answers. Hence, the statistical di erence between the distribution of answers in the two cases (i.e., answers by P2n or by K2n ) is m bounded by 22n2 . The rst part of the proposition follows. The proof that fDES4 n gn2N is strongly pseudorandom is more complex, yet uses esH I sentially the same ideas. In particular, the event corresponding to m is the disjunction of four types of events. Events of the rst type are of the form Rk = Rk for k 2f2 3g, where i j qi = (L0 R0) and qj = (L0 R0) are queries of the forward direction. Similarly, events of the ii jj second type are of the form Rk = Rk for k 2f2 1g, where qi = (L4 R4) and qj = (L4 R4) i j ii jj are queries of the backwards direction. Events of the third type are of the form Rk = Rk for i j k 2f2 3g, where qi = (L0 R0) is of the forward direction, qj = (L4 R4) is of the backward ii jj direction, and j < i. Similarly, events of the fourth type are of the form Rk = Rk for i j k 2f2 1g, where qi = (L4 R4) is of the forward direction, qj = (L0 R0) is of the backward ii jj direction, and j < i. As before, one bounds the probability of event m , and bounds the statistical distance between answers by K2n and answers by fDES4 n gn2N given that m is H I false. 3.8 Miscellaneous 3.8.1 Historical Notes The notion of computational indistinguishable ensembles was rst presented by Goldwasser and Micali (in the context of encryption schemes) GM82]. In the general setting, the notion rst appears in Yao's work which is also the origin of the de nition of pseudorandomness Y82]. Yao also observed that pseudorandom ensembles can be very far from uniform, yet our proof of Proposition 3.3 is taken from GK89a]. Pseudorandom generators were introduced by Blum and Micali BM82], who de ned such generators as producing sequences which are unpredictable. Blum and Micali proved that such pseudorandom generators do exist assuming the intractability of the discrete logarithm problem. Furthermore, they presented a general paradigm, for constructing pseudorandom generators, which has been used explicitly or implicitly in all subsequent developments. Other suggestions for pseudorandom generators were made soon after by Goldwasser et. al. GMT82] and Blum et. al. BBS82]. Consequently, Yao proved that the existence of any one-way permutation implies the existence of pseudorandom generators Y82]. Yao was the rst to characterize pseudorandom generators as producing sequences which are computationally indistinguishable from uniform. He also proved that this characterization of pseudorandom generators is equivalent to the characterization of Blum and Micali BM82]. Generalizations to Yao's result, that one-way permutations imply pseudorandom generators, were proven by Levin L85] and by Goldreich et. al. GKL88], culminating with 3.8. MISCELLANEOUS 131 the result of Hastad et. al. H90,ILL89] which asserts that pseudorandom generators exist if and only if one-way functions exist. The constructions presented in Section 3.5 follow the ideas of GKL88] and ILL89]. These constructions make extensive use of universal2 hashing functions, which were introduced by Carter and Wegman CW] and rst used in complexity theory by Sipser S82]. Pseudorandom functions were introduced and investigated by Goldreich et. al. GGM84]. In particular, the construction of pseudorandom functions based on pseudorandom generators is taken from GGM84]. Pseudorandom permutations were de ned and constructed by Luby and Racko LR86], and our presentation follows their work. Author's Note: Pseudorandom functions have many applications to cryptog- raphy, some of them were to be presented in other chapters of the book (e.g., on signatures and encryption). As these chapters were not written, the reader is referred to GGM84b] and G87b,O89]. GM82]. The hybrid method originates from the work of Goldwasser and Micali terminology is due to Leonid Levin. The 3.8.2 Suggestion for Further Reading Section 3.5 falls short of presenting the construction of Hastad et. al. HILL], not to mention proving its validity. Unfortunately, the proof of this fundamental theorem, asserting that pseudorandom generators exist if one-way functions exist, is too complicated to t in a book of the current nature. The interested reader is thus referred to the original paper of Hastad et. al. HILL] (which combines the results in H90,ILL89]) and to Luby's book L94book]. Simple pseudorandom generators based on speci c intractability assumptions are presented in BM82,BBS82,ACGS84,VV84,K88]. In particular, ACGS84] presents pseudorandom generators based on the intractability of factoring, whereas K88] presents pseudorandom generators based on the intractability of discrete logarithm problems. In both cases, the major step is the construction of hard-core predicates for the corresponding collections of one-way permutations. Proposition 3.3 presents a pair of ensembles which are computational indistinguishable although they are statistically far apart. One of the two ensembles is not constructible in polynomial-time. Goldreich showed that a pair of polynomial-time constructible ensembles having the above property (i.e., being both computationally indistinguishable and having a non-negligibly statistical di erence) exists if and only if one-way functions exist G90ipl]. Author's Note: G90ipl has appeared in IPL, Vol. 34, pp. 277{281. Readers interested in Kolmogorov complexity are referred to WHAT?] 132 CHAPTER 3. PSEUDORANDOM GENERATORS 3.8.3 Open Problems Although Hastad et. al. HILL] showed how to construct pseudorandom generators given any one-way function, their construction is not practical. The reason being that the \quality" of the generator on seeds of length n is related to the hardness of inverting the given p function on inputs of length < 4 n. We believe that presenting an e cient transformation of arbitrary one-way functions to pseudorandom generators is one of the most important open problems of the area. An open problem of more practical importance is to try to present even more e cient pseudorandom generators based on the intractability of speci c computational problems like integer factorization. For further details see Subsection 2.7.3. 3.8.4 Exercises Exercise 1: computational indistinguishability is preserved by e cient algorithms: Let fXngn2N and fYngn2N be two ensembles that are polynomial-time indistinguishI I able, and let A be a probabilistic polynomial-time algorithm. Prove that the ensembles fA(Xn)gn2N and fA(Yn)gn2N are polynomial-time indistinguishable. I I Exercise 2: statistical closeness is preserved by any function: Let fXngn2N and fYngn2N I I be two ensembles that are statistically close, and let f : f0 1g 7! f0 1g be a function. Prove that the ensembles ff (Xn )gn2N and ff (Yn )gn2N are statistically close. I I Exercise 3: Prove that for every L 2 BPP and every pair of polynomial-time indistinguishable ensembles, fXn gn2N and fYn gn2N , it holds that the function I I L (n) def jProb (X = n 2 L) ; Prob (Yn 2 L) j is negligible in n. It is tempting to think the the converse holds as well, but we don't know if it does note that fXn g and fYn g may be distinguished by a probabilitic algorithm, but not by a deterministic one. In such a case, which language should we de ne? For example, suppose that A is a probabilistic polynomial-time algorithm and let 1 L def fx : Prob(A(x)=) 2 g, then L is not necessarily in BPP . = Exercise 4: An equivalent formulation of statistical closeness: In the non-computational setting both the above and its converse are true and can be easily proven. Namely, prove that two ensembles, fXngn2N and fYn gn2N, are statistically close if and only I I if for every set S f0 1g , S (n) def jProb (X = n 2 S ) ; Prob (Yn 2 S ) j is negligible in n. 3.8. MISCELLANEOUS 133 Exercise 5: statistical closeness implies computational indistinguishability: Prove that if two ensembles are statistically close then they are polynomial-time indistinguishable. (Guideline: use the result of the previous exercise, and de ne for every function f : f0 1g 7! f0 1g a set Sf def fx : f (x)=1g.) = Exercise 6: computational indistinguishability by circuits - probabilism versus determinism: Let fXn gn2N and fYn gn2N be two ensembles, and C def fCngn2N be a family = I I I of probabilistic polynomial-size circuits. Prove that there exists a family of (deterministic) polynomial-size circuits, D def fDn gn2N , so that for every n = I D (n) C (n) where def D (n) = jProb (Dn (Xn ))=1) ; Prob (Dn (Yn ))=1) j def C (n) = jProb (Cn (Xn ))=1) ; Prob (Cn (Yn ))=1) j polynomial-size circuits, for every polynomial p( ) and all su ciently large n's jProb (Cn(Xn))=1) ; Prob (Cn(Yn ))=1) j < p(1n) Prove that X and Y are indistinguishable by polynomial-size circuits if and only if their m( )-products are indistinguishable by polynomial-size circuits, for every polynomial m( ). (Guideline: X and Y need not be polynomial-time constructible! Yet, a \good choice" of x1 ::: xk and y k+2 ::: y m may be \hard-wired" into the circuit.) Exercise 8: On the general de nition of a pseudorandom generator: Let G be a pseudorandom generator (by De nition 3.8), and let fUl(n)gn2N be polynomial-time indisI tinguishable from fG(Un)gn2N . Prove that the probability that G(Un ) has length I not equal to l(n) is negligible (in n). (Guideline: Consider an algorithm that for some polynomial p( ) proceeds as follows. On input 1n and a string to be tested , the algorithm rst samples G(Un ) for p(n) times and records the length of the shortest string found. Next the algorithm outputs 1 if and only if is longer than the length recorded.) Exercise 9: Consider a modi cation of Construction 3.10, where si i = G1(si;1) is used instead of i si = G1(si;1 ). Provide a simple proof that the resulting algorithm is also pseudorandom. (Guideline: don't modify the proof of Theorem 3.11, but rather modify G1 itself.) Exercise 7: computational indistinguishability by circuits - single sample versus several samples: We say that the ensembles X = fXn gn2N and Y = fYn gn2N are indistinI I guishable by polynomial-size circuits if for every family, fCn gn2N , of (deterministic) I 134 CHAPTER 3. PSEUDORANDOM GENERATORS Exercise 10: Let G be a pseudorandom generator, and h be a polynomial-time computable permutation (over strings of the same length). Prove that G0 and G00 de ned by G0(s) def h(G(s)) and G00(s) def G(h(s)) are both pseudorandom generators. = = Exercise 11: Let G be a pseudorandom generator, and h be a permutation (over strings of the same length) that is not necessarily polyonimial-time computable. 1. Is G0 de ned by G0 (s) def h(G(s)) necessarily a pseudorandom generator? = 2. Is G00 de ned by G00(s) def G(h(s)) necessarily a pseudorandom generator? = (Guideline: you may assume that there exist one-way permutations.) Exercise 12: Alternative construction of pseudorandom generators with large expansion factor: Let G1 be a pseudorandom generator with expansion factor l(n) = n + 1, and let p( ) be a polynomial. De ne G(s) to be the result of applying G1 iteratively p(jsj) times on s (i.e., G(s) def Gp(jsj)(s) where G0(s) def s and Gi1+1 def G1(Gi1(s))). Prove =1 = = 1 that G is a pseudorandom generator. What are the advantages of using Construction 3.10? Exercise 13: Sequential Pseudorandom Generator: A oracle machine is called a sequen1. The observer's queries are answered by independent ips of an unbiased coin. 2. The observer's queries are answered as follows. First a random seed, s, of length n is uniformly chosen. The ith query is answered by the rightmost (i.e., the ith) i i bit of gn (s), where gn is de ned as in the proof of Theorem 3.11. tial observer if its queries constitute a pre x of the natural numbers. Namely, on input 1n , the sequential observer makes queries 1 2 3 :::. Consider the following two experiments with a sequential observer having input 1n : Prove that a probabilistic polynomial-time observer cannot distinguish the two experiments, provided that G used in the construction is a pseudorandom generator. Namely, the di erence between the probability that the observer outputs 1 in the rst experiment and the probability that the observer outputs 1 in the second experiment is a negligible function (in n). Exercise 14: pseudorandomness implies unpredictability: Prove that all pseudorandom ensembles are unpredictable (in polynomial-time). (Guideline: Given an e cient predictor show how to construct an e cient distinguisher of the pseudorandom ensemble from the uniform one.) Exercise 15: unpredictability implies unpredictability: Let X = fXngn2N be an ensemble I such that there exists a function l : N 7! N so that Xn ranges over string of length II l(n), and l(n) can be computed in time poly(n). Prove that if X is unpredictable (in 3.8. MISCELLANEOUS 135 polynomial-time) then it is pseudorandom. (Guideline: Given an e cient distinguisher of X from the uniform ensemble fUl(n)gn2N I show how to construct an e cient predictor. The predictor randomly selects k 2 f0 ::: l(n) ; 1g reads only the rst k bits of the input, and applies D to the string resulting by augmenting the k-bit long pre x of the input with l(n) ; k uniformly chosen bits. If D answers 1 then the predictor outputs the rst of these random bits else the predictor outputs the complementary value. Use a hybrid technique to evaluate the performance of the predictor. Extra hint: an argument analogous to that of the proof of Theorem 3.14 has to be used as well.) Exercise 16: Construction of Hashing Families: m 1. Consider the set Sn of functions mapping n-bit long strings into m-bit strings as m follows. A function h in Sn is represented by an n-by-m binary matrix A, and an m-dimensional binary vector b. The n-dimensional binary vector x is mapped by the function h to the m-dimensional binary vector resulting by multiplying x by A and adding the vector b to the resulting vector (i.e., h(x) = xA + b). Prove m that Sn so de ned constitutes a hashing family (as de ned in Section 3.5). 2. Repeat the above item when the n-by-m matrices are restricted to be Toeplitz matrices. An n-by-m Toeplitz matrix, T = fTi j g, satis es Ti j = Ti+1 j +1 for all i j. Note that binary n-by-m Toeplitz matrices can be represented by strings of length n + m ; 1, where as representing arbitrary n-by-m binary matrices requires strings of length n m. m Exercise 17: Another Hashing Lemma: Let m, n, Sn , b, Xn and be as in Lemma 3.18. m, and for all but a 2;(b;m+log2 jS j) ;2 fraction of Prove that, for every set S f0 1g m the h's in Sn , it holds that j ) 2S j m (Guideline: follow the proof of Lemma 3.18, de ning x (h) = 1 if h(x) 2 S and 0 otherwise.) Prob(h(Xn) 2 S ) 2 (1 m Exercise 18: Yet another Hashing Lemma: Let m, n, and Sn be as above. Let B f0 1gn def log jB j and s def log jS j. Prove that, for all and S mf0 1gm be sets, and let b = 2 =2 m but a jB2j jS j ;2 fraction of the h's in Sn , it holds that jfx 2 B : h(x) 2 S )gj 2 (1 ) (jB j jS j) (Guideline: De ne a random variable Xn that is uniformly distributed over B .) 136 CHAPTER 3. PSEUDORANDOM GENERATORS a construction of a function ensemble where the functions in Fn are de ned as follows. For every s 2 f0 1gn, the function fs is de ned so that Exercise 19: Failure of an alternative construction of pseudorandom functions: Consider generator). (Guideline: Show, rst, that if pseudorandom generators exist then there exists a pseudorandom generator G satisfying G(0n) = 02n.) Exercise 20: Pseudorandom Generators with Direct Access: A direct access pseudorandom generator is a deterministic polynomial-time algorithm, G, for which no probabilistic polynomial-time oracle machine can distinguish the following two cases: 1. New queries of the oracle machine are answered by independent ips of an unbiased coin. (Repeating the same query yields the same answer.) 2. First, a random \seed", s, of length n is uniformly chosen. Next, each query, q , is answered by G(s q ). The bit G(s i) may be thought of as the ith bit in a bit sequence corresponding to the seed s, where i is represented in binary. Prove that the existence of (regular) pseudorandom generators implies the existence of pseudorandom generators with direct access. Note that modifying the current de nition, so that only unary queries are allowed, yields an alternative de nition of a sequential pseudorandom generator (presented in Exercise 13 above). Evaluate the advantage of direct access pseudorandom generators over sequential pseudorandom generators in settings requiring direct access only to bits of a polynomially long pseudorandom sequence. Exercise 21: other types of pseudorandom functions: De ne pseudorandom predicate ensembles so that the random variable Fn ranges over arbitrary Boolean predicates (i.e., functions in the range of Fn are de ned on all strings and have the form f : f0 1g 7! f0 1g). Assuming the existence of pseudorandom generators, construct e ciently computable ensembles of pseudorandom Boolean functions. Same for ensembles of functions in which each function in the range of Fn operates on the set of all strings (i.e., has the form f : f0 1g 7! f0 1g ). (Guideline: Use a modi cation of Construction 3.33 in which the building block is a pseudorandom generator expanding strings of length n into strings of length 3n.) Exercise 22: An alternative de nition of pseudorandom functions: For sake of simplicity this exercise is stated in terms of ensembles of Boolean functions as presented in fs(x) def G n ( (G 2 (G 1 (x)) ) = where s = 1 n , and G is as in Construction 3.33. Namely the roles of x and s in Construction 3.33 are switched (i.e., the root is labelled x and the value of fs on x is obtained by following the path corresponding to the index s). Prove that the resulting function ensemble is not necessarily pseudorandom (even if G is a pseudorandom 3.8. MISCELLANEOUS 137 the previous exercise. We say that a Boolean function ensemble, F = fFn gn2N , is I unpredictable if for every probabilistic polynomial-time oracle machine, M , for every polynomial p( ) and for all su ciently large n's 1 Prob(corrFn (M Fn (1n ))) < 2 + p(1 ) n where M Fn assumes values of the form (x ) 2 f0 1gn+1 so that x is not a query appearing in the computation M Fn , and corrf (x ) is de ned as the predicate \f (x) = ". Intuitively, after getting the value of f on points of its choice, the machine M outputs a new point and tries to guess the value of f on this point. Assuming that F = fFn gn2N is e ciently computable, prove that F is pseudorandom if and only if I F is unpredictable. (Guideline: A pseudorandom function ensemble is unpredictable since the uniform function ensemble is unpredictable. For the other direction use ideas analogous to those used in Exercise 14.) Exercise 23: Another alternative de nition of pseudorandom functions: Repeat the above exercise when modifying the de nition of unpredictability so that the oracle machine gets x 2 f0 1gn as input and after querying the function f on other points of its choice, the machine outputs a guess for f (x). Namely, we require that for every probabilistic polynomial-time oracle machine, M , that does not query the oracle on its own input, for every polynomial p( ), and for all su ciently large n's 1 Prob(M Fn (Un )= Fn (Un )) < 2 + p(1 ) n Exercise 24: Let Fn and DEStFn be as in Construction 3.40. Prove that, regardless of the choice of the ensemble F = fFn gn2N, the ensemble DES2 n is not pseudorandom. F I Similarly, prove that the ensemble DES3 n is not strongly pseudorandom. F (Guideline: Start by showing that the ensemble DES1 n is not pseudorandom.) F 138 CHAPTER 3. PSEUDORANDOM GENERATORS Chapter 4 Encryption Schemes In this chapter we discuss the well-known notions of private-key and public-key encryption schemes. More importantly, we de ne what is meant by saying that such schemes are secure. We then turn to some basic constructions. We show that the widely used construction of a \stream cipher" yields a secure (private-key) encryption, provided that the \key sequence" is generated using a pseudorandom generator. Public-key encryption schemes are constructed based on any trapdoor one-way permutation. Finally, we discuss dynamic notions of security such as robustness against chosen ciphertext attacks and nonmalleability. %Plan \input{enc-set}%% \input{enc-sec}%% \input{enc-eqv}%% \input{enc-prg}%% \input{enc-pk}%%% \input{enc-str}%% \input{enc-misc}% The basic setting: private-key, public-key,... Definitions of Security (semantic/indistinguishable) Equivalence of the two definitions Private-Key schemes based on Pseudorandom Generators Constrictions of Public-Key Encryption Schemes Stronger notions of security (chosen msg, malleable'') As usual: History, Reading, Open, Exercises 139 140 CHAPTER 4. ENCRYPTION SCHEMES Chapter 5 Digital Signatures and Message Authentication The di erence between message authentication and digital signatures is analogous to the di erence between private-key and public-key encryption schemes. In this chapter we de ne both type of schemes and the security problem associated to them. We then present several constructions. We show how to construct message authentication schemes using pseudorandom functions, and how to construct signature schemes using one-way permutations (which do not necessarily have a trapdoor). %Plan \input{sg-def}%%% %................ \input{sg-aut}%%% \input{sg-con1}%% %................ \input{sg-hash}%% %................ \input{sg-con2}%% \input{sg-misc}%% Definitions of Unforgable Signatures and Message Authentication Construction of Message Authentication Construction of Signatures by NY] tools: one-time signature, aut-trees, one-way hashing * Collision-free hashing: def, construct by clawfree, applications (sign., etc.) * Alternative Construction of Signatures EGM] As usual: History, Reading, Open, Exercises 141 142 CHAPTER 5. DIGITAL SIGNATURES AND MESSAGE AUTHENTICATION Chapter 6 Zero-Knowledge Proof Systems In this chapter we discuss zero-knowledge proof systems. Loosely speaking, such proof systems have the remarkable property of being convincing and yielding nothing (beyond the validity of the assertion). The main result presented is a method to generate zeroknowledge proof systems for every language in NP . This method can be implemented using any bit commitment scheme, which in turn can be implemented using any pseudorandom generator. In addition, we discuss more re ned aspects of the concept of zero-knowledge and their a ect on the applicability of this concept. 6.1 Zero-Knowledge Proofs: Motivation An archetypical \cryptographic" problem consists of providing mutually distrustful parties with a means of \exchanging" (predetermined) \pieces of information". The setting consists of several parties, each wishing to obtain some predetermined partial information concerning the secrets of the other parties. Yet each party wishes to reveal as little information as possible about its own secret. To clarify the issue, let us consider a speci c example. Suppose that all users in a system keep backups of their entire le system, encrypted using their public-key encryption, in a publicly accessible storage media. Suppose that at some point, one user, called Alice, wishes to reveal to another user, called Bob, the cleartext of one of her les (which appears in one of her backups). A trivial \solution" is for Alice just to send the (cleartext) le to Bob. The problem with this \solution" is that Bob has no way of verifying that Alice really sent him a le from her public backup, rather than just sending him an arbitrary le. Alice can simply prove that she sends the correct le by revealing to Bob her private encryption key. However, doing so, will reveal to Bob the contents of all her les, which is certainly something that Alice does 143 144 CHAPTER 6. ZERO-KNOWLEDGE PROOF SYSTEMS not want to happen. The question is whether Alice can convince Bob that she indeed revealed the correct le without yielding any additional \knowledge". An analogous question can be phrased formally as follows. Let f be a one-way permutation, and b a hard-core predicate with respect to f . Suppose that one party, A, has a string x, whereas another party, denoted B , only has f (x). Furthermore, suppose that A wishes to reveal b(x) to party B , without yielding any further information. The trivial \solution" is to let A send b(x) to B , but, as explained above, B will have no way of verifying whether A has really sent the correct bit (and not its complement). Party A can indeed prove that it sends the correct bit (i.e., b(x)) by sending x as well, but revealing x to B is much more than what A had originally in mind. Again, the question is whether A can convince B that it indeed revealed the correct bit (i.e., b(x)) without yielding any additional \knowledge". In general, the question is whether it is possible to prove a statement without yielding anything beyond its validity. Such proofs, whenever they exist, are called zero-knowledge, and play a central role (as we shall see in the subsequent chapter) in the construction of \cryptographic" protocols. Loosely speaking, zero-knowledge proofs are proofs that yield nothing (i.e., \no knowledge") beyond the validity of the assertion. In the rest of this introductory section, we discuss the notion of a \proof" and a possible meaning of the phrase \yield nothing (i.e., no knowledge) beyond something". 6.1.1 The Notion of a Proof We discuss the notion of a proof with the intention of uncovering some of its underlying aspects. A Proof as a xed sequence or as an interactive process Traditionally in mathematics, a \proof" is a xed sequence consisting of statements which are either self-evident or are derived from previous statements via self-evident rules. Actually, it is more accurate to substitute the phrase \self-evident" by the phrase \commonly agreed". In fact, in the formal study of proofs (i.e., logic), the commonly agreed statements are called axioms, whereas the commonly agreed rules are referred to as derivation rules. We wish to stress two properties of mathematics proofs: 1. proofs are viewed as xed objects 2. proofs are considered at least as fundamental as their consequence (i.e., the theorem). 6.1. ZERO-KNOWLEDGE PROOFS: MOTIVATION 145 However, in other areas of human activity, the notion of a \proof" has a much wider interpretation. In particular, a proof is not a xed object but rather a process by which the validity of an assertion is established. For example, the cross-examination of a witness in court is considered a proof in law, and failure to answer a rival's claim is considered a proof in philosophical, political and sometimes even technical discussions. In addition, in real-life situations, proofs are considered secondary (in importance) to their consequence. To summarize, in \canonical" mathematics proofs have a static nature (e.g., they are \written"), whereas in real-life situations proofs have a dynamic nature (i.e., they are established via an interaction). The dynamic interpretation of the notion of a proof is more adequate to our setting in which proofs are used as tools (i.e., subprotocols) inside \cryptographic" protocols. Furthermore, the dynamic interpretation (at least in a weak sense) is essential to the non-triviality of the notion of a zero-knowledge proof. Prover and Veri er The notion of a prover is implicit in all discussions of proofs, be it in mathematics or in real-life situations. Instead, the emphasis is placed on the veri cation process, or in other words on (the role of) the veri er. Both in mathematics and in real-life situations, proofs are de ned in terms of the veri cation procedure. Typically, the veri cation procedure is considered to be relatively simple, and the burden is placed on the party/person supplying the proof (i.e., the prover). The asymmetry between the complexity of the veri cation and the theorem-proving tasks is captured by the complexity class NP , which can be viewed as a class of proof systems. Each language L 2 NP has an e cient veri cation procedure for proofs of statements of the form \x 2 L". Recall that each L 2 NP is characterized by a polynomial-time recognizable relation RL so that L = fx : 9y s.t.(x y ) 2 RLg and (x y ) 2 RL only if jy j poly(jxj). Hence, the veri cation procedure for membership claims of the form \x 2 L" consists of applying the (polynomial-time) algorithm for recognizing RL , to the claim (encoded by) x and a prospective proof denoted y . Hence, any y satisfying (x y ) 2 RL is considered a proof of membership of x 2 L. Hence, correct statements (i.e., x 2 L) and only them have proofs in this proof system. Note that the veri cation procedure is \easy" (i.e., polynomial-time), whereas coming up with proofs may be \di cult". It is worthwhile to stress the distrustful attitude towards the prover in any proof system. If the veri er trusts the prover then no proof is needed. Hence, whenever discussing a proof system one considers a setting in which the veri er is not trusting the prover and furthermore is skeptic of anything the prover says. 146 CHAPTER 6. ZERO-KNOWLEDGE PROOF SYSTEMS Completeness and Validity Two fundamental properties of a proof system (i.e., a veri cation procedure) are its validity and completeness. The validity property asserts that the veri cation procedure cannot be \tricked" into accepting false statements. In other words, validity captures the veri er ability of protecting itself from being convinced of false statements (no matter what the prover does in order to fool it). On the other hand, completeness captures the ability of some prover to convince the veri er of true statements (belonging to some predetermined set of true statements). Note that both properties are essential to the very notion of a proof system. We remark here that not every set of true statements has a \reasonable" proof system in which each of these statements can be proven (while no false statement can be \proven"). This fundamental fact is given a precise meaning in results such as Godel's Incompleteness Theorem and Turing's proof of the unsolvability of the Halting Problem. We stress that in this chapter we con ne ourself to the class of sets that do have \e cient proof systems". In fact, Section 6.2 is devoted to discussing and formulating the concept of \e cient proof systems". Jumping ahead, we hint that the e ciency of a proof system will be associated with the e ciency of its veri cation procedure. 6.1.2 Gaining Knowledge Recall that we have motivated zero-knowledge proofs as proofs by which the veri er gains \no knowledge" (beyond the validity of the assertion). The reader may rightfully wonder what is knowledge and what is a gain of knowledge. When discussing zero-knowledge proofs, we avoid the rst question (which is quite complex), and treat the second question directly. Namely, without presenting a de nition of knowledge, we present a generic case in which it is certainly justi ed to say that no knowledge is gained. Fortunately, this \conservative" approach seems to su ce as far as cryptography is concerned. To motivate the de nition of zero-knowledge consider a conversation between two parties, Alice and Bob. Assume rst that this conversation is unidirectional, speci cally Alice only talks and Bob only listens. Clearly, we can say that Alice gains no knowledge from the conversation. On the other hand, Bob may or may not gain knowledge from the conversation (depending on what Alice says). For example, if all that Alice says is 1 + 1 = 2 then clearly Bob gains no knowledge from the conversation since he knows this fact himself. If, on the other hand, Alice tells Bob a proof of Fermat's Theorem then certainly he gained knowledge from the conversation. To give a better avour of the de nition, we now consider a conversation between Alice and Bob in which Bob asks Alice questions about a large graph (that is known to both of them). Consider rst the case in which Bob asks Alice whether the graph is Eulerian or not. Clearly, we say that Bob gains no knowledge from Alice's answer, since he could have 6.1. ZERO-KNOWLEDGE PROOFS: MOTIVATION 147 determined the answer easily by himself (e.g., by using Euler's Theorem which asserts that a graph is Eulerian if and only if all its vertices have even degree). On the other hand, if Bob asks Alice whether the graph is Hamiltonian or not, and Alice (somehow) answers this question then we cannot say that Bob gained no knowledge (since we do not know of an e cient procedure by which Bob can determine the answer by himself, and assuming P 6= NP no such e cient procedure exists). Hence, we say that Bob gained knowledge from the interaction if his computational ability, concerning the publicly known graph, has increased (i.e., if after the interaction he can easily compute something that he could not have e ciently computed before the interaction). On the other hand, if whatever Bob can e ciently compute about the graph after interacting with Alice, he can also e ciently compute by himself (from the graph) then we say that Bob gained no knowledge from the interaction. Hence, Bob gains knowledge only if he receives the result of a computation which is infeasible for Bob. The question of how could Alice conduct this infeasible computation (e.g., answer Bob's question of whether the graph is Hamiltonian) has been ignored so far. Jumping ahead, we remark that Alice may be a mere abstraction or may be in possession of additional hints, that enables to e ciently conduct computations that are otherwise infeasible (and in particular are infeasible for Bob who does not have these hints). (Yet, these hints are not necessarily \information" in the information theoretic sense as they may be determined by the common input, but not e ciently computed from it.) Knowledge vs. information We wish to stress that knowledge (as discussed above) is very di erent from information (in the sense of information theory). Knowledge is related to computational di culty, whereas information is not. In the above examples, there was a di erent between the knowledge revealed in case Alice answers questions of the form \is the graph Eulerian" and the case she answers questions of the form \is the graph Hamilton". From an information theoretic point of view there is no di erence between the two cases (i.e., in both Bob gets no information). Knowledge relates mainly to publicly known objects, whereas information relates mainly to objects on which only partial information is publicly known. Consider the case in which Alice answers each question by ipping an unbiased coin and telling Bob the outcome. From an information theoretic point of view, Bob gets from Alice information concerning an event. However, we say that Bob gains no knowledge from Alice, since he can toss coins by himself. 148 CHAPTER 6. ZERO-KNOWLEDGE PROOF SYSTEMS 6.2 Interactive Proof Systems In this section we introduce the notion of an interactive proof system, and present a nontrivial example of such a system (speci cally to claims of the form \the following two graphs are not isomorphic"). The presentation is directed towards the introduction of zeroknowledge interactive proofs. Interactive proof systems are interesting for their own sake, and have important complexity theoretic applications, that are discussed in Chapter 8. 6.2.1 De nition The de nition of an interactive proof system refers explicitly to the two computational tasks related to a proof system: \producing" a proof and verifying the validity of a proof. These tasks are performed by two di erent parties, called the prover and the veri er, which interact with one another. The interaction may be very simple and in particular unidirectional (i.e., the prover sends a text, called the proof, to the veri er). In general the interaction may be more complex, and may take the form of the veri er interrogating the prover. Interaction Interaction between two parties is de ned in the natural manner. The only point worth noting is that the interaction is parameterized by a common input (given to both parties). In the context of interactive proof systems, the common input represents the statement to be proven. We rst de ne the notion of an interactive machine, and next the notion of interaction between two such machines. The reader may skip to the next part of this subsection (titled \Conventions regarding interactive machines") with little loss (if at all). De nition 6.1 (an interactive machine): An interactive Turing machine (ITM) is a (deterministic) multi-tape Turing machine. The tapes consists of a read-only input-tape, a read-only random-tape, a read-andwrite work-tape, a write-only output-tape, a pair of communication-tapes, and a read-and-write switch-tape consisting of a single cell initiated to contents 0. One communication-tape is read-only and the other is write-only. Each ITM is associated a single bit 2 f0 1g, called its identity. An ITM is said to be active, in a con guration, if the contents of its switch-tape equals the machine's identity. Otherwise the machine is said to be idle. While being idle, the state of the machine, the location of its heads on the various tapes, and the contents of the writeable tapes of the ITM is not modi ed. 6.2. INTERACTIVE PROOF SYSTEMS 149 The contents of the input-tape is called input, the contents of the random-tape is called random-input, and the contents of the output-tape at termination is called output. The contents written on the write-only communication-tape during a (time) period in which the machine is active is called the message sent at this period. Likewise, the contents read from the read-only communication-tape during an active period is called the message received (at that period). (Without loss of generality the machine movements on both communication-tapes are only in one direction, say left to right). The above de nition, taken by itself, seems quite nonintuitive. In particular, one may say that once being idle the machine never becomes active again. One may also wonder what is the point of distinguishing the read-only communication-tape from the input-tape (and respectively distinguishing the write-only communication-tape from the output-tape). The point is that we are never going to consider a single interactive machine, but rather a pair of machines combined together so that some of their tapes coincide. Intuitively, the messages sent by an interactive machine are received by a second machine which shares its communication-tapes (so that the read-only communication-tape of one machine coincides with the write-only tape of the other machine). The active machine may become idle by changing the contents of the shared switch-tape and by doing so the other machine (having opposite identity) becomes active. The computation of such a pair of machines consists of the machines alternatingly sending messages to one another, based on their initial (common) input, their (distinct) random-inputs, and the messages each machine has received so far. De nition 6.2 (joint computation of two ITMs): Two interactive machines are said to be linked if they have opposite identities, their input-tapes coincide, their switch-tapes coincide, and the read-only communicationtape of one machine coincides with the write-only communication-tape of the other machine, and vice versa. We stress that the other tapes of both machines (i.e., the random-tape, the work-tape, and the output-tape) are distinct. The joint computation of a linked pair of ITMs, on a common input x, is a sequence of pairs. Each pair consists of the local con guration of each of the machines. In each such pair of local con gurations, one machine (not necessarily the same one) is active while the other machine is idle. If one machine halts while the switch-tape still holds its identity the we say that both machines have halted. At this point, the reader may object to the above de nition, saying that the individual machines are deprived of individual local inputs (and observing that they are given individual and unshared random-tapes). This restriction is removed in Subsection 6.2.3, and in 150 CHAPTER 6. ZERO-KNOWLEDGE PROOF SYSTEMS fact removing it is quite important (at least as far as practical purposes are concerned). Yet, for a rst presentation of interactive proofs, as well as for demonstrating the power of this concept, we prefer the above simpler de nition. The convention of individual random-tapes is however essential to the power of interactive proofs (see Exercise 4). Conventions regarding interactive machines Typically, we consider executions when the contents of the random-tape of each machine is uniformly and independently chosen (among all in nite bit sequences). The convention of having an in nite sequence of internal coin tosses should not bother the reader since during a nite computation only a nite pre x is read (and matters). The contents of each of these random-tapes can be viewed as internal coin tosses of the corresponding machine (as in the de nition of ordinary probabilistic machines, presented in Chapter 1). Hence, interactive machines are in fact probabilistic. Notation: Let A and B be a linked pair of ITMs, and suppose that all possible interactions of A and B on each common input terminate in a nite number of steps. We denote by hA Bi(x) the random variable representing the (local) output of B when interacting with machine A on common input x, when the random-input to each machine is uniformly and independently chosen. Another important convention is to consider the time-complexity of an interactive machine as a function of its input only. De nition 6.3 (the complexity of an interactive machine): We say that an interactive machine A has time complexity t : N 7! N if for every interactive machine B and every II string x, it holds that when interacting with machine B , on common input x, machine A always (i.e., regardless of the contents of its random-tape and B 's random-tape) halts within t(jxj) steps. We stress that the time complexity, so de ned, is independent of the contents of the messages that machine A receives. In other word, it is an upper bound which holds for all possible incoming messages. In particular, an interactive machine with time complexity t( ) reads, on input x, only a pre x of total length t(jxj) of the messages sent to it. Proof systems In general, proof systems are de ned in terms of the veri cation procedure (which may be viewed as one entity called the veri er). A \proof" to a speci c claim is always considered as coming from the outside (which can be viewed as another entity called the prover). The 6.2. INTERACTIVE PROOF SYSTEMS 151 veri cation procedure itself, does not generate \proofs", but merely veri es their validity. Interactive proof systems are intended to capture whatever can be e ciently veri ed via interaction with the outside. In general, the interaction with the outside may be very complex and may consist of many message exchanges, as long as the total time spent by the veri er is polynomial. In light of the association of e cient procedures with probabilistic polynomial-time algorithms, it is natural to consider probabilistic polynomial-time veri ers. Furthermore, the veri er's verdict of whether to accept or reject the claim is probabilistic, and a bounded error probability is allowed. (The error can of course be decreased to be negligible by repeating the veri cation procedure su ciently many times.) Loosely speaking, we require that the prover can convince the veri er of the validity of valid statement, while nobody can fool the veri er into believing false statements. In fact, it is only required that the veri er accepts valid statements with \high" probability, whereas the probability that it accepts a false statement is \small" (regardless of the machine with which the veri er interacts). In the following de nition, the veri er's output is interpreted as its decision on whether to accept or reject the common input. Output 1 is interpreted as accept', whereas output 0 is interpreted as reject'. De nition 6.4 (interactive proof system): A pair of interactive machines, (P V ), is called an interactive proof system for a language L if machine V is polynomial-time and the following two conditions hold Completeness: For every x 2 L Prob (hP V i(x)=1) 2 3 Soundness: For every x 62 L and every interactive machine B Prob (hB V i(x)=1) 1 3 Some remarks are in place. We rst stress that the soundness condition refers to all potential \provers" whereas the completeness condition refers only to the prescribed prover P . Secondly, the veri er is required to be (probabilistic) polynomial-time, while no resource bounds are placed on the computing power of the prover (in either completeness or soundness conditions!). Thirdly, as in the case of BPP , the error probability in the above de nition can be made exponentially small by repeating the interaction (polynomially) many times (see below). Every language in NP has an interactive proof system. Speci cally, let L 2 NP and let RL be a witness relation associated with the language L (i.e., RL is recognizable in 152 CHAPTER 6. ZERO-KNOWLEDGE PROOF SYSTEMS polynomial-time and L equals the set fx : 9y s.t. jy j = poly(x) ^ (x y ) 2 RLg). Then, an interactive proof for the language L consists of a prover that on input x 2 L sends a witness y (as above), and a veri er that upon receiving y (on common input x) outputs 1 if jy j = poly(jxj) and (x y ) 2 RL (and 0 otherwise). Clearly, when interacting with the prescribed prover, this veri er will always accept inputs in the language. On the other hand, no matter what a cheating \prover" does, this veri er will never accept inputs not in the language. We point out that in this proof system both parties are deterministic (i.e., make no use of their random-tape). It is easy to see that only languages in NP have interactive proof systems in which both parties are deterministic (see Exercise 2). In other words, NP can be viewed as a a class of interactive proof systems in which the interaction is unidirectional (i.e., from the prover to the veri er) and the veri er is deterministic (and never errs). In general interactive proofs, both restrictions are waived: the interaction is bidirectional and the veri er is probabilistic (and may err with some small probability). Both bidirectional interaction and randomization seem essential to the power of interactive proof systems (see further discussion in Chapter 8). De nition 6.5 (the class IP ): The class IP consists of all languages having interactive proof systems. By the above discussion NP IP . Since languages in BPP can be viewed as having a veri er (that decides on membership without any interaction), it follows that BPP NP IP . We remind the reader that it is not known whether BPP NP . We stress that the de nition of the class IP remains invariant if one replaced the (constant) bounds in the completeness and soundness conditions by two functions c s : 1 N 7! N satisfying c(n) < 1 ; 2;poly(n), s(n) > 2;poly(n), and c(n) > s(n) + poly(n) . Namely, II De nition 6.6 (generalized interactive proof): Let c s : N 7! N be functions satisfying II n) > s(n) + p(1n) , for some polynomial p( ). An interactive pair (P V ) is called a (generalized) interactive proof system for the language L, with completeness bound c( ) and soundness bound s( ), if c( (modi ed) completeness: For every x 2 L Prob (hP V i(x)=1) c( jxj) (modi ed) soundness: For every x 62 L and every interactive machine B Prob (hB V i(x)=1) s(jxj) The function g( ), where g(n) def c(n) ; s(n), is called the acceptance gap of (P V ) and the = def maxf1 ; c(n) s(n)g, is called the error probability of (P V ). function e( ), where e(n) = 6.2. INTERACTIVE PROOF SYSTEMS 153 Proposition 6.7 The following three conditions are equivalent 1. L 2 IP . Namely, there exists a interactive proof system, with completeness bound 1, 3 and soundness bound for the language L 2. L has very strong interactive proof systems: For every polynomial p( ), there exists an interactive proof system for the language L, with error probability bounded above by 2;p( ). 3. L has a very weak interactive proof: There exists a polynomial p( ), and a generalized interactive proof system for the language L, with acceptance gap bounded below by 1=p( ). Furthermore, completeness and soundness bounds for this system, namely the values c(n) and s(n), can be computed in time polynomial in n. 2 3 Clearly either of the rst two items imply the third one (including the requirement for e ciently computable bounds). The ability to e ciently compute completeness and soundness bounds is used in proving the opposite (non-trivial) direction. The proof is left as an exercise (i.e., Exercise 1). 6.2.2 An Example (Graph Non-Isomorphism in IP) All examples of interactive proof systems presented so far were degenerate (e.g., the interaction, if at all, was unidirectional). We now present an example of a non-degenerate interactive proof system. Furthermore, we present an interactive proof system for a language not known to be in BPP NP . Speci cally, the language is the set of pairs of non-isomorphic graphs, denoted GNI . Two graphs, G1 =(V1 E1) and G2 =(V2 E2), are called isomorphic if there exists a 1-1 and onto mapping, , from the vertex set V1 to the vertex set V2 so that (u v ) 2 E1 if and only if ( (v ) (u)) 2 E2 . The mapping , if existing, is called an isomorphism between the graphs. Construction 6.8 (Interactive proof system for Graph Non-Isomorphism): Common Input: A pair of two graphs, G1 = (V1 E1) and G2 = (V2 E2). Suppose, without loss of generality, that V1 = f1 2 ::: jV1jg, and similarly for V2 . Veri er's rst Step (V1): The veri er selects at random one of the two input graphs, and sends to the prover a random isomorphic copy of this graph. Namely, the veri er selects uniformly 2 f1 2g, and a random permutation from the set of permutations over the vertex set V . The veri er constructs a graph with vertex set V and edge set F def f( (u) (v)) : (u v ) 2 E g = and sends (V F ) to the prover. 154 CHAPTER 6. ZERO-KNOWLEDGE PROOF SYSTEMS Motivating Remark: If the input graphs are non-isomorphic, as the prover claims, then the prover should be able to distinguish (not necessarily by an e cient algorithm) isomorphic copies of one graph from isomorphic copies of the other graph. However, if the input graphs are isomorphic then a random isomorphic copy of one graph is distributed identically to a random isomorphic copy of the other graph. Prover's rst Step (P1): Upon receiving a graph, G0 = (V 0 E 0), from the veri er, the prover nds a 2 f1 2g so that the graph G0 is isomorphic to the input graph G . (If both = 1 2 satisfy the condition then is selected arbitrarily. In case no 2 f1 2g satis es the condition, is set to 0). The prover sends to the veri er. Veri er's second Step (V2): If the message, , received from the prover equals (chosen in Step V1) then the veri er outputs 1 (i.e., accepts the common input). Otherwise the veri er outputs 0 (i.e., rejects the common input). The veri er program presented above is easily implemented in probabilistic polynomialtime. We do not known of a probabilistic polynomial-time implementation of the prover's program, but this is not required. We now show that the above pair of interactive machines constitutes an interactive proof system (in the general sense) for the language GNI (Graph Non-Isomorphism). Proposition 6.9 The language GNI is in the class IP . Furthermore, the programs specied in Construction 6.8 constitute a generalized interactive proof system for GNI . Namely, 1. If G1 and G2 are not isomorphic (i.e., (G1 G2) 2 GNI ) then the veri er always accept (when interacting with the prover). 2. If G1 and G2 are isomorphic (i.e., (G1 G2) 62 GNI ) then, no matter with what 1 machine the veri er interacts, it rejects the input with probability at least 2 . proof: Clearly, if G1 and G2 are not isomorphic then no graph can be isomorphic to both G1 and G2. It follows that there exists a unique such that the graph G0 (received by the prover in Step P1) is isomorphic to the input graph G . Hence, found by the prover in Step (P1) always equals chosen in Step (V1). Part (1) follows. On the other hand, if G1 and G2 are isomorphic then the graph G0 is isomorphic to both input graphs. Furthermore, we will show that in this case the graph G0 yields no information about , and consequently no machine can (on input G1, G2 and G0 ) set so 1 that it equal , with probability greater than 2 . Details follow. Let be a permutation on the vertex set of a graph G = (V E ). Then, we denote by (G) the graph with vertex set V and edge set f( (u) (v)) : (u v ) 2 E g. Let be a 6.2. INTERACTIVE PROOF SYSTEMS 155 random variable uniformly distributed over f1 2g, and be a random variable uniformly distributed over the permutations of the set V . We stress that these two random variable are independent. We are interested in the distribution of the random variable (G ). We are going to show that, although (G ) is determined by the random variables and , the random variables and (G ) are statistically independent. In fact we show Claim 6.9.1: If the graphs G1 and G2 are isomorphic then for every graph G0 it holds that Prob = 1j (G ) = G0 = Prob = 2j (G ) = G0 = 1 2 ; ; proof: We rst claim that the sets S1 def f : (G1) = G0 ) and S2 def f : (G2) = G0) = = are of equal cardinality. This follows from the observation that there is a 1-1 and onto correspondence between the set S1 and the set S2 (the correspondence is given by the isomorphism between the graphs G1 and G2 ). Hence, Prob (G ) = G0 j = 1 = = = = Using Bayes Rule, the claim follows.2 Using Claim 6.9.1, it follows that for every pair, (G1 G2), of isomorphic graphs and for every randomized process, R, (possibly depending on this pair) it holds that Prob (R( (G ))= ) = = ; Prob Prob ( Prob ( ; Prob ; (G )= G0 1 2 S1) 2 S2) (G ) = G0j = 2 X G0 X Prob (G ))= G0 Prob R(G0))= j (G ) = G0 Prob (G ))= G0 ; ; ; G0 X = X ; ; Prob (G ))= G0 Prob R(G0)) 2 f1 2g 1 2 G0 1 2 b2f1 2g Prob R(G0))= b Prob b = j (G ) = G0 ; ; with equality in case R always outputs an element in the set f1 2g. Part (2) of the proposition follows. 156 CHAPTER 6. ZERO-KNOWLEDGE PROOF SYSTEMS Remarks concerning Construction 6.8 In the proof system of Construction 6.8, the veri er always accepts inputs in the language (i.e., the error probability in these cases equals zero). All interactive proof systems we shall consider will share this property. In fact it can be shown that every interactive proof system can be transformed into an interactive proof system (for the same language) in which the veri er always accepts inputs in the language. On the other hand, as shown in Exercise 5, only languages in NP have interactive proof system in which the veri er always rejects inputs not in the language. The fact that GNI 2 IP , whereas it is not known whether GNI 2 NP , is an indication to the power of interaction and randomness in the context of theorem proving. A much stronger indication is provided by the fact that every language in PSPACE has an interactive proof system (in fact IP equals PSPACE ). For further discussion see Chapter 8. 6.2.3 Augmentation to the Model For purposes that will become more clear in the sequel we augment the basic de nition of an interactive proof system by allowing each of the parties to have a private input (in addition to the common input). Loosely speaking, these inputs are used to capture additional information available to each of the parties. Speci cally, when using interactive proof systems as subprotocols inside larger protocols, the private inputs are associated with the local con gurations of the machines before entering the subprotocol. In particular, the private input of the prover may contain information which enables an e cient implementation of the prover's task. De nition 6.10 (interactive proof systems - revisited): An interactive machine is de ned as in De nition 6.1, except that the machine has an additional read-only tape called the auxiliary-input-tape. The contents of this tape is call auxiliary input. The complexity of such an interactive machine is still measured as a function of the (common) input. Namely, the interactive machine A has time complexity t : N 7! N II if for every interactive machine B and every string x, it holds that when interacting with machine B , on common input x, machine A always (i.e., regardless of contents of its random-tape and its auxiliary-input-tape as well as the contents of B 's tapes) halts within t(jxj) steps. We denote by hA(y ) B (z )i(x) the random variable representing the (local) output of B when interacting with machine A on common input x, when the random-input to each machine is uniformly and independently chosen, and A (resp., B ) has auxiliary input y (resp., z ). 6.3. ZERO-KNOWLEDGE PROOFS: DEFINITIONS 157 A pair of interactive machines, (P V ), is called an interactive proof system for a language L if machine V is polynomial-time and the following two conditions hold { Completeness: For every x 2 L, there exists a string y such that for every z 2 f0 1g Prob (hP (y ) V (z )i(x)=1) 2 3 { Soundness: For every x 62 L, every interactive machine B, and every y z 2 f0 1g Prob (hB (y ) V (z )i(x)=1) 1 3 We stress that when saying that an interactive machine is polynomial-time, we mean that its running-time is polynomial in the length of the common input. Consequently, it is not guaranteed that such a machine has enough time to read its entire auxiliary input. 6.3 Zero-Knowledge Proofs: De nitions In this section we introduce the notion of a zero-knowledge interactive proof system, and present a non-trivial example of such a system (speci cally to claims of the form \the following two graphs are isomorphic"). 6.3.1 Perfect and Computational Zero-Knowledge Loosely speaking, we say that an interactive proof system, (P V ), for a language L is zeroknowledge if whatever can be e ciently computed after interacting with P on input x 2 L, can also be e ciently computed from x (without any interaction). We stress that the above holds with respect to any e cient way of interacting with P , not necessarily the way de ned by the veri er program V . Actually, zero-knowledge is a property of the prescribed prover P . It captures P 's robustness against attempts to gain knowledge by interacting with it. A straightforward way of capturing the informal discussion follows. Let (P V ) be an interactive proof system for some language L. We say that (P V ), actually P , is perfect zero-knowledge if for every probabilistic polynomialtime interactive machine V there exists an (ordinary) probabilistic polynomialtime algorithm M so that for every x 2 L the following two random variables are identically distributed hP V i(x) (i.e., the output of the interactive machine V after interacting with the interactive machine P on common input x) 158 CHAPTER 6. ZERO-KNOWLEDGE PROOF SYSTEMS M (x) (i.e., the output of machine M on input x). Machine M is called a simulator for the interaction of V with P . We stress that we require that for every V interacting with P , not merely for V , there exists a (\perfect") simulator M . This simulator, although not having access to the interactive machine P , is able to simulate the interaction of V with P . This fact is taken as evidence to the claim that V did not gain any knowledge from P (since the same output could have been generated without any access to P ). Note that every language in BPP has a perfect zero-knowledge proof system in which the prover does nothing (and the veri er checks by itself whether to accept the common input or not). To demonstrate the zero-knowledge property of this \dummy prover", one may present for every veri er V a simulator M which is essentially identical to V (except that the communication tapes of V are considered as ordinary work tapes of M ). Unfortunately, the above formulation of perfect zero-knowledge is slightly too strict to be useful. We relax the formulation by allowing the simulator to fail, with bounded probability, to produce an interaction. De nition 6.11 (perfect zero-knowledge): Let (P V ) be an interactive proof system for some language L. We say that (P V ) is perfect zero-knowledge if for every probabilistic polynomial-time interactive machine V there exists a probabilistic polynomial-time algorithm M so that for every x 2 L the following two conditions hold: 1 1. With probability at most 2 , on input x, machine M outputs a special symbol denoted ? (i.e., Prob(M (x)= ?) 1 ). 2 2. Let m (x) be a random variable describing the distribution of M (x) conditioned on M (x) 6= ? (i.e., Prob(m (x) = ) = Prob(M (x) = jM (x) 6= ?), for every 2 f0 1g ). Then the following random variables are identically distributed the interactive machine P on common input x) m (x) (i.e., the output of machine M on input x, conditioned on not being ?) Machine M is called a perfect simulator for the interaction of V with P . hP V i(x) (i.e., the output of the interactive machine V after interacting with Condition 1 (above) can be replaced by a stronger condition requiring that M outputs the special symbol (i.e., ?) only with negligible probability. For example, one can require that on input x machine M outputs ? with probability bounded above by 2;p(jxj), for any polynomial p( ) see Exercise 6. Consequently, the statistical di erence between the 6.3. ZERO-KNOWLEDGE PROOFS: DEFINITIONS 159 random variables hP V i(x) and M (x) can be made negligible (in jxj) see Exercise 7. Hence, whatever the veri er e ciently computes after interacting with the prover, can be e ciently computed (up to an overwhelmingly small error) by the simulator (and hence by the veri er himself). Following the spirit of Chapters 3 and 4, we observe that for practical purposes there is no need to be able to \perfectly simulate" the output of V after interacting with P . Instead, it su ces to generate a probability distribution which is computationally indistinguishable from the output of V after interacting with P . The relaxation is consistent with our original requirement that \whatever can be e ciently computed after interacting with P on input x 2 L, can also be e ciently computed from x (without any interaction)". The reason being that we consider computationally indistinguishable ensembles as being the same. Before presenting the relaxed de nition of general zero-knowledge, we recall the de nition of computationally indistinguishable ensembles. Here we consider ensembles indexed by strings from a language, L. We say that the ensembles fRxgx2L and fSx gx2L are computationally indistinguishable if for every probabilistic polynomial-time algorithm, D, for every polynomial p( ) and all su ciently long x 2 L it holds that jProb(D(x Rx)=1) ; Prob(D(x Sx)=1)j < p(j1xj) De nition 6.12 (computational zero-knowledge): Let (P V ) be an interactive proof sys- tem for some language L. We say that (P V ) is computational zero-knowledge (or just zero-knowledge) if for every probabilistic polynomial-time interactive machine V there exists a probabilistic polynomial-time algorithm M so that the following two ensembles are computationally indistinguishable fhP V i(x)gx2L (i.e., the output of the interactive machine V after interacting with the interactive machine P on common input x) fM (x)gx2L (i.e., the output of machine M on input x). Machine M is called a simulator for the interaction of V with P . The reader can easily verify (see Exercise 9) that allowing the simulator to output 1 the symbol ? (with probability bounded above by, say, 2 ) and considering the conditional output distribution (as done in De nition 6.11), does not add to the power of De nition 6.12. We stress that both de nitions of zero-knowledge apply to interactive proof systems in the general sense (i.e., having any non-negligible gap in the acceptance probabilities for inputs inside and outside the language). In fact, the de nitions of zero-knowledge apply to 160 CHAPTER 6. ZERO-KNOWLEDGE PROOF SYSTEMS any pair of interactive machines (actually to each interactive machine). Namely, we may say that the interactive machine A is zero-knowledge on L if whatever can be e ciently computed after interacting with A on common input x 2 L, can also be e ciently computed from x itself. An alternative formulation of zero-knowledge An alternative formulation of zero-knowledge considers the veri er's view of the interaction with the prover, rather than only the output of the veri er after such an interaction. By the \veri er's view of the interaction" we mean the entire sequence of the local con gurations of the veri er during an interaction (execution) with the prover. Clearly, it su ces to consider only the contents of the random-tape of the veri er and the sequence of messages that the veri er has received from the prover during the execution (since the entire sequence of local con gurations as well as the nal output are determine by these objects). De nition 6.13 (zero-knowledge { alternative formulation): Let (P V ), L and V be as in De nition 6.12. We denote by viewP (x) a random variable describing the contents of V the random-tape of V and the messages V receives from P during a joint computation on common input x. We say that (P V ) is zero-knowledge if for every probabilistic polynomialtime interactive machine V there exists a probabilistic polynomial-time algorithm M so that the ensembles fviewP (x)gx2L and fM (x)gx2L are computationally indistinguishable. V A few remarks are in place. De nition 6.13 is obtained from De nition 6.12 by replacing hP V i(x) for view P (x). The simulator M used in De nition 6.13 is related, but not V equal, to the simulator used in De nition 6.12 (yet, this fact is not re ected in the text of these de nitions). Clearly, V (x) can be computed in (deterministic) polynomial-time from viewP (x), for every V . Although the opposite direction is not always true, De nition 6.13 V is equivalent to De nition 6.12 (see Exercise 10). The latter fact justi es the use of Definition 6.13, which is more convenient to work with, although it seems less natural than De nition 6.12. An alternative formulation of perfect zero-knowledge is straightforward, and clearly it is equivalent to De nition 6.11. * Complexity classes based on Zero-Knowledge De nition 6.14 (class of languages having zero-knowledge proofs): We denote by ZK (also CZK) the class of languages having (computational) zero-knowledge interactive proof systems. Likewise, PZK denotes the class of languages having perfect zero-knowledge interactive proof systems. Clearly, BPP PZK CZK IP . We believe that the rst two inclusions are strict. Assuming the existence of (non-uniformly) one-way functions, the last inclusion is an equality (i.e., CZK = IP ). See Proposition 6.24 and Theorems 3.29 and 6.30. 6.3. ZERO-KNOWLEDGE PROOFS: DEFINITIONS 161 * Expected polynomial-time simulators The formulation of perfect zero-knowledge presented in De nition 6.11 is di erent from the standard de nition used in the literature. The standard de nition requires that the simulator always outputs a legal transcript (which has to be distributed identically to the real interaction) yet it allows the simulator to run in expected polynomial-time rather than in strictly polynomial-time time. We stress that the expectation is taken over the coin tosses of the simulator (whereas the input to the simulator is xed). De nition 6.15 (perfect zero-knowledge { liberal formulation): We say that (P V ) is per- fect zero-knowledge in the liberal sense if for every probabilistic polynomial-time interactive machine V there exists an expected polynomial-time algorithm M so that for every x 2 L the random variables hP V i(x) and M (x) are identically distributed. We stress that by probabilistic polynomial-time we mean a strict bound on the running time in all possible executions, whereas by expected polynomial-time we allow nonpolynomial-time executions but require that the running-time is \polynomial on the average". Clearly, De nition 6.11 implies De nition 6.15 { see Exercise 8. Interestingly, there exists interactive proofs which are perfect zero-knowledge with respect to the liberal de nition but not known to be perfect zero-knowledge with respect to De nition 6.11. We prefer to adopt De nition 6.11, rather than De nition 6.15, because we wanted to avoid the notion of expected polynomial-time that is much more subtle than one realizes at rst glance. A parenthetical remark concerning the notion of average polynomial-time: The naive interpretation of expected polynomial-time is having average running-time that is bounded by a polynomial in the input length. This de nition of expected polynomial-time is unsatisfactory since it is not closed under reductions and is (too) machine dependent. Both aggravating phenomenon follow from the fact that a function may have an average (say over f0 1gn) that is bounded by polynomial (in n) and yet squaring the function yields a function which is not bounded by a polynomial (in n). Hence, a better interpretation of expected polynomial-time is having running-time that is bounded by a polynomial in a function which has average linear growing rate. Furthermore, the correspondence between average polynomial-time and e cient computations is more controversial than the more standard association of strict polynomial-time with e cient computations. An analogous discussion applies also to computational zero-knowledge. More speci cally, De nition 6.12 requires that the simulator works in polynomial-time, whereas a more liberal notion allows it to work in expected polynomial-time. 162 CHAPTER 6. ZERO-KNOWLEDGE PROOF SYSTEMS For sake of elegancy, it is customary to modify the de nitions allowing expected polynomialtime simulators, by requiring that such simulators exist also for the interaction of expected polynomial-time veri ers with the prover. 6.3.2 An Example (Graph Isomorphism in PZK) As mentioned above, every language in BPP has a trivial (i.e., degenerate) zero-knowledge proof system. We now present an example of a non-degenerate zero-knowledge proof system. Furthermore, we present a zero-knowledge proof system for a language not known to be in BPP . Speci cally, the language is the set of pairs of isomorphic graphs, denoted GI (see de nition in Section 6.2). Construction 6.16 (Perfect Zero-Knowledge proof for Graph Isomorphism): Common Input: A pair of two graphs, G1 = (V1 E1) and G2 = (V2 E2). Let be an isomorphism between the input graphs, namely is a 1-1 and onto mapping of the vertex set V1 to the vertex set V2 so that (u v ) 2 E1 if and only if ( (v ) (u)) 2 E2 . Prover's rst Step (P1): The prover selects a random isomorphic copy of G2, and sends it to the veri er. Namely, the prover selects at random, with uniform probability distribution, a permutation from the set of permutations over the vertex set V2, and constructs a graph with vertex set V2 and edge set F def f( (u) (v)) : (u v) 2 E2g = The prover sends (V2 F ) to the veri er. Motivating Remark: If the input graphs are isomorphic, as the prover claims, then the graph sent in step P1 is isomorphic to both input graphs. However, if the input graphs are not isomorphic then no graph can be isomorphic to both of them. Veri er's rst Step (V1): Upon receiving a graph, G0 = (V 0 E 0), from the prover, the veri ers asks the prover to show an isomorphism between G0 and one of the input graph, chosen at random by the veri er. Namely, the veri er uniformly selects 2 f1 2g, and sends it to the prover (who is supposed to answer with an isomorphism between G and G0 ). Prover's second Step (P2): If the message, , received from the veri er equals 2 then the prover sends to the veri er. Otherwise (i.e., 6= 2), the prover sends (i.e., def ( (v ))) to the veri er. (Remark: the composition of on , de ned as (v ) = the prover treats any 6= 2 as = 1.) 6.3. ZERO-KNOWLEDGE PROOFS: DEFINITIONS 163 Veri er's second Step (V2): If the message, denoted , received from the prover is an isomorphism between G and G0 then the veri er outputs 1, otherwise it outputs 0. Let use denote the prover's program by PGI . The veri er program presented above is easily implemented in probabilistic polynomialtime. In case the prover is given an isomorphism between the input graphs as auxiliary input, also the prover's program can be implemented in probabilistic polynomial-time. We now show that the above pair of interactive machines constitutes a zero-knowledge interactive proof system (in the general sense) for the language GI (Graph Isomorphism). Proposition 6.17 The language GI has a perfect zero-knowledge interactive proof system. Furthermore, the programs speci ed in Construction 6.16 satisfy 1. If G1 and G2 are isomorphic (i.e., (G1 G2) 2 GI ) then the veri er always accepts (when interacting with the prover). 2. If G1 and G2 are not isomorphic (i.e., (G1 G2) 62 GI ) then, no matter with what 1 machine the veri er interacts, it rejects the input with probability at least 2 . 3. The above prover (i.e., PGI ) is perfect zero-knowledge. Namely, for every probabilistic polynomial-time interactive machine V there exists a probabilistic polynomial-time algorithm M outputting ? with probability at most 1 so that for every x def (G1 G2) 2 = 2 GI the following two random variables are identically distributed viewPGI (x) (i.e., the view of V after interacting with PGI , on common input x) V m (x) (i.e., the output of machine M , on input x, conditioned on not being ?). A zero-knowledge interactive proof system for GI with error probability 2;k (only in the soundness condition) can be derived by executing the above protocol, sequentially, k times. We stress that in each repetition, of the above protocol, both (the prescribed) prover and veri er use coin tosses which are independent of the coins used in the other repetitions of the protocol. For further discussion see Section 6.3.4. We remark that k parallel executions will decrease the error in the soundness condition to 2;k as well, but the resulting interactive proof is not known to be zero-knowledge in case k grows faster than logarithmic in the input length. In fact, we believe that such an interactive proof is not zero-knowledge. For further discussion see Section 6.5. We stress that it is not known whether GI 2 BPP . Hence, Proposition 6.17 asserts the existence of perfect zero-knowledge proofs for languages not known to be in BPP . proof: We rst show that the above programs indeed constitute a (general) interactive proof system for GI . Clearly, if the input graphs, G1 and G2, are isomorphic then the graph G0 164 CHAPTER 6. ZERO-KNOWLEDGE PROOF SYSTEMS constructed in step (P1) is isomorphic to both of them. Hence, if each party follows its prescribed program then the veri er always accepts (i.e., outputs 1). Part (1) follows. On the other hand, if G1 and G2 are not isomorphic then no graph can be isomorphic to both G1 and G2 . It follows that no matter how the (possibly cheating) prover constructs G0 there exists 2 f1 2g so that G0 and G are not isomorphic. Hence, when the veri er follows its 1 program, the veri er rejects (i.e., outputs 0) with probability at least 2 . Part (2) follows. It remains to show that PGI is indeed perfect zero-knowledge on GI . This is indeed the di cult part of the entire proof. It is easy to simulate the output of the veri er speci ed in Construction 6.16 (since its output is identically 1 on inputs in the language GI ). It is also not hard to simulate the output of a veri er which follows the program speci ed in Construction 6.16, except that at termination it output the entire transcript of its interaction with PGI { see Exercise 11. The di cult part is to simulate the output of an e cient veri er which deviates arbitrarily from the speci ed program. We will use here the alternative formulation of (perfect) zero-knowledge, and show how to simulate V 's view of the interaction with PGI , for every probabilistic polynomial-time interactive machine V . As mentioned above it is not hard to simulate the veri er's view of the interaction with PGI in case the veri er follows the speci ed program. However, we need to simulate the view of the veri er in the general case (in which it uses an arbitrary polynomial-time interactive program). Following is an overview of our simulation (i.e., of our construction of a simulator, M , for each V ). The simulator M incorporates the code of the interactive program V . On input (G1 G2), the simulator M rst selects at random one of the input graphs (i.e., either G1 or G2) and generates a random isomorphic copy, denoted G00, of this input graph. In doing so, the simulator behaves di erently from PGI , but the graph generated (i.e., G00) is distributed identically to the message sent in step (P1) of the interactive proof. Say that the simulator has generated G00 by randomly permuting G1. Then, if V asks to see the isomorphism between G1 and G00, the simulator can indeed answer correctly and in doing so it completes a simulation of the veri er's view of the interaction with PGI . However, if V asks to see the isomorphism between G2 and G00, then the simulator (which, unlike PGI , does not \know" ) has no way to answer correctly, and we let it halt with output ?. We stress that the simulator \has no way of knowing" whether V will ask to see an isomorphism to G1 or G2. The point is that the simulator can try one of the possibilities 1 at random and if it is lucky (which happens with probability exactly 2 ) then it can output a distribution which is identical to the view of V when interacting with PGI (on common input (G1 G2)). A detailed description of the simulator follows. Simulator M . On input x def (G1 G2), simulator M proceeds as follows: = 1. Setting the random-tape of V : Let q ( ) denote a polynomial bounding the runningtime of V . The simulator M starts by uniformly selecting a string r 2 f0 1gq(jxj), to be used as the contents of the random-tape of V . 6.3. ZERO-KNOWLEDGE PROOFS: DEFINITIONS 165 2. Simulating the prover's rst step (P1): The simulator M selects at random, with uniform probability distribution, a \bit" 2 f1 2g and a permutation from the set of permutations over the vertex set V . It then constructs a graph with vertex set V and edge set F def f( (u) (v )) : (u v) 2 E g = Set G00 def (V F ). = 3. Simulating the veri er's rst step (V1): The simulator M initiates an execution of V by placing x on V 's common-input-tape, placing r (selected in step (1) above) on V 's random-tape, and placing G00 (constructed in step (2) above) on V 's incoming message-tape. After executing a polynomial number of steps of V , the simulator can read the outgoing message of V , denoted . To simplify the rest of the description, we normalize by setting = 1 if 6= 2 (and leave unchanged if = 2). 4. Simulating the prover's second step (P2): If = then the simulator halts with output (x r G00 ). 5. Failure of the simulation: Otherwise (i.e., 6= ), the simulator halts with output ?. Using the hypothesis that V is polynomial-time, it follows that so is the simulator M . 1 It is left to show that M outputs ? with probability at most 2 , and that, conditioned on not outputting ?, the simulator's output is distributed as the veri er's view in a \real interaction with PGI ". The following claim is the key to the proof of both claims. Claim 6.17.1: Suppose that the graphs G1 and G2 are isomorphic. Let be a random variable uniformly distributed in f1 2g, and (G) be a random variable (independent of ) describing the graph obtained from the graph G by randomly relabelling its nodes (cf. Claim 6.9.1). Then, for every graph G00, it holds that ; ; Prob = 1j (G ) = G00 = Prob = 2j (G ) = G00 Claim 6.17.1 is identical to Claim 6.9.1 (used to demonstrate that Construction 6.8 constitutes an interactive proof for GNI ). As in the rest of the proof of Proposition 6.9, it follows that any random process with output in f1 2g, given (G ), outputs with probability 1 exactly 2 . Hence, given G00 (constructed by the simulator in step (2)), the veri er's program yields (normalized) so that 6= with probability exactly 1 . We conclude that the simu2 1 lator outputs ? with probability 2 . It remains to prove that, conditioned on not outputting ?, the simulator's output is identical to \V 's view of real interactions". Namely, Claim 6.17.2: Let x = (G1 G2) 2 GI . Then, for every string r, graph H , and permutation , it holds that Prob viewPGI (x)=(x r H ) = Prob (M (x)=(x r H ) j M (x) 6= ?) V 166 CHAPTER 6. ZERO-KNOWLEDGE PROOF SYSTEMS proof: Let m (x) describe M (x) conditioned on its not being ?. We rst observe that both m (x) and viewPGI (x) are distributed over quadruples of the form (x r ), with uniformly V distributed r 2 f0 1gq(jxj). Let (x r) be a random variable describing the last two elements of view PGI (x) conditioned on the second element equals r. Similarly, let (x r) describe the V last two elements of m (x) (conditioned on the second element equals r). Clearly, it su ces to show that (x r) and (x r) are identically distributed, for every x and r. Observe that once r is xed the message sent by V on common input x, random-tape r, and incoming message H , is uniquely de ned. Let us denote this message by v (x r H ). We show that both (x r) and (x r) are uniformly distributed over the set Cx r def (H ) : H = (Gv (x r H )) = n o where (G) denotes the graph obtained from G by relabelling the vertices using the permutation (i.e., if G =(V E ) then (G) = (V F ) so that (u v ) 2 E i ( (u) (v)) 2 F ). The proof of this statement is rather tedious and unrelated to the subjects of this book (and hence can be skipped with no damage). The proof is slightly non-trivial because it relates (at least implicitly) to the automorphism group of the graph G2 (i.e., the set of permutations for which (G2) is identical, not just isomorphic, to G2 ). For simplicity, consider rst the special case in which the automorphism group of G2 consists of merely the identity permutation (i.e., G2 = (G2) if and only if is the identity permutation). In this case, (H ) 2 Cx r if and only if H is isomorphic to (both G1 and) G2 and is the isomorphism between H and Gv (x r H ). Hence, Cx r contains exactly jV2j! pairs, each containing a di erent graph H as the rst element. In the general case, (H ) 2 Cx r if and only if H is isomorphic to (both G1 and) G2 and is an isomorphism between H and Gv (x r H ). We stress that v (x r H ) is the same in all pairs containing H . Let aut(G2) denotes the size of the automorphism group of G2. Then, each H (isomorphic to G2 ) appears in exactly aut(G2) pairs of Cx r and each such pair contain a di erent isomorphism between H and Gv (x r H ). We rst consider the random variable (x r) (describing the su x of m (x)). Recall that (x r) is de ned by the following two step random process. In the rst step, one selects uniformly a pair ( ), over the set of pairs f1 2g-timespermutation, and sets H = (G ). In the second step, one outputs (i.e., sets (x r) to) ( (G ) ) if v (x r H )= (and ignores the ( ) pair otherwise). Hence, each graph H (isomorphic to G2 ) is generated, at the rst step, by exactly aut(G2) di erent (1 )-pairs (i.e., the pairs (1 ) satisfying H = (G1)), and by exactly aut(G2 ) di erent (2 )-pairs (i.e., the pairs (2 ) satisfying H = (G2)). All these 2 aut(G2) pairs yield the same graph H , and hence lead to the same value of v (x r H ). It follows that out of the 2 aut(G2) pairs, ( ), yielding 6.3. ZERO-KNOWLEDGE PROOFS: DEFINITIONS the graph H = (G ), only the pairs satisfying = v (x r H ) lead to an output. Hence, for each H (which is isomorphic to G2 ), the probability that (x r) = (H ) equals aut(G2 )=(jV2j!). Furthermore, for each H (which is isomorphic to G2), (1 if H = (Gv (x r H )) Prob ( (x r)=(H )) = jV2 j! 0 otherwise Hence (x r) is uniformly distributed over Cx r . We now consider the random variable (x r) (describing the su x of the veri er's view in a \real interaction" with the prover). Recall that (x r) is de ned by selecting uniformly a permutation (over the set V2 ), and setting (x r)= ( (G2) ) if v (x r (G2)) = 2 and (x r)= ( (G2) ) otherwise, where is the isomorphism between G1 and G2. Clearly, for each H (which is isomorphic to G2), the probability that (x r) = (H ) equals aut(G2)=(jV2j!). Furthermore, for each H (which is isomorphic to G2), (1 2;v (x r H ) Prob ( (x r)=(H )) = jV2 j! if = 0 otherwise 2;v (x r H ), we conclude Observing that H = (Gv (x r H )) if and only if = that (x r) and (x r) are identically distributed. The claim follows. 2 This completes the proof of Part (3) of the proposition. 167 6.3.3 Zero-Knowledge w.r.t. Auxiliary Inputs The de nitions of zero-knowledge presented above fall short of what is required in practical applications and consequently a minor modi cation should be used. We recall that these de nitions guarantee that whatever can be e ciently computed after interaction with the prover on any common input, can be e ciently computed from the input itself. However, in typical applications (e.g., when an interactive proof is used as a sub-protocol inside a bigger protocol) the veri er interacting with the prover, on common input x, may have some additional a-priori information, encoded by a string z , which may assist it in its attempts to \extract knowledge" from the prover. This danger may become even more acute in the likely case in which z is related to x. (For example, consider the protocol of Construction 6.16 and the case where the veri er has a-priori information concerning an isomorphism between the input graphs.) What is typically required is that whatever can be e ciently computed from x and z after interaction with the prover on any common input x, can be e ciently computed from x and z (without any interaction with the prover). This requirement is formulated below using the augmented notion of interactive proofs presented in De nition 6.10. 168 CHAPTER 6. ZERO-KNOWLEDGE PROOF SYSTEMS De nition 6.18 (zero-knowledge { revisited): Let (P V ) be an interactive proof for a lan- guage L (as in De nition 6.10). Denote by PL (x) the set of strings y satisfying the completeness condition with respect to x 2 L (i.e., for every z 2 f0 1g Prob (hP (y ) V (z )i(x)=1) 2 3 ). We say that (P V ) is zero-knowledge with respect to auxiliary input (auxiliary input zeroknowledge) if for every probabilistic polynomial-time interactive machine V there exists a probabilistic algorithm M , running in time polynomial in the length of its rst input, so that the following two ensembles are computationally indistinguishable (when the distinguishing gap is considered as a function of jxj) fhP (y) V (z)i(x)gx2L y2PL(x) z2f0 1g fM (x z)gx2L z2f0 1g Namely, for every probabilistic algorithm, D, with running-time polynomial in length of the rst input, every polynomial p( ), and all su ciently long x 2 L, all y 2 PL (x) and z 2 f0 1g , it holds that jProb(D(x z hP (y) V (z)i(x))=1) ; Prob(D(x z M (x z))=1)j < p(j1xj) In the above de nition y represents a-priori information to the prover, whereas z represents a-priori information to the veri er. Both y and z may depend on the common input x. We stress that the local inputs (i.e., y and z ) may not be known, even in part, to the counterpart. We also stress that the auxiliary input z is also given to the distinguishing algorithm (which may be thought of as an extension of the veri er). Recall that by De nition 6.10, saying that the interactive machine V is probabilistic polynomial-time means that its running-time is bounded by a polynomial in the length of the common input. Hence, the veri er program, the simulator, and the distinguishing algorithm, all run in time polynomial in the length of x (and not in time polynomial in the total length of all their inputs). This convention is essential in many respects. For example, having allowed even one of these machines to run in time proportional to the length of the auxiliary input would have collapsed computational zero-knowledge to perfect zeroknowledge (e.g., by considering veri ers which run in time polynomial in the common-input yet have huge auxiliary inputs of length exponential in the common-input). De nition 6.18 refers to computational zero-knowledge. A formulation of perfect zeroknowledge with respect to auxiliary input is straightforward. We remark that the perfect zero-knowledge proof for Graph Isomorphism, presented in Construction 6.16, is in fact perfect zero-knowledge with respect to auxiliary input. This fact follows easily by a minor augmentation to the simulator constructed in the proof of Proposition 6.17 (i.e., when invoking the veri er, the simulator should provide it with the auxiliary input which is given to the simulator). In general, a demonstration of zero-knowledge can be extended 6.3. ZERO-KNOWLEDGE PROOFS: DEFINITIONS 169 to yield zero-knowledge with respect to auxiliary input, provided that the simulator used in the original demonstration works by invoking the veri er's program as a black box. All simulators presented in this book have this property. * Implicit non-uniformity in De nition 6.18 The non-uniform nature of De nition 6.18 is captured by the fact that the simulator gets an auxiliary input. It is true that this auxiliary input is also given to both the veri er program and the simulator, however if it is su ciently long then only the distinguisher can make any use of its su x. It follows that the simulator guaranteed in De nition 6.18 produces output that is indistinguishable from the real interactions also by non-uniform polynomial-size machines. Namely, for every (even non-uniform) polynomial-size circuit family, fCn gn2N, every polynomial p( ), and all su ciently large n's, all x 2 L \ f0 1gn, I all y 2 PL (x) and z 2 f0 1g , jProb(Cn(x z hP (y) V (z)i(x))=1) ; Prob(Cn(x z M (x z))=1)j < p(j1xj) Following is a sketch of the proof. We assume, to the contrary, that there exists a polynomialsize circuit family, fCn gn2N, such that for in nitely many n's there exists triples (x y z ) I for which Cn has a non-negligible distinguishing gap. We derive a contradiction by incorporating the description of Cn together with the auxiliary input z into a longer auxiliary input, denoted z 0 . This is done in a way that both V and M have no su cient time to reach the description of Cn . For example, let q ( ) be a polynomial bounding the running-time of both V and M , as well as the size of Cn . Then, we let z 0 be the string which results by padding z with blanks to a total length of q (n) and appending the description of the circuit Cn at its end (i.e., if jzj > q(n) then z0 is a pre x of z ). Clearly, M (x z0) = M (x z ) and hP (y ) V (z 0)i(x) = hP (y ) V (z )i(x). On the other hand, by using a circuit evaluating algorithm, we get an algorithm D such that D(x z 0 ) = Cn (x z ), and contradiction follows. 6.3.4 Sequential Composition of Zero-Knowledge Proofs An intuitive requirement that a de nition of zero-knowledge proofs must satisfy is that zero-knowledge proofs are closed under sequential composition. Namely, if one executes one zero-knowledge proof after another then the composed execution must be zero-knowledge. The same should remain valid even if one executes polynomially many proofs one after the other. Indeed, as we will shortly see, the revised de nition of zero-knowledge (i.e., De nition 6.18) satis es this requirement. Interestingly, zero-knowledge proofs as de ned in De nition 6.12 are not closed under sequential composition, and this fact is indeed another indication to the necessity of augmenting this de nition (as done in De nition 6.18). 170 CHAPTER 6. ZERO-KNOWLEDGE PROOF SYSTEMS In addition to its conceptual importance, the Sequential Composition Lemma is an important tool in the design of zero-knowledge proof systems. Typically, these proof system consists of many repetitions of a atomic zero-knowledge proof. Loosely speaking, the atomic proof provides some (but not much) statistical evidence to the validity of the claim. By repeating the atomic proof su ciently many times the con dence in the validity of the claim is increased. More precisely, the atomic proof o ers a gap between the accepting probability of string in the language and strings outside the language. For example, in Construction 6.16 pairs of isomorphic graphs (i.e., inputs in GI ) are accepted with probability 1, whereas pairs 1 of non-isomorphic graphs (i.e., inputs not in GI ) are accepted with probability at most 2 . By repeating the atomic proof the gap between the two probabilities is further increased. For example, repeating the proof of Construction 6.16 for k times yields a new interactive proof in which inputs in GI are still accepted with probability 1 whereas inputs not in GI are accepted with probability at most 21k . The Sequential Composition Lemma guarantees that if the atomic proof system is zero-knowledge then so is the proof system resulting by repeating the atomic proof polynomially many times. Before we state the Sequential Composition Lemma, we remind the reader that the zero-knowledge property of an interactive proof is actually a property of the prover. Also, the prover is required to be zero-knowledge only on inputs in the language. Finally, we stress that when talking on zero-knowledge with respect to auxiliary input we refer to all possible auxiliary inputs for the veri er. Lemma 6.19 (Sequential Composition Lemma): Let P be an interactive machine (i.e., a prover) which is zero-knowledge with respect to auxiliary input on some language L. Suppose that the last message sent by P , on input x, bears a special \end of proof" symbol. Let Q( ) be a polynomial, and let PQ be an interactive machine that, on common input x, proceeds in Q(jxj) phases, each of them consisting of running P on common input x. (We stress that in case P is probabilistic, the interactive machine PQ uses independent coin tosses for each of the Q(jxj) phases.) Then PQ is zero-knowledge (with respect to auxiliary input) on L. Furthermore, if P is perfect zero-knowledge (with respect to auxiliary input) then so is PQ . The convention concerning \end of proof" is introduced for technical purposes (and is redundant in all known provers for which the number of messages sent is easily computed from the length of the common input). Clearly, every machine P can be easily modi ed so that its last message bears an appropriate symbol (as assumed above), and doing so preserves the zero-knowledge properties of P (as well as completeness and soundness conditions). The Lemma remain valid also if one allows auxiliary input to the prover. The extension is straightforward. The lemma ignores other aspects of repeating an interactive proof several times speci cally, the e ect on the gap between the accepting probability of inputs inside and outside of the language. This aspect of repetition is discussed in the previous section (see also Exercise 1). 6.3. ZERO-KNOWLEDGE PROOFS: DEFINITIONS 171 Proof: Let V be an arbitrary probabilistic polynomial-time interactive machine interacting with the composed prover PQ . Our task is to construct a (polynomial-time) simulator, M , which simulates the real interactions of V with PQ . Following is a very high level description of the simulation. The key idea is to simulate the real interaction on common input x in Q(jxj) phases corresponding to the phases of the operation of PQ . Each phase of the operation of PQ is simulated using the simulator guaranteed for the atomic prover P . The information accumulated by the veri er in each phase is passed to the next phase using the auxiliary input. The rst step in carrying-out the above plan is to partition the execution of an arbitrary interactive machine V into phases. The partition may not exist in the code of the program V , and yet it can be imposed on the executions of this program. This is done using the phase structure of the prescribed prover PQ , which is induced by the \end of proof" symbols. Hence, we claim that no matter how V operates, the interaction of V with PQ on common input x, can be captured by Q(jxj) successive interaction of a related machine, denoted V , with P . Namely, Claim 6.19.1: There exists a probabilistic polynomial-time V so that for every common input x and auxiliary input z it holds that hPQ V (z)i(x) = Z (Q(jxj)) where Z (0) def z and Z (i+1) def hP V (Z (i))i(x) = = Namely, Z (Q(jxj)) is a random variable describing the output of V after Q(jxj) successive interactions with P , on common input x, where the auxiliary input of V in the i + 1st interaction equals the output of V after the ith interaction (i.e., Z (i) ). proof: Consider an interaction of V (z ) with PQ , on common input x. Machine V can be slightly modi ed so that it starts its execution by reading the common-input, the randominput and the auxiliary-input into special regions in its work-tape, and never accesses the above read-only tapes again. Likewise, V is modi ed so that it starts each active period by reading the current incoming message from the communication-tape to a special region in the work tape (and never accesses the incoming message-tape again during this period). Actually, the above description should be modi ed so that V copies only a polynomially long (in the common input) pre x of each of these tapes, the polynomial being the one bounding the running time of V . Considering the contents of the work-tape of V at the end of each of the Q(jxj) phases (of interactions with PQ ), naturally leads us to the construction of V . Namely, on common input x and auxiliary input z 0, machine V starts by copying z 0 into the work-tape of V . Next, machine V simulates a single phase of the interaction of V with PQ (on input x) starting with the above contents of the work-tape of V (instead of starting with an empty work-tape). The invoked machine V regards the communication-tapes of machine V as 172 CHAPTER 6. ZERO-KNOWLEDGE PROOF SYSTEMS its own communication-tapes. Finally, V terminates by outputting the current contents of the work-tape of V . Actually, the above description should be slightly modi ed to deal di erently with the rst phase in the interaction with PQ . Speci cally, V copies z 0 into the work-tape of V only if z 0 encodes a contents of the work-tape of V (we assume, w.l.o.g., that the contents of the work-tape of V is encoded di erently from the encoding of an auxiliary input for V ). In case z 0 encodes an auxiliary input to V , machine V invokes V on an empty work-tape, and V regards the readable tapes of V (i.e., commoninput-tape, the random-input-tape and the auxiliary-input-tape) as its own. Observe that Z (1) def hP V (z )i(x) describes the contents of the work-tape of V after one phase, and = (i) def hP V (Z (i;1))i(x) describes the contents of the work-tape of V after i phases. Z= The claim follows. 2 Since V is a polynomial-time interactive machine (with auxiliary input) interacting with P , it follows by the lemma's hypothesis that there exists a probabilistic machine which simulates these interactions in time polynomial in the length of the rst input. Let M denote this simulator. We may assume, without loss of generality, that with overwhelmingly high probability M halts with output (as we can increase the probability of output by successive applications of M ). Furthermore, for sake of simplicity, we assume in the rest of this proof that M always halts with output. Namely, for every probabilistic polynomialtime (in x) algorithm D, every polynomial p( ), all su ciently long x 2 L and all z 2 f0 1g , we have jProb(D(x z hP V (z)i(x)) = 1) ; Prob(D(x z M (x z)) = 1)j < p(j1xj) We are now ready to present the construction of a simulator, M , that simulates the \real" output of V after interaction with PQ . Machine M uses the above guaranteed simulator M . On input (x z ), machine M sets z (0) = z and proceeds in Q(jxj) phases. In the ith phase, machine M computes z (i) by running machine M on input (x z (i;1)). After Q(jxj) phases are completed, machine M stops outputting z (Q(jxj)). Clearly, machine M , constructed above, runs in time polynomial in its rst input. (For non-constant Q( ) it is crucial here that the running-time of M is polynomial in the length of the rst input, rather than being polynomial in the length of both inputs.) It is left to show that machine M indeed produces output which is polynomially indistinguishable from the output of V (after interacting with PQ ). Namely, Claim 6.19.2: For every probabilistic algorithm D, with running-time polynomial in its rst input, every polynomial p( ), all su ciently long x 2 L and all z 2 f0 1g , we have jProb(D(x z hPQ V (z)i(x)) = 1) ; Prob(D(x z M (x z)) = 1)j < p(j1xj) 6.3. ZERO-KNOWLEDGE PROOFS: DEFINITIONS 173 proof sketch: We use a hybrid argument. In particular, we de ne the following Q(jxj) + 1 hybrids. The ith hybrid, 0 i Q(jxj), corresponds to the following random process. We rst let V interact with P for i phases, starting with common input x and auxiliary input z, and denote by Z (i) the output of V after the ith phase. We next repeatedly iterate M for the remaining Q(m) ; k phases. In both cases, we use the output of the previous phase as auxiliary input to the new phase. Formally, the hybrid H (i) is de ned as follows. H (i)(x z) def MQ(m);i(x Z (i)) = where Z (0) def z and Z (j +1) def hP V (Z (j ))i(x) = = 0) def (x z 0) and M (x z 0) def M (x M (x z 0)) M0 (x z = = j ;1 j Using Claim 6.19.1, the Q(jxj)th hybrid (i.e., H (Q(jxj))(x z )) equals hPQ V (z )i(x)). On the other hand, recalling the construction of M , we see that the zero hybrid (i.e., H (0)(x z )) equals M (x z )). Hence, all that is required to complete the proof is to show that every two adjacent hybrids are polynomially indistinguishable (as this would imply that the extreme hybrids, H (Q(m)) and H (0), are indistinguishable too). To this end, we rewrite the ith and i ; 1st hybrids as follows. H (i)(x z) = MQ(jxj);i(x hP V (Z (i;1) )i(x)) H (i;1)(x z) = MQ(jxj);i(x M (x Z (i;1))) where Z (i;1) is as de ned above (in the de nition of the hybrids). Using an averaging argument, it follows that if an algorithm, D, distinguishes the hybrids H (i)(x z ) and H (i;1)(x z ) then there exists a z 0 so that algorithm D distinguishes the random variables MQ(jxj);i(x hP V (z 0 )i(x)) and MQ(jxj);i(x M (x z 0)) at least as well. Incorporating algorithm M into D, we get a new algorithm D0 , with running time polynomially related to the former algorithms, which distinguishes the random variables (x z 0 hP V (z 0 )i(x)) and (x z 0 M (x z 0)) at least as well. (Further details are presented below.) Contradiction (to the hypothesis that M simulates (P V )) follows. 2 The lemma follows. Further details concerning the proof of Claim 6.19.2: The proof of Claim 6.19.2 is rather sketchy. The main thing which is missing are details concerning the way in which an algorithm contradicting the hypothesis that M is a simulator for (P V ) is derived from an algorithm contradicting the statement of Claim 6.19.2. These details are presented below, and the reader is encouraged not to skip them. Let us start with the non-problematic part. We assume, to the contradiction, that there exists a probabilistic polynomial-time algorithm, D, and a polynomial p( ), so that 174 CHAPTER 6. ZERO-KNOWLEDGE PROOF SYSTEMS for in nitely many x 2 L there exists z 2 f0 1g such that jProb(D(x z hPQ V (z)i(x)) = 1) ; Prob(D(x z M (x z)) = 1)j > p(j1xj) It follows that for every such x and z , there exists an i 2 f1 ::: Q(jxj)g such that jProb(D(x z H (i)(x z)) = 1) ; Prob(D(x z H (i;1)(x z)) = 1)j > Q(jxj)1 p(jxj) Denote (n) def 1=(Q(n) p(n)). Combining the de nition of the ith and i ; 1st hybrids with = an averaging argument, it follows that for each such x, z and i, there exists a z 0 , in the support of Z (i;1) (de ned as above), such that jProb(D(x z0 MQ(jxj);ihP V (z0)i(x)) = 1) ;Prob(D(x z0 MQ(jxj);i(M (x z0))) = 1)j > (jxj) This almost leads to the desired contradiction. Namely, the random variables (x z 0 hP V (z 0)i(x)) and (x z 0 M (x z 0)) can be distinguished using algorithms D and M , provided we \know" i. The problem is resolved using the fact, pointed out at the end of Subsection 6.3.3, that the output of M is undistinguished from the interactions of V with the prover even with respect to non-uniform polynomial-size circuits. Details follow. We construct a polynomial-size circuit family, denoted fCn g, which distinguishes (x z 0 hP V (z 00)i(x)) and (x z 0 M (x z 00)), for the above-mentioned (x z 0) pairs. On input x (supposedly in L \ f0 1gn) and (supposedly in either (x z 0 hP V (z 00)i(x)) or (x z 0 M (x z 00))), the circuit Cn , incorporating (the above-mentioned) i, uses algorithm M to compute = MQ(jxj);i(x ). Next Cn , using algorithm D, computes = D((x z 0) ) and halts outputting . Contradiction (to the hypothesis that M is a simulator for (P V )) follows. 2 And what about parallel composition? Unfortunately, we cannot prove that zero-knowledge (even with respect to auxiliary input) is preserved under parallel composition. Furthermore, there exist zero-knowledge proofs that when played twice in parallel do yield knowledge (to a \cheating veri er"). For further details see Subsection 6.5. The fact that zero-knowledge is not preserved under parallel composition of protocols is indeed bad news. One may even think that this fact is a conceptually annoying phenomenon. We disagree with this feeling. Our feeling is that the behaviour of protocols and \games" under parallel composition is, in general (i.e., not only in the context of zeroknowledge), a much more complex issue than the behaviour under sequential composition. 6.4. ZERO-KNOWLEDGE PROOFS FOR NP 175 Furthermore, the only advantage of parallel composition over sequential composition is in e ciency. Hence, we don't consider the non-closure under parallel composition to be a conceptual weakness of the formulation of zero-knowledge. Yet, the \non-closure" of zeroknowledge motivates the search for either weaker or stronger notions which are preserved under parallel composition. For further details, the reader is referred to Sections 6.9 and 6.6. 6.4 Zero-Knowledge Proofs for NP This section presents the main thrust of the entire chapter namely, a method for constructing zero-knowledge proofs for every language in NP . The importance of this method stems from its generality, which is the key to its many applications. Speci cally, we observe that almost all statements one wish to prove in practice can be encoded as claims concerning membership in languages in NP . The method, for constructing zero-knowledge proofs for NP-languages, makes essential use of the concept of bit commitment. Hence, we start with a presentation of this concept. 6.4.1 Commitment Schemes Commitment schemes are a basic ingredient in many cryptographic protocols. The are used to enable a party to commit itself to a value while keeping it secret. In a latter stage the commitment is \opened" and it is guaranteed that the \opening" can yield only a single value determined in the committing phase. Commitment schemes are the digital analogue of nontransparent sealed envelopes. By putting a note in such an envelope a party commits itself to the contents of the note while keeping it secret. De nition Loosely speaking, a commitment scheme is an e cient two-phase two-party protocol through which one party, called the sender, can commit itself to a value so the following two conicting requirements are satis ed. 1. Secrecy: At the end of the rst phase, the other party, called the receiver, does not gain any knowledge of the sender's value. This requirement has to be satis ed even if the receiver tries to cheat. 2. Unambiguity: Given the transcript of the interaction in the rst phase, there exists at most one value which the receiver may later (i.e., in the second phase) accept as a legal \opening" of the commitment. This requirement has to be satis ed even if the sender tries to cheat. 176 CHAPTER 6. ZERO-KNOWLEDGE PROOF SYSTEMS In addition, one should require that the protocol is viable in the sense that if both parties follow it then, at the end of the second phase, the receiver gets the value committed to by the sender. The rst phase is called the commit phase, and the second phase is called the reveal phase. We are requiring that the commit phase yield no knowledge (at least not of the sender's value) to the receiver, whereas the commit phase does \commit" the sender to a unique value (in the sense that in the reveal phase the receiver may accept only this value). We stress that the protocol is e cient in the sense that the predetermined programs of both parties can be implemented in probabilistic, polynomial-time. Without loss of generality, the reveal phase may consist of merely letting the sender send, to the receiver, the original value and the sequence of random coin tosses that it has used during the commit phase. The receiver will accept the value if and only if the supplied information matches its transcript of the interaction in the commit phase. The latter convention leads to the following de nition (which refers explicitly only to the commit phase). De nition 6.20 (bit commitment scheme): A bit commitment scheme is a pair of prob- abilistic polynomial-time interactive machines, denoted (S R) (for sender and receiver), satisfying: Input Speci cation: The common input is an integer n presented in unary (serving as the security parameter). The private input to the sender is a bit v . Secrecy: The receiver (even when deviating arbitrarily from the protocol) cannot distinguish a commitment to 0 from a commitment to 1. Namely, for every probabilistic polynomial-time machine R interacting with S , the random variables describing the output of R in the two cases, namely hS (0) R i(1n ) and hS (1) R i(1n ), are polynomially-indistinguishable. Unambiguity: Preliminaries the random coins used by the receiver (r) and the sequence of messages received from the sender (m). { Let 2 f0 1g. We say that a receiver's view (of such interaction), (r m), is a possible -commitment if there exists a string s such that m describes the messages received by R when R uses local coins r and interacts with machine S which uses local coins s and has input ( 1n). (Using the notation of De nition 6.13, the n condition may be expressed as m = viewS ((1n1 r)s) .) R { We say that the receiver's view (r m) is ambiguous if it is both a possible 0commitment and a possible 1-commitment. { A receiver's view of an interaction with the sender, denoted (r m), consists of 6.4. ZERO-KNOWLEDGE PROOFS FOR NP 177 The unambiguity requirement asserts that, for all but a negligible fraction of the coin tosses of the receiver, there exists no sequence of messages (from the sender) which together with these coin tosses forms an ambiguous receiver view. Namely, that for all but a negligible fraction of the r 2 f0 1gpoly(n) there is no m such that (r m) is ambiguous. The secrecy requirement (above) is analogous to the de nition of indistinguishability of encryptions (i.e., De nition missing(enc-indist.def)]). An equivalent formulation analogous to semantic security (i.e., De nition missing(enc-semant.def)]) can be presented, but is less useful in typical applications of commitment schemes. In any case, the secrecy requirement is a computational one. On the other hand, the unambiguity requirement has an information theoretic avour (i.e., it does not refer to computational powers). A dual definition, requiring information theoretic secrecy and computational unfeasibility of creating ambiguities, is presented in Subsection 6.8.2. The secrecy requirement refers explicitly to the situation at the end of the commit phase. On the other hand, we stress that the unambiguity requirement implicitly assumes that the reveal phase takes the following form: 1. the sender sends to the receiver its initial private input, v , and the random coins, s, it has used in the commit phase 2. the receiver veri es that v and s (together with the coins (r) used by R in the commit phase) indeed yield the messages that R has received in the commit phase. Veri cation is done in polynomial-time (by running the programs S and R). Note that the viability requirement (i.e., asserting that if both parties follow the protocol then, at the end of the reveal phase, the receiver gets v ) is implicitly satis ed by the above convention. Construction based on any one-way permutation Some public-key encryption scheme can be used as a commitment scheme. This can be done by having the sender generate a pair of keys and use the public-key together with the encryption of a value as its commitment to the value. In order to satisfy the unambiguity requirement, the underlying public-key scheme needs to satisfy additional requirements (e.g., the set of legitimate public-keys should be e ciently recognizable). In any case, publickey encryption schemes have additional properties not required of commitment schemes and their existence seems to require stronger intractability assumptions. An alternative construction, presented below, uses any one-way permutation. Speci cally, we use a oneway permutation, denoted f , and a hard-core predicate for it, denoted b (see Section 2.5). 178 CHAPTER 6. ZERO-KNOWLEDGE PROOF SYSTEMS Construction 6.21 (simple bit commitment): Let f : f0 1g 7! f0 1g be a function, and b : f0 1g 7! f0 1g be a predicate. 1. commit phase: To commit to value v 2 f0 1g (using security parameter n), the sender uniformly selects s 2 f0 1gn and sends the pair (f (s) b(s) v ) to the receiver. 2. reveal phase: In the reveal phase, the sender reveals the string s used in the commit phase. The receiver accepts the value v if f (s) = and b(s) v = , where ( ) is the receiver's view of the commit phase. Proposition 6.22 Let f : f0 1g 7! f0 1g be a length preserving 1-1 one-way function, and b : f0 1g 7! f0 1g be a hard-core predicate of f . Then, the protocol presented in Construction 6.21 constitutes a bit commitment scheme. Proof: The secrecy requirement follows directly from the fact that b is a hard-core of f . The unambiguity requirement follows from the 1-1 property of f . In fact, there exists no ambiguous receiver view. Namely, for each receiver view ( ), there is a unique s 2 f0 1gj j so that f (s) = and hence a unique v 2 f0 1g so that b(s) v = . Construction based on any one-way function We now present a construction of a bit commitment scheme which is based on the weakest assumption possible: the existence of one-way function. Proving the that the assumption is indeed minimal is left as an exercise (i.e., Exercise 12). On the other hand, by the results in Chapter 3 (speci cally, Theorems 3.11 and 3.29), the existence of one-way functions imply the existence of pseudorandom generators expanding n-bit strings into 3n-bit strings. We will use such a pseudorandom generator in the construction presented below. We start by motivating the construction. Let G be a pseudorandom generator satisfying jG(s)j = 3 jsj. Assume that G has the property that the sets fG(s) : s 2 f0 1gng and fG(s) 13n : s 2 f0 1gng are disjoint, were denote the bit-by-bit exclusive-or of the strings and . Then, the sender may commit itself to the bit v by uniformly selecting s 2 f0 1gn and sending the message G(s) v3n (v k denotes the all-v's k-bit long string). Unfortunately, the above assumption cannot be justi ed, in general, and a slightly more complex variant is required. The key observation is that for most strings 2 f0 1g3n the sets fG(s) : s 2 f0 1gng and fG(s) : s 2 f0 1gng are disjoint. Such a string is called good. This observation suggests the following protocol. The receiver uniformly selects 2 f0 1g3n, hoping that it is good, and the sender commits to the bit v by uniformly selecting s 2 f0 1gn and sending the message G(s) if v = 0 and G(s) otherwise. Construction 6.23 (bit commitment under general assumptions): Let G : f0 1g 7! f0 1g be a function so that jG(s)j = 3 jsj for all s 2 f0 1g . 6.4. ZERO-KNOWLEDGE PROOFS FOR NP 179 1. commit phase: To receive a commitment to a bit (using security parameter n), the receiver uniformly selects r 2 f0 1g3n and sends it to the sender. Upon receiving the message r (from the receiver), the sender commits to value v 2 f0 1g by uniformly selecting s 2 f0 1gn and sending G(s) if v = 0 and G(s) r otherwise. 2. reveal phase: In the reveal phase, the sender reveals the string s used in the commit phase. The receiver accepts the value 0 if G(s) = and the value 1 if G(s) r = , where (r ) is the receiver's view of the commit phase. Proposition 6.24 If G is a pseudorandom generator, then the protocol presented in Construction 6.23 constitutes a bit commitment scheme. Proof: The secrecy requirement follows the fact that G is a pseudorandom generator. Speci cally, let Uk denote the random variable uniformly distributed on strings of length k. Then for every r 2 f0 1g3n, the random variables U3n and U3n r are identically distributed. Hence, if it is feasible to nd an r 2 f0 1g3n such that G(Un ) and G(Un ) r are computationally distinguishable then either U3n and G(Un ) are computationally distinguishable or U3n r and G(Un ) r are computationally distinguishable. In either case contradiction to the pseudorandomness of G follows. We now turn to the unambiguity requirement. Following the motivating discussion, we call 2 f0 1g3n good if the sets fG(s) : s 2 f0 1gng and fG(s) : s 2 f0 1gng 3n yields a collision between the seeds s1 and s2 if are disjoint. We say that 2 f0 1g G(s1) = G(s2) . Clearly, is good if it does not yield a collision between any pair of seeds. On the other hand, there is a unique string which yields a collision between a given pair of seeds (i.e., = G(s1 ) G(s2)). Since there are 22n possible pairs of seeds, at most 22n strings yield collisions between seeds and all the other 3n-bit long strings are good. It follows that with probability at least 1 ; 22n;3n the receiver selects a good string. The unambiguity requirement follows. Extensions The de nition and the constructions of bit commitment schemes are easily extended to general commitment schemes enabling the sender to commit to a string rather than to a single bit. When de ning the secrecy of such schemes the reader is advised to consult De nition missing(enc-indist.def)]). For the purposes of the rest of this section we need a commitment scheme by which one can commit to a ternary value. Extending the de nition and the constructions to deal with this case is even more straightforward. In the rest of this section we will need commitment schemes with a seemingly stronger secrecy requirement than de ned above. Speci cally, instead of requiring secrecy with 180 CHAPTER 6. ZERO-KNOWLEDGE PROOF SYSTEMS respect to all polynomial-time machines, we will require secrecy with respect to all (not necessarily uniform) families of polynomial-size circuits. Assuming the existence of nonuniformly one-way functions (see De nition 2.6 in Section 2.2) commitment schemes with nonuniform secrecy can be constructed, following the same constructions used in the uniform case. 6.4.2 Zero-Knowledge proof of Graph Coloring Presenting a zero-knowledge proof system for one NP -complete language implies the existence of a zero-knowledge proof system for every language in NP . This intuitively appealing statement does require a proof which we postpone to a later stage. In the current subsection we present a zero-knowledge proof system for one NP -complete language, speci cally Graph 3-Colorability. This choice is indeed arbitrary. The language Graph 3-Coloring, denoted G3C , consists of all simple graphs (i.e., no parallel edges or self-loops) that can be vertex-colored using 3 colors so that no two adjacent vertices are given the same color. Formally, a graph G =(V E ), is 3-colorable, if there exists a mapping : V 7! f1 2 3g so that (u) 6= (v ) for every (u v ) 2 E . Motivating discussion The idea underlying the zero-knowledge proof system for G3C is to break the proof of the claim that a graph is 3-colorable into polynomially many pieces arranged in templates so that each template by itself yields no knowledge and yet all the templates put together guarantee the validity of the main claim. Suppose that the prover generates such pieces of information, places each of them in a separate sealed and nontransparent envelope, and allows the veri er to open and inspect the pieces participating in one of the templates. Then certainly the veri er gains no knowledge in the process, yet his con dence in the validity of the claim (that the graph is 3-colorable) increases. A concrete implementation of this abstract scheme follows. To prove that the graph G = (V E ) is 3-colorable, the prover generates a random 3coloring of the graph, denoted (actually a random relabelling of a xed coloring will do). The color of each single vertex constitutes a piece of information concerning the 3-coloring. The set of templates corresponds to the set of edges (i.e., each pair ( (u) (v )), (u v ) 2 E , constitutes a template to the claim that G is 3-colorable). Each single template (being merely a random pair of distinct elements in f1 2 3g) yield no knowledge. However, if all the templates are OK then the graph must be 3-colorable. Consequently, graphs which are not 3-colorable must contain at least one bad template and hence are rejected with nonnegligible probability. Following is an abstract description of the resulting zero-knowledge interactive proof system for G3C . 6.4. ZERO-KNOWLEDGE PROOFS FOR NP 181 Common Input: A simple graph G =(V E ). Prover's rst step: Let be a 3-coloring of G. The prover selects a random permutation, , over f1 2 3g, and sets (v ) def ( (v )), for each v 2 V . Hence, the = prover forms a random relabelling of the 3-coloring . The prover sends the veri er a sequence of jV j locked and nontransparent boxes so that the v th box contains the value (v ) Veri er's rst step: The veri er uniformly selects an edge (u v ) 2 E , and sends it to the prover Motivating Remark: The veri er asks to inspect the colors of vertices u and v Prover's second step: The prover sends to the veri er the keys to boxes u and v Veri er's second step: The veri er opens boxes u and v , and accepts if and only if they contain two di erent elements in f1 2 3g Clearly, if the input graph is 3-colorable then the prover can cause the veri er to accept always. On the other hand, if the input graph is not 3-colorable then any contents placed in the boxes must be invalid on at least one edge, and consequently the veri er will reject with probability at least 1=jE j. Hence, the above protocol exhibits a non-negligible gap in the accepting probabilities between the case of inputs in G3C and inputs not in G3C . The zeroknowledge property follows easily, in this abstract setting, since one can simulate the real interaction by placing a random pair of di erent colors in the boxes indicated by the veri er. We stress that this simple argument will not be possible in the digital implementation since the boxes are not totally ine ected by their contents (but are rather e ected, yet in an indistinguishable manner). Finally, we remark that the con dence in the validity of the claim (that the input graph is 3-colorable) may be increased by sequentially applying the above proof su cient many times. (In fact if the boxes are perfect as assumed above then one can also use parallel repetitions.) The interactive proof We now turn to the digital implementation of the above abstract protocol. In this implementation the boxes are implemented by a commitment scheme. Namely, for each box we invoke an independent execution of the commitment scheme. This will enable us to execute the reveal phase in only some of the commitments, a property that is crucial to our scheme. For simplicity of exposition, we use the simple commitment scheme presented in Construction 6.21 (or, more generally, any one-way interaction commitment scheme). We denote by Cs( ) the commitment of the sender, using coins s, to the (ternary) value . Construction 6.25 (A zero-knowledge proof for Graph 3-Coloring): 182 CHAPTER 6. ZERO-KNOWLEDGE PROOF SYSTEMS Common Input: A simple (3-colorable) graph G = (V E ). Let n def jV j and V = = f1 ::: ng. Auxiliary Input to the Prover: A 3-coloring of G, denoted . Prover's rst step (P1): The prover selects a random permutation, , over f1 2 3g, and sets (v ) def ( (v )), for each v 2 V . The prover uses the commitment scheme = to commit itself to the color of each of the vertices. Namely, the prover uniformly and independently selects s1 ::: sn 2 f0 1gn, computes ci = Csi ( (i)), for each i 2 V , and sends c1 ::: cn to the veri er Veri er's rst step (V1): The veri er uniformly selects an edge (u v ) 2 E , and sends it to the prover Motivating Remark: The veri er asks to inspect the colors of vertices u and v Prover's second step (P2): Without loss of generality, we may assume that the message received for the veri er is an edge, denoted (u v ). (Otherwise, the prover sets (u v ) to be some predetermined edge of G.) The prover uses the reveal phase of the commitment scheme in order to reveal the colors of vertices u and v to the veri er. Namely, the prover sends (su (u)) and (sv (v )) to the veri er Veri er's second step (V2): The veri er checks whether the values corresponding to commitments u and v were revealed correctly and whether these values are di erent. Namely, upon receiving (s ) and (s0 ), the veri er checks whether cu = Cs ( ), cv = Cs0 ( ), and 6= (and both in f1 2 3g). If all conditions hold then the veri er accepts. Otherwise it rejects. Let us denote the above prover's program by PG3C . We stress that both the programs of the veri er and of the prover can be implemented in probabilistic polynomial-time. In case of the prover's program this property is made possible by the use of the auxiliary input to the prover. As we will shortly see, the above protocol constitutes a weak interactive proof for G3C . As usual, the con dence can be increased (i.e., the error probability can be decreased) by su ciently many successive applications. However, the mere existence of an interactive proof for G3C is obvious (since G3C 2 NP ). The punch-line is that the above protocol is zero-knowledge (also with respect to auxiliary input). Using the Sequential Composition Lemma (Lemma 6.19), it follows that also polynomially many sequential applications of this protocol preserve the zero-knowledge property. Proposition 6.26 Suppose that the commitment scheme used in Construction 6.25 satises the (nonuniform) secrecy and the unambiguity requirements. Then Construction 6.25 constitutes an auxiliary input zero-knowledge (generalized) interactive proof for G3C . 6.4. ZERO-KNOWLEDGE PROOFS FOR NP 183 For further discussion of Construction 6.25 see remarks at the end of the current subsection. Proof of Proposition 6.26 We rst prove that Construction 6.25 constitutes a weak interactive proof for G3C . Assume rst that the input graph is indeed 3-colorable. Then if the prover follows the program in the construction then the veri er will always accept (i.e., accept with probability 1). On the other hand, if the input graph is not 3-colorable then, no matter what the prover does, the n commitments sent in Step (P1) cannot \correspond" to a 3-coloring of the graph (since such coloring does not exists). We stress that the unique correspondence of commitments to values is guaranteed by the unambiguity property of the commitment scheme. It follows that there must exists an edge (u v ) 2 E so that cu and cv , sent in step (P1), are not commitments to two di erent elements of f1 2 3g. Hence, no matter how the prover behaves, the veri er will reject with probability at least 1=jE j. Hence there is a non-negligible (in the input length) gap between the accepting probabilities in case the input is in G3C and in case it is not. We now turn to show that PG3C , the prover in Construction 6.25, is indeed zeroknowledge for G3C . The claim is proven without reference to auxiliary input (to the veri er), yet extending the argument to auxiliary input zero-knowledge is straightforward. Again, we will use the alternative formulation of zero-knowledge (i.e., De nition 6.13), and show how to simulate V 's view of the interaction with PG3C , for every probabilistic polynomial-time interactive machine V . As in the case of the Graph Isomorphism proof system (i.e., Construction 6.16) it is quite easy to simulate the veri er's view of the interaction with PG3C , provided that the veri er follows the speci ed program. However, we need to simulate the view of the veri er in the general case (in which it uses an arbitrary polynomial-time interactive program). Following is an overview of our simulation (i.e., of our construction of a simulator, M , for an arbitrary V ). The simulator M incorporates the code of the interactive program V . On input a graph G =(V E ), the simulator M (not having access to a 3-coloring of G) rst uniformly and independently selects n values e1 ::: en 2 f1 2 3g, and constructs a commitment to each of them. These ei 's constitute a \pseudo-coloring" of the graph, in which the end-points 2 of each edge are colored di erently with probability 3 . In doing so, the simulator behaves very di erently from PG3C , but nevertheless the sequence of commitments so generated is computationally indistinguishable from the sequence of commitments to a valid 3-coloring sent by PG3C in step (P1). If V , when given the commitments generated by the simulator, asks to inspect an edge (u v ) so that eu 6= ev then the simulator can indeed answer correctly, and doing so it completes a simulation of the veri er's view of the interaction with PG3C . However, if V asks to inspect an edge (u v ) so that eu = ev then the simulator has no way to answer correctly, and we let it halt with output ?. We stress that we don't assume that the simulator a-priori \knows" which edge the veri er V will ask to inspect. The validity 184 CHAPTER 6. ZERO-KNOWLEDGE PROOF SYSTEMS of the simulator stems from a di erent source. If the veri er's request were oblivious of the prover's commitment then with probability 2 the veri er would have asked to inspect an 3 edge which is properly colored. Using the secrecy property of the commitment scheme it follows that the veri er's request is \almost oblivious" of the values in the commitments. The zero-knowledge claim follows (yet, with some e ort). Further detail follow. We start with a detailed description of the simulator. Simulator M . On input a graph G =(V E ), the simulator M proceeds as follows: 1. Setting the random tape of V : Let q ( ) denote a polynomial bounding the runningtime of V . The simulator M starts by uniformly selecting a string r 2 f0 1gq(jxj), to be used as the contents of the local random tape of V . 2. Simulating the prover's rst step (P1): The simulator M uniformly and independently selects n values e1 ::: en 2 f1 2 3g and n random strings s1 ::: sn 2 f0 1gn to be used for committing to these values. The simulator computes, for each i 2 V , a commitment di = Csi (ei ). 3. Simulating the veri er's rst step (V1): The simulator M initiates an execution of V by placing G on V 's \common input tape", placing r (selected in step (1) above) on V 's \local random tape", and placing the sequence (d1 ::: dn) (constructed in step (2) above) on V 's \incoming message tape". After executing a polynomial number of steps of V , the simulator can read the outgoing message of V , denoted m. Again, we assume without loss of generality that m 2 E and let (u v ) = m. (Actually m 62 E is treated as in step (P2) in PG3C namely, (u v ) is set to be some predetermined edge of G.) 4. Simulating the prover's second step (P2): If eu 6= ev then the simulator halts with output (G r (d1 ::: dn) (su eu sv ev )). 5. Failure of the simulation: Otherwise (i.e., eu = ev ), the simulator halts with output ?. Using the hypothesis that V is polynomial-time, it follows that so is the simulator M . 1 It is left to show that M outputs ? with probability at most 2 , and that, conditioned on not outputting ?, the simulator's output is computationally indistinguishable from the veri er's view in a \real interaction with PG3C ". The proposition will follow by running the above simulator n times and outputting the rst output di erent from ?. We now turn to prove the above two claims. Claim 6.26.1: For every su ciently large graph, G =(V E ), the probability that M (G) = ? 1 is bounded above by 2 . 6.4. ZERO-KNOWLEDGE PROOFS FOR NP 185 proof: As above, n will denote the cardinality of the vertex set of G. Let us denote by pu v (G r (e1 ::: en)) the probability, taken over all the choices of the s1 ::: sn 2 f0 1gn, that V , on input G, random coins r, and prover message (Cs1 (e1 ) ::: Csn(en )), replies with the message (u v ). We assume, for simplicity, that V always answers with an edge of G (since otherwise its message is anyhow treated as if it were an edge of G). We rst claim that for every su ciently large graph, G =(V E ), every r 2 f0 1gq(n), every edge (u v ) 2 E , and every two sequences 2 f1 2 3gn, it holds that jpu v (G r ) ; pu v (G r )j 2j1 j E Actually, we can prove the following. Request Obliviousness Subclaim: For every polynomial p( ), every su ciently large graph, G = (V E ), every r 2 f0 1gq(n), every edge (u v ) 2 E , and every two sequences 2 n, it holds that f1 2 3g jpu v (G r ) ; pu v (G r )j p(1n) The Request Obliviousness Subclaim is proven using the non-uniform secrecy of the commitment scheme. The reader should be able to ll-up the details of such a proof at this stage. Nevertheless, a proof of the subclaim follows. Proof of the Request Obliviousness Subclaim: Assume on the contrary that there exists a polynomial p( ), and an in nite sequence of integers such that for each integer n (in the sequence) there exists an n-vertices graph, Gn = (Vn En), a string rn 2 f0 1gq(n), an edge (un vn ) 2 En , and two sequences n n 2 f1 2 3gn so that jpun vn (Gn rn n) ; pun vn (Gn rn n)j > p(1n) We construct a circuit family, fAn g, by letting An incorporate the interactive machine V , the graph Gn , and rn un vn n n , all being as in the contradiction hypothesis. On input, y (supposedly a commitment to either n or n ), circuit An runs V (on input Gn coins rn and prover's message y ), and outputs 1 if and only if V replies with (un vn ). Clearly, fAn g is a (non-uniform) family of polynomial-size circuits. The key observation is that An distinguishes commitments to n from commitments to n , since Prob(An (CUn2 ( )) = 1) = pun vn (Gn rn ) where Uk denotes, as usual, a random variable uniformly distributed over f0 1gk . Contradiction to the (non-uniform) secrecy of the commitment scheme follows by a standard hybrid argument (which relates the indistinguishability of sequences to the indistinguishability of single commitments). 186 CHAPTER 6. ZERO-KNOWLEDGE PROOF SYSTEMS Returning to the proof of Claim 6.26.1, we now use the above subclaim to upper bound the probability that the simulator outputs ?. The intuition is simple. Since the requests of V are almost oblivious of the values to which the simulator has committed itself, it is unlikely that V will request to inspect an illegally colored edge more often than if he would have made the request without looking at the commitment. A formal (but straightforward) analysis follows. Let Mr (G) denote the output of machine M on input G, conditioned on the event that it chooses the string r in step (1). We remind the reader that Mr (G) = ? only in case the veri er on input G, random tape r, and a commitment to some pseudo-coloring (e1 ::: en), asks to inspect an edge (u v ) which is illegally colored (i.e., eu = ev ). Let E(e1 ::: en) denote the set of edges (u v) 2 E that are illegally colored (i.e., satisfy eu = ev ) with respect to (e1 ::: en). Then, xing an arbitrary r and considering all possible choices of (e1 ::: en) 2 f1 2 3gn, X1X Prob(Mr (G) = ?) = pu v (G r e) n e2f1 2 3gn 3 (u v)2Ee (Recall that pu v (G r e) denotes the probability that the veri er asks to inspect (u v ) when given a sequence of random commitments to the values e.) De ne Bu v to be the set of ntuples (e1 ::: en) 2 f1 2 3gn satisfying eu = ev . Clearly, jBu v j = 3n;1 . By straightforward calculation we get XX Prob(Mr (G) = ?) = 31n pu v (G r e) (u v)2E e2Bu v 1 X jB j p (G r (1 ::: 1)) + 1 uv 3n (u v)2E u v 2jE j X 1 = 6+1 3 (u v)2E pu v (G r (1 ::: 1)) = 1+1 63 The claim follows. 2 For simplicity, we assume in the sequel that on common input G 2 G3C , the prover gets the lexicographically rst 3-coloring of G as auxiliary input. This enables us to omit the auxiliary input to PG3C (which is now implicit in the common input) from the notation. The argument is easily extended to the general case where PG3C gets an arbitrary 3-coloring of G as auxiliary input. Claim 6.26.2: The ensemble consisting of the output of M on input G = (V E ) 2 G3C , conditioned on it not being ?, is computationally indistinguishable from the ensemble 6.4. ZERO-KNOWLEDGE PROOFS FOR NP polynomial p( ), and all su ciently large graph G =(V E ), 187 fviewPG3C (G)gG2G3C . Namely, for every probabilistic polynomial-time algorithm, A, every V jProb(A(M (G)) = 1jM (G) 6= ?) ; Prob(A(viewPG3C (G)) = 1)j < p(j1 j) V V We stress that these ensembles are very di erent (i.e., the statistical distance between them is very close to the maximum possible), and yet they are computationally indistinguishable. Actually, we can prove that these ensembles are indistinguishable also by (non-uniform) families of polynomial-size circuits. In rst glance it seems that Claim 6.26.2 follows easily from the secrecy property of the commitment scheme. Indeed, Claim 6.26.2 is proven using the secrecy property of the commitment scheme, yet the proof is more complex than one anticipates (at rst glance). The di culty lies in the fact that the above ensembles consist not only of commitments to values, but also of an opening of some of the values. Furthermore, the choice of which commitments are to be opened depends on the entire sequence of commitments. proof: Given a graph G = (V E ), we de ne for each edge (u v ) 2 E two random variables describing, respectively, the output of M and the view of V in a real interaction, in case the veri er asked to inspect the edge (u v ). Speci cally u v (G) describes M (G) conditioned on M (G) containing the \reveal information" for vertices u and v . describes viewPG3C (G) conditioned on view PG3C (G) containing the \reveal V V information" for vertices u and v . u v (G) Let pu v (G) denote the probability that M (G) contains \reveal information" for vertices u and v , conditioned on M (G) 6= ?. Similarly, let qu v (G) denote the probability that viewPG3C (G) contains \reveal information" for vertices u and v . V Assume, in the contrary to the claim, that the ensembles mentioned in the claim are computationally distinguishable. Then one of the following cases must occur. Case 1: There is a noticeable di erence between the probabilistic pro le of the requests of V when interacting with PG3C and the requests of V when invoked by M . Formally, there exists a polynomial p( ) and an in nite sequence of integers such that for each integer n (in the sequence) there exists an n-vertices graph Gn = (Vn En ), and an edge (un vn ) 2 En , so that jpun vn (Gn) ; qun vn (Gn)j > p(1n) 188 CHAPTER 6. ZERO-KNOWLEDGE PROOF SYSTEMS Case 2: An algorithm distinguishing the above ensembles does so also conditioned on V asking for a particular edge. Furthermore, this request occurs with noticeable probability which is about the same in both ensembles. Formally, there exists a probabilistic polynomial-time algorithm A, a polynomial p( ) and an in nite sequence of integers such that for each integer n (in the sequence) there exists an n-vertices graph Gn =(Vn En), and an edge (un vn) 2 En , so that the following conditions hold qun vn (Gn) > p(1n) jpun vn (Gn) ; qun vn (Gn)j < 3 p(1n)2 jProb(A( un vn (Gn)) = 1) ; Prob(A( un vn (Gn )) = 1)j > p(j1 j) . V Case 1 can be immediately discarded since it leads easily to contradiction (to the nonuniform secrecy of the commitment scheme). The idea is to use the Request Obliviousness Subclaim appearing in the proof of Claim 6.26.1. Details are omitted. We are thus left with Case 2. We are now going to show that also Case 2 leads to contradiction. To this end we will construct a circuit family that will distinguish commitments to di erent sequences of values. Interestingly, neither of these sequences will equal the sequence of commitments generated by either the prover or by the simulator. Following is an overview of the construction. The nth circuit gets a sequence of 3n commitments and produces from it a sequence of n commitments (part of which is a subsequence of the input). When the input sequence to the circuit is taken from one distribution the circuit generates a subsequence corresponding to the sequence of commitments generated by the prover. Likewise, when the input sequence (to the circuit) is taken from the other distribution the circuit will generate a subsequence corresponding to the sequence of commitments generated by the simulator. We stress that the circuit does so without knowing from which distribution the input is taken. After generated an n-long sequence, the circuit feeds it to V , and depending on V 's behaviour the circuit may feed part of the sequence to algorithm A (mentioned in Case 2). Following is a detailed description of the circuit family. Let us denote by n the (lexicographically rst) 3-coloring of Gn used by the prover. We construct a circuit family, denoted fAn g, by letting An incorporate the interactive machine V , the \distinguishing" algorithm A, the graph Gn , the 3-coloring n , and the edge (un vn), all being those guaranteed in Case 2. The input to circuit An will be a sequence of commitments to 3n values, each in f1 2 3g. The circuit will distinguish commitments to a uniformly chosen 3n-long sequence from commitments to the xed sequence 1n 2n 3n (i.e., the sequence consisting of n 1-values, followed by n 2-values, followed by n 3-values). Following is a description of the operation of An . On input, y = (y1 ::: y3n) (where each yi is supposedly a commitment to an element of f1 2 3g), the circuit An proceeds as follows. 6.4. ZERO-KNOWLEDGE PROOFS FOR NP 189 An rst selects uniformly a permutation over f1 2 3g, and computes (i) = ( n(i)), for each i 2 Vn . For each i 2 Vn ; fun vng, the circuit sets ci = y (i) n;n+i (i.e., ci = yi if (i) = 1, ci = yn+i if (i) = 2, and ci = y2n+i if (i) = 3). Note that each yj is used at most once, and 2n + 2 of the yj 's are not used at all. The circuit uniformly selects su sv 2 f0 1gn, and sets cun = Csun ( (un )) and cvn = Csvn ( (vn )). The circuit initiates an execution of V by placing Gn on V 's \common input tape", placing a uniformly selected r 2 f0 1gq(n) on V 's \local random tape", and placing the sequence (c1 ::: cn) (constructed above) on V 's \incoming message tape". The circuit reads the outgoing message of V , denoted m. If m 6= (un vn ) then the circuit outputs 1. Otherwise (i.e., m = (un vn )), the circuit invokes algorithm A and outputs A(Gn r (c1 ::: cn) (sun (un) svn (vn))) Clearly the size of An is polynomial in n. We now evaluate the distinguishing ability of An . Let us rst consider the probability that circuit An outputs 1 on input a random commitment to the sequence 1n 2n 3n . The reader can easily verify that the sequence (c1 ::: cn) constructed by circuit An is distributed identically to the sequence sent by the prover in step (P1). Hence, letting C ( ) denote a random commitment to a sequence 2 f1 2 3g , we get Prob(An (C (1n 2n 3n )) = 1) = (1 ; qun vn (Gn )) +qun vn (Gn ) Prob(A( un vn (Gn )) = 1) On the other hand, we consider the probability that circuit An outputs 1 on input a random commitment to a uniformly chosen 3n-long sequence over f1 2 3g. The reader can easily verify that the sequence (c1 ::: cn) constructed by circuit An is distributed identically to the sequence (d1 ::: dn) generated by the simulator in step (2), conditioned on dun 6= dvn . Letting T3n denote a random variable uniformly distributed over f1 2 3g3n, we get Prob(An (C (T3n) = 1) = (1 ; pun vn (Gn)) +pun vn (Gn ) Prob(A( un vn (Gn )) = 1) Using the conditions of Case 2, and omitting Gn from the notation, we get jProb(An(C (1n2n3n )) = 1) ; Prob(An (C (T3n) = 1)j 190 CHAPTER 6. ZERO-KNOWLEDGE PROOF SYSTEMS qun vn jProb(A( un vn ) = 1) ; Prob(A( un vn ) = 1)j ; 2 jpun vn ; qun vn j 1 ;2 1 1 > p(n) p(n) 3 p(n)2 = 3 p1 n)2 ( Hence, the circuit family fAn g distinguishes commitments to f1n 2n 3n g from commitments to fT3ng. Combining an averaging argument with a hybrid argument, we conclude that there exists a polynomial-size circuit family which distinguishes commitments. This contradicts the non-uniform secrecy of the commitment scheme. Having reached contradiction in both cases, Claim 6.26.2. 2 Combining Claims 6.26.1 and 6.26.2, the zero-knowledge property of PG3C follows. This completes the proof of the proposition. Concluding remarks Construction 6.25 has been presented using a unidirectional commitment scheme. A fundamental property of such schemes is that their secrecy is preserved also in case (polynomially) many instances are invoked simultaneously. The proof of Proposition 6.26 indeed took advantage on this property. We remark that Construction 6.23 also possesses this simultaneous secrecy property, and hence the proof of Proposition 6.26 can be carried out also if the commitment scheme in used is the one of Construction 6.23 (see Exercise 14). We recall that this latter construction constitutes a commitment scheme if and only if such schemes exist at all (since Construction 6.23 is based on any one-way function and the existence of one-way functions is implied by the existence of commitment schemes). Proposition 6.26 assumes the existence of a nonuniformly secure commitment scheme. The proof of the proposition makes essential use of the nonuniform security by incorporating instances on which the zero-knowledge property fails into circuits which contradict the security hypothesis. We stress that the sequence of \bad" instances is not necessarily constructible by e cient (uniform) machines. Put in other words, the zero-knowledge requirement has some nonuniform avour. A uniform analogue of zero-knowledge would require only that it is infeasible to nd instances on which a veri er gains knowledge (and not that such instances do not exist at all). Using a uniformly secure commitment scheme, Construction 6.25 can be shown to be uniformly zero-knowledge. By itself, Construction 6.25 has little practical value, since it o ers very moderate acceptance gap (between inputs inside and outside of the language). Yet, repeating the protocol, on common input G = (V E ), for k jE j times (and letting the veri er accept only if all iterations are accepting) yields an interactive proof for G3C with error probability bounded 6.4. ZERO-KNOWLEDGE PROOFS FOR NP 191 by e;k , where e 2:718 is the natural logarithm base. Namely, on common input G 2 G3C the veri er always accepts, whereas on common input G 62 G3C the veri er accepts with probability bounded above by e;k (no matter what the prover does). We stress that, by virtue of the Sequential Composition Lemma (Lemma 6.19), if these iterations are performed sequentially then the resulting (strong) interactive proof is zero-knowledge as well. Setting k to be any super-logarithmic function of jGj (e.g., k = jGj), the error probability of the resulting interactive proof is negligible. We remark that it is unlikely that one can prove an analogous statement with respect to the interactive proof which results by performing these iteration in parallel. See Section 6.5. An important property of Construction 6.25 is that the prescribed prover (i.e., PG3C ) can be implemented in probabilistic polynomial-time, provided that it is given as auxiliary input a 3-coloring of the common input graph. As we shall see, this property is essential to the applications of Construction 6.25 to the design of cryptographic protocols. As admitted in the beginning of the current subsection, the choice of G3C as a bootstrapping NP -complete language is totally arbitrary. It is quite easy to design analogous zero-knowledge proofs for other popular NP -complete languages. Such constructions will use the same underlying ideas as those presented in the motivating discussion. 6.4.3 The General Result and Some Applications The theoretical and practical importance of a zero-knowledge proof for Graph 3-Coloring (e.g., Construction 6.25) follows from the fact that it can be applied to prove, in zeroknowledge, any statement having a short proof that can be e ciently veri ed. More precisely, a zero-knowledge proof system for a speci c NP -complete language (e.g., Construction 6.25) can be used to present zero-knowledge proof systems for every language in NP . Before presenting zero-knowledge proof systems for every language in NP , let us recall some conventions and facts concerning NP . We rst recall that every language L 2 NP is characterized by a binary relation R satisfying the following properties There exists a polynomial p( ) such that for every (x y ) 2 R it holds jy j p(jxj). There exists a polynomial-time algorithm for deciding membership in R. L = fx : 9w s.t. (x w) 2 Rg. Actually, each language in NP can be characterized by in nitely many such relations. Yet, for each L 2 NP we x and consider one characterizing relation, denoted RL . Secondly, since G3C is NP -complete, we know that L is polynomial-time reducible (i.e., Karpreducible) to G3C . Namely, there exists a polynomial-time computable function, f , such 192 CHAPTER 6. ZERO-KNOWLEDGE PROOF SYSTEMS that x 2 L if and only if f (x) 2 G3C . Thirdly, we observe that the standard reduction of L to G3C , denoted fL , has the following additional property: There exists a polynomial-time computable function, denoted gL , such that for every (x w) 2 RL it holds that gL (w) is a 3-coloring of fL (x). We stress that the above additional property is not required by the standard de nition of a Karp-reduction. Yet, it can be easily veri ed that the standard reduction fL (i.e., the composition of the generic reduction of L to SAT , the standard reductions of SAT to 3SAT , and the standard reduction of 3SAT to G3C ) does have such a corresponding gL. (See Exercise 16.) Using these conventions, we are ready to \reduce" the construction of zero-knowledge proof for NP to a zero-knowledge proof system for G3C . Construction 6.27 (A zero-knowledge proof for a language L 2 NP ): Common Input: A string x (supposedly in L) Auxiliary Input to the Prover: A witness, w, for the membership of x 2 L (i.e., a string w such that (x w) 2 RL ). Local pre-computation: Each party computes G def fL (x). The prover computes def = = gL(w). Invoking a zero-knowledge proof for G3C : The parties invoke a zero-knowledge proof on common input G. The prover enters this proof with auxiliary input . Proposition 6.28 Suppose that the subprotocol used in the last step of Construction 6.27 is Proof: The fact that Construction 6.27 constitutes an interactive proof for L is immediate indeed an auxiliary input zero-knowledge proof for G3C . Then Construction 6.27 constitutes an auxiliary input zero-knowledge proof for L. from the validity of the reduction (and the fact that it uses an interactive proof for G3C ). In rst glance it seems that the zero-knowledge property of Construction 6.27 follows as immediately. There is however a minor issue that one should not ignore. The veri er in the zero-knowledge proof for G3C , invoked in Construction 6.27, possesses not only the common input graph G but also the original common input x which reduces to G. This extra information might have helped this veri er to extract knowledge in the G3C interactive proof, if it were not the case that this proof system is zero-knowledge also with respect to auxiliary input. can be dealt with using auxiliary input to the veri er in Details follow. Suppose we need to simulate the interaction of a machine V with the prover, on common input x. Without loss of generality we may assume that machine V invokes an interactive 6.4. ZERO-KNOWLEDGE PROOFS FOR NP 193 machine V which interacts with the prover of the G3C interactive proof, on common input G = fL (x) and having auxiliary input x. Using the hypothesis that the G3C interactive proof is auxiliary input zero-knowledge, it follows that there exists a simulator M that on input (G x) simulates the interaction of V with the G3C -prover (on common input G and veri er's auxiliary input x). Hence, the simulator for Construction 6.27, denoted M , operates as follows. On input x, the simulator M computes G def fL (x) and outputs = M (G x). The proposition follows. We remark that an alternative way of resolving the minor di culty addressed above is to observe that the function fL (i.e., the one induced by the standard reductions) can be inverted in polynomial-time (see Exercise 17). In any case, we immediately get Theorem 6.29 Suppose that there exists a commitment scheme satisfying the (nonuniform) secrecy and the unambiguity requirements. Then every language in NP has an aux- iliary input zero-knowledge proof system. Furthermore, the prescribed prover in this system can be implemented in probabilistic polynomial-time, provided it gets the corresponding NP witness as auxiliary input. We remind the reader that the condition of the theorem is satis ed if (and only if) there exists (non-uniformly) one-way functions. See Theorem 3.29 (asserting that one-way functions imply pseudorandom generators), Proposition 6.24 (asserting that pseudorandom generators imply commitment schemes), and Exercise 12 (asserting that commitment schemes imply one-way functions). An Example: Proving properties of secrets A typical application of Theorem 6.29 is to enable one party to prove some property of its secrets without revealing the secrets. For concreteness, consider a party, denoted S , sending encrypted messages (over a public channel) to various parties, denoted R1 ::: Rt, and wishing to prove to some other party, denoted V , that all the corresponding plaintext messages are identical. Further suppose that the messages are sent to the receivers (i.e., the Ri 's) using a secure public-key encryption scheme, and let Ei ( ) denote the (probabilistic) encryption employed when sending a message to Ri . Namely, to send message Mi to Ri, the sender uniformly chooses ri 2 f0 1gn, computes the encryption Ei (ri Mi ), and transmits it over the public channel. In order to prove that C1 = E1(r1 M ) and C2 = E2(r2 M ) both encrypt the same message it su ces to reveal r1, r2 and M . However, doing so reveals the message M to the veri er. Instead, one can prove in zero-knowledge that there exists r1, r2 and M such that C1 = E1(r1 M ) and C2 = E2(r2 M ). The existence of such a zeroknowledge proof follows from Theorem 6.29 and the fact that the statement to be proven is of NP-type. Formally, we de ne a language L def f(C1 C2) : 9r1 r2 M s.t. C1 = E1(r1 M ) and C2 = E2(r2 M )g = 194 CHAPTER 6. ZERO-KNOWLEDGE PROOF SYSTEMS Clearly, the language L is in NP , and hence Theorem 6.29 can be applied. Additional examples are presented in Exercise 18. Zero-Knowledge for any language in IP Interestingly, the result of Theorem 6.29 can be extended \to the maximum" in the sense that under the same conditions every language having an interactive proof system also has a zero-knowledge proof system. Namely, Theorem 6.30 Suppose that there exists a commitment scheme satisfying the (nonuniform) secrecy and unambiguity requirements. Then every language in IP has a zeroknowledge proof system. We believe that this extension does not have much practical signi cance. Theorem 6.30 is proven by rst converting the interactive proof for L into one in which the veri er uses only \public coins" (i.e., an Arthur-Merlin proof) see Chapter 8. Next, the veri er's coin tosses are forced to be almost unbiased by using a coin tossing protocols (see section ****???). Finally, the prover's replies are sent using a commitment scheme, At the end of the interaction the prover proves in zero-knowledge that the original veri er would have accepted the hidden transcript (this is an NP-statement). 6.4.4 E ciency Considerations When presenting zero-knowledge proof systems for every language in NP , we made no attempt to present the most e cient construction possible. Our main concern was to present a proof which is as simple to explain as possible. However, once we know that zero-knowledge proofs for NP exist, it is natural to ask how e cient can they be. In order to establish common grounds for comparing zero-knowledge proofs, we have to specify a desired measure of error probability (for these proofs). An instructive choice, used in the sequel, is to consider the complexity of zero-knowledge proofs with error probability 2;k , where k is a parameter that may depend on the length of the common input. Another issue to bear in mind when comparing zero-knowledge proof is under what assumptions (if at all) are they valid. Throughout this entire subsection we stick to the assumption used so far (i.e., the existence of one-way functions). Standard e ciency measures Natural and standard e ciency measures to consider are 6.4. ZERO-KNOWLEDGE PROOFS FOR NP 195 The communication complexity of the proof. The most important communication measure is the round complexity (i.e., the number of message exchanges). The total number of bits exchanged in the interaction is also an important consideration. The computational complexity of the proof. Speci cally the number of elementary steps taken by each of the parties. Communication complexity seems more important than computational complexity, as long as the trade-o between them is \reasonable". To demonstrate these measures we consider the zero-knowledge proof for G3C presented in Construction 6.25. Recall that this proof system has very moderate acceptance gap, speci cally 1=jE j, on common input graph G = (V E ). So Construction 6.25 has to be applied sequentially k jE j in order to result in a zero-knowledge proof with error probability e;k , where e 2:718 is the natural logarithm base. Hence, the round complexity of the resulting zero-knowledge proof is O(k jE j), the bit complexity is O(k jE j jV j2), and the computational complexity is O(k jE j poly(jV j)), where the polynomial poly( ) depends on the commitment scheme in use. Much more e cient zero-knowledge proof systems may be custom-made for speci c languages in NP . Furthermore, even if one adopts the approach of reducing the construction of zero-knowledge proof systems for NP languages to the construction of a zero-knowledge proof system for a single NP -complete language, e ciency improvements can be achieved. For example, using Exercise 15, one can present zero-knowledge proofs for the Hamiltonian Circuit Problem (again with error 2;k ) having round complexity O(k), bit complexity O(k jV j2+ ), and computational complexity O(k jV j2+O( )), where > 0 is a constant depending on the desired security of the commitment scheme (in Construction 6.25 and in Exercise 15 we chose = 1). Note that complexities depending on the instance size are e ected by reductions among problems, and hence a fair comparison is obtained by considering the complexities for the generic problem (i.e., Bounded Halting). The round complexity of a protocol is a very important e ciency consideration and it is desirable to reduce it as much as possible. In particular, it is desirable to have zeroknowledge proofs with constant number of rounds and negligible error probability. This goal is pursued in Section 6.9. Knowledge Tightness: a particular e ciency measure The above e ciency measures are general in the sense that they are applicable to any protocol (independent on whether it is zero-knowledge or not). A particular measure of e ciency applicable to zero-knowledge protocols is their knowledge tightness. Intuitively, knowledge tightness is a re nement of zero-knowledge which is aimed at measuring the \actual security" of the proof system. Namely, how much harder does the veri er need to 196 CHAPTER 6. ZERO-KNOWLEDGE PROOF SYSTEMS work, when not interacting with the prover, in order to compute something which it can computes after interacting with the prover. Thus, knowledge tightness is the ratio between the (expected) running-time of the simulator and the running-time of the veri er in the real interaction simulated by the simulator. Note that the simulators presented so far, as well as all known simulator, operate by repeated random trials and hence an instructive measure of tightness should consider their expected running-time (assuming they never err (i.e., output the special ? symbol)) rather than the worst case. De nition 6.31 (knowledge tightness): Let t : N 7! N be a function. We say that a zeroII TimeM (x) ; p(jxj) TimeV (x) knowledge proof for language L has knowledge tightness t( ) if there exists a polynomial p( ) such that for every probabilistic polynomial-time veri er V there exists a simulator M (as in De nition 6.12) such that for all su ciently long x 2 L we have t(jxj) where TimeM (x) denotes the expected running-time of M on input x, and TimeV (x) denotes the running time of V on common input x. We assume a model of computation allowing one machine to invoke another machine at the cost of merely the running-time of the latter machine. The purpose of polynomial p( ), in the above de nition, is to take care of generic overhead created by the simulation (this is important in case the veri er V is extremely fast). We remark that the de nition of zeroknowledge does not guarantee that the knowledge tightness is polynomial. Yet, all known zero-knowledge proof, and more generally all zero-knowledge properties demonstrated using a single simulator with black-box access to V , have polynomial knowledge tightness. In particular, Construction 6.16 has knowledge tightness 2, whereas Construction 6.25 has knowledge tightness 3=2. We believe that knowledge tightness is a very important e ciency consideration and that it desirable to have it be a constant. 6.5 * Negative Results In this section we review some negative results concerning zero-knowledge. These results can be viewed as evidence to the belief that some of the shortcomings of the results and constructions presented in previous sections are unavoidable. Most importantly, Theorem 6.29 asserts the existence of (computational) zero-knowledge proof systems for NP , assuming that one-way functions exist. Two natural questions arise 1. An unconditional result: Can one prove the existence of (computational) zero-knowledge proof systems for NP , without making any assumptions? 6.5. * NEGATIVE RESULTS 197 2. Perfect zero-knowledge: Can one present perfect zero-knowledge proof systems for NP , even under some reasonable assumptions? The answer to both question seems to be negative. Another important question concerning zero-knowledge proofs is their preservation under parallel composition. We show that, in general, zero-knowledge is not preserved under parallel composition (i.e., there exists a pair of zero-knowledge protocols that when executed in parallel leak knowledge in a strong sense). Furthermore, we consider some natural proof systems, obtained via parallel composition of zero-knowledge proofs, and indicate that it is unlikely that the resulting composed proofs can be proven to be zero-knowledge. 6.5.1 Implausibility of an Unconditional \NP in ZK" Result Recall that Theorem 6.30 asserts the existence of zero-knowledge proofs for any languages in IP , provided that nonuniform one-way functions exist. In this subsection we consider the question of whether this su cient condition is also necessary. The results, known to date, seem to provide some (yet, weak) indication in this direction. Speci cally, the existence of zero-knowledge proof systems for languages out of BPP implies very weak forms of onewayness. Also, the existence of zero-knowledge proof systems for languages which are hard to approximate, in some average case sense, implies the existence of one-way functions (but not of nonuniformly one-way functions). In the rest of this subsection we provide precise statements of the above results. polynomial-time algorithms, I , D and F , so that the following two conditions hold 1. easy to sample and compute: as in De nition 2.11. (1) BPP CZK implies weak forms of one-wayness De nition 6.32 (collection of functions with one-way instances): A collection of functions, ffi : Di 7! f0 1g gi2I , is said to have one-way instances if there exists three probabilistic 2. some functions are hard to invert: For every probabilistic polynomial-time algorithm, A0, every polynomial p( ), and in nitely many i's Prob A0 (fi (Xn ) i) 2 fi;1fi (Xn ) < p(1 ) n where Xn is a random variable describing the output of algorithm D on input i. Actually, since the hardness condition does not refer to the distribution induced by I , we may assume, without loss of generality, that I = f0 1g and algorithm I uniformly selects 198 CHAPTER 6. ZERO-KNOWLEDGE PROOF SYSTEMS a string (of length equal to the length of its input). Recall that collections of one-way functions (as de ned in De nition 2.11) requires hardness to invert of all but a negligible measure of the functions fi (where the probability measure is induced by algorithm I ). Theorem 6.33 If there exist zero-knowledge proofs for languages outside of BPP then there exist collections of functions with one-way instances. We remark that the mere assumption that BPP IP is not known to imply any form of one-wayness. The existence of a language in NP which is not in BPP implies the existence of a function which is easy to compute but hard to invert in the worst-case (see Section 2.1). The latter consequence seems to be a much weaker form of one-wayness. (2) zero-knowledge proofs for \hard" languages yield one-way functions Our notion of hard languages is the following De nition 6.34 We say that a language L is hard to approximate if there exists a probabilistic polynomial-time algorithm S such that for every probabilistic polynomial-time algorithm A, every polynomial p( ), and in nitely many n's 1 Prob(A(Xn )= L (Xn )) < 2 + p(1 ) n where Xn def S (1n), and L is the characteristic function of the language L (i.e., L (x) = 1 = if x 2 L and L (x) = 0 otherwise). Theorem 6.35 If there exist zero-knowledge proofs for languages that are hard to approximate then there exist one-way functions. We remark that the mere existence of languages that are hard to approximate (even in a stronger sense by which the approximater must fail on all su ciently large n's) is not known to imply the existence of one-way functions (see Section 2.1). 6.5.2 Implausibility of Perfect Zero-Knowledge proofs for all of NP A theorem bounding the class of languages possessing perfect zero-knowledge proof systems follows. We start with some background (for more details see Section missing(eff-ip.sec)]). By AM we denote the class of languages having an interactive proof which proceeds as follows. First the veri er sends a random string to the prover, next the prover answers with 6.5. * NEGATIVE RESULTS 199 some string, and nally the veri er decided whether to accept or reject based on a deterministic computation (depending on the common input and the above two strings). The class AM seems to be a randomized counterpart of NP , and it is believed that coNP is not contained in AM. Additional support to this belief is given by the fact that coNP AM implies the collapse of the Polynomial-Time Hierarchy. In any case it is known that Theorem 6.36 The class of languages possessing perfect zero-knowledge proof systems is contained in the class coAM. (In fact, these languages are also in AM.) The theorem remains valid under several relaxations of perfect zero-knowledge (e.g., allowing the simulator to run in expected polynomial-time, etc.). Hence, if some NP complete language has a perfect zero-knowledge proof system then coNP AM, which is unlikely. We stress that Theorem 6.36 does not apply to perfect zero-knowledge arguments, dened and discussed in Section 6.8. Hence, there is no con ict between Theorem 6.36 and the fact that, under some reasonable complexity assumptions, perfect zero-knowledge arguments do exist for every language in NP . 6.5.3 Zero-Knowledge and Parallel Composition We discuss two negative results of very di erent conceptual standing. The rst result asserts the failure of the general \Parallel Composition Conjecture", but says nothing about speci c natural candidates. The second result refers to a class of interactive proofs, which contains several interesting and natural examples, and assert that the members of this class cannot be proven zero-knowledge using a general paradigm (know by the name \black box simulation"). We mention that it is hard to conceive an alternative way of demonstrating the zero-knowledge property of protocols (rather than by following this paradigm). (1) Failure of the Parallel Composition Conjecture For some time, after zero-knowledge proofs were rst introduced, several researchers insisted that the following must be true Parallel Composition Conjecture: Let P1 and P2 be two zero-knowledge provers. Then the prover resulting by running both of them in parallel is also zero-knowledge. Some researchers even considered the failure to prove the Parallel Composition Conjecture as a sign of incompetence. However, the Parallel Composition Conjecture is just wrong. 200 CHAPTER 6. ZERO-KNOWLEDGE PROOF SYSTEMS Proposition 6.37 There exists two provers, P1 and P2, such that each is zero-knowledge, and yet the prover resulting by running both of them in parallel yields knowledge (e.g., a cheating veri er may extract from this prover a solution to a problem that is not solvable in polynomial-time). Furthermore, the above holds even if the zero-knowledge property of each of the Pi 's can be demonstrated using a simulator which uses the veri er as a black-box (see below). We remark that these provers can be incorporated into a single prover that randomly selects which of the two programs to execute. Alternatively, the choice may be determined by the veri er. Proof idea: Consider a prover, denoted P1, that send \knowledge" to the veri er if and only if the veri er can answer some randomly chosen hard question (i.e., we stress that the question is chosen by P1 ). Answers to the hard questions look pseudorandom, yet P1 (which is not computationally bounded) can verify their correctness. Now, consider a second prover, denoted P2 , that answers these hard questions. Each of these provers (by itself) is zero-knowledge: P1 is zero-knowledge since it is unlikely that any probabilistic polynomialtime veri er can answer its questions whereas P2 is zero-knowledge since its answers can be simulated by random strings. Yet, once played in parallel, a cheating veri er can answer the question of P1 by sending it to P2 , and using this answer gain knowledge from P1 . To turn this idea into a proof we need to implement a hard problem with the above properties. The above proposition refutes the Parallel Composition Conjecture by means of exponential time provers. Assuming the existence of one-way functions the Parallel Composition Conjecture can be refuted also for probabilistic polynomial-time provers (with auxiliary inputs). For example, consider the following two provers P1 and P2 , which make use of proofs of knowledge (see Section 6.7). Let C be a bit commitment scheme (which we know to exist provided that one-way functions exist). On common-input C (1n ), where 2 f0 1g, prover P1 proves to the veri er, in zero-knowledge, that it knows . (To this end the prover is give as auxiliary input the coins used in the commitment.) On input C (1n ), prover P2 asks the veri er to prove that it knows and if P2 is convinced then it sends to the verier. This veri er employs the same system of proofs of knowledge used by the program P1 . Clearly, each prover is zero-knowledge and yet their parallel composition is not. Similarly, using stronger intractability assumptions, one can refute the Parallel Composition Conjecture also with respect to perfect zero-knowledge (rather than with respect to computational zero-knowledge). (2) Problems with \natural" candidates By de nition, to show that a prover is zero-knowledge one has to present, for each prospective veri er V , a corresponding simulator M (which simulates the interaction of V with 6.5. * NEGATIVE RESULTS 201 the prover). However, all known demonstrations of zero-knowledge proceed by presenting one \universal" simulator which uses any prospective veri er V as a black-box. In fact, these demonstrations use as black-box (or oracle) the \next message" function determined by the veri er program (i.e., V ), its auxiliary-input and its random-input. (This property of the simulators is implicit in our constructions of the simulators in previous sections.) We remark that it is hard to conceive an alternative way of demonstrating the zero-knowledge property. De nition 6.38 (black-box zero-knowledge): next message function: Let B be an interactive turing machine, and x z r be strings representing a common-input, auxiliary-input, and random-input, respectively. Consider the function Bx z r ( ) describing the messages sent by machine B such that Bx z r (m) denotes the message sent by B on common-input x, auxiliary-input z, random-input r, and sequence of incoming messages m. For simplicity, we assume that the output of B appears as its last message. black-box simulator: We say that a probabilistic polynomial-time oracle machine M is a black-box simulator for the prover P and the language L if for every polynomial-time interactive machine B , every probabilistic polynomial-time oracle machine D, every polynomial p( ), all su ciently large x 2 L, and every z r 2 f0 1g : jProb DBx z r (hP Br(z)i(x))=1 ; Prob DBx z r (M Bx z r (x))=1 j < p(j1xj) where Br (z ) denotes the interaction of machine B with auxiliary-input z and randominput r. We say that P is black-box zero knowledge if it has a black-box simulator. Essentially, the de nition says that a black-box simulator mimics the interaction of prover P with any polynomial-time veri er B , relative to any auxiliary-input (i.e., z ) that B may get and any random-input (i.e., r) that B may choose. The simulator does so (efciently), merely by using oracle calls to Bx z r (which speci es the next message that B sends on input x, auxiliary-input z , and random-input r). The simulation is indistinguishable from the true interaction, even if the distinguishing algorithm (i.e., D) is given access to the oracle Bx z r . An equivalent formulation is presented in Exercise 23. Clearly, if P is black-box zero-knowledge then it is zero-knowledge with respect to auxiliary input (and has polynomially bounded knowledge tightness (see De nition 6.31)). Theorem 6.39 Suppose that (P V ) is an interactive proof system, with negligible error probability, for the language L. Further suppose that (P V ) has the following properties 202 CHAPTER 6. ZERO-KNOWLEDGE PROOF SYSTEMS constant round: There exists an integer k such that for every x 2 L, on input x the prover P sends at most k messages. public coins: The messages sent by the veri er V are predetermined consecutive segments of its random tape. black-box zero-knowledge: The prover P has a black-box simulator (over the language L). Then L 2 BPP . We remark that both Construction 6.16 (zero-knowledge proof for Graph Isomorphism) and Construction 6.25 (zero-knowledge proof for Graph Coloring) are constant round, use public coins and are black-box zero-knowledge (for the corresponding languages). However, they do not have negligible error probability. Yet, repeating each of these constructions polynomially many times in parallel yields an interactive proof, with negligible error probability, for the corresponding language. Clearly the resulting proof system are constant round and use public coins. Hence, unless the corresponding languages are in BPP , these parallelized proof systems are not black-box zero-knowledge. Theorem 6.39 is sometimes interpreted as pointing to an inherent limitation of interactive proofs with public coins (also known as Arthur Merlin games see Section missing(eff-ip.sec)]). Such proofs cannot be both round-e cient (i.e., have constant number of rounds and negligible error) and black-box zero-knowledge (unless they are trivially so, i.e., the language is in BPP ). In other words, when constructing round-e cient zero-knowledge proof systems (for languages not in BPP ), one is advised to use \private coins" (i.e., to let the veri er send messages depending upon, but not revealing its coin tosses). 6.6 * Witness Indistinguishability and Hiding In light of the non-closure of zero-knowledge under parallel composition, see Subsection 6.5.3, alternative \privacy" criteria that are preserved under parallel composition are of practical and theoretical importance. Two notions, called witness indistinguishability and witness hiding, which refer to the \privacy" of interactive proof systems (of languages in NP ), are presented in this section. Both notions seem weaker than zero-knowledge, yet they su ce for some speci c applications. 6.6.1 De nitions In this section we con ne ourself to languages in NP . Recall that a witness relation for a language L 2 NP is a binary relation RL that is polynomially-bounded (i.e., (x y ) 2 RL 6.6. * WITNESS INDISTINGUISHABILITY AND HIDING implies jy j poly(jxj)), polynomial-time recognizable, and characterizes L by 203 L = fx : 9y s.t. (x y ) 2 RLg Witness indistinguishability Loosely speaking, an interactive proof for a language L 2 NP is witness independent (resp., witness indistinguishable) if the veri er's view of the interaction with the prover is statistically independent (resp., \computationally independent") of the auxiliary input of the prover. Actually, we will relax the requirement so that it applies only to the case in which the auxiliary input constitutes an NP-witness to the common input namely, let RL be the witness relation of the language L and suppose that x 2 L, then we consider only auxiliary inputs in RL (x) def fy (x y ) 2 RLg. By saying that the view is computational independent = of the witness we mean that for every two choices of auxiliary inputs the resulting views are computationally indistinguishable. In the actual de nition we combine notations and conventions from De nitions 6.13 and 6.18. De nition 6.40 (witness indistinguishability / independence): Let (P V ), L 2 NP and V be as in De nition 6.18, and let RL be a xed witness relation for the language L. We denote by viewP (y() ) (x) a random variable describing the contents of the random-tape of Vz V and the messages V receives from P during a joint computation on common input x, when P has auxiliary input y and V has auxiliary input z . We say that (P V ) is witness indistinguishable for RL if for every probabilistic polynomial-time interactive machine V , 1 2 12 and every two sequences W 1 = fwxgx2L and W 2 = fwx gx2L, so that wx wx 2 RL (x), the following two ensembles are computationally indistinguishable 1 fx viewP (wzx))(x)gx2L z2f0 1g V( 2 fx viewP (wzx))(x)gx2L z2f0 1g V( Namely, for every probabilistic polynomial-time algorithm, D, every polynomial p( ), all su ciently long x 2 L, and all z 2 f0 1g , it holds that 1 2 jProb(D(x viewP (wzx))(x))=1) ; Prob(D(x viewP (wzx))(x))=1)j < p(j1xj) V( V( We say that (P V ) is witness independent if the above ensembles are identically distributed. 12 Namely, for every x 2 L every wx wx 2 R(x) and z 2 f0 1g , the random variables 1) 2) viewP (wzx) (x) and viewP (wzx) (x) are identically distributed. V( V( 204 CHAPTER 6. ZERO-KNOWLEDGE PROOF SYSTEMS A few remarks are in place. First, one may observe that any proof system in which the prover ignores its auxiliary-input is trivially witness independent. In particular, exponentialtime provers may, without loss of generality, ignore their auxiliary-input (without any decrease in the probability that they convince the veri er). Yet, probabilistic polynomial-time provers can not a ord to ignore their auxiliary input (since otherwise they become useless). Hence, for probabilistic polynomial-time provers for languages outside BPP , witness indistinguishability is non-trivial. Secondly, one can easily show that any zero-knowledge proof system for a language in NP is witness indistinguishable (since the view corresponding to each witness can be approximated by the same simulator). Likewise, perfect zero-knowledge proofs are witness independent. Finally, it is relatively easy to see that witness indistinguishability and witness independence are preserved under sequential composition. In the next subsection we show that they are also preserved under parallel composition. Witness hiding We now turn to the notion of witness hiding. Intuitively, a proof system for a language in NP is witness hiding if after interacting with the prover it is still infeasible for the veri er to nd an NP witness for the common input. Clearly, such a requirement can hold only if it is infeasible to nd witnesses from scratch. Since, each NP language has instances for which witness nding is easy, we must consider the task of witness nding for specially selected hard instances. This leads to the following de nitions. De nition 6.41 (distribution of hard instances): Let L 2 NP and RL be a witness relation for L. Let X def fXn gn2N be a probability ensemble so that Xn assign non-zero probability = I mass only to strings in L \ f0 1gn. We say that X is hard for RL if for every probabilistic polynomial-time (witness nding) algorithm F , every polynomial p( ), all su ciently large n's and all z 2 f0 1gpoly(n) Prob(F (Xn z ) 2 RL(Xn )) < p(1 ) n tions. De nition 6.42 (witness hiding): Let (P V ), L 2 NP , and RL be as in the above de niLet X = fXn gn2N be a hard instance ensemble for RL . We say that (P V ) is witness I hiding for the relation RL under the instance ensemble X if for every probabilistic polynomial-time machine V , every polynomial p( ) and all su ciently large n's, and all z 2 f0 1g Prob(hP (Yn ) V (z )i(Xn) 2 RL(Xn )) < p(1 ) n where Yn is arbitrarily distributed over RL (Xn ). 6.6. * WITNESS INDISTINGUISHABILITY AND HIDING 205 We say that (P V ) is universal witness hiding for the relation RL if the proof system (P V ) is witness hiding for RL under every ensemble of hard instances, for RL , that is e ciently constructible (see De nition 3.5) We remark that the relation between the two privacy criteria (i.e., witness indistinguishable and witness hiding) is not obvious. Yet, zero-knowledge proofs (for NP ) are also (universal) witness hiding (for any corresponding witness relation). We remark that witness indistinguishability and witness hiding, similarly to zero-knowledge, are properties of the prover (and more generally of a any interactive machine). 6.6.2 Parallel Composition In contrary to zero-knowledge proof systems, witness indistinguishable proofs o er some robustness under parallel composition. Speci cally, parallel composition of witness indistinguishable proof systems results in a witness indistinguishable system, provided that the original prover is probabilistic polynomial-time. Lemma 6.43 (Parallel Composition Lemma): Let L 2 NP , and RL be as in De nition 6.40, and suppose that P is probabilistic polynomial-time, and (P V ) is witness indistinguishable (resp., witness independent) for RL . Let Q( ) be a polynomial, and PQ denote a program that on common-input x1 ::: xQ(n) 2 f0 1gn and auxiliary-input w1 ::: wQ(n) 2 f0 1g , invokes P in parallel Q(n) times, so that in the ith copy P is invoked on commoninput xi and auxiliary-input wi . Then, PQ is witness indistinguishable (resp., witness independent) for RQ def f(x w) : 8i (xi wi) 2 RLg L= where x = (x1 ::: xm), and w = (w1 ::: wm), so that m = Q(n) and jxi j = n for each i. hybrid argument. We concentrate on the computational version. To avoid cumbersome notation we consider a generic n for which the claim of the lemma fails. (By contradiction there must be in nitely many such n's and a precise argument will actually handle all these n's together.) Namely, suppose that by using a veri er program VQ, it is feasible to distin1 1 2 2 guish the witnesses w1 = (w1 ::: wm) and w2 = (w1 ::: wm), used by PQ , in an interaction m . Then, for some i, the program V distinguishes also the hybrid on common-input x 2 L Q 1 2 1 2 witnesses h(i) = (w1 ::: wi1 wi2+1 ::: wm) and h(i+1) = (w1 ::: wi1+1 wi2+2 ::: wm). Rewrite h(i) = (w1 ::: wi wi2+1 wi+2 ::: wm) and h(i+1) = (w1 ::: wi wi1+1 wi+2 ::: wm). We derive a contradiction by constructing a veri er V that distinguishes (the witnesses used by P in) interactions with the original prover P . Details follow. Proof Sketch: Both the computational and information theoretic versions follow by a 206 CHAPTER 6. ZERO-KNOWLEDGE PROOF SYSTEMS The program V incorporates the programs P and VQ and proceeds by interacting with the prover P in parallel to simulating m ; 1 other interactions with P . The real interaction with P is viewed as the i +1st copy in an interaction of VQ, whereas the simulated interactions are associated with the other copies. Speci cally, in addition to the commoninput x, machine V gets the appropriate i and the sequences x, h(i) and h(i+1) as part of its auxiliary input. For each j 6= i +1, machine V will use xj as common-input and wj as the auxiliary-input to the j th copy of P . Machine V invokes VQ on common input x and provides it with an interface to a virtual interaction with PQ . The i + 1st component of a message = ( 1 ::: m) sent by VQ is forwarded to the prover P and all other components are kept for the simulation of the other copies. When P answers with a message , machine V computes the answers of the other copies of P (by feeding the program P with the corresponding auxiliary-input and the corresponding sequence of incoming messages). It follows, that V can distinguish the case P uses the witness wi1+1 from the case P uses wi2+1. 6.6.3 Constructions In this subsection we present constructions of witness indistinguishable and witness hiding proof systems. Constructions of witness indistinguishable proofs Using the Parallel Composition Lemma and the observation that zero-knowledge proofs are witness indistinguishable we derive the following Theorem 6.44 Assuming the existence of (nonuniformly) one-way functions, every language in NP has a constant-round witness indistinguishable proof system with negligible error probability. In fact, the error probability can be made exponentially small. We remark that no such result is known for zero-knowledge proof system. Namely, the known proof systems for NP are either not constant-round (e.g., Construction 6.27) or have non-negligible error probability (e.g., Construction 6.25) or require stronger intractability assumptions (see Subsection 6.9.1) or are only computationally sound (see Subsection 6.9.2). Similarly, we can derive a constant-round witness independent proof system, with exponentially small error probability, for Graph Isomorphism. (Again, no analogous result is known for perfect zero-knowledge proofs.) 6.6. * WITNESS INDISTINGUISHABILITY AND HIDING 207 Constructions of witness hiding proofs Witness indistinguishable proof systems are not necessarily witness hiding. For example, any language with unique witnesses has a proof system which yields the unique witness, and yet is trivially witness independent. On the other hand, for some relations, witness indistinguishability implies witness hiding. For example Proposition 6.45 Let f(fi0 fi1) : i 2 I g be a collection of (nonuniform) clawfree functions, and let R def f(x w) : w =( r) ^ x =(i x0) ^ x0 = fi (r)g = Then if a machine P is witness indistinguishable for R then it is also witness hiding for R under the distribution generated by setting i = I (1n) and x0 = fi0 (D(0 i)), where I and D are as in De nition 2.13. By a collection of nonuniform clawfree functions we mean that even nonuniform families of circuits fCn g fail to form claws on input distribution I (1n), except with negligible probability. We remark that the above proposition does not relate to the purpose of interacting with P (e.g., whether P is proving membership in a language, knowledge of a witness, and so on). The proposition is proven by contradiction. Details follow. Suppose that an interactive machine V nds witnesses after interacting with P . By the witness indistinguishability of P it follows that V is performing as well regardless on whether the witness is of the form (0 ) or (1 ). Combining the programs V and P with algorithm D we derive a claw forming algorithm (and hence contradiction). Speci cally, the claw-forming algorithm, on input i 2 I , uniformly selects 2 f0 1g, randomly generates r = D( i), computes x = (i fi (r)), and simulates an interaction of V with P on commoninput x and auxiliary-input ( r) to P . If machine V outputs a witness w 2 R(x) then, 1 with probability approximately 2 , we have w = (1 ; r0) and a claw is formed (since fi (r) = fi1; (r0)).2 Furthermore, every NP relation can be \slightly modi ed" so that, for the modi ed relation, witness indistinguishability implies witness hiding. Given a relation R, the modi ed relation, denoted R2, is de ned by R2 def f((x1 x2) w) : jx1 j = jx2j ^ 9i s.t. (xi w) 2 Rg = Namely, w is a witness under R2 for the instance (x1 x2) if and only if w is a witness under R for either x1 or x2. Proposition 6.46 Let R and R2 be as above. If a machine P is witness indistinguishable for R2 then it is also witness hiding for R2 under every distribution of hard instances induced (see below) by an e cient algorithm that randomly selects pairs in R. 208 CHAPTER 6. ZERO-KNOWLEDGE PROOF SYSTEMS Let S be a probabilistic polynomial-time algorithm that on input 1n outputs (x w) 2 R so that jxj = n. Let Xn denotes the distribution induced on the rst element in the output of S (1n). The proposition asserts that if P is witness indistinguishable and fXn gn2N an I ensemble of hard instances for R then P is witness hiding under the ensemble fX n gn2N I where X n consists of two independent copies of Xn . This assertion is proven by contradiction. Suppose that an interactive machine V nds witnesses after interacting with P . By the witness indistinguishability of P it follows that V is performing as well regardless on whether the witness w for (x1 x2) satis es either (x1 w) 2 R or (x2 w) 2 R. Combining the programs V and P with algorithm S we derive a algorithm, denoted F , that nds witnesses for R (under the distribution Xn ). On input x 2 L, algorithm F generates at random (x0 w0) = S (1jxj) and sets x = (x x0) with probability 1 and x = (x0 x) otherwise. 2 Algorithm F simulates an interaction of V with P on common-input x and auxiliary input w0 to P , and when V outputs a witness w algorithm F checks whether (x w) 2 R. The reader can easily veri er that algorithm F performs well under the instance ensemble fXng, hence contradicting the hypothesis that Xn is hard for R. 2 6.6.4 Applications Applications for the notions presented in this section are scattered in various places in the book. In particular, witness-indistinguishable proof systems are used in the construction of constant-round arguments for NP (see Subsection 6.9.2), witness independent proof systems are used in the zero-knowledge proof for Graph Non-Isomorphism (see Section 6.7), and witness hiding proof systems are used for the e cient identi cation scheme based on factoring (in Section 6.7). 6.7 * Proofs of Knowledge This section addresses the concept of \proofs of knowledge". Loosely speaking, these are proofs in which the prover asserts \knowledge" of some object (e.g., a 3-coloring of a graph) and not merely its existence (e.g., the existence of a 3-coloring of the graph, which in turn imply that the graph is in the language G3C ). But what is meant by saying that a machine knows something? Indeed the main thrust of this section is in addressing this question. Before doing so we point out that \proofs of knowledge", and in particular zero-knowledge \proofs of knowledge", have many applications to the design of cryptographic schemes and cryptographic protocols. Some of these applications are discussed in a special subsection. Of special interest is the application to identi cation schemes, which is discussed in a separate subsection. 6.7. * PROOFS OF KNOWLEDGE 209 6.7.1 De nition We start with a motivating discussion. What does it mean to say that a machine knows something? Any standard dictionary suggests several meanings to the verb know and most meanings are phrased with reference to \awareness". We, however, must look for a behavioristic interpretation of the verb. Indeed, it is reasonable to link knowledge with ability to do something, be it at the least the ability to write down whatever one knows. Hence, we will say that a machine knows a string if it can output the string . This seems as total nonsense. A machine has a well de ned output: either the output equals or it does not. So what can be meant by saying that a machine can do something. Loosely speaking, it means that the machine can be modi ed so that it does whatever is claimed. More precisely, it means that there exists an e cient machine which, using the original machine as oracle, outputs whatever is claimed. So far for de ning the \knowledge of machines". Yet, whatever a machine knows or does not know is \its own business". What can be of interest to the outside is the question of what can be deduced about the knowledge of a machine after interacting with it. Hence, we are interested in proofs of knowledge (rather than in mere knowledge). For sake of simplicity let us consider a concrete question: how can a machine prove that it knows a 3-coloring of a graph? An obvious way is just to send the 3-coloring to the veri er. Yet, we claim that applying Construction 6.25 (i.e., the zero-knowledge proof system for G3C ) su ciently many times results in an alternative way of proving knowledge of a 3coloring of the graph. Loosely speaking, we say that an interactive machine, V , constitutes a veri er for knowledge of 3-coloring if the probability that the veri er is convinced by a machine P to accept the graph G is inversely proportional to the di culty of extracting a 3-coloring of G when using machine P as a \black box". Namely, the extraction of the 3coloring is done by an oracle machine, called an extractor, that is given access to a function specifying the messages sent by P (in response to particular messages that P receives). The (expected) running time of the extractor, on input G and access to an oracle specifying P 's messages, is inversely related (by a factor polynomial in jGj) to the probability that P convinces V to accept G. In case P always convinces V to accept G, the extractor runs in expected polynomial-time. The same holds in case P convinces V to accept with nonnegligible probability. We stress that the latter special cases do not su ce for a satisfactory de nition. Preliminaries Let R f0 1g f0 1g be a binary relation. Then R(x) def fs : (x s) 2 Rg and LR def = = fx : 9s s.t. (x s) 2 Rg. If (x s) 2 R then we call s a solution for x. We say that R is 210 CHAPTER 6. ZERO-KNOWLEDGE PROOF SYSTEMS polynomially bounded if there exists a polynomial p such that jsj p(jxj) for all (x s) 2 R. We say that R is an NP relation if R is polynomially bounded and, in addition, there exists a polynomial-time algorithm for deciding membership in R (i.e., LR 2 NP ). In the sequel, we con ne ourselves to polynomially bounded relations. We wish to be able to consider in a uniform manner all potential provers, without making distinction based on their running-time, internal structure, etc. Yet, we observe that these interactive machine can be given an auxiliary-input which enables them to \know" and to prove more. Likewise, they may be luck to select a random-input which enables more than another. Hence, statements concerning the knowledge of the prover refer not only to the prover's program but also to the speci c auxiliary and random inputs it has. Hence, we x an interactive machine and all inputs (i.e., the common-input, the auxiliary-input, and the random-input) to this machine, and consider both the corresponding accepting probability (of the veri er) and the usage of this (prover+inputs) template as an oracle to a \knowledge extractor". This motivates the following de nition. De nition 6.47 (message speci cation function): Denote by Px y r (m) the message sent by machine P on common-input x, auxiliary-input y , and random input r, after receiving messages m. The function Px y r is called the message speci cation function of machine P with common-input x, auxiliary-input y , and random input r. An oracle machine with access to the function Px y r will represent the knowledge of machine P on common-input x, auxiliary-input y, and random input r. This oracle machine, called the knowledge extractor, will try to nd a solution to x (i.e., an s 2 R(x)). The running time of the extractor is inversely related to the corresponding accepting probability (of the veri er). Knowledge veri ers Now that all the machinery is ready, we present the de nition of a system for proofs of knowledge. Actually, the de nition presented below is a generalization (to be motivated by the subsequent applications). At rst reading, the reader may set the function to be identically zero. De nition 6.48 (System of proofs of knowledge): Let R be a binary relation, and : N ! I 0 1]. We say that an interactive function V is a knowledge veri er for the relation R with knowledge error if the following two conditions hold. Non-triviality: There exists an interactive machine P so that for every (x y ) 2 R all possible interactions of V with P on common-input x and auxiliary-input y are accepting. 6.7. * PROOFS OF KNOWLEDGE 211 Validity (with error ): There exists a polynomial q ( ) and a probabilistic oracle machine K such that for every interactive function P , every x 2 LR and every y r 2 f0 1g , machine K satis es the following condition: Denote by p(x) the probability that the interactive machine V accepts, on input x, when interacting with the prover speci ed by Px y r . Then if p(x) > (jxj) then, on input x and access to oracle Px y r , machine K outputs a solution s 2 R(x) within an expected number of steps bounded by p(x) ; (jxj) : The oracle machine K is called a universal knowledge extractor. When ( ) is identically zero, we just say that V is a knowledge veri er for the relation R. An interactive pair (P V ) so that V is a knowledge veri er for a relation R and P is a machine satisfying the non-triviality condition (with respect to V and R) is called a system for proofs of knowledge for the relation R. q(jxj) 6.7.2 Observations The zero-knowledge proof systems for Graph Isomorphism (i.e., Construction 6.16) and for Graph 3-Coloring (i.e., Construction 6.25) are in fact proofs of knowledge (with some knowledge error) for the corresponding languages. Speci cally, Construction 6.16 is a proof 1 of knowledge of an isomorphism with knowledge error 2 , whereas Construction 6.25 is a 1 proof of knowledge of a 3-coloring with knowledge error 1 ; jE j (on common input G = (V E )). By iterating each construction su ciently many times we can get the knowledge error to be exponentially small. (The proofs of all these claims are left as an exercise.) In fact, we get a proof of knowledge with zero error, since Proposition 6.49 Let R be an NP relation, and q( ) be a polynomial such that (x y) 2 R implies jy j q (jxj). Suppose that (P V ) is a system for proofs of knowledge, for the relation R, with knowledge error (n) def 2;q(n). Then (P V ) is a system for proofs of knowledge = for the relation R (with zero knowledge error). Proof Sketch: Given a knowledge extractor, K , substantiating the hypothesis, we con- struct a new knowledge extractor which runs K in parallel to conducting an exhaustive search for a solution. Let p(x) be as in De nition 6.48. To evaluate the performance of the new extractor consider two cases. 212 CHAPTER 6. ZERO-KNOWLEDGE PROOF SYSTEMS Case 1: p(x) 2 (jxj). In this case, we use the fact 1 2 p(x) ; (jxj) p(x) Case 2: p(x) 2 (jxj). In this case, we use the fact that exhaustive search of a solution 1 boils down to 2q(jxj) trials, whereas p(1x) 2 2q(jxj). It follows that Theorem 6.50 Assuming the existence of (nonuniformly) one-way function, every NP relation has a zero-knowledge system for proofs of knowledge. 6.7.3 Applications We brie y review some of the applications for (zero-knowledge) proofs of knowledge. Typically, (zero-knowledge) proofs of knowledge are used for \mutual disclosure" of the same information. Suppose that Alice and Bob both claim that they know something (e.g., a 3-coloring of a common input) but are each doubtful of the other person's claim. Employing a zero-knowledge proof of knowledge in both direction is indeed a (conceptually) simple solution to the problem of convincing each other of their knowledge. Non-oblivious commitment schemes When using a commitment scheme the receiver is guaranteed that after the commit phase the sender is committed to at most one value (in the sense that it can later \reveal" only this value). Yet, the receiver is not guaranteed that the sender \knows" to what value it is committed. Such a guarantee may be useful in many settings, and can be obtained by using proof of knowledge. For more details see Subsection 6.9.2. Chosen message attacks An obvious way of protecting against chosen message attacks on a (public-key) encryption scheme is to augment the ciphertext by a zero-knowledge proof of knowledge of the cleartext. (For de nition and alternative constructions of such schemes see Section missing(enc-strong.sec)].) However, one should note that the resulting encryption scheme employs bidirectional communication between the sender and the receiver (of the encrypted message). It seems that the use of non-interactive zero-knowledge proofs of knowledge would yield unidirectional (public-key) encryption schemes. Such claims have been made, yet no proof has ever appeared (and we refrain from expressing an opinion on the issue). Non-interactive zero-knowledge proofs are discussed in Section 6.10. 6.7. * PROOFS OF KNOWLEDGE 213 A zero-knowledge proof system for GNI The interactive proof of Graph Non-Isomorphism (GNI ), presented in Construction 6.8, is not zero-knowledge (unless GNI 2 BPP ). A cheating veri er may construct a graph H and learn whether it is isomorphic to the rst input graph by sending H as query to the prover. A more appealing refutation can be presented to the claim that Construction 6.8 is auxiliary-input zero-knowledge (e.g., the veri er can check whether its auxiliary-input is isomorphic to one of the common-input graphs). We observe however, that Construction 6.8 \would have been zero-knowledge" if the veri er always knew the answer to its queries (as is the case for the honest veri er). The idea then is to have the veri er prove to the prover that he (i.e., the veri er) knows the answer to the query (i.e., an isomorphism to the appropriate input graph), and the prover answers the query only if it is convinced of this claim. Certainly, the veri er's proof of knowledge should not yield the answer (otherwise the prover can use this information in order to cheat thus foiling the soundness requirement). If the veri er's proof of knowledge is zero-knowledge then certainly it does not yield the answer. In fact, it su ces that the veri er's proof of knowledge is witness-independent (see Section 6.6). 6.7.4 Proofs of Identity (Identi cation schemes) Identi cation schemes are useful in large distributed systems in which the users are not acquainted with one another. A typical, everyday example is the consumer-retailer situation. In computer systems, a typical example is electronic mail (in communication networks containing sites allowing too loose local super-user access). In between, in technological sophistication, are the Automatic Teller Machine (ATM) system. In these distributed systems, one wishes to allow users to be able to authenticate themselves to other users. This goal is achieved by identi cation schemes, de ned below. In the sequel, we shall also see that identi cation schemes are intimately related to proofs of knowledge. We just hint that a person's identity can be linked to his ability to do something, and in particular to his ability to prove knowledge of some sort. De nition Loosely speaking, an identi cation scheme consists of a public le containing records for each user and an identi cation protocol. Each record consists of the name (or identity) of a user and auxiliary identi cation information to be used when invoking the identi cation protocol (as discussed below). The public le is established and maintained by a trusted party which vouches for the authenticity of the records (i.e., that each record has been submitted by the user the name of which is speci ed in it). All users have read access to the public le at all times. Alternatively, the trusted party can supply each user with a 214 CHAPTER 6. ZERO-KNOWLEDGE PROOF SYSTEMS signed copy of its public record. Suppose now, that Alice wishes to prove to Bob that it is indeed her communicating with him. To this end, Alice invokes the identi cation protocol with the (public le) record corresponding to her name as a parameter. Bob veri es that the parameter in use indeed matches Alice's public record and proceeds executing his role in the protocol. It is required that Alice will always be able to convince Bob (that she is indeed Alice), whereas nobody else can fool Bob into believing that she/he is Alice. Furthermore, Carol should not be able to impersonate as Alice even after receiving polynomially many proofs of identity from Alice. Clearly, if the identi cation information is to be of any use, then Alice must keep in secret the random coins she has used to generate her record. Furthermore, Alice must use these stored coins, during the execution of the identi cation protocol, but this must be done in a way which does not allow her counterparts to later impersonate her. Conventions: In the following de nition we adopt the formalism and notations of interac- tive machines with auxiliary input (presented in De nition 6.10). We recall that when M is an interactive machine, we denote by M (y ) the machine which results by xing y to be the auxiliary input of machine M . In the following de nition n is the security parameter, and we assume with little loss of generality, that the names (i.e., identities) of the users are encoded by strings of length n. If A is a probabilistic algorithm and x r 2 f0 1g , then Ar (x) denotes the output of algorithm A on input x and random coins r. Remark: In rst reading, the reader may ignore algorithm A and the random variable Tn in the security condition. Doing so, however, yields a weaker condition, that is typically unsatisfactory. De nition 6.51 (identi cation scheme): An identi cation scheme consists of a pair, (I ), where I is a probabilistic polynomial time algorithm and =(P V ) is a pair of probabilistic polynomial-time interactive machines satisfying the following conditions Viability: For every n 2 N, every 2 f0 1gn, and every s 2 f0 1gpoly(n) I Prob (hP (s) V i( Is( ))=1) = 1 Security: For every pair of probabilistic polynomial-time interactive machines, A and B , every polynomial p( ), all su ciently large n 2 N, every 2 f0 1gn, and every z I Prob (hB (z Tn) V i( ISn ( ))=1) < p(1 ) n where Sn is a random variable uniformly distributed over f0 1gpoly(n) , and Tn is a random variable describing the output of A(z ) after interacting with P (Sn ) on common input , for polynomially many times. 6.7. * PROOFS OF KNOWLEDGE 215 Algorithm I is called the information generating algorithm, and the pair (P V ) is called the identi cation protocol. Hence, to use the identi cation scheme a user, say Alice, the identity of which is encoded by the string , should rst uniformly select a secret string s, compute i def Is ( ), = ask the trusted party to place the record ( i) in the public le, and store the string s in a safe place. The viability condition asserts that Alice can convince Bob of her identity by executing the identi cation: Alice invokes the program P using the stored string s as auxiliary input, and Bob uses the program V and makes sure that the common input is the public record containing (which is in the public le). Ignoring, for a moment, algorithm A and the random variable Tn , the security condition yields that it is infeasible for a party to impersonate Alice if all this party has is the public record of Alice and some unrelated auxiliary input. However, such a security condition may not su ce in many applications since a user wishing to impersonate Alice may ask her rst to prover her identity to him/her. The (full) security condition asserts that even if Alice has proven her identity to Carol many times in the past, still it is infeasible for Carol to impersonate Alice. We stress that Carol cannot impersonate Alice to Bob provided that she cannot interact concurrently with both. In case this condition does not hold then nothing is guaranteed (and indeed Carol can easily cheat by referring Bob's questions to Alice and answering as Alice does). Identi cation schemes and proofs of knowledge A natural way of establishing a person's identity is to ask him/her to supply a proof of knowledge of a fact that this person is supposed to know. Let us consider a speci c (and in fact quite generic) example. Construction 6.52 (identi cation scheme based on a one-way function): Let f be a function. On input an identity 2 f0 1gn, the information generating algorithm uniformly selects a string s 2 f0 1gn and outputs f (s). (The pair ( f (s)) is the public record for the user with name ). The identi cation protocol consists of a proof of knowledge of the inverse of the second element in the public record. Namely, in order to prove its identity, user proves that he knows a string s so that f (s) = r, where ( r) is a record in the public le. (The proof of knowledge in used is allowed to have negligible knowledge error.) knowledge then Construction 6.52 constitutes an identi cation scheme. Proposition 6.53 If f is a one-way function and the proof of knowledge in use is zeroHence, identi cation schemes exist if one-way functions exist. More e cient identi cation schemes can be constructed based on speci c intractability assumptions. For example, assuming the intractability of factoring, the so called Fiat-Shamir identi cation scheme, which is actually a proof of knowledge of a square root, follows. 216 CHAPTER 6. ZERO-KNOWLEDGE PROOF SYSTEMS Construction 6.54 (the Fiat-Shamir identi cation scheme): On input an identity 2 f0 1gn, the information generating algorithm uniformly selects a composite number N , which is the product of two n-bit long primes, a residue s mod N , and outputs the pair (N s2 mod N ). (The pair ( (N s2 mod N )) is the public record for user ). The identi cation protocol consists of a proof of knowledge of the corresponding modular square root. Namely, in order to prove its identity, user proves that he knows a square root of r def s2 mod N , where ( (r N )) is a record in the public le. (Again, negligible knowledge = error is allowed.) The proof of knowledge of square root is analogous to the proof system for Graph Isomorphism presented in Construction 6.16. Namely, in order to prove knowledge of a square root of r s2 (mod N ), the prover repeats the following steps su ciently many times: Construction 6.55 (atomic proof of knowledge of square root): The prover randomly selects a residue, q , modulo N and send t def q 2 mod N to the = veri er The veri er uniformly selects 2 f0 1g and sends it to the prover Motivation: in case = 0 the veri er asks for a square root of t mod N , whereas in case = 1 the veri er asks for a square root of t r mod N . In the sequel we assume, without loss of generality, that 2 f0 1g. The prover replies with p def q s mod N = The veri er accepts (this time) if and only if the messages t and p sent by the prover satis es p2 t r mod N When Construction 6.55 is repeated k times, either sequentially or in parallel, the resulting protocol constitutes a proof of knowledge of modular square root with knowledge error 2;k . In case these repetitions are conducted sequentially, then the resulting protocol is zero-knowledge. Yet, for use in Construction 6.54 it su ces that the proof of knowledge is witness-hiding, and fortunately even polynomially many parallel executions can be shown to be witness-hiding (see Section 6.6). Hence the resulting identi cation scheme has constant round complexity. We remark that for identi cation purposes it su ces to perform Construction 6.55 superlogarithmically many times. Furthermore, also less repetitions are of value: when applying Construction 6.55 k = O(log n) times, and using the resulting protocol in Construction 6.54, we get a scheme (for identi cation) in which impersonation can occur with probability at most 2;k . 6.8. * COMPUTATIONALLY-SOUND PROOFS (ARGUMENTS) 217 Identi cation schemes and proofs of ability As hinted above, a proof of knowledge of a string (i.e., the ability to output the string) is a special case of a proof of ability to do something. It turns out that identi cation schemes can be based also on the more general concept of proofs of ability. We avoid de ning this concept, and refrain ourself to two \natural" examples of using a proof of ability as basis for identi cation. It is an everyday practice to identify people by their ability to produce their signature. This practice can be carried into the digital setting. Speci cally, the public record of Alice consists of her name and the veri cation key corresponding to her secret signing key in a predetermined signature scheme. The identi cation protocol consists of Alice signing a random message chosen by the veri er. A second popular means of identi cation consists of identifying people by their ability to answer correctly personal questions. A digital analogue to this practice follows. To this end we use pseudorandom functions (see Section 3.6) and zero-knowledge proofs (of membership in a language). The public record of Alice consists of her name and a \commitment" to a randomly selected pseudorandom function (e.g., either via a string-commitment to the index of the function or via a pair consisting of a random domain element and the value of the function at this point). The identi cation protocol consists of Alice returning the value of the function at a random location chosen by the veri er, and supplying a zero-knowledge proof that the value returned indeed matches the function appearing in the public record. We remark that the digital implementation o ers more security than the everyday practice. In the everyday setting the veri er is given the list of all possible question and answer pairs and is trusted not to try to impersonate as the user. Here we replaced the possession of the correct answers by a zero-knowledge proof that the answer is correct. 6.8 * Computationally-Sound Proofs (Arguments) In this section we consider a relaxation of the notion of an interactive proof system. Speci cally, we relax the soundness condition of interactive proof systems. Instead of requiring that it is impossible to fool the veri er into accepting false statement (with probability greater than some bound), we only require that it is infeasible to do so. We call such protocols computationally sound proof systems (or arguments). The advantage of computationally sound proof systems is that perfect zero-knowledge computationally sound proof systems can be constructed, under some reasonable complexity assumptions, for all languages in NP . We remark that perfect zero-knowledge proof systems are unlikely to exists for all languages in NP (see section 6.5). We recall that computational zero-knowledge proof systems do exist for all languages in NP , provided that one-way functions exist. Hence, the above quoted positive results exhibit some kind of a trade-o between the soundness and zero-knowledge 218 CHAPTER 6. ZERO-KNOWLEDGE PROOF SYSTEMS properties of the zero-knowledge protocols of NP . We remark, however, that this is not a real trade-o since the perfect zero-knowledge computationally sound proofs for NP are constructed under stronger complexity theoretic assumption than the ones used for the computationally zero-knowledge proofs. It is indeed an interesting research project to try to construct perfect zero-knowledge computationally sound proofs for NP under weaker assumptions (and in particular assuming only the existence of one-way functions). We remark that it seems that computationally-sound proof systems can be much more e cient than ordinary proof systems. Speci cally, under some plausible complexity assumptions, extremely e cient computationally-sound proof systems (i.e., requiring only poly-logarithmic communication and randomness) exist for any language in NP . An analogous result cannot hold for ordinary proof systems, unless NP is contained in deterministic quasi-polynomial time (i.e., NP Dtime(2polylog)). 6.8.1 De nition The de nition of computationally sound proof systems follows naturally from the above discussion. The only issue to consider is that merely replacing the soundness condition of De nition 6.4 by the following computational soundness condition leads to an unnatural de nition, since the computational power of the prover in the completeness condition (in De nition 6.4) is not restricted. Computational Soundness: For every polynomial-time interactive machine B , and for all su ciently long x 62 L 1 Prob (hB V i(x)=1) 3 Hence, it is natural to restrict the prover in both (completeness and soundness) conditions to be an e cient one. It is crucial to interpret e cient as being probabilistic polynomialtime given auxiliary input (otherwise only languages in BPP will have such proof systems). Hence, our starting point is De nition 6.10 (rather than De nition 6.4). De nition 6.56 (computationally sound proof system) (arguments): A pair of interactive machines, (P V ), is called an computationally sound proof system for a language L if both machines are polynomial-time (with auxiliary inputs) and the following two conditions hold Completeness: For every x 2 L there exists a string y such that for every string z 2 Prob (hP (y ) V (z )i(x)=1) 3 6.8. * COMPUTATIONALLY-SOUND PROOFS (ARGUMENTS) 219 Computational Soundness: For every polynomial-time interactive machine B , and for all su ciently long x 62 L and every y and z Prob (hB (y ) V (z )i(x)=1) 1 3 As usual, the error probability in the completeness condition can be reduced (from 1 ) 3 up to 2;poly(jxj), by repeating the protocol su ciently many times. The same is not true, in general, with respect to the error probability in the computational soundness condition (see Exercise 21). All one can show is that the error probability can be reduced to be negligible (i.e., smaller that 1=p( ), for every polynomial p( )). Speci cally, by repeating a computationally sound proof su ciently many time (i.e., superlogarithmically many times) we get a new veri er V 0 for which it holds that For every polynomial p( ), every polynomial-time interactive machine B , and for all su ciently long x 62 L and every y and z ; 1 Prob hB (y ) V 0(z )i(x)=1 p(jxj) See Exercise 20. 6.8.2 Perfect Commitment Schemes The thrust of the current section is in a method for constructing perfect zero-knowledge arguments for every language in NP . This method makes essential use of the concept of commitment schemes with a perfect (or \information theoretic") secrecy property. Hence, we start with an exposition of \perfect" commitment schemes. We remark that such schemes may be useful also in other settings (e.g., in settings in which the receiver of the commitment is computationally unbounded, see for example Section 6.9). The di erence between commitment scheme (as de ned in Subsection 6.4.1) and perfect commitment schemes (de ned below) consists of a switching in scope of the secrecy and unambiguity requirements. In commitment schemes (see De nition 6.20), the secrecy requirement is computational (i.e., refers only to probabilistic polynomial-time adversaries), whereas the unambiguity requirement is information theoretic (and makes no reference to the computational power of the adversary). On the other hand, in perfect commitment schemes (see de nition below), the secrecy requirement is information theoretic, whereas the unambiguity requirement is computational (i.e., refers only to probabilistic polynomialtime adversaries). Hence, in some sense calling one of these schemes \perfect" is somewhat unfair to the other (yet, we do so in order to avoid cumbersome terms as a \perfectlysecret/computationally-nonambiguous commitment scheme"). We remark that it is impossible to have a commitment scheme in which both the secrecy and unambiguity requirements are information theoretic (see Exercise 22). 220 CHAPTER 6. ZERO-KNOWLEDGE PROOF SYSTEMS De nition Loosely speaking, a perfect commitment scheme is an e cient two-phase two-party protocol through which the sender can commit itself to a value so the following two con icting requirements are satis ed. 1. Secrecy: At the end of the commit phase the receiver does not gain any information of the sender's value. 2. Unambiguity: It is infeasible for the sender to interact with the receiver so that the commit phase is successfully terminated and yet later it is feasible for the sender to perform the reveal phase in two di erent ways leading the receiver to accept (as legal \openings") two di erent values. Using analogous conventions to the ones used in Subsection 6.4.1, we make the following de nition. De nition 6.57 (perfect bit commitment scheme): A perfect bit commitment scheme is a pair of probabilistic polynomial-time interactive machines, denoted (S R) (for sender and receiver), satisfying: Input Speci cation: The common input is an integer n presented in unary (serving as the security parameter). The private input to the sender is a bit v . Secrecy: For every probabilistic (not necessarily polynomial-time) machine R interacting with S , the random variables describing the output of R in the two cases, namely hS (0) R i(1n ) and hS (1) R i(1n ), are statistically close. Unambiguity: Preliminaries. For simplicity v 2 f0 1g and n 2 N are implicit in all notations. Fix I any probabilistic polynomial-time algorithm F . { As in De nition 6.20, a receiver's view of an interaction with the sender, denoted (r m), consists of the random coins used by the receiver (r) and the sequence of messages received from the sender (m). A sender's view of the same interaction, denoted (s m), consists of the random coins used by the sender (s) and the ~ sequence of messages received from the receiver (m). A joint view of the interac~ tion is a pair consisting of corresponding receiver and sender views of the same interaction. ~ { Let 2 f0 1g. We say that a joint view (of an interaction), t def ((r m) (s m)), = has a feasible -opening (with respect to F ) if on input (t ), algorithm F outputs (say, with probability > 1=2) a string s0 such that m describes the messages 6.8. * COMPUTATIONALLY-SOUND PROOFS (ARGUMENTS) 221 received by R when R uses local coins r and interacts with machine S which uses local coins s0 and input ( 1n). (Remark: We stress that s0 may, but need not, equal s. The output of algorithm F has to satisfy a relation which depends only on the receiver's view part of the input the sender's view is supplied to algorithm F as additional help.) { We say that a joint view is ambiguous (with respect to F ) if it has both a feasible 0-opening and a feasible 1-opening (w.r.t. F ). The unambiguity requirement asserts that, for all but a negligible fraction of the coin tosses of the receiver, it is infeasible for the sender to interact with the receiver so that the resulting joint view is ambiguous with respect to some probabilistic polynomial-time algorithm F . Namely, for every probabilistic polynomial time interactive machine S , probabilistic polynomial-time algorithm F , polynomial p( ), and all su ciently large n, the probability that the joint view of the interaction between R and with S , on common input 1n , is ambiguous with respect to F , is at most 1=p(n). In the formulation of the unambiguity requirement, S describes the (cheating) sender strategy in the commit phase, whereas F describes its strategy in the reveal phase. Hence, it is justi ed (and in fact necessary) to pass the sender's view of the interaction (between S and R) to algorithm F . The unambiguity requirement asserts that any e cient strategy S will fail to produce a joint view of interaction, which can be latter (e ciently) opened in two di erent ways supporting two di erent values. As usual, events occurring with negligible probability are ignored. As in De nition 6.20, the secrecy requirement refers explicitly to the situation at the end of the commit phase, whereas the unambiguity requirement implicitly assumes that the reveal phase takes the following form: 1. the sender sends to the receiver its initial private input, v , and the random coins, s, it has used in the commit phase 2. the receiver veri es that v and s (together with the coins (r) used by R in the commit phase) indeed yield the messages that R has received in the commit phase. Veri cation is done in polynomial-time (by running the programs S and R). Construction based on one-way permutations Perfect commitment schemes can be constructed using any one-way permutation. The known scheme, however, involve a linear (in the security parameter) number of rounds. Hence, it can be used for the purposes of the current section, but not for the construction in Section 6.9. 222 CHAPTER 6. ZERO-KNOWLEDGE PROOF SYSTEMS Construction 6.58 (perfect bit commitment): LetP be a permutation, and b(x y) denote f the inner-product mod 2 of x and y (i.e., b(x y ) = n=1 xi yi mod 2). i 1. commit phase (using security parameter n): The receiver randomly selects n ; 1 linearly indepndent vectors r1 ::: rn;1 2 f0 1gn. The sender uniformly selects s 2 f0 1gn and computes y = f (s). (So far no message is exchanged between the parties.) The parties proceed in n ; 1 rounds. In the ith round (i = 1 ::: n ; 1), the receiver sends ri to the sender, which replies by computing and sending ci def b(y ri). = At this point there are exactly two solutions to the equations b(y ri) = ci , 1 i n ; 1. De ne j = 0 if y is the lexicographically rst solution (among the two), and j = 1 otherwise. To commit to a value v 2 f0 1g, the sender sends cn def j v to the receiver. = 2. reveal phase: In the reveal phase, the sender reveals the string s used in the commit phase. The receiver accepts the value v if f (s) = y , b(y ri) = ci for all 1 i n ; 1, and y is the lexicographically rst solution to these n ; 1 equations i cn = v . Proposition 6.59 Suppose that f is a one-way permutation. Then, the protocol presented in Construction 6.58 constitutes a perfect bit commitment scheme. It is quite easy to see that Construction 6.58 satis es the secrecy condition. The proof that the unambiguity requirement is satis ed is quite complex and is omitted for space considerations. Construction based on clawfree collections Perfect commitment schemes (of constant number of rounds) can be constructed using a strong intractability assumption speci cally, the existence of clawfree collections (see Subsection 2.4.5). This assumption implies the existence of one-way functions, but it is not known whether the converse is true. Nevertheless, clawfree collections can be constructed under widely believed assumptions such as the intractability of factoring and DLP. Actually, the construction of perfect commitment schemes, presented below, uses a clawfree collection with an additional property speci cally, it is assume that the set of indices of the collection (i.e., the range of algorithm I ) can be e ciently recognized (i.e., is in BPP ). We remark that such collections do exist under the assumption that DLP is intractable (see Subsection 2.4.5). Construction 6.60 (perfect bit commitment): Let (I D F ) be a triplet of e cient algorithms. 6.8. * COMPUTATIONALLY-SOUND PROOFS (ARGUMENTS) 223 1. commit phase: To receive a commitment to a bit (using security parameter n), the receiver randomly generates i = I (1n) and sends it to the sender. To commit to value v 2 f0 1g (upon receiving the message i from the receiver), the sender checks if indeed i is in the range of I (1n), and if so the sender randomly generates s = D(i), computes c = F (v i s), and sends c to the receiver. (In case i is not in the range of I (1n) the sender aborts the protocol announcing that the receiver is cheating.) 2. reveal phase: In the reveal phase, the sender reveals the string s used in the commit phase. The receiver accepts the value v if F (v i s) = c, where (i c) is the receiver's (partial) view of the commit phase. Proposition 6.61 Let (I D F ) be a clawfree collection with a probabilistic polynomialProof: The secrecy requirement follows directly from Property (2) of a clawfree collection (combined with the test i 2 I (1n ) conducted by the sender). The unambiguity requirement time recognizable set of indices (i.e., range of algorithm I ). Then, the protocol presented in Construction 6.60 constitutes a perfect bit commitment scheme. follows from Property (3) of a clawfree collection, using a standard reducibility argument. We remark that the Factoring Clawfree Collection, presented in Subsection 2.4.5, can be used to construct a perfect commitment scheme although this collection is not known to have an e ciently recognizable index set. Hence, perfect commitment schemes exists also under the assumption that factoring Blum integers is intractable. Loosely speaking, this is done by letting the receiver prove to the sender (in zero-knowledge) that the selected index, N , satis es the secrecy requirement. What is actually being proven is that half of the square roots, of each quadratic residue mod N , have Jacobi symbol 1 (relative to N ). A zero-knowledge proof system of this claim does exist (without assuming anything). We remark that the idea just presented can be described as replacing the requirement that the index set is e ciently recognizable by a zero-knowledge proof that a string is indeed a legitimate index. Commitment Schemes with a posteriori secrecy We conclude the discussion of perfect commitment schemes by introducing a relaxation of the secrecy requirement. The resulting scheme cannot be used for the purposes of the current section, yet it is useful in di erent settings. The advantage in the relaxation is that it allows to construct commitment schemes using any clawfree collection, thus waiving the additional requirement that the index set is e ciently recognizable. Loosely speaking, we relax the secrecy requirement of perfect commitment schemes by requiring that it only holds whenever the receiver follows it prescribed program (denoted 224 CHAPTER 6. ZERO-KNOWLEDGE PROOF SYSTEMS R). This seems strange since we don't really want to assume that the real receiver follows the prescribed program (but rather allow it to behave arbitrarily). The point is that a real receiver may disclose the coin tosses used by it in the commit phase in a later stage, say even after the reveal phase, and by doing so a posteriori prove that (at least in some weak sense) it was following the prescribed program. Actually, the receiver only proves that he behaved in a manner which is consistent with its program. De nition 6.62 (commitment scheme with perfect a posteriori secrecy): A bit commitment scheme with perfect a posteriori secrecy is de ned as in De nition 6.8.2, except that the secrecy requirement is replaced by the following a posteriori secrecy requirement: For every string r 2 f0 1gpoly(n) it holds that hS (0) Rri(1n) and hS (1) Rri(1n ) are statistically close, where Rr denotes the execution of the interactive machine R when using internal coin tosses r. Proposition 6.63 Let (I D F ) be a clawfree collection. Consider a modi cation of Con- struction 6.60, in which the sender's check, of whether i is in the range of I (1n), is omitted (from the commit phase). Then the resulting protocol constitutes a bit commitment scheme with perfect a posteriori secrecy. In contrast to Proposition 6.61, here the clawfree collection may not have an e ciently recognizable index set. Hence, the veri er's check must have been omitted. Yet, the receiver can later prove that the message sent by it during the commit phase (i.e., i) is indeed a valid index by disclosing the random coins it has used in order to generate i (using algorithm I ). Proof: The a posteriori secrecy requirement follows directly from Property (2) of a clawfree collection (combined with the assumption that i in indeed a valid index). The unambiguity requirement follows as in Proposition 6.61. A typical application of commitment scheme with perfect a posteriori secrecy is presented in Section 6.9. In that setting the commitment scheme is used inside an interactive proof with the veri er playing the role of the sender (and the prover playing the role of the receiver). If the veri er a posteriori learns that the prover has been cheating then the veri er rejects the input. Hence, no damage is caused, in this case, by the fact that the secrecy of the veri er's commitments might have been breached. Nonuniform computational unambiguity Actually, for the applications to proof/argument systems, both the one below and the one in Section 6.9, we need commitment schemes with perfect secrecy and nonuniform computational unambiguity. (The reasons for this need are analogous to the case of the 6.8. * COMPUTATIONALLY-SOUND PROOFS (ARGUMENTS) 225 zero-knowledge proof for NP presented in Section 6.4.) By nonuniform computational unambiguity we mean that the unambiguity condition should hold also for (nonuniform) families of polynomial-size circuits. We stress that all the constructions of perfect commitment schemes possess the nonuniform computational unambiguity, provided that the underlying clawfree collections foil also nonuniform polynomial-size claw-forming circuits. In order to prevent the terms of becoming too cumbersome we omit the phrase \nonuniform" when referring to the perfect commitment schemes in the description of the two applications. 6.8.3 Perfect Zero-Knowledge Arguments for NP Having perfect commitment scheme at our disposal, we can construct perfect zero-knowledge arguments for NP , by modifying the construction of (computational) zero-knowledge proofs (for NP ) in a totally syntactic manner. We recall that in these proof systems (e.g., Construction 6.25 for Graph 3-Colorability) the prover uses a commitment scheme in order to commit itself to many values, part of them it later reveals upon the veri er's request. All that is needed is to replace the commitment scheme used by the prover by a perfect commitment scheme. We claim that the resulting protocol is a perfect zero-knowledge argument (computationally sound proof) for the original language. For sake of concreteness we prove Proposition 6.64 Consider a modi cation of Construction 6.25 so that the commitment scheme used by the prover is replaced by a perfect commitment scheme. Then the resulting protocol is a perfect zero-knowledge weak argument for Graph 3-Colorability. By a weak argument we mean a protocol in which the gap between the completeness and the computational soundness condition is non-negligible. In our case the veri er always accepts inputs in G3C , whereas no e cient prover can fool him into accepting graphs G =(V E ) not in G3C with probability greater than 1 ; 2j1 j . We remind the reader that by polynomially E many repetitions the error probability can be made negligible. Proof Sketch: We start by proving that the resulting protocol is perfect zero-knowledge for G3C . We use the same simulator as in the proof of Proposition 6.26. However, this time analyzing the properties of the simulator is much easier since the commitments are distributed independently of the committed values, and consequently the veri er acts in total oblivion of the values. It follows that the simulator outputs a transcript with proba2 bility exactly 3 , and for similar reasons this transcript is distributed identically to the real interaction. The perfect zero-knowledge property follows. The completeness condition is obvious as in the proof of Proposition 6.26. It is left to prove that the protocol satis es the computational soundness requirement. This is indeed the more subtle part of the current proof (in contrast to the proof of Proposition 6.26 in 226 CHAPTER 6. ZERO-KNOWLEDGE PROOF SYSTEMS which proving soundness is quite easy). We use a reducibility argument to show that a prover's ability to cheat with too high probability on inputs not in G3C translates to an algorithm contradicting the unambiguity of the commitment scheme. Details follows. We assume, to the contradiction, that there exists a (polynomial-time) cheating prover P , and an in nite sequence integers, so that for each integer n there exists graphs Gn = (Vn En) 62 G3C and a string yn so that P (yn ) leads the veri er to accept Gn with probabil1 ity > 1 ; 2jEn j . Let k def jVnj. Let c1 ::: ck be the sequence of commitments (to the vertices = colors) sent by the prover in step (P1). Recall that in the next step, the veri er sends a uniformly chosen edge (of En ) and the prover must answer by revealing di erent colors for its endpoint, otherwise the veri er rejects. A straightforward calculation shows that, since Gn is not 3-colorable, there must exist a vertex for which the prover is able to reveal at least two di erent colors. Hence, we can construct a polynomial-size circuit, incorporating P , Gn and yn, that violates the (nonuniform) unambiguity condition. Contradiction to the hypothesis of the proposition follows, and this completes the proof. Combining Propositions 6.59 and 6.64, we get Corollary 6.65 If non-uniformly one-way permutations exist then every language in NP has a perfect zero-knowledge argument. Concluding Remarks Propositions 6.26 and 6.64 exhibit a kind of a trade-o between the strength of the soundness and zero-knowledge properties. The protocol of Proposition 6.26 o ers computational zeroknowledge and \perfect" soundness, whereas the protocol of Proposition 6.64 o ers perfect zero-knowledge and computational soundness. However, one should note that the two results are not obtained under the same assumptions. The conclusion of Proposition 6.26 is valid as long as any one-way functions exist, whereas the conclusion of Proposition 6.64 requires a (probably much) stronger assumption. Yet, one may ask which of the two protocols should we prefer, assuming that they are both valid. The answer depends on the setting (i.e., application) in which the protocol is to be used. In particular, one should consider the following issues The relative importance attributed to soundness and zero-knowledge in the speci c application. In case of clear priori to one of the two properties a choice should be made accordingly. The computational resources of the various users in the application. One of the users may be known to be in possession of much more substantial computing resources, and it may be reasonable to require that he/she should not be able to cheat even not in an information theoretic sense. 6.8. * COMPUTATIONALLY-SOUND PROOFS (ARGUMENTS) 227 The soundness requirement refers only to the duration of the execution, whereas in many applications zero-knowledge may be of concern also for a long time afterwards. If this is the case then perfect zero-knowledge arguments do o er a clear advantage (over zero-knowledge proofs). 6.8.4 Zero-Knowledge Arguments of Polylogarithmic E ciency A dramatic improvement in the e ciency of zero-knowledge arguments for NP , can be obtained by combining ideas from Chapter missing(sign.sec)] and a result described in Section missing(eff-pcp.sec)]. In particular, assuming the existence of very strong collision-free hashing functions one can construct a computationally-sound (zero-knowledge) proof, for any language in NP , which uses only polylogarithmic amount of communication and randomness. The interesting point in the above statement is the mere existence of such extremely e cient argument, let alone their zero-knowledge property. Hence, we refrain ourselves to describing the ideas involved in constructing such arguments, and do not address the issue of making them zero-knowledge. By Theorem missing(np-pcp.thm)], every NP language, L, can be reduced to 3SAT so that non-members of L are mapped into 3CNF formulae for which every truth assignment satis es at most an 1 ; fraction of the clauses, where > 0 is a universal constant. Let us denote this reduction by f . Now, in order to prove that x 2 L it su ces to prove that the formula f (x) is satis able. This can be done by supplying a satisfying assignment for f (x). The interesting point is that the veri er need not check that all clauses of f (x) are satis ed by the given assignment. Instead, it may uniformly select only polylogarithmically many clauses and check that the assignment satis es all of them. If x 2 L (and the prover supplies a satisfying assignment to f (x)) then the veri er will always accept. Yet, if x 62 L then no assignment satis es more than a 1 ; fraction of the clauses, and consequently a uniformly chosen clause is not satis ed with probability at least . Hence, checking superlogarithmically many clauses will do. The above paragraph explains why the randomness complexity is polylogarithmic, but it does not explain why the same holds for the communication complexity. For this end we need an additional idea. The idea is to use a special commitment scheme which allows to commit to a string of length n so that the commitment phase takes polylogarithmic communication and individual bits of this string can be revealed (and veri ed correct) at polylogarithmic communication cost. For constructing such a commitment scheme we use a collision-free hashing function. The function maps strings of some length to strings of half the length so that it is \hard" to nd two strings which are mapped by the function to the same image. Let n denote the length of the input string to which the sender wishes to commit itself, and let k be a parameter (which is later set to be polylogarithmic in n). Denote by H a collision-free hashing function mapping strings of length 2k into strings of length k. The 228 CHAPTER 6. ZERO-KNOWLEDGE PROOF SYSTEMS sender partitions its input string into m def n consequtive blocks, each of length k. Next, the =k sender constructs a binary tree of depth log2 m, placing the m blocks in the corresponding leaves of the tree. In each internal node, the sender places the hash value obtained by applying the function H to the contents of the children of this node. The only message sent in the commit phase is the contents of the root (sent by the sender to the receiver). By doing so, unless the sender can form collisions under H , the sender has \committed" itself to some n-bit long string. When the receiver wishes to get the value of a speci c bit in the string, the sender reveals to the receiver the contents of both children of each node along the path from the root to the corresponding leaf. The receiver checks that the values supplied for each node (along the path) match the value obtained by applying H to the values supplied for the two children. The protocol for arguing that x 2 L consists of the prover committing itself to a satisfying assignment for f (x), using the above scheme, and the veri er checking individual clauses by asking the prover to reveal the values assigned to the variables in these clauses. The protocol can be shown to be computationally-sound provided that it is infeasible to nd a pair 2 f0 1g2k so that H ( ) = H ( ). Speci cally, we need to assume that forming collisions under H is not possible in subexponential time namely, that for some > 0, forming collisions with probability greater than 2;k must take at least 2k time. In 1 such a case, we set k = (log n)1+ and get a computationally-sound proof of communication complexity O( log n m k) = polylog(n). (Weaker lower bounds for the collision-forming task o(1) may still yield meaningful results by an appropriate setting of the parameter k.) We stress that collisions can always be formed in time 22k and hence the entire approach fails if the prover is not computationally bounded (and consequently we cannot get (perfectly-sound) proof systems this way). Furthermore, by a simulation argument one may show that, only languages in Dtime(2polylog) have proof systems with polylogarithmic communication and randomness complexity. 6.9 * Constant Round Zero-Knowledge Proofs In this section we consider the problem of constructing constant-round zero-knowledge proof systems with negligible error probability for all languages in NP . To make the rest of the discussion less cumbersome we de ne a proof system to be round-e cient if it is both constant-round and with negligible error probability. We present two approaches to the construction of round-e cient zero-knowledge proofs for NP . 1. Basing the construction of round-e cient zero-knowledge proof systems on commitment schemes with perfect secrecy (see Subsection 6.8.2). 6.9. * CONSTANT ROUND ZERO-KNOWLEDGE PROOFS 229 2. Constructing (round-e cient zero-knowledge) computationally-sound proof systems (see Section 6.8) instead of (round-e cient zero-knowledge) proof systems. The advantage of the second approach is that round-e cient zero-knowledge computationallysound proof systems for NP can be constructed using any one-way function, whereas it is not known whether round-e cient zero-knowledge proof systems for NP can be constructed under the same general assumption. In particular, we only know how to construct perfect commitment schemes by using much stronger assumptions (e.g., the existence of clawfree permutations). Both approaches have one fundamental idea in common. We start with an abstract exposition of this common idea. Recall that the basic zero-knowledge proof for Graph 3-Colorability, presented in Construction 6.25, consists of a constant number of rounds. However, this proof system has a non-negligible error probability (in fact the error probability is very close to 1). In Section 6.4, it was suggested to reduce the error probability to a negligible one by sequentially applying the proof system su ciently many times. The problem is that this yields a proof system with a non-constant number of rounds. A natural suggestion is to perform the repetitions of the basic proof in parallel, instead of sequentially. The problem with this \solution" is that it is not known whether that the resulting proof system is zero-knowledge. Furthermore, it is known that it is not possible to present, as done in the proof of Proposition 6.26, a single simulator which uses every possible veri er as a black box (see Section 6.5). The source of trouble is that, when playing many versions of Construction 6.25 in parallel, a cheating veri er may select the edge to be inspected (i.e., step (V1)) in each version depending on the commitments sent in all versions (i.e., in step (P1)). Such behaviour of the veri er defeats a simulator analogous to the one presented in the proof of Proposition 6.26. The way to overcome this di culty is to \switch" the order of steps (P1) and (V1). But switching the order of these steps enables the prover to cheat (by sending commitments in which only the \query" edges are colored correctly). Hence, a more re ned approach is required. The veri er starts by committing itself to one edge-query for each version (of Construction 6.25), then the prover commits itself to the coloring in each version, and only then the veri er reveals its queries and the rest of the proof proceeds as before. The commitment scheme used by the veri er should prevent the prover from predicting the sequence of edges committed to by the veri er. This is the point were the two approaches di er. 1. The rst approach utilizes for this purpose a commitment scheme with perfect secrecy. The problem with this approach is that such schemes are known to exists only under 230 CHAPTER 6. ZERO-KNOWLEDGE PROOF SYSTEMS stronger assumption than merely the existence of one-way function. Yet, such schemes do exists under assumptions such as the intractability of factoring integers of special form or the intractability of the discrete logarithm problem. 2. The second approach bounds the computational resources of prospective cheating provers. Consequently, it su ces to utilize, \against" these provers (as commitment receivers), commitment schemes with computational security. We remark that this approach utilizes (for the commitments done by the prover) a commitment scheme with an extra property. Yet, such schemes can be constructed using any one-way function. We remark that both approaches lead to protocols that are zero-knowledge in a liberal sense (i.e., using expected polynomial-time simulators). 6.9.1 Using commitment schemes with perfect secrecy For sake of clarity, let us start by presenting a detailed description of the constant-round interactive proof (for Graph 3-Colorability (i.e., G3C )) sketched above. This interactive proof employs two di erent commitment schemes. The rst scheme is the simple commitment scheme (with \computational" secrecy) presented in Construction 6.21. We denote by Cs ( ) the commitment of the sender, using coins s, to the (ternary) value . The second commitment scheme is a commitment scheme with perfect secrecy (see Section 6.8.2). For simplicity, we assume that this scheme has a commit phase in which the receiver sends one message to the sender which then replies with a single message (e.g., the schemes presented in Section 6.8.2). Let us denote by Pm s ( ) the commitment of the sender to string , upon receiving message m (from the receiver) and when using coins s. Construction 6.66 (A round-e cient zero-knowledge proof for G3C): Common Input: A simple (3-colorable) graph G = (V E ). Let n def jV j, t def n jE j = = and V = f1 ::: ng. Auxiliary Input to the Prover: A 3-coloring of G, denoted . Prover's preliminary step (P0): The prover invokes the commit phase of the perfect commit scheme, which results in sending to the veri er a message m. Veri er's preliminary step (V0): The veri er uniformly and independently selects a = sequence of t edges, E def ((u1 v1) ::: (ut vt)) 2 E t, and sends the prover a random commitment to these edges. Namely, the veri er uniformly selects s 2 f0 1gn and sends Pm s (E ) to the prover 6.9. * CONSTANT ROUND ZERO-KNOWLEDGE PROOFS 231 Motivating Remark: At this point the veri er is committed to a sequence of t edges. This commitment is of perfect secrecy Prover's step (P1): The prover uniformly and independently selects t permutations, def 1 ::: t, over f1 2 3g, and sets j (v ) = j ( (v )), for each v 2 V and 1 j t. The prover uses the computational commitment scheme to commit itself to colors of each of the vertices according to each 3-coloring. Namely, the prover uniformly and independently selects s1 1 ::: sn t 2 f0 1gn, computes ci j = Csi j ( j (i)), for each i 2 V and 1 j t, and sends c1 1 ::: cn t to the veri er Veri er's step (V1): The veri er reveals the sequence E = ((u1 v1) ::: (ut vt)) to the prover. Namely, the veri er send (s E) to the prover Motivating Remark: At this point the entire commitment of the veri er is revealed. The veri er now expects to receive, for each j , the colors assigned by the j th coloring to vertices uj and vj (the endpoints of the j th edge in E ) Prover's step (P2): The prover checks that the message just received from the verier is indeed a valid revealing of the commitment made by the veri er at step (V0). Otherwise the prover halts immediately. Let us denote the sequence of t edges, just revealed, by (u1 v1) ::: (ut vt). The prover uses the reveal phase of the computational commitment scheme in order to reveal, for each j , the j th coloring of vertices uj and vj to the veri er. Namely, the prover sends to the veri er the sequence of quadruples (su1 1 1(u1 ) sv1 1 1(v1)) ::: (sut t t(ut ) svt t t(vt)) Veri er's step (V2): The veri er checks whether, for each j , the values in the j th quadruple constitute a correct revealing of the commitments cuj j and cvj j , and whether the corresponding values are di erent. Namely, upon receiving (s1 1 s01 1) through (st t s0t t), the veri er checks whether for each j , it holds that cuj j = Csj ( j ), cvj j = Cs0j ( j ), and j 6= j (and both are in f1 2 3g). If all conditions hold then the veri er accepts. Otherwise it rejects. We rst assert that Construction 6.66 is indeed an interactive proof for G3C . Clearly, the veri er always accepts a common input in G3C . Suppose that the common input graph, G = (V E ), is not in G3C . Clearly, each of the \committed colorings" sent by the prover in step (P1) contains at least one illegally-colored edge. Using the perfect secrecy of the commitments sent by the veri er in step (V0), we deduce that at step (P1) the prover has \no idea" which edges the veri er asks to see (i.e., as far as the information available to the prover is concerned, each possibility is equally likely). Hence, although the prover sends the \coloring commitment" after receiving the \edge commitment", the probability that all the \committed edges" have legally \committed coloring" is at most t e;n < 2;n 1; 1 jE j 232 CHAPTER 6. ZERO-KNOWLEDGE PROOF SYSTEMS We now turn to show that Construction 6.66 is indeed zero-knowledge (in the liberal sense allowing expected polynomial-time simulators). For every probabilistic (expected) polynomial-time interactive machine, V , we introduce an expected polynomial-time simulator, denoted M . The simulator starts by selecting and xing a random tape, r, for V . Given the input graph G and the random tape r, the commitment message of the veri er V is determined. Hence, M invokes V , on input G and random tape r, and gets the corresponding commitment message, denoted CM . The simulator proceeds in two steps. S1) Extracting the query edges: M generates a sequence of n t random commitments to dummy values (e.g., all values equal 1), and feeds it to V . In case V replies by revealing correctly a sequence of t edges, denoted (u1 v1) ::: (ut vt), the simulator records these edges and proceed to the next step. In case the reply of V is not a valid revealing of the commitment message CM , the simulator halts outputting the current view of V (e.g., G, r and the commitments to dummy values). S2) Generating an interaction that satis es the query edges (oversimpli ed exposition): Let (u1 v1) ::: (ut vt ) denote the sequence of edges recorded in step (S1). M generates a sequence of n t commitments, c1 1 ::: cn t, so that for each j = 1 ::: t, it holds that cuj j and cvj j are random commitments to two di erent random values in f1 2 3g and all the other ci j 's are random commitments to dummy values (e.g., all values equal 1). The underlying values are called a pseudo-colorings. The simulator feeds this sequence of commitments to V . If V replies by revealing correctly the (above recorded) sequence of edges, then M can complete the simulation of a \real" interaction of V (by revealing the colors of the endpoints of these recorded edges). Otherwise, the entire step is repeated (until success occurs). In the rest of the description we ignore the possibility that, when invoked in steps (S1) and (S2), the veri er reveals two di erent edge commitments. Loosely speaking, this practice is justi ed by the fact that during expected polynomial-time computations such event can occur only with negligible probability (since otherwise it contradicts the computational unambiguity of the commitment scheme used by the veri er). To illustrate the behaviour of the simulator assume that the program V always reveals correctly the commitment done in step (V0). In such a case, the simulator will nd out the query edges in step (S1), and using them in step (S2) it will simulate the interaction of V with the real prover. Using ideas as in Section 6.4 one can show that the simulation is computational indistinguishable from the real interaction. Note that in this case, step (S2) of the simulator is performed only once. Consider now a more complex case in which, on each possible sequence of internal coin tosses r, program V correctly reveals the commitment done in step (V0) only with 1 probability 3 . The probability in this statement is taken over all possible commitments generated to the dummy values (in the simulator step (S1)). We rst observe that the 6.9. * CONSTANT ROUND ZERO-KNOWLEDGE PROOFS 233 probability that V correctly reveals the commitment done in step (V0), after receiving a random commitment to a sequence of pseudo-colorings (generated by the simulator in 1 step (S2)), is approximately 3 . (Otherwise, we derive a contradiction to the computational secrecy of the commitment scheme used by the prover.) Hence, the simulator reaches step (S2) with probability 1 , and each execution of step (S2) is completed successfully with 3 1 probability p 3 . It follows that the expected number of times that step (S2) is invoked when running the simulator is 1 1 1. 3p Let us now consider the general case. Let q (G r) denote the probability that, on input graph G and random tape r, after receiving random commitments to dummy values (generated in step (S1)), program V correctly reveals the commitment done in step (V0). Likewise, we denote by p(G r) the probability that, (on input graph G and random tape r) after receiving a random commitment to a sequence of pseudo-colorings (generated by the simulator in step (S2)), program V correctly reveals the commitment done in step (V0). As before the di erence between q (G r) and p(G r) is negligible (in terms of the size of the graph G), otherwise one derives contradiction to the computational secrecy of the prover's commitment scheme. We conclude that the simulator reaches step (S2) with probability q def q (G r), and each execution of step (S2) is completed successfully with probability = def p(G r). It follows that the expected number of times that step (S2) is invoked when p= q 1 running the simulator is q p . Here are the bad news: we cannot guarantee that p is approximately 1 or even bounded by a polynomial in the input size (e.g., let p = 2;n and q = 2;n=2 , q then the di erence between them is negligible and yet p is not bounded by poly(n)). This is why the above description of the simulator is oversimpli ed and a modi cation is indeed required. We make the simulator expected polynomial-time by modifying step (S2) as follows. We add an intermediate step (S1.5), to be performed only if the simulator did not halt in step (S1). The purpose of step (S1.5) is to provide a good estimate of q (G r). The estimate is computed by repeating step (S1) until a xed (polynomial in jGj) number of correct V -reveals are encountered (i.e., the estimate will be the ratio of the number of successes divided by the number of trial). By xing a su ciently large polynomial, we can guarantee that with overwhelmingly high probability (i.e., 1 ; 2;poly(jGj) ) the estimate is within a constant factor of q (G r). It is easily veri ed that the estimate can be computed within expected time poly(jGj)=q (G r). Step (S2) of the simulator is modi ed by adding a bound on the number of times it is performed, and if none of these executions yield a correct V -reveal then the simulator outputs a special empty interaction. Speci cally, step (S2) will be performed at most poly(jGj)=q, where q is the estimate to q (G r) computed in step (S1.5). It follows that the modi ed simulator has expected running time bounded by G q (G r) poly(jr)j) = poly(jGj). q(G It is left to analyze the output distribution of the modi ed simulator. We refrain ourselves to reducing this analysis to the analysis of the output of the original simulator, by bounding the probability that the modi ed simulator outputs a special empty interaction. 234 This probability is bounded by CHAPTER 6. ZERO-KNOWLEDGE PROOF SYSTEMS (G r) def q (G r) ; q (G r) 1 ; (1 ; p(G r))poly(jGj)=q(G r) = = q (G r) (1 ; p(G r))poly(jGj)=q(G r) We claim that (G r) is a negligible function of jGj. Assume, to the contrary, that there exists a polynomial P ( ), an in nite sequence of graphs fGng, and an in nite sequence of random tapes frng, such that (Gn rn) > 1=P (n). It follows that for each such n we have q (Gn rn) > 1=P (n). We consider two cases. Case 1: For in nitely many n's, it holds that p(Gn rn) get for these n's (Gn rn) q (Gn rn)=2. In such a case we (1 ; p(Gn rn))poly(jGn j)=q(Gn rn ) poly(jGn j)=q(Gn rn ) 1 ; q (Gn rn) 2 ;poly(jGn j)=2 <2 which contradicts our hypothesis that (Gn rn) > 1=poly(n). Case 2: For in nitely many n's, it holds that p(Gn rn) < q (Gn rn)=2. It follows that for these n's we have jq (Gn rn) ; p(Gn rn)j > P (n)=2, which leads to contradiction of the computational secrecy of the commitment scheme (used by the prover). Hence, contradiction follows in both cases. We remark that one can modify Construction 6.66 so that weaker forms of perfect commitment schemes can be used. We refer speci cally to commitment schemes with perfect a posteriori secrecy (see Subsection 6.8.2). In such schemes the secrecy is only established a posteriori by the receiver which discloses the coin tosses it has used in the commit phase. In our case, the prover plays the role of the receiver, and the veri er plays the role of the sender. It su ces to establish the secrecy property a posteriori, since in case secrecy is not establish the veri er may reject. In such a case no harm has been caused since the secrecy of the perfect commitment scheme is used only to establish the soundness of the interactive proof. 6.9.2 Bounding the power of cheating provers Construction 6.66 can be modi ed to yield a zero-knowledge computationally sound proof, under the (more general) assumption that one-way functions exist. In the modi ed protocol, we let the veri er use a commitment scheme with computational secrecy, instead of 6.9. * CONSTANT ROUND ZERO-KNOWLEDGE PROOFS 235 the commitment scheme with perfect secrecy used in Construction 6.66. (Hence, both users commit to their messages using commitment scheme with computational secrecy.) Furthermore, the commitment scheme used by the prover must have the extra property that it is infeasible to construct a commitment without \knowing" to what value it commits. Such a commitment scheme is called non-oblivious. We start by de ning and constructing non-oblivious commitment schemes. Non-oblivious commitment schemes The non-obliviousness of a commitment scheme is intimately related to the de nition of proof of knowledge (see Section 6.7). De nition 6.67 (non-oblivious commitment schemes): Let (S R) be a commitment scheme as in De nition 6.20. We say that the commitment scheme is non-oblivious if the prescribed receiver, R, constitutes a knowledge-veri er, that is always convinced by S , for the relation f((1n r m) ( s)) : m =viewS((1n1nr)s)g R where, as in De nition 6.20, viewS ((1n1 r)s) denotes the messages received by the interactive R machine R on input 1n and local-coins r, when interactive with machine S (that has input ( 1n) and uses coins s). n It follows that the receiver prescribed program, R, may accept or rejects at the end of the commit phase, and that this decision is supposed to re ect the sender's ability to later come up with a legal opening of the commitment (i.e., successfully complete the reveal phase). We stress that non-obliviousness relates mainly to cheating senders, since the prescribed sender has no di culty to later successfully complete the reveal phase (and in fact during the commit phase S always convinces the receiver of this ability). Hence, any sender program (not merely the prescribed S ) can be modi ed so that at the end of the commit phase it (locally) outputs information enabling the reveal phase (i.e., and s). The modi ed sender runs in expected time that is inversely proportional to the probability that the commit phase is completed successfully. We remark that in an ordinary commitment scheme, at the end of the commit phase, the receiver does not necessarily \know" whether the sender can later successfully conduct the reveal phase. For example, a cheating sender in Construction 6.21 can (undetectedly) perform the commit phase without ability to later successfully perform the reveal phase (e.g., the sender may just send a uniformly chosen string). It is only guaranteed that if the sender follows the prescribed program then the sender can always succeed in the reveal phase. Furthermore, with respect to the scheme presented in Construction 6.23, a cheating sender can (undetectedly) perform the commit phase in a way that it generates a receiver 236 CHAPTER 6. ZERO-KNOWLEDGE PROOF SYSTEMS view which does not have any corresponding legal opening (and hence the reveal phase is doomed to fail). See Exercise 13. Nevertheless Theorem 6.68 If one-way functions exist then there exist non-oblivious commitment schemes with constant number of communication rounds. We recall that (ordinary) commitment schemes can be constructed assuming the existence of one-way functions (see Proposition 6.24 and Theorem 3.29). Consider the relation corresponding to such a scheme. Using zero-knowledge proofs of knowledge (see Section 6.7) for the above relation, we get a non-oblivious commitment scheme. (We remark that such proofs do exist under the same assumptions.) However, the resulting commitment scheme has unbounded number of rounds (due to the round complexity of the zero-knowledge proof). We seem to have reached a vicious circle, yet there is a way out. We can use constant-round witness indistinguishable proofs (see Section 6.6), instead of the zero-knowledge proofs. The resulting commitment scheme has the additional property that when applied (polynomially) many times in parallel the secrecy property holds simultaneously in all copies. This fact follows from the Parallel Composition Lemma for witness indistinguishable proofs (see Section 6.6). The simultaneous secrecy of many copies is crucial to the following application. Modifying Construction 6.66 We recall that we are referring to a modi cation of Construction 6.66 in which the veri er uses a commitment scheme (with computational secrecy), instead of the commitment scheme with perfect secrecy used in Construction 6.66. In addition, the commitment scheme used by the prover is non-oblivious. We conclude this section by remarking on how to adopt the argument of the rst approach (i.e., of Subsection 6.9.1) to suit our current needs. We start with the claim that the modi ed protocol is a computationally-sound proof for G3C . Verifying that the modi ed protocol satis es the completeness condition is easy as usual. We remark that the modi ed protocol does not satisfy the (usual) soundness condition (e.g., a \prover" of exponential computing power can break the veri er's commitment and generate pseudo-colorings that will later fool the veri er into accepting). Nevertheless, we can show that the modi ed protocol does satisfy the computational soundness (of De nition 6.56). Namely, we show that for every polynomial p( ), every polynomial-time interactive machine B , and for all su ciently large graph G 62 G3C and every y and z Prob (hB (y ) VG3C (z )i(x)=1) p(j1xj) 6.10. * NON-INTERACTIVE ZERO-KNOWLEDGE PROOFS 237 where VG3C is the veri er program in the modi ed protocol. Using the information theoretic unambiguity of the commitment scheme employed by the prover, we can talk of a unique color assignment which is induced by the prover's commitments. Using the fact that this commitment scheme is non-oblivious, it follows that the prover can be modi ed so that, in step (P1), it outputs the values to which it commits itself at this step. We can now use the computational secrecy of the veri er's commitment scheme to show that the color assignment generated by the prover is almost independent of the veri er's commitment. Hence, the probability that the prover can fool the veri er to accept an input not in the language is non-negligibly greater than what it would have been if the veri er asked random queries after the prover makes its (color) commitments. The computational soundness of the (modi ed) protocol follows. We remark that we do not know whether the protocol is computationally sound in case the prover uses a commitment scheme that is not guaranteed to be non-oblivious. Showing that the (modi ed) protocol is zero-knowledge is even easier than it was in the rst approach (i.e., in Subsection 6.9.1). The reason being that when demonstrating zero-knowledge of such protocols we use the secrecy of the prover's commitment scheme and the unambiguity of the veri er's commitment scheme. Hence, only these properties of the commitment schemes are relevant to the zero-knowledge property of the protocols. Yet, the current (modi ed) protocol uses commitment schemes with relevant properties which are not weaker than the ones of the corresponding commitment schemes used in Construction 6.66. Speci cally, the prover's commitment scheme in the modi ed protocol possess computationally secrecy just like the prover's commitment scheme in Construction 6.66. We stress that this commitment, like the simpler commitment used for the prover in Construction 6.66, has the simultaneous secrecy (of many copies) property. Furthermore, the veri er's commitment scheme in the modi ed protocol possess \information theoretic" unambiguity, whereas the veri er's commitment scheme in Construction 6.66 is merely computationally unambiguous. 6.10 * Non-Interactive Zero-Knowledge Proofs Author's Note: Indeed, this section is missing 6.10.1 De nition 6.10.2 Construction 6.11 * Multi-Prover Zero-Knowledge Proofs In this section we consider an extension of the notion of an interactive proof system. Specifically, we consider the interaction of a veri er with several (say, two) provers. The provers 238 CHAPTER 6. ZERO-KNOWLEDGE PROOF SYSTEMS may share an a-priori selected strategy, but it is assumed that they cannot interact with each other during the time period in which they interact with the veri er. Intuitively, the provers can coordinate their strategies prior to, but not during, their interrogation by the veri er. The notion of multi-prover interactive proof plays a fundamental role in complexity theory. This aspect is not addressed here (but rather postponed to Section missing(eff-pcp.sec)]). In the current section we merely address the zero-knowledge aspects of multi-party interactive proofs. Most importantly, the multi-prover model enables the construction of (perfect) zero-knowledge proof systems for NP , independent of any complexity theoretic (or other) assumptions. Furthermore, these proof systems can be extremely e cient. Speci cally, the on-line computations of all parties can be performed in polylogarithmic time (on a RAM). 6.11.1 De nitions For sake of simplicity we consider the two-prover model. We remark that more provers do not o er any essential advantages (and speci cally, none that interest us in this section). Loosely speaking, a two-prover interactive proof system is a three party protocol, where two parties are provers and the additional party is a veri er. The only interaction allowed in this model is between the veri er and each of the provers. In particular, a prover does not \know" the contents of the messages sent by the veri er to the other prover. The provers do however share a random input tape, which is (as in the one-prover case) \beyond the reach" of the veri er. The two-prover setting is a special case of the two-partner model described below. The two-partner model The two-party model consists of two partners interacting with a third party, called solitary. The two partners can agree on their strategies beforehand, and in particular agree on a common uniformly chosen string. Yet, once the interaction with the solitary begins, the partners can no longer exchange information. The following de nition of such an interaction extends De nitions 6.1 and 6.2. De nition 6.69 (two-partner model): The two-partner model consists of three interactive machines, two are called partners and the third is called solitary, which are linked and interact as hereby speci ed. The input-tapes of all three parties coincide, and its contents is called the common input. The random-tapes of the two partners coincide, and is called the partners' random-tape. (The solitary has a separate random-tape.) 6.11. * MULTI-PROVER ZERO-KNOWLEDGE PROOFS 239 The solitary has two pairs of communication-tapes and two switch-tapes instead of a single pair of communication-tapes and a single switch-tape (as in De nition 6.1). Both partners have the same identity and the solitary has an opposite identity (see De nitions 6.1 and 6.2). The rst (resp., second) switch-tape of the solitary coincides with the switch-tape of the rst (resp., second) partner, the rst (resp., second) read-only communication-tape of the solitary coincides with the write-only communication-tape of the rst (resp., second) partner and vice versa. The joint computation of the three parties, on a common input x, is a sequence of triplets. Each triplet consists of the local con guration of each of the three machines. The behaviour of each partner-solitary pair is as in the de nition of the joint computation of a pair of interactive machines. Notation: We denote by hP1 P2 S i(x) the output of the solitary S after interacting with the partners P1 and P2 , on common input x. Two-prover interactive proofs A two-prover interactive proof system is now de ned analogously to the one-prover case (see De nitions 6.4 and 6.6). De nition 6.70 (two-prover interactive proof system): A triplet of interactive machines, (P1 P2 V ), in the two-partner model is called an proof system for a language L if the machine V (called veri er) is probabilistic polynomial-time and the following two conditions hold Completeness: For every x 2 L Prob (hP1 P2 V i(x)=1) 2 3 Soundness: For every x 62 L and every pair of partners (B1 B2 ), 1 Prob (hB1 B2 V i(x)=1) 3 As usual, the error probability in the completeness condition can be reduced (from 1 ) 3 up to 2;poly(jxj), by sequentially repeating the protocol su ciently many times. We stress that error reduction via parallel repetitions is not known to work in general. The notion of zero-knowledge (for multi-prove systems) remains exactly as in the oneprover case. Actually, the de nition of perfect zero-knowledge may even be made more strict by requiring that the simulator never fails (i.e., never outputs the special symbol ?). Namely, 240 CHAPTER 6. ZERO-KNOWLEDGE PROOF SYSTEMS De nition 6.71 We say that a (two-prover) proof system (P1 P2 V ) for a language L is perfect zero-knowledge if for every probabilistic polynomial-time interactive machine V there exists a probabilistic polynomial-time algorithm M such that for every x 2 L the random variables hP1 P2 V i(x) and M (x) are identically distributed. Extension to the auxiliary-input (zero-knowledge) model is straightforward. 6.11.2 Two-Senders Commitment Schemes The thrust of the current section is in a method for constructing perfect zero-knowledge two-prover proof systems for every language in NP . This method makes essential use of a commitment scheme for two senders and one receiver that posses \information theoretic" secrecy and unambiguity properties. We stress that it is impossible to simultaneously achieve \information theoretic" secrecy and unambiguity properties in the single sender model. A De nition Loosely speaking, a two-sender commitment scheme is an e cient two-phase protocol for the two-partner model, through which the partners, called senders, can commit themselves to a value so that the following two con icting requirements are satis ed. 1. Secrecy: At the end of the commit phase the solitary, called receiver, does not gain any information of the senders' value. 2. Unambiguity: Suppose that the commit phase is successfully terminated. Then if later the senders can perform the reveal phase so that the receiver accepts the value 0 with probability p then they cannot perform the reveal phase so that the receiver accepts the value 1 with probability substantially bigger than 1 ; p. (Due to the secrecy requirement and the fact that the senders are computationally unbounded, for every p, the senders can always conduct the commit phase so that they can later reveal the value 0 with probability p and the value 1 with probability 1 ; p.) Instead of presenting a general de nition, we restrict our attention to the special case of two-sender commitment schemes in which only the rst sender (and the receiver) takes part in the commit phase, whereas only the second sender takes part in the reveal phase. Furthermore, we assume, without loss of generality, that in the reveal phase the (second) sender sends the contents of the joint random-tape (used by the rst sender in the commit phase) to the receiver. 6.11. * MULTI-PROVER ZERO-KNOWLEDGE PROOFS 241 De nition 6.72 (two-sender bit commitment): A two-sender bit commitment scheme is a triplet of probabilistic polynomial-time interactive machines, denoted (S1 S2 R), for the two-partner model satisfying: Input Speci cation: The common input is an integer n presented in unary, called the security parameter. The two partners, called the senders, have an auxiliary private input v 2 f0 1g. Secrecy: The 0-commitment and the 1-commitment are identically distributed. Namely, for every probabilistic (not necessarily polynomial-time) machine R interacting with the rst sender (i.e., S1 ), the random variables hS1(0) R i(1n ) and hS1(1) R i(1n) are identically distributed. Unambiguity: Preliminaries. For simplicity v 2 f0 1g and n 2 N are implicit in all I notations. { As in De nition 6.20, a receiver's view of an interaction with the ( rst) sender, denoted (r m), consists of the random coins used by the receiver, denoted r, and the sequence of messages received from the ( rst) sender, denoted m. { Let 2 f0 1g. We say that the string s is a possible -opening of the receiver's view (r m) if m describes the messages received by R when R uses local coins r and interacts with machine S1 which uses local coins s and input ( 1n ). { Let S1 be an arbitrary program for the rst sender. Let p be a real, and 2 f0 1g. We say that p is an upper bound on the probability of a -opening of the receiver's view of the interaction with S1 if for every random variable X , which is statistically independent of the receiver's coin tosses, the probability that X is a possible opening of the receiver's view of an interaction with S1 is at most p. (The probability is taken over the coin tosses of the receiver, the strategy S1 and the random variable X .) { Let S1 be as above, and, for each 2 f0 1g, let p be an upper bound on the probability of a -opening of the interaction with S1 . We say that the receiver's view of the interaction with S1 is unambiguous if p0 + p1 1 + 2;n . The unambiguity requirement asserts that, for every program for the rst sender, S1 , the receiver's interaction with S1 is unambiguous. In the formulation of the unambiguity requirement, the random variables X represent possible strategies of the second sender. These strategies may depend on the random input that is shared by the two senders, but is independent of the receiver's random coins (since information on these coins, if at all, is only sent to the rst sender). Actually, the highest possible value of p0 + p1 is attainable by deterministic strategies for both senders. Thus, it su ces to consider an arbitrary deterministic strategy S1 for the rst sender and a xed 242 CHAPTER 6. ZERO-KNOWLEDGE PROOF SYSTEMS -opening, denoted s , for the second sender (for each 2 f0 1g). In this case, the probability is taken only over the receiver coin tosses and we can strengthen the unambiguity condition as follows: (strong unambiguity condition) for every deterministic strategy S1 , and every pair of strings (s0 s1), the probability that for both = 0 1 the string s is a opening of the receiver's view of the interaction with S1 is bounded above by 2;n . In general, in case the sender employ randomized strategies, they determine for each possible coin-tossing of the receiver a pair of probabilities corresponding to their success in a 0opening and a 1-opening. The unambiguity condition asserts that the average of these pairs, taken over all possible receiver's coin tosses is a pair which sums-up to at most 1 + 2;n . Intuitively, this means that the senders cannot do more harm than deciding at random (possibly based also on the receiver's message to the rst sender) whether to commit to 0 or to 1. Both secrecy and unambiguity requirements are information theoretic (in the sense that no computational restrictions are placed on the adversarial strategies). We stress that we have implicitly assumed that the reveal phase takes the following form: 1. the second sender sends to the receiver the initial private input, v , and the random coins, s, used by the rst sender in the commit phase 2. the receiver veri es that v and s (together with the private coins (r) used by R in the commit phase) indeed yield the messages that R has received in the commit phase. Veri cation is done in polynomial-time (by running the programs S1 and R). A Construction By the above conventions, it su ces to explicitly describe the commit phase (in which only the rst sender takes part). Construction 6.73 (two-sender bit commitment): Preliminaries: Let 0 1 denote two permutations over f0 1 2g so that 0 is the identity permutation and 1 is a permutation consisting of a single transposition, say (1 2). Namely, 1 (1)=2, 1(2)=1 and 1 (0)=0. Common input: the security parameter n (in unary). A convention: Suppose that the contents of the senders' random-tape encodes a uniformly selected s = s1 sn 2 f0 1 2gn. 6.11. * MULTI-PROVER ZERO-KNOWLEDGE PROOFS Commit Phase: 243 1. The receiver uniformly selects r = r1 rn 2 f0 1gn and sends r to the rst sender. 2. To commit to a bit , the rst sender computes ci def ri (si ) + mod 3, for each = i, and sends c1 cn to the receiver. We remark that the second sender could have opened the commitment either way if he had known r (sent by the receiver to the rst sender). The point is that the second sender does not \know" r, and this fact drastically limits its ability to cheat. Proposition 6.74 Construction 6.73 constitutes a two-sender bit commitment scheme. Proof: The security property follows by observing that for every choice of r 2 f0 1gn, the message sent by the rst sender is uniformly distributed over f0 1 2gn. The unambiguity property is proven by contradiction. As a motivation, we rst consider the execution of the above protocol when n equals 1 and show that it is impossible for the two senders to be always able to open the commitments both ways. Consider two messages, (0 s0) and (1 s1), sent by the second sender in the reveal phase so that s0 is a possible 0opening and s1 is a possible 1-opening, both with respect to the receiver's view. We stress that these messages are sent obliviously of the random coins of the receiver, and hence must match all possible receiver's views (or else the opening does not always succeed). It follows that for each r 2 f0 1g, both r (s0) and r (s1 ) + 1 mod 3 must t the message received by the receiver (in the commit phase) in response to message r sent by it. Hence, r (s0 ) r (s1) + 1 (mod 3) holds, for each r 2 f0 1g. Contradiction follows since no two 0 s1 2 f0 1 2g can satisfy both 0 (s0 ) s 0 (s1 ) + 1 (mod 3) and 1(s0 ) 1 (s1) + 1 (mod 3). (The reason being that the rst equality implies s0 s1 + 1 (mod 3) which combined with the second equality yields 1(s1 + 1 mod 3) 1 (s1 ) + 1 (mod 3), whereas for every s 2 f0 1 2g it holds that 1(s + 1 mod 3) 6 1(s) + 1 (mod 3).) We now turn to the actual proof of the unambiguity property. We rst observe that if there exists a program S1 so that the receiver's interaction with S1 is ambiguous, then there exists also such a deterministic program. Actually, the program is merely a function, denoted f , mapping n-bit long strings into sequences in f0 1 2gn. Likewise, the (0-opening and 1-opening) strategies for the second sender can be assumed, without loss of generality, to be deterministic. Consequently, both strategies consist of constant sequences, denoted s0 and s1, and both can be assumed (with no loss of generality) to be in f0 1 2gn. For each 2 f0 1g, let p denote the probability that the sequence s is a possible opening of the receiver's view (Un f (Un)), where Un denotes a random variable uniformly distributed over f0 1gn. The contradiction hypothesis implies that p0 + p1 > 1 + 2;n . Put 244 CHAPTER 6. ZERO-KNOWLEDGE PROOF SYSTEMS in other words, jR0j + jR1j 2n + 2, where R denotes the set of all strings r 2 f0 1gn for which the sequence s is a possible -opening of the receiver's view (r f (r)). Namely, R = fr : (8i) fi (r) ri (si ) + (mod 3)g where r = r1 rn , s = s1 sn , and f (r) = f1 (r) fn (r). We are going to refute the contradiction hypothesis by showing that the intersection of the sets R0 and R1 cannot contain more than a single element. Claim 6.74.1: Let R0 and R1 as de ned above. Then jR0 \ R1 j 1. proof: Suppose, on the contrary, that 2 R0 \ R1 (and 6= ). Then, there exist an i such that i 6= i , and without loss of generality i = 0 (and i = 1). By the de nition of R it follows that 0 fi ( ) 0(si ) (mod 3) 0 fi ( ) 1(si ) (mod 3) 1 fi ( ) 0(si ) + 1 (mod 3) 1 fi ( ) 1(si ) + 1 (mod 3) Contradiction follows as in the motivating discussion. 2 This completes the proof of the proposition. We remark that Claim 6.74.1 actually yields the strong unambiguity condition (presented in the discussion following De nition 6.72). More importantly, we remark that the proof extends easily to the case in which many instances of the protocol are executed in parallel namely, the parallel protocol constitutes a two-sender multi-value (i.e., string) commitment scheme. Author's Note: The last remark should be elaborated signi cantly. In addition, it should be stressed that the claim holds also when the second sender is asked to reveal only some of the commitments, as long as this request is indepdendent of the coin tosses used by the receiver during the commit phase. 6.11.3 Perfect Zero-Knowledge for NP Two-prover perfect zero-knowledge proof systems for any language in NP follow easily by modifying Construction 6.25. The modi cation consists of replacing the bit commitment scheme, used in Construction 6.25, by the two-sender bit commitment scheme of Construction 6.73. Speci cally, the modi ed proof system for Graph Coloring proceeds as follows. Two-prover atomic proof of Graph Coloring 6.11. * MULTI-PROVER ZERO-KNOWLEDGE PROOFS 245 The rst prover uses the prover's random tape to determine a permutation of the coloring. In order to commit to each of the resulting colors, the rst prover invokes (the commit phase of) a two-sender bit commitment, setting the security parameter to be the number of vertices in the graph. (The rst prover plays the role of the rst sender whereas the veri er plays the role of the receiver.) The veri er uniformly selects an edge and sends it to the second prover. The second prover reveals the colors of the endpoints of the required edge, by sending the portions of the prover's random-tape used in the corresponding instance of the commit phase. We now remark on the properties of the above protocol. As usual, one can see that the provers can always convince the veri er of valid claims (i.e., the completeness condition hold). Using the unambiguity property of the two-sender commitment scheme we can think of the rst prover as selecting at random, with arbitrary probability distribution, a color assignment to the vertices of the graph. We stress that this claim holds although many instances of the commit protocol are performed concurrently (see remark above). If the graph is not 3-colored than each of the possible color assignments chosen by the rst prover is illegal, and a weak soundness property follows. Yet, by executing the above protocol polynomially many times, even in parallel, we derive a protocol satisfying the soundness requirement. We stress that the fact that parallelism is e ective here (as means for decreasing error probability) follows from the unambiguity property of two-sender commitment scheme and not from a general \parallel composition lemma" (which is not valid in the two-prover setting). Author's Note: The last sentence refers to a false claim by which the error probability of a protocol in which a basic protocol is repeated t times in parallel is at most pt , where p is the error probability of the basic protocol. Interestingly, Ran Raz has recently proven a general \parallel composition lemma" of slightly weaker form: the error probability indeed decreases exponentially in t (but the base is indeed bigger than p). We now turn to the zero-knowledge aspects of the above protocol. It turns out that this part is much easier to handle than in all previous cases we have seen. In the construction of the simulator we take advantage on the fact that it is playing the role of both provers and hence the unambiguity of the commitment scheme does not apply. Speci cally, the simulator, playing the role of both senders, can easily open each commitment any way it wants. (Here we take advantage on the speci c structure of the commitment scheme of Construction 6.73.) Details follow. Simulation of the atomic proof of Graph Coloring 246 CHAPTER 6. ZERO-KNOWLEDGE PROOF SYSTEMS The simulator generates random \commitments to nothing". Namely, the simulator invokes the veri er and answers its messages by uniformly chosen strings. Upon receiving the query-egde (u v ) from the veri er, the simulator uniformly selects two di erent colours, u and v , and opens the corresponding commitments so that to reveal this values. The simulator has no di culty to do so since, unlike the second prover, it knows the messages sent by the veri er in the commit phase. (Given the receiver's view, (r1 rn c1 cn ), of the commit phase, a 0-opening is computed by ; ; setting si = ri 1(ci ) whereas a 1-opening is computed by setting si = ri 1 (ci ; 1), for all i.) We now remark that the entire argument extends trivially to the case in which polynomially many instances of the protocol are performed concurrently. E ciency improvement NP , can be obtained by using the techniques described in Section A dramatic improvement in the e ciency of two-prover (perfect) zero-knowledge proofs for In particular, such a proof system with constant error probability, can be implemented in probabilistic polynomial-time, so that the number of bits exchanged in the interaction is logarithmic. Furthermore, the veri er is only required to use logarithmically many coin tosses. The error can be reduced to 2;k by repeating the protocol sequentially for k times. In particular negligible error probability is achieved in polylogarithmic communication complexity. We stress again that error reduction via parallel repetitions is not known to work in general, and in particular is not known to work in this speci c case. missing(eff-pcp.sec)]. Author's Note: Again, the last statement is out of date and recent results do allow to reduce the error probability without increasing the number of rounds. 6.11.4 Applications Multi-prover interactive proofs are useful only in settings in which the \proving entity" can be separated and its parts kept ignorant of one another during the proving process. In such cases we get perfect zero-knowledge proofs without having to rely on complexity theoretic assumptions. In other words, general widely believed mathematical assumptions are replaced by physical assumptions concerning the speci c setting. A natural application is to the problem of identi cation, and speci cally the identi cation of a user at some station. In Section 6.7 we discuss how to reduce identi cation to a zero-knowledge proof of knowledge (for some NP relation). The idea is to supply each user with two smart-cards, implementing the two provers in a two-prover zero-knowledge 6.12. MISCELLANEOUS 247 proof of knowledge. These two smart-cards have to be inserted in two di erent slots of the station, and this guarantees that the smart-cards cannot communicate one with another. The station will play the role of the veri er in the zero-knowledge proof of knowledge. This way the station is protected against impersonation, whereas the users are protected against pirate stations which may try to extract knowledge from the smart-cards (so to enable impersonation by its agents). 6.12 Miscellaneous 6.12.1 Historical Notes Interactive proof systems were introduced by Goldwasser, Micali and Racko GMR85]. (Earlier versions of this paper date to early 1983. Yet, the paper, being rejected three times from major conferences, has rst appeared in public only in 1985, concurrently to the paper of Babai B85].) A restricted form of interactive proofs, known by the name Arthur Merlin Games, was introduced by Babai B85]. (The restricted form turned out to be equivalent in power { see Section missing(eff-ip.sec)].) The interactive proof for Graph Non-Isomorphism is due to Goldreich, Micali and Wigderson GMW86]. The concept of zero-knowledge has been introduced by Goldwasser, Micali and Racko , in the same paper quoted above GMR85]. Their paper contained also a perfect zeroknowledge proof for Quadratic Non Residuousity. The perfect zero-knowledge proof system for Graph Isomorphism is due to Goldreich, Micali and Wigderson GMW86]. The latter paper is also the source to the zero-knowledge proof systems for all languages in NP , using any (nonunifomly) one-way function. (Brassard and Crepeau have later constructed alternative zero-knowledge proof systems for NP , using a stronger intractability assumption, speci cally the intractability of the Quadratic Residuousity Problem.) The cryptographic applications of zero-knowledge proofs were the very motivation for their presentation in GMR85]. Zero-knowledge proofs were applied to solve cryptographic problems in FMRW85] and CF85]. However, many more applications were possible once it was shown how to construct zero-knowledge proof systems for every language in NP . In particular, general methodologies for the construction of cryptographic protocols have appeared in GMW86,GMW87]. Credits for the advanced sections The results providing upper bounds on the complexity of languages with perfect zeroknowledge proofs (i.e., Theorem 6.36) are from Fortnow For87] and Aiello and Hastad AH87]. The results indicating that one-way functions are necessary for non-trivial zeroknowledge are from Ostrovsky and Wigderson OWistcs93]. The negative results con- 248 CHAPTER 6. ZERO-KNOWLEDGE PROOF SYSTEMS cerning parallel composition of zero-knowledge proof systems (i.e., Proposition 6.37 and Theorem 6.39) are from GKr89b]. The notions of witness indistinguishability and witness hiding, were introduced and developed by Feige and Shamir FSwitness]. Author's Note: FSwitness has appeared in STOC90. The concept of proofs of knowledge originates from the paper of Goldwasser, Micali and Racko GMR85]. First attempts to provide a de nition to this concept appear in Fiat, Feige and Shamir FFS87] and Tompa and Woll TW87]. However, the de nitions provided in both FFS87,TW87] are not satisfactory. The issue of de ning proofs of knowledge has been extensively investigated by Bellare and Goldreich BGknow], and we follow their suggestions. The application of zero-knowledge proofs of knowledge to identi cation schemes was discovered by Feige, Fiat and Shamir FFS87]. Computationally sound proof systems (i.e., arguments) were introduced by Brassard, Chaum, and Crepeau BCC87]. Their paper also presents perfect zero-knowledge arguments for NP based on the intractability of factoring. Naor et. al. NOVY92] showed how to construct perfect zero-knowledge arguments for NP based on any one-way permutation, and Construction 6.58 is taken from their paper. The polylogarithmic-communication argument system for NP (of Subsection 6.8.4) is due to Kilian K92]. Author's Note: NOVY92 has appeared in Crypto92, and K92 in STOC92. Author's Note: Micali's model of CS-proofs was intended for the missing chapter on complexity theory. The round-e cient zero-knowledge proof systems for NP , based on any clawfree collection, is taken from Goldreich and Kahan GKa89]. The round-e cient zero-knowledge arguments for NP , based on any one-way function, uses ideas of Feige and Shamir FSconst] (yet, their original construction is di erent). Author's Note: NIZK credits: BFM and others Multi-prover interactive proofs were introduced by Ben-Or, Goldwasser, Kilian and Wigderson BGKW88]. Their paper also presents a perfect zero-knowledge two-prover proof system for NP . The perfect zero-knowledge two-prover proof for NP , presented in Section 6.11, follows their ideas but explicitly states the properties of the two-sender commitment scheme in use. Consequently, we observe that this proof system can be applied in parallel to decease the error probability to a negligible one. Author's Note: This observation escaped Feige, Lapidot and Shamir. 6.12. MISCELLANEOUS 249 6.12.2 Suggestion for Further Reading For further details on interactive proof systems see Section missing(eff-ip.sec)]. A uniform-complexity treatment of zero-knowledge was given by Goldreich Guniform]. In particular, it is shown how to use (uniformly) one-way functions to construct interactive proof systems for NP so that it is infeasible to nd instances on which the prover leaks knowledge. Zero-knowledge proof systems for any language in IP , based on (nonuniformly) one-way functions, were constructed by Impagliazzo and Yung IY87] (yet, their paper contains no details). An alternative construction is presented by Ben-Or et. al. Betal88]. Further reading related to the advanced sections Additional negative results concerning zero-knowledge proofs of restricted types appear in Goldreich and Oren GO87]. The interested reader is also directed to Boppana, Hastad and Zachos BHZ87] for a proof that if every language in coNP has a constant-round interactive proof system then the Polynomial-Time Hierarchy collapses to its second level. Round-e cient perfect zero-knowledge arguments for NP , based on the intractability of the Discrete Logarithm Problem, appears in a paper by Brassard, Crepeau and Yung BCY]. A round-e cient perfect zero-knowledge proof system for Graph Isomorphism appears in a paper by Bellare, Micali and Ostrovsky BMO89]. Author's Note: NIZK suggestions An extremely e cient perfect zero-knowledge two-prover proof system for NP , appears in a paper by Dwork et. al. DFKNS]. Speci cally, only logarithmic randomness and communication complexities are required to get a constant error probability. This result uses the characterization of NP in terms of low complexity multi-prover interactive proof systems, which is further discussed in Section missing(eff-pcp.sec)]. The paper by Goldwasser, Micali and Racko GMR85] contains also a suggestion for a general measure of \knowledge" revealed by a prover, of which zero-knowledge is merely a special case. For further details see Goldreich and Petrank GPkc]. Author's Note: GPkc has appeared in FOCS91. See also a recent work by Goldreich, Ostrovsky and Petrank in STOC94. missing chapter on complexity. Author's Note: The discussion of knowledge complexity is better t into the 250 CHAPTER 6. ZERO-KNOWLEDGE PROOF SYSTEMS 6.12.3 Open Problems Our formulations of zero-knowledge (e.g., perfect zero-knowledge as de ned in De nition 6.11) is di erent from the standard de nition used in the literature (e.g., De nition 6.15). The standard de nition refers to expected polynomial-time machines rather to strictly (probabilistic) polynomial-time machines. Clearly, De nition 6.11 implies De nition 6.15 (see Exercise 8), but it is open whether the converse hold. Author's Note: Base nizk and arguments on (more) general assumptions. 6.12.4 Exercises Exercise 1: decreasing the error probability in interactive proofs: Prove Proposition 6.7. (Guideline: Execute the weaker interactive proof su ciently many times, using independently chosen coin tosses for each execution, and rule by an appropriate threshold. Observe that the bounds on completeness and soundness need to be e ciently computable. Be careful when demonstrating the soundness of the resulting veri er. The statement remains valid regardless of whether these repetitions are executed sequentially or \in parallel", yet demonstrating that the soundness condition is satis ed is much easier in the rst case.) Exercise 2: the role of randomization in interactive proofs { part 1: Prove that if L has an interactive proof system in which the veri er is deterministic then L 2 NP . (Guideline: Note that if the veri er is deterministic then the entire interaction between the prover and the veri er is determined by the prover. Hence, a modi ed prover can just precompute the interaction and send it to the modi ed veri er as the only message. The modi ed veri er checks that the interaction is consistent with the message that the original veri er would have sent) Exercise 3: the role of randomization in interactive proofs { part 2: Prove that if L has an interactive proof system then it has one in which the prover is deterministic. Furthermore, prove that for every (probabilistic) interactive machine V there exists a deterministic interactive machine P so that for every x the probability Prob (hP V i(x)=1) equals the supremum of Prob (hB V i(x)=1) taken over all interactive machines B . (Guideline: for each possible pre x of interaction, the prover can determine a message which maximizes the accepting probability of the veri er V .) Exercise 4: the role of randomization in interactive proofs { part 3: Consider a modi cation, to the de nition of an interactive machine, in which the random-tapes of the prover and veri er coincide (i.e., intuitively, both use the same sequence of coin tosses which is known to both of them). Prove that every language having such a modi ed 6.12. MISCELLANEOUS 251 interactive proof system has also an interactive proof system (of the original kind) in which the prover sends a single message. Exercise 5: the role of error in interactive proofs: Prove that if L has an interactive proof system in which the veri er never (not even with negligible probability) accepts a string not in the language L then L 2 NP . (Guideline: De ne a relation RL such that (x y ) 2 RL if y is a full transcript of an interaction leading the veri er to accept the input x. We stress that y contains the veri er's coin tosses and all the messages received from the prover.) Exercise 6: error in perfect zero-knowledge simulators - part 1: Consider modi cations of De nition 6.11 in which condition 1 is replaced by requiring, for some function q ( ), that Prob(M (x) = ?) < q (jxj). Assume that q ( ) is polynomial-time computable. Show that if for some polynomials, p1( ) and p2 ( ), and all su ciently large n's, q (n) > 1=p1(n) and q (n) < 1 ; 2;p2 (n) then the modi ed de nition is equivalent to the original one. Justify the bounds placed on the function q ( ). (Guideline: the idea is to repeatedly execute the simulator su ciently many time.) Exercise 7: error in perfect zero-knowledge simulators - part 2: Consider the following alternative to De nition 6.11, by which we say that (P V ) is perfect zero-knowledge if for every probabilistic polynomial-time interactive machine V there exists a probabilistic polynomial-time algorithm M so that the following two ensembles are statistically close (i.e., their statistical di erence is negligible as a function of jxj) fhP V i(x)gx2L fM (x)gx2L Prove that De nition 6.11 implies the new de nition. Exercise 8: (E) error in perfect zero-knowledge simulators - part 3: Prove that De nition 6.11 implies De nition 6.15. Exercise 9: error in computational zero-knowledge simulators: Consider an alternative to De nition 6.12, by which the simulator is allowed to output the symbol ? (with prob1 ability bounded above by, say, 2 ) and its output distribution is considered conditioned on its not being ? (as done in De nition 6.11). Prove that this alternative de nition is equivalent to the original one (i.e., to De nition 6.12). Exercise 10: alternative formulation of zero-knowledge - simulating the interaction: Prove the equivalence of De nitions 6.12 and 6.13. Exercise 11: Present a simple probabilistic polynomial-time algorithm which simulates the view of the interaction of the veri er described in Construction 6.16 with the prover de ned there. The simulator, on input x 2 GI , should have output which is GI distributed identically to viewPGI (x). V 252 CHAPTER 6. ZERO-KNOWLEDGE PROOF SYSTEMS Exercise 12: Prove that the existence of bit commitment schemes implies the existence of one-way functions. (Guideline: following the notations of De nition 6.20, consider the mapping of (v s r) to the receiver's view (r m). Observe that by the unambiguity requirement range elements are very unlikely to have inverses with both possible values of v . The mapping is polynomial-time computable and an algorithm that inverts it, even with success probability that is not negligible, can be used to contradict the secrecy requirement.) Exercise 13: Considering the commitment scheme of Construction 6.23, suggest a cheating sender that induces a receiver-view (of the commit phase) being both indistinguishable from the receiver-view in interactions with the prescribed sender with very high probability, neither a possible 0-commitment nor a possible 1commitment. (Hint: the sender just replies with a uniformly chosen string.) Exercise 14: using Construction 6.23 as a commitment scheme in Construction 6.25: Prove that when the commitment scheme of Construction 6.23 is used in the G3C protocol then resulting scheme remains zero-knowledge. Consider the modi cations required to prove Claim 6.26.2. Exercise 15: more e cient zero-knowledge proofs for NP : Following is an outline for a constant-round zero-knowledge proof for the Hamiltonian Circuit Problem (HCP), 1 with acceptance gap 2 (between inputs inside and outside of the language). Common Input: a graph G =(V E ) Auxiliary Input (to the prover): a permutation , over V , representing the order of vertices along a Hamiltonian Circuit Prover's rst step: Generates a random isomorphic copy of G, denoted G0 = (V E 0). (Let denote the permutation between G and G0 ). For each pair (i j ) 2 V 2, the prover sets ei j = 1 if (i j ) 2 E 0 and ei j = 0 otherwise. The prover computes a random commitment to each ei j . Namely, it uniformly chooses si j 2 f0 1gn and computes ci j = Csi j (ei j ). The prover sends all the ci j 's to the veri er Veri er's rst step: Uniformly selects 2 f0 1g and sends it to the prover Prover's second step: Let be the message received from the veri er. If = 1 then the prover reveals all the jV j2 commitments to the veri er (by revealing all si j 's), and sends along also the permutation . If = 0 then the prover reveals only jV j commitments to the veri er, speci cally those corresponding to the Hamiltonian circuit in G0 (i.e., the prover sends s (1) ( (2)), s (2) ( (3)),...,s (n;1) ( (n)), s (n) ( (1))). 6.12. MISCELLANEOUS 253 Complete the description of the above interactive proof, evaluate its acceptance probabilities, and provide a sketch of the proof of the zero-knowledge property (i.e., describe the simulator). If you are really serious provide a full proof of the zero-knowledge property. Exercise 16: strong reductions: Let L1 and L2 be two languages in NP , and let R1 and R2 1. standard requirement: x 2 L1 if and only if f (x) 2 L2. 2. additional requirement: For every (x w) 2 R1, it holds that (f (x) g (w)) 2 R2. be binary relations characterizing L1 and L2 , respectively. We say that the relation R1 is Levin-reducible to the relation R2 if there exist two polynomial-time computable functions f and g such that the following two conditions hold. We call the above reduction after Levin, who upon discovering, independently of Cook and Karp, the existence of NP -complete problem, made a stronger de nition of a reduction which implies the above. Prove the following statements 1. Let L 2 NP and let LR be the generic relation characterizing L (i.e., x a nondeterministic machine ML and let (x w) 2 RL if w is an accepting computation of ML on input x). Let RSAT be the standard relation characterizing SAT (i.e., (x w) 2 RSAT if w is a truth assignment satisfying the CNF formula x). Prove that RL is Levin-reducible to RSAT . 2. Let RSAT be as above, and let R3SAT be de ned analogously for 3SAT . Prove that RSAT is Levin-reducible to R3SAT . 3. Let R3SAT be as above, and let RG3C be the standard relation characterizing G3C (i.e., (x w) 2 RG3C if w is a 3-coloring of the graph x). Prove that R3SAT is Levin-reducible to RG3C . 4. Levin-reductions are transitive. Exercise 17: Prove the existence of a Karp-reduction of L to SAT that, when considered as a function, can be inverted in polynomial-time. Same for the reduction of SAT to 3SAT and the reduction of 3SAT to G3C . (In fact, the standard Karp-reductions have this property.) way functions, present solutions to the following cryptographic problems: Exercise 18: applications of Theorem 6.29: Assuming the existence of non-uniformly one1. Suppose that party R received over a public channel a message encrypted using its own public-key encryption. Suppose that the message consists of two parts and party R wishes to reveal to everybody the rst part of the message but not the second. Further suppose that the other parties want a proof that R indeed revealed the correct contents of the rst part of its message. 254 CHAPTER 6. ZERO-KNOWLEDGE PROOF SYSTEMS 2. Suppose that party S wishes to send party R a signature to a publicly known document so that only R gets the signature but everybody else can verify that such a signature was indeed sent by S . (We assume that all parties share a public channel.) 3. Suppose that party S wishes to send party R a commitment to a partially specied statement so that R remains oblivious of the unspeci ed part. For example, S may wish to commit itself to some standard o er while keeping the amount o ered secret. Exercise 19: on knowledge tightness: Evaluate the knowledge tightness of Construction 6.25, when applied logarithmically many times in parallel. Exercise 20: error reduction in computationally sound proofs { part 1: Given a computa1 tionally sound proof (with error probability 3 ) for a language L construct a computationally sound proof with negligible error probability (for L). Exercise 21: error reduction in computationally sound proofs { part 2: Construct a computationally sound proof that has negligible error probability (i.e., smaller than 1=p(jxj) for every polynomial p( ) and su ciently long inputs x) but when repeated sequentially jxj times has error probability greater than 2;jxj. We refer to the error probability in the (computational) soundness condition. Exercise 22: commitment schemes { an impossibility result: Prove that there exists no two-party protocol which simultaneously satis es the perfect secrecy requirement of De nition 6.57 and the (information theoretic) unambiguity requirement of De nition 6.20. Exercise 23: alternative formulation of black-box zero-knowledge: We say that a probabilistic polynomial-time oracle machine M is a black-box simulator for the prover P and the language L if for every (not necessarily uniform) polynomial-size circuit family fBngn2N, the ensembles fhP Bjxji(x)gx2L and fM Bjxj (x)gx2L are indistinguishable I by (non-uniform) polynomial-size circuits. Namely, for every polynomial-size circuit family fDngn2N , every polynomial p( ), all su ciently large n and x 2 f0 1gn \ L, I jProb (Dn(hP Bni(x))=1) ; Prob Dn(M Bn (x))=1 j < p(1n) Prove that the current formulation is equivalent to the one presented in De nition 6.38. Exercise 24: Prove that the protocol presented in Construction 6.25 is indeed a black-box zero-knowledge proof system for G3C . (Guideline: use the formulation presented above.) Chapter 7 Cryptographic Protocols Author's Note: This chapter is a serious obstacle to any future attempt of completing this book. %Plan \input{pt-motiv}% \input{pt-def}%%% %................ \input{pt-two}%%% \input{pt-many}%% %................ %................ \input{pt-misc}%% Motivation (Examples: voting, OT) Definition (of a protocol problem) (2 and more parties, w/without fairness'') Construction of two-party protocols Construction of multi-party protocols in the private-channel model. Adapt to the computational model'' (no private channels) As usual: History, Reading, Open, Exercises 255 256 CHAPTER 7. CRYPTOGRAPHIC PROTOCOLS Chapter 8 * New Frontiers Where is the area going? That's always hard to predict, but following are some recent and not so recent developments. %Plan \input{fr-eff}%%% \input{fr-sys}%%% \input{fr-dyn}%%% \input{fr-incr}%% \input{fr-traf}%% \input{fr-soft}%% more stress on efficiency (from a theory perspective!) "System Problems" (key-mgmt, replay, etc.) Dynamic adversaries (in multi-party protocls) Incremental Cryptography BGG] Trafic Analysis RS] Software Protection G,O] (that's not really new...) 257 258 CHAPTER 8. * NEW FRONTIERS Chapter 9 The E ect of Cryptography on Complexity Theory Cryptography had a fundamental e ect on the development of complexity theory. Notions such as computational indistinguishability, pseudorandomness (in the sense discussed in previous chapters), interactive proofs and random self-reducibility were rst introduced and developed with a cryptographic motivation. However, these notions turned out to in uence the development of complexity theory as well, and were further developed within this broader theory. In this chapter we survey some of these developments which have their roots in cryptography and yet provide results which are no longer (directly) relevant to cryptography. %Plan \input{eff-rand}% %................ \input{eff-ip}%%% \input{eff-pcp}%% \input{eff-rsr}%% \input{eff-lear}% \input{eff-misc}% Deterministic Simulation of Randomized Complexity Classes (simulations of random-AC0, BPP and RL) The power of Interactive Proofs (coNP subset IP=PSPACE) PCP and its applications to hardness of approximation Random Self-Reducibility (DLP/QR, Permanent) Learning (as usual) 259 260CHAPTER 9. THE EFFECT OF CRYPTOGRAPHY ON COMPLEXITY THEORY Chapter 10 * Related Topics In this chapter we survey several unrelated topics which are related to cryptography in some way. For example, a natural problem which arises in light of the excessive use of randomness is how to extract almost perfect randomness from sources of weak randomness. %Plan \input{tp-sour}%% \input{tp-byz}%%% \input{tp-check}% \input{tp-misc}%% Weak sources of randomness Byzantine Agreement Program Checking and Statistical Tests As usual: History, Reading, Open, Exercises 261 262 CHAPTER 10. * RELATED TOPICS Appendix A Annotated List of References (compiled Feb. 1989) Author's Note: The following list of annotated references was compiled by me more than ve years ago. The list was intended to serve as an appendix to class notes for my course on \Foundations of Cryptography" given at the Technion in the Spring of 1989. Thus, a few pointers to lectures given in the course appear in the list. Author's Note: By the way, copies of the above-mentioned class notes, writ- ten mostly by graduate students attending my course, can be requested from the publication o cer of the Computer Science Department of the Technion, Haifa, Israel. Although I have a very poor opinion of these notes, I was surprised to learn that they have been used by several people. The only thing that I can say in favour of these notes is that they cover my entire (one-semester) course on \Foundations of Cryptography" in particular, they contain material on encryption and signatures (which is most missing in the current fragments). 263 264 APPENDIX A. ANNOTATED LIST OF REFERENCES (COMPILED FEB. 1989) Preface The list of references is partitioned into two parts: Main References and Suggestions for Further Reading. The Main References consists of the list of papers that I have extensively used during the course. Other papers which I mentioned brie y may be found in the list of Suggestions for Further Reading. This second list also contains papers, reporting further developments, which I have not mentioned at all. Clearly, my suggestions for further reading do not exhaust all interesting works done in the area. Some good works were omitted on purpose (usually when totally superseeded by others) and some were omitted by mistake. Also, no consistent policy was implemented in deciding which version of the work to cite. In most cases I used the reference which I had available on line (as updating all references would have taken too much time). 265 PART I : Main References BM88] Bellare, M., and S. Micali, \How to Sign Given any Trapdoor Function", Proc. 20th STOC, 1988. Simpli es the construction used in GMR84], using a weaker condition (i.e. the existence of trapdoor one-way permutations). Readability: reasonable. BM82] Blum, M., and Micali, S., \How to Generate Cryptographically Strong Sequences of Pseudo-Random Bits", SIAM Jour. on Computing, Vol. 13, 1984, pp. 850-864. First version in FOCS 1982. Presents a general method of constructing pseudorandom generators, and the rst example of using it. Characterizes such generators as passing all (polynomial-time) prediction tests. Presents the notion of a "hard-core" predicate and the rst proof of the existence of such predicate based on the existence of a particular one-way function (i.e. Discrete Logarithm Problem). Readability: confusing in some places, but usually ne. GL89] Goldreich, O., and L.A. Levin, \A Hard-Core Predicate to any One-Way Function", 21st STOC, 1989, pp. 25-32. Shows that any "padded" one-way function f (x p) = f0 (x) p, has a simple hard-core bit, the inner-product mod-2 of x and p. Readability: STOC version is very elegant and laconic (Levin wrote it). These notes present a more detailed but cumbersome version. GMW86] Goldreich, O., S. Micali, and A. Wigderson, \Proofs that Yield Nothing But their Validity and a Methodology of Cryptographic Protocol Design", Proc. of 27th Symp. on Foundation of Computer Science, 1986, pp. 174-187. A full version appears as TR-544, Computer Science Dept., Technion, Haifa, Israel. Demonstrates the generality and the wide applicability of zero-knowledge proofs. In particular, using any bit commitment scheme, it is shown how to construct a zeroknowledge proof for any language in NP . Perfect zero-knowledge proofs are presented for Graph Isomorphism and its complement. 266 APPENDIX A. ANNOTATED LIST OF REFERENCES (COMPILED FEB. 1989) Readability: the full version is very detailed, sometimes to a point of exhausting the reader. A more elegant proof of the main result is sketched in G89a]. GMW87] Goldreich, O., S. Micali, and A. Wigderson, \How to Play any Mental Game", 19th STOC, 1987. A more reasonable version is available from me. Deals with the problem of cryptographic protocols in its full generality, showing how to automatically generate fault-tolerant protocols for computing any function (using any trapdoor one-way permutation). Readability: STOC version is too hand-waving. These notes constitute a better source of information. GM82] Goldwasser, S., and S. Micali, \Probabilistic Encryption", JCSS, Vol. 28, No. 2, 1984, pp. 270-299. Previous version in STOC 1982. Introduces the concept of polynomially indistinguishable probability distributions. Presents notions of secure encryption, demonstrating the inadequency of previous intuitions. Presents a general method for constructing such encryption schemes, and a rst application of it. First use of the "hybrid" method. Readability: Nice introduction. The technical part is somewhat messy. GMR85] Goldwasser, S., S. Micali, and C. Racko , \The Knowledge Complexity of Interactive Proof Systems", SIAM J. on Comput., Vol. 18, No. 1, 1989, pp. 186-208. Previous version in STOC 1985. Introduces the concepts of an interactive proof and a zero-knowledge proof. Presents the rst (non-trivial) example of a zero-knowledge proof. First application of zeroknowledge to the design of cryptographic protocols. Readability: good. GMR84] Goldwasser, S., S. Micali, and R.L. Rivest, \A Digital Signature Scheme Secure Against Adaptive Chosen Message Attacks", SIAM J on Comput., Vol. 17, No. 2, 1988, pp. 281-308. Previous version in FOCS 1984. Surveys and investigates de nitions of unforgeable signatures. Presents the rst signature scheme which is unforgeable in a very strong sense even under a chosen message attack. Readability: excellent as an introduction to the problem. Don't read the construction, but rather refer to BM88]. 267 Y82] Yao, A.C., \Theory and Applications of Trapdoor Functions", Proc. of the 23rd IEEE Symp. on Foundation of Computer Science, 1982, pp. 80-91. Presents a general de nition of polynomially indistinguishable probability distributions. Characterizes pseudorandom generators as passing all (polynomial-time) statistical tests. (This formulation is equivalent to passing all polynomial-time prediction tests.) Given any one-way permutation constructs a pseudorandom generator. Readability: Most interesting statements are not stated explicitly. Furthermore, contains no proofs. 268 APPENDIX A. ANNOTATED LIST OF REFERENCES (COMPILED FEB. 1989) PART II : Suggestions for Further Reading My suggestions for further reading are grouped under the following categories: 1. General: Papers which deal or relate several of the following categories. 2. Hard Computational Problems: Pointers to literature on seemingly hard computational problems (e.g. integer factorization) and to works relating di erent hardness criteria. 3. Encryption: Papers dealing with secure encryption schemes (in the strong sense dened in lecture 5B and 6). 4. Pseudorandomness: Papers dealing with the construction of pseudorandom generators, pseudorandom functions and permutations and their applications to cryptography and complexity theory. 5. Signatures and Commitment Schemes: Papers dealing with unforable signature schemes (as de ned in lecture 10) and secure commitment schemes (mentioned in lecture 13). 6. Interactive Proofs, Zero-Knowledge and Protocols: In addition to papers with apparent relevance to cryptography this list contains also papers investigating the complexity theoretic aspects of interactive proofs and zero-knowledge. 7. Additional Topics: Pointers to works on software protection, computation with an untrusted oracle, protection against abuse of cryptographic systems, Byzantine Agreement, sources of randomness, and \cryptanalysis". 8. Historical Background: The current approach to Cryptography did not emerge \out of the blue". It originates in works that were not referenced in the previous categories (which include only material conforming with the de nitions and concepts presented in the course). This category lists some of these pioneering works. A.1. GENERAL 269 A.1 General Much of the current research in cryptography focuses on reducing the existence of complex cryptographic primitives (such as the existence of unforgeable signature schemes) to simple complexity assumptions (such as the existence of one-way functions). A rst work investigating the limitations of these reductions is IR89], where a "gap" between tasks implying secret key exchange and tasks reducible to the existence of one-way functions is shown. The gap is in the sense that a reduction of the rst task to the second would imply P 6= NP . Many of the more complex results in cryptography (e.g. the existence of zero-knowledge interactive proofs for all languages in NP ) are stated and proved in terms of non-uniform complexity. As demonstrated throughout the course, this simpli es both the statements and their proofs. An attempt to treat secure encryption and zero-knowledge in uniform complexity measures is reported in G89a]. In fact, the lectures on secure encryption are based on G89a]. references G89a] Goldreich, O., \A Uniform-Complexity Treatment of Encryption and Zero-Knowledge", TR-568, Computer Science Dept., Technion, Haifa, Israel, 1989. IR89] Impagliazzo, R., and S. Rudich, \Limits on the Provable Consequences of One-Way Permutations", 21st STOC, pp. 44-61, 1989. A.2 Hard Computational Problems 2.1. Candidates for One-Way functions Hard computational problems are the basis of cryptography. The existence of adequately hard problems (see lecture 2) is not known. The most popular candidates are from computational number theory: integer factorization (see P82] for a survey of the best algorithms known), discrete logarithms in nite elds (see O84] for a survey of the best algorithms known), and the logarithm problem for "Elliptic groups" (cf. M85]). Additional suggestions are the decoding problem for random linear codes (see GKL88] and BMT78]) and high density subset-sum (\knapsack") problems (see CR88, IN89]). Note that low density subset-sum problems are usually easy (see survey BO88]). 270 APPENDIX A. ANNOTATED LIST OF REFERENCES (COMPILED FEB. 1989) Much of the early-80th research in cryptography used the intractability assumption of the Quadratic Residuousity Problem (introduced in GM82]). The nice structure of the problem was relied upon in constructions as LMR83], but in many cases further research led to getting rid of the need to rely on the special structure (and to using weaker intractability assumptions). Attempts to base cryptography on computationally hard combinatorial problems have been less popular. Graph Isomorphism is very appealing (as it has a nice structure as the Quadratic Residuousity Problem), but such a suggestion should not be taken seriously unless one speci es an easily samplable instance distribution for which the problem seems hard. For details on candidates whose conjectured hardness was refuted see category 7.6. 2.2. Generic Hard Problems The universal one-way function presented in lecture 3 originates from L85]. The same ideas were used in G88a] and AABFH88], but the context there is of \average case complexity" (originated in L84] and surveyed in G88a]). In this context \hard" means intractable on in nitely many instance lengths, rather than intractable on all but nitely many instance lengths. Such problems are less useful in cryptography. 2.3. Hard-Core Predicates As pointed out in lecture 4, hard-core predicates are a useful tool in cryptography. Such predicates are known to exist for exponentiation modulo a prime BM82], (more generally) for "repeated addition" in any Abelian group K88] and for the RSA and Rabin (squaring mod N ) functions ACGS84]. Recall that the general result of GL89] (see lectures 4-5A) guarantees the existence of hard-core predicates for any "padded" function. references AABFH88] Abadi, M., E. Allender, A. Broder, J. Feigenbaum, and L. Hemachandra, \On Generating Hard, Solved Instances of Computational Problem", Crypto88 proceedings. ACGS84] W. Alexi, B. Chor, O. Goldreich and C.P. Schnorr, "RSA and Rabin Functions: Certain Parts Are As Hard As the Whole", SIAM Jour. on Computing, Vol. 17, 1988, pp. 194-209. A preliminary version appeared in Proc. 25th FOCS, 1984, pp. 449-457. A.2. HARD COMPUTATIONAL PROBLEMS 271 BM82] see main references. BMT78] Berlekamp, E.R., R.J. McEliece, and H.C.A. van Tilborg, \On the Inherent Intractability of Certain Coding Problems", IEEE Trans. on Inform. Theory, 1978. BO88] Brickell, E.F., and A.M. Odlyzko, \Cryptanalysis: A Survey of Recent Results", Proceedings of the IEEE, Vol. 76, pp. 578-593, 1988. CR88] Chor, B., and R.L. Rivest, \A Knapsack Type Public-Key Cryptosystem Based on Arithmetic in Finite Fields", IEEE Trans. on Inf. Th., Vol. 34, pp. 901-909, 1988. G88a] Goldreich, O., \Towards a Theory of Average Case Complexity (a survey)", TR-531, Computer Science Dept., Technion, Haifa, Israel, 1988. GKL88] see category 4. GL89] see main references. GM82] see main references. IN89] Impagliazzo, R., and M. Naor, \E cient Cryptographic Schemes Provable as Secure as Subset Sum", manuscript, 1989. K88] B.S. Kaliski, Jr., "Elliptic Curves and Cryptography: A Pseudorandom Bit Generator and Other Tools", Ph.D. Thesis, LCS, MIT, 1988. L84] Levin, L.A., \Average Case Complete Problems", SIAM Jour. of Computing, 1986, Vol. 15, pp. 285-286. Extended abstract in 16th STOC, 1984. L85] see category 4. LW] D.L. Long and A. Wigderson, "How Discreet is Discrete Log?", Proc. 15th STOC, 1983, pp. 413-420. A better version ? LMR83] see category 6. M85] Miller, V.S., \Use of Elliptic Curves in Cryptography", Crypto85 - Proceedings, Lecture Notes in Computer Science, Vol. 218, Springer Verlag, 1985, pp. 417-426. O84] Odlyzko, A.M., \Discrete Logarithms in Finite Fields and their Cryptographic Significance", Eurocrypt84 proceedings, Springer-Verlag, Lecture Notes in Computer Science, Vol. 209, pp. 224-314, 1985. manuscript. P82] Pomerance, C., \Analysis and Comparison of some Integer Factorization Algorithms", Computational Methods in Number Theory: Part I, H.W. Lenstra Jr. and R. Tijdeman eds., Math. Center Amsterdam, 1982, pp. 89-139. 272 APPENDIX A. ANNOTATED LIST OF REFERENCES (COMPILED FEB. 1989) A.3 Encryption The e cient construction of a secure public-key encryption scheme, presented in lecture 8, originates from BG84]. The security of this scheme is based on the intractability assumption of factoring, while its e ciency is comparable with that of the RSA. More generally, the scheme can be based on any trapdoor one-way permutation. Non-uniform versions of the two de nitions of security (presented in lecture 6) were shown equivalent in MRS88]. These versions were also shown equivalent to a third de nition appearing in Y82]. The robustness of encryption schemes against active adversaries was addressed in GMT82]. Folklore states that secret communication can be achieved over a channel controlled by an active adversary by use of bi-directional communication: for every message transmission, the communicating parties exchange new authenticated cryptographic keys (i.e. the receiver transmits a new authenticated encryption-key that is used only for the current message). Note that this prevents a chosen message attack on the currently used instance of the encryption scheme. Note that this suggestion does not constitute a public-key encryption scheme, but rather a secure means of private bi-directional communication. It was claimed that \non-interactive zero-knowledge proofs of knowledge" yield the construction of publickey encryption secure against chosen ciphertext attack BFM88], but no proof of this claim has appeared. references BFM88] see category 6. BG84] Blum, M., and S. Goldwasser, \An E cient Probabilistic Public-Key Encryption Scheme which hides all partial information", Advances in Cryptology: Proc. of Crypto 84, ed. B Blakely, Springer Verlag Lecture Notes in Computer Science, vol. 196, pp. 289-302. GMT82] Goldwasser, S., S. Micali, and P. Tong, \Why and How to Establish a Private Code in a Public Network", 23rd FOCS, 1982, pp. 134-144. MRS88] Micali, S., C. Racko , and B. Sloan, \The Notion of Security for Probabilistic Cryptosystems", SIAM Jour. of Computing, 1988, Vol. 17, pp. 412-426. Y82] see main references. A.4. PSEUDORANDOMNESS 273 A.4 Pseudorandomness I have partitioned the works in this category into two subcategories: works with immediate cryptographic relevance versus works which have a more abstract (say complexity theoretic) orientation. A survey on Pseudorandomness in contained in G88b]. 4.1. Cryptographically oriented works The theory of pseudorandomness was extended to deal with functions and permutations. De nitions of pseudorandom functions and permutations are presented in GGM84] and LR86]. Pseudorandom generators were used to construct pseudorandom functions GGM84], and these were used to construct pseudorandom permutations LR86]. Cryptographic applications are discussed in GGM84b, LR86]. In lecture 9, we proved that the existence of one-way permutations implies the existence of pseudorandom generators. Recently, it has been shown that pseudorandom generators exist if and only if one-way functions exist ILL89, H89]. The construction of pseudorandom generators presented in these works is very complex and ine cient, thus the quest for an e cient construction of pseudorandom generator based on any one-way function is not over yet. A previous construction by GKL88] might turn out useful in this quest. A very e cient pseudorandom generator based on the intractability of factoring integers arises from the works BBS82, ACGS84, VV84]. The generator was suggested in BBS82] (where it was proved secure assuming intractability of Quadratic Residuousity Problem), and proven secure assuming intractability of factoring in VV84] (by adapting the techniques in ACGS84]). 4.2. Complexity oriented works The existence of a pseudorandom generator implies the existence of a pair of statistically di erent e ciently constructible probability ensembles which are computationally indistinguishable. This su cient condition turns out to be also a necessary one G89b]. The di erence between the output distribution of a pseudorandom generator and more commonly considered distributions is demonstrated in L88]. The \commonly considered" distributions (e.g. all distributions having a polynomial-time computable distribution function) are shown to be homogenous while a pseudorandom generator gives rise to distributions which are not homogenous. Homogenous distributions are de ned as distributions which allow good average approximation of all polynomial-time invariant characteristics of a string from its Kolmogorov complexity. 274 APPENDIX A. ANNOTATED LIST OF REFERENCES (COMPILED FEB. 1989) The use of pseudorandom generators for deterministic simulation of probabilistic complexity classes was rst suggested in Y82]. A uni ed approach, leading to better simulations, can be found in NW88]. Other results concerning the \e cient" generation of sequences which \look random" to machines of various complexity classes can be found in RT85, BNS89, Ni89]. The existence of sparse and evasive pseudorandom distributions is investigated in GKr89a]. A sparse distribution (unlike a distribution statistically close to the uniform one) ranges over a negligible fraction of the strings. Evasiveness is the infeasibility of hitting an element in the distribution's support. Applications of some results to zero-knowledge are presented in GKr89b]. references ACGS84] see category 2. BNS89] Babai, L., N. Nisan, and M. Szegedy, \Multi-party Protocols and Logspace-Hard Pseudorandom Sequences", 21st STOC, pp. 1-11, 1989. BBS82] L. Blum, M. Blum and M. Shub, A Simple Secure Unpredictable Pseudo-Random Number Generator, SIAM Jour. on Computing, Vol. 15, 1986, pp. 364-383. Preliminary version in Crypto82. BM82] see main references. G88b] Goldreich, O., \Randomness, Interactive Proofs, and Zero-Knowledge - A Survey", The Universal Turing Machine - A Half-Century Survey, R. Herken ed., Oxford Science Publications, pp. 377-406, 1988. G89b] Goldreich, O., \A Note on Computational Indistinguishability", TR-89-051, ICSI, Berkeley, USA, (1989). GGM84] Goldreich, O., S. Goldwasser, and S. Micali, "How to Construct Random Functions", Jour. of ACM, Vol. 33, No. 4, 1986, pp. 792-807. Extended abstract in FOCS84. GGM84b] Goldreich, O., S. Goldwasser, and S. Micali, \On the Cryptographic Applications of Random Functions", Crypto84, proceedings, Springer-Verlag, Lecture Notes in Computer Science, vol. 196, pp. 276-288, 1985. GKr89a] Goldreich, O., and H. Krawczyk, \Sparse Pseudorandom Distributions", Crypto89 proceedings, to appear. A.5. SIGNATURES AND COMMITMENT SCHEMES GKr89b] see category 6. 275 GKL88] Goldreich, O., H. Krawczyk, and M. Luby, "On the Existence of Pseudorandom Generators", 29th FOCS, 1988. GM82] see main references. H89] Hastad, J., \Pseudo-Random Generators with Uniform Assumptions", preprint, 1989. ILL89] Impagliazzo, R., L.A. Levin, and M. Luby, \Pseudorandom Generation from One-Way Functions", 21st STOC, pp. 12-24, 1989. L85] L.A. Levin, "One-Way Function and Pseudorandom Generators", Combinatorica, Vol. 7, No. 4, 1987, pp. 357-363. A preliminary version appeared in Proc. 17th STOC, 1985, pp. 363-365. L88] L.A. Levin, "Homogenous Measures and Polynomial Time Invariants", 29th FOCS, pp. 36-41, 1988. LR86] M. Luby and C. Racko , "How to Construct Pseudorandom Permutations From Pseudorandom Functions", SIAM Jour. on Computing, Vol. 17, 1988, pp. 373-386. Extended abstract in FOCS86. NW88] Nisan, N., and A. Wigderson, \Hardness vs. Randomness", Proc. 29th FOCS, pp. 2-11, 1988. Ni89] Nisan, N., \Pseudorandom Generators for Bounded Space Machines", private communication, 1989. RT85] Reif, J.H., and J.D. Tygar, \E cient Parallel Pseudo-Random Number Generation", Crypto85, proceedings, Springer-Verlag, Lecture Notes in Computer Science, vol. 218, pp. 433-446, 1985. Y82] see main references. VV84] Vazirani, U.V., and V.V. Vazirani, \E cient and Secure Pseudo-Random Number Generation", 25th FOCS, pp. 458-463, 1984. A.5 Signatures and Commitment Schemes Recent works reduce the existence of these important primitives to assumptions weaker than ever conjectured. 276 APPENDIX A. ANNOTATED LIST OF REFERENCES (COMPILED FEB. 1989) 5.1. Unforgeable Signatures Schemes Unforgeable signature schemes can be constructed assuming the existence of one-way permutations NY89]. The core of this work is a method for constructing \cryptographically strong" hashing functions. Further improvements and techniques are reported in G86, EGM89]: in G86] a technique for making schemes as GMR84, BM88, NY89] \memoryless" is presented in EGS89] the concept of \on-line/o -line" signature schemes is presented and methods for constructing such schemes are presented as well. 5.2. Secure Commitment Schemes Secure commitment schemes can be constructed assuming the existence of pseudorandom generator N89]. In fact, the second scheme presented in lecture 13 originates from this paper. references BM88] see main references. EGM89] Even, S., O. Goldreich, and S. Micali, \On-Line/O -Line Digital Signature Schemes", Crypto89 proceedings, to appear. G86] Goldreich, O., \Two Remarks concerning the Goldwasser-Micali-Rivest Signature Scheme", Crypto86, proceedings, Springer-Verlag, Lecture Notes in Computer Science, vol. 263, pp. 104-110, 1987. GMR84] see main references. N89] M. Naor, \Bit Commitment Using Pseudorandomness", IBM research report. Also to appear in Crypto89 proceedings, 1989. NY89] M. Naor and M. Yung, \Universal One-Way Hash Functions and their Cryptographic Applications", 21st STOC, pp. 33-43, 1989. A.6 Interactive Proofs, Zero-Knowledge and Protocols This category is subdivided into three parts. The rst contains mainly cryptographically oriented works on zero-knowledge, the second contains more complexity oriented works on interactive proofs and zero-knowledge. The third subcategory lists works on the design of A.6. INTERACTIVE PROOFS, ZERO-KNOWLEDGE AND PROTOCOLS 277 cryptographic protocols. Surveys on Interactive Proof Systems and Zero-Knowledge Proofs can be found in G88b, Gw89]. 6.1. Cryptographically oriented works on Zero-Knowledge An important measure for the \practicality" of a zero-knowledge proof system is its knowledge tightness. Intuitively, tightness is (the supremum taken over all probabilistic polynomialtime veri ers of) the ratio between the time it takes the simulator to simulate an interaction with the prover and the complexity of the corresponding veri er G87a]. The de nition of zero-knowledge only guarantees that the knowledge-tightness can be bounded by any function growing faster than every polynomial. However, the de nition does not guarantee that the knowledge-tightness can be bounded above by a particular polynomial. It is easy to see that the knowledge-tightness of the proof system for Graph Isomorphism (presented in lecture 12) is 2, while the tightness of proof system for Graph colouring (lecture 13) is m (i.e., the number of edges). I believe that the knowledge-tightness of a protocol is an important aspect to be considered, and that it is very desirable to have tightness be a constant. Furthermore, using the notion of knowledge-tightness one can introduce more re ned notions of zero-knowledge and in particular the notion of constant-tightness zero-knowledge. Such re ned notions may be applied in a non-trivial manner also to languages in P . Two standard e ciency measures associated with interactive proof systems are the computational complexity of the proof system (i.e., number of steps taken by either or both parties) and the communication complexity of the proof system (here one may consider the number of rounds, and/or the total number of bits exchanged). Of special importance to practice is the question whether the (honest) prover's program can be a probabilistic polynomial-time when an auxiliary input is given (as in the case of the proof system, presented in lecture 13, for Graph Colouability). An additional measure, the importance of which has been realized only recently, is the number of strings to which the commitment scheme is applied individually (see KMO89]). The zero-knowledge proof system for graph colourability presented in lecture 13 is not the most practical one known. Proof systems with constant knowledge-tightness, probabilistic polynomial-time provers and a number of iterations which is merely super-logarithmic exist for all languages in NP (assuming, of course, the existence of secure commitment) IY87]. This proof system can be modi ed to yield a zero-knowledge proof with f (n) iterations, for every unbounded function f . Using stronger intractability assumptions (e.g. the existence of claw-free one-way permutations), constant-round zero-knowledge proof systems can be presented for every language in NP GKa89]. Perfect zero-knowledge arguments1 were introduced in BC86a, BCC88] and shown to exist for all languages in NP , assuming the intractability of factoring integers. The di er1 The term "argument" has appeared rst in BCY89]. The authors of BCC88] create an enormous 278 APPENDIX A. ANNOTATED LIST OF REFERENCES (COMPILED FEB. 1989) ence between arguments and interactive proofs is that in a argument the soundness condition is restricted to probabilistic polynomial-time machines (with auxiliary input). Hence, it is infeasible (not impossible) to fool the veri er into accepting (with non-negligible probability) an input not in the language. Assuming the existence of any commitment scheme, it is shown that any language in NP has a constant-round zero-knowledge argument FS88]. The limitations of zero-knowledge proof systems and the techniques to demonstrate their existence are investigated in GO87, GKr89b]. In particular, zero-knowledge proofs with deterministic veri er (resp. prover) exist only for languages in RP (resp. BPP ), constant-round proofs of the AM-type (cf. B85]) can be demonstrated zero-knowledge by an oblivious simulation only if the language is in BPP . Thus, the \parallel versions" of the interactive proofs (presented in GMW86]) for Graph Isomorphism and every L 2 NP are unlikely to be demonstrated zero-knowledge. However, modi ed versions of these interactive proofs yield constant-round zero-knowledge proofs (see GKa89] for NP and BMO89] for Graph Isomorphism). These interactive proofs are, of course, not of the AM-type. The concept of a \proof of knowledge" was introduced and informally de ned in GMR85]. Precise formalizations following this sketch has appeared in BCC88, FFS87, TW87]. This concept is quite useful in the design of cryptographic protocols and zero-knowledge proof systems. In fact, it has been used implicitly in GMR85, GMW87, CR87] and explicitly in FFS87, TW87]. However, I am not too happy with the current formalizations and intend to present a new formalization. \Non-interactive" zero-knowledge proofs are known to exist assuming the existence of trapdoor one-way permutations KMO89]. These are two-phase protocols. The rst phase is a preprocessing which uses bi-directional communication. In the second phase, zeroknowledge proofs can be produced via one-directional communication from the prover to the veri er. The number of statements proven in the second phase is a polynomial in the complexity of the rst phase (this polynomial is arbitrarily xed after the rst phase is completed). Historical remark: Using a stronger intractability assumption (i.e. the intractability of Quadratic Residuousity Problem) BC86b] showed that every language in NP has a zero- knowledge interactive proof system. This result has been obtained independently of (but subsequently to) GMW86]. amount of confusion by insisting to refer to arguments by the term interactive proofs. For example, the result of For87] does not hold for perfect zero-knowledge arguments. Be careful not to confuse arguments with interactive proofs in which the completeness condition is satis ed by a probabilistic polynomial-time prover (with auxiliary input). A.6. INTERACTIVE PROOFS, ZERO-KNOWLEDGE AND PROTOCOLS 279 6.2. Complexity oriented works on Interactive Proofs and Zero-Knowledge The de nition of interactive proof systems, presented in lecture 12, originates from GMR85]. A special case, in which the veri er sends the outcome of all its coin tosses to the prover was suggested in B85] and termed Arthur Merlin (AM) games. AM games are easier to analyze, while general interactive proof systems are easier to design. Fortunately, the two formalizations coincide in a strong sense: for every polynomial Q, the classes IP (Q(n)) and AM(Q(n)) are equal GS86], where IP (Q(n)) denotes the class of languages having Q(n)-round interactive proof system. It is also known, that for every k 1 and every polynomial Q, the class AM(Q(n)) and AM(k Q(n)) coincide BaMo88]. A stronger result does not \relativize" (i.e. there exists an oracle A such that for every polynomial Q and every unbounded function g the class AM(Q(n))A is strictly contained in AM(g (n) Q(n))A) AGH88]. Author's Note: However, in light of the results of LFKN,S] (see FOCS90), this means even less than ever. See also Chang et. al. (JCSS, Vol. 49, No. 1). Author's Note: This list was compiled before the fundamental results of Lund, Fortnow, Karlo and Nisan LFKN] and Shamir S] were known. By these results every language in PSPACE has an interactive proof system. Since IP PSPACE folklore], the two classes collide. Every language L 2 IP (Q(n)) has a Q(n)-round interactive proof system in which the veri er accepts every x 2 L with probability 1, but only languages in NP have interactive proof systems in which the veri er never accepts x 2 L GMS87]. Further developments = appear in BMO89]. The class AM(2) is unlikely to contain coNP , as this will imply the collapse of the polynomial-time hierarchy BHZ87]. It is also known that for a random oracle A, AM(2) = NP A NW88]. The complexity of languages having zero-knowledge proof systems seems to depend on whether these systems are perfect or only computational zero-knowledge. On one hand, it is known that perfect (even almost-perfect) zero-knowledge proof systems exist only for languages inside AM(2) \ coAM(2) For87, AH87]. On the other hand, assuming the existence of commitment schemes (the very assumption used to show \NP in ZK") every languages in IP has a computational zero-knowledge proof system IY87] (for a detailed proof see Betal88]). Returning to perfect zero-knowledge proof systems, it is worthwhile mentioning that such systems are known for several computational problems which are considered hard (e.g. Quadratic Residuousity Problem GMR85], Graph Isomorphism GMW86], membership in a subgroup TW87], and a problem computationally equivalent to Discrete Logarithm GKu88]). 280 APPENDIX A. ANNOTATED LIST OF REFERENCES (COMPILED FEB. 1989) The concept of the knowledge complexity of a languages was introduced in GMR85], but the particular formalization suggested there is somewhat ad-hoc and unnatural.2 The knowledge complexity of a language is the minimum number of bits released by an interactive proof system for the language. Namely, a language L 2 IP has knowledge complexity k( ) if there exists an interactive proof for L such that the interaction of the prover on x 2 L can be simulated by a probabilistic polynomial-time oracle machine on input x and up to k(jxj) Boolean queries (to an oracle of "its choice"). More details will appear in a forthcoming paper of mine. An attempt to get rid of the intractability assumption used in the \NP in ZK" result of GMW86], led BGKW88] to suggest and investigate a model of multi-prover interactive proof systems. It was shown that two \isolated" provers can prove statements in NP in a perfect zero-knowledge manner. A di erent multi-prover model, in which one unknown prover is honest while the rest my interact and cheat arbitrarily, was suggested and investigated in FST88]. This model is equivalent to computation with a \noisy oracle". 6.3. On the Design of Cryptographic Protocols The primary motivation for the concept of zero-knowledge proof systems has been their potential use in the design of cryptographic protocols. Early examples of such use can be found in GMR85, FMRW85, CF85]. The general results in GMW86] allowed the presentation of automatic generators of two-party and multi-party cryptographic protocols (see Y86]3 and GMW87], respectively). Further improvements are reported in GHY87, GV87, IY87]. Two important tools in the construction of cryptographic protocols are Oblivious Transfer and Veri able Secret Sharing. Oblivious Transfer, introduced in R81], was further investigated in EGL82, FMRW85, BCR86, Cre87, CK88, Kil88]. Veri able Secret Sharing, introduced in CGMA85], was further investigated in GMW86, Bh86a, Fel87]. Other useful techniques appear in Bh86b, CR87]. An elegant model for investigations of multi-party cryptographic protocols was suggested in BGW88]. This model consists of processors connected in pairs via private channels. The bad processors have in nite computing resources (and so using computationally hard problems is useless). Hence, computational complexity restrictions and assumptions are substituted by assumptions about the communication model. An automatic generator 1 of protocols for this model, tolerating up to 3 malicious processors, has been presented in BGW88, CCD88]. Augmenting the model by a broadcast channel, tolerance can be 2 In particular, according to that formalization a prover revealing with probability 1 a Hamiltonian circuit 2 in the input gragh yields one one bit of knowledge. 3 It should be stressed that Y86] improves over Y82b]. The earlier paper presented two-party cryptographic protocols allowing semi-honest parties to compute privately functions ranging over \small" (i.e. polynomially bounded) domains. A.6. INTERACTIVE PROOFS, ZERO-KNOWLEDGE AND PROTOCOLS 281 1 improved to 2 BR89]. (The augmentation is necessary, as there are tasks which cannot be performed if a third of the processors are malicious (e.g. Byzantine Agreement).) Be1 yond the 2 bound, only functions of special type (i.e. the exclusive-or of locally computed functions) can be privately computed CKu89]. references AGH86] Aiello, W., S. Goldwasser, and J. Hastad, \On the Power of Interaction", Proc. 27th FOCS, pp. 368-379, 1986. AH87] Aiello, W., and J. Hastad, \Perfect Zero-Knowledge Languages can be Recognized in Two Rounds", Proc. 28th FOCS, pp. 439-448, 1987. AGY85] Alon, N., Z. Galil, and M. Yung, \A Fully Polynomial Simultaneous Broadcast in the Presence of Faults", unpublished manuscript, 1985. B85] Babai, L., \Trading Group Theory for Randomness", Proc. 17th STOC, 1985, pp. 421-429. BKL] Babai, L., W.M. Kantor, and E.M. Luks, \Computational Complexity and Classi cation of Finite Simple Groups", Proc. 24th FOCS, pp. 162-171, 1983. BaMo88] Babai, L., and S. Moran, \Arthur-Merlin Games: A Randomized Proof System, and a Hierarchy of Complexity Classes", JCSS, Vol. 36, No. 2, pp. 254-276, 1988. BMO89] Bellare, M., S. Micali, and R. Ostrovsky, \On Parallelizing Zero-Knowledge Proofs and Perfect Completeness Zero-Knowledge", manuscript, April 1989. Bh86a] Benaloh, (Cohen), J.D., \Secret Sharing Homomorphisms: keeping shares of a secret secret", Crypto86, proceedings, Springer-Verlag, Lecture Notes in Computer Science, vol. 263, pp. 251-260, 1987. Bh86b] Benaloh, (Cohen), J.D., \Cryptographic Capsules: A Disjunctive Primitive for Interactive Protocols", Crypto86, proceedings, Springer-Verlag, Lecture Notes in Computer Science, vol. 263, pp. 213-222, 1987. Betal88] Ben-Or, M., O. Goldreich, S. Goldwasser, J. Hastad, J. Killian, S. Micali, and P. Rogaway, \Every Thing Provable is provable in ZK", to appear in the proceedings of Crypto88, 1988. BGW88] Ben-Or, M., S. Goldwasser, and A. Wigderson, \Completeness Theorems for NonCryptographic Fault-Tolerant Distributed Computation", 20th STOC, pp. 1-10, 1988. 282 APPENDIX A. ANNOTATED LIST OF REFERENCES (COMPILED FEB. 1989) BGKW88] Ben-Or, M., S. Goldwasser, J. Kilian, and A. Wigderson, \Multi-Prover Interactive Proofs: How to Remove Intractability", 20th STOC, pp. 113-131, 1988. BT89] Ben-Or, M., and T. Rabin, \Veri able Secret Sharing and Multiparty Protocols with Honest Majority", 21st STOC, pp. 73-85, 1989. Bk] Blakley, G.R., \Safeguarding Cryptographic Keys", Proc. of National Computer Conf., Vol. 48, AFIPS Press, 1979, pp. 313-317. BFM88] Blum, M., P. Feldman, and S. Micali, \Non-Interactive Zero-Knowledge and its Applications", 20th STOC, pp. 103-112, 1988. BHZ87] Boppana, R., J. Hastad, and S. Zachos, \Does Co-NP Have Short Interactive Proofs?", IPL, 25, May 1987, pp. 127-132. BCC88] Brassard, G., D. Chaum, and C. Crepeau, "Minimum Disclosure Proofs of knowledge", JCSS, Vol. 37, No. 2, Oct. 1988, pp. 156-189. BC86a] Brassard, G., and C. Crepeau, \Non-Transitive Transfer of Con dence: A Perfect Zero-Knowledge Interactive Protocol for SAT and Beyond", Proc. 27th FOCS, pp. 188-195, 1986. BC86b] Brassard, G., and C. Crepeau, \Zero-Knowledge Simulation of Boolean Circuits", Advances in Cryptology - Crypto86 (proceedings), A.M. Odlyzko (ed.), Springer-Verlag, Lecture Notes in Computer Science, vol. 263, pp. 223-233, 1987. BCR86] Brassard, G., C. Crepeau, and J.M. Robert, \Information Theoretic Reductions Among Disclosure Problems", Proc. 27th FOCS, pp. 168-173, 1986. BCY89] Brassard, G., C. Crepeau, and M. Yung, \Everything in NP can be argued in perfect zero-knowledge in a bounded number of rounds", Proc. of the 16th ICALP, July 1989. CCD88] Chaum, D., C. Crepeau, I. Dangard, \Multi-party Unconditionally Secure Protocols", 20th STOC, pp. 11-19, 1988. Cha] Chaum, D., \Demonstrating that a Public Predicate can be Satis ed Without Revealing Any Information About How", Advances in Cryptology - Crypto86 (proceedings), A.M. Odlyzko (ed.), Springer-Verlag, Lecture Notes in Computer Science, vol. 263, pp. 195-199, 1987. CGMA85] Chor, B., S. Goldwasser, S. Micali, and B. Awerbuch, \Veri able Secret Sharing and Achieving Simultaneity in the Presence of Faults", Proc. 26th FOCS, 1985, pp. 383395. CKu89] Chor, B., and E. Kushilevitz, \A Zero-One Law for Boolean Privacy", 21st STOC, pp. 62-72, 1989. A.6. INTERACTIVE PROOFS, ZERO-KNOWLEDGE AND PROTOCOLS 283 CR87] Chor, B., and M.O, Rabin, \Achieving Independence in Logarithmic Number of Rounds", 6th PODC, pp. 260-268, 1987. CGG] Chor, B., O. Goldreich, and S. Goldwasser, \The Bit Security of Modular Squaring given Partial Factorization of the Modulos", Advances in Cryptology - Crypto85 (proceedings), H.C. Williams (ed.), Springer-Verlag, Lecture Notes in Computer Science, vol. 218, 1986, pp. 448-457. CF85] Cohen, J.D., and M.J. Fischer, \A Robust and Veri able Cryptographically Secure Election Scheme", Proc. 26th FOCS, pp. 372-382, 1985. Cre87] Crepeau, C., \Equivalence between two Flavour of Oblivious Transfer", Crypto87 proceedings, Lecture Notes in Computer Science, Vol. 293, Springer-Verlag, 1987, pp. 350-354. CK88] Crepeau, C., and J. Kilian, \Weakening Security Assumptions and Oblivious Transfer", Crypto88 proceedings. EGL82] see category 8. Fel87] Feldman, P., \A Practical Scheme for Veri able Secret Sharing", Proc. 28th FOCS, pp. 427-438, 1987. FFS87] Feige, U., A. Fiat, and A. Shamir, \Zero-Knowledge Proofs of Identity", Proc. of 19th STOC, pp. 210-217, 1987. FST88] Feige, U., A. Shamir, and M. Tennenholtz, \The Noisy Oracle Problem", Crypto88 proceedings. FS88] Feige, U., and A. Shamir, \Zero-Knowledge Proofs of Knowledge in Two Rounds", manuscript, Nov. 1988. FMRW85] Fischer, M., S. Micali, C. Racko , and D.K. Wittenberg, \An Oblivious Transfer Protocol Equivalent to Factoring", unpublished manuscript, 1986. Preliminary versions were presented in EuroCrypt84 (1984), and in the NSF Workshop on Mathematical Theory of Security, Endicott House (1985). For87] Fortnow, L., \The Complexity of Perfect Zero-Knowledge", Proc. of 19th STOC, pp. 204-209, 1987. GHY85] Galil, Z., S. Haber, and M. Yung, \A Private Interactive Test of a Boolean Predicate and Minimum-Knowledge Public-Key Cryptosystems", Proc. 26th FOCS, 1985, pp. 360-371. GHY87] Galil, Z., S. Haber, and M. Yung, \Cryptographic Computation: Secure Fault-Tolerant Protocols and the Public-Key Model" Crypto87, proceedings, Springer-Verlag, Lecture Notes in Computer Science, vol. 293, pp. 135-155, 1987. 284 APPENDIX A. ANNOTATED LIST OF REFERENCES (COMPILED FEB. 1989) G87a] Goldreich, O., \Zero-Knowledge and the Design of Secure Protocols (an exposition)", TR-480, Computer Science Dept., Technion, Haifa, Israel, 1987. G88b] see category 4. GKu88] Goldreich, O., and E. Kushilevitz, \A Perfect Zero-Knowledge Proof for a Decision Problem Equivalent to Discrete Logarithm", Crypto88, proceedings. GKa89] Goldreich, O., and A. Kahan, \Using Claw-Free Permutations to Construct ZeroKnowledge Proofs for NP", in preparation, 1989. GKr89b] Goldreich, O., and H. Krawczyk, \On Sequential and Parallel Composition of ZeroKnowledge Protocols", preprint, 1989. GV87] Goldreich, O., and R. Vainish, \How to Solve any Protocol Problem - an E ciency Improvement", Crypto87, proceedings, Springer-Verlag, Lecture Notes in Computer Science, vol. 293, pp. 73-86, 1987. GMS87] Goldreich, O., Y. Mansour, and M. Sipser \Interactive Proof Systems: Provers that Never Fail and Random Selection", 28th FOCS, pp. 449-461, 1987. GMW86] see main references. GMW87] see main references. GO87] Goldreich, O., and Y. Oren, \On the Cunning Power of Cheating Veri ers: Some Observations about Zero-Knowledge Proofs", in preparation. Preliminary version, by Y. Oren, in FOCS87. Gw89] Goldwasser, S., \Interactive Proof Systems", Proc. of Symposia in Applied Mathematics, AMS, Vol. 38, 1989. GMR85] see main references. GS86] Goldwasser, S., and M. Sipser, \Private Coins vs. Public Coins in Interactive Proof Systems", Proc. 18th STOC, 1986, pp. 59-68. IY87] Impagliazzo, R., and M. Yung, \Direct Minimum-Knowledge Computations", Advances in Cryptology - Crypto87 (proceedings), C. Pomerance (ed.), Springer-Verlag, Lecture Notes in Computer Science, vol. 293, 1987, pp. 40-51. Kil88] Kilian, J., \Founding Cryptography on Oblivious Transfer", 20th STOC, pp. 20-31, 1988. LMR83] Luby, M., S. Micali, and C. Racko , 24th FOCS, 1983. KMO89] Kilian, J., S. Micali, and R. Ostrovsky, \Simple Non-Interactive Zero-Knowledge Proofs", 30th FOCS, to appear, 1989. A.7. ADDITIONAL TOPICS NW88] see category 4. R81] see category 8. 285 TW87] Tompa, M., and H. Woll, \Random Self-Reducibility and Zero-Knowledge Interactive Proofs of Possession of Information", Proc. 28th FOCS, pp. 472-482, 1987. Y82b] Yao, A.C., \Protocols for Secure Computations", 23rd FOCS, 1982, pp. 160-164. Y86] Yao, A.C., \How to Generate and Exchange Secrets", Proc. 27th FOCS, pp. 162-167, 1986. A.7 Additional Topics This category provides pointers to topics which I did not address so far. These topics include additional cryptographic problems (e.g. software protection, computation with an untrusted oracle, and protection against \abuse of cryptographic systems"), lower level primitives (e.g. Byzantine Agreement and sources of randomness) and \cryptanalysis". 7.1. Software Protection A theoretical framework for discussing software protection is suggested in G87b]. Recently, the solution in G87b] has been dramatically improved O89]. 7.2. Computation with an Untrusted Oracle Computation with an untrusted oracle raises two problems: the oracle may fail the computation by providing wrong answers, and/or the oracle can gain information on the input of the machine which uses it. The rst problem can be identi ed with recent research on \program checking" initiated in BK89]. Note that the de nition of \program checking" is more re ned than the one of an interactive proof (in particular it does not trivialize polynomialtime computations and does not allow in nitely powerful provers) and thus is more suitable for the investigation. The results in BK89, BLR89] are mainly encouraging as they provide many positive examples of computations which can be sped-up (and yet con rmed) using an oracle. A formalization of the second problem, presented in AFK87], seems to have reached a dead-end with the negative results of AFK87]. Other formalizations appear in BF89] and BLR89]. 286 APPENDIX A. ANNOTATED LIST OF REFERENCES (COMPILED FEB. 1989) 7.3. Protection Against Abuse of Cryptographic Systems How can a third party prevent the abuse of a two-party cryptographic protocol executed through a channel he controls? As an example consider an attempt of one party to pass information to his counterpart by using a signature scheme. This old problem (sometimes referred to as the prisoners' problem or the subliminal channel) is formalized and solved, using active intervention of the third party, in D88]. 7.4. Byzantine Agreement In lectures 14-15 we have assumed the existence of a broadcast channel accessable by all processors. In case such a channel does not exist in the network (i.e., in case we are using a point-to-point network), such a channel can be implemented using Byzantine Agreement. Using private channel, randomized Byzantine Agreement protocols with expected O(1) rounds can be implemented FM88]. This work builds on R83]. Additional insight can be gained from the pioneering works of Be83, Br85], and from the survey of CD89]. 7.5. Sources of Randomness A subject related to cryptography is the use of weak sources of randomness in applications requiring perfect coins. Models of weak sources are presented and investigated in B84, SV84, CG85, Cetal85, LLS87]. Further developments are reported in V85, VV85, V87]. 7.6. Cryptanalysis In all the famous examples of successful cryptanalysis of a proposed cryptographic scheme, the success revealed a explicit or implicit assumption made by the designers of the cryptosystem. This should serve as experimental support to the thesis underlying the course that assumptions have to be made explicitly. Knapsack cryptosystems, rst suggested in MH78], were the target of many attacks. The rst dramatic success was the breaking of the original MH78] scheme, using the existence of a trapdoor super-increasing sequence S82]. An alternative attack applicable against low density knapsack (subset sum) problems was suggested in LO85]. For more details see BO88]. It seems that the designers conjectured that subset sum problems with a trapdoor (resp. with low density) are as hard as random high density subset sum problems. It seems that this conjecture is false. Another target for many attacks were the linear congruential number generators and their generalizations. Although these generators are known to pass many statistical tests K69], they do not pass all polynomial-time statistical tests Boy82]. Generalizations to A.7. ADDITIONAL TOPICS 287 polynomial congruential recurrences and linear generators which output only part of the bits of the numbers produced can be found in Kr88] and S87], respectively. The fact that a proposed scheme passes some tests or attacks does not mean that it will pass all e cient tests. Another famous cryptographic system which triggered interesting algorithmic research is the OSS84] signature scheme. This scheme was based on the conjecture, latter refuted in Pol84], that it is hard to solve a modular quadratic equation in two variables. Other variants (e.g. OSS84b, OS85]) were broken as well (in EAKMM85, BD85], resp.). Proving that one cannot nd the trapdoor information used by the legal signer does not mean that one cannot forge signatures.4 references AFK87] Abadi, M., J. Feigenbaum, and J. Kilian, \On Hiding Information from an Oracle", 19th STOC, pp. 195-203, 1987. BF89] Beaver, D., and J. Feigenbaum, \Encrypted Queries to Multiple Oracles", manuscript, 1989. B84] Blum, M., \Independent Unbiased Coin Flips from a Correlated Biased Source: a Finite State Markov Chain", 25th Symp. on Foundation of Computer Science, pp. 425-433, 1984. Be83] Ben-Or, M., \Another Advantage of Free Choice: Completely Asynchronous Agreement Protocols", 2nd PODC, pp. 27-30, 1983. BK89] Blum, M., and S. Kannan, \Designing Programs that Check their Work", 21st STOC, pp. 86-97, 1989. BLR89] Blum, M., M. Luby, and R. Rubinfeld, in preparation. Boy82] Boyar, J.B., \Inferring Sequences Produced by Pseudo-Random Number Generators", JACM, Vol. 36, No. 1, pp. 129-141, 1989. Early version in FOCS82 (under previous name: Plumstead). Br85] Bracha, G., \An O(log n) Expected Rounds Randomized Byzantine Generals Protocol", JACM, Vol. 34, No. 4, pp. 910-920, 1987. Extended abstract in STOC85. To further stress this point, consider a signature scheme \based on composites" where the signature of a message m relative to the public-key N is 2m mod N . The infeasibility of retrieving the trapdoor (i.e. the factorization of N ) is a poor guarantee for security. 4 288 APPENDIX A. ANNOTATED LIST OF REFERENCES (COMPILED FEB. 1989) BD85] Brickell, E.F., and J.M. DeLaurentis, \An Attack on a Signature Scheme Proposed by Okamoto and Shiraishi", Crypto85, proceedings, Springer-Verlag, Lecture Notes in Computer Science, vol. 218, pp. 28-32, 1985. LO85] Lagarias, J.C., and A.M. Odlyzko, \Solving Low-Density Subset Sum Problems", JACM, Vol. 32, (1985), pp. 229-246. 24th FOCS, pp. 1-10, 1983. BO88] see category 2. CD89] Chor, B., and C. Dwork, \Randomization in Byzantine Agreement", Advances in Computing Research, Vol. 5, S. Micali, ed., JAI Press, in press. Cetal85] Chor, B., J. Freidmann, O. Goldreich, J. Hastad, S. Rudich, and R. Smolensky, \The Bit Extraction Problem or t-Resilient Functions", 26th FOCS, pp. 396-407, 1985. CG85] Chor, B., and O. Goldreich, \Unbiased Bits from Sources of Weak Randomness and Probabilistic Communication Complexity", 26th Symp. on Foundation of Computer Science, pp. 427-443, 1985. D88] Desmedt, Y., \Abuses in Cryptography and How to Fight Them", Crypto88 proceedings, to appear. EAKMM85] Estes, D., L. Adleman, K. Kompella, K. McCurley, and G. Miller, \Breaking the OngSchnorr-Shamir Signature Scheme for Quadratic Number Fields", Crypto85, proceedings, Springer-Verlag, Lecture Notes in Computer Science, vol. 218, pp. 3-13, 1985. FM88] Feldman, P., and S. Micali, \Optimal Algorithms for Byzantine Agreement", 20th STOC, pp. 148-161, 1988. FHKLS] Frieze, A.M., J. Hastad, R. Kannan, J.C. Lagarias, and A. Shamir, \Reconstructing Truncated Integer Variables Satisfying Linear Congruences", SIAM J. Comput., Vol. 17, No. 2, pp. 262-280, 1988. Combines early papers from FOCS84 and STOC85 (by Frieze, Kannan and Lagarias, and Hastad and Shamir, resp.). G87b] Goldreich, O., \Towards a Theory of Software Protection and Simulation by Oblivious RAMs", 19th STOC, pp. 182-194, 1987. K69] Knuth, D.E., The Art of Computer Programming, Vol. 2, Addison-Wesley, Reading, Mass., 1969. Kr88] Krawczyk, H., \How to Predict Congruential Generators", TR-533, Computer Science Dept., Technion, Haifa, Israel, 1988. To appear in J. of Algorithms. LR88] J.C. Lagarias, and J. Reeds, \Unique Extrapolation of Polynomial Recurrences", SIAM J. Comput., Vol. 17, No. 2, pp. 342-362, 1988. A.7. ADDITIONAL TOPICS 289 LLS87] Lichtenstein, D., N. Linial, and M. Saks, \Imperfect Random Sources and Discrete Control Processes", 19th STOC, pp. 169-177, 1987. MH78] see category 8. OS85] Okamoto, T., and A. Shiraishi, \A Fast Signature Scheme Based on Quadratic Inequalities", Proc. of 1985 Symp. on Security and Privacy, April 1985, Oakland, Cal. OSS84] Ong, H., C.P. Schnorr, and A. Shamir, \An E cient Signature Scheme Based on Quadratic Equations", 16th STOC, pp. 208-216, 1984. OSS84b] Ong, H., C.P. Schnorr, and A. Shamir, \E cient Signature Schemes Based on Polynomial Equations", Crypto84, proceedings, Springer-Verlag, Lecture Notes in Computer Science, vol. 196, pp. 37-46, 1985. O89] Ostrovsky, R., \An E cient Software Protection Scheme", in preparations. Pol84] Pollard, J.M., \Solution of x2 + ky 2 m (mod n), with Application to Digital Signatures", preprint, 1984. R83] Rabin, M.O., \Randomized Byzantine Agreement", 24th FOCS, pp. 403-409, 1983. SV84] Santha, M., and U.V. Vazirani, \Generating Quasi-Random Sequences from SlightlyRandom Sources", 25th Symp. on Foundation of Computer Science, pp. 434-440, 1984. S82] Shamir, A., \A Polynomial-Time Algorithm for Breaking the Merkle-Hellman Cryptosystem", 23rd FOCS, pp. 145-152, 1982. S87] Stern, J., \Secret Linear Congruential Generators are not Cryptographically Secure", 28th FOCS, pp. 421-426, 1987. V85] U.V. Vazirani, \Towards a Strong Communication Complexity Theory or Generating Quasi-Random Sequences from Two Communicating Slightly-Random Sources", Proc. 17th ACM Symp. on Theory of Computing, 1985, pp. 366-378. V87] U.V. Vazirani, "E ciency Considerations in Using Semi-random Sources", Proc. 19th ACM Symp. on Theory of Computing, 1987, pp. 160-168. VV85] U.V. Vazirani, and V.V. Vazirani, \Random Polynomial Time is equal to SlightlyRandom Polynomial Time", 26th Symp. on Foundation of Computer Science, pp. 417-428, 1985. 290 APPENDIX A. ANNOTATED LIST OF REFERENCES (COMPILED FEB. 1989) A.8 Historical Background An inspection of the references listed above reveals that all these works were initiated in the 80's and began to appear in the literature in 1982 (e.g. GM82]). However, previous work had tremendous in uence on these works of the 80's. The in uence took the form of setting intuitive goals, providing basic techniques, and suggesting potential solutions which served as a basis for constructive criticism (leading to robust approaches). 8.1. Classic Cryptography Answering the fundamental question of classic cryptography in a gloomy way (i.e. it is impossible to design a code that cannot be broken), Shannon suggested a modi cation to the question S49]. Rather than asking whether it is possible to break the code, one should ask whether it is feasible to break it. A code should be considered good if it cannot be broken when investing work which is in reasonable proportion to the work required of the legal parties using the code. 8.2. New Directions in Cryptography Prospects of commercial application were the trigger for the beginning of civil investigations of encryption schemes. The DES designed in the early 70's has adopted the new paradigm: it is clearly possible but supposely infeasible to break it. Following the challenge of constructing and analyzing new encryption schemes came new questions like how to exchange keys over an insecure channel M78]. New concepts were invented: digital signatures R77, DH76], public-key cryptosystems and one-way functions DH76]. First implementations of these concepts were suggested in MH78, RSA78, R79]. Cryptography was explicitly related to complexity theory in Br79, EY80, Lem79]: it was understood that problems related to breaking a cryptographic scheme cannot be NP complete and that NP -hardness is a poor evidence for cryptographic security. Techniques as \n-out-of-2n veri cation" R77] and secret sharing S79] were introduced (and indeed were used extensively in subsequent research). 8.3. At the Dawn of a New Era Early investigations of cryptographic protocols revealed the inadequacy of imprecise notions of security and the subtleties involved in designing cryptographic protocols. In particular, problems as coin tossing over telephone B82a], exchange of secrets and oblivious transfer were formulated R81, B82b] (cf. EGL82]). Doubts concerning the security of \mental poker" protocol of SRA79] led to the current notion of secure encryption GM82] and to A.8. HISTORICAL BACKGROUND 291 concepts as computational indistinguishability. Doubts concerning the Oblivious Transfer protocol of R81] led to the concept of zero-knowledge GMR85] (early versions date to March 1982). An alternative approach to the security of cryptographic protocols was suggested in DY81] (see also DEK82]), but it turned out that it is much too di cult to test whether a protocol is secure EG83]. Fortunately, tools for constructing secure protocols do exist (see Y86, GMW87])! references B82a] Blum, M., \Coin Flipping by Phone", IEEE Spring COMPCOM, pp. 133-137, February 1982. See also SIGACT News, Vol. 15, No. 1, 1983. B82b] Blum, M., \How to Exchange Secret Keys", Memo. No. UCB/ERL M81/90. ACM Trans. Comput. Sys., Vol. 1, pp. 175-193, 1983. Br79] Brassard, G., \A Note on the Complexity of Cryptography", IEEE Trans. on Inform. Th., Vol. 25, pp. 232-233, 1979. DH76] W. Di e, and M. E. Hellman, "New Directions in Cryptography", IEEE transactions on Info. Theory, IT-22 (Nov. 1976), pp. 644-654 DEK82] Dolev, D., S. Even, and R. Karp, \On the Security of Ping-Pong Protocols", Advances in Cryptology: Proceedings of Crypto82, Plenum Press, pp. 177-186, 1983. DY81] Dolev, D., and A.C. Yao, \On the Security of Public-Key Protocols", IEEE Trans. on Inform. Theory, Vol. 30, No. 2, pp. 198-208, 1983. Early version in FOCS81. EGL82] Even, S., O. Goldreich, and A. Lempel, \A Randomized Protocol for Signing Contracts", CACM, Vol. 28, No. 6, 1985, pp. 637-647. Extended abstract in Crypto82. EG83] Even, S., and O. Goldreich, \On the Security of Multi-party Ping-Pong Protocols", 24th FOCS, pp. 34-39, 1983. EY80] Even, S., and Y. Yacobi, \Cryptography and NP-Completeness", 7th ICALP proceedings, Lecture Notes in Computer Science, Vol. 85, Springer Verlag, pp. 195-207, 1980. See also later version by Even, Selman, and Yacobi (titled: \The Complexity of Promise Problems with Applications to Public-Key Cryptography") in Inform. and Control, Vol. 61, pp. 159-173, 1984. GMW87] see main references. GM82] see main references. 292 APPENDIX A. ANNOTATED LIST OF REFERENCES (COMPILED FEB. 1989) GMR85] see main references. Lem79] Lempel, A., \Cryptography in Transition", Computing Surveys, Dec. 1979. M78] Merkle, R.C., \Secure Communication over Insecure Channels", CACM, Vol. 21, No. 4, pp. 294-299, 1978. MH78] Merkle, R.C., and M.E. Hellman, \Hiding Information and Signatures in Trapdoor Knapsacks", IEEE Trans. Inform. Theory, Vol. 24, pp. 525-530, 1978. R77] M.O. Rabin, \Digitalized Signatures", Foundations of Secure Computation, Academic Press, R.A. DeMillo et. al. eds., 1977. R79] M.O. Rabin, "Digitalized Signatures and Public Key Functions as Intractable as Factoring", MIT/LCS/TR-212, 1979. R81] Rabin, M.O., \How to Exchange Secrets by Oblivious Transfer", unpublished manuscript, 1981. RSA78] R. Rivest, A. Shamir, and L. Adleman, "A Method for Obtaining Digital Signatures and Public Key Cryptosystems", Comm. ACM, Vol. 21, Feb. 1978, pp 120-126 S79] Shamir, A., \How to Share a Secret", CACM, Vol. 22, 1979, pp. 612-613. S83] A. Shamir, "On the Generation of Cryptographically Strong Pseudorandom Sequences", ACM Transaction on Computer Systems, Vol. 1, No. 1, February 1983, pp. 38-44. SRA79] Shamir, A., R.L. Rivest, and L. Adleman, \Mental Poker", MIT/LCS report TM-125, 1979. S49] Shannon, C.E., \Communication Theory of Secrecy Systems", Bell Sys. Tech. J., 28, pp. 656-715, 1949. Y86] see category 6. ...