A_Survey_of_Formal_Software_Development_Methods.pdf - See...

This preview shows page 1 out of 31 pages.

Unformatted text preview: See discussions, stats, and author profiles for this publication at: A Survey of Formal Software Development Methods Article · April 1995 Source: CiteSeer CITATIONS READS 24 295 1 author: Donald Sannella The University of Edinburgh 141 PUBLICATIONS 3,201 CITATIONS SEE PROFILE All content following this page was uploaded by Donald Sannella on 26 June 2014. The user has requested enhancement of the downloaded file. A survey of formal software development methods1 Donald Sannella Department of Arti cial Intelligence and Department of Computer Science University of Edinburgh July 1988 1 Introduction 1.1 Scope This paper is a survey of the current state of the art of research on methods for formal software development. The scope of this paper is necessarily restricted so as to avoid discussion of a great many approaches at a very super cial level. First, although some of the ideas discussed below could be (and have been) applied to hardware development as well as to software development, this topic will not be treated here. Second, the special problems involved in the development of concurrent systems will not be discussed here although again many of the approaches mentioned below could be applied in this context. Third, no attempt is made to treat programming methodologies such as Jackson's method and program development systems such as the MIT Programmer's Apprentice which are not formally based. Finally, this survey does not claim to be fully exhaustive although an attempt has been made to cover most of the main approaches. Many of the technical details of the di erent approaches discussed have been glossed over or simpli ed for the purposes of this presentation; full details may be found in the cited references. 1.2 Software development: from requirements to program This section presents a general picture of the process by which a software system may be developed by formal methods from a speci cation of the requirements the system must ful ll. This overall picture will be useful in discussing the wide variety of formal program development approaches available, as the di erent approaches attack di erent aspects of the problem. Let SP 0 be a speci cation of the requirements which the software system is expected to ful ll, expressed in some formal speci cation language SL. (The process, sometimes known as requirements engineering, by which such a precise formal speci cation is obtained starting from the informal and often vague requirements of the customer will not be discussed here although it is acknowledged that this problem is by no means a trivial one.) This speci cation constrains the input/output behaviour of the system in some fashion. It may This paper was written under contract to GEC Research as a part of the Alvey-sponsored ISF (Integrated Systems Factory) Study. 1 1 also place constraints on the time and space resources available, although most of the approaches to be discussed are unable to deal with constraints of this kind. The ultimate objective is a program P written in some given programming language PL which satis es the requirements in SP 0. The usual way to proceed is to construct P by whatever means are available, making informal reference to SP 0 in the process, and then verify in some way that P does indeed satisfy SP 0. The only practical veri cation method available at present is to test P , checking that in certain selected cases the input/output relation it computes satis es the constraints imposed by SP 0. This has the obvious disadvantage that (except for trivial programs) correctness of P is never guaranteed by this process, even if the correct output is produced in all test cases. On the other hand, methods for automatically generating test cases which are likely to expose problems are available, see for example [BCFG 86]. An alternative to testing is a formal proof that the program P is correct with respect to the speci cation SP 0. However, after two decades of work on program veri cation it now seems to be more or less widely accepted that this will probably never be feasible for programs of realistic size. At the very least, initial hopes for a system capable of automatically generating proofs of program correctness are regarded as unrealizable. Most recent work in this area has focused on methods for developing programs from speci cations in such a way that the resulting program is guaranteed to be correct by construction. The main idea is to develop P from SP 0 via a series of small re nement steps, inspired by the programming discipline of stepwise re nement. Each re nement step captures a single design decision, for instance a choice between several algorithms which implement the same function or between several ways of eciently representing a given data type. This yields the following diagram: SP 0 > SP 1 > SP 2 >   > P If each individual re nement step (SP 0 > SP 1, SP 1 > SP 2 and so on) can be proved correct, then P itself is guaranteed to be correct. Each of these proofs is orders of magnitude easier than a proof that P itself is correct since each re nement step is small. In principle it would be possible to combine all the individual correctness proofs to yield a proof of the correctness of P with respect to SP 0, but in practice this would never be necessary. When this approach is used to develop large and complex programs, the individual speci cations SP n become large and unwieldy. This is particularly true as n increases. As a consequence the proofs of correctness of re nement steps become dicult, even though the creative leap involved in a single step remains the same. The solution to this problem adopted by some formal program development approaches is to allow speci cations to be decomposed into smaller units during the development process. These smaller speci cations may then be re ned independently of one another. A simple development involving 2 only two decompositions and six re nement steps would then give the following diagram: 8 >> SP 1 > SP 2 > P >>  < 8 >< SP 2 > P SP 0 > > >> SP 1 > > : SP > P : 2 0 0 00 00 0 Here  and are intended to denote arbitrary speci cation-building operations, and P , P and P are program modules. The program P  (P P ) is guaranteed to be correct with respect to the speci cation SP 0 provided each of the individual re nement steps can be proved correct. This assumes that  and can be used for combining program modules as well as speci cations, or at least that operations corresponding to  and exist on the level of program modules. It is important to note that the neat and tidy diagrams above are not intended to suggest that the formal development of realistic programs proceeds without backtracking, mistakes and iteration. Formal program development approaches do not claim to remove the possibility of unwise design decisions. But once a program is obtained by means of some sequence of re nement steps then a diagram like the one above which omits all the blind alleys may be drawn. Then, provided all the required correctness proofs have been carried out, correctness of the resulting program is guaranteed. Most approaches do not require the correctness proofs to be carried out when the individual re nement steps are rst proposed; the proof obligations can be recorded for later, to be discharged after it becomes clear if the re nement step will lead to (a module of) the nal program. There are a number of questions which the general view of formal program development presented above leaves completely open, and which any particular approach which ts into this mould must answer. These include:  What is the speci cation language SL?  What speci cation-building operations such as  and above are available in SL, if any?  What is the programming language PL?  What is the relationship between PL and SL?  What does \re nement" mean and under what circumstances is a re nement step correct?  How is the transition between SL and PL made?  What is the relationship between re nement and decomposition?  Does re nement of programs P > P make sense? What is the relation between this and re nement of speci cations SP > SP ? 0 00 0 0 0 3 00  Does the re nement process itself have the status of a formal object subject to analysis and manipulation (see [SS 83])? It is very important to the mathematical soundness of the particular program development approach under consideration that some of these questions (at least the rst seven) be given very precise answers. For example, whatever speci cation language SL is used, it is important that it be given a complete formal semantics. Without such a semantics, speci cations have no precise meaning and so no formal proofs can be undertaken. The same holds for the programming language PL. The notion of re nement must be given a precise mathematical de nition and it must be shown that re nements preserve behaviour in an appropriate sense. There are other questions which are important for determining the useability of an approach:2  What methods are available for proving the correctness of re nement steps?  Where do re nement steps come from?  What tools are available for assisting with which aspects of the program development process?  What level of sophistication is required by the user of such tools?  Which aspects of the program development process as sketched above can be fully automated?  Does the approach require speci cations to be \complete" in any sense? Does it provide a way of checking completeness of a speci cation or identifying areas of incompleteness?  Does the approach provide ways of deriving programs which are optimal (or at least adequate) with respect to some performance measure?  Does the approach provide a formal way of comparing pros and cons of di erent implementations which meet a speci cation?  What are the complexity properties of the approach; for example, what happens to the size of proofs of correctness as speci cations grow in size? Again, it is essential for soundness that any methods which are established for proving correctness of re nement steps etc. are shown to be sound with respect to the formal de nition of re nement. Even once all these questions have been given precise answers, there are additional conditions which the answers must satisfy. For example, whatever de nition of re nement 2 Thanks to Aaron Sloman for pointing out the importance of the last four of these. 4 is adopted, it is essential that re nement steps be composable as discussed at the beginning of section 3. It turns out that this in turn imposes constraints on the semantics of speci cation-building operations. Although these are important questions, it is not the purpose of this paper to answer all of them for all the approaches which will be mentioned below. Some of the answers may be found in the cited papers. Sometimes an approach has no clear answer to one or more of the most central questions; in this case I try to point this out. 1.3 Structure of this paper The rest of this paper surveys di erent approaches to formal program development in the light of the discussion above. Section 2 surveys languages for writing speci cations and programs. Section 3 examines the notions of re nement adopted in di erent approaches. Finally, section 4 discusses the issue of where individual re nement steps come from. 2 Speci cation and programming languages This section survey languages which are used in di erent formal program development approaches to write speci cations and programs. As suggested by the gradual evolution from high-level speci cation to program described above, in many approaches the distinction between speci cations and programs is blurred or even non-existent. Some approaches described below adopt a single so-called wide spectrum language which can be used to write high-level speci cations, ecient programs, and everything which arises in the transition from the former to the latter during the program development process. In these intermediate steps it is natural for speci cation constructs to be mixed freely with programming constructs because of the way that high-level speci cations are gradually re ned to programs. This also avoids various problems which arise when separate speci cation and programming languages are used: there is no essential di erence between re nement of programs and re nement of speci cations; the same modularization constructs can be used to structure speci cations as well as programs; there is no sudden leap from one notation to another but rather a gradual transition from high-level speci cation to ecient program. For the purposes of this paper, \high-level speci cations" are descriptions which give details of what is required. This is contrasted with \programs" which suggest how the desired result is to be computed (which amounts to an algorithm of some kind). The word \speci cation" shall be used to refer to any description of the input/output behaviour of a system, whether algorithmic or not; thus a program is a speci cation which is executable. This is consistent with the terminology used in wide spectrum languages where a program is a speci cation which happens to use only the executable subset of the language. In some approaches it is argued that the initial high-level speci cation of requirements must be executable. This is thought to be necessary in order to ensure that this formal speci cation accurately re ects the customer's intentions; with an executable speci cation this can be ascertained by testing. We agree that it is necessary to start from a 5 speci cation of requirements which is correct with respect to our intentions, and that bug-free formal speci cations are dicult to construct. But requiring the initial speci cation to be executable means that a major part (the most dicult part, in our view) of the program development process is not formalized. For one thing, in constructing an executable speci cation it is necessary to make many decisions which could be left open in a non-executable speci cation. This means that a whole range of perfectly acceptable alternative implementations is unnecessarily eliminated from consideration from the very beginning. Writing speci cations in an executable speci cation language is just a form of programming, and developing ecient programs from such speci cations is just program optimization. Aiming for an initial speci cation which is as abstract and non-algorithmic as possible often leads to useful simpli cations and generalizations which would not be discovered otherwise. Such a speci cation can be \tested" and shown to accurately re ect our intentions using a theorem prover; instead of evaluating an expression e and checking that the resulting value is v as expected, we try to prove that e = v is impled by our speci cation. A number of papers discussing these issues and others can be found in [GMc 86]; unfortunately most of the papers in this collection were originally published in the period 1977-1982 and so recent developments are not discussed. 2.1 VDM VDM (the Vienna Development Method) is a method for rigorous (not formal) program development. The objective is to produce programs by a process similar to the one sketched in section 1.2 where the individual re nement steps are shown to be correct using arguments which are formalizable rather than formal, thus approximating the level of rigour used in mathematics. This is supposed to yield most of the advantages of formal program development by ensuring that sloppiness is avoided without the foundational and notational overhead of full formality. VDM is presented in [Jon 80] and [BJ 82]; a short introduction to the approach is in [Jon 86]. VDM is the most widely accepted approach to systematic program development available to date. VDM uses a model-oriented approach to describing data types. Models are built using functions, relations and sets. A simple example from [Jon 80] is the following speci cation of dates: Date :: YEAR : Nat MONTH : fJan,Feb,: : : ,Decg DAY : f1 : 31g This models dates as triples, but does not require that dates be represented as triples in the nal program. Not all of the values of type Date are valid; the legal ones are characterized by the following data type invariant : 6 inv-date(hy; m; di) =def (m 2 fJan,Mar,May,Jul,Aug,Oct,Decg ) 1  d  31) ^ (m 2 fApr,Jun,Sep,Novg ) 1  d  30) ^ (m = Feb ^: is-leap-year(y) ) 1  d  28) ^ (m = Feb ^ is-leap-year(y) ) 1  d  29) A problem with model-oriented speci cations is that it is easy to overspecify a system, eliminating certain implementations from consideration from the beginning. In VDM a precise notion of overspeci cation has been studied. A model is called biased if it is not possible to de ne an equality test on the data values in the model in terms of the operators de ned. Intuitively, a biased model contains unnecessary redundancy. An unbiased model is viewed as suciently abstract to be the initial high-level speci cation of a system. More concrete models are introduced during the process of re nement (section 3.1). Pre- and post-conditions are used to specify procedures, which may have side e ects. For example, a procedure called TIMEWARP which resets the date (part of the global state) to a point 100 years in the past provided it is not October may be speci ed as follows: TIMEWARP states: DATE pre-TIMEWARP(hy; m; di) =def m 6= Oct post-TIMEWARP(hy; m; di; r) =def r = hy ? 100; m; di In the post-condition, r is the state which is produced by the procedure TIMEWARP. It is necessary to show that TIMEWARP preserves the invariant associated with Date to ensure that TIMEWARP cannot create an invalid date when given a valid date. Decomposition during the re nement process is a matter of breaking down procedures into individual statements which can themselves be speci ed using pre- and post-conditions. When this process has been carried out to completion the result is a program. The ESPRIT project RAISE [BDMP 85] is attempting to provide VDM with a formal foundation and support tools. A similar aim is being pursued by the MetaSoft project at the Polish Academy of Sciences [Bli 87]. 2.2 Z Z is a speci cation language based on the principle that programs and data can be described using set theory just as all of mathematics can be built on a set-theoretic basis. Thus, Z is no more than a formal notation for ordinary naive set theory. The rst version of Z [ASM 79] used a rather clumsy and verbose notation but the current version [Spi 85,87] adopts a more concise and elegant notation based on the idea of a scheme which Rgeneralizes the sort of thing behind mathematical notations like fx j x  7g, x:x + 1, 4x3dx, 8x:p(x), 9x:p(x), all of which involve bound variables of some kind. Data types are modelled in Z using set-theoretic constructions, just as mathematical \data types" like natural numbers, real numbers, ordered pairs and sequences are de ned 7 in set-theoretic terms in mathematics. For example, in specifying a display-oriented text editor [Suf 82] a document is described as an ordered pair consisting of the text before the cursor and the text after the cursor: DOC seq [CH ]  seq [CH ] (CH is the set of characters which may appear in documents.) Two DOC -transforming functions may then be speci ed as follows: back : DOC ! DOC ins : CH ! DOC ! DOC dom back = fl; r j l 6= hig (8(l; r) : DOC ; ch : CH ) back (l  hch i; r) = (l; hch i  r); ins ch(l; r) = (l  hch i; r); (In this speci cation, hi is the empty sequence; hch i is the sequence containing the single character ch; and  is the append function on sequences.) The function back (move one character backwards) is partial; it is applicable only to documents having some text before the cursor. This restriction on its domain is given by the rst axiom in the above speci cation. The set of documents whose cursor is positioned at the beginning of a word can be des...
View Full Document

  • Left Quote Icon

    Student Picture

  • Left Quote Icon

    Student Picture

  • Left Quote Icon

    Student Picture