Testing - Testing In ordinary computational practice by...

Info iconThis preview shows pages 1–10. Sign up to view the full content.

View Full Document Right Arrow Icon
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Background image of page 2
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Background image of page 4
Background image of page 5

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Background image of page 6
Background image of page 7

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Background image of page 8
Background image of page 9

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Background image of page 10
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Testing In ordinary computational practice by hand or by desk machines, it is the custom to check ever step of the computation and, when an error is found, to localize it by a backward process starting from thefirsl point where the error is noted. Norbert Wiener, Cybernetics Testing and debugging are often spokentas a single phrase but they are not the same thing. To over—simplify, debugging is what you do when you know that a pro— gram is broken Testing is a determined, systematic attempt to break a program that you think is working. Edsger Dijkstra made the famous observation that testing can demonstrate the presence of bugs, but not their absence. His hope is that programs can be made cor— rect by construction, so that there are no errors and thus no need for testing. Though this is a fine goal. it is not yet realistic for substantial programs. So in this chapter we’ll focus on how to test to find errors rapidly. efficiently, and effectively Thinking about potential problems as you code is a good start. Systematic testing, from easy tests to elaborate ones, helps ensure that programs begin life working cor— rectly and remain correct as they grow Automation helps to eliminate manual pro— cesses and encourages extensive testing. And there are plenty of tricks of the trade that programmers have learned from experience. One way to write bug—free code is to generate it by a program. If some programs ming task is understood so well that writing the code seems mechanical. then it should be mechanized. A common case occurs when a program can be generated from a specification in some specialized language. For example, we compile high-level lan- guages into assembly code; we use regular expressions to specify patterns of text; we use notations like SUMCAleSO) to represent operations over a range of cells in a spreadsheet. In such cases, if the generator or translator is correct and if the specifica- tion is correct, the resulting program will be correct too. We will cover this rich topic 139 140 TESTING CHAPTER 6 in more detail in Chapter 9; in this chapter we will talk briefly about ways to create tests from compact specifications. 6.1 Test as ‘You Write the Code The earlier a problem is found, the better. If you think systematically about what you are writing as you write it, you can verify simple properties of the program as it is being constructed, with the result that your code will have gone through one round of testing before it is even compiled. Certain kinds of bugs never come to life. Test codelat its boundaries. One technique is boundary condition testing: as each small piece of code is writtenfla loop or a conditional statement, for example-check right then that the condition branches the right way or that the loop goes through the proper number of times. This process is called boundary condition testing because you are probing at the natural boundaries within the program and data, such as non- existent or empty input, a single input item, an exactly full array, and so on. The idea is that most bugs occur at boundaries. If a piece of code is going to fail, it will likely fail at a boundary. Conversely, if it works at its boundaries, it’s likely to work else- where too. This fragment, modeled on fgets, reads characters until it finds a newline or fills a buffer: '2 int 1'; char sfMAX]; ‘2 ? '2 for (1' = 0; (s[i] : getcharO) l: ’\n’ &&'i < MAX—l; ++i) (, ; “z s[-—i] = ’\O’; Imagine that you have just written this loop. Now simulate it mentally as it reads a line. The first boundary to test is the simplest: an empty line. If you start with a line that contains only a single newline, it’s easy to see that the loop stops on the first iter- ation with i set to zero, so the last line decrements 1' to -l and thus writes a null byte into s[-l], which is before the beginning of the array Boundary condition testing finds the error-i If we rewrite the loop to use the conventional idiom for filling an array with input characters, it looks like this: for (i = 0; i < MAX—l; i++) 7 if ((s[i] z getcharO) == ’\n’) 1’ break; 7 s[i] = ’\O’; Repeating the original boundary test, it’s easy to verify that a line with just a newline is handled correctly: 1' is zero, the first input character breaks out of the loop, and SECflONfifl TEsrAsvouvmeETHecooE 141 ’\O’ is stored in SW}. Similar checking for inputs of one and two characters fol— lowed by a newline give us confidence that the loop works near that boundary. There are other boundary conditions to check, though. If the input contains a long line or no newlines, that is protected by the check that i stays less than MAX-l. But what if the input is empty, so the first call to getchar returns EOF? We must check for that: r 7 for (1‘ == 0; 1’ < MAX—1; i++3 ? if Meiji} = getcharCD == ’\n’ H 5U] == EOF) ? break; :2 5U] = ’\0’; Boundary condition testing can catch lots of bugs, but not all of them. We will return to this example in Chapter 8, where we will show that it still has a portability bug. The next step is to check input at the other boundary, where the array is nearly full, exactly full, and over-full, particularly if the newline arrives at the same time. We won’t write out the details here, but it’s a good exercise. Thinking about the boundaries raises the question of what to do when the buffer fills before a, ’\n’ occurs; this gap in the specification should be resolved early, and testing boundaries helps to identify it. I Boundary condition checking is effective for finding off-by—one errors. With practice, it becomes second nature, and many trivial bugs are eliminated before they ever happen. c Test pre- and post-conditions. Another way to head off problems is to verify that expected or necessary properties hold before (pro—condition) and after (post—condition) some piece of code executes. Making sure that input values are within range is a common example of testing a pro—condition. This function for computing the average of n elements in an array has a problem if n is less than or equal to zero: e double angdouble aE}, int n) { int i; double sum; sum 2 0.0; for (i = 0; i < n; i++) sum += a[i]; return sum / n; 7 7 '2 ? ? ? ‘2 ? {2 } What should avg do if n is zero? An array with no elements is a meaningful concept although its average value is not. Should avg let the system catch the division by zero? Abort? Complain? Quietly return some innocuous value? What if n is nega— tive, which is nonsensical but not impossible? As suggested in Chapter 4. our prefer- ence would probably be to return 0 as the average if n is less than or equal to zero: return n <= 0 ? 0.0 : sum/n; 142 TESTING CHAPTER 6 but there’s no single right answer, The one guaranteed wrong answer is to ignore the problem. An article in the November. 1998 Scientific Amerierm describes an incident aboard the USS Yorktown. a guided—missile cruiser. A crew member mistakenly entered a zero for a data value, which resulted in a division by zero, an error that cascaded and eventually shut down the ship’s propulsion system. The Yorktown was dead in the water for a couple of hours because a program didn’t check for valid input. Use assertions. C and C++ provide an assertion facility in <assert.h> that encour~ ages adding pre~ and post-condition tests. Since a failed assertion aborts the program. these are usually reserved for situations where a failure is really unexpected and there’s no way to recover. We might augment the code above with an assertion before the loop: assert(n > 0); If the assertion is violated. it will cause the program to abort with a standard message: Assertion failed: n > 0, file avgtest C, line 7 Aborthrash) Assertions are particularly helpful for validating properties of interfaces because they draw attention to inconsistencies between caller and callee and may even indicate who’s at fault. If the assertion that n is greater than zero fails when the function is called. it points the finger at the caller rather than at avg itself as the source of trouble. If an interface changes but we forget to fix some routine that depends on it, an asser- tion may catch the mistake before it causes real trouble. Program defensively. A useful technique is to add code to handle “can’t happen” cases. situations where it is not logically possible for something to happen but (because of some failure elsewhere) it might anyway. Adding a test for zero or nega— tive array lengths to avg was one example. As another example. a program process— ing grades might expect that there would be no negative or huge values but should check anyway: if (grade < 0 ll grade > 100) /* can’t happen */ letter = ’?’; else if (grade >= 90) letter = ’A’; else This is an example of defensive programming: making sure that a program protects itself against incorrect use or illegal data. Null pointers. out of range subscripts, divi~ sion by zero. and other errors can be detected early and warned about or deflected. Defensive programming (no pun intended) might well have caught the zerovdivide problem on the Yorktown. SECTION 6.1 TEST AS YOU WRITE THE 0005 143 Check error returns. One often-overlooked defense is to check the error returns from library functions and system calls. Retum values from input routines such as fread and fscanf should always be checked for errors, as should any file open call such as fopen. If a read or open fails, computation cannot proceed correctly. Checking the return code from output functions like fpri ntiC or fwri te will catch the error that results from trying to write a file when there is no space left on the disk. It may be sufficient to check the return value from fclose, which returns EOF if any error occurred during any operation, and zero otherwise. fp = fopenCoutfile, "w"); while (. . .) /-:.- write output to outfile ~:.—/ fprinthfp, ...); if (feloseCfp) 2: EOF) { /~.2 any errors? -:e/ /* some output error occurred 96/ } Output errors can be serious. If the file being written is the new version of a precious file, this check will save you from removing the old file if the new one was not writ- ten successfully. The effort of testing as you go is minimal and pays off handsomely. Thinking about testing as you write a program will lead to better code, because that’s when you know best what the code should do. If instead you wait until something breaks, you will probably have forgotten how the code works. Working under pressure, you will need to figure it out again, which takes time, and the fixes will be less thorough and more fragile because your refreshed understanding is likely to be incomplete. Exercise 6-1. Check out these examples at their boundaries, then fix them as neces— sary according to the principles of style in Chapter 1 and the advice in this chapter. (a) This is supposed to compute factorials: int factorial(int n) { int fac; fac = 1; while (n-—) fac *= n; return fac; } (b) This is supposed to print the characters of a string one per line: ‘i = 0; do { putchar(s[i++]); putchar(’\n'); }while (s[i] != ’\O’); 144 TESTING CHAPTER 6 (c) This is meant to copy a string from source to destination: void strcpyCchar *dest, char *src) { int 1'; .for (a a o; src[i] s: ’\0’; 1++) destfi] = srch’]; } ((1) Another string copy, which attempts to copy n characters from s to t: void strncpyCChar at, char w‘rS, int n) > { while (11 > 0 && as != ’\O’){ fit = *5; t++; '? (e) A numerical comparison: if (1' > 3') printf("%d is greater than %d.\n", i, j); ‘2 7 7 el 5 9 i1 printf("%d is smaller than %d.\n", 1‘, j); (f) A character class test: ? 1‘ch >= ’A’ && C <= ’Z’){ if (c <= ’L') cout << "first half of alphabet"; else cout << "second haliC of alphabet"; D Exercise 6-2. As we are writing this book in late l998, the Year 2000 problem looms as perhaps the biggest boundary condition problem ever. (a) What dates would you use to check whether a system is likely to work in the year 2000? Supposing that tests are expensive to perform, in what order would you do your tests after trying January 1, 2000 itself? (b) How would you test the standard function crime, which returns a string represen- tation of the date in this form: Fri Dec 31 23:58:27 EST l999\n\0 Suppose your program calls Ctime. How would you write your code to defend against a flawed implementation? SECTION 5.2 SYSTEMATlC TESTING 145 (c) Describe how you would test a calendar program that prints output like this: January 2000 SMTuWThFS 1 2 3 4 5 6 7 8 9 10 ll 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 (d) What other time boundaries can you think of in systems that you use, and how would you test to see whether they are handled correctly? 6.2 Systematic Testing It’s important to test a program systematically so you know at each step what you are testing and what results you expect. You need to be orderly so you don’t overlook anything, and you must keep records so you know how much you have done. Test incrementally. Testing should go hand in hand with program construction. A “big bang” where one writes the whole program. then tests it all at once. is much harder and more time-consuming than an incremental approach Write part of a pro- gram, test it, add some more code, test that, and so on. If you have two packages that have been written and tested independently, test that they work together when you finally connect them. For instance, when we were testing the CSV programs in Chapter 4, the first step was to write just enough code to read the input; this let us validate input processing. The next step was to split input lines at commas. Once these parts were working, we moved on to fields with quotes, and then gradually worked up to testing everything. Test simple parts first. The incremental approach also applies to how you test fea» tures. Tests should focus first on the simplest and most commonly executed features of a program; only when those are working properly should you move on. This way, at each stage, you expose more to testing and build confidence that basic mechanisms are working correctly. Easy tests find the easy bugs. Each test does the minimum to ferret out the next potential problem. Although each bug is harder to trigger than its predecessor, it is not necessarily harder to fix. In this section, we’ll talk about ways to choose effective tests and in what order to apply them; in the next two sections, we’ll talk about how to mechanize the process so that it can be carried out efficiently. The first step, at least for small programs or individual functions, is an extension of the boundary condition testing that we described in the previous section: systematic testing of small cases. Suppose we have a function that performs binary search in an array of integers. We would begin with these tests, arranged in order of increasing complexity: 146 TESTING CHAPTER 6 0 search an array with no elements 0 search an array with one element and a trial value that is ~ less than the single element in the array - equal to the single element a greater than the single element ¢ search an array with two elements and trial values that — check all five possible positions 0 check behavior with duplicate elements in the array and trial values ~ less than the value in the array * equal to the value ‘ greater than the value 1 0 search an array with three elements as with two elements 0 search an array with four elements as with two and three If the function gets past this unscathed, it’s likely to be in good shape, but it could still be tested further. This set of tests is small enough tr) perform by hand, but it is better to create a test scaffold to mechanize the process. The following driver program is about as simple as we can manage. It reads input lines that contain a key to search for and an array size; it creates an array of that size containing values 1, 3, 5, and it searches the array for the key. /~x bintest main: scaffold for testing binsearch -.-.-/ int mainCvoid) { int 1‘, key, nelem, arr[lOOO]; while (scanf("%d %d", &key, &nelem) != EOF) { for (i = O; i < nelem; i++) arr[i] = Zia-i + 1; printf("%d\n", binsearcthey, arr, nelemD; } return 0; } This is simpleminded but it shows that a useful test scaffold need not be big, and it is easily extended to perform more of these tests and require less manual intervention. Know what output to expect. For all tests, it’s necessary know what the right answer is; if you don’t, you’re wasting your time. This might seem obvious, since for many programs it’s easy to tell whether the program is working. For example, either a copy of a file is a copy or it isn’t, The output from a sort is sorted or it isn’t; it must also be a permutation of the original input. Most programs are more difficult to characterizewcompilers (does the output properly translate the input?), numerical algorithms (is the answer within error toler- ance?), graphics (are the pixels in the right places?), and so on. For these, it’s espe— cially important to validate the output by comparing it with known values. SECTION 6.2 SYSTEMATlC TESTING 147 0 To test a compiler, compile and run the test files. The test programs should in turn generate output, and their results should be compared to known ones. To test a numerical program. generate test cases that explore the edges of the algorithm, trivial cases as well as hard ones. Where possible, write code that verifies that output properties are sane. For example. the output of a numerical integrator can be tested for continuity, and for agreement with closed—form solutions. To test a graphics program, it’s not enough to see if it can draw a box: instead read the box back from the screen and check that its edges are exactly where they should be. If the program has an inverse, check that its application recovers the input. Encryption and decryption are inverses. so if you encrypt something and can’t decrypt it, something is wrong. Similarly. lossless compression and expansion algorithms should be inverses. Programs that bundle files together should extract them unchanged. Sometimes there are multiple methods for inversion; check all combina- tions. Verify conservation properties. Many programs preserve some property of their inputs. Tools like WC (count lines, words, and characters) and sum (compute a check— sum) can verify that outputs are of the same size, have the same number of words, contain the same bytes in some order, and the like. Other programs compare files for identity (cmp) or report differences (di ff). These programs or similar ones are read- ily available for most environments, and are well worth acquiring. A byte-frequency program can be used to check for conservation of data and also to spot anomalies like non-text characters in supposedly text‘only files; here’s a ver— sion that we call freq: #include <stdio.h> #inClude <Ctype.h> #inClude <limits.h> unsigned long countEUCHARwMAX+l]; /* freq main: display byte frequency counts a/ int main(void) { int C; while ((c = getchar()) != EOF) count[c]++; for (C = O; C <= UCHAR_MAX; c++) if (count[C] l: O) printf("%.2x %C %lu\n”, C, isprinttc) ? C : ’—’, count[c]); return 0; } Conservation properties can be verified within a program, too. A function that counts the elements in a data structure provides a trivial consistency check. A hash 148 TESTlNG CHAPTER 6 table should have the property that every element inserted into it can be retrieved. This condition is easy to check with a function that dumps the contents of the table into a file or an array. At any time, the number of insertions into a data structure minus the number of deletions must equal the number of elements contained, a condia tion that is easy to verify. Compare independent implementations. Independent implementations of a library or program should produce the same answers. For example, two compilers should pro— duce programs that behave the same way on the same machine, at least in most situa— tions. Sometimes an answer can be computed in two different ways, or you might be able to write a trivial version of a program to use as a slow but independent compari— son. lf two unrelated programs get the same answers, there is a good chance that they are correct; if they get different answers, at least one is wrong. One of the authors once worked with another person on a compiler for a new machine. The work of debugging the code generated by the compiler was split: one person wrote the software that encoded instructions for the target machine, and the other wrote the disassembler for the debugger. This meant that any error of interpre- tation or implementation of the instruction set was unlikely to be duplicated between the two components. When the compiler miscoded an instruction, the disassembler was sure to notice. All the early output of the compiler was run through the disassem- bler and verified against the compiler’s own debugging printouts. This strategy worked very well in practice, instantly catching mistakes in both pieces. The only dif— ficult, protracted debugging occurred when both people interpreted an ambiguous phrase in the architecture description in the same incorrect way. Measure test coverage. One goal of testing is to make sure that every statement of a program has been executed sometime during the sequence of tests; testing cannot be considered complete unless every line of the program has been exercised by at least one test. Complete coverage is often quite difficult to achieve. Even leaving aside “can’t happen” statements, it is hard to use normal inputs to force a program to go through particular statements. There are commercial tools for measuring coverage. Profilers, often included as part of compiler suites, provide a way to compute a statement frequency count for each program statement that indicates the coverage achieved by specific tests. We tested the Markov program of Chapter 3 with a combination of these tech- niques. The last section of this chapter describes those tests in detail. Exercise 6-3. Describe how you would test freq. Exercise 6—4. Design and implement a version of freq that measures the frequencies of other types of data values, such as 32—bit integers or floating—point numbers. Can you make one version of the program handle a variety of types elegantly? :1 ...
View Full Document

This note was uploaded on 04/04/2008 for the course EECS 215 taught by Professor Phillips during the Winter '08 term at University of Michigan.

Page1 / 10

Testing - Testing In ordinary computational practice by...

This preview shows document pages 1 - 10. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online