MIPS . . . OR . . . OOPS?
uring the 1970s and 1980s competition was fierce between two leading computer makers: IBM and Digital Equipment Corporation (DEC). Although
DEC did not manufacture huge mainframe systems, its largest systems were suitable for t
FIGURE 10.2 Disk Track Seeks Using the First-Come, First-Served Disk
FIGURE 10.3 Disk Arm Motion for the Shortest Seek Time First
Empirical studies have shown that over 50% of disk accesses are sequential in
nature, and that prefetching increases performance by 40%, on average.
The downside of prefetching is the phenomenon of cache pollution. Cache
pollution occurs when the cache is
considerations in disk performance are those that present us with opportunities for
tuning and adjustment, tasks that should be a routine part of system operations.
Disk arm motion is the greatest consumer of service time within most disk
Disk Queue Time in Milliseconds
FIGURE 10.1 Disk Queue Time Plotted Against Utilization Percentage
particular disk track), and transfer rate (the rate at which the read/write head
mined by the speed of the disk and the rate at which requests arrive in the service
queue. Stated mathematically:
Request Arrival Rate
Disk Service Rate
where the arrival rate is given in requests per second, and the disk service rate is
Eliminate all unnecessary branches.
Use iteration instead of recursion when possible.
Build conditional statements (e.g., if, switch, case) with the most probable
Declare variables in a structure in order of size with the largest ones fir
architecture to write superior programs. We have provided a sidebar containing a
list of things to keep in mind while you are optimizing your program code. We
invite you to ponder the ways in which each of these tips takes various system
components into a
In fixed prediction, when the assumption is that the branch is always taken,
preparation is also made for an incorrect prediction. State information is saved
before the speculative processing begins. If the guess is correct, this saved information is dele
FIGURE 10.4 Disk Arm Motion for the SCAN Disk Scheduling Algorithm
46, 52, 62, 75, 35, 28, 21, 19, 6.
The disk arm passes over track 99 between reading tracks 75 and 35, and then
travels to track zero after reading
continually sweep over all 100 disk tracks. But, in fact, the lowest required track
is 6 and the highest is 75. Thus, if the disk arm changes direction only when the
highest- and lowest-numbered tracks are read, the arm will traverse only 69
24. A certain microprocessor requires either 2, 3, 4, 8, or 12 machine cycles to perform
various operations. Twenty-five percent of its instructions require 2 machine cycles,
20% require 3 machine cycles, 17.5% require 4 machine cycles, 12.5% require 8
to find identical (or comparable) systems that have results posted on both sites. Discuss your findings.
We mentioned that a large volume of data is gathered during system probe traces. To
give you some idea of the actual
B to C, and A to C) using the arithmetic and geometric means. Are there any surprises? Explain.
6. What are the limitations of synthetic benchmarks such as Whetstone and Dhrystone?
Do you think that the concept of a synthetic benchmark could be extended to overcome these limitations? Explain your answer.
7. What would you say to a vendor that tells
REVIEW OF ESSENTIAL TERMS AND CONCEPTS
1. Explain what is meant when we say that a program or system is memory bound.
What other types of bindings have we discussed?
2. What does Amdahls Law tell us about performance optimization?
3. Which of the means is
There is no shortage of information concerning the performance of I/O systems.
Hennessy and Pattersons book (1996) explores this topic in great detail. An excellent
(though dated) investigation of disk scheduling policies can be found in Oney (1975).
In the context of overall computer design, one of the most respected treatments of
computer performance is presented in Hennessy and Patterson (1996). Their book
integrates performance considerations throughout its exposition of all facets
his chapter has presented the two aspects of computer performance: performance assessment and performance optimization. You should come away from
this chapter knowing the key measurements of computer performance and how to
PROGRAM OPTIMIZATION TIPS
Give the compiler as much information as possible about what you are doing.
Use constants and local variables where possible. If your language permits
them, define prototypes and declare static functions. Use arrays instead of
tion type. This information can then be used to achieve a better instruction balance. The idea is to attempt to write your loops with the best mix of instructions
for a given architecture (e.g., loads, stores, integer operations, floating-point
Users of the SPEC benchmarks pay an administrative fee for the suites source
code and instructions for its installation and compilation. Manufacturers are encouraged (but not required) to submit a report that includes the results of the benchmarks
this metric, these systems would be utterly worthless! Nevertheless, MFLOPS,
like MIPS, is a popular metric with marketing people because it sounds like a
hard value and represents a simple and intuitive concept.
Despite their shortcomings, clock speed, M
SPEC CINT2000 Benchmark Kernels
Compresses a TIFF (Tagged Image Format File), a Web
server log, binary program code, "random" data, and a tar
Moler, and Pete Stewart of the Argonne National Laboratory developed Linpack
in 1984 to measure the performance of supercomputers. It was originally written
in FORTRAN 77 and has subsequently been rewritten in C and Java. Although it
has some serious shor
SPEC (Standard Performance Evaluation Corporation) was founded in 1988
by a consortium of computer manufacturers in cooperation with the Electrical Engineering Times. SPECs main objective is to establish equitable and realistic methods
for computer perfor
The VAX 11/780 was a commercial success. The system was so popular that it
became the standard 1 MIPS system. For many years, the VAX 11/780 was the reference system for numerous benchmarks. The results of these benchmarks could be
extrapolated to infer a
results for Software B running Test Y. We show that Test X is almost equivalent
to Test Y. From this, we conclude that Software A is faster. And by how much
are Tests X and Y not equivalent? Is it possible that Test X is contrived to make
Software A look
Then take the twelfth root of this product:
(4.48 1029)1/12 296
Thus, the CINT metric for this system is (a fairly impressive) 296. If this result
were obtained when running benchmarks compiled with standard (conservative)
compiler settings, it would be r