Unformatted text preview: nd ﬂoatingpoint multiplication and addition were considered important operations in design of the Pentium III, even though a signiﬁcant amount of hardware is required to achieve the low latencies and high degree of pipelining shown. On the other hand, division is relatively infrequent, and difﬁcult to implement with short latency or issue time, and so these operations are relatively slow. 5.7.3 A Closer Look at Processor Operation
As a tool for analyzing the performance of a machine level program executing on a modern processor, we have developed a more detailed textual notation to describe the operations generated by the instruction decoder, as well as a graphical notation to show the processing of operations by the functional units. Neither of these notations exactly represents the implementation of a speciﬁc, reallife processor. They are simply methods to help understand how a processor can take advantage of parallelism and branch prediction in executing a program. Translating Instructions into Operations
We present our notation by working with combine4 (Figure 5.10), our fastest code up to this point as an example. We focus just on the computation performed by the loop, since this is the dominating factor in performance for large vectors. We consider the ca...
View
Full
Document
This note was uploaded on 09/02/2010 for the course ELECTRICAL 360 taught by Professor Schultz during the Spring '10 term at BYU.
 Spring '10
 Schultz
 The American, Gulliver's Travels, 2.2.5 2.2.6 2.2.7 2.3 2.3.1 2.3.2 2.3.3 2.3.4 2.3.5

Click to edit the document details