chap04

Computer Architecture, Fifth Edition: A Quantitative Approach (The Morgan Kaufmann Series in Computer Architecture and Design)

Info iconThis preview shows pages 1–4. Sign up to view the full content.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: 4 “Who’s first?” “America.” “Who’s second?” “Sir, there is no second.” Dialog between two observers of the sailing race later named “The America’s Cup” and run every few years. This quote was the inspiration for John Cocke’s naming of the IBM research processor as “America.” This processor was the precursor to the RS/6000 series and the first superscalar microprocessor. Advanced Pipelining and Instruction-Level Parallelism 4 4.1 Instruction-Level Parallelism: Concepts and Challenges 221 4.2 Overcoming Data Hazards with Dynamic Scheduling 240 4.3 Reducing Branch Penalties with Dynamic Hardware Prediction 262 4.4 Taking Advantage of More ILP with Multiple Issue 278 4.5 Compiler Support for Exploiting ILP 289 4.6 Hardware Support for Extracting More Parallelism 299 4.7 Studies of ILP 318 4.8 Putting It All Together: The PowerPC 620 335 4.9 Fallacies and Pitfalls 349 4.10 Concluding Remarks 352 4.11 Historical Perspective and References 354 Exercises 362 In the last chapter we saw how pipelining can overlap the execution of instruc-tions when they are independent of one another. This potential overlap among in-structions is called instruction-level parallelism (ILP) since the instructions can be evaluated in parallel. In this chapter, we look at a wide range of techniques for extending the pipelining ideas by increasing the amount of parallelism exploited among instructions. We start by looking at techniques that reduce the impact of data and control hazards and then turn to the topic of increasing the ability of the processor to exploit parallelism. We discuss the compiler technology used to in-crease the ILP and examine the results of a study of available ILP. The Putting It All Together section covers the PowerPC 620, which supports most of the ad-vanced pipelining techniques described in this chapter. In this section, we discuss features of both programs and processors that limit the amount of parallelism that can be exploited among instructions. We conclude the section by looking at simple compiler techniques for enhancing the exploita-tion of pipeline parallelism by a compiler. 4.1 Instruction-Level Parallelism: Concepts and Challenges 222 Chapter 4 Advanced Pipelining and Instruction-Level Parallelism The CPI of a pipelined machine is the sum of the base CPI and all contribu-tions from stalls: The ideal pipeline CPI is a measure of the maximum performance attainable by the implementation. By reducing each of the terms of the right-hand side, we min-imize the overall pipeline CPI and thus increase the instruction throughput per clock cycle. While the focus of the last chapter was on reducing the RAW stalls and the control stalls, in this chapter we will see that the techniques we introduce to further reduce the RAW and control stalls, as well as reduce the ideal CPI, can increase the importance of dealing with structural, WAR, and WAW stalls. The equation above allows us to characterize the various techniques we examine in this...
View Full Document

This document was uploaded on 02/09/2012.

Page1 / 152

chap04 - 4 “Who’s first?” “America.” “Who’s...

This preview shows document pages 1 - 4. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online