chapter_3

Computer Architecture, Fifth Edition: A Quantitative Approach (The Morgan Kaufmann Series in Computer Architecture and Design)

Info iconThis preview shows pages 1–4. Sign up to view the full content.

View Full Document Right Arrow Icon
3 “Who’s first?” “America.” “Who’s second?” “Sir, there is no second.” Dialog between two observers of the sailing race later named “The America’s Cup” and run every few years. This quote was the inspiration for John Cocke’s naming of the IBM research processor as “America.” This processor was the precursor to the RS/6000 series and the first superscalar microprocessor. Instruction-Level Parallelism and its Dynamic Exploitation 4
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
3.1 Instruction-Level Parallelism: Concepts and Challenges 221 3.2 Overcoming Data Hazards with Dynamic Scheduling 231 3.3 Dynamic Scheduling: Examples and the Algorithm 239 3.4 Reducing Branch Costs with Dynamic Hardware Prediction 247 3.5 High Performance Instruction Delivery 261 3.6 Taking Advantage of More ILP with Multiple Issue 268 3.7 Hardware-Based Speculation 278 3.8 Studies of the Limitations of ILP 294 3.9 Limitations on ILP for Realizable Processors 309 3.10 Putting It All Together: The P6 Microarchitecture 316 3.11 Another View: Thread Level Parallelism 329 3.12 Crosscutting Issues: Using an ILP Datapath to Exploit TLP 330 3.13 Fallacies and Pitfalls 330 3.14 Concluding Remarks 333 3.15 Historical Perspective and References 337 Exercises 345 All processors since about 1985, including those in the embedded space, use pipelining to overlap the execution of instructions and improve performance. This potential overlap among instructions is called instruction-level parallelism (ILP) since the instructions can be evaluated in parallel. In this chapter and the next, we look at a wide range of techniques for extending the pipelining ideas by increasing the amount of parallelism exploited among instructions. This chapter is at a considerably more advanced level than the material in Appendix A. If you are not familiar with the ideas in Appendix A, you should review that Appendix before venturing into this chapter. We start this chapter by looking at the limitation imposed by data and control hazards and then turn to the topic of increasing the ability of the processor to ex- ploit parallelism. Section 3.1 introduces a large number of concepts, which we build on throughout these two chapters. While some of the more basic material in 3.1 Instruction-Level Parallelism: Concepts and Challenges
Background image of page 2
222 Chapter 3 Instruction-Level Parallelism and its Dynamic Exploitation this chapter could be understood without all of the ideas in Section 3.1, this basic material is important to later sections of this chapter as well as to chapter 4. There are two largely separable approaches to exploiting ILP. This chapter covers techniques that are largely dynamic and depend on the hardware to locate the parallelism. The next chapter focuses on techniques that are static and rely much more on software. In practice, this partitioning between dynamic and static and between hardware-intensive and software-intensive is not clean, and tech- niques from one camp are often used by the other. Nonetheless, for exposition purposes, we have separated the two approaches and tried to indicate where an approach is transferable.
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Image of page 4
This is the end of the preview. Sign up to access the rest of the document.

Page1 / 134

chapter_3 - 3 Instruction-Level Parallelism and its Dynamic...

This preview shows document pages 1 - 4. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online