AppG - G.1 G.2 G.3 G.4 G.5 G.6 G.7 Introduction: Exploiting...

Info iconThis preview shows pages 1–4. Sign up to view the full content.

View Full Document Right Arrow Icon
G.1 Introduction: Exploiting Instruction-Level Parallelism Statically G-2 G.2 Detecting and Enhancing Loop-Level Parallelism G-2 G.3 Scheduling and Structuring Code for Parallelism G-12 G.4 Hardware Support for Exposing Parallelism: Predicated Instructions G-23 G.5 Hardware Support for Compiler Speculation G-27 G.6 The Intel IA-64 Architecture and Itanium Processor G-32 G.7 Concluding Remarks G-44
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
G Hardware and Software for VLIW and EPIC The EPIC approach is based on the application of massive resources. These resources include more load-store, computational, and branch units, as well as larger, lower-latency caches than would be required for a superscalar processor. Thus, IA-64 gambles that, in the future, power will not be the critical limitation, and that massive resources, along with the machinery to exploit them, will not penalize performance with their adverse effect on clock speed, path length, or CPI factors. M. Hopkins [2000] in a commentary on the EPIC approach and the IA-64 architecture
Background image of page 2
G-2 Appendix G Hardware and Software for VLIW and EPIC In this chapter we discuss compiler technology for increasing the amount of par- allelism that we can exploit in a program as well as hardware support for these compiler techniques. The next section deFnes when a loop is parallel, how a dependence can prevent a loop from being parallel, and techniques for eliminat- ing some types of dependences. The following section discusses the topic of scheduling code to improve parallelism. These two sections serve as an introduc- tion to these techniques. We do not attempt to explain the details of ILP-oriented compiler techniques, since that would take hundreds of pages, rather than the 20 we have allotted. Instead, we view this material as providing general background that will enable the reader to have a basic understanding of the compiler techniques used to exploit ILP in modern computers. Hardware support for these compiler techniques can greatly increase their effectiveness, and Sections G.4 and G.5 explore such support. The IA-64 repre- sents the culmination of the compiler and hardware ideas for exploiting parallel- ism statically and includes support for many of the concepts proposed by researchers during more than a decade of research into the area of compiler-based instruction-level parallelism. Section G.6 is a description and performance analy- ses of the Intel IA-64 architecture and its second-generation implementation, Itanium 2. The core concepts that we exploit in statically based techniques—Fnding par- allelism, reducing control and data dependences, and using speculation—are the same techniques we saw exploited in Chapter 2 using dynamic techniques. The key difference is that the techniques in this appendix are applied at compile time by the compiler, rather than at run time by the hardware. The advantages of com- pile time techniques are primarily two: they do not burden run time execution with any inefFciency, and they can take into account a wider range of the pro- gram than a run time approach might be able to incorporate. As an example of the
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Image of page 4
This is the end of the preview. Sign up to access the rest of the document.

Page1 / 45

AppG - G.1 G.2 G.3 G.4 G.5 G.6 G.7 Introduction: Exploiting...

This preview shows document pages 1 - 4. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online