Lecture_06_Limits on ILP

Lecture_06_Limits on ILP - CA Lecture06 - ILP-limits

Info iconThis preview shows pages 1–7. Sign up to view the full content.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: CA Lecture06 - ILP-limits ([email protected]) 06-1 5008: Computer Architecture 5008: Computer 5008: Computer Architecture Architecture Chapter 3 Chapter 3 – – Limits on Instruction Limits on Instruction-- Level Parallelism Level Parallelism CA Lecture06 - ILP-limits ([email protected]) 06-2 Some Facts … • Interest in multiple-issue because wanted to improve performance without affecting uniprocessor programming model • Taking advantage of ILP is conceptually simple, but design problems are amazingly complex in practice • Conservative in ideas, just faster clock and bigger • Processors of last 5 years (Pentium 4, IBM Power 5, AMD Opteron) have the same basic structure and similar sustained issue rates (3 to 4 instructions per clock) as the 1st dynamically scheduled, multiple-issue processors announced in 1995 – Clocks 10 to 20X faster, caches 4 to 8X bigger, 2 to 4X as many renaming registers, and 2X as many load-store units ⇒ performance 8 to 16X • Peak v. delivered performance gap increasing CA Lecture06 - ILP-limits ([email protected]) 06-3 Outline • Review • Limits to ILP (another perspective) • Thread Level Parallelism • Multithreading • Simultaneous Multithreading • Power 4 vs. Power 5 • Head to Head: VLIW vs. Superscalar vs. SMT • Commentary • Conclusion CA Lecture06 - ILP-limits ([email protected]) 06-4 Limits to ILP • Conflicting studies of amount – Benchmarks (vectorized Fortran FP vs. integer C programs) – Hardware sophistication – Compiler sophistication • How much ILP is available using existing mechanisms with increasing HW budgets? • Do we need to invent new HW/SW mechanisms to keep on processor performance curve? – Intel MMX, SSE (Streaming SIMD Extensions): 64 bit ints – Intel SSE2: 128 bit, including 2 64-bit Fl. Pt. per clock – Motorola AltaVec: 128 bit ints and FPs – Supersparc Multimedia ops, etc. CA Lecture06 - ILP-limits ([email protected]) 06-5 Overcoming Limits • Advances in compiler technology + significantly new and different hardware techniques may be able to overcome limitations assumed in studies • However, unlikely such advances when coupled with realistic hardware will overcome these limits in near future CA Lecture06 - ILP-limits ([email protected]) 06-6 Limits to ILP Initial HW Model here; MIPS compilers. Assumptions for ideal/perfect machine to start: 1. Register renaming – infinite virtual registers => all register WAW & WAR hazards are avoided 2. Branch prediction – perfect; no mispredictions 3. Jump prediction – all jumps perfectly predicted (returns, case statements) 2 & 3 ⇒ no control dependencies; perfect speculation & an unbounded buffer of instructions available 4....
View Full Document

This note was uploaded on 08/23/2009 for the course IEE 5513 taught by Professor Cwliu during the Spring '09 term at National Chiao Tung University.

Page1 / 55

Lecture_06_Limits on ILP - CA Lecture06 - ILP-limits

This preview shows document pages 1 - 7. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online