08_scheduling-1up

08_scheduling-1up - CIS 501 Computer Architecture Unit 8:...

Info iconThis preview shows pages 1–9. Sign up to view the full content.

View Full Document Right Arrow Icon
CIS 501 Computer Architecture Unit 8: Static and Dynamic Scheduling CIS 501 (Martin): Scheduling Slides originally developed by Drew Hilton, Amir Roth and Milo Martin at University of Pennsylvania 1
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
CIS 501 (Martin): Scheduling 2 This Unit: Static & Dynamic Scheduling • Pipelining and superscalar review • Code scheduling • To reduce pipeline stalls • To increase ILP (insn level parallelism) • Two approaches • Static scheduling by the compiler • Dynamic scheduling by the hardware CPU Mem I/O System software App App App
Background image of page 2
CIS 501 (Martin): Scheduling 3 Readings • Textbook (MA:FSPTCM) Sections 3.3.1 – 3.3.4 (but not “Sidebar:”) Sections 5.0-5.2, 5.3.3, 5.4, 5.5 • Paper • “Memory Dependence Prediction using Store Sets” by Chrysos & Emer
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Pipelining Review • Increases clock frequency by staging instruction execution • “Scalar” pipelines have a best-case CPI of 1 • Challenges: • Data and control dependencies further worsen CPI • Data: With full bypassing, load-to-use stalls • Control: use branch prediction to mitigate penalty • Big win, done by all processors today • How many stages (depth)? • Five stages is pretty good minimum • Intel Pentium II/III: 12 stages • Intel Pentium 4: 22+ stages • Intel Core 2: 14 stages CIS 501 (Martin): Scheduling 4
Background image of page 4
CIS 501 (Martin): Scheduling 5 Pipeline Diagram • Use compiler scheduling to reduce load-use stall frequency • “ d* ” is data dependency, “ s* ” is structural hazard, p* ” is propagation hazard (only n instructions per stage) 1 2 3 4 5 6 7 8 9 add $3 $2,$1 F D X M W lw $4 4($3) F D X M W addi $6 $4 ,1 F D d* X M W sub $8 $3,$1 F p* D X M W 1 2 3 4 5 6 7 8 9 add $3 $2,$1 F D X M W lw $4 4($3) F D X M W sub $8 $3,$1 F D X M W addi $6 $4 ,1 F D X M W
Background image of page 5

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Superscalar Pipeline Review • Execute two or more instruction per cycle • Challenges: • wide fetch (branch prediction harder, misprediction more costly) • wide decode (stall logic) • wide execute (more ALUs) • wide bypassing (more possibly bypassing paths) • Finding enough independent instructions (and fill delay slots) • How many instructions per cycle max (width)? • Really simple, low-power cores are still single-issue (most ARMs) • Even low-power cores a dual-issue (ARM A8, Intel Atom) • Most desktop/laptop chips three-issue or four-issue (Core i7) • A few 5 or 6-issue chips have been built (IBM Power4, Itanium II) CIS 501 (Martin): Scheduling 6
Background image of page 6
CIS 501 (Martin): Scheduling 7 Superscalar Pipeline Diagrams - Ideal scalar 1 2 3 4 5 6 7 8 9 10 11 12 lw 0(r1) r2 F D X M W lw 4(r1) r3 F D X M W lw 8(r1) r4 F D X M W add r14,r15 r6 F D X M W add r12,r13 r7 F D X M W add r17,r16 r8 F D X M W lw 0(r18) r9 F D X M W 2-way superscalar 1 2 3 4 5 6 7 8 9 10 11 12 lw 0(r1) r2 F D X M W lw 4(r1) r3 F D X M W lw 8(r1) r4 F D X M W add r14,r15 r6 F D X M W add r12,r13 r7 F D X M W add r17,r16 r8 F D X M W lw 0(r18) r9 F D X M W
Background image of page 7

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
CIS 501 (Martin): Scheduling 8 Superscalar Pipeline Diagrams - Realistic scalar 1 2 3 4 5 6 7 8 9 10 11 12 lw 0(r1) r2 F D X M W lw 4(r1) r3
Background image of page 8
Image of page 9
This is the end of the preview. Sign up to access the rest of the document.

This note was uploaded on 10/19/2011 for the course CS 501 taught by Professor Matin during the Fall '10 term at UPenn.

Page1 / 174

08_scheduling-1up - CIS 501 Computer Architecture Unit 8:...

This preview shows document pages 1 - 9. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online