08_scheduling

08_scheduling - CIS 501 Computer Architecture Unit 8:...

Info iconThis preview shows pages 1–9. Sign up to view the full content.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: CIS 501 Computer Architecture Unit 8: Static and Dynamic Scheduling CIS 501 (Martin): Scheduling Slides originally developed by Drew Hilton, Amir Roth and Milo Martin at University of Pennsylvania 1 CIS 501 (Martin): Scheduling 2 This Unit: Static & Dynamic Scheduling Code scheduling To reduce pipeline stalls To increase ILP (insn level parallelism) Static scheduling by the compiler Approach & limitations Dynamic scheduling in hardware Register renaming Instruction selection Handling memory operations CPU Mem I/O System software App App App CIS 501 (Martin): Scheduling 3 Readings Textbook (MA:FSPTCM) Sections 3.3.1 3.3.4 (but not Sidebar:) Sections 5.0-5.2, 5.3.3, 5.4, 5.5 Paper to read for in-class discussion: The MIPS R10000 Superscalar Microprocessor by Kenneth Yeager Paper for group discussion and questions: Memory Dependence Prediction using Store Sets by Chrysos & Emer Code Scheduling & Limitations CIS 501 (Martin): Scheduling 4 Code Scheduling Scheduling: act of finding independent instructions Static done at compile time by the compiler (software) Dynamic done at runtime by the processor (hardware) Why schedule code? Scalar pipelines: fill in load-to-use delay slots to improve CPI Superscalar: place independent instructions together As above, load-to-use delay slots Allow multiple-issue decode logic to let them execute at the same time CIS 501 (Martin): Scheduling 5 CIS 501 (Martin): Scheduling 6 Compiler Scheduling Compiler can schedule (move) instructions to reduce stalls Basic pipeline scheduling : eliminate back-to-back load-use pairs Example code sequence: a = b + c; d = f e; sp stack pointer, sp+0 is a, sp+4 is b, etc Before ld r2,4(sp) ld r3 ,8(sp) add r3 ,r2,r1 //stall st r1,0(sp) ld r5,16(sp) ld r6 ,20(sp) sub r5, r6 ,r4 //stall st r4,12(sp) After ld r2,4(sp) ld r3 ,8(sp) ld r5,16(sp) add r3 ,r2,r1 //no stall ld r6 ,20(sp) st r1,0(sp) sub r5, r6 ,r4 //no stall st r4,12(sp) CIS 501 (Martin): Scheduling 7 Compiler Scheduling Requires Large scheduling scope Independent instruction to put between load-use pairs + Original example: large scope, two independent computations This example: small scope, one computation One way to create larger scheduling scopes? Loop unrolling Before ld r2,4(sp) ld r3 ,8(sp) add r3 ,r2,r1 //stall st r1,0(sp) After ld r2,4(sp) ld r3 ,8(sp) add r3 ,r2,r1 //stall st r1,0(sp) CIS 501 (Martin): Scheduling Scheduling Scope Limited by Branches r1 and r2 are inputs loop: jz r1, not_found ld [r1+0] -> r3 sub r2, r3 -> r4 jz r4, found ld [r1+4] -> r1 jmp loop Legal to move load up past branch?...
View Full Document

Page1 / 171

08_scheduling - CIS 501 Computer Architecture Unit 8:...

This preview shows document pages 1 - 9. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online