cse431-chapterSohi

cse431-chapterSohi - CSE 431 Computer Architecture Fall...

Info iconThis preview shows pages 1–5. Sign up to view the full content.

View Full Document Right Arrow Icon
CSE 431 Computer Architecture Fall 2008 A “Simple” SS Processor Mary Jane Irwin ( www.cse.psu.edu/~mji ) CSE431 Sample SS Processor.1 Irwin, PSU, 2008 [Adapted from Sohi, Instruction Issue Logic for High-Performance, Interruptable, Multiple Functional Unit, Pipelined Computers , IEEE Transactions on Computers, Vol 39, No 3, 1990 . ] Review: Extracting More Performance ± To achieve high performance, need both machine parallelism and instruction level parallelism ( ILP ) by z Superpipelining z Static multiple-issue (VLIW) z Dynamic multiple-issue (superscalar (SS)) ± A SS processor’s instruction issue and execution policies impact the available ILP z In-order fetch, issue, and commit and out-of-order execution - Pipelining creates true dependencies ( read before write ) - Out-of-order execution creates antidependencies ( write before read ) CSE431 Sample SS Processor.2 Irwin, PSU, 2008 and output dependencies ( write before write ) - In-order commit allows speculation (to increase ILP) and is required to implement precise interrupts ± Register renaming can solve these storage dependencies
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Review: SS Pipeline Stage Functions ns tr s e, d o s y FETCH DECODE & DISPATCH EXECUTE WRITE BACK RESULT COMMIT etch multiple instructio Decode and issue ins ait for source operand o be Ready and FU fre chedule Result Bus an xecute instruction opy Result Bus data to atching waiting source rite dst contents to egFile or Data Memor CSE431 Sample SS Processor.3 Irwin, PSU, 2008 F W t e C m R In Order In Order Out of Order In Order ruu_fetch() ruu_dispatch() ruu_issue() lsq_refresh() ruu_writeback() ruu_commit() Speedup Measurements ± The speedup of the SS processor is z Assumes the processors have the same IC & CC # scalar cycles # superscalar cycles speedup = s n = -------------------------------- ± To compute average speedup performance can use z Geometric mean z Harmonic mean HM = n / ( Σ 1/s i ) i = 1 n GM = n Π s i i = 1 n CSE431 Sample SS Processor.4 Irwin, PSU, 2008 - assigns a larger weighting to the programs with the smallest speedup z EX: two programs with same scalar cycles, with a SS speedup of 2 for program1 and 25 for program2 - GM = - HM = (2 * 25) = 7.1 2 / (.5 + .04) = 2 /.54 = 3.7
Background image of page 2
Maximum (Theoretical) SS Speedups ± The highest speedup that can be achieved with “ideal” machine parallelism (ignoring structural hazards, data dependencies, and control dependencies) z HM of 5.4 is the highest average speedup for these benchmarks that can be achieved even with idea machine parallelism that can be achieved even with ideal machine parallelism! 6 8 10 12 eedup From Johnson, 1992 CSE431 Sample SS Processor.5 Irwin, PSU, 2008 0 2 4 5 d i ff ccom do du c gn uchess rsim linpack si m p l e tr o f twol Sp Baseline Superscalar MIPS Processor Model BHT Decode & Issue (aka Dispatch) Fetch D$ LSQ Execute Commit Writeback Load/Store Queue L I N I FP RegFile Integer RegFile I$ BTB RUU IALU IALU FPALU LSQ L I N I P C RUU_Head RUU Tai 3 4 5 CSE431 Sample SS Processor.6 Irwin, PSU, 2008 IMULT Result Bus Register Update Unit RUU_Tail 6 (Or done as Reorder Buffer together with FU Reservation Stations )
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Additional RegFile Fields ± Each register in the general purpose RegFile has two
Background image of page 4
Image of page 5
This is the end of the preview. Sign up to access the rest of the document.

Page1 / 26

cse431-chapterSohi - CSE 431 Computer Architecture Fall...

This preview shows document pages 1 - 5. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online