MIT6_004s09_lec24

MIT6_004s09_lec24 - MIT OpenCourseWare http:/ocw.mit.edu...

Info iconThis preview shows pages 1–4. Sign up to view the full content.

View Full Document Right Arrow Icon
MIT OpenCourseWare http://ocw.mit.edu For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms . 6.004 Computation Structures Spring 2009
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
L24– Paral el Processing 1 6.004 Spring 2009 5/7/09 Parallel Processing modified 5/4/09 10:08 L24– Paral el Processing 2 6.004 Spring 2009 5/7/09 The Home Stretch TODAY 5/7: Lab 8 (LAST!) due Friday 5/8 section: LAST QUIZ (#5)! Tu 5/12: Wrapup (LAST!) Lecture! Wednesday 5/13: NO SECTION MEETINGS! ± ± Optional DESIGN PROJECT due ± ± ALL (late) Assignments due ± ± Immense Satisfaction/Rejoicing/Relief/Celebration/Wild Partying. L24– Paral el Processing 3 6.004 Spring 2009 5/7/09 Taking a step back Dynamic Execution Path Static Code Path Length = number of instructions along path aka “Thread of Execution” loop: LD(n, r1) CMPLT(r31, r1, r2) BF(r2, done) LD(r, r3) LD(n,r1) MUL(r1, r3, r3) ST(r3, r) LD(n,r1) SUBC(r1, 1, r1) ST(r1, n) BR(loop) done: L24– Paral el Processing 4 6.004 Spring 2009 5/7/09 We have been building machines to execute one thread (quickly) Beta Processor Memory Execution Thread Path Length x Clocks-per-Instruction Time = Clocks-per-second
Background image of page 2
L24– Paral el Processing 5 6.004 Spring 2009 5/7/09 Can we make CPI < 1 ? Two Places to Find Parallelism Instruction Level (ILP) – Fetch and issue groups of independent instructions within a thread of execution Thread Level (TLP) Simultaneously execute multiple execution streams …Implies we can complete more than one instruction each clock cycle! L24– Paral el Processing 6 6.004 Spring 2009 5/7/09 Instruction-Level Parallelism Sequential Code This is okay, but smarter coding does be±er in this example! loop: LD(n, r1) CMPLT(r31, r1, r2) BF(r2, done) LD(r, r3) LD(n,r1) MUL(r1, r3, r3) ST(r3, r) LD(n,r4) SUBC(r4, 1, r4) ST(r4, n) BR(loop) done: What if I tried to do multiple iterations at once? loop: LD(n, r1) CMPLT(r31, r1, r2) BF(r2, done) LD(r, r3) LD(n,r1) LD(n,r4) MUL(r1, r3, r3) SUBC(r4, 1, r4) ST(r3, r) ST(r4, n) BR(loop) done: “Safe” Parallel Code L24– Paral el Processing 7 6.004 Spring 2009 5/7/09 Superscalar Parallelism - ± Popular now, but the limits are near (8-issue) - ± Multiple instruction dispatch - ± Speculative execution L24– Paral el Processing 8 6.004 Spring 2009 5/7/09 SIMD Processing (Single Intruction Multiple Data) Each datapath has its own local data (Register File) All data paths execute the same instruction Conditional branching is difficult… (What if only one CPU has R1 = 0?) Conditional operations are common in SIMD machines if (flag1) Rc = Ra <op> Rb Global ANDing or ORing of flag registers are used for high-level control Reg File ALU PC +1 or Branch
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Image of page 4
This is the end of the preview. Sign up to access the rest of the document.

This note was uploaded on 11/07/2011 for the course COMPUTER S 6.004 taught by Professor Staff during the Spring '09 term at MIT.

Page1 / 7

MIT6_004s09_lec24 - MIT OpenCourseWare http:/ocw.mit.edu...

This preview shows document pages 1 - 4. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online