CSE502_lec15 - MTreviewF09Outline+SampleMT+Answers_9pg

CSE502_lec15 - MTreviewF09Outline+SampleMT+Answers_9pg -...

Info iconThis preview shows pages 1–4. Sign up to view the full content.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: CSE502 Lecture 15 - Tue 3Nov09 Review: MidTerm Thu 5Nov09 - Outline of Major Topics Computing system: performance, speedup, performance/cost Origins and benefits of scalar instruction pipelines and caches Pipeline Hazards Structural need more HW to wait less Data Dependence RAW (WAR & WAW resolvable) ByPass HW lessens RAW delays even if in-order Xeq Dynamic HW support out-of-order Xeq & speculation Control where fetch after jumps or branches Instruction Level Parallelism (ILP) => pipelines Hazards limit fast pipelining Ways to lessen impact of hazards More memory and register ports Branch/Jump prediction Out-of-order execution Speculative execution Code rescheduling (moving) Static by compiler SW Loop Unrolling (Software pipelining) Dynamic by Tomasulo-style HW Advantages of Tomasulo Disadvantages of Tomasulo Superscalar (2-8 instructions per cycle) vs VLIW Advantages and disadvantages of each Which better? Why? Fast CPUs use both how? Why? Thread-level support Cray/Tera MTA-1 128 threads * 256 CPUs, NO caches Multi-CPU (2-8 cores) on a chip instead of speculation Less power per core from lower voltage, less logic Vector Processing Loop unrolling Loop execution time Without chaining With chaining Print Your Name_ LW ANSWERS _ + IDCard Number_____________ + Student Number ____-___-______ MidTerm Tues 12 Mar 2002 12:50-14:05 CSE 533 Prof. Wittie page 1 **OPEN BOOK** **OPEN NOTES** **CLOSED FRIENDS** You have 75 minutes for 3 questions with 8 parts on 3 pages = 100 points. Read all questions before you start. Some easy questions may be later. Show your math work for partial credit. Note any special assumptions that you make for any question. Raise your hand for help. Write your answers on fronts and backs of this exam, and hand it in directly to me. PLEASE PUT YOUR SUNY ID WHERE I MAY SEE IT DURING THE EXAM. 1) (40 pts) Your NETalker company is designing a new wireless ZeBox, that will run your patented netsearch code 95% of its working time. You must make a critical hardware choice affecting both the cost and speed of ZeBox. The old Box costs $36 and takes an average of 40 seconds per search. A new ZeBox that takes 40 seconds per search costs only $10. Four optional hardware improvements to ZeBox are feasible: W , X , Y , Z . Each speeds up only parts of netsearch execution, as shown in the figure. For an average run of netsearch divided into 20 equal time periods, only W can speed up 3 periods; W or X (but not both together) can speed the next 3 periods; only X for 2 periods; Y or X for 5 ; only Y for 3 , and Y or Z for 4 . Each improvement has a different speedup factor for its sections of the run and an additional cost. W is 15 times faster for $15 more. X is 5 times faster for $5 . Y is 10 x for $10 more. Z is 20 times for $20 more. If two improvements are possible during a time period, only the faster one has any effect. The question is what WXYZ combination gives the best performance-to-total_cost ratio . For example, if ZeBox has all four optional improvements...
View Full Document

This note was uploaded on 11/06/2010 for the course CSE 502 taught by Professor Wittie,l during the Spring '08 term at SUNY Stony Brook.

Page1 / 9

CSE502_lec15 - MTreviewF09Outline+SampleMT+Answers_9pg -...

This preview shows document pages 1 - 4. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online