{[ promptMessage ]}

Bookmark it

{[ promptMessage ]}

L08 - 18-447 Lecture 8 Performance-how to summarize compare...

Info iconThis preview shows pages 1–4. Sign up to view the full content.

View Full Document Right Arrow Icon
CMU 18-447 S’08 L8-1 © 2008 J. C. Hoe 18-447 Lecture 8: Performance -- how to summarize & compare James C. Hoe Dept of ECE, CMU February 11, 2008 Announcements: Midterm 2/18 in class, Lectures 1~8 Lab 2 due this week (5% bonus for Tuesday) HW 2 due Friday at noon, HH-A304 Read P&H Appendix C for next Lecture Handouts: Practice Midterm and Solutions CMU 18-447 S’08 L8-2 © 2008 J. C. Hoe Latency vs. Throughput Latency (a time measure) ­ time between start and finish of a single task ­ most applicable in interactive applications Throughput (a rate measure) ­ number of tasks finished in a given unit of time ­ most applicable in batch applications Throughput is not always 1/latency when concurrency is involved (think bus vs. F1 race car) ­ improve latency ?? improve throughput ­ improve throughput ?? improve latency Not completely distinct when different granularity are considered ­ increasing throughput of component processing shortens the latency of the overall task
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
CMU 18-447 S’08 L8-3 © 2008 J. C. Hoe It is all about time Performance = 1 / Time ­ shorter latency higher performance ­ higher throughput (job/time) higher performance UNIX “time” command ­ user CPU time: time spent running your code ­ system CPU time: time spent running other code on behalf of your code ­ elapsed time: wall-clock time ­ elapsed time – user CPU time – system CPU time = time running other code unrelated to your code 1. Be precise about what you measured when reporting 2. Rule of thumb: measure and report wall-clock time on unloaded system CMU 18-447 S’08 L8-4 © 2008 J. C. Hoe IPC, MIPS and GHz The metrics you are most likely to see in marketing are IPC (instruction per cycle), MIPS (million instruction per second) and GHz How are they incomplete? Iron Law on Performance wall clock time = (time/cyc) (cyc/inst) (inst/program) ­ MIPS and IPC are averages which instructions matter ­ GHz can be boosted artificially by design (lower the other 2 terms) e.g., 1.4GHz P4 1.0GHz P3 1/GHz 1/MIPS 1/IPC
Background image of page 2
CMU 18-447 S’08 L8-5 © 2008 J. C. Hoe Pseudo FLOPS Scientific computing community often use pseudo FLOPS as performance metric nominal # of floating point operations program runtime ­ e.g. FFT of size N has nominally 5N log 2 (N) FP operations Is this a good, fair metric to compare machine + algorithm combinations?
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Image of page 4
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}