{[ promptMessage ]}

Bookmark it

{[ promptMessage ]}

24 Optimization - "$ CMSC 216 Introduction to Computer...

Info iconThis preview shows pages 1–5. Sign up to view the full content.

View Full Document Right Arrow Icon
!"#"$$ $ CMSC 216 Introduction to Computer Systems Lecture 24 Obtimization Jan Plane & Pete Keleher {jplane, keleher}@cs.umd.edu Administrivia Read Sections 2.2-2.4 and 7.6-7.13 of Bryant and O’Hallaron Final Exam Thursday, May 12 4:00-6:00 pm ARM 0126
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
!"#"$$ & O PTIMIZING P ROGRAM P ERFORMANCE Chapter 5, Bryant & O'Hallaron How processors spend their time Not all instructions take the same amount of time; some are more expensive Processors have caches to keep copies of recently accessed memory locations in fast storage – a recently accessed memory location is more likely to be accessed again soon – each cache item stores multiple data items (called the line size ) – the same instruction may take different amounts of time different times it’s executed - misses from the cache can be ten to a hundred times slower – efficiency will be maximized if the same cache items are used multiple times
Background image of page 2
!"#"$$ # Understanding modern processors It's helpful to know a bit about what a compiler can and can't do, as well as what takes time on the hardware Pipelining parts of multiple instructions can execute simultaneously, such as decoding one instruction while loading the next one from memory Branch prediction the processor guesses which way a branch will go, which allows the pipeline to stay full Superscalar processors can execute two or more instructions at once Some floating point operations (e.g., division) can take longer than integer (or other f.p.) operations perhaps 5 to 10 times longer for the same operation Issues in conducting measurements Number of runs: a single run of a program to time its performance is not sufficient - many things go on in a computer, such as operating system functions and other programs running multiple runs provide increased accuracy take the mean of the K fastest runs Workload: what data is the program given for measurement runs- does it look like a "typical" use of the program? many algorithms might look good if the measured workload is too small – for small n, O(n 2 ) algorithms are similar to O(n) many algorithms might also look good if the measured workload is too large (not representative of the typical usage pattern)
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
!"#"$$ ' Sources of performance problems I/O operations that are too small – reading one or a few characters at a time • Debugging printf 's were left in the program Poor basic algorithms O ( n 2 ) algorithms were used in cases where n was large, or for frequently performed operations Algorithms were used that compose poorly
Background image of page 4
Image of page 5
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}