This preview shows page 1. Sign up to view the full content.
Unformatted text preview: ulti-Threaded Apber of cores for blackscholes, facesim (both
plications,” in Int’l. Symp. on Performance Analysis of Systems and Software (ISPASS), 2012.
PARSEC) and cholesky (SPLASH-2). Part A (5 points)
allel programming has become inevitable exhibit strong
Which, if any, of the three applicationsfor mainstream scalability? Explain your answer. computing. One of the key needs to efﬁcient programming
is to have the appropriate tools to analyze parallel performance. In particular, a software developer needs analysis
tools to identify the performance scaling bottlenecks, not
only on current hardware but also on future hardware with
many more cores than are available today; likewise, computer architects need analysis tools to understand the behavioral characteristics of existing and future workloads to
design and optimize future hardware.
Speedup curves which report speedup as a function of
the number cores, as exempliﬁed in Figure 1, are often used
to understand scaling behavior of an application. Although
Part B (5 points) a high-level view on application scala speedup curve gives
ing behavior, it does not provide any insight with respect cholesky at 16 threads? Please show enough
What is the approximate eﬃciency of facesim and to
why to make clear how does not scale. your answer.
work an application does oryou arrived at There are many
possible causes for poor scaling behavior, such as synchronization, as well as interference in both shared on-chip resources (e.g., last-level cache) and off-chip resources (e.g.,
main memory). Unfortunately, a speedup curve provides no
clue whatsoever why an application exhibits poor scaling
In this paper, we propose the speedup stack which is a
novel representation that provides insight into an application’s scaling behavior on multi-core hardware. The height 145 8 Part C (5 points)
This problem continues to refer to the graph on the previous page.
To be weakly scalable, a program must consist of two parts:
• A part that is inherently sequential and can never be parallelized
• A part that is parallelizable and will exhibit linear speedup in the number of threads
Assume that facesim is weakly scalab...
View Full Document
- Fall '13
- Computer Architecture