36
Data: 0
Prefix sums: 0
Ranks of 1s:
INTRODUCTION TO PARALLEL PROCESSING
0
0
1
1
1
0
1
1
2
2
1
3
0
2
1
4
4
0
5
1
5
3
0
2
5
A priority circuit has a list of 0s and 1s as its inputs and picks the first (highest-priority) 1
in the list. The function of a p
6
INTRODUCTION TO PARALLEL PROCESSING
Figure 1.1. The exponential growth of microprocessor performance, known as Moores law,
shown over the past two decades.
The speed-of-light argument suggests that once the above limit has been reached, the
only path to
PARALLEL ALGORITHM COMPLEXITY
1.
2.
3.
51
Showing that, in the worst case, solution of the problem requires data to travel
a certain distance or that a certain volume of data must pass through a limitedbandwidth interface. An example of the first method i
96
INTRODUCTION TO PARALLEL PROCESSING
O( p log p ) work required for sorting p elements on a single processor. The analysis of this
algorithm with regard to speed-up, efficiency, and so forth is left as an exercise. Faster and
more efficient PRAM sorting
156
INTRODUCTION TO PARALLEL PROCESSING
Figure 8.3. Systolic data structure for minimum, maximum, and median finding.
8.3. PARALLEL PREFIX COMPUTATION
Parallel prefix computation was defined in Section 2.1, with several algorithms provided
subsequently in
MORE SHARED-MEMORY ALGORITHMS
111
6.1. SEQUENTIAL RANK-BASED SELECTION
Rank-based selection is the problem of finding a (the) kth smallest element in a sequence
S = x 0 , x 1 , . . . , x n -1 whose elements belong to a linear order. Median, maximum, and
m
SORTING AND SELECTION NETWORKS
141
7.5. OTHER CLASSES OF SORTING NETWORKS
A class of sorting networks that possess the same asymptotic (log2 n) delay and ( n
log n ) cost as Batcher sorting networks, but that offer some advantages, are the periodic
balanc
126
INTRODUCTION TO PARALLEL PROCESSING
in Section 6.4. Using your analysis, justify the choice of |S |/ p random samples in the first
algorithm step.
6.7.
Parallel radixsort algorithm
a . Extend the parallel radixsort algorithm given in Section 6.4 to th
21
INTRODUCTION TO PARALLELISM
W (8) = 22
T(8) = 7
E(8) = 15/(8 7) = 27%
S (8) = 15/7 = 2.14
R(8) = 22/15 = 1.47
Q (8) = 0.39
The efficiency in this latter case is even lower, primarily because the interprocessor transfers
constitute overhead rather than
MODELS OF PARALLEL PROCESSING
81
A more precise model, particularly if the circuit is to be implemented on a dense VLSI
chip, would include the effect of wires, in terms of both the chip area they consume (cost)
and the signal propagation delay between an
This page intentionally left blank.
MODELS OF PARALLEL PROCESSING
67
4.1. DEVELOPMENT OF EARLY MODELS
Associative processing (AP) was perhaps the earliest form of parallel processing.
Associative or content-addressable memories (AMs, CAMs), which allow me