lecture_14 - Lecture 14 Multiprocessor and Cluster...

Info iconThis preview shows pages 1–14. Sign up to view the full content.

View Full Document Right Arrow Icon
Lecture 14 Multiprocessor and Cluster
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Announcement Final Exam /16 6/16 Chapter 6, 7, 8, 9 Exercises 6.4, 6.22, 6.33, 6.36 7.11, 7.14, 7.22,7.45 8.1, 8.2, 8.28, 8.29 .2, 9.5, 9.6 9.2, 9.5, 9.6
Background image of page 2
Parallel Computers Definition: “A parallel computer is a collection of processing elements that cooperate and communicate to solve large problems fast.” Almasi and Gottlieb, Highly Parallel Computing , 1989 Questions about parallel computers: How do they communicate? » Shared address space : UMA vs. NUMA » Message passing : clusters How do they coordinate? » Synchronization What type of interconnection? » Bus & network ow many processors? How many processors? Does it translate into performance?
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Communication Models Single Address Space: load/store n 1 P2 Pn Common Physical ddress x load x Shared P0 P1 Address Pn private store x Portion of Address Space P2 private Private portion Address space P0 private P1 private
Background image of page 4
MA:Uniform Memory Access UMA:Uniform Memory Access Memory: centralized with uniform access time (“uma”) and bus interconnect Examples: SPARCCenter, Challenge, SystemPro rocessor rocessor Processor Processor Processor Processor Caches Caches Caches Caches Main Memory I/O System Scalibility is an issue !!!
Background image of page 5

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
UMA: Nonuniform Memory Access NUMA: Nonuniform Memory Access Memory: distributed with nonuniform access time numa”) and scalable interconnect (distributed memory) (“numa”) and scalable interconnect (distributed memory) Examples: T3D, Exemplar, Paragon, CM-5 Processor +Cache Processor +Cache Processor +Cache Processor +Cache memory I/O memory I/O : : Interconnection Network memory I/O memory I/O : : Processor +Cache Processor +Cache Processor +Cache Processor +Cache
Background image of page 6
Network Topology Processor-memory switch Fully-connected 2D torus Ring Cube
Background image of page 7

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
ommunication Models Communication Models Multiple address spaces: Message Passing Local Process Local Process Address Space Address Space match Recv y, P, t x y Send x, Q, t Process P Process Q
Background image of page 8
Program Example – Single-Address Space sum 100,000 numbers & 100 processors (load & store) First Step: each processor (Pn) sums his subset of numbers
Background image of page 9

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Program Example – Single-Address Space Second Step: Add partial sums via divide-and-conquer
Background image of page 10
Parallel Program – Message Passing sum 100,000 numbers & 100 processors (send & receive) First Step: each processor (Pn) sums his subset of numbers
Background image of page 11

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Parallel Program – Message Passing Second Step: Add partial sums via divide-and-conquer
Background image of page 12
hallenge of Parallel Processing Challenge of Parallel Processing ommunication Overhead Communication Overhead Programmers must know a good deal about the hardware Available Parallelism in applications Amdahl’s Law (FracX: original % to be speed up) Speedup = 1 / [(FracX/SpeedupX + (1-FracX)] What fraction of the original computation can be sequential to get 80X speedup om 100 processors? from 100 processors?
Background image of page 13

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Image of page 14
This is the end of the preview. Sign up to access the rest of the document.

Page1 / 55

lecture_14 - Lecture 14 Multiprocessor and Cluster...

This preview shows document pages 1 - 14. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online