chapter5-m2--ziavras

Centralized memory multiprocessor memory

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: : 1. Centralized Memory Multiprocessor Memory Multiprocessor • • • < few dozen processor chips (and < 100 cores) in 2006 Small enough to share single, centralized memory 2. Physically Distributed-Memory multiprocessor • • Larger number chips and cores than 1. BW demands ⇒ Memory distributed among processors demands distributed among processors 9 Centralized vs. Distributed Memory Scale P1 Pn $ $ Pn P1 Mem $ Mem $ Interconnection network Interconnection network Mem Mem Centralized Memory Distributed Memory I/O is not shown 10 Centralized Memory Multiprocessor • Also called symmetric multiprocessor (SMP) because single main memory has a symmetric single main memory has symmetric relationship to all processors • Large caches ⇒ single memory can satisfy memory demands of small number of processors • Can scale to a few dozen processors by using a switch and many memory banks • Although scaling beyond that is technically conceivable it becomes less attractive as the conceivable, it becomes less attractive as the number of processors sharing centralized memory increases 11 Distributed Memory Multiprocessor • Pro: Cost-effective way to scale memory bandwidth memory bandwidth • If most accesses are to local memory • Pro: Reduces latency of local memory accesses • Con: Communicating data between Communicating data between processors more complex • Con: Must change software to take Must change software to take advantage of increased memory BW 12 2 Models for Communication and Memory Architecture 1. Communication occurs by explicitly passing messages among the processors: message message-passing multiprocessors multiprocessors 2. Communication occurs through a shared address space (via loads and stores): shared memory multiprocessors shared memory multiprocessors either • UMA (Uniform Memory Access time) for shared address, centralized memory MP • NUMA (Non Uniform Memory Access time multiprocessor) for shared address, distributed memory MP • In past, confusion whether “sharing” means sharing physical memory (Symmetric MP) or sharing address space (DSM: distributed shared memory memory) 13 Challenges of Parallel Processing • First challenge is % of program inherently sequential inherently sequential (Amdahl’s Law) Law • Suppose 80X speedup from 100 processors What fraction of original processors. What fraction of original program can be sequential? a. 10% b.5% c. 1% d.<1% 14 Amdahl’s Law Answers Speedup overall = 1 (1 − Fraction enhanced ) + 80 = Fraction parallel Speedup parallel 1 (1 − Fraction parallel 80 × ((1 − Fraction parallel ) + )+ Fraction parallel...
View Full Document

Ask a homework question - tutors are online