Lecture 14 Multicores, Multiprocessors, and Clusters

Lecture 14 Multicores, Multiprocessors, and Clusters -...

Info iconThis preview shows pages 1–9. Sign up to view the full content.

View Full Document Right Arrow Icon
Chapter 7 Multicores, Multiprocessors, and Clusters
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Chapter 7 — Multicores, Multiprocessors, and Clusters — 2 Introduction ± Goal: connecting multiple computers to get higher performance ± Multiprocessors ± Scalability, availability, power efficiency ± Job-level (process-level) parallelism ± High throughput for independent jobs ± Parallel processing program ± Single program run on multiple processors ± Multicore microprocessors ± Chips with multiple processors (cores) §9.1 Introduction
Background image of page 2
Chapter 7 — Multicores, Multiprocessors, and Clusters — 3 Hardware and Software ± Hardware ± Serial: e.g., Pentium 4 ± Parallel: e.g., quad-core Xeon e5345 ± Software ± Sequential: e.g., matrix multiplication ± Concurrent: e.g., operating system ± Sequential/concurrent software can run on serial/parallel hardware ± Challenge: making effective use of parallel hardware
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Chapter 7 — Multicores, Multiprocessors, and Clusters — 4 What We’ve Already Covered ± §2.11: Parallelism and Instructions ± Synchronization ± §3.6: Parallelism and Computer Arithmetic ± Associativity ± §4.10: Parallelism and Advanced Instruction-Level Parallelism ± §5.8: Parallelism and Memory Hierarchies ± Cache Coherence ± §6.9: Parallelism and I/O: ± Redundant Arrays of Inexpensive Disks
Background image of page 4
Chapter 7 — Multicores, Multiprocessors, and Clusters — 5 Parallel Programming ± Parallel software is the problem ± Need to get significant performance improvement ± Otherwise, just use a faster uniprocessor, since it’s easier! ± Difficulties ± Partitioning ± Coordination ± Communications overhead §7.2 The Difficulty of Creating Parallel Processing Programs
Background image of page 5

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Chapter 7 — Multicores, Multiprocessors, and Clusters — 6 Amdahl’s Law ± Sequential part can limit speedup ± Example: 100 processors, 90× speedup? ± T new = T parallelizable /100 + T sequential ± ± Solving: F parallelizable = 0.999 ± Need sequential part to be 0.1% of original time 90 /100 F ) F (1 1 Speedup able paralleliz able paralleliz = + =
Background image of page 6
Chapter 7 — Multicores, Multiprocessors, and Clusters — 7 Scaling Example ± Workload: sum of 10 scalars, and 10 × 10 matrix sum ± Speed up from 10 to 100 processors ± Single processor: Time = (10 + 100) × t add ± 10 processors ± Time = 10 × t add + 100/10 × t add = 20 × t add ± Speedup = 110/20 = 5.5 (55% of potential) ± 100 processors ± Time = 10 × t add + 100/100 × t add = 11 × t add ± Speedup = 110/11 = 10 (10% of potential) ± Assumes load can be balanced across processors
Background image of page 7

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Chapter 7 — Multicores, Multiprocessors, and Clusters — 8 Scaling Example (cont) ± What if matrix size is 100 × 100? ± Single processor: Time = (10 + 10000) × t
Background image of page 8
Image of page 9
This is the end of the preview. Sign up to access the rest of the document.

Page1 / 51

Lecture 14 Multicores, Multiprocessors, and Clusters -...

This preview shows document pages 1 - 9. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online