chapter5-m2--ziavras

8 fraction parallel fraction parallel 79 792 9975 15

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: 100 Fraction parallel ) =1 100 79 = 80 × Fraction parallel − 0.8 × Fraction parallel Fraction parallel = 79 / 79.2 = 99.75% 15 Challenges of Parallel Processing • Second challenge is long latency to remote memory memory • Suppose 32 CPU MP, 2GHz, 200 ns remote memory, all local accesses hit memory hi hierarchy and base CPI (all references hit in the cache) is 0.5. CPI (Remote access = 200/0.5 = 400 clock cycles.) is performance impact if • What is performance impact if 0.2% instructions per processor involve remote access? a. 1.5X b. 2.0X c. 2.5X 16 CPI Equation • CPI = Base CPI + Remote request rate Remote request rate x Remote request cost • CPI = 0.5 + 0.2% x 400 = 0.5 + 0.8 = 1.3 • MP with all local references is 1.3/0.5 or 2.5 faster than 0.2% instructions involve local access 17 Challenges of Parallel Processing 1. Application parallelism ⇒ primarily via new algorithms that have better parallel new algorithms that have better parallel performance 2. Long remote latency impact ⇒ both by architect and by the programmer* hit th • For example, reduce frequency of remote accesses either by remote accesses either by – Caching shared data (HW) – Restructuring the data layout to make more accesses local (SW) (SW) • Today’s lecture on HW to help latency via caches via caches * Also: runtime system, compiler 18 Symmetric Shared-Memory Architectures • From multiple boards on a shared bus to multiple processors inside single chip multiple processors inside a single chip • Caches both • – Private data are used by a single processor data used by single processor – Shared data are used by multiple processors Caching shared data reduces: shared data 1. latency to shared data 2. memory bandwidth for shared data 3. interconnect bandwidth ⇒ cache coherence problem (drawback) 19 Example Cache Coherence Problem P2 P1 u=? u :5 $ P3 3 u=? 4 $ 5 $ u :5 u= 7 1 I/O devices u:5 2 Memory – Processors see different values for u after event 3 diff – With write back caches, value written back to memory depends on happenstance of which cache flushes or writes back value when » Processes accessing main memory may see very stale value accessing main memory may see very stale value – Unacceptable for programming, & its frequent! 20 Intuitive Memory Model P L1 100:67 L2 100:35 Reading from an address should return the last value written to that address – Easy in uniprocessors, except for I/O Memory Disk...
View Full Document

This document was uploaded on 02/09/2014.

Ask a homework question - tutors are online