chapter5-m2--ziavras

Shared data that is allocated remotely and bandwidth

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: ation and Replication key to performance of shared data • Migration - data can be moved to a local cache and used there in a transparent fashion – Reduces both latency to access shared data that is allocated remotely and bandwidth demand on the shared memory • Replication – for shared data being simultaneously read, since caches make a copy of data in local cache – Reduces both latency of access and contention for read shared data both latency of access and contention for read shared data 25 2 Classes of Cache Coherence Protocols 1. Directory based — Sharing status of a block of physical memory is kept in just one location physical memory is kept in just one location, the directory 2. Snooping — Every cache with a copy of data also has a copy of sharing status of block, but bl no centralized state is kept • • All caches are accessible via some broadcast medium (a bus or switch) All cache controllers monitor or snoop on the medium to determine whether or not they have a copy of a block that is requested on bus or switch access block that is requested on a bus or switch access 26 Snoopy Cache-Coherence Protocols State Address Data Pn P1 Bus snoop $ $ Mem I/O devices Cache-memory transaction • Cache Controller “snoops” all transactions on the shared medium (bus or switch) – relevant transaction if for a block it contains – take action to ensure coherence » invalidate, update, or supply value – depends on state of the block and the protocol • Either get exclusive access before write via write get exclusive access before write via write invalidate or update all copies on write 27 Example: Write-thru Invalidate P2 P1 u=? $ P3 3 u=? 4 $ 5 $ u :5 u= 7 u :5 I/O devices 1 u:5 2 u=7 Memory • Must invalidate before step 3 • Write update uses more broadcast medium BW ⇒ all recent MPUs use write invalidate 28 Architectural Building Blocks • Cache block state transition diagram – FSM specifying how disposition of block changes specifying how disposition of block changes » invalid, valid, dirty • Broadcast Medium Transactions (e.g., bus) – Fundamental system design abstraction system design abstraction – Logically single set of wires connect several devices – Protocol: arbitration, command+address, data ⇒ Every device observes every transaction • Broadcast medium enforces serialization of read or write accesses ⇒ Write serialization – 1st processor to get medium invalidates others’ copies it cannot complete write until it obtains bus – All coherence schemes require serializing accesses to same cache block • Also need to find up-to-date copy of cache block fi bl 2...
View Full Document

This document was uploaded on 02/09/2014.

Ask a homework question - tutors are online