16+-+Coherence

16+-+Coherence - CS 4290/6290 Multi-core and Cache...

Info iconThis preview shows pages 1–7. Sign up to view the full content.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: CS 4290/6290 Multi-core and Cache Coherence • Avoids cache thrashing – Each processor has its own cache(s) • No contention for processor resources – Each processor has its own RS, ROB, etc. • But… CS 4290/6290 – Spring 2009 – Prof. Milos Prvulovic 2 • Shared memory easy with no caches – P1 writes, P2 can read – Only one copy of data exists (in memory) • Caches store their own copies of the data – Those copies can easily get inconsistent – Classic example: adding to a sum • P1 loads allSum, adds its mySum, stores new allSum • P1’s cache now has dirty data, but memory not updated • P2 loads allSum from memory, adds its mySum, stores allSum • P2’s cache also has dirty data • Eventually P1 and P2’s cached data will go to memory • Regardless of write-back order, the final value ends up wrong CS 4290/6290 – Spring 2009 – Prof. Milos Prvulovic 3 • A memory system is coherent if – A read R from address X on processor P1 returns the value written by the most recent write W to X on P1 if no other processor has written to X between W and R. – If P1 writes to X and P2 reads X after a sufficient time, and there are no other writes to X in between, P2’s read returns the value written by P1’s write. – Writes to the same location are serialized: two writes to location X are seen in the same order by all processors. CS 4290/6290 – Spring 2009 – Prof. Milos Prvulovic 4 • Property 1. preserves program order – If no sharing, each processor acts like a uniprocessor • Property 2. says that any write to an address must eventually be seen by all processors – If P1 writes to X and P2 keeps reading X, P2 must eventually see the new value • Property 3. preserves causality – Let X start at 0. P1 sets X to 1. P2 waits until X is 1 then sets X to 2. P3 eventually sees that X becomes 2. – If different processors could see writes in different order, P3 might think final value of X is 1. CS 4290/6290 – Spring 2009 – Prof. Milos Prvulovic 5 • Hardware schemes – Shared Caches • Trivially enforces coherence • Not scalable (L1 cache quickly becomes a bottleneck) – Snooping • Needs a broadcast network (like a bus) to enforce coherence • Each cache that has a block tracks its sharing state on its own – Directory • Can enforce coherence even with a point-to-point network • A block has just one place where its full sharing state is kept CS 4290/6290 – Spring 2009 – Prof. Milos Prvulovic 6 •...
View Full Document

{[ snackBarMessage ]}

Page1 / 24

16+-+Coherence - CS 4290/6290 Multi-core and Cache...

This preview shows document pages 1 - 7. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online