### lecture-8-4

Course: ECE 7995, Fall 2009
School: Wayne State University
Caching ECE7995 and Prefetching Techniques in Computer Systems Lecture 8: Buffer Cache in Main Memory (IV) Quantifying Locality with LRU Stack Blocks are ordered by their recencies; Blocks enter from the stack top, and leave from its bottom; . . .3 4 4 5 3 2 9 8 1 4 5 3 Recency = 1 Recency = 2 LRU stack LRU Stack Blocks are ordered by recency in the LRU stack; Blocks enter from the stack top, and leave...

Caching ECE7995 and Prefetching Techniques in Computer Systems Lecture 8: Buffer Cache in Main Memory (IV) Quantifying Locality with LRU Stack Blocks are ordered by their recencies; Blocks enter from the stack top, and leave from its bottom; . . .3 4 4 5 3 2 9 8 1 4 5 3 Recency = 1 Recency = 2 LRU stack LRU Stack Blocks are ordered by recency in the LRU stack; Blocks enter from the stack top, and leave from its bottom; . . .3 4 5 3 2 9 8 IRR = 2 3 4 5 3 Recency = 0 Recency = 2 Inter-Reference Recency (IRR) The number of other distinct blocks accessed between two consecutive LRU references to the block. stack Locality Strength MULTI2 IRR (Re-use Distance in Blocks) LRU Locality Strength Good for "absolutely" strong locality Bad for relatively weak locality Cache Size Virtual Time (Reference Stream) LRU's Inability with Weak Locality Memory scanning (one-time access) Infinite IRR, weak locality; should not be cached at all; not replaced timely in LRU (be cached until their recency larger than cache size); LRU's Inability with Weak Locality Loop-like accesses (repeated accesses with a fixed interval) IRR is the same as the interval The interval larger than cache size, no hits blocks to be accessed soonest can be unfortunately replaced. LRU's Inability with Weak Locality Accesses with distinct frequencies: The recencies of frequently accessed blocks become large because of references to infrequently accessed block; Frequently accessed blocks could be unfortunately replaced. Looking for Blocks with Strong Locality MULTI2 IRR (Re-use Distance in Blocks) Locality Strength Cover 1000 Blocks with Strongest Locality Cache Size Virtual Time (Reference Stream) Challenges Address the limitations of LRU fundamentally. Retain the low overhead and adaptability merits of LRU. Simplicity: affordable implementation Adaptability: responsive to access pattern changes Principle of the LIRS Replacement If a block's IRR is high, its next IRR is likely to be high again. We select the blocks with high IRRs for replacement . LIRS: Low IRR Set Replacement algorithm We keep the set of blocks with low IRRs in cache. Requirements on Low IRR Block Set (LIRS) The set size should be the cache size. The set consists of the blocks with strongest locality strength (with the lowest IRRs) Dynamically keep the set up to date Low IRR Block Set Low IRR ( LIR ) block and High IRR (HIR) block Block Sets Physical Cache lirs LIR block set lirs Cache size lirs (size is L ) L Lhirs L =L + Lhirs HIR block set An Example for LIRS lirs V t i m e / Bl ock s L =2, Lhirs=1 1 2 3 4 5 6 7 8 9 10 R I RR 1 3 4 1 1 i nf 3 i nf A B X X X X X X X C D E X X 2 0 LIR block set = {A, B}, HIR block set = {C, D, E} Mapping to Cache Block Sets LIR block set A B Physical Cache A B E HIR block set C D E Resident blocks lirs L =2 Lhirs=1 Which Block is replaced ? Replace HIR Blocks D is referenced at time 10 V t im e / Blocks 1 2 3 4 5 6 7 8 9 10 R I RR 1 3 4 1 1 inf 3 I nf A B X X X X X X X C D E X X0 X 1 The resident HIR block (E) is replaced ! How LIR Set is Updated ? Recency of LIR Block Used V t im e / Blocks 1 2 3 4 5 6 7 8 9 10 R I RR A B X X X X X X X 2 3 4 1 1 inf C D E X X X0 1 2 I nf After D is Referenced at Time 10 ... ... V t im e / Blocks 1 2 3 4 5 6 7 8 9 10 R I RR A B B C D D E X X X X X X X 2 1 3 1 4 inf 0 X X X 2 1 I nf E is replaced, D enters LIR set If Reference is to C at Time 10 ... ... V tim e / Blocks 1 2 3 4 5 6 7 8 9 10 R I RR A B X X X X X X X 2 4 1 1 C D E X0 4 X X 3 1 3 I nf E is replaced, C cannot enter LIR set The LIRS References with Weak Locality Memory scanning (one-time access) Infinite IRR; Not included in the LIR block set; replaced timely. The LIRS References with Weak Locality Loop-like accesses The IRRs of all blocks are the same; Once a block becomes LIR block, it can keep its status; Any cached block can contribute a hit in one loop of accesses. The LIRS References with Weak Locality Accesses with distinct frequencies: The IRRs of frequently accessed blocks have smaller IRR, than infrequently accessed blocks. Frequently accessed blocks are LIR blocks; Always cached and get hits. Making LIRS O(1) Efficient HIR IRR (New IRR of the HIR block) Rmax (Maximum Recency of LIR blocks) This efficiency is achieved by our LIRS stack. LRU stack + LIR block with Rmax recency its in bottom ==> LIRS stack. Differences between LRU and LIRS Stacks Stack size of LRU decided by cache size, and fixed; Stack size of LIRS decided by Rmax, and varied. resident block LRU stack LIRS stack 5 3 2 1 6 Cache size L =5 5 3 2 1 6 9 4 8 Llir =3 Lhir = 2 LRU stack holds only resident blocks; LIRS stack holds any blocks whose recencies are no more than Rmax. LIR block HIR block LRU stack does not distinguish "hot" and "cold" blocks in it; LIRS stack distinguishes LIR and HIR blocks in it, and dynamically maintains their statues. How does LIRS Stack Help? HIR IRR (New IRR of the HIR block) Rmax (Maximum Recency of LIR blocks) LIRS Stack Blocks in the LIRS stack ==> IRR < Rmax Other blocks ==> IRR > Rmax LIRS Operations resident in cache 5 3 Initialization: All the referenced blocks are LIR block HIR block given an LIR status until LIR block set is full. We place resident HIR blocks in Stack Q 2 1 6 9 4 8 LIRS stack S Cache size L =5 Llir =3 Lhir = 2 5 3 Resident HIR Stack Q Access an LIR Block (a Hit) ... 5 9 7 5 3 8 4 5 3 2 1 6 9 4 8 5 3 resident in cache LIR block HIR block Cache size L =5 Llir =3 Lhir = 2 LIRS stack Resident HIR Stack Access an LIR Block (a Hit) ... 5 9 7 5 3 8 4 5 3 2 1 6 9 8 5 3 resident in cache LIR block HIR block Cache size L =5 Llir =3 Lhir = 2 LIRS stack Resident HIR Stack Access an LIR block (a Hit) ... 5 9 7 5 3 8 8 4 5 3 2 1 6 9 S resident in cache LIR block HIR block Cache size L =5 Llir =3 Lhir = 2 5 3 Q Access a Resident HIR Block (a Hit) ... 5 9 7 5 3 resident in cache LIR block 3 8 4 5 3 2 1 S HIR block Cache size L =5 Llir =3 Lhir = 2 5 3 Q Access a Resident HIR Block (a Hit) ... 5 9 7 5 3 resident in cache LIR block 3 8 4 5 HIR block Cache size L =5 Llir =3 Lhir = 2 2 1 S 5 Q Access a Resident HIR Block (a Hit) ... 5 9 7 5 3 resident in cache LIR block HIR block 3 8 4 5 2 1 S Q Cache size L =5 Llir =3 Lhir = 2 1 5 Access a Resident HIR Block (a Hit) ... 5 9 7 5 resident in cache LIR block HIR block Cache size L =5 Llir =3 Lhir = 2 3 8 4 S 5 1 5 Q Access a Non-Resident HIR block (a Miss) ... 5 9 7 resident in cache LIR block HIR block 7 5 3 8 4 S Cache size L =5 Llir =3 Lhir = 2 7 5 1 Q Access a Non-Resident HIR block (a Miss) ... 5 9 resident in cache LIR block HIR block 9 7 5 3 8 4 S Cache size L =5 Llir =3 Lhir = 2 9 7 5 Q Access a Non-Resident HIR block (a Miss) ... 5 resident in cache LIR block HIR block 5 9 7 5 3 8 4 S Cache size L =5 Llir =3 Lhir = 2 9 7 Q Workload Traces postgres is a trace of join queries among four relations in a relational database system; sprite is from the Sprite network f...

