This preview shows page 1. Sign up to view the full content.
Unformatted text preview: y are reused Issues
when to run cleaner?
how many segments to clean at a time?
*which segments to clean?
*how to re-write the live blocks? Segment
Old segments contain
• live data
• “dead data” belonging to files that were deleted or overwritten Segment cleaning involves writing out the live data
Segment summary block identifies each piece of information
in the segment (for data blocks to which inodes are they
Segment cleaning (cont’d)
Segment cleaning process involves
1. reading a number of segments into memory 2. identifying the live data 3. writing them back to a smaller number of clean segments Key issue is where to write these live data Write
u = utilization
utilization (fraction of live data) Segment
Segment Cleaning Policies: which
Greedy policy: always cleans the least-utilized segments
Cost-benefit policy: selects segments with the highest benefitto-cost ratio
older data – more stable
newer data – more likely to be modified or deleted –
cleaning wastes time 1 to read, u to copy Copying
Copying life blocks: where
• sorts the blocks by the time they were last modified
• groups blocks of similar age together into new segments Age of a block is good predictor of its survival
Supports cost-benefit policy Simulation
Simulation results Consider two file access patterns
90% of the accesses involve 10% of the files
10% of the accesses involves 90% of the files Greedy
Greedy policy Write cost is very sensitive to disk utilization
• higher disk utilizations result in more frequent segment cleanings
• will also clean segments that contain more live data Using
View Full Document