{[ promptMessage ]}

Bookmark it

{[ promptMessage ]}

Appuswamy - Block-level RAID is dead Raja Appuswamy David C...

Info iconThis preview shows pages 1–2. Sign up to view the full content.

View Full Document Right Arrow Icon
Block-level RAID is dead Raja Appuswamy, David C. van Moolenbroek, Andrew S. Tanenbaum Vrije Universiteit, Amsterdam { raja, dcvmoole, ast } @cs.vu.nl Abstract The common storage stack as found in most operating systems has remained unchanged for several decades. In this stack, the RAID layer operates under the file system layer, at the block abstraction level. We argue that this arrangement of layers has fatal flaws. In this paper, we highlight its main problems, and present a new storage stack arrangement that solves these problems. 1 Introduction The concept of RAID [13] is a landmark in the history of storage systems. Taking advantage of the traditional block interface used by file systems, RAID algorithms were integrated at the block level, thus, retaining perfect backward compatibility with existing installations. As installations became larger, administrative tools like vol- ume managers [20] followed suit. These tools broke the “one file system per disk” bond and made it possible to resize file systems on the fly. Volumes also served as a convenient point for policy assignment (choosing RAID levels for instance) and quota enforcement. Together, we refer to RAID and volume management solutions as the RAID layer . The compatibility-driven integration of the RAID layer at the block-level has remained unchanged despite significant changes in the storage hardware landscape. We believe that it is time to retire block-level RAID. In this paper, we highlight several significant problems associated with the traditional block-level RAID imple- mentation (Section 2). We briefly discuss proposed so- lutions and explain why they do not solve all the prob- lems (Section 3). We then present Loris, a clean-slate design of the storage stack (Section 4), and highlight how it solves all the problems by design (Section 5). 2 Problems with block-level RAID In this section, we will provide an in-depth look at some of the problems that plague block-level RAID implemen- tations. 2.1 Silent data corruption Modern disk drives exhibit a range of partial failures [14, 6], like lost, misdirected, and torn writes. In all these cases, the drive reports back a success, resulting in data being silently corrupted. Various checksumming techniques have been devel- oped to detect data corruption [16] and they offer vary- ing levels of reliability. One technique that is capable of detecting all the aforementioned sources of corruption, involves storing the checksum of a block with its par- ent (the inode for instance). This has been referred to as parental checksumming . Since such a technique in- volves knowledge of block relationships, it can only be employed by file systems. However, parental checksumming loses its benefit when used in combination with block level RAID. This is due to the fact that while file system-initiated reads undergo verification, RAID-initiated reads (a subtractive read to recompute parity for instance) are completely un- verified. As a result, RAID can propagate data corrup- tion, leading to data loss [12]. For instance, if a corrupt
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Image of page 2
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}