This preview shows pages 1–2. Sign up to view the full content.
This preview has intentionally blurred sections. Sign up to view the full version.View Full Document
Unformatted text preview: Disk and Tape Storage Cost Models Richard L. Moore, Jim DAoust, Robert H. McDonald and David Minor; San Diego Supercomputer Center, University of California San Diego; La Jolla, CA, USA Abstract The current and projected costs of storage are a critical issue as organizations face an explosive growth in data. While the cost of purchasing storage hardware is readily available from vendors, there is little published literature that describes the total cost of providing storage from an operational perspective. This paper describes current estimates of both disk and tape storage based on operational experience at the San Diego Supercomputer Center which operates a large-scale storage infrastructure. These costs include not only the storage hardware costs, but also the costs of supporting servers and related infrastructure, hardware maintenance, software licenses, floor space, utilities and labor costs. A brief discussion of projected cost trends in both disk and tape is provided, as well as a comparison to current web-based commercial storage services. Background and Objectives Virtually all organizations face explosive growth in their storage requirements, including exponentially growing volumes of data over increasing retention periods. Researchers at UC Berkeley estimated that 5 exabytes of data were produced in 2003 , while IDC recently estimated that 161 exabytes of digital information were produced in 2006 and projected nearly 1 zettabyte new data in 2010 . Thus it is critical for organizations to find metrics for real cost estimates for data storage. The San Diego Supercomputer Center (SDSC) has operated a large-scale 24*7 production data center for more than 20 years. This experience provides a comprehensive understanding of the operational costs required to provide a long-term sustainable storage infrastructure. SDSC currently operates more than 2,500 terabytes (TB) of disk storage from several different vendors including fibre-channel, SATA and MAID (Massive Array of Idle Disks) disk systems. The data volume stored in SDSC&s tape-based archival system has grown exponentially with a remarkably consistent doubling rate of ~15 months, and now exceeds 5 petabytes (PB); the current capacity is 25 PB without compression. The data infrastructure at SDSC is provided for a wide variety of applications and users, including simulation output from the national supercomputing research community, experimental and sensor data from the scientific community, and digital library collections from the Library of Congress, the National Archives and Records Administration, and others. As SDSC&s storage infrastructure grows in size and evolves to support a broader set of communities and services, it is critical to develop comprehensive cost models for current and future sustainable storage....
View Full Document
This note was uploaded on 12/27/2011 for the course CMPSC 290h taught by Professor Chong during the Fall '09 term at UCSB.
- Fall '09