leung - USENIX Association 7th USENIX Conference on File...

Info iconThis preview shows pages 1–2. Sign up to view the full content.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: USENIX Association 7th USENIX Conference on File and Storage Technologies 153 Spyglass: Fast, Scalable Metadata Search for Large-Scale Storage Systems Andrew W. Leung ⋆ Minglong Shao † Timothy Bisson † Shankar Pasupathy † Ethan L. Miller ⋆ ⋆ University of California, Santa Cruz † NetApp { aleung, elm } @cs.ucsc.edu { minglong, tbisson, shankarp } @netapp.com Abstract The scale of today’s storage systems has made it in- creasingly difficult to find and manage files. To address this, we have developed Spyglass, a file metadata search system that is specially designed for large-scale storage systems. Using an optimized design, guided by an anal- ysis of real-world metadata traces and a user study, Spy- glass allows fast, complex searches over file metadata to help users and administrators better understand and man- age their files. Spyglass achieves fast, scalable performance through the use of several novel metadata search techniques that exploit metadata search properties. Flexible index con- trol is provided by an index partitioning mechanism that leverages namespace locality. Signature files are used to significantly reduce a query’s search space, improving performance and scalability. Snapshot-based metadata collection allows incremental crawling of only modified files. A novel index versioning mechanism provides both fast index updates and “back-in-time” search of meta- data. An evaluation of our Spyglass prototype using our real-world, large-scale metadata traces shows search per- formance that is 1-4 orders of magnitude faster than ex- isting solutions. The Spyglass index can quickly be up- dated and typically requires less than 0.1% of disk space. Additionally, metadata collection is up to 10 × faster than existing approaches. 1 Introduction The rapidly growing amounts of data in today’s stor- age systems makes finding and managing files extremely difficult. Storage users and administrators need to effi- ciently answer questions about the properties of the files being stored in order to properly manage this increas- ingly large sea of data. Metadata search, which involves indexing file metadata such as inode fields and extended attributes, can help answer many of these questions [26]. Metadata search allows point, range, top- k , and aggre- gation search over file properties, facilitating complex, ad hoc queries about the files being stored. For exam- ple, it can help an administrator answer “which files can be moved to second tier storage?” or “which applica- tion’s and user’s files are consuming the most space?”. Metadata search can also help a user find his or her ten most recently accessed presentations or largest virtual machine images. Efficiently answering these questions can greatly improve how user and administrator manage files in large-scale storage systems....
View Full Document

This note was uploaded on 11/12/2011 for the course CE 726 taught by Professor Staf during the Spring '11 term at SUNY Buffalo.

Page1 / 14

leung - USENIX Association 7th USENIX Conference on File...

This preview shows document pages 1 - 2. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online