{[ promptMessage ]}

Bookmark it

{[ promptMessage ]}

leung - Spyglass Fast Scalable Metadata Search for...

Info iconThis preview shows pages 1–2. Sign up to view the full content.

View Full Document Right Arrow Icon
USENIX Association 7th USENIX Conference on File and Storage Technologies 153 Spyglass: Fast, Scalable Metadata Search for Large-Scale Storage Systems Andrew W. Leung Minglong Shao Timothy Bisson Shankar Pasupathy Ethan L. Miller University of California, Santa Cruz NetApp { aleung, elm } @cs.ucsc.edu { minglong, tbisson, shankarp } @netapp.com Abstract The scale of today’s storage systems has made it in- creasingly difficult to find and manage files. To address this, we have developed Spyglass, a file metadata search system that is specially designed for large-scale storage systems. Using an optimized design, guided by an anal- ysis of real-world metadata traces and a user study, Spy- glass allows fast, complex searches over file metadata to help users and administrators better understand and man- age their files. Spyglass achieves fast, scalable performance through the use of several novel metadata search techniques that exploit metadata search properties. Flexible index con- trol is provided by an index partitioning mechanism that leverages namespace locality. Signature files are used to significantly reduce a query’s search space, improving performance and scalability. Snapshot-based metadata collection allows incremental crawling of only modified files. A novel index versioning mechanism provides both fast index updates and “back-in-time” search of meta- data. An evaluation of our Spyglass prototype using our real-world, large-scale metadata traces shows search per- formance that is 1-4 orders of magnitude faster than ex- isting solutions. The Spyglass index can quickly be up- dated and typically requires less than 0.1% of disk space. Additionally, metadata collection is up to 10 × faster than existing approaches. 1 Introduction The rapidly growing amounts of data in today’s stor- age systems makes finding and managing files extremely difficult. Storage users and administrators need to effi- ciently answer questions about the properties of the files being stored in order to properly manage this increas- ingly large sea of data. Metadata search, which involves indexing file metadata such as inode fields and extended attributes, can help answer many of these questions [26]. Metadata search allows point, range, top- k , and aggre- gation search over file properties, facilitating complex, ad hoc queries about the files being stored. For exam- ple, it can help an administrator answer “which files can be moved to second tier storage?” or “which applica- tion’s and user’s files are consuming the most space?”. Metadata search can also help a user find his or her ten most recently accessed presentations or largest virtual machine images. Efficiently answering these questions can greatly improve how user and administrator manage files in large-scale storage systems.
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Image of page 2
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}