p421-karras - One-Pass Wavelet Synopses for Maximum-Error...

Info iconThis preview shows pages 1–2. Sign up to view the full content.

View Full Document Right Arrow Icon
One-Pass Wavelet Synopses for Maximum-Error Metrics * Panagiotis Karras Nikos Mamoulis Department of Computer Science University of Hong Kong Pokfulam Road Hong Kong { pkarras,nikos } @cs.hku.hk Abstract We study the problem of computing wavelet- based synopses for massive data sets in static and streaming environments. A compact representa- tion of a data set is obtained after a thresholding process is applied on the coefficients of its wavelet decomposition. Existing polynomial-time thresh- olding schemes that minimize maximum error metrics are disadvantaged by impracticable time and space complexities and are not applicable in a data stream context. This is a cardinal issue, as the problem at hand in its most practically interesting form involves the time-efficient approximation of huge amounts of data, potentially in a streaming environment. In this paper we fill this gap by de- veloping efficient and practicable wavelet thresh- olding algorithms for maximum-error metrics, for both a static and a streaming case. Our algorithms achieve near-optimal accuracy and superior run- time performance, as our experiments show, under frugal space requirements in both contexts. 1 Introduction Several database applications require the reduction of vast amounts of data into a more manageable size. Such data re- duction is useful in situations where exactness is not valued as high as speed. For example, in order to evaluate a query execution plan, it is imperative to estimate the selectivity of the query components efficiently, while it is not necessary to get precise knowledge about it. In Decision Support Sys- tems (DSS) applications, a user is not primarily interested * Work sponsored by grants HKU 7380/02E and HKU 7149/03E from Hong Kong RGC. Permission to copy without fee all or part of this material is granted pro- vided that the copies are not made or distributed for direct commercial advantage, the VLDB copyright notice and the title of the publication and its date appear, and notice is given that copying is by permission of the Very Large Data Base Endowment. To copy otherwise, or to republish, requires a fee and/or special permission from the Endowment. Proceedings of the 31st VLDB Conference, Trondheim, Norway, 2005 in the exact (expensive to retrieve) answer to a query, but in a fairly accurate estimation of it, such that it would reveal the basic features of the examined body of data. More- over, the need for quick and reliable data approximation is prominent in situations where massive data arrives in a stream; in such settings the approximation needs to be also extracted in a single pass over the data. Wavelet decomposition [1] provides a very effective data reduction tool, with applications in data mining [12], se- lectivity estimation [13], and approximate and aggregate query processing of massive relational tables [16, 4] and data streams [7]. In simple terms, a wavelet synopsis is ex- tracted by applying the wavelet decomposition on an input collection (considered as a sequence of values) and then
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Image of page 2
This is the end of the preview. Sign up to access the rest of the document.

Page1 / 12

p421-karras - One-Pass Wavelet Synopses for Maximum-Error...

This preview shows document pages 1 - 2. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online