p409-guha - Space eciency in Synopsis construction...

Info icon This preview shows pages 1–2. Sign up to view the full content.

View Full Document Right Arrow Icon
Space efficiency in Synopsis construction algorithms Sudipto Guha * Department of Computer and Information Sciences University of Pennsylvania, Philadelphia PA 19104 Abstract Histograms and Wavelet synopses have been found to be useful in query optimization, ap- proximate query answering and mining. Over the last few years several good synopsis al- gorithms have been proposed. These have mostly focused on the running time of the synopsis constructions, optimum or approx- imate, vis-a-vis their quality. However the space complexity of synopsis construction al- gorithms has not been investigated as thor- oughly. Many of the optimum synopsis con- struction algorithms (as well as few of the ap- proximate ones) are expensive in space. In this paper, we propose a general technique that reduces space complexity. We show that the notion of “working space” proposed in these contexts is redundant. We believe that our algorithm also generalizes to a broader range of dynamic programs beyond synopsis construction. Our modifications can be easily adapted to existing algorithms. We demon- strate the performance benefits through ex- periments on real-life and synthetic data. 1 Introduction Wavelet and Histogram representations are important data analysis tools and have been used in image anal- ysis and signal processing for a long time. Most appli- cations of these techniques consider representing the input in terms of the broader characteristics of the data, referred to as a synopsis or signature. These syn- opses or signatures, typically constructed to minimize * Supported in part by an Alfred P. Sloan Re- search Fellowship and by an NSF Award CCF-0430376. Email: [email protected] Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct commercial advantage, the VLDB copyright notice and the title of the publication and its date appear, and notice is given that copying is by permission of the Very Large Data Base Endowment. To copy otherwise, or to republish, requires a fee and/or special permission from the Endowment. Proceedings of the 31st VLDB Conference, Trondheim, Norway, 2005 some desired error criterion, are used subsequently in a variety of ways. A few of the highlights include ap- plications in OLAP/DSS systems by Haas et. al. [18], in approximate query answering by Amsaleg et. al. [2] and Acharya et. al. [1], and more recently in mining time series by Chakraborty et. al. [5]. Histograms were one of the earliest synopses used in the context of database query optimization [29, 25]. Since the introduction of serial histograms by Ioan- nidis [19] this area has been a focus of a significant body of research, e.g., [20, 28, 21, 11, 16] among many others. Matias, Vitter and Wang [24] gave one of the first proposals for using Wavelet based synopsis and over the last few years this topic has also received sig- nificant attention from different groups of researchers [4, 12, 9, 8, 26]. Histograms and Wavelets are not the only synopses structures – quantiles and samples have been used widely as well. We will not be able to cover
Image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Image of page 2
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}

What students are saying

  • Left Quote Icon

    As a current student on this bumpy collegiate pathway, I stumbled upon Course Hero, where I can find study resources for nearly all my courses, get online help from tutors 24/7, and even share my old projects, papers, and lecture notes with other students.

    Student Picture

    Kiran Temple University Fox School of Business ‘17, Course Hero Intern

  • Left Quote Icon

    I cannot even describe how much Course Hero helped me this summer. It’s truly become something I can always rely on and help me. In the end, I was not only able to survive summer classes, but I was able to thrive thanks to Course Hero.

    Student Picture

    Dana University of Pennsylvania ‘17, Course Hero Intern

  • Left Quote Icon

    The ability to access any university’s resources through Course Hero proved invaluable in my case. I was behind on Tulane coursework and actually used UCLA’s materials to help me move forward and get everything together on time.

    Student Picture

    Jill Tulane University ‘16, Course Hero Intern