p13-cormode - Sketching Streams Through the Net:...

Info iconThis preview shows pages 1–2. Sign up to view the full content.

View Full Document Right Arrow Icon
Sketching Streams Through the Net: Distributed Approximate Query Tracking Graham Cormode Bell Labs, Lucent Technologies cormode@bell-labs.com Minos Garofalakis * Intel Research, Berkeley minos@acm.org Abstract Emerging large-scale monitoring applications re- quire continuous tracking of complex data- analysis queries over collections of physically- distributed streams. Effective solutions have to be simultaneously space/time efficient (at each remote monitor site), communication efficient (across the underlying communication network), and provide continuous, guaranteed-quality ap- proximate query answers. In this paper, we pro- pose novel algorithmic solutions for the problem of continuously tracking a broad class of complex aggregate queries in such a distributed-streams setting. Our tracking schemes maintain approxi- mate query answers with provable error guaran- tees, while simultaneously optimizing the stor- age space and processing time at each remote site, and the communication cost across the net- work. They rely on tracking general-purpose ran- domized sketch summaries of local streams at re- mote sites along with concise prediction mod- els of local site behavior in order to produce highly communication- and space/time-efficient solutions. The result is a powerful approximate query tracking framework that readily incorpo- rates several complex analysis queries (including distributed join and multi-join aggregates, and ap- proximate wavelet representations), thus giving the first known low-overhead tracking solution for such queries in the distributed-streams model. 1 Introduction Traditional data-management applications typically require database support for a variety of one-shot queries , includ- * Work done while at Bell Labs, Lucent Technologies. Permission to copy without fee all or part of this material is granted pro- vided that the copies are not made or distributed for direct commercial advantage, the VLDB copyright notice and the title of the publication and its date appear, and notice is given that copying is by permission of the Very Large Data Base Endowment. To copy otherwise, or to republish, requires a fee and/or special permission from the Endowment. Proceedings of the 31st VLDB Conference, Trondheim, Norway, 2005 ing lookups, sophisticated slice-and-dice operations, data mining tasks, and so on. One-shot means the data pro- cessing is essentially done once, in response to the posed query. This has led to a very successful industry of database engines optimized for supporting complex, one-shot SQL queries over large amounts of data. Recent years, how- ever, have witnessed the emergence of a new class of large- scale event monitoring applications that pose novel data- management challenges. In one class of applications, mon- itoring a large-scale system is a crucial aspect of system op- eration and maintenance. As an example, consider the Net- work Operations Center (NOC) for the IP-backbone net-
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Image of page 2
This is the end of the preview. Sign up to access the rest of the document.

Page1 / 12

p13-cormode - Sketching Streams Through the Net:...

This preview shows document pages 1 - 2. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online