{[ promptMessage ]}

Bookmark it

{[ promptMessage ]}

Lecture13-Nov16-05 - Sensor Data Management In Sensor...

Info icon This preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Sensor Data Management In Sensor Networks Towards Sensor Database Systems [Bonnet+ 2001] I. Introduction – – This paper defines a model for sensor databases Stored data are represented as relations while sensor data are represented as time series and each long-running query formulated over a sensor database defines a persistent view, which is maintained during a given time interval – – The design and implementation of the COUGAR sensor database system is also described Applications monitor the world by querying and analyzing sensor data Towards Sensor Database Systems [Bonnet+ 2001] I. Introduction – Examples of monitoring applications include o o o supervising items in a factory warehouse, gathering information in a disaster area, organizing vehicle traffic in a large city – – – – These applications involve a combination of stored data (a list of sensors and their related attributes, such as their location) and sensor data These will be called sensor databases This paper focuses on sensor query processing – the design, algorithms, and implementations used to run queries over sensor databases A sensor query is defined as a query expressed over a sensor database Towards Sensor Database Systems [Bonnet+ 2001] I. Introduction Factory Warehouse Example – – A sensor query is defined as a query expressed over a sensor database Each item of a factory warehouse has a stick-on temperature sensor attached to it as well as other attached sensors are to walls and embedded in floors and ceilings – Each sensor provides two signal-processing functions: 1. 2. getTemperature() returns the measured temperature at regular intervals, and detectAlarmTemperature(threshold) returns the temperature whenever it crosses a certain threshold – Each sensor is able to communicate this data and/or to store it locally Towards Sensor Database Systems [Bonnet+ 2001] Factory Warehouse Example – – – The sensor database stores the identifier of all sensors in the warehouse along with their location and is connected to the sensor network The sensor database is used to make sure that items do not overheat Typical queries that are run continuously may include: o o o o o Query 1: “Return repeatedly the abnormal temperatures measured” Query 2: “Every minute, return the temperature measured on the third floor” Query 3: “Generate a notification whenever two sensors within 5 yards of each other simultaneously measure an abnormal temperature” Query 4: “Every five minutes retrieve the maximum temperature measured over the last five minutes” Query 5: “Return the average temperature measured on each floor over the last 10 minutes” Towards Sensor Database Systems [Bonnet+ 2001] Factory Warehouse Example – These examples queries have the following characteristics: o o o o o Monitoring queries are long running The desired result of a query is typically a series of notifications of system activity (periodic or triggered by special situations) Queries need to correlate data produced simultaneously by different sensors Queries need to aggregate sensor data over time windows Most queries contain some condition restricting the set of sensors that are involved (usually geographical conditions) – Queries are formulated regardless of the physical structure or the organization of the sensor network since the actual structure and population of a sensor network may vary over the lifespan of a query – There are similarities with relational database query processing – most applications combine sensor data with stored data Towards Sensor Database Systems [Bonnet+ 2001] – – Sensor data differs from traditional relational data since it is not stored in a database server and it varies over time There are two approaches for processing sensor queries: o Warehousing approach: represents the current state-of-the-art processing of sensor queries and access to the sensor network are separated – the sensor network is used by a data collection mechanism suited for answering predefined queries over historical data proceeds in two steps: (i) data is extracted from sensor network in a predefined manner and is stored in a database located on a unique front-end server; (ii) query processing takes place on the centralized database Towards Sensor Database Systems [Bonnet+ 2001] – Distributed approach: this approach is the focus of this paper the query workload determines the data to be extracted from sensors provides flexibility – different queries extract different data from the sensor network – and efficient – only relevant data are extracted from the sensor network it allows the sensor database system to leverage the computing resources on the sensor nodes: a sensor query can be evaluated at the front-end server, in the sensor network, at the sensors, or at some combination of the three – Sensor database system should deal with sensor and communication failures; it should consider sensor data as measurements with an associated uncertainty not as fact; it should establish and run a distributed query execution plan without assuming global knowledge Towards Sensor Database Systems [Bonnet+ 2001] – The paper has the following contributions: o Built on the results of [Seshadri+ 1995] to define a data model and longrunning queries semantics for sensor databases. A sensor database mixes stored data and sensor data. Stored data are represented as relations while sensor data are represented as time series. Each long-running query defines a persistent view that is maintained during a given time interval. o Described the design and implementation of the Cornell COUGAR sensor database system where COUGAR extends the Cornell PREDATOR objectrelational database system. In COUGAR, each type of sensor is modeled as a new Abstract Data Type (ADT). Signal-processing functions are modeled as ADT functions that return sensor data. Long-running queries are formulated in SQL. To support the evaluation of long-running queries, the query execution engine is extended with a new mechanism for the execution of sensor ADT functions Towards Sensor Database Systems [Bonnet+ 2001] II. A Model for Sensor Database Systems – Build on existing work by [Seshadri+ 1995] to define a data model for sensor data and an algebra of operators to formulate sensor queries II.A. Sensor Data – – A sensor database involves stored data and sensor data Stored data include the set of sensors participating in the sensor database along with characteristics of the sensors (e.g., their location) or characteristics of the physical environment – – These stored data are represented as relations The question: how to represent sensor data? Towards Sensor Database Systems [Bonnet+ 2001] II.A. Sensor Data – Sensor data are generated by signal processing functions and the representation chosen for sensor data should formulate sensor queries (data collection, correlation in time, and aggregates over time windows) – – Time is essential -- signal processing functions may return output repeatedly over time, and each output has a time-stamp In addition, monitoring queries introduce constraints on the sensor data time-stamps, e.g., Query 3 in Example 1 assumes that the abnormal temperatures are detected either simultaneously or within a certain time interval. Queries 4 and 5; on the other hand, aggregates over time windows and reference time explicitly Towards Sensor Database Systems [Bonnet+ 2001] II.A. Sensor Data – – – Sensor data is represented as time series Representation of sensor time series are based on the sequence model introduced by [Seshadri+ 1995] A sequence is defined as a 3-tuple comprised of o o o a set of records R a countable totally ordered domain O (ordering domain – the elements of the ordering domain are referred to as positions) an ordering of R by O (defined as a relation between O and R, such ( that every record in R is associated with some position in O – Sequence operators are n-ary mappings on sequences; they operate on a given number of input sequences producing a unique output sequence Towards Sensor Database Systems [Bonnet+ 2001] II.A. Sensor Data – – – 1. 2. All sequence operators can be composed Sequence operators include: select, project, compose (natural join on the position), and aggregates over a set of positions Sensor data as a time series is represented with the following properties: The set of records corresponds to the outputs of a signal processing function over time The ordering domain is a discrete time scale, i.e. a set of time quantum where each time quantum corresponds a position. Natural numbers are used as the time-series ordering domain. Each natural number represents the number of time units elapsed between a given origin and any (discrete) point in time. It is assumed that clocks are synchronized and thus all sensors share the same time scale Towards Sensor Database Systems [Bonnet+ 2001] II.A. Sensor Data 1. All outputs of the signal processing function generated during a time quantum are associated to the same position p. In case a sensor does not generate events during the time quantum associated to a position, the Null record is associated to that position 2. Whenever a signal processing function produces an output, the base sequence is updated at the position corresponding to the production time. Updates to sensor time series occur in increasing position order II.B. Sensor Queries – – A sensor database involves stored data and sensor data, i.e., relations and sequences Sensor query is defined as an acyclic graph of relational and sequence operators Towards Sensor Database Systems [Bonnet+ 2001] II.B. Sensor Queries – The inputs of a relational operator are either base relations or the output of another relational operator; the inputs of a sequence operator are either base sequences or the output of another sequence operator, i.e. relations are manipulated using relational operators and sequences are manipulated using sequence operators – There are three exceptions to this rule – three operators allow combining relations and sequences: o o o the relational projection operator can take a sequence as input and project out the position attribute to obtain a relation a cross product operator can take as input a relation and a sequence to produce a sequence an aggregate operator can take a sequence as input and a grouping list that does not include the position attribute Towards Sensor Database Systems [Bonnet+ 2001] II.B. Sensor Queries – – Sensor queries are long running Each sensor query is associated a time interval of the form [O, O + T] where O is the time at which it is submitted and T is the number of time quantums during which it is running – – During the life of long-running query, relations and sensor sequences may be updated An update to a relation R can be an insert, a delete, or modifications of a record in R, whereas, an update to a sensor sequence S is the insertion of a new record associated to a position greater than or equal to all the undefined positions in S Towards Sensor Database Systems [Bonnet+ 2001] II.B. Sensor Queries – A sensor query defines a view that is persistent during its associated time interval where this persistent view is maintained to reflect the updates that are repeatedly performed on sensor time series – [Jagadish+ 1995] presented that persistent views over relations and sequences could be maintained incrementally without accessing the complete sequences – Informally, persistent views can be maintained incrementally if updates occur in increasing position order and if the algebra used to compose queries does not allow sequences to be combined using any relational operators – Both conditions hold in the definition of a sensor database used in this paper Towards Sensor Database Systems [Bonnet+ 2001] III. The COUGAR Sensor Database System – The initial version of COUGAR system has been evaluated in the following aspects: 1. User representation: o o How are sensors and signal processing functions modeled in the database schema? How are queries formulated? 2. Internal representation: o o How is sensor data represented within the database components that perform query processing? How are sensor queries evaluated to provide the semantics of longrunning queries? Towards Sensor Database Systems [Bonnet+ 2001] III. A User Representation – In COUGAR, signal-processing functions are represented as Abstract Data Type (ADT) and a Sensor ADT is considered for all sensors of a same type (e.g., temperature sensors, seismic sensors) – The public interface of a Sensor ADT corresponds to the specific signalprocessing functions supported by a type of sensor whereas an ADT object in the database corresponds to a physical sensor in the real world – – Sensor queries are formulated in SQL with small modifications to the language The ‘FROM’ clause of a sensor query includes a relation whose schema contains a sensor ADT attribute while the expressions over sensor ADTs are included in either the ‘SELECT’ or the ‘WHERE’ clause of a sensor query Towards Sensor Database Systems [Bonnet+ 2001] III. A User Representation – The queries introduced earlier are formulated in COUGAR as follows: o The simplified schema of the sensor database contains one relation R(loc point, floor int, s sensorNode), where loc is a point ADT that stores the coordinates of the sensor, floor is the floor where the sensor is located in the data warehouse and sensorNode is a Sensor ADT that supports the methods getTemp() and detectAlarmTemp(threshold), where threshold is the threshold temperature above which abnormal temperatures are returned o Both ADT functions return temperature represented as float Towards Sensor Database Systems [Bonnet+ 2001] III. A User Representation – Query 1: “Return repeatedly the abnormal temperatures measured by all sensors” SELECT R.s.detectAlarmTemp(100) FROM R WHERE $every(); The expression $every() is introduced as a syntactical construct to indicate that the query is long-running – Query 2: “Every minute, return the temperature measured by all sensors on the third floor” SELECT R.s.getTemp() FROM R WHERE R.floor = 3 AND $every(60); The expression $every() takes as argument the time in seconds between successive outputs of the sensor ADT functions in the query Towards Sensor Database Systems [Bonnet+ 2001] III. A User Representation – Query 3: “Generate a notification whenever two sensors within 5 yards of each other measure simultaneously an abnormal temperature” SELECT R1s.detectAlarmTemp(100), R2.s. detectAlarmTemp (100) FROM R R1, R R2 WHERE $SQRT($SQR(R1.loc.x – R2.loc.x) + $SQR( R1.loc.y – R2.loc.y)) < 5 AND R1.s > R2.s AND $every(); This formulation assumes that the system incorporates an equality condition on the time at which the temperatures are obtained from both sensors. – – Queries 4 and 5 cannot be expressed in the initial COUGAR since aggregates over time windows are not supported Time interval associated with long-running queries in COUGAR is the interval between the instant the query is submitted and the instant the query is explicitly stopped Towards Sensor Database Systems [Bonnet+ 2001] III. B Internal Representation – Query processing takes place on a database front-end while signalprocessing functions are executed on the sensor nodes involved in the query – The query execution engine on the database front-end includes a mechanism for interacting with remote sensors where a query execution engine in each sensor executes signal processing functions and sends data back to the front-end – In COUGAR, it is assumed that there are no modifications to the stored data during the execution of a long-running query -- strict two-phase locking on the database front-end ensures verification of this assumption Towards Sensor Database Systems [Bonnet+ 2001] III. B Internal Representation – The initial version of COUGAR does not consider a long-running query as a persistent view; the system computes the incremental results that could be used to maintain a view where these incremental results are obtained by evaluating sensor ADT functions repeatedly and by combining the outputs they produce over time with stored data – The execution of Sensor ADT functions is essential for sensor queries execution Towards Sensor Database Systems [Bonnet+ 2001] Advantages: – Distributed approach makes it efficient since only the relevant data are extracted from the WSN under consideration, hence reducing the communication and processing overhead The representation of the processing function as ADT provides controlled access to encapsulated data through a well-defined set of functions The authors chose not to reinvent the wheel as the sensor queries are formulated in SQL with little modification to the language The use of Virtual Relations introduces more flexibility – – – Towards Sensor Database Systems [Bonnet+ 2001] Disadvantages: – – – – Since the authors are proposing a sensor DB system, how does their system adhere to the ACID properties of a DB needs to be mentioned The protocol assumes that the sensed data is all stored in the sensor node – this may introduce a space constraint in the sensor node The sensor data is time variant and after a certain time the data would be outdated The authors do not suggest any rule for what time duration the data should be held in the node Towards Sensor Database Systems [Bonnet+ 2001] Suggestions/Improvements/Future Work: – – – Handle multiple copies of same kind of data available from nearby resources Provide adaptive query processing mechanism Handle mobile nodes and sinks Supporting Aggregate Queries Over Ad-Hoc Wireless Sensor Networks [Madden+ 2002] Introduction – – The paper discusses the challenges associated with implementing the five basic database aggregates (COUNT, MIN, MAX, SUM, and AVERAGE) The network aggregation approach discussed in this paper is driven by a general purpose, SQL-style interface that can execute queries over any kind of sensor data irregardless of the application – There are two benefits of this approach over the traditional network solution which is generally application dependent: o o Computation can be optimized by defining the language that users use to express aggregates Since the same aggregation language can be applied to all data types, the burden on programmers is substantially less: they can issue declarative, SQL style queries rather than implementing custom networking protocols to extract the needed data from the network Supporting Aggregate Queries Over Ad-Hoc Wireless Sensor Networks [Madden+ 2002] Introduction – – – The paper presents a variety of techniques to improve the reliability and performance of the proposed solution In addition, it is shown how grouped aggregates can be efficiently computed and offered a comparison to related systems and database projects Two properties of radio communication need to be pointed out: o Radio is a broadcast medium such that any sensor within hearing distance can hear any message irrespective of whether or not it is the intended recipient o Radio links are typically symmetric: if a sensor a can hear sensor b, it is assumed that sensor b can also hear sensor a; however, this may not hold true in some cases Supporting Aggregate Queries Over Ad-Hoc Wireless Sensor Networks [Madden+ 2002] Background – – Messages in the current generation of TinyOS are a fixed size preprogrammed into sensors Each message type has a message id that distinguishes it from other types of messages and each sensor has a unique sensor id that distinguishes it from other sensors – All messages specify their recipient (or broadcast), allowing sensors to ignore messages not intended for them, although non-broadcast messages must still be received by all sensors within range – unintended recipients drop messages not addressed to them – – The technique adopted is to build a routing tree to route sensor data One sensor, typically interfaces the querying user to the rest of the network, is chosen to be the...
View Full Document

{[ snackBarMessage ]}