60 Pages

hamid_muhammad_r_200512_mast

Course: ETD 11022005, Fall 2009
School: Georgia Tech
Rating:
 
 
 
 
 

Word Count: 15096

Document Preview

Activity Unsupervised Discovery and Characterization for Sensor-Rich Environments A Thesis Presented to The Academic Faculty by Raffay Hamid In Partial Fulfillment of the Requirements for the Degree Master of Science College of Computing Georgia Institute of Technology December 2005 Unsupervised Activity Discovery and Characterization for Sensor-Rich Environments Approved by: Dr. Aaron Bobick College of...

Register Now

Unformatted Document Excerpt

Coursehero >> Georgia >> Georgia Tech >> ETD 11022005

Course Hero has millions of student submitted documents similar to the one
below including study guides, practice problems, reference materials, practice exams, textbook help and tutor support.

Course Hero has millions of student submitted documents similar to the one below including study guides, practice problems, reference materials, practice exams, textbook help and tutor support.
Activity Unsupervised Discovery and Characterization for Sensor-Rich Environments A Thesis Presented to The Academic Faculty by Raffay Hamid In Partial Fulfillment of the Requirements for the Degree Master of Science College of Computing Georgia Institute of Technology December 2005 Unsupervised Activity Discovery and Characterization for Sensor-Rich Environments Approved by: Dr. Aaron Bobick College of Computing Georgia Institute of Technology, Advisor Dr. Irfan Essa College of Computing Georgia Institute of Technology Dr. Charles Isbell College of Computing Georgia Institute of Technology Date Approved To Ammi, Abbu, and Ayesha. iii ACKNOWLEDGEMENTS Many thanks indeed to my advisor, Aaron Bobick for all his insight and inspiration over the course of this thesis. I cannot thank Amos Johnson enough, who helped me immensely at each step of this work. Thanks to Samir Batta, Siddhartha Maddi and Graham Coleman for working so hard towards making this project a success. Many thanks to Irfan Essa, Charles Isbell and Jim Rehg for all their time and helpful suggestions. I want to thank Vivek Kwatra, Gabrial Brostow and Drew Steedly, whose advise always put me in good stead. Thanks to all my colleagues at Computational Perception Lab and Mobile Robot Lab for inspiring me to think about my research problem from different perspectives. Without the love and support of my family, I never could have made it through this process. Thank you all so very much. iv TABLE OF CONTENTS DEDICATION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ACKNOWLEDGEMENTS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . LIST OF TABLES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . LIST OF FIGURES iii iv vii . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viii x 1 1 1 2 3 6 7 10 10 12 12 15 15 16 16 17 18 20 21 22 23 24 SUMMARY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . I INTRODUCTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1 1.2 1.3 1.4 1.5 II Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Solution Approach & Contributions . . . . . . . . . . . . . . . . . . . . . . Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Thesis outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . PREVIOUS WORK . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . III REPRESENTING ACTIVITIES AS BAGS OF EVENT N -GRAMS 3.1 3.2 3.3 Vector Space Model - VSM . . . . . . . . . . . . . . . . . . . . . . . . . . . Activities as Histograms of Event n-Grams . . . . . . . . . . . . . . . . . . Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . IV UNSUPERVISED ACTIVITY-CLASS DISCOVERY . . . . . . . . . . . 4.1 4.2 Activity Similarity Metric . . . . . . . . . . . . . . . . . . . . . . . . . . . Activity-Class Discovery . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2.1 4.2.2 4.2.3 V Activity-Class as Maximal Clique . . . . . . . . . . . . . . . . . . . Maximal Cliques using Dominant Sets . . . . . . . . . . . . . . . . Dominant Sets Using Replicator Dynamics . . . . . . . . . . . . . . ACTIVITY CLASS CHARACTERIZATION . . . . . . . . . . . . . . . . 5.1 5.2 Typical Class Member . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Discovering Event Motifs . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2.1 5.2.2 A Definition of Motif . . . . . . . . . . . . . . . . . . . . . . . . . . Objective Function Optimization . . . . . . . . . . . . . . . . . . . v VI ACTIVITY CLASSIFICATION, ANOMALY DETECTION AND EXPLANATION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 6.1 6.2 Activity Classification and Anomaly Detection . . . . . . . . . . . . . . . . Anomaly Explanation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2.1 6.2.2 Activity-Class Modeling . . . . . . . . . . . . . . . . . . . . . . . . Explanatory Features . . . . . . . . . . . . . . . . . . . . . . . . . . 26 27 27 28 31 31 31 32 32 34 35 35 37 37 38 38 39 39 40 40 42 43 45 48 VII EXPERIMENTS AND RESULTS . . . . . . . . . . . . . . . . . . . . . . . 7.1 7.2 7.3 Loading Dock Scenario . . . . . . . . . . . . . . . . . . . . . . . . . . . . . House Scenario . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Discovered Activity Classes 7.3.1 7.3.2 7.4 . . . . . . . . . . . . . . . . . . . . . . . . . . Loading Dock Scenario . . . . . . . . . . . . . . . . . . . . . . . . . House Scenario . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Discovered Event Motifs . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.4.1 7.4.2 7.4.3 Loading Dock Scenario . . . . . . . . . . . . . . . . . . . . . . . . . House Scenario . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Subjective Assessment of Evaluation . . . . . . . . . . . . . . . . . 7.5 7.6 Discussion regarding Discovered Classes and Motifs . . . . . . . . . . . . . Detected Anomalies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.6.1 7.6.2 7.6.3 7.6.4 7.6.5 Learning Threshold for Anomalies using ROC . . . . . . . . . . . . Analysis of Detected Anomalies . . . . . . . . . . . . . . . . . . . . User Study For Detected Anomalies . . . . . . . . . . . . . . . . . . Noise Sensitivity . . . . . . . . . . . . . . . . . . . . . . . . . . . . Automatic Event Detection . . . . . . . . . . . . . . . . . . . . . . 7.7 Anomalous Activity Explanation . . . . . . . . . . . . . . . . . . . . . . . VIII CONCLUSIONS AND FUTURE WORK . . . . . . . . . . . . . . . . . . REFERENCES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vi LIST OF TABLES 1 The average detection rate of the system in the face of noise. . . . . . . . . 42 vii LIST OF FIGURES 1 2 A Person pushes a Cart carrying Packages into the Back Door of a Delivery Vehicle. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . VSM representation of two sequences S1 and S2 which have the same eventcontent, while different event order. As it can be seen, the VSM representation does not capture such order differences. . . . . . . . . . . . . . . . . . . Transformation of an example activity from sequence of discrete events to histogram of event n-grams. Here the value of n is shown to be equal to 3. V is event vocabulary, S is event sequence, and T is sequence of overlapping n-grams. Step-d shows the non-zero n-gram counts of V. . . . . . . . . . . . Five simulated activity sequences are shown to illustrate the different concepts introduced in 6.2.2. 1 has low value of Pc , its entropy Hc is low and therefore its predictability is high. 4 has medium Pc , its entropy Hc is also low and its predictability is high. Finally 8 has high Pc , but its entropy Hc is high which makes its predictability low. 1 could be useful in explaining the extraneous features in an anomalous activity, while 4 could be useful in explaining the features that were deficient in an anomaly. . . . . . . . . . . A schematic diagram of the camera setup at the loading dock area with overlapping fields of view (FOV). The FOV of camera 1 is shown in blue while that of camera 2 is in red. The overlapping area of the dock is shown in purple. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A schematic diagram of the strain-gage setup in the house scenario. The red dots represents the positions of the strain gages. . . . . . . . . . . . . . . . Each row represents the similarity of a particular activity with the entire activity training set. White implies identical similarity while black represents complete dissimilarity. The activities ordered after the red cross line in the clustered similarity matrix were dissimilar enough from all other activities as to not be included in any non-trivial maximal clique. . . . . . . . . . . . . . Visualization of similarity matrices before and after class discovery for the House Environment. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Visualization of the structural differences between the discovered activityclasses. Thick lines with brighter shades of red indicate higher frequency. . ROC obtained by varying over a range of values. . . . . . . . . . . . . . . Anomalous Activities - (a) shows a delivery vehicle leaving the loading dock with its back door still open. (b) shows an unusual number of people unloading a delivery vehicle. (c) shows a person cleaning the loading dock floor. . 11 12 3 13 4 30 5 32 33 6 7 34 35 35 39 8 9 10 11 41 viii 12 Performance Analysis - Each graph shows system-performance under synthetically generated noise using different generative noise models. The X-axis represents the noise interval where the amount of noise is inversely proportional to the noise interval. The Y-axis represents the percentage of regular test activities that remain regular members of the original sub-classes in the face of noise. The horizontal line in all these graphs shows the classification performance using automatically detected events as described in 7.6.5. . . Anomaly Explanation - explanations generated by the system for the three anomalies in Figure 11. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 43 13 ix SUMMARY This thesis presents an unsupervised method for discovering and analyzing the different kinds of activities in an active environment. Drawing from natural language processing, a novel representation of activities as bags of event n-grams is introduced, where the global structural information of activities using their local event statistics is analyzed. It is demonstrated how maximal cliques in an undirected edge-weighted graph of activities, can be used in an unsupervised manner, to discover the different activity-classes. Taking on some work done in computer networks and bio-informatics, it is shown how to characterize these discovered activity-classes from a wholestic as well as a by-parts view-point. A definition of anomalous activities is formulated along with a way to detect them based on the difference of an activity instance from each of the discovered activity-classes. Finally, an information theoretic method to explain the detected anomalies in a human-interpretable form is presented. Results over extensive data-sets, collected from multiple active environments are presented, to show the competence and generalizability of the proposed framework. x CHAPTER I INTRODUCTION then purged with euphrasy and rue the visual nerve, for he had much to see - John Milton (Paradise Lost: Book XI) 1.1 Motivation As I look around myself, I see a group of fellow graduate students having a team meeting around a desk near mine. Some amongst them passionately move their hands as they emphasize a point, while others write equations on a white-board - the rest nod their heads in a "yay" or "nay" manner, either agreeing or disagreeing with the point being discussed. Interestingly, this simple observation raises a fundamental question, i.e., how do we actually understand these everyday activities in such efficient and effortless manner? This highly elliptical question has numerous facets along which it needs to be examined - including everything from the biological aspects of low-level image formation on retina, to the cognitive aspects of understanding streams of visual information. In this treatise, I would try to elucidate some of the computational A.I. aspects of this problem of everyday activity analysis. The understanding of how we make sense of everyday activities has many ramifications, including one that involves coming up with systems that can be used for automatic surveillance or supporting users in ubiquitous environments. 1.2 Problem Statement While the general theme of this treatise is activity analysis in sensor-rich environments, more specifically this work focuses on the following problems: How can we represent activities transpiring in a certain environment without having substantive prior knowledge about the activity-structure in that domain? 1 How can we discover the different kinds of these activities in an unsupervised manner. How can we characterize these discovered activity-classes so that we can analyze them at different levels of details. What defines a usual or an unusual member of an activity-class, and finally, How can we explain the unusual activity-instances in a human interpretable form? 1.3 Solution Approach & Contributions The approach to solving the key-questions enumerated above and the main contributions of this dissertation are: Novel Activity Representation: A novel representation of activities is presented as bags of discrete event n-grams - a perspective different from the previously used grammar driven approaches. This treatment of activities, motivated by some recent developments in natural language processing, allows one to analyze the global structural information of activities by simply considering their local event statistics. Activity Similarity: Based on this activity representation, the notion of similarity between two activities is formalized, taking into account their core structural and event-frequency based differences. Unsupervised Activity-Class Discovery: The problem of unsupervised activityclass discovery is posed as a graph theoretic problem, swhoing how finding maximal cliques in edge-weighted activity-graphs can be used to this end. Activity-Class Characterization: Taking on some of the previous work in the fields of computer networks, the problem of activity-class characterization at a wholestic scale is formalized as finding the typical content and structure of an activity-class. Unsupervised Event-Motif Discovery: Inspired by some of the recent work in the field of bio-informatics, predictably recurrent event subsequences (Event-Motifs) are extracted, using variable-memory Markov chains. 2 Anomalous activity detection: An incremental method is proposed for classifying a new activity-instance and detecting whether it is a regular or an anomalous member of its membership class. Anomaly Explanation: An information-theoretic method is introduced that explains how an anomalous activity is different from regular activities in a humaninterpretable form. Such explanations can be useful for large scale vision based surveillance systems. 1.4 Related Work The problem of activity analysis has been taken on in numerous fields, including everything from human-computer interaction to computational perception, from ubiquitous computing to machine learning. Interestingly, some researchers have also tried to use these analyses for generation of new motion models in turn used for rendering different movements and behaviors on the console. One of the pioneer pieces of work in computational context analysis was [7] which analyzed the different activities of children in a sensor-rich environment. Researchers in the field of ubiquitous computing have done interesting work in creating context-aware applications including everything from mobile tour guides [1] to working with more recent paradigms of programming-by-demonstration [2]. In computer vision, most of the initial interest in activity understanding was focused on model based activity recognition [9] [35] where a set of target activities is modeled using some representation, followed by the learning of the model parameters given some training data. Different types of Dynamic Bayesian Networks have been extensively used for modeling various activities [39] [32]. Along these lines, some researchers have also made use of the context information to improve upon the recognition performance [23]. At the same time, these models of human behavior and motion have been used to synthesize new types of motions in the field of computer animation [8]. In the past, various approaches for activity representation have been fundamentally grammar-driven (see e.g. [16], [22]). In this work I propose to treat activities as bags of event n-grams, which allows the extraction of the global structural information of an activity, by 3 simply considering its event statistics at a local scale. This treatment of activities, motivated by some recent developments in natural language processing [30], lets one to get away from actually scripting every single way in which an activity can be performed, and can be used for learning the different kinds of activity structures in an unsupervised manner. Although the idea of discovering activity-classes has been previously explored in such fields as network intrusion detection [20], it has only recently been applied to everyday activities. My approach towards this problem is novel in a few key aspects. Unlike [15] which require a priori expert knowledge to model the activity-classes in an environment, I propose to discover this information in an unsupervised fashion. Since event-monograms, as used in [40] and [36], do not capture the temporal information of an activity, I use higher order event n-grams to capture this information more efficiently. Numerous solutions to the problem of discovering important recurrent motifs in sequential data have been proposed (see e.g. [25] and [11] and the references therein). Work done in [38] and [28] present techniques for learning variable-memory Markov chains from training data in an unsupervised manner. The variable-memory elements in these Markov chains can be thought of as motifs that have good predictive power of the future events. However they presume the availability of pre-classified data. Moreover, their approach does not filter out the motifs that are common in multiple classes. Here, I modify the work done in [38] to handle data from multiple classes, finding motifs that are maximally mutually exclusive amongst activity-classes. This forms a nice continuum between the activity-class discovery, and characterization. Moreover, instead of sequentially finding individual motifs and masking them out from the sequences as proposed in [5], my scheme simultaneously finds all the motifs in the data in one pass. This allows one to find partially overlapping motifs. Most of the previous attempts to tackle the problem of anomaly detection have focused on model-based anomaly recognition. These methods pre-define a particular type of activity as being anomalous, model it in some way, and then detect whether a new activity-instance is anomalous [15] [19]. While such an approach could prove to be useful for cases where the variance between different anomalous instances is not significantly large [21], for any 4 reasonably unconstrained situation, anomalies are hard to completely define a priori. Since this is particularly true for everyday activities, I define an anomaly as "something different from regular", with the hope of being able to model something regular more efficiently. Interestingly enough, there are studies done in Cognitive Science which show evidence that humans also learn about anomalies by considering the "distance" of a new piece of information from the mental model of the class which they believe the new information belongs to [10] [12]. I formalize the problem of discovering activity classes as searching for edge-weighted maximal cliques in the graph of K activity-instances. Indeed, in the past, some authors have argued that maximal clique is the strictest definition of a cluster [4]. Finding maximal cliques in an edge-weighted undirected graph is a classic graph theoretic problem. Because combinatorially searching for maximal cliques is computationally hard, numerous approximations to the solution of this problem have been proposed (see [27] and the references within). For my purposes, I adopt the recently proposed approximate approach of iteratively finding dominant sets of maximally similar nodes in a graph (equivalent to finding maximal cliques) [26]. Besides providing an efficient approximation to finding maximal cliques, the framework of dominant sets naturally provides a principled measure of the cohesiveness of a class as well as a measure of node participation in its membership class. This measure of class participation may be used for an instance based representation of a clique [18]. It is argued that while facing a new piece of information, humans first classify it into an existing class [31] [29], and then compare it to the previous class members to understand how it varies in relation to the general characteristics of the membership class. Using this hypothesis as my motivation, I represent an activity class by a set of mutually disjunctive sub-classes, and then detect a new activity as a regular or an anomalous member of its membership sub-class. Some of the previous work done in bio-informatics on finding motifs, presumes the availability of pre-classified data [6]. Moreover, these approaches do not filter out the motifs that are common in multiple classes. My proposed scheme discovers activity classes in an unsupervised manner, and finds patterns that are maximally mutually exclusive amongst activity-classes. Besides activity-class characterization, these motifs can be used 5 for finding interesting sub-sequences which need to be further analyzed more closely. 1.5 Thesis outline I start in Chapter 2 by explaining a novel activity representation which does not require prior knowledge of the activity-structure. Using this novel activity representation, I explain a method of unsupervised activity-class discovery in Chapter 3, followed by a discussion about the different ways to characterize such discovered activity-classes in Chapter 4. In Chapter 5, I explain how these characterizations can be used for activity classification, unusual activity detection and explanation. The empirical analysis of my proposal for everyday activity analysis is presented in Chapter 6, followed by Chapter 7, which explains the shortcomings and conclusions of this treatise. 6 CHAPTER II PREVIOUS WORK The problem of activity analysis has been taken on in numerous fields, including everything from human-computer interaction to computational perception, from ubiquitous computing to machine learning. Interestingly, some researchers have also tried to use these analyses for generation of new motion models in turn used for rendering different movements and behaviors on the console. One of the pioneer pieces of work in computational context analysis was [7] which analyzed the different activities of children in a sensor-rich environment. Researchers in the field of ubiquitous computing have done interesting work in creating context-aware applications including everything from mobile tour guides [1] to working with more recent paradigms of programming-by-demonstration [2]. In computer vision, most of the initial interest in activity understanding was focused on model based activity recognition [9] [35] where a set of target activities is modeled using some representation, followed by the learning of the model parameters given some training data. Different types of Dynamic Bayesian Networks have been extensively used for modeling various activities [39] [32]. Along these lines, some researchers have also made use of the context information to improve upon the recognition performance [23]. At the same time, these models of human behavior and motion have been used to synthesize new types of motions in the field of computer animation [8]. In the past, various approaches for activity representation have been fundamentally grammar-driven (see e.g. [16], [22]). In this work I propose to treat activities as bags of event n-grams, which allows the extraction of the global structural information of an activity, by simply considering its event statistics at a local scale. This treatment of activities, motivated by some recent developments in natural language processing [30], lets one to get away from actually scripting every single way in which an activity can be performed, and can be used for learning the different kinds of activity structures in an unsupervised manner. 7 Although the idea of discovering activity-classes has been previously explored in such fields as network intrusion detection [20], it has only recently been applied to everyday activities. My approach towards this problem is novel in a few key aspects. Unlike [15] which require a priori expert knowledge to model the activity-classes in an environment, I propose to discover this information in an unsupervised fashion. Since event-monograms, as used in [40] and [36], do not capture the temporal information of an activity, I use higher order event n-grams to capture this information more efficiently. Numerous solutions to the problem of discovering important recurrent motifs in sequential data have been proposed (see e.g. [25] and [11] and the references therein). Work done in [38] and [28] present techniques for learning variable-memory Markov chains from training data in an unsupervised manner. The variable-memory elements in these Markov chains can be thought of as motifs that have good predictive power of the future events. However they presume the availability of pre-classified data. Moreover, their approach does not filter out the motifs that are common in multiple classes. Here, I modify the work done in [38] to handle data from multiple classes, finding motifs that are maximally mutually exclusive amongst activity-classes. This forms a nice continuum between the activity-class discovery, and characterization. Moreover, instead of sequentially finding individual motifs and masking them out from the sequences as proposed in [5], my scheme simultaneously finds all the motifs in the data in one pass. This allows one to find partially overlapping motifs. Most of the previous attempts to tackle the problem of anomaly detection have focused on model-based anomaly recognition. These methods pre-define a particular type of activity as being anomalous, model it in some way, and then detect whether a new activity-instance is anomalous [15] [19]. While such an approach could prove to be useful for cases where the variance between different anomalous instances is not significantly large [21], for any reasonably unconstrained situation, anomalies are hard to completely define a priori. Since this is particularly true for everyday activities, I define an anomaly as "something different from regular", with the hope of being able to model something regular more efficiently. Interestingly enough, there are studies done in Cognitive Science which show evidence that 8 humans also learn about anomalies by considering the "distance" of a new piece of information from the mental model of the class which they believe the new information belongs to [10] [12]. I formalize the problem of discovering activity classes as searching for edge-weighted maximal cliques in the graph of K activity-instances. Indeed, in the past, some authors have argued that maximal clique is the strictest definition of a cluster [4]. Finding maximal cliques in an edge-weighted undirected graph is a classic graph theoretic problem. Because combinatorially searching for maximal cliques is computationally hard, numerous approximations to the solution of this problem have been proposed (see [27] and the references within). For my purposes, I adopt the recently proposed approximate approach of iteratively finding dominant sets of maximally similar nodes in a graph (equivalent to finding maximal cliques) [26]. Besides providing an efficient approximation to finding maximal cliques, the framework of dominant sets naturally provides a principled measure of the cohesiveness of a class as well as a measure of node participation in its membership class. This measure of class participation may be used for an instance based representation of a clique [18]. It is argued that while facing a new piece of information, humans first classify it into an existing class [31] [29], and then compare it to the previous class members to understand how it varies in relation to the general characteristics of the membership class. Using this hypothesis as my motivation, I represent an activity class by a set of mutually disjunctive sub-classes, and then detect a new activity as a regular or an anomalous member of its membership sub-class. Some of the previous work done in bio-informatics on finding motifs, presumes the availability of pre-classified data [6]. Moreover, these approaches do not filter out the motifs that are common in multiple classes. My proposed scheme discovers activity classes in an unsupervised manner, and finds patterns that are maximally mutually exclusive amongst activity-classes. Besides activity-class characterization, these motifs can be used for finding interesting sub-sequences which need to be further analyzed more closely. 9 CHAPTER III REPRESENTING ACTIVITIES AS BAGS OF EVENT N -GRAMS Consider an active setting such as a loading dock with delivery vehicles, people, packages etc. Such active environments consist of animate and inanimate objects interacting with each other. The interaction of these objects in a particular manner constitutes an event, while a sequence of such events constitutes an activity. In the past, various approaches for activity representation have been fundamentally grammar-driven [16]. These representations have proven to be useful in cases where one has a priori information about the structure of activity and can actually construct a model whose parameters can be later learned given some data. While such a presumption can be safely made in constrained situations, it is far from being true in large scale uncontrolled settings. Observing this constrain, researchers have extensively used ergodic hidden markov models for activity representation [22]. However, most of the work in that direction has been done in a supervised learning framework where the availability of classes along with labeled training data is presumed. 3.1 Vector Space Model - VSM Looking at an activity as a sequence of discrete events, two important quantities emerge: Content - events that span the activity, and Order - the arrangement of the set of events. This treatment of an activity is similar to the representation of a document as a set of words - also known as the Vector Space Model (VSM) [30], in which a document is represented as a vector of its word-counts, in the space of possible words. To use such a scheme, we must define a set of possible events (event vocabulary) that could take place in the situation under consideration. As this representation is designed 10 Key Frame of a epresentative Event R Back Door of DV Cart (Full) Person Figure 1. A Person pushes a Cart carrying Packages into the Back Door of a Delivery Vehicle. to be manipulated by a perceptual system, the events must be chosen such that they are detectable from low-level perceptual primitives. A particular interaction of these perceptual primitives constitute an event. A key-frame of a representative event from one of the active environments that is explored in this work (Loading Dock) is shown in Figure 1. While VSM captures the content of a sequence in an efficient way, it ignores its order. To stress upon this point, let us assume we are given a set of 10 events: E = {1, 2, ..., 10} (1) Consider two sequences of exactly the same content, but different event order given as: S1 = {1, 2, 3, 8, 9, 10} (2) and S2 = {8, 9, 10, 1, 2, 3} (3) The VSM representations for both S1 and S2 are given in figure 2. As can be seen, while VSM captures the content of the sequences competently, it loses the order information of the sequences. Because the word content in documents often implies causal structure, this is usually not a significant problem. Generally activities are not fully defined by their eventcontent alone; rather, they have preferred or typical event-orderings. Therefore a model for capturing the order of events is needed. 11 VSMrepresen tation of two activities S1 and S2 with same con and different event-order tent event-monogram counts 1 S1 = {1,2,3,8,9,10} 0 1 2 3 4 event-monogram bins 5 6 7 8 9 10 1 event-monogram counts S2 = {8,9,10,1,2,3} 0 1 2 3 4 event-monogram bins 5 6 7 8 9 10 Figure 2. VSM representation of two sequences S1 and S2 which have the same event-content, while different event order. As it can be seen, the VSM representation does not capture such order differences. 3.2 Activities as Histograms of Event n-Grams To this end, we consider histograms of higher order event n-grams (see figure 3), where we represent an activity by a sparse vector of counts of overlapping event n-grams in a high dimensional space of possible event n-grams. Our proposed scheme would capture the activity-structure for domains with substantive structural coherence. It is evident that higher values of n would capture the temporal order information of events more rigidly, and would entail a more discriminative representation. 3.3 Discussion While the proposed representation of treating activities as bags of event n-grams captures both content and order of events, it does pose a problem of dealing with sparse very high dimensional data. For instance, if we defined K events that could take place in a situation, and we considered n-grams with n = 3, our activity would be living in a space of order 12 IllustrationActivity Representation of Event Vocabulary Example Event Sequence Event n(3)-gram Sequence Histogram of Event n(3)-grams V = {1 , 2 , 3} S = {2,1,2,3,2,1,2} T = { Step - a Step - b 2-1-2 , 1-2-3 , 2-3-2 3-2-1 , 2-1-2 Step - c } 2 1.5 1 0.5 0 2-1-2 1-2-3 2-3-2 3-2-1 Step - d Figure 3. Transformation of an example activity from sequence of discrete events to histogram of event n-grams. Here the value of n is shown to be equal to 3. V is event vocabulary, S is event sequence, and T is sequence of overlapping n-grams. Step-d shows the non-zero n-gram counts of V. O(K 3 ). For even moderate values of K, learning and estimation in such a space can be infeasible. One could partially solve the curse of dimensionality of the space using any of the plethora of dimensionality reduction techniques available. Such an approach is very similar to the Latent Semantic Analysis [37] of documents in the Information Retrieval community, where documents are mapped to a lower dimensional subspace (using Principal Component Analysis [34]), and document similarity is computed based on the inner product between vectors in the reduced dimensional subspace. However, PCA would lose some information (least significant in terms of L2 norm) in the process of reducing dimensionality. Again, since the word content in documents often implies causal structure, this loss of information is usually not a significant problem for documents. On the other hand, activities are generally characterized by specific event orderings, therefore the loss of information in case of activities can prove to be more serious. Another potential solution to the problem of dimensional complexity is to consider only those dimensions which are of significant importance, where the dimension importance can be estimated by using some inductive learning techniques based on some training data as proposed in some of the work in Network Intrusion Detection community [20]. Unfortunately, unlike the problem of network intrusion detection, where large sets of network 13 activity data are usually available, we do not have such large data sets, and therefore using inductive learning techniques for our problem would not provide a good estimation of dimension importance. In this work, a solution to this problem is proposed by constructing a similarity metric that encompasses all the non-zero components of the activity vector, hence giving equal importance to all the dimensions with values greater than zero, and no importance to dimensions with values equal to zero. Since the activity vector is highly sparse, this allows us to reduce the dimensionality dramatically without loosing any information, allowing us to be able to use higher values of n-grams for our analysis (see Chapter 4 for more details). Another interesting question regarding the proposed representation is what is the optimal value of n to be used for activities in a certain domain. For instance, in environments with high quantum of structure, the activities would follow certain order more strictly, and one could use smaller values of n to capture this order information. On the other hand, higher order of n would be needed to capture the structural information for domains where activities have more stochastic element to them. We propose a potential solution to this problem of finding the optimal value of n, by optimizing over the lengths of predictably recurrent event subsequences (Event motifs) using variable-memory Markov chains (see Chapter 5 for more details). 14 CHAPTER IV UNSUPERVISED ACTIVITY-CLASS DISCOVERY This chapter begins by presenting a novel sequence comparison metric. The proposed view of the similarity between a pair of sequences consists of two factors, the core structural differences and differences based on the frequency of occurrence of event n-grams. Having established a notion of similarity between a pair of activities, the problem of activity-class discovery is posed as a graph theoretic problem of finding maximal cliques in edge-weighted activity graphs where the weights between activity-nodes are proportional to the similarity between the corresponding activities sequences. 4.1 Activity Similarity Metric Sequence comparison is a well-studied problem and has numerous applications in such fields as text retrieval, bio-informatics etc. [13]. Our view of the similarity between a pair of sequences consists of two factors, the core structural differences and differences based on the frequency of occurrence of event n-grams. The core structural differences relate to the distinct n-grams that occurred in either one of the sequences in a sequence-pair, but not in both. We believe that for such differences, the number of these mutually exclusive n-grams is of fundamental interest. On the other hand, if a particular n-gram is inclusive in both the sequences, the only discrimination that can be drawn between the sequence pair is purely based on the frequency of the occurrence of that n-gram. Let A and B denote two sequences of events, and let their corresponding histogram of n-grams be denoted by HA and HB . Let Y and Z be the sets of indices of n-grams with counts greater than zero in HA and HB respectively. Let i denote different n-grams. f (i |HA ) and f (i |HB ) denote the counts of i in sequences A and B respectively. We define the similarity between two event sequences as: 15 sim(A, B) = 1 - iY,Z |f (i |HA ) - f (i |HB )| f (i |HA ) + f (i |HB ) (4) where = 1/(||Y || + ||Z||) is the normalizing factor, and || || computes the cardinality of a set. While our proposed similarity metric conforms to: (1) the property of Identity of indiscernibles, (2) is commutative, and (3) is positive semi-definite, it does not however follow Cauchy-Schwartz inequality, making it a divergence rather than a true distance metric. 4.2 Activity-Class Discovery It is argued that while facing a new piece of information, humans first classify it into an existing class [29] [31], and then compare it to the previous class members to understand how it varies in relation to the general characteristics of the membership class. Using this hypothesis as our motivation, we represent an activity space by a set of mutually disjunctive classes, and then detect a new activity as a regular or an anomalous member of its membership class. 4.2.1 Activity-Class as Maximal Clique This work asserts that the activity-instances occurring in an environment do not span the activity-space uniformly. Rather, there exist disjunctive activity-sets with high internal similarity while low similarity across the sets. This assertion is backed by the assumption that the detected events, constituting activities in an environment, encode the underlying structure of activities [29]. Starting off with a set of K activity-instances, let us consider this activity-set as an undirected edge-weighted graph with K nodes, each node representing a histogram of n-grams of one of the K activity-instances. The weight of an edge is the similarity between a pair of nodes as defined in 4.1. We can now formalize the problem of discovering activity-classes of as searching for edge-weighted maximal cliques 1 in the graph of K activity-instances [4]. Along these lines, a maximal clique in the graph is found, proceed by removing that set Recall that a subset of nodes of a graph is a clique if all its nodes are mutually adjacent; a maximal clique is is not contained in any larger clique, whereas a maximum clique has largest cardinality. 1 16 of nodes from the graph, and repeating this process iteratively with the remaining set of nodes, until there remain no non-trivial maximal cliques in the graph. The leftover nodes after the removal of maximal cliques are dissimilar from most of the regular nodes, and are hence anomalous (for details on anomaly detection, please see 6.1). 4.2.2 Maximal Cliques using Dominant Sets Finding maximal cliques in an edge-weighted undirected graph is a classic graph theoretic problem. Because combinatorially searching for maximal cliques is computationally hard, numerous approximations to the solution of this problem have been proposed [27]. For our purposes, we adopt the approximate approach of iteratively finding dominant sets of maximally similar nodes in a graph (equivalent to finding maximal cliques) as proposed in [26]. Besides providing an efficient approximation to finding maximal cliques, the framework of dominant sets naturally provides a principled measure of the cohesiveness of a class as well as a measure of node participation in its membership class. We now give an overview of dominant sets showing how they can be used for our problem. Let the data to be clustered be represented by an undirected edge-weighted graph with no self-loops G = (V, E, ) where V is the vertex set V = {1, 2, ...K}, E V V is the edge set, and : E R+ is the positive weight function. The weight on the edges of the graph are represented by a corresponding KK symmetric similarity matrix A = (a ij ) defined as: aij = sim(i, j) if (i, j) E otherwise (5) sim is computed using our proposed notion of similarity as described in 4.1. To quantize the cohesiveness of a node in a cluster, let us define its "average weighted degree". Let S V be a non-empty subset of vertices and i S, such that, 1 ||S|| 0 awdegS (i) = Moreover, for j S, we define S as: | aij jS (6) 17 S (i, j) = aij - awdegS (i) (7) Intuitively, S (i, j) measures the similarity between nodes j and i, with respect to the average similarity between node i and its neighbors in S. Note that S (i, j) can either be positive or negative. Now let us consider how weights are assigned to individual nodes 2 . Let S V be a non-empty subset of vertices and i S. The weight of i w.r.t. S is given as: 1 jS\{i} wS (i) = if ||S|| = 1 S\{i} (j, i)wS\{i} (j) otherwise (8) Moreover, the total weight of S is defined to be: W (S) = iS wS (i) (9) Intuitively, wS (i) gives a measure of the overall similarity between vertex i and the vertices of S\{i} with respect to the overall similarity among the vertices in S\{i}. We are now in a position to define dominant sets. A non-empty sub-set of vertices SV such that W (T ) > 0 for any non-empty T S, is said to be dominant if: 1. wS (i) > 0, i S, i.e. internal homogeneity 2. wS S{i} (i) < 0 i S, i.e. external inhomogeneity. | Effectively, we can state that the dominant set in a edge-weighted graph is equivalent to a cluster of vertices in that graph. 4.2.3 Dominant Sets Using Replicator Dynamics We now turn our attention to finding a dominant set in an edge-weighted graph with adjacency matrix A. For this purpose, consider the following quadratic program which is a generalization of Motzkin-Straus program [24]: 2 Note that here the term weight is being used to describe both the edge-weights and the node-weights. However, these two are different quantities. 18 maximize f (x) = subject to x. where n 1 T x Ax 2 (10) =x R : i=1 n xi = 1 and xi 0, i (11) is the standard simplex in Rn . If S is a dominant sub-set of vertices, then its weighted characteristics vector xS , defined as: wi (S) = w(i,j) W (S) if |S| S otherwise (12) is a strict local maximizer of f in . Conversely, if x is a strict local maximizer of f in then its support = (x ) = {i V : x = 0 }is a dominant set. By the virtue of i | the above result, we can find a dominant set by first localizing a solution of Equation 10 with an appropriate continuous optimization technique, and then picking up the support set of the solution found. The clustering algorithm we use basically consists of iteratively finding a dominant set in that graph by solving Equation 10 and finding its support, then removing the support from the graph, until all the vertices have been clustered. Because solving Equation 8 combinatorially is infeasible, we use a continuous optimization technique proposed in [26] which applying replicator dynamics. Let W = (w ij ) be a non-negative real-valued n x n matrix. The discrete time version of the replicator equation can be given as [24]: (W x(t))i x(t)T W x(t) 0 xi (t + 1) = xi (t) (13) According to the fundamental theorem of natural selection [14], if W = W T , then the function F (x) = xT W x is strictly increasing along any non-constant trajectory of the replicator dynamics of equation 13. In other words, t > 0, F (x(t+1)) > F (x(t)). Finally, let W = A, the adjacency matrix, then the replicator system, starting from any arbitrary initial state will eventually converge to a maximizer of function given in Equation 10. This will correspond to a dominant set in the graph and hence to a cluster of nodes. 19 CHAPTER V ACTIVITY CLASS CHARACTERIZATION Having clustered a given set of activities, we are now looking for a way to represent their characteristics in terms of some general, tractable and concise form. Such a class model is required to: Classification of a new activity instance, and for the Comparison of the new activity-class member to the general characteristics of the membership class to analyze its normality. There are many ways to approach the problem of creating a representative model of a class. One of these is the generative approach which presumes a stochastic process that creates class instances, where the objective is to learn the particular distribution which dictates this underlying process. However for our problem, since the parametric form of the underlying distribution is unknown, this direction cannot be adopted. Even if we approximate the actual distribution through some known parametric form, the large dimensionality of the activity space and the availability of small activity samples, makes learning such a distribution without over-fitting infeasible. Let us therefore resort to the idea of instance based approach for activity class characterization. In this regard, two related approaches are investigated here: Typical Class Member: We formulate the problem as that of finding the node which is the "best representative" of the rest of the cluster nodes - essentially converting the problem of learning, into one of search. From now on we will call the best representative member of a class as the "Typical Member". Event Motifs: We formalize the problem as finding predictably recurrent activity subsequences using variable-memory Markov chains. These subsequences are generally called Event Motifs and are maximally mutually exclusive amongst activity-classes. 20 The characterization of an activity-class using its Typical Member allows one to represent the general characteristics of the class at a wholestic scale, which in turn comes in handy for the overall explanation of an unusual class-member. The Event Motif characterization lets one take a more granular look at the structure of an activity-class, which in turn allows one to analyze a new class member at a more local scale. In the following, a detailed explanation of each one of these characterizations is provided. 5.1 Typical Class Member The question of typicality is closely related to the idea of how similar a node is to the other members of the cluster. There are many ways in which this idea of similarity of a node with respect to other nodes could be exploited to find the typical node. The classic graph theoretic literature provides a potential answer to this problem in terms of finding the "centroid" of the cluster, i.e. finding the node which minimizes the maximum distance (inverse of similarity) between the rest of the nodes and itself (also known as the Min-Max algorithm). While this method is theoretically sound, it is prone to noisy clusters and would work well only in cases where the clusters are well-behaved. Another method proposed in graph theory for such a problem relates to finding the maximum in-degree of every node of the cluster, labeling the node with maximum indegree as the typical node. One could consider the number of nodes maximally close to a particular node as it in-degree, transforming our undirected graph into a directed one. Stated otherwise, the idea is to consider that node as the typical member of the cluster, to which most nodes are maximally similar. Indeed, the approach of labeling a node as typical or not, based on its in-degree usually works very well. It still however retains some major problems in terms being completely agnostic about the more global structure of the cluster. More specifically, due to the maximization operation which we have to do to transform of our undirected graph to a directed one, we are forced to look at a very local view of our landscape which of course could lead to problems. The idea of fining the best representative member of a cluster has been studied in some other fields such as Computer Networks, where the problem is very similar to finding the 21 web-page which best represents a collection of web-pages (see e.g. [18]). Along the lines of [18], we propose the idea of Typical nodes (mentioned as "Authoritative Sources" in [18]) and "Similar to Typical (STT)" nodes (mentioned as "hubs" in [18]). Typical and STT nodes exhibit a mutually reinforcing relationship - a good STT node is one which is closer to a Typical while node, a Typical node is one closer to more STT nodes. Like [18], we associate a non-negative Typicality weight x p and a non-negative STT weight y p to each node in the cluster where p denotes the index of nodes in a cluster. Naturally, if p is closer to many nodes with large x values, it should receive a large y value. On the other hand if p is closer to nodes with large y values, it should receive large x value. We define two coupled processes to update the weights xp and y p iteratively, i.e. xp q:(q,p)E yp (14) and yp q:(q,p)E xp (15) As we iterate the above two equations k times in the limit k , x p and y p converge to x and y . The node which has the largest component in the converged vector x would correspond to the node which has the greatest Typical weight and hence is the best representative of the nodes of clusters. x can be computed from the Eigen Analysis of the matrix AT A where A is the symmetric similarity matrix of all the nodes of the cluster. Essentially x is the principal eigenvector (the one with greatest corresponding Eigen value) of AT A, the largest component of which corresponds to the Typical Node of the cluster (for the proof, please refer to [18]). 5.2 Discovering Event Motifs Let us now turn our attention towards finding interesting recurrent event-motifs in these discovered classes. Some of the previous work done in bio-informatics on finding motifs, presumes the availability of pre-classified data [6]. Moreover, these approaches do not filter out the motifs that are common in multiple classes. The scheme proposed here discovers 22 activity classes in an unsupervised manner, and finds patterns that are maximally mutually exclusive amongst activity-classes. 5.2.1 A Definition of Motif From the perspective of activity discovery and recognition, we are interested in frequently occurring event-sequences that are useful in predicting future events, and can therefore be used for activity class characterization. Following [38], we assume that a class of activitysequences can be modeled as a variable-memory Markov chain (VMMC ). We define an event-motif for an activity-class as one of the variable-memory elements of its VMMC. We cast the problem of finding the optimal length of the memory element of a VMMC as a function optimization problem and propose our objective function in the following. Let Y be the set of events, A be the set of activity-instances, and C be the set of discovered activity-classes. Let us define a function U(a) that maps an activity a A to its membership class c C. Let us define the set of activities belonging to a particular class c C as Ac = {a A : U(a) = c}. For a = (y1 , y2 , ..., yn ) A where y1 , y2 , ...yn Y , let p(c|a) denote the probability that activity a belongs to class c. Then, p(a|c)p(c) p(c|a) = p(a) n p(yi |yi-1 , yi-2 , ..., y1 , c) i=1 (16) where we have assumed that all activities and classes are equally likely. We approximate Eq 16 by a VMMC, Mc to get: n n p(yi |yi-1 , yi-2 , ..., y1 , c) = i=1 i=1 p(yi |yi-1 , yi-2 , ..., yi-mi , c) (17) where mi i - 1 i. For any 1 i n, the sequence (y i-1 , yi-2 , ..., yi-mi ) is called the context of yi in Mc ( [38]), denoted by SMc (yi ). We want to find the sub-sequences which can effeciently characterize a particular class, while having minimal representation in other classes. We therefore define our objective function as: Q(Mc |Ac ) = - where = aAc (18) p(c|a) (19) 23 and = c C\{c} aAc p(c |a) (20) Intuitively, represents how well a set of event-motifs can characterize a class in terms of correctly classifying the activities belonging to that class. On the other hand, denotes to what extent a set of motifs of a class represent activities belonging to other classes. It is clear that maximizing while minimizing would result in the optimization of Q(M c |Ac ). Note that our motif finding algorithm leverages the availability of the discovered activityclasses to find the maximally mutually exclusive motifs. This shows the usefulness of our activity discovery framework as a pre-step to the motif finding scheme. 5.2.2 Objective Function Optimization We now explain how we optimize our proposed objective function. [38] describe a technique to compare different VMMC models that balances the predictive power of a model with its complexity. Let s be a context in Mc , where s = yn-1 , yn-2 , ..., y1 , and yn-1 , yn-2 , ..., y1 Y . Let us define the suffix of s as suffix (s) = y n-1 , yn-1 , ...y2 . For each y Y , let NA (y, s) be the number of occurrences of event y in activity-sequences contained in A A where s precedes y, and let NA (s) be the number of occurances of s in activity-sequences in A . We define the function A (s) as A (s) = yY N (s, y)log p(y|s) ^ p(y|suffix(s)) ^ (21) where p(y|s) = NA (s, y)/NA (s) is the maximum likelihood estimator of p(y|s). Intuitively, ^ A (s) represents the number of bits that would be saved if the events following s in A , were encoded using s as a context, versus having suffix (s) as a context. In other words, it represents how much better the model could predict the events following s by including the last event in s as part of context of these events. We now define the function c (s) (bit gain of s) as c (s) = Ac (s) - c C\{c} Ac (s) (22) 24 Note that higher values of Ac (s) imply greater probability that an activity in A c is assigned to c, given that s is used as a motif. In particular, higher the value of Ac (s), higher will be the value of . Similarly, higher the value of c C\{c} Ac (s), higher the value of . We include a sequence s as a context in the model M c iff c (s) > K log( ) where (23) is the total length of all the activities in A, while K is a user defined parameter. The term K log( ) represents added complexity of the model M c , by using s as opposed to suffix (s) as a context, which is shorter in length and occurs at least as often as s. The higher the value of K the more parsimonious the model will be. Equation 23 selects sequences that both appear regularly and have good classification and predictive power - and hence can be thought of as event-motifs. Work done in [28] shows how the motifs in a VMMC can be compactly represented as a tree. Work done in [3] presents a linear time algorithm that constructs such a tree by first constructing a data structure called a Suffix Tree to represent all sub-sequences in the training data A, and then by pruning this tree to leave only the sequences representing motifs in the VMMC for some activity-class. We follow this general approach by using Eq 23 as our pruning criterion. 25 CHAPTER VI ACTIVITY CLASSIFICATION, ANOMALY DETECTION AND EXPLANATION As mentioned in 4.2, it is argued that while facing a new piece of information, humans first classify it into an existing class [29] [31], and then compare it to the previous class members to understand how it varies in relation to the general characteristics of the membership class. Using this hypothesis as our motivation, we represent an activity space by a set of mutually disjunctive classes, and then detect a new activity as a regular or an anomalous member of its membership class. Unlike [40] we do not wish to re-analyze the entire data set for every new activity instance. Therefore, we present an incremental approach to classification and detection for a new activity instance. 6.1 Activity Classification and Anomaly Detection Given ||C|| discovered activity-classes, we are now interested in finding if a new activity instance is regular or anomalous. Each member j of an activity-class c has some weight wc (j), that indicates the participation of j in c. We compute the similarity between a new activity-instance and previous members of each sub-class by defining a function A c ( ) as: Ac ( ) = j sim(, j)wc (j) j c (24) Here wc (j) is the same as defined in Equation 8. A c represents the average weighted similarity between the new activity-instance and any one of the discovered sub-classes c. The selected membership sub-class c can be found as c = arg max Ac ( ) c (25) Once the membership decision of a new test activity has been made, we now focus our attention on deciding whether the new class member is regular or anomalous. Intuitively 26 speaking, we want to decide the normality of a new instance based on its closeness to the previous members of its membership sub-class. This is done with respect to the average closeness between all the previous members of its membership sub-class. Let us define a function ( ) as: ( ) = jc c (j, )wc (j) (26) where in is defined by Equation 7. We define a new sub-class member as regular if ( ) is greater than a particular threshold. The threshold on ( ) is learned by mapping all the anomalous activity instances detected in the training activity-set to their closest sub-class (using Equation 24, 25), and computing the value of for both regular and anomalous activity instances. We can now observe the variation in f alse acceptance rate (FAR) and true positives (HITS) as a function of the threshold . This gives a "Receiver Operating Curve" (ROC). The area under this curve is indicative of the confidence in our detection metric ( ) [17]. Based on our tolerance for HITS and FAR we can now choose an appropriate threshold. 6.2 Anomaly Explanation We now address the question of characterizing the anomalous members. We first review (as explained in Chapter 5) the characterization of a model for the regular members of a sub-class against which its anomalous members could be compared [31]. We then find the most informative features of our space in terms of discriminability between the regular and the anomalous sub-class members. 6.2.1 Activity-Class Modeling As mentioned in 5.1, because of the huge dimensionality of our feature space and the availability of meager (and sparse) training data, we resort to the idea of activity-class representation using class prototype(s) (the exampler view [33]) to model the regular members of an activity-class. We formulate this problem as finding the member that is the "most representative" of the rest of the sub-class members. Fining the best representative member of a cluster in terms of its similarity to other cluster members has been studied in 27 other fields. For instance [18] finds the most authoritative nodes in a cluster by iteratively assigning authority weights to each node member. An advantage of using the dominant sets framework for discovering constituent sub-class structure of an activity class is that it naturally provides a principled measure of a node's representativeness of its membership activity-class, defined by wS (i) in Equation 8. We propose using the member node of a activity-class with maximum weight w S (i) as the representative model of the sub-class. This most representative node is used to explain the anomalous members of the activity-class. 6.2.2 Explanatory Features We now focus on the problem of finding the features that can be used to explain an anomalous activity in a maximally- informative manner. We are interested in features of a sub-class with minimum entropy and substantive frequency of occurrence. The entropy of a tri-gram indicates the variation in its observed frequency, which indicates the confidence in the prediction of its frequency. The frequency of occurrence of a tri-gram suggests its participation in a sub-class. We want to analyze the extraneous and the pertinent features in an activity that made it anomalous with respect to the most explanatory features of the regular members of the membership activity-class. We now construct our approach mathematically (a figurative illustration is given in Figure 4). Let i denote a particular tri-gram i for an activity, and c denote any of of the ||C|| discovered sub-classes. If R denotes the Typical Member member of c as described in 5.1, and denotes a new anomalous sub-class member, then we can define the difference between their counts for i as: D(i ) = fR (i ) - f (i ) (27) where f (i ) denotes the count of a tri-gram i . Let us define the distribution of the probability of occurrence of i in c as: fk (i ) Pc (i ) = kc M (28) fk (i ) i=1 kc where M represents all the non-zero tri-grams in all the members of sub-class c. Let us 28 define multiset i as: c i = {fk (i )|k c} c (29) We can now define probability Q(x) of occurrence of a particular member x i for i in c c as: Q(x) = jc 1 if f (i ) = x otherwise (30) where is the normalization factor. Let us define Shannon's Entropy of a tri-gram i for a sub-class c by Hc (i ) as: Hc (i ) = xi c 0 Qc (x)ln(Qc (x)) (31) We can now define the notion of predictability, P RD c (i ), of the values of tri-gram i of cluster c as: P RDc (i ) = 1 - Hc (i ) M (32) Hc (i ) i=1 It is evident from this definition, that i with high entropy Hc (i ) would have high variability, and therefore would have low predictability. We define the explainability of a tri-gram i c that was frequently and consistently present in the regular sub-cluster as: P c (i ) = P RDc (i )Pc (i ) P Intuitively, c indicates how much an i is instrumental in representing a sub-class c. (33) Similarly, we can define the explainability of i c in terms of how consistently was it absent in representing c. max A c (i ) = P RDc (i )(Pc (i ) - Pc (i )) (34) where Pcmax (i ) is the maximum probability of occurrence of any i in c. The first term in both Equation 33 and 34 indicates how consistent i is in its frequency over the different members of a cluster. The second term in Equation 33 and 34 dictates how representative and non-representative i is for c respectively. Given an anomalous member of a sub-class, we can now find the features that were frequently and consistently present in the regular members of the sub-class, but were deficient 29 Illustration of Most Explanator y Features h lo igh w e n hi pr d trop e gh ic y, re tbi fq a uen li, y cy t 18 16 14 12 10 8 6 4 2 0 1 hi low g me h p entr di red op u m ic y, lo fr tabi hi w equ ly, gh ent en it p e rop cy lo r d y w frq icab , eue t il y n c it y Counts Act iv2t i y e3 Squ e c 4 nce s 1 5 = {0,0,0,0,0} 4c 1 2 3 4 n s -gram 5 6 i = {5,1,13,7,17} 7 8 9 10 = {3,4,4,4,4} 8c Figure 4. Five simulated activity sequences are shown to illustrate the different concepts introduced in 6.2.2. 1 has low value of Pc , its entropy Hc is low and therefore its predictability is high. 4 has medium Pc , its entropy Hc is also low and its predictability is high. Finally 8 has high Pc , but its entropy Hc is high which makes its predictability low. 1 could be useful in explaining the extraneous features in an anomalous activity, while 4 could be useful in explaining the features that were deficient in an anomaly. in the anomaly . To this end, we define the function DEF ICIEN T ( ) as: P DEF ICIEN T ( ) = arg max[c (i )Dc (i )] i (35) Similarly, we can find the most explanatory features that were consistently absent in the regular members of the membership sub-class but were extraneous in the anomaly. We define the function EXT RAN EOU S( ) as: A EXT RAN EOU S( ) = arg min[c (i )Dc (i )] i (36) We can now explain anomalies based on these features that were Deficient from an anomaly but were frequently and consistently Present in the regular members Extraneous in the anomaly but were consistently Absent from the regular members of the activity-class. 30 CHAPTER VII EXPERIMENTS AND RESULTS To test the competence of the proposed framework, experiments on extensive data-sets collected from two active environments were performed. For both experimental setups, the value of n for the n-grams was chosen to be equal to 3 (tri-grams). This is done with the understanding that it encodes the past, present and future information of an event (essentially following second order Markov assumption). 7.1 Loading Dock Scenario We collected video data at the Loading Dock area of a retail bookstore. To visually span the area of activities in the loading dock, we installed two cameras with partially overlapping fields of view. A schematic diagram with sample views from the two cameras is shown in Figure 5. Daily activities from 9a.m. to 5p.m., 5 days a week, for over one month were recorded. Based on our observations of the activities taking place in that environment, an event vocabulary of 61 events was constructed. Every activity has a known starting event, i.e. Delivery Vehicle Enters the Loading Dock and a known ending event, i.e. Delivery Vehicle Leaves the Loading Dock. We used 150 of the collected instances of activities, that were manually annotated using our defined event-vocabulary of 61 events. The interaction of some low-level perceptually distinguishable primitives constitute each of these 61 events. For the Loading Dock environment, we used 10 such primitives: Person, Cart, Delivery Vehicle(D.V.), Left Door of D.V., Right Door of D.V., Back Door of D.V., Package, Doorbell, Front Door of Building, Side Door of Building. 7.2 House Scenario To test our proposed algorithms on the activities in a house environment, we deployed 16 strain gages at different locations in a house, each with a unique identification code. 31 Warehouse Entrance Front Poles Camera 1 View Camera 2 Loading Area Loading Dock A Loading Dock B View Camera 1 Side Pole Camera 2 Side Entrance Dock A Entrance Dock B Entrance Figure 5. A schematic diagram of the camera setup at the loading dock area with overlapping fields of view (FOV). The FOV of camera 1 is shown in blue while that of camera 2 is in red. The overlapping area of the dock is shown in purple. These transducers registered the time when the resident of the house walked over them. The data was collected daily for almost 5 months (151 days - each day is considered as an individual activity). Whenever the person passed near a transducer at a particular location, it was considered as the occurrence of a unique event. Thus our event vocabulary in this environment consists of 16 events. Figure 6 shows a schematic top-view of this environment. 7.3 7.3.1 Discovered Activity Classes Loading Dock Scenario Of the 150 training activities, we found 7 classes (maximal cliques), with 106 activities as part of any one of the discovered class, while 44 activities being different enough to be not included into any non-trivial maximal clique. The visual representation for the similarity matrices of the original 150 activities and the arranged activities in 7 clusters is shown in Figure 7. These discovered activity-classes were then provided to our motif finding framework which discovered multiple motifs of various lengths, ranked by their respective bit-gains (class-characterization ability). Analysis of the discovered classes reveals a strong 32 11 5 Sunro om 3 2 9 10 4 Kitchen 16 8 7 14 6 Den Gar age Formal Dining Roo m 1 13 Stairway Front Roo m 15 Enterence Hall 12 Office Figure 6. A schematic diagram of the strain-gage setup in the house scenario. The red dots represents the positions of the strain gages. structural similarity amongst the class members, while the discovered motifs show ability to characterize the membership class efficiently. A brief description of the discovered activityclasses is given in following: Sub-Class 1: UPS carts. Sub-Class 2: Pickup trucks (mostly Fed Ex ) and vans that dropped off a few packages without needing a hand cart. Sub-Class 3: Delivery trucks that dropped off multiple packages, using hand carts, that required multiple people. Sub-Class 4: A mixture of car, van, and truck delivery vehicles that dropped off one or two packages without needing a hand cart. Sub-Class 5: Delivery-vehicles that picked up and dropped-off multiple packages using a motorized hand cart and multiple people. Sub-Class 6: Van delivery-vehicles that dropped off one or two packages without needing a hand cart. 33 delivery-vehicles that picked up multiple packages using hand Visualization Discov of ered Activit y Classes In Loading Dock En vironment Un-Clustered Similarit y Matrix Activity Clusters Clustered Similarit y Matrix Figure 7. Each row represents the similarity of a particular activity with the entire activity training set. White implies identical similarity while black represents complete dissimilarity. The activities ordered after the red cross line in the clustered similarity matrix were dissimilar enough from all other activities as to not be included in any non-trivial maximal clique. Sub-Class 7: Delivery trucks that dropped off multiple packages using hand carts. 7.3.2 House Scenario Of the 151 activities captured over a little more than 5 months, we found 5 activity-classes (maximal cliques), with 131 activities as members of any one of the discovered class, and 20 activities being dissimilar enough not to be a part of any non-trivial maximal clique (see Figure 8). A brief description of the discovered activity-classes is given below: Sub-Class 1: Activities lasting for the entire length of days where the person's trajectory spans the entire house space. Most of the time was spent in the area around the Kitchen and the Dining Table. Sub-Class 2: The person moves from from kitchen to the stairway more often. Further more, as opposed to cluster 1, the person does not go from the Office to the Sum Room area. Sub-Class 3: The person spends more time in the areas of Den and the living-room. Moreover, he visits the Sun-room more often. Sub-Class 4: The person spends most of the day in the Kitchen and the Dining Room.The duration for which she stays in the house is small for this sub-class. 34 Visualization Discov of ered Activit y Classes In House En vironment Un-Clustered Similarit y Matrix Activity Clusters Clustered Similarit y Matrix Figure 8. Visualization of similarity matrices before and after class discovery for the House Environment. Figure 9. Visualization of the structural differences between the discovered activity-classes. Thick lines with brighter shades of red indicate higher frequency. Sub-Class 5: The person moves from Dining Room to the Sun Room more often. The duration for which she stays in the house is significantly smaller than any other sub-class. To better illustrate the structural differences in the discovered activity-classes, a visualization of the normalized frequency-counts of the person's trajectory between different locations is shown in figure 9. 7.4 7.4.1 Discovered Event Motifs Loading Dock Scenario The highest big-gain event-motifs found for the 7 discovered activity-classes in the Loading Dock domain are given below: Sub-Class 1: Person places package into back door of delivery vehicle - Person enters into side door of building g Person is empty handed - Person exists from side 35 door of building g Person is full handed - Person places package into back door of delivery vehicle. Sub-Class 2: Cart is full - Person opens front door of building - Person pushes cart into front door of building - Cart is full - Person closes front door of building - Person opens front door of building - Person exists from front door of building - Person is empty handed - Person closes front door of building. Sub-Class 3: DV drives in forward into LDA - Person opens left door of DV - Person exists from left door of DV - Person is empty handed - Person closes the left door of delivery vehicle. Sub-Class 4: Person opens back door of DV - Person removes package from back door of DV - Person removes package from back door of DV - Person removes package from back door of DV - Person removes package from back door of DV - Person removes package from back door of DV. Sub-Class 5: Person closes front door of building - Person removes package from cart - Person places package into back door of DV - Person removes package from cart - Person places package into back door of DV - Person removes package from cart - Person places package into back door of DV. Sub-Class 6: Person Removes Cart From Back Door of DV - Person Removes Package From Back Door of DV - Person Places Package Into Cart - Person Places Package Into Cart - Person Removes Package From Back Door of DV - Person Places Package Into Cart - Person Removes Package From Back Door of DV - Person Places Package Into Cart. Sub-Class 7: Person closes back door of DV - Person opens left door of DV - Person enters into left door of DV - Person is empty handed - Person closes left door of DV. 36 7.4.2 House Scenario The highest big-gain event-motifs found for the 5 discovered activity-classes in the House scenario are given below: Sub-Class 1: Alarm - Kitchen entrance - Fridge - Sink - Garage door (inside). Sub-Class 2: Stairway - Fridge - Sink - Cupboard - Sink. Sub-Class 3: Stairway - Dining Table - Den - Living-room Door - Sunroom - Living-room door - Den. Sub-Class 4: Den - Living-room door - Den - Kitchen Entrance - Stairway. Sub-Class 5: Fridge - Dining Table - Kitchen Entrance - Fridge - Sink 7.4.3 Subjective Assessment of Evaluation The method defined above would, by construction, find activity-classes and the characterizing event-motifs. This begs the question as to how valid are the discovered activity-classes and the characterizing event-motifs. Our final goal is to design a system that would be able to discover and characterize human-interpretable activity-classes. Keeping this thought in mind, we performed a limited user test to subjectively assess the performance of our system involving 7 participant. For each participant, 2 of the 7 discovered activity classes were selected from the Loading Dock environment. Each participant was shown 6 example activities, 3 from each of the 2 selected activity-classes. The participants were then shown 6 motifs, 3 for each of the 2 classes, and were asked to associate each motif to the class that it best belonged to. Their answers agreed with our systems 83% of the time, i.e., on average a participant agreed with our system on 5 out of 6 motifs. The probability of agreement on 5 out of 6 motifs by random guessing1 is only 0.093. 1 According to the binomial probability function the chance of randomly agreeing on 5 out of 6 motifs is 6 C5 (0.5)1 (0.5)5 . 37 7.5 Discussion regarding Discovered Classes and Motifs The discovered activity-classes both for the Loading Dock and the House data-sets, are subjectively semantically coherent and divide their respective activity space discriminatively. The fundamental differences between various classes in the Loading Dock environment are dictated by the fact whether the activities were of deliver or pick-up, how many people were involved in the activity, how many packages were moved, and what type of delivery vehicle was used. For the House environment, these differences consist of how long does a person stay in the house, and what time of the year it is. Figures 7 and 8 show that the activities performed in the Loading Dock environment are structurally more well defined than those performed in the House environment. This is because our vocabulary for the Loading Dock environment consists of semantically meaningful events, which can encode the underlying activity structure efficiently. For the House environment, the events are simply the locations where a person went, and are not particularly designed to encode the underlying structure of the activities. The discovered motifs of membership classes efficiently characterize these classes. Note that the discovered motifs for activity-classes where package delivery occurred, have events like Person Places Package In The Back Door Of Delivery Vehicle and Person Pushes Cart In The Front Door of Building Cart is Full. On the other hand event-motifs for activityclasses where package pick-up occurred, have events such as Person Removes Package From Back-Door Of Delivery Vehicle and Person Places Package Into Cart. Similarly, The motifs for the House environment capture the position where the person spends most of her time and the order in which she visits the different places in the house. 7.6 Detected Anomalies We performed experimental analysis on the activities from the Loading Dock scenario. As mentioned in 7.3.1, of the 150 training activities, we found 7 classes (maximal cliques), with 106 activities as part of any one of the discovered class, while 44 activities being different enough to be not included into any non-trivial maximal clique. We know give a detailed explanation of how, using these initially detected anomalous activities, we can 38 ROC For Decision Threshold 1 0.9 Decision Threshold 0.8 True Positiv - HITS es 0.7 0.6 0.5 Area Under The ROC = 0.9403 0.4 0.3 0.2 0.1 0 0.2 0.4 0.6 0.8 1.0 Fal Acceptance Rate - F se AR Figure 10. ROC obtained by varying over a range of values. learn a threshold for detecting new anomalous activity-class members, how valid are these detected anomalies from a human view-point, and finally, what explanations did we get for detected anomalous activities using based on the selected key-features of the activity-classes. 7.6.1 Learning Threshold for Anomalies using ROC Using the 7 discovered activity-classes and the anomalous activities, the anomalous activities were first classified into one of the 7 activity-classes using Equation 24 and 25. Based on these activity-class labels, as defined in Equation 26 was computed for all 150 activities. The ROC that was obtained is shown in Figure 10. The area under the obtained ROC was 0.94, which indicates a confidence of 94% in the proposed detection metric [17]. 7.6.2 Analysis of Detected Anomalies Analyzing the detected anomalous activities reveals the interesting fact that there are essentially two kinds of activities that are being detected, (1) ones that are truly alarming, where someone must be notified, and (2) those that are simply unusual delivery activities with respect to the other regular activities. Key-frames for three of the truly alarming anomalous activities are shown in Figure 11. Figure 11-a shows a truck driving out without closing it's back door. Not shown in the key-frame is the sequence of events where a loadingdock personnel runs after the delivery vehicle to tell the driver of his mistake. Figure 11-b shows a delivery activity where a relatively excessive number of people unload the delivery vehicle. Usually only one or two people unload a delivery vehicle, however as can be seen 39 from Figure 11-b, in this case there...

Find millions of documents on Course Hero - Study Guides, Lecture Notes, Reference Materials, Practice Exams and more. Course Hero has millions of course specific materials providing students with the best way to expand their education.

Below is a small sample set of documents:

Georgia Tech - CS - 6400
Copyright 2004 Pearson Education, Inc.Chapter 2Database System Concepts and ArchitectureCopyright 2004 Pearson Education, Inc.Data ModelsData Model: A set of concepts to describe the structure of a database, and certain constraints that the
Georgia Tech - CS - 6235
CS4220 Embedded Systems CS6235 Real-Time Systems 1B: RTE Concepts and ExamplesInstructor: Calton Pu calton.pu@cc TA: Jason Parekh (jparekh@cc)1Example: CarOperating environment: Road conditions and other cars. Controlling System Human driver:
Acton School of Business - HIST - 327
Acton School of Business - POLI - 479
The MIT Press Journalshttp:/mitpress.mit.edu/journalsThis article is provided courtesy of The MIT Press. To join an e-mail alert list and receive the latest news on our publications, please visit: http:/mitpress.mit.edu/e-mailDemocracy and Victo
Georgia Tech - CS - 4440
Database System Issues in Nomadic Computing, Alonso and Korth, SIGMOD Challenge Session, 1993 Mobile Wireless Computing: Challenges in Data Management, Imielinski and Badrinath, Comm. ACM, 19941 Computing that uses portable computing devices, e
Acton School of Business - COMP - 440
Probabilistic reasoning over timeDevika Subramanian Lecture 8 Comp 440OutlineTemporal models Inference in temporal modelsFiltering Prediction P di i Smoothing Most likely explanationThree examplesHMMs Kalman filters Dynamic Bayesian networks
Georgia Tech - CS - 4750
Prototyping and ScenariosAgenda Questions? A note about exams Prototyping Scenarios Project Part IIA note about exams Always allowed an 8.5 x 11 cheat sheet NO NO NO last minute excuses for missing a test unless you absolutely couldnt help
Acton School of Business - NAVA - 302
CHAPTER 4 RADAR NAVIGATION RADARSCOPE INTERPRETATIONIn its position nding or navigational application, radar may serve the navigator as a very valuable tool if its characteristics and limitations are understood. While determining position through o
Georgia Tech - CS - 4470
I nte nt Room llige sFacilitating inte raction in physical e nvironm nts eAge ndaQue stions S tanford Inte ractiveWorkspace s Re d Efforts lateC 4470/6456 - Fall 2002 SS tanford I nte ractiveWorkspace Proje s ctS 1999 ince http:/iwork.stanf
Allan Hancock College - LIB - 1901
Allan Hancock College - LIB - 1901
Allan Hancock College - LIB - 1901
Georgia Tech - CS - 7001
Mustaque Ahamad mustaq@cc.gatech.edu Room 220, CoC Bldg.http:/www.cc.gatech.edu/faculty/Mustaque.AhamadAreas: Systems, Information Security Courses: Operating Systems, Distributed Systems, Secure Computer SystemsResearch Theme&quot;to explore system
Georgia Tech - CS - 6230
MxNDataDistributionusing EChoMohammadHasanAbbasi HyeseungJeongOutlineCCAOverview EChooverview Previousresearch MxNDataredistribution SolutionApproachCCA:OverviewSolverComponent MathComponentProvide PortUse PortCCAServicesCCAServices
Neumont - EN - 1990
Kerrybrooke Development Ltd. v. Ellis-Don Ltd., [1990] 1 S.C.R. 275Westeel-Rosco Limited/Westeel-Rosco Limite, Robert Seipp, Brian Fiddler, William Rozell, Gary Ashton, Phillip Wellings, Kent Glasier, Cory Berg, Sidney Stechoski, Lee Cannon, Dan Br
Neumont - CSC - 1990
Kerrybrooke Development Ltd. v. Ellis-Don Ltd., [1990] 1 S.C.R. 275Westeel-Rosco Limited/Westeel-Rosco Limite, Robert Seipp, Brian Fiddler, William Rozell, Gary Ashton, Phillip Wellings, Kent Glasier, Cory Berg, Sidney Stechoski, Lee Cannon, Dan Br
Purdue - RYAN - 225
Stat 225 Discrete RV Practice Problems Solutions03/09/091. (a) Describe, in words, the relationship between the Bernoulli and Binomial distributions. [Hint: Recall the definition of Bernoulli trials.] Underlying the Binomial distribution is a se
Georgia Tech - CS - 6432
Topic: Atmospheric Model: Steering Improvement-Motivation:-With the idea of &quot;Distributed Laboratories&quot;, scientists and engineersfrom different locations can work interactively on problems that areeither complex or that take a long
University of Texas - L - 389
Lifecycle Metadata for Digital ObjectsOctober 23, 2002 Accession MetadataDescriptive metadata for what? WWW(metatags, Dublin Core [Colorado], RDF) Finding aids (EAD) Books and other chunks (MARC) Multimedia objects (METS/Colorado) Individua
Acton School of Business - COMP - 540
Evolving Efficient Wavelets for Audio CompressionTaylor Goodhart February 4, 2008Abstract In this project we propose to use a genetic algorithm to discover efficient wavelets for compression of audio data. As a secondary objective, we seek to disco
Kentucky - ID - 163
ID-163G.J. Schwab, L.W. Murdock, D. Ditsch, and M. Rasnake, UK Department of Plant and Soil Sciences; F.J. Sikora, UK Division of Regulatory Services; W. Frye, Kentucky Department of Agriculture Office for Consumer &amp; Environmental ProtectionAgric
Kentucky - ID - 108
Beef BookAuthors(IN ALPHABETICAL ORDER) Jim Akers .Extension Associate, Animal and Food Sciences Georginna Anderson .Student, Animal and Food Sciences Dr. Les Anderson .Associate Extension Professor, Animal and Food Sciences Dr. Jos Bicudo .Former
Virgin Islands - PHYS - 216
Sources of Magnetic fieldsOutline for Today Ch 30, sec 30.2 - 30.3Magnetic force between two parallel conductors Ampere's LawMagnetic Force Between Two Parallel ConductorsSince a current carrying conductor has with it an associated magnetic fie
Acton School of Business - E - 896
Fits to Ks from RQMD v2.4, b &lt; 4 fm(1/2pTNev)(d2N/dpTdy)100.5384 P1 P2/ 11 8.652 9.148(1/2pTNev)(d2N/dpTdy)0.7649 0.58552.6 &lt; y &lt; 2.9: T = 109.3 +/- 7.0 MeV100.10.20.32mT - m0 (GeV/c ) (1/2pTNev)(d2N/dpTdy)1.052 P1 P2 / 11
Acton School of Business - E - 896
inverse slope parameter (MeV)400 inverse slope parameter vs. rapidity E896 DDC, TOF cuts RQMD v2.4, b &lt; 4 fm350E891 E877300E896 SDDA250200150100500-1-0.500.511.522.533.54 rapidity
Acton School of Business - E - 896
2001/03/26 11.58 , pT&lt;0.15 GeV/c - TOF cuts incl - compare data with RQMD (1/Nev)(dN/dy) 12 K.Kainz results S.Kelly Thesis, no scalingS.Kelly Thesis, divide once by branch. ratio 10 RQMD results8 642rapidity, assumption02
Acton School of Business - E - 896
(1/2pipTNev)(d2N/dpTdy) intercept parameter6 intercept parameter vs. rapidity E896 DDC, TOF cuts RQMD v2.4, b &lt; 4 fm543210-1-0.500.511.522.533.54 rapidity
Acton School of Business - E - 896
H 38kgB8ps&quot;%@34lB(3gBn@97 $ Q @ l8 F (&amp; 0 l &amp; $ Q @ l8 F (&amp; 0 (&amp; &quot; p qd3F 238kg48's&quot;n@4(321)'%$#!97 5A @ @ l8 Cg4dB8ps&quot;%@3F h (fe8 kgd!ji4Cg3d 93kwuwvd{#AEp$ 3!1pos&amp;n(X!d&quot;4( 3!1pos&amp;n(9)X7 z u x &quot; @ z&amp; f 8 l&quot; l Q z&amp; f 8
Georgia Tech - CS - 4440
CS 4440 Course Reading SummariesPaper #: 10, Section 4.7Title: Gigascope: A Stream Database for Network Applications(1) ProblemsThe authors of these papers were researchers at AT&amp;T Labs looking for solutions for real customers. They were not j
Acton School of Business - ADMN - 543
M.A. WRIGHT FUND EQUITY RESEARCHChrisis Nicolaou chrisis@rice.edu March 6, 2002 MARKET DATAPrice 3-0502 Close 52-Wk Range Low MidSept. Price Target Valuation Result Shares Out. (Million) Market Cap. (Billion) Beta Yahoo Dividend Yield Price to Earn
Georgia Tech - CS - 8803
CS 8803 AIA[3.0 AS] 34Problem Statement:The purpose of this paper is to present to the readers OpenCQ which is a continual query system forupdate monitoring in the web. The paper begins with the importance of continual query: It's quiteineffici
Acton School of Business - GW - 4314
Part V: BGP Beacons -An Infrastructure for BGP MonitoringBetter understanding of BGP dynamics Difficulties Multiple administrative domains Unknown information (policies, topologies) Unknown operational practices Ambiguous protocol specsPropo
Michigan - P - 506
Physics 506 Homework Assignment #3 Solutions Textbook problems: Ch. 8: 8.18, 8.19 Ch. 9: 9.3, 9.6 8.18Winter 2006a) From the use of Greens theorem in two dimensions show that the TM and TE modes in a waveguide dened by the boundary-value problem
Georgia Tech - ETD - 01072008
DESIGN AND DEVELOPMENT OF A LAYER-BASED ADDITIVE MANUFACTURING PROCESS FOR THE REALIZATION OF METAL PARTS OF DESIGNED MESOSTRUCTUREA Dissertation Presented to The Academic Faculty by Christopher Bryant WilliamsIn Partial Fulfillment of the Requir
Georgia Tech - MATH - 3770
The two portfolios puzzleErnie Croot October 1, 20081IntroductionFirst, I would like to say that I learned of this puzzle some years ago from Yang Wang, who is presently department head of Michigan State University mathematics, and was formerl
Georgia Tech - MATH - 3770
Notes on the chi-squared distributionErnie Croot October 7, 20081IntroductionKnow the material in your book about chi-squared random variables, in addition to the material presented below.1.1Basic properties of chi-squared random variable
Acton School of Business - ESCI - 321
Cenozoic -The development of the Earth as we know it today At only 66 million years long, the Cenozoic is only 1.4% of all geologic timeAge of Ocean Basins Is Well ConstrainedThe Circum-Pacific Orogenic BeltGeological Provinces of the U.STh
Georgia Tech - CS - 2260
Programming The Nintendo Game Boy Advance: The Unofficial GuideProgramming The Nintendo Game Boy Advance: The Unofficial Guide Copyright (c)2003 by Jonathan S. Harbour - http:/www.jharbour.comProgramming The Nintendo Game Boy Advance: The Unoffic
Georgia Tech - CS - 4455
Game Play StylesStructure of Game PlayAug 28, Fall 2002 CS44551Game Play StylesaGames Have various structures that the user must handle Common structures appear in many gamesAug 28, Fall 2002 CS 44552Suggestions?Aug 28, Fall
UCSD - CSE - 107
Chapter 7 Message AuthenticationIn most peoples minds, privacy is the goal most strongly associated to cryptography. But message authentication is arguably even more important. Indeed you may or may not care if some particular message you send out
Georgia Tech - CS - 1321
Programming ParadigmsOutline Prerequisites none Objectives What you are doing here The really big picture Programming ParadigmsProgramming Paradigms Procedural Historical approach Object-Oriented Emerging culture Functional Mathema
Georgia Tech - CS - 1322
&lt;?xml version=&quot;1.0&quot; encoding=&quot;UTF-8&quot;?&gt; &lt;Error&gt;&lt;Code&gt;NoSuchKey&lt;/Code&gt;&lt;Message&gt;The specified key does not exist.&lt;/Message&gt;&lt;Key&gt;3e6cb4f636692b131708ab8c65ce0da77bf68cc8.ppt&lt;/Key&gt;&lt;RequestId&gt;E 864FF15541EDEDA&lt;/RequestId&gt;&lt;HostId&gt;c0yxKneqwKjyaW5C+K2WDf/fIA2
Georgia Tech - CS - 1322
CS2 Module 14 Category: CS Concepts Topic: Recursion Objectives Head and Tail RecursionCS 2Introduction to Object Oriented Programming Module 14 CS Concepts RecursionRecursion Remember Recursion? A programming technique whereby a method c
Michigan - MATH - 594
MATH 594, WINTER 2006, FINAL EXAMDUE: MONDAY, 4/20/2006 (1) (2,2,2,2 pts.) Let = 2 + 5. (a) What is the minimum polynomial f (x) of over Q? (b) Show that Q()/Q is not a Galois extension. (c) Let E be a splitting field of f (x). Determine the Gal
Georgia Tech - CS - 8803
Paper: Effects of power conservation, wireless coverage and cooperation on data dissemination among mobile devicesProblem:As the information available online has become very user friendly, access to information has become an important aspect of wi
Georgia Tech - CS - 8803
CS8803 Course Reading SummariesPaper #: 5.2 Mobile 16Title: Effects of Power conservation, wireless coverage and cooperation on data dissemination among mobile devices(1) Problems The paper presents a peer-to-peer data sharing system called
Georgia Tech - CS - 3220
SIAM REV. Vol. 39, No. 1, pp. 5467, March 1997c 1997 Society for Industrial and Applied Mathematics003THE MATHEMATICS OF THE PENTIUM DIVISION BUGALAN EDELMAN Abstract. Despite all of the publicity surrounding the Pentium bug of 1994, the mathem
UCSD - ECE - 138
}de xi O al rm he TOrientation Primary flat orientation Secondary flat locations &lt;111&gt; p-type &lt;100&gt; p-type &lt;111&gt; n-type &lt;100&gt; n-type &lt;110&gt; no secondary flat 90 clockwise from primary flat 45 clockwise from primary flat 180 clockwise from primary f
Acton School of Business - COMP - 300
Weapons of Mass MobilizationA quiet couple in Berkeley got sick of being ignored by the system. So they built a new one. How MoveOn changed the face of fundraising, brought P2P to political advertising, and reinvented grassroots activism. By Gary Wo
Georgia Tech - ECE - 3710
Score:_Name:_ ECE 3710 Test 3 Wednesday, July 7 Copy all of your answers to this cover sheet1. Power dissipated in resistor _ Energy stored in capacitor _Energy stored in inductor __2. I = _/_ amps (in phasor notation) Power = _ watts Reactiv
Rocky Mountain College of Art and Design - EEE - 499
NAVAL COMBAT SUPPORT SYSTEM (NCSS) SOFTWARE REQUIREMENTS SPECIFICATION 1.0 INTRODUCTION1.1 The overall Naval Combat Support System is a real-time system whose purpose is to control the primary combat support systems aboard the Royal Nova Scotian Na
Georgia Tech - ECE - 4112
Viruses and Worms1Agenda How viruses work Virus detectors How worms work Example viruses/worms Melissa Morris My_SQL Lab discussionEC 4112 E 2Viruses Propagates to other programs by modifying them Copies the virus code to other pro
Kentucky - PR - 532
PR-5322005 Native Warm-Season Perennial Grasses ReportG.L. Olson, S.R. Smith, R. Spitaleri, J.C. Henning, T.D. Phillips, and G.D. LacefieldIntroductionKentuckys pasture and hay acres are largely seeded in coolseason species. This results in a
Georgia Tech - ETD - 11242003
Direct Adaptive Control for Nonlinear Uncertain Dynamical SystemsA Dissertation Presented to The Academic Faculty byTomohisa HayakawaIn Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy in Aerospace EngineeringGeo
Acton School of Business - YS - 4937
IEEE International Conference on Application-specific System, Architectures and Processors (ASAP'08). July 2008Configurable and Scalable High Throughput Turbo Decoder Architecture for Multiple 4G Wireless StandardsYang Sun , Yuming Zhu , Manish Go
Georgia Tech - CS - 2130
CS 2130Presentation 04 The C Programming LanguageQuestion? Suppose you have two files containing all the functions for your C program (main.c and other.c). How can you make an executable? 1. In the file main.c place the following line at the top
Georgia Tech - CS - 6250
Broadband Wireless NetworksBenny BingSchool of ECE, Georgia Tech Office: GCATT Room 270 Email: benny@ece.gatech.edu Phone/Fax: 404-385-0271 Homepage: http:/users.ece.gatech.edu/~benny Slides downloadable from http:/users.ece.gatech.edu/~benny/wirel
Georgia Tech - CS - 4440
[1: #21]Building Personal Maps from GPS Data (referred to as BPM)Authors: Lin Liao, Donald J. Patterson, Dieter Fox, Henry KautzPROBLEM STATEMENT: BPM proposes a method to build personal maps based on GPS data. Gathering the GPS data itself
Acton School of Business - CHEM - 442
Michigan - CEE - 212
Michigan - E - 101
1FunctionWrite a function which takes two numbers and calculates both their sum and product without displaying anything.2Smallest of three numbersIn the following: function [small] = small3(a,b,c) if (a &lt; b &amp; a &lt; c) small = a; elseif (b &lt; a