Unformatted text preview: T echniques for mining per iodic u ser behavior in WL AN t r aces
Gr oup 3 K es Pear t mukundh mohan upanit a goswami Sr ikant h Subr amanian Nat ar ajan Chockalingam A nalysis of WL AN Tr aces
Popular it y of APs in M ANETS Paper: Periodic properties of user mobility and access-point popularity By: Minkyoung Kim and David Kotz. W hat will be pr esent ed
Tr ace Acquisit ion The DFT APs Per iodicit y Using one mont h t r aces Using one year t r aces Access point clust er ing Significance of findings Summar y T r ace Aquisit ion
Tr aces ar e fr om Dar t mout h Campus Collect ed over spr ing 2003 and wint er 2004 APs having L ess t hat 50 user /hr wer e r emoved. Such APs ar e consider ed inact ive h ence ir r elevant t o t he st udy. Only Unique user s at each AP was consider ed. T he DFT
T r ansfor m) (Discr et e Four ier Why t he DFT was used: Tr ansfor m t r aces t o fr equency domain h ence exposing per iodic behavior Easy t o get back t o t ime domain by t aking t he inver se DFT A P Popular it y
Dist r ibut ion Of user One Access Point A P Per iodicit y
(base on four week t r ace) 85% of APs had t heir pr imar y per iod at one day 25% of APs had t heir pr imar y per iod at one A P Per iodicit y (base on one year t r ace) 25% of APs had t heir per iod at one day 38% had t heir per iod ad one week Clust er ing of APs
Why clust er APS? Clust er ing aims at discover ing”nat ur al” classes in dat a. Discover y of st r uct ur e in dat a can lead t o a new under st anding of dat a. Clust er ing (cont d)
Clust er ing was done using t he aut oclass
Aut oclass t akes t hr ee par amet er s APs pr imar y per iod APs secondar y per iod M aximum amount of user s AP ser viced per hour Significance of Findings
Cr eat e mat hemat ical equat ion base on per iodic behavior of access point s Summar y
The following was pr esent ed: I nfor mat ion on t r aces The DFT Per iodicit y of APs AP classificat ion I nt er pr et at ion on findings T hank You Guanling Chen, Heng Huang, and Minkyong Kim Introduction
a Offline datamining techniques are used to study periodic patterns in a campus wide environment. Profiling client’s movements lead to Location prediction Anomaly detection Mobility Modeling Challenges addressed in this paper : Removing noise from data Interpretation of discovered patterns a a Data preparation and the ping pong effect
Only clients who were active for more than 30 days were considered. a Diameter as a client’s mobility measure. a The ping pong effect causes the changing of a client’s association from one AP to another AP when there is no physical movement. a Heuristic approach used to prevent the ping pong effect
○ Grouping of Aps that the client associates often [Switching back and forth] ○ Sorting out the most prominent AP from this group for the particular user
a Thus finding the appropriate clients who have meaningful movement patterns and reducing noise Each group has a “significant AP” which defines the AP to which the client was associated most of the time. a We can clearly state that the client indeed spent most of its time with significant AP that was chosen and this AP can be used to predict client’s location Patterns and Periodicities
a a a a a Association sequence is the set of APs a client has associated to. A periodic pattern is defined as a sequence of length ‘k’ which occurs within the association sequence with a period ‘p’. While defining periodic patterns an error bound ‘δ’ was also considered. Exact pattern matching has been used on the basis of the parameter - length The timeline was divided into on segments and off segments A pattern was considered periodic only if it occurred M*0.75 times where M is the maximum number of occurrences in a given sequence. Analysis of frequent patterns
a a a The home location of a client is defined as its most frequently visited AP or building. A client visits his home building more often than his home AP as the home building may contain one or more of his frequently visited APs. Most of the home APs belonged to academic and residential buildings. a One major observation is that the popularity of home APs and home buildings do not follow a power law. The sharing of home locations by clients is plotted below a a The fraction of home building being shared is higher than that of home AP. It was also observed that the home AP was located within the home building. a a The top movements were observed to be made between residential buildings and academic buildings like libraries. It has also been observed that 13% of the movement is below 50 meters. It was earlier mentioned that APs within 50 m lead to ping pong effect. a The distribution of popularity of top movements did follow the power law which was not in the case of popularity of home locations. This means that lesser number of clients had shared the top movements. Analysis of periodic patterns
a a a A client visited more than one location periodically . APs have more periodically visiting clients than buildings. APs in academic buildings like classrooms had the most number of periodically visiting clients. a a The periods were mostly in terms of days and weekly patterns were more prominent. Certain anomalies like a period of 12.5 days were also found Temporal relationships between clients were not found. Conclusion
a a a a Most of the mobility is student driven. Parameters that reproduce patterns in mobility models can be described The future work includes the study of distribution patterns for individual clients. The use of AP coordinates to approximate client’s location is a limitation. Outline.. In what other ways, researchers have quantified periodic user behavior and their results? An Agenda based mobility model that doesn’t use WLAN traces to study user behavior but uses NHTS database to model user mobility. Some initial results of periodicity observed in Sept ’07 WLAN traces from UF based on class schedules. On Modeling User Associations in Wireless LAN Traces on University Campuses Wei-jen Hsu and Ahmed Helmy Introduction This work is the most exhaustive trace based study in literature known till date. Traces from four different organization (MIT, USC, Dartmouth and UCSD) were analyzed. The purpose of studying such a wide range of traces is to realistically model user- behavior and their usage in campus. The motivation behind this work is manifold like, Better Management and Capacity planning decisions in campus based on usage patterns. To develop realistic models to support the design and evaluation of wireless routing protocols. The basic understanding of how user’s behave is essential to deploy new wireless technologies. The Four Classes of User-Behavior and metrics..
Individual User Behavior Activeness of Users Macro-level Mobility Micro-level Mobility Repetitive association patterns Metric to capture the behavior Online Time Fraction Coverage Total Hand-Off Count Location Similarity Index and Network Similarity Index Focus on the repetitive behavior of users Location Similarity Index: Fraction of all snapshot pairs which are separated by a fixed time gap when the user associates with the same Access Point. Network Similarity Index: Average of the location similarity index for all users for a given time gap. What does this metric signify? Results and Conclusion “Users have the strongest tendency to show DAILY and WEEKLY repetitive association pattern” !!! An Agenda Based Mobility Model Qunwei Zheng, Xiaoyan Hong, Jun Liu Introduction The main idea of this paper is to incorporate the social activities of people in building a mobility model . It is based on the assumption that people’s movements are most likely the explicit and implicit results of their activity agenda. Designed a framework called “Agenda Driven Mobility Model” and have analyzed their model using Hidden Markov Chain. “Agenda” provides a certain amount of predictability of nodes whereabouts and can be used to assist in routing. To realistically model the mobility of users, the authors use data from the National Household Travel Survey. A sample data : “ A student, on a specific weekday took a trip; the purpose of the trip was to go to school; she used her personal vehicle to travel a distance of four miles, which took her 10 min; she stayed at school for six hours”. The framework has three main components: A personal Agenda Geographic MAP Motion Generator. Agenda Based Framework Agenda: Defines person’s activities based on his social role. It includes “what, “when” and “where” elements of the activities. eg: (time1: location1,Activity 1) Geographic Locations (Maps): It contains location information of possible activities and road information that connects all locations. Motion Generator: With input from agenda and map, the motion generator produces node’s movement along the path along destinations, including moves, turns and pauses. Map Generation.. They define roads first and place buildings (address) second. People move along these roads and stay at those addresses for the activities listed in the agendas. For their work, they have generated synthetic maps. The type of address , say library or restaurant will help them to calculate the dwell time of the activities in those locations. Agenda Generation.. In the initialization, each node creates an agenda which covers all day long activities. The next agenda item is picked according to NHTS’s activity distribution. Its location is picked randomly from the many addresses types. Time = Duration time of current activity+ longest possible travel time from the current address to next address. Motion Generator.. It takes as input an agenda and a map and then chooses a motion path for a node to move towards the next activity location according to the agenda. This path is the shortest distance path between current activity and next activity location In real situation, a mobile node may not take the shortest path! Results and conclusion Social Roles and Agenda activities tend to cause geographic concentrations, which impact routing performance significantly and using an agenda based Model is expected to provide realistic results. Motivation for our Project The main drawback in the agenda based model is that there model is based on NHTS data which is just a simple survey that includes demographic characteristics of households, people, vehicles, daily and long distance travel for all purposes and by all transportation modes. Are the statistics derived from such data capture realistically people’s repetitive periodic behavior? Not really… In our project, we have the same idea of user “schedules” which is to some extent analogous to an “agenda” . But here we consider those social activities of a user in his schedule which is “repetitive” over a daily and weekly basis like “going to a class”, “going to library” etc. Constructing a model based on repetitive user association patters/mobility pattern taken from traces will be a more realistic approach towards evaluating network performance of routing protocols. Initial results..
7 6 5 4 3 2 1 0 1 2 3 Monday Wednesday Friday X axis: Week Number Y axis: Number of students The periodicity in classes can be used in constructing class profile Matrix. Questions and Suggestions? THANK YOU ...
View Full Document
This note was uploaded on 05/27/2011 for the course CIS 4930 taught by Professor Staff during the Spring '08 term at University of Florida.
- Spring '08