weijen_research_overview

weijen_research_overview - Analysis of Large-scale Wireless...

Info iconThis preview shows pages 1–8. Sign up to view the full content.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Analysis of Large-scale Wireless Network Traces and Its Impact on User Modeling and Protocol Design Wei-jen Hsu Advised by Dr. Ahmed Helmy 05/27/11 1 Emerging Wireless Communication • Opportunities • Challenges – Dynamic network structure – Decentralized service paradigm – Tight coupling between the devices and individuals 05/27/11 2 Problem Statement • To understand user behavioral patterns in mobile networks from empirical, large-scale data sets – Individual mobility characteristics – Pair-wise similarity – Global encounter pattern • To incorporate the findings in modeling and protocol design – User mobility model – Classify behavioral groups; profile-cast – Efficient broadcast 05/27/11 3 The TRACE framework x1,1 x1,n xt ,1 xt ,n Trace Characterize 05/27/11 Represent Analyze Employ (apply) 4 Outline T ra c e s Detailed behavior analysis Complete case O b s e rv a tio n In d iv id u a l u s e r m o b ility U s e r g ro u p s in th e p o p u la tio n E n c o u n te r p a tte rn s in th e n e tw o rk Future Work A p p lic a tio n M o b ility m odel M ic ro s c o p ic b e h a v io r 05/27/11 P ro file -c a s t p ro to c o l S m a llW o rld based m essag e d is s e m in a tio n M a c ro s c o p ic b e h a v io r 5 Trace Sets • In this work we mainly use WLAN traces – Mostly from university campuses or corporate networks (4 universities, 1 corporate network) – The largest data sets about wireless network users available to date (# users / lengths) – No bias: not “special-purpose”, data from all users in the network • For comparison we also look at some vehicular movement trace and human encounter trace Trace 05/27/11 6 Trace Sets • Available information from WLAN traces – MAC addresses of the devices as identifiers – Location/Time of users (our main focus) Node: e0_12_29_fc_ba_8c Association Start time 2197745 2230200 2257917 2285119 2297134 2304287 Location_ID 172.16.8.244_11009 172.16.8.244_11009 172.16.8.244_11009 172.16.8.244_11009 172.16.8.244_11009 172.16.8.244_11023 Duration 4433 13320 643 1017 7153 6744 Trace 05/27/11 7 Case study I – individual mobility T ra c e s O b se rv a tio n A p p lic a tio n In d iv id u a l u se r m o b ility M o b ility m odel M ic ro sc o p ic b e h a v io r 05/27/11 U s e r g ro u p s in th e p o p u la tio n E n c o u n te r p a tte rn s in th e n e tw o rk P ro file -c a s t p ro to c o l S m a llW o rld b a se d m e ss a g e d isse m in a tio n M a c ro s c o p ic b e h a v io r 8 Goal • To understand the mobility/usage pattern of individual wireless network users • To observe how environments/user type/trace-collection techniques impact the observations • To propose a realistic mobility model based on empirical observations – That is mathematically tractable – That matches with multiple scenarios 05/27/11 9 Mobility Models • Mobility models are of crucial importance for the evaluation of wireless mobile network protocols • Requirements for mobility models – Realism (detailed behavior from traces) – Parameterized, tunable behavior – Mathematical tractability 05/27/11 10 Metrics for Mobility Models • How often are the nodes present? – Percentage of “online” time • What kind of preference do users show in space? – The percentile of time spent at the most frequently visited locations • What kind of repetition do users show in time? – The probability of re-appearance x1,1 x1,n xt ,1 xt ,n Represent 05/27/11 11 Mobility Characteristics from WLANs • Simple existing models are very different from the characteristics in WLAN Prob.(online time fraction > x) On/off activity pattern Skewed location preference Periodic re-appearance Characterize 05/27/11 12 Time-variant Community (TVC) Model [Spy06, Hsu07] • Skewed location visiting preferences – Create “communities” to be the preferred area of movement • Periodical reappearance – Create structure in time – Periods – Repetitive structure TP1 TP2 TP3 TP1 TP2 TP3 Time Repetitive time period structure Time period 1 (TP1) 75% C o m m 11 Time period 3 (TP3) C o m m 13 C om m 2 1 C om m 2 2 C om m 1 C o m m 23 2 C o m m 33 25% C o m m 31 05/27/11 Time period 2 (TP2) C o m m 43 Employ (apply) 13 Theoretical Tractability • For the TVC model, we can derive – Nodal spatial distribution – the demographic profile of the mobility model – Average node degree – important for cluster maintenance and geographic routing – Hitting time/ Meeting time – important for routing performance analysis • With low error when the communication range is small compared to the community sizes (communication disk < 25% of community) 05/27/11 14 Theoretical Tractability node degree Avg. Spatial distribution 3 0.08 0.06 0.04 0.02 0 0 200 400 600 800 Hitting time 0 800 600 400 Y 200 2 1 .5 1 0 .5 0 0 X 50000 10 20 30 C o m m u n ic a tio n R a n g e (K ) M o d el1 -s im M o d el1 -th eo ry M o d el2 -s im M o d el2 -th eo ry M o d el6 (m u lti_ tier)-s im M o d el6 (m u lti_ tier)-th eo ry 35000 30000 25000 20000 15000 10000 50 M o d el5 (tiered _ co m m )-s im M o d el5 (tiered _ co m m )-th eo ry M o d el7 (m u lti_ co m m )-s im M o d el7 (m u lti_ co m m )-th eo ry M o d el3 -s im M o d el3 -th eo ry 12000 Meeting Time (s) 40000 40 Meeting time 15000 45000 Hitting time (s) M o d el6 -s im M o d el6 -th eo ry M o d el7 -s im M o d el7 -th eo ry M o d el3 -s im M o d el3 -th eo ry 2 .5 Average Node Degree Prob(node appears at (X,Y)) 0.1 9000 6000 3000 5000 0 0 10 20 30 40 50 C o m m u n ic a tio n R a n g e (K ) 05/27/11 60 70 10 20 30 40 50 C o m m u n ic a tio n R a n g e (K ) 60 70 15 Using the TVC Model – Reproducing Mobility Characteristics • (STEP1) Identify the popular locations; assign communities • (STEP2) Assign parameters to the communities according to stats • (STEP3) Adding user on-off patterns (e.g., in WLAN, users are usually off when moving) 05/27/11 16 Using the TVC Model – Reproducing Mobility Characteristics • WLAN trace (example: MIT trace) A P sorted by to tal am ou n t of tim e associated w ith it 11 21 31 41 51 61 71 81 91 1 .E + 0 0 1 .E -0 1 1 .E -0 2 M odel-sim plified 1 .E -0 3 1 .E -0 4 1 .E -0 5 M odel-com plex M IT 1 .E -0 6 Skewed location visiting preference 0 .3 Prob.(Node re-appear at the same AP after the time gap) Average fraction of online time associated with the AP 1 0 .2 5 0 .2 M odel-sim plified M IT 0 .1 5 0 .1 0 .0 5 M odel-com plex 0 0 2 4 T im e g ap (d ay s) 6 8 Periodic re-appearance *Similar matches achieved for USC and Dartmouth traces. 05/27/11 17 Using the TVC Model – Reproducing Mobility Characteristics • Vehicular trace (Cab-spotting) 1 .E -0 1 1 .E -0 2 1 .E -0 3 1 .E -0 4 1 .E -0 5 1 .E -0 6 05/27/11 M od e l V e h icle -tra ce 0 .3 Prob.(Node re-appear at the same location after the time gap) Average fraction of online time associated with the location L oca tion sorte d b y tota l a m ou n t of tim e a ssocia te d w ith it 1 11 21 31 41 51 61 71 81 91 1 .E + 0 0 0 .2 5 M odel 0 .2 0 .1 5 V eh icle-trace 0 .1 0 .0 5 0 0 2 4 T im e g ap (d ay s) 6 8 18 Using the TVC Model – Reproducing Mobility Characteristics • Human encounter trace at a conference 10 100 In te r-m e e tin g tim e (s) 1000 10000 100000 M e e tin g d u ra tio n (s ) 1 1000000 0 .1 M odel 0 .0 1 C a m b rid g e -IN F O C O M -tra c e 0 .0 0 1 0 .0 0 0 1 0 .0 0 0 0 1 1 6 h o u rs Inter-meeting time 0 .1 100 1000 10000 100000 M odel 0 .0 1 C a m b rid g e -IN F O C O M -tra c e 0 .0 0 1 0 .0 0 0 1 A encounters B Encounter Inter-meeting duration time 05/27/11 10 1 Prob(Meeting duration > X) Prob(Inter-meeting time > X) 1 Encounter duration time 19 Summary (Case Study I) • We observe some omni-present mobility characteristics from WLANs • These characteristics are not captured by existing synthetic mobility models (i.e., hence the models are not realistic) • We propose the Time-variant Community (TVC) model, which is realistic, theoretically tractable, and flexible 05/27/11 20 Case study II – Groups in WLAN T races O b s e r v a tio n A p p lic a tio n I n d iv id u a l u s e r m o b ility M o b ility m odel M ic ro s c o p ic b e h a v io r 05/27/11 U ser g ro u p s in th e p o p u la tio n E n c o u n te r p a tte rn s in th e n e tw o r k P r o f ile - c a s t p r o to c o l S m a llW o r ld based m essage d is s e m in a tio n M a c r o s c o p ic b e h a v io r 21 Goal • Identify similar users (in terms of long run mobility preferences) from the diverse WLAN user population – Understand the constituents of the population – Identify potential groups for group-aware service • In this work we classify users based on their long-run mobility trends (or location-visiting preferences) – We consider semester-long USC trace (spring 2006, 94days) and quarter-long Dartmouth trace (spring 2004, 61 days) 05/27/11 22 Representation of User Association Patterns • We choose to represent summary of user association in each day by a single vector -Office, 10AM -12PM -Library, 3PM – 4PM -Class, 6PM – 8PM Association vector: (library, office, class) =(0.2, 0.4, 0.4) • Summarize the long-run mobility in an E ach row represents an “association matrix” association vector for a tim e slot x 1 ,1 x 1 , 2 x A n entry represents the 2 ,1 percentage of online tim e during tim e slot i a t location j x t ,1 xi , j x1 , n x t , n E ach colum n represents the popularity for a location across tim e 05/27/11 x1,1 x1,n xt ,1 xt ,n Represent 23 Eigen-behavior • Eigen-behaviors: The vectors that describe the maximum remaining power in the association matrix (obtained through Singular Value Decompostion) k −1 u1 = arg max X ⋅ u , uk = arg max ( X − ∑ Xui ui' ) ⋅ u ∀k ≥ 2 u =1 u =1 i =1 with quantifiable importance wk = σ ∑i =1 σ i2 • Eigen-behavior Distance calculates similarity of users by weighted inner products of eigen-behaviors. – Sim(U ,V ) = ∑ wi w j ui ⋅ v j ∀i , j • Benefits: Reduced computation and noise 2 k 05/27/11 Rank ( X ) 24 Identify Similar User • With the distance between users U and V defined as 1-Sim(U,V), we use hierarchical clustering to find similar user In te r-g ro USC groups.u p In te r-g ro u p Dartmouth In tra -g ro u p In tra -g ro u p 1 S e rie s3 S e rie s4 0 .8 E ig e n -b e h a v io r d ista n c e 0 .6 S e rie s3 S e rie s4 0 .6 CDF CDF 0 .8 1 0 .4 A M V D E ig e n -b e h a v io r d ista n c e 0 .4 AMVD 0 .2 0 .2 0 0 0 0 .2 0 .4 0 .6 0 .8 D ista n c e b e tw e e n u se rs 1 0 0 .2 0 .4 0 .6 0 .8 D ista n c e b e tw e e n u se rs 1 *AMVD = Average Minimum Vector Distance 05/27/11 25 Validation of User Groups • Significance of the groups – users in the same group are indeed much more similar to each other than randomly formed groups (0.93 v.s. 0.46 for USC, 0.91 v.s. 0.42 for Dartmouth) • Uniqueness of the groups – the most important group eigen-behavior is important for its own group but not other groups Significance score of top eigen-behavior for USC Dartmouth Its own group 0.779 0.727 Other groups 0.005 0.004 05/27/11 26 User Groups in WLAN - Observations Group size • Skewed group size distribution – the largest 10 groups account for more than 30% of population on campus. Power-law distributed group sizes. • Most groups can be described by a list of locations with a clear ordering of importance • We also observe groups visiting multiple locations with similar importance – taking the most important location for each user is not sufficient 1000 D a rtm o u th 5 4 0 * x ^ -0 .6 7 USC 5 0 0 * x ^ -0 .7 5 100 10 1 1 05/27/11 10 100 U se r g ro u p siz e ra n k 1 0 0 0 Characterize 27 Enough of words! Let’s see how it works 05/27/11 28 Summary (Case Study II) • We use SVD to obtain eigen-behaviors of individual users. • We use the eigen-behavior distances and hierarchical clustering to classify WLAN users into similar groups. • This finding is useful for mobility modeling (identifying group sizes and their frequently visited locations), network management, abnormality detection, and group-aware protocol (i.e., profile-cast, our future work) 05/27/11 29 Case study III – Encounter Pattern T races O b s e r v a tio n A p p lic a tio n I n d iv id u a l u s e r m o b ility M o b ility m odel M ic ro s c o p ic b e h a v io r 05/27/11 U ser g ro u p s in th e p o p u la tio n E n c o u n te r p a tte rn s in th e n e tw o r k P r o f ile - c a s t p r o to c o l S m a llW o r ld based m essage d is s e m in a tio n M a c r o s c o p ic b e h a v io r 30 Encounter Events • How many other nodes does a node encounter with? Prob. (unique encounter fraction > x) • Derived from simultaneous associations to the same locations 0.5 On avg. only 2%~7% of population 05/27/11 31 Encounter-Relationship (ER) graph • Draw a link to connect • Most node pairs are a pair of nodes if they connected in the ER ever encounter with graph each other • The ER graphs show SmallWorld graph characteristics x1,1 x1,n xt ,1 xt ,n Represent 05/27/11 – High clustering coefficient – Low average path length Characterize 32 Future Work – Profile-cast T ra c e s O b s e rv a tio n A p p lic a tio n In d iv id u a l u s e r m o b ility M o b ility m odel M ic ro s c o p ic b e h a v io r 05/27/11 U s e r g ro u p s in th e p o p u la tio n E n c o u n te r p a tte rn s in th e n e tw o rk P ro file -c a s t p ro to c o l S m a llW o rld based m essag e d is s e m in a tio n M a c ro s c o p ic b e h a v io r 33 Goal • To send messages to a group of nodes within the general population – The group is defined by the intrinsic behavior patterns of the nodes (CISE students, library visitors, moviegoers) – The sender does not know the network identities (addresses) of the destinations • Different from multi-cast: No join/leave, no group maintenance 05/27/11 34 Profile-cast Use Cases • Mobility profile-cast – Targeting people who move in a particular Current pattern (lost-and-found, context-aware announcement) – Rely on the “similarity metric” between users • Mobility-independent profile-cast – Targeting people with a certain characteristics Future independent of mobility (classic music lovers) – Rely on the “Small World” encounter pattern 05/27/11 35 Mobility Profile-cast (intergroup) Mobility space N SN S D Forward?? D 05/27/11 N N D Scoped message spread in the mobility space 36 Inter-group profile-cast Operation N SN – Singular value decomposition • Profiling user mobility 1. profiling provides amobility of a node N – The summary of the matrixs represented by an i (A few eigen-behavior vectorsssociation matrix for a are sufficient, e.g. 99% of users at most 7 vectors describe 90% of power in the association matrices for 94 x1, n x1,1 x1, 2 x days) 2 ,1 Each row represents an N association vector for a− − − − time slot −− − − − − −− represents entry xt ,1 − − − − −− − − xt , n x ,j Sum. ivectors An the percentage of online time during time slot i at location j 05/27/11 37 Inter-group profile-cast Operation 1. profiling N N SN 2. Forwarding decision N 05/27/11 • Determining user similarity – Nodes exchange their eigenbehaviors and the corresponding weights at encounter – Similarity of user mobility are evaluated by weighted inner products of eigen-behaviors Sim(U , V ) = ∑ wi w j ui ⋅ v j ∀i , j – Message forwarded if Sim(U,V) is higher than a threshold (recall that the goal is to deliver messages to nodes with similar profile) 38 Evaluation • Based on USC WLAN trace for realistic user mobility(2006 spring, 94 days, 5000 users) • We use hierarchical clustering to identify 200 distinct groups based on mobility profile. • We pick groups with 5 or more members and randomly pick 20% of the members in these groups as senders 05/27/11 39 Evaluation • Spanning the spectrum of grouping knowledge Complete user grouping info Inferred user grouping info Centralized protocol - Highly efficient - But not practical 05/27/11 Similarity-based protocol No user grouping info Epidemic and Random Tx. -Simple -Not optimized 40 Evaluation - Result Success Rate Delay Overhead Flooding Centralized Similarity 0.7 Similarity 0.6 Similarity 0.5 RTx m=1 TTL=inf. RTx m=3 TTL=inf. RTx m=6 TTL=inf. RTx m=9 TTL=inf. RTx m=inf. TTL=1 RTx m=inf. TTL=5 0 05/27/11 0.2 0.4 • Centralized: Excellent success 92% rate with only 3% overhead. 3.13 • Similarity-based: 45% (1) 61% success rate at low 2.56 overhead, 92% success rate more overhead 2.06 at 45% overhead (2) A flexible success rate – 1.80 overhead tradeoff 2.11 • RTx with infinite TTL: Much 1.47 more overhead undersimilar success rate • Short RTx with many copies: 0.6 0.8 1 1.2 1.4 Good success rate/overhead, but delay is still long 41 Mobility Profile-cast (intra-group) Goal Flooding Similarity S S S Single long random walk S 05/27/11 Multiple short random walks S 42 Mobility Profile-cast (inter-group) • Sending to a mobility profile specified by the sender – Gradient ascend followed by local flooding (in the mobility space) – The current message holder holds on to the message until it encounters with a node with higher similarity to the target – When the message reaches a point close enough to the target, local flooding is triggered 05/27/11 43 Mobility Profile-cast (inter-group) Goal Flooding S T.P. S T.P. Gradient-ascend 05/27/11 S T.P. Single long random walk S T.P. Flooding_sim Multiple short random walks S T.P. S T.P. 44 Mobility Profile-cast (inter-group) Delivery ratio Delay Overhead Flooding Flooding to simialr nodes only Gradient Ascend RW TTL=200 RW TTL=500 mRW TTL=500 thread=3 mRW TTL=10 thread=150 0 05/27/11 0.5 1 1.5 2 45 Performance Comparison 1.2 1 Gradient ascend helps Success rate - Large groups to overcome the difficult case – when the source is far from T.P. 0.8 sim<0.0001 0.0001<=sim<0.001 0.001<=sim<0.01 0.01<=sim<0.1 0.1<=sim Few long RW is better when S is far from T.P. but many short RW is better when S is close to T.P. 0.6 0.4 0.2 0 Flooding 05/27/11 Flooding_sim Gradient_acsend Few long RW Many short RW 46 Performance Comparisonis better Few long RW Gradient ascend helps to overcome the difficult case – when the source is far from T.P. 3000000 Delay - Large groups when S is close toT.P. but many short RW is better when S is close to T.P. sim<0.0001 0.0001<=sim<0.001 0.001<=sim<0.01 0.01<=sim<0.1 0.1<=sim 2500000 2000000 Gradient ascend 1500000 has some extra delay comparing 1000000 with flooding 500000 0 Flooding 05/27/11 Flooding_sim Gradient_acsend Few long RW Many short RW 47 Future Work • Mobility independent profile-cast – The target group are not necessarily “close” in the mobility space – The encounter pattern provides a network in which most nodes are reachable – We don’t want to flood – How to leverage the Small World encounter pattern to reach the “neighborhood” of most nodes efficiently? 05/27/11 48 Mobility Independent Profile-cast Goal Flooding SmallWorld-based S S Single long random walk S 05/27/11 S Multiple short random walks S 49 Future Work – One-copy-per-clique in the “mobility space” Forward? S Interest space Mobility space Physical space – We expect this to work because similarity in mobility leads to frequent encounters 0 .7 0 .6 Encounter Ratio 0 .5 0 .4 0 .3 0 .2 0 .1 0 0 05/27/11 0 .2 0 .4 0 .6 U se r p a ir sim ila rity 0 .8 1 50 Final Notes • Downside of trace-based work – Unable to access the ground truth – Using the current user data to speculate the future – We have plenty data sets for the normal scenarios, but should they be the focus? • Concerns – Privacy – What’s a real-world application that can’t be done with today’s the great Internet? 05/27/11 51 Potential Directions • Combing the trace with small-scale experiments • Build new testbeds and prototype new services • Come up with a specific scenario to provide solutions for • Challenge the established knowledge 05/27/11 52 Thank You!! Any Questions? http://nile.cise.ufl.edu/~weijenhs/ wjhsu@ufl.edu 05/27/11 53 Mobility Observations 05/27/11 54 Dimensions of User Association 12AM AP1 Online event 6AM 12PM AP2 Offline event 6PM AP3 AP2 Handoff event 12AM 6AM time AP1 Association session – Activeness of a user – Macro-level mobility (How widely a user moves?) – Repetitive association pattern 05/27/11 55 Prob.(online time fraction > x) User Activeness (Overall) Overall, offline time is not negligible!! (on average a user is online between 14% to 36%, except Dart-04 trace) 05/27/11 56 Prob.(online time fraction > x) User Activeness (Change of time) Activeness of user in the same environment can change significantly across time 05/27/11 57 Prob.(coverage > x) Macro-level mobility (Overall) Overall, users visit relatively small fraction of APs (On average, less than 5%) 05/27/11 58 Prob.(coverage > x) Macro-level mobility (Change of time) Although activeness of users changed significantly w.r.t. time, macro-level mobility remains similar 05/27/11 59 Fraction of online time associated with the AP Macro-level mobility Users spend most of online time with few “favorite APs” - More than 65% of time with 1, 95% of time with 5 APs. 05/27/11 60 Prob (Node re-appear at the sNetworkasimilarityme gap) ame AP fter the ti index Repetitive Association Pattern Daily/weekly patterns are visible in some traces 05/27/11 61 Applications of the TVC Model 05/27/11 62 • Geographic routing – How many nodes are needed to get the same performance, given different mobility patterns? – Geographic routing success rate heavily depends on the average node degree 05/27/11 Geographic routing success rate Using the TVC Model – Performance Prediction 1 0 .8 0 .6 0 .4 0 .2 M o d e l1 _ 2 0 0 n o d e s M o d e l3 _ 7 6 0 n o d e s 0 10 20 30 N o d a l c o m m u n ic a tio n ra n g e 40 63 Using the TVC Model – Performance Prediction • Simple message spreading (epidemic routing) – How fast does a message spread in a MANET? – Using SI model and the “meeting probability” we use in the derivation of the meeting time Nodes received the message 60 50 40 30 20 10 T h eory (S I m od el) S im u la tion 0 0 05/27/11 500 1000 1500 S im u la tio n tim e 2000 64 Details of the Traces 05/27/11 65 Why WLAN traces? • Such traces provide a solid foundation to understand wireless network users today, not to predict the future – It is hard to predict how technology evolves – But intrinsic human behavior sometimes do not change that much 05/27/11 66 Trace Sets • Data sets used in the dissertation – University campus WLANs/generic users – Dartmouth, USC, UF (to be added later) – University campus WLAN/PDA users – UCSD, Dartmouth – Corporate WLAN/generic users – MIT/IBM – Vehicular trace (GPS positions of taxis) – Cab-spotting – Human encounter trace at a conference – Cambridge-INFOCOM Many traces available at archive http://nile.cise.ufl.edu/MobiLib/ or http://crawdad.cs.dartmouth.edu/ 05/27/11 67 Trace Sets • (Refer to page 38) – Thousands of users, hundreds of unique locations – We analyze month/semester-long traces – Comparing various trace collection methods and its influences on the observations 05/27/11 68 Representation of User Association Patterns (library, 1:30PM-2:30PM) (office, 10AM-12PM) (class, 6PM-8PM) Association vector: (library, office, class) =(0.2, 0.4, 0.4) • We choose to represent summary of user association in each day by a single vector. • For a given day d, user association vector is defined by a n-element vector a = {aj : the percentage of online time the user i spends at APj on day d}. – The elements of a vector sum to 1. – Use zero vector for off-line users. x x • The elements in the vectors quantify the relative x x importance (or, attraction) of the AP to the user. Represent 1,1 t ,1 05/27/11 1, n t ,n 69 TVC-backup 05/27/11 70 Theory Derivation – Hitting Time • Hitting time – the time for a node to move into the communication range of a randomly chosen target coordinate, starting from the stationary distribution 05/27/11 (hit) 71 Theory Derivation – Hitting Time 1. Weighted average conditioned on the relative location of the ‘target’ HT = ∑ HT (target in comm ) Pr(target in comm ) ∀comm i i 1. Calculate the unit-time hitting probability for each scenario Pht = Pmove ⋅ 2(Comm_range)(Avg_speed) (Community_edge) 2 1. Calculate hitting probability for the whole time period 2. Calculate the conditional hitting time 05/27/11 72 User Grouping 05/27/11 73 The Association Matrix • Summarize the long-run mobility preferences of individual nodes E ach ro w rep resen ts an asso ciation v ecto r fo r a tim e slo t A n en try rep resen ts th e p ercen tag e of on lin e tim e d u rin g tim e slo t i a t lo catio n j x 1 ,1 x 2 ,1 x t ,1 x1 , 2 xi , j E ach co lu m n rep resen ts th e p o p u larity fo r a lo catio n acro ss tim e x1 , n xt ,n • We need to capture the ‘distance’ between association matrices between different users 05/27/11 74 Naïve Distance Metric • The “average minimum distance” – For each association vector (row) of user i, find the closest vector of user j and take average of |ajd’ - aid| over all days d – Intuition: for every daily association vector of i, if there is a similar association vector for j, then (i,j) have similar behavior – Drawback: Expensive to calculate. Includes noise. 05/27/11 75 Summary of Association Matrix • Association matrices have multi-modal row vectors, but low dimensionality • Summary vector Y that captures the most variation in row vectors Xi’s d SIG (Y ) = ∑ X i ⋅ Y i =1 d ∑ i =1 Xi 1 • Singular Value Decomposition (SVD) provides the desired property P e rce n ta g e o f p o w e r ca p tu red 90 80 70 60 50 0 .8 0 .7 P ercen tag e o f p o w er ca p tu red 90 80 70 60 50 0 .6 0 .5 0 .4 Dartmouth 0 .3 1 05/27/11 Ratio of users 0 .9 Ratio of users USC 1 1 0 .9 5 0 .9 0 .8 5 0 .8 0 .7 5 0 .7 0 .6 5 0 .6 3 5 7 9 R eco n stru ctio n ran k (k ) 11 1 3 5 7 R eco n stru ctio n ra n k (k ) 9 76 Benefit of the Eigen-behavior Distance • Computation/Storage efficient − −−− −−− − −−−− − −−− −−− − −−−− AMVD O( N 2 d 2 ) − −−− −−− − −−−− Sum. vectors −−−− −−−− −−−− −−−− O( Nd + cN ) Eigen-behavior distance 2 2 • Lower communication overhead when nodes exchange their behavior summary • Noise reduction 05/27/11 77 Validation of the User Groups 05/27/11 USC 1 Percentage of power captured (joint association matrix for a cluster) Percentage of power captured (joint association matrix for a cluster) • Significance of the groups – users in the same group are indeed much more similar to each other than randomly formed groups Dartmouth 1 0 .8 0 .8 0 .6 0 .6 0 .4 0 .4 0 .2 0 .2 0 0 0 .2 0 .4 0 .6 0 .8 P ercentage o f po w er captu red (rand om m atrix) 1 0 0 0 .2 0 .4 0 .6 0 .8 1 P ercen tag e o f p o w er cap tu red (ran d o m m atrix ) 78 Validation of the User Groups • The existence of distinct behavioral groups (~hundreds)– each group has unique group eigen-behavior • Uniqueness of the groups – the top-1 group eigen-behavior is important for its own group but not other groups Significance score of top eigen-behavior for USC Dartmouth Its own group 0.779 0.727 Other groups 0.005 0.004 05/27/11 79 Encounter Study 05/27/11 80 Indicator for Inter-node Relationship • Nodes have a closer relationship to each other if they visit similar APs during overlapped time periods • From WLAN traces, We find “encounters” to measure internode relationship 05/27/11 81 • How many other nodes does a node encounter with? 0.5 • How many total encounter events does a node have? Prob. (total encounter events > x) Prob. (unique encounter fraction > x) Encounter distribution On avg. only 2%~7% of population 05/27/11 Total encounter count for WLAN users follows BiPareto distribution 82 Information Diffusion via Encounter • Information is carried by nodes physically, and disseminated to others via encounters. A B A C Time evolves BC • We try to understand how information propagates under current encounter patterns. (1) Broadcast traffic with all nodes participating in information diffusion. (2) Adequate storage and bandwidth. 05/27/11 83 Information Diffusion (Ideal) Unreachable ratio • Is information diffusion likely to succeed? For less than 10 days I can reach 90% of the user population 05/27/11 84 Information Diffusion (Selfish User) • Is the information diffusion robust enough? – What if some nodes are non-cooperative? Unreachable ratio (Fig: USC) 05/27/11 85 Information Diffusion (long encounter) – What if short encounters are not quickly discovered or inadequate to exchange the messages? Trace duration = 15 days 05/27/11 86 DTN Routing Decisions • DTN routing protocols are de-centralized – Each node relies on local information to make forwarding decisions – The decisions have direct impact on performance • Delivery probability • Overhead (transmission and storage) • Delay N N S Forward?? 05/27/11 N N D 87 Evaluation • We compare protocol performances based on the following metrics – Success rate (% of intended receivers that receive the messages) – Overhead (Total transmission count) – Delay 05/27/11 88 More graphs for Info Diffusion 05/27/11 89 Delay with Selfish Nodes 05/27/11 90 Delay without Short Encounters 05/27/11 91 ...
View Full Document

This note was uploaded on 05/27/2011 for the course CIS 4930 taught by Professor Staff during the Spring '08 term at University of Florida.

Page1 / 91

weijen_research_overview - Analysis of Large-scale Wireless...

This preview shows document pages 1 - 8. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online