This preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
Unformatted text preview: 1 COMP170 Discrete Mathematical Tools for Computer Science Discrete Math for Computer Science K. Bogart, C. Stein and R.L. Drysdale Section 5.1, pp. 213221 Intro to Probability Version 2.0: Last updated, May 13, 2007 Slides c 2005 by M. J. Golin and G. Trippen 2 Introduction to Probability • Why Study Probability? • Complementary Probabilities • Probability Spaces and Distributions • Probability and Hashing • The Uniform Probability Distribution 3 Why Study Probability? In Computer Science we often deal with random events . Some involve randomness imposed from the outside, e.g., networking, when requests from computers on the network enter the network at “random” time. Some involve randomness that we introduce, e.g., hashing , which is a techinique often used to compactly store information in a computer for later quick retrieval. Studying the performance of computer systems in the presence of these types of randomness, requires understanding random ness, which is the study of probability . 4 Hashing Imagine a company with one hundred employees. There’s not enough room in the main office to give each one a mailbox. So, instead, they have one mailbox for each letter of the alphabet. When a letter arrived, it gets put into the box corresponding to the recipients surname. This is an example of a Hash Function . Hashing is a very common programming tool that permits con cise storage of data with quick lookups. The general idea is that we have a set of records that need to be stored. Each record is addressed using its key , e.g., name or ID number. The records are stored in a table. Each table location, called a bucket or slot , holds a list of records. We are also given a hash function h ( x ) . A record with key key is stored in the bucket with index h ( key ) . 5 Hashing Hash Table T buckets/ slots Our Hash Function: h ( x ) = x mod m Data (with Keys) 4 4 7 7 10 10 13 13 15 15 collision! Good hash function spreads keys evenly among buckets. 2 m 1 = 7 1 3 4 5 6 Keys are integers . m = 8 When searching for a record you might have to look at every record in the appropriate bucket, so 6 Given: Table with 100 buckets and 50 keys. Is it possible that all 50 keys are assigned to same bucket? • Using good hash function, you’d never see this in a million years. • Actually, you also wouldn’t see that all the keys hash into different locations. How can we calculate likelihood of such events? → Study of Probability bad case best case 7 Introduction to Probability • Why Study Probability? • Complementary Probabilities • Probability Spaces and Distributions • Probability and Hashing • The Uniform Probability Distribution 8 In Probability Theory we need to define three related but different concepts: • The underlying Sample Space and Elements (Outcomes) in the sample space • An event in the Sample Space • The Weight of an element in the sample space Gives a Probability Distribution (Measure) 9 Andrei Nikolaevich Kolmogorov...
View
Full Document
 Spring '10
 M.J.Golin
 Computer Science, Probability theory, hash function, Probability space

Click to edit the document details