PDB_Stat_100_Lecture_06

PDB_Stat_100_Lecture_06 - STA 100 Lecture 6 Paul Baines

STA 100 Lecture 6 Paul Baines Department of Statistics University of California, Davis January 14th, 2011

Admin for the Day I Homework 2 posted, due Wednesday, Jan 19th in class I NO CLASS MONDAY – MLK DAY! I Comments: In the math and statistics world, log always means natural log i.e., log e or ‘ln’ as you may have seen previously. e.g., log(animals body) will give the (natural) log brain weights. ( ln doesn’t exist in R ). References for Today: Rosner, Ch 3.7-3.8 (7th Ed.) References for Wednesday: Rosner, Ch 4.1-4.9 (7th Ed.)

Some R Tips/Tricks You need two things to complete the R portion on Hwk 2: sampling and subsetting . I Subsetting Data: Point-and-click via: Data > Active Dataset > Subset active dataset Choose a logical (T/F) subset expression . For example: animal == "Cow" Give your subset a different name! Note the double == .

Subsetting Data by Hand I Manually Subsetting Data: Simple examples: > animals[1,] # show the first row > animals[1:5,] # Show the first 5 rows > animals[,"brain"] # show the "brain" column A more sophisticated example: > # Pick only those animals whose body size > # is bigger than the median body size > x.med <- median(animals\$body) > subby <- animals[animals\$body > x.med, ] > subby animal body brain 2 Cow 465.0 423.0 6 Dipliodocus 11700.0 50.0 ...etc...
Random Samples in R I Sampling in R : You can sample from a list of values in R using the sample function: > x <- c(1:10) # x = (1,2,3,4,5,6,7,8,9,10) > # Randomly pick 4 values from x > first.one <- sample(x,4) > second.time <- sample(x,4) # do it again... > first.time [1] 3 6 1 4 > second.time [1] 1 2 3 8 > animals[first.time,] animal body brain 3 Grey wolf 36.33 119.5 6 Dipliodocus 11700.00 50.0 1 Mountain beaver 1.35 465.0 4 Goat 27.66 115.0

Putting it together I We can now randomly pick which rows to keep, and we know how to pick out a subset of rows. Together: > # Randomly pick 4 of the 27 animals > animals[sample(1:27,4),] animal body brain 17 Rhesus monkey 6.8 179 23 Jaguar 100.0 157 2 Cow 465.0 423 15 African elephant 6654.0 5712 You will get a different answer each time! ,

SRS vs. Cluster Sampling Recall: I SRS: Everyone has same chance of being picked.

