Unformatted text preview: n, we not only want to beUer understand a distribu9on, but we want to compare the distribu9on for subgroups or to compare against another popula9on or standard •  How do you think the expected grade distribu9on might vary with gender? Two Qualita9ve variables Stat 2 Survey Male B A grade C DF Female sex mosaicplot(table(video$sex, video$grade), main = "Stat 2 Survey") How to read a Mosaic plot There are 91 students in the survey. Think of them as spread out evenly in the box New Plot: Mosaic Put all the females on one side of the box. There are 38. New Plot: Mosaic Rearrange the females so that those who expect the same grade are together in the box. 8 of the 38 expect a C Mosaic plot Stat 2 Survey Male B A Smaller frac9on of females expect an A in comparison to Males grade C DF Female sex None of the males expect a C Case: East Bay Housing Market load(url("hUp:// StatData/SFHousing.rda")) Warning: It’s BIG San Francisco Chronicle lis9ngs Data •  Record: house sold in a par9cular 9me period •  Over 200,000 houses •  Subset to a dozen ci9es in the East Bay – about 25,000 houses Variables: •  City •  County •  Price •  # bedrooms •  Lot square footage •  and 10 more Rela9onship between city and sale price Data types: City - factor Sale price - numeric Examine a subset of the ci9es someCities = c("Albany", "Berkeley”, "El Cerrito", "Emeryville", "Piedmont", "Richmond", "Lafayette", "Walnut Creek", "Kensington","Alameda","Orinda”,"Moraga")! shousing = ! housing[housing$city %in% someCities & housing$price < 2000000,]! dim(shousing)! [1] 20415 15 Boxplots boxplot(shousing$price ~ shousing$city, las = 2)! Ci9es ordered by median price Rela9onship between price per square foot and total square foot Both are quan9ta9ve ppsf = shousing$price/shousing$bsqft
 plot(ppsf ~ shousing$bsqft)! WHAT’s Wrong with this plot? ScaUer plot plot(ppsf ~ shousing$bsqft, plot y against x pch=19, change plovng character to solid circle cex = 0.2, shrink plovng character to 20% subset = shousing$city =="Berkeley",! Plot a subset of records main="Berkeley", 9tle of plot xlab="Area (ft^2)", label for x axis ylab = "Price/ft^2") label for y axis Rela9onships between more than 2 variables •  Qualita9ve informa9on can be conveyed in plots through color, plovng symbol, juxtaposed panels •  The following plot uses informa9on from 4 variables: city, number of bedrooms, lot size (sq i), and price per square i What do you see? Berkeley ● Piedmont ● ● ● ● ● 1 bedrooms ● ● ● ●...
