Bedrooms 600 400 200 price per square feet price

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: n, we not only want to beUer understand a distribu9on, but we want to compare the distribu9on for subgroups or to compare against another popula9on or standard •  How do you think the expected grade distribu9on might vary with gender? Two Qualita9ve variables Stat 2 Survey Male B A grade C DF Female sex mosaicplot(table(video$sex, video$grade), main = "Stat 2 Survey") How to read a Mosaic plot There are 91 students in the survey. Think of them as spread out evenly in the box New Plot: Mosaic Put all the females on one side of the box. There are 38. New Plot: Mosaic Rearrange the females so that those who expect the same grade are together in the box. 8 of the 38 expect a C Mosaic plot Stat 2 Survey Male B A Smaller frac9on of females expect an A in comparison to Males grade C DF Female sex None of the males expect a C Case: East Bay Housing Market load(url("hUp://www.stanford.edu/~vcs/ StatData/SFHousing.rda")) Warning: It’s BIG San Francisco Chronicle lis9ngs Data •  Record: house sold in a par9cular 9me period •  Over 200,000 houses •  Subset to a dozen ci9es in the East Bay – about 25,000 houses Variables: •  City •  County •  Price •  # bedrooms •  Lot square footage •  and 10 more Rela9onship between city and sale price Data types: City - factor Sale price - numeric Examine a subset of the ci9es someCities = c("Albany", "Berkeley”, "El Cerrito", "Emeryville", "Piedmont", "Richmond", "Lafayette", "Walnut Creek", "Kensington","Alameda","Orinda”,"Moraga")! shousing = ! housing[housing$city %in% someCities & housing$price < 2000000,]! dim(shousing)! [1] 20415 15 Boxplots boxplot(shousing$price ~ shousing$city, las = 2)! Ci9es ordered by median price Rela9onship between price per square foot and total square foot Both are quan9ta9ve ppsf = shousing$price/shousing$bsqft
 plot(ppsf ~ shousing$bsqft)! WHAT’s Wrong with this plot? ScaUer plot plot(ppsf ~ shousing$bsqft, plot y against x pch=19, change plovng character to solid circle cex = 0.2, shrink plovng character to 20% subset = shousing$city =="Berkeley",! Plot a subset of records main="Berkeley", 9tle of plot xlab="Area (ft^2)", label for x axis ylab = "Price/ft^2") label for y axis Rela9onships between more than 2 variables •  Qualita9ve informa9on can be conveyed in plots through color, plovng symbol, juxtaposed panels •  The following plot uses informa9on from 4 variables: city, number of bedrooms, lot size (sq i), and price per square i What do you see? Berkeley ● Piedmont ● ● ● ● ● 1 bedrooms ● ● ● ●...
View Full Document

This document was uploaded on 02/16/2014 for the course STATISTICS 3026 at Columbia.

Ask a homework question - tutors are online