Stat 104 Section 2, Spring 2011 Contact info 1. Josh Zagorsky, < [email protected] > 2. Office Hours: Thursdays from 2pm-3pm in Science Center 600, or email for other times. Notes from p-set #1 1. Quartiles : different conventions for naming, but if a datapoint falls at the 30th percentile, you can say it’s in the ”2nd quartile.” 2. Log Transforms - don’t worry about this yet (we’ll cover it later), but log transforms make things more left-skewed. New concepts 1. Sample standard deviation s = variance = s n i =1 ( x i - ¯ x ) 2 ( n - 1) 2. Sample covariance of x and y , denoted s xy = 1 n - 1 n i =1 ( x i - ¯ x )( y i - ¯ y ). Also, Cov ( X, X ) = V ar ( X ). 3. Correlation = Covariance V ariance X V ariance Y 4. Probability : 0 P ( A ) 1. The probabilty that event A doesn’t happen is called the complement of A, in lecture denoted ¯ A and in the textbook denoted A c . P ( A c ) = 1 - P ( A ) 5. Chebyshev’s Rule For any set of data, regardless of the distribution, the proportion of the
data that lies within k standard deviations of the mean is at least 1-1 k 2 (lecture 4). 6. Linear transformations of a dataset: take a dataset X , multiply each point by b and add a to it, and you have the dataset a + bX . V ar ( a + bX ) = b 2 V ar ( X ) Mean ( a + bX ) = a + b × Mean ( X ) 7. Regression Fit a line to the data that minimizes squared errors. For stocks, the slope of this line is the β . Stata commands 1. Make a two-stock, 50-50 portfolio: generate portf = 0.5*stock1 + 0.5*stock2 2. Scatterplot: twoway (scatter y variable x variable)(lfit y variable x variable) 3. Regression: regress dependent variable independent variable1 followed by any num-ber of additional independent variables
