R Commands.docx - STA 371G R Commands Create a list of numbers(vector o Name_of_list <\u00ad c#1#2#3 \u2026 Calculating and naming sample statistics o

# R Commands.docx - STA 371G R Commands Create a list of...

This preview shows page 1 - 3 out of 9 pages.

STA 371G R Commands Create a list of numbers (vector) o Name_of_list <- c(#1, #2, #3 …) Calculating and naming sample statistics o sample_mean <- mean(Name_of_list) o  sample_variance <- var(Name_of_list) o sample_standard_deviation <- sd(Name_of_list)   Calculate a sample statistic of only the first 5 numbers of a list o Sample_mean_5 <- mean(Name_of_list[1:5]) Calculate 95% confidence interval o Avg_price_ci_95 <- t.test(Name_of_list,conf.level=0.95) Make a histogram of a variable from a dataset o Hist(datasetname\$variablename, main=’’, xlab=’Title of the  variable’, col=’color’) Simple regression model o Model <- lm(y_variable ~ x_variable) o Summary(model) Predict/extrapolate a value from a linear model o Predict.lm(model’s_name, list(x_variable=#,x_variable=#) Create a Confidence Interval for a regression model (range that 95% sure contains the true slope andintercept) Create a confidence and prediction interval for a specific X value (confidence = what is the average response at that x value, prediction = what is the exact y value for the x value) Make a scatterplot of 2 variables from a dataset
plot(stock_market_returns\$W5000, resid(model), pch=16,  +col='green', xlab='W5000', ylab='Residuals') o Errors are normally distributed Look at scatterplot of residuals and look for appx. Normality > hist(resid(model’s_name), col='darkred', +   xlab='Residuals', main='') Look at Q-Q plot of residuals to look for a straight line > qqnorm(resid(model), main='') o Variance of Y is the same for any value of X (homoscedasticity) Look at the residual plot to make sure there is roughly equal spread all the way across Create a multiple regression model o Model <- lm(Y ~ X 1 + X 2+  X 3 …, data=dataset_name) o Summary(model) Summary with rounded decimal places o Round(summary(model)\$coefficients,3) Testing if the whole multiple regression model is significant o Use P-value of the overall model o To see how good the predictions are: Check histogram of residuals for normality Hist(model\$residuals, col=’green’, main=’’,  xlab=’Residuals’, ylab=’Frequency’ Check the mean of the residuals, should be very close to 0 Mean\$model\$residuals) Find the SD of the residuals Sd(model\$residuals) You now can create a distribution of the residuals with the mean and SD – to see how much of the data (Y) falls within these standard deviations Can obtain these statistics directly from the regression model Summary(model)\$sigma – gives you residual standard error, also found on the summary page in general

#### You've reached the end of your free preview.

Want to read all 9 pages?

• Fall '11
• Damien