Example:
For a certain statistics course, data for students scores on Exam 1 are used to try to
predict students scores on Exam 2. Using the output below, answer the following
questions:
1. Describe the form, strength, and direction of the relationship be
HOMEWORK 11
11.10
(a) The normality assumption describes that only response variable need to be normal.
(b) The R-square represents the explanatory power.
(c) The explanatory variables do not need to be statistically significant respectively. But the
tota
Two-Way Tables in SAS
options nodate pageno=1;
goptions colors=(none);
title 'Age versus College Program - class example';
data one;
input age $ program $ count;
datalines;
18below
2full
36
18below
2part
98
18below
4full
75
18below
4part
37
18to21
2full
1
Getting Started in SAS
Some general instructions regarding homeworks:
Do not use an alternate programs to make any of your graphs, for now just make them by hand,
in homework 2 we will start using SAS!
We will use text files for all our datasets, I.E. ges
Section 11.1
Multiple Linear Regression (MLR)
TopicsMLR
Extension of SLR
Statistical model
Estimation of the parameters and
interpretation
R-square with MLR
Anova Table
F-Test and t-tests
A continuation of Chapter 10
Most things are conceptually simi
Section 7.1
Inference for the mean of a population
Change: Population standard deviation ()
is now unknown
The t distribution
One-sample t confidence interval
One-sample t test
Matched pairs t procedures
Robustness of t procedures
The t distribution:
Sections 2.1-2.2
Looking at Data-Relationships
Data with two or more variables:
Response vs Explanatory variables
Scatterplots
Correlation
Regression line
Association between a pair of
variables
Association: Some values of one variable tend to
occur mo
Section 5.2
Sampling Distribution for Counts and
Proportions
Preview
Population distribution vs. sampling
distribution
Binomial distributions for sample counts
Finding binomial probabilities: tables
Binomial mean and standard variation
Sample proporti
Section 1.3
The Normal Distributions
Topics
Density curves
Normal distributions
The 68-95-99.7 rule
The standard normal distribution
Normal distribution calculations
Standardizing observations
Normal quantile plots
Density curves
Density curve
Imagi
Section 10.1
Simple Linear Regression
A continuation of Chapter 2
Statistical model for linear regression
Data for simple linear regression
Estimation of the parameters
Confidence intervals and significance tests
Confidence intervals for mean response
Correlation Example
options ls=72;
title1 'Gesell Correlation Example';
data gesell;
infile 'C:\gesell.txt';
input name $ age score;
yrs = age/12; /* creating new variable "yrs" which
converts age in months to age in years */
run;
symbol value = circle;
p
ANOVA in SAS
Example: In a study of workplace safety, workers were asked to rate the safety of their work environment, and
a composite score called the Safety Climate Index (SCI) was calculated. The workers were classified
according to their job category
7.24 (a).
7.141
(a).
The distribution of three-bedroom house price is somewhat right-skewed, while that of fourbedroom house price is almost normal. The price of a four-bedroom house is relatively higher
than that of a three-bedroom house. The spread of t
HOMEWORK 9
2.18
(a) There is no form of this scatterplot. It has a slightly positive association, but little strength.
And it has an outlier.
(b) The outlier is ODouls. It has the lowest percent of alcohol and the lowest calorie. These
features are suitab
9.24
(a). There were totally 27802 students. The marginal total of female is 14915; total of male is
12887. The marginal total of people who said no is 14279; total of people who said yes is
13523.
(b). The percent of people who said no is 51.36%, and the
Plot Examples
options ls=72;
goptions colors=(none);
title1 'Botulism Data';
/* data on fatal food poisoning outcome is death or survived
age and incubation period is also there */
data picnic;
input outcome $
datalines;
d 29 13 s 39 46
s 17 20 d 38 18
d
Test Proportions in SAS
options nodate pageno=1;
goptions colors=(none);
title 'Restaurant Workers Is work stressful?';
/* SAS uses alphabetical ordering to determine a success for a
binomial, so in this case N comes before Y, so a No is a success in
the
HOMEWORK 12
12.10
(a) DFG=7-1=6
DFE=5*7-7=28
.025<P-value<.05
(b)DFG=5-1=4
DFE=11*5-5=50
.05<P-value<.10
(c) DFG=6-1=5
DFE=34-5=29
.010<P-value<.025
12.12
(a) S1=37, S2=28, S3=42
2*28=56>42
It is reasonable to use the assumption of equal standard deviatio
HOMEWORK 10
10.6
(a) The slope describes the change in y for a change in x.
(b) y=b0+b1x is the sample regression line. The population regression line is y=0+1x
(c) The mean of response varies with x. Thus, the 95% CI for the mean response depends on
x.
1
Regression in SAS (1)
options ls=72;
title1 'Gesell Regression Example';
/* the dataset gesell2 has an added
observation Mary with missing score */
data gesell2;
infile 'C:\gesell2.txt';
input name $ age score;
run;
proc print data=gesell2;
run;
Gesell Re
Section 12.1
One-Way Analysis of Variance (ANOVA)
Inference for One-Way ANOVA
Comparing means for several groups
Format of data
An analogy: two sample t-statistic
ANOVA hypotheses and model
Understanding two types of variation
Estimates of population