Experimental Design and Statistical Analysis for Engineers
CIVE 657

Fall 2014
Another Example: SemiLog
Transformation
We have data on PCB
concentrations in fish tissue from
the Great Lakes
First reported in high
concentrations in the early 1970s
Phased out in the late 70s
Started to decline in the 80s
Level of decline slowed in t
Experimental Design and Statistical Analysis for Engineers
CIVE 657

Fall 2014
9/26/2013
CIVE 657
Distribution Fitting
We are interested in learning about a population
We cant measure the entire population
We sample from the present population take a
small number of independent repetitions
We are interested in the best model tha
Experimental Design and Statistical Analysis for Engineers
CIVE 657

Fall 2014
Model Selection in R
We will work again with the data from Problem 6.9, Grocery Retailer. Recall that we
formed a data table named Grocery consisting of the variables Hours, Cases,
Costs, and Holiday. We ran a full linear model which we named Retailer
inv
Experimental Design and Statistical Analysis for Engineers
CIVE 657

Fall 2014
9/26/2013
Some special values in R
NaN: is generated when you do 0/0 or Inf/Inf
Inf: is generated when you do 1/0
Inf: is generated when you do 5/0
NA: represents Not Available use for missing
data
The is.na() function can be used to check for missing
v
Experimental Design and Statistical Analysis for Engineers
CIVE 657

Fall 2014
10/30/2013
Some Final Notes
CIVE 657
Part II
Bartlett test for 2=way ANOVA in R has a BUG!
bartlett.test(x$yield~interaction(x$fert,x$watr)
bartlett.test(x$yield~interaction(x$fert,x$watr)
bartlett.test(x$yield~x$fert*x$watr)
Use the first not the sec
Experimental Design and Statistical Analysis for Engineers
CIVE 657

Fall 2014
Chapter 5 Goodness of Fit Tests
5 GOODNESS OF
FIT TESTS
Objectives
After studying this chapter you should
be able to calculate expected frequencies for a variety of
probability models;
be able to use the 2 distribution to test if a set of observations
fit
Experimental Design and Statistical Analysis for Engineers
CIVE 657

Fall 2014
Statistics 512: Applied Linear Models
Topic 3
Topic Overview
This topic will cover thinking in terms of matrices regression on multiple predictor variables case study: CS majors Text Example (NKNW 241)
Chapter 5: Linear Regression in Matrix Form
The SLR M
Experimental Design and Statistical Analysis for Engineers
CIVE 657

Fall 2014
Data Type
Distribution
Binomial
Discrete
Poisson
Uniform
Exponential
Description
* Repeatable event
* Only 2 possible outputs
Example:
game concisting of flipping a coin(P=0.5) 5 times(n=5), and win
an amount depending on how many faces you get(k=)*
Repea
Experimental Design and Statistical Analysis for Engineers
CIVE 657

Fall 2014
A Step on the Road
To enter small data set:
Combine:
c()
c(1,2,3) , c(a,b,c)
Colon :
a:b
1:5
series of successive numbers from 1 to 5
Sequence:
seq(a,b, by=c) or seq(a,b, length=c)
seq(1,10,by=0.5)
increment of 0.5 between each two successive numbers
Experimental Design and Statistical Analysis for Engineers
CIVE 657

Fall 2014
A Step on the Road The Ordeal
Correlation:
The correlation analysis indicates weather two variables vary together or not; direction and
magnitude of the association of the two variables will be visualized and quantified.
Visualization:
plot(A~ B, data=, p
Experimental Design and Statistical Analysis for Engineers
CIVE 657

Fall 2014
A Step on the Road The Forgiveness
5 steps to fit a distribution to an unknown set of data saved in a vector named x:
1 Step backward:
Going over previous studies and reviewing literature dealing with similar type of data could
be very helpful to limit y
Experimental Design and Statistical Analysis for Engineers
CIVE 657

Fall 2014
HW 4 CIVE 657
Linear Regressions
1. Read in the data car_data.csv from Moodle.
2. Conduct a pairs plot of the data. Explain briefly what you see.
3. Regress Price as a function of Mileage. Plot Price as a function of mileage and add your
regression line.
Experimental Design and Statistical Analysis for Engineers
CIVE 657

Fall 2014
car_dat<read.csv("C:/Users/Ibrahim/Documents/AUB/STAT/2013Fall/HW/HW4/car_data.csv", header=T)
car_dat<car_dat[,1:12]
windows()
pairs(car_dat)
#What you should be bale to see is:
#1) Price tends to drop with increased mileage
#2) Prices seem to differ b
Experimental Design and Statistical Analysis for Engineers
CIVE 657

Fall 2014
HW 3 Simple Linear Regression
CIVE 657
Data on the mortality and the hardness of the drinking water were collected from 61 cities in
England and Wales. The data are in the file water.csv on Moodle. The file includes data on:
The annual mortality per 100,0
Experimental Design and Statistical Analysis for Engineers
CIVE 657

Fall 2014
rm(list=ls()
water<read.csv("C:/Users/Ibrahim/Documents/AUB/STAT/Linear regression/water.csv")
summary(water)#Looks good.No wierd numbers in mortality and hardness
windows()
par(mfrow=c(1,2)
hist(water$mortality,xlab="mortality",main="mortality frequency
Experimental Design and Statistical Analysis for Engineers
CIVE 657

Fall 2014
#adjust the working directory
setwd("C:/Users/messenger net/Desktop/Experimental Design/Homeworks/HW3")
#Read in the data
water=read.table("water.csv", header=T, sep=",")
#Let us look at our data
head(water)
#Summary of the data
summary(water)
#scatter pl
Experimental Design and Statistical Analysis for Engineers
CIVE 657

Fall 2014
CIVE 657
HW 2: Due Monday October 21 2013
Problem 1
Conductivity measurements (mho/cm) were taken at four different locations in the aerated lagoon
of a pulp and paper mill. The lagoon is supposed to be mixed by aerators so the contents are
expected to be
Experimental Design and Statistical Analysis for Engineers
CIVE 657

Fall 2014
CIVE 657
HW 1: Due Friday October 4 2013
Problem 1:
A manufacturer of computer mice obtains the tracking balls from a new supplier. As part of the
quality control program, the manufacturer routinely sends a sample of 50 balls from each shipment for
analys
Experimental Design and Statistical Analysis for Engineers
CIVE 657

Fall 2014
#Problem 1
#The prob of having less than 2 deffective balls is
pbinom(1,size=50,p=0.03)
#OR
dbinom(0,size=50,p=0.03)+dbinom(1,size=50,p=0.03)
#mean and variance of the defective balls
#Mean=n*p
50*0.03
#Var=n*p*(1p)
50*0.03*(10.03)
X<seq(0,50, by=1)
Y<
Experimental Design and Statistical Analysis for Engineers
CIVE 657

Fall 2014
10/30/2013
Moving from Correlation to
Regression
Correlation Direction & Strength
We want to go further
Statistical Modeling
Statistical modeling focuses on finding a quantitative
description of how the mean of the variable of interest
varies as a func
Experimental Design and Statistical Analysis for Engineers
CIVE 657

Fall 2014
10/21/2013
CIVE 657
Final Project
Need to include the following:
Why do you care about this problem and why you
think it is worth studying
Describe the data you have or will get
Describe the data collection protocol used or that
will be used
Who, wher
Experimental Design and Statistical Analysis for Engineers
CIVE 657

Fall 2014
9/11/2013
Our Analysis Strategy
Explore the data using plots and summary statistics
(EDA techniques)
Construct a useful model/ experiment
Process involves a lot of trial and error (sometimes)
All models are wrong some are useful
Assess the model
Recogn
Experimental Design and Statistical Analysis for Engineers
CIVE 657

Fall 2014
10/14/2013
TwoWay ANOVA
Process we study will involve more than one factor
studying more than one factor at a time leads to
greater efficiency
Less time and less number of experiments for a given power
If we are studying the effects of 2 factors use
Experimental Design and Statistical Analysis for Engineers
CIVE 657

Fall 2014
10/8/2013
CIVE 657
Summary of Last Week
The goal of statistical inference is to provide insight on
population distribution parameters based on limited
observations we observe
We only observe a sample of the population
ttest: inference about population
Experimental Design and Statistical Analysis for Engineers
CIVE 657

Fall 2014
9/30/2013
Tests for the Normality
Assumption
A lot of statistical tests (e.g. ttest) require that our
data are normally distributed
Find the third and fourth standardized moments
Use the Goodness of Fit tests
Chisquare
AndersonDarling
KolmogorovSmirno
Experimental Design and Statistical Analysis for Engineers
CIVE 657

Fall 2014
10/9/2013
The ANOVA Ftest
The ANOVA Fstatistic compares variation due to specific sources
(levels of the factor) with variation among individuals who should be
similar (individuals in the same sample).
F=
variation among sample means
variation among ind
Experimental Design and Statistical Analysis for Engineers
CIVE 657

Fall 2014
9/6/2013
CIVE 657
Experimental Design and Statistical
Analysis for Engineers
Class Format & Computer Details
Mondays classes will be held here
On Wednesdays we will start here and then
move to the Construction Management
Computer Lab to work on R
All ana
Experimental Design and Statistical Analysis for Engineers
CIVE 657

Fall 2014
10/10/2013
Equal Variances (Ftest)
Why care? Under what applications?
Test assumption of equal variances was made in ttest
When comparing means, the validity (and power) of the
test becomes questionable if the variances are very
unequal
Interest in a