1. State null and alternative hypothesis for following statements.
a. We want to test whether the average grade point average in American colleges is 2.0
(out of 4.0) or not.
b. We want to test if college students take less th
IF STATEMENTS AND LOOPS
if, ifelse, for, while, repeat, break
The if() and ifelse() statements allow us to perform different operations depending
on which statement is true.
Usage of if:
if(condition) commands when TRUE
if(condition) commands when
Q1) a) Generate any two sequences from uniform distribution with length 50.
b) Plot x versus y by trying different options for type argument of the plot function
Q1) Consider the computer repair data (Rec6data.txt). In this study a random sample of service
call records for a computer repair operation were examined and the length of each call (in
minutes) and the number of
Who : lists variables currently in the workspace
Whos : lists variables currently in the workspace with their size
clear : clears the workspace, all variables are removed
clear x y z : clears only variables x, y
DATA FRAMES and LISTS
Vectors included in the data frame must be of the same length. Data frames allow
you to put vectors of different types together. E.g. see d2 below.
o Enter the following commands in R
> x = 1:4
> y = 2:4
Q1) Use for loops for the following questions:
a) Create a matrix of size 6 by 6, where the entries are equal to the summation of the row and
column number, i.e.
8 9 10 11 12
Q1) Consider the data below.
a) Create a vector that only includes names.
b) Create a matrix consist of age, hgt and wgt.
c) By u
1. Construct given vector and matrix first.
a= [1 3 -1 5 9 3 4];
b=[2 5 -4;1 7 8;9 3 6];
Then run the following command for suitable a or b and see the results.
range, std, var, diff, sum, cumsum, prod, cumprod, cov, corrcoef.
1. Suppose the recovery time for patients taking a new drug is measured (in days). A
placebo group is also used to avoid the placebo effect. The data are as follows, find
whether the drug is convenient or not.
7. DESCRIPTIVE STATISTICS
Previously we talked about graphical descriptive statistics
Now the most commonly used numerical descriptive statistics
o Measures of central tendency
o Measures of dispersion
Measures of central tendency
Mean, mode, median
10. ONE WAY ANALYSIS OF VARIANCE
Example: Designing a new package for a certain product (from 7.1 in Ilk (2011) Marketing
o A food company. They want to have a new package designed for a certain product
they produce (say a chocolate bar) to help increa
1. Write a user-defined function of repmat.
2. Calcium is the most abundant mineral in the human body and has several important
Q1) Consider the Poisson log-likelihood function, which is given by
L yi In n In yi!
Write an R function to compute the log-likelihood function. (You can ignore the constant part).
Q2) Consider the Normal distribution with two pa
Q1) An outbreak of Salmonella-related illness was attributed to ice cream produced at a certain
factory. Scientists measured the level of Salmonella in 9 randomly sampled batches of ice cream.
The levels (in MPN/
STAT 291 STATISTICAL COMPUTING I
Statistical computing and computational statistics are two areas of statistics which use
computational, graphical, and numerical methods to solve statistical problems. Statistical
computing usually involves algorithms, r
1. Calculate the following;
a) e 3, ln(e 3), log10(e 3), log10(105)
b) solve 3x=17
a) > exp(3)
b) > log(17)/log(3)
2. a) Read the Stereo
2.7 MORE ON MATRICES
Initialization of a matrix is not necessary except in two cases:
1. Large matrices: When you are dealing with very large matrices, it is efficient to make
initialization. The initialization reserves for the matrix a b
PART II: STATISTICAL ANALYSIS USING MATLAB
Not free. Licensed software.
Originally written by Dr. Cleve Moler in late 1970s to provide easy access to matrix
First version was written to be used in courses in matrix theory, line
8. HYPOTHESIS TESTING
Example-1: makes use of the gas data set in Matlab
o Data: Gas prices in January and in February collected from 20
randomly selected gas stations in the same city.
o Matlab code to perform various different analyses on this data s
5. DESCRIPTIVE STATISTICS
o Measures of central tendency
o mean, median, mode, geomean, harmmean, trimmean
x i )1 / n
Geometric mean: (
for positive xs
Example: Suppose you have an investment which earns 10% in the first year, 60% in
Q1) Remember that there are three basic functions for constructing vectors are c(.) (combine);
seq(from, to, by) (sequence);and rep(x, times) (repeat). Use the command line to create
a) a sequence from 0 to 9 and
Example 7: One way Analysis of Variance
o Aim: 3 new methods for teaching tennis serve; A, B, C. C being the currently
most applied method.
Are the effects of these
methods equivalent on making
* From Wikipedia: An ace occurs when a served ball l
1. The equation of a straight line is y=mx+c where m and c are constants. Compute the y
coordinates of a line with slope m=0.5 and the intercept c= -2 at the following
coordinates: x=0, 1.5, 3, 4, 5, 7, 9, 10.
2. The equation in Q.1. is a mathe
More problems in R
1. Calculate the sum
by using R programming. Then show that the result is the same
with n(n+1)(2n+1) /6. (Hint you can use the built-in functions such as sum).
2. Write the following data set to a text file and save it. The
2. Examining and Transforming Data
Copyright 2014 by John Fox
Examining and Transforming Data
I To motivate the inspection and exploration of data as a necessary
preliminary to statistical modeling.
I To rev
5. Dummy-Variable Regression
and Analysis of Variance
Copyright 2014 by John Fox
Dummy-Variable Regression and Analysis of Variance
I One of the limitations of multiple-regression analysis is that it
Lecture 3: Multiple Regression
Prof. Sharyn OHalloran
Sustainable Development U9611
Basics of Multiple Regression
Review Strategies for Data Analysis
Testing and Interpreting Interactions in Regression In a Nutshell
The principles given here always apply when interpreting the coefficients in a
multiple regression analysis containing interactions. However, given these principles,
the meaning of the coef