HUDM 6026
Computational Statistics
Week 3 Logistic Regression
Reviewing R Syntax
Run lines 105 to 108 and then review lines 147 to 153 from the R
syntax file from last class.
# pick out two predictors for multiple reg example
X <- state.x77[,c("Population

Week 5 Monte Carlo Simulation Studies
R Task
The function below called sumsq calculates the sum of squared deviations
from the sample mean of a vector.
sumsq <- function(vec)
cfw_
mn <- mean(vec)
d <- vec - mn
d2 <- d^2

Data Mining
W4240 Sections 001, 003/004
Lauren A. Hannah
Columbia University, Department of Statistics
October 9, 2014
Outline
Administration
Classification: Why and When
Naive Bayes Classification
The Naive assumption: what and why
The Dangers of Naivete

Week 1: Review of Statistics Concepts
Before we can begin to learn about regression, we need to review a few important and
fundamental statistical results that will be used throughout the course.
Sample vs population
Repeated samples and properties of e

Week 3: Multiple Variable Regression
I. Models, Estimation, and Interpretation
A. Two explanatory variables
Now begin with a larger model:
In the two variable case, this is no longer a regression line but is instead a regression
plane. [See handout].
Reca

Week 4: Dealing with Categorical Variables
I. Dealing with qualitative and categorical variables
Before we begin today, lets review some basics on variable types:
Nominal: allow for qualitative classification in which ordering doesnt make
sense. For exam

Week 9: Model Building
More on Causal Inference
In the previous lecture, I highlighted that regression can be used to answer causal
questions. To do so, control variables need to be included, and the fit of the model is
determined by the extent to which t

HUDM 5126: Review
Remember:
You must have either DATA or ASSUMPTIONS.
Remember:
All models are wrong, but some are useful. (re: George Box)
In order to better understand relationships between two (or more) variables in a
population, we will need to use in