Prof. Green Stat 102 The Statistical Underpinnings of Experimentation
Randomization and Sampling Variability R.A. Fisher saw that not only did random assignment make for unbiased inference, it also allows one to make useful statements about the degr
Toward Statistical Inference
Thanks to Joe Chang in the Statistics Department for some lecture content Inference use information about a sample to draw an inference about the population Example : A CNN/Gallup/USA Today poll of 1000 people reveal tha
Probability in Practice
(The trick is knowing when to apply which probability rule!) Suggestions : Read the Textbook. I like it. Do problems in book. They have solutions. Make a picture. This helps to clarify sample spaces. Do some more problems.
Ex
X
Outcome
Random Variables
Cartoon Guide Chapter 4 Highly Recommended!
A Random Variable is :
Y
HH 2
The numerical outcome of a random experiment A function defined on the Sample Space S (the set of all possible outcomes). This function assigns
Midterm 1 Thursday, September 29
In class One Hour (but you'll have until 2:25) Covers through Probability (chapter 15 in book) Closed note, closed book Any formulas needed will be provided Bring a calculator (doesn't need to be fancy) Bring somethin
The Central Limit Theorem Revisited
Two more views
Here again is the Central Limit Theorem :
Central Limit Theorem
If X1, X2, . . . Xn are a sample of n independent and identically distributed trials from any distribution with mean and standard dev
Hypothesis Testing
Like proof by contradiction : (think back to geometry) Example : if x and y are two even numbers, prove that x+y is also even. Proof (by contradiction). Assume x+y is an odd number. Than x+y = 2c+1 for some integer c. If x and y ar
Two-sample tests of means in MINITAB : use Stat Basic Statistics Two-sample-t. Data can be entered in separate columns, or in a single column with a variable indicating treatment group (the subscript column). Click options to set confidence/alpha l
Inference for Two-Way Tables :
The finer things in life
(Agresti, Categorical Data Analysis is a good additional resource for this topic)
Example : Music and Wine. A marketing study examined the effect of playing different kinds of music on the numb
Announcements
Midterm Thursday, 10/20 in OML202
Closed note, closed book. Any complicated formulas needed will be provided. Bring a calculator
Next week
Go to sections for remainder of the term EXCEPTION : Pre-Med section meets with me and 103 sec
Announcements :
Homework 1 due today before you leave. Remember to sign up for STAT 100-0a on the classes server AND for whichever section you are in! TA office hours and locations are listed online Open session with JDRS every Monday, 2:30-3:30 in O
Data Relationships
Today: describing relationship of two quantitative variables :
Scatterplots Correlation Regression
Visualizing Relationships : Scatterplots
Plot two variables simultaneously Put one variable on horizontal axis, other variable o
Prof. Green Stat 102 From T-Tests to Regression to Multiple Regression The T-test provides an instructive way to get acquainted with the idea of "degrees of freedom." In contrast to the Z-test, where numbers like 1.65 and 1.96 have magical qualities
Green / Statistics The Mechanics of Multiple Regression One of the most important concepts in statistics is the idea of "controlling" for a variable. This lecture is designed to give you a feel for what "controls" are and how they are implemented in
Green / Statistics Regression: Tricks of the Trade Data for this example come from Green, Strolovitch, and Wong (1998): "Defended Neighborhoods, Integration, and Racially Motivated Crime" (American Journal of Sociology). The dependent variable is the
Statistics Professor Green Perils of Multiple Regression Least-squares regression rests on several assumptions about the causal process by which the data were generated. Becoming an intelligent consumer of statistical information requires one to unde
Green / Stats Causal Order One of the basic principles of statistics or philosophy of science for that matter is that numbers themselves cannot adjudicate questions of causality. Just because variables X and Y are strongly correlated does not mean
Green / Stats Forecasting and Prediction
To this point in the course, we have focused primarily on parameter estimation rather than prediction. In this lecture, I want to discuss prediction and forecasting. Forecasting is usually understood to be a
Green / Stats The Importance of "Importance" This lecture aims to distinguish among various meanings of the term "importance" as applied to regression results. I will use James E. Campbells regression analysis of presidential election outcomes (see c
Welcome to STAT 101a-106a Introduction to Statistics
Syllabus Overview On classes server under web page and syllabus. Updated periodically. About The Sections . . . About Me . . . Why I'm here . . . Your Questions . . .
Things to Do Today
Fill out M
Announcements
Turn in sheet for desired course section today. Assignments made by next class. See times for my and TA's office hours on web. USE THEM! Homework 1 assigned today (available online in materials folder on classes server). DUE NEXT TUESDA
Prof. Green Intro Stats Regression with Experimental Data
Regression ranks among the most useful tools in statistics. Use #1: Predicting outcomes. Often a mindless, theory-free activity. Can be fun and/or profitable. Example: predicting sales based