This preview shows page 1. Sign up to view the full content.
Unformatted text preview: Chapter 12
Chapter
Sampling Design How do we gather data?
How
• Surveys
Surveys
• Opinion polls
• Interviews
• Studies
–
–
– Observational
Retrospective (past)
Prospective (future) • Experiments Population
Population
• the entire group of
the
individuals that we
want information about
want Census
Census
• a complete count of the
complete
population
population How good is a
census?
census?
Do frog fairy tale . . .
The answer is 83! Why would we not use
a census all the time?
census
1)
2)
3)
4) Not accurate
Very expensive
Perhaps impossibleat the U.S. census – it
Look
has a huge amount of error
If using destructive sampling, you would in
If
Since taking takes a to know
Suppose it census of any
it; plus
destroy population youawanted long to
•
•
• population takes time,
t ompile the weight of the
che average data making the
Breaking strength of soda bottles
cwhitetail are VERY costly to
ensuses deer population in
data obsolete by the time we
Lifetime of flashlight batteries
d et it!
Texas – wouldo! be feasible to
git
Safety ratings for cars
do a census? Sample
Sample
• A part of the population that
part
we actually examine in
order to gather information
order
• Use sample to generalize to
Use
population
population Sampling
design
design • refers to the method
refers
method
used to choose the
sample from the
population
population Sampling frame
Sampling
• a list of every
list every
individual in the
population
population Jelly Blubber Activity
Jelly
• Select 5 Jelly blubbers that you think
Select
are representative of the population
of blubbers in regards to length.
of
• Find the mean length of your sample Simple Random
Simple
Suppose we were to take an SRS of
100 BHS students – put each
Samplestudent has the
Samplein(SRS)
Not only does each
students’ name a hat. Then • sconsist of nto ndividuals–from very
consist select 100 namesbut e the
ame chance i be selected
randomly
from the
possible group of 100 students has the
population chosen in suchsame
hat. Each student has the a way
same chance to be selected! Therefore,
c
that to behance to be selected!
that
it has
possible for all 100 students
to be seniors in order for it to be an
– every individual has an equal
every
SRS!
chance of being selected
chance
– every set of n individuals has an
every
equal chance of being selected
equal Stratified random
sample
sample
Homogeneous groups are groups
that are alike based upon some
characteristic of the a stratified
Suppose we were to takegroup
members.
random sample of 100 BHS students. • population are already divided by
population is divided
Since students
grade level, grade level can be our strata.
intohen randomly select 25 seniors,
T homogeneous
randomly select 25 juniors, randomly
groups sophomores, and randomly
groups called strata
select 25
select 25 freshmen.
• SRS’s are pulled from
SRS’s
each strata
each Systematic
random sample
random Suppose we want to do a systematic random
sample of BHS students  number a list of
students
(There are approximately 2000 students – if we
want a sample of 100, 2000/100 = 20) • selecta sample by 1 and 20 at
select number between
Select
random. That
the first
fsollowing student will beevery 20
a systematic
tudent chosen, then choose
student
approach from there.
approach
• randomly select where to
randomly
begin
begin
th Cluster Sample
Cluster Suppose we want to do a cluster sample of
BHS students. One way to do this would
be to randomly select 10 classrooms
during 2nd period. Sample all students in
those rooms! • based upon location
• randomly pick a
randomly
location & sample all
all
there
there For the Jelly Blubber
colony:
colony:
µ = 19.41
19.41 Multistage
sample
sample To use a multistage approach to sampling
BHS students, we could first divide 2nd
period classes by level (AP, Honors,
Regular, etc.) and randomly select 4 second
period classes from each group. Then we
could randomly select 5 students from each
of those classes. The selection process is
done in stages! • select successively
select
smaller groups within
the population in stages
the
• SRS used at each stage SRS
SRS
•Advantages
– Unbiased
– Easy •Disadvantages – Large variance
– May not be representative
– Must have sampling frame (list of population) Stratified
Stratified • Advantages •Disadvantages – More precise – Difficult to do if you unbiased estimator must divide stratum
than SRS
– Formulas for SD & – Less variability
confidence intervals are more complicated
– Cost reduced if strata already exists – Need sampling frame Systematic Random
Sample
Sample •Advantages •Disadvantages – Unbiased
– Don’t need sampling frame
– Ensure that the sample is spread across population
– More efficient, cheaper, etc. – Large variance
– Can be confounded by trend or cycle
– Formulas are complicated Cluster Samples
Cluster
•Advantages – Unbiased – Cost is reduced
– Sampling frame may not be available (not needed) •Disadvantages – Clusters may not be representative of population
– Formulas are complicated Identify the sampling design
Identify
1)The Educational Testing Service (ETS) needed a sample of colleges. ETS first divided all colleges into groups of similar types (small public, small private, etc.) Then they randomly selected 3 colleges from each group. Stratified random sample Identify the sampling design
2) A county commissioner wants to survey people 2) A county commissioner wants to survey people in her district to determine their opinions on a particular law up for adoption. She decides to randomly select blocks in her district and then survey all who live on those blocks. Cluster sampling Identify the sampling design
3) A local restaurant manager wants to survey 3) A local restaurant manager wants to survey customers about the service they receive. Each night the manager randomly chooses a number between 1 & 10. He then gives a survey to that customer, and to every 10th customer after them, to fill it out before they leave. Systematic random sampling Random digit
table
table
Numbers can be read across. Numbers can the random digit
The following is part of be read vertically.
table found on page A117 ofdiagonally.
Numbers can be read your
textbook: • each entry is equally
each
Row
1 likely 8 5 be3any of the
4 5 1 to 0
371
24255804570
10 digits
10
38993435063
• digits are independent
digits
of each other
of Suppose your population consisted of these 20 people:
Suppose We will need to use double 1) Aidan
6) Fred 11) Kathy
16) Paul
1) Aidan
digit
2) Bob 7) Gloria 12) Lori random numbers,
17) Shawnie
3) Chico 8) Hannah ignoring any 18) Tracy greater
13) Matthew number
13) Matthew
18) Tracy
t14) Nan Start with Row 1
han 20.
4) Doug 9) Israel
19) Uncle Sam
5) Edward
10) Jung 15) Opus
and read across. 20) Vernon Ignore. Ignore.
Ignore. Ignore. Use the following random digits to select a sample of five from these people. Row
14
20
38 Stop when five people are selected. So
5 1 my sample would consist of :
8051371
155801570
9 Aidan, Edward, Matthew, Opus, and
93435063
Tracy Bias
Bias
• A systematic error in
systematic
measuring that causes the
measuring the estimate
Anything
data to be wrong! It might
• favors certain outcomes
be attributed to the
researchers, the
respondent, or to the
sampling method! Sources of
Bias
Bias
• things that can cause
things
can
bias in your sample
bias
• cannot do anything
cannot
with bad data
with Voluntary
response
response
•People chose to respond in
An example would be the surveys
Remember – the way to
magazines that ask readers to mail in
determine voluntary
•tUsually only people with very he survey. Other examples are callin
shows, Americanis: etc.
response Idol,
strong opinions respond
Remember, the respondent selects
themselves to participate in the
survey! Selfselection!! Convenience
sampling
sampling The data obtained by a convenience
sample will be biased – however this
method is often used for surveys &
results reported in newspapers and
An example would be stopping
magazines!
friendlylooking people in the mall to
survey. Another example is the
surveys left on tables at restaurants
 a convenient method! •Ask people who are easy to ask
•Produces bias results Undercoverage
Undercoverage
People with unlisted
phone numbers –
usually highincome
families •some groups of population People without
phone numbers –
Suppose you take a
are left out of the usually lowsample by randomly
income families
selecting names from
sampling process
the phone book –
some groups will not
have the opportunity
of being selected! People with ONLY cell
phones – usually young
adults Nonresponse
Nonresponse
•occurs when an individual chosen for the sample can’t be contacted or refuses to cooperate
•telephone surveys 70% nonresponse
Because of huge telemarketing
efforts in the past few years,
telephone surveys have a MAJOR
People roblem with by the researchers,
One p are to help with the problem
way chosen nonresponse!
BUT refuse o to make follow
of nonresponse tis participate.
contact with the people who are
NOT
N when you first contact
not homeOT selfselected!
them.
This is often confused with voluntary
response! Suppose we wanted to survey high
school students on drug abuse and
we used a uniformed police officer
to interview each student in our
sample – would we get honest
Response bias occurs when for
answers?
some reason (interviewer’s or
respondent’s fault) you get incorrect
answers. Response bias
Response •occurs when the behavior of respondent or interviewer causes bias in the sample
•wrong answers Wording ofbethe
The level of vocabulary should
appropriate for the population as
Questions must be worded
Questions
Questions
nyou ares possible to avoid
eutral a surveying influencing the response.
•wording can influence the – if surveying Podunk, TX,
then you
answers that are given should avoid
complex vocabulary.
•connotation of words
– if surveying doctors,
•use of “big” words or technical then use more complex,
words wording.
technical Response bias refers to anything in
the survey design that influences the
responses. Source of Bias?
Source
1) Before the presidential election of 1936, FDR against Republican ALF Landon, the magazine Literary Digest predicting Landon winning the election in a 3to2 victory. A survey of 2.8 million people. George Gallup surveyed only 50,000 people and predicted that Roosevelt would win. The Digest’s survey came from magazine subscribers, car owners, Undercoverage – since the Digest’s survey
telephone directories, etc.
comes from car owners, etc., the people
selected were mostly from highincome
families and thus mostly Republican!
(other answers are possible) 2) Suppose that you want to estimate 2) Suppose that you want to estimate the total amount of money spent by students on textbooks each semester at SMU. You collect register receipts for students as they leave the bookstore Convenience sampling – easy way to
during lunch one day.
collect data
or
Undercoverage – students who buy
books from online bookstores are
included. 3) To find the average value of a 3) To find the average value of a home in Plano, one averages the price of homes that are listed for sale with a realtor.
Undercoverage – leaves out homes
that are not for sale or homes that
are listed with different realtors.
(other answers are possible) ...
View
Full
Document
This note was uploaded on 12/09/2011 for the course STATS 221 taught by Professor Nielson during the Fall '10 term at BYU.
 Fall '10
 Nielson

Click to edit the document details