Unformatted text preview: CHAPTER 1 ∇ STATISTICS
Chapter Preview The purpose of Chapter 1 is to present:
1. an initial image of statistics that includes both the key role statistics has in the technical
aspects of life as well as its everyday applicability,
2. its basic vocabulary and definitions,
3. basic ideas and concerns about the processes used to obtain sample data.
Section 1, at the beginning of each chapter, primes the student for the concepts that will be
presented in the chapter. It is our hope that the students, after completing study of the chapter,
will see more clearly the meaning of the statistics reported and will recognize this change in the
clarity of their interpretation.
The exercises from this first section (some found at the end of the chapter, others available
online) are also excellent introductions to statistics and how they can be presented. The
differences, similarities, sources of information, sample sizes (if given), and so on should be
pointed out to the students.
This discussion, in addition to the learning of basic statistical terminology (as presented in
Chapter 1), prepares the student to find and question everyday statistical presentations. Bring in
newspapers and/or magazines and have groups of students find a form of statistics whether it is a
graph, percentage, average, and so on. This raises their awareness and reinforces the concepts.
Encourage students to bring in their own examples for possible extra credit projects. SECTION 1.1 EXERCISES
e. How often do you eat fruit?
Internet visitors at the Postyour.info website.
1/63 = 0.01587, 11/63 = 0.1746, 16/63 = 0.25397
No. Only people who visited the site and wanted to answer the question did so. 1.2 a. Answers will vary.
b. It does not appear, based on the list of averages given, that Java professionals work a 40hour week.
c. Only if long work hours are desirable. 1.3 a. Americans
b. length of time before a Wi-Fi user gets antsy and needs to check their messages
c. 47% of those people surveyed say they get antsy within one hour about checking their email,
etc. 1.4 a. Answers will vary
b. 5% of those asked replied that they made their bed weekly. 1.5 Answers will vary There are many new definitions introduced in this section. Include examples that the students can
relate to, such as, the average age of a student at your college. Carry this through each of the
definitions. Students also have difficulty with the difference between variable and data.
Exercises 1.26 and 1.59 are good for classroom discussion. The instructor will need to determine
whether or not the students understand the basic concepts. 2 Chapter 1: Statistics The graphics in this section give information about the sample (the number of people surveyed). Be
watchful of articles that do not give any of this information. Sometimes not knowing something about the
sample or survey size causes a question of credibility.
1.6 Answers will vary. Descriptive Statistics - refers to the techniques and methods for organizing and summarizing the
information obtained from the sample.
Inferential Statistics - refers to the techniques of interpreting and generalizing about the population based
on the information obtained from the sample. 1.7 a. inferential
b. descriptive 1.8 a. descriptive
b. inferential 1.9 a.
e. married women, ages 25-50, who have 2 or more children
how often moms say they have a date night with their spouse
18% of those surveyed say they have a date night every 4-6 months.
(0.18)(1170) = 211 1.10 a.
f. American adults
spring-cleaning chore he or she would prefer to hire someone to do
777(0.47) = 365
actual percentage for the whole population could be from 3.52% lower to 3.52% higher than
the percentage reported
between 8.48% and 15.52% a.
f. USA teens
what everyday invention they thought would be obsolete in 5 years
501(0.21) = 105
actual percentage could be 4.3% lower or 4.3% higher than quoted
between 16.7% and 25.3% 1.11 A variable is the characteristic of interest (ex. height), where data is a value for the variable (ex. 5'5"). A
variable varies (heights vary), that is, heights can take on different values. Data (singular) such as 5'5"
(one person's height) is constant; it does not change in value for a specific subject.
An attribute variable can take on any qualitative or "numerical" qualitative information (ex. kinds of fruit,
types of music, religious preference, model year - most answers are in words, although model year would
have "numerical" answers such as "2003"). An attribute variable can be nominal (description or name) or
ordinal (ordered position or rank; first, second…).
A numerical variable can take on any quantitative information. This includes any count-type and
measurable-type data (ex. number of children in a family, amount of time, age, height, area, volume,
miles per gallon). A numerical variable can be discrete or continuous. The domain of a discrete variable
has gaps between the possible values; there are numerical values that cannot occur. Theoretically, the
domain of a continuous variable has no gaps since all numerical values are possible. Do not be confused
by data that has been rounded due to scale being used or for convenience reasons. 3 Chapter 1: Statistics 1.12 a.
d. 1.13 a. 45% (100% - 55%)
b. The percentages are from different groups. 1.14 a. Yes, if the rate increases from 4% to 6% that is a 50% increase in the rate:
(6-4)/4 = 2/4 = 0.50 = 50%.
As a percent alone; the 50% is meaningless, it does not give the actual size of the numbers
b. The phrase “50% jump” works much more effectively at getting people’s attention than does
“2% increase.” all Americans consumers or all American taxpayers
tax return consumers
‘action’ upon receipt of tax refund
pay down debt, longest bar Population - the collection of all individuals, objects, or scores whose properties are under consideration.
Parameter - a number calculated from the population of values.
Sample - that part of the population from which the data values or information is obtained.
Statistic - a number calculated from the sample values.
NOTE: Parameters are calculated from populations; both begin with p.
Statistics are calculated from samples; both begin with s.
e. The population is all US adults.
A sample is the 1200 randomly selected adults.
The variable is “allergy status” for each adult.
The statistic is the 33.2% based on the sampled adults.
The parameter is the percent of all US adults who have an allergy, in this case, 36%. 1.16 Parameters give the information for the entire population and have one specific value. Statistics
come from samples which can vary in size and method of data collection, therefore giving
different measurements for each different sample. 1.17 A football jersey number is a categorical variable. It is attribute information that can identify
something about the position played by a player [for example; 60’s & 70’s are numbers for
lineman and they are not eligible to catch passes, other number groups have similar restrictions],
but does not give any measurable information about that player. 1.18 a. Attribute possibilities: marital status, ZIP code, gender, highest level of education
b. Numerical possibilities: annual income, age, distance to store, amount spent 1.19 a. Nominal possibilities: marital status, gender, ZIP code
b. Ordinal possibilities: highest level of education, ranking of department preferences, rating for
first impression of store 1.20 a. Score can only be whole numbers (scores are counted).
b. Number of minutes (a length of time) can be any value (time is measured); its accuracy
depends on the precision with which it is measured. 1.21 a. Severity of side effects from a particular medicine for a patient
b. attribute (ordinal) 1.22 a. level of danger with respect to cell phone use
b. attribute (ordinal) 4 Chapter 1: Statistics 1.23 a. weight of books and supplies per student
b. numerical (continuous)
c. 10 lbs, 5.67 lbs, 15.2 lbs... 1.24 a.
g. all 2009 pickup trucks listed on MPGoMatic.com
165, sample = 6 trucks
Manufacturer, Model, Drive, Transmission
Engine size, Engine Size Displacement, City MPG, Hwy MPG
Discrete: Engine Size
Continuous: Engine Size Displacement, City MPG, Hwy MPG 1.25 a.
nominal 1.26 a. The population contains all objects of interest, while the sample contains only those actually
b. convenience, availability, practicality SECTION 1.2 EXERCISES
1.27 Group 2, the football players, because their weights cover a wider range of values, probably 175
to 300+, while the cheerleaders probably all weigh between 110 and 150. 1.28 The variability. Less variation would be more desirable. 1.29 By using a standard weight or measure in conjunction with money, prices between competing
product brands can be more easily compared, irrespective of purchase quantity. There is a great
deal of variability in container sizes between brands and even within brands of the same product.
Problems associated with this variability are simplified by showing the standard unit price in
addition to the cash register amount at the point of sale. 1.30 Yes. 4 cups of 6.5 and 1 cup of 4 ounces averages 6.0 ounces. 1.31 Answers will vary but there is no way to differentiate between the students if everybody attains
the same grade. If all students received a 100%, then the test is too easy. If all students received
a 0%, then the test was too hard. If the scores are between 40 to 95%, you can distinguish
among the students’ knowledge about the subject. Demonstrate the different types of sample designs or data collection methods with examples.
Emphasize the definition of a random sample. Show how to use the Random Number Table
(Appendix B, Table 1) to find a random sample. Point out to the students that there is further
background information on random numbers in the Introductory Concepts of Appendix A. SECTION 1.3 EXERCISES
A convenience sample or volunteer sample, as indicated by their very names, can often result in biased
Data collection can be accomplished with experiments (the environment is controlled) or observational
studies (environment is not controlled). Surveys fall under observational studies. 5 Chapter 1: Statistics Sample designs can be categorized as judgment samples (believed to be typical) or probability samples
(certain chance of being selected is given to each data value in the population).
The random sample (each data value has the same chance) is the most common probability sample.
Methods (simply defined) to obtain a random sample include:
1. Simple Random Sample using Random Number Table – (see Introductory Concepts in Appendix A)
2. Systematic – every kth element is chosen
1. Stratified – fixed number of elements from each strata (group)
2. Proportional (Quota) – number of elements from each strata is determined by its size
3. Cluster – fixed number or all elements from certain strata.
1.32 a. Volunteer
b. Yes, only those that subscribe to USA Today and have strong opinions on the subject will
respond. 1.33 a. Volunteer
b. Yes, only those that subscribe to USA Today and take the time to answer are included. 1.34 Answers will vary but Landers’ survey was a volunteer survey, therefore there is a bias – mostly,
only those with strong opinions will respond. 1.35 convenience sampling 1.36 Judgment sampling - the distributor selected stores in areas he believed would be receptive. 1.37 Answers will vary: Includes only people who have Internet access; volunteer sample, therefore
includes mostly those with strong opinions 1.38 a. (1,1), (1,2), (1,3), (1,4)
(2,1), (2,2), (2,3), (2,4)
(3,1), (3,2), (3,3), (3,4)
(4,1), (4,2), (4,3), (4,4)
b. (1,1,1), (1,1,2), (1,1,3)
(1,2,1), (1,2,2), (1,2,3)
(1,3,1), (1,3,2), (1,3,3)
(2,1,1), (2,1,2), (2,1,3)
(2,2,1), (2,2,2), (2,2,3)
(2,3,1), (2,3,2), (2,3,3)
(3,1,1), (3,1,2), (3,1,3)
(3,2,1), (3,2,2), (3,2,3)
(3,3,1), (3,3,2), (3,3,3) 1.39 a. The set (list) from which a sample is actually drawn
b. A computer list of the full-time students
c. Random Number Table; student numbered 1288 was selected for the sample. 1.40 A simple random sample would be very difficult to obtain from an extremely large or spread out
population. 1.41 Statistical methods presume the use of random samples. 6 Chapter 1: Statistics 1.42 No. In fact, if Sheila isn’t careful, dozens of identical numbers could be included in the sample.
The table of random numbers may include several identical 4-digit numbers. To avoid this
problem, Sheila’s supervisor should have given her a list of 500 unique random numbers selected
from the table. Then he should have instructed her not to select the same random number twice.
Even after taking this precaution, the telephone company may not be assigning numbers to
residents on a random basis, and it may also be impossible for Sheila to find a match. Finally, the
population of telephone users may not be representative of the population being studied. 1.43 Randomly select an integer between 1 and 25 (100/x = 100/4 = 25). Locate the first item by this
integer value. Select every 25 data thereafter until the sample is complete. 1.44 a. U.S. Senate
b. U.S. House of Representatives 1.45 A proportional sample would work best since the area is already divided into 35 (different size)
listening areas. The size of the listening area determines the size of the subsample. The total for
all subsamples would be 2500. 1.46 Election day polls are taken from voting precincts selected because they are believed to be
representative of all voters. Each precinct is a cluster and not all precincts are sampled. 1.47 Only people with telephones and listed phone numbers will be considered, possibly eliminating
those with only cell phones. 1.48 Not all adults are registered voters. 1.49 a. Fluorescent bulbs use up to 75% less energy than incandescent light bulbs; the average life
of compact fluorescent bulbs is up to 10 times as long as incandescent light bulbs.
d. Yes, so one would know which bulb is best
e. part (d)
f. collect data on the amount of energy used in each type of bulb for a certain amount of time
g. collect data on the lifetimes of a sample of each bulb If Minitab, Excel or the TI-83/84 Plus is going to be utilized throughout the course, be sure to
emphasize the command forms on the introductory Tech Card at the back of the text. This card
should serve as a well-utilized reference. SECTION 1.4 EXERCISES
1.50 Several large comprehensive computer programs (called statistical packages) have been
developed that perform many of the computations and tests you will study in this text. In order to
have the statistical package perform the computations, you simply enter the data into the
computer and the computer does the rest on command; quickly and easily. 1.51 Draw graphs, print charts, calculate statistics 1.52 a. Calculators only do the calculations they are directed to perform. The results are only as
good as the operator is with precise entries.
b. The computer is a remarkable calculating machine, but it is incapable of judging whether or
not the data it is working with can be used to assess the truth. Computers cannot determine
whether or not a study has been conducted properly, or whether the appropriate methodology
is being used. The power and speed of the computer in performing calculations needed to 7 Chapter 1: Statistics
analyze data often tempts researchers to perform calculations that would never have been
performed without careful planning and consideration. ONLINE PROBLEMS
1.53 a. all individuals who have hypertension and use prescription drugs to control it (a very large
b. the 5,000 people in the study
c. the proportion of the population for which the drug is effective
d. the proportion of the sample for which the drug is effective, 80%
e. No, but it is estimated to be approximately 80%. 1.54 a.
e. average cost of textbooks for the semester per student for all students
all students enrolled for this semester
the cost of textbooks for the semester for one student
the 100 students
the average cost of textbooks for the semester per student for the 100 students; add all 100
values and divide the total by 100. 1.55 a.
d. all assembled parts from the assembly line
the parts checked
attribute, attribute (it identifies the assembler), numerical 1.56 a.
d. all students currently enrolled at the college
the 10 students selected
discrete, continuous (cost rounded to nearest cent), nominal 1.57 a.
d. The population being studied is composed of all people suffering from seasonal allergies.
The sample is the 679 people given the dose.
The characteristics of interest are relief status and side effects.
The data being collected are qualitative (relief status and side effects). 1.58 a.
numerical 1.59 a. A statistic and a parameter are both numerical values calculated to describe some
characteristic of the data; the statistic summarizes the data in the sample, while the
parameter summarizes the data of the entire population.
b. convenience, availability, practicality SECTION 1.3
1.60 probability samples 8 Chapter 1: Statistics Chapter Review
1.61 Each student's answers will be different. A few possibilities are:
a. color of hair, major, gender, marital status
b. number of courses taken, number of credit hours, number of jobs, height, weight, distance
from hometown to college, cost of textbooks 1.62 a. The proportion of registered voters who will vote for the candidate
b. 23% (35/150)
c. No. If the sample is representative of the population, the candidate will not win. 1.63 a. T = 3 is a data value - a value from one person.
b. What is the average number of times per week the people in the sample went shopping?
c. What is the average number of times per week that people (all people) go shopping? 1.64 a. C = 4 is a data value – a value from one mother.
b. What percent of the mothers in the sample complimented their child one or more times
c. What percent of all mothers complimented their child one or more times yesterday? 1.65 a. credit card holders
b. type/name of credit card, number of months past due on payment, debt amount
c. name – attribute, number of months, debt amount - numerical 1.66 a. U.S. adults
b. library card status, age, gender, ethnic group, state residency, political party affiliation, annual
c. status, gender, ethnic, state, political – attribute; age, annual use - numerical 1.67 qualitative, ordinal; responses were descriptive and could be ranked 1.68 a.
d. 1.69 a. observational study
b. percent or proportion of sunglass use
c. proportion of sample that wore sunglasses, 4 out of 10 adults 1.70 a. The number of homeruns hit increases throughout the first several years, then, with an
occasional dip (maybe injury) the homerun production remains at about the same level until
the late years. In Ruth’s case, he was not an everyday player for some of the years. In Barry
Bonds case, he seems to defy normal aging as his production kept increasing through all
years except for the one-year spike.
b. Aaron, he hit between 25 and 47 homeruns for all but the first and last few years.
c. A case can be made for each: Ruth has more seasons with 50 or more, Aaron has more
total, Bonds has 73 one year, the only one more than 60.
d. The 73 is quite different than all of the rest.
e. If he is not pitching, Ruth would be the choice at age 21. Bonds would be the choice at age
35. 1.71 Each will have different examples. 1.72 Each will have different examples. all adults with chronic low back pain
stratified 9 Chapter 1: Statistics 1.73 Each will have different examples. 1.74 Each will have different e...
View Full Document