Unformatted text preview: Introduction to Statistics
(Data Management) Statistics is the science of collecting, organizing and
interpreting data Data is collected using different types of sampling
The data is organized into tables and then presented
in a graph Different types of graphs can be used depending on
the type of data that is being presented Statistics: The Analysis Stage Once data has been collected and represented, it
must be analyzed and interpreted. Consider the
following as a guide for the analysis stage: 1. Correlation: Is there a relation?
Is there a positive or negative correlation? 2. Correlation with variables:
Define the relation in words. 3. Linear or Non—Linear: Is the data best represented by a
line of best fit or a curve of best fit? 4. Is (0, O) in the data? Is the dependent variable zero when
the independent variable is zero? 5. Discrete or Continuous: Does the data between the given
set have any meaning? Statistics: The Analysis Stage 1 . Correlation  Is there a relation?
0 Is there a positive or negative correlation? Positive correlation: Negative correlation:
as the independent as the independent
variable (x) increases variable (x) increases,
as the dependent the dependent variable
variable (y) increases (y) decreases y 40 5D ’ 10 20 so 40 50 2. Correlation with variables ‘ 0 Define the relation in words
Example: “As a bird's length increases, its wingspread
increases" 3. Linear or NonLinear
o Is the data best represented by a line of best fit or a curve of best fit?
Line of best fit Curve of best fit
— models a linear relation _ models a nonlinear
— passes through or close to as relation
many points as pOSSible  uses all the features
 any point not on the line of the line of best fit should be distributed evenly
above and below it  used for interpolation and
extrapolation (“3:236 .6 307 o 20* 10* . l . . . . p
i 0 2D 30 AD 50 4. Is (0,0) in the data?
Is the dependent variable zero when the
independent variable is zero? 5. Discrete or Continuous?
0 Does the data between the given set have any meaning? NO: the relation can YES: draw a line
be represented by points only Example: Banquet Rental Cost Driving to Toronto y ——”)ll.6. (Ami Obi/1L ﬁlm/€50
02L"? 090' 56715 Extrapolation vs. Interpolation Interpolation
o to estimate a value between the known values Extrapolation ScaTTerploTs & Line of BesT FiT /
/_
Example 1. a) Draw a scaTTerpIoT relaTing lengTh and wingspread of birds. LengThs and ngspreods of Birds “"9” 24 62 126 75 50 170 125 26 125 30
(cm)
ngﬁg‘gc‘d 27 75 150 170 100 250 300 50 350 100 b) Now analyze The daTa using The five key poinTs relaTing To
The analysis sTage. (i) CorreIaTion: Is There a relaTion? Is There a posiTive or
negaTive correlaTion? (ii) CorreIaTion wiTh variables: Define The relaTion in words. (iii) Linear or NonLinear: Is The daTa besT represenTed by a
sTraighT line or a curve? (iv) Is (0, O) in The daTa? Is The dependenT variable zero when
The independenT variable is zero? (v) DiscreTe or ConTinuous: Does The daTa beTween The given
seT have any meaning? Lengths and Wingspreads of Birds 35D SUD 25D m) 200 ad: VVIngSpre Ln
I: 100 SD SD 1 UEI 15D
Length (cm) c) Use The line of besT fiT To deTermine The wingspread for a
bird ThaT is 100cm. Are you inTerpolaTing or exTrapoIaTing?
Explain. d) Use The line of besT fiT To deTermine The lengTh of a bird
ThaT has a wingspread of 330cm. Are you inTerpolaTing or
exTrapolaTing? Explain. Analysis of Data 1. Correlation: Is there a relation?
Is there a positive or negative correlation? There is a positive correlation because as the independent
variable increases, the dependent variable also increases. 2. Correlation with variables: Define the relation in words As the length of the bird increases, the wingspread also
increases. 3. Linear or NonLinear?
Is the data best represented by a straight line or a curve? The data is best represented by a straight line
i.e. a line of best fit. 4. Is (0, O) in the data? Is the dependent variable zero when
the independent variable is zero? Yes, (0,0) is in the data i.e. for zero length, the
wingspread would be zero. 5. Discrete or Continuous
Does the data between the given set have any meaning? The data is between the given set has meaning,
therefore the graph is continuous. ...
View
Full Document
 Spring '11
 Dr.EmirJunver
 Statistics

Click to edit the document details