number of attributes Notes
Each instance is described by a fixed predefined set of
features, its attributes
But: number of attributes may vary in practice
Possible solution: irrelevant value flag
Related problem: existence of an attribute may depend of
va
STAT 1112
Major Assignment
Student Name: _
Student Number: _
Bianca Frantini, a Humber College graduate who always had only perfect marks in statistics, was hired
by the famous Healthy Life medical insurance company. Bianca is assigned to conduct statisti
Introduction to Statistics
RSMT 1500 ( Quantitative Research Methods)
Based on ; Elementary Statistics
Picturing the World (6th edition)
Prepared by;
Mona BayaniKeyvani
What is Statistics?
Statistics is the science of collecting,
organizing, presenting,
Linear Regression and Correlation
RSMT 1500 ( quantitative research Method)
Based on ; Elementary Statistics
Picturing the World (6th edition)
Prepared by;
Mona BayaniKeyvani
Correlation
Correlation
A relationship between two variables.
The data can be
Binomial Probability
Based on ; Elementary Statistics Picturing
the World (6th edition)
RSMT 1500 ( Quantitative Research Methods)
Prepared by;
Mona BayaniKeyvani
Binomial Probability
How many outcome are in the following experiments?
Getting the right a
Normal Probability Distribution
Based on ; Elementary Statistics Picturing
the World (6th edition)
RSMT 1500 ( Quantitative Research Methods)
Prepared by;
Mona BayaniKeyvani
Properties of a Normal Distribution
Continuous random variable
Has an infinite n
Probability
Based on ; Elementary Statistics Picturing
the World (6th edition)
RSMT 1500( Quantitative Research Methods)
Prepared by;
Mona BayaniKeyvani
Probability
Probability
is chance and likelihood of an event
happening.
Probability is frequently expr
Hypothesis Testing;
Two samples
RSMT 1500 ( quantitative research Method)
Based on ; Elementary Statistics
Picturing the World (6th edition)
Prepared by;
Mona BayaniKeyvani
Two Sample Hypothesis Test
Compares two parameters from two
populations.
Independe
STAT 1112 Quiz
1. In a particular country, household income is not normally distributed, with a mean of $12000 per
year, and a standard deviation of $8000. If you take a sample of 48 households in this country,
what is the probability that the sample mean
Descriptive Statistics,
Measures of Central Tendency
RSMT 1500 ( Quantitative Research Methods)
Based on ; Elementary Statistics
Picturing the World (6th edition)
Prepared by;
Mona BayaniKeyvani
Measures of Central Tendency
A
of central tendency is a valu
Descriptive Statistics,
Frequency Distribution
RSMT 1500 ( quantitative research Method)
Based on ; Elementary Statistics
Picturing the World (6th edition)
Prepared by;
Mona BayaniKeyvani
Descriptive Statistics
Presenting a data in a frequency
distributi
Data Mining Tasks Notes
Data Mining Tasks (Styles of learning):
Classification learning:
predicting a discrete class
Association learning:
detecting associations between features
Clustering:
grouping similar instances into clusters
Numeric prediction:
pre
Clustering Notes
Examples: customer grouping
Finding groups of items that are similar
Clustering is unsupervised
The class of an example is not known
Success often measured subjectively
Numeric prediction
Classification learning, but class is numeric
Lear
MEASURES OF VARIANCE Notes
Min and max
Range
Standard deviation sqrt of variance (SD)
Variance - V= (xi - x)2/n-1
Interquartile range (Q3-Q1 or 75%-25%) IQRT
What measures are to be used for sample description?
If distribution is NORMAL
Mean
Variance (or
Ratio quantities Notes
Ratio quantities are ones for which the measurement
scheme defines a zero point
Example: attribute distance
Ratio quantities are treated as real numbers
Distance between an object and itself is zero
All mathematical operations are a
Introduction to biostatistics Lecture Notes
1.
Basics
2.
Variable types
3.
Descriptive statistics:
4.
Categorical data
Numerical data
Inferential statistics
Confidence intervals
Hipotheses testing
STATISTICS can mean 2 things:
- the numbers we get when we
Populations and Samples Notes
Population: Collection of all possible entities of interest
Described by Parameters
Sample: Subset of collection
Described by Statistics
Statistical Inference
Art and science of using samples to make conclusions about
populat
Transforming ordinal to Boolean Notes
Simple transformation allows
ordinal attribute with n values
to be coded using n1 boolean attributes
Example: attribute temperature
Better than coding it as a nominal attribute
Metadata
Information about the data that
Generating a flat file Notes
Process of flattening a file is called denormalization
Several relations are joined together to make one
Possible with any finite set of finite relations
Problematic: relationships without pre-specified number of
objects
Examp
Hipotheses testing Notes
H0: 1=2; p1=p2; (RR=1, OR=1, difference=0)
HA: 12; p1p2 (two sided, one sided)
Significance level (agreed 0.05).
Test for P value (t-test, 2 , etc.).
P value is the probability to get the difference (association), if
the null hypo
Description of numerical data Notes
Arranging data
Frequencies (relative and cumulative), graphical
presentation
Measures of central tendency and variance
Assessing normality
Grouping
Sorting data
Groups (5-17 gr.) according researchers criteria.
To asses
Categorical Variables Notes
Review: Categorical Variables place individuals into one of
several groups or categories.
The values of a categorical variable are labels for the
different categories.
The distribution of a categorical variable lists the count
Conditional Distribution Notes
Marginal distributions tell us nothing about the relationship
between two variables.
A Conditional Distribution of a variable describes the
values of that variable among individuals who have a
specific value of another varia
Data Scales Notes
Data are generally classified into four types:
1.
Nominal Categorical data
2.
Ordinal shows ranks, intervals may vary
3.
Interval intervals are constant, arbitrary 0
4.
Ratio
Numeric data with a real 0 value.
Ordinal, Interval and Ratio
STAT 1112-YD
Abigail Natnat
Winter 2016
N01100811
Assignment # 1
1.
a.) BAR GRAPH
The Bar Graph of the 2015 Annual Report of the Procter & Gamble Company
30,000
25,000
20,000 26,157
29,831
15,000
16,001
10,000
9,725 14,900 5,773
5,000
0
Net Sales of 2015