Unimodal Empirical formula Multi modal Bimodal Trimodal 3 median mean mode mean

Unimodal empirical formula multi modal bimodal

This preview shows page 9 - 19 out of 58 pages.

Unimodal Empirical formula: Multi-modal Bimodal Trimodal ) ( 3 median mean mode mean
Image of page 9

Subscribe to view the full document.

Data Mining Exploratory Data Analysis Symmetric vs. Skewed Data Data in most real applications are not symmetric. They may instead be either positively skewed , where the mode occurs at a value that is smaller than the median, or negatively skewed , where the mode occurs at a value greater than the median. 10 Symmetric data Positively skewed data Negatively skewed data
Image of page 10
Data Mining Exploratory Data Analysis Properties of Normal Distribution Curve 11 ←———— Represent data dispersion, spread ————→ Represent central tendency
Image of page 11

Subscribe to view the full document.

Data Mining Exploratory Data Analysis Measures Data Distribution: Variance and Standard Deviation Variance and standard deviation ( sample: s, population: σ) Variance : (algebraic, scalable computation) Q: Can you compute it incrementally and efficiently? Standard deviation s (or σ) is the square root of variance s 2 ( or σ 2) 12 n i n i i i n i i x n x n x x n s 1 1 2 2 1 2 2 ] ) ( 1 [ 1 1 ) ( 1 1 n i i n i i x N x N 1 2 2 1 2 2 1 ) ( 1
Image of page 12
Data Mining Exploratory Data Analysis Graphic Displays of Basic Statistical Descriptions Boxplot : graphic display of five-number summary Histogram : x-axis represent values, y-axis represent frequencies. Quantile plot : each value is paired with indicating that approximately 100 of data are . Quantile-quantile (q-q) plot : graphs the quantiles of one univariate distribution against the corresponding quantiles of another Scatter plot : each pair of values is a pair of coordinates and plotted as points in the plane 13
Image of page 13

Subscribe to view the full document.

Data Mining Exploratory Data Analysis Boxplot Boxplot : graphic display of five-number summary. 14
Image of page 14
Data Mining Exploratory Data Analysis Measuring the Dispersion of Data: Quartiles & Boxplots Quartiles : Q 1 (25 th percentile), Q 3 (75 th percentile) Inter-quartile range : IQR = Q 3 Q 1 Five number summary : min, Q 1 , median, Q 3 , max Boxplot : Data is represented with a box Q 1 , Q 3 , IQR: The ends of the box are at the first and third quartiles, i.e., the height of the box is IQR Median (Q 2 ) is marked by a line within the box Whiskers: two lines outside the box extended to Minimum and Maximum Outliers: points beyond a specified outlier threshold, plotted individually Outlier : usually, a value higher/lower than 1.5 x IQR 15
Image of page 15

Subscribe to view the full document.

Data Mining Exploratory Data Analysis Visualization of Data Dispersion: 3-D Boxplots 16
Image of page 16
Data Mining Exploratory Data Analysis Histogram Histogram : x-axis represent values, y-axis represent frequencies. 17
Image of page 17

Subscribe to view the full document.

Data Mining Exploratory Data Analysis Histogram Analysis Histogram: Graph display of tabulated frequencies, shown as bars Differences between histograms and bar charts Histograms are used to show distributions of variables while bar charts are used to compare variables Histograms plot binned quantitative data while bar charts plot categorical data
Image of page 18
Image of page 19
  • Winter '18
  • nour

What students are saying

  • Left Quote Icon

    As a current student on this bumpy collegiate pathway, I stumbled upon Course Hero, where I can find study resources for nearly all my courses, get online help from tutors 24/7, and even share my old projects, papers, and lecture notes with other students.

    Student Picture

    Kiran Temple University Fox School of Business ‘17, Course Hero Intern

  • Left Quote Icon

    I cannot even describe how much Course Hero helped me this summer. It’s truly become something I can always rely on and help me. In the end, I was not only able to survive summer classes, but I was able to thrive thanks to Course Hero.

    Student Picture

    Dana University of Pennsylvania ‘17, Course Hero Intern

  • Left Quote Icon

    The ability to access any university’s resources through Course Hero proved invaluable in my case. I was behind on Tulane coursework and actually used UCLA’s materials to help me move forward and get everything together on time.

    Student Picture

    Jill Tulane University ‘16, Course Hero Intern

Ask Expert Tutors You can ask You can ask ( soon) You can ask (will expire )
Answers in as fast as 15 minutes