JORDAN UNIVERSITY OF SCIENCE AND TECHNOLOGY Informtion Technology Faculty Computer Science Deprtment CS 729 Advanced Data Mining Box-plots Done by:  Mohammed K. Ali Shatnawi         20093171003 Presented to:

Dr. Hasan Najadat Box-plots history In 1977, John Tukey published an efficient method for displaying a five-number data  summary. The graph is called a boxplot (also known as a box and whisker plot) and  summarizes the following statistical measures: median upper and lower quartiles minimum and maximum data values Box-plot in depth The following figure shows a box-plot graph followed by interpreting it:
Figure 1: Box-plot graph The box itself contains the middle 50% of the data. The upper edge (hinge) of the  box indicates the 75th percentile of the data set, and the lower hinge indicates  the 25th percentile. The range of the middle two quartiles is known as the inter- quartile range.
Unformatted text preview: • The line in the box indicates the median value of the data. • If the median line within the box is not equidistant from the hinges, then the data is skewed. • The ends of the vertical lines or "whiskers" indicate the minimum and maximum data values, unless outliers are present in which case the whiskers extend to a maximum of 1.5 times the inter-quartile range. • The points outside the ends of the whiskers are outliers or suspected outliers. Useful software support box-plot • MATLAB for engineering and science. The function can be seen under Statistical Visualization/Distribution plot package and called: boxplot(…) and has 4 overloaded versions. • SoftStat Company also provide a good software for box plot and other types of graph called STATISTICAL, and it is free for download. • Microsoft Office Excel application....
