The Cartoon Guide to Statistics. New York: Harper Perennial, p. 212, 1993 Weisstein, Eric W. "Chernoff Face." From MathWorld --A Wolfram Web Resource. mathworld.wolfram.com/ChernoffFace.html

Data Mining Exploratory Data Analysis Stick Figure A census data figure showing age, income, gender, education, etc. A 5-piece stick figure (1 body and 4 limbs w. different angle/length) 37
Data Mining Exploratory Data Analysis Hierarchical Visualization Techniques 38 Visualization of the data using a hierarchical partitioning into subspaces Methods Dimensional Stacking Worlds-within-Worlds Tree-Map Cone Trees InfoCube

Data Mining Exploratory Data Analysis Visualizing Complex Data and Relations: Tag Cloud & Tree-Map Tag cloud visualizing user-generated tags. The importance of tag is represented by font size/color Popularly used to visualize word/phrase distributions 39 KDD 2013 Research Paper Title Tag Cloud Newsmap: Google News Stories in 2005 Tree-Map display hierarchical data as a set of nested rectangles.
Data Mining Exploratory Data Analysis Visualizing Complex Data and Relations: Social Networks Visualizing non-numerical data: social and information networks 40 A typical network structure A social network organizing information networks

Data Mining Exploratory Data Analysis Measuring Data Similarity and Dissimilarity Data Matrix and Dissimilarity Matrix Proximity Measures for Nominal Attributes Proximity Measures for Binary Attributes Dissimilarity of Numeric Data: Minkowski Distance Proximity Measures for Ordinal Attributes Dissimilarity for Attributes of Mixed Types Cosine Similarity 41
Data Mining Exploratory Data Analysis Measuring Data Similarity and Dissimilarity 42 A cluster is a collection of data objects such that the objects within a cluster are similar to one another and dissimilar to the objects in other clusters. Outlier analysis also employs clustering-based techniques to identify potential outliers as objects that are highly dissimilar to others.

Data Mining Exploratory Data Analysis Similarity, Dissimilarity, and Proximity 43 Similarity measure or similarity function A real-valued function that quantifies the similarity between two objects Measure how two data objects are alike: The higher value, the more alike Often falls in the range [0,1]: 0: no similarity; 1: completely similar Dissimilarity (or distance ) measure Numerical measure of how different two data objects are In some sense, the inverse of similarity: The lower, the more alike Minimum dissimilarity is often 0 (i.e., completely similar) Range [0, 1] or [0, ∞) , depending on the definition Proximity usually refers to either similarity or dissimilarity
Data Mining Exploratory Data Analysis Data Matrix and Dissimilarity Matrix 44 Data matrix A data matrix of data points with dimensions Dissimilarity (distance) matrix data points, but registers only the distance d ( i, j ) (typically metric)

