Measured on a scale of equal-sized units Values have order e.g., temperature in or , calendar dates Can be positive, 0, or, negative Allow us to compare and quantify the difference between values Ratio-scaled Inherent zero-point We can speak of values as being an order of magnitude larger than the unit of measurement (10 is twice as high as 5 ). e.g., temperature in Kelvin, length, counts, monetary quantities The values are ordered We can compute the difference between values, as well as the mean, median, and mode Q1: Is student ID a nominal, ordinal, or interval-scaled data? Q2: What about eye color? Or color in the color spectrum of physics? 7

Data Mining Data Discrete vs. Continuous Attributes Classification algorithm often talk of attributes as being discrete or continuous . Discrete Attribute Has only a finite or countably infinite set of values e.g., zip codes, profession, or the set of words in a collection of documents Sometimes, represented as integer variables Note: Binary attributes are a special case of discrete attributes Continuous Attribute Has real numbers as attribute values e.g., temperature, height, or weight Practically, real values can only be measured and represented using a finite number of digits Continuous attributes are typically represented as floating-point variables 8
Data Mining Data Important Characteristics of Data Sets There are three general characteristics of Data Sets namely: Dimensionality , Sparsity , and Resolution . 10
Data Mining Data Data Dimensionality The dimensionality of a data set is the number of features/attributes that the objects in the data set possesses.

