University of Florida CISE department Gator Engineering Data Issues Dr. Sanjay Ranka Professor Computer and Information Science and Engineering University of Florida, Gainesville

University of Florida CISE department Gator Engineering Data Mining Sanjay Ranka Spring 2011 What Is a Data Set ? • Attributes (describe objects) – Variable, field, characteristic, feature or observation • Objects (have attributes) – Record, point, case, sample, entity or item • Data Set – Collection of objects
University of Florida CISE department Gator Engineering Data Mining Sanjay Ranka Spring 2011 Type of an Attribute • The type of an attribute depends on the following properties: – Distinctness: =, – Order: <, > – Addition: +, - – Multiplication: *, /

University of Florida CISE department Gator Engineering Data Mining Sanjay Ranka Spring 2011 Types of Attributes Attribute Type Description Examples Operations Nominal Each value represents a label. (Typical comparisons between two values are limited to “equal” or “no equal”) Flower color, gender, zip code Mode, entropy, contingency correlation, χ 2 test Ordinal The values can be ordered. (Typical comparisons between two values are “equal” or “greater” or “less”) Hardness of minerals, {good, better, best}, grades, street numbers, rank, age Median, percentiles, rank correlation, run tests, sign tests Interval The differences between values are meaningful, i.e., a unit of measurement exists. (+, - ) Calendar dates, temperature in Celsius or Fahrenheit Mean, standard deviation, Pearson's correlation, t and F tests Ratio Differences and ratios are meaningful. (*, /) Monetary quantities, counts, age, mass, length, electrical current Geometric mean, harmonic mean, percent variation
University of Florida CISE department Gator Engineering Data Mining Sanjay Ranka Spring 2011 Transformations for different types Attribute Level Transformation Comments Nominal Any permutation of values If all employee ID numbers were reassigned, would it make any difference?

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
