02Data - Concepts and Techniques Chapter 2 Jiawei Han...

Info icon This preview shows pages 1–9. Sign up to view the full content.

View Full Document Right Arrow Icon
September 16, 2009 Data Mining: Concepts and Techniques 1 Concepts and Techniques — Chapter 2 — Jiawei Han, Micheline Kamber, and Jian Pei University of Illinois at Urbana-Champaign  Simon Fraser University © 2008 Jiawei Han, Micheline Kamber, and Jian Pei.  All  rights reserved.
Image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
September 16, 2009 Data Mining: Concepts and Techniques 2
Image of page 2
September 16, 2009 Data Mining: Concepts and Techniques 3 Chapter 2: Getting to Know Your Data Data Objects and Attribute Types Basic Statistical Descriptions of Data Data Visualization Measuring Data Similarity and Dissimilarity Summary
Image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
September 16, 2009 Data Mining: Concepts and Techniques 4 Types of Data Sets Record Relational records Data matrix, e.g., numerical matrix,  crosstabs Document data: text documents: term- frequency vector Transaction data Graph and network World Wide Web Social or information networks Molecular Structures Ordered Video data: sequence of images Temporal data: time-series Sequential Data: transaction sequences Genetic sequence data Spatial, image and multimedia: Spatial data: maps Image data:  Video data: Document 1 season timeout lost wi n game score ball pla y coach team Document 2 Document 3 3 0 5 0 2 6 0 2 0 2 0 0 7 0 2 1 0 0 3 0 0 1 0 0 1 2 2 0 3 0 TID Items 1 Bread, Coke, Milk 2 Beer, Bread 3 Beer, Coke, Diaper, Milk 4 Beer, Bread, Diaper, Milk 5 Coke, Diaper, Milk
Image of page 4
September 16, 2009 Data Mining: Concepts and Techniques 5 Important Characteristics of Structured Data Dimensionality Curse of dimensionality Sparsity Only presence counts Resolution Patterns depend on the scale   Distribution Centrality and dispersion
Image of page 5

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
September 16, 2009 Data Mining: Concepts and Techniques 6 Data Objects Data sets are made up of data objects. data object  represents an entity. Examples:  sales database:  customers, store items, sales medical database: patients, treatments university database: students, professors, courses Also called  samples , examples, instances, data points,  objects, tuples . Data objects are described by  attributes . Database rows -> data objects; columns ->attributes.
Image of page 6
September 16, 2009 Data Mining: Concepts and Techniques 7 Attributes Attribute ( or  dimensions, features, variables ): a  data field, representing a characteristic or feature of  a data object. E.g., customer _ID, name, address Types: Nominal Binary Numeric: quantitative Interval-scaled Ratio-scaled
Image of page 7

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
September 16, 2009 Data Mining: Concepts and Techniques 8 Attribute Types Nominal:  categories, states, or “names of things” Hair_color = {black, brown, blond, red, auburn, grey, white} marital status, occupation, ID numbers, zip codes Binary
Image of page 8
Image of page 9
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}

What students are saying

  • Left Quote Icon

    As a current student on this bumpy collegiate pathway, I stumbled upon Course Hero, where I can find study resources for nearly all my courses, get online help from tutors 24/7, and even share my old projects, papers, and lecture notes with other students.

    Student Picture

    Kiran Temple University Fox School of Business ‘17, Course Hero Intern

  • Left Quote Icon

    I cannot even describe how much Course Hero helped me this summer. It’s truly become something I can always rely on and help me. In the end, I was not only able to survive summer classes, but I was able to thrive thanks to Course Hero.

    Student Picture

    Dana University of Pennsylvania ‘17, Course Hero Intern

  • Left Quote Icon

    The ability to access any university’s resources through Course Hero proved invaluable in my case. I was behind on Tulane coursework and actually used UCLA’s materials to help me move forward and get everything together on time.

    Student Picture

    Jill Tulane University ‘16, Course Hero Intern