Com johndoecom johndoecom sangmi lee pallickara cs480

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: y conversions Constraint verifica/on De ­Duplicaton Data fusion And many others Sangmi Lee Pallickara, CS480, Spring 2012 Spring 2013 Examples •  Data Cleaning is a set of tasks •  •  •  •  •  •  •  8 •  Fraudulent bank customer Email john@doe.com john@doe.com john@doe.com Sangmi Lee Pallickara, CS480, Spring 2012 Spring 2013 Duplicates –  Data about a single en/ty is entered mul/ple /mes into the same database •  Inter ­source duplicates –  Data with different representa/ons –  When integra/ng mul/ple data sources •  Methods to detect them are the same •  Treatments of duplicates are different 11 Sangmi Lee Pallickara, CS480, Spring 2012 12 2 2/19/13 CS480 Principles of Data Management Spring 2013 Intra-source duplicates CS480 Principles of Data Management Spring 2013 Intra-source duplicates •  Poor online duplicate detec/on methods HELLO H...
View Full Document

This note was uploaded on 02/11/2014 for the course CS 480 taught by Professor Staff during the Spring '08 term at Colorado State.

Ask a homework question - tutors are online