# DescriptiveStatistics_Chapter1 - EGN 3443 Probability and...

• Notes
• 55

This preview shows page 1 - 13 out of 55 pages.

EGN 3443: Probability and Statistics for Engineers Shikhar Acharya
Velocity, Variety, and Volume of Data http :// https://
Data Science The discovery of new information in terms of patterns or rules from vast amounts of data. The process of finding interesting structure in data. The process of employing one or more computer learning techniques to automatically analyze and extract knowledge from data.
Examples… 30% of the total sales of Amazon are generated by its recommendation engine (“you may also like”) Insurance firms can now monitor the driving styles of their customers and offer them rates based on their competence (or recklessness) rather than their age and gender Google targets advertisements based on the search history Political campaigns collect and analyze voter information and target potential voters who are more likely to vote for them
one vacation resort drastically cut labor costs by syncing up its scheduling process with information from the National Weather Service Hospitals are analyzing medical data and patient records to predict those patients that are likely to seek readmission within a few months of discharge. The hospital can then intervene in hopes of preventing another costly hospital stay Credit card company uses big data for fraud detection
One retailer (Walmart?) discovered that when men bought diapers on Thursdays and Saturdays, they also tended to buy beer. The manager of grocery chain could use this newly discovered information in various ways to increase revenue. For example, they could move the beer display closer to the diaper display. And, they could make sure beer and diapers were sold at full price on Thursdays.
Data Scientist - Data scientists can find answers to important questions from many sources of unstructured information - As companies rush to capitalize on the potential of big data, the largest constraint many face is the scarcity of this special talent Skill Set of Data Scientist (source: Drew Conway)
Statistics: Two Processes Describing sets of data and Drawing conclusions (making inferences , estimates, decisions, predictions, etc. based on sampled data)
Descriptive Statistics Involves Collecting Data Presenting Data Summarizing Data Purpose Describe Data X = 30, S 2 = 100 0 25 45 Q1 Q2 Q3 Q4 \$
Involves Estimation Hypothesis Testing Purpose Make decisions about population characteristics Inferential Statistics
Population : Well defined collection of objects that we want to investigate Sample : subset of population
Obtaining Data Published source Observational study Survey Designed experiment