{[ promptMessage ]}

Bookmark it

{[ promptMessage ]}

Documents

# Documents - Chapter 1 Reading Exercise Chapter 2 Data...

This preview shows pages 1–3. Sign up to view the full content.

Chapter 1: Reading Exercise Chapter 2: Data Analysis (Variables and Describing/Summarizing data distributions) -Data is information about n number of subjects. -Analysis includes understanding the information in a manner that is useful in decision making. Sometimes analysis has well defined objective. -Information is organized in the format of subjects and variables. -- Set of subjects make sample(population) and attributes or characteristics of these subjects are variables. Note that population is a collection all subjects related to the data of interest. However sample includes only some subjects of the population. For example, CENSUS includes all the US citizens and legal immigrants (about 310millions) but CPS, the current population survey, includes only a sample of the current population (less than a million). There are 2 types of variables: Qualitative (categorical) and Quantitative (numerical) Qualitative variable takes non-numbers (categories) as values/observations. Ex: Major, Gender, Ethnicity, Student status, Buy/Sell status of a stock. No algebra can be done on qual. variable, except counting. There are two types of Qual. variables: 1. Nominal (categories have no order/rankings). Ex: Major, Gender, Race… 2. Ordinal (categories have order/ranking). Ex: Student status (freshman, sophomore …), Earned education (HS, Bachelor, Master,…) Quantitative variable takes numbers as values. Ex: GPA, SAT scores, # of mistakes, monthly unemployment rate, stock prices. When small data (small n and few variables) is presented, it may not be difficult to understand the information content. However, when large data is presented, how do we understand or explain the information content? -Always eyeball it top to bottom and left to right! Does it help, may be not! What is information? In stats/probability, this is may be understood as Distribution. Distribution of a variable is about breaking down the information content to - Different categories/values this variable takes - The counts/frequencies/relative frequencies of these values Note: 1. Distribution helps in understanding the pattern of variation in the information. 2. Relative frequencies are objective probabilities.

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Section 2.1 (Not part of syllabus) How do we breakdown information in a qualitative variable? -By graphs and tables. These tools must show categories and counts. Example: Top40 (by 2008 salary) CEOs CEO Company Industry Degree Salary (\$millions) Age Hugh Grant Monsanto Mfg Bachelor s 35.78 50 Matthew K Rose Burlington Santa Fe Service Bachelor s 36.52 49 Robert J Stevens Lockheed Martin Mfg Masters 36.56 56 Paul J Evanson Allegheny Energy Energy Law 37.29 66 Richard C Adkerson Freeport Copper Mfg MBA 38.66 61 James L Dolan Cablevision Service Bachelor s 38.81 53 Brian L Roberts Comcast Service Bachelor s 38.98 48 Michael D Watford Ultra Petroleum Energy MBA 40.64 54 Howard Solomon Forest Labs Pharm Law 40.89 80 J Willard Marriott Jr Marriott International Service Bachelor s 44.09 76 Miles D White Abbott Laboratories Pharm MBA 44.76 53 John H Hammergren McKesson Service MBA 44.91 49 Albert L Lord SLM Finance Bachelor s 45.99 62 Sol J Barer Celgene Pharm PhD 46.07
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}