6.Back to the Histogram. Is this a good summary? No.Is there any tip-off that this data is time series from histogram? Nothing. Can mix up dates (Jan
Connect dots:Lots of variation! No time structure! If reordered -- same. Bunch of noise. No sequential dependence. In the future -- same. This graph's future is determined.Where is the evidence that the stock went up?More positives than negatives? Histogram of the relative changes.
Transformed data . Price return data. Recoverable from return.Histogram is very normal.How do we describe this dataset? Two numbers. Case study: Executive Compensation (page 142)Open CEO COMP 03. Clear row states.Go over columns -- 26! Lots of info about these gentlemen.Total compensation sometimes equal to salary, sometimes not. Why?Why look at this data? Maybe someone will be a CEO one day.Or maybe you're historian, want to analyze. Maybe want to see how these compare to regular workers.In this format -- useless! Need summarize.Describe mean/median. Different.Lots of white space: bad. Empirical rule: bad. Need lots of summary statistics to describe it. Very complex dataset!What if there are just a couple of outliers, like the Yankees?
Using the Lasso tool, we can Rows->Exclude, Rows-> Hide, Distributions->Script->Redo AnalysisDidn't help much. You can say -- take out the lasso again!Still the same. Takes a lot of description.How do I compare between sectors? e.g. financial sector vs telecommunication sector? Histograms skewed. Means skewed by high values. Medians -- lose information, like Yankees.Salaries change by percentage amounts. You hear salary increase of 3%, not $1000. Need to transform multiplicative increase into additive increase.Logarithms! Stretch the main part and squash the tail.10K 100K 1000K 10,000K 4 5 6 7number of figures. He makes a 6-figure.Histogram of LogTotalCompensation: