allreg.pdf - Statistical Regression Analysis 2 Chapter 1 Probability Distributions Estimation and Testing 1.1 Introduction Here we introduce probability

# allreg.pdf - Statistical Regression Analysis 2 Chapter 1...

• 85

This preview shows page 1 - 5 out of 85 pages.

Statistical Regression Analysis July 27, 2016
2
Chapter 1 Probability Distributions, Estimation, and Testing 1.1 Introduction Here we introduce probability distributions, and basic estimation/testing methods. Random variables are outcomes of an experiment or data-generating process, where the outcome is not known in advance, although the set of possible outcomes is. Random variables can be discrete or continuous . Discrete random variables can take on only a finite or countably infinite set of possible outcomes. Continuous random variables can take on values along a continuum. In many cases, variables of one type may be treated as or reported as the other type. In general, we will use upper-case letters (such as Y ) to represent random variables, and lower-case letters (such as y ) to represent specific outcomes. Not all (particularly applied statistics) books follow this convention. 1.1.1 Discrete Random Variables/Probability Distributions In many applications, the result of the data-generating process is the count of a number of events of some sort. In some cases, a certain number of trials are conducted, and the outcome of each trial is observed as a “Success” or “Failure” (binary outcomes). In these cases, the number of trials ending in Success is observed. Alternatively, a series of trials may be conducted until a pre-selected number of Successes are observed. In other settings, the number of events of interest may be counted in a fixed amount of time or space, without actually breaking the domain into a set of distinct trials. For discrete random variables, we will use p ( y ) to represent the probability that the random variable Y takes on the value y . We require that all such probabilities be bounded between 0 and 1 (inclusive), and that they sum to 1: P { Y = y } = p ( y ) 0 p ( y ) 1 summationdisplay y p ( y ) = 1 3
4 CHAPTER 1. PROBABILITY DISTRIBUTIONS, ESTIMATION, AND TESTING The cumulative distribution function is the probability that a random variable takes on a value less than or equal to a specific value y * . It is an increasing function that begins at 0 and increases to 1, and we will denote it as F ( y * ). For discrete random variables it is a step function, taking a step at each point where p ( y ) > 0: F ( y * ) = P ( Y y * ) = summationdisplay y y * p ( y ) The mean or Expected Value ( μ ) of a random variable is it’s long-run average if the experiment was conducted repeatedly ad infinitum. The variance ( σ 2 ) is the average squared difference between the random variable and its mean, measuring the dispersion within the distribution. The standard deviation ( σ ) is the positive square root of the variance, and is in the same units as the data. μ Y = E { Y } = summationdisplay y yp ( y ) σ 2 Y = V { Y } = E braceleftBig ( Y - μ Y ) 2 bracerightBig = summationdisplay y ( y - μ Y ) 2 p ( y ) σ Y = + radicalBig σ 2 Y Note that for any function of Y , the expected value and variance of the function is computed as follows: E { g ( Y ) } = summationdisplay y g ( y ) p ( y ) = μ g ( Y ) V { g ( Y ) } = E braceleftBig ( g ( Y ) - μ g ( Y ) ) 2 bracerightBig = summationdisplay y ( g ( y ) - μ g ( Y ) ) 2 p ( y ) For any constants a and b

#### You've reached the end of your free preview.

Want to read all 85 pages?