Course Hero Logo

ECF230-FPD-1-2018-2.ppt - ECF 230: Introduction to...

Course Hero uses AI to attempt to automatically extract content from documents to surface to you and others so you can study better, e.g., in search results, to enrich docs, and more. This preview shows page 1 out of 84 pages.

Unformatted text preview: ECF 230: Introduction to Econometrics Chapter # 1: Introduction By: Damodar N. Gujarati Ms Mwafulirwa Jane School of Business: University of Lusaka 2018. WHAT IS ECONOMETRICS? • Econometrics means “economic measurement”. the scope of econometrics is much broader, as can be seen from the following definitions: • “Consists of the application of mathematical statistics to economic data to lend empirical support to the models constructed by mathematical economics and to obtain numerical results” (Gerhard 1968). • “Econometrics may be defined as the social science in which the tools of economic theory, mathematics, and statistical inference are applied to the analysis of economic phenomena” (Goldberger 1964). • “Econometrics is concerned with the empirical determination of economic laws” (Theil 1971). METHODOLOGY OF ECONOMETRICS • Broadly speaking, traditional econometric methodology proceeds along the following lines: 1. Statement of theory or hypothesis. 2. Specification of the mathematical model of the theory 3. Specification of the statistical, or econometric, model 4. Collecting the data 5. Estimation of the parameters of the econometric model 6. Hypothesis testing 7. Forecasting or prediction 8. Using the model for control or policy purposes. • To illustrate the preceding steps, let us consider the well-known Keynesian theory of consumption. 1. Statement of Theory or Hypothesis • Keynes states that on average, consumers increase their consumption as their income increases, but not as much as the increase in their income (0<MPC < 1). 2. Specification of the Mathematical Model of Consumption (singleequation model) Y = α+ βX 0 < β2 < 1 (I.3.1) Y = consumption expenditure and (dependent variable) X = income, (independent, or explanatory variable) α = the intercept/constant β = the slope coefficient/ gradient • • The slope coefficient β measures the Marginal Propensity to consume. The specific form of this mathematical model is a linear model THE MEANING OF THE TERM LINEAR • Linearity in the Variables • The first meaning of linearity is that the conditional expectation of Y is a linear function of Xi, the regression curve in this case is a straight line. But • E(Y | Xi) = β1 + β2X2i is not a linear function • Linearity in the Parameters • The second interpretation of linearity is that the conditional expectation of Y, E(Y | Xi), is a linear function of the parameters, the β’s; it may or may not be linear in the variable X. • E(Y | Xi) = β1 + β2X2i • is a linear (in the parameter) regression model. All the models shown in Figure 2.3 are thus linear regression models, that is, models linear in the parameters. • Now consider the model: • E(Y | Xi) = β1 + β22 Xi . • The preceding model is an example of a nonlinear (in the parameter) regression model. • From now on the term “linear” regression will always mean linear in parameters regression that is linear in the parameters; the β’s (that is, the parameters are raised to the first power only). • Geometrically, the MPC shows us by how much the consumption expenditure will change when income increases by one unit. 3. Specification of the Econometric model of Consumption • The relationships between economic variables are generally inexact. In addition to income, other variables affect consumption expenditure. For example, size of family, individual preferences, environment, family religion , etc., are likely to exert some influence on consumption. • To allow for the inexact relationships between economic variables, (I.3.1) is modified as follows: • • Y = α + βX + u (I.3.2) where u, known as the disturbance, or error, term, is a random (stochastic) variable that has well-defined probabilistic properties. The disturbance term u may well represent all those factors that affect consumption but are not taken into account explicitly. Therefore the econometric model is comprised of the mathematical model and the random component. • THE SIGNIFICANCE OF THE STOCHASTIC DISTURBANCE TERM • The disturbance term ui is a surrogate for all those variables that are omitted from the model but that collectively affect Y. Why don’t we introduce them into the model explicitly? The reasons are many: • 1. Vagueness of theory: The theory, if any, determining the behavior of Y may be, and often is, incomplete. We might be ignorant or unsure about the other variables affecting Y. • 2. Unavailability of data: Lack of quantitative information about these variables, e.g., information on family wealth generally is not available. • 3. Core variables versus peripheral variables: Assume that besides income X1, the number of children per family X2, sex X3, religion X4, education X5, and geographical region X6 also affect consumption expenditure. But the joint influence of all or some of these variables may be so small and it does not pay to introduce them into the model explicitly. One hopes that their combined effect can be treated as a random variable ui. • 4. Intrinsic randomness in human behavior: Even if we succeed in introducing all the relevant variables into the model, there is bound to be some “intrinsic” randomness in individual Y’s that cannot be explained no matter how hard we try. The disturbances, the u’s, may very well reflect this intrinsic randomness. • 5. Poor proxy variables: for example, Friedman regards permanent consumption (Yp) as a function of permanent income (Xp). But since data on these variables are not directly observable, in practice we use proxy variables, such as current consumption (Y) and current income (X), there is the problem of errors of measurement, u may in this case then also represent the errors of measurement. • 6. Principle of parsimony: we would like to keep our regression model as simple as possible. If we can explain the behavior of Y “substantially” with two or three explanatory variables and if our theory is not strong enough to suggest what other variables might be included, why introduce more variables? Let ui represent all other variables. • 7. Wrong functional form: Often we do not know the form of the functional relationship between the regressand (dependent) and the regressors. Is consumption expenditure a linear (in variable) function of income or a nonlinear (invariable) function? If it is the former, • Yi = β1 + B2Xi + ui is the proper functional relationship between Y and X, but if it is the latter, • Yi = β1 + β2Xi + β3X2i + ui may be the correct functional form. • In two-variable models the functional form of the relationship can often be judged from the scattergram. But in a multiple regression model, it is not easy to determine the appropriate functional form, for graphically we cannot visualize scattergrams in multiple dimensions. 4. Obtaining Data • To obtain the numerical values of β1 and β2, we need data. Look at Table I.1, which relate to the personal consumption expenditure (PCE) and the gross domestic product (GDP). The data are in “real” terms. The data are plotted in Figure I.3 Basic Data Types used in Econometrics There are 3 types of data which econometricians might use for analysis: 1.Time series data • is a set of observations on the values that a variable takes at different times. Such data may be collected at regular time intervals, such as daily (e.g., stock prices, weather reports), weekly (e.g., money supply figures), monthly [e.g., the unemployment rate, the Consumer Price Index (CPI)], quarterly (e.g., GDP), and annually (government budgets). 2. Cross-sectional data •are data on one or more variables collected at the same point in time, such as the census of population conducted by the Central Statistics Organisation (CSO) every 10 years (the latest being in year 2010), the basic food basket by JCTR and opinion polls by MUVI – TV and the Post Newspaper. 3. Panel data, a combination of 1. & 2. •For a panel, data from both time series and cross-section data elements are used. The table below is an example of panel data. Hypothetical Egg Production in Zambian: 2010-2011. Province Y1 = eggs produced in 2010 (millions) Y2 = eggs produced in 2011 (millions) X1 = price per tray(ZMK) in 2010 X2 = price per tray(ZMK) in 2011 Y1 Y2 X1 X2 LUSAKA 100 150 20 22 COPPERBELT 200 280 19 21 CENTRAL 60 62 18 20 NORTHERN 50 53 18.5 19.5 SOUTHERN 44 48 18 21 MUCHINGA 10 28 19 20 N.WESTERN 65 90 19.5 20.5 WESTERN 30 35 20 21 EASTERN 55 65 18 22 LUAPULA 60 78 19 21 For each year we have 10 cross-sectional observations (the provinces) and for each province we have two time series observations on output and prices of eggs, a total of 20 (combined) observations. Note that a panel is a special case of pooled data. Pooled data combines cross sections and time series. They could be different cross sections but panel uses the same crosssections over the period considered. 5. Estimation of the Econometric Model • Regression analysis is the main tool used to obtain the estimates. Using this technique and the data given in Table I.1, we obtain the following estimates of α and β, namely, -1208 and 0.67. Thus, the estimated consumption function is: • Yˆ = −1208 + 0.67Xi • The slope coefficient (i.e., the MPC) was about 0.70, an increase in real income of 1 dollar led, on average, to an increase of about 70 cents in real consumption. The estimated regression line of the estimated equation can be shown on a graph as a regression line. • (I.3.3) 6. Hypothesis Testing • • • • • • • • That is to find out whether the estimates obtained in, Eq. (I.3.3) are in accord with the expectations of the theory that is being tested . Keynes expected the MPC to be positive. Thus we could have stated our hypothesis as Hₒ : β <0 H₁: β >0 Based on the estimation done in the previous step, MPC is 0.67. Hypothesis testing entails determining if 0.67 is statistically positive despite it being obvious on face value. Alternatively, the hypothesis of interest could also be; Hₒ : β >1 H₁: β <1 Which is also in line with Keynes theory of consumption. In this case, we would want to establish if the 0.67 is statistically less than 1. Such confirmation or refutation of economic theories on the basis of sample evidence is based on a branch of statistical theory known as statistical inference (hypothesis testing). 7. Forecasting or Prediction • To illustrate, suppose we want to predict the mean consumption expenditure for 2017. If the GDP value for 2017 was 7269.8 billion dollars consumption would be: Yˆ1997 = −1208 + 0.67(7269.8) = 3662.766 • (I.3.4) The actual value of the consumption expenditure reported in 2017 was 4913.5 billion dollars. The estimated model (I.3.3) thus under-predicted the actual consumption expenditure by about 1250.73 billion dollars. We could say the forecast error is about 1250.73 billion dollars, which is represented by the random or stochastic term. • Suppose that, as a result of the proposed policy change, investment expenditure increases. What will be the effect on the economy? As macroeconomic theory shows, the change in income following, a dollar’s worth of change in investment expenditure is given by the income multiplier M, which is defined as: • M = 1/(1 − MPC) • The multiplier is about M = 3.03. That is, an increase (decrease) of a dollar in investment will eventually lead to more than a threefold increase (decrease) in income; note that it takes time for the multiplier to work. The critical value in this computation is MPC. Thus, a quantitative estimate of MPC provides valuable information for policy purposes. Knowing MPC, one can predict the future course of income and consumption expenditure. • (I.3.5) 8. Use of the Model for Control or Policy Purposes • • • • Suppose we have the estimated consumption function given in (I.3.3). Suppose further the government believes that consumer expenditure of about 4900 will keep the unemployment rate at its current level of about 4.2%. What level of income will guarantee the target amount of consumption expenditure? If the regression results given in (I.3.3) seem reasonable, simple arithmetic will show that: 4900 = −1208 + 0.67X (I.3.6) which gives X = 9116.42, approximately. That is, an income level of about 9116.42(billion) dollars, given an MPC of about 0.67, will produce an expenditure of about 4900 billion dollars. Thus the government needs to ensure 9116.42 level of income if it is to keep consumption at 4900 for a given MPC. As these calculations suggest, an estimated model may be used for control, or policy, purposes. By appropriate fiscal and monetary policy mix, the government can manipulate the control variable X to produce the desired level of the target variable Y. summary • Figure I.4 summarizes the anatomy of classical econometric modeling. A Note on the Measurement Scales of Variables The variables that will generally encounter fall into the following four broad categories: 1.Nominal Scale •nominal scales assign numbers as labels to identify objects or classes of objects. The assigned numbers carry no additional meaning except as identifiers. For example, the use of ID codes A, N and P to represent aggressive, normal, and passive drivers is a nominal scale variable. Note that the order has no meaning here, and the difference between identifiers is meaningless. In practice it is often useful to assign numbers instead of letters to represent nominal scale variables, but the numbers should not be treated as ordinal, interval, or ratio scale variables. Further, variables such as gender (male, female) and marital status (married, unmarried, divorced, separated) simply denote categories 2. Ordinal Scale •Something measured on an "ordinal" scale does have an evaluative connotation. One value is greater or larger or better than the other. Product A is preferred over product B, and therefore A receives a value of 1 and B receives a value of 2. •Another example might be rating your job satisfaction on a scale from 1 to 10, with 10 representing complete satisfaction. With ordinal scales, we only know that 2 is better than 1 or 10 is better than 9; we do not know by how much. It may vary. The distance between 1 and 2 maybe shorter than between 9 and 10. •Other examples are grading systems (A, B, C grades) or income class (upper, middle, lower). For these variables the ordering exists but the distances between the categories cannot be quantified. •Students of economics will recall the indifference curves between two goods, each higher indifference curve indicating higher level of utility, but one cannot quantify by how much one indifference curve is higher than the others. 3. Interval Scale • Interval scales build upon ordinal scale variables. In an interval scale, numbers are assigned to objects such that the differences (but not ratios) between the numbers can be meaningfully interpreted. Temperature (in Celsius or Fahrenheit) represents an interval scale variable, since the difference between measurements is the same anywhere along the scale, and is consistent across measurements. Ratios of interval scale variables have limited meaning because there is not an absolute zero for interval scale variables. The temperature scale in Kelvin, in contrast, is a ratio scale variable because its zero value is absolute zero, i.e. nothing can be measured at a lower temperature than 0 degrees Kelvin. •Time is an example of variable measured on the interval scale. The distance between 1 and 2 is equal to the distance between 9 and 10. Temperature using Celsius or Fahrenheit is a good example, there is the exact same difference between 100 degrees and 90 as there is between 42 and 32. 4. Ratio Scale •Ratio scales have all the attributes of interval scale variables and one additional attribute: ratio scales include an absolute “zero” point. For example, traffic density (measured in vehicles per kilometer) represents a ratio scale. The density of a link is defined as zero when there are no vehicles in a link. Other ratio scale variables include number of vehicles in a queue, height of a person, distance traveled, accident rate, etc •Temperature measured in Kelvin is an example. There is no value possible below 0 degrees Kelvin, it is absolute zero. Weight is another example, 0 lbs. is a meaningful absence of weight. Your bank account balance is another. Although you can have a negative or positive account balance, there is a definite and non arbitrary meaning of an account balance of 0. REGRESSION ANALYSIS;A HYPOTHETICAL EXAMPLE • Regression analysis is largely concerned with estimating and/or predicting the (population) mean value of the dependent variable (regressand) on the basis of the known or fixed values of the explanatory variable(s) (repressor). • Look at table 2.1 which refers to a total population of 60 families and their weekly income (X) and weekly consumption expenditure (Y). The 60 families are divided into 10 income groups. • There is considerable variation in weekly consumption expenditure in each income group. But the general picture that one gets is that, despite the variability of weekly consumption expenditure within each income bracket, on the average, weekly consumption expenditure increases as income increases. • (I.3.2) is an example of a linear regression model, i.e., it hypothesizes that Y is linearly related to X, but that the relationship between the two is not exact; it is subject to individual variation. The econometric model of (I.3.2) can be depicted as shown in Figure I.2. • The dark circled points in Figure 2.1 show the conditional mean values of Y for a given X value. If we join these conditional mean values, we obtain what is known as the population regression line (PRL), or more generally, the population regression curve. More simply, it is the regression of Y on X. The adjective “population” comes from the fact that we are dealing in this example with the entire population of 60 families. Of course, in reality a population may have many more families. THE CONCEPT OF POPULATION REGRESSION FUNCTION (PRF) • From the preceding discussion and Figures. 2.1 and 2.2, it is clear that each conditional mean E(Y | Xi) is a function of Xi. Symbolically, • E(Y | Xi) = f (Xi) …………………………………..(2.2.1) • Equation (2.2.1) is known as the conditional expectation function (CEF) or population regression function (PRF) or population regression (PR) for short. • The functional form of the PRF is an empirical question. For example, we may assume that the PRF E(Y | Xi) is a linear function of Xi, say, of the type • E(Y | Xi) = β1 + β2Xi ………………………………..(2.2.2) STOCHASTIC SPECIFICATION OF PRF • We can express the deviation of an individual Yi around its expected value as follows: • ui = Yi − E(Y | Xi) • Technically, ui is known as the stochastic disturbance or stochastic error term. • How do we interpret (2.4.1)? • The sum of two components: – (1) E(Y | Xi), is known as the systematic, or deterministic, component, – (2) ui, which is the random, or nonsystematic, component. • Now if we take the expected value of (2.4.1) on both sides, we obtain • E(Yi | Xi) = E[E(Y | Xi)] + E(ui | Xi) • = E(Y | Xi) + E(ui | Xi) (2.4.4) • Where expected value of a constant is that constant itself. • Since E(Yi | Xi) is the same thing as E(Y | Xi), Eq. (2.4.4) implies that • E(ui | Xi) = 0 (2.4.5) • Thus, the assumption that the regression line passes through the conditional means of Y implies that the conditional mean values of ui (conditional upon the given X’s) are zero. • It is clear that • E(Y | Xi) = β1 + β2Xi (2.2.2) • and • Yi = β1 + β2Xi + ui (2.4.2) Better • are equivalent forms if E(ui | Xi) = 0. • We can develop the concept of the sample regression function (SRF) to represent the sample regression line. The sample counterpart of (2.2.2) may be written as • Yˆi = βˆ1 + βˆ2Xi (2.6.1) • where Yˆ is read as “Y-hat’’ or “Y-cap’’ • Yˆi = estimator of E(Y | Xi) • βˆ1 = estimator of β1 • βˆ2 = estimator of β2 • Note that an estimator, also known as a (sample) statistic, is simply a rule or formula or method that tells how to estimate th...
View Full Document

Newly uploaded documents

Show More

Newly uploaded documents

Show More

  • Left Quote Icon

    Student Picture

  • Left Quote Icon

    Student Picture

  • Left Quote Icon

    Student Picture