Forecasting the Variability of Stock Index Returns with Stochastic Volatility Models and Implied Vol

Forecasting the Variability of Stock Index Returns with Stochastic Volatility Models and Implied Vol

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: TI 2000-104/4 Tinbergen Institute Discussion Paper Forecasting the Variability of Stock F orecasting Index Returns with Stochastic Volatility Models and Implied Volatility Eugenie Hol Siem Jan Koopman Tinbergen Institute The Tinbergen Institute is the institute for economic research of the Erasmus Universiteit Rotterdam, Universiteit van Amsterdam and Vrije Universiteit Amsterdam. Tinbergen Institute Amsterdam Keizersgracht 482 1017 EG Amsterdam The Netherlands Tel.: +31.(0)20.5513500 Fax: +31.(0)20.5513555 Tinbergen Institute Rotterdam Burg. Oudlaan 50 3062 PA Rotterdam The Netherlands Tel.: +31.(0)10.4088900 Fax: +31.(0)10.4089031 Most TI discussion papers can be downloaded at http://www.tinbergen.nl Forecasting the Variability of Stock Index Returns with Stochastic Volatility Models and Implied Volatility Eugenie Hol and Siem Jan Koopman Department of Accounting and Finance, University of Birmingham Department of Econometrics, Free University Amsterdam November 13, 2000 In this paper we compare the predictive abilility of Stochastic Volatility (SV) models to that of volatility forecasts implied by option prices. We develop an SV model with implied volatility as an exogeneous variable in the variance equation which facilitates the use of statistical tests for nested models we refer to this model as the SVX model. The SVX model is then extended to a volatility model with persistence adjustment term and this we call the SVX+ model. This class of SV models can be estimated by quasi maximum likelihood methods but the main emphasis will be on methods for exact maximum likelihood using Monte Carlo importance sampling methods. The performance of the models is evaluated, both within sample and out-of-sample, for daily returns on the Standard & Poor's 100 index. Similar studies have been undertaken with GARCH models where ndings were initially mixed but recent research has indicated that implied volatility provides superior forecasts. We nd that implied volatility outperforms historical returns in-sample but that the latter contains incremental information in the form of stochastic shocks incorporated in the SVX models. The out-of-sample volatility forecasts are evaluated against daily squared returns and intradaily squared returns for forecasting horizons ranging from 1 to 10 days. For the daily squared returns we obtain mixed results, but when we use intradaily squared returns as a measure of realised volatility we nd that the SVX+ model produces the most accurate out-of-sample volatility forecasts and that the model that only utilises implied volatility performes the worst as its volatility forecasts are upwardly biased. Abstract KEYWORDS: Forecasting, Implied Volatility, Monte Carlo likelihood method, Stochastic volatility, Stock indices. 1 Introduction Forecasts of nancial market volatility play a crucial role in nancial decision making and the need for accurate forecasts is apparent in a number of areas, such as option pricing, hedging strategies, portfolio allocation and Value-at-Risk calculations. Unfortunately, it is notoriously di cult to accurately predict volatility and the problem is exacerbated by the fact that realised volatility has to be approximated as it is inherently unobservable. Due to its critical role the topic of volatility forecasting has however received much attention and the resulting literature is considerable. One of the main sources of volatility forecasts are historical parameteric volatility models such as Generalised Autoregressive Conditional Heteroscedasticity (GARCH) and Stochastic Volatility (SV) models. The parameters in these models are estimated with historical data and subsequently used Corresponding author: Siem Jan Koopman, Department of Econometrics, Free University, De Boelelaan 1105, NL1081 HV Amsterdam. Email s.j.koopman@econ.vu.nl. 1 to construct out-of-sample volatility forecasts. The high degree of intertemporal volatility persistence observed by these models suggests that the variability of stock index returns is highly predictable and that past observations contain valuable information for the prediction of future volatility. Studies comparing the forecasting abilities of the various volatility models have been undertaken for a number of stock indices and the general consensus appears to be that those models that attribute more weight to recent observations outperform others1 . Little e ort has however been made to compare ex-ante volatility forecasts produced by GARCH models with those of SV models2. An alternative information source for volatility prediction is found in implied volatility which is calculated from option prices in combination with a certain option pricing model. Early empirical studies by Latane and Rendelman (1976), Chiras and Manaster (1978) and Beckers (1981) have indicated that implied volatility, when compared with historical standard deviations, can be regarded as a good predictor of future volatility. Implied volatility is also often referred to as the market's volatility forecast and is forward looking, as opposed to historical based methods which are by de nition backward looking. Provided that the option market is e cient and that the option pricing model has been correctly speci ed, the information content of implied volatility should therefore subsume that of all other variables in the information set. The question whether the most accurate volatility forecasts are produced by implied volatility, rather than by the historically based volatility models, was rst addressed by Day and Lewis (1992) who developed a GARCH model with embedded implied volatility. Contrary to theory, their results indicated that GARCH models provided better volatility forecasts than implied volatility but that the latter might contain additional information as the best forecasts were obtained using both sources of information. Canina and Figlewski (1993) even found "little or no correlation at all between implied volatility and subsequent realized volatility" and favoured a simple historical volatility measure. Findings in the early nineties were therefore mixed and the assumed comprehensive information content of implied volatility appeared questionable as Lamoureux and Lastrapes (1993) were also unable to reject the hypothesis that predictions based on GARCH models contained incremental information about future volatility. Recent studies by Christensen and Prabhala (1998), Fleming (1998) and Blair, Poon and Taylor (2000) are however much more supportive as all present evidence that the most accurate volatility forecasts for returns on the Standard & Poor's 100 stock index are based on implied volatility. Moreover, their research strongly suggests that historical data contains little or no incremental information about future volatility3 . Thusfar the issue of comparitive forecasting ability has however not been studied in the context of SV models. In recent years this class of volatility model has received considerable attention in the literature and it can now be regarded as a competitive alternative to GARCH models eventhough its empirical application has been limited. In this paper we examine the predictive ability of the SV model and compare its volatility forecasts with those of implied volatility. For this purpose we introduce an SV model which incorporates implied volatility as an exogeneous variable in the variance equation. This model, which we will refer to as the Stochastic Volatility with eXogeneous variables (SVX) model, allows us to perform statistical tests for nested models. We evaluate the predictive performance for daily returns on the Standard & Poor's 100 index and as a measure of implied volatility we use the VIX index of the Chicago Board Options Exchange (CBOE). In addition, we compare the ex-ante forecasting ability of the di erent methods over a ve year evaluation period for forecasting horizons ranging from 1 to 10 trading days. As measures of realised volatility we consider both daily squared returns and intradaily squared returns. The SV class of models considered in this paper are estimated using exact maximum likelihood produced more accurate volatility forecasts than GARCH models. 3 It has been suggested, most notably by Blair et.al (2000), that the earlier ndings are due to measurement errors in the calculated implied volatility measure. 1 See e.g. Akgiray (1989), Dimson and Marsh (1990) and Walsh and Tsou (1998) for an overview. 2 An exception is Heynen (1995) who examined a variety of international stock indices and found that SV models 2 methods which are based on Monte Carlo simulation techniques such as importance sampling and antithetics. More accurate estimates of the likelihood function are obtained when the number of simulations is increased. Therefore the estimates can be as accurate as desired at the cost of computer time. The remainder of this paper is structured as follows. In the next section we discuss the various model speci cations while in section 3 we present the relevant estimation methods. The data and in-sample estimation results are presented in section 4. In section 5 we give details of our forecasting methodology and the out-of-sample forecasting results are presented in section 6. In the nal section we conclude and provide a summary. 2 Model Speci cations Generalised Autoregressive Conditional Heteroscedasticity (GARCH) models have thusfar been the most frequently applied class of time-varying volatility model. Since its introduction by Engle (1982) and subsequent generalisation by Bollerslev (1986) this model has been extended in numerous ways which usually involved alternative formulations for the volatility process4 . Although the Stochastic Volatility (SV) model has been recognised as a viable alternative to the GARCH model, the latter is still the standard in empirical applications5. This is mainly due to the problems which arise as a consequence of the intractability of the likelihood function of the SV model which prohibits its direct evaluation. However, in recent years considerable progress has been made in this area which does not only encourage further empirical research but also enables the development of various extensions of the SV model. One of the possible extensions involves the inclusion of exogenous variables in the variance equation which we will discuss in this paper the resulting model we refer to as the SVX model. Volatility models are usually de ned by their rst two moments, the mean and the variance equation. The general notation for the mean equation of time-varying volatility models is given by yt = t + t "t "t NID(0 1) t = 1 ::: T (1) where yt denotes the return series of interest and t its conditional mean6 . The disturbance term "t is assumed to be identically and independently distributed with zero mean and unit variance. In addition, the assumption of normality is added. A common notation for the variance equation of the SV class of volatility models is given by 2 t = 2 exp(ht ) (2) and it is therefore de ned as the product of a positive scaling factor 2 and the exponential of the stochastic process ht . For the standard SV model this process is speci ed as ht = ht;1 + t t NID(0 1) (3) 4 For surveys on GARCH models we refer to Bollerslev, Chou and Kroner (1992), Bera and Higgins (1993), Bollerslev, Engle and Nelson (1994) and Diebold and Lopez (1995). 5 SV models are reviewed in, for example, Taylor (1994), Ghysels, Harvey and Renault (1996) and Shephard (1996). 6 For SV models the conditional mean is usually assumed to be equal to zero or is modelled prior to estimation of the volatility process. Simultaneous estimation of the mean and variance equation has been undertaken in, for example, Koopman and Hol Uspensky (2000). where the degree of volatility persistence is measured by the parameter which is restricted to a positive value smaller than one in order to ensure the stationarity of the volatility process, so 0 < < 1. Further, it is assumed that the disturbance term t is mutually uncorrelated with the error term "t 3 in the mean equation (1), both contemporaneously and at all lags. The SV model with embedded implied volatility is labelled the SVX model and we could specify its stochastic process as ht = ht;1 + xt + t (4) t NID(0 1) where xt denotes the contemporaneous implied volatility measure in logarithmic squared form, so xt = ln 2 t. The value for in the SVX model is restricted to be less than one in absolute values, IV i.e. ;1 < < 1. The problem with this speci cation is that it includes an entire lag structure of the implied volatility measure which becomes apparent when we rewrite the volatility process in logarithmic terms as ln 2 = ln 2 + ht t = ln 2 + ht;1 + xt + t = (1 ; ) ln 2 + ln 2;1 + xt + t t and if we repeatedly substitute for the lagged volatility process we observe that ln 2 = ln t 2 + xt + t; X 1 In comparison, the equivalent notation for the SV model, with ht as de ned in equation (3), can be written as t;1 Xi 2 2 ln t = ln + t;i : Inclusion of these multiple lagged implied volatility measures lead to a downwardly biased value for when iPpositive. For a negative parameter the estimate for will be asymptotically upwardly s biased, as 1 i < 0 for ;1 < < 0. Obviously, the size of this bias depends on the estimated value i=1 for the persistence parameter . If is close to zero and insigni cant, i.e. if all volatility information is impounded in the implied volatility measure, will only be marginally biased. For the GARCH class of models the issue of a multiple lagged implied volatility structure was pointed out by Amin and Ng (1997), who suggested a persistence adjustment term. A similar structure can be implemented for the SVX model by de ning ht as follows ht = ht;1 + (1 ; L)xt + t (5) t NID(0 1) which by recursive substitution of the logarithmic variance equation leads to ln 2 = (1 ; ) ln 2 + ln 2;1 + (1 ; L)xt + t t t = ln 2 i=1 i xt;i + t; X 1 i i=0 t;i : i=0 + xt + t; X 1 i and therefore omits the implied volatility lag structure. By de ning ht as in equation (5) we therefore obtain an alternative SVX model, which we will refer to hereafter as the SVX model with persistence adjustment, or the SVX+ model. Finally, we also consider a deterministic volatility model that only utilises implied volatility as a source of volatility information. This model we obtain by imposing the restrictions = 0 and 2 = 0 on equation (4) or (5), and therefore ht = xt (6) with ln 2 = ln 2 + xt : t This last of our four models we term the VX model as the volatility process does not have a separate error term and is solely determined by exogeneous variables. 4 i=0 t;i 3 Model Estimation In this section we show how the parameters of the SVX class of models can be estimated by simulated maximum likelihood using importance sampling. Further, we show how to compute the conditional mean and variance of the volatility process ht . First we show that a quasi-maximum likelihood method can also be used. Consider model (1), (2) and (3) which we can transform by taking logs of squared yt 's, that is yt = ln yt2 , to obtain the model yt = + ht + ut = ln 2 ut ln 2 1 with ht given by (3) for t = 1 : : : T . We take t as zero, implying that the yt process remains unmodelled. The resulting model is within the class of linear state space models for an introduction to state space models we refer to Harvey (1993, Chapter 4). Note that the disturbance term of the model for yt is non-Gaussian. Nevertheless, the Kalman lter and the associated smoothing algorithm produce the minimum mean square linear estimator for ht see Harvey (1993, section 4.3). By assuming that the disturbance ut is normally distributed with mean and variance set equal to the mean and variance of a ln 2 variable, we obtain so-called quasi maximum likelihood estimates of the 1 unknown parameters , and when maximising the Gaussian likelihood function with respect to these parameters. This estimation procedure is proposed by Harvey, Ruiz and Shephard (1994) and is implemented in the computer package STAMP of Koopman et.al (2000). The inclusion of explanatory variables in the log-volatility process does not complicate matters further. The log-squared transformation does not a ect the log-volatility processes of the SVX models as de ned in equations (4) and (5). Therefore the Kalman lter can still be applied to the resulting linear model. However the regression coe cient need to be estimated additionally. In the following we will develop exact maximum likelihood methods for the estimation of the parameters of SV and SVX models. Quasi-maximum likelihood methods will not be used in the empirical studies of sections 4 and 6 since we prefer to use exact likelihood methods. Let y = (y1 : : : yT )0 and = ( 1 : : : T )0 where observation yt is modelled as (1) and its log-volatility is given by 2 t = + ht t = exp( t ) with signal ht modelled as (3), (4) or (5), for t = 1 : : : T . Note that 2 = exp( ). Further we shall collect the parameters which are not included in the state vector below in the parameter vector . The SVX model (1), (2) and (4) in state space form is given by 3.1 Quasi-maximum likelihood of SV and SVX models 3.2 Exact maximum likelihood of SVX models using importance sampling p(yj )= T Y t=1 N(0 2 ) t with t = 0, 2 = exp( t ) = exp( + ht ) = 2 exp(ht ). The state vector collects the components t of the log-volatility and is given by t = ( ht )0 . The so-called transition equation for the state vector is given by 2 301 10 0 6 0 1 0 7 t+B 0 C t 5 @0 A t+1 = 4 0 xt 5 where disturbances t are distributed as NID(0 1). The initial state vector 1 is given by 0 1 80 1 2 39 >0 > <B C 6 0 0 B C N @0A 40 7= 0 @A: 5> >0 2 2 0 0 =(1 ; ) h1 for some arbitrary large value for . Finally, the parameter vector is given by 0 =B @ 1 C: A This completes the model speci cation in state space form. It follows that p(yj )= T Y This representation of SV and SVX models can be regarded as a nonlinear state space model. The aim now is to estimate the parameter vector by exact maximum likelihood. This requires the evaluation of the loglikelihood function. A convenient expression for the loglikelihood is developed below. In section 3.2.2 we provide some computational details required for estimation. The state vector t elements, which include the regression coe cients and and the stochastic log-volatility process ht , are estimated using signal extraction methods which are brie y discussed in section 3.2.3. t=1 N(0 exp t ) t = (1 0 1) t : 3.2.1 Loglikelihood evaluation The construction of the exact likelihood for the SV model using the Monte Carlo likelihood approach of Shephard and Pitt (1997) and Durbin and Koopman (1997) can be modi ed for the SVX model. The nonlinear relation between the log-volatility ht and the observation equation is not altered in the SVX case only the speci cation for ht is di erent. Similar considerations are discussed by Chib, Nardari and Shephard (1998) in a Bayesian context. The same modi cation can be used for the SVX+ model since we merely replace the explanatory variable xt by (1 ; L)xt . The loglikelihood function for the SVX model can be computed via the Monte Carlo technique of importance sampling. The likelihood function can be expressed as L( ) = p(yj ) = p(y j )d = p(yj Z Z )p( j )d : (7) An e cient way of evaluating the likelihood is by using importance sampling see Ripley (1987, Chapter 5). We require a simulation device to sample from an importance density p( jy ) which we ~ prefer to be as close as possible to the true densitity p( jy ). An obvious choice for the importance density is the conditional Gaussian density since in this case it is relatively straightforward to sample from p( jy ) = g( jy ). An approximating Gaussian model for the SVX model is developed in ~ the appendix. The simulation smoother of de Jong and Shephard (1995) is used to sample from the approximating Gaussian model g( jy ). The likelihood function (7) can be obtained by writing Z j ~ L( ) = p(yj ) gp(jy ) ) g( jy )d = Efp(yj ) gp(jyj ) ) g (8) ( ( ~ where E denotes expectation with respect to the importance density g( jy ). Expression (8) can be simpli ed following a suggestion of Durbin and Koopman (1997). The likelihood function of the approximating Gaussian model is given by g( ) j Lg ( ) = g(yj ) = g(yjy j ) = g(yg( jy)p( )j ) (9) 6 and it follows that This ratio also appears in (8) and substitution leads to ~ p(y ) L( ) = Lg ( ) Ef g(yjj ) g (10) which is the convenient expression we will use in our calculations. The likelihood function of the approximating Gaussian model Lg ( ) can be calculated via the Kalman lter. The conditional density functions p(yj ) and g(yj ) are easily computed given values for and using (26). It follows that the likelihood function of the SVX model is equivalent to the likelihood function of an approximating Gaussian model, multiplied by a correction term. This correction term only needs to be evaluated via simulation. An obvious estimator for the likelihood of the SVX model is ^ L( ) = Lg ( )w (11) p( j ) = Lg ( ) : g( jy ) g(yj ) wi = p(yj i ) (12) g(yj ) i=1 and i denotes a draw from the importance density g( jy ). The accuracy of this estimator solely depends on M , that is the number of simulation samples. In practice, we usually work with the log ^ of the likelihood function to manage the magnitude of density values. The log transformation of L( ) ;3=2 ) see Shephard and Pitt (1997) and introduces bias for which we can correct up to order O(M 1 w=M where M X wi i Durbin and Koopman (1997). We obtain P with s2 = (M ; 1);1 M (wi ; w)2 . w i=1 s ^ ln L( ) = ln Lg ( ) + ln w + 2Mw 2 w 2 (13) 3.2.2 Computational details Given a particular vector for = ( " )0 , we evaluate the loglikelihood function (13) for which we use the approximating model (25) to generate simulation samples. To obtain a maximum likelihood estimate for , which we denote by ^ , the loglikelihood is numerically maximised with respect to in a similar fashion as for Gaussian models see Harvey (1989) and Koopman et.al (2000). The repeated evaluation of the loglikelihood for di erent 's during the search for ^ will be based on the same set of random numbers used for simulation. Although the approximating model is e ective for simulation, we may wish to decrease the simulation variance further using standard simulation techniques based on antithetics and control variables see Durbin and Koopman (1997). In our computations we have employed the standard antithetic variable as given by i = 2^ ; i where i is a draw from the importance density g( jy ) and where ^ = E( ) can be obtained using the Kalman lter and smoother. Since i ; ^ = ;( i ; ^) and i are ~ normally distributed, the two vectors i and i are equi-probable. The number of simulation samples M is set prior to the estimation procedure. The choice of M can be determined by computing the error variance due to simulation see Durbin and Koopman (1997). It is shown by Sandmann and Koopman (1998) that M can be relatively small in the context of SV models. Therefore, in this study we have set M equal to 100 times two antithetic variables, that is M = 200. 7 3.2.3 Signal extraction The Monte Carlo importance sampling techniques, which we have used for likelihood evaluation, can also be employed to compute the conditional mean and variance of the unobserved signal t . The same approximating Gaussian model can be used for this purpose. The details are given by Durbin and Koopman (2000). The conditional mean and variance of the signal t are given by E( t jy ) = (1) t where we compute (1) and (2) by t t (1) Var( t jy ) = (2) ; (1) ]2 : t t M 1X = M wi it2 i=1 t M 1X = M wi it i=1 (2) t with wi as de ned in (12) and it as the tth element of i which is the ith draw from the importance density g( jy ). This device can be generalised to obtain the elements of conditional mean and variance of the state vector t . In practice, the unknown parameter vector is replaced by its Monte Carlo maximum likelihood estimate ^ . The uncertainty related to the estimate ^ can be also taken into account by similar Monte Carlo simulation techniques see Durbin and Koopman (2000). An alternative approach of signal extraction for SV and SVX models is provided by the Bayesian Markov chain Monte Carlo techniques see, for example, Shephard and Pitt (1997) and Kim, Shephard and Chib (1998). 3.2.4 Numerical implementation of estimation procedure The simulated Monte Carlo estimation procedure is implemented using the object-oriented matrix programming language of Doornik (1998) using the library of Koopman, Shephard and Doornik (1999). The relevant programs, including the one used for the empirical studies in this paper, can be downloaded from the Internet at www.econ.vu.nl/koopman/sv/ and the program documentation can be consulted on-line. The programs can be used in a more general context they can be modi ed for other Monte Carlo studies and be applied to other data-sets. Ox 2.2 SsfPack 2.2 3.3 Exact maximum likelihood of VX model The VX model can be regarded as a simple regression model with heteroscedastic disturbances, that is 2 = exp( + xt ). The model can be estimated by maximum likelihood using standard methods t for example, see Johnston and DiNardo (1997, Chapter 6). The rst and second derivatives of the loglikelihood with respect to the parameters and can be obtained analytically for the VX model and this allows straightforward application of Newton's method for numerically maximising the likelihood function. 4 Data Description and Empirical In-Sample Results 4.1 Data The data we selected is the Standard & Poor's 100 stock index for the period 2 January 1986 to 31 December 1999 and was obtained from Datastream. After adjusting the series for holidays, our sample consists of 3532 daily observations. The continuously compounded returns on the stock index are expressed in percentage terms and are therefore given by Rt = 100(ln Pt ; ln Pt;1 ) where Pt denotes the closing price of the Standard & Poor's 100 index at time t. The accompanying implied 8 volatility index we use is the Chicago Board Options Exchange Market Volatility Index (VIX) which was extracted from the CBOE on-line database7 . The VIX is calculated as the weigthed average of implied volatilities of eight near-the-money, nearby and second nearby call and put options on the Standard & Poor's 100 index and represents the implied volatility of a hypothetical at-the-money OEX option with twenty two trading days to expiry8. We use the daily closing level of the VIX index and from the annualised VIX, which is expressed in terms of standard deviations, we calculate the daily VIX variance at time t as 2 t = V IXt2 =252. The main attraction of the VIX index is that it IV mitigates many of the problems which lead to biased implied volatility values. 10 5 0 -5 -10 -15 -20 1986 (ii) 100 80 60 40 20 0 1986 1990 1994 1998 -25 -20 (iii) 100 80 60 40 20 0 1986 -15 -10 -5 0 5 10 .2 .4 (i) .6 (i) N(s=1.1) 1990 1994 1998 1990 1994 1998 Figure 1: Daily (i) returns and (ii) squared returns (truncated at 100) on the Standard & Poor's 100 index and (iii) the VIX index between 02/01/86 and 31/12/99 In gure 1 we graph the daily return and the VIX series, together with the squared return series which can be regarded as an approximation of realised volatility9 . In addition we show the histogram of the daily returns on the Standard & Poor's 100 index in the top right corner which shows that this series is negatively skewed and exhibits leptokurtosis. As our full sample includes observations relating to the October 1987 stock market crash, which might have a distorting in uence on the estimation results, we also consider a subsample that starts on 1 January 1988. Summary statistics for both samples are given in table 1. The e ects of the large outliers in the full sample are illustrated by the very high values for the skewness and excess kurtosis coe cients. When we compare the variances of the two return series we observe a value of 1:211 for the full sample against 0:905 for the shorter sample, which represent annual standard deviation values of 17:5% and 15:1%, respectively. The (1995) who regard it as a useful proxy for expected stock market volatility. 9 Note that the graph of the squared return series is truncated at a value of 100 which only a ects the observation of 19 October 1987 that has a value of 561:214. 7 www.cboe.com/tools/historical/vix.htm 8 The construction of the VIX index is described in detail by Whaley (1993) and by Fleming, Ostdiek and Whaley 9 Table 1: Summary statistics of daily returns and squared returns on the S&P 100 Index and the VIX index from 02/01/86 to 31/12/99 and from 04/01/88 to 31/12/99 Period 1986-1999 1988-1999 No. of Obs. T 3531 3027 Series Mean Variance Skewness Excess Kurtosis Maximum Minimum ^1 ^2 ^3 ^4 ^5 Q(12) S&P100 Rt Rt 2 VIX 2 IV t S&P100 Rt Rt 2 VIX 2 IV t 0.058 1.211 -3.102 66.847 8.539 -23.690 -0.021 -0.053 -0.039 -0.042 0.024 29.644 1.215 99.934 50.294 2794.13 561.214 0.000 0.163 0.138 0.071 0.026 0.128 253.45 1.858 7.850 18.553 479.565 89.521 0.324 0.789 0.623 0.622 0.598 0.568 10760 0.063 0.905 -0.471 6.143 5.606 -7.644 -0.019 -0.019 -0.048 -0.019 -0.034 34.147 0.909 6.576 12.791 236.497 58.438 0.000 0.195 0.087 0.051 0.114 0.139 341.68 p 1.639 1.307 2.226 7.750 9.668 0.324 0.960 0.933 0.914 0.899 0.875 27065 ^` is the sample autocorrelation coe cient at lag ` with asymptotic standard error 1= T and Q(`) is the Box-Ljung portmanteau statistic based on ` squared autocorrelations. The critical value at the 1% signi cance level for the Q(12) statistic is 26.22. decrease in volatility is also re ected in the VIX series which has however much larger mean values than the squared return series. When these VIX values are translated into annual standard deviations these amount to 21:6% for the full sample and 15:8% for the post crash period which indicates that the implied volatility measure tends to overestimate actual volatility. The graphs in gure 1 show that the VIX and the squared return series follow a very similar pattern although the VIX series is much smoother, i.e. less volatile, which is con rmed by their respective variance statistics in table 1. Further, the degree of autocorrelation is much higher for the VIX series than for the squared return series, especially for the 1988{1999 sample where autocorrelation coe cients for the VIX series are comparable with the persistence parameter values generally found for SV and GARCH models. The Q(12) statistics for the return series indicates that the null hypothesis of zero values for the rst twelve autocorrelation coe cients has to be rejected at the 1% level for both samples. The rst order autocorrelation coe cients are however not signi cantly di erent from zero eventhough this is frequently observed for stock index series10 . We therefore leave the conditional mean t unmodelled, i.e. we assume that in equation (1) t = 0. 10 See, e.g.: Campbell, Lo and MacKinlay (1997, Chapter 2). 10 4.2 Empirical in-sample results In this section we present the results obtained with the four models introduced in section 2, which are the SV, the SVX, the SVX+ and the VX model. The general mean and variance equations are de ned in equations (1), with yt = Rt and t = 0, and (2), respectively. The log processes for these models are given by SV Model: SVX Model: SVX+ Model: VX Model: ht = ht;1 + t t t ht = ht;1 + xt + ht = xt : ht = ht;1 + (1 ; L)xt + The SV model can be obtained by imposing the parameter restriction = 0 in the ht de nition of either the SVX or the SVX+ model and when we impose = = 0 for the SVX or the SVX+ model we obtain the VX model. For these combinations of models we can therefore apply statistical tests for nested models. In table 2 we present the parameter estimates and results for the tests of the presumed comprehensive informational content of the VIX index relative to the SV model for daily returns on the Standard & Poor's 100 index over the periods 1986{1999 and 1988{1999. For the SV model we nd that the estimated coe cients for the persistence parameter are close to unity and statistically signi cant for both samples. The fact that the shorter 1988{1999 sample does not contain the large outlier of the longer sample is re ected not only in the size of the estimate, but also in the estimated values for the scaling parameter 2 and the variance of t , as these are both larger for the more volatile 1986{1999 sample. Of further interest are the statistics for "t , the error term in the mean equation, which indicate that the assumption of zero serial correlation is not violated by the SV model and that the SV model is capable of absorbing excess kurtosis found in the underlying series. The normality statistics are considerably worse for the deterministic VX model which is mainly attributable to the fact that this model does not have an error term in the variance equation. The values for in this model are signi cantly larger than unity for both samples and as they are combined with relatively low estimates for 2 this means that the volatility process of the VX model exhibits more movement than the VIX index while at the same time it results in lower variances than the implied volatility measure. This re ects our observation in section 4.1 that the VIX index tends to overstate the volatility process while at the same time it underestimates its degree of variation11 . On the basis of the maximum likelihood statistics we would have to favour the VX model to the SV model although the former clearly violates the model assumptions with regard to "t . The best maximum likelihood statistics for both periods are obtained with the SVX and the SVX+ model which combine the two sources of volatility information. We nd that the parameters in these models are always statistically signi cant which con rms the earlier ndings in the GARCH literature that implied volatility contains crucial information about the volatility process. The estimates for the parameters in the SVX models are negative and statistically signi cant implying that the parameter is upwardly biased. When we include the persistence adjustment term in the SVX+ model is still found to be negative but no longer signi cant and the values for move towards unity and are close to be statistically equal to one. The values for are now smaller than those of the VX model. The extra movement in the volatility process is compensated by t which has highly signi cant values upwardly biased. 11 See Fleming (1998) who also observes that implied volatility, on average, exceeds realised volatility and is therefore 11 Period T SV 0:726 0 923 Table 2: Estimation results 1986{1999 3531 1988{1999 3027 SV 0:688 0 921 Model 0:426 0 456 SVX 0:425 0 455 SVX+ 0:499 0 527 VX 0:421 : SVX SVX+ 0:420 VX 0:483 2 0 571 : 0 398 0 397 0 473 0 514 0 393 0 984 0 959 : 0 233 0 966 0 991 1 224 0:975 : : 1 395 1 009 1 154 1 085 1 123 0 387 : : ;0:373 ;0:035 ;0:247 ;0:213 : : 0 012 : : ;0:008 : : : : : : : : 0:982 : 0 036 : : 1:310 : 0 398 : 0 228 : : 1:081 0:020 : 1:156 ;0:402 ;0:031 ;0:420 1 213 ;0:229 0 452 0 392 : ;0:136 1:314 1 416 0 451 : 0 457 : 0 184 0 511 : : : 0 987 : 0 195 1:072 0:274 0 384 : 1 156 : : : 1 070 : 0 211 1:150 0:290 -3696.53 0 399 1 229 : : 2 0 022 : 0 051 0 210 0:033 : : : : 0:285 -4484.44 -4530.95 0:301 12 -4633.02 303.12 297.16 9272.04 19.633 18.969 8970.92 20.588 21.127 8976.88 20.241 22.796 98.98 93.02 9065.90 20.693 325.822 -4481.46 ln L LR( LR( LR( LR( AIC Q(12) N = 0)a = 0)b = 2 = 0)a = 2 = 0)b -3813.83 234.60 230.02 7633.66 19.348 18.364 -3698.82 -3717.78 7401.06 21.856 24.461 7405.64 21.213 22.925 42.50 37.92 7439.56 22.776 80.302 Parameter estimates are reported together with the asymptotic 95% con dence intervals which are a-symmetric for 2 , and 2 LR( = 0) and LR( = 2 = 0) are the likelihood ratio statistics for the hypotheses = 0 and = 2 = 0, respectively, as measured against a the SVX and b the SVX+ model. AIC is the Akaike Information Criterion which is calculated as -2(ln L) + 2p Q(`) is the Box-Ljung portmanteau statistic for the estimated observation errors which is asymptotically 2 distributed with ` ; p degrees of freedom where p is the total number of estimated parameters N is the 2 normality test statistic with 2 degrees of freedom. for its variance where the increase in 2 relative to that of the SV model is due to the fact that the level of the ht process is higher. The likelihood ratio statistics then indicate that we can never reject the joint hypothesis that both and 2 are statistically insigni cant. The volatility process is therefore best described by the SVX+ model, rather than the SVX model with its higher maximum likelihood values as this model overestimates the value for and therefore provides a biased estimate of the volatility process. The SVX+ model is also to be preferred to the VX model which appears to be mainly attributable to the omission of the stochastic component t in the deterministic VX model. Our overall conclusion is therefore that the in-sample volatility process is best described by a volatility model which includes not only implied volatility but also a stochastic element, as shocks to the volatility process are not su ciently captured by the implied volatility measure alone. 5 Volatility Forecasting Methodology We develop forecasts based on the rolling window principle where we initially estimate the parameters over the period 2 January 1986 to 31 December 1994. This sample therefore spans a period of 9 years and consists of 2270 observations. This leaves an evaluation period of 1261 observations covering ve years of data, i.e. from 3 January 1995 to 31 December 1999. Having calculated the volatility forecasts based on the parameters of this initial sample we roll it forward by one trading day, thus keeping the sample size constant at 2270 observations. We also construct volatility forecasts for 2, 3, 4, 5 and 10 day horizons. We obtain non-overlapping forecasts because we roll the estimation sample forward by N observations, where N denotes the forecasting horizon in terms of trading days. This means that we re-estimate the model parameters 1261 times for the one day ahead forecasts and 630, 420, 315, 252 and 126 times for the 2, 3, 4, 5 and 10 day forecasts, respectively. 5.1 Stochastic Volatility model forecasts The one step ahead volatility forecast for the SV model, as de ned in equations (1), (2) and (3), is calculated as (14) E( 2 +1jT ) = exp(ln ^ 2 + hT +1jT + 0:5pT +1jT ) T and the N step ahead volatility forecast is de ned as E( 2 +1 T +N jT ) = T N X j =1 exp(ln ^ 2 + hT +j jT + 0:5pT +j jT ) N X = ^ 2 exp(hT +1jT + 0:5pT +1jT ) + ^ 2 j =2 exp " ^ j ;1 hT +1jT + 0:5 ^ 2( j ;1) pT +1jT + N ;2 X i=0 ^ ^2 2 i !# (15) where ^ 2 , ^ and ^ 2 are the maximum likelihood estimates of 2 , and 2 , respectively. The estimator of hT +1 using all observations available at time T is denoted by hT +1jT with variance pT +1jT and both are computed by simulation methods which are discussed in section 3. The quantities hT +1jT and pT +1jT are computed using the methods of section 3.2.3. The multi-step forecasts are de ned as a summation of the N individual forecasts conditional on the information available at time T . As N increases, the individual forecast E( 2 +N jT ) will converge to a constant which we call the individual T long-term volatility forecast and which is identical to the unconditional variance as given by ^2 ^ exp 0:5 : 1 ; ^2 2 ! 13 It is evident from equation (15) that the rate at which the individual forecasts move towards this value is determined by the size of ^ the smaller the volatility persistence estimate, the faster the individual forecasts converge to the individual long-term volatility forecast value. In empirical applications for daily stock returns the volatility persistence estimates are invariably found to be close to unity, which means that for shorter forecasting horizons individual forecasts are almost solely determined by the size of the short-term volatility, denoted by ^ 2 exp(hT +1jT + 0:5pT +1jT ). 5.2 SVX model forecasts + The one step ahead forecasts for the SVX+ model, as de ned in equations (1), (2) and (5) are obtained in a similar manner as for the SV model in equation (14) and using the same methods. However, for the SVX+ model one period ahead volatility forecast we require xT +1 and xT in order to calculate the values for hT +1jT and pT +1jT . As xT +1 is not known at time T we choose to replace it by xT , the last available implied volatility measure in the information set, but only for the purpose of calculating hT +1jT and pT +1jT . The same problem occurs for volatility forecasts further into the future and therefore we choose to de ne the N step ahead volatility forecasts of the SVX+ model as an N multiple of the one step ahead volatility forecast, so we de ne these as E( 2 +1 T +N jT ) = N ^ 2 exp(hT +1jT + 0:5pT +1jT ): T (16) The implied volatility forecasts are based on the VX model which we de ned in equations (1), (2) and (6), so ht = xt : The one step ahead VX volatility forecast we calculate as E( 2 +1jT ) = ^ 2 exp(^ xT + 0:5^ 2 ) h T (17) where ^ 2 and ^ are the maximum likelihood estimates of 2 and . The sample prediction error variance is denoted by ^ 2 . Again we replace xT +1 with xT in the one period ahead forecasting equation h as the former is not yet known the N step ahead forecasts are then given by E( 2 +1 T +N jT ) = N ^ 2 exp(^ xT + 0:5^ 2 ) h T and these are therefore also de ned as a multiple of the one day ahead VX volatility forecast. (18) 5.3 VX model forecasts 5.4 Measuring predictive forecasting ability To evaluate the accuracy of variance forecasts they have to be compared with realised volatility, which can not be observed. It is common practice in the literature to de ne the actual or realised variance as the squared observed returns, which for one day ahead volatility is equal to 2 RT +1 = T +1 "T +1 2 2 (19) However, the squared error "2 +1 will vary widely which implies that only a relatively small part is T attributable to 2 +1 . T An alternative approach which addresses this problem has been suggested see, for example, Andersen and Bollerslev (1998), Andersen, Bollerslev, Diebold and Labys (1999), Barndor -Nielsen and Shephard (2001) and Andersen, Bollerslev, Diebold and Ebens (2000). In these studies intradaily return data is used to approximate ex-post volatility more accurately. Following these studies we de ne 14 intradaily squared returns as the sum of the squared ve minute returns between 9.30 a.m. and 4.00 p.m. EST during the relevant trading day and to this we then add the overnight return, so ~ T +1 = 2 (X m k=1 100(ln PT +1 (k+1=m) ; ln PT +1 (k=m) )] ) 2 + 100(ln PT +2 (1=m) ; ln PT +1 (m=m) )]2 (20) where PT +1 (1=m) denotes the rst price of the Standard & Poor's Index on day T + 1 at 9.30 a.m. with m representing the number of observations during day T + 1, which on a full trading day amount to 79 observations. The price of the Standard & Poor's Index at 9.30 a.m. on the subsequent trading day is then denoted by PT +2 (1=m) . The multiple-day values of the daily squared returns and the intradaily squared returns are obtained by summing the realised volatility measures of equations (19) and (20) over the relevant forecasting interval, so N X2 2 RT +1 T +N = RT +i (21) with i=1 2 RT +i = 100(ln PT +i ; ln PT +i;1 )]2 where PT +i denotes the closing price of the Standard & Poor's Index at time T + i, and N X ~ 2 +1 T +N = ~ 2 +i T T with ~ T +i = 2 i=1 2 (22) (X m k=1 100(ln PT +i (k+1=m) ; ln PT +i (k=m) )] ) + 100(ln PT +i+1 (1=m) ; ln PT +i (m=m) )]2 : R2 statistics, which are calculated from the regressions 2 RT +1 T +N = a + b E( 2 In order to assess the predictive accuracy of the volatility forecasts we compare the goodness-of- t T +1 T +N jT ) + (23) and ~ 2 +1 T +N = a + b E( 2 +1 T +N jT ) + (24) T T for the squared and the intradaily squared returns, respectively. If the relevant volatility forecast is unbiased , then a = 0 and b = 1. In addition to the regression based evaluation method we also report on a number of error statistics, which are the mean squared error (MSE), the median squared error (MedSE) and the mean absolute error (MAE) as these criteria are also frequently applied in the volatility forecasting literature. 6 Out-of-Sample Results In this section we report on the out-of-sample forecasting results of the SV, the SVX+ and the VX model over the evaluation period 1995 to 1999. Before we discuss these results in section 6:2 we will examine the relationships between the various parameters in the SV model. 15 For the one day ahead SV volatility forecasts we had to estimate the SV model 1261 times which resulted in an equal number of estimates for all the parameters in the model. This now allows us to examine the dynamics of the SV model as the forecasting sample rolls forward by one observation at the time. For this purpose we plot in gure 2 the estimates of the persistence parameter , the error variance of the volatility process 2 , and the scaling parameter 2 . These estimates are based on the previous 9 years of data and the sample variance of each of these data series is plotted in the same gure alongside those of the SV parameter estimates. SV Model: persistence parameter .99 1.4 1.2 1 .8 .95 1995 .06 1996 1997 1998 1999 .6 1995 1996 1997 1998 1999 Forecasting sample variance 6.1 The parameters estimates of the SV model .97 SV Model: error variance .7 SV Model: scaling parameter .04 .6 .02 .5 0 1995 1996 1997 1998 1999 1995 1996 1997 1998 1999 Figure 2: Parameter estimates of the SV model and forecasting sample variance based on the previous 9 years of data. The graph for the ^ parameter shows that volatility is highly persistent for all 1261 samples and when we compare this time series with that of ^ 2 we observe that these series move in opposite directions during the entire period. The negative relationship between these two parameter estimates indicates that large (unexpected) shocks to the volatility process have a downward e ect on the estimate and that volatility persistence is higher when these shocks are more moderate in size, i.e. when values for ^ 2 are smaller. For the forecasting sample variance, depicted in the top right corner of gure 2, we see a sharp decrease after approximately two years when the observations relating to the 1987 stock market crash drop out of the forecasting sample. This decrease in volatility is also re ected in ^ 2 and in the estimated value of the scaling parameter 2 which displays a positive correlation with the variance series of the forecasting samples over the full period. The relationship between the scaling parameter and the other two SV parameters di ers however across the sample which is to be expected as they measure di erent aspects of the volatility process: the estimate for 2 re ects the level of volatility, ^ measures the degree of volatility persistence and the value for ^ 2 indicates the amount of variation in the volatility process. This means that during times of persistent high volatility we will observe high values for ^ 2 and ^ but low values for ^ 2 as there is relatively little movement in the volatility process itself. However, the estimated value for the scaling parameter will still be large when a high level of volatility is due to a few outliers, but the variation in the volatility process as measured by ^ 2 is going to be higher and will be accompanied by a smaller value for the volatility persistence estimate ^ . 16 6.2 Empirical out-of-sample forecasting results The empirical out-of-sample forecasting results based on the methodology described in section 5 are presented in tables 3 and 4. In table 3 we evaluate the volatility forecasts obtained with the SV, SVX+ and VX model against the squared returns over the full ve year evaluation period. The goodness-of- t R2 statistic for the SVX+ and the VX model forecasts are higher than those of the SV model. However, the hypothesis that a = 0 or b = 1 is least violated for the SV model volatility forecasts indicating that the SV model forecasts exhibit the smallest degree of bias. When we evaluate the accuracy of the volatility forecasts on the basis of error statistics a di erent picture emerges. Although the SVX+ and VX model still perform very well when we consider the MSE error statistic, we observe that the VX model fares much worse in terms of the MedSE and the MAE statistics. For these error statistics the VX model is not only outperformed by the SVX+ model but also frequently by the SV model. The results of this evaluation are therefore mixed: the SV model produces volatility forecasts that have the smallest degree of bias, the SVX+ and VX model have very similar goodness-of- t and MSE values and the SVX+ model produces the most accurate out-of-sample volatility forecasts in terms of the MedSE and MAE error statistics. In table 4 we present the forecasting evaluation results with realised returns de ned in terms of ve minute squared returns. As this high frequency data series does not start until 6 January 1997, our evaluation period consists of 3 years of data resulting in forecasting samples of 754, 377, 251, 188 and 75 observations for N = 1, 2, 3, 4, 5 and 10, respectively. We observe that the values for the R2 statistic increase considerably when we de ne realised volatility in terms of ve minute squared returns and this degree of increase is conform earlier ndings in the high frequency return literature. The highest values for the goodness-of- t statistic are almost always those of the VX model its volatility forecasts are however severely upwardly biased as the hypothesis that b = 1 has to be rejected at very low signi cance levels for values of b smaller than unity. The SVX+ model which has comparable values for R2 produces volatility forecasts that are far less biased. The error statistics also favour this model as the forecasts of the SVX+ model are, with the exception of two statistics for N = 10, always the lowest of the three models. In addition, we observe that the VX model is almost consistently the worst performing forecasting model in terms of error statistics. We therefore conclude that the worst performing forecasting model is the VX model eventhough it has the highest R2 statistics. The most accurate volatility forecasts are obtained with the SVX+ model which has goodness-of- t values similar to those of the VX model but combines this with the best error statistics and forecasts that are less biased. In gure 3 we graph the one period ahead volatility forecasts of the SV, the SVX+ and the VX model together with the two measures of realised volatility where the scale of the daily squared return series di ers from those of the other four series. However, it is obvious from the graphs that all ve series follow a very similar pattern and we can clearly distinguish two periods of increased volatility which occur towards the end of 1997 and 1998. The SVX+ and VX forecasting series are near perfectly correlated but forecasts of the VX model are on average 17% higher than those of the SVX+ model which is favourable for the VX model when volatility is very high but leads to overestimation during times of relative tranquility. In terms of sample moments the SVX+ model produces forecasts that are very much alike those of the SV model eventhough it correlation coe cient with the SV forecasting series is lower at 83%. Although di cult to discern from gure 3 all forecasting series, on average, overestimate realised volatility both in terms of daily squared and intradaily squared returns this problem is then most severe for the VX volatility forecasts. 17 Table 3: Out-of-sample forecasting results evaluated against daily squared returns for the (i) SV model, (ii) SVX+ model and the (iii) VX model based on the 1986{1999 sample and for the evaluation period 3 January 1995 to 31 December 1999 1986{1999 Forecasting Model SV Model N =1 a b R 2 N =2 0:2881 (0 908) N = 3 N = 4 N = 5 N = 10 0:6118 0:8098 1:3221 2:4698 (1 207) 0:0365 (0 268) 1:1356 (1 157) : : 1:0221 (0 161) : : 0:9601 0:9671 0:9024 0:9519 (0 272) : : (1 161) : : (1 410) : : (1 025) : : (0 215) (0 589) (0 210) 0:0694 8:1406 0:3517 1:1171 : : 0:0817 21:462 0:9406 1:8056 : : 0:0929 0:1131 0:1061 0:1222 35:785 49:878 70:705 190:76 1:4160 2:0022 2:5819 7:3689 2:2633 2:8325 3:1066 5:4334 : : : : : : : : MSE MedSE MAE SVX+ Model a b R 2 ;0:1675 ;0:0345 0:3600 0:3726 1:2079 2:9023 (1 313) (0 119) (0 787) (0 609) (1 392) (1 464) 1:3959 (3 517) 1:2330 (1 871) 1:0758 1:1219 0:9451 0:8992 (0 582) (0 921) (0 362) (0 578) 0:1130 7:8423 0:2991 1:0682 : : 0:1350 20:378 0:7753 1:6943 : : 0:1404 0:1869 0:1347 0:1766 34:023 46:074 68:578 179:39 1:2804 1:6861 2:2644 6:5155 2:1192 2:5732 3:0286 5:2690 : : : : : : : : MSE MedSE MAE VX Model a b R2 MSE MedSE MAE ;0:1357 0:0651 0:5011 0:6776 1:4608 3:3400 (1 100) (0 232) (1 130) (1 147) (1 729) (1 720) 1:1222 (1 416) 0:9702 (0 307) 0:8447 0:8526 0:7326 0:7031 (1 524) (1 440) (2 241) (2 134) 0:1185 7:6897 0:4436 1:1430 0:1369 20:073 1:0658 1:8008 0:1411 0:1815 0:1309 0:1708 33:829 45:895 69:284 182:69 1:7170 2:1253 2:8655 8:2397 2:2585 2:7704 3:1914 5:7354 Parameter estimates and goodness-of- t R2 statistics for the OLS regressions as de ned in equation (23) with t-statistics in parentheses testing for the null hypotheses a = 0 and b = 1. The highest values for R2 and the lowest error statistic values are underlined. 18 Table 4: Out-of-sample forecasting results evaluated against intradaily squared returns for the (i) SV model, (ii) SVX+ model and the (iii) VX model based on the 1988{1999 sample and for the evaluation period 6 January 1997 to 31 December 1999 1988{1999 Forecasting Model SV Model N =1 a b R 2 N =2 0:2670 (1 485) N = 3 N = 4 N = 5 N = 10 0:3966 0:6137 0:9790 3:0864 (1 275) 0:0585 (0 807) 0:9240 (1 509) : : 0:8712 (2 033) : : 0:8714 0:8757 0:8334 0:7428 (1 761) : : (1 210) : : (1 402) : : (1 628) : : (1 330) (1 640) (1 836) 0:3091 0:9803 0:1549 0:5944 : : 0:3350 2:8944 0:4704 1:0395 : : 0:3639 0:3207 0:3126 0:2781 5:7255 9:6701 14:951 50:127 1:0704 1:5317 2:6939 5:2282 1:4697 1:8761 2:3689 4:1301 : : : : : : : : MSE MedSE MAE SVX+ Model a b R2 MSE MedSE MAE ;0:1130 ;0:0036 0:3572 0:8304 1:0334 3:2963 (1 661) (0 022) (1 231) (1 978) (1 772) (2 153) 1:0472 0:4006 0:8508 0:1369 0:5548 : : (1 011) 0:9595 0:4258 2:4842 0:4193 0:9981 : : (0 703) 0:8579 0:7999 0:8021 0:6934 0:4079 0:4007 0:4046 0:3780 5:3923 8:8366 13:289 46:585 0:8846 1:1947 2:1332 8:1839 1:4528 1:8032 2:2530 4:3358 : : : : : : : : (2 168) (2 790) (2 475) (2 944) VX Model a b R 2 ;0:0496 0:1182 0:5301 1:0793 1:2621 3:6432 (0 781) (0 754) (1 957) (2 684) (2 272) (2 462) 0:8204 (5 097) 0:7507 (5 691) 0:6686 0:6182 0:6303 0:5503 (6 716) (6 860) (5 985) (5 469) 0:4189 0:9557 0:2613 0:6681 0:4392 3:0408 0:6934 1:2213 0:4244 0:3988 0:4130 0:3802 7:0630 12:206 18:034 69:094 1:8086 3:0596 3:6090 16:905 1:8188 2:2921 2:8458 5:9025 MSE MedSE MAE Parameter estimates and goodness-of- t R2 statistics for the OLS regressions as de ned in equation (24) with t-statistics in parentheses testing for the null hypotheses a = 0 and b = 1. The highest values for R2 and the lowest error statistic values are underlined. 19 Daily squared returns 60 50 40 30 20 10 0 1995 1996 1997 (i) 10 Intradaily squared returns 10 5 1998 1999 0 1995 (ii) 10 1996 1997 1998 1999 5 5 0 1995 1996 1997 1998 1999 0 1995 1996 (iii) 10 1997 1998 1999 5 0 1995 1996 1997 1998 1999 Figure 3: Daily squared and intradaily squared returns together with the one day ahead volatility forecasts of the (i) SV, (ii) SVX+ and (iii) VX model for the Standard & Poor's 100 index over the period 03/01/95 to 31/12/99 based on a 9 year rolling window sample. 7 Summary and Conclusions In this paper we examine the predictive ability of Stochastic Volatility (SV) models and compare its forecasts with the volatility forecasts implied by option prices for daily returns on the Standard & Poor's 100 Stock Index. For this purpose we extend the SV model to a volatility model which allows for the inclusion of implied volatility as an exogeneous variable in the variance equation. As the resulting SVX model includes an entire lag structure of the implied volatility measure we extend the SVX model further to adjust for this with a persistence adjustment term and thus obtain a second model which we call the SVX+ model. In addition we de ne a volatility model which only utilises implied volatility and refer to it as the VX model. We have estimated the SV, SVX and SVX+ models successfully by exact maximum likelihood using Monte Carlo importance sampling methods. Our in-sample results indicate that historical returns are outperformed by implied volatility but that the former contains additional information about the volatility process in the form of stochastic shocks that are incorporated in the variance equation of the SVX type models. Our results do not contradict earlier ndings in the GARCH literature, where recent research has indicated that implied volatility provides the most accurate volatility forecasts, as the GARCH class of model is by de nition a deterministic model which does not allow for a stochastic element in the variance equation. The out-of-sample volatility forecasts are constructed for forecasting horizons ranging from 1 to 10 trading days and we approximate realised volatility as daily squared returns and intradaily squared returns following research by, for example, Andersen and Bollerslev (1998). The relative forecasting accuracy of the various volatility models is evaluated using both regression based evaluation methods and error statistics. We obtain mixed results when we de ne realised volatility in terms of daily squared returns but when we use intradaily squared 20 returns for our forecasting evaluation we nd that the most accurate ex-ante volatility forecasts are obtained with the SVX+ model. Although the R2 statistics are the highest for the VX model we nd on closer examination that this model produces forecasts that are severely upwardly biased and therefore conclude that this model is outperformed not only by the SVX+ but also by the SV model. 21 Appendix: approximating model used for simulation Consider the single density component pt = N(0 2 ) with 2 = exp( t ) where t is given in section 3.2. t t Here we develop the approximating model based on a linear Gaussian model with mean E(yt ) = t + ct and variance V(yt ) = Ht , that is yt = t + u t ut N(ct Ht ) t = 1 ::: T (25) where ct and Ht are determined in such a way that the mean and variance of yt implied by the approximating model (25) and by the true model (1) and (4) are as close as possible12. We achieve this Rby equalising the rst and second derivatives of p(yj ) and g(yj ) with respect ~ to at ^ = E( ) = g( jy ). Note that p( ) refers to a density for the true model and g( ) refers to a density for the approximating Gaussian model. Further, it follows that ^ can simply be obtained via the Kalman lter and smoother applied to the approximating model (25). The conditional densities are given by T T Y Y p(yj ) = pt g(yj ) = gt (26) with t=1 t=1 pt = N(0 t ) = p(yt j t ) = ;0:5 ln 2 + t + exp(; t )yt2 ] gt = N(ct + t Ht ) = g(yt j t ) = ;0:5 ln 2 + ln Ht + Ht;1 (yt ; ct ; t )2 ]: Di erentiating both densities twice with respect to t gives pt = ;0:5 1 ; exp(; t )yt2 ] _ pt = ;0:5 exp(; t )yt2 gt = Ht;1 (yt ; ct ; t ) _ gt = ;Ht;1: Equalising the rst and second derivatives, that is pt = gt and pt = gt for t = 1 : : : T , leads to __ ct = yt ; t + 0:5Ht ; 1 Ht = 2 exp( t )=yt2 : For given values of t = ~t , the resulting model for yt = yt ; ct is equivalent to ~ ~ yt = t + ut ~ ~ ut N(0 Ht ) ~ t = 1 ::: T with ~ ~ yt = ~t ; 0:5Ht + 1 ~ Ht = 2 exp(~t )=yt2 : ~ ~ It should be noted that Ht > 0 for any value of t . We cannot solve out for yt and Ht at ^t = E( t ) ~ ~ because E refers to expectation with respect to the approximating model which depend on t . However, such complicated but linear system of equations is usually solved iteratively by starting with a trial ~ value t = ~t . Computing yt and Ht and applying the Kalman lter smoother to model (25) leads to a ~ ~ smoothed estimate for t which can be used as a new trial value for t . Recomputing yt and Ht based ~ ^t . Note that the rst and on this new trial value leads to an iterative procedure which converges to second derivatives of the true and approximating densities are equal at t = ^t . More details are given by Durbin and Koopman (1997). It is worth mentioning that ^t is equal to the mode of p( t jy ) which can be of interest. 12 Note that the true model implies a nonlinear relationship between yt and t the approximating (linear) model is e ectively a second-order Taylor expansion of the true model around t. Further, the multivariate Gaussian density g( jy ) can be regarded as a Laplace approximation to the true density p( jy ) see Shephard and Pitt (1997). 22 References Akgiray, V. (1989), Conditional Heteroscedasticity in Time Series of Stock Returns: Evidence and Forecasts, Journal of Business 62, 55-80. Amin, K.I. and V.K. Ng (1997), Inferring Future Volatility from the Information in Implied Volatility in Eurodollar Options: A New Approach, Review of Financial Studies 2, 333-367. Andersen, T.G. and T. Bollerslev (1998), Answering the Skeptics: Yes, Standard Volatility Models do Provide Accurate Forecasts, International Economic Review 39, 885-905. Andersen, T.G., T. Bollerslev, F.X. Diebold and H. Ebens (2000), The Distribution of Stock Return Volatility, Working Paper, Northwestern University, Duke University, New York University and John Hopkins University. Andersen, T.G., T. Bollerslev, F.X. Diebold and P. Labys (1999), The Distribution of Exchange Rate Volatility, Working Paper, Northwestern University, Duke University, New York University and University of Pennsylvania, Revised version of NBER Working Paper No. 6961 (1998). Barndor -Nielsen, O.E. and N. Shephard (2001), Non-Gaussian OU Based Models and Some of Their Uses in Financial Economics (with discussion), Journal of the Royal Statistical Society, Series B, forthcoming. Beckers, S. (1981), Standard Deviations Implied in Options Prices as Predictors of Future Stock Price Variability, Journal of Banking and Finance 5, 363-381. Bera, A.K. and M.L. Higgins (1993), ARCH Models: Properties, Estimation and Testing, Journal of Economic Surveys 7, 305-366. Blair, B., S. Poon and S.J. Taylor (2000), Forecasting S&P 100 Volatility: The Incremental Information Content of Implied Volatilities and High Frequency Returns, Working Paper, Lancaster University. Bollerslev, T. (1986), Generalized Autoregressive Conditional Heteroskedasticity, Journal of Econometrics 31, 307-327. Bollerslev, T., R.Y. Chou and K.F. Kroner (1992), ARCH Modeling in Finance: A Review of the Theory and Empirical Evidence, Journal of Econometrics 52, 5-59. Bollerslev, T., R.F. Engle and D.B. Nelson (1994), ARCH Models, in: Handbook of Econometrics, Vol. 4, eds. R.F. Engle and D.L. McFadden, Elsevier Science, Amsterdam, 2959-3038. Campbell, J.Y., A.W. Lo and A.C. MacKinlay (1997), The Econometrics of Financial Markets, New Jersey: Princeton University Press. Canina, L. and S. Figlewski (1993), The Informational Content of Implied Volatility, Review of Financial Studies 6, 659-693. Chib, S., F. Nardari and N. Shephard (1998), Markov Chain Monte Carlo Methods for Generalized Stochastic Volatility Models, Discussion Paper: Nu eld College, Oxford. Chiras, D.P. and S. Manaster (1978), The Information Content of Options Prices and a Test of Market 23 E ciency, Journal of Financial Economics 6, 213-234. Christensen, B.J. and N.R. Prabhala (1998), The Relation between Implied and Realized Volatility, Journal of Financial Economics 50, 125-150. Day, T.E. and C.M. Lewis (1992), Stock Market Volatility and the Information Content of Stock Index Options, Journal of Econometrics 52, 267-287. de Jong, P. and N. Shephard (1995), The Simulation Smoother for Time Series Models, Biometrika 82, 339-350. Diebold, F.X. and J.A. Lopez (1995), Modeling Volatility Dynamics, in: Macroeconometrics: Developments, Tensions and Prospects, eds. K. Hoover, Kluwer Academic Publishers, Amsterdam, 427-472. Dimson, E. and P. Marsh (1990), Volatility Forecasting without Data-Snooping, Journal of Banking and Finance 14, 399-421. Doornik, J.A. (1998), Object-Oriented Matrix Programming using Ox 2.0, Timberlake Consultants Press, London. www.nuff.ox.ac.uk/Users/Doornik/ Durbin, J. and S.J. Koopman (1997), Monte Carlo Maximum Likelihood Estimation for Non-Gaussian State Space Models, Biometrika 84, 669-84. Durbin, J. and S.J. Koopman (2000), Time Series Analysis of Non-Gaussian Observations based on State Space Models from both Classical and Bayesian Perspectives (with discussion), Journal of the Royal Statistical Society, Series B, 62, 3-56. Engle, R.F. (1982), Autoregressive Conditional Heteroscedasticity with Estimates of the Variance of United Kingdom In ation, Econometrica 50, 987-1006. Fleming, J., B. Ostdiek and R.E. Whaley (1995), Predicting Stock Market Volatility: A New Measure, Journal of Futures Markets 15, 265-302. Fleming, J. (1998), The Quality of Market Volatility Forecast Implied by S&P 100 Index Option Prices, Journal of Empirical Finance 5, 317-345. Ghysels, E., A.C. Harvey and E. Renault (1996), Stochastic Volatility, in: Handbook of Statistics, Vol. 14, Statistical Methods in Finance, eds. G.S. Maddala and C.R. Rao, North-Holland, Amsterdam, 128-198. Harvey, A.C. (1989), Forecasting, Structural Time Series Models and the Kalman Filter, Cambridge University Press, Cambridge. Harvey, A.C. (1993), Time Series Models, 2nd edition, Harvester Wheatsheaf, Hemel Hempstead. Harvey, A.C., E. Ruiz and N. Shephard (1994), Multivariate Stochastic Variance Models, Review of Economic Studies 61, 247-264. Heynen, R.C. (1995), Essays on Derivatives Pricing Theory, PhD Dissertation, Erasmus University Rotterdam, Thesis Publishers, Amsterdam. 24 Johnston, J. and J. DiNardo (1997), Econometric Methods, 4th edition, McGraw-Hill, International editions. Kim, S., N. Shephard and S. Chib (1998), Stochastic Volatility: Likelihood Inference and Comparison with ARCH Models, Review of Economic Studies 65, 361-393. Koopman, S.J., Harvey, A.C., Doornik, J.A., and Shephard, N. (2000), STAMP 6.0 Structural Time Series Analyser, Modeller and Predictor, Timberlake Consultants, London. www.stamp-software.com Koopman, S.J. and E. Hol Uspensky (2000), The Stochastic Volatility in Mean Model: Empirical Evidence from International Stock Markets, Discussion Paper, Tinbergen Institute, The Netherlands. www.tinbergen.nl Koopman, S.J., N. Shephard and J. Doornik (1999), Statistical Algorithms for Models in State Space using SsfPack 2.2, Econometrics Journal, 2, 113-166. www.ssfpack.com Lamoureux, C.G. and W.D. Lastrapes (1993), Forecasting Stock-Return Variance: Toward an Understanding of Stochastic Implied Volatility, Review of Financial Studies 6, 293-326 Latane, H.A. and R.J. Rendleman (1976), Standard Deviations of Stock Price Ratios Implied in Option Prices, Journal of Finance 31, 369-381. Sandmann, G. and S.J. Koopman (1998), Estimation of Stochastic Volatility Models via Monte Carlo Maximum Likelihood, Journal of Econometrics 87, 271-301. Shephard, N.G. (1996), Statistical Aspects of ARCH and Stochastic Volatility, in: Time Series Models in Econometrics, Finance and Other Fields, Monographs on Statistics and Applied Probability 65, eds. D.R. Cox, D.V. Hinkley and O.E. Barndor -Nielsen, Chapman and Hall, London, 1-67. Shephard, N. and M.K. Pitt (1997), Likelihood Analysis of Non-Gaussian Measurement Time Series. Biometrika 84, 653-667. Taylor, S.J. (1994), Modeling Stochastic Volatility: A Review and Comparative Study, Mathematical Finance 4, 183-204. Walsh D.M. and G.Y. Tsou (1998), Forecasting Index Volatility: Sampling Interval and Non-Trading E ects, Applied Financial Economics 8, 477-485. Whaley, R.E. (1993), Derivatives on Market Volatility: Hedging Tools Long Overdue, Journal of Derivatives 1, 71-84. 25 ...
View Full Document

Ask a homework question - tutors are online