Abstract

Using non-linear models to forecast volatility for three equity index samples, this study examines weekly returns of three indices; Dow Jones Industrial index, FTSE 100 index, and Nikkei 225 index. The sample covers a twenty year sample period. The study employs an in sample and out of sample volatility forecast using standard symmetric loss functions in order to identify an appropriate model that best forecast volatility. Using the mean error (ME), root mean square error (RMSE), mean absolute error (MAE), and mean absolute percentage error (MAPE), the study finds the EGARCH model to outperform the ARCH, and GARCH model in forecasting volatility.

Keywords: Equity market, Volatility, ARCH, GARCH, EGARCH.

Received: 17 September 2018 / Revised: 22 October 2018 / Accepted: 27 November 2018/ Published: 20 December 2018

Contribution/ Originality

This is among the first studies that found EGARCH model to outperform the ARCH, and GARCH model in forecasting volatility using a combination of Japan, UK and US data.

1. INTRODUCTION

Modelling and forecasting volatility has been the subject of most economists, financial experts, researchers, financial advisors and economic policy makers over the past three decades (Boguth et al., 2011; Constantinides et al., 2013; Bollerslev et al., 2016; Cipollini et al., 2017; Bollerslev et al., 2018). Volatility forecasting plays a vital role in the black and Scholes (BS) option pricing theory, providing key function in pricing options in the financial market. In the Black and Scholes (BS) model, four parameters are observable in this model, with volatility being the only parameter that is unobservable. This has given Derivative traders and financial dealers difficulties as to how to observe or predict this parameter accurately. Regulators, Practioners and Academics have embraced Value at Risk and many view Value at Risk as a vital component of current best practice in risk management. One of the most common methods of parametric approach in Value at risk requires calculating the volatility of a return series.

In this work, we explore a number of models ranging from the linear models to the more sophisticated nonlinear models, on weekly volatility of three equity index from three different indices and they include; the Dow Jones Industrial index, FTSE 100 index and the Nikkei 225 index. The mean square error (MSE), root mean square error (RMSE), mean absolute percentage error (MAPE) and the mean absolute error (MAE) will be employed later in this work to evaluate the performance of the various models as to how they forecast volatility.

Why Study Volatility?

Volatility plays a crucial role in financial markets and hence it is important to understand this concept, which is why we model volatility using the non-linear models in this work. Volatility quantifies risk and thus plays a major role in modern finance. Black and Scholes (1973) evaluated volatility using the option pricing formula. It shows the relationship between an option’s price and several other factors, including volatility of the underlying asset’s price.

Ye-Hsiang (1993) argued that in order to derive an option’s price from the Black and Scholes formula, the option is replicated by a portfolio consisting of the underlying asset and a risk free bond. In the Black and Scholes formula, different expectation of volatility will result in a different option price from this formula. This allows for an arbitrage opportunity if the option’s market price is different from the initial cost of the portfolio. Thus an option trader is able to make profit by making superior forecasts of future volatility. With the vast portion of the financial markets so dependent on the volatility behaviour, there are obvious benefits to understanding volatility better. Improved forecast would allow traders to price their options more accurately. Using more frequent data provide better estimation of volatility.

Day and Lewis (1992) proved that volatility can be predictable using the ARCH models and that volatility is entirely captured by implied volatility within the Black and Scholes model. The reason for this is that Autoregressive models are still not entirely exploited by the market. However, Kuwahara and Marsh (1992) argued that the conditional volatility derived from GARCH and EGARCH models enable researchers to obtain option values which are very close to those that could be observed by the market. This suggests that conditional volatility cannot be used as an additional source of information, since it can be observed in the implied values. According to Michael Minnich, vice president of capital market risk advisor, Value at risk is a very important component of risk management and also an important part of volatility forecasting.

Furthermore, modelling and forecasting volatility has been subject of interest to many academicians, portfolio analysis’s and those involved with risk management. There has been growing interest in this area of research as a result of the current economic crisis hitting most countries of the world. The indices of most countries are experiencing severe downturn and hence derivative traders, academicians and portfolio managers are more concern about this problem. The purpose of this paper then is to forecast volatility using the non-linear models and applying the various standard symmetric loss functions to evaluate which of these models forecast volatility better.

The rest of the paper is structured as follows: section 2 presents a concise review of the various literature on volatility. Section 3 discusses the methodology while section 4 and 5 is analyses the empirical results and conclusion respectively.

2. LITERATURE REVIEW

Over the last two decades, there has been an increasing interest in modelling the volatility of stock market returns (Akgiray, 1989; Dimson and Marsh, 1990; Pagan and Schwert, 1990; Boguth et al., 2011; Constantinides et al., 2013; Bollerslev et al., 2016; Cipollini et al., 2017; Bollerslev et al., 2018). This is basically due to the highly volatile movements of prices of stock returns in the financial market. This has led researchers into investigating the level and stationarity of volatility over time (Day and Lewis, 1992; Tse and Tung, 1992; Figlewski et al., 1993). Most research has been directed towards examining the accuracy of this forecast. Both linear and nonlinear models have been applied by different researchers and each of them came out with different conclusions as regards the accuracy of volatility forecast (Cao and Tsay, 1992; Heynen and Kat, 1994; Brailsford and Faff, 1996; Figlewski, 1997).

Following the outburst of the ARCH models by Engle (1982); Bollerslev (1986) and Nelson (1991) literature surrounding its emergence has boomed since its discovery. Many researchers have come up with different views as to how these models forecast volatility better. Such model ranges from the naive (linear models) to the more sophisticated (nonlinear) models. Their findings are, however by no means consistent, as their end results differ even when the same indices and sample period are considered. This can be attributed to the manner in which the models were evaluated and the evaluation criteria employed. The review of literature tries to bring different research findings, how they differ and how they are consistent with one another, but none of the findings gave exact results, but where able to identify which of the models forecast volatility better (see for example; (Angelidis et al., 2003; Louis and Guan, 2004; Balaban and Bayar, 2005; Mats and Viman, 2005; Palmquist and Viman, 2005; Abdul and Shabbir, 2008)).

3. METHODOLOGY

3.1. Data set and Sample Description.

This work uses the daily closing prices of three stock market indices from 3rd April 1989 to 7th April 2009. The investigated indices are FTSE 100 index from the UK; S&P 500 index from the USA and the Nikkei 225 index from Japan. These indices have a continuous sequence of around 5034 observations, excluding the non-trading days such as weekends, public holidays and other exchange closure days. Data are sourced from data stream. The FTSE 100 index comprises of 100 large firms registered in the UK, while the S&P 500 and the NIKKEI 225 comprises of 500 top rated companies and 225 top rated companies, respectively.
The entire sample for each index is divided into two subsample periods, with each representing a 10 year period. The first subsample period is from 4th April 1989 to 2nd April 1999 with 523 trading weeks and second subsample from 6th April 1999 to 7th April 2009 with 524 trading weeks. The Daily returns are calculated on each of the subsample periods on each of the three indices mentioned earlier. This is done by applying the formula represented below as thus;

Where Rt denotes daily index returns, Pt denotes the closing price of the index at time t and Pt-1 refers to the closing index price at time t-1. The ln represents the logarithm of the relative price index. The daily returns calculated are then divided by the number of trading days in a week, with holidays excluded from the calculation to obtain the weekly return of the various series.

3.2. ARCH Methodology

3.2.1. Testing for ARCH Effect

ARCH (Autoregressive conditional heteroskedasticity) models are designed to model and forecast conditional variance. An indication of ARCH is that the residuals will be uncorrelated, but the squared residuals will show autocorrelation. The later is once again tested when we consider the autocorrelation function (ACF) and the partial autocorrelation (PACF) of the squared residuals.

Testing for ARCH effects requires us to test for a reasonable test of the null hypothesis of conditional homoskedasticity, against the conditional heteroskedasticity. This done basically to make sure that there is no heteroskedasticity in the ARCH model being estimated. We therefore need to apply the LM test (Lagrangian multiplier test), using the residual series of the model to test for ARCH effects. According to Brooks (2002) the test is one of a joint hypothesis that all q lags of the squared residuals have coefficient values that are not significantly different from zero. If the value of the test statistic is greater than the critical value from the X2 distribution, then we reject the null hypothesis of no ARCH effects and conclude that there is a presence of ARCH effect. The lag to implement here depends on the preferences of the researcher.

3.2.2. ARCH (p) Model Specification

ARCH

The result of the lags that proved no ARCH effect from the above section is employed to estimate in the order of the ARCH model to be adopted. We used the PACF of square residuals to estimate the order of the ARCH (p) model. The form of the AR (p) model can be used as the mean equation for the ARCH type model. Other researchers claim that the PACF of the square residuals can be useful for such applications.

The mean equation is represented as;

GARCH

Following Lamoureux and Lastrapes (1990); Walsh and Tsou (1998) a simple GARCH (1 1) will be employed and thus there is no need to specify any higher order.

EGARCH

The same principle is applied here as above and this is done following Engle and Ng (1993) and Brooks (2002) and thus simple EGARCH (1 1) is employed.

3.2.3. ARCH-Type Model Estimation.

The R square and the F statistics are evaluated so as to identify their importance and significance in the ARCH model. In this particular case, we consider the mean equation only rather than the model as a whole. The R square most of the time is meaningless and they contribute nothing to the estimation process. R square sometimes is negative in value and such a value is irrelevant. This is obvious when the residual sum of squares is greater than the sum of square residual in the model. The R square and the F statistics are meaningful for OLS models but meaningless for the ARCH models.

3.2.4. Diagnostic Check

ARCH

The stationarity condition for ARCH models is checked here for the non-negativity constraints and the finite unconditional variance. The alpha coefficients are summed to ensure they sum to less than one and this implies the non-breach of the stationarity condition of the ARCH (p) model.

GARCH

We also checked for the restrictions of the non negativity constraint as this determine the stationarity of the GARCH model. The restrictions most of the time are satisfied but this is checked for our model specified in the above section.

Serial Correlation in Standardised Residuals

We checked whether the serial correlation present in the autoregressive model are removed and are also not present in the ARCH type models. This is achieved when we tested using the ARCH LM test.

3.3. Forecasting

The study focuses on performing out of sample weekly volatility forecasting and each sample is split into two sub samples. This work is consistent with those of Figlewski et al. (1993) studied the weekly return of various return series using the linear and non linear models. Their result proved that the EGARCH model outperformed other models when evaluated. There are basically two types of forecasting, the static and dynamic forecasting. These forecasting types have two implications, the dynamic forecasting sets subsequent innovations to zero, while the static forecasting extends the forward recursion through the end of the estimation sample, allowing for a series of one step ahead forecasts of both structural model and the innovation. For simplicity, we will thus look at dynamic forecasting generated in Eviews.

3.3.1. Dynamic Forecasting

According to Tsay (2005) the ARCH model, the one step ahead forecast

3.3.2. Out of Sample forecast

Forecasting is a very important aspect in Finance as it helps researchers and derivative traders in determining the risk in portfolio management. In-sample forecasting, is based on parameter estimated using all data in the sample, and it implicitly assumes parameter estimates are stable across time. In practice, time variation is a critical issue in forecasting (Poon, 2005).

The in sample forecasting was done using the first 261 weeks to forecast volatility of the model. For the entire indices, the first subsample was forecasted using the sample period from 1st April 1994 to 1st April to 1999, representing the first 261 weeks.

For the second sub sample the forecasting is done using the sample period from 6th of April 2004 to 7th of April 2009, representing the first 261 weeks. The result of this will be discussed in section four.

3.3.3. Forecast Evaluation Criteria.

Various evaluation criteria are used to know how best a particular model out performs another. The mean errors (ME), mean square error (MSE), root mean square error (RMSE), and the mean absolute error (MAE) are the various error techniques to look at when testing performance. The ones mentioned here are employed so as to check performance. A forecast error with mean near zero and small variance depicts the more preferred model. Therefore the model that best forecasts volatility is taken from the criteria discussed above. The model with the lowest value of the mean error (ME) proves to be the best forecast of volatility and the same applies to the mean square error (MSE), the root mean square error (RMSE), and the mean absolute error (MAE). (See Poon (2005))

Mean Error (ME)

4. EMPIRICAL RESULTS

4.1. Data Series Statistics.

Table-4.1.1. Descriptive Statistics of return series.

	DJ-IND		FTSE 100		NIKKEI 225
Sample:	1989-1999	1999-2009	1989-1999	1999-2009	1989-1999	1999-2009
Size	2517	2515	2524	2531	2469	2461
Mean	0.000576	-1E-04	0.000441	-0.00019	-0.00029	-0.00025
Median	0.000693	0.000312	0.00053	0.000227	-0.00023	5.84E-06
SD	0.008922	0.013067	0.008925	0.013402	0.014946	0.016201
Skewness	-0.5851	0.025056	0.069316	-0.1265	0.343934	-0.3191
Kurtosis	9.839083	10.74865	5.302952	9.220919	7.213919	9.553287
JB-Test	5048.935	6292.118	559.7818	4087.958	1875.439	4445.475
Probability	0	0	0	0	0	0

Source: Eviews’ Result Output

From table 4.1.1, the mean value in the first subsamples seems to be closer but with the exception of those of the Nikkei 225 suggesting a negative value. In the second subsample, the mean of the various indices are all negative with that of the FTSE 100 and Nikkei 225 relatively close. In the first subsample, the standard deviation of Dow Jones Industrial and the FTSE 100 are quite close but that of Nikkei 225 is significantly different. The same applies to the second subsample, but this time the standard deviation of the Nikkei 225 seems close to those of the other indices.

The Skewness of Dow Jones Industrial in the first subsample is negative suggesting a longer left tail and a higher peak in the middle. According to Poon (2005) the implication of this is that for a large part of the time, financial asset returns fluctuate in a range smaller than that of a normal distribution. On the other hand, FTSE 100 and Nikkei 225 are positively skewed suggesting a longer right tail and a higher peak in the middle. In the second subsample, Dow Jones Industrial tends to be positive, while that for FTSE 100 and Nikkei 225 is negatively skewed. The kurtosis for all the samples are positive and in excess of 3 suggesting a flatter and thicker tail and are very sensitive to outliers.

4.1.1. Jarque-Bera Normality Test.

This test for the normality of the asset returns of the various indices. The low p value and high value of the Jarque-Bera statistics suggests we reject the null hypothesis of normality for all the samples. These values are lower than that of the 5% level of significance and this suggests that the financial assets are not normally distributed, hence are non normal.

4.1.2. Stationarity Test (ADF Test)

Augmented Dickey Fuller Test

Table-4.1.2. Augmented Dickey Fuller Test for all the Return Series

	DJ-Industrial		FTSE 100		NIKKEI 225
Lag	1989-1999	1999-2009	1989-1999	1999-2009	1989-1999	1999-2009
0	-3.84951	-4.15147	-3.58007	-4.33075	-4.07404	-4.2307
0	0	0	0	0	0	0
1	2.032284	2.23598	1.830214	2.440101	2.225083	2.353054
1	0	0	0	0	0	0
2	1.346313	1.422396	1.198992	1.659917	1.439777	1.607558
2	0	0	0	0	(0.00000	0
3	0.788079	0.836214	0.681636	0.953854	0.845688	0.993646
3	0	0	0	0	0	0
4	0.37119	0.410869	0.342955	0.520806	0.412657	0.526422
4	0	0	0	0	0	0
5	0.112196	0.12047	0.112003	0.203948	0.146547	0.194116
5	0	0	0	0	0	(0.00000

Notes: values in parenthesis are p-values lower than the 1% level of significance.

The return series were tested for stationarity using 36 lags as a result of the weekly returns being employed. According to Brooks et al. (2005) if the data are monthly, use 12 lags, if the data are quarterly, use 4 lags and so on. But in order to make our work simple, we decided to illustrate 5 lags in the table above. Based on these, we decided to use 36 lags being that the data are weekly. From table 4.1.2, since the ADF test statistics has p-values lower than the 1 % level of significance, we reject the null hypothesis of unit root and conclude that there is no unit root and hence returns series are stationary.

4.2. AR (m) Model Building

Model Identification

We used the Akaike and the Schwartz information criteria to determine the order of the model to employ. The rejection of the null hypothesis of non stationarity makes it reasonable to proceed to the model specification.

Table-4.2.1. Akaike and Schwartz Bayesian information criteria of the first subsamples.

	Dow Jones Indus		Ftse 100		Nikkei 225
	1989-1999		1989-1999		1989-1999
Lag	AIC	SBIC	AIC	SBIC	AIC	SBIC
0	-6.600140*	-6.597823*	-6.599418	-6.597107	-5.568396	-5.566042
1	-6.599403	-6.594769	-6.605291	-6.600667*	-5.567375	-5.562665
2	-6.598964	-6.59201	-6.604955	-6.598017	-5.574370*	-5.567304*
3	-6.59912	-6.589845	-6.605395	-6.596141	-5.57383	-5.564405
4	-6.598522	-6.586924	-6.604755	-6.593184	-5.572675	-5.560891
5	-6.597365	-6.583443	-6.605677	-6.591787	-5.571529	-5.557382
6	-6.596421	-6.580172	-6.605632	-6.589421	-5.571612	-5.555102
7	-6.598028	-6.579452	-6.60745	-6.588918	-5.570536	-5.551661
8	-6.597738	-6.576834	-6.607472*	-6.586616	-5.57054	-5.549299

The * represents AIC or SBIC with the lowest value.

From table 4.2.1 in the first sub sample, the Akaike and the Schwartz suggest AR (0), since it represents the value with the lowest AIC and SBIC. The PACF for the Dow Jones Indus suggest AR(1) while the information criteria suggest AR (0), but following Akgiray (1989) he decided to choose AR(1)for his sample despite the fact that the AIC and SBIC suggests AR(0).

We therefore modelled our autoregressive model based on the information criteria with the lowest value and in some cases where the AIC suggest too many lag say AR(8), like in the Dow Jones Industrial index in the second sample of table 4.2.2, instead we decided to choose AR(2) suggested by the SBIC. The motivation for this is that we are trying to keep our model as parsimonious as possible and also in some researches, the SIC has proven to suggest a better order for AR models than the AIC. See (Brooks et al., 2005).

Table-4.2.2. Akaike and Schwartz Bayesian information criteria of the Second subsample.

	Dow Jones Indus		Ftse 100		Nikkei 225
	1999-2009		1999-2009		1999-2009
Lag	AIC	SBIC	AIC	SBIC	AIC	SBIC
0	-5.83698	-5.83467	-5.78647	-5.78416	-5.407093*	-5.404732*
1	-5.84163	-5.83699	-5.78985	-5.78523	-5.40699	-5.40226
2	-5.84667	-5.839713*	-5.79296	-5.78604	-5.40639	-5.39931
3	-5.84857	-5.83929	-5.80091	-5.79168	-5.4063	-5.39685
4	-5.84744	-5.83583	-5.8054	-5.79386	-5.40525	-5.39343
5	-5.84865	-5.83472	-5.80762	-5.79376	-5.40519	-5.39101
6	-5.84755	-5.83129	-5.81087	-5.794693*	-5.40469	-5.38814
7	-5.84821	-5.82963	-5.81048	-5.79199	-5.40495	-5.38602
8	-5.848694*	-5.82778	-5.811791*	-5.79098	-5.40381	-5.38251

The * represents AIC or SBIC with the lowest value.

4.2.1. AR (m) Model Estimation

Table-4.2.3. AR (m) model- OLS regression coefficient estimates

Notes: figures in parenthesis are the p-values.

4.2.2. Diagnostic Checks

From table 4.2.3, the significance of the p value and the high value of the F-statistics of most of the models shows equations are well specified. But there are cases where the F-statistics seems not to be statistically significant and this can be seen with low insignificant F-statistic. Insignificant F-statistics are observed in Dow Jones Industrial index of the first subsample and the Nikkei 225 index of the second subsample. This does not matter much as most of the other coefficients were statistically significant.

The R-square of most of the subsamples were less than 1%, except for two cases where they went above 1% to about 3%. It was 1.1% during the second subsample of the Dow Jones Industrial index and 3% during the second subsample of the FTSE 100 index. The R-squares tend to increase during the second subsamples of the various indices, and this might be as a result of the more volatile periods (the September 11 attack of 2001 and the recent credit crunch of the last three years).

Durbin-Watson

The Durbin-Watson statistic for all most all the models are close to 2, leading to the rejection of the null hypothesis of no serial correlation of the residuals. This suggests that there is little evidence of serial correlation in the models (See Brooks et al. (2005)).

Table-4.3.1. ARCH (m) Model regression coefficient parameters.

Figures in parenthesis represent the p-values.

The only exception to this is the second subsample of Nikkei 225 index which shows a Durbin-Watson statistic of 2, suggesting no autocorrelation of residuals in the model. But one should take this with a pinch of salt as Durbin-Watson is not a perfect measure of the presence of autocorrelation in the residuals.

4.3. ARCH Effect

The residuals of the AR (m) models in the above section were tested for ARCH effect using the ARCH LM test. The residuals of all the return series were all significant at 1% level of significance, indicating that there is presence of heteroskedasticity in the residuals of the return series. This means that the autoregressive models cannot adequately model the return of the series and this therefore means that any result will be biased and the will produce false estimates and results. As a result of this, we resort to model the ARCH type models in order to take care of the issue of heteroskedasticity, but this will depend upon the fact that there are no heteroskedasticity in the model.

From table 4.3.1, significant coefficients of the variance equation can be seen in almost all the return series, with significance at both 1% and 5% level. This is indicated by the small p-values illustrated in parenthesis. The only exception to this is the large p-value observed in the first coefficient of the second subsample of the Nikkei 225 index. The insignificance does not matter much as majority of the other coefficients shows significance with low p-values. The negative value of the R-square makes it impossible to estimate the value of the F-statistic for the entire sample. R-square is negative for the entire sample; this is explained in section 3.6.3. The log likelihood is relatively large for the entire sample suggesting the models employed fits the data well.

According to Brooks et al. (2005) the more parameters there are in the conditional variance equation, the more likely it is that one or more of them will have negative estimated values. This same result is observed in the second subsample of the Nikkei 225 index where the first coefficient of the model is negative as a result of ARCH (5) being estimated. Btu this is not the case in all circumstances even when we estimated ARCH (8) of second subsample of Dow Jones Industrial index and that of FTSE 100 index.

Table-4.3.2. GARCH (1 1) Model Estimation.

Notes: Figures in parenthesis represent the p-values.

The sum of the coefficients on the lagged squared error and lagged conditional variance is very close to one for all samples of the return series. That for the Dow Jones Industrial index is approximately 0.99101 and 0.99491 for both first and second subsamples respectively. The same issue applies to the rest of the return series, where all the coefficients seem to be very close to unity. This also implies that the shocks to the conditional variance will be highly persistent. See (Brooks et al., 2005).

Table-4.3.3. EGARCH (1 1) Model Estimation

Dow Jones Indus			Ftse 100			Nikkei 225
Stat/coef	1989-1999	1999-2009	Stat/coef	1989-1999	1999-2009	Stat/coef	1989-1999	1999-2009
C	0.000468	-0.000015	C	0.000399	-0.000154	C	-0.000167	-0.000106
C	-0.0032	-0.9281	C	-0.0099	-0.3761	C	-0.4367	-0.6637
Varince Eqution			Variance Eqution			Variance Equation
C (2)	-0.33473	-0.21708	C (2)	-0.16749	-0.220353	C (2)	-0.29774	-0.361448
C (2)	0	0	C (2)	0	0	C (2)	0	0
C (3)	0.112528	0.099973	C (3)	0.089652	0.104704	C (3)	0.15967	0.173927
C (3)	0	0	C (3)	0	0	C (3)	0	0
C (4)	-0.006795	-0.114367	C (4)	-0.04264	-0.120989	C (4)	-0.102	-0.084147
C (4)	-0.0001	0	C (4)	0	0	C (4)	0	0
C (5)	0.9737	0.984802	C (5)	0.98984	0.98495	C (5)	-0.97987	0.973512
C (5)	0	0	C (5)	0	0	C (5)	0	0
R-square	-0.000148	-0.000042	R square	-0.000023	-0.000006	R square	-0.000064	-0.000079
DW	1.9597	2.1505	DW	1.83213	2.1278	DW	2.0159	2.06244

Figures in parenthesis represent the p-values.

The coefficients of the constant in the mean equation from table 4.3.3 are not statistically significant, except for that of the Dow Jones Industrial index of the first subsample which is significant at 5% level. But all the parameters of the variance equation for all the samples were found to be statistically significant at all level of significance. The R square here is again negative like that of the other ARCH type models, but the reason for this has been explained before in section 3.6.3. This model has an overwhelming advantage over the GARCH model presented in table 4.3.2 above. The reason being that there are no non-negativity restrictions placed on the parameters of the model, as the negative results found in table 4.3.3 will be taken care of by the log of GARCH represented as the dependent variable in the above model.

4.4. Diagnostic Checks

The ARCH models in the various subsamples were checked that they don’t breach the stationarity condition explained in section 3.6.4. For the ARCH model, table 4.5.1 shows that the stationarity conditions for the ARCH models are fulfilled as the sum of the alpha coefficients are close to unity. This stationarity condition satisfies that the models in table 4.5.1 are modelled correctly. Table 4.5.1 shows estimates of the GARCH model and the stationarity condition here is satisfied also as the coefficients of the variance equation were very close to unity, indicating evidence of stationarity in the model and that the simple GARCH (1 1) model is adequately modelled. The overwhelming advantage of the EGARCH model makes the stationarity condition satisfied as there are no restrictions to the model.

Serial Correlation of Standardised Residuals

The ARCH LM test was employed after estimating the ARCH model to check whether the presence of heteroskedasticity in the autoregressive model are not present in the ARCH models estimated for the various return series. The ARCH LM test for the various indices in tables 4.5.1 and 4.5.1 suggests p values that are very high and in most cases close to 100%. We therefore fail to reject the null hypothesis of homoskedasticity and conclude that there are no heteroskedasticity in the model. This is indicated by the high p values suggested by the ARCH LM test. This suggests that the ARCH models are adequately modelled and that there are no serial correlations in the model.

4.4.1. Dynamic Forecasting

Fig 4.5.1A to fig 4.5.1R (at the appendix) illustrates the in sample actual standard deviation of the first and second sub samples of the three indices under investigation. This represents 521 weeks with week1 to week 261 in sample conditional standard deviation values, followed by week 262 to week 521 dynamic out of sample forecast. ARCH type models in most cases seem to better describe samples in which volatility clustering are more transparent. In all the graphs presented above, the ARCH type models tend to follow the in sample actual standard deviation quite closely than any other models in most cases.

For the Dow Jones Industrial index, the ARCH model seems to follow the actual in sample standard deviation quite closely in the first sub sample than those of the GARCH and EGARCH in fig 4.5.1 A to fig 4.5.1C. The second model that best forecast volatility from the visual investigation is the EGARCH model in fig 4.5.1C as they tend to follow the actual in sample standard deviation quite closely. For the second sub sample, the EGARCH model follows the actual in sample better than the GARCH and ARCH model and hence tends to forecast volatility better from fig 4.5.1D to fig 4.5.1F.

Fig 4.5.1G to fig 4.5.1I at the appendix represents the first sub sample of FTSE 100. The ARCH model in this case, forecasts volatility better than the GARCH and EGARCH model, as they tend to follow the actual in sample standard deviation with high peaks. The GARCH model seems to follow the actual in sample quite closely after the ARCH model. Again for the second sub sample the ARCH model seems to follow the actual in sample better than the GARCH and EGARCH model. This is illustrated in fig 4.5.1J to fig 4.5.1L where the GARCH model seems to be second best in terms of volatility forecasting since they follow the actual in sample standard deviation quite closely.

The Nikkei 225 index shows higher peaks than those of the Dow Jones Industrial and the FTSE 100 indices. The ARCH model again seems to follow the actual in sample standard deviation quite closely than the other models in fig 4.5.1M to fig 4.5.1O. The second sub sample in fig 4.5.1P to fig 4.5.1R has peaks that are lower than those of the first sub sample explained above. The EGARCH model follows the actual in sample standard deviation quite closely than the ARCH and GARCH models, followed by the ARCH and then the GARCH model.

The ARCH type models seems to forecast volatility better than the GARCH and EGARCH models in most cases, but one cannot rush into such conclusion as what will determine which model forecast volatility better will be based on the standard symmetric loss function.

4.4.2. Out of Sample Forecast

From fig 4.5.1A to fig 4.5.1R in appendix, the week 262 to week 523 represents the out of sample forecasting period of the various return series. This shows that the out of sample forecast of the three indices clearly converge to the long term unconditional values. This is common with all the forecasts of the indices under study.

The in sample forecast and out of sample forecast were analysed using the standard symmetric loss function to evaluate the performance of the competing models. The forecast was done using the dynamic forecasting, forecasting multi periods ahead. The in sample forecast period is made up of the first 261 weeks with 1249 observations and an out of sample forecast period of the second 261 weeks with 1264 observations. Returns of the various indices were calculated weekly and are used in the analysis of this work. This is consistent with those of Balaban and Bayar (2005). The four standard symmetric loss functions here include; the Mean Error, Root mean square error, Mean absolute error and the Mean absolute percentage error. The model with the most minimum forecasting error is regarded as a model that best forecast volatility better.

Table 4.6.1A to 4.6.1C in appendix represents the in sample volatility forecast for the entire sample (first and second sub samples), with the tables representing the actual error statistics for the non linear models. Using the four evaluation criteria, the EGARCH model clearly dominates the ARCH and GARCH models by registering the lowest statistical error for the three indices. The GARCH model seems to have the lowest minimum values of the errors but it is not enough to outperform those of the EGARCH model as they seem to do so only with the Dow Jones Industrial index and the FTSE 100 in the first sub samples. EGARCH model has the lowest values and thus is regarded as the model the best forecast volatility when we consider the in sample forecast. This result is consistent with those of Najand (2002); Pagan and Schwert (1990) and Figlewski et al. (1993) who all conclude that the EGARCH model seems to outperform other models in their studies.

For the out of sample forecast (Table 4.6.1D to 4.6.1F in appendix), the EGARCH model again outperformed the ARCH and the GARCH model. Its dominancy can be seen mostly in the Nikkei 225 index where it outperformed the ARCH and GARCH models all over the entire sample (in sample and out of sample), registering the lowest statistical error among other models in the work. EGARCH model outperformed the ARCH and GARCH models in the second sub sample of the FTSE 100 index. EGARCH model seems to perform very well in periods of high volatility and it is common mostly in the second sub samples of the various indices. During the second sub sample of the various indices, there are two major events that led to high and low periods of volatility and these are the September 11 terrorist attack in 2001 and the ongoing financial crisis that started late 2007. The result here are again similar to those of Figlewski et al. (1993); Pagan and Schwert (1990) and Najand (2002) who all found the EGARCH to best forecast volatility than the ARCH and GARCH models.

5. SUMMARY AND CONCLUSION

The purpose of this paper is to evaluate and investigate the forecasting ability of the non-linear models which include; the ARCH, GARCH and EGARCH models. We also investigated the Autoregressive model but this model was not used in forecasting volatility of the return series under investigation for simplicity reasons. From our analysis of the various return indices, we discovered that the return series does not follow a normal distribution, thus exhibiting fat tails and high peaks which are higher than those of a normal distribution. Evaluation of volatility forecasting have been investigated both in out of sample and in sample forecast, using the mean error (ME), root mean square error (RMSE), mean absolute error (MAE) and mean absolute percentage error (MAPE) applied to a multi-step ahead out of sample weekly volatility forecast.

Among the three non-linear models under investigation, the ARCH type model exhibits relatively a poor forecast performance under both in sample and out of sample forecast, followed by the GARCH type models. The EGARCH model was found to outperform other models in forecasting volatility in this analysis. Its superiority was observed in return series like the Nikkei 225 where the EGARCH model tend to outperform other models both in out of sample forecast and in sample forecast.

Funding: This study received no specific financial support.

Competing Interests: The authors declare that they have no competing interests.

Contributors/Acknowledgement: All authors contributed equally to the conception and design of the study.

REFERENCES

Abdul, R. and A. Shabbir, 2008. Predicting stock returns volatility: An evaluation of linear vs. Nonlinear methods. International Research Journal of Finance and Economics, 20(141-150): 31.

Akgiray, V., 1989. Conditional heteroscedasticity in time series of stock returns: Evidence and forecasts. The Journal of Business, 62(1): 55-80. Available at: https://doi.org/10.1086/296451.

Angelidis, T., A. Benos and S. Degiannakis, 2003. The use of GARCH models in VaR estimation. Discussion Paper, University of Piraeus.

Balaban, E. and A. Bayar, 2005. Stock returns and volatility: Empirical evidence from fourteen countries. Applied Economics Letters, 12(10): 603-611. Available at: https://doi.org/10.1080/13504850500120607.

Black, F. and M. Scholes, 1973. The pricing of options and corporate liabilities. Journal of Political Economy, 81(3): 637-654. Available at: https://doi.org/10.1086/260062.

Boguth, O., M. Carlson, A. Fisher and M. Simutin, 2011. Conditional risk and performance evaluation: Volatility timing, overconditioning, and new estimates of momentum alphas. Journal of Financial Economics, 102(2): 363-389. Available at: https://doi.org/10.1016/j.jfineco.2011.06.002.

Bollerslev, T., 1986. Generalized autoregressive conditional heteroskedasticity. Journal of Econometrics, 31(3): 307-327. Available at: https://doi.org/10.1016/0304-4076(86)90063-1.

Bollerslev, T., B. Hood, J. Huss and L.H. Pedersen, 2016. Risk everywhere: Modeling and managing volatility. Working Paper.

Bollerslev, T., B. Hood, J. Huss and L.H. Pedersen, 2018. Risk everywhere: Modeling and managing volatility. The Review of Financial Studies, 31(7): 2729-2773. Available at: https://doi.org/10.1093/rfs/hhy041.

Brailsford, T.J. and R.W. Faff, 1996. An evaluation of volatility forecasting techniques. Journal of Banking & Finance, 20(3): 419-438. Available at: https://doi.org/10.1016/0378-4266(95)00015-1.

Brooks, C., 2002. Introductory econometrics for finance. Cambridge University Press.

Brooks, C., A. Clare, J.W. Dalle Molle and G. Persand, 2005. A comparison of extreme value theory approaches for determining value at risk. Journal of Empirical Finance, 12(2): 339-352. Available at: https://doi.org/10.1016/j.jempfin.2004.01.004.

Cao, C.Q. and R.S. Tsay, 1992. Nonlinear time series analysis of stock volatilities. Journal of Applied Econometrics, 7(S1): S165-S185. Available at: https://doi.org/10.1002/jae.3950070512.

Cipollini, F., G.M. Gallo and E. Otranto, 2017. On heteroskedasticity and regimes in volatility forecasting. Working Paper.

Constantinides, G.M., J.C. Jackwerth and A. Savov, 2013. The puzzle of index option returns. Review of Asset Pricing Studies, 3(2): 229-257. Available at: https://doi.org/10.1093/rapstu/rat004.

Day, T.E. and C.M. Lewis, 1992. Stock market volatility and the information content of stock index options. Journal of Econometrics, 52(1-2): 267-287. Available at: https://doi.org/10.1016/0304-4076(92)90073-z.

Dimson, E. and P. Marsh, 1990. Volatility forecasting without data-snooping. Journal of Banking & Finance, 14(2-3): 399-421. Available at: https://doi.org/10.1016/0378-4266(90)90056-8.

Engle, R.F., 1982. Autoregressive conditional heteroscedasticity with estimates of the variance of United Kingdom inflation. Econometrica: Journal of the Econometric Society, 50(4): 987-1007. Available at: https://doi.org/10.2307/1912773.

Engle, R.F. and V.K. Ng, 1993. Measuring and testing the impact of news on volatility. The Journal of Finance, 48(5): 1749-1778. Available at: https://doi.org/10.2307/2329066.

Figlewski, S., 1997. Forecasting volatility. Financial Markets, Institutions & Instruments, 6(1): 1-88.

Figlewski, S., R. Cumby and J. Hasbrouck, 1993. Forecasting volatilities and correlations with EGARCH models. The Journal of Derivatives, 1(2): 51-63. Available at: https://doi.org/10.3905/jod.1993.407877.

Heynen, R.C. and H.M. Kat, 1994. Volatility prediction: A comparison of the stochastic volatility, GARCH (1, 1) and EGARCH (1, 1) models. The Journal of Derivatives, 2(2): 50-65. Available at: https://doi.org/10.3905/jod.1994.407912.

Kuwahara, H. and T.A. Marsh, 1992. The pricing of japanese equity warrants. Management Science, 38(11): 1610-1641. Available at: https://doi.org/10.1287/mnsc.38.11.1610.

Lamoureux, C.G. and W.D. Lastrapes, 1990. Heteroskedasticity in stock return data: Volume versus GARCH effects. The Journal of Finance, 45(1): 221-229. Available at: https://doi.org/10.2307/2328817.

Louis, H. and W. Guan, 2004. Forecasting volatility. Available from https://ssrn.com/abstract=165528 or http://dx.doi.org/10.2139/ssrn.165528 .

Mats, P. and B. Viman, 2005. Forecasting volatility in the Swedish stock market.

Najand, M., 2002. Forecasting stock index futures price volatility: Linear vs. Nonlinear models. Financial Review, 37(1): 93-104. Available at: https://doi.org/10.1111/1540-6288.00006.

Nelson, D.B., 1991. Conditional heteroskedasticity in asset returns: A new approach. Econometrica, 59(2): 347-370. Available at: https://doi.org/10.2307/2938260.

Pagan, A.R. and G.W. Schwert, 1990. Alternative models for conditional stock volatility. Journal of Econometrics, 45(1-2): 267-290. Available at: https://doi.org/10.1016/0304-4076(90)90101-x.

Palmquist, M. and B. Viman, 2005. Forecasting volatility in the Swedish stock market; volatility modeling of the OMX-index using four different models. Available from www.stat.umu.se/kursweb/vt05/stac05mom3/?download=MatsBjorn.pdf?

Poon, S.H., 2005. A practical guide to forecasting financial market volatility. Chichester: Jhon Wiley and Sons, Ltd.

Tsay, R.S., 2005. Analysis of financial time series. 2nd Edn.: John Wiley & Sons.

Tse, Y.K. and S.H. Tung, 1992. Forecasting volatility in the Singapore stock market. Asia Pacific Journal of Management, 9(1): 1-13.

Walsh, D.M. and G.Y.-G. Tsou, 1998. Forecasting index volatility: Sampling interval and non-trading effects. Applied Financial Economics, 8(5): 477-485. Available at: https://doi.org/10.1080/096031098332772.

Ye-Hsiang, L., 1993. Forecasting daily volatility of foreign exchange markets: A comparison of the ARCH model and a new model using high frequency data. An Unpublished Msc Project, Submitted to the Sloan School of Management, MIT, USA. pp: 1-33.

Appendix

4.6. Forecasting

Fig-4.6.1A. A comparison of ARCH model conditional Standard deviation of Dow Jones Industrial index returns versus Actual Standard deviation (1989-1999) first subsample.

Source: Eviews’ Result Output

Fig-4.6.1B. A comparison of GARCH model conditional Standard deviation of Dow Jones Industrial index returns versus Actual Standard deviation (1989-1999) first subsample.

Source: Eviews’ Result Output

Fig-4.6.1C. A comparison of EGARCH model conditional Standard deviation of Dow Jones Industrial index returns versus Actual Standard deviation (1989-1999) first subsample.

Source: Eviews’ Result Output

Fig-4.6.1D. A comparison of ARCH model conditional Standard deviation of Dow Jones Industrial index returns versus Actual Standard deviation (1999-2009) second subsample.

Source: Eviews’ Result Output

Fig 4.6.1E; A comparison of GARCH model conditional Standard deviation of Dow Jones Industrial index returns versus Actual Standard deviation (1999-2009) second subsample.

Source: Eviews’ Result Output

Fig-4.6.1F. A comparison of EGARCH model conditional Standard deviation of Dow Jones Industrial index returns versus Actual Standard deviation (1999-2009) second subsample.

Source: Eviews’ Result Output

Fig-4.6.1G. A comparison of ARCH model conditional Standard deviation of FTSE 100 index returns versus Actual Standard deviation (1989-1999) first subsample.

Source: Eviews’ Result Output

Fig-4.6.1H. A comparison of GARCH model conditional Standard deviation of FTSE 100 index returns versus Actual Standard deviation (1989-1999) first subsample.

Source: Eviews’ Result Output

Fig-4.6.1I. A comparison of EGARCH model conditional Standard deviation of FTSE 100 index returns versus Actual Standard deviation (1989-1999) first subsample

Source: Eviews’ Result Output

Fig-4.6.1J. A comparison of ARCH model conditional Standard deviation of FTSE 100 index returns versus Actual Standard deviation (1999-2009) second subsample.

Source: Eviews’ Result Output

Fig-4.6.1K. A comparison of GARCH model conditional Standard deviation of FTSE 100 index returns versus Actual Standard deviation (1999-2009) second subsample.

Source: Eviews’ Result Output

Fig-4.6.1L. A comparison of EGARCH model conditional Standard deviation of FTSE 100 index returns versus Actual Standard deviation (1999-2009) second subsample.

Source: Eviews’ Result Output

Fig-4.6.1M. A comparison of ARCH model conditional Standard deviation of Nikkei 225 index returns versus Actual Standard deviation (1989-1999) first subsample.

Source: Eviews’ Result Output

Fig-4.6.1N. A comparison of GARCH model conditional Standard deviation of Nikkei 225 index returns versus Actual Standard deviation (1989-1999) first subsample.

Source: Eviews’ Result Output

Fig-4.6.1O. A comparison of EGARCH model conditional Standard deviation of Nikkei 225 index returns versus Actual Standard deviation (1989-1999) first subsample.

Source: Eviews’ Result Output

Fig-4.6.1P. A comparison of ARCH model conditional Standard deviation of Nikkei 225 index returns versus Actual Standard deviation (1999-2009) second subsample.

Source: Eviews’ Result Output

Fig-4.6.1Q. A comparison of GARCH model conditional Standard deviation of Nikkei 225 index returns versus Actual Standard deviation (1999-2009) second subsample.

Source: Eviews’ Result Output

Fig-4.6.1R. A comparison of EGARCH model conditional Standard deviation of Nikkei 225 index returns versus Actual Standard deviation (1999-2009) second subsample.

Source: Eviews’ Result Output

4.7. Forecast Evaluation

Table-4.7.1A. In Sample Forecast Evaluation Values.

DJ-INDUS First Sub Sample (In sample Forecast)
	Mean Error	Root Mean Square Error	Mean Absolute Error	Mean Absolute Percentage Error
ARCH	0.001011	0.004471	0.003202	0.367176
GARCH	0.009676	0.004448	0.003155	0.114863
EGARCH	0.001078	0.004466	0.00319	0.125018
DJ-INDUS Second Sub Sample (In Sample Forecast)
	Mean Error	Root Mean Square Error	Mean Absolute Error	Mean Absolute Percentage Error
ARCH	0.002107	0.007661	0.005507	0.422728
GARCH	0.002053	0.007457	0.005271	0.403524
EGARCH	0.000627	0.007179	0.004691	0.424726

Notes: Values on bold represents values with lowest errors.

Table 4.7.1B. In Sample Forecast Evaluation values.

FTSE 100 First Sub Sample (In sample Forecast)
	Mean Error	Root Mean Square Error	Mean Absolute Error	Mean Absolute Percentage Error
ARCH	0.002985	0.003958	0.003691	0.4172
GARCH	0.000819	0.003959	0.002977	0.33751
EGARCH	0.000833	0.004703	0.002976	0.340345
FTSE 100 Second Sub Sample (In Sample Forecast)
	Mean Error	Root Mean Square Error	Mean Absolute Error	Mean Absolute Percentage Error
ARCH	0.003664	0.008407	0.00635	0.409347
GARCH	0.003998	0.008393	0.006423	0.409437
EGARCH	0.004284	0.006943	0.004574	0.416044

Notes: Values on bold represent the lowest values of the errors.

Table-4.7.1C. In Sample Forecast Evaluation values.

NIKKEI 225 First Sub Sample (In sample Forecast)
	Mean Error	Root Mean Square Error	Mean Absolute Error	Mean Absolute Percentage Error
ARCH	0.0034542	0.0075368	0.0060961	0.3774864
GARCH	0.0050495	0.009189	0.0075196	0.4141537
EGARCH	0.0015794	0.0069898	0.005361	0.3770982
NIKKEI 225 Second Sub Sample (In Sample Forecast)
	Mean Error	Root Mean Square Error	Mean Absolute Error	Mean Absolute Percentage Error
ARCH	0.0017021	0.0091229	0.0061949	0.3983955
GARCH	0.0026092	0.0093056	0.0065624	0.3901843
EGARCH	0.0011903	0.0090092	0.0059819	0.4015017

Notes: Values on bold represent the lowest values of the errors.

Table-4.71D. Out of Sample Forecast Evaluation Values.

DJ-INDUS First Sub Sample (Out sample Forecast)
	Mean Error	Root Mean Square Error	Mean Absolute Error	Mean Absolute Percentage Error
ARCH	0.0006518	0.0052163	0.0036346	0.4107027
GARCH	0.000951	0.0051906	0.0036393	0.1088028
EGARCH	0.0011579	0.0052677	0.0038005	0.1265369
DJ-INDUS Second Sub Sample (Out Sample Forecast)
	Mean Error	Root Mean Square Error	Mean Absolute Error	Mean Absolute Percentage Error
ARCH	0.0034416	0.0098577	0.0075563	0.5581286
GARCH	0.0032699	0.0095816	0.0071748	0.5307678
EGARCH	0.0006485	0.0092275	0.0061063	0.5697578

Notes: Values on bold represent the lowest values of the errors.

Table4.7.1E. Out of Sample Forecast Evaluation values.

FTSE 100 First Sub Sample (Out of sample Forecast)
	Mean Error	Root Mean Square Error	Mean Absolute Error	Mean Absolute Percentage Error
ARCH	0.004955	0.005991	0.005215	0.584485
GARCH	0.000861	0.004675	0.003659	0.408897
EGARCH	0.000891	0.0047	0.000369	0.411908
FTSE 100 Second Sub Sample (Out of Sample Forecast)
	Mean Error	Root Mean Square Error	Mean Absolute Error	Mean Absolute Percentage Error
ARCH	0.006719	0.011146	0.009643	0.568854
GARCH	0.007385	0.011089	0.009738	0.554997
EGARCH	0.003394	0.008902	0.006125	0.584641

Notes: Values on bold represent the lowest values of the errors.

Table-4.7.1F. Out of Sample Forecast Evaluation values.

NIKKEI 225 First Sub Sample (Out of sample Forecast)
	Mean Error	Root Mean Square Error	Mean Absolute Error	Mean Absolute Percentage Error
ARCH	0.005102	0.009032	0.007778	0.441118
GARCH	0.008373	0.011083	0.009987	0.478706
EGARCH	0.001912	0.00767	0.006108	0.423331
NIKKEI 225 Second Sub Sample (Out of Sample Forecast)
	Mean Error	Root Mean Square Error	Mean Absolute Error	Mean Absolute Percentage Error
ARCH	0.002391	0.011508	0.007933	0.496587
GARCH	0.00413	0.011874	0.008792	0.493609
EGARCH	0.001331	0.011336	0.007444	0.499782

Notes: Values on bold represent the lowest values of the errors.

Views and opinions expressed in this article are the views and opinions of the author(s), Financial Risk and Management Reviews shall not be responsible or answerable for any loss, damage or liability etc. caused in relation to/arising out of the use of the content.

Index

Abstract

Contribution/ Originality

1. INTRODUCTION

2. LITERATURE REVIEW

3. METHODOLOGY

4. EMPIRICAL RESULTS

5. SUMMARY AND CONCLUSION

REFERENCES

Appendix