Index

Abstract

The traditional frequentist quantile regression makes minimal assumptions that accommodate errors that are not normal given that the response variable (y) is continuous even in Bayesian framework. However inference on these models where y is not continuous proves to be challenging particularly when the response variable is an ordinal data. This paper portrays the idea of Bayesian quantile estimation on ordinal data. This method utilizes the latent variable inferential framework. Estimation was done using Markov chain Monte Carlo simulation with Gibbs sampler where the cut points were set ahead of time and remained fixed all through the analysis. The method was applied in a mental health study of University undergraduate students. Investigations of the model exemplify the practical utility of Bayesian ordinal quantile models. In this paper we were able to investigate the mental health state of undergraduate students at different points in the distribution of their ages. Our findings show that the age of the students has a significant effect on their mental health. The results revealed that at 25th, 50th and 75th quantiles the ages had a negative effect on their mental health while at the 95th quantile the effect was positive. This study was able to show that older undergraduate students are more mentally equipped to withstand the stress of higher learning in the University.

Keywords: Quantile regression, Gibbs sampler, Posterior distribution, Latent variable, Bayesian ordinal quantile, Regression.

JEL Classification:: C11 & C40.

Received: 27 October 2020 / Revised: 12 November 2020 / Accepted: 30 November 2020/ Published: 14 December 2020

Contribution/ Originality

The paper's primary contribution is to apply Bayesian ordinal quantile regression to mental health analysis. The study utilized the Gibbs sampler with fixed cut-points. It portrayed insight to the effect of age on the mental health of undergraduate students at different points on the age distribution.


1. INTRODUCTION

The traditional frequentist quantile regression as proposed by Koenker (2004) makes minimal assumptions that accommodates continuous response variables with errors that are not normal. The Bayesian quantile framework also assumes the response variable to be continuous. However inference on these models where the response variable is not continuous proves to be challenging particularly for ordinal data. Ordinal models arise when the response variable is discrete and inherently ordered or ranked with the characteristic that values assigned to outcomes have an ordinal meaning, but no cardinal interpretation. For example, in a survey regarding the performance of the economy, responses may be recorded as follows: 1 for ‘bad’, 2 for ‘average’ and 3 for ‘good’. The responses in such a case have ordinal meaning but no cardinal interpretation, so one cannot say a value of 2 is twice as good as a value of 1, (Rahman, 2016). The ordinal ranking of the responses differentiates these data from unordered choice outcomes. Quantile regression allows us to uncover interesting structures that might be present in the tails of the distribution, including heavy-tailed or skewed distributions, that would otherwise be masked in standard regression and distort inference.

2. BAYESIAN QUANTILE REGRESSION

Given a linear model:

Koenker. and Machado (1999) were the first to show that likelihood-based inference using independently distributed asymmetric Laplace densities (ALD) is directly related to the minimization problem in Equation 6. Yu and Zhang (2005) proposed a three-parameter ALD with a skewness parameter that can be used directly to model the quantile of interest.

3. BAYESIAN ORDINAL QUANTILE REGRESSION

The standard approach to regression with ordinal response variables is to use the ordinal probit model. Fitting this ordinal probit model only captures the mean of the conditional distribution of the continuous latent variable underlying each response but quantile regression will study the full conditional distributions of such outcomes without assuming Gaussianity. The Bayesian method of estimating quantile regression stems from the fact that maximization of the likelihood, where the error follows an AL distribution, is equivalent to minimization of the quantile objective function in Equation 6. Bayesian implementation of quantile regression begins by forming a likelihood based on the AL distribution, thus the posterior distribution is proportional to the product of the likelihood and the prior distribution of the parameters and it can be represented as follows;

Hideo and Genya (2012) showed that Gibbs sampling can be used for Bayesian quantile regression provided the AL distribution is represented as a mixture of normal–exponential distributions. There is also a partially collapsed Gibbs sampler (Reed & Yu, 2009). In particular, there is no widely accepted quantile regression method for ordinal variables. Ordinal variables are especially common in medical contexts, where many health outcomes are expressed as ordered categories rather than as strictly numerical measures, like the case we are considering in this paper, where the mental health status of undergraduate students are assessed based on the ages of the students.

The Bayesian quantile regression methods thus discussed so far assumes that the response variable y is continuous. Since y is assumed continuous in Bayesian Quantile regression and its residuals are modeled directly as Asymptotic Laplace Distributed (ALD) variables then this direct approach is very meaningful but the situation where y the response variable is not continuous but ordinal, the direct approach becomes meaningless. To handle the situation where y is ordinal, Rahim and Haithem (2017) introduced a continuous latent variables zi corresponding to each yi, the variable zi is unobserved and relates to the observed response yi, which has J categories or outcomes

4. METHODOLOGY

Our data is an ordinal data on the mental health state of university students. Information gotten from students was classified based on their age and answers to questions that point to their mental health wellbeing. Their mental health state was ordered into 3 categories; stable, mildly-unstable and unstable. In this paper we consider the ordered responses yi (mental health state) and the corresponding covariate xi (age), for i = 1, . . . ,n. Using the latent variable inferential framework of Albert and Chib (1993) we employed the Gibbs sampler, leveraging on the assumption that the Asymptotic Laplace distribution of the residual is a mixture of normal and exponential distribution where the residual is given as;

5. RESULTS

Our data has a sample size of 707 undergraduate students. The number of observations corresponding to each category are; 138(19.52%) for stable state, 533 (75.39%) for mildly-stable state and 36 (5.15%) for unstable state. Using data on the above variables, the application studied the effect of age on the mental health state of undergraduate students. In the analysis we considered the 25th, 50th, 75th and 95th quantiles. The ages of the students range from 15-42. Where we classified ages (15-19) years to fall within the 25th quantile, (20-21) years falls within the 50th quantile, (22-23) years falls within the 75th quartile and ages >23 falls within the 95th quantile.

The posterior estimates of the Bayesian quantile ordinal models with the inefficiency values were obtained and the results are shown below;

Table-1. Posterior mean, posterior standard deviation and inefficiency values.

 Quantiles Parameters Estimates
 
Intercept
Age
25th quantile Mean
-1.4284
-0.7301
Standard Deviation
0.8282
0.0388
  Inefficiency
1.1156
1.1667
50th quantile Mean
-1.0017
-0.1995
Standard Deviation
0.1894
0.0637
Inefficiency
1.4307
1.2034
75th quantile Mean
-0.2227
-0.0036
Standard Deviation
0.0347
0.0466
Inefficiency
1.3620
1.3075
90th quantile Mean
0.0591
0.3261
Standard Deviation
0.1662
0.2124
Inefficiency
1.1567
1.2864

The effect of age at the 25th, 50th and 75th quartile has a negative effect on the probability of supporting stable mental health. This shows that a significant number of students between the ages of 15 – 23 are not in a stable mental health state. At the 90th quantile, we see a positive effect, this shows that significantly students from ages 23 and above are in a state of stable mental health.

Table-2. Deviance Information Criterion (DIC) for all Quantiles.

Quantiles
Deviance Information Criterion (DIC)
25th Quantile
1004.16
50th Quantile
839.93
75th Quantile
789.25
90th Quantile
987.53

The model selection criterion such as deviance information criterion (DIC) (Celeux, Forbes, Robert, & Titterington, 2006; Spiegelhalter, Best, Carlin, & van der Linde, 2002) was utilized to choose a value of  that is most consistent with the data. To show this, DIC was computed for the 25th, 50th, 75th and 90th quantile models and the values were 1004.16, 839.93, 789.25 and 987.53, respectively as shown in Table 2. Hence, amongst all the models considered, the 75th quantile model provides the best fit.

6. CONCLUSION

The paper considers the Bayesian analysis of quantile regression models for univariate ordinal data. The method exploits the latent variable inferential framework of Albert and Chib (1993) and capitalizes on the normal–exponential mixture representation of the AL distribution. Estimation utilizes Gibbs sampling with fixed cut-points.

Funding: This study received no specific financial support.  

Competing Interests: The author declares that there are no conflicts of interests regarding the publication of this paper.

REFERENCES

Albert, J., & Chib, S. (1993). Bayesian analysis of binary and polychotomous response data. Journal of the American Statistical Association, 88(422), 669–679.Available at: https://doi.org/10.1080/01621459.1993.10476321.

Celeux, G., Forbes, F., Robert, C. P., & Titterington, D. M. (2006). Deviance information criteria for missing data models. Bayesian Analysis, 1(4), 651–674.Available at: https://doi.org/10.1214/06-ba122.

Grabski, I. N., Vito, R. D., & Engelhardt, B. E. (2019). Bayesian ordinal quantile regression with a partially collapsed gibbs sampler. Paper presented at the Joint Statistical Meeting online Program, July 27th - Aug 1st 2019; Colorado USA. Cited as: arXiv:1911.07099 [stat.ME].

Hideo, K., & Genya, K. (2012). Gibbs sampling methods for bayesian quantile regression. Journal of Statistical Computation and Simulation, 81(11), 1565–1578.Available at: https://doi.org/10.1080/00949655.2010.496117.

Koenker, R. (2004). Quantile regression for longitudinal data. Journal of Multivariate Analysis, 91(1), 74-89.

Koenker., R. W., & Machado, J. A. F. (1999). Goodness of fit and related inference processes for quantile regression. Journal of the American Statistical Association, 94(448), 1296-1310.Available at: https://doi.org/10.1080/01621459.1999.10473882.

Rahim, A., & Haithem, T. M. A. (2017). Bayesian quantile regression for ordinal longitudinal data. Journal of Applied Statistics, 45(5), 815-828.Available at: 10.1080/02664763.2017.1315059.

Rahman, M. A. (2016). Bayesian quantile regression for ordinal models. Bayesian Analysis, 11(1), 1-24.Available at: https://doi.org/10.1214/15-ba939.

Reed, C., & Yu, K. (2009). A partially collapsed gibbs sampler for bayesian quantile regression. Computing and Mathematics Working Papers, Brunel University. Retrieved from: http://bura.brunel.ac.uk/handle/2438/3593.

Spiegelhalter, D. J., Best, N. G., Carlin, B. P., & van der Linde, A. (2002). Bayesian measures of model complexity and fit. Journal of the Royal Statistical Society – Series B Statistical Methodology, 64(4), 583–639.

Yu, K., & Zhang, J. (2005). A three paramter asymmetric laplace distribution and its extensions. Communications in Statistics – Theory and Methods, 34(9-10), 1867– 1879.Available at: https://doi.org/10.1080/03610920500199018.

Views and opinions expressed in this article are the views and opinions of the author(s), Quarterly Journal of Econometrics Research shall not be responsible or answerable for any loss, damage or liability etc. caused in relation to/arising out of the use of the content.