SEPARATION AND MULTICOLLINEARITY IN POLYTOMOUS QUADRATIC LOGISTIC REGRESSION MODEL
Keywords:
Polytomous Logistic Regression, Quadratic Logistic Regression, Principal Components Analysis, Polytomous ResponseAbstract
The logistic regression model is used to model the relationship between a categorical dependent variable and a set of explanatory variables, continuous or discrete. Almost all papers on logistic regression have only considered the classical logistic regression model, with linear discriminant functions. But there are situations where quadratic discriminant functions are useful, and work better. However, the quadratic logistic regression model involves the estimation of a great number of unknown parameters, and this leads to computational difficulties when there are a great number of explanatory variables. Furthermore, if the groups of explanatory variables are completely separated, the maximum likelihood estimators of the unknown parameters do not exist. This paper proposes to use a set of principal components of the explanatory variables, in order to reduce the dimensions in the problem, with continuous independent variables, and the computational costs for the parameter estimation in polytomous quadratic logistic regression, without loss of accuracy. Examples on datasets taken from the literature show that the quadratic logistic regression model, with principal components, is feasible and, generally, works better than the classical logistic regression model with linear discriminant functions, in terms of correct classification rates.


