 Share Your Research, Maximize Your Social Impacts

# Multiple regression model on categorical factors

Journal: Bulletin of Prydniprovs'ka State Academy of Civil Engineering and Architecture (Vol.2019, No. 6)

Publication Date:

Authors : ;

Page : 85-89

Keywords : multiple regression; categorical factor; categorical levels; explained variable; explanatory variables; quantitative assessment;

### Abstract

To find a quantitative assessment of the influence of categorical factors on the variable being explained. To predict the value of a numerical variable depending on the values of other numerical variables, a regression analysis of a statistical model based on observational data is used. However, in many situations, categorical variables must be included in the model. Method. Categorical factors are included in the model as dummy variables whose values 1 or 0 correspond to the presence or absence of a certain categorical level of the explanatory variable.However, such an approach is justified only if the category variable has only two levels: the presence or absence of a certain quality. Then the value of the dummy variable is due to be equal to 1 in the first case and 0 − in the second one. If there are several categorical levels, it is proposed to introduce dummy variables corresponding to each level and taking the value 1 into the multiple regression model if the categorical explanatory variable takes the value of the corresponding level and is 0 for all its other values (levels). In addition, the model makes it possible to take into account the possible interaction of categorical levels of various explanatory non-numerical variables and the effect of such interaction when quantifying their influence on the explained variable. Results. Proposed model of multiple regression on the categorical factors , which allows to find a quantitative assessment of the impact on the explanatory variable, not only numerical but also categorical independent variables. Moreover, the model takes into account the fact that in the absence of any influence from the explanatory variables, i.e. when they are all 0, the regression should also be 0 . Moreover, the so-called “shift” of direct regression is absent ( and, moreover, cannot be negative, as for the mathematical line ) . Scientific novelty. The proposed model extends and generalizes the capabilities of the regression analysis of statistical models for the case of categorical factors. Practical relevance. The model of multiple regression on categorical factors allows us to solve the following problems: selection of categorical levels of non-numeric parameters when designing systems; deciding on the choice of strategy in financial activities and management.