A Study on Influence of Predictor Multicollinearity on Performance of the Stepwise Regression Prediction Equation

+ Author Affiliations + Find other works by these authors
  • Funds:

    Supported by the National Natural Science Foundation of China under Grant Nos. 40675023 and 41065002, and the Key Natural Science Foundation of Guangxi Province under Grant No. 0832019Z.

PDF

  • The prediction accuracy of the traditional stepwise regression prediction equation (SRPE) is affected by the multicollinearity among its predictors. This paper introduces the condition number analysis into the prediction modeling to minimize the multicollinearity in the SRPE. In the condition number prediction modeling, the condition number is used to select the combination of predictors with the lowest multicollinearity from the possible combinations of a number of candidate predictors (variables), and the selected combina- tion is then used to construct the condition number regression prediction equation (CNRPE). This novel prediction modeling is performed in typhoon track prediction, which is a difficult task among meteorological disaster predictions. Six pairs of typhoon track latitude/longitude SRPEs and CNRPEs for July, August, and September are built by employing the traditional and the novel prediction modeling approaches, respectively, and by using a large number of identical modeling samples. The comparative analysis indicates that under the condition of the same candidate predictors (variables) and predictands (dependent variables),although the fitting accuracy of the novel prediction models used for the historical samples of South China Sea (SCS) typhoon tracks is slightly lower than that of the traditional prediction models, the prediction accuracy for the independent samples is obviously improved, with the averaged prediction error of the novel models for July, August, and September being 153.9 km, which is 75.3 km smaller than that of the traditional models (a reduction of 33%). This is because the novel prediction modeling effectively minimizes the multicollinearity by computation and analysis of the condition number. It is shown further that when F =1.0, 2.0, and 3.0, the average prediction errors of the traditional SRPEs are obviously larger than those of the CNRPEs. Moreover, extremely large and unreasonable prediction errors occur at some individual points of the typhoon track predicted by the SRPEs due to the multicollinearity existing in the combination of predictors.
  • 加载中
  • 加载中
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

A Study on Influence of Predictor Multicollinearity on Performance of the Stepwise Regression Prediction Equation

  • 1. Guangxi Research Institute of Meteorological Disasters,Nanning 530022;
    Guangxi Research Institute of Meteorological Disasters,Nanning 530022;
    Guangxi Research Institute of Meteorological Disasters,Nanning 530022
Funds: Supported by the National Natural Science Foundation of China under Grant Nos. 40675023 and 41065002, and the Key Natural Science Foundation of Guangxi Province under Grant No. 0832019Z.

Abstract: The prediction accuracy of the traditional stepwise regression prediction equation (SRPE) is affected by the multicollinearity among its predictors. This paper introduces the condition number analysis into the prediction modeling to minimize the multicollinearity in the SRPE. In the condition number prediction modeling, the condition number is used to select the combination of predictors with the lowest multicollinearity from the possible combinations of a number of candidate predictors (variables), and the selected combina- tion is then used to construct the condition number regression prediction equation (CNRPE). This novel prediction modeling is performed in typhoon track prediction, which is a difficult task among meteorological disaster predictions. Six pairs of typhoon track latitude/longitude SRPEs and CNRPEs for July, August, and September are built by employing the traditional and the novel prediction modeling approaches, respectively, and by using a large number of identical modeling samples. The comparative analysis indicates that under the condition of the same candidate predictors (variables) and predictands (dependent variables),although the fitting accuracy of the novel prediction models used for the historical samples of South China Sea (SCS) typhoon tracks is slightly lower than that of the traditional prediction models, the prediction accuracy for the independent samples is obviously improved, with the averaged prediction error of the novel models for July, August, and September being 153.9 km, which is 75.3 km smaller than that of the traditional models (a reduction of 33%). This is because the novel prediction modeling effectively minimizes the multicollinearity by computation and analysis of the condition number. It is shown further that when F =1.0, 2.0, and 3.0, the average prediction errors of the traditional SRPEs are obviously larger than those of the CNRPEs. Moreover, extremely large and unreasonable prediction errors occur at some individual points of the typhoon track predicted by the SRPEs due to the multicollinearity existing in the combination of predictors.

Catalog

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return