Intercomparison of Partial Least Square Regression-based Feature Selection Techniques for Snow Cover Area Models
Snow cover area on a river basin, affects so many meteorologic and environmental parameters. By growing remote sensing technology, nowadays snow cover area could be measured on a regular basis for scientific purposes. In this study, the monthly average of snow cover area of the Baranduz river basin from West Azerbaijan in Iran had been used for modelling by ANN and SVM. The snow cover area was extracted from MODIS 8-day maximum snow extent products from 2000 to 2019. Also, the 20 meteorologic parameters were collected from Bibakran and Babarud ground hydrometeorological stations and 20 parameters were collected from satellite base data powered by NASA LaRC projects. After BoxCox transformation analysis, the feature selection methods were used to select the modelling subsets. Partial least square regression base filter and wrapper feature selection methods were used to select modelling subsets. LW, RC, SR, VIP, SMC, MRMR, JT filter methods and GA, MCUVE and REP wrapper methods were used to select the best parameters for modelling. By increasing the thresholds of the feature selection methods, the number of the selected parameters in subsets was decreased, and after a certain amount of thresholding value, the number of parameters was fixed in 10 variables. Selected subsets were being evaluated by multicollinearity indexes and by performances of the ANN and the SVM models. 80% of the data used for training models and 20% of the data used for testing the models. The accuracy of all models was high and acceptable but, in some subsets, there was a serious multicollinearity issue. However, the correlation between parameters was so high despite this, the PLSR base feature selection methods have been very successful in reducing a great amount of multicollinearity in selected subsets. Also, the ANN and SVM models have shown very high performance in modelling the monthly snow cover area.