Developing a novel hybrid model for the estimation of surface 8-h ozone (O<sub>3</sub>) across the remote Tibetan Plateau during 2005–2018
Abstract. We developed a two-stage model named random forest-generalized additive model (RF-GAM) based on satellite data, meteorological factors, and other geographical covariates to predict the surface 8-h O3 concentration across the remote Tibetan Plateau. The 10-fold cross-validation result suggested that RF-GAM showed the excellent performance with the highest R2 value (0.76) and lowest root mean square error (RMSE) (14.41 μg/m3) compared with other seven machine learning models. The predictive performance of RF-GAM model showed significantly seasonal discrepency with the highest R2 value observed in summer (0.74), followed by winter (0.69) and autumn (0.67), and the lowest one in spring (0.64). Additionally, the unlearning ground-observed O3 data collected from open websites were applied to test the transferring ability of the novel model, and confirmed that the model was robust to predict the surface 8-h O3 concentration during other periods (R2 = 0.67, RMSE = 25.68 μg/m3). RF-GAM was then used to predict the daily 8-h O3 level over Tibetan Plateau during 2005–2018 for the first time. It was found that the estimated O3 concentration displayed a slow increase from 64.74 ± 8.30 μg/m3 to 66.45 ± 8.67 μg/m3 2005 through 2015, whereas it decreased from the peak to 65.87 ± 8.52 μg/m3 during 2015–2018. Besides, the estimated 8-h O3 concentrations exhibited notably spatial variation with the highest values in some cities of North Tibetan Plateau such as Huangnan (73.48 ± 4.53 μg/m3) and Hainan (72.24 ± 5.34 μg/m3), followed by the cities in the central region including Lhasa (65.99 ± 7.24 μg/m3) and Shigatse (65.15 ± 6.14 μg/m3), and the lowest one in some cities of Southeast Tibetan Plateau such as Aba (55.17 ± 12.77 μg/m3). Based on the 8-h O3 critical value (100 μg/m3) scheduled by World Health Organization (WHO), we further estimated the annually mean nonattainment days over Tibetan Plateau this period. It should be noted that most of the cities in Tibetan Plateau shared with the excellent air quality, while several cities (e.g., Huangnan, Haidong, and Guoluo) still suffered from more than 40 nonattainment days each year, which should be paid more attention to alleviate local O3 pollution. The result shown herein confirms the novel hybrid model improves the prediction accuracy and can be applied to assess the potential health risk, particularly in the remote regions with sparse monitoring sites.