scholarly journals Application of random forest regression to the calculation of gas-phase chemistry within the GEOS-Chem chemistry model v10

2019 ◽  
Vol 12 (3) ◽  
pp. 1209-1225 ◽  
Author(s):  
Christoph A. Keller ◽  
Mat J. Evans

Abstract. Atmospheric chemistry models are a central tool to study the impact of chemical constituents on the environment, vegetation and human health. These models are numerically intense, and previous attempts to reduce the numerical cost of chemistry solvers have not delivered transformative change. We show here the potential of a machine learning (in this case random forest regression) replacement for the gas-phase chemistry in atmospheric chemistry transport models. Our training data consist of 1 month (July 2013) of output of chemical conditions together with the model physical state, produced from the GEOS-Chem chemistry model v10. From this data set we train random forest regression models to predict the concentration of each transported species after the integrator, based on the physical and chemical conditions before the integrator. The choice of prediction type has a strong impact on the skill of the regression model. We find best results from predicting the change in concentration for long-lived species and the absolute concentration for short-lived species. We also find improvements from a simple implementation of chemical families (NOx = NO + NO2). We then implement the trained random forest predictors back into GEOS-Chem to replace the numerical integrator. The machine-learning-driven GEOS-Chem model compares well to the standard simulation. For ozone (O3), errors from using the random forests (compared to the reference simulation) grow slowly and after 5 days the normalized mean bias (NMB), root mean square error (RMSE) and R2 are 4.2 %, 35 % and 0.9, respectively; after 30 days the errors increase to 13 %, 67 % and 0.75, respectively. The biases become largest in remote areas such as the tropical Pacific where errors in the chemistry can accumulate with little balancing influence from emissions or deposition. Over polluted regions the model error is less than 10 % and has significant fidelity in following the time series of the full model. Modelled NOx shows similar features, with the most significant errors occurring in remote locations far from recent emissions. For other species such as inorganic bromine species and short-lived nitrogen species, errors become large, with NMB, RMSE and R2 reaching >2100 % >400 % and <0.1, respectively. This proof-of-concept implementation takes 1.8 times more time than the direct integration of the differential equations, but optimization and software engineering should allow substantial increases in speed. We discuss potential improvements in the implementation, some of its advantages from both a software and hardware perspective, its limitations, and its applicability to operational air quality activities.

2018 ◽  
Author(s):  
Christoph A. Keller ◽  
Mat J. Evans

Abstract. Atmospheric chemistry models are a central tool to study the impact of chemical constituents on the environment, vegetation and human health. These models are numerically intense, and previous attempts to reduce the numerical cost of chemistry solvers have not delivered transformative change. We show here the potential of a machine learning (in this case random forest regression) replacement for the gas-phase chemistry in atmospheric chemistry models. Our training data consists of one month (July 2013) of output of chemical conditions together with the model physical state, produced from the GEOS-Chem chemistry model (v10). From this data set we train random forest regression models to predict the concentration of each transported species after the integrator, based on the physical and chemical conditions before the integrator. The choice of prediction type has a strong impact on the skill of the regression model. We find best results from predicting the change in concentration for long-lived species and the absolute concentration for short-lived species. We also find improvements from a simple implementation of chemical families (NOx = NO + NO2). We then implement the trained random forest predictors back into GEOS-Chem to replace the numerical integrator. The machine learning driven GEOS-Chem model compares well to the standard simulation. For O3, error from using the random forests grow slowly and after 5 days the normalised mean bias (NMB), root mean square error (RMSE) and R2 are 4.2 %, 35 %, 0.9 respectively; after 30 days the errors increase to 13 %, 67 %, 0.75. The biases become largest in remote areas such as the tropical Pacific where errors in the chemistry can accumulate with little balancing influence from emissions or deposition. Over polluted regions the model error is less than 10 % and has significant fidelity in following the time series of the full model. Modelled NOx shows similar features, with the most significant errors occurring in remote locations far from recent emissions. For other species such as inorganic bromine species and short lived nitrogen species errors become large, with NMB, RMSE and R2 reaching >2100 % >400 %, <0.1 respectively. This proof-of-concept implementation is 85 % slower than the direct integration of the differential equations but optimisation and software engineering would allow substantial increases in speed. We discuss potential improvements in the implementation, some of its advantages from both a software and hardware perspective, its limitations and its applicability to operational air quality activities.


2008 ◽  
Vol 9 (1) ◽  
pp. 515 ◽  
Author(s):  
Allison Gehrke ◽  
Shaojun Sun ◽  
Lukasz Kurgan ◽  
Natalie Ahn ◽  
Katheryn Resing ◽  
...  

2020 ◽  
Author(s):  
Basit Khan ◽  
Sabine Banzhaf ◽  
Edward C. Chan ◽  
Renate Forkel ◽  
Farah Kanani-Sühring ◽  
...  

Abstract. In this article we describe the implementation of an online-coupled gas-phase chemistry model in the turbulence resolving PALM model system 6.0. The new chemistry model is part of the PALM-4U components (read: PALM for you; PALM for urban applications) which are designed for application of PALM model in the urban environment (Maronga et al., 2020). The latest version of the Kinetic PreProcessor (KPP, 2.2.3), has been utilised for the numerical integration of gas-phase chemical reactions. A number of tropospheric gas-phase chemistry mechanisms of different complexity have been implemented ranging from the photostationary state to more complex mechanisms such as CBM4, which includes major pollutants namely O3, NO, NO2, CO, a simplified VOC chemistry and a small number of products. Further mechanisms can also be easily added by the user. In this work, we provide a detailed description of the chemistry model, its structure along with its various features, input requirements, its application and limitations. A case study is presented to demonstrate the application of the new chemistry model in the urban environment. The computation domain of the case study is comprised of part of Berlin, Germany, covering an area of 6.71 × 6.71 km with a horizontal resolution of 10 m. We used "PARAMETERIZED" emission mode of the chemistry model that only considers emissions from traffic sources. Three chemical mechanisms of varying complexity and one no-reaction (passive) case have been applied and results are compared with observations from two permanent air quality stations in Berlin that fall within the computation domain. The results show importance of online photochemistry and dispersion of air pollutants in the urban boundary layer. The simulated NOx and O3 species show reasonable agreement with observations. The agreement is better during midday and poorest during the evening transition hours and at night. CBM4 and SMOG mechanisms show better agreement with observations than the steady state PHSTAT mechanism.


2021 ◽  
Vol 8 (3) ◽  
pp. 209-221
Author(s):  
Li-Li Wei ◽  
Yue-Shuai Pan ◽  
Yan Zhang ◽  
Kai Chen ◽  
Hao-Yu Wang ◽  
...  

Abstract Objective To study the application of a machine learning algorithm for predicting gestational diabetes mellitus (GDM) in early pregnancy. Methods This study identified indicators related to GDM through a literature review and expert discussion. Pregnant women who had attended medical institutions for an antenatal examination from November 2017 to August 2018 were selected for analysis, and the collected indicators were retrospectively analyzed. Based on Python, the indicators were classified and modeled using a random forest regression algorithm, and the performance of the prediction model was analyzed. Results We obtained 4806 analyzable data from 1625 pregnant women. Among these, 3265 samples with all 67 indicators were used to establish data set F1; 4806 samples with 38 identical indicators were used to establish data set F2. Each of F1 and F2 was used for training the random forest algorithm. The overall predictive accuracy of the F1 model was 93.10%, area under the receiver operating characteristic curve (AUC) was 0.66, and the predictive accuracy of GDM-positive cases was 37.10%. The corresponding values for the F2 model were 88.70%, 0.87, and 79.44%. The results thus showed that the F2 prediction model performed better than the F1 model. To explore the impact of sacrificial indicators on GDM prediction, the F3 data set was established using 3265 samples (F1) with 38 indicators (F2). After training, the overall predictive accuracy of the F3 model was 91.60%, AUC was 0.58, and the predictive accuracy of positive cases was 15.85%. Conclusions In this study, a model for predicting GDM with several input variables (e.g., physical examination, past history, personal history, family history, and laboratory indicators) was established using a random forest regression algorithm. The trained prediction model exhibited a good performance and is valuable as a reference for predicting GDM in women at an early stage of pregnancy. In addition, there are certain requirements for the proportions of negative and positive cases in sample data sets when the random forest algorithm is applied to the early prediction of GDM.


2002 ◽  
Vol 2 (3) ◽  
pp. 215-226 ◽  
Author(s):  
N. Lahoutifard ◽  
M. Ammann ◽  
L. Gutzwiller ◽  
B. Ervens ◽  
Ch. George

Abstract. The impact of multiphase reactions involving nitrogen dioxide (NO2) and aromatic compounds was simulated in this study. A mechanism (CAPRAM 2.4, MODAC Mechanism) was applied for the aqueous phase reactions, whereas RACM was applied for the gas phase chemistry. Liquid droplets were considered as monodispersed with a mean radius of 0.1 µm and a liquid content (LC) of 50 µg m-3. The multiphase mechanism has been further extended to the chemistry of aromatics, i.e. reactions involving benzene, toluene, xylene, phenol and cresol have been added. In addition, reaction of NO2 with dissociated hydroxyl substituted aromatic compounds has also been implemented. These reactions proceed through charge exchange leading to nitrite ions and therefore to nitrous acid formation. The strength of this source was explored under urban polluted conditions. It was shown that it may increase gas phase HONO levels under some conditions and that the extent of this effect is strongly pH dependent. Especially under moderate acidic conditions (i.e. pH above 4) this source may represent more than 75% of the total HONO/NO2 - production rate, but this contribution drops down close to zero in acidic droplets (as those often encountered in urban environments).


2016 ◽  
Author(s):  
Daniel Cariolle ◽  
Philippe Moinat ◽  
Hubert Teyssèdre ◽  
Luc Giraud ◽  
Béatrice Josse ◽  
...  

Abstract. This article reports on the development and tests of the Adaptative Semi-Implicit Scheme (ASIS) solver for the simulation of atmospheric chemistry. To solve the Ordinary Differential Equation systems associated with the time evolution of the species concentrations, ASIS adopts a one step linearized implicit scheme with specific treatments of the Jacobian of the chemical fluxes. It conserves mass and has a time stepping module to control the accuracy of the numerical solution. In idealized box model simulations ASIS gives results similar to the higher order implicit schemes derived from the Rosenbrock's and Gear's methods. When implemented in the MOCAGE CTM and the LMD Mars GCM the ASIS solver performs well and reveals weaknesses and limitations of the original semi-implicit solvers used by these two models. ASIS can be easily adapted to various chemical schemes and further developments are foreseen to increase its computational efficiency, and to include the computation of the concentrations of the species in aqueous phase in addition to gas phase chemistry.


2011 ◽  
Vol 11 (8) ◽  
pp. 3653-3671 ◽  
Author(s):  
S. Morin ◽  
R. Sander ◽  
J. Savarino

Abstract. The isotope anomaly (Δ17O) of secondary atmospheric species such as nitrate (NO3−) or hydrogen peroxide (H2O2) has potential to provide useful constrains on their formation pathways. Indeed, the Δ17O of their precursors (NOx, HOx etc.) differs and depends on their interactions with ozone, which is the main source of non-zero Δ17O in the atmosphere. Interpreting variations of Δ17O in secondary species requires an in-depth understanding of the Δ17O of their precursors taking into account non-linear chemical regimes operating under various environmental settings. This article reviews and illustrates a series of basic concepts relevant to the propagation of the Δ17O of ozone to other reactive or secondary atmospheric species within a photochemical box model. We present results from numerical simulations carried out using the atmospheric chemistry box model CAABA/MECCA to explicitly compute the diurnal variations of the isotope anomaly of short-lived species such as NOx and HOx. Using a simplified but realistic tropospheric gas-phase chemistry mechanism, Δ17O was propagated from ozone to other species (NO, NO2, OH, HO2, RO2, NO3, N2O5, HONO, HNO3, HNO4, H2O2) according to the mass-balance equations, through the implementation of various sets of hypotheses pertaining to the transfer of Δ17O during chemical reactions. The model results confirm that diurnal variations in Δ17O of NOx predicted by the photochemical steady-state relationship during the day match those from the explicit treatment, but not at night. Indeed, the Δ17O of NOx is "frozen" at night as soon as the photolytical lifetime of NOx drops below ca. 10 min. We introduce and quantify the diurnally-integrated isotopic signature (DIIS) of sources of atmospheric nitrate and H2O2, which is of particular relevance to larger-scale simulations of Δ17O where high computational costs cannot be afforded.


2020 ◽  
Vol 12 (5) ◽  
pp. 41-51
Author(s):  
Shaimaa Mahmoud ◽  
◽  
Mahmoud Hussein ◽  
Arabi Keshk

Opinion mining in social networks data is considered as one of most important research areas because a large number of users interact with different topics on it. This paper discusses the problem of predicting future products rate according to users’ comments. Researchers interacted with this problem by using machine learning algorithms (e.g. Logistic Regression, Random Forest Regression, Support Vector Regression, Simple Linear Regression, Multiple Linear Regression, Polynomial Regression and Decision Tree). However, the accuracy of these techniques still needs to be improved. In this study, we introduce an approach for predicting future products rate using LR, RFR, and SVR. Our data set consists of tweets and its rate from 1:5. The main goal of our approach is improving the prediction accuracy about existing techniques. SVR can predict future product rate with a Mean Squared Error (MSE) of 0.4122, Linear Regression model predict with a Mean Squared Error of 0.4986 and Random Forest Regression can predict with a Mean Squared Error of 0.4770. This is better than the existing approaches accuracy.


2021 ◽  
Author(s):  
Li Zhang ◽  
Georg Grell ◽  
Stuart McKeen ◽  
Ravan Ahmadov ◽  
Karl Froyd ◽  
...  

Abstract. The global Flow-following finite-volume Icosahedral Model (FIM), which was developed in the Global Systems Laboratory of NOAA/ESRL, has been coupled inline with aerosol and gas-phase chemistry schemes of different complexity using the chemistry and aerosol packages from WRF-Chem v3.7, named as FIM-Chem v1. The three chemistry schemes include 1) the simple aerosol modules from the Goddard Chemistry Aerosol Radiation and Transport model that includes only simplified sulfur chemistry, bulk aerosols, and sectional dust and sea salt modules (GOCART); 2) the photochemical gas-phase mechanism RACM coupled to GOCART to determine the impact of more realistic gas-phase chemistry on the GOCART aerosols simulations (RACM_ GOCART); and 3) a further sophistication within the aerosol modules by replacing GOCART with a modal aerosol scheme that includes secondary organic aerosols (SOA) based on the VBS approach (RACM_SOA_VBS). FIM-Chem is able to simulate aerosol, gas-phase chemical species and SOA at various spatial resolutions with different levels of complexity and quantify the impact of aerosol on numerical weather predictions (NWP). We compare the results of RACM_ GOCART and GOCART schemes which uses the default climatological model fields for OH, H2O2, and NO3. We find significant reductions of sulfate that are on the order of 40 % to 80 % over the eastern US and are up to 40 % near the Beijing region over China when using the RACM_GOCART scheme. We also evaluate the model performance by comparing with the Atmospheric Tomography Mission (ATom-1) aircraft measurements in 2016 summer. FIM-Chem shows good performance in capturing the aerosol and gas-phase tracers. The model predicted vertical profiles of biomass burning plumes and dust plumes off the western Africa are also reproduced reasonably well.


Sign in / Sign up

Export Citation Format

Share Document