scholarly journals A Global Flood Risk Modeling Framework Built With Climate Models and Machine Learning

2021 ◽  
Vol 13 (4) ◽  
Author(s):  
David A. Carozza ◽  
Mathieu Boudreault
2021 ◽  
pp. 002224372110329
Author(s):  
Nicolas Padilla ◽  
Eva Ascarza

The success of Customer Relationship Management (CRM) programs ultimately depends on the firm's ability to identify and leverage differences across customers — a very diffcult task when firms attempt to manage new customers, for whom only the first purchase has been observed. For those customers, the lack of repeated observations poses a structural challenge to inferring unobserved differences across them. This is what we call the “cold start” problem of CRM, whereby companies have difficulties leveraging existing data when they attempt to make inferences about customers at the beginning of their relationship. We propose a solution to the cold start problem by developing a probabilistic machine learning modeling framework that leverages the information collected at the moment of acquisition. The main aspect of the model is that it exibly captures latent dimensions that govern the behaviors observed at acquisition as well as future propensities to buy and to respond to marketing actions using deep exponential families. The model can be integrated with a variety of demand specifications and is exible enough to capture a wide range of heterogeneity structures. We validate our approach in a retail context and empirically demonstrate the model's ability at identifying high-value customers as well as those most sensitive to marketing actions, right after their first purchase.


One Earth ◽  
2021 ◽  
Vol 4 (9) ◽  
pp. 1310-1321
Author(s):  
David Lallemant ◽  
Perrine Hamel ◽  
Mariano Balbi ◽  
Tian Ning Lim ◽  
Rafael Schmitt ◽  
...  

Author(s):  
SOURABH SHRIVASTAVA ◽  
RAM AVTAR ◽  
PRASANTA KUMAR BAL

The coarse horizontal resolution global climate models (GCMs) have limitations in producing large biases over the mountainous region. Also, single model output or simple multi-model ensemble (SMME) outputs are associated with large biases. While predicting the rainfall extreme events, this study attempts to use an alternative modeling approach by using five different machine learning (ML) algorithms to improve the skill of North American Multi-Model Ensemble (NMME) GCMs during Indian summer monsoon rainfall from 1982 to 2009 by reducing the model biases. Random forest (RF), AdaBoost (Ada), gradient (Grad) boosting, bagging (Bag) and extra (Extra) trees regression models are used and the results from each models are compared against the observations. In simple MME (SMME), a wet bias of 20[Formula: see text]mm/day and an RMSE up to 15[Formula: see text]mm/day are found over the Himalayan region. However, all the ML models can bring down the mean bias up to [Formula: see text][Formula: see text]mm/day and RMSE up to 2[Formula: see text]mm/day. The interannual variability in ML outputs is closer to observation than the SMME. Also, a high correlation from 0.5 to 0.8 is found between in all ML models and then in SMME. Moreover, representation of RF and Grad is found to be best out of all five ML models that represent a high correlation over the Himalayan region. In conclusion, by taking full advantage of different models, the proposed ML-based multi-model ensemble method is shown to be accurate and effective.


2019 ◽  
Author(s):  
Evan Greene ◽  
Greg Finak ◽  
Leonard A. D’Amico ◽  
Nina Bhardwaj ◽  
Candice D. Church ◽  
...  

AbstractHigh-dimensional single-cell cytometry is routinely used to characterize patient responses to cancer immunotherapy and other treatments. This has produced a wealth of datasets ripe for exploration but whose biological and technical heterogeneity make them difficult to analyze with current tools. We introduce a new interpretable machine learning method for single-cell mass and flow cytometry studies, FAUST, that robustly performs unbiased cell population discovery and annotation. FAUST processes data on a per-sample basis and returns biologically interpretable cell phenotypes that can be compared across studies, making it well-suited for the analysis and integration of complex datasets. We demonstrate how FAUST can be used for candidate biomarker discovery and validation by applying it to a flow cytometry dataset from a Merkel cell carcinoma anti-PD-1 trial and discover new CD4+ and CD8+ effector-memory T cell correlates of outcome co-expressing PD-1, HLA-DR, and CD28. We then use FAUST to validate these correlates in an independent CyTOF dataset from a published metastatic melanoma trial. Importantly, existing state-of-the-art computational discovery approaches as well as prior manual analysis did not detect these or any other statistically significant T cell sub-populations associated with anti-PD-1 treatment in either data set. We further validate our methodology by using FAUST to replicate the discovery of a previously reported myeloid correlate in a different published melanoma trial, and validate the correlate by identifying it de novo in two additional independent trials. FAUST’s phenotypic annotations can be used to perform cross-study data integration in the presence of heterogeneous data and diverse immunophenotyping staining panels, enabling hypothesis-driven inference about cell sub-population abundance through a multivariate modeling framework we call Phenotypic and Functional Differential Abundance (PFDA). We demonstrate this approach on data from myeloid and T cell panels across multiple trials. Together, these results establish FAUST as a powerful and versatile new approach for unbiased discovery in single-cell cytometry.


2015 ◽  
Vol 8 (7) ◽  
pp. 5419-5435 ◽  
Author(s):  
W. Paja ◽  
M. Wrzesień ◽  
R. Niemiec ◽  
W. R. Rudnicki

Abstract. The climate models are extremely complex pieces of software. They reflect best knowledge on physical components of the climate, nevertheless, they contain several parameters, which are too weakly constrained by observations, and can potentially lead to a crash of simulation. Recently a study by Lucas et al. (2013) has shown that machine learning methods can be used for predicting which combinations of parameters can lead to crash of simulation, and hence which processes described by these parameters need refined analyses. In the current study we reanalyse the dataset used in this research using different methodology. We confirm the main conclusion of the original study concerning suitability of machine learning for prediction of crashes. We show, that only three of the eight parameters indicated in the original study as relevant for prediction of the crash are indeed strongly relevant, three other are relevant but redundant, and two are not relevant at all. We also show that the variance due to split of data between training and validation sets has large influence both on accuracy of predictions and relative importance of variables, hence only cross-validated approach can deliver robust prediction of performance and relevance of variables.


Water ◽  
2021 ◽  
Vol 13 (22) ◽  
pp. 3294
Author(s):  
Chentao He ◽  
Jiangfeng Wei ◽  
Yuanyuan Song ◽  
Jing-Jia Luo

The middle and lower reaches of the Yangtze River valley (YRV), which are among the most densely populated regions in China, are subject to frequent flooding. In this study, the predictor importance analysis model was used to sort and select predictors, and five methods (multiple linear regression (MLR), decision tree (DT), random forest (RF), backpropagation neural network (BPNN), and convolutional neural network (CNN)) were used to predict the interannual variation of summer precipitation over the middle and lower reaches of the YRV. Predictions from eight climate models were used for comparison. Of the five tested methods, RF demonstrated the best predictive skill. Starting the RF prediction in December, when its prediction skill was highest, the 70-year correlation coefficient from cross validation of average predictions was 0.473. Using the same five predictors in December 2019, the RF model successfully predicted the YRV wet anomaly in summer 2020, although it had weaker amplitude. It was found that the enhanced warm pool area in the Indian Ocean was the most important causal factor. The BPNN and CNN methods demonstrated the poorest performance. The RF, DT, and climate models all showed higher prediction skills when the predictions start in winter than in early spring, and the RF, DT, and MLR methods all showed better prediction skills than the numerical climate models. Lack of training data was a factor that limited the performance of the machine learning methods. Future studies should use deep learning methods to take full advantage of the potential of ocean, land, sea ice, and other factors for more accurate climate predictions.


Energies ◽  
2021 ◽  
Vol 14 (20) ◽  
pp. 6852
Author(s):  
Grant Buster ◽  
Paul Siratovich ◽  
Nicole Taverna ◽  
Michael Rossol ◽  
Jon Weers ◽  
...  

Geothermal power plants are excellent resources for providing low carbon electricity generation with high reliability. However, many geothermal power plants could realize significant improvements in operational efficiency from the application of improved modeling software. Increased integration of digital twins into geothermal operations will not only enable engineers to better understand the complex interplay of components in larger systems but will also enable enhanced exploration of the operational space with the recent advances in artificial intelligence (AI) and machine learning (ML) tools. Such innovations in geothermal operational analysis have been deterred by several challenges, most notably, the challenge in applying idealized thermodynamic models to imperfect as-built systems with constant degradation of nominal performance. This paper presents GOOML: a new framework for Geothermal Operational Optimization with Machine Learning. By taking a hybrid data-driven thermodynamics approach, GOOML is able to accurately model the real-world performance characteristics of as-built geothermal systems. Further, GOOML can be readily integrated into the larger AI and ML ecosystem for true state-of-the-art optimization. This modeling framework has already been applied to several geothermal power plants and has provided reasonably accurate results in all cases. Therefore, we expect that the GOOML framework can be applied to any geothermal power plant around the world.


2020 ◽  
Vol 13 (5) ◽  
pp. 2355-2377
Author(s):  
Vijay S. Mahadevan ◽  
Iulian Grindeanu ◽  
Robert Jacob ◽  
Jason Sarich

Abstract. One of the fundamental factors contributing to the spatiotemporal inaccuracy in climate modeling is the mapping of solution field data between different discretizations and numerical grids used in the coupled component models. The typical climate computational workflow involves evaluation and serialization of the remapping weights during the preprocessing step, which is then consumed by the coupled driver infrastructure during simulation to compute field projections. Tools like Earth System Modeling Framework (ESMF) (Hill et al., 2004) and TempestRemap (Ullrich et al., 2013) offer capability to generate conservative remapping weights, while the Model Coupling Toolkit (MCT) (Larson et al., 2001) that is utilized in many production climate models exposes functionality to make use of the operators to solve the coupled problem. However, such multistep processes present several hurdles in terms of the scientific workflow and impede research productivity. In order to overcome these limitations, we present a fully integrated infrastructure based on the Mesh Oriented datABase (MOAB) (Tautges et al., 2004; Mahadevan et al., 2015) library, which allows for a complete description of the numerical grids and solution data used in each submodel. Through a scalable advancing-front intersection algorithm, the supermesh of the source and target grids are computed, which is then used to assemble the high-order, conservative, and monotonicity-preserving remapping weights between discretization specifications. The Fortran-compatible interfaces in MOAB are utilized to directly link the submodels in the Energy Exascale Earth System Model (E3SM) to enable online remapping strategies in order to simplify the coupled workflow process. We demonstrate the superior computational efficiency of the remapping algorithms in comparison with other state-of-the-science tools and present strong scaling results on large-scale machines for computing remapping weights between the spectral element atmosphere and finite volume discretizations on the polygonal ocean grids.


Sign in / Sign up

Export Citation Format

Share Document