prediction uncertainty
Recently Published Documents


TOTAL DOCUMENTS

261
(FIVE YEARS 54)

H-INDEX

30
(FIVE YEARS 5)

Modelling ◽  
2021 ◽  
Vol 2 (4) ◽  
pp. 753-775
Author(s):  
John C. Chrispell ◽  
Eleanor W. Jenkins ◽  
Kathleen R. Kavanagh ◽  
Matthew D. Parno

Multiple factors, many of them environmental, coalesce to inform agricultural decisions. Farm planning is often done months in advance. These decisions have to be made with the information available at the time, including current trends, historical data, or predictions of what future weather patterns may be. The effort described in this work is geared towards a flexible mathematical and software framework for simulating the impact of meteorological variability on future crop yield. Our framework is data driven and can easily be applied to any location with suitable historical observations. This will enable site-specific studies that are needed for rigorous risk assessments and climate adaptation planning. The framework combines a physics-based model of crop yield with stochastic process models for meteorological inputs. Combined with techniques from uncertainty quantification, global sensitivity analysis, and machine learning, this hybrid statistical–physical framework allows studying the potential impacts of meteorological uncertainty on future agricultural yields and identify the environmental variables that contribute the most to prediction uncertainty. To highlight the utility of our general approach, we studied the predicted yields of multiple crops in multiple scenarios constructed from historical data. Using global sensitivity analysis, we then identified the key environmental factors contributing to uncertainty in these scenarios’ predictions.


2021 ◽  
Vol 13 (1) ◽  
Author(s):  
Florian Huber ◽  
Sven van der Burg ◽  
Justin J. J. van der Hooft ◽  
Lars Ridder

AbstractMass spectrometry data is one of the key sources of information in many workflows in medicine and across the life sciences. Mass fragmentation spectra are generally considered to be characteristic signatures of the chemical compound they originate from, yet the chemical structure itself usually cannot be easily deduced from the spectrum. Often, spectral similarity measures are used as a proxy for structural similarity but this approach is strongly limited by a generally poor correlation between both metrics. Here, we propose MS2DeepScore: a novel Siamese neural network to predict the structural similarity between two chemical structures solely based on their MS/MS fragmentation spectra. Using a cleaned dataset of > 100,000 mass spectra of about 15,000 unique known compounds, we trained MS2DeepScore to predict structural similarity scores for spectrum pairs with high accuracy. In addition, sampling different model varieties through Monte-Carlo Dropout is used to further improve the predictions and assess the model’s prediction uncertainty. On 3600 spectra of 500 unseen compounds, MS2DeepScore is able to identify highly-reliable structural matches and to predict Tanimoto scores for pairs of molecules based on their fragment spectra with a root mean squared error of about 0.15. Furthermore, the prediction uncertainty estimate can be used to select a subset of predictions with a root mean squared error of about 0.1. Furthermore, we demonstrate that MS2DeepScore outperforms classical spectral similarity measures in retrieving chemically related compound pairs from large mass spectral datasets, thereby illustrating its potential for spectral library matching. Finally, MS2DeepScore can also be used to create chemically meaningful mass spectral embeddings that could be used to cluster large numbers of spectra. Added to the recently introduced unsupervised Spec2Vec metric, we believe that machine learning-supported mass spectral similarity measures have great potential for a range of metabolomics data processing pipelines.


2021 ◽  
Vol 11 (20) ◽  
pp. 9397
Author(s):  
Yonghwan Jeong

This paper presents an uncontrolled intersection-passing algorithm with an integrated approach of stochastic model-predictive control and prediction uncertainty estimation for autonomous vehicles. The proposed algorithm is designed to utilize information from sensors mounted on the autonomous vehicle and high-definition intersection maps. The proposed algorithm is composed of two modules, namely target state prediction and a motion planner. The target state prediction module has predicted the future behavior of intersection-approaching vehicles based on human driving data. The recursive covariance estimator has been utilized to estimate the prediction uncertainty for each approaching vehicle. The desired driving mode has been determined based on the uncontrolled intersection theory. The estimated prediction uncertainty has been used to define the probability distribution of the stochastic model-predictive controller to cope with time-varying uncertainty characteristics of the perception algorithm. The constrained stochastic model-predictive controller based on safety indexes has determined the desired longitudinal acceleration. The proposed robust intersection-passing algorithm has been evaluated via computer simulation based on Monte Carlo simulation with a sensor model. The simulation results showed that the proposed algorithm guarantees the minimum safety constraints and improves the ride comfort at uncontrolled intersections by estimating the uncertainty of sensors and prediction.


2021 ◽  
Author(s):  
Yingnan Gao ◽  
Martin Wu

Background: 16S rRNA gene has been widely used in microbial diversity studies to determine the community composition and structure. 16S rRNA gene copy number (16S GCN) varies among microbial species and this variation introduces biases to the relative cell abundance estimated using 16S rRNA read counts. To correct the biases, methods (e.g., PICRUST2) have been developed to predict 16S GCN. 16S GCN predictions come with inherent uncertainty, which is often ignored in the downstream analyses. However, a recent study suggests that the uncertainty can be so great that copy number correction is not justified in practice. Despite the significant implications in 16S rRNA based microbial diversity studies, the uncertainty associated with 16S GCN predictions has not been well characterized and its impact on microbial diversity studies needs to be investigated. Results: Here we develop RasperGade16S, a novel method and software to better model and capture the inherent uncertainty in 16S rRNA GCN prediction. RasperGade16S implements a maximum likelihood framework of pulsed evolution model and explicitly accounts for intraspecific GCN variation and heterogeneous GCN evolution rates among species. Using cross validation, we show that our method provides robust confidence estimates for the GCN predictions and outperforms PICRUST2 in both precision and recall. We have predicted GCN for 592605 OTUs in the SILVA database and tested 113842 bacterial communities that represent an exhaustive and diverse list of engineered and natural environments. We found that the prediction uncertainty is small enough for 99% of the communities that 16S GCN correction should improve their compositional and functional profiles estimated using 16S rRNA reads. On the other hand, we found that GCN variation has limited impacts on beta-diversity analyses such as PCoA, PERMANOVA and random forest test. Conclusion: We have developed a method to accurately account for uncertainty in 16S rRNA GCN predictions and the downstream analyses. For almost all 16S rRNA surveyed bacterial communities, correction of 16S GCN should improve the results when estimating their compositional and functional profiles. However, such correction is not necessary for beta-diversity analyses.


Author(s):  
Xiaohui Qi ◽  
Hao Wang ◽  
Jian Chu ◽  
Kiefer Chiam

This study evaluated the performances of various autocorrelation function (ACF) models in predicting the geological interface using a well-known conditional random field method. Prediction accuracies and uncertainties were compared between a flexible Matérn model and two classical ACF models: the Gaussian model and the single exponential model. The rockhead data of Bukit Timah granite from boreholes at two sites in Singapore as well as simulated data were used for the comparisons. The results showed that the classical models produce a reasonable prediction uncertainty only when its smoothness coefficient is consistent with that of the geological data. Otherwise, the classical models may produce prediction errors much larger than that of the Matérn model. On the other hand, the prediction accuracy of the Matérn model is affected by the spacing of the boreholes. When the borehole spacing is relatively small (< 0.4 × scale of fluctuation), the Matérn model can reasonably quantify the prediction uncertainty. However, when the borehole spacing is large, the prediction by the Matérn model becomes less accurate as compared with the prediction using the classical models with the right value of smoothness coefficient due to the large estimation error of the smoothness coefficient.


Forecasting ◽  
2021 ◽  
Vol 3 (3) ◽  
pp. 501-516
Author(s):  
Feifei Yang ◽  
Diego Cerrai ◽  
Emmanouil N. Anagnostou

Weather-related power outages affect millions of utility customers every year. Predicting storm outages with lead times of up to five days could help utilities to allocate crews and resources and devise cost-effective restoration plans that meet the strict time and efficiency requirements imposed by regulatory authorities. In this study, we construct a numerical experiment to evaluate how weather parameter uncertainty, based on weather forecasts with one to five days of lead time, propagates into outage prediction error. We apply a machine-learning-based outage prediction model on storm-caused outage events that occurred between 2016 and 2019 in the northeastern United States. The model predictions, fed by weather analysis and other environmental parameters including land cover, tree canopy, vegetation characteristics, and utility infrastructure variables exhibited a mean absolute percentage error of 38%, Nash–Sutcliffe efficiency of 0.54, and normalized centered root mean square error of 68%. Our numerical experiment demonstrated that uncertainties of precipitation and wind-gust variables play a significant role in the outage prediction uncertainty while sustained wind and temperature parameters play a less important role. We showed that, while the overall weather forecast uncertainty increases gradually with lead time, the corresponding outage prediction uncertainty exhibited a lower dependence on lead times up to 3 days and a stepwise increase in the four- and five-day lead times.


2021 ◽  
Vol 150 (1) ◽  
pp. 215-224
Author(s):  
Ying Zhang ◽  
Qiulong Yang ◽  
Kunde Yang

Author(s):  
Huaqiang Wen ◽  
Yang Su ◽  
Zihao Wang ◽  
saimeng Jin ◽  
Jingzheng Ren ◽  
...  

Quantitative structure-property relationship (QSPR) studies based on deep neural networks (DNN) are receiving increasing attention due to their excellent performances. A systematic methodology coupling multiple machine learning technologies is proposed to solve vital problems including applicability domain and prediction uncertainty in DNN-based QSPRs. Key features are rapidly extracted from plentiful but chaotic descriptors by principal component analysis (PCA) and kernel PCA. Then, a detailed applicability domain (AD) is defined by K-means algorithm to avoid unreliable predictions and discover its potential impact on uncertainty. Moreover, prediction uncertainty is analyzed with dropout-embedded DNN by thousands of independent tests to assess the reliability of predictions. The prediction of flashpoint temperature is employed as a case study demonstrating that the model accuracy is remarkably improved comparing with the referenced model. More importantly, the proposed methodology breaks through difficulties in analyzing the uncertainty of DNN-based QSPRs and presents an AD correlated with the uncertainty.


2021 ◽  
Author(s):  
Florian Huber ◽  
Sven van der Burg ◽  
Justin J.J. van der Hooft ◽  
Lars Ridder

Mass spectrometry data is one of the key sources of information in many workflows in medicine and across the life sciences. Mass fragmentation spectra are considered characteristic signatures of the chemical compound they originate from, yet the chemical structure itself usually cannot be easily deduced from the spectrum. Often, spectral similarity measures are used as a proxy for structural similarity but this approach is strongly limited by a generally poor correlation between both metrics. Here, we propose MS2DeepScore: a novel Siamese neural network to predict the structural similarity between two chemical structures solely based on their MS/MS fragmentation spectra. Using a cleaned dataset of >100,000 mass spectra of about 15,000 unique known compounds, MS2DeepScore learns to predict structural similarity scores for spectrum pairs with high accuracy. In addition, sampling different model varieties through Monte-Carlo Dropout is used to further improve the predictions and assess the model's prediction uncertainty. On 3,600 spectra of 500 unseen compounds, MS2DeepScore is able to identify highly-reliable structural matches and predicts Tanimoto scores with a root mean squared error of about 0.15. The prediction uncertainty estimate can be used to select a subset of predictions with a root mean squared error of about 0.1. We demonstrate that MS2DeepScore outperforms classical spectral similarity measures in retrieving chemically related compound pairs from large mass spectral datasets, thereby illustrating its potential for spectral library matching. Finally, MS2DeepScore can also be used to create chemically meaningful mass spectral embeddings that could be used to cluster large numbers of spectra. Added to the recently introduced unsupervised Spec2Vec metric, we believe that machine learning-supported mass spectral similarity metrics have great potential for a range of metabolomics data processing pipelines.


Sign in / Sign up

Export Citation Format

Share Document