Bayesian Update with Importance Sampling: Required Sample Size

Daniel Sanz-Alonso; Zijian Wang

doi:10.3390/e23010022

Trigonometric Inference Providing Learning in Deep Neural Networks

Applied Sciences ◽

10.3390/app11156704 ◽

2021 ◽

Vol 11 (15) ◽

pp. 6704

Author(s):

Jingyong Cai ◽

Masashi Takemoto ◽

Yuming Qiu ◽

Hironori Nakajo

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Deep Neural Networks ◽

Activation Function ◽

Trigonometric Approximation ◽

Model Parameters ◽

Training Algorithms ◽

Activation Functions ◽

Classical Training ◽

Sum Formula

Despite being heavily used in the training of deep neural networks (DNNs), multipliers are resource-intensive and insufficient in many different scenarios. Previous discoveries have revealed the superiority when activation functions, such as the sigmoid, are calculated by shift-and-add operations, although they fail to remove multiplications in training altogether. In this paper, we propose an innovative approach that can convert all multiplications in the forward and backward inferences of DNNs into shift-and-add operations. Because the model parameters and backpropagated errors of a large DNN model are typically clustered around zero, these values can be approximated by their sine values. Multiplications between the weights and error signals are transferred to multiplications of their sine values, which are replaceable with simpler operations with the help of the product to sum formula. In addition, a rectified sine activation function is utilized for further converting layer inputs into sine values. In this way, the original multiplication-intensive operations can be computed through simple add-and-shift operations. This trigonometric approximation method provides an efficient training and inference alternative for devices with insufficient hardware multipliers. Experimental results demonstrate that this method is able to obtain a performance close to that of classical training algorithms. The approach we propose sheds new light on future hardware customization research for machine learning.

Effects of Training Set Size on Supervised Machine-Learning Land-Cover Classification of Large-Area High-Resolution Remotely Sensed Data

Remote Sensing ◽

10.3390/rs13030368 ◽

2021 ◽

Vol 13 (3) ◽

pp. 368

Author(s):

Christopher A. Ramezan ◽

Timothy A. Warner ◽

Aaron E. Maxwell ◽

Bradley S. Price

Keyword(s):

Machine Learning ◽

Sample Size ◽

Remotely Sensed ◽

Training Data ◽

Supervised Machine Learning ◽

Sample Sizes ◽

Remotely Sensed Data ◽

Large Area ◽

Training Set ◽

Set Size

The size of the training data set is a major determinant of classification accuracy. Nevertheless, the collection of a large training data set for supervised classifiers can be a challenge, especially for studies covering a large area, which may be typical of many real-world applied projects. This work investigates how variations in training set size, ranging from a large sample size (n = 10,000) to a very small sample size (n = 40), affect the performance of six supervised machine-learning algorithms applied to classify large-area high-spatial-resolution (HR) (1–5 m) remotely sensed data within the context of a geographic object-based image analysis (GEOBIA) approach. GEOBIA, in which adjacent similar pixels are grouped into image-objects that form the unit of the classification, offers the potential benefit of allowing multiple additional variables, such as measures of object geometry and texture, thus increasing the dimensionality of the classification input data. The six supervised machine-learning algorithms are support vector machines (SVM), random forests (RF), k-nearest neighbors (k-NN), single-layer perceptron neural networks (NEU), learning vector quantization (LVQ), and gradient-boosted trees (GBM). RF, the algorithm with the highest overall accuracy, was notable for its negligible decrease in overall accuracy, 1.0%, when training sample size decreased from 10,000 to 315 samples. GBM provided similar overall accuracy to RF; however, the algorithm was very expensive in terms of training time and computational resources, especially with large training sets. In contrast to RF and GBM, NEU, and SVM were particularly sensitive to decreasing sample size, with NEU classifications generally producing overall accuracies that were on average slightly higher than SVM classifications for larger sample sizes, but lower than SVM for the smallest sample sizes. NEU however required a longer processing time. The k-NN classifier saw less of a drop in overall accuracy than NEU and SVM as training set size decreased; however, the overall accuracies of k-NN were typically less than RF, NEU, and SVM classifiers. LVQ generally had the lowest overall accuracy of all six methods, but was relatively insensitive to sample size, down to the smallest sample sizes. Overall, due to its relatively high accuracy with small training sample sets, and minimal variations in overall accuracy between very large and small sample sets, as well as relatively short processing time, RF was a good classifier for large-area land-cover classifications of HR remotely sensed data, especially when training data are scarce. However, as performance of different supervised classifiers varies in response to training set size, investigating multiple classification algorithms is recommended to achieve optimal accuracy for a project.

Analysis and Prediction of COVID-19 Using SIR, SEIQR, and Machine Learning Models: Australia, Italy, and UK Cases

Information ◽

10.3390/info12030109 ◽

2021 ◽

Vol 12 (3) ◽

pp. 109 ◽

Cited By ~ 1

Author(s):

Iman Rahimi ◽

Amir H. Gandomi ◽

Panagiotis G. Asteris ◽

Fang Chen

Keyword(s):

Machine Learning ◽

Logistic Function ◽

Prediction Performance ◽

Machine Learning Algorithms ◽

Model Parameters ◽

The Novel ◽

Chinese City ◽

Limited Memory ◽

Increasing Trend ◽

Novel Coronavirus

The novel coronavirus disease, also known as COVID-19, is a disease outbreak that was first identified in Wuhan, a Central Chinese city. In this report, a short analysis focusing on Australia, Italy, and UK is conducted. The analysis includes confirmed and recovered cases and deaths, the growth rate in Australia compared with that in Italy and UK, and the trend of the disease in different Australian regions. Mathematical approaches based on susceptible, infected, and recovered (SIR) cases and susceptible, exposed, infected, quarantined, and recovered (SEIQR) cases models are proposed to predict epidemiology in the above-mentioned countries. Since the performance of the classic forms of SIR and SEIQR depends on parameter settings, some optimization algorithms, namely Broyden–Fletcher–Goldfarb–Shanno (BFGS), conjugate gradients (CG), limited memory bound constrained BFGS (L-BFGS-B), and Nelder–Mead, are proposed to optimize the parameters and the predictive capabilities of the SIR and SEIQR models. The results of the optimized SIR and SEIQR models were compared with those of two well-known machine learning algorithms, i.e., the Prophet algorithm and logistic function. The results demonstrate the different behaviors of these algorithms in different countries as well as the better performance of the improved SIR and SEIQR models. Moreover, the Prophet algorithm was found to provide better prediction performance than the logistic function, as well as better prediction performance for Italy and UK cases than for Australian cases. Therefore, it seems that the Prophet algorithm is suitable for data with an increasing trend in the context of a pandemic. Optimization of SIR and SEIQR model parameters yielded a significant improvement in the prediction accuracy of the models. Despite the availability of several algorithms for trend predictions in this pandemic, there is no single algorithm that would be optimal for all cases.

Importance Sampling and Necessary Sample Size: An Information Theory Approach

SIAM/ASA Journal on Uncertainty Quantification ◽

10.1137/16m1093549 ◽

2018 ◽

Vol 6 (2) ◽

pp. 867-879 ◽

Cited By ~ 3

Author(s):

Daniel Sanz-Alonso

Keyword(s):

Information Theory ◽

Sample Size ◽

Importance Sampling ◽

Theory Approach

Conditioning Model Ensembles to Various Observed Data (Field and Regional Level) by Applying Machine-Learning-Augmented Workflows to a Mature Field with 70 Years of Production History

SPE Reservoir Evaluation & Engineering ◽

10.2118/205188-pa ◽

2021 ◽

pp. 1-18

Author(s):

Gisela Vanegas ◽

John Nejedlik ◽

Pascale Neff ◽

Torsten Clemens

Keyword(s):

Machine Learning ◽

Oil Recovery ◽

Numerical Models ◽

Operating Conditions ◽

Model Parameters ◽

Large Set ◽

Model Parameter ◽

Production History ◽

Hydrocarbon Fields ◽

Parameter Distributions

Summary Forecasting production from hydrocarbon fields is challenging because of the large number of uncertain model parameters and the multitude of observed data that are measured. The large number of model parameters leads to uncertainty in the production forecast from hydrocarbon fields. Changing operating conditions [e.g., implementation of improved oil recovery or enhanced oil recovery (EOR)] results in model parameters becoming sensitive in the forecast that were not sensitive during the production history. Hence, simulation approaches need to be able to address uncertainty in model parameters as well as conditioning numerical models to a multitude of different observed data. Sampling from distributions of various geological and dynamic parameters allows for the generation of an ensemble of numerical models that could be falsified using principal-component analysis (PCA) for different observed data. If the numerical models are not falsified, machine-learning (ML) approaches can be used to generate a large set of parameter combinations that can be conditioned to the different observed data. The data conditioning is followed by a final step ensuring that parameter interactions are covered. The methodology was applied to a sandstone oil reservoir with more than 70 years of production history containing dozens of wells. The resulting ensemble of numerical models is conditioned to all observed data. Furthermore, the resulting posterior-model parameter distributions are only modified from the prior-model parameter distributions if the observed data are informative for the model parameters. Hence, changes in operating conditions can be forecast under uncertainty, which is essential if nonsensitive parameters in the history are sensitive in the forecast.

Prediction of Threshold Sand Rates from Acoustic Monitors Using Artificial Intelligence

10.2118/208162-ms ◽

2021 ◽

Author(s):

Ronald E. Vieira ◽

Bohan Xu ◽

Asad Nadeem ◽

Ahmed Nadeem ◽

Siamack A. Shirazi

Keyword(s):

Artificial Intelligence ◽

Machine Learning ◽

Noise Level ◽

Background Noise ◽

Pipe Wall ◽

Flow Patterns ◽

Operating Conditions ◽

Flow Conditions ◽

Background Noise Level ◽

Input Parameters

Abstract Solids production from oil and gas wells can cause excessive damage resulting in safety hazards and expensive repairs. To prevent the problems associated with sand influx, ultrasonic devices can be used to provide a warning when sand is being produced in pipelines. One of the most used methods for sand detection is utilizing commercially available acoustic sand monitors that clamp to the outside of pipe wall and measures the acoustic energy generated by sand grain impacts on the inner side of a pipe wall. Although the transducer used by acoustic monitors is especially sensitive to acoustic emissions due to particle impact, it also reacts to flow induced noise as well (background noise). The acoustic monitor output does not exceed the background noise level until a sufficient sand rate is entrained in the flow that causes a signal output that is higher than the background noise level. This sand rate is referred to as the threshold sand rate or TSR. A significant amount of data has been compiled over the years for TSR at the Tulsa University Sand Management Projects (TUSMP) for various flow conditions with stainless steel pipe material. However, to use this data to develop a model for different flow patterns, fluid properties, pipe, and sand sizes is challenging. The purpose of this work is to develop an artificial intelligence (AI) methodology using machine learning (ML) models to determine TSR for a broad range of operating conditions. More than 250 cases from previous literature as well as ongoing research have been used to train and test the ML models. The data utilized in this work has been generated mostly in a large-scale multiphase flow loop for sand sizes ranging from 25 to 300 μm varying sand concentrations and pipe diameters from 25.4 mm to 101.6 mm ID in vertical and horizontal directions downstream of elbows. The ML algorithms including elastic net, random forest, support vector machine and gradient boosting, are optimized using nested cross-validation and the model performance is evaluated by R-squared score. The machine learning models were used to predict TSR for various velocity combinations under different flow patterns with sand. The sensitivity to changes of input parameters on predicted TSR was also investigated. The method for TSR prediction based on ML algorithms trained on lab data is also validated on actual field conditions available in the literature. The AI method results reveal a good training performance and prediction for a variety of flow conditions and pipe sizes not tested before. This work provides a framework describing a novel methodology with an expanded database to utilize Artificial Intelligence to correlate the TSR with the most common production input parameters.

Dynamic Detection of Delayed Cerebral Ischemia Using Machine Learning

10.1101/2020.04.15.20067041 ◽

2020 ◽

Author(s):

Murad Megjhani ◽

Kalijah Terilli ◽

Ayham Alkhachroum ◽

David J. Roh ◽

Sachin Agarwal ◽

...

Keyword(s):

Machine Learning ◽

Cerebral Ischemia ◽

Characteristic Curve ◽

Delayed Cerebral Ischemia ◽

Risk Scores ◽

Support Vector ◽

Model Parameters ◽

Learning Approaches ◽

Physiologic Data ◽

Over Time

AbstractObjectiveTo develop a machine learning based tool, using routine vital signs, to assess delayed cerebral ischemia (DCI) risk over time.MethodsIn this retrospective analysis, physiologic data for 540 consecutive acute subarachnoid hemorrhage patients were collected and annotated as part of a prospective observational cohort study between May 2006 and December 2014. Patients were excluded if (i) no physiologic data was available, (ii) they expired prior to the DCI onset window (< post bleed day 3) or (iii) early angiographic vasospasm was detected on admitting angiogram. DCI was prospectively labeled by consensus of treating physicians. Occurrence of DCI was classified using various machine learning approaches including logistic regression, random forest, support vector machine (linear and kernel), and an ensemble classifier, trained on vitals and subject characteristic features. Hourly risk scores were generated as the posterior probability at time t. We performed five-fold nested cross validation to tune the model parameters and to report the accuracy. All classifiers were evaluated for good discrimination using the area under the receiver operating characteristic curve (AU-ROC) and confusion matrices.ResultsOf 310 patients included in our final analysis, 101 (32.6%) patients developed DCI. We achieved maximal classification of 0.81 [0.75-0.82] AU-ROC. We also predicted 74.7 % of all DCI events 12 hours before typical clinical detection with a ratio of 3 true alerts for every 2 false alerts.ConclusionA data-driven machine learning based detection tool offered hourly assessments of DCI risk and incorporated new physiologic information over time.

Resampled dimensional reduction for feature representation in machine learning

10.21203/rs.3.pex-1636/v1 ◽

2021 ◽

Author(s):

Herdiantri Sufriyana ◽

Yu Wei Wu ◽

Emily Chia-Yu Su

Keyword(s):

Machine Learning ◽

Parameter Estimation ◽

Prediction Model ◽

Sample Size ◽

Dimensional Reduction ◽

Latent Variables ◽

Feature Representation ◽

Estimated Parameters ◽

Representation Technique ◽

Selection Of

Abstract We aimed to provide a resampling protocol for dimensional reduction resulting a few latent variables. The applicability focuses on but not limited for developing a machine learning prediction model in order to improve the number of sample size in relative to the number of candidate predictors. By this feature representation technique, one can improve generalization by preventing latent variables to overfit data used to conduct the dimensional reduction. However, this technique may warrant more computational capacity and time to conduct the procedure. The key stages consisted of derivation of latent variables from multiple resampling subsets, parameter estimation of latent variables in population, and selection of latent variables transformed by the estimated parameters.

PERFORMANCE COMPARISON OF MACHINE LEARNING ALGORITHMS FOR PREDICTIVE MAINTENANCE

Informatyka Automatyka Pomiary w Gospodarce i Ochronie Środowiska ◽

10.35784/iapgos.1834 ◽

2020 ◽

Vol 10 (3) ◽

pp. 32-35

Author(s):

Jakub Gęca

Keyword(s):

Machine Learning ◽

Learning Algorithms ◽

Performance Comparison ◽

Machine Learning Algorithms ◽

Predictive Maintenance ◽

Model Parameters ◽

Data Set ◽

Reduction Techniques ◽

Machine Reliability ◽

Dimensionality Reduction Techniques

The consequences of failures and unscheduled maintenance are the reasons why engineers have been trying to increase the reliability of industrial equipment for years. In modern solutions, predictive maintenance is a frequently used method. It allows to forecast failures and alert about their possibility. This paper presents a summary of the machine learning algorithms that can be used in predictive maintenance and comparison of their performance. The analysis was made on the basis of data set from Microsoft Azure AI Gallery. The paper presents a comprehensive approach to the issue including feature engineering, preprocessing, dimensionality reduction techniques, as well as tuning of model parameters in order to obtain the highest possible performance. The conducted research allowed to conclude that in the analysed case , the best algorithm achieved 99.92% accuracy out of over 122 thousand test data records. In conclusion, predictive maintenance based on machine learning represents the future of machine reliability in industry.

An OGI model for personalized estimation of glucose and insulin concentration in plasma

Mathematical Biosciences and Engineering ◽

10.3934/mbe.2021420 ◽

2021 ◽

Vol 18 (6) ◽

pp. 8499-8523

Author(s):

Weijie Wang ◽

◽

Shaoping Wang ◽

Yixuan Geng ◽

Yajing Qiao ◽

...

Keyword(s):

Kalman Filtering ◽

Gaussian Noise ◽

Particle Filtering ◽

In Silico ◽

Insulin Concentration ◽

Model Parameters ◽

Bayesian Filtering ◽

Extended Kalman Filtering ◽

External Disturbances ◽

Non Gaussian

<abstract><p>Plasma glucose concentration (PGC) and plasma insulin concentration (PIC) are two essential metrics for diabetic regulation, but difficult to be measured directly. Often, PGC and PIC are estimated from continuous glucose monitoring and insulin delivery data. Nevertheless, the inter-individual variability and external disturbance (e.g. carbohydrate intake) bring challenges for accurate estimations. This study is to estimate PGC and PIC adaptively by identifying personalized parameters and external disturbances. An observable glucose-insulin (OGI) dynamic model is established to describe insulin absorption, glucose regulation, and glucose transport. The model parameters and disturbances can be extended to observable state variables and be identified dynamically by Bayesian filtering estimators. Two basic Gaussian noise based Bayesian filtering estimators, extended Kalman filtering (EKF) and unscented Kalman filtering (UKF), are implemented. Recognizing the prevalence of non-Gaussian noise, in this study, two new filtering estimators: particle filtering with Gaussian noise (PFG), and particle filtering with mixed non-Gaussian noise (PFM) are designed and implemented. The proposed OGI model in conjunction with the estimators is evaluated using the data from 30 in-silico subjects and 10 human participants. For in-silico subjects, the OGI with PFM estimator has the ability to estimate PIC and PGC adaptively, achieving RMSE of PIC $ 9.49\pm3.81 $ mU/L, and PGC $ 0.89\pm0.19 $ mmol/L. For human, the OGI with PFM has the promise to identify disturbances ($ 95.46\%\pm0.65\% $ accurate rate of meal identification). OGI model provides a way to fully personalize the parameters and external disturbances in real time, and has potential clinical utility for artificial pancreas.</p></abstract>