Coupled online learning as a way to tackle instabilities  and biases in neural network parameterizations: general algorithms and Lorenz 96 case study (v1.0)

Abstract. Over the last couple of years, machine learning parameterizations have emerged as a potential way to improve the representation of subgrid processes in Earth system models (ESMs). So far, all studies were based on the same three-step approach: first a training dataset was created from a high-resolution simulation, then a machine learning algorithm was fitted to this dataset, before the trained algorithm was implemented in the ESM. The resulting online simulations were frequently plagued by instabilities and biases. Here, coupled online learning is proposed as a way to combat these issues. Coupled learning can be seen as a second training stage in which the pretrained machine learning parameterization, specifically a neural network, is run in parallel with a high-resolution simulation. The high-resolution simulation is kept in sync with the neural network-driven ESM through constant nudging. This enables the neural network to learn from the tendencies that the high-resolution simulation would produce if it experienced the states the neural network creates. The concept is illustrated using the Lorenz 96 model, where coupled learning is able to recover the “true” parameterizations. Further, detailed algorithms for the implementation of coupled learning in 3D cloud-resolving models and the super parameterization framework are presented. Finally, outstanding challenges and issues not resolved by this approach are discussed.

Download Full-text

Coupled online learning as a way to tackle instabilities and biases in neural network parameterizations: general algorithms and Lorenz96 case study (v1.0)

10.5194/gmd-2019-319 ◽

2020 ◽

Author(s):

Stephan Rasp

Keyword(s):

Neural Network ◽

Machine Learning ◽

Online Learning ◽

High Resolution ◽

Machine Learning Algorithms ◽

Training Dataset ◽

The Neural Network ◽

Training Stage ◽

Lorenz 96 ◽

Resolution Simulation

Abstract. Over the last couple of years, machine learning parameterizations have emerged as a potential way to improve the representation of sub-grid processes in Earth System Models (ESMs). So far, all studies were based on the same three-step approach: first a training dataset was created from a high-resolution simulation, then a machine learning algorithms was fitted to this dataset, before the trained algorithms was implemented in the ESM. The resulting online simulations were frequently plagued by instabilities and biases. Here, coupled online learning is proposed as a way to combat these issues. Coupled learning can be seen as a second training stage in which the pretrained machine learning parameterization, specifically a neural network, is run in parallel with a high-resolution simulation. The high-resolution simulation is kept in sync with the neural network-driven ESM through constant nudging. This enables the neural network to learn from the tendencies that the high-resolution simulation would produce if it experienced the states the neural network creates. The concept is illustrated using the Lorenz 96 model, where coupled learning is able to recover the "true" parameterizations. Further, detailed algorithms for the implementation of coupled learning in 3D cloud-resolving models and the super parameterization framework are presented. Finally, outstanding challenges and issues not resolved by this approach are discussed.

Download Full-text

Predicting Alert Source Device using Machine Learning Algorithms

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.d1526.079920 ◽

2020 ◽

Vol 9 (9) ◽

pp. 1-10

Keyword(s):

Neural Network ◽

Machine Learning ◽

Model Building ◽

Learning Algorithm ◽

Learning Algorithms ◽

Research Work ◽

Machine Learning Algorithms ◽

Training Dataset ◽

Imbalanced Dataset ◽

Daunting Task

In a large distributed virtualized environment, predicting the alerting source from its text seems to be daunting task. This paper explores the option of using machine learning algorithm to solve this problem. Unfortunately, our training dataset is highly imbalanced. Where 96% of alerting data is reported by 24% of alerting sources. This is the expected dataset in any live distributed virtualized environment, where new version of device will have relatively less alert compared to older devices. Any classification effort with such imbalanced dataset present different set of challenges compared to binary classification. This type of skewed data distribution makes conventional machine learning less effective, especially while predicting the minority device type alerts. Our challenge is to build a robust model which can cope with this imbalanced dataset and achieves relative high level of prediction accuracy. This research work stared with traditional regression and classification algorithms using bag of words model. Then word2vec and doc2vec models are used to represent the words in vector formats, which preserve the sematic meaning of the sentence. With this alerting text with similar message will have same vector form representation. This vectorized alerting text is used with Logistic Regression for model building. This yields better accuracy, but the model is relatively complex and demand more computational resources. Finally, simple neural network is used for this multi-class text classification problem domain by using keras and tensorflow libraries. A simple two layered neural network yielded 99 % accuracy, even though our training dataset was not balanced. This paper goes through the qualitative evaluation of the different machine learning algorithms and their respective result. Finally, two layered deep learning algorithms is selected as final solution, since it takes relatively less resource and time with better accuracy values.

Download Full-text

SIGNAL RECOGNITION SYSTEM MACHINE LEARNING ALGORITHM

Applied Mathematics and Fundamental Informatics ◽

10.25206/2311-4908-2020-7-1-53-59 ◽

2020 ◽

pp. 053-059

Author(s):

D. E. SCHUMACHER ◽

Keyword(s):

Neural Network ◽

Machine Learning ◽

Software Defined Radio ◽

Learning Algorithm ◽

Recognition System ◽

Digital Information ◽

Signal Recognition ◽

Radio System ◽

The Neural Network ◽

Radio Signals

The article provides a comparative analysis of processing algorithms for digital information methods of machine learning. A research system for recognizing radio signals based on an artificial neural network as part of a software-defined radio system (SDR). The neural network was designed in the MatLab program, simulation signals were generated, and an algorithm for training the neural network was developed.

Download Full-text

MultiResU-Net: Neural Network for Salt Bodies Delineation and QC Manual Interpretation

10.4043/31169-ms ◽

2021 ◽

Author(s):

Yesser HajNasser

Keyword(s):

Neural Network ◽

Machine Learning ◽

Ground Truth ◽

Complex Salt ◽

Training Dataset ◽

Systematic Bias ◽

Spatial Features ◽

The Neural Network ◽

Seismic Image ◽

Machine Learning Approach

Abstract Accurate delineation of salt bodies is essential for the characterization of hydrocarbon accumulation and seal efficiency in offshore reservoirs. The interpretation of these subsurface features is heavily dependent on visual picking. This in turn could introduce systematic bias into the task of salt body interpretation. In this study, we introduce a novel machine learning approach of a deep neural network to mimic an experienced geophysical interpreter's intellect in interpreting salt bodies. Here, the benefits of using machine learning are demonstrated by implementing the MultiResU-Net network. The network is an improved form of the classic U-Net. It presents two key architectural improvements. First, it replaces the simple convolutional layers with inception-like blocks with varying kernel sizes to reconcile the spatial features learned from different seismic image contexts. Second, it incorporates residual convolutional layers along the skip connections between the downsampling and the upsampling paths. This aims at compensating for the disparity between the lower-level features coming from the early stages of the downsampling path and the much higher-level features coming from the upsampling path. From the primary results using the TGS Salt Identification Challenge dataset, the MultiResU-Net outperformed the classic U-Net in identifying salt bodies and showed good agreement with the ground truth. Additionally, in the case of complex salt body geometries, the MultiResU-Net predictions exhibited some intriguing differences with the ground truth interpretation. Although the network validation accuracy is about 95%, some of these occasional discrepancies between the neural network predictions and the ground truth highlighted the subjectivity of the manual interpretation. Consequently, this raises the need to incorporate these neural networks that are prone to random perturbations to QC manual geophysical interpretation. To bridge the gap between the human interpretation and the machine learning predictions, we propose a closed-loop-machine-learning workflow that aims at optimizing the training dataset by incorporating both the consistency of the neural network and the intellect of an experienced geophysical interpreter.

Download Full-text

Discretization and machine learning approximation of BSDEs with a constraint on the Gains-process

Monte Carlo Methods and Applications ◽

10.1515/mcma-2020-2080 ◽

2021 ◽

Vol 0 (0) ◽

Author(s):

Idris Kharroubi ◽

Thomas Lim ◽

Xavier Warin

Keyword(s):

Neural Network ◽

Machine Learning ◽

Neural Networks ◽

Differential Equations ◽

Numerical Experiments ◽

Optimization Problem ◽

Learning Approach ◽

The Neural Network ◽

Machine Learning Approach ◽

Mesh Grid

AbstractWe study the approximation of backward stochastic differential equations (BSDEs for short) with a constraint on the gains process. We first discretize the constraint by applying a so-called facelift operator at times of a grid. We show that this discretely constrained BSDE converges to the continuously constrained one as the mesh grid converges to zero. We then focus on the approximation of the discretely constrained BSDE. For that we adopt a machine learning approach. We show that the facelift can be approximated by an optimization problem over a class of neural networks under constraints on the neural network and its derivative. We then derive an algorithm converging to the discretely constrained BSDE as the number of neurons goes to infinity. We end by numerical experiments.

Download Full-text

Deep Learning Classification of Canine Behavior Using a Single Collar-Mounted Accelerometer: Real-World Validation

Animals ◽

10.3390/ani11061549 ◽

2021 ◽

Vol 11 (6) ◽

pp. 1549

Author(s):

Robert D. Chambers ◽

Nathanael C. Yoder ◽

Aletha B. Carson ◽

Christian Junge ◽

David E. Allen ◽

...

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Real World ◽

Learning Algorithm ◽

Drinking Behavior ◽

True Positive Rate ◽

Training Dataset ◽

Activity Levels ◽

Accelerometer Data ◽

Activity Monitors

Collar-mounted canine activity monitors can use accelerometer data to estimate dog activity levels, step counts, and distance traveled. With recent advances in machine learning and embedded computing, much more nuanced and accurate behavior classification has become possible, giving these affordable consumer devices the potential to improve the efficiency and effectiveness of pet healthcare. Here, we describe a novel deep learning algorithm that classifies dog behavior at sub-second resolution using commercial pet activity monitors. We built machine learning training databases from more than 5000 videos of more than 2500 dogs and ran the algorithms in production on more than 11 million days of device data. We then surveyed project participants representing 10,550 dogs, which provided 163,110 event responses to validate real-world detection of eating and drinking behavior. The resultant algorithm displayed a sensitivity and specificity for detecting drinking behavior (0.949 and 0.999, respectively) and eating behavior (0.988, 0.983). We also demonstrated detection of licking (0.772, 0.990), petting (0.305, 0.991), rubbing (0.729, 0.996), scratching (0.870, 0.997), and sniffing (0.610, 0.968). We show that the devices’ position on the collar had no measurable impact on performance. In production, users reported a true positive rate of 95.3% for eating (among 1514 users), and of 94.9% for drinking (among 1491 users). The study demonstrates the accurate detection of important health-related canine behaviors using a collar-mounted accelerometer. We trained and validated our algorithms on a large and realistic training dataset, and we assessed and confirmed accuracy in production via user validation.

Download Full-text

Study on Influence of Range of Data in Concrete Compressive Strength with Respect to the Accuracy of Machine Learning with Linear Regression

Applied Sciences ◽

10.3390/app11093866 ◽

2021 ◽

Vol 11 (9) ◽

pp. 3866

Author(s):

Jun-Ryeol Park ◽

Hye-Jin Lee ◽

Keun-Hyeok Yang ◽

Jung-Keun Kook ◽

Sanghee Kim

Keyword(s):

Machine Learning ◽

Compressive Strength ◽

Linear Regression ◽

Coarse Aggregate ◽

Learning Algorithm ◽

Linear Regression Analysis ◽

Aggregate Size ◽

Training Dataset ◽

Machine Learning Algorithm ◽

Concrete Compressive Strength

This study aims to predict the compressive strength of concrete using a machine-learning algorithm with linear regression analysis and to evaluate its accuracy. The open-source software library TensorFlow was used to develop the machine-learning algorithm. In the machine-earning algorithm, a total of seven variables were set: water, cement, fly ash, blast furnace slag, sand, coarse aggregate, and coarse aggregate size. A total of 4297 concrete mixtures with measured compressive strengths were employed to train and testing the machine-learning algorithm. Of these, 70% were used for training, and 30% were utilized for verification. For verification, the research was conducted by classifying the mixtures into three cases: the case where the machine-learning algorithm was trained using all the data (Case-1), the case where the machine-learning algorithm was trained while maintaining the same number of training dataset for each strength range (Case-2), and the case where the machine-learning algorithm was trained after making the subcase of each strength range (Case-3). The results indicated that the error percentages of Case-1 and Case-2 did not differ significantly. The error percentage of Case-3 was far smaller than those of Case-1 and Case-2. Therefore, it was concluded that the range of training dataset of the concrete compressive strength is as important as the amount of training dataset for accurately predicting the concrete compressive strength using the machine-learning algorithm.

Download Full-text

Unraveling the deep learning gearbox in optical coherence tomography image segmentation towards explainable artificial intelligence

Communications Biology ◽

10.1038/s42003-021-01697-y ◽

2021 ◽

Vol 4 (1) ◽

Author(s):

Peter M. Maloca ◽

Philipp L. Müller ◽

Aaron Y. Lee ◽

Adnan Tufail ◽

Konstantinos Balaskas ◽

...

Keyword(s):

Neural Network ◽

Machine Learning ◽

Optical Coherence Tomography ◽

Image Segmentation ◽

Convolutional Neural Network ◽

Learning Algorithm ◽

Ground Truth ◽

Optical Coherence Tomography Image ◽

Optical Coherence ◽

Tomography Image

AbstractMachine learning has greatly facilitated the analysis of medical data, while the internal operations usually remain intransparent. To better comprehend these opaque procedures, a convolutional neural network for optical coherence tomography image segmentation was enhanced with a Traceable Relevance Explainability (T-REX) technique. The proposed application was based on three components: ground truth generation by multiple graders, calculation of Hamming distances among graders and the machine learning algorithm, as well as a smart data visualization (‘neural recording’). An overall average variability of 1.75% between the human graders and the algorithm was found, slightly minor to 2.02% among human graders. The ambiguity in ground truth had noteworthy impact on machine learning results, which could be visualized. The convolutional neural network balanced between graders and allowed for modifiable predictions dependent on the compartment. Using the proposed T-REX setup, machine learning processes could be rendered more transparent and understandable, possibly leading to optimized applications.

Download Full-text

End-to-End Autonomous Driving Through Dueling Double Deep Q-Network

Automotive Innovation ◽

10.1007/s42154-021-00151-3 ◽

2021 ◽

Author(s):

Baiyu Peng ◽

Qi Sun ◽

Shengbo Eben Li ◽

Dongsuk Kum ◽

Yuming Yin ◽

...

Keyword(s):

Neural Network ◽

State Space ◽

Learning Algorithm ◽

Rapid Development ◽

Autonomous Driving ◽

Saliency Map ◽

Hierarchical Architecture ◽

Link Type ◽

The Neural Network ◽

End To End

AbstractRecent years have seen the rapid development of autonomous driving systems, which are typically designed in a hierarchical architecture or an end-to-end architecture. The hierarchical architecture is always complicated and hard to design, while the end-to-end architecture is more promising due to its simple structure. This paper puts forward an end-to-end autonomous driving method through a deep reinforcement learning algorithm Dueling Double Deep Q-Network, making it possible for the vehicle to learn end-to-end driving by itself. This paper firstly proposes an architecture for the end-to-end lane-keeping task. Unlike the traditional image-only state space, the presented state space is composed of both camera images and vehicle motion information. Then corresponding dueling neural network structure is introduced, which reduces the variance and improves sampling efficiency. Thirdly, the proposed method is applied to The Open Racing Car Simulator (TORCS) to demonstrate its great performance, where it surpasses human drivers. Finally, the saliency map of the neural network is visualized, which indicates the trained network drives by observing the lane lines. A video for the presented work is available online, https://youtu.be/76ciJmIHMD8 or https://v.youku.com/v_show/id_XNDM4ODc0MTM4NA==.html.

Download Full-text

On the Unfounded Enthusiasm for Soft Selective Sweeps III: The Supervised Machine Learning Algorithm That Isn’t

Genes ◽

10.3390/genes12040527 ◽

2021 ◽

Vol 12 (4) ◽

pp. 527

Author(s):

Eran Elhaik ◽

Dan Graur

Keyword(s):

Machine Learning ◽

Learning Algorithm ◽

A Priori ◽

Neutral Theory ◽

Dominant Mode ◽

Supervised Machine Learning ◽

Training Dataset ◽

Selective Sweeps ◽

Two Factors ◽

Negative Controls

In the last 15 years or so, soft selective sweep mechanisms have been catapulted from a curiosity of little evolutionary importance to a ubiquitous mechanism claimed to explain most adaptive evolution and, in some cases, most evolution. This transformation was aided by a series of articles by Daniel Schrider and Andrew Kern. Within this series, a paper entitled “Soft sweeps are the dominant mode of adaptation in the human genome” (Schrider and Kern, Mol. Biol. Evolut. 2017, 34(8), 1863–1877) attracted a great deal of attention, in particular in conjunction with another paper (Kern and Hahn, Mol. Biol. Evolut. 2018, 35(6), 1366–1371), for purporting to discredit the Neutral Theory of Molecular Evolution (Kimura 1968). Here, we address an alleged novelty in Schrider and Kern’s paper, i.e., the claim that their study involved an artificial intelligence technique called supervised machine learning (SML). SML is predicated upon the existence of a training dataset in which the correspondence between the input and output is known empirically to be true. Curiously, Schrider and Kern did not possess a training dataset of genomic segments known a priori to have evolved either neutrally or through soft or hard selective sweeps. Thus, their claim of using SML is thoroughly and utterly misleading. In the absence of legitimate training datasets, Schrider and Kern used: (1) simulations that employ many manipulatable variables and (2) a system of data cherry-picking rivaling the worst excesses in the literature. These two factors, in addition to the lack of negative controls and the irreproducibility of their results due to incomplete methodological detail, lead us to conclude that all evolutionary inferences derived from so-called SML algorithms (e.g., S/HIC) should be taken with a huge shovel of salt.

Download Full-text