Multi-Fidelity Physics-Constrained Neural Network and Its Application in Materials Modeling

Abstract Training machine learning tools such as neural networks requires the availability of sizable data, which can be difficult for engineering and scientific applications where experiments or simulations are expensive. In this work, a novel multi-fidelity physics-constrained neural network is proposed to reduce the required amount of training data, where physical knowledge is applied to constrain neural networks, and multi-fidelity networks are constructed to improve training efficiency. A low-cost low-fidelity physics-constrained neural network is used as the baseline model, whereas a limited amount of data from a high-fidelity simulation is used to train a second neural network to predict the difference between the two models. The proposed framework is demonstrated with two-dimensional heat transfer and phase transition problems, which are fundamental in materials modeling. Physics is described by partial differential equations. With the same set of training data, the prediction error of physics-constrained neural network can be one order of magnitude lower than that of a classical artificial neural network without physical constraints. The accuracy of the prediction is comparable to those from direct numerical solutions of equations.

Download Full-text

Methods for maintenance of neural networks in continual learning scenarios

10.36227/techrxiv.14565078.v1 ◽

2021 ◽

Author(s):

Bhasker Sri Harsha Suri ◽

Manish Srivastava ◽

Kalidas Yeturu

Keyword(s):

Neural Network ◽

Neural Networks ◽

Production Systems ◽

Image Data ◽

Training Data ◽

Learning Loss ◽

Unseen Data ◽

Scoring Schemes ◽

The Difference ◽

Continual Learning

Neural networks suffer from catastrophic forgetting problem when deployed in a continual learning scenario where new batches of data arrive over time; however they are of different distributions from the previous data used for training the neural network. For assessing the performance of a model in a continual learning scenario, two aspects are important (i) to compute the difference in data distribution between a new and old batch of data and (ii) to understand the retention and learning behavior of deployed neural networks. Current techniques indicate the novelty of a new data batch by comparing its statistical properties with that of the old batch in the input space. However, it is still an open area of research to consider the perspective of a deployed neural network’s ability to generalize on the unseen data samples. In this work, we report a dataset distance measuring technique that indicates the novelty of a new batch of data while considering the deployed neural network’s perspective. We propose the construction of perspective histograms which are a vector representation of the data batches based on the correctness and confidence in the prediction of the deployed model. We have successfully tested the hypothesis empirically on image data coming MNIST Digits, MNIST Fashion, CIFAR10, for its ability to detect data perturbations of type rotation, Gaussian blur, and translation. Upon new data, given a model and its training data, we have proposed and evaluated four new scoring schemes, retention score (R), learning score (L), Oscore and SP-score for studying how much the model can retain its performance on past data, how much it can learn new data, the combined expression for the magnitude of retention and learning and stability-plasticity characteristics respectively. The scoring schemes have been evaluated MNIST Digits and MNIST Fashion data sets on different types of neural network architectures based on the number of parameters, activation functions, and learning loss functions, and an instance of a typical analysis report is presented. Machine learning model maintenance is a reality in production systems in the industry, and we hope our proposed methodology offers a solution to the need of the day in this aspect.

Download Full-text

Methods for maintenance of neural networks in continual learning scenarios

10.36227/techrxiv.14565078 ◽

2021 ◽

Author(s):

Bhasker Sri Harsha Suri ◽

Manish Srivastava ◽

Kalidas Yeturu

Keyword(s):

Neural Network ◽

Neural Networks ◽

Production Systems ◽

Image Data ◽

Training Data ◽

Learning Loss ◽

Unseen Data ◽

Scoring Schemes ◽

The Difference ◽

Continual Learning

Neural networks suffer from catastrophic forgetting problem when deployed in a continual learning scenario where new batches of data arrive over time; however they are of different distributions from the previous data used for training the neural network. For assessing the performance of a model in a continual learning scenario, two aspects are important (i) to compute the difference in data distribution between a new and old batch of data and (ii) to understand the retention and learning behavior of deployed neural networks. Current techniques indicate the novelty of a new data batch by comparing its statistical properties with that of the old batch in the input space. However, it is still an open area of research to consider the perspective of a deployed neural network’s ability to generalize on the unseen data samples. In this work, we report a dataset distance measuring technique that indicates the novelty of a new batch of data while considering the deployed neural network’s perspective. We propose the construction of perspective histograms which are a vector representation of the data batches based on the correctness and confidence in the prediction of the deployed model. We have successfully tested the hypothesis empirically on image data coming MNIST Digits, MNIST Fashion, CIFAR10, for its ability to detect data perturbations of type rotation, Gaussian blur, and translation. Upon new data, given a model and its training data, we have proposed and evaluated four new scoring schemes, retention score (R), learning score (L), Oscore and SP-score for studying how much the model can retain its performance on past data, how much it can learn new data, the combined expression for the magnitude of retention and learning and stability-plasticity characteristics respectively. The scoring schemes have been evaluated MNIST Digits and MNIST Fashion data sets on different types of neural network architectures based on the number of parameters, activation functions, and learning loss functions, and an instance of a typical analysis report is presented. Machine learning model maintenance is a reality in production systems in the industry, and we hope our proposed methodology offers a solution to the need of the day in this aspect.

Download Full-text

Operational Determination of the Activated Sludge Process Using Neural Networks

Water Science & Technology ◽

10.2166/wst.1992.0762 ◽

1992 ◽

Vol 26 (9-11) ◽

pp. 2461-2464 ◽

Cited By ~ 2

Author(s):

R. D. Tyagi ◽

Y. G. Du

Keyword(s):

Neural Network ◽

Neural Networks ◽

Steady State ◽

Activated Sludge ◽

Feedforward Neural Network ◽

Training Data ◽

Activated Sludge Process

A steady-statemathematical model of an activated sludgeprocess with a secondary settler was developed. With a limited number of training data samples obtained from the simulation at steady state, a feedforward neural network was established which exhibits an excellent capability for the operational prediction and determination.

Download Full-text

Experiments with Neural Networks in the Identification and Control of a Magnetic Levitation System Using a Low-Cost Platform

Applied Sciences ◽

10.3390/app11062535 ◽

2021 ◽

Vol 11 (6) ◽

pp. 2535

Author(s):

Bruno E. Silva ◽

Ramiro S. Barbosa

Keyword(s):

Neural Networks ◽

Neural Control ◽

Magnetic Levitation ◽

Low Cost ◽

Training Data ◽

Neural Controllers ◽

Levitation System ◽

Internal Model Controller ◽

Magnetic Levitation System ◽

And Performance

In this article, we designed and implemented neural controllers to control a nonlinear and unstable magnetic levitation system composed of an electromagnet and a magnetic disk. The objective was to evaluate the implementation and performance of neural control algorithms in a low-cost hardware. In a first phase, we designed two classical controllers with the objective to provide the training data for the neural controllers. After, we identified several neural models of the levitation system using Nonlinear AutoRegressive eXogenous (NARX)-type neural networks that were used to emulate the forward dynamics of the system. Finally, we designed and implemented three neural control structures: the inverse controller, the internal model controller, and the model reference controller for the control of the levitation system. The neural controllers were tested on a low-cost Arduino control platform through MATLAB/Simulink. The experimental results proved the good performance of the neural controllers.

Download Full-text

CrossGR

Proceedings of the ACM on Interactive Mobile Wearable and Ubiquitous Technologies ◽

10.1145/3448100 ◽

2021 ◽

Vol 5 (1) ◽

pp. 1-23

Author(s):

Xinyi Li ◽

Liqiong Chang ◽

Fangfang Song ◽

Ju Wang ◽

Xiaojiang Chen ◽

...

Keyword(s):

Gesture Recognition ◽

Low Cost ◽

User Involvement ◽

Recognition System ◽

Training Data ◽

Generative Adversarial Network ◽

Adversarial Network ◽

Target User ◽

Order Of Magnitude ◽

Training Examples

This paper focuses on a fundamental question in Wi-Fi-based gesture recognition: "Can we use the knowledge learned from some users to perform gesture recognition for others?". This problem is also known as cross-target recognition. It arises in many practical deployments of Wi-Fi-based gesture recognition where it is prohibitively expensive to collect training data from every single user. We present CrossGR, a low-cost cross-target gesture recognition system. As a departure from existing approaches, CrossGR does not require prior knowledge (such as who is currently performing a gesture) of the target user. Instead, CrossGR employs a deep neural network to extract user-agnostic but gesture-related Wi-Fi signal characteristics to perform gesture recognition. To provide sufficient training data to build an effective deep learning model, CrossGR employs a generative adversarial network to automatically generate many synthetic training data from a small set of real-world examples collected from a small number of users. Such a strategy allows CrossGR to minimize the user involvement and the associated cost in collecting training examples for building an accurate gesture recognition system. We evaluate CrossGR by applying it to perform gesture recognition across 10 users and 15 gestures. Experimental results show that CrossGR achieves an accuracy of over 82.6% (up to 99.75%). We demonstrate that CrossGR delivers comparable recognition accuracy, but uses an order of magnitude less training samples collected from the end-users when compared to state-of-the-art recognition systems.

Download Full-text

Identification of Granule Growth Regimes in High Shear Wet Granulation Processes Using a Physics-Constrained Neural Network

Processes ◽

10.3390/pr9050737 ◽

2021 ◽

Vol 9 (5) ◽

pp. 737

Author(s):

Chaitanya Sampat ◽

Rohit Ramachandran

Keyword(s):

Neural Network ◽

Neural Networks ◽

Real Time ◽

Data Driven ◽

Wet Granulation ◽

Manufacturing Processes ◽

High Shear ◽

Process Data ◽

Physical Constraints ◽

Granule Growth

The digitization of manufacturing processes has led to an increase in the availability of process data, which has enabled the use of data-driven models to predict the outcomes of these manufacturing processes. Data-driven models are instantaneous in simulate and can provide real-time predictions but lack any governing physics within their framework. When process data deviates from original conditions, the predictions from these models may not agree with physical boundaries. In such cases, the use of first-principle-based models to predict process outcomes have proven to be effective but computationally inefficient and cannot be solved in real time. Thus, there remains a need to develop efficient data-driven models with a physical understanding about the process. In this work, we have demonstrate the addition of physics-based boundary conditions constraints to a neural network to improve its predictability for granule density and granule size distribution (GSD) for a high shear granulation process. The physics-constrained neural network (PCNN) was better at predicting granule growth regimes when compared to other neural networks with no physical constraints. When input data that violated physics-based boundaries was provided, the PCNN identified these points more accurately compared to other non-physics constrained neural networks, with an error of <1%. A sensitivity analysis of the PCNN to the input variables was also performed to understand individual effects on the final outputs.

Download Full-text

Evaluation of Power Insulator Detection Efficiency with the Use of Limited Training Dataset

Applied Sciences ◽

10.3390/app10062104 ◽

2020 ◽

Vol 10 (6) ◽

pp. 2104

Author(s):

Michał Tomaszewski ◽

Paweł Michalski ◽

Jakub Osuchowski

Keyword(s):

Neural Network ◽

Neural Networks ◽

Object Detection ◽

Convolutional Neural Network ◽

Deep Neural Networks ◽

Detection Efficiency ◽

Training Data ◽

Training Dataset ◽

Training Set ◽

Convolutional Network

This article presents an analysis of the effectiveness of object detection in digital images with the application of a limited quantity of input. The possibility of using a limited set of learning data was achieved by developing a detailed scenario of the task, which strictly defined the conditions of detector operation in the considered case of a convolutional neural network. The described solution utilizes known architectures of deep neural networks in the process of learning and object detection. The article presents comparisons of results from detecting the most popular deep neural networks while maintaining a limited training set composed of a specific number of selected images from diagnostic video. The analyzed input material was recorded during an inspection flight conducted along high-voltage lines. The object detector was built for a power insulator. The main contribution of the presented papier is the evidence that a limited training set (in our case, just 60 training frames) could be used for object detection, assuming an outdoor scenario with low variability of environmental conditions. The decision of which network will generate the best result for such a limited training set is not a trivial task. Conducted research suggests that the deep neural networks will achieve different levels of effectiveness depending on the amount of training data. The most beneficial results were obtained for two convolutional neural networks: the faster region-convolutional neural network (faster R-CNN) and the region-based fully convolutional network (R-FCN). Faster R-CNN reached the highest AP (average precision) at a level of 0.8 for 60 frames. The R-FCN model gained a worse AP result; however, it can be noted that the relationship between the number of input samples and the obtained results has a significantly lower influence than in the case of other CNN models, which, in the authors’ assessment, is a desired feature in the case of a limited training set.

Download Full-text

KLASIFIKASI BATIK RIAU DENGAN MENGGUNAKAN CONVOLUTIONAL NEURAL NETWORKS (CNN)

Jurnal Ilmu Komputer ◽

10.33060/jik/2020/vol9.iss1.144 ◽

2020 ◽

Vol 9 (1) ◽

pp. 7-10

Author(s):

Hendry Fonda

Keyword(s):

Neural Network ◽

Neural Networks ◽

Artificial Neural Networks ◽

Deep Learning ◽

Convolutional Neural Networks ◽

18Th Century ◽

The Public ◽

Artificial Neural ◽

The Difference ◽

Fully Connected

ABSTRACT Riau batik is known since the 18th century and is used by royal kings. Riau Batik is made by using a stamp that is mixed with coloring and then printed on fabric. The fabric used is usually silk. As its development, comparing Javanese batik with riau batik Riau is very slowly accepted by the public. Convolutional Neural Networks (CNN) is a combination of artificial neural networks and deeplearning methods. CNN consists of one or more convolutional layers, often with a subsampling layer followed by one or more fully connected layers as a standard neural network. In the process, CNN will conduct training and testing of Riau batik so that a collection of batik models that have been classified based on the characteristics that exist in Riau batik can be determined so that images are Riau batik and non-Riau batik. Classification using CNN produces Riau batik and not Riau batik with an accuracy of 65%. Accuracy of 65% is due to basically many of the same motifs between batik and other batik with the difference lies in the color of the absorption in the batik riau. Kata kunci: Batik; Batik Riau; CNN; Image; Deep Learning ABSTRAK Batik Riau dikenal sejak abad ke 18 dan digunakan oleh bangsawan raja. Batik Riau dibuat dengan menggunakan cap yang dicampur dengan pewarna kemudian dicetak di kain. Kain yang digunakan biasanya sutra. Seiring perkembangannya, dibandingkan batik Jawa maka batik Riau sangat lambat diterima oleh masyarakat. Convolutional Neural Networks (CNN) merupakan kombinasi dari jaringan syaraf tiruan dan metode deeplearning. CNN terdiri dari satu atau lebih lapisan konvolutional, seringnya dengan suatu lapisan subsampling yang diikuti oleh satu atau lebih lapisan yang terhubung penuh sebagai standar jaringan syaraf. Dalam prosesnya CNN akan melakukan training dan testing terhadap batik Riau sehingga didapat kumpulan model batik yang telah terklasi fikasi berdasarkan ciri khas yang ada pada batik Riau sehingga dapat ditentukan gambar (image) yang merupakan batik Riau dan yang bukan merupakan batik Riau. Klasifikasi menggunakan CNN menghasilkan batik riau dan bukan batik riau dengan akurasi 65%. Akurasi 65% disebabkan pada dasarnya banyak motif yang sama antara batik riau dengan batik lainnya dengan perbedaan terletak pada warna cerap pada batik riau. Kata kunci: Batik; Batik Riau; CNN; Image; Deep Learning

Download Full-text

Binary and Multiclass Text Classification by Means of Separable Convolutional Neural Network

Inventions ◽

10.3390/inventions6040070 ◽

2021 ◽

Vol 6 (4) ◽

pp. 70

Author(s):

Elena Solovyeva ◽

Ali Abdullah

Keyword(s):

Neural Network ◽

Neural Networks ◽

Convolutional Neural Network ◽

Recurrent Neural Networks ◽

Low Cost ◽

Computational Cost ◽

High Accuracy ◽

Activation Functions ◽

Fully Connected ◽

Fully Connected Networks

In this paper, the structure of a separable convolutional neural network that consists of an embedding layer, separable convolutional layers, convolutional layer and global average pooling is represented for binary and multiclass text classifications. The advantage of the proposed structure is the absence of multiple fully connected layers, which is used to increase the classification accuracy but raises the computational cost. The combination of low-cost separable convolutional layers and a convolutional layer is proposed to gain high accuracy and, simultaneously, to reduce the complexity of neural classifiers. Advantages are demonstrated at binary and multiclass classifications of written texts by means of the proposed networks under the sigmoid and Softmax activation functions in convolutional layer. At binary and multiclass classifications, the accuracy obtained by separable convolutional neural networks is higher in comparison with some investigated types of recurrent neural networks and fully connected networks.

Download Full-text