Multi-Fidelity Physics-Constrained Neural Network and Its Application in Materials Modeling

2019 ◽  
Vol 141 (12) ◽  
Author(s):  
Dehao Liu ◽  
Yan Wang

Abstract Training machine learning tools such as neural networks require the availability of sizable data, which can be difficult for engineering and scientific applications where experiments or simulations are expensive. In this work, a novel multi-fidelity physics-constrained neural network is proposed to reduce the required amount of training data, where physical knowledge is applied to constrain neural networks, and multi-fidelity networks are constructed to improve training efficiency. A low-cost low-fidelity physics-constrained neural network is used as the baseline model, whereas a limited amount of data from a high-fidelity physics-constrained neural network is used to train a second neural network to predict the difference between the two models. The proposed framework is demonstrated with two-dimensional heat transfer, phase transition, and dendritic growth problems, which are fundamental in materials modeling. Physics is described by partial differential equations. With the same set of training data, the prediction error of physics-constrained neural network can be one order of magnitude lower than that of the classical artificial neural network without physical constraints. The accuracy of the prediction is comparable to those from direct numerical solutions of equations.

Author(s):  
Dehao Liu ◽  
Yan Wang

Abstract Training machine learning tools such as neural networks requires the availability of sizable data, which can be difficult for engineering and scientific applications where experiments or simulations are expensive. In this work, a novel multi-fidelity physics-constrained neural network is proposed to reduce the required amount of training data, where physical knowledge is applied to constrain neural networks, and multi-fidelity networks are constructed to improve training efficiency. A low-cost low-fidelity physics-constrained neural network is used as the baseline model, whereas a limited amount of data from a high-fidelity simulation is used to train a second neural network to predict the difference between the two models. The proposed framework is demonstrated with two-dimensional heat transfer and phase transition problems, which are fundamental in materials modeling. Physics is described by partial differential equations. With the same set of training data, the prediction error of physics-constrained neural network can be one order of magnitude lower than that of a classical artificial neural network without physical constraints. The accuracy of the prediction is comparable to those from direct numerical solutions of equations.


2021 ◽  
Author(s):  
Bhasker Sri Harsha Suri ◽  
Manish Srivastava ◽  
Kalidas Yeturu

Neural networks suffer from catastrophic forgetting problem when deployed in a continual learning scenario where new batches of data arrive over time; however they are of different distributions from the previous data used for training the neural network. For assessing the performance of a model in a continual learning scenario, two aspects are important (i) to compute the difference in data distribution between a new and old batch of data and (ii) to understand the retention and learning behavior of deployed neural networks. Current techniques indicate the novelty of a new data batch by comparing its statistical properties with that of the old batch in the input space. However, it is still an open area of research to consider the perspective of a deployed neural network’s ability to generalize on the unseen data samples. In this work, we report a dataset distance measuring technique that indicates the novelty of a new batch of data while considering the deployed neural network’s perspective. We propose the construction of perspective histograms which are a vector representation of the data batches based on the correctness and confidence in the prediction of the deployed model. We have successfully tested the hypothesis empirically on image data coming MNIST Digits, MNIST Fashion, CIFAR10, for its ability to detect data perturbations of type rotation, Gaussian blur, and translation. Upon new data, given a model and its training data, we have proposed and evaluated four new scoring schemes, retention score (R), learning score (L), Oscore and SP-score for studying how much the model can retain its performance on past data, how much it can learn new data, the combined expression for the magnitude of retention and learning and stability-plasticity characteristics respectively. The scoring schemes have been evaluated MNIST Digits and MNIST Fashion data sets on different types of neural network architectures based on the number of parameters, activation functions, and learning loss functions, and an instance of a typical analysis report is presented. Machine learning model maintenance is a reality in production systems in the industry, and we hope our proposed methodology offers a solution to the need of the day in this aspect.


2021 ◽  
Author(s):  
Bhasker Sri Harsha Suri ◽  
Manish Srivastava ◽  
Kalidas Yeturu

Neural networks suffer from catastrophic forgetting problem when deployed in a continual learning scenario where new batches of data arrive over time; however they are of different distributions from the previous data used for training the neural network. For assessing the performance of a model in a continual learning scenario, two aspects are important (i) to compute the difference in data distribution between a new and old batch of data and (ii) to understand the retention and learning behavior of deployed neural networks. Current techniques indicate the novelty of a new data batch by comparing its statistical properties with that of the old batch in the input space. However, it is still an open area of research to consider the perspective of a deployed neural network’s ability to generalize on the unseen data samples. In this work, we report a dataset distance measuring technique that indicates the novelty of a new batch of data while considering the deployed neural network’s perspective. We propose the construction of perspective histograms which are a vector representation of the data batches based on the correctness and confidence in the prediction of the deployed model. We have successfully tested the hypothesis empirically on image data coming MNIST Digits, MNIST Fashion, CIFAR10, for its ability to detect data perturbations of type rotation, Gaussian blur, and translation. Upon new data, given a model and its training data, we have proposed and evaluated four new scoring schemes, retention score (R), learning score (L), Oscore and SP-score for studying how much the model can retain its performance on past data, how much it can learn new data, the combined expression for the magnitude of retention and learning and stability-plasticity characteristics respectively. The scoring schemes have been evaluated MNIST Digits and MNIST Fashion data sets on different types of neural network architectures based on the number of parameters, activation functions, and learning loss functions, and an instance of a typical analysis report is presented. Machine learning model maintenance is a reality in production systems in the industry, and we hope our proposed methodology offers a solution to the need of the day in this aspect.


1992 ◽  
Vol 26 (9-11) ◽  
pp. 2461-2464 ◽  
Author(s):  
R. D. Tyagi ◽  
Y. G. Du

A steady-statemathematical model of an activated sludgeprocess with a secondary settler was developed. With a limited number of training data samples obtained from the simulation at steady state, a feedforward neural network was established which exhibits an excellent capability for the operational prediction and determination.


2021 ◽  
Vol 11 (6) ◽  
pp. 2535
Author(s):  
Bruno E. Silva ◽  
Ramiro S. Barbosa

In this article, we designed and implemented neural controllers to control a nonlinear and unstable magnetic levitation system composed of an electromagnet and a magnetic disk. The objective was to evaluate the implementation and performance of neural control algorithms in a low-cost hardware. In a first phase, we designed two classical controllers with the objective to provide the training data for the neural controllers. After, we identified several neural models of the levitation system using Nonlinear AutoRegressive eXogenous (NARX)-type neural networks that were used to emulate the forward dynamics of the system. Finally, we designed and implemented three neural control structures: the inverse controller, the internal model controller, and the model reference controller for the control of the levitation system. The neural controllers were tested on a low-cost Arduino control platform through MATLAB/Simulink. The experimental results proved the good performance of the neural controllers.


Author(s):  
Xinyi Li ◽  
Liqiong Chang ◽  
Fangfang Song ◽  
Ju Wang ◽  
Xiaojiang Chen ◽  
...  

This paper focuses on a fundamental question in Wi-Fi-based gesture recognition: "Can we use the knowledge learned from some users to perform gesture recognition for others?". This problem is also known as cross-target recognition. It arises in many practical deployments of Wi-Fi-based gesture recognition where it is prohibitively expensive to collect training data from every single user. We present CrossGR, a low-cost cross-target gesture recognition system. As a departure from existing approaches, CrossGR does not require prior knowledge (such as who is currently performing a gesture) of the target user. Instead, CrossGR employs a deep neural network to extract user-agnostic but gesture-related Wi-Fi signal characteristics to perform gesture recognition. To provide sufficient training data to build an effective deep learning model, CrossGR employs a generative adversarial network to automatically generate many synthetic training data from a small set of real-world examples collected from a small number of users. Such a strategy allows CrossGR to minimize the user involvement and the associated cost in collecting training examples for building an accurate gesture recognition system. We evaluate CrossGR by applying it to perform gesture recognition across 10 users and 15 gestures. Experimental results show that CrossGR achieves an accuracy of over 82.6% (up to 99.75%). We demonstrate that CrossGR delivers comparable recognition accuracy, but uses an order of magnitude less training samples collected from the end-users when compared to state-of-the-art recognition systems.


Processes ◽  
2021 ◽  
Vol 9 (5) ◽  
pp. 737
Author(s):  
Chaitanya Sampat ◽  
Rohit Ramachandran

The digitization of manufacturing processes has led to an increase in the availability of process data, which has enabled the use of data-driven models to predict the outcomes of these manufacturing processes. Data-driven models are instantaneous in simulate and can provide real-time predictions but lack any governing physics within their framework. When process data deviates from original conditions, the predictions from these models may not agree with physical boundaries. In such cases, the use of first-principle-based models to predict process outcomes have proven to be effective but computationally inefficient and cannot be solved in real time. Thus, there remains a need to develop efficient data-driven models with a physical understanding about the process. In this work, we have demonstrate the addition of physics-based boundary conditions constraints to a neural network to improve its predictability for granule density and granule size distribution (GSD) for a high shear granulation process. The physics-constrained neural network (PCNN) was better at predicting granule growth regimes when compared to other neural networks with no physical constraints. When input data that violated physics-based boundaries was provided, the PCNN identified these points more accurately compared to other non-physics constrained neural networks, with an error of <1%. A sensitivity analysis of the PCNN to the input variables was also performed to understand individual effects on the final outputs.


2020 ◽  
Vol 10 (6) ◽  
pp. 2104
Author(s):  
Michał Tomaszewski ◽  
Paweł Michalski ◽  
Jakub Osuchowski

This article presents an analysis of the effectiveness of object detection in digital images with the application of a limited quantity of input. The possibility of using a limited set of learning data was achieved by developing a detailed scenario of the task, which strictly defined the conditions of detector operation in the considered case of a convolutional neural network. The described solution utilizes known architectures of deep neural networks in the process of learning and object detection. The article presents comparisons of results from detecting the most popular deep neural networks while maintaining a limited training set composed of a specific number of selected images from diagnostic video. The analyzed input material was recorded during an inspection flight conducted along high-voltage lines. The object detector was built for a power insulator. The main contribution of the presented papier is the evidence that a limited training set (in our case, just 60 training frames) could be used for object detection, assuming an outdoor scenario with low variability of environmental conditions. The decision of which network will generate the best result for such a limited training set is not a trivial task. Conducted research suggests that the deep neural networks will achieve different levels of effectiveness depending on the amount of training data. The most beneficial results were obtained for two convolutional neural networks: the faster region-convolutional neural network (faster R-CNN) and the region-based fully convolutional network (R-FCN). Faster R-CNN reached the highest AP (average precision) at a level of 0.8 for 60 frames. The R-FCN model gained a worse AP result; however, it can be noted that the relationship between the number of input samples and the obtained results has a significantly lower influence than in the case of other CNN models, which, in the authors’ assessment, is a desired feature in the case of a limited training set.


2020 ◽  
Vol 9 (1) ◽  
pp. 7-10
Author(s):  
Hendry Fonda

ABSTRACT Riau batik is known since the 18th century and is used by royal kings. Riau Batik is made by using a stamp that is mixed with coloring and then printed on fabric. The fabric used is usually silk. As its development, comparing Javanese  batik with riau batik Riau is very slowly accepted by the public. Convolutional Neural Networks (CNN) is a combination of artificial neural networks and deeplearning methods. CNN consists of one or more convolutional layers, often with a subsampling layer followed by one or more fully connected layers as a standard neural network. In the process, CNN will conduct training and testing of Riau batik so that a collection of batik models that have been classified based on the characteristics that exist in Riau batik can be determined so that images are Riau batik and non-Riau batik. Classification using CNN produces Riau batik and not Riau batik with an accuracy of 65%. Accuracy of 65% is due to basically many of the same motifs between batik and other batik with the difference lies in the color of the absorption in the batik riau. Kata kunci: Batik; Batik Riau; CNN; Image; Deep Learning   ABSTRAK   Batik Riau dikenal sejak abad ke 18 dan digunakan oleh bangsawan raja. Batik Riau dibuat dengan menggunakan cap yang dicampur dengan pewarna kemudian dicetak di kain. Kain yang digunakan biasanya sutra. Seiring perkembangannya, dibandingkan batik Jawa maka batik Riau sangat lambat diterima oleh masyarakat. Convolutional Neural Networks (CNN) merupakan kombinasi dari jaringan syaraf tiruan dan metode deeplearning. CNN terdiri dari satu atau lebih lapisan konvolutional, seringnya dengan suatu lapisan subsampling yang diikuti oleh satu atau lebih lapisan yang terhubung penuh sebagai standar jaringan syaraf. Dalam prosesnya CNN akan melakukan training dan testing terhadap batik Riau sehingga didapat kumpulan model batik yang telah terklasi    fikasi berdasarkan ciri khas yang ada pada batik Riau sehingga dapat ditentukan gambar (image) yang merupakan batik Riau dan yang bukan merupakan batik Riau. Klasifikasi menggunakan CNN menghasilkan batik riau dan bukan batik riau dengan akurasi 65%. Akurasi 65% disebabkan pada dasarnya banyak motif yang sama antara batik riau dengan batik lainnya dengan perbedaan terletak pada warna cerap pada batik riau. Kata kunci: Batik; Batik Riau; CNN; Image; Deep Learning


Inventions ◽  
2021 ◽  
Vol 6 (4) ◽  
pp. 70
Author(s):  
Elena Solovyeva ◽  
Ali Abdullah

In this paper, the structure of a separable convolutional neural network that consists of an embedding layer, separable convolutional layers, convolutional layer and global average pooling is represented for binary and multiclass text classifications. The advantage of the proposed structure is the absence of multiple fully connected layers, which is used to increase the classification accuracy but raises the computational cost. The combination of low-cost separable convolutional layers and a convolutional layer is proposed to gain high accuracy and, simultaneously, to reduce the complexity of neural classifiers. Advantages are demonstrated at binary and multiclass classifications of written texts by means of the proposed networks under the sigmoid and Softmax activation functions in convolutional layer. At binary and multiclass classifications, the accuracy obtained by separable convolutional neural networks is higher in comparison with some investigated types of recurrent neural networks and fully connected networks.


Sign in / Sign up

Export Citation Format

Share Document