Semisupervised Learning for Seismic Monitoring Applications

Abstract The impressive performance that deep neural networks demonstrate on a range of seismic monitoring tasks depends largely on the availability of event catalogs that have been manually curated over many years or decades. However, the quality, duration, and availability of seismic event catalogs vary significantly across the range of monitoring operations, regions, and objectives. Semisupervised learning (SSL) enables learning from both labeled and unlabeled data and provides a framework to leverage the abundance of unreviewed seismic data for training deep neural networks on a variety of target tasks. We apply two SSL algorithms (mean-teacher and virtual adversarial training) as well as a novel hybrid technique (exponential average adversarial training) to seismic event classification to examine how unlabeled data with SSL can enhance model performance. In general, we find that SSL can perform as well as supervised learning with fewer labels. We also observe in some scenarios that almost half of the benefits of SSL are the result of the meaningful regularization enforced through SSL techniques and may not be attributable to unlabeled data directly. Lastly, the benefits from unlabeled data scale with the difficulty of the predictive task when we evaluate the use of unlabeled data to characterize sources in new geographic regions. In geographic areas where supervised model performance is low, SSL significantly increases the accuracy of source-type classification using unlabeled data.

Download Full-text

Enabling deeper learning on big data for materials informatics applications

Scientific Reports ◽

10.1038/s41598-021-83193-1 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Dipendra Jha ◽

Vishu Gupta ◽

Logan Ward ◽

Zijiang Yang ◽

Christopher Wolverton ◽

...

Keyword(s):

Neural Networks ◽

Big Data ◽

Deep Learning ◽

Deep Neural Networks ◽

Materials Science ◽

Prediction Models ◽

Model Performance ◽

Materials Informatics ◽

Learning Framework ◽

Significant Attention

AbstractThe application of machine learning (ML) techniques in materials science has attracted significant attention in recent years, due to their impressive ability to efficiently extract data-driven linkages from various input materials representations to their output properties. While the application of traditional ML techniques has become quite ubiquitous, there have been limited applications of more advanced deep learning (DL) techniques, primarily because big materials datasets are relatively rare. Given the demonstrated potential and advantages of DL and the increasing availability of big materials datasets, it is attractive to go for deeper neural networks in a bid to boost model performance, but in reality, it leads to performance degradation due to the vanishing gradient problem. In this paper, we address the question of how to enable deeper learning for cases where big materials data is available. Here, we present a general deep learning framework based on Individual Residual learning (IRNet) composed of very deep neural networks that can work with any vector-based materials representation as input to build accurate property prediction models. We find that the proposed IRNet models can not only successfully alleviate the vanishing gradient problem and enable deeper learning, but also lead to significantly (up to 47%) better model accuracy as compared to plain deep neural networks and traditional ML techniques for a given input materials representation in the presence of big data.

Download Full-text

Diversity Adversarial Training against Adversarial Attack on Deep Neural Networks

Symmetry ◽

10.3390/sym13030428 ◽

2021 ◽

Vol 13 (3) ◽

pp. 428

Author(s):

Hyun Kwon ◽

Jun Lee

Keyword(s):

Neural Networks ◽

Deep Neural Networks ◽

Diversity Training ◽

Original Data ◽

Training Method ◽

Learning Framework ◽

Adversarial Examples ◽

Adversarial Training ◽

Adversarial Attack ◽

Accuracy Rates

This paper presents research focusing on visualization and pattern recognition based on computer science. Although deep neural networks demonstrate satisfactory performance regarding image and voice recognition, as well as pattern analysis and intrusion detection, they exhibit inferior performance towards adversarial examples. Noise introduction, to some degree, to the original data could lead adversarial examples to be misclassified by deep neural networks, even though they can still be deemed as normal by humans. In this paper, a robust diversity adversarial training method against adversarial attacks was demonstrated. In this approach, the target model is more robust to unknown adversarial examples, as it trains various adversarial samples. During the experiment, Tensorflow was employed as our deep learning framework, while MNIST and Fashion-MNIST were used as experimental datasets. Results revealed that the diversity training method has lowered the attack success rate by an average of 27.2 and 24.3% for various adversarial examples, while maintaining the 98.7 and 91.5% accuracy rates regarding the original data of MNIST and Fashion-MNIST.

Download Full-text

Solving inverse problems in stochastic models using deep neural networks and adversarial training

Computer Methods in Applied Mechanics and Engineering ◽

10.1016/j.cma.2021.113976 ◽

2021 ◽

Vol 384 ◽

pp. 113976

Author(s):

Kailai Xu ◽

Eric Darve

Keyword(s):

Neural Networks ◽

Inverse Problems ◽

Stochastic Models ◽

Deep Neural Networks ◽

Adversarial Training

Download Full-text

Deep Neural Networks Constrained by Decision Rules

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33012496 ◽

2019 ◽

Vol 33 ◽

pp. 2496-2505

Author(s):

Yuzuru Okajima ◽

Kunihiko Sadamasa

Keyword(s):

Neural Networks ◽

Deep Neural Networks ◽

Predictive Accuracy ◽

Decision Rules ◽

Hybrid Technique ◽

Complex Data ◽

Rule Based ◽

Prior Probabilities ◽

The Neural Network ◽

Latent Representations

Deep neural networks achieve high predictive accuracy by learning latent representations of complex data. However, the reasoning behind their decisions is difficult for humans to understand. On the other hand, rule-based approaches are able to justify the decisions by showing the decision rules leading to them, but they have relatively low accuracy. To improve the interpretability of neural networks, several techniques provide post-hoc explanations of decisions made by neural networks, but they cannot guarantee that the decisions are always explained in a simple form like decision rules because their explanations are generated after the decisions are made by neural networks.In this paper, to balance the accuracy of neural networks and the interpretability of decision rules, we propose a hybrid technique called rule-constrained networks, namely, neural networks that make decisions by selecting decision rules from a given ruleset. Because the networks are forced to make decisions based on decision rules, it is guaranteed that every decision is supported by a decision rule. Furthermore, we propose a technique to jointly optimize the neural network and the ruleset from which the network select rules. The log likelihood of correct classifications is maximized under a model with hyper parameters about the ruleset size and the prior probabilities of rules being selected. This feature makes it possible to limit the ruleset size or prioritize human-made rules over automatically acquired rules for promoting the interpretability of the output. Experiments on datasets of time-series and sentiment classification showed rule-constrained networks achieved accuracy as high as that achieved by original neural networks and significantly higher than that achieved by existing rule-based models, while presenting decision rules supporting the decisions.

Download Full-text

BEAN: Interpretable and Efficient Learning With Biologically-Enhanced Artificial Neuronal Assembly Regularization

Frontiers in Neurorobotics ◽

10.3389/fnbot.2021.567482 ◽

2021 ◽

Vol 15 ◽

Author(s):

Yuyang Gao ◽

Giorgio A. Ascoli ◽

Liang Zhao

Keyword(s):

Neural Networks ◽

Deep Neural Networks ◽

Model Performance ◽

Semantic Content ◽

Population Coding ◽

Crucial Issue ◽

Synaptic Interactions ◽

Training Samples ◽

Efficient Learning ◽

Neuronal Assembly

Deep neural networks (DNNs) are known for extracting useful information from large amounts of data. However, the representations learned in DNNs are typically hard to interpret, especially in dense layers. One crucial issue of the classical DNN model such as multilayer perceptron (MLP) is that neurons in the same layer of DNNs are conditionally independent of each other, which makes co-training and emergence of higher modularity difficult. In contrast to DNNs, biological neurons in mammalian brains display substantial dependency patterns. Specifically, biological neural networks encode representations by so-called neuronal assemblies: groups of neurons interconnected by strong synaptic interactions and sharing joint semantic content. The resulting population coding is essential for human cognitive and mnemonic processes. Here, we propose a novel Biologically Enhanced Artificial Neuronal assembly (BEAN) regularization1 to model neuronal correlations and dependencies, inspired by cell assembly theory from neuroscience. Experimental results show that BEAN enables the formation of interpretable neuronal functional clusters and consequently promotes a sparse, memory/computation-efficient network without loss of model performance. Moreover, our few-shot learning experiments demonstrate that BEAN could also enhance the generalizability of the model when training samples are extremely limited.

Download Full-text

Improving the Robustness of Deep Neural Networks via Adversarial Training with Triplet Loss

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2019/403 ◽

2019 ◽

Author(s):

Pengcheng Li ◽

Jinfeng Yi ◽

Bowen Zhou ◽

Lijun Zhang

Keyword(s):

Neural Networks ◽

Deep Neural Networks ◽

Current Model ◽

Empirical Studies ◽

Metric Learning ◽

Distance Metric Learning ◽

Distance Metric ◽

Classification Boundary ◽

Adversarial Training ◽

Triplet Loss

Recent studies have highlighted that deep neural networks (DNNs) are vulnerable to adversarial examples. In this paper, we improve the robustness of DNNs by utilizing techniques of Distance Metric Learning. Specifically, we incorporate Triplet Loss, one of the most popular Distance Metric Learning methods, into the framework of adversarial training. Our proposed algorithm, Adversarial Training with Triplet Loss (AT2L), substitutes the adversarial example against the current model for the anchor of triplet loss to effectively smooth the classification boundary. Furthermore, we propose an ensemble version of AT2L, which aggregates different attack methods and model structures for better defense effects. Our empirical studies verify that the proposed approach can significantly improve the robustness of DNNs without sacrificing accuracy. Finally, we demonstrate that our specially designed triplet loss can also be used as a regularization term to enhance other defense methods.

Download Full-text

Privacy Issues Regarding the Application of DNNs to Activity-Recognition using Wearables and Its Countermeasures by Use of Adversarial Training

Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2017/268 ◽

2017 ◽

Cited By ~ 5

Author(s):

Yusuke Iwasawa ◽

Kotaro Nakayama ◽

Ikuko Yairi ◽

Yutaka Matsuo

Keyword(s):

Neural Networks ◽

Activity Recognition ◽

Information Disclosure ◽

Deep Neural Networks ◽

Recognition Performance ◽

Empirical Validation ◽

Privacy Concerns ◽

Proposed Model ◽

Significant Performance ◽

Adversarial Training

Deep neural networks have been successfully applied to activity recognition with wearables in terms of recognition performance. However, the black-box nature of neural networks could lead to privacy concerns. Namely, generally it is hard to expect what neural networks learn from data, and so they possibly learn features that highly discriminate user-information unintentionally, which increases the risk of information-disclosure. In this study, we analyzed the features learned by conventional deep neural networks when applied to data of wearables to confirm this phenomenon.Based on the results of our analysis, we propose the use of an adversarial training framework to suppress the risk of sensitive/unintended information disclosure. Our proposed model considers both an adversarial user classifier and a regular activity-classifier during training, which allows the model to learn representations that help the classifier to distinguish the activities but which, at the same time, prevents it from accessing user-discriminative information. This paper provides an empirical validation of the privacy issue and efficacy of the proposed method using three activity recognition tasks based on data of wearables. The empirical validation shows that our proposed method suppresses the concerns without any significant performance degradation, compared to conventional deep nets on all three tasks.

Download Full-text

SEMI-SUPERVISED SEMANTIC SEGMENTATION NETWORK VIA LEARNING CONSISTENCY FOR REMOTE SENSING LAND-COVER CLASSIFICATION

ISPRS Annals of Photogrammetry Remote Sensing and Spatial Information Sciences ◽

10.5194/isprs-annals-v-2-2020-609-2020 ◽

2020 ◽

Vol V-2-2020 ◽

pp. 609-615

Author(s):

B. Zhang ◽

Y. Zhang ◽

Y. Li ◽

Y. Wan ◽

F. Wen

Keyword(s):

Neural Network ◽

Remote Sensing ◽

Neural Networks ◽

Land Cover ◽

Deep Neural Networks ◽

Semantic Segmentation ◽

Land Cover Classification ◽

Unlabeled Data ◽

Weighted Sum ◽

Affine Transform

Abstract. Current popular deep neural networks for semantic segmentation are almost supervised and highly rely on a large amount of labeled data. However, obtaining a large amount of pixel-level labeled data is time-consuming and laborious. In remote sensing area, this problem is more urgent. To alleviate this problem, we propose a novel semantic segmentation neural network (S4Net) based on semi-supervised learning by using unlabeled data. Our model can learn from unlabeled data by consistency regularization, which enforces the consistency of output under different random transforms and perturbations, such as random affine transform. Thus, the network is trained by the weighted sum of a supervised loss from labeled data and a consistency regularization loss from unlabeled data. The experiments we conducted on DeepGlobe land cover classification challenge dataset verified that our network can make use of unlabeled data to obtain precise results of semantic segmentation and achieve competitive performance when compared to other methods.

Download Full-text

Graph Universal Adversarial Attacks: A Few Bad Actors Ruin Graph Learning Models

Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2021/458 ◽

2021 ◽

Author(s):

Xiao Zang ◽

Yi Xie ◽

Jie Chen ◽

Bo Yuan

Keyword(s):

Neural Networks ◽

Success Rate ◽

Deep Neural Networks ◽

Model Performance ◽

Graph Model ◽

Structured Data ◽

Graph Structure ◽

Learning Models ◽

Security Threat ◽

Anchor Nodes

Deep neural networks, while generalize well, are known to be sensitive to small adversarial perturbations. This phenomenon poses severe security threat and calls for in-depth investigation of the robustness of deep learning models. With the emergence of neural networks for graph structured data, similar investigations are urged to understand their robustness. It has been found that adversarially perturbing the graph structure and/or node features may result in a significant degradation of the model performance. In this work, we show from a different angle that such fragility similarly occurs if the graph contains a few bad-actor nodes, which compromise a trained graph neural network through flipping the connections to any targeted victim. Worse, the bad actors found for one graph model severely compromise other models as well. We call the bad actors ``anchor nodes'' and propose an algorithm, named GUA, to identify them. Thorough empirical investigations suggest an interesting finding that the anchor nodes often belong to the same class; and they also corroborate the intuitive trade-off between the number of anchor nodes and the attack success rate. For the dataset Cora which contains 2708 nodes, as few as six anchor nodes will result in an attack success rate higher than 80% for GCN and other three models.

Download Full-text

A Study of Deep Neural Networks Transfer Learning For Fault Diagnosis Applications

Annual Conference of the PHM Society ◽

10.36001/phmconf.2021.v13i1.2996 ◽

2021 ◽

Vol 13 (1) ◽

Author(s):

Michael Franco-Garcia ◽

Alex Benasutti ◽

Larry Pearlstein ◽

Mohammed Alabsi

Keyword(s):

Neural Networks ◽

Fault Diagnosis ◽

Transfer Learning ◽

Deep Neural Networks ◽

Model Performance ◽

Poor Performance ◽

Operating Conditions ◽

Learning Performance ◽

Vibration Data ◽

Model Training

Intelligent fault diagnosis utilizing deep learning algorithms has been widely investigated recently. Although previous results demonstrated excellent performance, features learned by Deep Neural Networks (DNN) are part of a large black box. Consequently, lack of understanding of underlying physical meanings embedded within the features can lead to poor performance when applied to different but related datasets i.e. transfer learning applications. This study will investigate the transfer learning performance of a Convolution Neural Network (CNN) considering 4 different operating conditions. Utilizing the Case Western Reserve University (CWRU) bearing dataset, the CNN will be trained to classify 12 classes. Each class represents a unique differentfault scenario with varying severity i.e. inner race fault of 0.007”, 0.014” diameter. Initially, zero load data will be utilized for model training and the model will be tuned until testing accuracy above 99% is obtained. The model performance will be evaluated by feeding vibration data collected when the load is varied to 1, 2 and 3 HP. Initial results indicated that the classification accuracy will degrade substantially. Hence, this paper will visualize convolution kernels in time and frequency domains and will investigate the influence of changing loads on fault characteristics, network classification mechanism and activation strength.

Download Full-text