Interpolation Consistency Training for Semi-supervised Learning

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2019/504 ◽

2019 ◽

Cited By ~ 39

Author(s):

Vikas Verma ◽

Alex Lamb ◽

Juho Kannala ◽

Yoshua Bengio ◽

David Lopez-Paz

Keyword(s):

Neural Network ◽

Neural Networks ◽

Supervised Learning ◽

Deep Neural Networks ◽

State Of The Art ◽

Data Distribution ◽

Network Architectures ◽

Low Density ◽

Decision Boundary ◽

Classification Problems

We introduce Interpolation Consistency Training (ICT), a simple and computation efficient algorithm for training Deep Neural Networks in the semi-supervised learning paradigm. ICT encourages the prediction at an interpolation of unlabeled points to be consistent with the interpolation of the predictions at those points. In classification problems, ICT moves the decision boundary to low-density regions of the data distribution. Our experiments show that ICT achieves state-of-the-art performance when applied to standard neural network architectures on the CIFAR-10 and SVHN benchmark dataset.

Download Full-text

Modular Dynamic Neural Network: A Continual Learning Architecture

Applied Sciences ◽

10.3390/app112412078 ◽

2021 ◽

Vol 11 (24) ◽

pp. 12078

Author(s):

Daniel Turner ◽

Pedro J. S. Cardoso ◽

João M. F. Rodrigues

Keyword(s):

Neural Network ◽

Neural Networks ◽

Feature Extraction ◽

Deep Neural Networks ◽

State Of The Art ◽

Simple Task ◽

Dynamic Neural Network ◽

Main Components ◽

Over Time ◽

Continual Learning

Learning to recognize a new object after having learned to recognize other objects may be a simple task for a human, but not for machines. The present go-to approaches for teaching a machine to recognize a set of objects are based on the use of deep neural networks (DNN). So, intuitively, the solution for teaching new objects on the fly to a machine should be DNN. The problem is that the trained DNN weights used to classify the initial set of objects are extremely fragile, meaning that any change to those weights can severely damage the capacity to perform the initial recognitions; this phenomenon is known as catastrophic forgetting (CF). This paper presents a new (DNN) continual learning (CL) architecture that can deal with CF, the modular dynamic neural network (MDNN). The presented architecture consists of two main components: (a) the ResNet50-based feature extraction component as the backbone; and (b) the modular dynamic classification component, which consists of multiple sub-networks and progressively builds itself up in a tree-like structure that rearranges itself as it learns over time in such a way that each sub-network can function independently. The main contribution of the paper is a new architecture that is strongly based on its modular dynamic training feature. This modular structure allows for new classes to be added while only altering specific sub-networks in such a way that previously known classes are not forgotten. Tests on the CORe50 dataset showed results above the state of the art for CL architectures.

Download Full-text

A Comparison of Deep Learning Methods for Timbre Analysis in Polyphonic Automatic Music Transcription

Electronics ◽

10.3390/electronics10070810 ◽

2021 ◽

Vol 10 (7) ◽

pp. 810

Author(s):

Carlos Hernandez-Olivan ◽

Ignacio Zay Pinilla ◽

Carlos Hernandez-Lopez ◽

Jose R. Beltran

Keyword(s):

Neural Network ◽

Neural Networks ◽

Deep Neural Networks ◽

State Of The Art ◽

High Impact ◽

Critical Problem ◽

Music Transcription ◽

Automatic Music Transcription ◽

Music Information ◽

Method Show

Automatic music transcription (AMT) is a critical problem in the field of music information retrieval (MIR). When AMT is faced with deep neural networks, the variety of timbres of different instruments can be an issue that has not been studied in depth yet. The goal of this work is to address AMT transcription by analyzing how timbre affect monophonic transcription in a first approach based on the CREPE neural network and then to improve the results by performing polyphonic music transcription with different timbres with a second approach based on the Deep Salience model that performs polyphonic transcription based on the Constant-Q Transform. The results of the first method show that the timbre and envelope of the onsets have a high impact on the AMT results and the second method shows that the developed model is less dependent on the strength of the onsets than other state-of-the-art models that deal with AMT on piano sounds such as Google Magenta Onset and Frames (OaF). Our polyphonic transcription model for non-piano instruments outperforms the state-of-the-art model, such as for bass instruments, which has an F-score of 0.9516 versus 0.7102. In our latest experiment we also show how adding an onset detector to our model can outperform the results given in this work.

Download Full-text

ThriftyNets: Convolutional Neural Networks with Tiny Parameter Budget

IoT ◽

10.3390/iot2020012 ◽

2021 ◽

Vol 2 (2) ◽

pp. 222-235

Author(s):

Guillaume Coiffier ◽

Ghouthi Boukli Hacene ◽

Vincent Gripon

Keyword(s):

Neural Network ◽

Machine Learning ◽

Neural Networks ◽

Convolutional Neural Network ◽

Spatial Resolution ◽

Network Architecture ◽

Deep Neural Networks ◽

State Of The Art ◽

Feature Maps ◽

Neural Network Architecture

Deep Neural Networks are state-of-the-art in a large number of challenges in machine learning. However, to reach the best performance they require a huge pool of parameters. Indeed, typical deep convolutional architectures present an increasing number of feature maps as we go deeper in the network, whereas spatial resolution of inputs is decreased through downsampling operations. This means that most of the parameters lay in the final layers, while a large portion of the computations are performed by a small fraction of the total parameters in the first layers. In an effort to use every parameter of a network at its maximum, we propose a new convolutional neural network architecture, called ThriftyNet. In ThriftyNet, only one convolutional layer is defined and used recursively, leading to a maximal parameter factorization. In complement, normalization, non-linearities, downsamplings and shortcut ensure sufficient expressivity of the model. ThriftyNet achieves competitive performance on a tiny parameters budget, exceeding 91% accuracy on CIFAR-10 with less than 40 k parameters in total, 74.3% on CIFAR-100 with less than 600 k parameters, and 67.1% On ImageNet ILSVRC 2012 with no more than 4.15 M parameters. However, the proposed method typically requires more computations than existing counterparts.

Download Full-text

An optical neural network using less than 1 photon per multiplication

Nature Communications ◽

10.1038/s41467-021-27774-8 ◽

2022 ◽

Vol 13 (1) ◽

Author(s):

Tianyu Wang ◽

Shi-Yuan Ma ◽

Logan G. Wright ◽

Tatsuhiro Onodera ◽

Brian C. Richard ◽

...

Keyword(s):

Neural Network ◽

Neural Networks ◽

Deep Learning ◽

Deep Neural Networks ◽

Fundamental Principle ◽

Energy Costs ◽

Network Architectures ◽

Optical Neural Networks ◽

Optical Neural Network ◽

Handwritten Digit

AbstractDeep learning has become a widespread tool in both science and industry. However, continued progress is hampered by the rapid growth in energy costs of ever-larger deep neural networks. Optical neural networks provide a potential means to solve the energy-cost problem faced by deep learning. Here, we experimentally demonstrate an optical neural network based on optical dot products that achieves 99% accuracy on handwritten-digit classification using ~3.1 detected photons per weight multiplication and ~90% accuracy using ~0.66 photons (~2.5 × 10−19 J of optical energy) per weight multiplication. The fundamental principle enabling our sub-photon-per-multiplication demonstration—noise reduction from the accumulation of scalar multiplications in dot-product sums—is applicable to many different optical-neural-network architectures. Our work shows that optical neural networks can achieve accurate results using extremely low optical energies.

Download Full-text

Interaffection of Multiple Datasets with Neural Networks in Speech Emotion Recognition

10.5753/eniac.2020.12141 ◽

2020 ◽

Author(s):

Ronnypetson Da Silva ◽

Valter M. Filho ◽

Mario Souza

Keyword(s):

Neural Network ◽

Neural Networks ◽

Emotion Recognition ◽

Deep Neural Networks ◽

Speech Emotion Recognition ◽

Network Architectures ◽

Shared Representations ◽

Multiple Datasets ◽

Neural Network Architectures

Many works that apply Deep Neural Networks (DNNs) to Speech Emotion Recognition (SER) use single datasets or train and evaluate the models separately when using multiple datasets. Those datasets are constructed with specific guidelines and the subjective nature of the labels for SER makes it difficult to obtain robust and general models. We investigate how DNNs learn shared representations for different datasets in both multi-task and unified setups. We also analyse how each dataset benefits from others in different combinations of datasets and popular neural network architectures. We show that the longstanding belief of more data resulting in more general models doesn’t always hold for SER, as different dataset and meta-parameter combinations hold the best result for each of the analysed datasets.

Download Full-text

Towards a high robust neural network via feature matching

International Journal of Multimedia Information Retrieval ◽

10.1007/s13735-021-00219-0 ◽

2021 ◽

Author(s):

Jian Li ◽

Yanming Guo ◽

Songyang Lao ◽

Yulun Wu ◽

Liang Bai ◽

...

Keyword(s):

Neural Network ◽

Neural Networks ◽

Deep Neural Networks ◽

Feature Matching ◽

Feature Vector ◽

State Of The Art ◽

Model Performance ◽

Image Features ◽

Classification Systems ◽

Adversarial Attack

AbstractImage classification systems have been found vulnerable to adversarial attack, which is imperceptible to human but can easily fool deep neural networks. Recent researches indicate that regularizing the network by introducing randomness could greatly improve the model’s robustness against adversarial attack, but the randomness module would normally involve complex calculations and numerous additional parameters and seriously affect the model performance on clean data. In this paper, we propose a feature matching module to regularize the network. Specifically, our model learns a feature vector for each category and imposes additional restrictions on image features. Then, the similarity between image features and category features is used as the basis for classification. Our method does not introduce any additional network parameters than undefended model and can be easily integrated into any neural network. Experiments on the CIFAR10 and SVHN datasets highlight that our proposed module can effectively improve both clean data and perturbed data accuracy in comparison with the state-of-the-art defense methods and outperform the L2P method by 6.3$$\%$$ % , 24$$\%$$ % on clean and perturbed data, respectively, using ResNet-V2(18) architecture.

Download Full-text

RTHN: A RNN-Transformer Hierarchical Network for Emotion Cause Extraction

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2019/734 ◽

2019 ◽

Cited By ~ 3

Author(s):

Rui Xia ◽

Mengran Zhang ◽

Zixiang Ding

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Relative Position ◽

Deep Neural Networks ◽

State Of The Art ◽

Hierarchical Network ◽

Classification Problems ◽

Rule Based ◽

Machine Learning Methods ◽

Word Level

The emotion cause extraction (ECE) task aims at discovering the potential causes behind a certain emotion expression in a document. Techniques including rule-based methods, traditional machine learning methods and deep neural networks have been proposed to solve this task. However, most of the previous work considered ECE as a set of independent clause classification problems and ignored the relations between multiple clauses in a document. In this work, we propose a joint emotion cause extraction framework, named RNN-Transformer Hierarchical Network (RTHN), to encode and classify multiple clauses synchronously. RTHN is composed of a lower word-level encoder based on RNNs to encode multiple words in each clause, and an upper clause-level encoder based on Transformer to learn the correlation between multiple clauses in a document. We furthermore propose ways to encode the relative position and global predication information into Transformer that can capture the causality between clauses and make RTHN more efficient. We finally achieve the best performance among 12 compared systems and improve the F1 score of the state-of-the-art from 72.69% to 76.77%.

Download Full-text

Tri-net for Semi-Supervised Deep Learning

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2018/278 ◽

2018 ◽

Cited By ~ 11

Author(s):

Dong-Dong Chen ◽

Wei Wang ◽

Wei Gao ◽

Zhi-Hua Zhou

Keyword(s):

Neural Network ◽

Neural Networks ◽

Deep Learning ◽

Error Rate ◽

Deep Neural Network ◽

Deep Neural Networks ◽

State Of The Art ◽

Fine Tuning ◽

Learning Methods ◽

Model Initialization

Deep neural networks have witnessed great successes in various real applications, but it requires a large number of labeled data for training. In this paper, we propose tri-net, a deep neural network which is able to use massive unlabeled data to help learning with limited labeled data. We consider model initialization, diversity augmentation and pseudo-label editing simultaneously. In our work, we utilize output smearing to initialize modules, use fine-tuning on labeled data to augment diversity and eliminate unstable pseudo-labels to alleviate the influence of suspicious pseudo-labeled data. Experiments show that our method achieves the best performance in comparison with state-of-the-art semi-supervised deep learning methods. In particular, it achieves 8.30% error rate on CIFAR-10 by using only 4000 labeled examples.

Download Full-text

HELLO: improved neural network architectures and methodologies for small variant calling

BMC Bioinformatics ◽

10.1186/s12859-021-04311-4 ◽

2021 ◽

Vol 22 (1) ◽

Author(s):

Anand Ramachandran ◽

Steven S. Lumetta ◽

Eric W. Klee ◽

Deming Chen

Keyword(s):

Neural Network ◽

Neural Networks ◽

Image Recognition ◽

Deep Neural Network ◽

Deep Neural Networks ◽

Method Development ◽

Variant Calling ◽

Network Architectures ◽

Sequencing Data ◽

Neural Network Architectures

Abstract Background Modern Next Generation- and Third Generation- Sequencing methods such as Illumina and PacBio Circular Consensus Sequencing platforms provide accurate sequencing data. Parallel developments in Deep Learning have enabled the application of Deep Neural Networks to variant calling, surpassing the accuracy of classical approaches in many settings. DeepVariant, arguably the most popular among such methods, transforms the problem of variant calling into one of image recognition where a Deep Neural Network analyzes sequencing data that is formatted as images, achieving high accuracy. In this paper, we explore an alternative approach to designing Deep Neural Networks for variant calling, where we use meticulously designed Deep Neural Network architectures and customized variant inference functions that account for the underlying nature of sequencing data instead of converting the problem to one of image recognition. Results Results from 27 whole-genome variant calling experiments spanning Illumina, PacBio and hybrid Illumina-PacBio settings suggest that our method allows vastly smaller Deep Neural Networks to outperform the Inception-v3 architecture used in DeepVariant for indel and substitution-type variant calls. For example, our method reduces the number of indel call errors by up to 18%, 55% and 65% for Illumina, PacBio and hybrid Illumina-PacBio variant calling respectively, compared to a similarly trained DeepVariant pipeline. In these cases, our models are between 7 and 14 times smaller. Conclusions We believe that the improved accuracy and problem-specific customization of our models will enable more accurate pipelines and further method development in the field. HELLO is available at https://github.com/anands-repo/hello

Download Full-text

Evaluation of deep learning approaches based on convolutional neural networks for corrosion detection

Structural Health Monitoring ◽

10.1177/1475921717737051 ◽

2017 ◽

Vol 17 (5) ◽

pp. 1110-1128 ◽

Cited By ~ 55

Author(s):

Deegan J Atha ◽

Mohammad R Jahanshahi

Keyword(s):

Neural Network ◽

Neural Networks ◽

Convolutional Neural Network ◽

Convolutional Neural Networks ◽

State Of The Art ◽

Inspection Time ◽

Computational Time ◽

Network Architectures ◽

Corrosion Detection ◽

Neural Network Architectures

Corrosion is a major defect in structural systems that has a significant economic impact and can pose safety risks if left untended. Currently, an inspector visually assesses the condition of a structure to identify corrosion. This approach is time-consuming, tedious, and subjective. Robotic systems, such as unmanned aerial vehicles, paired with computer vision algorithms have the potential to perform autonomous damage detection that can significantly decrease inspection time and lead to more frequent and objective inspections. This study evaluates the use of convolutional neural networks for corrosion detection. A convolutional neural network learns the appropriate classification features that in traditional algorithms were hand-engineered. Eliminating the need for dependence on prior knowledge and human effort in designing features is a major advantage of convolutional neural networks. This article presents different convolutional neural network–based approaches for corrosion assessment on metallic surfaces. The effect of different color spaces, sliding window sizes, and convolutional neural network architectures are discussed. To this end, the performance of two pretrained state-of-the-art convolutional neural network architectures as well as two proposed convolutional neural network architectures are evaluated, and it is shown that convolutional neural networks outperform state-of-the-art vision-based corrosion detection approaches that are developed based on texture and color analysis using a simple multilayered perceptron network. Furthermore, it is shown that one of the proposed convolutional neural networks significantly improves the computational time in contrast with state-of-the-art pretrained convolutional neural networks while maintaining comparable performance for corrosion detection.

Download Full-text