Hybrid evolutionary network architecture search ( HyENAS ) for convolution class of deep neural networks with applications

There exists significant demand for improved Reynolds-averaged Navier–Stokes (RANS) turbulence models that are informed by and can represent a richer set of turbulence physics. This paper presents a method of using deep neural networks to learn a model for the Reynolds stress anisotropy tensor from high-fidelity simulation data. A novel neural network architecture is proposed which uses a multiplicative layer with an invariant tensor basis to embed Galilean invariance into the predicted anisotropy tensor. It is demonstrated that this neural network architecture provides improved prediction accuracy compared with a generic neural network architecture that does not embed this invariance property. The Reynolds stress anisotropy predictions of this invariant neural network are propagated through to the velocity field for two test cases. For both test cases, significant improvement versus baseline RANS linear eddy viscosity and nonlinear eddy viscosity models is demonstrated.

Download Full-text

Explainable Deep Neural Networks for Multivariate Time Series Predictions

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2019/932 ◽

2019 ◽

Cited By ~ 8

Author(s):

Roy Assaf ◽

Anika Schumann

Keyword(s):

Neural Networks ◽

Time Series ◽

Network Architecture ◽

Power Plants ◽

Deep Neural Networks ◽

Time Series Data ◽

Multivariate Time Series ◽

Average Energy ◽

Series Data ◽

Time Interval

We demonstrate that CNN deep neural networks can not only be used for making predictions based on multivariate time series data, but also for explaining these predictions. This is important for a number of applications where predictions are the basis for decisions and actions. Hence, confidence in the prediction result is crucial. We design a two stage convolutional neural network architecture which uses particular kernel sizes. This allows us to utilise gradient based techniques for generating saliency maps for both the time dimension and the features. These are then used for explaining which features during which time interval are responsible for a given prediction, as well as explaining during which time intervals was the joint contribution of all features most important for that prediction. We demonstrate our approach for predicting the average energy production of photovoltaic power plants and for explaining these predictions.

Download Full-text

Sharing Residual Units Through Collective Tensor Factorization To Improve Deep Neural Networks

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2018/88 ◽

2018 ◽

Cited By ~ 6

Author(s):

Yunpeng Chen ◽

Xiaojie Jin ◽

Bingyi Kang ◽

Jiashi Feng ◽

Shuicheng Yan

Keyword(s):

Neural Networks ◽

Network Architecture ◽

Deep Neural Networks ◽

Tensor Decomposition ◽

Classification Performance ◽

Model Parameters ◽

Tensor Factorization ◽

Unified Framework ◽

Benchmark Datasets ◽

Basic Network

The residual unit and its variations are wildly used in building very deep neural networks for alleviating optimization difficulty. In this work, we revisit the standard residual function as well as its several successful variants and propose a unified framework based on tensor Block Term Decomposition (BTD) to explain these apparently different residual functions from the tensor decomposition view. With the BTD framework, we further propose a novel basic network architecture, named the Collective Residual Unit (CRU). CRU further enhances parameter efficiency of deep residual neural networks by sharing core factors derived from collective tensor factorization over the involved residual units. It enables efficient knowledge sharing across multiple residual units, reduces the number of model parameters, lowers the risk of over-fitting, and provides better generalization ability. Extensive experimental results show that our proposed CRU network brings outstanding parameter efficiency -- it achieves comparable classification performance with ResNet-200 while using a model size as small as ResNet-50 on the ImageNet-1k and Places365-Standard benchmark datasets.

Download Full-text

Network Approximation using Tensor Sketching

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2018/321 ◽

2018 ◽

Cited By ~ 1

Author(s):

Shiva Prasad Kasiviswanathan ◽

Nina Narodytska ◽

Hongxia Jin

Keyword(s):

Neural Networks ◽

Language Processing ◽

Network Architecture ◽

Deep Neural Networks ◽

Network Architectures ◽

Effective Parameters ◽

Unified Framework ◽

Design Changes ◽

Target Network ◽

Fully Connected

Deep neural networks are powerful learning models that achieve state-of-the-art performance on many computer vision, speech, and language processing tasks. In this paper, we study a fundamental question that arises when designing deep network architectures: Given a target network architecture can we design a `smaller' network architecture that 'approximates' the operation of the target network? The question is, in part, motivated by the challenge of parameter reduction (compression) in modern deep neural networks, as the ever increasing storage and memory requirements of these networks pose a problem in resource constrained environments.In this work, we focus on deep convolutional neural network architectures, and propose a novel randomized tensor sketching technique that we utilize to develop a unified framework for approximating the operation of both the convolutional and fully connected layers. By applying the sketching technique along different tensor dimensions, we design changes to the convolutional and fully connected layers that substantially reduce the number of effective parameters in a network. We show that the resulting smaller network can be trained directly, and has a classification accuracy that is comparable to the original network.

Download Full-text

Cryostructuring of Polymeric Systems †: Application of Deep Neural Networks for the Classification of Structural Features Peculiar to Macroporous Poly(vinyl alcohol) Cryogels Prepared without and with the Additives of Chaotropes or Kosmotropes

Molecules ◽

10.3390/molecules25194480 ◽

2020 ◽

Vol 25 (19) ◽

pp. 4480

Author(s):

Ilya I. Kurochkin ◽

Ilya N. Kurochkin ◽

Olga Yu. Kolosova ◽

Vladimir I. Lozinsky

Keyword(s):

Neural Networks ◽

Vinyl Alcohol ◽

Network Architecture ◽

Deep Neural Networks ◽

Polymer Solutions ◽

Total Porosity ◽

Structural Features ◽

Poly Vinyl Alcohol ◽

Macroporous Structure

Macroporous poly(vinyl alcohol) cryogels (PVACGs) are physical gels formed via cryogenic processing of polymer solutions. The properties of PVACGs depend on many factors: the characteristics and concentration of PVA, the absence or presence of foreign solutes, and the freezing-thawing conditions. These factors also affect the macroporous morphology of PVACGs, their total porosity, pore size and size distribution, etc. In this respect, there is the problem with developing a scientifically-grounded classification of the morphological features inherent in various PVACGs. In this study PVA cryogels have been prepared at different temperatures when the initial polymer solutions contained chaotropic or kosmotropic additives. After the completion of gelation, the rigidity and heat endurance of the resultant PVACGs were evaluated, and their macroporous structure was investigated using optical microscopy. The images obtained were treated mathematically, and deep neural networks were used for the classification of these images. Training and test sets were used for their classification. The results of this classification for the specific deep neural network architecture are presented, and the morphometric parameters of the macroporous structure are discussed. It was found that deep neural networks allow us to reliably classify the type of additive or its absence when using a combined dataset.

Download Full-text

ThriftyNets: Convolutional Neural Networks with Tiny Parameter Budget

IoT ◽

10.3390/iot2020012 ◽

2021 ◽

Vol 2 (2) ◽

pp. 222-235

Author(s):

Guillaume Coiffier ◽

Ghouthi Boukli Hacene ◽

Vincent Gripon

Keyword(s):

Neural Network ◽

Machine Learning ◽

Neural Networks ◽

Convolutional Neural Network ◽

Spatial Resolution ◽

Network Architecture ◽

Deep Neural Networks ◽

State Of The Art ◽

Feature Maps ◽

Neural Network Architecture

Deep Neural Networks are state-of-the-art in a large number of challenges in machine learning. However, to reach the best performance they require a huge pool of parameters. Indeed, typical deep convolutional architectures present an increasing number of feature maps as we go deeper in the network, whereas spatial resolution of inputs is decreased through downsampling operations. This means that most of the parameters lay in the final layers, while a large portion of the computations are performed by a small fraction of the total parameters in the first layers. In an effort to use every parameter of a network at its maximum, we propose a new convolutional neural network architecture, called ThriftyNet. In ThriftyNet, only one convolutional layer is defined and used recursively, leading to a maximal parameter factorization. In complement, normalization, non-linearities, downsamplings and shortcut ensure sufficient expressivity of the model. ThriftyNet achieves competitive performance on a tiny parameters budget, exceeding 91% accuracy on CIFAR-10 with less than 40 k parameters in total, 74.3% on CIFAR-100 with less than 600 k parameters, and 67.1% On ImageNet ILSVRC 2012 with no more than 4.15 M parameters. However, the proposed method typically requires more computations than existing counterparts.

Download Full-text

Generative Enhancement of 3D Image Classifiers

Applied Sciences ◽

10.3390/app10217433 ◽

2020 ◽

Vol 10 (21) ◽

pp. 7433

Author(s):

Michal Varga ◽

Ján Jadlovský ◽

Slávka Jadlovská

Keyword(s):

Neural Networks ◽

Knowledge Sharing ◽

Classification Accuracy ◽

Network Architecture ◽

Deep Neural Networks ◽

3D Image ◽

Accuracy Improvement ◽

Generative Adversarial Network ◽

Adversarial Network ◽

Generative Modeling

In this paper, we propose a methodology for generative enhancement of existing 3D image classifiers. This methodology is based on combining the advantages of both non-generative classifiers and generative modeling. Its purpose is to streamline the synthesis of novel deep neural networks by embedding existing compatible classifiers into a generative network architecture. A demonstration of this process and evaluation of its effectiveness is performed using a 3D convolutional classifier and its generative equivalent—a 3D conditional generative adversarial network classifier. The results of the experiments show that the generative classifier delivers higher performance, gaining a relative classification accuracy improvement of 7.43%. An increase of accuracy is also observed when comparing it to a plain convolutional classifier that was trained on a dataset augmented with samples created by the trained generator. This suggests a desirable knowledge sharing mechanism exists within the hybrid discriminator-classifier network.

Download Full-text

Innovative Topologies and Algorithms for Neural Networks

Future Internet ◽

10.3390/fi12070117 ◽

2020 ◽

Vol 12 (7) ◽

pp. 117

Author(s):

Salvatore Graziani ◽

Maria Gabriella Xibilia

Keyword(s):

Neural Networks ◽

Network Architecture ◽

Deep Neural Networks ◽

State Of The Art ◽

Text Processing ◽

Successful Outcome ◽

Training Algorithm ◽

Special Issue ◽

And Training ◽

Selection Of

The introduction of new topologies and training procedures to deep neural networks has solicited a renewed interest in the field of neural computation. The use of deep structures has significantly improved the state of the art in many applications, such as computer vision, speech and text processing, medical applications, and IoT (Internet of Things). The probability of a successful outcome from a neural network is linked to selection of an appropriate network architecture and training algorithm. Accordingly, much of the recent research on neural networks is devoted to the study and proposal of novel architectures, including solutions tailored to specific problems. The papers of this Special Issue make significant contributions to the above-mentioned fields by merging theoretical aspects and relevant applications. Twelve papers are collected in the issue, addressing many relevant aspects of the topic.

Download Full-text

An Experimental Analysis of Deep Learning Architectures for Supervised Speech Enhancement

Electronics ◽

10.3390/electronics10010017 ◽

2020 ◽

Vol 10 (1) ◽

pp. 17

Author(s):

Soha A. Nossier ◽

Julie Wall ◽

Mansour Moniri ◽

Cornelius Glackin ◽

Nigel Cannings

Keyword(s):

Neural Networks ◽

Deep Learning ◽

Speech Enhancement ◽

Network Architecture ◽

Deep Neural Networks ◽

Subjective Evaluation ◽

Objective Evaluation ◽

Learning Models ◽

Factors Affecting ◽

General Evaluation

Recent speech enhancement research has shown that deep learning techniques are very effective in removing background noise. Many deep neural networks are being proposed, showing promising results for improving overall speech perception. The Deep Multilayer Perceptron, Convolutional Neural Networks, and the Denoising Autoencoder are well-established architectures for speech enhancement; however, choosing between different deep learning models has been mainly empirical. Consequently, a comparative analysis is needed between these three architecture types in order to show the factors affecting their performance. In this paper, this analysis is presented by comparing seven deep learning models that belong to these three categories. The comparison includes evaluating the performance in terms of the overall quality of the output speech using five objective evaluation metrics and a subjective evaluation with 23 listeners; the ability to deal with challenging noise conditions; generalization ability; complexity; and, processing time. Further analysis is then provided while using two different approaches. The first approach investigates how the performance is affected by changing network hyperparameters and the structure of the data, including the Lombard effect. While the second approach interprets the results by visualizing the spectrogram of the output layer of all the investigated models, and the spectrograms of the hidden layers of the convolutional neural network architecture. Finally, a general evaluation is performed for supervised deep learning-based speech enhancement while using SWOC analysis, to discuss the technique’s Strengths, Weaknesses, Opportunities, and Challenges. The results of this paper contribute to the understanding of how different deep neural networks perform the speech enhancement task, highlight the strengths and weaknesses of each architecture, and provide recommendations for achieving better performance. This work facilitates the development of better deep neural networks for speech enhancement in the future.

Download Full-text

Sound Event Detection Using Derivative Features in Deep Neural Networks

Applied Sciences ◽

10.3390/app10144911 ◽

2020 ◽

Vol 10 (14) ◽

pp. 4911

Author(s):

Jin-Yeol Kwak ◽

Yong-Joo Chung

Keyword(s):

Neural Network ◽

Neural Networks ◽

Event Detection ◽

Network Architecture ◽

Deep Neural Networks ◽

Audio Signal ◽

Feed Forward Neural Network ◽

Sound Event ◽

Audio Data ◽

Sound Event Detection

We propose using derivative features for sound event detection based on deep neural networks. As input to the networks, we used log-mel-filterbank and its first and second derivative features for each frame of the audio signal. Two deep neural networks were used to evaluate the effectiveness of these derivative features. Specifically, a convolutional recurrent neural network (CRNN) was constructed by combining a convolutional neural network and a recurrent neural networks (RNN) followed by a feed-forward neural network (FNN) acting as a classification layer. In addition, a mean-teacher model based on an attention CRNN was used. Both models had an average pooling layer at the output so that weakly labeled and unlabeled audio data may be used during model training. Under the various training conditions, depending on the neural network architecture and training set, the use of derivative features resulted in a consistent performance improvement by using the derivative features. Experiments on audio data from the Detection and Classification of Acoustic Scenes and Events 2018 and 2019 challenges indicated that a maximum relative improvement of 16.9% was obtained in terms of the F-score.

Download Full-text