scholarly journals When the state of the art is ahead of the state of understanding: Unintuitive properties of deep neural networks

Author(s):  
Joan Serrà

Deep learning is an undeniably hot topic, not only within both academia and industry, but also among society and the media. The reasons for the advent of its popularity are manifold: unprecedented availability of data and computing power, some innovative methodologies, minor but significant technical tricks, etc. However, interestingly, the current success and practice of deep learning seems to be uncorrelated with its theoretical, more formal understanding. And with that, deep learning’s state-of-the-art presents a number of unintuitive properties or situations. In this note, I highlight some of these unintuitive properties, trying to show relevant recent work, and expose the need to get insight into them, either by formal or more empirical means.

Algorithms ◽  
2021 ◽  
Vol 14 (2) ◽  
pp. 39
Author(s):  
Carlos Lassance ◽  
Vincent Gripon ◽  
Antonio Ortega

Deep Learning (DL) has attracted a lot of attention for its ability to reach state-of-the-art performance in many machine learning tasks. The core principle of DL methods consists of training composite architectures in an end-to-end fashion, where inputs are associated with outputs trained to optimize an objective function. Because of their compositional nature, DL architectures naturally exhibit several intermediate representations of the inputs, which belong to so-called latent spaces. When treated individually, these intermediate representations are most of the time unconstrained during the learning process, as it is unclear which properties should be favored. However, when processing a batch of inputs concurrently, the corresponding set of intermediate representations exhibit relations (what we call a geometry) on which desired properties can be sought. In this work, we show that it is possible to introduce constraints on these latent geometries to address various problems. In more detail, we propose to represent geometries by constructing similarity graphs from the intermediate representations obtained when processing a batch of inputs. By constraining these Latent Geometry Graphs (LGGs), we address the three following problems: (i) reproducing the behavior of a teacher architecture is achieved by mimicking its geometry, (ii) designing efficient embeddings for classification is achieved by targeting specific geometries, and (iii) robustness to deviations on inputs is achieved via enforcing smooth variation of geometry between consecutive latent spaces. Using standard vision benchmarks, we demonstrate the ability of the proposed geometry-based methods in solving the considered problems.


Entropy ◽  
2020 ◽  
Vol 22 (12) ◽  
pp. 1365
Author(s):  
Bogdan Muşat ◽  
Răzvan Andonie

Convolutional neural networks utilize a hierarchy of neural network layers. The statistical aspects of information concentration in successive layers can bring an insight into the feature abstraction process. We analyze the saliency maps of these layers from the perspective of semiotics, also known as the study of signs and sign-using behavior. In computational semiotics, this aggregation operation (known as superization) is accompanied by a decrease of spatial entropy: signs are aggregated into supersign. Using spatial entropy, we compute the information content of the saliency maps and study the superization processes which take place between successive layers of the network. In our experiments, we visualize the superization process and show how the obtained knowledge can be used to explain the neural decision model. In addition, we attempt to optimize the architecture of the neural model employing a semiotic greedy technique. To the extent of our knowledge, this is the first application of computational semiotics in the analysis and interpretation of deep neural networks.


Author(s):  
Aydin Ayanzadeh ◽  
Sahand Vahidnia

In this paper, we leverage state of the art models on Imagenet data-sets. We use the pre-trained model and learned weighs to extract the feature from the Dog breeds identification data-set. Afterwards, we applied fine-tuning and dataaugmentation to increase the performance of our test accuracy in classification of dog breeds datasets. The performance of the proposed approaches are compared with the state of the art models of Image-Net datasets such as ResNet-50, DenseNet-121, DenseNet-169 and GoogleNet. we achieved 89.66% , 85.37% 84.01% and 82.08% test accuracy respectively which shows thesuperior performance of proposed method to the previous works on Stanford dog breeds datasets.


2021 ◽  
Vol 3 (4) ◽  
pp. 966-989
Author(s):  
Vanessa Buhrmester ◽  
David Münch ◽  
Michael Arens

Deep Learning is a state-of-the-art technique to make inference on extensive or complex data. As a black box model due to their multilayer nonlinear structure, Deep Neural Networks are often criticized as being non-transparent and their predictions not traceable by humans. Furthermore, the models learn from artificially generated datasets, which often do not reflect reality. By basing decision-making algorithms on Deep Neural Networks, prejudice and unfairness may be promoted unknowingly due to a lack of transparency. Hence, several so-called explanators, or explainers, have been developed. Explainers try to give insight into the inner structure of machine learning black boxes by analyzing the connection between the input and output. In this survey, we present the mechanisms and properties of explaining systems for Deep Neural Networks for Computer Vision tasks. We give a comprehensive overview about the taxonomy of related studies and compare several survey papers that deal with explainability in general. We work out the drawbacks and gaps and summarize further research ideas.


2021 ◽  
Vol 2021 ◽  
pp. 1-10
Author(s):  
Hongwei Luo ◽  
Yijie Shen ◽  
Feng Lin ◽  
Guoai Xu

Speaker verification system has gained great popularity in recent years, especially with the development of deep neural networks and Internet of Things. However, the security of speaker verification system based on deep neural networks has not been well investigated. In this paper, we propose an attack to spoof the state-of-the-art speaker verification system based on generalized end-to-end (GE2E) loss function for misclassifying illegal users into the authentic user. Specifically, we design a novel loss function to deploy a generator for generating effective adversarial examples with slight perturbation and then spoof the system with these adversarial examples to achieve our goals. The success rate of our attack can reach 82% when cosine similarity is adopted to deploy the deep-learning-based speaker verification system. Beyond that, our experiments also reported the signal-to-noise ratio at 76 dB, which proves that our attack has higher imperceptibility than previous works. In summary, the results show that our attack not only can spoof the state-of-the-art neural-network-based speaker verification system but also more importantly has the ability to hide from human hearing or machine discrimination.


2020 ◽  
Author(s):  
Alisson Hayasi da Costa ◽  
Renato Augusto C. dos Santos ◽  
Ricardo Cerri

AbstractPIWI-Interacting RNAs (piRNAs) form an important class of non-coding RNAs that play a key role in the genome integrity through the silencing of transposable elements. However, despite their importance and the large application of deep learning in computational biology for classification tasks, there are few studies of deep learning and neural networks for piRNAs prediction. Therefore, this paper presents an investigation on deep feedforward networks models for classification of transposon-derived piRNAs. We analyze and compare the results of the neural networks in different hyperparameters choices, such as number of layers, activation functions and optimizers, clarifying the advantages and disadvantages of each configuration. From this analysis, we propose a model for human piRNAs classification and compare our method with the state-of-the-art deep neural network for piRNA prediction in the literature and also traditional machine learning algorithms, such as Support Vector Machines and Random Forests, showing that our model has achieved a great performance with an F-measure value of 0.872, outperforming the state-of-the-art method in the literature.


Author(s):  
Xiao Ling ◽  
Sameer Singh ◽  
Daniel S. Weld

Recent research on entity linking (EL) has introduced a plethora of promising techniques, ranging from deep neural networks to joint inference. But despite numerous papers there is surprisingly little understanding of the state of the art in EL. We attack this confusion by analyzing differences between several versions of the EL problem and presenting a simple yet effective, modular, unsupervised system, called Vinculum, for entity linking. We conduct an extensive evaluation on nine data sets, comparing Vinculum with two state-of-the-art systems, and elucidate key aspects of the system that include mention extraction, candidate generation, entity type prediction, entity coreference, and coherence.


2021 ◽  
Vol 2021 ◽  
pp. 1-13
Author(s):  
Wanheng Liu ◽  
Ling Yin ◽  
Cong Wang ◽  
Fulin Liu ◽  
Zhiyu Ni

In this paper, a novel medical knowledge graph in Chinese approach applied in smart healthcare based on IoT and WoT is presented, using deep neural networks combined with self-attention to generate medical knowledge graph to make it more convenient for performing disease diagnosis and providing treatment advisement. Although great success has been made in the medical knowledge graph in recent studies, the issue of comprehensive medical knowledge graph in Chinese appropriate for telemedicine or mobile devices have been ignored. In our study, it is a working theory which is based on semantic mobile computing and deep learning. When several experiments have been carried out, it is demonstrated that it has better performance in generating various types of medical knowledge graph in Chinese, which is similar to that of the state-of-the-art. Also, it works well in the accuracy and comprehensive, which is much higher and highly consisted with the predictions of the theoretical model. It proves to be inspiring and encouraging that our work involving studies of medical knowledge graph in Chinese, which can stimulate the smart healthcare development.


Author(s):  
Dong-Dong Chen ◽  
Wei Wang ◽  
Wei Gao ◽  
Zhi-Hua Zhou

Deep neural networks have witnessed great successes in various real applications, but it requires a large number of labeled data for training. In this paper, we propose tri-net, a deep neural network which is able to use massive unlabeled data to help learning with limited labeled data. We consider model initialization, diversity augmentation and pseudo-label editing simultaneously. In our work, we utilize output smearing to initialize modules, use fine-tuning on labeled data to augment diversity and eliminate unstable pseudo-labels to alleviate the influence of suspicious pseudo-labeled data. Experiments show that our method achieves the best performance in comparison with state-of-the-art semi-supervised deep learning methods. In particular, it achieves 8.30% error rate on CIFAR-10 by using only 4000 labeled examples.


2017 ◽  
Vol 37 (4-5) ◽  
pp. 513-542 ◽  
Author(s):  
Sen Wang ◽  
Ronald Clark ◽  
Hongkai Wen ◽  
Niki Trigoni

This paper studies visual odometry (VO) from the perspective of deep learning. After tremendous efforts in the robotics and computer vision communities over the past few decades, state-of-the-art VO algorithms have demonstrated incredible performance. However, since the VO problem is typically formulated as a pure geometric problem, one of the key features still missing from current VO systems is the capability to automatically gain knowledge and improve performance through learning. In this paper, we investigate whether deep neural networks can be effective and beneficial to the VO problem. An end-to-end, sequence-to-sequence probabilistic visual odometry (ESP-VO) framework is proposed for the monocular VO based on deep recurrent convolutional neural networks. It is trained and deployed in an end-to-end manner, that is, directly inferring poses and uncertainties from a sequence of raw images (video) without adopting any modules from the conventional VO pipeline. It can not only automatically learn effective feature representation encapsulating geometric information through convolutional neural networks, but also implicitly model sequential dynamics and relation for VO using deep recurrent neural networks. Uncertainty is also derived along with the VO estimation without introducing much extra computation. Extensive experiments on several datasets representing driving, flying and walking scenarios show competitive performance of the proposed ESP-VO to the state-of-the-art methods, demonstrating a promising potential of the deep learning technique for VO and verifying that it can be a viable complement to current VO systems.


Sign in / Sign up

Export Citation Format

Share Document