Literature Review of Deep Network Compression

Ali Alqahtani; Xianghua Xie; Mark W. Jones

doi:10.3390/informatics8040077

Literature Review of Deep Network Compression

Informatics ◽

10.3390/informatics8040077 ◽

2021 ◽

Vol 8 (4) ◽

pp. 77

Author(s):

Ali Alqahtani ◽

Xianghua Xie ◽

Mark W. Jones

Keyword(s):

Neural Networks ◽

Deep Learning ◽

Deep Neural Networks ◽

Low Rank ◽

Vast Number ◽

Deep Networks ◽

Factorization Methods ◽

Network Compression ◽

Rank Factorization ◽

Pruning Methods

Deep networks often possess a vast number of parameters, and their significant redundancy in parameterization has become a widely-recognized property. This presents significant challenges and restricts many deep learning applications, making the focus on reducing the complexity of models while maintaining their powerful performance. In this paper, we present an overview of popular methods and review recent works on compressing and accelerating deep neural networks. We consider not only pruning methods but also quantization methods, and low-rank factorization methods. This review also intends to clarify these major concepts, and highlights their characteristics, advantages, and shortcomings.

Download Full-text

Syntactic Structure from Deep Learning

Annual Review of Linguistics ◽

10.1146/annurev-linguistics-032020-051035 ◽

2020 ◽

Vol 7 (1) ◽

Author(s):

Tal Linzen ◽

Marco Baroni

Keyword(s):

Neural Networks ◽

Deep Learning ◽

Machine Translation ◽

Online Publication ◽

Deep Neural Networks ◽

Syntactic Structure ◽

Annual Review ◽

Publication Date ◽

Grammatical Knowledge ◽

Deep Networks

Modern deep neural networks achieve impressive performance in engineering applications that require extensive linguistic skills, such as machine translation. This success has sparked interest in probing whether these models are inducing human-like grammatical knowledge from the raw data they are exposed to and, consequently, whether they can shed new light on long-standing debates concerning the innate structure necessary for language acquisition. In this article, we survey representative studies of the syntactic abilities of deep networks and discuss the broader implications that this work has for theoretical linguistics. Expected final online publication date for the Annual Review of Linguistics, Volume 7 is January 14, 2021. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.

Download Full-text

The HSIC Bottleneck: Deep Learning without Back-Propagation

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i04.5950 ◽

2020 ◽

Vol 34 (04) ◽

pp. 5085-5092 ◽

Cited By ~ 1

Author(s):

Wan-Duo Kurt Ma ◽

J. P. Lewis ◽

W. Bastiaan Kleijn

Keyword(s):

Neural Networks ◽

Deep Learning ◽

Deep Neural Networks ◽

Back Propagation ◽

Single Layer ◽

Cross Entropy ◽

Entropy Loss ◽

Deep Networks ◽

Independence Criterion

We introduce the HSIC (Hilbert-Schmidt independence criterion) bottleneck for training deep neural networks. The HSIC bottleneck is an alternative to the conventional cross-entropy loss and backpropagation that has a number of distinct advantages. It mitigates exploding and vanishing gradients, resulting in the ability to learn very deep networks without skip connections. There is no requirement for symmetric feedback or update locking. We find that the HSIC bottleneck provides performance on MNIST/FashionMNIST/CIFAR10 classification comparable to backpropagation with a cross-entropy target, even when the system is not encouraged to make the output resemble the classification labels. Appending a single layer trained with SGD (without backpropagation) to reformat the information further improves performance.

Download Full-text

Semi-Orthogonal Low-Rank Matrix Factorization for Deep Neural Networks

10.21437/interspeech.2018-1417 ◽

2018 ◽

Cited By ~ 39

Author(s):

Daniel Povey ◽

Gaofeng Cheng ◽

Yiming Wang ◽

Ke Li ◽

Hainan Xu ◽

...

Keyword(s):

Neural Networks ◽

Matrix Factorization ◽

Deep Neural Networks ◽

Low Rank ◽

Rank Matrix ◽

Low Rank Matrix

Download Full-text

Automatic Detection of Arrhythmia Based on Multi-Resolution Representation of ECG Signal

Sensors ◽

10.3390/s20061579 ◽

2020 ◽

Vol 20 (6) ◽

pp. 1579

Author(s):

Dongqi Wang ◽

Qinghua Meng ◽

Dongming Chen ◽

Hupo Zhang ◽

Lisheng Xu

Keyword(s):

Neural Networks ◽

Deep Learning ◽

Deep Neural Networks ◽

Channel Model ◽

Expert Knowledge ◽

Automatic Detection ◽

Data Representation ◽

Learning Technology ◽

Arrhythmia Detection ◽

Automatic Feature Extraction

Automatic detection of arrhythmia is of great significance for early prevention and diagnosis of cardiovascular disease. Traditional feature engineering methods based on expert knowledge lack multidimensional and multi-view information abstraction and data representation ability, so the traditional research on pattern recognition of arrhythmia detection cannot achieve satisfactory results. Recently, with the increase of deep learning technology, automatic feature extraction of ECG data based on deep neural networks has been widely discussed. In order to utilize the complementary strength between different schemes, in this paper, we propose an arrhythmia detection method based on the multi-resolution representation (MRR) of ECG signals. This method utilizes four different up to date deep neural networks as four channel models for ECG vector representations learning. The deep learning based representations, together with hand-crafted features of ECG, forms the MRR, which is the input of the downstream classification strategy. The experimental results of big ECG dataset multi-label classification confirm that the F1 score of the proposed method is 0.9238, which is 1.31%, 0.62%, 1.18% and 0.6% higher than that of each channel model. From the perspective of architecture, this proposed method is highly scalable and can be employed as an example for arrhythmia recognition.

Download Full-text

Enabling deeper learning on big data for materials informatics applications

Scientific Reports ◽

10.1038/s41598-021-83193-1 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Dipendra Jha ◽

Vishu Gupta ◽

Logan Ward ◽

Zijiang Yang ◽

Christopher Wolverton ◽

...

Keyword(s):

Neural Networks ◽

Big Data ◽

Deep Learning ◽

Deep Neural Networks ◽

Materials Science ◽

Prediction Models ◽

Model Performance ◽

Materials Informatics ◽

Learning Framework ◽

Significant Attention

AbstractThe application of machine learning (ML) techniques in materials science has attracted significant attention in recent years, due to their impressive ability to efficiently extract data-driven linkages from various input materials representations to their output properties. While the application of traditional ML techniques has become quite ubiquitous, there have been limited applications of more advanced deep learning (DL) techniques, primarily because big materials datasets are relatively rare. Given the demonstrated potential and advantages of DL and the increasing availability of big materials datasets, it is attractive to go for deeper neural networks in a bid to boost model performance, but in reality, it leads to performance degradation due to the vanishing gradient problem. In this paper, we address the question of how to enable deeper learning for cases where big materials data is available. Here, we present a general deep learning framework based on Individual Residual learning (IRNet) composed of very deep neural networks that can work with any vector-based materials representation as input to build accurate property prediction models. We find that the proposed IRNet models can not only successfully alleviate the vanishing gradient problem and enable deeper learning, but also lead to significantly (up to 47%) better model accuracy as compared to plain deep neural networks and traditional ML techniques for a given input materials representation in the presence of big data.

Download Full-text

Representing Deep Neural Networks Latent Space Geometries with Graphs

Algorithms ◽

10.3390/a14020039 ◽

2021 ◽

Vol 14 (2) ◽

pp. 39

Author(s):

Carlos Lassance ◽

Vincent Gripon ◽

Antonio Ortega

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Deep Learning ◽

Objective Function ◽

Learning Process ◽

Deep Neural Networks ◽

State Of The Art ◽

The Core ◽

Learning Tasks ◽

Latent Space

Deep Learning (DL) has attracted a lot of attention for its ability to reach state-of-the-art performance in many machine learning tasks. The core principle of DL methods consists of training composite architectures in an end-to-end fashion, where inputs are associated with outputs trained to optimize an objective function. Because of their compositional nature, DL architectures naturally exhibit several intermediate representations of the inputs, which belong to so-called latent spaces. When treated individually, these intermediate representations are most of the time unconstrained during the learning process, as it is unclear which properties should be favored. However, when processing a batch of inputs concurrently, the corresponding set of intermediate representations exhibit relations (what we call a geometry) on which desired properties can be sought. In this work, we show that it is possible to introduce constraints on these latent geometries to address various problems. In more detail, we propose to represent geometries by constructing similarity graphs from the intermediate representations obtained when processing a batch of inputs. By constraining these Latent Geometry Graphs (LGGs), we address the three following problems: (i) reproducing the behavior of a teacher architecture is achieved by mimicking its geometry, (ii) designing efficient embeddings for classification is achieved by targeting specific geometries, and (iii) robustness to deviations on inputs is achieved via enforcing smooth variation of geometry between consecutive latent spaces. Using standard vision benchmarks, we demonstrate the ability of the proposed geometry-based methods in solving the considered problems.

Download Full-text

Off-the-shelf deep learning is not enough, and requires parsimony, Bayesianity, and causality

npj Computational Materials ◽

10.1038/s41524-020-00487-0 ◽

2021 ◽

Vol 7 (1) ◽

Author(s):

Rama K. Vasudevan ◽

Maxim Ziatdinov ◽

Lukas Vlcek ◽

Sergei V. Kalinin

Keyword(s):

Neural Networks ◽

Computer Vision ◽

Deep Learning ◽

Bayesian Methods ◽

Deep Neural Networks ◽

Applied Research ◽

Modern Science ◽

Generative Models ◽

Knowledge Development ◽

Physical Constraints

AbstractDeep neural networks (‘deep learning’) have emerged as a technology of choice to tackle problems in speech recognition, computer vision, finance, etc. However, adoption of deep learning in physical domains brings substantial challenges stemming from the correlative nature of deep learning methods compared to the causal, hypothesis driven nature of modern science. We argue that the broad adoption of Bayesian methods incorporating prior knowledge, development of solutions with incorporated physical constraints and parsimonious structural descriptors and generative models, and ultimately adoption of causal models, offers a path forward for fundamental and applied research.

Download Full-text

Diversity oriented Deep Reinforcement Learning for targeted molecule generation

Journal of Cheminformatics ◽

10.1186/s13321-021-00498-z ◽

2021 ◽

Vol 13 (1) ◽

Author(s):

Tiago Pereira ◽

Maryam Abbasi ◽

Bernardete Ribeiro ◽

Joel P. Arrais

Keyword(s):

Neural Networks ◽

Deep Learning ◽

Reinforcement Learning ◽

Deep Neural Networks ◽

Chemical Space ◽

Biological Properties ◽

Training Process ◽

Training Strategy ◽

Inhibitory Power ◽

Exploratory Strategy

AbstractIn this work, we explore the potential of deep learning to streamline the process of identifying new potential drugs through the computational generation of molecules with interesting biological properties. Two deep neural networks compose our targeted generation framework: the Generator, which is trained to learn the building rules of valid molecules employing SMILES strings notation, and the Predictor which evaluates the newly generated compounds by predicting their affinity for the desired target. Then, the Generator is optimized through Reinforcement Learning to produce molecules with bespoken properties. The innovation of this approach is the exploratory strategy applied during the reinforcement training process that seeks to add novelty to the generated compounds. This training strategy employs two Generators interchangeably to sample new SMILES: the initially trained model that will remain fixed and a copy of the previous one that will be updated during the training to uncover the most promising molecules. The evolution of the reward assigned by the Predictor determines how often each one is employed to select the next token of the molecule. This strategy establishes a compromise between the need to acquire more information about the chemical space and the need to sample new molecules, with the experience gained so far. To demonstrate the effectiveness of the method, the Generator is trained to design molecules with an optimized coefficient of partition and also high inhibitory power against the Adenosine $$A_{2A}$$ A 2 A and $$\kappa$$ κ opioid receptors. The results reveal that the model can effectively adjust the newly generated molecules towards the wanted direction. More importantly, it was possible to find promising sets of unique and diverse molecules, which was the main purpose of the newly implemented strategy.

Download Full-text

Multi-Mineral Segmentation of SEM Images Using Deep Learning Techniques

10.2118/206526-ms ◽

2021 ◽

Author(s):

Vladislav Vasilevich Alekseev ◽

Denis Mihaylovich Orlov ◽

Dmitry Anatolevich Koroteev

Keyword(s):

Neural Networks ◽

Deep Learning ◽

Deep Neural Networks ◽

Pore Space ◽

Rock Physics ◽

Sem Images ◽

Digital Rock Physics ◽

Learning Techniques ◽

Digital Core ◽

Non Destructive

Abstract The approaches of building and methods of using the digital core are currently developing rapidly. The use of these methods makes it possible to obtain petrophysical information by non-destructive methods quickly. Digital rock physics includes two main stages: constructing models and modeling various physical processes on the obtained models. Our work proposes using deep learning methods for mineral and pore space segmentation instead of classical methods such as threshold image processing. Deep neural networks have long been able to show their advantages in many areas of computer vision. This paper proposes and tests methods that help identify different minerals in images from a scanning electron microscope. We used images of rocks of the Achimov formation, which are arkoses, as samples. We tested various deep neural networks such as LinkNet, U-Net, ResUNet, and pix2pix and identified those that performed best in segmentation.

Download Full-text

SỬ DỤNG DEEP NEURAL NETWORKS PHÁT HIỆN GAI ĐỘNG KINH TRONG BẢN GHI EEG

Journal of Military Science and Technology ◽

10.54939/1859-1043.j.mst.70.2020.77-84 ◽

2020 ◽

pp. 77-84

Author(s):

Xuyến

Keyword(s):

Neural Networks ◽

Artificial Neural Networks ◽

Deep Learning ◽

Deep Neural Networks ◽

Artificial Neural

Deep Neural Networks là một thuật toán dạy cho máy học, là phương pháp nâng cao của mạng nơ-ron nhân tạo (Artificial Neural Networks) nhiều tầng để học biểu diễn mô hình đối tượng. Bài báo trình bày phương pháp để phát hiện spike tự động, giải quyết bài toán cho các bác sỹ khi phân tích dữ liệu khổng lồ được thu thập từ bản ghi điện não để xác định một khu vực của não gây ra chứng động kinh. Hàng triệu mẫu được phân tích thủ công đã được đào tạo lại để tìm các gai liêp tiếp phát ra từ vùng não bị ảnh hưởng. Để đánh giá phương pháp đề xuất, tác giả đã xây dựng hệ thống trong đó sử dụng một số mô hình deep learning đưa vào thử nghiệm hỗ trợ các bác sỹ khám phát hiện và chẩn đoán sớm bệnh.

Download Full-text