A novel layerwise pruning method for model reduction of fully connected deep neural networks

Author(s):  
Lukas Mauch ◽  
Bin Yang
2018 ◽  
Vol 5 (5) ◽  
pp. 3263-3269 ◽  
Author(s):  
Haonan Guo ◽  
Shenghong Li ◽  
Bin Li ◽  
Yinghua Ma ◽  
Xudie Ren

Electronics ◽  
2019 ◽  
Vol 8 (1) ◽  
pp. 78 ◽  
Author(s):  
Zidi Qin ◽  
Di Zhu ◽  
Xingwei Zhu ◽  
Xuan Chen ◽  
Yinghuan Shi ◽  
...  

As a key ingredient of deep neural networks (DNNs), fully-connected (FC) layers are widely used in various artificial intelligence applications. However, there are many parameters in FC layers, so the efficient process of FC layers is restricted by memory bandwidth. In this paper, we propose a compression approach combining block-circulant matrix-based weight representation and power-of-two quantization. Applying block-circulant matrices in FC layers can reduce the storage complexity from O ( k 2 ) to O ( k ) . By quantizing the weights into integer powers of two, the multiplications in the reference can be replaced by shift and add operations. The memory usages of models for MNIST, CIFAR-10 and ImageNet can be compressed by 171 × , 2731 × and 128 × with minimal accuracy loss, respectively. A configurable parallel hardware architecture is then proposed for processing the compressed FC layers efficiently. Without multipliers, a block matrix-vector multiplication module (B-MV) is used as the computing kernel. The architecture is flexible to support FC layers of various compression ratios with small footprint. Simultaneously, the memory access can be significantly reduced by using the configurable architecture. Measurement results show that the accelerator has a processing power of 409.6 GOPS, and achieves 5.3 TOPS/W energy efficiency at 800 MHz.


2021 ◽  
Vol 37 (2) ◽  
pp. 123-143
Author(s):  
Tuan Minh Luu ◽  
Huong Thanh Le ◽  
Tan Minh Hoang

Deep neural networks have been applied successfully to extractive text summarization tasks with the accompany of large training datasets. However, when the training dataset is not large enough, these models reveal certain limitations that affect the quality of the system’s summary. In this paper, we propose an extractive summarization system basing on a Convolutional Neural Network and a Fully Connected network for sentence selection. The pretrained BERT multilingual model is used to generate embeddings vectors from the input text. These vectors are combined with TF-IDF values to produce the input of the text summarization system. Redundant sentences from the output summary are eliminated by the Maximal Marginal Relevance method. Our system is evaluated with both English and Vietnamese languages using CNN and Baomoi datasets, respectively. Experimental results show that our system achieves better results comparing to existing works using the same dataset. It confirms that our approach can be effectively applied to summarize both English and Vietnamese languages.


Author(s):  
Shuqin Gu ◽  
Yuexian Hou ◽  
Lipeng Zhang ◽  
Yazhou Zhang

Although Deep Neural Networks (DNNs) have achieved excellent performance in many tasks, improving the generalization capacity of DNNs still remains a challenge. In this work, we propose a novel regularizer named Ensemble-based Decorrelation Method (EDM), which is motivated by the idea of the ensemble learning to improve generalization capacity of DNNs. EDM can be applied to hidden layers in fully connected neural networks or convolutional neural networks. We treat each hidden layer as an ensemble of several base learners through dividing all the hidden units into several non-overlap groups, and each group will be viewed as a base learner. EDM encourages DNNs to learn more diverse representations by minimizing the covariance between all base learners during the training step. Experimental results on MNIST and CIFAR datasets demonstrate that EDM can effectively reduce the overfitting and improve the generalization capacity of DNNs  


2020 ◽  
Vol 8 (5) ◽  
pp. 3292-3296

Android is susceptible to malware attacks due to its open architecture, large user base and access to its code. Mobile or android malware attacks are increasing from last year. These are common threats for every internet-accessible device. From Researchers Point of view 50% increase in cyber-attacks targeting Android Mobile phones since last year. Malware attackers increasingly turning their attention to attacking smartphones with credential-theft, surveillance, and malicious advertising. Security investigation in the android mobile system has relied on analysis for malware or threat detection using binary samples or system calls with behavior profile for malicious applications is generated and then analyzed. The resulting report is then used to detect android application malware or threats using manual features. To dispose of malicious applications in the mobile device, we propose an Android malware detection system using deep learning techniques which gives security for mobile or android. FNN(Fully-connected FeedForward Deep Neural Networks) and AutoEncoder algorithm from deep learning provide Extensive experiments on a real-world dataset that reaches to an accuracy of 95 %. These papers explain Deep learning FNN(Fully-connected FeedForward Deep Neural Networks) and AutoEncoder approach for android malware detection.


2020 ◽  
Author(s):  
Nitin Chandrachoodan ◽  
Basava Naga Girish Koneru ◽  
Vinita Vasudevan

<div>Deep Neural Networks (DNNs) are increasingly being used in a variety of applications. However, DNNs have huge computational and memory requirements. One way to reduce these requirements is to sparsify DNNs by using smoothed LASSO (Least Absolute Shrinkage and Selection Operator) functions. In this paper, we show that for the same maximum error with respect to the LASSO function, the sparsity values obtained using various smoothed LASSO functions are similar. We also propose a layer-wise DNN pruning algorithm, where the layers are pruned based on their individual allocated accuracy loss budget determined by estimates of the reduction in number of multiply-accumulate operations (in convolutional layers) and weights (in fully connected layers). Further, the structured LASSO variants in both convolutional and fully connected layers are explored within the smoothed LASSO framework and the tradeoffs involved are discussed. The efficacy of proposed algorithm in enhancing the sparsity within the allowed degradation in DNN accuracy and results obtained on structured LASSO variants are shown on MNIST, SVHN, CIFAR-10, and Imagenette datasets.</div>


2020 ◽  
Vol 1 ◽  
pp. 6
Author(s):  
Henning Petzka ◽  
Martin Trimmel ◽  
Cristian Sminchisescu

Symmetries in neural networks allow different weight configurations leading to the same network function. For odd activation functions, the set of transformations mapping between such configurations have been studied extensively, but less is known for neural networks with ReLU activation functions. We give a complete characterization for fully-connected networks with two layers. Apart from two well-known transformations, only degenerated situations allow additional transformations that leave the network function unchanged. Reduction steps can remove only part of the degenerated cases. Finally, we present a non-degenerate situation for deep neural networks leading to new transformations leaving the network function intact.


Author(s):  
Shiva Prasad Kasiviswanathan ◽  
Nina Narodytska ◽  
Hongxia Jin

Deep neural networks are powerful learning models that achieve state-of-the-art performance on many computer vision, speech, and language processing tasks. In this paper, we study a fundamental question that arises when designing deep network architectures: Given a target network architecture can we design a `smaller' network architecture that 'approximates' the operation of the target network? The question is, in part, motivated by the challenge of parameter reduction (compression) in modern deep neural networks, as the ever increasing storage and memory requirements of these networks pose a problem in resource constrained environments.In this work, we focus on deep convolutional neural network architectures, and propose a novel randomized tensor sketching technique that we utilize to develop a unified framework for approximating the operation of both the convolutional and fully connected layers. By applying the sketching technique along different tensor dimensions, we design changes to the convolutional and fully connected layers that substantially reduce the number of effective parameters in a network. We show that the resulting smaller network can be trained directly, and has a classification accuracy that is comparable to the original network.


Sign in / Sign up

Export Citation Format

Share Document