scholarly journals A dual number abstraction for static analysis of Clarke Jacobians

2022 ◽  
Vol 6 (POPL) ◽  
pp. 1-30
Author(s):  
Jacob Laurel ◽  
Rem Yang ◽  
Gagandeep Singh ◽  
Sasa Misailovic

We present a novel abstraction for bounding the Clarke Jacobian of a Lipschitz continuous, but not necessarily differentiable function over a local input region. To do so, we leverage a novel abstract domain built upon dual numbers, adapted to soundly over-approximate all first derivatives needed to compute the Clarke Jacobian. We formally prove that our novel forward-mode dual interval evaluation produces a sound, interval domain-based over-approximation of the true Clarke Jacobian for a given input region. Due to the generality of our formalism, we can compute and analyze interval Clarke Jacobians for a broader class of functions than previous works supported – specifically, arbitrary compositions of neural networks with Lipschitz, but non-differentiable perturbations. We implement our technique in a tool called DeepJ and evaluate it on multiple deep neural networks and non-differentiable input perturbations to showcase both the generality and scalability of our analysis. Concretely, we can obtain interval Clarke Jacobians to analyze Lipschitz robustness and local optimization landscapes of both fully-connected and convolutional neural networks for rotational, contrast variation, and haze perturbations, as well as their compositions.

Electronics ◽  
2019 ◽  
Vol 8 (1) ◽  
pp. 78 ◽  
Author(s):  
Zidi Qin ◽  
Di Zhu ◽  
Xingwei Zhu ◽  
Xuan Chen ◽  
Yinghuan Shi ◽  
...  

As a key ingredient of deep neural networks (DNNs), fully-connected (FC) layers are widely used in various artificial intelligence applications. However, there are many parameters in FC layers, so the efficient process of FC layers is restricted by memory bandwidth. In this paper, we propose a compression approach combining block-circulant matrix-based weight representation and power-of-two quantization. Applying block-circulant matrices in FC layers can reduce the storage complexity from O ( k 2 ) to O ( k ) . By quantizing the weights into integer powers of two, the multiplications in the reference can be replaced by shift and add operations. The memory usages of models for MNIST, CIFAR-10 and ImageNet can be compressed by 171 × , 2731 × and 128 × with minimal accuracy loss, respectively. A configurable parallel hardware architecture is then proposed for processing the compressed FC layers efficiently. Without multipliers, a block matrix-vector multiplication module (B-MV) is used as the computing kernel. The architecture is flexible to support FC layers of various compression ratios with small footprint. Simultaneously, the memory access can be significantly reduced by using the configurable architecture. Measurement results show that the accelerator has a processing power of 409.6 GOPS, and achieves 5.3 TOPS/W energy efficiency at 800 MHz.


2021 ◽  
Vol 37 (2) ◽  
pp. 123-143
Author(s):  
Tuan Minh Luu ◽  
Huong Thanh Le ◽  
Tan Minh Hoang

Deep neural networks have been applied successfully to extractive text summarization tasks with the accompany of large training datasets. However, when the training dataset is not large enough, these models reveal certain limitations that affect the quality of the system’s summary. In this paper, we propose an extractive summarization system basing on a Convolutional Neural Network and a Fully Connected network for sentence selection. The pretrained BERT multilingual model is used to generate embeddings vectors from the input text. These vectors are combined with TF-IDF values to produce the input of the text summarization system. Redundant sentences from the output summary are eliminated by the Maximal Marginal Relevance method. Our system is evaluated with both English and Vietnamese languages using CNN and Baomoi datasets, respectively. Experimental results show that our system achieves better results comparing to existing works using the same dataset. It confirms that our approach can be effectively applied to summarize both English and Vietnamese languages.


Author(s):  
Shuqin Gu ◽  
Yuexian Hou ◽  
Lipeng Zhang ◽  
Yazhou Zhang

Although Deep Neural Networks (DNNs) have achieved excellent performance in many tasks, improving the generalization capacity of DNNs still remains a challenge. In this work, we propose a novel regularizer named Ensemble-based Decorrelation Method (EDM), which is motivated by the idea of the ensemble learning to improve generalization capacity of DNNs. EDM can be applied to hidden layers in fully connected neural networks or convolutional neural networks. We treat each hidden layer as an ensemble of several base learners through dividing all the hidden units into several non-overlap groups, and each group will be viewed as a base learner. EDM encourages DNNs to learn more diverse representations by minimizing the covariance between all base learners during the training step. Experimental results on MNIST and CIFAR datasets demonstrate that EDM can effectively reduce the overfitting and improve the generalization capacity of DNNs  


2020 ◽  
Vol 8 (5) ◽  
pp. 3292-3296

Android is susceptible to malware attacks due to its open architecture, large user base and access to its code. Mobile or android malware attacks are increasing from last year. These are common threats for every internet-accessible device. From Researchers Point of view 50% increase in cyber-attacks targeting Android Mobile phones since last year. Malware attackers increasingly turning their attention to attacking smartphones with credential-theft, surveillance, and malicious advertising. Security investigation in the android mobile system has relied on analysis for malware or threat detection using binary samples or system calls with behavior profile for malicious applications is generated and then analyzed. The resulting report is then used to detect android application malware or threats using manual features. To dispose of malicious applications in the mobile device, we propose an Android malware detection system using deep learning techniques which gives security for mobile or android. FNN(Fully-connected FeedForward Deep Neural Networks) and AutoEncoder algorithm from deep learning provide Extensive experiments on a real-world dataset that reaches to an accuracy of 95 %. These papers explain Deep learning FNN(Fully-connected FeedForward Deep Neural Networks) and AutoEncoder approach for android malware detection.


2020 ◽  
Author(s):  
Nitin Chandrachoodan ◽  
Basava Naga Girish Koneru ◽  
Vinita Vasudevan

<div>Deep Neural Networks (DNNs) are increasingly being used in a variety of applications. However, DNNs have huge computational and memory requirements. One way to reduce these requirements is to sparsify DNNs by using smoothed LASSO (Least Absolute Shrinkage and Selection Operator) functions. In this paper, we show that for the same maximum error with respect to the LASSO function, the sparsity values obtained using various smoothed LASSO functions are similar. We also propose a layer-wise DNN pruning algorithm, where the layers are pruned based on their individual allocated accuracy loss budget determined by estimates of the reduction in number of multiply-accumulate operations (in convolutional layers) and weights (in fully connected layers). Further, the structured LASSO variants in both convolutional and fully connected layers are explored within the smoothed LASSO framework and the tradeoffs involved are discussed. The efficacy of proposed algorithm in enhancing the sparsity within the allowed degradation in DNN accuracy and results obtained on structured LASSO variants are shown on MNIST, SVHN, CIFAR-10, and Imagenette datasets.</div>


2020 ◽  
Vol 1 ◽  
pp. 6
Author(s):  
Henning Petzka ◽  
Martin Trimmel ◽  
Cristian Sminchisescu

Symmetries in neural networks allow different weight configurations leading to the same network function. For odd activation functions, the set of transformations mapping between such configurations have been studied extensively, but less is known for neural networks with ReLU activation functions. We give a complete characterization for fully-connected networks with two layers. Apart from two well-known transformations, only degenerated situations allow additional transformations that leave the network function unchanged. Reduction steps can remove only part of the degenerated cases. Finally, we present a non-degenerate situation for deep neural networks leading to new transformations leaving the network function intact.


Author(s):  
Shiva Prasad Kasiviswanathan ◽  
Nina Narodytska ◽  
Hongxia Jin

Deep neural networks are powerful learning models that achieve state-of-the-art performance on many computer vision, speech, and language processing tasks. In this paper, we study a fundamental question that arises when designing deep network architectures: Given a target network architecture can we design a `smaller' network architecture that 'approximates' the operation of the target network? The question is, in part, motivated by the challenge of parameter reduction (compression) in modern deep neural networks, as the ever increasing storage and memory requirements of these networks pose a problem in resource constrained environments.In this work, we focus on deep convolutional neural network architectures, and propose a novel randomized tensor sketching technique that we utilize to develop a unified framework for approximating the operation of both the convolutional and fully connected layers. By applying the sketching technique along different tensor dimensions, we design changes to the convolutional and fully connected layers that substantially reduce the number of effective parameters in a network. We show that the resulting smaller network can be trained directly, and has a classification accuracy that is comparable to the original network.


Entropy ◽  
2020 ◽  
Vol 22 (2) ◽  
pp. 204
Author(s):  
Matteo Zambra ◽  
Amos Maritan ◽  
Alberto Testolin

Network science can offer fundamental insights into the structural and functional properties of complex systems. For example, it is widely known that neuronal circuits tend to organize into basic functional topological modules, called network motifs. In this article, we show that network science tools can be successfully applied also to the study of artificial neural networks operating according to self-organizing (learning) principles. In particular, we study the emergence of network motifs in multi-layer perceptrons, whose initial connectivity is defined as a stack of fully-connected, bipartite graphs. Simulations show that the final network topology is shaped by learning dynamics, but can be strongly biased by choosing appropriate weight initialization schemes. Overall, our results suggest that non-trivial initialization strategies can make learning more effective by promoting the development of useful network motifs, which are often surprisingly consistent with those observed in general transduction networks.


2020 ◽  
Vol 34 (10) ◽  
pp. 13791-13792
Author(s):  
Liangzhu Ge ◽  
Yuexian Hou ◽  
Yaju Jiang ◽  
Shuai Yao ◽  
Chao Yang

Despite their widespread applications, deep neural networks often tend to overfit the training data. Here, we propose a measure called VECA (Variance of Eigenvalues of Covariance matrix of Activation matrix) and demonstrate that VECA is a good predictor of networks' generalization performance during the training process. Experiments performed on fully-connected networks and convolutional neural networks trained on benchmark image datasets show a strong correlation between test loss and VECA, which suggest that we can calculate the VECA to estimate generalization performance without sacrificing training data to be used as a validation set.


Sign in / Sign up

Export Citation Format

Share Document