deep neural networks
Recently Published Documents


TOTAL DOCUMENTS

7014
(FIVE YEARS 6344)

H-INDEX

95
(FIVE YEARS 67)

2022 ◽  
Vol 54 (8) ◽  
pp. 1-36
Author(s):  
Xingwei Zhang ◽  
Xiaolong Zheng ◽  
Wenji Mao

Deep neural networks (DNNs) have been verified to be easily attacked by well-designed adversarial perturbations. Image objects with small perturbations that are imperceptible to human eyes can induce DNN-based image class classifiers towards making erroneous predictions with high probability. Adversarial perturbations can also fool real-world machine learning systems and transfer between different architectures and datasets. Recently, defense methods against adversarial perturbations have become a hot topic and attracted much attention. A large number of works have been put forward to defend against adversarial perturbations, enhancing DNN robustness against potential attacks, or interpreting the origin of adversarial perturbations. In this article, we provide a comprehensive survey on classical and state-of-the-art defense methods by illuminating their main concepts, in-depth algorithms, and fundamental hypotheses regarding the origin of adversarial perturbations. In addition, we further discuss potential directions of this domain for future researchers.


2022 ◽  
Vol 15 (3) ◽  
pp. 1-31
Author(s):  
Shulin Zeng ◽  
Guohao Dai ◽  
Hanbo Sun ◽  
Jun Liu ◽  
Shiyao Li ◽  
...  

INFerence-as-a-Service (INFaaS) has become a primary workload in the cloud. However, existing FPGA-based Deep Neural Network (DNN) accelerators are mainly optimized for the fastest speed of a single task, while the multi-tenancy of INFaaS has not been explored yet. As the demand for INFaaS keeps growing, simply increasing the number of FPGA-based DNN accelerators is not cost-effective, while merely sharing these single-task optimized DNN accelerators in a time-division multiplexing way could lead to poor isolation and high-performance loss for INFaaS. On the other hand, current cloud-based DNN accelerators have excessive compilation overhead, especially when scaling out to multi-FPGA systems for multi-tenant sharing, leading to unacceptable compilation costs for both offline deployment and online reconfiguration. Therefore, it is far from providing efficient and flexible FPGA virtualization for public and private cloud scenarios. Aiming to solve these problems, we propose a unified virtualization framework for general-purpose deep neural networks in the cloud, enabling multi-tenant sharing for both the Convolution Neural Network (CNN), and the Recurrent Neural Network (RNN) accelerators on a single FPGA. The isolation is enabled by introducing a two-level instruction dispatch module and a multi-core based hardware resources pool. Such designs provide isolated and runtime-programmable hardware resources, which further leads to performance isolation for multi-tenant sharing. On the other hand, to overcome the heavy re-compilation overheads, a tiling-based instruction frame package design and a two-stage static-dynamic compilation, are proposed. Only the lightweight runtime information is re-compiled with ∼1 ms overhead, thus guaranteeing the private cloud’s performance. Finally, the extensive experimental results show that the proposed virtualized solutions achieve up to 3.12× and 6.18× higher throughput in the private cloud compared with the static CNN and RNN baseline designs, respectively.


2022 ◽  
Vol 18 (2) ◽  
pp. 1-25
Author(s):  
Saransh Gupta ◽  
Mohsen Imani ◽  
Joonseop Sim ◽  
Andrew Huang ◽  
Fan Wu ◽  
...  

Stochastic computing (SC) reduces the complexity of computation by representing numbers with long streams of independent bits. However, increasing performance in SC comes with either an increase in area or a loss in accuracy. Processing in memory (PIM) computes data in-place while having high memory density and supporting bit-parallel operations with low energy consumption. In this article, we propose COSMO, an architecture for co mputing with s tochastic numbers in me mo ry, which enables SC in memory. The proposed architecture is general and can be used for a wide range of applications. It is a highly dense and parallel architecture that supports most SC encodings and operations in memory. It maximizes the performance and energy efficiency of SC by introducing several innovations: (i) in-memory parallel stochastic number generation, (ii) efficient implication-based logic in memory, (iii) novel memory bit line segmenting, (iv) a new memory-compatible SC addition operation, and (v) enabling flexible block allocation. To show the generality and efficiency of our stochastic architecture, we implement image processing, deep neural networks (DNNs), and hyperdimensional (HD) computing on the proposed hardware. Our evaluations show that running DNN inference on COSMO is 141× faster and 80× more energy efficient as compared to GPU.


2022 ◽  
Vol 18 (2) ◽  
pp. 1-22
Author(s):  
Gokul Krishnan ◽  
Sumit K. Mandal ◽  
Chaitali Chakrabarti ◽  
Jae-Sun Seo ◽  
Umit Y. Ogras ◽  
...  

With the widespread use of Deep Neural Networks (DNNs), machine learning algorithms have evolved in two diverse directions—one with ever-increasing connection density for better accuracy and the other with more compact sizing for energy efficiency. The increase in connection density increases on-chip data movement, which makes efficient on-chip communication a critical function of the DNN accelerator. The contribution of this work is threefold. First, we illustrate that the point-to-point (P2P)-based interconnect is incapable of handling a high volume of on-chip data movement for DNNs. Second, we evaluate P2P and network-on-chip (NoC) interconnect (with a regular topology such as a mesh) for SRAM- and ReRAM-based in-memory computing (IMC) architectures for a range of DNNs. This analysis shows the necessity for the optimal interconnect choice for an IMC DNN accelerator. Finally, we perform an experimental evaluation for different DNNs to empirically obtain the performance of the IMC architecture with both NoC-tree and NoC-mesh. We conclude that, at the tile level, NoC-tree is appropriate for compact DNNs employed at the edge, and NoC-mesh is necessary to accelerate DNNs with high connection density. Furthermore, we propose a technique to determine the optimal choice of interconnect for any given DNN. In this technique, we use analytical models of NoC to evaluate end-to-end communication latency of any given DNN. We demonstrate that the interconnect optimization in the IMC architecture results in up to 6 × improvement in energy-delay-area product for VGG-19 inference compared to the state-of-the-art ReRAM-based IMC architectures.


2022 ◽  
Vol 18 (1) ◽  
pp. 1-27
Author(s):  
Ran Xu ◽  
Rakesh Kumar ◽  
Pengcheng Wang ◽  
Peter Bai ◽  
Ganga Meghanath ◽  
...  

Videos take a lot of time to transport over the network, hence running analytics on the live video on embedded or mobile devices has become an important system driver. Considering such devices, e.g., surveillance cameras or AR/VR gadgets, are resource constrained, although there has been significant work in creating lightweight deep neural networks (DNNs) for such clients, none of these can adapt to changing runtime conditions, e.g., changes in resource availability on the device, the content characteristics, or requirements from the user. In this article, we introduce ApproxNet, a video object classification system for embedded or mobile clients. It enables novel dynamic approximation techniques to achieve desired inference latency and accuracy trade-off under changing runtime conditions. It achieves this by enabling two approximation knobs within a single DNN model rather than creating and maintaining an ensemble of models, e.g., MCDNN [MobiSys-16]. We show that ApproxNet can adapt seamlessly at runtime to these changes, provides low and stable latency for the image and video frame classification problems, and shows the improvement in accuracy and latency over ResNet [CVPR-16], MCDNN [MobiSys-16], MobileNets [Google-17], NestDNN [MobiCom-18], and MSDNet [ICLR-18].


2022 ◽  
Vol 134 ◽  
pp. 104080
Author(s):  
Ran Wang ◽  
Vahid Asghari ◽  
Clara Man Cheung ◽  
Shu-Chien Hsu ◽  
Chia-Jung Lee

2022 ◽  
Vol 6 (POPL) ◽  
pp. 1-30
Author(s):  
Jacob Laurel ◽  
Rem Yang ◽  
Gagandeep Singh ◽  
Sasa Misailovic

We present a novel abstraction for bounding the Clarke Jacobian of a Lipschitz continuous, but not necessarily differentiable function over a local input region. To do so, we leverage a novel abstract domain built upon dual numbers, adapted to soundly over-approximate all first derivatives needed to compute the Clarke Jacobian. We formally prove that our novel forward-mode dual interval evaluation produces a sound, interval domain-based over-approximation of the true Clarke Jacobian for a given input region. Due to the generality of our formalism, we can compute and analyze interval Clarke Jacobians for a broader class of functions than previous works supported – specifically, arbitrary compositions of neural networks with Lipschitz, but non-differentiable perturbations. We implement our technique in a tool called DeepJ and evaluate it on multiple deep neural networks and non-differentiable input perturbations to showcase both the generality and scalability of our analysis. Concretely, we can obtain interval Clarke Jacobians to analyze Lipschitz robustness and local optimization landscapes of both fully-connected and convolutional neural networks for rotational, contrast variation, and haze perturbations, as well as their compositions.


Sign in / Sign up

Export Citation Format

Share Document