scholarly journals A Uniform Architecture Design for Accelerating 2D and 3D CNNs on FPGAs

Electronics ◽  
2019 ◽  
Vol 8 (1) ◽  
pp. 65 ◽  
Author(s):  
Zhiqiang Liu ◽  
Paul Chow ◽  
Jinwei Xu ◽  
Jingfei Jiang ◽  
Yong Dou ◽  
...  

Three-dimensional convolutional neural networks (3D CNNs) have gained popularity in many complicated computer vision applications. Many customized accelerators based on FPGAs are proposed for 2D CNNs, while very few are for 3D CNNs. Three-D CNNs are far more computationally intensive and the design space for 3D CNN acceleration has been further expanded since one more dimension is introduced, making it a big challenge to accelerate 3D CNNs on FPGAs. Motivated by the finding that the computation patterns of 2D and 3D CNNs are very similar, we propose a uniform architecture design for accelerating both 2D and 3D CNNs in this paper. The uniform architecture is based on the idea of mapping convolutions to matrix multiplications. A customized mapping module is developed to generate the feature matrix tilings with no need to store the entire enlarged feature matrix on-chip or off-chip, a splitting strategy is adopted to reconstruct a convolutional layer to adapt to the on-chip memory capacity, and a 2D multiply-and-accumulate (MAC) array is adopted to compute matrix multiplications efficiently. For demonstration, we implement an accelerator prototype with a high-level synthesis (HLS) methodology on a Xilinx VC709 board and test the accelerator on three typical CNN models: AlexNet, VGG16, and C3D. Experimental results show that the accelerator achieves state-of-the-art throughput performance on both 2D and 3D CNNs, with much better energy efficiency than the CPU and GPU.

VLSI Design ◽  
2012 ◽  
Vol 2012 ◽  
pp. 1-15 ◽  
Author(s):  
B. Bala Tripura Sundari

The high integration density in today's VLSI chips offers enormous computing power to be utilized by the design of parallel computing hardware. The implementation of computationally intensive algorithms represented by -dimensional (-D) nested loop algorithms, onto parallel array architecture is termed as mapping. The methodologies adopted for mapping these algorithms onto parallel hardware often use heuristic search that requires a lot of computational effort to obtain near optimal solutions. We propose a new mapping procedure wherein a lower dimensional subspace (of the -D problem space) of inner loop is identified, in which lies the computational expression that generates the output or outputs of the -D problem. The processing elements (PE array) are assigned to the identified sub-space and the reuse of the PE array is through the assignment of the PE array to the successive sub-spaces in consecutive clock cycles/periods (CPs) to complete the computational tasks of the -D problem. The above is used to develop our proposed modified heuristic search to arrive at optimal design and the complexity comparisons are given. The MATLAB results of the new search and the design space trade-off analysis using the high-level synthesis tool are presented for two typical computationally intensive nested loop algorithms—the 6D FSBM and the 4D edge detection alternatively known as the 2D filtering algorithm.


2019 ◽  
Vol 9 (1) ◽  
Author(s):  
Hidetoshi Urakubo ◽  
Torsten Bullmann ◽  
Yoshiyuki Kubota ◽  
Shigeyuki Oba ◽  
Shin Ishii

AbstractRecently, there has been rapid expansion in the field of micro-connectomics, which targets the three-dimensional (3D) reconstruction of neuronal networks from stacks of two-dimensional (2D) electron microscopy (EM) images. The spatial scale of the 3D reconstruction increases rapidly owing to deep convolutional neural networks (CNNs) that enable automated image segmentation. Several research teams have developed their own software pipelines for CNN-based segmentation. However, the complexity of such pipelines makes their use difficult even for computer experts and impossible for non-experts. In this study, we developed a new software program, called UNI-EM, for 2D and 3D CNN-based segmentation. UNI-EM is a software collection for CNN-based EM image segmentation, including ground truth generation, training, inference, postprocessing, proofreading, and visualization. UNI-EM incorporates a set of 2D CNNs, i.e., U-Net, ResNet, HighwayNet, and DenseNet. We further wrapped flood-filling networks (FFNs) as a representative 3D CNN-based neuron segmentation algorithm. The 2D- and 3D-CNNs are known to demonstrate state-of-the-art level segmentation performance. We then provided two example workflows: mitochondria segmentation using a 2D CNN and neuron segmentation using FFNs. By following these example workflows, users can benefit from CNN-based segmentation without possessing knowledge of Python programming or CNN frameworks.


Electronics ◽  
2019 ◽  
Vol 8 (7) ◽  
pp. 803 ◽  
Author(s):  
Deguang Wang ◽  
Junzhong Shen ◽  
Mei Wen ◽  
Chunyuan Zhang

Three-dimensional (3D) deconvolution is widely used in many computer vision applications. However, most previous works have only focused on accelerating two-dimensional (2D) deconvolutional neural networks (DCNNs) on Field-Programmable Gate Arrays (FPGAs), while the acceleration of 3D DCNNs has not been well studied in depth as they have higher computational complexity and sparsity than 2D DCNNs. In this paper, we focus on the acceleration of both 2D and 3D sparse DCNNs on FPGAs by proposing efficient schemes for mapping 2D and 3D sparse DCNNs on a uniform architecture. Firstly, a pruning method is used to prune unimportant network connections and increase the sparsity of weights. After being pruned, the number of parameters of DCNNs is reduced significantly without accuracy loss. Secondly, the remaining non-zero weights are encoded in coordinate (COO) format, reducing the memory demands of parameters. Finally, to demonstrate the effectiveness of our work, we implement our accelerator design on the Xilinx VC709 evaluation platform for four real-life 2D and 3D DCNNs. After the first two steps, the storage required of DCNNs is reduced up to 3.9×. Results show that the performance of our method on the accelerator outperforms that of the our prior work by 2.5× to 3.6× in latency.


Geophysics ◽  
2021 ◽  
pp. 1-45
Author(s):  
Ronghua Peng ◽  
Bo Han ◽  
Yajun Liu ◽  
Xiangyun Hu

Forward modeling is vital for three-dimensional (3D) inversion and interpretation of electromagnetic (EM) data in anisotropic media, which is one of the major challenges in the field of EM geophysics. However, there are few freely available 3D codes that are capable of modeling EM responses in fully anisotropic media. Besides, most of the existing 3D EM codes are written in low-level languages such as C and Fortran, making them difficult to read, maintain and extend. Taking advantage of recent progress in computer technology and numerical methods, we have developed an open-source package for forward modeling of frequency-domain EM fields in a fully 3D anisotropic earth (EM3DANI) using the Julia language, a relatively young, high-level programming language with a focus on high performance. Based on a mimetic finite-volume (MFV) discretization of the governing equations, the modeling algorithm is expressed in an abstract form in terms of matrices/vectors and thus can be easily implemented by using any high-level language commonly-used for numerical computing. Existing libraries written in low-level languages can be easily integrated into a Julia code without the so-called two-language problem, thus we have exploited several mature third-party packages to deal with computationally intensive parts of the forward modeling, which guarantees high stability and efficiency. We have elaborated the structure of the package, paying special attention to code usability, readability and extendability, while striving to retain versatility and high performance. The effectiveness of the code is demonstrated through two 1D synthetic examples for magnetotellurics (MT) and controlled-source electromagnetics (CSEM) problems, respectively. High accuracy and efficiency can be achieved for both 1D examples. We further present a 3D example mimicking marine CSEM survey scenario for hydrocarbon exploration. The simulation results indicate that the effect of the anisotropy on forward responses is significant, and can be comparable to that of the target reservoir.


2017 ◽  
Vol 2017 ◽  
pp. 1-12
Author(s):  
Andreas G. Savva ◽  
Theocharis Theocharides ◽  
Chrysostomos Nicopoulos

This work presents a design exploration framework for developing a high level Artificial Neural Network (ANN) for fault detection in hardware systems. ANNs can be used for fault detection purposes since they have excellent characteristics such as generalization capability, robustness, and fault tolerance. Designing an ANN in order to be used for fault detection purposes includes different parameters. Through this work, those parameters are presented and analyzed based on simulations. Moreover, after the development of the ANN, in order to evaluate it, a case study scenario based on Networks on Chip is used for detection of interrouter link faults. Simulation results with various synthetic traffic models show that the proposed work can detect up to 96–99% of interrouter link faults with a delay less than 60 cycles. Added to this, the size of the ANN is kept relatively small and they can be implemented in hardware easily. Synthesis results indicate an estimated amount of 0.0523 mW power consumption per neuron for the implemented ANN when computing a complete cycle.


VLSI Design ◽  
2012 ◽  
Vol 2012 ◽  
pp. 1-13 ◽  
Author(s):  
Roberta Piscitelli ◽  
Andy D. Pimentel

This paper presents a framework for high-level power estimation of multiprocessor systems-on-chip (MPSoC) architectures on FPGA. The technique is based on abstract execution profiles, called event signatures, and it operates at a higher level of abstraction than, for example, commonly used instruction-set simulator (ISS)-based power estimation methods and should thus be capable of achieving good evaluation performance. As a consequence, the technique can be very useful in the context of early system-level design space exploration. We integrated the power estimation technique in a system-level MPSoC synthesis framework. Subsequently, using this framework, we designed a range of different candidate architectures which contain different numbers of MicroBlaze processors and compared our power estimation results to those from real measurements on a Virtex-6 FPGA board.


Processes ◽  
2021 ◽  
Vol 9 (3) ◽  
pp. 447
Author(s):  
Johannes Möller ◽  
Ralf Pörtner

Techniques to provide in vitro tissue culture have undergone significant changes during the last decades, and current applications involve interactions of cells and organoids, three-dimensional cell co-cultures, and organ/body-on-chip tools. Efficient computer-aided and mathematical model-based methods are required for efficient and knowledge-driven characterization, optimization, and routine manufacturing of tissue culture systems. As an alternative to purely experimental-driven research, the usage of comprehensive mathematical models as a virtual in silico representation of the tissue culture, namely a digital twin, can be advantageous. Digital twins include the mechanistic of the biological system in the form of diverse mathematical models, which describe the interaction between tissue culture techniques and cell growth, metabolism, and the quality of the tissue. In this review, current concepts, expectations, and the state of the art of digital twins for tissue culture concepts will be highlighted. In general, DT’s can be applied along the full process chain and along the product life cycle. Due to the complexity, the focus of this review will be especially on the design, characterization, and operation of the tissue culture techniques.


2013 ◽  
Vol 13 (Supplement-1) ◽  
pp. 7-14
Author(s):  
S. Gavliakova ◽  
J. Plevkova ◽  
J. Jakus ◽  
I. Poliacek

Abstract Methods that had been applied to study central neuronal circuits regulating cough and respiratory reflexes so far rely on recording performed in vivo, ex vivo, micro injecting and lesion methods. Based on the available data it is clear that this network is complicated, multilevel, holarchical, undergoing reconfiguration under afferent inputs. For many students and researchers it is complicated to get a virtual spatial image of these cooperating neuronal populations. The project was aimed to create graphical three-dimensional computer model of the brainstem using environment MATLAB and the matrix algebra to visualize neuron localization within the brainstem. Relevant data for the model had been taken from recent and also former research papers published in particular areas. This model may help scientists to visualize groups of neurons, help them to find targets for microinjecting or lesion studies together with stereotaxic positioning. The model is upgradeable and highly flexible for future use, research and teaching applications in MATLAB environment. MATLAB is a high-level language and interactive environment that enables you to perform computationally intensive tasks faster than with traditional programming languages


Sign in / Sign up

Export Citation Format

Share Document