scholarly journals Photonic Integrated Reconfigurable Linear Processors as Neural Network Accelerators

2021 ◽  
Vol 11 (13) ◽  
pp. 6232
Author(s):  
Lorenzo De Marinis ◽  
Marco Cococcioni ◽  
Odile Liboiron-Ladouceur ◽  
Giampiero Contestabile ◽  
Piero Castoldi ◽  
...  

Reconfigurable linear optical processors can be used to perform linear transformations and are instrumental in effectively computing matrix–vector multiplications required in each neural network layer. In this paper, we characterize and compare two thermally tuned photonic integrated processors realized in silicon-on-insulator and silicon nitride platforms suited for extracting feature maps in convolutional neural networks. The reduction in bit resolution when crossing the processor is mainly due to optical losses, in the range 2.3–3.3 for the silicon-on-insulator chip and in the range 1.3–2.4 for the silicon nitride chip. However, the lower extinction ratio of Mach–Zehnder elements in the latter platform limits their expressivity (i.e., the capacity to implement any transformation) to 75%, compared to 97% of the former. Finally, the silicon-on-insulator processor outperforms the silicon nitride one in terms of footprint and energy efficiency.

Sensors ◽  
2019 ◽  
Vol 19 (19) ◽  
pp. 4115 ◽  
Author(s):  
Yuxia Li ◽  
Bo Peng ◽  
Lei He ◽  
Kunlong Fan ◽  
Zhenxu Li ◽  
...  

Roads are vital components of infrastructure, the extraction of which has become a topic of significant interest in the field of remote sensing. Because deep learning has been a popular method in image processing and information extraction, researchers have paid more attention to extracting road using neural networks. This article proposes the improvement of neural networks to extract roads from Unmanned Aerial Vehicle (UAV) remote sensing images. D-Linknet was first considered for its high performance; however, the huge scale of the net reduced computational efficiency. With a focus on the low computational efficiency problem of the popular D-LinkNet, this article made some improvements: (1) Replace the initial block with a stem block. (2) Rebuild the entire network based on ResNet units with a new structure, allowing for the construction of an improved neural network D-Linknetplus. (3) Add a 1 × 1 convolution layer before DBlock to reduce the input feature maps, reducing parameters and improving computational efficiency. Add another 1 × 1 convolution layer after DBlock to recover the required number of output channels. Accordingly, another improved neural network B-D-LinknetPlus was built. Comparisons were performed between the neural nets, and the verification were made with the Massachusetts Roads Dataset. The results show improved neural networks are helpful in reducing the network size and developing the precision needed for road extraction.


Author(s):  
Jörg Bornschein

An FPGA-based coprocessor has been implemented which simulates the dynamics of a large recurrent neural network composed of binary neurons. The design has been used for unsupervised learning of receptive fields. Since the number of neurons to be simulated (>104) exceeds the available FPGA logic capacity for direct implementation, a set of streaming processors has been designed. Given the state- and activity vectors of the neurons at time t and a sparse connectivity matrix, these streaming processors calculate the state- and activity vectors for time t + 1. The operation implemented by the streaming processors can be understood as a generalized form of a sparse matrix vector product (SpMxV). The largest dataset, the sparse connectivity matrix, is stored and processed in a compressed format to better utilize the available memory bandwidth.


2021 ◽  
Author(s):  
Xianmeng Zhao ◽  
Haibin Lv ◽  
Cheng Chen ◽  
Shenjie Tang ◽  
Xiaoping Liu ◽  
...  

Abstract Implementing artificial neural networks on integrated platforms has generated significant interest in recent years. Several architectures for on-chip optical networks with basic functionalities have been successfully demonstrated, for example, optical spiking neurosynaptic, photonic convolution accelerator, and nanophotonic/electronic hybrid deep neuron networks. In this work, we propose a layered coherent silicon-on-insulator diffractive optical neural network, of which the inter-layer phase delay can be actively tuned. By forming a close-loop with control electronics, we further demonstrate that our fabricated on-chip neural network can be trained in-situ and consequently reconfigured to perform various tasks, including full adder operation and vowel recognition, while achieving almost the same accuracy as networks trained on conventional computers. Our results show that the proposed optical neural network could potentially pave the way for future optical artificial intelligence hardware.


Author(s):  
Kai-Uwe Demasius ◽  
Aron Kirschen ◽  
Stuart Parkin

AbstractData-intensive computing operations, such as training neural networks, are essential for applications in artificial intelligence but are energy intensive. One solution is to develop specialized hardware onto which neural networks can be directly mapped, and arrays of memristive devices can, for example, be trained to enable parallel multiply–accumulate operations. Here we show that memcapacitive devices that exploit the principle of charge shielding can offer a highly energy-efficient approach for implementing parallel multiply–accumulate operations. We fabricate a crossbar array of 156 microscale memcapacitor devices and use it to train a neural network that could distinguish the letters ‘M’, ‘P’ and ‘I’. Modelling these arrays suggests that this approach could offer an energy efficiency of 29,600 tera-operations per second per watt, while ensuring high precision (6–8 bits). Simulations also show that the devices could potentially be scaled down to a lateral size of around 45 nm.


2021 ◽  
Vol 279 ◽  
pp. 01017
Author(s):  
Evgeniy Ivliev ◽  
Pavel Obukhov ◽  
Viktor Ivliev ◽  
Denis Medvedev ◽  
Viktor Martynov

The article is devoted to the development and analysis of methods of identifying dynamic objects. A neural network with the architecture of SSD MobileNetV2 has been developed to solve the problem of detecting baggage tags and barcodes. Several approaches are considered to solve the problem of identifying digital-letter information: Tesseract, SSD InceptionV2, OpenCV and a convolutional neural network. The efficiency of the methods on real images was checked. It was concluded that electricity consumption can be reduced by 49.43%.


2017 ◽  
Vol 115 (2) ◽  
pp. 254-259 ◽  
Author(s):  
Daniël M. Pelt ◽  
James A. Sethian

Deep convolutional neural networks have been successfully applied to many image-processing problems in recent works. Popular network architectures often add additional operations and connections to the standard architecture to enable training deeper networks. To achieve accurate results in practice, a large number of trainable parameters are often required. Here, we introduce a network architecture based on using dilated convolutions to capture features at different image scales and densely connecting all feature maps with each other. The resulting architecture is able to achieve accurate results with relatively few parameters and consists of a single set of operations, making it easier to implement, train, and apply in practice, and automatically adapts to different problems. We compare results of the proposed network architecture with popular existing architectures for several segmentation problems, showing that the proposed architecture is able to achieve accurate results with fewer parameters, with a reduced risk of overfitting the training data.


IoT ◽  
2021 ◽  
Vol 2 (2) ◽  
pp. 222-235
Author(s):  
Guillaume Coiffier ◽  
Ghouthi Boukli Hacene ◽  
Vincent Gripon

Deep Neural Networks are state-of-the-art in a large number of challenges in machine learning. However, to reach the best performance they require a huge pool of parameters. Indeed, typical deep convolutional architectures present an increasing number of feature maps as we go deeper in the network, whereas spatial resolution of inputs is decreased through downsampling operations. This means that most of the parameters lay in the final layers, while a large portion of the computations are performed by a small fraction of the total parameters in the first layers. In an effort to use every parameter of a network at its maximum, we propose a new convolutional neural network architecture, called ThriftyNet. In ThriftyNet, only one convolutional layer is defined and used recursively, leading to a maximal parameter factorization. In complement, normalization, non-linearities, downsamplings and shortcut ensure sufficient expressivity of the model. ThriftyNet achieves competitive performance on a tiny parameters budget, exceeding 91% accuracy on CIFAR-10 with less than 40 k parameters in total, 74.3% on CIFAR-100 with less than 600 k parameters, and 67.1% On ImageNet ILSVRC 2012 with no more than 4.15 M parameters. However, the proposed method typically requires more computations than existing counterparts.


Author(s):  
Paweł Tarasiuk ◽  
Piotr S. Szczepaniak

AbstractThis paper presents a novel method for improving the invariance of convolutional neural networks (CNNs) to selected geometric transformations in order to obtain more efficient image classifiers. A common strategy employed to achieve this aim is to train the network using data augmentation. Such a method alone, however, increases the complexity of the neural network model, as any change in the rotation or size of the input image results in the activation of different CNN feature maps. This problem can be resolved by the proposed novel convolutional neural network models with geometric transformations embedded into the network architecture. The evaluation of the proposed CNN model is performed on the image classification task with the use of diverse representative data sets. The CNN models with embedded geometric transformations are compared to those without the transformations, using different data augmentation setups. As the compared approaches use the same amount of memory to store the parameters, the improved classification score means that the proposed architecture is more optimal.


2021 ◽  
Vol 24 (68) ◽  
pp. 21-32
Author(s):  
Yaming Cao ◽  
ZHEN YANG ◽  
CHEN GAO

Convolutional neural networks (CNNs) have shown strong learning capabilities in computer vision tasks such as classification and detection. Especially with the introduction of excellent detection models such as YOLO (V1, V2 and V3) and Faster R-CNN, CNNs have greatly improved detection efficiency and accuracy. However, due to the special angle of view, small size, few features, and complicated background, CNNs that performs well in the ground perspective dataset, fails to reach a good detection accuracy in the remote sensing image dataset. To this end, based on the YOLO V3 model, we used feature maps of different depths as detection outputs to explore the reasons for the poor detection rate of small targets in remote sensing images by deep neural networks. We also analyzed the effect of neural network depth on small target detection, and found that the excessive deep semantic information of neural network has little effect on small target detection. Finally, the verification on the VEDAI dataset shows, that the fusion of shallow feature maps with precise location information and deep feature maps with rich semantics in the CNNs can effectively improve the accuracy of small target detection in remote sensing images.


2021 ◽  
Vol 15 ◽  
Author(s):  
Yiyang Li ◽  
T. Patrick Xiao ◽  
Christopher H. Bennett ◽  
Erik Isele ◽  
Armantas Melianas ◽  
...  

In-memory computing based on non-volatile resistive memory can significantly improve the energy efficiency of artificial neural networks. However, accurate in situ training has been challenging due to the nonlinear and stochastic switching of the resistive memory elements. One promising analog memory is the electrochemical random-access memory (ECRAM), also known as the redox transistor. Its low write currents and linear switching properties across hundreds of analog states enable accurate and massively parallel updates of a full crossbar array, which yield rapid and energy-efficient training. While simulations predict that ECRAM based neural networks achieve high training accuracy at significantly higher energy efficiency than digital implementations, these predictions have not been experimentally achieved. In this work, we train a 3 × 3 array of ECRAM devices that learns to discriminate several elementary logic gates (AND, OR, NAND). We record the evolution of the network’s synaptic weights during parallel in situ (on-line) training, with outer product updates. Due to linear and reproducible device switching characteristics, our crossbar simulations not only accurately simulate the epochs to convergence, but also quantitatively capture the evolution of weights in individual devices. The implementation of the first in situ parallel training together with strong agreement with simulation results provides a significant advance toward developing ECRAM into larger crossbar arrays for artificial neural network accelerators, which could enable orders of magnitude improvements in energy efficiency of deep neural networks.


Sign in / Sign up

Export Citation Format

Share Document