Nano-Crossbar Weighted Memristor-Based Convolution Neural Network Architecture for High-Performance Artificial Intelligence Applications

2021 ◽  
Vol 21 (3) ◽  
pp. 1833-1844
Author(s):  
Kyojin Kim ◽  
Kamran Eshraghian ◽  
Hyunsoo Kang ◽  
Kyoungrok Cho

Nano memristor crossbar arrays, which can represent analog signals with smaller silicon areas, are popularly used to describe the node weights of the neural networks. The crossbar arrays provide high computational efficiency, as they can perform additions and multiplications at the same time at a cross-point. In this study, we propose a new approach for the memristor crossbar array architecture consisting of multi-weight nano memristors on each cross-point. As the proposed architecture can represent multiple integer-valued weights, it can enhance the precision of the weight coefficients in comparison with the existing memristor-based neural networks. This study presents a Radix-11 nano memristor crossbar array with weighted memristors; it validates the operations of the circuits, which use the arrays through circuit-level simulation. With the proposed Radix-11 approach, it is possible to represent eleven integer-valued weights. In addition, this study presents a neural network designed using the proposed Radix-11 weights, as an example of high-performance AI applications. The neural network implements a speech-keyword detection algorithm, and it was designed on a TensorFlow platform. The implemented keyword detection algorithm can recognize 35 Korean words with an inferencing accuracy of 95.45%, reducing the inferencing accuracy only by 2% when compared to the 97.53% accuracy of the real-valued weight case.

Author(s):  
Dolly Sapra ◽  
Andy D. Pimentel

AbstractThe automated architecture search methodology for neural networks is known as Neural Architecture Search (NAS). In recent times, Convolutional Neural Networks (CNNs) designed through NAS methodologies have achieved very high performance in several fields, for instance image classification and natural language processing. Our work is in the same domain of NAS, where we traverse the search space of neural network architectures with the help of an evolutionary algorithm which has been augmented with a novel approach of piecemeal-training. In contrast to the previously published NAS techniques, wherein the training with given data is considered an isolated task to estimate the performance of neural networks, our work demonstrates that a neural network architecture and the related weights can be jointly learned by combining concepts of the traditional training process and evolutionary architecture search in a single algorithm. The consolidation has been realised by breaking down the conventional training technique into smaller slices and collating them together with an integrated evolutionary architecture search algorithm. The constraints on architecture search space are placed by limiting its various parameters within a specified range of values, consequently regulating the neural network’s size and memory requirements. We validate this concept on two vastly different datasets, namely, the CIFAR-10 dataset in the domain of image classification, and PAMAP2 dataset in the Human Activity Recognition (HAR) domain. Starting from randomly initialized and untrained CNNs, the algorithm discovers models with competent architectures, which after complete training, reach an accuracy of of 92.5% for CIFAR-10 and 94.36% PAMAP2. We further extend the algorithm to include an additional conflicting search objective: the number of parameters of the neural network. Our multi-objective algorithm produces a Pareto optimal set of neural networks, by optimizing the search for both the accuracy and the parameter count, thus emphasizing the versatility of our approach.


Electronics ◽  
2020 ◽  
Vol 9 (11) ◽  
pp. 1929
Author(s):  
Jiacang Ho ◽  
Dae-Ki Kang

Deep neural networks have achieved high performance in image classification, image generation, voice recognition, natural language processing, etc.; however, they still have confronted several open challenges that need to be solved such as incremental learning problem, overfitting in neural networks, hyperparameter optimization, lack of flexibility and multitasking, etc. In this paper, we focus on the incremental learning problem which is related with machine learning methodologies that continuously train an existing model with additional knowledge. To the best of our knowledge, a simple and direct solution to solve this challenge is to retrain the entire neural network after adding the new labels in the output layer. Besides that, transfer learning can be applied only if the domain of the new labels is related to the domain of the labels that have already been trained in the neural network. In this paper, we propose a novel network architecture, namely Brick Assembly Network (BAN), which allows a trained network to assemble (or dismantle) a new label to (or from) a trained neural network without retraining the entire network. In BAN, we train labels with a sub-network (i.e., a simple neural network) individually and then we assemble the converged sub-networks that have trained for a single label together to form a full neural network. For each label to be trained in a sub-network of BAN, we introduce a new loss function that minimizes the loss of the network with only one class data. Applying one loss function for each class label is unique and different from standard neural network architectures (e.g., AlexNet, ResNet, InceptionV3, etc.) which use the values of a loss function from multiple labels to minimize the error of the network. The difference of between the loss functions of previous approaches and the one we have introduced is that we compute a loss values from node values of penultimate layer (we named it as a characteristic layer) instead of the output layer where the computation of the loss values occurs between true labels and predicted labels. From the experiment results on several benchmark datasets, we evaluate that BAN shows a strong capability of adding (and removing) a new label to a trained network compared with a standard neural network and other previous work.


1993 ◽  
Vol 02 (01) ◽  
pp. 117-132 ◽  
Author(s):  
STAMATIS VASSILIADIS ◽  
GERALD G. PECHANEK ◽  
JOSÉ G. DELGADO-FRIAS

This paper proposes a novel digital neural network architecture referred to as the Sequential PIpelined Neuroemulator or Neurocomputer (SPIN). The SPIN processor emulates neural networks producing high performance with minimum hardware by sequentially processing each neuron in the modeled completely connected network with a pipelined physical neuron structure. In addition to describing SPIN, performance equations are estimated for the ring systolic, the recurrent systolic array, and the neuromimetic neurocomputer architectures, three previously reported schemes for the emulation of neural networks, and a comparison with the SPIN architecture is reported.


2020 ◽  
Vol 2020 (10) ◽  
pp. 54-62
Author(s):  
Oleksii VASYLIEV ◽  

The problem of applying neural networks to calculate ratings used in banking in the decision-making process on granting or not granting loans to borrowers is considered. The task is to determine the rating function of the borrower based on a set of statistical data on the effectiveness of loans provided by the bank. When constructing a regression model to calculate the rating function, it is necessary to know its general form. If so, the task is to calculate the parameters that are included in the expression for the rating function. In contrast to this approach, in the case of using neural networks, there is no need to specify the general form for the rating function. Instead, certain neural network architecture is chosen and parameters are calculated for it on the basis of statistical data. Importantly, the same neural network architecture can be used to process different sets of statistical data. The disadvantages of using neural networks include the need to calculate a large number of parameters. There is also no universal algorithm that would determine the optimal neural network architecture. As an example of the use of neural networks to determine the borrower's rating, a model system is considered, in which the borrower's rating is determined by a known non-analytical rating function. A neural network with two inner layers, which contain, respectively, three and two neurons and have a sigmoid activation function, is used for modeling. It is shown that the use of the neural network allows restoring the borrower's rating function with quite acceptable accuracy.


2021 ◽  
Vol 40 (3) ◽  
pp. 1-13
Author(s):  
Lumin Yang ◽  
Jiajie Zhuang ◽  
Hongbo Fu ◽  
Xiangzhi Wei ◽  
Kun Zhou ◽  
...  

We introduce SketchGNN , a convolutional graph neural network for semantic segmentation and labeling of freehand vector sketches. We treat an input stroke-based sketch as a graph with nodes representing the sampled points along input strokes and edges encoding the stroke structure information. To predict the per-node labels, our SketchGNN uses graph convolution and a static-dynamic branching network architecture to extract the features at three levels, i.e., point-level, stroke-level, and sketch-level. SketchGNN significantly improves the accuracy of the state-of-the-art methods for semantic sketch segmentation (by 11.2% in the pixel-based metric and 18.2% in the component-based metric over a large-scale challenging SPG dataset) and has magnitudes fewer parameters than both image-based and sequence-based methods.


Sensors ◽  
2021 ◽  
Vol 21 (8) ◽  
pp. 2852
Author(s):  
Parvathaneni Naga Srinivasu ◽  
Jalluri Gnana SivaSai ◽  
Muhammad Fazal Ijaz ◽  
Akash Kumar Bhoi ◽  
Wonjoon Kim ◽  
...  

Deep learning models are efficient in learning the features that assist in understanding complex patterns precisely. This study proposed a computerized process of classifying skin disease through deep learning based MobileNet V2 and Long Short Term Memory (LSTM). The MobileNet V2 model proved to be efficient with a better accuracy that can work on lightweight computational devices. The proposed model is efficient in maintaining stateful information for precise predictions. A grey-level co-occurrence matrix is used for assessing the progress of diseased growth. The performance has been compared against other state-of-the-art models such as Fine-Tuned Neural Networks (FTNN), Convolutional Neural Network (CNN), Very Deep Convolutional Networks for Large-Scale Image Recognition developed by Visual Geometry Group (VGG), and convolutional neural network architecture that expanded with few changes. The HAM10000 dataset is used and the proposed method has outperformed other methods with more than 85% accuracy. Its robustness in recognizing the affected region much faster with almost 2× lesser computations than the conventional MobileNet model results in minimal computational efforts. Furthermore, a mobile application is designed for instant and proper action. It helps the patient and dermatologists identify the type of disease from the affected region’s image at the initial stage of the skin disease. These findings suggest that the proposed system can help general practitioners efficiently and effectively diagnose skin conditions, thereby reducing further complications and morbidity.


Electronics ◽  
2021 ◽  
Vol 10 (14) ◽  
pp. 1614
Author(s):  
Jonghun Jeong ◽  
Jong Sung Park ◽  
Hoeseok Yang

Recently, the necessity to run high-performance neural networks (NN) is increasing even in resource-constrained embedded systems such as wearable devices. However, due to the high computational and memory requirements of the NN applications, it is typically infeasible to execute them on a single device. Instead, it has been proposed to run a single NN application cooperatively on top of multiple devices, a so-called distributed neural network. In the distributed neural network, workloads of a single big NN application are distributed over multiple tiny devices. While the computation overhead could effectively be alleviated by this approach, the existing distributed NN techniques, such as MoDNN, still suffer from large traffics between the devices and vulnerability to communication failures. In order to get rid of such big communication overheads, a knowledge distillation based distributed NN, called Network of Neural Networks (NoNN), was proposed, which partitions the filters in the final convolutional layer of the original NN into multiple independent subsets and derives smaller NNs out of each subset. However, NoNN also has limitations in that the partitioning result may be unbalanced and it considerably compromises the correlation between filters in the original NN, which may result in an unacceptable accuracy degradation in case of communication failure. In this paper, in order to overcome these issues, we propose to enhance the partitioning strategy of NoNN in two aspects. First, we enhance the redundancy of the filters that are used to derive multiple smaller NNs by means of averaging to increase the immunity of the distributed NN to communication failure. Second, we propose a novel partitioning technique, modified from Eigenvector-based partitioning, to preserve the correlation between filters as much as possible while keeping the consistent number of filters distributed to each device. Throughout extensive experiments with the CIFAR-100 (Canadian Institute For Advanced Research-100) dataset, it has been observed that the proposed approach maintains high inference accuracy (over 70%, 1.53× improvement over the state-of-the-art approach), on average, even when a half of eight devices in a distributed NN fail to deliver their partial inference results.


2016 ◽  
Vol 807 ◽  
pp. 155-166 ◽  
Author(s):  
Julia Ling ◽  
Andrew Kurzawski ◽  
Jeremy Templeton

There exists significant demand for improved Reynolds-averaged Navier–Stokes (RANS) turbulence models that are informed by and can represent a richer set of turbulence physics. This paper presents a method of using deep neural networks to learn a model for the Reynolds stress anisotropy tensor from high-fidelity simulation data. A novel neural network architecture is proposed which uses a multiplicative layer with an invariant tensor basis to embed Galilean invariance into the predicted anisotropy tensor. It is demonstrated that this neural network architecture provides improved prediction accuracy compared with a generic neural network architecture that does not embed this invariance property. The Reynolds stress anisotropy predictions of this invariant neural network are propagated through to the velocity field for two test cases. For both test cases, significant improvement versus baseline RANS linear eddy viscosity and nonlinear eddy viscosity models is demonstrated.


2021 ◽  
pp. 2103376 ◽  
Author(s):  
Sifan Li ◽  
Mei‐Er Pam ◽  
Yesheng Li ◽  
Li Chen ◽  
Yu‐Chieh Chien ◽  
...  

2021 ◽  
Vol 11 (15) ◽  
pp. 6845
Author(s):  
Abu Sayeed ◽  
Jungpil Shin ◽  
Md. Al Mehedi Hasan ◽  
Azmain Yakin Srizon ◽  
Md. Mehedi Hasan

As it is the seventh most-spoken language and fifth most-spoken native language in the world, the domain of Bengali handwritten character recognition has fascinated researchers for decades. Although other popular languages i.e., English, Chinese, Hindi, Spanish, etc. have received many contributions in the area of handwritten character recognition, Bengali has not received many noteworthy contributions in this domain because of the complex curvatures and similar writing fashions of Bengali characters. Previously, studies were conducted by using different approaches based on traditional learning, and deep learning. In this research, we proposed a low-cost novel convolutional neural network architecture for the recognition of Bengali characters with only 2.24 to 2.43 million parameters based on the number of output classes. We considered 8 different formations of CMATERdb datasets based on previous studies for the training phase. With experimental analysis, we showed that our proposed system outperformed previous works by a noteworthy margin for all 8 datasets. Moreover, we tested our trained models on other available Bengali characters datasets such as Ekush, BanglaLekha, and NumtaDB datasets. Our proposed architecture achieved 96–99% overall accuracies for these datasets as well. We believe our contributions will be beneficial for developing an automated high-performance recognition tool for Bengali handwritten characters.


Sign in / Sign up

Export Citation Format

Share Document