A RRAM based Max-Pooling Scheme for Convolutional Neural Network

Author(s):  
Yaotian Ling ◽  
Zongwei Wang ◽  
Yunfan Yang ◽  
Zhizhen Yu ◽  
Qilin Zheng ◽  
...  
2021 ◽  
Vol 10 (1) ◽  
pp. 383-389
Author(s):  
Wahyudi Setiawan ◽  
Moh. Imam Utoyo ◽  
Riries Rulaningtyas

Convolutional neural network (CNN) is a method of supervised deep learning. The architectures including AlexNet, VGG16, VGG19, ResNet 50, ResNet101, GoogleNet, Inception-V3, Inception ResNet-V2, and Squeezenet that have 25 to 825 layers. This study aims to simplify layers of CNN architectures and increased accuracy for fundus patches classification. Fundus patches classify two categories: normal and neovascularization. Data used for classification is MESSIDOR and Retina Image Bank that have 2,080 patches. Results show the best accuracy of 93.17% for original data and 99,33% for augmentation data using CNN 31 layers. It consists input layer, 7 convolutional layers, 7 batch normalization, 7 rectified linear unit, 6 max-pooling, fully connected layer, softmax, and output layer.


2021 ◽  
Vol 11 (4) ◽  
pp. 1505
Author(s):  
Keisuke Manabe ◽  
Yusuke Asami ◽  
Tomonari Yamada ◽  
Hiroyuki Sugimori

Background and purpose. This study evaluated a modified specialized convolutional neural network (CNN) to improve the accuracy of medical images. Materials and Methods. We defined computed tomography (CT) images as belonging to one of the following 10 classes: head, neck, chest, abdomen, and pelvis with and without contrast media, with 10,000 images per class. We modified the CNN based on the AlexNet with an input size of 512 × 512. We resized the filter sizes of the convolution layer and max pooling. Using these modified CNNs, various models were created and evaluated. The improved CNN was evaluated to classify the presence or absence of the pancreas in the CT images. We compared the overall accuracy, which was calculated from images not used for training, to that of the ResNet. Results. The overall accuracies of the most improved CNN and ResNet in the 10 classes were 94.8% and 89.3%, respectively. The filter sizes of the improved CNN for the convolution layer were (13, 13), (7, 7), (5, 5), (5, 5), and (5, 5) in order from the first layer, and that of max-pooling was (7, 7). The calculation times of the most improved CNN and ResNet were 56 and 120 min, respectively. Regarding the classification of the pancreas, the overall accuracies of the most improved CNN and ResNet were 75.75% and 58.25%, respectively. The calculation times of the most improved CNN and ResNet were 36 and 55 min, respectively. Conclusion. By optimizing the filter size of the convolution layer and max-pooling of 512 × 512 images, we quickly obtained a highly accurate medical image classification model. This improved CNN can be useful for classifying lesions and anatomies for related diagnostic aid applications.


2021 ◽  
pp. 28-34
Author(s):  
Andryi V. Manokhin ◽  
◽  
Natalia A. Rybachok ◽  

The article highlights aspects of the use of deep machine learning to recognize the accents of the English language. The software has been developed to determine the percentage of how close audio recordings are to each of 8 most common English accents. A convolutional neural network consisting of 2 convolutional layers, 1 max pooling layer, and 2 dense layers was trained across 2 epochs on a set of 5,516 audio recordings taken from the English Multi-speaker Corpus for Voice Cloning resource. The forecasting accuracy of 89.07% was achieved on the test data presented by 11 thousand MFCC matrices with a dimension of 50×87.


Sensors ◽  
2020 ◽  
Vol 20 (11) ◽  
pp. 3101
Author(s):  
Jaihyuk Choi ◽  
Sungjae Lee ◽  
Youngdoo Son ◽  
Soo Youn Kim

This paper presents an always-on Complementary Metal Oxide Semiconductor (CMOS) image sensor (CIS) using an analog convolutional neural network for image classification in mobile applications. To reduce the power consumption as well as the overall processing time, we propose analog convolution circuits for computing convolution, max-pooling, and correlated double sampling operations without operational transconductance amplifiers. In addition, we used the voltage-mode MAX circuit for max pooling in the analog domain. After the analog convolution processing, the image data were reduced by 99.58% and were converted to digital with a 4-bit single-slope analog-to-digital converter. After the conversion, images were classified by the fully connected processor, which is traditionally performed in the digital domain. The measurement results show that we achieved an 89.33% image classification accuracy. The prototype CIS was fabricated in a 0.11 μm 1-poly 4-metal CIS process with a standard 4T-active pixel sensor. The image resolution was 160 × 120, and the total power consumption of the proposed CIS was 1.12 mW with a 3.3 V supply voltage and a maximum frame rate of 120.


Sign in / Sign up

Export Citation Format

Share Document