A RRAM based Max-Pooling Scheme for Convolutional Neural Network

Convolutional neural network (CNN) is a method of supervised deep learning. The architectures including AlexNet, VGG16, VGG19, ResNet 50, ResNet101, GoogleNet, Inception-V3, Inception ResNet-V2, and Squeezenet that have 25 to 825 layers. This study aims to simplify layers of CNN architectures and increased accuracy for fundus patches classification. Fundus patches classify two categories: normal and neovascularization. Data used for classification is MESSIDOR and Retina Image Bank that have 2,080 patches. Results show the best accuracy of 93.17% for original data and 99,33% for augmentation data using CNN 31 layers. It consists input layer, 7 convolutional layers, 7 batch normalization, 7 rectified linear unit, 6 max-pooling, fully connected layer, softmax, and output layer.

Download Full-text

Max-Pooling Convolutional Neural Network for Chinese Digital Gesture Recognition

Advances in Intelligent Systems and Computing - Information Technology and Intelligent Transportation Systems ◽

10.1007/978-3-319-38771-0_8 ◽

2016 ◽

pp. 79-89 ◽

Cited By ~ 1

Author(s):

Zhao Qian ◽

Li Yawei ◽

Zhu Mengyu ◽

Yang Yuliang ◽

Xiao Ling ◽

...

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Gesture Recognition ◽

Max Pooling

Download Full-text

Improvement in the Convolutional Neural Network for Computed Tomography Images

Applied Sciences ◽

10.3390/app11041505 ◽

2021 ◽

Vol 11 (4) ◽

pp. 1505

Author(s):

Keisuke Manabe ◽

Yusuke Asami ◽

Tomonari Yamada ◽

Hiroyuki Sugimori

Keyword(s):

Neural Network ◽

Computed Tomography ◽

Convolutional Neural Network ◽

Ct Images ◽

Classification Model ◽

Max Pooling ◽

Computed Tomography Images ◽

Input Size ◽

Filter Size

Background and purpose. This study evaluated a modified specialized convolutional neural network (CNN) to improve the accuracy of medical images. Materials and Methods. We defined computed tomography (CT) images as belonging to one of the following 10 classes: head, neck, chest, abdomen, and pelvis with and without contrast media, with 10,000 images per class. We modified the CNN based on the AlexNet with an input size of 512 × 512. We resized the filter sizes of the convolution layer and max pooling. Using these modified CNNs, various models were created and evaluated. The improved CNN was evaluated to classify the presence or absence of the pancreas in the CT images. We compared the overall accuracy, which was calculated from images not used for training, to that of the ResNet. Results. The overall accuracies of the most improved CNN and ResNet in the 10 classes were 94.8% and 89.3%, respectively. The filter sizes of the improved CNN for the convolution layer were (13, 13), (7, 7), (5, 5), (5, 5), and (5, 5) in order from the first layer, and that of max-pooling was (7, 7). The calculation times of the most improved CNN and ResNet were 56 and 120 min, respectively. Regarding the classification of the pancreas, the overall accuracies of the most improved CNN and ResNet were 75.75% and 58.25%, respectively. The calculation times of the most improved CNN and ResNet were 36 and 55 min, respectively. Conclusion. By optimizing the filter size of the convolution layer and max-pooling of 512 × 512 images, we quickly obtained a highly accurate medical image classification model. This improved CNN can be useful for classifying lesions and anatomies for related diagnostic aid applications.

Download Full-text

English Accent Recognition Using Deep Machine Learning

Control Systems and Computers ◽

10.15407/csc.2021.04.028 ◽

2021 ◽

pp. 28-34

Author(s):

Andryi V. Manokhin ◽

◽

Natalia A. Rybachok ◽

Keyword(s):

Neural Network ◽

Machine Learning ◽

Convolutional Neural Network ◽

Test Data ◽

English Language ◽

Forecasting Accuracy ◽

Max Pooling ◽

Audio Recordings ◽

Accent Recognition

The article highlights aspects of the use of deep machine learning to recognize the accents of the English language. The software has been developed to determine the percentage of how close audio recordings are to each of 8 most common English accents. A convolutional neural network consisting of 2 convolutional layers, 1 max pooling layer, and 2 dense layers was trained across 2 epochs on a set of 5,516 audio recordings taken from the English Multi-speaker Corpus for Voice Cloning resource. The forecasting accuracy of 89.07% was achieved on the test data presented by 11 thousand MFCC matrices with a dimension of 50×87.

Download Full-text

A Comparison between Average and Max-Pooling in Convolutional Neural Network for Scoliosis Classification

International Journal of Advanced Trends in Computer Science and Engineering ◽

10.30534/ijatcse/2020/9791.42020 ◽

2020 ◽

Vol 9 (1.4) ◽

pp. 689-696

Author(s):

Nurbaity Sabri

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Max Pooling

Download Full-text

Area and Energy Efficient 2D Max-Pooling For Convolutional Neural Network Hardware Accelerator

IECON 2020 The 46th Annual Conference of the IEEE Industrial Electronics Society ◽

10.1109/iecon43393.2020.9254452 ◽

2020 ◽

Author(s):

Bin Zhao ◽

Yi Sheng Chong ◽

Anh Tuan Do

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Energy Efficient ◽

Hardware Accelerator ◽

Max Pooling ◽

Neural Network Hardware

Download Full-text

Design of an Always-On Image Sensor Using an Analog Lightweight Convolutional Neural Network

Sensors ◽

10.3390/s20113101 ◽

2020 ◽

Vol 20 (11) ◽

pp. 3101

Author(s):

Jaihyuk Choi ◽

Sungjae Lee ◽

Youngdoo Son ◽

Soo Youn Kim

Keyword(s):

Neural Network ◽

Power Consumption ◽

Convolutional Neural Network ◽

Image Classification ◽

Image Sensor ◽

Image Resolution ◽

Oxide Semiconductor ◽

Total Power ◽

Max Pooling ◽

Total Power Consumption

This paper presents an always-on Complementary Metal Oxide Semiconductor (CMOS) image sensor (CIS) using an analog convolutional neural network for image classification in mobile applications. To reduce the power consumption as well as the overall processing time, we propose analog convolution circuits for computing convolution, max-pooling, and correlated double sampling operations without operational transconductance amplifiers. In addition, we used the voltage-mode MAX circuit for max pooling in the analog domain. After the analog convolution processing, the image data were reduced by 99.58% and were converted to digital with a 4-bit single-slope analog-to-digital converter. After the conversion, images were classified by the fully connected processor, which is traditionally performed in the digital domain. The measurement results show that we achieved an 89.33% image classification accuracy. The prototype CIS was fabricated in a 0.11 μm 1-poly 4-metal CIS process with a standard 4T-active pixel sensor. The image resolution was 160 × 120, and the total power consumption of the proposed CIS was 1.12 mW with a 3.3 V supply voltage and a maximum frame rate of 120.

Download Full-text