scholarly journals Mixup of Feature Maps in a Hidden Layer for Training of Convolutional Neural Network

Author(s):  
Hideki Oki ◽  
Takio Kurita
2021 ◽  
Vol 21 (1) ◽  
Author(s):  
Bambang Tutuko ◽  
Siti Nurmaini ◽  
Alexander Edo Tondas ◽  
Muhammad Naufal Rachmatullah ◽  
Annisa Darmawahyuni ◽  
...  

Abstract Background Generalization model capacity of deep learning (DL) approach for atrial fibrillation (AF) detection remains lacking. It can be seen from previous researches, the DL model formation used only a single frequency sampling of the specific device. Besides, each electrocardiogram (ECG) acquisition dataset produces a different length and sampling frequency to ensure sufficient precision of the R–R intervals to determine the heart rate variability (HRV). An accurate HRV is the gold standard for predicting the AF condition; therefore, a current challenge is to determine whether a DL approach can be used to analyze raw ECG data in a broad range of devices. This paper demonstrates powerful results for end-to-end implementation of AF detection based on a convolutional neural network (AFibNet). The method used a single learning system without considering the variety of signal lengths and frequency samplings. For implementation, the AFibNet is processed with a computational cloud-based DL approach. This study utilized a one-dimension convolutional neural networks (1D-CNNs) model for 11,842 subjects. It was trained and validated with 8232 records based on three datasets and tested with 3610 records based on eight datasets. The predicted results, when compared with the diagnosis results indicated by human practitioners, showed a 99.80% accuracy, sensitivity, and specificity. Result Meanwhile, when tested using unseen data, the AF detection reaches 98.94% accuracy, 98.97% sensitivity, and 98.97% specificity at a sample period of 0.02 seconds using the DL Cloud System. To improve the confidence of the AFibNet model, it also validated with 18 arrhythmias condition defined as Non-AF-class. Thus, the data is increased from 11,842 to 26,349 instances for three-class, i.e., Normal sinus (N), AF and Non-AF. The result found 96.36% accuracy, 93.65% sensitivity, and 96.92% specificity. Conclusion These findings demonstrate that the proposed approach can use unknown data to derive feature maps and reliably detect the AF periods. We have found that our cloud-DL system is suitable for practical deployment


Entropy ◽  
2020 ◽  
Vol 22 (9) ◽  
pp. 949
Author(s):  
Jiangyi Wang ◽  
Min Liu ◽  
Xinwu Zeng ◽  
Xiaoqiang Hua

Convolutional neural networks have powerful performances in many visual tasks because of their hierarchical structures and powerful feature extraction capabilities. SPD (symmetric positive definition) matrix is paid attention to in visual classification, because it has excellent ability to learn proper statistical representation and distinguish samples with different information. In this paper, a deep neural network signal detection method based on spectral convolution features is proposed. In this method, local features extracted from convolutional neural network are used to construct the SPD matrix, and a deep learning algorithm for the SPD matrix is used to detect target signals. Feature maps extracted by two kinds of convolutional neural network models are applied in this study. Based on this method, signal detection has become a binary classification problem of signals in samples. In order to prove the availability and superiority of this method, simulated and semi-physical simulated data sets are used. The results show that, under low SCR (signal-to-clutter ratio), compared with the spectral signal detection method based on the deep neural network, this method can obtain a gain of 0.5–2 dB on simulated data sets and semi-physical simulated data sets.


2018 ◽  
Vol 4 (9) ◽  
pp. 107 ◽  
Author(s):  
Mohib Ullah ◽  
Ahmed Mohammed ◽  
Faouzi Alaya Cheikh

Articulation modeling, feature extraction, and classification are the important components of pedestrian segmentation. Usually, these components are modeled independently from each other and then combined in a sequential way. However, this approach is prone to poor segmentation if any individual component is weakly designed. To cope with this problem, we proposed a spatio-temporal convolutional neural network named PedNet which exploits temporal information for spatial segmentation. The backbone of the PedNet consists of an encoder–decoder network for downsampling and upsampling the feature maps, respectively. The input to the network is a set of three frames and the output is a binary mask of the segmented regions in the middle frame. Irrespective of classical deep models where the convolution layers are followed by a fully connected layer for classification, PedNet is a Fully Convolutional Network (FCN). It is trained end-to-end and the segmentation is achieved without the need of any pre- or post-processing. The main characteristic of PedNet is its unique design where it performs segmentation on a frame-by-frame basis but it uses the temporal information from the previous and the future frame for segmenting the pedestrian in the current frame. Moreover, to combine the low-level features with the high-level semantic information learned by the deeper layers, we used long-skip connections from the encoder to decoder network and concatenate the output of low-level layers with the higher level layers. This approach helps to get segmentation map with sharp boundaries. To show the potential benefits of temporal information, we also visualized different layers of the network. The visualization showed that the network learned different information from the consecutive frames and then combined the information optimally to segment the middle frame. We evaluated our approach on eight challenging datasets where humans are involved in different activities with severe articulation (football, road crossing, surveillance). The most common CamVid dataset which is used for calculating the performance of the segmentation algorithm is evaluated against seven state-of-the-art methods. The performance is shown on precision/recall, F 1 , F 2 , and mIoU. The qualitative and quantitative results show that PedNet achieves promising results against state-of-the-art methods with substantial improvement in terms of all the performance metrics.


Entropy ◽  
2022 ◽  
Vol 24 (1) ◽  
pp. 102
Author(s):  
Michele Lo Giudice ◽  
Giuseppe Varone ◽  
Cosimo Ieracitano ◽  
Nadia Mammone ◽  
Giovanbattista Gaspare Tripodi ◽  
...  

The differential diagnosis of epileptic seizures (ES) and psychogenic non-epileptic seizures (PNES) may be difficult, due to the lack of distinctive clinical features. The interictal electroencephalographic (EEG) signal may also be normal in patients with ES. Innovative diagnostic tools that exploit non-linear EEG analysis and deep learning (DL) could provide important support to physicians for clinical diagnosis. In this work, 18 patients with new-onset ES (12 males, 6 females) and 18 patients with video-recorded PNES (2 males, 16 females) with normal interictal EEG at visual inspection were enrolled. None of them was taking psychotropic drugs. A convolutional neural network (CNN) scheme using DL classification was designed to classify the two categories of subjects (ES vs. PNES). The proposed architecture performs an EEG time-frequency transformation and a classification step with a CNN. The CNN was able to classify the EEG recordings of subjects with ES vs. subjects with PNES with 94.4% accuracy. CNN provided high performance in the assigned binary classification when compared to standard learning algorithms (multi-layer perceptron, support vector machine, linear discriminant analysis and quadratic discriminant analysis). In order to interpret how the CNN achieved this performance, information theoretical analysis was carried out. Specifically, the permutation entropy (PE) of the feature maps was evaluated and compared in the two classes. The achieved results, although preliminary, encourage the use of these innovative techniques to support neurologists in early diagnoses.


Feed-forward neural networks can be trained based on a gradient-descent based backpropagation algorithm. But, these algorithms require more computation time. Extreme Learning Machines (ELM’s) are time-efficient, and they are less complicated than the conventional gradient-based algorithm. In previous years, an SRAM based convolutional neural network using a receptive – field Approach was proposed. This neural network was used as an encoder for the ELM algorithm and was implemented on FPGA. But, this neural network used an inaccurate 3-stage pipelined parallel adder. Hence, this neural network generates imprecise stimuli to the hidden layer neurons. This paper presents an implementation of precise convolutional neural network for encoding in the ELM algorithm based on the receptive - field approach at the hardware level. In the third stage of the pipelined parallel adder, instead of approximating the output by using one 2-input 15-bit adder, one 4-input 14-bit adder is used. Also, an additional weighted pixel array block is used. This weighted pixel array improves the accuracy of generating 128 weighted pixels. This neural network was simulated using ModelSim-Altera 10.1d and synthesized using Quartus II 13.0 sp1. This neural network is implemented on Cyclone V FPGA and used for pattern recognition applications. Although this design consumes slightly more hardware resources, this design is more accurate compared to previously existing encoders


2019 ◽  
Vol 13 ◽  
pp. 302-309
Author(s):  
Jakub Basiakowski

The following paper presents the results of research on the impact of machine learning in the construction of a voice-controlled interface. Two different models were used for the analysys: a feedforward neural network containing one hidden layer and a more complicated convolutional neural network. What is more, a comparison of the applied models was presented. This comparison was performed in terms of quality and the course of training.


Sensors ◽  
2021 ◽  
Vol 21 (22) ◽  
pp. 7468
Author(s):  
Yui-Kai Weng ◽  
Shih-Hsu Huang ◽  
Hsu-Yu Kao

In a CNN (convolutional neural network) accelerator, to reduce memory traffic and power consumption, there is a need to exploit the sparsity of activation values. Therefore, some research efforts have been paid to skip ineffectual computations (i.e., multiplications by zero). Different from previous works, in this paper, we point out the similarity of activation values: (1) in the same layer of a CNN model, most feature maps are either highly dense or highly sparse; (2) in the same layer of a CNN model, feature maps in different channels are often similar. Based on the two observations, we propose a block-based compression approach, which utilizes both the sparsity and the similarity of activation values to further reduce the data volume. Moreover, we also design an encoder, a decoder and an indexing module to support the proposed approach. The encoder is used to translate output activations into the proposed block-based compression format, while both the decoder and the indexing module are used to align nonzero values for effectual computations. Compared with previous works, benchmark data consistently show that the proposed approach can greatly reduce both memory traffic and power consumption.


2021 ◽  
pp. 1-13
Author(s):  
R. Bhuvaneswari ◽  
S. Ganesh Vaidyanathan

Diabetic Retinopathy (DR) is one of the most common diabetic diseases that affect the retina’s blood vessels. Too much of the glucose level in blood leads to blockage of blood vessels in the retina, weakening and damaging the retina. Automatic classification of diabetic retinopathy is a challenging task in medical research. This work proposes a Mixture of Ensemble Classifiers (MEC) to classify and grade diabetic retinopathy images using hierarchical features. We use an ensemble of classifiers such as support vector machine, random forest, and Adaboost classifiers that use the hierarchical feature maps obtained at every pooling layer of a convolutional neural network (CNN) for training. The feature maps are generated by applying the filters to the output of the previous layer. Lastly, we predict the class label or the grade for the given test diabetic retinopathy image by considering the class labels of all the ensembled classifiers. We have tested our approaches on the E-ophtha dataset for the classification task and the Messidor dataset for the grading task. We achieved an accuracy of 95.8% and 96.2% for the E-ophtha and Messidor datasets, respectively. A comparison among prominent convolutional neural network architectures and the proposed approach is provided.


Sign in / Sign up

Export Citation Format

Share Document