scholarly journals Avoiding Overfitting: A Survey on Regularization Methods for Convolutional Neural Networks

2022 ◽  
Author(s):  
Claudio Filipi Gonçalves dos Santos ◽  
João Paulo Papa

Several image processing tasks, such as image classification and object detection, have been significantly improved using Convolutional Neural Networks (CNN). Like ResNet and EfficientNet, many architectures have achieved outstanding results in at least one dataset by the time of their creation. A critical factor in training concerns the network’s regularization, which prevents the structure from overfitting. This work analyzes several regularization methods developed in the last few years, showing significant improvements for different CNN models. The works are classified into three main areas: the first one is called “data augmentation”, where all the techniques focus on performing changes in the input data. The second, named “internal changes”, which aims to describe procedures to modify the feature maps generated by the neural network or the kernels. The last one, called “label”, concerns transforming the labels of a given input. This work presents two main differences comparing to other available surveys about regularization: (i) the first concerns the papers gathered in the manuscript, which are not older than five years, and (ii) the second distinction is about reproducibility, i.e., all works refered here have their code available in public repositories or they have been directly implemented in some framework, such as TensorFlow or Torch.

Author(s):  
Paweł Tarasiuk ◽  
Piotr S. Szczepaniak

AbstractThis paper presents a novel method for improving the invariance of convolutional neural networks (CNNs) to selected geometric transformations in order to obtain more efficient image classifiers. A common strategy employed to achieve this aim is to train the network using data augmentation. Such a method alone, however, increases the complexity of the neural network model, as any change in the rotation or size of the input image results in the activation of different CNN feature maps. This problem can be resolved by the proposed novel convolutional neural network models with geometric transformations embedded into the network architecture. The evaluation of the proposed CNN model is performed on the image classification task with the use of diverse representative data sets. The CNN models with embedded geometric transformations are compared to those without the transformations, using different data augmentation setups. As the compared approaches use the same amount of memory to store the parameters, the improved classification score means that the proposed architecture is more optimal.


Author(s):  
Sarah Badr AlSumairi ◽  
Mohamed Maher Ben Ismail

Pneumonia is an infectious disease of the lungs. About one third to one half of pneumonia cases are caused by bacteria. Early diagnosis is a critical factor for a successful treatment process. Typically, the disease can be diagnosed by a radiologist using chest X-ray images. In fact, chest X-rays are currently the best available method for diagnosing pneumonia. However, the recognition of pneumonia symptoms is a challenging task that relies on the availability of expert radiologists. Such “human” diagnosis can be inaccurate and subjective due to lack of clarity and erroneous decision. Moreover, the error can increase more if the physician is requested to analyze tens of X-rays within a short period of time. Therefore, Computer-Aided Diagnosis (CAD) systems were introduced to support and assist physicians and make their efforts more productive. In this paper, we investigate, design, implement and assess customized Convolutional Neural Networks to overcome the image-based Pneumonia classification problem. Namely, ResNet-50 and DenseNet-161 models were inherited to design customized deep network architecture and improve the overall pneumonia classification accuracy. Moreover, data augmentation was deployed and associated with standard datasets to assess the proposed models. Besides, standard performance measures were used to validate and evaluate the proposed system.


2017 ◽  
Vol 10 (27) ◽  
pp. 1329-1342 ◽  
Author(s):  
Javier O. Pinzon Arenas ◽  
Robinson Jimenez Moreno ◽  
Paula C. Useche Murillo

This paper presents the implementation of a Region-based Convolutional Neural Network focused on the recognition and localization of hand gestures, in this case 2 types of gestures: open and closed hand, in order to achieve the recognition of such gestures in dynamic backgrounds. The neural network is trained and validated, achieving a 99.4% validation accuracy in gesture recognition and a 25% average accuracy in RoI localization, which is then tested in real time, where its operation is verified through times taken for recognition, execution behavior through trained and untrained gestures, and complex backgrounds.


PLoS ONE ◽  
2021 ◽  
Vol 16 (4) ◽  
pp. e0249404
Author(s):  
Jeongtae Son ◽  
Dongsup Kim

Prediction of protein-ligand interactions is a critical step during the initial phase of drug discovery. We propose a novel deep-learning-based prediction model based on a graph convolutional neural network, named GraphBAR, for protein-ligand binding affinity. Graph convolutional neural networks reduce the computational time and resources that are normally required by the traditional convolutional neural network models. In this technique, the structure of a protein-ligand complex is represented as a graph of multiple adjacency matrices whose entries are affected by distances, and a feature matrix that describes the molecular properties of the atoms. We evaluated the predictive power of GraphBAR for protein-ligand binding affinities by using PDBbind datasets and proved the efficiency of the graph convolution. Given the computational efficiency of graph convolutional neural networks, we also performed data augmentation to improve the model performance. We found that data augmentation with docking simulation data could improve the prediction accuracy although the improvement seems not to be significant. The high prediction performance and speed of GraphBAR suggest that such networks can serve as valuable tools in drug discovery.


Geophysics ◽  
2021 ◽  
pp. 1-77
Author(s):  
Hanchen Wang ◽  
Tariq Alkhalifah

The ample size of time-lapse data often requires significant event detection and source location efforts, especially in areas like shale gas exploration regions where a large number of micro-seismic events are often recorded. In many cases, the real-time monitoring and locating of these events are essential to production decisions. Conventional methods face considerable drawbacks. For example, traveltime-based methods require traveltime picking of often noisy data, while migration and waveform inversion methods require expensive wavefield solutions and event detection. Both tasks require some human intervention, and this becomes a big problem when too many sources need to be located, which is common in micro-seismic monitoring. Machine learning has recently been used to identify micro-seismic events or locate their sources once they are identified and picked. We propose to use a novel artificial neural network framework to directly map seismic data, without any event picking or detection, to their potential source locations. We train two convolutional neural networks on labeled synthetic acoustic data containing simulated micro-seismic events to fulfill such requirements. One convolutional neural network, which has a global average pooling layer to reduce the computational cost while maintaining high-performance levels, aims to classify the number of events in the data. The other network predicts the source locations and other source features such as the source peak frequencies and amplitudes. To reduce the size of the input data to the network, we correlate the recorded traces with a central reference trace to allow the network to focus on the curvature of the input data near the zero-lag region. We train the networks to handle single, multi, and no event segments extracted from the data. Tests on a simple vertical varying model and a more realistic Otway field model demonstrate the approach's versatility and potential.


In this paper we will identify a cry signals of infants and the explanation behind the screams below 0-6 months of segment age. Detection of baby cry signals is essential for the pre-processing of various applications involving crial analysis for baby caregivers, such as emotion detection. Since cry signals hold baby well-being information and can be understood to an extent by experienced parents and experts. We train and validate the neural network architecture for baby cry detection and also test the fastAI with the neural network. Trained neural networks will provide a model and this model can predict the reason behind the cry sound. Only the cry sounds are recognized, and alert the user automatically. Created a web application by responding and detecting different emotions including hunger, tired, discomfort, bellypain.


In this Research study image identifications will be done by the help of Advanced CNN (Convolutional Neural Networks with Tensorflow Framework. Here we use Python as a main programming language because Tensorflow is a python library. In this study input data mainly focuses on Plants categories by the help of leaves for identifications. Selecting CNN is the best approach for the training and testing data because it produces promising and continuously improving results on automated plant identifications. Here results are divided in terms of accuracy and time. Using advanced CNN results are above 95% while on others accuracy is below 90% and taking much time than this.


2021 ◽  
Vol 2086 (1) ◽  
pp. 012148
Author(s):  
P A Khorin ◽  
A P Dzyuba ◽  
P G Serafimovich ◽  
S N Khonina

Abstract Recognition of the types of aberrations corresponding to individual Zernike functions were carried out from the pattern of the intensity of the point spread function (PSF) outside the focal plane using convolutional neural networks. The PSF intensity patterns outside the focal plane are more informative in comparison with the focal plane even for small values/magnitudes of aberrations. The mean prediction errors of the neural network for each type of aberration were obtained for a set of 8 Zernike functions from a dataset of 2 thousand pictures of out-of-focal PSFs. As a result of training, for the considered types of aberrations, the obtained averaged absolute errors do not exceed 0.0053, which corresponds to an almost threefold decrease in the error in comparison with the same result for focal PSFs.


Author(s):  
Md Gouse Pasha

Accidents are now increasingly increasing as more cases are caused by driver drowsiness. To reduce these situations we were working on something that could reduce numbers and get accidents early. Seeing a drowsy driver behind the steering wheel once and warning him could reduce road accidents. In this case drowsiness is detected using an automatic camera, where, based on the captured image, the neural network detects whether the driver is awake or tired. Convolutional Neural Network Technology (CNN) has been used as part of a neural network, where each framework is examined separately and the average of the last 20 frames are tested, corresponding for about one second to a set of training and test data. We analyse image segmentation methods, construct a model based on convolutional neural networks. Using a detailed database of more than 2000 image fragments we are training and analysing the segmentation network to extract the emotional state of the driver in images.


2019 ◽  
pp. 47-52
Author(s):  
R. Yu. Belorutsky ◽  
S. V. Zhitnik

The problem of recognizing a human speech in the form of digits from one to ten recorded by dictaphone is considered. The method of the sound signal spectrogram recognition by means of convolutional neural networks is used. The algorithms for input data preliminary processing, networks training and words recognition are realized. The recognition accuracy for different number of convolution layers is estimated. Its number is determined and the structure of neural network is proposed. The comparison of recognition accuracy when the input data for the network is spectrogram or first two formants is carried out. The recognition algorithm is tested by male and female voices with different duration of pronunciation.


Sign in / Sign up

Export Citation Format

Share Document