scholarly journals Exploring deep neural networks via layer-peeled model: Minority collapse in imbalanced training

2021 ◽  
Vol 118 (43) ◽  
pp. e2103091118
Author(s):  
Cong Fang ◽  
Hangfeng He ◽  
Qi Long ◽  
Weijie J. Su

In this paper, we introduce the Layer-Peeled Model, a nonconvex, yet analytically tractable, optimization program, in a quest to better understand deep neural networks that are trained for a sufficiently long time. As the name suggests, this model is derived by isolating the topmost layer from the remainder of the neural network, followed by imposing certain constraints separately on the two parts of the network. We demonstrate that the Layer-Peeled Model, albeit simple, inherits many characteristics of well-trained neural networks, thereby offering an effective tool for explaining and predicting common empirical patterns of deep-learning training. First, when working on class-balanced datasets, we prove that any solution to this model forms a simplex equiangular tight frame, which, in part, explains the recently discovered phenomenon of neural collapse [V. Papyan, X. Y. Han, D. L. Donoho, Proc. Natl. Acad. Sci. U.S.A. 117, 24652–24663 (2020)]. More importantly, when moving to the imbalanced case, our analysis of the Layer-Peeled Model reveals a hitherto-unknown phenomenon that we term Minority Collapse, which fundamentally limits the performance of deep-learning models on the minority classes. In addition, we use the Layer-Peeled Model to gain insights into how to mitigate Minority Collapse. Interestingly, this phenomenon is first predicted by the Layer-Peeled Model before being confirmed by our computational experiments.

2020 ◽  
Author(s):  
Albahli Saleh ◽  
Ali Alkhalifah

BACKGROUND To diagnose cardiothoracic diseases, a chest x-ray (CXR) is examined by a radiologist. As more people get affected, doctors are becoming scarce especially in developing countries. However, with the advent of image processing tools, the task of diagnosing these cardiothoracic diseases has seen great progress. A lot of researchers have put in work to see how the problems associated with medical images can be mitigated by using neural networks. OBJECTIVE Previous works used state-of-the-art techniques and got effective results with one or two cardiothoracic diseases but could lead to misclassification. In our work, we adopted GANs to synthesize the chest radiograph (CXR) to augment the training set on multiple cardiothoracic diseases to efficiently diagnose the chest diseases in different classes as shown in Figure 1. In this regard, our major contributions are classifying various cardiothoracic diseases to detect a specific chest disease based on CXR, use the advantage of GANs to overcome the shortages of small training datasets, address the problem of imbalanced data; and implementing optimal deep neural network architecture with different hyper-parameters to improve the model with the best accuracy. METHODS For this research, we are not building a model from scratch due to computational restraints as they require very high-end computers. Rather, we use a Convolutional Neural Network (CNN) as a class of deep neural networks to propose a generative adversarial network (GAN) -based model to generate synthetic data for training the data as the amount of the data is limited. We will use pre-trained models which are models that were trained on a large benchmark dataset to solve a problem similar to the one we want to solve. For example, the ResNet-152 model we used was initially trained on the ImageNet dataset. RESULTS After successful training and validation of the models we developed, ResNet-152 with image augmentation proved to be the best model for the automatic detection of cardiothoracic disease. However, one of the main problems associated with radiographic deep learning projects and research is the scarcity and unavailability of enough datasets which is a key component of all deep learning models as they require a lot of data for training. This is the reason why some of our models had image augmentation to increase the number of images without duplication. As more data are collected in the field of chest radiology, the models could be retrained to improve the accuracies of the models as deep learning models improve with more data. CONCLUSIONS This research employs the advantages of computer vision and medical image analysis to develop an automated model that has the clinical potential for early detection of the disease. Using deep learning models, the research aims to evaluate the effectiveness and accuracy of different convolutional neural network models in the automatic diagnosis of cardiothoracic diseases from x-ray images compared to diagnosis by experts in the medical community.


2021 ◽  
pp. 385-399
Author(s):  
Wilson Guasti Junior ◽  
Isaac P. Santos

Abstract In this work we explore the use of deep learning models based on deep feedforward neural networks to solve ordinary and partial differential equations. The illustration of this methodology is given by solving a variety of initial and boundary value problems. The numerical results, obtained based on different feedforward neural networks structures, activation functions and minimization methods, were compared to each other and to the exact solutions. The neural network was implemented using the Python language, with the Tensorflow library.


Entropy ◽  
2020 ◽  
Vol 22 (12) ◽  
pp. 1365
Author(s):  
Bogdan Muşat ◽  
Răzvan Andonie

Convolutional neural networks utilize a hierarchy of neural network layers. The statistical aspects of information concentration in successive layers can bring an insight into the feature abstraction process. We analyze the saliency maps of these layers from the perspective of semiotics, also known as the study of signs and sign-using behavior. In computational semiotics, this aggregation operation (known as superization) is accompanied by a decrease of spatial entropy: signs are aggregated into supersign. Using spatial entropy, we compute the information content of the saliency maps and study the superization processes which take place between successive layers of the network. In our experiments, we visualize the superization process and show how the obtained knowledge can be used to explain the neural decision model. In addition, we attempt to optimize the architecture of the neural model employing a semiotic greedy technique. To the extent of our knowledge, this is the first application of computational semiotics in the analysis and interpretation of deep neural networks.


2021 ◽  
Vol 6 (5) ◽  
pp. 10-15
Author(s):  
Ela Bhattacharya ◽  
D. Bhattacharya

COVID-19 has emerged as the latest worrisome pandemic, which is reported to have its outbreak in Wuhan, China. The infection spreads by means of human contact, as a result, it has caused massive infections across 200 countries around the world. Artificial intelligence has likewise contributed to managing the COVID-19 pandemic in various aspects within a short span of time. Deep Neural Networks that are explored in this paper have contributed to the detection of COVID-19 from imaging sources. The datasets, pre-processing, segmentation, feature extraction, classification and test results which can be useful for discovering future directions in the domain of automatic diagnosis of the disease, utilizing artificial intelligence-based frameworks, have been investigated in this paper.


Author(s):  
Qin Song ◽  
Yu-Jun Zheng ◽  
Jun Yang

Morbidity prediction can be useful in improving the effectiveness and efficiency of medical services, but accurate morbidity prediction is often difficult because of the complex relationships between diseases and their influencing factors. This study investigates the effects of food contamination on gastrointestinal-disease morbidities using eight different machine-learning models, including multiple linear regression, a shallow neural network, and three deep neural networks and their improved versions trained by an evolutionary algorithm. Experiments on the datasets from ten cities/counties in central China demonstrate that deep neural networks achieve significantly higher accuracy than classical linear-regression and shallow neural-network models, and the deep denoising autoencoder model with evolutionary learning exhibits the best prediction performance. The results also indicate that the prediction accuracies on acute gastrointestinal diseases are generally higher than those on other diseases, but the models are difficult to predict the morbidities of gastrointestinal tumors. This study demonstrates that evolutionary deep-learning models can be utilized to accurately predict the morbidities of most gastrointestinal diseases from food contamination, and this approach can be extended for the morbidity prediction of many other diseases.


Author(s):  
V. N. Gridin ◽  
I. A. Evdokimov ◽  
B. R. Salem ◽  
V. I. Solodovnikov

The analysis of key stages, implementation features and functioning principles of the neural networks, including deep neural networks, has been carried out. The problems of choosing the number of hidden elements, methods for the internal topology selection and setting parameters are considered. It is shown that in the training and validation process it is possible to control the capacity of a neural network and evaluate the qualitative characteristics of the constructed model. The issues of construction processes automation and hyperparameters optimization of the neural network structures are considered depending on the user's tasks and the available source data. A number of approaches based on the use of probabilistic programming, evolutionary algorithms, and recurrent neural networks are presented.


2020 ◽  
Vol 17 (8) ◽  
pp. 3478-3483
Author(s):  
V. Sravan Chowdary ◽  
G. Penchala Sai Teja ◽  
D. Mounesh ◽  
G. Manideep ◽  
C. T. Manimegalai

Road injuries are a big drawback in society for a few time currently. Ignoring sign boards while moving on roads has significantly become a major cause for road accidents. Thus we came up with an approach to face this issue by detecting the sign board and recognition of sign board. At this moment there are several deep learning models for object detection using totally different algorithms like RCNN, faster RCNN, SPP-net, etc. We prefer to use Yolo-3, which improves the speed and precision of object detection. This algorithm will increase the accuracy by utilizing residual units, skip connections and up-sampling. This algorithm uses a framework named Dark-net. This framework is intended specifically to create the neural network for training the Yolo algorithm. To thoroughly detect the sign board, we used this algorithm.


2021 ◽  
Vol 25 (3) ◽  
pp. 31-35
Author(s):  
Piotr Więcek ◽  
Dominik Sankowski

The article presents a new algorithm for increasing the resolution of thermal images. For this purpose, the residual network was integrated with the Kernel-Sharing Atrous Convolution (KSAC) image sub-sampling module. A significant reduction in the algorithm’s complexity and shortening the execution time while maintaining high accuracy were achieved. The neural network has been implemented in the PyTorch environment. The results of the proposed new method of increasing the resolution of thermal images with sizes 32 × 24, 160 × 120 and 640 × 480 for scales up to 6 are presented.


2021 ◽  
Author(s):  
Ghassan Mohammed Halawani

The main purpose of this project is to modify a convolutional neural network for image classification, based on a deep-learning framework. A transfer learning technique is used by the MATLAB interface to Alex-Net to train and modify the parameters in the last two fully connected layers of Alex-Net with a new dataset to perform classifications of thousands of images. First, the general common architecture of most neural networks and their benefits are presented. The mathematical models and the role of each part in the neural network are explained in detail. Second, different neural networks are studied in terms of architecture, application, and the working method to highlight the strengths and weaknesses of each of neural network. The final part conducts a detailed study on one of the most powerful deep-learning networks in image classification – i.e. the convolutional neural network – and how it can be modified to suit different classification tasks by using transfer learning technique in MATLAB.


Sign in / Sign up

Export Citation Format

Share Document