Exploring deep neural networks via layer-peeled model: Minority collapse in imbalanced training

In this paper, we introduce the Layer-Peeled Model, a nonconvex, yet analytically tractable, optimization program, in a quest to better understand deep neural networks that are trained for a sufficiently long time. As the name suggests, this model is derived by isolating the topmost layer from the remainder of the neural network, followed by imposing certain constraints separately on the two parts of the network. We demonstrate that the Layer-Peeled Model, albeit simple, inherits many characteristics of well-trained neural networks, thereby offering an effective tool for explaining and predicting common empirical patterns of deep-learning training. First, when working on class-balanced datasets, we prove that any solution to this model forms a simplex equiangular tight frame, which, in part, explains the recently discovered phenomenon of neural collapse [V. Papyan, X. Y. Han, D. L. Donoho, Proc. Natl. Acad. Sci. U.S.A. 117, 24652–24663 (2020)]. More importantly, when moving to the imbalanced case, our analysis of the Layer-Peeled Model reveals a hitherto-unknown phenomenon that we term Minority Collapse, which fundamentally limits the performance of deep-learning models on the minority classes. In addition, we use the Layer-Peeled Model to gain insights into how to mitigate Minority Collapse. Interestingly, this phenomenon is first predicted by the Layer-Peeled Model before being confirmed by our computational experiments.

Download Full-text

Identification of Thoracic Diseases by Exploiting Deep Neural Networks (Preprint)

10.2196/preprints.23644 ◽

2020 ◽

Author(s):

Albahli Saleh ◽

Ali Alkhalifah

Keyword(s):

Neural Network ◽

Neural Networks ◽

Deep Learning ◽

Convolutional Neural Network ◽

Deep Neural Networks ◽

Medical Image Analysis ◽

Medical Community ◽

Learning Models ◽

X Ray ◽

Chest Disease

BACKGROUND To diagnose cardiothoracic diseases, a chest x-ray (CXR) is examined by a radiologist. As more people get affected, doctors are becoming scarce especially in developing countries. However, with the advent of image processing tools, the task of diagnosing these cardiothoracic diseases has seen great progress. A lot of researchers have put in work to see how the problems associated with medical images can be mitigated by using neural networks. OBJECTIVE Previous works used state-of-the-art techniques and got effective results with one or two cardiothoracic diseases but could lead to misclassification. In our work, we adopted GANs to synthesize the chest radiograph (CXR) to augment the training set on multiple cardiothoracic diseases to efficiently diagnose the chest diseases in different classes as shown in Figure 1. In this regard, our major contributions are classifying various cardiothoracic diseases to detect a specific chest disease based on CXR, use the advantage of GANs to overcome the shortages of small training datasets, address the problem of imbalanced data; and implementing optimal deep neural network architecture with different hyper-parameters to improve the model with the best accuracy. METHODS For this research, we are not building a model from scratch due to computational restraints as they require very high-end computers. Rather, we use a Convolutional Neural Network (CNN) as a class of deep neural networks to propose a generative adversarial network (GAN) -based model to generate synthetic data for training the data as the amount of the data is limited. We will use pre-trained models which are models that were trained on a large benchmark dataset to solve a problem similar to the one we want to solve. For example, the ResNet-152 model we used was initially trained on the ImageNet dataset. RESULTS After successful training and validation of the models we developed, ResNet-152 with image augmentation proved to be the best model for the automatic detection of cardiothoracic disease. However, one of the main problems associated with radiographic deep learning projects and research is the scarcity and unavailability of enough datasets which is a key component of all deep learning models as they require a lot of data for training. This is the reason why some of our models had image augmentation to increase the number of images without duplication. As more data are collected in the field of chest radiology, the models could be retrained to improve the accuracies of the models as deep learning models improve with more data. CONCLUSIONS This research employs the advantages of computer vision and medical image analysis to develop an automated model that has the clinical potential for early detection of the disease. Using deep learning models, the research aims to evaluate the effectiveness and accuracy of different convolutional neural network models in the automatic diagnosis of cardiothoracic diseases from x-ray images compared to diagnosis by experts in the medical community.

Download Full-text

Solving Differential Equations Using Feedforward Neural Networks

10.1007/978-3-030-86973-1_27 ◽

2021 ◽

pp. 385-399

Author(s):

Wilson Guasti Junior ◽

Isaac P. Santos

Keyword(s):

Neural Network ◽

Neural Networks ◽

Deep Learning ◽

Differential Equations ◽

Feedforward Neural Networks ◽

Activation Functions ◽

Learning Models ◽

Python Language ◽

The Neural Network ◽

Minimization Methods

Abstract In this work we explore the use of deep learning models based on deep feedforward neural networks to solve ordinary and partial differential equations. The illustration of this methodology is given by solving a variety of initial and boundary value problems. The numerical results, obtained based on different feedforward neural networks structures, activation functions and minimization methods, were compared to each other and to the exact solutions. The neural network was implemented using the Python language, with the Tensorflow library.

Download Full-text

Semiotic Aggregation in Deep Learning

Entropy ◽

10.3390/e22121365 ◽

2020 ◽

Vol 22 (12) ◽

pp. 1365

Author(s):

Bogdan Muşat ◽

Răzvan Andonie

Keyword(s):

Neural Network ◽

Neural Networks ◽

Deep Learning ◽

Decision Model ◽

Deep Neural Networks ◽

Neural Model ◽

Network Layers ◽

Saliency Maps ◽

Spatial Entropy ◽

Insight Into

Convolutional neural networks utilize a hierarchy of neural network layers. The statistical aspects of information concentration in successive layers can bring an insight into the feature abstraction process. We analyze the saliency maps of these layers from the perspective of semiotics, also known as the study of signs and sign-using behavior. In computational semiotics, this aggregation operation (known as superization) is accompanied by a decrease of spatial entropy: signs are aggregated into supersign. Using spatial entropy, we compute the information content of the saliency maps and study the superization processes which take place between successive layers of the network. In our experiments, we visualize the superization process and show how the obtained knowledge can be used to explain the neural decision model. In addition, we attempt to optimize the architecture of the neural model employing a semiotic greedy technique. To the extent of our knowledge, this is the first application of computational semiotics in the analysis and interpretation of deep neural networks.

Download Full-text

NNV: The Neural Network Verification Tool for Deep Neural Networks and Learning-Enabled Cyber-Physical Systems

Computer Aided Verification - Lecture Notes in Computer Science ◽

10.1007/978-3-030-53288-8_1 ◽

2020 ◽

pp. 3-17 ◽

Cited By ~ 5

Author(s):

Hoang-Dung Tran ◽

Xiaodong Yang ◽

Diego Manzanas Lopez ◽

Patrick Musau ◽

Luan Viet Nguyen ◽

...

Keyword(s):

Neural Network ◽

Neural Networks ◽

Deep Neural Networks ◽

Cyber Physical Systems ◽

Physical Systems ◽

Verification Tool ◽

The Neural Network ◽

Network Verification

Download Full-text

A Review of Recent Deep Learning Models in COVID-19 Diagnosis

European Journal of Engineering and Technology Research ◽

10.24018/ejers.2021.6.5.2485 ◽

2021 ◽

Vol 6 (5) ◽

pp. 10-15

Author(s):

Ela Bhattacharya ◽

D. Bhattacharya

Keyword(s):

Artificial Intelligence ◽

Neural Networks ◽

Deep Learning ◽

Deep Neural Networks ◽

Test Results ◽

Learning Models ◽

Future Directions ◽

Human Contact ◽

The World ◽

Short Span

COVID-19 has emerged as the latest worrisome pandemic, which is reported to have its outbreak in Wuhan, China. The infection spreads by means of human contact, as a result, it has caused massive infections across 200 countries around the world. Artificial intelligence has likewise contributed to managing the COVID-19 pandemic in various aspects within a short span of time. Deep Neural Networks that are explored in this paper have contributed to the detection of COVID-19 from imaging sources. The datasets, pre-processing, segmentation, feature extraction, classification and test results which can be useful for discovering future directions in the domain of automatic diagnosis of the disease, utilizing artificial intelligence-based frameworks, have been investigated in this paper.

Download Full-text

Effects of Food Contamination on Gastrointestinal Morbidity: Comparison of Different Machine-Learning Methods

International Journal of Environmental Research and Public Health ◽

10.3390/ijerph16050838 ◽

2019 ◽

Vol 16 (5) ◽

pp. 838 ◽

Cited By ~ 4

Author(s):

Qin Song ◽

Yu-Jun Zheng ◽

Jun Yang

Keyword(s):

Neural Network ◽

Machine Learning ◽

Neural Networks ◽

Linear Regression ◽

Deep Neural Networks ◽

Gastrointestinal Disease ◽

Food Contamination ◽

Gastrointestinal Diseases ◽

Central China ◽

Learning Models

Morbidity prediction can be useful in improving the effectiveness and efficiency of medical services, but accurate morbidity prediction is often difficult because of the complex relationships between diseases and their influencing factors. This study investigates the effects of food contamination on gastrointestinal-disease morbidities using eight different machine-learning models, including multiple linear regression, a shallow neural network, and three deep neural networks and their improved versions trained by an evolutionary algorithm. Experiments on the datasets from ten cities/counties in central China demonstrate that deep neural networks achieve significantly higher accuracy than classical linear-regression and shallow neural-network models, and the deep denoising autoencoder model with evolutionary learning exhibits the best prediction performance. The results also indicate that the prediction accuracies on acute gastrointestinal diseases are generally higher than those on other diseases, but the models are difficult to predict the morbidities of gastrointestinal tumors. This study demonstrates that evolutionary deep-learning models can be utilized to accurately predict the morbidities of most gastrointestinal diseases from food contamination, and this approach can be extended for the morbidity prediction of many other diseases.

Download Full-text

OPTIMIZATION PROCESS ANALYSIS FOR HYPERPARAMETERS OF NEURAL NETWORK DATA PROCESSING STRUCTURES

Vestnik komp iuternykh i informatsionnykh tekhnologii ◽

10.14489/vkit.2020.10.pp.003-010 ◽

2020 ◽

pp. 3-10

Author(s):

V. N. Gridin ◽

I. A. Evdokimov ◽

B. R. Salem ◽

V. I. Solodovnikov

Keyword(s):

Neural Network ◽

Neural Networks ◽

Deep Neural Networks ◽

Process Analysis ◽

Validation Process ◽

The Neural Network ◽

Source Data ◽

The Neural Networks ◽

Qualitative Characteristics ◽

Setting Parameters

The analysis of key stages, implementation features and functioning principles of the neural networks, including deep neural networks, has been carried out. The problems of choosing the number of hidden elements, methods for the internal topology selection and setting parameters are considered. It is shown that in the training and validation process it is possible to control the capacity of a neural network and evaluate the qualitative characteristics of the constructed model. The issues of construction processes automation and hyperparameters optimization of the neural network structures are considered depending on the user's tasks and the available source data. A number of approaches based on the use of probabilistic programming, evolutionary algorithms, and recurrent neural networks are presented.

Download Full-text

Sign Board Recognition Based on Convolutional Neural Network Using Yolo-3

Journal of Computational and Theoretical Nanoscience ◽

10.1166/jctn.2020.9214 ◽

2020 ◽

Vol 17 (8) ◽

pp. 3478-3483

Author(s):

V. Sravan Chowdary ◽

G. Penchala Sai Teja ◽

D. Mounesh ◽

G. Manideep ◽

C. T. Manimegalai

Keyword(s):

Neural Network ◽

Deep Learning ◽

Object Detection ◽

Convolutional Neural Network ◽

Road Accidents ◽

Learning Models ◽

The Neural Network

Road injuries are a big drawback in society for a few time currently. Ignoring sign boards while moving on roads has significantly become a major cause for road accidents. Thus we came up with an approach to face this issue by detecting the sign board and recognition of sign board. At this moment there are several deep learning models for object detection using totally different algorithms like RCNN, faster RCNN, SPP-net, etc. We prefer to use Yolo-3, which improves the speed and precision of object detection. This algorithm will increase the accuracy by utilizing residual units, skip connections and up-sampling. This algorithm uses a framework named Dark-net. This framework is intended specifically to create the neural network for training the Yolo algorithm. To thoroughly detect the sign board, we used this algorithm.

Download Full-text

Increasing of Thermal Images Resolution Using Deep Learning Neural Networks

Pomiary Automatyka Robotyka ◽

10.14313/par_241/31 ◽

2021 ◽

Vol 25 (3) ◽

pp. 31-35

Author(s):

Piotr Więcek ◽

Dominik Sankowski

Keyword(s):

Neural Network ◽

Neural Networks ◽

Deep Learning ◽

Execution Time ◽

High Accuracy ◽

New Method ◽

Residual Network ◽

Thermal Images ◽

The Neural Network

The article presents a new algorithm for increasing the resolution of thermal images. For this purpose, the residual network was integrated with the Kernel-Sharing Atrous Convolution (KSAC) image sub-sampling module. A significant reduction in the algorithm’s complexity and shortening the execution time while maintaining high accuracy were achieved. The neural network has been implemented in the PyTorch environment. The results of the proposed new method of increasing the resolution of thermal images with sizes 32 × 24, 160 × 120 and 640 × 480 for scales up to 6 are presented.

Download Full-text

Convolutional neural network for image classification based on transfer learning technique

10.32920/ryerson.14663658 ◽

2021 ◽

Author(s):

Ghassan Mohammed Halawani

Keyword(s):

Neural Network ◽

Neural Networks ◽

Deep Learning ◽

Convolutional Neural Network ◽

Image Classification ◽

Transfer Learning ◽

Learning Networks ◽

The Neural Network ◽

Learning Technique ◽

Common Architecture

The main purpose of this project is to modify a convolutional neural network for image classification, based on a deep-learning framework. A transfer learning technique is used by the MATLAB interface to Alex-Net to train and modify the parameters in the last two fully connected layers of Alex-Net with a new dataset to perform classifications of thousands of images. First, the general common architecture of most neural networks and their benefits are presented. The mathematical models and the role of each part in the neural network are explained in detail. Second, different neural networks are studied in terms of architecture, application, and the working method to highlight the strengths and weaknesses of each of neural network. The final part conducts a detailed study on one of the most powerful deep-learning networks in image classification – i.e. the convolutional neural network – and how it can be modified to suit different classification tasks by using transfer learning technique in MATLAB.

Download Full-text