Learning Sparse Neural Networks for Better Generalization

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2020/735 ◽

2020 ◽

Author(s):

Shiwei Liu

Keyword(s):

Neural Networks ◽

Test Data ◽

Deep Neural Networks ◽

Generalization Bounds ◽

Better Than

Deep neural networks perform well on test data when they are highly overparameterized, which, however, also leads to large cost to train and deploy them. As a leading approach to address this problem, sparse neural networks have been widely used to significantly reduce the size of networks, making them more efficient during training and deployment, without compromising performance. Recently, sparse neural networks, either compressed from a pre-trained model or obtained by training from scratch, have been observed to be able to generalize as well as or even better than their dense counterparts. However, conventional techniques to find well fitted sparse sub-networks are expensive and the mechanisms underlying this phenomenon are far from clear. To tackle these problems, this Ph.D. research aims to study the generalization of sparse neural networks, and to propose more efficient approaches that can yield sparse neural networks with generalization bounds.

Download Full-text

Deep Neural Networks with Multistate Activation Functions

Computational Intelligence and Neuroscience ◽

10.1155/2015/721367 ◽

2015 ◽

Vol 2015 ◽

pp. 1-10 ◽

Cited By ~ 2

Author(s):

Chenghao Cai ◽

Yanyan Xu ◽

Dengfeng Ke ◽

Kaile Su

Keyword(s):

Neural Networks ◽

Gradient Descent ◽

Deep Neural Networks ◽

Error Rates ◽

Stochastic Gradient Descent ◽

Activation Functions ◽

Classification Problems ◽

Training Set ◽

Relative Improvement ◽

Better Than

We propose multistate activation functions (MSAFs) for deep neural networks (DNNs). These MSAFs are new kinds of activation functions which are capable of representing more than two states, including theN-order MSAFs and the symmetrical MSAF. DNNs with these MSAFs can be trained via conventional Stochastic Gradient Descent (SGD) as well as mean-normalised SGD. We also discuss how these MSAFs perform when used to resolve classification problems. Experimental results on the TIMIT corpus reveal that, on speech recognition tasks, DNNs with MSAFs perform better than the conventional DNNs, getting a relative improvement of 5.60% on phoneme error rates. Further experiments also reveal that mean-normalised SGD facilitates the training processes of DNNs with MSAFs, especially when being with large training sets. The models can also be directly trained without pretraining when the training set is sufficiently large, which results in a considerable relative improvement of 5.82% on word error rates.

Download Full-text

Androgen Receptor Binding Category Prediction with Deep Neural Networks and Structure-, Ligand-, and Statistically Based Features

Molecules ◽

10.3390/molecules26051285 ◽

2021 ◽

Vol 26 (5) ◽

pp. 1285

Author(s):

Alfonso T. García-Sosa

Keyword(s):

Neural Networks ◽

Androgen Receptor ◽

Logistic Model ◽

Deep Neural Networks ◽

State Of The Art ◽

Protein Structures ◽

Training Set ◽

Multivariate Logistic Model ◽

And Training ◽

Better Than

Substances that can modify the androgen receptor pathway in humans and animals are entering the environment and food chain with the proven ability to disrupt hormonal systems and leading to toxicity and adverse effects on reproduction, brain development, and prostate cancer, among others. State-of-the-art databases with experimental data of human, chimp, and rat effects by chemicals have been used to build machine-learning classifiers and regressors and to evaluate these on independent sets. Different featurizations, algorithms, and protein structures lead to different results, with deep neural networks (DNNs) on user-defined physicochemically relevant features developed for this work outperforming graph convolutional, random forest, and large featurizations. The results show that these user-provided structure-, ligand-, and statistically based features and specific DNNs provided the best results as determined by AUC (0.87), MCC (0.47), and other metrics and by their interpretability and chemical meaning of the descriptors/features. In addition, the same features in the DNN method performed better than in a multivariate logistic model: validation MCC = 0.468 and training MCC = 0.868 for the present work compared to evaluation set MCC = 0.2036 and training set MCC = 0.5364 for the multivariate logistic regression on the full, unbalanced set. Techniques of this type may improve AR and toxicity description and prediction, improving assessment and design of compounds. Source code and data are available on github.

Download Full-text

Convolutional Neural Networks for Classifying Melanoma Images

10.1101/2020.05.22.110973 ◽

2020 ◽

Author(s):

Abhinav Sagar ◽

J Dheeba

Keyword(s):

Neural Networks ◽

Skin Cancer ◽

Transfer Learning ◽

Convolutional Neural Networks ◽

Deep Neural Networks ◽

State Of The Art ◽

Cancer Classification ◽

Previous State ◽

Better Than

AbstractIn this work, we address the problem of skin cancer classification using convolutional neural networks. A lot of cancer cases early on are misdiagnosed as something else leading to severe consequences including the death of a patient. Also there are cases in which patients have some other problems and doctors think they might have skin cancer. This leads to unnecessary time and money spent for further diagnosis. In this work, we address both of the above problems using deep neural networks and transfer learning architecture. We have used publicly available ISIC databases for both training and testing our model. Our work achieves an accuracy of 0.935, precision of 0.94, recall of 0.77, F1 score of 0.85 and ROC-AUC of 0.861 which is better than the previous state of the art approaches.

Download Full-text

Deep Neural Networks Offer Morphologic Classification and Diagnosis of Bacterial Vaginosis

Journal of Clinical Microbiology ◽

10.1128/jcm.02236-20 ◽

2020 ◽

Author(s):

Zhongxiao Wang ◽

Lei Zhang ◽

Min Zhao ◽

Ying Wang ◽

Huihui Bai ◽

...

Keyword(s):

Neural Networks ◽

Bacterial Vaginosis ◽

Deep Neural Networks ◽

Independent Test ◽

Golden Standard ◽

Test Sets ◽

Fully Connected ◽

Deep Learning Model ◽

Bacterial Morphotypes ◽

Better Than

Background: Bacterial vaginosis (BV) is caused by the excessive and imbalanced growth of bacteria in vagina, affecting 30-50% of women in their lives. Gram stain followed by Nugent scoring based on bacterial morphotypes under the microscope (NS) has been considered the golden standard for BV diagnosis, which is often labor-intensive, time-consuming, and variable results from person to person. Methods: We developed and optimized a convolutional neural networks (CNN) model, and evaluated its ability to automatically identify and classify three categories of Nugent scores from microscope images. The CNN model was first established with a panel of microscopic images with Nugent scores determined by experts. The model was trained by minimizing the cross entropy loss function and optimized by using a momentum optimizer. The separate test sets of images collected from three hospitals were evaluated by the CNN models. Results: The CNN model consisted of 25 convolutional layers, 2 pooling layers, and a fully connected layer. The model obtained 82.4% sensitivity and 96.6% specificity on the 5,815 validation images when considered altered vaginal flora and BV as the positive samples, which was better than the top-level technologists and obstetricians in China. The ability of generalization for our model was strong that it obtained 75.1% accuracy of three categories of Nugent scores on the independent test set of 1082 images, which was 6.6% higher than the average of three technologists, who are with a bachelor degree in medicine and eligible making diagnostic decisions. When three technologists ran one specimen in triplicate, the precision of three categories of Nugent scores was 54.0%. 103 samples diagnosed by two technologists at different days showed repeatability of 90.3%. Conclusion: The CNN model over-performed human healthcare practitioners on accuracy and stability for three categories of Nugent scores diagnosis. The deep learning model may offer translational applications in automating diagnosis of bacterial vaginosis with proper supporting hardware.

Download Full-text

Time Series Features Extraction and Forecast from Multi-feature Stocks with Hybrid Deep Neural Networks.

Recent Advances in Computer Science and Communications ◽

10.2174/2666255813999200625220302 ◽

2020 ◽

Vol 13 ◽

Author(s):

Ye Xu ◽

Xun Yuan

Keyword(s):

Neural Networks ◽

Time Series ◽

Hybrid Model ◽

Deep Neural Networks ◽

Arima Model ◽

Forecast Accuracy ◽

Forecast Model ◽

Closing Price ◽

Massive Information ◽

Better Than

Background: Forecasting of time series stock data is important in financial related works. Stock data usually have multifeatures such as opening price, closing price and so on. The traditional forecast methods, however, is mainly applied to one feature – closing price, or a few, like four or five features. The massive information hidden in the multi-feature data is not thoroughly discovered and used. Objective: Find a method to make used of all information of multi-features and get a forecast model. Method: LSTM based models are introduced in this paper. For comparison, three models are used and they are single LSTM model, hybrid model of LSTM-CNN, and traditional ARIMA model. Results: Experiments with different models are performed on stock data with 50 and 230 features, respectively. Results show that MSE of single LSTM model is 2.4% lower than ARIMA model and MSE of LSTM-CNN model is 12.57% lower than that of single LSTM model on 50 features data. On 230 features data, LSTM-CNN model is found to be improved by 23.41% in forecast accuracy. Conclusion: In this paper, we use three different models – ARIMA, single LSTM and LSTM-CNN hybrid model – to forecast rise and fall of multi-features stock data. It’s found that single LSTM model is better than traditional ARIMA model on the average, and LSTM-CNN hybrid model is better than single LSTM model on 50-feature stock data. What’s more, we use LSTM-CNN model to perform experiments on stock data with 50 and 230 features, respectively. And is found that results of the same model on 230 features data is better than that on 50 features data. It’s proved in our work that the LSTM-CNN hybrid model is better than other models and experiments on stock data with more features could result in better outcomes. We’ll do more works on hybrid models next.

Download Full-text

Sparse Data Recommendation by Fusing Continuous Imputation Denoising Autoencoder and Neural Matrix Factorization

Applied Sciences ◽

10.3390/app9010054 ◽

2018 ◽

Vol 9 (1) ◽

pp. 54 ◽

Cited By ~ 1

Author(s):

Xinyue Wan ◽

Bofeng Zhang ◽

Guobing Zou ◽

Furong Chang

Keyword(s):

Neural Networks ◽

Matrix Factorization ◽

Deep Neural Networks ◽

Original Data ◽

Classification Problems ◽

Denoising Autoencoder ◽

Data Sparsity ◽

The Public ◽

Factorization Model ◽

Better Than

In recent years, although deep neural networks have yielded immense success in solving various recognition and classification problems, the exploration of deep neural networks in recommender systems has received relatively less attention. Meanwhile, the inherent sparsity of data is still a challenging problem for deep neural networks. In this paper, firstly, we propose a new CIDAE (Continuous Imputation Denoising Autoencoder) model based on the Denoising Autoencoder to alleviate the problem of data sparsity. CIDAE performs regular continuous imputation on the missing parts of the original data and trains the imputed data as the desired output. Then, we optimize the existing advanced NeuMF (Neural Matrix Factorization) model, which combines matrix factorization and a multi-layer perceptron. By optimizing the training process of NeuMF, we improve the accuracy and robustness of NeuMF. Finally, this paper fuses CIDAE and optimized NeuMF with reference to the idea of ensemble learning. We name the fused model the I-NMF (Imputation-Neural Matrix Factorization) model. I-NMF can not only alleviate the problem of data sparsity, but also fully exploit the ability of deep neural networks to learn potential features. Our experimental results prove that I-NMF performs better than the state-of-the-art methods for the public MovieLens datasets.

Download Full-text

Application of Deep Learning Methods to Forecasting Changes in Short-Term Currency Trends

Scientific Bulletin of Mukachevo State University Series “Economics” ◽

10.52566/msu-econ.7(2).2020.75-86 ◽

2020 ◽

Vol 7 (2) ◽

pp. 75-86

Author(s):

Vasily D. Derbentsev ◽

Vitalii S. Bezkorovainyi ◽

Iryna V. Luniak

Keyword(s):

Neural Networks ◽

Time Series ◽

Deep Learning ◽

Deep Neural Networks ◽

Short Term Memory ◽

Learning Models ◽

Short Term ◽

Term Memory ◽

Long Short Term Memory ◽

Better Than

This study investigates the issues of forecasting changes in short-term currency trends using deep learning models, which is relevant for both the scientific community and for traders and investors. The purpose of this study is to build a model for forecasting the direction of change in the prices of currency quotes based on deep neural networks. The developed architecture was based on the model of valve recurrent node, which is a modification of the model of “Long Short-Term Memory”, but is simpler in terms of the number of parameters and learning time. The forecast calculations of the dynamics of quotations of the currency pair euro/dollar and the most capitalised cryptocurrency Bitcoin/dollar were performed using daily, four-hour and hourly datasets. The obtained results of binary classification (forecast of the direction of trend change) when applying daily and hourly quotations turned out to be generally better than those of time series models or models of neural networks of other architecture (in particular, multilayer perceptron or “Long Short-Term Memory” models). According to the study results, the highest accuracy of classification was for the model of daily quotations for both euro/dollar – about 72%, and for Bitcoin/ dollar – about 69%. For four-hour and hourly time series, the accuracy of classification decreased, which can be explained both by the increase in the impact of “market noise” and the probable overfitting. Computer simulation has demonstrated that models predict a rising trend better than a declining one. The study confirmed the prospects for the application of deep learning models for short-term forecasting of time series of currency quotes. The use of the developed models proved to be effective for both fiat and cryptocurrencies. The proposed system of models based on deep neural networks can be used as a basis for developing an automated trading system in the foreign exchange market

Download Full-text

Androgen Receptor Binding Category Prediction with Deep Neural Networks and Structure-, Ligand-, and Statistically-Based Features

10.20944/preprints202102.0318.v3 ◽

2021 ◽

Author(s):

Alfonso T. García-Sosa

Keyword(s):

Neural Networks ◽

Androgen Receptor ◽

Logistic Model ◽

Deep Neural Networks ◽

State Of The Art ◽

Protein Structures ◽

Training Set ◽

Multivariate Logistic Model ◽

And Training ◽

Better Than

Substances that can modify the androgen receptor pathway in humans and animals are entering the environment and food chain with the proven ability to disrupt hormonal systems and leading to toxicity and adverse effects on reproduction, brain development, and prostate cancer, among others. State-of-the-art databases with experimental data of human, chimp, and rat effects by chemicals have been used to build machine learning classifiers and regressors and evaluate these on independent sets. Different featurizations, algorithms, and protein structures lead to dif- ferent results, with deep neural networks (DNNs) on user-defined physicochemically-relevant features developed for this work outperforming graph convolutional, random forest, and large featurizations. The results show that these user-provided structure-, ligand-, and statistically-based features and specific DNNs provided the best results as determined by AUC (0.87), MCC (0.47), and other metrics and by their interpretability and chemical meaning of the descriptors/features. In addition, the same features in the DNN method performed better than in a multivariate logistic model: validation MCC = 0.468 and training MCC = 0.868 for the present work compared to evalu- ation set MCC = 0.2036 and training set MCC = 0.5364 for the multivariate logistic regression on the full, unbalanced set. Techniques of this type may improve AR and toxicity description and predic- tion, improving assessment and design of compounds. Source code and data are available at https://github.com/AlfonsoTGarcia-Sosa/ML

Download Full-text

Androgen Receptor Binding Category Prediction with Deep Neural Networks and Structure-, Ligand-, and Statistically-Based Features

10.20944/preprints202102.0318.v2 ◽

2021 ◽

Author(s):

Alfonso T. García-Sosa

Keyword(s):

Neural Networks ◽

Androgen Receptor ◽

Logistic Model ◽

Deep Neural Networks ◽

State Of The Art ◽

Protein Structures ◽

Training Set ◽

Multivariate Logistic Model ◽

And Training ◽

Better Than

Substances that can modify the androgen receptor pathway in humans and animals are entering the environment and food chain with the proven ability to disrupt hormonal systems and leading to toxicity and adverse effects on reproduction, brain development, and prostate cancer, among others. State-of-the-art databases with experimental data of human, chimp, and rat effects by chemicals have been used to build machine learning classifiers and regressors and evaluate these on independent sets. Different featurizations, algorithms, and protein structures lead to dif- ferent results, with deep neural networks (DNNs) on user-defined physicochemically-relevant features developed for this work outperforming graph convolutional, random forest, and large featurizations. The results show that these user-provided structure-, ligand-, and statistically-based features and specific DNNs provided the best results as determined by AUC (0.87), MCC (0.47), and other metrics and by their interpretability and chemical meaning of the descriptors/features. In addition, the same features in the DNN method performed better than in a multivariate logistic model: validation MCC = 0.468 and training MCC = 0.868 for the present work compared to evalu- ation set MCC = 0.2036 and training set MCC = 0.5364 for the multivariate logistic regression on the full, unbalanced set. Techniques of this type may improve AR and toxicity description and predic- tion, improving assessment and design of compounds. Source code and data are available at https://github.com/AlfonsoTGarcia-Sosa/ML

Download Full-text

Theoretical Investigation of Generalization Bounds for Adversarial Learning of Deep Neural Networks

Journal of Statistical Theory and Practice ◽

10.1007/s42519-021-00171-6 ◽

2021 ◽

Vol 15 (2) ◽

Author(s):

Qingyi Gao ◽

Xiao Wang

Keyword(s):

Neural Networks ◽

Theoretical Investigation ◽

Deep Neural Networks ◽

Adversarial Learning ◽

Generalization Bounds

Download Full-text