scholarly journals Deep ConvNet: Non-Random Weight Initialization for Repeatable Determinism, Examined with FSGM

Sensors ◽  
2021 ◽  
Vol 21 (14) ◽  
pp. 4772
Author(s):  
Richard N. M. Rudd-Orthner ◽  
Lyudmila Mihaylova

A repeatable and deterministic non-random weight initialization method in convolutional layers of neural networks examined with the Fast Gradient Sign Method (FSGM). Using the FSGM approach as a technique to measure the initialization effect with controlled distortions in transferred learning, varying the dataset numerical similarity. The focus is on convolutional layers with induced earlier learning through the use of striped forms for image classification. Which provided a higher performing accuracy in the first epoch, with improvements of between 3–5% in a well known benchmark model, and also ~10% in a color image dataset (MTARSI2), using a dissimilar model architecture. The proposed method is robust to limit optimization approaches like Glorot/Xavier and He initialization. Arguably the approach is within a new category of weight initialization methods, as a number sequence substitution of random numbers, without a tether to the dataset. When examined under the FGSM approach with transferred learning, the proposed method when used with higher distortions (numerically dissimilar datasets), is less compromised against the original cross-validation dataset, at ~31% accuracy instead of ~9%. This is an indication of higher retention of the original fitting in transferred learning.

Author(s):  
Richard Niall Mark Rudd-Orthner ◽  
Lyudmila Mihaylova

This paper presents a non-random weight initialization method in convolutional layers of neural networks examined with the Fast Gradient Sign Method (FSGM) attack. This paper's focus is convolutional layers, and are the layers that have been responsible for better than human performance in image categorization. The proposed method induces earlier learning through the use of striped forms, and as such has less unlearning of the existing random number speckled methods, consistent with the intuitions of Hubel and Wiesel. The proposed method provides a higher performing accuracy in a single epoch, with improvements of between 3-5% in a well known benchmark model, of which the first epoch is the most relevant as it is the epoch after initialization. The proposed method is also repeatable and deterministic, as a desirable quality for safety critical applications in image classification within sensors. That method is robust to Glorot/Xavier and He initialization limits as well. The proposed non-random initialization was examined under adversarial perturbation attack through the FGSM approach with transferred learning, as a technique to measure the affect in transferred learning with controlled distortions, and finds that the proposed method is less compromised to the original validation dataset, with higher distorted datasets.


Author(s):  
Caixia Sun ◽  
Lian Zou ◽  
Cien Fan ◽  
Yu Shi ◽  
Yifeng Liu

Deep neural networks are vulnerable to adversarial examples, which can fool models by adding carefully designed perturbations. An intriguing phenomenon is that adversarial examples often exhibit transferability, thus making black-box attacks effective in real-world applications. However, the adversarial examples generated by existing methods typically overfit the structure and feature representation of the source model, resulting in a low success rate in a black-box manner. To address this issue, we propose the multi-scale feature attack to boost attack transferability, which adjusts the internal feature space representation of the adversarial image to get far to the internal representation of the original image. We show that we can select a low-level layer and a high-level layer of the source model to conduct the perturbations, and the crafted adversarial examples are confused with original images, not just in the class but also in the feature space representations. To further improve the transferability of adversarial examples, we apply reverse cross-entropy loss to reduce the overfitting further and show that it is effective for attacking adversarially trained models with strong defensive ability. Extensive experiments show that the proposed methods consistently outperform the iterative fast gradient sign method (IFGSM) and momentum iterative fast gradient sign method (MIFGSM) under the challenging black-box setting.


2019 ◽  
Vol 9 (24) ◽  
pp. 5340 ◽  
Author(s):  
Małgorzata Plechawska-Wójcik ◽  
Mikhail Tokovarov ◽  
Monika Kaczorowska ◽  
Dariusz Zapała

Evaluation of cognitive workload finds its application in many areas, from educational program assessment through professional driver health examination to monitoring the mental state of people carrying out jobs of high responsibility, such as pilots or airline traffic dispatchers. Estimation of multilevel cognitive workload is a task usually realized in a subject-dependent way, while the present research is focused on developing the procedure of subject-independent evaluation of cognitive workload level. The aim of the paper is to estimate cognitive workload level in accordance with subject-independent approach, applying classical machine learning methods combined with feature selection techniques. The procedure of data acquisition was based on registering the EEG signal of the person performing arithmetical tasks divided into six intervals of advancement. The analysis included the stages of preprocessing, feature extraction, and selection, while the final step covered multiclass classification performed with several models. The results discussed show high maximal accuracies achieved: ~91% for both the validation dataset and for the cross-validation approach for kNN model.


2016 ◽  
Vol 7 (4) ◽  
Author(s):  
Mochammad Yusa ◽  
Ema Utami ◽  
Emha T. Luthfi

Abstract. Readmission is associated with quality measures on patients in hospitals. Different attributes related to diabetic patients such as medication, ethnicity, race, lifestyle, age, and others result in the calculation of quality care that tends to be complicated. Classification techniques of data mining can solve this problem. In this paper, the evaluation on three different classifiers, i.e. Decision Tree, k-Nearest Neighbor (k-NN), dan Naive Bayes with various settingparameter, is developed by using 10-Fold Cross Validation technique. The targets of parameter performance evaluated is based on term of Accuracy, Mean Absolute Error (MAE), dan Kappa Statistic. The selected dataset consists of 47 attributes and 49.735 records. The result shows that k-NN classifier with k=100 has a better performance in term of accuracy and Kappa Statistic, but Naive Bayes outperforms in term of MAE among other classifiers. Keywords: k-NN, naive bayes, diabetes, readmissionAbstrak. Proses Readmisi dikaitkan dengan perhitungan kualitas penanganan pasien di rumah sakit. Perbedaan atribut-atribut yang berhubungan dengan pasien diabetes proses medikasi, etnis, ras, gaya hidup, umur, dan lain-lain, mengakibatkan perhitungan kualitas cenderung rumit. Teknik klasifikasi data mining dapat menjadi solusi dalam perhitungan kualitas ini. Teknik klasifikasi merupakan salah satu teknik data mining yang perkembangannya cukup signifikan. Di dalam penelitian ini, model algoritma klasifikasi Decision Tree, k-Nearest Neighbor (k-NN), dan Naive Bayes dengan berbagai parameter setting akan dievaluasi performanya berdasarkan nilai performa Accuracy, Mean AbsoluteError (MAE), dan Kappa Statistik dengan metode 10-Fold Cross Validation. Dataset yang dievaluasi memiliki 47 atribut dengan 49.735 records. Hasil penelitian menunjukan bahwa performa accuracy, MAE, dan Kappa Statistik terbaik didapatkan dari Model Algoritma Naive Bayes.Kata Kunci: k-NN, naive bayes, diabetes, readmisi


2021 ◽  
Author(s):  
Thi Lan Anh Dinh ◽  
Filipe Aires

Abstract. The use of statistical models to study the impact of weather on crop yield has not ceased to increase. Unfortunately, this type of application is characterised by datasets with a very limited number of samples (typically one sample per year). In general, statistical inference uses three datasets: the training dataset to optimise the model parameters, the validation datasets to select the best model, and the testing dataset to evaluate the model generalisation ability. Splitting the overall database into three datasets is impossible in crop yield modelling. The leave-one-out cross-validation method or simply leave-one-out (LOO) has been introduced to facilitate statistical modelling when the database is limited. However, the model choice is made using the testing dataset, which can be misleading by favouring unnecessarily complex models. The nested cross-validation approach was introduced in machine learning to avoid this problem by truly utilising three datasets, especially problems with limited databases. In this study, we proposed one particular implementation of the nested cross-validation, called the leave-two-out method (LTO), to chose the best model with an optimal model complexity (using the validation dataset) and estimated the true model quality (using the testing dataset). Two applications are considered: Robusta coffee in Cu M'gar (Dak Lak, Vietnam) and grain maize over 96 French departments. In both cases, LOO is misleading by choosing too complex models; LTO indicates that simpler models actually perform better when a reliable generalisation test is considered. The simple models obtained using the LTO approach have reasonable yield anomaly forecasting skills in both study crops. This LTO approach can also be used in seasonal forecasting applications. We suggest that the LTO method should become a standard procedure for statistical crop modelling.


2020 ◽  
Author(s):  
Binyin Li ◽  
Miao Zhang ◽  
Joost Riphagen ◽  
Kathryn Morrison Yochim ◽  
Biao Li ◽  
...  

Abstract Background: Structural neuroimaging has been applied towards identification of individuals with Alzheimer’s disease (AD) and mild cognitive impairment (MCI). However, these methods are greatly impacted by age limiting their utility for detection of preclinical pathology. Therefore, careful consideration of age effects in the modeling of AD degenerative patterns could provide more sensitive detection of the earliest stages of brain disease.Methods: We built linear models for age based on multiple combined structural features (cortical thickness, subcortical structural volumes, ratio of gray to white matter signal intensity, white matter signal abnormalities, total intracranial volume) in 272 healthy adults across a wide age range (D1: age 36-108). These models were then used to create a new support vector machine (SVM) training model with 10-fold cross validation in 136 AD and 268 control participants (D2) based on deviations from the expected age-effects found in the initial sample. Subsequent validation assessed the accuracy of the SVM model to correctly classify AD patients in a new dataset (D3). Finally, we applied the classifier to individuals with MCI to evaluate prediction for early impairment and longitudinal cognitive change.Results: Optimal cross-validation accuracy was 93.07% in the D2, compared to 91.83% without age detrending in D1. In the validation dataset (D3), the classifier obtained an accuracy of 84.85% (56/66), sensitivity of 85.36% (35/41) and specificity of 84% (21/25). In the MCI dataset, we observed significantly greater longitudinal cognitive decline in MCI who were classified as more ‘AD-like’ (MCI-AD), and this effect was pronounced in individuals who were late MCI. The top five contributive features were volumes of left hippocampus, right hippocampus, left amygdala, the thickness of left and right medial temporal & parahippocampus gyrus.Conclusions: Linear detrending for age in SVM for combined structural features resulted in good performance for classification of AD and generalization of MCI prediction. Such procedures should be employed in future work.


Sign in / Sign up

Export Citation Format

Share Document