Simple Convolutional-Based Models: Are They Learning the Task or the Data?

2021 ◽  
pp. 1-17
Author(s):  
Luis Sa-Couto ◽  
Andreas Wichert

Abstract Convolutional neural networks (CNNs) evolved from Fukushima's neocognitron model, which is based on the ideas of Hubel and Wiesel about the early stages of the visual cortex. Unlike other branches of neocognitron-based models, the typical CNN is based on end-to-end supervised learning by backpropagation and removes the focus from built-in invariance mechanisms, using pooling not as a way to tolerate small shifts but as a regularization tool that decreases model complexity. These properties of end-to-end supervision and flexibility of structure allow the typical CNN to become highly tuned to the training data, leading to extremely high accuracies on typical visual pattern recognition data sets. However, in this work, we hypothesize that there is a flip side to this capability, a hidden overfitting. More concretely, a supervised, backpropagation based CNN will outperform a neocognitron/map transformation cascade (MTCCXC) when trained and tested inside the same data set. Yet if we take both models trained and test them on the same task but on another data set (without retraining), the overfitting appears. Other neocognitron descendants like the What-Where model go in a different direction. In these models, learning remains unsupervised, but more structure is added to capture invariance to typical changes. Knowing that, we further hypothesize that if we repeat the same experiments with this model, the lack of supervision may make it worse than the typical CNN inside the same data set, but the added structure will make it generalize even better to another one. To put our hypothesis to the test, we choose the simple task of handwritten digit classification and take two well-known data sets of it: MNIST and ETL-1. To try to make the two data sets as similar as possible, we experiment with several types of preprocessing. However, regardless of the type in question, the results align exactly with expectation.

Author(s):  
Paolo Massimo Buscema ◽  
William J Tastle

Data sets collected independently using the same variables can be compared using a new artificial neural network called Artificial neural network What If Theory, AWIT. Given a data set that is deemed the standard reference for some object, i.e. a flower, industry, disease, or galaxy, other data sets can be compared against it to identify its proximity to the standard. Thus, data that might not lend itself well to traditional methods of analysis could identify new perspectives or views of the data and thus, potentially new perceptions of novel and innovative solutions. This method comes out of the field of artificial intelligence, particularly artificial neural networks, and utilizes both machine learning and pattern recognition to display an innovative analysis.


2020 ◽  
Vol 34 (3) ◽  
pp. 683
Author(s):  
Elena Badal-Valero ◽  
Belén García-Cárceles

This paper explores the possibilities offered by statistical tools based on artificial neural networks for pattern recognition in expert work for money-laundering detection. The data is provided by the Spanish Police Department and comes from a case in which is actually working at. Account information is provided, where some accounting entries are identified as fraud. Hence it is possible to use this information to train a classification model. In this analysis, after briefly describing methodology used and fitting strategy, it is presented a model with a promising predictive capacity, even with strongly unbalanced training data set. After applying balancing technique to  the training data (SMOTE) the result is remarkably improved which would indicate the viability of those models as tool for police experts planification, providing a way to reduce the use of expensive research resources.


Author(s):  
Paolo Massimo Buscema ◽  
William J. Tastle

Data sets collected independently using the same variables can be compared using a new artificial neural network called Artificial neural network What If Theory, AWIT. Given a data set that is deemed the standard reference for some object, i.e. a flower, industry, disease, or galaxy, other data sets can be compared against it to identify its proximity to the standard. Thus, data that might not lend itself well to traditional methods of analysis could identify new perspectives or views of the data and thus, potentially new perceptions of novel and innovative solutions. This method comes out of the field of artificial intelligence, particularly artificial neural networks, and utilizes both machine learning and pattern recognition to display an innovative analysis.


2020 ◽  
Vol 10 (1) ◽  
pp. 55-60
Author(s):  
Owais Mujtaba Khanday ◽  
Samad Dadvandipour

Deep Neural Networks (DNN) in the past few years have revolutionized the computer vision by providing the best results on a large number of problems such as image classification, pattern recognition, and speech recognition. One of the essential models in deep learning used for image classification is convolutional neural networks. These networks can integrate a different number of features or so-called filters in a multi-layer fashion called convolutional layers. These models use convolutional, and pooling layers for feature abstraction and have neurons arranged in three dimensions: Height, Width, and Depth. Filters of 3 different sizes were used like 3×3, 5×5 and 7×7. It has been seen that the accuracy on the training data has been decreased from 100% to 97.8% as we increase the filter size and also the accuracy on the test data set decreases for 3×3 it is 98.7%, for 5×5 it is 98.5%, and for 7×7 it is 97.8%. The loss on the training data and test data per 10 epochs could be seen drastically increasing from 3.4% to 27.6% and 12.5% to 23.02%, respectively. Thus it is clear that using the filters having lesser dimensions is giving less loss than those having more dimensions. However, using the smaller filter size comes with the cost of computational complexity, which is very crucial in the case of larger data sets.


Sensors ◽  
2020 ◽  
Vol 21 (1) ◽  
pp. 11
Author(s):  
Domonkos Haffner ◽  
Ferenc Izsák

The localization of multiple scattering objects is performed while using scattered waves. An up-to-date approach: neural networks are used to estimate the corresponding locations. In the scattering phenomenon under investigation, we assume known incident plane waves, fully reflecting balls with known diameters and measurement data of the scattered wave on one fixed segment. The training data are constructed while using the simulation package μ-diff in Matlab. The structure of the neural networks, which are widely used for similar purposes, is further developed. A complex locally connected layer is the main compound of the proposed setup. With this and an appropriate preprocessing of the training data set, the number of parameters can be kept at a relatively low level. As a result, using a relatively large training data set, the unknown locations of the objects can be estimated effectively.


2020 ◽  
Vol 6 ◽  
Author(s):  
Jaime de Miguel Rodríguez ◽  
Maria Eugenia Villafañe ◽  
Luka Piškorec ◽  
Fernando Sancho Caparrini

Abstract This work presents a methodology for the generation of novel 3D objects resembling wireframes of building types. These result from the reconstruction of interpolated locations within the learnt distribution of variational autoencoders (VAEs), a deep generative machine learning model based on neural networks. The data set used features a scheme for geometry representation based on a ‘connectivity map’ that is especially suited to express the wireframe objects that compose it. Additionally, the input samples are generated through ‘parametric augmentation’, a strategy proposed in this study that creates coherent variations among data by enabling a set of parameters to alter representative features on a given building type. In the experiments that are described in this paper, more than 150 k input samples belonging to two building types have been processed during the training of a VAE model. The main contribution of this paper has been to explore parametric augmentation for the generation of large data sets of 3D geometries, showcasing its problems and limitations in the context of neural networks and VAEs. Results show that the generation of interpolated hybrid geometries is a challenging task. Despite the difficulty of the endeavour, promising advances are presented.


2021 ◽  
Vol 11 (15) ◽  
pp. 6723
Author(s):  
Ariana Raluca Hategan ◽  
Romulus Puscas ◽  
Gabriela Cristea ◽  
Adriana Dehelean ◽  
Francois Guyon ◽  
...  

The present work aims to test the potential of the application of Artificial Neural Networks (ANNs) for food authentication. For this purpose, honey was chosen as the working matrix. The samples were originated from two countries: Romania (50) and France (53), having as floral origins: acacia, linden, honeydew, colza, galium verum, coriander, sunflower, thyme, raspberry, lavender and chestnut. The ANNs were built on the isotope and elemental content of the investigated honey samples. This approach conducted to the development of a prediction model for geographical recognition with an accuracy of 96%. Alongside this work, distinct models were developed and tested, with the aim of identifying the most suitable configurations for this application. In this regard, improvements have been continuously performed; the most important of them consisted in overcoming the unwanted phenomenon of over-fitting, observed for the training data set. This was achieved by identifying appropriate values for the number of iterations over the training data and for the size and number of the hidden layers and by introducing of a dropout layer in the configuration of the neural structure. As a conclusion, ANNs can be successfully applied in food authenticity control, but with a degree of caution with respect to the “over optimization” of the correct classification percentage for the training sample set, which can lead to an over-fitted model.


2021 ◽  
Author(s):  
Louise Bloch ◽  
Christoph M. Friedrich

Abstract Background: The prediction of whether Mild Cognitive Impaired (MCI) subjects will prospectively develop Alzheimer's Disease (AD) is important for the recruitment and monitoring of subjects for therapy studies. Machine Learning (ML) is suitable to improve early AD prediction. The etiology of AD is heterogeneous, which leads to noisy data sets. Additional noise is introduced by multicentric study designs and varying acquisition protocols. This article examines whether an automatic and fair data valuation method based on Shapley values can identify subjects with noisy data. Methods: An ML-workow was developed and trained for a subset of the Alzheimer's Disease Neuroimaging Initiative (ADNI) cohort. The validation was executed for an independent ADNI test data set and for the Australian Imaging, Biomarker and Lifestyle Flagship Study of Ageing (AIBL) cohort. The workow included volumetric Magnetic Resonance Imaging (MRI) feature extraction, subject sample selection using data Shapley, Random Forest (RF) and eXtreme Gradient Boosting (XGBoost) for model training and Kernel SHapley Additive exPlanations (SHAP) values for model interpretation. This model interpretation enables clinically relevant explanation of individual predictions. Results: The XGBoost models which excluded 116 of the 467 subjects from the training data set based on their Logistic Regression (LR) data Shapley values outperformed the models which were trained on the entire training data set and which reached a mean classification accuracy of 58.54 % by 14.13 % (8.27 percentage points) on the independent ADNI test data set. The XGBoost models, which were trained on the entire training data set reached a mean accuracy of 60.35 % for the AIBL data set. An improvement of 24.86 % (15.00 percentage points) could be reached for the XGBoost models if those 72 subjects with the smallest RF data Shapley values were excluded from the training data set. Conclusion: The data Shapley method was able to improve the classification accuracies for the test data sets. Noisy data was associated with the number of ApoEϵ4 alleles and volumetric MRI measurements. Kernel SHAP showed that the black-box models learned biologically plausible associations.


2020 ◽  
Vol 34 (04) ◽  
pp. 5620-5627 ◽  
Author(s):  
Murat Sensoy ◽  
Lance Kaplan ◽  
Federico Cerutti ◽  
Maryam Saleki

Deep neural networks are often ignorant about what they do not know and overconfident when they make uninformed predictions. Some recent approaches quantify classification uncertainty directly by training the model to output high uncertainty for the data samples close to class boundaries or from the outside of the training distribution. These approaches use an auxiliary data set during training to represent out-of-distribution samples. However, selection or creation of such an auxiliary data set is non-trivial, especially for high dimensional data such as images. In this work we develop a novel neural network model that is able to express both aleatoric and epistemic uncertainty to distinguish decision boundary and out-of-distribution regions of the feature space. To this end, variational autoencoders and generative adversarial networks are incorporated to automatically generate out-of-distribution exemplars for training. Through extensive analysis, we demonstrate that the proposed approach provides better estimates of uncertainty for in- and out-of-distribution samples, and adversarial examples on well-known data sets against state-of-the-art approaches including recent Bayesian approaches for neural networks and anomaly detection methods.


Sign in / Sign up

Export Citation Format

Share Document