scholarly journals Convolutional Neural Networks and Impact of Filter Sizes on Image Classification

2020 ◽  
Vol 10 (1) ◽  
pp. 55-60
Author(s):  
Owais Mujtaba Khanday ◽  
Samad Dadvandipour

Deep Neural Networks (DNN) in the past few years have revolutionized the computer vision by providing the best results on a large number of problems such as image classification, pattern recognition, and speech recognition. One of the essential models in deep learning used for image classification is convolutional neural networks. These networks can integrate a different number of features or so-called filters in a multi-layer fashion called convolutional layers. These models use convolutional, and pooling layers for feature abstraction and have neurons arranged in three dimensions: Height, Width, and Depth. Filters of 3 different sizes were used like 3×3, 5×5 and 7×7. It has been seen that the accuracy on the training data has been decreased from 100% to 97.8% as we increase the filter size and also the accuracy on the test data set decreases for 3×3 it is 98.7%, for 5×5 it is 98.5%, and for 7×7 it is 97.8%. The loss on the training data and test data per 10 epochs could be seen drastically increasing from 3.4% to 27.6% and 12.5% to 23.02%, respectively. Thus it is clear that using the filters having lesser dimensions is giving less loss than those having more dimensions. However, using the smaller filter size comes with the cost of computational complexity, which is very crucial in the case of larger data sets.

2021 ◽  
Author(s):  
Louise Bloch ◽  
Christoph M. Friedrich

Abstract Background: The prediction of whether Mild Cognitive Impaired (MCI) subjects will prospectively develop Alzheimer's Disease (AD) is important for the recruitment and monitoring of subjects for therapy studies. Machine Learning (ML) is suitable to improve early AD prediction. The etiology of AD is heterogeneous, which leads to noisy data sets. Additional noise is introduced by multicentric study designs and varying acquisition protocols. This article examines whether an automatic and fair data valuation method based on Shapley values can identify subjects with noisy data. Methods: An ML-workow was developed and trained for a subset of the Alzheimer's Disease Neuroimaging Initiative (ADNI) cohort. The validation was executed for an independent ADNI test data set and for the Australian Imaging, Biomarker and Lifestyle Flagship Study of Ageing (AIBL) cohort. The workow included volumetric Magnetic Resonance Imaging (MRI) feature extraction, subject sample selection using data Shapley, Random Forest (RF) and eXtreme Gradient Boosting (XGBoost) for model training and Kernel SHapley Additive exPlanations (SHAP) values for model interpretation. This model interpretation enables clinically relevant explanation of individual predictions. Results: The XGBoost models which excluded 116 of the 467 subjects from the training data set based on their Logistic Regression (LR) data Shapley values outperformed the models which were trained on the entire training data set and which reached a mean classification accuracy of 58.54 % by 14.13 % (8.27 percentage points) on the independent ADNI test data set. The XGBoost models, which were trained on the entire training data set reached a mean accuracy of 60.35 % for the AIBL data set. An improvement of 24.86 % (15.00 percentage points) could be reached for the XGBoost models if those 72 subjects with the smallest RF data Shapley values were excluded from the training data set. Conclusion: The data Shapley method was able to improve the classification accuracies for the test data sets. Noisy data was associated with the number of ApoEϵ4 alleles and volumetric MRI measurements. Kernel SHAP showed that the black-box models learned biologically plausible associations.


2021 ◽  
pp. 1-17
Author(s):  
Luis Sa-Couto ◽  
Andreas Wichert

Abstract Convolutional neural networks (CNNs) evolved from Fukushima's neocognitron model, which is based on the ideas of Hubel and Wiesel about the early stages of the visual cortex. Unlike other branches of neocognitron-based models, the typical CNN is based on end-to-end supervised learning by backpropagation and removes the focus from built-in invariance mechanisms, using pooling not as a way to tolerate small shifts but as a regularization tool that decreases model complexity. These properties of end-to-end supervision and flexibility of structure allow the typical CNN to become highly tuned to the training data, leading to extremely high accuracies on typical visual pattern recognition data sets. However, in this work, we hypothesize that there is a flip side to this capability, a hidden overfitting. More concretely, a supervised, backpropagation based CNN will outperform a neocognitron/map transformation cascade (MTCCXC) when trained and tested inside the same data set. Yet if we take both models trained and test them on the same task but on another data set (without retraining), the overfitting appears. Other neocognitron descendants like the What-Where model go in a different direction. In these models, learning remains unsupervised, but more structure is added to capture invariance to typical changes. Knowing that, we further hypothesize that if we repeat the same experiments with this model, the lack of supervision may make it worse than the typical CNN inside the same data set, but the added structure will make it generalize even better to another one. To put our hypothesis to the test, we choose the simple task of handwritten digit classification and take two well-known data sets of it: MNIST and ETL-1. To try to make the two data sets as similar as possible, we experiment with several types of preprocessing. However, regardless of the type in question, the results align exactly with expectation.


2020 ◽  
Vol 12 (11) ◽  
pp. 1743
Author(s):  
Artur M. Gafurov ◽  
Oleg P. Yermolayev

Transition from manual (visual) interpretation to fully automated gully detection is an important task for quantitative assessment of modern gully erosion, especially when it comes to large mapping areas. Existing approaches to semi-automated gully detection are based on either object-oriented selection based on multispectral images or gully selection based on a probabilistic model obtained using digital elevation models (DEMs). These approaches cannot be used for the assessment of gully erosion on the territory of the European part of Russia most affected by gully erosion due to the lack of national large-scale DEM and limited resolution of open source multispectral satellite images. An approach based on the use of convolutional neural networks for automated gully detection on the RGB-synthesis of ultra-high resolution satellite images publicly available for the test region of the east of the Russian Plain with intensive basin erosion has been proposed and developed. The Keras library and U-Net architecture of convolutional neural networks were used for training. Preliminary results of application of the trained gully erosion convolutional neural network (GECNN) allow asserting that the algorithm performs well in detecting active gullies, well differentiates gullies from other linear forms of slope erosion — rills and balkas, but so far has errors in detecting complex gully systems. Also, GECNN does not identify a gully in 10% of cases and in another 10% of cases it identifies not a gully. To solve these problems, it is necessary to additionally train the neural network on the enlarged training data set.


2021 ◽  
Vol 2021 ◽  
pp. 1-11
Author(s):  
Haibin Chang ◽  
Ying Cui

More and more image materials are used in various industries these days. Therefore, how to collect useful images from a large set has become an urgent priority. Convolutional neural networks (CNN) have achieved good results in certain image classification tasks, but there are still problems such as poor classification ability, low accuracy, and slow convergence speed. This article mainly introduces the image classification algorithm (ICA) research based on the multilabel learning of the improved convolutional neural network and some improvement ideas for the research of the ICA based on the multilabel learning of the convolutional neural network. This paper proposes an ICA research method based on multilabel learning of improved convolutional neural networks, including the image classification process, convolutional network algorithm, and multilabel learning algorithm. The conclusions show that the average maximum classification accuracy of the improved CNN in this paper is 90.63%, and the performance is better, which is beneficial to improving the efficiency of image classification. The improved CNN network structure has reached the highest accuracy rate of 91.47% on the CIFAR-10 data set, which is much higher than the traditional CNN algorithm.


2020 ◽  
Vol 2 (2) ◽  
pp. 23
Author(s):  
Lei Wang

<p>As an important research achievement in the field of brain like computing, deep convolution neural network has been widely used in many fields such as computer vision, natural language processing, information retrieval, speech recognition, semantic understanding and so on. It has set off a wave of neural network research in industry and academia and promoted the development of artificial intelligence. At present, the deep convolution neural network mainly simulates the complex hierarchical cognitive laws of the human brain by increasing the number of layers of the network, using a larger training data set, and improving the network structure or training learning algorithm of the existing neural network, so as to narrow the gap with the visual system of the human brain and enable the machine to acquire the capability of "abstract concepts". Deep convolution neural network has achieved great success in many computer vision tasks such as image classification, target detection, face recognition, pedestrian recognition, etc. Firstly, this paper reviews the development history of convolutional neural networks. Then, the working principle of the deep convolution neural network is analyzed in detail. Then, this paper mainly introduces the representative achievements of convolution neural network from the following two aspects, and shows the improvement effect of various technical methods on image classification accuracy through examples. From the aspect of adding network layers, the structures of classical convolutional neural networks such as AlexNet, ZF-Net, VGG, GoogLeNet and ResNet are discussed and analyzed. From the aspect of increasing the size of data set, the difficulties of manually adding labeled samples and the effect of using data amplification technology on improving the performance of neural network are introduced. This paper focuses on the latest research progress of convolution neural network in image classification and face recognition. Finally, the problems and challenges to be solved in future brain-like intelligence research based on deep convolution neural network are proposed.</p>


2020 ◽  
Vol 12 (20) ◽  
pp. 3358
Author(s):  
Vasileios Syrris ◽  
Ondrej Pesek ◽  
Pierre Soille

Automatic supervised classification with complex modelling such as deep neural networks requires the availability of representative training data sets. While there exists a plethora of data sets that can be used for this purpose, they are usually very heterogeneous and not interoperable. In this context, the present work has a twofold objective: (i) to describe procedures of open-source training data management, integration, and data retrieval, and (ii) to demonstrate the practical use of varying source training data for remote sensing image classification. For the former, we propose SatImNet, a collection of open training data, structured and harmonized according to specific rules. For the latter, two modelling approaches based on convolutional neural networks have been designed and configured to deal with satellite image classification and segmentation.


2019 ◽  
Vol 11 (1) ◽  
Author(s):  
Sreerupa Das ◽  
Christopher D Hollander ◽  
Suraiya Suliman

Convolutional Neural Networks (CNNs) have become the recent tool of choice for many visual detection tasks, including object classification, localization, detection, and segmentation. CNNs are specialized neural networks composed of many layers and specifically designed to analyze grid-like data, e.g. images. One of the key features of a CNN is its ability to automatically detect important features within an image (e.g. edges, patterns, shapes); prior to CNNs, these features had to be manually engineered by subject matter experts. Inspired by the significant achievements and success that CNNs have experienced in the domain of computer vision, we examine a specific convolutional neural network (CNN) architecture, U-Net, suited for the task of visual defect detection. We identify and discuss situations for the use of this architecture in the specific context of external defect detection on aircraft and experimentally discuss its performance across a dataset of common visual defects. One requirement of training Convolution Networks on an image analysis task is the need for a large image (training) data set.  We address this problem by using synthetically generated images from computer models of jets with varying angles and perspectives with and without induced faults in the generated images.  This paper presents the initial results of using CNNs, specifically U-Net, to detect aerial vehicle surface defects of three categories.  We further demonstrate that CNNs trained on synthetic images can then be used to detect faults in real images of jets with visual damages.  The results obtained in this research, indicate that our approach has been quite effective in detecting surface anomalies in our tests.


10.29007/9c5j ◽  
2019 ◽  
Author(s):  
Allison Rossetto ◽  
Wenjin Zhou

Wavelet pooling methods can improve the classification accuracy of Convolutional Neural Networks (CNNs). Combining wavelet pooling with the Nesterov-accelerated Adam (NAdam) gradient calculation method can improve both the accuracy of the CNN. We have implemented wavelet pooling with NAdam in this work using both a Haar wavelet (WavPool-NH) and a Shannon wavelet (WavPool-NS). The WavPool-NH and WavPool- NS methods are most accurate of the methods we considered for the MNIST and LIDC- IDRI lung tumor data-sets. The WavPool-NH and WavPool-NS implementations have an accuracy of 95.92% and 95.52%, respectively, on the LIDC-IDRI data-set. This is an improvement from the 92.93% accuracy obtained on this data-set with the max pooling method. The WavPool methods also avoid overfitting which is a concern with max pool- ing. We also found WavPool performed fairly well on the CIFAR-10 data-set, however, overfitting was an issue with all the methods we considered. Wavelet pooling, especially when combined with an adaptive gradient and wavelets chosen specifically for the data, has the potential to outperform current methods.


2021 ◽  
pp. 002203452110357
Author(s):  
T. Chen ◽  
P.D. Marsh ◽  
N.N. Al-Hebshi

An intuitive, clinically relevant index of microbial dysbiosis as a summary statistic of subgingival microbiome profiles is needed. Here, we describe a subgingival microbial dysbiosis index (SMDI) based on machine learning analysis of published periodontitis/health 16S microbiome data. The raw sequencing data, split into training and test sets, were quality filtered, taxonomically assigned to the species level, and centered log-ratio transformed. The training data set was subject to random forest analysis to identify discriminating species (DS) between periodontitis and health. DS lists, compiled by various “Gini” importance score cutoffs, were used to compute the SMDI for samples in the training and test data sets as the mean centered log-ratio abundance of periodontitis-associated species subtracted by that of health-associated ones. Diagnostic accuracy was assessed with receiver operating characteristic analysis. An SMDI based on 49 DS provided the highest accuracy with areas under the curve of 0.96 and 0.92 in the training and test data sets, respectively, and ranged from −6 (most normobiotic) to 5 (most dysbiotic) with a value around zero discriminating most of the periodontitis and healthy samples. The top periodontitis-associated DS were Treponema denticola, Mogibacterium timidum, Fretibacterium spp., and Tannerella forsythia, while Actinomyces naeslundii and Streptococcus sanguinis were the top health-associated DS. The index was highly reproducible by hypervariable region. Applying the index to additional test data sets in which nitrate had been used to modulate the microbiome demonstrated that nitrate has dysbiosis-lowering properties in vitro and in vivo. Finally, 3 genera ( Treponema, Fretibacterium, and Actinomyces) were identified that could be used for calculation of a simplified SMDI with comparable accuracy. In conclusion, we have developed a nonbiased, reproducible, and easy-to-interpret index that can be used to identify patients/sites at risk of periodontitis, to assess the microbial response to treatment, and, importantly, as a quantitative tool in microbiome modulation studies.


2019 ◽  
Vol 488 (4) ◽  
pp. 5232-5250 ◽  
Author(s):  
Alexander Chaushev ◽  
Liam Raynard ◽  
Michael R Goad ◽  
Philipp Eigmüller ◽  
David J Armstrong ◽  
...  

ABSTRACT Vetting of exoplanet candidates in transit surveys is a manual process, which suffers from a large number of false positives and a lack of consistency. Previous work has shown that convolutional neural networks (CNN) provide an efficient solution to these problems. Here, we apply a CNN to classify planet candidates from the Next Generation Transit Survey (NGTS). For training data sets we compare both real data with injected planetary transits and fully simulated data, as well as how their different compositions affect network performance. We show that fewer hand labelled light curves can be utilized, while still achieving competitive results. With our best model, we achieve an area under the curve (AUC) score of $(95.6\pm {0.2}){{\ \rm per\ cent}}$ and an accuracy of $(88.5\pm {0.3}){{\ \rm per\ cent}}$ on our unseen test data, as well as $(76.5\pm {0.4}){{\ \rm per\ cent}}$ and $(74.6\pm {1.1}){{\ \rm per\ cent}}$ in comparison to our existing manual classifications. The neural network recovers 13 out of 14 confirmed planets observed by NGTS, with high probability. We use simulated data to show that the overall network performance is resilient to mislabelling of the training data set, a problem that might arise due to unidentified, low signal-to-noise transits. Using a CNN, the time required for vetting can be reduced by half, while still recovering the vast majority of manually flagged candidates. In addition, we identify many new candidates with high probabilities which were not flagged by human vetters.


Sign in / Sign up

Export Citation Format

Share Document