scholarly journals Comparison of Different Image Data Augmentation Approaches

Author(s):  
Loris Nanni ◽  
Michelangelo Paci ◽  
Sheryl Brahnam ◽  
Alessandra Lumini

Convolutional Neural Networks (CNNs) have gained prominence in the research literature on image classification over the last decade. One shortcoming of CNNs, however, is their lack of generalizability and tendency to overfit when presented with small training sets. Augmentation directly confronts this problem by generating new data points providing additional information. In this paper, we investigate the performance of more than ten different sets of data augmentation methods, with two novel approaches proposed here: one based on the Discrete Wavelet Transform and the other on the Constant-Q Gabor transform. Pretrained ResNet50 networks are finetuned on each augmentation method. Combinations of these networks are evaluated and compared across three benchmark data sets of images representing diverse problems and collected by instruments that capture information at different scales: a virus data set, a bark data set, and a LIGO glitches data set. Experiments demonstrate the superiority of this approach. The best ensemble proposed in this work achieves state-of-the-art performance across all three data sets. This result shows that varying data augmentation is a feasible way for building an ensemble of classifiers for image classification (code available at https://github.com/LorisNanni).

2021 ◽  
Vol 7 (12) ◽  
pp. 254
Author(s):  
Loris Nanni ◽  
Michelangelo Paci ◽  
Sheryl Brahnam ◽  
Alessandra Lumini

Convolutional neural networks (CNNs) have gained prominence in the research literature on image classification over the last decade. One shortcoming of CNNs, however, is their lack of generalizability and tendency to overfit when presented with small training sets. Augmentation directly confronts this problem by generating new data points providing additional information. In this paper, we investigate the performance of more than ten different sets of data augmentation methods, with two novel approaches proposed here: one based on the discrete wavelet transform and the other on the constant-Q Gabor transform. Pretrained ResNet50 networks are finetuned on each augmentation method. Combinations of these networks are evaluated and compared across four benchmark data sets of images representing diverse problems and collected by instruments that capture information at different scales: a virus data set, a bark data set, a portrait dataset, and a LIGO glitches data set. Experiments demonstrate the superiority of this approach. The best ensemble proposed in this work achieves state-of-the-art (or comparable) performance across all four data sets. This result shows that varying data augmentation is a feasible way for building an ensemble of classifiers for image classification.


Sensors ◽  
2021 ◽  
Vol 21 (17) ◽  
pp. 5809
Author(s):  
Loris Nanni ◽  
Giovanni Minchio ◽  
Sheryl Brahnam ◽  
Davide Sarraggiotto ◽  
Alessandra Lumini

In this paper, we examine two strategies for boosting the performance of ensembles of Siamese networks (SNNs) for image classification using two loss functions (Triplet and Binary Cross Entropy) and two methods for building the dissimilarity spaces (FULLY and DEEPER). With FULLY, the distance between a pattern and a prototype is calculated by comparing two images using the fully connected layer of the Siamese network. With DEEPER, each pattern is described using a deeper layer combined with dimensionality reduction. The basic design of the SNNs takes advantage of supervised k-means clustering for building the dissimilarity spaces that train a set of support vector machines, which are then combined by sum rule for a final decision. The robustness and versatility of this approach are demonstrated on several cross-domain image data sets, including a portrait data set, two bioimage and two animal vocalization data sets. Results show that the strategies employed in this work to increase the performance of dissimilarity image classification using SNN are closing the gap with standalone CNNs. Moreover, when our best system is combined with an ensemble of CNNs, the resulting performance is superior to an ensemble of CNNs, demonstrating that our new strategy is extracting additional information.


Author(s):  
Loris Nanni ◽  
Giovanni Minchio ◽  
Sheryl Brahnam ◽  
Davide Sarraggiotto ◽  
Alessandra Lumini

In this paper, we examine two strategies for boosting the performance of ensembles of Siamese networks (SNNs) for image classification using two loss functions (Triplet and Binary Cross Entropy) and two methods for building the dissimilarity spaces (FULLY and DEEPER). With FULLY, the distance between a pattern and a prototype is calculated by comparing two images using the fully connected layer of the Siamese network. With DEEPER, each pattern is described using a deeper layer combined with dimensionality reduction. The basic design of the SNNs takes advantage of supervised k-means clustering for building the dissimilarity spaces that train a set of support vector machines, which are then combined by sum rule for a final decision. The robustness and versatility of this approach are demonstrated on several cross-domain image data sets, including a portrait data set, two bioimage and two animal vocalization data sets. Results show that the strategies employed in this work to increase the performance of dissimilarity image classification using SNN is closing the gap with standalone CNNs. Moreover, when our best system is combined with an ensemble of CNNs, the resulting performance is superior to an ensemble of CNNs, demonstrating that our new strategy is extracting additional information.


2021 ◽  
Vol 7 (2) ◽  
pp. 755-758
Author(s):  
Daniel Wulff ◽  
Mohamad Mehdi ◽  
Floris Ernst ◽  
Jannis Hagenah

Abstract Data augmentation is a common method to make deep learning assessible on limited data sets. However, classical image augmentation methods result in highly unrealistic images on ultrasound data. Another approach is to utilize learning-based augmentation methods, e.g. based on variational autoencoders or generative adversarial networks. However, a large amount of data is necessary to train these models, which is typically not available in scenarios where data augmentation is needed. One solution for this problem could be a transfer of augmentation models between different medical imaging data sets. In this work, we present a qualitative study of the cross data set generalization performance of different learning-based augmentation methods for ultrasound image data. We could show that knowledge transfer is possible in ultrasound image augmentation and that the augmentation partially results in semantically meaningful transfers of structures, e.g. vessels, across domains.


2021 ◽  
Vol 11 (15) ◽  
pp. 6721
Author(s):  
Jinyeong Wang ◽  
Sanghwan Lee

In increasing manufacturing productivity with automated surface inspection in smart factories, the demand for machine vision is rising. Recently, convolutional neural networks (CNNs) have demonstrated outstanding performance and solved many problems in the field of computer vision. With that, many machine vision systems adopt CNNs to surface defect inspection. In this study, we developed an effective data augmentation method for grayscale images in CNN-based machine vision with mono cameras. Our method can apply to grayscale industrial images, and we demonstrated outstanding performance in the image classification and the object detection tasks. The main contributions of this study are as follows: (1) We propose a data augmentation method that can be performed when training CNNs with industrial images taken with mono cameras. (2) We demonstrate that image classification or object detection performance is better when training with the industrial image data augmented by the proposed method. Through the proposed method, many machine-vision-related problems using mono cameras can be effectively solved by using CNNs.


Mathematics ◽  
2021 ◽  
Vol 9 (6) ◽  
pp. 624
Author(s):  
Stefan Rohrmanstorfer ◽  
Mikhail Komarov ◽  
Felix Mödritscher

With the always increasing amount of image data, it has become a necessity to automatically look for and process information in these images. As fashion is captured in images, the fashion sector provides the perfect foundation to be supported by the integration of a service or application that is built on an image classification model. In this article, the state of the art for image classification is analyzed and discussed. Based on the elaborated knowledge, four different approaches will be implemented to successfully extract features out of fashion data. For this purpose, a human-worn fashion dataset with 2567 images was created, but it was significantly enlarged by the performed image operations. The results show that convolutional neural networks are the undisputed standard for classifying images, and that TensorFlow is the best library to build them. Moreover, through the introduction of dropout layers, data augmentation and transfer learning, model overfitting was successfully prevented, and it was possible to incrementally improve the validation accuracy of the created dataset from an initial 69% to a final validation accuracy of 84%. More distinct apparel like trousers, shoes and hats were better classified than other upper body clothes.


2018 ◽  
Vol 11 (2) ◽  
pp. 53-67
Author(s):  
Ajay Kumar ◽  
Shishir Kumar

Several initial center selection algorithms are proposed in the literature for numerical data, but the values of the categorical data are unordered so, these methods are not applicable to a categorical data set. This article investigates the initial center selection process for the categorical data and after that present a new support based initial center selection algorithm. The proposed algorithm measures the weight of unique data points of an attribute with the help of support and then integrates these weights along the rows, to get the support of every row. Further, a data object having the largest support is chosen as an initial center followed by finding other centers that are at the greatest distance from the initially selected center. The quality of the proposed algorithm is compared with the random initial center selection method, Cao's method, Wu method and the method introduced by Khan and Ahmad. Experimental analysis on real data sets shows the effectiveness of the proposed algorithm.


2003 ◽  
Vol 21 (1) ◽  
pp. 123-135 ◽  
Author(s):  
S. Vignudelli ◽  
P. Cipollini ◽  
F. Reseghetti ◽  
G. Fusco ◽  
G. P. Gasparini ◽  
...  

Abstract. From September 1999 to December 2000, eXpendable Bathy-Thermograph (XBT) profiles were collected along the Genova-Palermo shipping route in the framework of the Mediterranean Forecasting System Pilot Project (MFSPP). The route is virtually coincident with track 0044 of the TOPEX/Poseidon satellite altimeter, crossing the Ligurian and Tyrrhenian basins in an approximate N–S direction. This allows a direct comparison between XBT and altimetry, whose findings are presented in this paper. XBT sections reveal the presence of the major features of the regional circulation, namely the eastern boundary of the Ligurian gyre, the Bonifacio gyre and the Modified Atlantic Water inflow along the Sicily coast. Twenty-two comparisons of steric heights derived from the XBT data set with concurrent realizations of single-pass altimetric heights are made. The overall correlation is around 0.55 with an RMS difference of less than 3 cm. In the Tyrrhenian Sea the spectra are remarkably similar in shape, but in general the altimetric heights contain more energy. This difference is explained in terms of oceanographic signals, which are captured with a different intensity by the satellite altimeter and XBTs, as well as computational errors. On scales larger than 100 km, the data sets are also significantly coherent, with increasing coherence values at longer wavelengths. The XBTs were dropped every 18–20 km along the track: as a consequence, the spacing scale was unable to resolve adequately the internal radius of deformation (< 20 km). Furthermore, few XBT drops were carried out in the Ligurian Sea, due to the limited north-south extent of this basin, so the comparison is problematic there. On the contrary, the major features observed in the XBT data in the Tyrrhenian Sea are also detected by TOPEX/Poseidon. The manuscript is completed by a discussion on how to integrate the two data sets, in order to extract additional information. In particular, the results emphasize their complementariety in providing a dynamically complete description of the observed structures. Key words. Oceanography: general (descriptive and regional oceanography) Oceanography: physical (sea level variations; instruments and techniques)


2020 ◽  
Author(s):  
Ying Bi ◽  
Bing Xue ◽  
Mengjie Zhang

© Springer International Publishing AG, part of Springer Nature 2018. Feature extraction is an essential process for image data dimensionality reduction and classification. However, feature extraction is very difficult and often requires human intervention. Genetic Programming (GP) can achieve automatic feature extraction and image classification but the majority of existing methods extract low-level features from raw images without any image-related operations. Furthermore, the work on the combination of image-related operators/descriptors in GP for feature extraction and image classification is limited. This paper proposes a multi-layer GP approach (MLGP) to performing automatic high-level feature extraction and classification. A new program structure, a new function set including a number of image operators/descriptors and two region detectors, and a new terminal set are designed in this approach. The performance of the proposed method is examined on six different data sets of varying difficulty and compared with five GP based methods and 42 traditional image classification methods. Experimental results show that the proposed method achieves better or comparable performance than these baseline methods. Further analysis on the example programs evolved by the proposed MLGP method reveals the good interpretability of MLGP and gives insight into how this method can effectively extract high-level features for image classification.


2021 ◽  
Vol 87 (6) ◽  
pp. 445-455
Author(s):  
Yi Ma ◽  
Zezhong Zheng ◽  
Yutang Ma ◽  
Mingcang Zhu ◽  
Ran Huang ◽  
...  

Many manifold learning algorithms conduct an eigen vector analysis on a data-similarity matrix with a size of N×N, where N is the number of data points. Thus, the memory complexity of the analysis is no less than O(N2). We pres- ent in this article an incremental manifold learning approach to handle large hyperspectral data sets for land use identification. In our method, the number of dimensions for the high-dimensional hyperspectral-image data set is obtained with the training data set. A local curvature varia- tion algorithm is utilized to sample a subset of data points as landmarks. Then a manifold skeleton is identified based on the landmarks. Our method is validated on three AVIRIS hyperspectral data sets, outperforming the comparison algorithms with a k–nearest-neighbor classifier and achieving the second best performance with support vector machine.


Sign in / Sign up

Export Citation Format

Share Document