scholarly journals Data Augmentation Method by Applying Color Perturbation of Inverse PSNR and Geometric Transformations for Object Recognition Based on Deep Learning

2020 ◽  
Vol 10 (11) ◽  
pp. 3755
Author(s):  
Eun Kyeong Kim ◽  
Hansoo Lee ◽  
Jin Yong Kim ◽  
Sungshin Kim

Deep learning is applied in various manufacturing domains. To train a deep learning network, we must collect a sufficient amount of training data. However, it is difficult to collect image datasets required to train the networks to perform object recognition, especially because target items that are to be classified are generally excluded from existing databases, and the manual collection of images poses certain limitations. Therefore, to overcome the data deficiency that is present in many domains including manufacturing, we propose a method of generating new training images via image pre-processing steps, background elimination, target extraction while maintaining the ratio of the object size in the original image, color perturbation considering the predefined similarity between the original and generated images, geometric transformations, and transfer learning. Specifically, to demonstrate color perturbation and geometric transformations, we compare and analyze the experiments of each color space and each geometric transformation. The experimental results show that the proposed method can effectively augment the original data, correctly classify similar items, and improve the image classification accuracy. In addition, it also demonstrates that the effective data augmentation method is crucial when the amount of training data is small.

Author(s):  
Mario Lasseck

The detection and identification of individual species based on images or audio recordings has shown significant performance increase over the last few years, thanks to recent advances in deep learning. Reliable automatic species recognition provides a promising tool for biodiversity monitoring, research and education. Image-based plant identification, for example, now comes close to the most advanced human expertise (Bonnet et al. 2018, Lasseck 2018a). Besides improved machine learning algorithms, neural network architectures, deep learning frameworks and computer hardware, a major reason for the gain in performance is the increasing abundance of biodiversity training data, either from observational networks and data providers like GBIF, Xeno-canto, iNaturalist, etc. or natural history museum collections like the Animal Sound Archive of the Museum für Naturkunde. However, in many cases, this occurrence data is still insufficient for data-intensive deep learning approaches and is often unbalanced, with only few examples for very rare species. To overcome these limitations, data augmentation can be used. This technique synthetically creates more training samples by applying various subtle random manipulations to the original data in a label-preserving way without changing the content. In the talk, we will present augmentation methods for images and audio data. The positive effect on identification performance will be evaluated on different large-scale data sets from recent plant and bird identification (LifeCLEF 2017, 2018) and detection (DCASE 2018) challenges (Lasseck 2017, Lasseck 2018b, Lasseck 2018c).


Diagnostics ◽  
2021 ◽  
Vol 11 (10) ◽  
pp. 1935
Author(s):  
Fanwen Wang ◽  
Hui Zhang ◽  
Fei Dai ◽  
Weibo Chen ◽  
Chengyan Wang ◽  
...  

Deep learning has demonstrated superior performance in image reconstruction compared to most conventional iterative algorithms. However, their effectiveness and generalization capability are highly dependent on the sample size and diversity of the training data. Deep learning-based reconstruction requires multi-coil raw k-space data, which are not collected by routine scans. On the other hand, large amounts of magnitude images are readily available in hospitals. Hence, we proposed the MAGnitude Images to Complex K-space (MAGIC-K) Net to generate multi-coil k-space data from existing magnitude images and a limited number of required raw k-space data to facilitate the reconstruction. Compared to some basic data augmentation methods applying global intensity and displacement transformations to the source images, the MAGIC-K Net can generate more realistic intensity variations and displacements from pairs of anatomical Digital Imaging and Communications in Medicine (DICOM) images. The reconstruction performance was validated in 30 healthy volunteers and 6 patients with different types of tumors. The experimental results demonstrated that the high-resolution Diffusion Weighted Image (DWI) reconstruction benefited from the proposed augmentation method. The MAGIC-K Net enabled the deep learning network to reconstruct images with superior performance in both healthy and tumor patients, qualitatively and quantitatively.


Diagnostics ◽  
2021 ◽  
Vol 11 (6) ◽  
pp. 1052
Author(s):  
Leang Sim Nguon ◽  
Kangwon Seo ◽  
Jung-Hyun Lim ◽  
Tae-Jun Song ◽  
Sung-Hyun Cho ◽  
...  

Mucinous cystic neoplasms (MCN) and serous cystic neoplasms (SCN) account for a large portion of solitary pancreatic cystic neoplasms (PCN). In this study we implemented a convolutional neural network (CNN) model using ResNet50 to differentiate between MCN and SCN. The training data were collected retrospectively from 59 MCN and 49 SCN patients from two different hospitals. Data augmentation was used to enhance the size and quality of training datasets. Fine-tuning training approaches were utilized by adopting the pre-trained model from transfer learning while training selected layers. Testing of the network was conducted by varying the endoscopic ultrasonography (EUS) image sizes and positions to evaluate the network performance for differentiation. The proposed network model achieved up to 82.75% accuracy and a 0.88 (95% CI: 0.817–0.930) area under curve (AUC) score. The performance of the implemented deep learning networks in decision-making using only EUS images is comparable to that of traditional manual decision-making using EUS images along with supporting clinical information. Gradient-weighted class activation mapping (Grad-CAM) confirmed that the network model learned the features from the cyst region accurately. This study proves the feasibility of diagnosing MCN and SCN using a deep learning network model. Further improvement using more datasets is needed.


2021 ◽  
Vol 11 (15) ◽  
pp. 7148
Author(s):  
Bedada Endale ◽  
Abera Tullu ◽  
Hayoung Shi ◽  
Beom-Soo Kang

Unmanned aerial vehicles (UAVs) are being widely utilized for various missions: in both civilian and military sectors. Many of these missions demand UAVs to acquire artificial intelligence about the environments they are navigating in. This perception can be realized by training a computing machine to classify objects in the environment. One of the well known machine training approaches is supervised deep learning, which enables a machine to classify objects. However, supervised deep learning comes with huge sacrifice in terms of time and computational resources. Collecting big input data, pre-training processes, such as labeling training data, and the need for a high performance computer for training are some of the challenges that supervised deep learning poses. To address these setbacks, this study proposes mission specific input data augmentation techniques and the design of light-weight deep neural network architecture that is capable of real-time object classification. Semi-direct visual odometry (SVO) data of augmented images are used to train the network for object classification. Ten classes of 10,000 different images in each class were used as input data where 80% were for training the network and the remaining 20% were used for network validation. For the optimization of the designed deep neural network, a sequential gradient descent algorithm was implemented. This algorithm has the advantage of handling redundancy in the data more efficiently than other algorithms.


2021 ◽  
Vol 8 (1) ◽  
Author(s):  
Huu-Thanh Duong ◽  
Tram-Anh Nguyen-Thi

AbstractIn literature, the machine learning-based studies of sentiment analysis are usually supervised learning which must have pre-labeled datasets to be large enough in certain domains. Obviously, this task is tedious, expensive and time-consuming to build, and hard to handle unseen data. This paper has approached semi-supervised learning for Vietnamese sentiment analysis which has limited datasets. We have summarized many preprocessing techniques which were performed to clean and normalize data, negation handling, intensification handling to improve the performances. Moreover, data augmentation techniques, which generate new data from the original data to enrich training data without user intervention, have also been presented. In experiments, we have performed various aspects and obtained competitive results which may motivate the next propositions.


2016 ◽  
Vol 14 (03) ◽  
pp. 1642002 ◽  
Author(s):  
Bahar Akbal-Delibas ◽  
Roshanak Farhoodi ◽  
Marc Pomplun ◽  
Nurit Haspel

One of the major challenges for protein docking methods is to accurately discriminate native-like structures from false positives. Docking methods are often inaccurate and the results have to be refined and re-ranked to obtain native-like complexes and remove outliers. In a previous work, we introduced AccuRefiner, a machine learning based tool for refining protein–protein complexes. Given a docked complex, the refinement tool produces a small set of refined versions of the input complex, with lower root-mean-square-deviation (RMSD) of atomic positions with respect to the native structure. The method employs a unique ranking tool that accurately predicts the RMSD of docked complexes with respect to the native structure. In this work, we use a deep learning network with a similar set of features and five layers. We show that a properly trained deep learning network can accurately predict the RMSD of a docked complex with 1.40 Å error margin on average, by approximating the complex relationship between a wide set of scoring function terms and the RMSD of a docked structure. The network was trained on 35000 unbound docking complexes generated by RosettaDock. We tested our method on 25 different putative docked complexes produced also by RosettaDock for five proteins that were not included in the training data. The results demonstrate that the high accuracy of the ranking tool enables AccuRefiner to consistently choose the refinement candidates with lower RMSD values compared to the coarsely docked input structures.


Author(s):  
Uzma Batool ◽  
Mohd Ibrahim Shapiai ◽  
Nordinah Ismail ◽  
Hilman Fauzi ◽  
Syahrizal Salleh

Silicon wafer defect data collected from fabrication facilities is intrinsically imbalanced because of the variable frequencies of defect types. Frequently occurring types will have more influence on the classification predictions if a model gets trained on such skewed data. A fair classifier for such imbalanced data requires a mechanism to deal with type imbalance in order to avoid biased results. This study has proposed a convolutional neural network for wafer map defect classification, employing oversampling as an imbalance addressing technique. To have an equal participation of all classes in the classifier’s training, data augmentation has been employed, generating more samples in minor classes. The proposed deep learning method has been evaluated on a real wafer map defect dataset and its classification results on the test set returned a 97.91% accuracy. The results were compared with another deep learning based auto-encoder model demonstrating the proposed method, a potential approach for silicon wafer defect classification that needs to be investigated further for its robustness.


2020 ◽  
Vol 12 (7) ◽  
pp. 1092
Author(s):  
David Browne ◽  
Michael Giering ◽  
Steven Prestwich

Scene classification is an important aspect of image/video understanding and segmentation. However, remote-sensing scene classification is a challenging image recognition task, partly due to the limited training data, which causes deep-learning Convolutional Neural Networks (CNNs) to overfit. Another difficulty is that images often have very different scales and orientation (viewing angle). Yet another is that the resulting networks may be very large, again making them prone to overfitting and unsuitable for deployment on memory- and energy-limited devices. We propose an efficient deep-learning approach to tackle these problems. We use transfer learning to compensate for the lack of data, and data augmentation to tackle varying scale and orientation. To reduce network size, we use a novel unsupervised learning approach based on k-means clustering, applied to all parts of the network: most network reduction methods use computationally expensive supervised learning methods, and apply only to the convolutional or fully connected layers, but not both. In experiments, we set new standards in classification accuracy on four remote-sensing and two scene-recognition image datasets.


Electronics ◽  
2020 ◽  
Vol 9 (11) ◽  
pp. 1892
Author(s):  
Simone Porcu ◽  
Alessandro Floris ◽  
Luigi Atzori

Most Facial Expression Recognition (FER) systems rely on machine learning approaches that require large databases for an effective training. As these are not easily available, a good solution is to augment the databases with appropriate data augmentation (DA) techniques, which are typically based on either geometric transformation or oversampling augmentations (e.g., generative adversarial networks (GANs)). However, it is not always easy to understand which DA technique may be more convenient for FER systems because most state-of-the-art experiments use different settings which makes the impact of DA techniques not comparable. To advance in this respect, in this paper, we evaluate and compare the impact of using well-established DA techniques on the emotion recognition accuracy of a FER system based on the well-known VGG16 convolutional neural network (CNN). In particular, we consider both geometric transformations and GAN to increase the amount of training images. We performed cross-database evaluations: training with the "augmented" KDEF database and testing with two different databases (CK+ and ExpW). The best results were obtained combining horizontal reflection, translation and GAN, bringing an accuracy increase of approximately 30%. This outperforms alternative approaches, except for the one technique that could however rely on a quite bigger database.


2019 ◽  
Vol 11 (12) ◽  
pp. 1435 ◽  
Author(s):  
Shiran Song ◽  
Jianhua Liu ◽  
Heng Pu ◽  
Yuan Liu ◽  
Jingyan Luo

The efficient and accurate application of deep learning in the remote sensing field largely depends on the pre-processing technology of remote sensing images. Particularly, image fusion is the essential way to achieve the complementarity of the panchromatic band and multispectral bands in high spatial resolution remote sensing images. In this paper, we not only pay attention to the visual effect of fused images, but also focus on the subsequent application effectiveness of information extraction and feature recognition based on fused images. Based on the WorldView-3 images of Tongzhou District of Beijing, we apply the fusion results to conduct the experiments of object recognition of typical urban features based on deep learning. Furthermore, we perform a quantitative analysis for the existing pixel-based mainstream fusion methods of IHS (Intensity-Hue Saturation), PCS (Principal Component Substitution), GS (Gram Schmidt), ELS (Ehlers), HPF (High-Pass Filtering), and HCS (Hyper spherical Color Space) from the perspectives of spectrum, geometric features, and recognition accuracy. The results show that there are apparent differences in visual effect and quantitative index among different fusion methods, and the PCS fusion method has the most satisfying comprehensive effectiveness in the object recognition of land cover (features) based on deep learning.


Sign in / Sign up

Export Citation Format

Share Document