Deep Learning Case Study on Imbalanced Training Data for Automatic Bird Identification

A Case Study of the Augmentation and Evaluation of Training Data for Deep Learning

Journal of Data and Information Quality ◽

10.1145/3317573 ◽

2019 ◽

Vol 11 (4) ◽

pp. 1-22 ◽

Cited By ~ 1

Author(s):

Junhua Ding ◽

Xinchuan Li ◽

Xiaojun Kang ◽

Venkat N. Gudivada

Keyword(s):

Deep Learning ◽

Training Data ◽

Evaluation Of Training

Download Full-text

Deep Learning on Construction Sites: A Case Study of Sparse Data Learning Techniques for Rebar Segmentation

Sensors ◽

10.3390/s21165428 ◽

2021 ◽

Vol 21 (16) ◽

pp. 5428

Author(s):

Suzanna Cuypers ◽

Maarten Bassier ◽

Maarten Vergauwen

Keyword(s):

Deep Learning ◽

Image Interpretation ◽

Semantic Segmentation ◽

Training Model ◽

Training Data ◽

Construction Site ◽

Major Drawback ◽

Automate Monitoring ◽

Site Monitoring

With recent advancements in deep learning models for image interpretation, it has finally become possible to automate construction site monitoring processes that rely on remote sensing. However, the major drawback of these models is their dependency on large datasets of training images labeled at pixel level, which have to be produced manually by skilled personnel. To alleviate the need for training data, this study evaluates weakly- and semi-supervised semantic segmentation models for construction site imagery to efficiently automate monitoring tasks. As a case study, we compare fully-, weakly- and semi-supervised methods for the detection of rebar covers, which are useful for quality control. In the experiments, recent models, i.e. IRNet, DeepLabv3+ and the cross-consistency training model, are compared for their ability to segment rebar covers from construction site imagery with minimal manual input. The results show that weakly- and semi-supervised models can indeed approach the performance of fully-supervised models, with the majority of the target objects being properly found. Through this study, construction site stakeholders are provided with detailed information on how tp leverage deep learning for efficient construction site monitoring and weigh preprocessing, training and testing efforts against each other in order to decide between fully-, weakly- and semi-supervised training.

Download Full-text

A regularized ensemble framework of deep learning for cancer detection from multi-class, imbalanced training data

Pattern Recognition ◽

10.1016/j.patcog.2017.12.017 ◽

2018 ◽

Vol 77 ◽

pp. 160-172 ◽

Cited By ~ 57

Author(s):

Xiaohui Yuan ◽

Lijun Xie ◽

Mohamed Abouelenien

Keyword(s):

Deep Learning ◽

Cancer Detection ◽

Training Data ◽

Imbalanced Training Data

Download Full-text

Automatic Mapping of Thermokarst Landforms from Remote Sensing Images Using Deep Learning: A Case Study in the Northeastern Tibetan Plateau

Remote Sensing ◽

10.3390/rs10122067 ◽

2018 ◽

Vol 10 (12) ◽

pp. 2067 ◽

Cited By ~ 7

Author(s):

Lingcao Huang ◽

Lin Liu ◽

Liming Jiang ◽

Tingjun Zhang

Keyword(s):

Remote Sensing ◽

Deep Learning ◽

High Resolution ◽

Tibetan Plateau ◽

Training Data ◽

Fine Tuning ◽

Remote Sensing Images ◽

Northeastern Tibetan Plateau ◽

High Resolution Images

Thawing of ice-rich permafrost causes thermokarst landforms on the ground surface. Obtaining the distribution of thermokarst landforms is a prerequisite for understanding permafrost degradation and carbon exchange at local and regional scales. However, because of their diverse types and characteristics, it is challenging to map thermokarst landforms from remote sensing images. We conducted a case study towards automatically mapping a type of thermokarst landforms (i.e., thermo-erosion gullies) in a local area in the northeastern Tibetan Plateau from high-resolution images by the use of deep learning. In particular, we applied the DeepLab algorithm (based on Convolutional Neural Networks) to a 0.15-m-resolution Digital Orthophoto Map (created using aerial photographs taken by an Unmanned Aerial Vehicle). Here, we document the detailed processing flow with key steps including preparing training data, fine-tuning, inference, and post-processing. Validating against the field measurements and manual digitizing results, we obtained an F1 score of 0.74 (precision is 0.59 and recall is 1.0), showing that the proposed method can effectively map small and irregular thermokarst landforms. It is potentially viable to apply the designed method to mapping diverse thermokarst landforms in a larger area where high-resolution images and training data are available.

Download Full-text

Siamese Reconstruction Network: Accurate Image Reconstruction from Human Brain Activity by Learning to Compare

Applied Sciences ◽

10.3390/app9224749 ◽

2019 ◽

Vol 9 (22) ◽

pp. 4749

Author(s):

Lingyun Jiang ◽

Kai Qiao ◽

Linyuan Wang ◽

Chi Zhang ◽

Jian Chen ◽

...

Keyword(s):

Deep Learning ◽

Human Brain ◽

Brain Activity ◽

Feature Space ◽

Training Data ◽

Reconstruction Method ◽

Learning Method ◽

Training Samples ◽

Visual Reconstruction ◽

Relationship Of

Decoding human brain activities, especially reconstructing human visual stimuli via functional magnetic resonance imaging (fMRI), has gained increasing attention in recent years. However, the high dimensionality and small quantity of fMRI data impose restrictions on satisfactory reconstruction, especially for the reconstruction method with deep learning requiring huge amounts of labelled samples. When compared with the deep learning method, humans can recognize a new image because our human visual system is naturally capable of extracting features from any object and comparing them. Inspired by this visual mechanism, we introduced the mechanism of comparison into deep learning method to realize better visual reconstruction by making full use of each sample and the relationship of the sample pair by learning to compare. In this way, we proposed a Siamese reconstruction network (SRN) method. By using the SRN, we improved upon the satisfying results on two fMRI recording datasets, providing 72.5% accuracy on the digit dataset and 44.6% accuracy on the character dataset. Essentially, this manner can increase the training data about from n samples to 2n sample pairs, which takes full advantage of the limited quantity of training samples. The SRN learns to converge sample pairs of the same class or disperse sample pairs of different class in feature space.

Download Full-text

Domain randomization-enhanced deep learning models for bird detection

Scientific Reports ◽

10.1038/s41598-020-80101-x ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Xin Mao ◽

Jun Kang Chow ◽

Pin Siang Tan ◽

Kuan-fu Liu ◽

Jimmy Wu ◽

...

Keyword(s):

Deep Learning ◽

Continuous Monitoring ◽

Bird Species ◽

Training Data ◽

Learning Models ◽

Fine Grained ◽

Bird Detection ◽

Relationship Of ◽

The Relationship

AbstractAutomatic bird detection in ornithological analyses is limited by the accuracy of existing models, due to the lack of training data and the difficulties in extracting the fine-grained features required to distinguish bird species. Here we apply the domain randomization strategy to enhance the accuracy of the deep learning models in bird detection. Trained with virtual birds of sufficient variations in different environments, the model tends to focus on the fine-grained features of birds and achieves higher accuracies. Based on the 100 terabytes of 2-month continuous monitoring data of egrets, our results cover the findings using conventional manual observations, e.g., vertical stratification of egrets according to body size, and also open up opportunities of long-term bird surveys requiring intensive monitoring that is impractical using conventional methods, e.g., the weather influences on egrets, and the relationship of the migration schedules between the great egrets and little egrets.

Download Full-text

U-Infuse: Democratization of Customizable Deep Learning for Object Detection

Sensors ◽

10.3390/s21082611 ◽

2021 ◽

Vol 21 (8) ◽

pp. 2611

Author(s):

Andrew Shepley ◽

Greg Falzon ◽

Christopher Lawson ◽

Paul Meek ◽

Paul Kwan

Keyword(s):

Deep Learning ◽

Intellectual Property ◽

Object Detection ◽

Image Data ◽

Learning Technologies ◽

Training Data ◽

Learning Models ◽

Ecological Data ◽

Single Class ◽

Large Numbers

Image data is one of the primary sources of ecological data used in biodiversity conservation and management worldwide. However, classifying and interpreting large numbers of images is time and resource expensive, particularly in the context of camera trapping. Deep learning models have been used to achieve this task but are often not suited to specific applications due to their inability to generalise to new environments and inconsistent performance. Models need to be developed for specific species cohorts and environments, but the technical skills required to achieve this are a key barrier to the accessibility of this technology to ecologists. Thus, there is a strong need to democratize access to deep learning technologies by providing an easy-to-use software application allowing non-technical users to train custom object detectors. U-Infuse addresses this issue by providing ecologists with the ability to train customised models using publicly available images and/or their own images without specific technical expertise. Auto-annotation and annotation editing functionalities minimize the constraints of manually annotating and pre-processing large numbers of images. U-Infuse is a free and open-source software solution that supports both multiclass and single class training and object detection, allowing ecologists to access deep learning technologies usually only available to computer scientists, on their own device, customised for their application, without sharing intellectual property or sensitive data. It provides ecological practitioners with the ability to (i) easily achieve object detection within a user-friendly GUI, generating a species distribution report, and other useful statistics, (ii) custom train deep learning models using publicly available and custom training data, (iii) achieve supervised auto-annotation of images for further training, with the benefit of editing annotations to ensure quality datasets. Broad adoption of U-Infuse by ecological practitioners will improve ecological image analysis and processing by allowing significantly more image data to be processed with minimal expenditure of time and resources, particularly for camera trap images. Ease of training and use of transfer learning means domain-specific models can be trained rapidly, and frequently updated without the need for computer science expertise, or data sharing, protecting intellectual property and privacy.

Download Full-text

Applications of deep learning to decorated ceramic typology and classification: A case study using Tusayan White Ware from Northeast Arizona

Journal of Archaeological Science ◽

10.1016/j.jas.2021.105375 ◽

2021 ◽

Vol 130 ◽

pp. 105375

Author(s):

Leszek M. Pawlowicz ◽

Christian E. Downum

Keyword(s):

Deep Learning ◽

Ceramic Typology

Download Full-text

Deep Learning-Based Differentiation between Mucinous Cystic Neoplasm and Serous Cystic Neoplasm in the Pancreas Using Endoscopic Ultrasonography

Diagnostics ◽

10.3390/diagnostics11061052 ◽

2021 ◽

Vol 11 (6) ◽

pp. 1052

Author(s):

Leang Sim Nguon ◽

Kangwon Seo ◽

Jung-Hyun Lim ◽

Tae-Jun Song ◽

Sung-Hyun Cho ◽

...

Keyword(s):

Decision Making ◽

Deep Learning ◽

Network Model ◽

Endoscopic Ultrasonography ◽

Data Augmentation ◽

Clinical Information ◽

Training Data ◽

Fine Tuning ◽

Cystic Neoplasm ◽

Cystic Neoplasms

Mucinous cystic neoplasms (MCN) and serous cystic neoplasms (SCN) account for a large portion of solitary pancreatic cystic neoplasms (PCN). In this study we implemented a convolutional neural network (CNN) model using ResNet50 to differentiate between MCN and SCN. The training data were collected retrospectively from 59 MCN and 49 SCN patients from two different hospitals. Data augmentation was used to enhance the size and quality of training datasets. Fine-tuning training approaches were utilized by adopting the pre-trained model from transfer learning while training selected layers. Testing of the network was conducted by varying the endoscopic ultrasonography (EUS) image sizes and positions to evaluate the network performance for differentiation. The proposed network model achieved up to 82.75% accuracy and a 0.88 (95% CI: 0.817–0.930) area under curve (AUC) score. The performance of the implemented deep learning networks in decision-making using only EUS images is comparable to that of traditional manual decision-making using EUS images along with supporting clinical information. Gradient-weighted class activation mapping (Grad-CAM) confirmed that the network model learned the features from the cyst region accurately. This study proves the feasibility of diagnosing MCN and SCN using a deep learning network model. Further improvement using more datasets is needed.

Download Full-text

Phonetic Variation Modeling and a Language Model Adaptation for Korean English Code-Switching Speech Recognition

Applied Sciences ◽

10.3390/app11062866 ◽

2021 ◽

Vol 11 (6) ◽

pp. 2866

Author(s):

Damheo Lee ◽

Donghyun Kim ◽

Seung Yun ◽

Sanghun Kim

Keyword(s):

Speech Recognition ◽

Language Model ◽

Reduction Rate ◽

Code Switching ◽

Training Data ◽

Target Domain ◽

Phonetic Variation ◽

Language Model Adaptation ◽

Imbalanced Training Data ◽

Lm Adaptation

In this paper, we propose a new method for code-switching (CS) automatic speech recognition (ASR) in Korean. First, the phonetic variations in English pronunciation spoken by Korean speakers should be considered. Thus, we tried to find a unified pronunciation model based on phonetic knowledge and deep learning. Second, we extracted the CS sentences semantically similar to the target domain and then applied the language model (LM) adaptation to solve the biased modeling toward Korean due to the imbalanced training data. In this experiment, training data were AI Hub (1033 h) in Korean and Librispeech (960 h) in English. As a result, when compared to the baseline, the proposed method improved the error reduction rate (ERR) by up to 11.6% with phonetic variant modeling and by 17.3% when semantically similar sentences were applied to the LM adaptation. If we considered only English words, the word correction rate improved up to 24.2% compared to that of the baseline. The proposed method seems to be very effective in CS speech recognition.

Download Full-text