Detection of Invasive Species in Wetlands: Practical DL with Heavily Imbalanced Data

Deep Learning (DL) has become popular due to its ease of use and accuracy, with Transfer Learning (TL) effectively reducing the number of images needed to solve environmental problems. However, this approach has some limitations which we set out to explore: Our goal is to detect the presence of an invasive blueberry species in aerial images of wetlands. This is a key problem in ecosystem protection which is also challenging in terms of DL due to the severe imbalance present in the data. Results for the ResNet50 network show a high classification accuracy while largely ignoring the blueberry class, rendering these results of limited practical interest to detect that specific class. Moreover, by using loss function weighting and data augmentation results more akin to our practical application, our goals can be obtained. Our experiments regarding TL show that ImageNet weights do not produce satisfactory results when only the final layer of the network is trained. Furthermore, only minor gains are obtained compared with random weights when the whole network is retrained. Finally, in a study of state-of-the-art DL architectures best results were obtained by the ResNeXt architecture with 93.75 True Positive Rate and 98.11 accuracy for the Blueberry class with ResNet50, Densenet, and wideResNet obtaining close results.

Download Full-text

Big data classification with optimization driven MapReduce framework

International Journal of Knowledge-based and Intelligent Engineering Systems ◽

10.3233/kes-210062 ◽

2021 ◽

Vol 25 (2) ◽

pp. 173-183

Author(s):

Mujeeb Shaik Mohammed ◽

Praveen Sam Rachapudy ◽

Madhavi Kasa

Keyword(s):

Big Data ◽

Optimization Algorithm ◽

Moving Average ◽

Imbalanced Data ◽

Data Classification ◽

Bat Algorithm ◽

True Positive Rate ◽

Mapreduce Framework ◽

Positive Rate ◽

Big Data Classification

With the technical advances, the amount of big data is increasing day-by-day such that the traditional software tools face burden in handling them. Additionally, the presence of the imbalance data in the big data is a huge concern to the research industry. In order to assure the effective management of big data and to deal with the imbalanced data, this paper proposes a new optimization algorithm. Here, the big data classification is performed using the MapReduce framework, wherein the map and reduce functions are based on the proposed optimization algorithm. The optimization algorithm is named as Exponential Bat algorithm (E-Bat), which is the integration of the Exponential Weighted Moving Average (EWMA) and Bat Algorithm (BA). The function of map function is to select the features that are presented to the classification in the reducer module using the Neural Network (NN). Thus, the classification of big data is performed using the proposed E-Bat algorithm-based MapReduce Framework and the experimentation is performed using four standard databases, such as Breast cancer, Hepatitis, Pima Indian diabetes dataset, and Heart disease dataset. From, the experimental results, it can be shown that the proposed method acquired a maximal accuracy of 0.8829 and True Positive Rate (TPR) of 0.9090, respectively.

Download Full-text

Audio-Based Aircraft Detection System for Safe RPAS BVLOS Operations

10.20944/preprints202010.0343.v1 ◽

2020 ◽

Author(s):

Jorge Mariscal-Harana ◽

Víctor Alarcón ◽

Fidel González ◽

Juan José Calvente ◽

Francisco Javier Pérez-Grau ◽

...

Keyword(s):

Real Time ◽

Data Augmentation ◽

Detection System ◽

Cost Effective ◽

True Positive Rate ◽

Computational Performance ◽

Combining Data ◽

Wide Range ◽

Positive Rate ◽

Aircraft Detection

For the Remotely Piloted Aircraft Systems (RPAS) market to continue its current growth rate, cost-effective "Detect and Avoid" systems which enable safe beyond visual line of sight (BVLOS) operations are critical. We propose an audio-based "Detect and Avoid" system, composed of microphones and an embedded computer, which performs real-time inferences using a sound event detection (SED) deep learning model. Two state-of-the-art SED models, YAMNet and VGGish, are fine-tuned using our aircraft sounds dataset and their performances are compared for a wide range of configurations. YAMNet, whose MobileNet architecture is designed for embedded applications, outperformed VGGish both in terms of aircraft detection and computational performance. YAMNet's optimal configuration, with > 70% true positive rate and precision, results from combining data augmentation and undersampling with the highest available inference frequency (i.e. 10 Hz). While our proposed "Detect and Avoid" system already allows the detection of small aircraft from sound in real time, a larger dataset, sensor fusion, or the use of cloud-based services for remote computations could further improve its performance.

Download Full-text

Building Damage Detection from Post-Event Aerial Imagery Using Single Shot Multibox Detector

Applied Sciences ◽

10.3390/app9061128 ◽

2019 ◽

Vol 9 (6) ◽

pp. 1128 ◽

Cited By ~ 12

Author(s):

Yundong Li ◽

Wei Hu ◽

Han Dong ◽

Xueyan Zhang

Keyword(s):

Machine Learning ◽

Data Augmentation ◽

Hurricane Sandy ◽

Training Data ◽

Aerial Images ◽

Detection Methods ◽

Single Shot ◽

Data Set ◽

Augmentation Strategies ◽

Post Disaster

Using aerial cameras, satellite remote sensing or unmanned aerial vehicles (UAV) equipped with cameras can facilitate search and rescue tasks after disasters. The traditional manual interpretation of huge aerial images is inefficient and could be replaced by machine learning-based methods combined with image processing techniques. Given the development of machine learning, researchers find that convolutional neural networks can effectively extract features from images. Some target detection methods based on deep learning, such as the single-shot multibox detector (SSD) algorithm, can achieve better results than traditional methods. However, the impressive performance of machine learning-based methods results from the numerous labeled samples. Given the complexity of post-disaster scenarios, obtaining many samples in the aftermath of disasters is difficult. To address this issue, a damaged building assessment method using SSD with pretraining and data augmentation is proposed in the current study and highlights the following aspects. (1) Objects can be detected and classified into undamaged buildings, damaged buildings, and ruins. (2) A convolution auto-encoder (CAE) that consists of VGG16 is constructed and trained using unlabeled post-disaster images. As a transfer learning strategy, the weights of the SSD model are initialized using the weights of the CAE counterpart. (3) Data augmentation strategies, such as image mirroring, rotation, Gaussian blur, and Gaussian noise processing, are utilized to augment the training data set. As a case study, aerial images of Hurricane Sandy in 2012 were maximized to validate the proposed method’s effectiveness. Experiments show that the pretraining strategy can improve of 10% in terms of overall accuracy compared with the SSD trained from scratch. These experiments also demonstrate that using data augmentation strategies can improve mAP and mF1 by 72% and 20%, respectively. Finally, the experiment is further verified by another dataset of Hurricane Irma, and it is concluded that the paper method is feasible.

Download Full-text

PRATD: A Phased Remote Access Trojan Detection Method with Double-Sided Features

Electronics ◽

10.3390/electronics9111894 ◽

2020 ◽

Vol 9 (11) ◽

pp. 1894

Author(s):

Chun Guo ◽

Zihua Song ◽

Yuan Ping ◽

Guowei Shen ◽

Yuhei Cui ◽

...

Keyword(s):

False Positive ◽

Detection Method ◽

False Positive Rate ◽

True Positive Rate ◽

Remote Access ◽

Detection Methods ◽

Security Threats ◽

True Positive ◽

Trojan Detection ◽

Positive Rate

Remote Access Trojan (RAT) is one of the most terrible security threats that organizations face today. At present, two major RAT detection methods are host-based and network-based detection methods. To complement one another’s strengths, this article proposes a phased RATs detection method by combining double-side features (PRATD). In PRATD, both host-side and network-side features are combined to build detection models, which is conducive to distinguishing the RATs from benign programs because that the RATs not only generate traffic on the network but also leave traces on the host at run time. Besides, PRATD trains two different detection models for the two runtime states of RATs for improving the True Positive Rate (TPR). The experiments on the network and host records collected from five kinds of benign programs and 20 famous RATs show that PRATD can effectively detect RATs, it can achieve a TPR as high as 93.609% with a False Positive Rate (FPR) as low as 0.407% for the known RATs, a TPR 81.928% and FPR 0.185% for the unknown RATs, which suggests it is a competitive candidate for RAT detection.

Download Full-text

Ascertaining an efficient eligibility cut-off for extended Medicare items for eating disorders

Australasian Psychiatry ◽

10.1177/10398562211028632 ◽

2021 ◽

pp. 103985622110286

Author(s):

Tracey Wade ◽

Jamie-Lee Pennesi ◽

Yuan Zhou

Keyword(s):

Eating Disorders ◽

Eating Disorder ◽

False Positive Rate ◽

Area Under The Curve ◽

Rate Sensitivity ◽

True Positive Rate ◽

Eating Disorder Examination Questionnaire ◽

Eating Disorder Examination ◽

Positive Rate ◽

The Relationship

Objective: Currently eligibility for expanded Medicare items for eating disorders (excluding anorexia nervosa) require a score ⩾ 3 on the 22-item Eating Disorder Examination-Questionnaire (EDE-Q). We compared these EDE-Q “cases” with continuous scores on a validated 7-item version of the EDE-Q (EDE-Q7) to identify an EDE-Q7 cut-off commensurate to 3 on the EDE-Q. Methods: We utilised EDE-Q scores of female university students ( N = 337) at risk of developing an eating disorder. We used a receiver operating characteristic (ROC) curve to assess the relationship between the true-positive rate (sensitivity) and the false-positive rate (1-specificity) of cases ⩾ 3. Results: The area under the curve showed outstanding discrimination of 0.94 (95% CI: .92–.97). We examined two specific cut-off points on the EDE-Q7, which included 100% and 87% of true cases, respectively. Conclusion: Given the EDE-Q cut-off for Medicare is used in conjunction with other criteria, we suggest using the more permissive EDE-Q7 cut-off (⩾2.5) to replace use of the EDE-Q cut-off (⩾3) in eligibility assessments.

Download Full-text

IDA-GAN: A Novel Imbalanced Data Augmentation GAN

2020 25th International Conference on Pattern Recognition (ICPR) ◽

10.1109/icpr48806.2021.9411996 ◽

2021 ◽

Author(s):

Hao Yang ◽

Yun Zhou

Keyword(s):

Data Augmentation ◽

Imbalanced Data

Download Full-text

Pretherapeutic Imaging for Axillary Staging in Breast Cancer: A Systematic Review and Meta-Analysis of Ultrasound, MRI and FDG PET

Journal of Clinical Medicine ◽

10.3390/jcm10071543 ◽

2021 ◽

Vol 10 (7) ◽

pp. 1543

Author(s):

Morwenn Le Boulc’h ◽

Julia Gilhodes ◽

Zara Steinmeyer ◽

Sébastien Molière ◽

Carole Mathelin

Keyword(s):

Systematic Review ◽

Sensitivity And Specificity ◽

Meta Analysis ◽

False Negative ◽

Significant Proportion ◽

False Negative Rate ◽

True Positive Rate ◽

Axillary Staging ◽

Positron Emission ◽

Positive Rate

Background: This systematic review aimed at comparing performances of ultrasonography (US), magnetic resonance imaging (MRI), and fluorodeoxyglucose positron emission tomography (PET) for axillary staging, with a focus on micro- or micrometastases. Methods: A search for relevant studies published between January 2002 and March 2018 was conducted in MEDLINE database. Study quality was assessed using the QUality Assessment of Diagnostic Accuracy Studies checklist. Sensitivity and specificity were meta-analyzed using a bivariate random effects approach; Results: Across 62 studies (n = 10,374 patients), sensitivity and specificity to detect metastatic ALN were, respectively, 51% (95% CI: 43–59%) and 100% (95% CI: 99–100%) for US, 83% (95% CI: 72–91%) and 85% (95% CI: 72–92%) for MRI, and 49% (95% CI: 39–59%) and 94% (95% CI: 91–96%) for PET. Interestingly, US detects a significant proportion of macrometastases (false negative rate was 0.28 (0.22, 0.34) for more than 2 metastatic ALN and 0.96 (0.86, 0.99) for micrometastases). In contrast, PET tends to detect a significant proportion of micrometastases (true positive rate = 0.41 (0.29, 0.54)). Data are not available for MRI. Conclusions: In comparison with MRI and PET Fluorodeoxyglucose (FDG), US is an effective technique for axillary triage, especially to detect high metastatic burden without upstaging majority of micrometastases.

Download Full-text

Generative Adversarial Networks to Improve the Robustness of Visual Defect Segmentation by Semantic Networks in Manufacturing Components

Applied Sciences ◽

10.3390/app11146368 ◽

2021 ◽

Vol 11 (14) ◽

pp. 6368

Author(s):

Fátima A. Saiz ◽

Garazi Alfaro ◽

Iñigo Barandiaran ◽

Manuel Graña

Keyword(s):

Ad Hoc ◽

Data Augmentation ◽

Semantic Network ◽

Semantic Networks ◽

Stereo Image ◽

Generative Adversarial Networks ◽

Specific Class ◽

Adversarial Networks ◽

Augmentation Techniques ◽

Image Acquisition System

This paper describes the application of Semantic Networks for the detection of defects in images of metallic manufactured components in a situation where the number of available samples of defects is small, which is rather common in real practical environments. In order to overcome this shortage of data, the common approach is to use conventional data augmentation techniques. We resort to Generative Adversarial Networks (GANs) that have shown the capability to generate highly convincing samples of a specific class as a result of a game between a discriminator and a generator module. Here, we apply the GANs to generate samples of images of metallic manufactured components with specific defects, in order to improve training of Semantic Networks (specifically DeepLabV3+ and Pyramid Attention Network (PAN) networks) carrying out the defect detection and segmentation. Our process carries out the generation of defect images using the StyleGAN2 with the DiffAugment method, followed by a conventional data augmentation over the entire enriched dataset, achieving a large balanced dataset that allows robust training of the Semantic Network. We demonstrate the approach on a private dataset generated for an industrial client, where images are captured by an ad-hoc photometric-stereo image acquisition system, and a public dataset, the Northeastern University surface defect database (NEU). The proposed approach achieves an improvement of 7% and 6% in an intersection over union (IoU) measure of detection performance on each dataset over the conventional data augmentation.

Download Full-text

Markerless tracking of an entire honey bee colony

Nature Communications ◽

10.1038/s41467-021-21769-1 ◽

2021 ◽

Vol 12 (1) ◽

Author(s):

Katarzyna Bozek ◽

Laetitia Hebert ◽

Yoann Portugal ◽

Greg J. Stephens

Keyword(s):

Honey Bee ◽

True Positive Rate ◽

Computational Method ◽

Visual Features ◽

Brood Cell ◽

Collective Behaviors ◽

Markerless Tracking ◽

Bee Colony ◽

Positive Rate ◽

Group Members

AbstractFrom cells in tissue, to bird flocks, to human crowds, living systems display a stunning variety of collective behaviors. Yet quantifying such phenomena first requires tracking a significant fraction of the group members in natural conditions, a substantial and ongoing challenge. We present a comprehensive, computational method for tracking an entire colony of the honey bee Apis mellifera using high-resolution video on a natural honeycomb background. We adapt a convolutional neural network (CNN) segmentation architecture to automatically identify bee and brood cell positions, body orientations and within-cell states. We achieve high accuracy (~10% body width error in position, ~10° error in orientation, and true positive rate > 90%) and demonstrate months-long monitoring of sociometric colony fluctuations. These fluctuations include ~24 h cycles in the counted detections, negative correlation between bee and brood, and nightly enhancement of bees inside comb cells. We combine detected positions with visual features of organism-centered images to track individuals over time and through challenging occluding events, recovering ~79% of bee trajectories from five observation hives over 5 min timespans. The trajectories reveal important individual behaviors, including waggle dances and crawling inside comb cells. Our results provide opportunities for the quantitative study of collective bee behavior and for advancing tracking techniques of crowded systems.

Download Full-text

Classifying the Biological Status of Honeybee Workers Using Gas Sensors

Sensors ◽

10.3390/s21010166 ◽

2020 ◽

Vol 21 (1) ◽

pp. 166

Author(s):

Jakub T. Wilk ◽

Beata Bąk ◽

Piotr Artiemjew ◽

Jerzy Wilde ◽

Maciej Siuda

Keyword(s):

Gas Sensors ◽

Laboratory Tests ◽

Test Chamber ◽

Ambient Air ◽

True Positive Rate ◽

True Positive ◽

Positive Rate ◽

Parameters Of Accuracy ◽

Selection Of

Honeybee workers have a specific smell depending on the age of workers and the biological status of the colony. Laboratory tests were carried out at the Department of Apiculture at UWM Olsztyn, using gas sensors installed in two twin prototype multi-sensor detectors. The study aimed to compare the responses of sensors to the odor of old worker bees (3–6 weeks old), young ones (0–1 days old), and those from long-term queenless colonies. From the experimental colonies, 10 samples of 100 workers were taken for each group and placed successively in the research chambers for the duration of the study. Old workers came from outer nest combs, young workers from hatching out brood in an incubator, and laying worker bees from long-term queenless colonies from brood combs (with laying worker bee’s eggs, humped brood, and drones). Each probe was measured for 10 min, and then immediately for another 10 min ambient air was given to regenerate sensors. The results were analyzed using 10 different classifiers. Research has shown that the devices can distinguish between the biological status of bees. The effectiveness of distinguishing between classes, determined by the parameters of accuracy balanced and true positive rate, of 0.763 and 0.742 in the case of the best euclidean.1nn classifier, may be satisfactory in the context of practical beekeeping. Depending on the environment accompanying the tested objects (a type of insert in the test chamber), the introduction of other classifiers as well as baseline correction methods may be considered, while the selection of the appropriate classifier for the task may be of great importance for the effectiveness of the classification.

Download Full-text