Comparison of Deep Learning Approaches for Protective Behaviour Detection Under Class Imbalance from MoCap and EMG data

Recently, there has been a huge rise in malware growth, which creates a significant security threat to organizations and individuals. Despite the incessant efforts of cybersecurity research to defend against malware threats, malware developers discover new ways to evade these defense techniques. Traditional static and dynamic analysis methods are ineffective in identifying new malware and pose high overhead in terms of memory and time. Typical machine learning approaches that train a classifier based on handcrafted features are also not sufficiently potent against these evasive techniques and require more efforts due to feature-engineering. Recent malware detectors indicate performance degradation due to class imbalance in malware datasets. To resolve these challenges, this work adopts a visualization-based method, where malware binaries are depicted as two-dimensional images and classified by a deep learning model. We propose an efficient malware detection system based on deep learning. The system uses a reweighted class-balanced loss function in the final classification layer of the DenseNet model to achieve significant performance improvements in classifying malware by handling imbalanced data issues. Comprehensive experiments performed on four benchmark malware datasets show that the proposed approach can detect new malware samples with higher accuracy (98.23% for the Malimg dataset, 98.46% for the BIG 2015 dataset, 98.21% for the MaleVis dataset, and 89.48% for the unseen Malicia dataset) and reduced false-positive rates when compared with conventional malware mitigation techniques while maintaining low computational time. The proposed malware detection solution is also reliable and effective against obfuscation attacks.

Download Full-text

Tapping the Potential of Earth Observation - Calving Front Detection in SAR Images using Deep Learning Techniques

10.5194/egusphere-egu21-11280 ◽

2021 ◽

Author(s):

Nora Gourmelon ◽

Thorsten Seehaus ◽

AmirAbbas Davari ◽

Matthias Braun ◽

Andreas Maier ◽

...

Keyword(s):

Deep Learning ◽

Hydrological Cycle ◽

Class Imbalance ◽

Earth Observation ◽

Learning Approaches ◽

Sar Image ◽

Sar Images ◽

Distance Map ◽

Learning Techniques ◽

Front Detection

<p>The calving fronts of lake or marine terminating glaciers provide information about the state of glaciers. A change in its position can affect the flow of the entire glacier system, and the loss of ice mass as icebergs calve-off and discharge into the ocean has a multi-scale impact on the global hydrological cycle. The calving fronts can be manually delineated in Synthetic Aperture Radar (SAR) images. However, this is a time-consuming, tedious and expensive task. As deep learning approaches have achieved tremendous success in various disciplines, such as medical image processing and computer vision, the project Tapping the Potential of Earth Observation (TAPE) is amongst other things dedicated to applying deep learning techniques to calving front detection. So far, all our experiments have employed U-Net based architectures, as the U-Net is state-of-the-art in semantic image segmentation. A major challenge of front detection is the class imbalance: The front has significantly fewer pixels than the remaining parts of the SAR image. Hence, we developed variants of the U-Net specifically addressing this challenge including an Attention U-Net, a probabilistic Bayesian U-Net, as well as a U-Net with a distance map-based binary cross-entropy (BCE) loss function and a Mathews correlation coefficient (MCC) as early stopping criterion. In future work, we plan to investigate multi-task learning and a segmentation of the SAR image into different classes (i.e. ocean, glacier and rocks) to enhance the quality and efficiency of the front detection.</p>

Download Full-text

Deep Learning for Virtual Screening: Five Reasons to Use ROC Cost Functions

10.1101/2020.06.25.166884 ◽

2020 ◽

Author(s):

Vladimir Golkov ◽

Alexander Becker ◽

Daniel T. Plop ◽

Daniel Čuturilo ◽

Neda Davoudi ◽

...

Keyword(s):

Deep Learning ◽

Drug Discovery ◽

High Throughput Screening ◽

Class Imbalance ◽

Ground Truth ◽

Rapid Screening ◽

Cost Functions ◽

Learning Approaches ◽

Modern Drug ◽

Training Schemes

AbstractComputer-aided drug discovery is an essential component of modern drug development. Therein, deep learning has become an important tool for rapid screening of billions of molecules in silico for potential hits containing desired chemical features. Despite its importance, substantial challenges persist in training these models, such as severe class imbalance, high decision thresholds, and lack of ground truth labels in some datasets. In this work we argue in favor of directly optimizing the receiver operating characteristic (ROC) in such cases, due to its robustness to class imbalance, its ability to compromise over different decision thresholds, certain freedom to influence the relative weights in this compromise, fidelity to typical benchmarking measures, and equivalence to positive/unlabeled learning. We also propose new training schemes (coherent mini-batch arrangement, and usage of out-of-batch samples) for cost functions based on the ROC, as well as a cost function based on the logAUC metric that facilitates early enrichment (i.e. improves performance at high decision thresholds, as often desired when synthesizing predicted hit compounds). We demonstrate that these approaches outperform standard deep learning approaches on a series of PubChem high-throughput screening datasets that represent realistic and diverse drug discovery campaigns on major drug target families.

Download Full-text

Genome-wide discovery of pre-miRNAs: comparison of recent approaches based on machine learning

Briefings in Bioinformatics ◽

10.1093/bib/bbaa184 ◽

2020 ◽

Cited By ~ 1

Author(s):

Leandro A Bugnon ◽

Cristian Yones ◽

Diego H Milone ◽

Georgina Stegmayer

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Homo Sapiens ◽

False Positive Rate ◽

Class Imbalance ◽

Mirna Precursor ◽

Learning Approaches ◽

Novel Mirna ◽

Genome Wide ◽

Positive Rate

Abstract Motivation The genome-wide discovery of microRNAs (miRNAs) involves identifying sequences having the highest chance of being a novel miRNA precursor (pre-miRNA), within all the possible sequences in a complete genome. The known pre-miRNAs are usually just a few in comparison to the millions of candidates that have to be analyzed. This is of particular interest in non-model species and recently sequenced genomes, where the challenge is to find potential pre-miRNAs only from the sequenced genome. The task is unfeasible without the help of computational methods, such as deep learning. However, it is still very difficult to find an accurate predictor, with a low false positive rate in this genome-wide context. Although there are many available tools, these have not been tested in realistic conditions, with sequences from whole genomes and the high class imbalance inherent to such data. Results In this work, we review six recent methods for tackling this problem with machine learning. We compare the models in five genome-wide datasets: Arabidopsis thaliana, Caenorhabditis elegans, Anopheles gambiae, Drosophila melanogaster, Homo sapiens. The models have been designed for the pre-miRNAs prediction task, where there is a class of interest that is significantly underrepresented (the known pre-miRNAs) with respect to a very large number of unlabeled samples. It was found that for the smaller genomes and smaller imbalances, all methods perform in a similar way. However, for larger datasets such as the H. sapiens genome, it was found that deep learning approaches using raw information from the sequences reached the best scores, achieving low numbers of false positives. Availability The source code to reproduce these results is in: http://sourceforge.net/projects/sourcesinc/files/gwmirna Additionally, the datasets are freely available in: https://sourceforge.net/projects/sourcesinc/files/mirdata

Download Full-text

IUCNN - deep learning approaches to approximate species' extinction risk

10.1101/2021.06.17.448832 ◽

2021 ◽

Author(s):

Alexander Zizka ◽

Tobias Andermann ◽

Daniele Silvestro

Keyword(s):

Neural Network ◽

Deep Learning ◽

Large Scale ◽

Extinction Risk ◽

Class Imbalance ◽

Network Models ◽

Training Data ◽

Assessment Process ◽

Learning Approaches ◽

Neural Network Models

Aim: The global Red List (RL) from the International Union for the Conservation of Nature is the most comprehensive global quantification of extinction risk, and widely used in applied conservation as well as in biogeographic and ecological research. Yet, due to the time-consuming assessment process, the RL is biased taxonomically and geographically, which limits its application on large scales, in particular for understudied areas such as the tropics, or understudied taxa, such as most plants and invertebrates. Here we present IUCNN, an R-package implementing deep learning models to predict species RL status from publicly available geographic occurrence records (and other traits if available). Innovation: We implement a user-friendly workflow to train and validate neural network models, and subsequently use them to predict species RL status. IUCNN contains functions to address specific issues related to the RL framework, including a regression-based approach to account for the ordinal nature of RL categories and class imbalance in the training data, a Bayesian approach for improved uncertainty quantification, and a target accuracy threshold approach that limits predictions to only those species whose RL status can be predicted with high confidence. Most analyses can be run with few lines of code, without prior knowledge of neural network models. We demonstrate the use of IUCNN on an empirical dataset of ~14,000 orchid species, for which IUCNN models can predict extinction risk within minutes, while outperforming comparable methods. Main conclusions: IUCNN harnesses innovative methodology to estimate the RL status of large numbers of species. By providing estimates of the number and identity of threatened species in custom geographic or taxonomic datasets, IUCNN enables large-scale analyses on the extinction risk of species so far not well represented on the official RL.

Download Full-text

Deep Learning Approaches for Whiteboard Image Quality Enhancement

Color and Imaging Conference ◽

10.2352/j.imagingsci.technol.2019.63.4.040404 ◽

2019 ◽

Vol 2019 (1) ◽

pp. 360-368

Author(s):

Mekides Assefa Abebe ◽

Jon Yngve Hardeberg

Keyword(s):

Deep Learning ◽

Image Quality ◽

Image Data ◽

Quality Enhancement ◽

Network Architectures ◽

Learning Approaches ◽

Data Set ◽

Image Quality Enhancement ◽

Processing Techniques ◽

White Balancing

Different whiteboard image degradations highly reduce the legibility of pen-stroke content as well as the overall quality of the images. Consequently, different researchers addressed the problem through different image enhancement techniques. Most of the state-of-the-art approaches applied common image processing techniques such as background foreground segmentation, text extraction, contrast and color enhancements and white balancing. However, such types of conventional enhancement methods are incapable of recovering severely degraded pen-stroke contents and produce artifacts in the presence of complex pen-stroke illustrations. In order to surmount such problems, the authors have proposed a deep learning based solution. They have contributed a new whiteboard image data set and adopted two deep convolutional neural network architectures for whiteboard image quality enhancement applications. Their different evaluations of the trained models demonstrated their superior performances over the conventional methods.

Download Full-text

Assessment of the Risk Factors of MDD Recurrence Based on Deep Learning Approaches

SSRN Electronic Journal ◽

10.2139/ssrn.3411719 ◽

2019 ◽

Author(s):

Qian Wu ◽

Weiling Zhao ◽

Xiaobo Yang ◽

Hua Tan ◽

Lei You ◽

...

Keyword(s):

Risk Factors ◽

Deep Learning ◽

Learning Approaches

Download Full-text

The Problem of Fraudulent Content on the Web: Deep Learning Approaches

SSRN Electronic Journal ◽

10.2139/ssrn.3575411 ◽

2020 ◽

Author(s):

Priyanka Meel ◽

Farhin Bano ◽

Dr. Dinesh K. Vishwakarma

Keyword(s):

Deep Learning ◽

Learning Approaches ◽

The Web

Download Full-text

Deep convolutional neural networks for cardiovascular vulnerable plaque detection

MATEC Web of Conferences ◽

10.1051/matecconf/201927702024 ◽

2019 ◽

Vol 277 ◽

pp. 02024 ◽

Cited By ~ 1

Author(s):

Lincan Li ◽

Tong Jia ◽

Tianqi Meng ◽

Yizhe Liu

Keyword(s):

Neural Networks ◽

Deep Learning ◽

Convolutional Neural Networks ◽

Vulnerable Plaque ◽

Recall Rate ◽

Superior Performance ◽

Learning Approaches ◽

Deep Convolutional Neural Networks ◽

Vulnerable Plaques ◽

Plaque Detection

In this paper, an accurate two-stage deep learning method is proposed to detect vulnerable plaques in ultrasonic images of cardiovascular. Firstly, a Fully Convonutional Neural Network (FCN) named U-Net is used to segment the original Intravascular Optical Coherence Tomography (IVOCT) cardiovascular images. We experiment on different threshold values to find the best threshold for removing noise and background in the original images. Secondly, a modified Faster RCNN is adopted to do precise detection. The modified Faster R-CNN utilize six-scale anchors (122,162,322,642,1282,2562) instead of the conventional one scale or three scale approaches. First, we present three problems in cardiovascular vulnerable plaque diagnosis, then we demonstrate how our method solve these problems. The proposed method in this paper apply deep convolutional neural networks to the whole diagnostic procedure. Test results show the Recall rate, Precision rate, IoU (Intersection-over-Union) rate and Total score are 0.94, 0.885, 0.913 and 0.913 respectively, higher than the 1st team of CCCV2017 Cardiovascular OCT Vulnerable Plaque Detection Challenge. AP of the designed Faster RCNN is 83.4%, higher than conventional approaches which use one-scale or three-scale anchors. These results demonstrate the superior performance of our proposed method and the power of deep learning approaches in diagnose cardiovascular vulnerable plaques.

Download Full-text

Deep learning systems detect dysplasia with human-like accuracy using histopathology and probe-based confocal laser endomicroscopy

Scientific Reports ◽

10.1038/s41598-021-84510-4 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Shan Guleria ◽

Tilak U. Shah ◽

J. Vincent Pulido ◽

Matthew Fasullo ◽

Lubaina Ehsan ◽

...

Keyword(s):

Deep Learning ◽

Diagnostic Accuracy ◽

High Sensitivity ◽

Confocal Laser Endomicroscopy ◽

Confocal Laser ◽

Learning Approaches ◽

Learning Models ◽

Whole Slide Image ◽

Slide Image ◽

Level Model

AbstractProbe-based confocal laser endomicroscopy (pCLE) allows for real-time diagnosis of dysplasia and cancer in Barrett’s esophagus (BE) but is limited by low sensitivity. Even the gold standard of histopathology is hindered by poor agreement between pathologists. We deployed deep-learning-based image and video analysis in order to improve diagnostic accuracy of pCLE videos and biopsy images. Blinded experts categorized biopsies and pCLE videos as squamous, non-dysplastic BE, or dysplasia/cancer, and deep learning models were trained to classify the data into these three categories. Biopsy classification was conducted using two distinct approaches—a patch-level model and a whole-slide-image-level model. Gradient-weighted class activation maps (Grad-CAMs) were extracted from pCLE and biopsy models in order to determine tissue structures deemed relevant by the models. 1970 pCLE videos, 897,931 biopsy patches, and 387 whole-slide images were used to train, test, and validate the models. In pCLE analysis, models achieved a high sensitivity for dysplasia (71%) and an overall accuracy of 90% for all classes. For biopsies at the patch level, the model achieved a sensitivity of 72% for dysplasia and an overall accuracy of 90%. The whole-slide-image-level model achieved a sensitivity of 90% for dysplasia and 94% overall accuracy. Grad-CAMs for all models showed activation in medically relevant tissue regions. Our deep learning models achieved high diagnostic accuracy for both pCLE-based and histopathologic diagnosis of esophageal dysplasia and its precursors, similar to human accuracy in prior studies. These machine learning approaches may improve accuracy and efficiency of current screening protocols.

Download Full-text