CNNdel: Calling Structural Variations on Low Coverage Data Based on Convolutional Neural Networks

Many structural variations (SVs) detection methods have been proposed due to the popularization of next-generation sequencing (NGS). These SV calling methods use different SV-property-dependent features; however, they all suffer from poor accuracy when running on low coverage sequences. The union of results from these tools achieves fairly high sensitivity but still produces low accuracy on low coverage sequence data. That is, these methods contain many false positives. In this paper, we present CNNdel, an approach for calling deletions from paired-end reads. CNNdel gathers SV candidates reported by multiple tools and then extracts features from aligned BAM files at the positions of candidates. With labeled feature-expressed candidates as a training set, CNNdel trains convolutional neural networks (CNNs) to distinguish true unlabeled candidates from false ones. Results show that CNNdel works well with NGS reads from 26 low coverage genomes of the 1000 Genomes Project. The paper demonstrates that convolutional neural networks can automatically assign the priority of SV features and reduce the false positives efficaciously.

Download Full-text

PostSV: A Post–Processing Approach for Filtering Structural Variations

Bioinformatics and Biology Insights ◽

10.1177/1177932219892957 ◽

2020 ◽

Vol 14 ◽

pp. 117793221989295 ◽

Cited By ~ 1

Author(s):

Eman Alzaid ◽

Achraf El Allali

Keyword(s):

State Of The Art ◽

Post Processing ◽

Structural Variations ◽

Sequencing Technologies ◽

Structural Differences ◽

Overall Performance ◽

Next Generation Sequencing Ngs ◽

Low Coverage ◽

Ngs Data ◽

Generation Sequencing

Genomic structural variations are significant causes of genome diversity and complex diseases. With advances in sequencing technologies, many algorithms have been designed to identify structural differences using next-generation sequencing (NGS) data. Due to repetitions in the human genome and the short reads produced by NGS, the discovery of structural variants (SVs) by state-of-the-art SV callers is not always accurate. To improve performance, multiple SV callers are often used to detect variants. However, most SV callers suffer from high false-positive rates, which diminishes the overall performance, especially in low-coverage genomes. In this article, we propose a post-processing classification–based algorithm that can be used to filter structural variation predictions produced by SV callers. Novel features are defined from putative SV predictions using reads at the local regions around the breakpoints. Several classifiers are employed to classify the candidate predictions and remove false positives. We test our classifier models on simulated and real genomes and show that the proposed approach improves the performance of state-of-the-art algorithms.

Download Full-text

Robust PPG Peak Detection Using Dilated Convolutional Neural Networks

10.36227/techrxiv.16529310.v2 ◽

2021 ◽

Author(s):

Kianoosh Kazemi ◽

Juho Laitala ◽

Iman Azimi ◽

Pasi Liljeberg ◽

Amir M. Rahmani

Keyword(s):

Neural Networks ◽

Heart Rate ◽

Convolutional Neural Networks ◽

Signal To Noise Ratio ◽

Motion Artifact ◽

Detection Algorithm ◽

Peak Detection ◽

Detection Methods ◽

Method Performance ◽

Data Generator

<div>Accurate peak determination from noise-corrupted photoplethysmogram (PPG) signal is the basis for further analysis of physiological quantities such as heart rate and heart rate variability. In the past decades, many methods have been proposed to provide reliable peak detection. These peak detection methods include rule-based algorithms, adaptive thresholds, and signal processing techniques. However, they are designed for noise-free PPG signals and are insufficient for PPG signals with low signal-to-noise ratio (SNR). This paper focuses on enhancing PPG noise-resiliency and proposes a robust peak detection algorithm for noise and motion artifact corrupted PPG signals. Our algorithm is based on Convolutional Neural Networks (CNN) with dilated convolutions. Using dilated convolutions provides a large receptive field, making our CNN model robust at time series processing. In this study, we use a dataset collected from wearable devices in health monitoring under free-living conditions. In addition, a data generator is developed for producing noisy PPG data used for training the network. The method performance is compared against other state-of-the-art methods and tested in SNRs ranging from 0 to 45 dB. Our method obtains better accuracy in all the SNRs, compared with the existing adaptive threshold and transform-based methods. The proposed method shows an overall precision, recall, and F1-score 80%, 80%, and 80% in all the SNR ranges. However, these figures for the other methods are below 78%, 77%, and 77%, respectively. The proposed method proves to be accurate for detecting PPG peaks even in the presence of noise.</div>

Download Full-text

Efficient implementation of convolutional neural networks in the data processing of two-photon in vivo imaging

Bioinformatics ◽

10.1093/bioinformatics/btz055 ◽

2019 ◽

Vol 35 (17) ◽

pp. 3208-3210 ◽

Cited By ~ 1

Author(s):

Yangzhen Wang ◽

Feng Su ◽

Shanshan Wang ◽

Chaojuan Yang ◽

Yonglu Tian ◽

...

Keyword(s):

Neural Networks ◽

Convolutional Neural Networks ◽

Detection Methods ◽

Supplementary Information ◽

Imaging Data ◽

Two Photon ◽

Processing Power ◽

Fluctuation Method ◽

The Brain

Abstract Motivation Functional imaging at single-neuron resolution offers a highly efficient tool for studying the functional connectomics in the brain. However, mainstream neuron-detection methods focus on either the morphologies or activities of neurons, which may lead to the extraction of incomplete information and which may heavily rely on the experience of the experimenters. Results We developed a convolutional neural networks and fluctuation method-based toolbox (ImageCN) to increase the processing power of calcium imaging data. To evaluate the performance of ImageCN, nine different imaging datasets were recorded from awake mouse brains. ImageCN demonstrated superior neuron-detection performance when compared with other algorithms. Furthermore, ImageCN does not require sophisticated training for users. Availability and implementation ImageCN is implemented in MATLAB. The source code and documentation are available at https://github.com/ZhangChenLab/ImageCN. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

Identification of Tomato Disease Types and Detection of Infected Areas Based on Deep Convolutional Neural Networks and Object Detection Techniques

Computational Intelligence and Neuroscience ◽

10.1155/2019/9142753 ◽

2019 ◽

Vol 2019 ◽

pp. 1-15 ◽

Cited By ~ 6

Author(s):

Qimei Wang ◽

Feng Qi ◽

Minghe Sun ◽

Jianhua Qu ◽

Jie Xue

Keyword(s):

Neural Networks ◽

Object Detection ◽

Convolutional Neural Networks ◽

Detection Methods ◽

Disease Detection ◽

Deep Convolutional Neural Networks ◽

Detection Techniques ◽

Tomato Diseases ◽

Validation Set ◽

Tomato Disease

This study develops tomato disease detection methods based on deep convolutional neural networks and object detection models. Two different models, Faster R-CNN and Mask R-CNN, are used in these methods, where Faster R-CNN is used to identify the types of tomato diseases and Mask R-CNN is used to detect and segment the locations and shapes of the infected areas. To select the model that best fits the tomato disease detection task, four different deep convolutional neural networks are combined with the two object detection models. Data are collected from the Internet and the dataset is divided into a training set, a validation set, and a test set used in the experiments. The experimental results show that the proposed models can accurately and quickly identify the eleven tomato disease types and segment the locations and shapes of the infected areas.

Download Full-text

The Unreasonable Effectiveness of Convolutional Neural Networks in Population Genetic Inference

10.1101/336073 ◽

2018 ◽

Cited By ~ 3

Author(s):

Lex Flagel ◽

Yaniv Brandvain ◽

Daniel R. Schrider

Keyword(s):

Neural Networks ◽

Convolutional Neural Networks ◽

Population Genetic ◽

Sequence Data ◽

Input Sequence ◽

Evolutionary Model ◽

Sequence Alignments ◽

Likelihood Approach ◽

Population Genetic Inference ◽

Genetic Inference

ABSTRACTPopulation-scale genomic datasets have given researchers incredible amounts of information from which to infer evolutionary histories. Concomitant with this flood of data, theoretical and methodological advances have sought to extract information from genomic sequences to infer demographic events such as population size changes and gene flow among closely related populations/species, construct recombination maps, and uncover loci underlying recent adaptation. To date most methods make use of only one or a few summaries of the input sequences and therefore ignore potentially useful information encoded in the data. The most sophisticated of these approaches involve likelihood calculations, which require theoretical advances for each new problem, and often focus on a single aspect of the data (e.g. only allele frequency information) in the interest of mathematical and computational tractability. Directly interrogating the entirety of the input sequence data in a likelihood-free manner would thus offer a fruitful alternative. Here we accomplish this by representing DNA sequence alignments as images and using a class of deep learning methods called convolutional neural networks (CNNs) to make population genetic inferences from these images. We apply CNNs to a number of evolutionary questions and find that they frequently match or exceed the accuracy of current methods. Importantly, we show that CNNs perform accurate evolutionary model selection and parameter estimation, even on problems that have not received detailed theoretical treatments. Thus, when applied to population genetic alignments, CNN are capable of outperforming expert-derived statistical methods, and offer a new path forward in cases where no likelihood approach exists.

Download Full-text

High Sensitivity and Specificity of Chromosomal Pertubations in Human Invasive Breast Cancer (BrCa) Associated with Circulating Nucleic Acids (CNA) Using Concatemers of Short Sequence DNA Tags in Next Generation Sequencing (NGS)

The FASEB Journal ◽

10.1096/fasebj.25.1_supplement.122.2 ◽

2011 ◽

Vol 25 (S1) ◽

Author(s):

William Marvin Mitchell ◽

Julia Beck ◽

Howard B Urnovitz ◽

Ekkehard Schuetz

Keyword(s):

Breast Cancer ◽

Next Generation Sequencing ◽

Nucleic Acids ◽

Sensitivity And Specificity ◽

Invasive Breast Cancer ◽

High Sensitivity ◽

Short Sequence ◽

Circulating Nucleic Acids ◽

Next Generation Sequencing Ngs ◽

Generation Sequencing

Download Full-text

Behavioral Malware Detection Using Deep Graph Convolutional Neural Networks

10.36227/techrxiv.10043099.v1 ◽

2019 ◽

Cited By ~ 1

Author(s):

Angelo Schranko de Oliveira ◽

Renato José Sassi

Keyword(s):

Neural Networks ◽

Convolutional Neural Networks ◽

Short Term Memory ◽

Malware Detection ◽

Detection Methods ◽

New Public ◽

Classification Tasks ◽

Long Short Term Memory ◽

Similar Area ◽

Source Of Information

<div>Malware behavioral graphs provide a rich source of information that can be leveraged for detection and classification tasks. In this paper, we propose a novel behavioral malware detection method based on Deep Graph Convolutional Neural Networks (DGCNNs) to learn directly from API call sequences and their associated behavioral graphs. In order to train and evaluate the models, we created a new public domain dataset of more than 40,000 API call sequences resulting from the execution of malware and goodware instances in a sandboxed environment. Experimental results show that our models achieve similar Area Under the ROC Curve (AUC-ROC) and F1-Score to Long-Short Term Memory (LSTM) networks, widely used as the base architecture for behavioral malware detection methods, thus indicating that the models can effectively learn to distinguish between malicious and benign temporal patterns through convolution operations on graphs. To the best of our knowledge, this is the first paper that investigates the applicability of DGCNN to behavioral malware detection using API call sequences.</div>

Download Full-text

Ensemble of Deep Convolutional Neural Networks for Automatic Pavement Crack Detection and Measurement

Coatings ◽

10.3390/coatings10020152 ◽

2020 ◽

Vol 10 (2) ◽

pp. 152 ◽

Cited By ~ 6

Author(s):

Zhun Fan ◽

Chong Li ◽

Ying Chen ◽

Paola Di Mascio ◽

Xiaopeng Chen ◽

...

Keyword(s):

Neural Networks ◽

Convolutional Neural Networks ◽

Crack Detection ◽

The Other ◽

Detection Methods ◽

Deep Convolutional Neural Networks ◽

Predicted Probability ◽

Low Efficiency ◽

The Individual ◽

Pavement Crack Detection

Automated pavement crack detection and measurement are important road issues. Agencies have to guarantee the improvement of road safety. Conventional crack detection and measurement algorithms can be extremely time-consuming and low efficiency. Therefore, recently, innovative algorithms have received increased attention from researchers. In this paper, we propose an ensemble of convolutional neural networks (without a pooling layer) based on probability fusion for automated pavement crack detection and measurement. Specifically, an ensemble of convolutional neural networks was employed to identify the structure of small cracks with raw images. Secondly, outputs of the individual convolutional neural network model for the ensemble were averaged to produce the final crack probability value of each pixel, which can obtain a predicted probability map. Finally, the predicted morphological features of the cracks were measured by using the skeleton extraction algorithm. To validate the proposed method, some experiments were performed on two public crack databases (CFD and AigleRN) and the results of the different state-of-the-art methods were compared. To evaluate the efficiency of crack detection methods, three parameters were considered: precision (Pr), recall (Re) and F1 score (F1). For the two public databases of pavement images, the proposed method obtained the highest values of the three evaluation parameters: for the CFD database, Pr = 0.9552, Re = 0.9521 and F1 = 0.9533 (which reach values up to 0.5175 higher than the values obtained on the same database with the other methods), for the AigleRN database, Pr = 0.9302, Re = 0.9166 and F1 = 0.9238 (which reach values up to 0.7313 higher than the values obtained on the same database with the other methods). The experimental results show that the proposed method outperforms the other methods. For crack measurement, the crack length and width can be measure based on different crack types (complex, common, thin, and intersecting cracks.). The results show that the proposed algorithm can be effectively applied for crack measurement.

Download Full-text

Pharmacogenomics in Children: Advantages and Challenges of Next Generation Sequencing Applications

International Journal of Pediatrics ◽

10.1155/2013/136524 ◽

2013 ◽

Vol 2013 ◽

pp. 1-8 ◽

Cited By ~ 4

Author(s):

O. M. Vanakker ◽

A. De Paepe

Keyword(s):

Next Generation Sequencing ◽

Sequence Data ◽

Next Generation ◽

Clinical Implementation ◽

Ethical Aspects ◽

Number Of Genes ◽

Working Groups ◽

Next Generation Sequencing Ngs ◽

Generation Sequencing ◽

Single Reaction

Pharmacogenetics is considered as a prime example of how personalized medicine nowadays can be put into practice. However, genotyping to guide pharmacological treatment is relatively uncommon in the routine clinical practice. Several reasons can be found why the application of pharmacogenetics is less than initially anticipated, which include the contradictory results obtained for certain variants and the lack of guidelines for clinical implementation. However, more reproducible results are being generated, and efforts have been made to establish working groups focussing on evidence-based clinical guidelines. For another pharmacogenetic hurdle, the speed by which a pharmacogenetic profile for a certain drug can be obtained in an individual patient, there has been a revolution in molecular genetics through the introduction of next generation sequencing (NGS), making it possible to sequence a large number of genes up to the complete genome in a single reaction. Besides the enthusiasm due to the tremendous increase of our sequencing capacities, several considerations need to be made regarding quality and interpretation of the sequence data as well as ethical aspects of this technology. This paper will focus on the different NGS applications that may be useful for pharmacogenomics in children and the challenges that they bring on.

Download Full-text

Deep Convolutional Neural Networks-Based Automatic Breast Segmentation and Mass Detection in DCE-MRI

Computational and Mathematical Methods in Medicine ◽

10.1155/2020/2413706 ◽

2020 ◽

Vol 2020 ◽

pp. 1-12

Author(s):

Han Jiao ◽

Xinhua Jiang ◽

Zhiyong Pang ◽

Xiaofeng Lin ◽

Yihua Huang ◽

...

Keyword(s):

Neural Network ◽

Neural Networks ◽

Convolutional Neural Network ◽

Convolutional Neural Networks ◽

False Positives ◽

Mass Detection ◽

Deep Convolutional Neural Networks ◽

Region Segmentation ◽

Dce Mri ◽

Breast Segmentation

Breast segmentation and mass detection in medical images are important for diagnosis and treatment follow-up. Automation of these challenging tasks can assist radiologists by reducing the high manual workload of breast cancer analysis. In this paper, deep convolutional neural networks (DCNN) were employed for breast segmentation and mass detection in dynamic contrast-enhanced magnetic resonance imaging (DCE-MRI). First, the region of the breasts was segmented from the remaining body parts by building a fully convolutional neural network based on U-Net++. Using the method of deep learning to extract the target area can help to reduce the interference external to the breast. Second, a faster region with convolutional neural network (Faster RCNN) was used for mass detection on segmented breast images. The dataset of DCE-MRI used in this study was obtained from 75 patients, and a 5-fold cross validation method was adopted. The statistical analysis of breast region segmentation was carried out by computing the Dice similarity coefficient (DSC), Jaccard coefficient, and segmentation sensitivity. For validation of breast mass detection, the sensitivity with the number of false positives per case was computed and analyzed. The Dice and Jaccard coefficients and the segmentation sensitivity value for breast region segmentation were 0.951, 0.908, and 0.948, respectively, which were better than those of the original U-Net algorithm, and the average sensitivity for mass detection achieved 0.874 with 3.4 false positives per case.

Download Full-text