An Overview of the Application of Deep Learning in Short Read Sequence Classification

Mapping Intimacies ◽

10.1101/2020.09.19.304782 ◽

2020 ◽

Author(s):

Kristaps Bebris ◽

Inese Polaka

Keyword(s):

Deep Learning ◽

Sequence Classification ◽

Sequencing Data ◽

Sequencing Technology ◽

Short Read ◽

Short Read Sequencing ◽

Current State ◽

Learning Techniques ◽

Short Read Sequence

AbstractAdvances in sequencing technology have led to an ever increasing amount of available short read sequencing data. This has, consequently, exacerbated the need for efficient and precise classification tools that can be used in the analysis of this data. As it stands, recent years have shown that massive leaps in performance can be achieved when it comes to approaches that are based in heuristics, and alongside these improvements there has been an ever increasing interest in applying deep learning techniques to revolutionize this classification task. We attempt to gather up these approaches and to evaluate their performance in a reproducible fashion to get a better perspective on the current state of deep learning based methods when it comes to the classification of short read sequencing data.

Download Full-text

An Overview of the Application of Deep Learning in Short-Read Sequence Classification

Information Technology and Management Science ◽

10.7250/itms-2020-0005 ◽

2020 ◽

Vol 23 ◽

pp. 35-40

Author(s):

Kristaps Bebris ◽

Inese Polaka

Keyword(s):

Deep Learning ◽

Sequence Classification ◽

Sequencing Data ◽

Sequencing Technology ◽

Short Read ◽

Short Read Sequencing ◽

Current State ◽

Learning Techniques ◽

Short Read Sequence

Advances in sequencing technology have led to an ever increasing amount of available short-read sequencing data. This has, consequently, exacerbated the need for efficient and precise classification tools that can be used in the analysis of these data. As it stands, recent years have shown that massive leaps in performance can be achieved when it comes to approaches that are based on heuristics, and apart from these improvements there has been an ever increasing interest in applying deep learning techniques to revolutionize this classification task. We attempt to study these approaches and to evaluate their performance in a reproducible fashion to get a better perspective on the current state of deep learning based methods when it comes to the classification of short-read sequencing data

Download Full-text

Rapid Mycobacterium tuberculosis spoligotyping from uncorrected long reads using Galru

10.1101/2020.05.31.126490 ◽

2020 ◽

Author(s):

Andrew J. Page ◽

Nabil-Fareed Alikhan ◽

Michael Strinden ◽

Thanh Le Viet ◽

Timofey Skvortsov

Keyword(s):

Mycobacterium Tuberculosis ◽

State Of The Art ◽

Sequence Data ◽

Human Pathogen ◽

Sequencing Data ◽

Short Read ◽

Short Read Sequencing ◽

Long Reads ◽

Long Read

AbstractSpoligotyping of Mycobacterium tuberculosis provides a subspecies classification of this major human pathogen. Spoligotypes can be predicted from short read genome sequencing data; however, no methods exist for long read sequence data such as from Nanopore or PacBio. We present a novel software package Galru, which can rapidly detect the spoligotype of a Mycobacterium tuberculosis sample from as little as a single uncorrected long read. It allows for near real-time spoligotyping from long read data as it is being sequenced, giving rapid sample typing. We compare it to the existing state of the art software and find it performs identically to the results obtained from short read sequencing data. Galru is freely available from https://github.com/quadram-institute-bioscience/galru under the GPLv3 open source licence.

Download Full-text

Deep learning techniques for classification of electroencephalogram (EEG) motor imagery (MI) signals: a review

Neural Computing and Applications ◽

10.1007/s00521-021-06352-5 ◽

2021 ◽

Author(s):

Hamdi Altaheri ◽

Ghulam Muhammad ◽

Mansour Alsulaiman ◽

Syed Umar Amin ◽

Ghadir Ali Altuwaijri ◽

...

Keyword(s):

Deep Learning ◽

Motor Imagery ◽

Learning Techniques ◽

Electroencephalogram Eeg

Download Full-text

REscan: inferring repeat expansions and structural variation in paired-end short read sequencing data

Bioinformatics ◽

10.1093/bioinformatics/btaa753 ◽

2020 ◽

Author(s):

Russell Lewis McLaughlin

Keyword(s):

Structural Variation ◽

Sequence Data ◽

Neurological Diseases ◽

Repeat Expansion ◽

Sequencing Data ◽

Short Read ◽

Short Read Sequencing ◽

Repeat Expansions ◽

Paired End Sequencing

Abstract Motivation Repeat expansions are an important class of genetic variation in neurological diseases. However, the identification of novel repeat expansions using conventional sequencing methods is a challenge due to their typical lengths relative to short sequence reads and difficulty in producing accurate and unique alignments for repetitive sequence. However, this latter property can be harnessed in paired-end sequencing data to infer the possible locations of repeat expansions and other structural variation. Results This article presents REscan, a command-line utility that infers repeat expansion loci from paired-end short read sequencing data by reporting the proportion of reads orientated towards a locus that do not have an adequately mapped mate. A high REscan statistic relative to a population of data suggests a repeat expansion locus for experimental follow-up. This approach is validated using genome sequence data for 259 cases of amyotrophic lateral sclerosis, of which 24 are positive for a large repeat expansion in C9orf72, showing that REscan statistics readily discriminate repeat expansion carriers from non-carriers. Availabilityand implementation C source code at https://github.com/rlmcl/rescan (GNU General Public Licence v3).

Download Full-text

Classification of Dermoscopy Skin Images with the Application of Deep Learning Techniques

Proceedings of the 5th Brazilian Technology Symposium - Smart Innovation, Systems and Technologies ◽

10.1007/978-3-030-57566-3_7 ◽

2020 ◽

pp. 73-81

Author(s):

Pablo David Minango Negrete ◽

Yuzo Iano ◽

Ana Carolina Borges Monteiro ◽

Reinaldo Padilha França ◽

Gabriel Gomes de Oliveira ◽

...

Keyword(s):

Deep Learning ◽

Learning Techniques

Download Full-text

High resolution copy number inference in cancer using short-molecule nanopore sequencing

10.1101/2020.12.28.424602 ◽

2020 ◽

Author(s):

Timour Baslan ◽

Sam Kovaka ◽

Fritz J. Sedlazeck ◽

Yanming Zhang ◽

Robert Wappel ◽

...

Keyword(s):

Copy Number ◽

Cost Effective ◽

Chromosome Analysis ◽

Ease Of Use ◽

Precision Oncology ◽

Nanopore Sequencing ◽

Dna Molecules ◽

Sequencing Data ◽

Short Read ◽

Short Read Sequencing

ABSTRACTGenome copy number is an important source of genetic variation in health and disease. In cancer, clinically actionable Copy Number Alterations (CNAs) can be inferred from short-read sequencing data, enabling genomics-based precision oncology. Emerging Nanopore sequencing technologies offer the potential for broader clinical utility, for example in smaller hospitals, due to lower instrument cost, higher portability, and ease of use. Nonetheless, Nanopore sequencing devices are limited in terms of the number of retrievable sequencing reads/molecules compared to short-read sequencing platforms. This represents a challenge for applications that require high read counts such as CNA inference. To address this limitation, we targeted the sequencing of short-length DNA molecules loaded at optimized concentration in an effort to increase sequence read/molecule yield from a single nanopore run. We show that sequencing short DNA molecules reproducibly returns high read counts and allows high quality CNA inference. We demonstrate the clinical relevance of this approach by accurately inferring CNAs in acute myeloid leukemia samples. The data shows that, compared to traditional approaches such as chromosome analysis/cytogenetics, short molecule nanopore sequencing returns more sensitive, accurate copy number information in a cost effective and expeditious manner, including for multiplex samples. Our results provide a framework for the sequencing of relatively short DNA molecules on nanopore devices with applications in research and medicine, that include but are not limited to, CNAs.

Download Full-text

Machine Learning and Deep Learning Techniques, Features and Obstacles in the Cataract Diagnosis

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.c4283.099320 ◽

2020 ◽

Vol 9 (3) ◽

pp. 87-93

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Literature Review ◽

Congenital Cataract ◽

Low Cost ◽

Current Situation ◽

Learning Techniques ◽

Research Questions ◽

New Research

Cataract is a degenerative condition that, according to estimations, will rise globally. Even though there are various proposals about its diagnosis, there are remaining problems to be solved. This paper aims to identify the current situation of the recent investigations on cataract diagnosis using a framework to conduct the literature review with the intention of answering the following research questions: RQ1) Which are the existing methods for cataract diagnosis? RQ2) Which are the features considered for the diagnosis of cataracts? RQ3) Which is the existing classification when diagnosing cataracts? RQ4) And Which obstacles arise when diagnosing cataracts? Additionally, a cross-analysis of the results was made. The results showed that new research is required in: (1) the classification of “congenital cataract” and, (2) portable solutions, which are necessary to make cataract diagnoses easily and at a low cost.

Download Full-text

Stock Pattern Classification from Charts using Deep Learning Algorithms

Academic Perspective Procedia ◽

10.33793/acperpro.03.01.89 ◽

2020 ◽

Vol 3 (1) ◽

pp. 445-454

Author(s):

Celal Buğra Kaya ◽

Alperen Yılmaz ◽

Gizem Nur Uzun ◽

Zeynep Hilal Kilimci

Keyword(s):

Neural Networks ◽

Deep Learning ◽

Convolutional Neural Networks ◽

Pattern Classification ◽

Short Term Memory ◽

Stock Exchange ◽

Learning Techniques ◽

Proposed Model ◽

Istanbul Stock Exchange

Pattern classification is related with the automatic finding of regularities in dataset through the utilization of various learning techniques. Thus, the classification of the objects into a set of categories or classes is provided. This study is undertaken to evaluate deep learning methodologies to the classification of stock patterns. In order to classify patterns that are obtained from stock charts, convolutional neural networks (CNNs), recurrent neural networks (RNNs), and long-short term memory networks (LSTMs) are employed. To demonstrate the efficiency of proposed model in categorizing patterns, hand-crafted image dataset is constructed from stock charts in Istanbul Stock Exchange and NASDAQ Stock Exchange. Experimental results show that the usage of convolutional neural networks exhibits superior classification success in recognizing patterns compared to the other deep learning methodologies.

Download Full-text

A Transposon Story: From TE Content to TE Dynamic Invasion of Drosophila Genomes Using the Single-Molecule Sequencing Technology from Oxford Nanopore

Cells ◽

10.3390/cells9081776 ◽

2020 ◽

Vol 9 (8) ◽

pp. 1776

Author(s):

Mourdas Mohamed ◽

Nguyet Thi-Minh Dang ◽

Yuki Ogyama ◽

Nelly Burlet ◽

Bruno Mugat ◽

...

Keyword(s):

Single Molecule ◽

Wild Type ◽

Sequencing Data ◽

Short Read ◽

Short Read Sequencing ◽

Sequencing Technologies ◽

Oxford Nanopore ◽

In The Wild ◽

Successive Generations ◽

Type Strains

Transposable elements (TEs) are the main components of genomes. However, due to their repetitive nature, they are very difficult to study using data obtained with short-read sequencing technologies. Here, we describe an efficient pipeline to accurately recover TE insertion (TEI) sites and sequences from long reads obtained by Oxford Nanopore Technology (ONT) sequencing. With this pipeline, we could precisely describe the landscapes of the most recent TEIs in wild-type strains of Drosophila melanogaster and Drosophila simulans. Their comparison suggests that this subset of TE sequences is more similar than previously thought in these two species. The chromosome assemblies obtained using this pipeline also allowed recovering piRNA cluster sequences, which was impossible using short-read sequencing. Finally, we used our pipeline to analyze ONT sequencing data from a D. melanogaster unstable line in which LTR transposition was derepressed for 73 successive generations. We could rely on single reads to identify new insertions with intact target site duplications. Moreover, the detailed analysis of TEIs in the wild-type strains and the unstable line did not support the trap model claiming that piRNA clusters are hotspots of TE insertions.

Download Full-text

An Empirical Study on the Classification of Chinese News Articles by Machine Learning and Deep Learning Techniques

2019 International Conference on Machine Learning and Cybernetics (ICMLC) ◽

10.1109/icmlc48188.2019.8949309 ◽

2019 ◽

Author(s):

Chuen-Min Huang ◽

Yi-Jun Jiang

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Empirical Study ◽

Learning Techniques

Download Full-text