Analysis of prospective microbiology research using third-generation sequencing technology

Abstract Motivation Third-generation sequencing technologies can sequence long reads that contain as many as 2 million base pairs. These long reads are used to construct an assembly (i.e. the subject’s genome), which is further used in downstream genome analysis. Unfortunately, third-generation sequencing technologies have high sequencing error rates and a large proportion of base pairs in these long reads is incorrectly identified. These errors propagate to the assembly and affect the accuracy of genome analysis. Assembly polishing algorithms minimize such error propagation by polishing or fixing errors in the assembly by using information from alignments between reads and the assembly (i.e. read-to-assembly alignment information). However, current assembly polishing algorithms can only polish an assembly using reads from either a certain sequencing technology or a small assembly. Such technology-dependency and assembly-size dependency require researchers to (i) run multiple polishing algorithms and (ii) use small chunks of a large genome to use all available readsets and polish large genomes, respectively. Results We introduce Apollo, a universal assembly polishing algorithm that scales well to polish an assembly of any size (i.e. both large and small genomes) using reads from all sequencing technologies (i.e. second- and third-generation). Our goal is to provide a single algorithm that uses read sets from all available sequencing technologies to improve the accuracy of assembly polishing and that can polish large genomes. Apollo (i) models an assembly as a profile hidden Markov model (pHMM), (ii) uses read-to-assembly alignment to train the pHMM with the Forward–Backward algorithm and (iii) decodes the trained model with the Viterbi algorithm to produce a polished assembly. Our experiments with real readsets demonstrate that Apollo is the only algorithm that (i) uses reads from any sequencing technology within a single run and (ii) scales well to polish large assemblies without splitting the assembly into multiple parts. Availability and implementation Source code is available at https://github.com/CMU-SAFARI/Apollo. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

The study of transcriptomes of symbiotic tissue of pea using the third-generation sequencing technology Oxford Nanopore

Abstract book of the 2nd International Scientific Conference "Plants and Microbes: the Future of Biotechnology" PLAMIC2020 ◽

10.28983/plamic2020.093 ◽

2020 ◽

Author(s):

E. S. Gribchenko

Keyword(s):

Nanopore Sequencing ◽

Third Generation ◽

Nitrogen Fixing ◽

Sequencing Technology ◽

The Third ◽

Third Generation Sequencing ◽

Oxford Nanopore ◽

Mycorrhizal Roots ◽

Gene Isoforms ◽

Generation Sequencing

The transcriptome profiles the cv. Frisson mycorrhizal roots and inoculated nitrogen-fixing nodules were investigated using the Oxford Nanopore sequencing technology. A database of gene isoforms and their expression has been created.

Download Full-text

Recovery of human gut microbiota genomes with third-generation sequencing

Cell Death and Disease ◽

10.1038/s41419-021-03829-y ◽

2021 ◽

Vol 12 (6) ◽

Author(s):

Yanfei Li ◽

Yueling Jin ◽

Jianming Zhang ◽

Haoying Pan ◽

Lan Wu ◽

...

Keyword(s):

Gut Microbiota ◽

Third Generation ◽

Sequencing Technology ◽

Bacterial Genomes ◽

Human Gut ◽

Human Gut Microbiota ◽

Third Generation Sequencing ◽

Large Numbers ◽

Generation Sequencing

AbstractHuman gut microbiota modulates normal physiological functions, such as maintenance of barrier homeostasis and modulation of metabolism, as well as various chronic diseases including type 2 diabetes and gastrointestinal cancer. Despite decades of research, the composition of the gut microbiota remains poorly understood. Here, we established an effective extraction method to obtain high quality gut microbiota genomes, and analyzed them with third-generation sequencing technology. We acquired a large quantity of data from each sample and assembled large numbers of reliable contigs. With this approach, we constructed tens of completed bacterial genomes in which there were several new bacteria species. We also identified a new conditional pathogen, Enterococcus tongjius, which is a member of Enterococci. This work provided a novel and reliable approach to recover gut microbiota genomes, facilitating the discovery of new bacteria species and furthering our understanding of the microbiome that underlies human health and diseases.

Download Full-text

Recovery of Human Gut Microbiota Genomes Substantially With Third-generation Sequencing

10.21203/rs.3.rs-87441/v1 ◽

2020 ◽

Author(s):

Yanfei Li ◽

Yueling Jin ◽

Haoying Pan ◽

Jianming Zhang ◽

Lan Wu ◽

...

Keyword(s):

Gut Microbiota ◽

Genomic Dna ◽

Extraction Method ◽

Third Generation ◽

Sequencing Technology ◽

Human Gut ◽

Third Generation Sequencing ◽

Health And Disease ◽

Generation Sequencing

Abstract BackgroundHuman gut microbiota modulates normal physiological functions, such as the maintenance of barrier homeostasis and the modulation of metabolism, and various chronic diseases including type 2 diabetes and gastrointestinal cancer. Despite decades of researches, the composition of the gut microbiota remains unexplored and unidentified. ResultsHere we established an effective extraction method to obtain high-quality gut microbiota genomic DNA and detected the samples with third-generation sequencing technology. We acquired a quite big data form each sample and assembled many reliable contigs. Not only enormous unknown genes, but also several new bacteria subspecies or species were identified. ConclusionsThis work provides a novel and reliable framework to recover gut microbiota genomes substantially, facilitating the understanding of the roles of the microbiome that underlie in human health and disease.

Download Full-text

BUILDING CATALOGUE OF LIFE: ULTRAHIGH THROUGHPUT DNA BARCODING USING THIRD GENERATION SEQUENCING

MOLECULAR PHYLOGENETICS ◽

10.30826/molphy2018-05 ◽

2018 ◽

Author(s):

P.D.N. HEBERT ◽

◽

T.W.A. BRAUKMANN ◽

S.W.J. PROSSER ◽

S. RATNASINGHAM ◽

...

Keyword(s):

Dna Barcoding ◽

Third Generation ◽

Third Generation Sequencing ◽

Generation Sequencing

Download Full-text

Faculty Opinions recommendation of Prospective genomic characterization of the German enterohemorrhagic Escherichia coli O104:H4 outbreak by rapid next generation sequencing technology.

Faculty Opinions – Post-Publication Peer Review of the Biomedical Literature ◽

10.3410/f.12538961.13766060 ◽

2011 ◽

Author(s):

Martin Maiden

Keyword(s):

Escherichia Coli ◽

Next Generation Sequencing ◽

Enterohemorrhagic Escherichia Coli ◽

Next Generation ◽

Sequencing Technology ◽

Next Generation Sequencing Technology ◽

Generation Sequencing Technology ◽

Genomic Characterization ◽

Generation Sequencing

Download Full-text

Diagnostic Value of Next-Generation Sequencing Technology in Viral Infections of the Central Nervous System

SSRN Electronic Journal ◽

10.2139/ssrn.3218718 ◽

2018 ◽

Author(s):

Xiao-Wei Xing ◽

Jia-Tang Zhang ◽

Yu-Bao Ma ◽

Xiao-Yan Chen ◽

Lei-Wu Lei-Wu ◽

...

Keyword(s):

Central Nervous System ◽

Nervous System ◽

Next Generation Sequencing ◽

Viral Infections ◽

Diagnostic Value ◽

Sequencing Technology ◽

Next Generation Sequencing Technology ◽

Generation Sequencing Technology ◽

The Central Nervous System ◽

Generation Sequencing

Download Full-text

IsoDetect: Detection of splice isoforms from third generation long reads based on short feature sequences

Current Bioinformatics ◽

10.2174/1574893615666200316101205 ◽

2020 ◽

Vol 15 ◽

Author(s):

Hongdong Li ◽

Wenjing Zhang ◽

Yuwen Luo ◽

Jianxin Wang

Keyword(s):

Sequence Similarity ◽

Detection Methods ◽

Sequence Information ◽

Third Generation ◽

Sequencing Data ◽

Splice Isoforms ◽

Third Generation Sequencing ◽

Long Reads ◽

Feature Sequence ◽

Generation Sequencing

Aims: Accurately detect isoforms from third generation sequencing data. Background: Transcriptome annotation is the basis for the analysis of gene expression and regulation. The transcriptome annotation of many organisms such as humans is far from incomplete, due partly to the challenge in the identification of isoforms that are produced from the same gene through alternative splicing. Third generation sequencing (TGS) reads provide unprecedented opportunity for detecting isoforms due to their long length that exceeds the length of most isoforms. One limitation of current TGS reads-based isoform detection methods is that they are exclusively based on sequence reads, without incorporating the sequence information of known isoforms. Objective: Develop an efficient method for isoform detection. Method: Based on annotated isoforms, we propose a splice isoform detection method called IsoDetect. First, the sequence at exon-exon junction is extracted from annotated isoforms as the “short feature sequence”, which is used to distinguish different splice isoforms. Second, we aligned these feature sequences to long reads and divided long reads into groups that contain the same set of feature sequences, thereby avoiding the pair-wise comparison among the large number of long reads. Third, clustering and consensus generation are carried out based on sequence similarity. For the long reads that do not contain any short feature sequence, clustering analysis based on sequence similarity is performed to identify isoforms. Result: Tested on two datasets from Calypte Anna and Zebra Finch, IsoDetect showed higher speed and compelling accuracy compared with four existing methods. Conclusion: IsoDetect is a promising method for isoform detection. Other: This paper was accepted by the CBC2019 conference.

Download Full-text