Whole-Genome Sequencing and Potassium-Solubilizing Mechanism of Bacillus aryabhattai SK1-7

To analyze the whole genome of Bacillus aryabhattai strain SK1-7 and explore its potassium solubilization characteristics and mechanism, thus providing a theoretical basis for analyzing the utilization and improvement of insoluble potassium resources in soil. Genome information for Bacillus aryabhattai SK1-7 was obtained by using Illumina NovaSeq second-generation sequencing and GridION Nanopore ONT third-generation sequencing technology. The contents of organic acids and polysaccharides in fermentation broth of Bacillus aryabhattai SK1-7 were determined by high-performance liquid chromatography and the anthrone sulfuric acid method, and the expression levels of the potassium solubilization-related genes ackA, epsB, gltA, mdh and ppc were compared by real-time fluorescence quantitative PCR under different potassium source culture conditions. The whole genome of the strain consisted of a complete chromosome sequence and four plasmid sequences. The sequence sizes of the chromosomes and plasmids P1, P2, P3 and P4 were 5,188,391 bp, 136,204 bp, 124,862 bp, 67,200 bp and 12,374 bp, respectively. The GC contents were 38.2, 34.4, 33.6, 32.8, and 33.7%. Strain SK1-7 mainly secreted malic, formic, acetic and citric acids under culture with an insoluble potassium source. The polysaccharide content produced with an insoluble potassium source was higher than that with a soluble potassium source. The expression levels of five potassium solubilization-related genes with the insoluble potassium source were higher than those with the soluble potassium source.

Download Full-text

TOGGLe, a flexible framework for easily building complex workflows and performing robust large-scale NGS analyses

10.1101/245480 ◽

2018 ◽

Cited By ~ 4

Author(s):

Christine Tranchant-Dubreuil ◽

Sébastien Ravel ◽

Cécile Monat ◽

Gautier Sarah ◽

Abdoulaye Diallo ◽

...

Keyword(s):

Population Structure ◽

High Performance ◽

Large Scale ◽

Genotyping By Sequencing ◽

Whole Genome ◽

Third Generation ◽

Third Generation Sequencing ◽

Sequencing Method ◽

Flexible Framework ◽

Generation Sequencing

ABSTRACTThe advent of NGS has intensified the need for robust pipelines to perform high-performance automated analyses. The required softwares depend on the sequencing method used to produce raw data (e.g. Whole genome sequencing, Genotyping By Sequencing, RNASeq) as well as the kind of analyses to carry on (GWAS, population structure, differential expression). These tools have to be generic and scalable, and should meet the biologists needs.Here, we present the new version of TOGGLe (Toolbox for Generic NGS Analyses), a simple and highly flexible framework to easily and quickly generate pipelines for large-scale second- and third-generation sequencing analyses, including multi-sample and multi-threading support. TOGGLe is a workflow manager designed to be as effortless as possible to use for biologists, so the focus can remain on the analyses. Pipelines are easily customizable and supported analyses are reproducible and shareable. TOGGLe is designed as a generic, adaptable and fast evolutive solution, and has been tested and used in large-scale projects on various organisms. It is freely available at http://toggle.southgreen.fr/, under the GNU GPLv3/CeCill-C licenses) and can be deployed onto HPC clusters as well as on local machines.

Download Full-text

The optimal standard protocols for whole-genome sequencing of antibiotic-resistant pathogenic bacteria using third-generation sequencing platforms

Molecular & Cellular Toxicology ◽

10.1007/s13273-021-00157-2 ◽

2021 ◽

Author(s):

Tae-Min La ◽

Ji-hoon Kim ◽

Taesoo Kim ◽

Hong-Jae Lee ◽

Yoonsuk Lee ◽

...

Keyword(s):

Whole Genome Sequencing ◽

Genome Sequencing ◽

Pathogenic Bacteria ◽

Whole Genome ◽

Third Generation ◽

Antibiotic Resistant ◽

Third Generation Sequencing ◽

Sequencing Platforms ◽

Generation Sequencing

Download Full-text

SMOOTH-seq: single-cell genome sequencing of human cells on a third-generation sequencing platform

Genome Biology ◽

10.1186/s13059-021-02406-y ◽

2021 ◽

Vol 22 (1) ◽

Author(s):

Xiaoying Fan ◽

Cheng Yang ◽

Wen Li ◽

Xiuzhen Bai ◽

Xin Zhou ◽

...

Keyword(s):

Single Cell ◽

Genome Sequencing ◽

Single Molecule ◽

Human Cancer ◽

Whole Genome ◽

Third Generation ◽

Sequencing Platform ◽

Human Cancer Cell Lines ◽

Third Generation Sequencing ◽

Generation Sequencing

AbstractThere is no effective way to detect structure variations (SVs) and extra-chromosomal circular DNAs (ecDNAs) at single-cell whole-genome level. Here, we develop a novel third-generation sequencing platform-based single-cell whole-genome sequencing (scWGS) method named SMOOTH-seq (single-molecule real-time sequencing of long fragments amplified through transposon insertion). We evaluate the method for detecting CNVs, SVs, and SNVs in human cancer cell lines and a colorectal cancer sample and show that SMOOTH-seq reliably and effectively detects SVs and ecDNAs in individual cells, but shows relatively limited accuracy in detection of CNVs and SNVs. SMOOTH-seq opens a new chapter in scWGS as it generates high fidelity reads of kilobases long.

Download Full-text

The Landscapes of Full-Length Transcripts and Splice Isoforms as Well as Transposons Exonization in the Lepidopteran Model System, Bombyx mori

Frontiers in Genetics ◽

10.3389/fgene.2021.704162 ◽

2021 ◽

Vol 12 ◽

Author(s):

Zongrui Dai ◽

Jianyu Ren ◽

Xiaoling Tong ◽

Hai Hu ◽

Kunpeng Lu ◽

...

Keyword(s):

Bombyx Mori ◽

Developmental Stages ◽

Model System ◽

Splice Isoforms ◽

Protein Coding ◽

Functional Studies ◽

Third Generation Sequencing ◽

Non Coding Rna ◽

Second Generation Sequencing ◽

Generation Sequencing

The domesticated silkworm, Bombyx mori, is an important model system for the order Lepidoptera. Currently, based on third-generation sequencing, the chromosome-level genome of Bombyx mori has been released. However, its transcripts were mainly assembled by using short reads of second-generation sequencing and expressed sequence tags which cannot explain the transcript profile accurately. Here, we used PacBio Iso-Seq technology to investigate the transcripts from 45 developmental stages of Bombyx mori. We obtained 25,970 non-redundant high-quality consensus isoforms capturing ∼60% of previous reported RNAs, 15,431 (∼47%) novel transcripts, and identified 7,253 long non-coding RNA (lncRNA) with a large proportion of novel lncRNA (∼56%). In addition, we found that transposable elements (TEs) exonization account for 11,671 (∼45%) transcripts including 5,980 protein-coding transcripts (∼32%) and 5,691 lncRNAs (∼79%). Overall, our results expand the silkworm transcripts and have general implications to understand the interaction between TEs and their host genes. These transcripts resource will promote functional studies of genes and lncRNAs as well as TEs in the silkworm.

Download Full-text

TRiCoLOR: tandem repeat profiling using whole-genome long-read sequencing data

GigaScience ◽

10.1093/gigascience/giaa101 ◽

2020 ◽

Vol 9 (10) ◽

Cited By ~ 1

Author(s):

Davide Bolognini ◽

Alberto Magi ◽

Vladimir Benes ◽

Jan O Korbel ◽

Tobias Rausch

Keyword(s):

Tandem Repeat ◽

Error Rates ◽

Sequencing Error ◽

Whole Genome Sequencing Data ◽

Whole Genome ◽

Third Generation ◽

Sequencing Data ◽

Sequencing Technologies ◽

Third Generation Sequencing ◽

Generation Sequencing

Abstract Background Tandem repeat sequences are widespread in the human genome, and their expansions cause multiple repeat-mediated disorders. Genome-wide discovery approaches are needed to fully elucidate their roles in health and disease, but resolving tandem repeat variation accurately remains a challenging task. While traditional mapping-based approaches using short-read data have severe limitations in the size and type of tandem repeats they can resolve, recent third-generation sequencing technologies exhibit substantially higher sequencing error rates, which complicates repeat resolution. Results We developed TRiCoLOR, a freely available tool for tandem repeat profiling using error-prone long reads from third-generation sequencing technologies. The method can identify repetitive regions in sequencing data without a prior knowledge of their motifs or locations and resolve repeat multiplicity and period size in a haplotype-specific manner. The tool includes methods to interactively visualize the identified repeats and to trace their Mendelian consistency in pedigrees. Conclusions TRiCoLOR demonstrates excellent performance and improved sensitivity and specificity compared with alternative tools on synthetic data. For real human whole-genome sequencing data, TRiCoLOR achieves high validation rates, suggesting its suitability to identify tandem repeat variation in personal genomes.

Download Full-text

Genome assembly of Vitis rotundifolia Michx. using third-generation sequencing (Oxford Nanopore Technologies)

PROCEEDINGS ON APPLIED BOTANY GENETICS AND BREEDING ◽

10.30901/2227-8834-2021-2-63-71 ◽

2021 ◽

Vol 182 (2) ◽

pp. 63-71

Author(s):

M. M. Agakhanov ◽

E. A. Grigoreva ◽

E. K. Potokina ◽

P. S. Ulianich ◽

Y. V. Ukhatova

Keyword(s):

Genome Sequence ◽

Genome Assembly ◽

De Novo ◽

Whole Genome Sequence ◽

Whole Genome ◽

Third Generation ◽

Vitis Rotundifolia ◽

Third Generation Sequencing ◽

Genome Sequence Assembly ◽

Generation Sequencing

The immune North American grapevine species Vitis rotundifolia Michaux (subgen. Muscadinia Planch.) is regarded as a potential donor of disease resistance genes, withstanding such dangerous diseases of grapes as powdery and downy mildews. The cultivar ‘Dixie’ is the only representative of this species preserved ex situ in Russia: it is maintained by the N.I. Vavilov All-Russian Institute of Plant Genetic Resources (VIR) in the orchards of its branch, Krymsk Experiment Breeding Station. Third-generation sequencing on the MinION platform was performed to obtain information on the primary structure of the cultivar’s genomic DNA, employing also the results of Illumina sequencing available in databases. A detailed description of the technique with modifications at various stages is presented, as it was used for grapevine genome sequencing and whole-genome sequence assembly. The modified technique included the main stages of the original protocol recommended by the MinION producer: 1) DNA extraction; 2) preparation of libraries for sequencing; 3) MinION sequencing and bioinformatic data processing; 4) de novo whole-genome sequence assembly using only MinION data or hybrid assembly (MinION+Illumina data); and 5) functional annotation of the whole-genome assembly. Stage 4 included not only de novo sequencing, but also the analysis of the available bioinformatic data, thus minimizing errors and increasing precision during the assembly of the studied genome. The DNA isolated from the leaves of cv. ‘Dixie’ was sequenced using two MinION flow cells (R9.4.1).

Download Full-text

Linking De Novo Assembly Results with Long DNA Reads Using the dnaasm-link Application

BioMed Research International ◽

10.1155/2019/7847064 ◽

2019 ◽

Vol 2019 ◽

pp. 1-10

Author(s):

Wiktor Kuśmirek ◽

Wiktor Franus ◽

Robert Nowak

Keyword(s):

Dna Sequences ◽

De Novo ◽

Computation Time ◽

Third Generation ◽

Next Generation ◽

Sequencing Data ◽

Third Generation Sequencing ◽

Combining Data ◽

Second Generation Sequencing ◽

Generation Sequencing

Currently, third-generation sequencing techniques, which make it possible to obtain much longer DNA reads compared to the next-generation sequencing technologies, are becoming more and more popular. There are many possibilities for combining data from next-generation and third-generation sequencing. Herein, we present a new application called dnaasm-link for linking contigs, the result of de novo assembly of second-generation sequencing data, with long DNA reads. Our tool includes an integrated module to fill gaps with a suitable fragment of an appropriate long DNA read, which improves the consistency of the resulting DNA sequences. This feature is very important, in particular for complex DNA regions. Our implementation is found to outperform other state-of-the-art tools in terms of speed and memory requirements, which may enable its usage for organisms with a large genome, something which is not possible in existing applications. The presented application has many advantages: (i) it significantly optimizes memory and reduces computation time; (ii) it fills gaps with an appropriate fragment of a specified long DNA read; (iii) it reduces the number of spanned and unspanned gaps in existing genome drafts. The application is freely available to all users under GNU Library or Lesser General Public License version 3.0 (LGPLv3). The demo application, Docker image, and source code can be downloaded from project homepage.

Download Full-text

Evaluating approaches to find exon chains based on long reads

10.1101/066241 ◽

2016 ◽

Author(s):

Anna Kuosmanen ◽

Veli Mäkinen

Keyword(s):

Second Generation ◽

Simulated Data ◽

Error Rates ◽

Third Generation ◽

Sequencing Technologies ◽

Third Generation Sequencing ◽

Long Reads ◽

Long Read ◽

Second Generation Sequencing ◽

Generation Sequencing

AbstractMotivationTranscript prediction can be modelled as a graph problem where exons are modelled as nodes and reads spanning two or more exons are modelled as exon chains. PacBio third-generation sequencing technology produces significantly longer reads than earlier second-generation sequencing technologies, which gives valuable information about longer exon chains in a graph. However, with the high error rates of third-generation sequencing, aligning long reads correctly around the splice sites is a challenging task. Incorrect alignments lead to spurious nodes and arcs in the graph, which in turn lead to incorrect transcript predictions.ResultsWe survey several approaches to find the exon chains corresponding to long reads in a splicing graph, and experimentally study the performance of these methods using simulated data to allow for sensitivity / precision analysis. Our experiments show that short reads from second-generation sequencing can be used to significantly improve exon chain correctness either by error-correcting the long reads before splicing graph creation, or by using them to create a splicing graph on which the long read alignments are then projected. We also study the memory and time consumption of various modules, and show that accurate exon chains lead to significantly increased transcript prediction accuracy.AvailabilityThe simulated data and in-house scripts used for this article are available at http://cs.helsinki.fi/u/aekuosma/exon_chain_evaluation_publish.tar.gz.

Download Full-text

Scanning the Effects of Ethyl Methanesulfonate on the Whole Genome ofLotus japonicusUsing Second-Generation Sequencing Analysis

G3 Genes|Genome|Genetics ◽

10.1534/g3.114.014571 ◽

2015 ◽

Vol 5 (4) ◽

pp. 559-567 ◽

Cited By ~ 7

Author(s):

Nur Fatihah Mohd-Yusoff ◽

Pradeep Ruperao ◽

Nurain Emylia Tomoyoshi ◽

David Edwards ◽

Peter M. Gresshoff ◽

...

Keyword(s):

Second Generation ◽

Ethyl Methanesulfonate ◽

Whole Genome ◽

Sequencing Analysis ◽

Second Generation Sequencing ◽

Generation Sequencing

Download Full-text

Third-generation sequencing and the future of genomics

10.1101/048603 ◽

2016 ◽

Cited By ~ 42

Author(s):

Hayan Lee ◽

James Gurtowski ◽

Shinjae Yoo ◽

Maria Nattestad ◽

Shoshana Marcus ◽

...

Keyword(s):

Single Molecule ◽

Structural Variation ◽

De Novo ◽

Third Generation ◽

Base Pairs ◽

Haplotype Phasing ◽

Third Generation Sequencing ◽

Second Generation Sequencing ◽

High Quality Genome ◽

Generation Sequencing

AbstractThird-generation long-range DNA sequencing and mapping technologies are creating a renaissance in high-quality genome sequencing. Unlike second-generation sequencing, which produces short reads a few hundred base-pairs long, third-generation single-molecule technologies generate over 10,000 bp reads or map over 100,000 bp molecules. We analyze how increased read lengths can be used to address longstanding problems in de novo genome assembly, structural variation analysis and haplotype phasing.

Download Full-text