scholarly journals Whole-Genome Sequencing and Potassium-Solubilizing Mechanism of Bacillus aryabhattai SK1-7

2022 ◽  
Vol 12 ◽  
Author(s):  
Yifan Chen ◽  
Hui Yang ◽  
Zizhu Shen ◽  
Jianren Ye

To analyze the whole genome of Bacillus aryabhattai strain SK1-7 and explore its potassium solubilization characteristics and mechanism, thus providing a theoretical basis for analyzing the utilization and improvement of insoluble potassium resources in soil. Genome information for Bacillus aryabhattai SK1-7 was obtained by using Illumina NovaSeq second-generation sequencing and GridION Nanopore ONT third-generation sequencing technology. The contents of organic acids and polysaccharides in fermentation broth of Bacillus aryabhattai SK1-7 were determined by high-performance liquid chromatography and the anthrone sulfuric acid method, and the expression levels of the potassium solubilization-related genes ackA, epsB, gltA, mdh and ppc were compared by real-time fluorescence quantitative PCR under different potassium source culture conditions. The whole genome of the strain consisted of a complete chromosome sequence and four plasmid sequences. The sequence sizes of the chromosomes and plasmids P1, P2, P3 and P4 were 5,188,391 bp, 136,204 bp, 124,862 bp, 67,200 bp and 12,374 bp, respectively. The GC contents were 38.2, 34.4, 33.6, 32.8, and 33.7%. Strain SK1-7 mainly secreted malic, formic, acetic and citric acids under culture with an insoluble potassium source. The polysaccharide content produced with an insoluble potassium source was higher than that with a soluble potassium source. The expression levels of five potassium solubilization-related genes with the insoluble potassium source were higher than those with the soluble potassium source.

2018 ◽  
Author(s):  
Christine Tranchant-Dubreuil ◽  
Sébastien Ravel ◽  
Cécile Monat ◽  
Gautier Sarah ◽  
Abdoulaye Diallo ◽  
...  

ABSTRACTThe advent of NGS has intensified the need for robust pipelines to perform high-performance automated analyses. The required softwares depend on the sequencing method used to produce raw data (e.g. Whole genome sequencing, Genotyping By Sequencing, RNASeq) as well as the kind of analyses to carry on (GWAS, population structure, differential expression). These tools have to be generic and scalable, and should meet the biologists needs.Here, we present the new version of TOGGLe (Toolbox for Generic NGS Analyses), a simple and highly flexible framework to easily and quickly generate pipelines for large-scale second- and third-generation sequencing analyses, including multi-sample and multi-threading support. TOGGLe is a workflow manager designed to be as effortless as possible to use for biologists, so the focus can remain on the analyses. Pipelines are easily customizable and supported analyses are reproducible and shareable. TOGGLe is designed as a generic, adaptable and fast evolutive solution, and has been tested and used in large-scale projects on various organisms. It is freely available at http://toggle.southgreen.fr/, under the GNU GPLv3/CeCill-C licenses) and can be deployed onto HPC clusters as well as on local machines.


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Xiaoying Fan ◽  
Cheng Yang ◽  
Wen Li ◽  
Xiuzhen Bai ◽  
Xin Zhou ◽  
...  

AbstractThere is no effective way to detect structure variations (SVs) and extra-chromosomal circular DNAs (ecDNAs) at single-cell whole-genome level. Here, we develop a novel third-generation sequencing platform-based single-cell whole-genome sequencing (scWGS) method named SMOOTH-seq (single-molecule real-time sequencing of long fragments amplified through transposon insertion). We evaluate the method for detecting CNVs, SVs, and SNVs in human cancer cell lines and a colorectal cancer sample and show that SMOOTH-seq reliably and effectively detects SVs and ecDNAs in individual cells, but shows relatively limited accuracy in detection of CNVs and SNVs. SMOOTH-seq opens a new chapter in scWGS as it generates high fidelity reads of kilobases long.


2021 ◽  
Vol 12 ◽  
Author(s):  
Zongrui Dai ◽  
Jianyu Ren ◽  
Xiaoling Tong ◽  
Hai Hu ◽  
Kunpeng Lu ◽  
...  

The domesticated silkworm, Bombyx mori, is an important model system for the order Lepidoptera. Currently, based on third-generation sequencing, the chromosome-level genome of Bombyx mori has been released. However, its transcripts were mainly assembled by using short reads of second-generation sequencing and expressed sequence tags which cannot explain the transcript profile accurately. Here, we used PacBio Iso-Seq technology to investigate the transcripts from 45 developmental stages of Bombyx mori. We obtained 25,970 non-redundant high-quality consensus isoforms capturing ∼60% of previous reported RNAs, 15,431 (∼47%) novel transcripts, and identified 7,253 long non-coding RNA (lncRNA) with a large proportion of novel lncRNA (∼56%). In addition, we found that transposable elements (TEs) exonization account for 11,671 (∼45%) transcripts including 5,980 protein-coding transcripts (∼32%) and 5,691 lncRNAs (∼79%). Overall, our results expand the silkworm transcripts and have general implications to understand the interaction between TEs and their host genes. These transcripts resource will promote functional studies of genes and lncRNAs as well as TEs in the silkworm.


GigaScience ◽  
2020 ◽  
Vol 9 (10) ◽  
Author(s):  
Davide Bolognini ◽  
Alberto Magi ◽  
Vladimir Benes ◽  
Jan O Korbel ◽  
Tobias Rausch

Abstract Background Tandem repeat sequences are widespread in the human genome, and their expansions cause multiple repeat-mediated disorders. Genome-wide discovery approaches are needed to fully elucidate their roles in health and disease, but resolving tandem repeat variation accurately remains a challenging task. While traditional mapping-based approaches using short-read data have severe limitations in the size and type of tandem repeats they can resolve, recent third-generation sequencing technologies exhibit substantially higher sequencing error rates, which complicates repeat resolution. Results We developed TRiCoLOR, a freely available tool for tandem repeat profiling using error-prone long reads from third-generation sequencing technologies. The method can identify repetitive regions in sequencing data without a prior knowledge of their motifs or locations and resolve repeat multiplicity and period size in a haplotype-specific manner. The tool includes methods to interactively visualize the identified repeats and to trace their Mendelian consistency in pedigrees. Conclusions TRiCoLOR demonstrates excellent performance and improved sensitivity and specificity compared with alternative tools on synthetic data. For real human whole-genome sequencing data, TRiCoLOR achieves high validation rates, suggesting its suitability to identify tandem repeat variation in personal genomes.


2021 ◽  
Vol 182 (2) ◽  
pp. 63-71
Author(s):  
M. M. Agakhanov ◽  
E. A. Grigoreva ◽  
E. K. Potokina ◽  
P. S. Ulianich ◽  
Y. V. Ukhatova

The immune North American grapevine species Vitis rotundifolia Michaux (subgen. Muscadinia Planch.) is regarded as a potential donor of disease resistance genes, withstanding such dangerous diseases of grapes as powdery and downy mildews. The cultivar ‘Dixie’ is the only representative of this species preserved ex situ in Russia: it is maintained by the N.I. Vavilov All-Russian Institute of Plant Genetic Resources (VIR) in the orchards of its branch, Krymsk Experiment Breeding Station. Third-generation sequencing on the MinION platform was performed to obtain information on the primary structure of the cultivar’s genomic DNA, employing also the results of Illumina sequencing available in databases. A detailed description of the technique with modifications at various stages is presented, as it was used for grapevine genome sequencing and whole-genome sequence assembly. The modified technique included the main stages of the original protocol recommended by the MinION producer: 1) DNA extraction; 2) preparation of libraries for sequencing; 3) MinION sequencing and bioinformatic data processing; 4) de novo whole-genome sequence assembly using only MinION data or hybrid assembly (MinION+Illumina data); and 5) functional annotation of the whole-genome assembly. Stage 4 included not only de novo sequencing, but also the analysis of the available bioinformatic data, thus minimizing errors and increasing precision during the assembly of the studied genome. The DNA isolated from the leaves of cv. ‘Dixie’ was sequenced using two MinION flow cells (R9.4.1).


2019 ◽  
Vol 2019 ◽  
pp. 1-10
Author(s):  
Wiktor Kuśmirek ◽  
Wiktor Franus ◽  
Robert Nowak

Currently, third-generation sequencing techniques, which make it possible to obtain much longer DNA reads compared to the next-generation sequencing technologies, are becoming more and more popular. There are many possibilities for combining data from next-generation and third-generation sequencing. Herein, we present a new application called dnaasm-link for linking contigs, the result of de novo assembly of second-generation sequencing data, with long DNA reads. Our tool includes an integrated module to fill gaps with a suitable fragment of an appropriate long DNA read, which improves the consistency of the resulting DNA sequences. This feature is very important, in particular for complex DNA regions. Our implementation is found to outperform other state-of-the-art tools in terms of speed and memory requirements, which may enable its usage for organisms with a large genome, something which is not possible in existing applications. The presented application has many advantages: (i) it significantly optimizes memory and reduces computation time; (ii) it fills gaps with an appropriate fragment of a specified long DNA read; (iii) it reduces the number of spanned and unspanned gaps in existing genome drafts. The application is freely available to all users under GNU Library or Lesser General Public License version 3.0 (LGPLv3). The demo application, Docker image, and source code can be downloaded from project homepage.


2016 ◽  
Author(s):  
Anna Kuosmanen ◽  
Veli Mäkinen

AbstractMotivationTranscript prediction can be modelled as a graph problem where exons are modelled as nodes and reads spanning two or more exons are modelled as exon chains. PacBio third-generation sequencing technology produces significantly longer reads than earlier second-generation sequencing technologies, which gives valuable information about longer exon chains in a graph. However, with the high error rates of third-generation sequencing, aligning long reads correctly around the splice sites is a challenging task. Incorrect alignments lead to spurious nodes and arcs in the graph, which in turn lead to incorrect transcript predictions.ResultsWe survey several approaches to find the exon chains corresponding to long reads in a splicing graph, and experimentally study the performance of these methods using simulated data to allow for sensitivity / precision analysis. Our experiments show that short reads from second-generation sequencing can be used to significantly improve exon chain correctness either by error-correcting the long reads before splicing graph creation, or by using them to create a splicing graph on which the long read alignments are then projected. We also study the memory and time consumption of various modules, and show that accurate exon chains lead to significantly increased transcript prediction accuracy.AvailabilityThe simulated data and in-house scripts used for this article are available at http://cs.helsinki.fi/u/aekuosma/exon_chain_evaluation_publish.tar.gz.


2015 ◽  
Vol 5 (4) ◽  
pp. 559-567 ◽  
Author(s):  
Nur Fatihah Mohd-Yusoff ◽  
Pradeep Ruperao ◽  
Nurain Emylia Tomoyoshi ◽  
David Edwards ◽  
Peter M. Gresshoff ◽  
...  

2016 ◽  
Author(s):  
Hayan Lee ◽  
James Gurtowski ◽  
Shinjae Yoo ◽  
Maria Nattestad ◽  
Shoshana Marcus ◽  
...  

AbstractThird-generation long-range DNA sequencing and mapping technologies are creating a renaissance in high-quality genome sequencing. Unlike second-generation sequencing, which produces short reads a few hundred base-pairs long, third-generation single-molecule technologies generate over 10,000 bp reads or map over 100,000 bp molecules. We analyze how increased read lengths can be used to address longstanding problems in de novo genome assembly, structural variation analysis and haplotype phasing.


Sign in / Sign up

Export Citation Format

Share Document