average read length
Recently Published Documents


TOTAL DOCUMENTS

23
(FIVE YEARS 12)

H-INDEX

4
(FIVE YEARS 2)

2021 ◽  
Vol 21 (1) ◽  
Author(s):  
Hui Zhang ◽  
Jingjing Jin ◽  
Guoyun Xu ◽  
Zefeng Li ◽  
Niu Zhai ◽  
...  

Abstract Background Cigar wrapper leaves are the most important raw material of cigars. Studying the genomic information of cigar tobacco is conducive to improving cigar quality from the perspective of genetic breeding. However, no reference genome or full-length transcripts at the genome-wide scale have been reported for cigar tobacco. In particular, anion channels/transporters are of high interest for their potential application in regulating the chloride content of cigar tobacco growing on coastal lands, which usually results in relatively high Cl− accumulation, which is unfavorable. Here, the PacBio platform and NGS technology were combined to generate a full-length transcriptome of cigar tobacco used for cigar wrappers. Results High-quality RNA isolated from the roots, leaves and stems of cigar tobacco were subjected to both the PacBio platform and NGS. From PacBio, a total of 11,652,432 subreads (19-Gb) were generated, with an average read length of 1,608 bp. After corrections were performed in conjunction with the NGS reads, we ultimately identified 1,695,064 open reading frames including 21,486 full-length ORFs and 7,342 genes encoding transcription factors from 55 TF families, together with 2,230 genes encoding long non-coding RNAs. Members of gene families related to anion channels/transporters, including members of the SLAC and CLC families, were identified and characterized. Conclusions The full-length transcriptome of cigar tobacco was obtained, annotated, and analyzed, providing a valuable genetic resource for future studies in cigar tobacco.


BMC Genomics ◽  
2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Trent M. Prall ◽  
Emma K. Neumann ◽  
Julie A. Karl ◽  
Cecilia G. Shortreed ◽  
David A. Baker ◽  
...  

Abstract Background Oxford Nanopore Technologies’ instruments can sequence reads of great length. Long reads improve sequence assemblies by unambiguously spanning repetitive elements of the genome. Sequencing reads of significant length requires the preservation of long DNA template molecules through library preparation by pipetting reagents as slowly as possible to minimize shearing. This process is time-consuming and inconsistent at preserving read length as even small changes in volumetric flow rate can result in template shearing. Results We have designed SNAILS (Slow Nucleic Acid Instrument for Long Sequences), a 3D-printable instrument that automates slow pipetting of reagents used in long read library preparation for Oxford Nanopore sequencing. Across six sequencing libraries, SNAILS preserved more reads exceeding 100 kilobases in length and increased its libraries’ average read length over manual slow pipetting. Conclusions SNAILS is a low-cost, easily deployable solution for improving sequencing projects that require reads of significant length. By automating the slow pipetting of library preparation reagents, SNAILS increases the consistency and throughput of long read Nanopore sequencing.


Author(s):  
Dattatraya Hegde Radhika ◽  
Raghavendra Gunnaiah ◽  
Ashwini Lamani ◽  
Dadapeer Peerjade ◽  
Rudrappa Chandrashekhar Jagadeesh

Xanthomonas citri pv. punicae (Xcp) causing bacterial blight is a devastating disease of pomegranate in India and Pakistan. Most of Xanthomonads use the type III secretion system to inject transcription activator-like effector (TALE) proteins into host cell. TALEs bind to the effector binding elements in the promoter of host susceptibility genes triggering disease development. PacBio single-molecule real-time (SMRT) long-read sequencing technology was used to identify the TALE encoding genes, which is otherwise not possible using next-generation short-read sequencers. 1.74 Gb raw data containing 368980 subreads with an average read length of 4724 bp and longest read length of 77471 were generated. Subreads were assembled into 15 scaffolds generating ~5.4 Mb (348x) genome. Xcp exhibited close lineage with X. citri pv. citri with 98.78% average nucleotide identity. Of the 4263 protein-coding genes, eleven non-TALE type III effectors and two TALE encoding genes were identified.


Author(s):  
Yi Yan ◽  
Na Zhang ◽  
Chenglin Liu ◽  
Xinran Wu ◽  
Kai Liu ◽  
...  

Abstract As a polyphagous soil-dwelling predatory mite, Stratiolaelaps scimitus (Womersley) (Acari: Laelapidae), formerly known as Stratiolaelaps miles (Berlese), is native to the Northern hemisphere and preys on soil invertebrates, including fungus gnats, springtails, thrips nymphs, nematodes, and other species of mites. Already mass-produced and commercialized in North America and Europe, S. scimitus is now introduced in China as a biocontrol agent for field crop. The introduction, however, can lead to unexpected genetic changes within populations of biological control agents, which might decrease the efficacy of pest management or increase the risks to local environments. To better understand the genetic basis of its biology and behavior, we sequenced and assembled the draft genome of S. scimitus using the PacBio Sequel platform II. We generated ∼150 × (64.81 Gb) PacBio long reads with an average read length of 12.60 kb. Reads longer than 5 kb were assembled into contigs, resulting in the final assembly of 158 contigs with a N50 length of 7.66 Mb, and captured 93.1% of the BUSCO gene set (n = 1,066). We identified 16.39% (69.91 Mb) repetitive elements, 1,686 non-coding RNAs, and 13,305 protein-coding genes, which represented 95.8% BUSCO completeness. Combining analyses of genome family evolution and function enrichment of gene ontology and pathway, a total of 135 families experienced significant expansions, which were mainly involved in digestion, detoxification, immunity and venom. Major expansions of the detoxification enzymes, i.e., P450s and carboxylesterases, suggest a possible genetic mechanism underlying polyphagy and ecological adaptions. Our high-quality genome assembly and annotation provide new insights on the evolutionary biology, soil ecology and biological control for predaceous mites.


2020 ◽  
Vol 3 (1) ◽  
Author(s):  
Yusuke Oguchi ◽  
Hirofumi Shintaku ◽  
Sotaro Uemura

AbstractSingle-cell transcriptome analysis has been revolutionized by DNA barcodes that index cDNA libraries, allowing highly multiplexed analyses to be performed. Furthermore, DNA barcodes are being leveraged for spatial transcriptomes. Although spatial resolution relies on methods used to decode DNA barcodes, achieving single-molecule decoding remains a challenge. Here, we developed an in-house sequencing system inspired by a single-molecule sequencing system, HeliScope, to spatially decode DNA barcode molecules at single-molecule resolution. We benchmarked our system with 30 types of DNA barcode molecules and obtained an average read length of ~20 nt with an error rate of less than 5% per nucleotide, which was sufficient to spatially identify them. Additionally, we spatially identified DNA barcode molecules bound to antibodies at single-molecule resolution. Leveraging this, we devised a method, termed “molecular foot printing”, showing potential for applying our system not only to spatial transcriptomics, but also to spatial proteomics.


2020 ◽  
Author(s):  
Trent M. Prall ◽  
Emma K. Neumann ◽  
Julie A. Karl ◽  
Cecilia G. Shortreed ◽  
David A. Baker ◽  
...  

AbstractBackgroundOxford Nanopore Technologies’ instruments can sequence reads of great length. Long reads improve sequence assemblies by unambiguously spanning repetitive elements of the genome. Sequencing reads of significant length requires the preservation of long DNA template molecules through library preparation by pipetting reagents as slowly as possible in order to minimizing shearing. This process is time consuming and inconsistent at preserving read length as even small changes in volumetric flow rate can result in template shearing.ResultsWe have designed SNAILS (Slow Nucleic Acid Instrument for Long Sequences), a 3D-printable instrument that automates slow pipetting of reagents used in long read library preparation for Oxford Nanopore sequencing. Across six sequencing libraries, SNAILS preserved more reads exceeding one hundred kilobases in length and increased the average read length of its libraries over manual slow pipetting.ConclusionsSNAILS is a low-cost, easily deployable solution for improving sequencing projects that require reads of significant length. By automating the slow pipetting of library preparation reagents, SNAILS both increases the consistency and throughput of long read Nanopore sequencing.


2020 ◽  
Author(s):  
Wenxiong Zhou ◽  
Li Kang ◽  
Haifeng Duan ◽  
Shuo Qiao ◽  
Louis Tao ◽  
...  

Abstract An error-correction code (ECC) sequencing approach has recently been reported to effectively reduce sequencing errors by interrogating a DNA fragment with three orthogonal degenerate sequencing-by-synthesis (SBS) reactions. However, similar to other non-single-molecule SBS methods, the reaction will gradually lose its synchronization within a molecular colony in ECC sequencing. This phenomenon, called dephasing, causes sequencing error, and in ECC sequencing, induces distinctive dephasing patterns. To understand the characteristic dephasing patterns of the dual-base flowgram in ECC sequencing and to generate a correction algorithm, we built a virtual sequencer in silico. Starting from first principles and based on sequencing chemical reactions, we simulated ECC sequencing results, identified the key factors of dephasing in ECC sequencing chemistry, and designed an effective dephasing algorithm. The results show that our dephasing algorithm is applicable to sequencing signals with at least 500 cycles, or 1,000-bp average read length, with acceptably low error rate for further parity-checks and ECC deduction. Our virtual sequencer with our dephasing algorithm can further be extended to a dichromatic form of ECC sequencing, allowing for a potentially much more accurate sequencing approach.


2020 ◽  
Author(s):  
Yuya Kiguchi ◽  
Suguru Nishijima ◽  
Naveen Kumar ◽  
Masahira Hattori ◽  
Wataru Suda

Abstract Background: The ecological and biological features of the indigenous phage community (virome) in the human gut microbiome are poorly understood, possibly due to many fragmented contigs and fewer complete genomes based on conventional short-read metagenomics. Long-read sequencing technologies have attracted attention as an alternative approach to reconstruct long and accurate contigs from microbial communities. However, the impact of long-read metagenomics on human gut virome analysis has not been well evaluated. Results: Here we present chimera-less PacBio long-read metagenomics of multiple displacement amplification (MDA)-treated human gut virome DNA. The method included the development of a novel bioinformatics tool, SACRA (Split Amplified Chimeric Read Algorithm), which efficiently detects and splits numerous chimeric reads in PacBio reads from the MDA-treated virome samples. SACRA treatment of PacBio reads from five samples markedly reduced the average chimera ratio from 72 to 1.5%, generating chimera-less PacBio reads with an average read-length of 1.8 kb. De novo assembly of the chimera-less long reads generated contigs with an average N50 length of 11.1 kb, whereas those of MiSeq short reads from the same samples were 0.7 kb, dramatically improving contig extension. Alignment of both contig sets generated 378 high-quality merged contigs (MCs) composed of the minimum scaffolds of 434 MiSeq and 637 PacBio contigs, respectively, and also identified numerous MiSeq short fragmented contigs ≤500 bp additionally aligned to MCs, which possibly originated from a small fraction of MiSeq chimeric reads. The alignment also revealed that fragmentations of the scaffolded MiSeq contigs were caused primarily by genomic complexity of the community, including local repeats, hypervariable regions, and highly conserved sequences in and between the phage genomes. We identified 142 complete and near-complete phage genomes including 108 novel genomes, varying from 5 to 185 kb in length, the majority of which were predicted to be Microviridae phages including several variants with homologous but distinct genomes, which were fragmented in MiSeq contigs. Conclusions: Long-read metagenomics coupled with SACRA provides an improved method to reconstruct accurate and extended phage genomes from MDA-treated virome samples of the human gut, and potentially from other environmental virome samples.


2020 ◽  
Vol 48 (7) ◽  
pp. 3734-3746 ◽  
Author(s):  
Stephan Werner ◽  
Lukas Schmidt ◽  
Virginie Marchand ◽  
Thomas Kemmer ◽  
Christoph Falschlunger ◽  
...  

Abstract Reverse transcription (RT) of RNA templates containing RNA modifications leads to synthesis of cDNA containing information on the modification in the form of misincorporation, arrest, or nucleotide skipping events. A compilation of such events from multiple cDNAs represents an RT-signature that is typical for a given modification, but, as we show here, depends also on the reverse transcriptase enzyme. A comparison of 13 different enzymes revealed a range of RT-signatures, with individual enzymes exhibiting average arrest rates between 20 and 75%, as well as average misincorporation rates between 30 and 75% in the read-through cDNA. Using RT-signatures from individual enzymes to train a random forest model as a machine learning regimen for prediction of modifications, we found strongly variegated success rates for the prediction of methylated purines, as exemplified with N1-methyladenosine (m1A). Among the 13 enzymes, a correlation was found between read length, misincorporation, and prediction success. Inversely, low average read length was correlated to high arrest rate and lower prediction success. The three most successful polymerases were then applied to the characterization of RT-signatures of other methylated purines. Guanosines featuring methyl groups on the Watson-Crick face were identified with high confidence, but discrimination between m1G and m22G was only partially successful. In summary, the results suggest that, given sufficient coverage and a set of specifically optimized reaction conditions for reverse transcription, all RNA modifications that impede Watson-Crick bonds can be distinguished by their RT-signature.


GigaScience ◽  
2020 ◽  
Vol 9 (1) ◽  
Author(s):  
Martin Pippel ◽  
David Jebb ◽  
Franziska Patzold ◽  
Sylke Winkler ◽  
Heiko Vogel ◽  
...  

Abstract Background Adapted to different ecological niches, moth species belonging to the Hyles genus exhibit a spectacular diversity of larval color patterns. These species diverged ∼7.5 million years ago, making this rather young genus an interesting system to study a wide range of questions including the process of speciation, ecological adaptation, and adaptive radiation. Results Here we present a high-quality genome assembly of the bat hawkmoth Hyles vespertilio, the first reference genome of a member of the Hyles genus. We generated 51× Pacific Biosciences long reads with an average read length of 8.9 kb. Pacific Biosciences reads longer than 4 kb were assembled into contigs, resulting in a 651.4-Mb assembly consisting of 530 contigs with an N50 value of 7.5 Mb. The circular mitochondrial contig has a length of 15,303 bp. The H. vespertilio genome is very repeat-rich and exhibits a higher repeat content (50.3%) than other Bombycoidea species such as Bombyx mori (45.7%) and Manduca sexta (27.5%). We developed a comprehensive gene annotation workflow to obtain consensus gene models from different evidence including gene projections, protein homology, transcriptome data, and ab initio predictions. The resulting gene annotation is highly complete with 94.5% of BUSCO genes being completely present, which is higher than the BUSCO completeness of the B. mori (92.2%) and M. sexta (90%) annotations. Conclusions Our gene annotation strategy has general applicability to other genomes, and the H. vespertilio genome provides a valuable molecular resource to study a range of questions in this genus, including phylogeny, incomplete lineage sorting, speciation, and hybridization. A genome browser displaying the genome, alignments, and annotations is available at https://genome-public.pks.mpg.de/cgi-bin/hgTracks?db=HLhylVes1.


Sign in / Sign up

Export Citation Format

Share Document