de novo sequence assembly
Recently Published Documents


TOTAL DOCUMENTS

29
(FIVE YEARS 6)

H-INDEX

10
(FIVE YEARS 0)

2021 ◽  
Author(s):  
Austin Reid Manny ◽  
Carrie Ann Hetzel ◽  
Arshan Mizani ◽  
Max Lee Nibert

Trichomonas vaginalis is the most common nonviral cause of sexually transmitted infections globally, with an estimated quarter of a billion people infected around the world. Infection by the protozoan parasite results in the clinical syndrome trichomoniasis, which manifests as an inflammatory syndrome with acute and chronic consequences. Half or more of these parasites are themselves infected with one or more dsRNA viruses which can exacerbate the inflammatory disease. Four distinct viruses have been found in T. vaginalis to date, Trichomonas vaginalis virus 1 through 4 (or TVVs). Despite the global prevalence of these viruses, few coding-complete genome sequences have been determined. We conducted viral sequence mining in publicly available transcriptomes across 60 RNA-seq datasets representing 13 distinct T. vaginalis isolates. We assembled sequences for 27 new trichomonasvirus strains across all known TVV species, with 17 of these assemblies representing coding-complete genomes. Using a strategy of de novo sequence assembly followed by taxonomic classification, we discovered a fifth species of TVV that we term Trichomonas vaginalis virus 5 (TVV5). Six strains of TVV5 were assembled, including two coding-complete genomes. These TVV5 sequences exhibit high sequence identity to each other, but low identity to any strains of TVV1-4. Phylogenetic analysis corroborates the species-level designation. These results substantially increase the number of coding-complete TVV genome sequences and demonstrate the utility of mining publicly available transcriptomes for the discovery of RNA viruses in a critical human pathogen.


2021 ◽  
Author(s):  
Keyi Geng ◽  
Lara Garcia Merino ◽  
Linda Wedemann ◽  
Aniek Martens ◽  
Malgorzata Sobota ◽  
...  

The CRISPR/Cas9 system is widely used to permanently delete genomic regions by inducing double-strand breaks via dual guide RNAs. However, consequences of Cas9 deletion events have not been fully investigated. To characterize Cas9-induced genotypic abnormalities in human cells, we utilized an innovative droplet-based target enrichment approach followed by long-read sequencing and coupled it to customized de novo sequence assembly. We here describe extensive genomic disruptions by Cas9, involving a genomic duplication and inversion of the target region as well as integrations of exogenous DNA at the double-strand break sites. Although these events altered the genomic composition of the on-target region, we found that the aberrant DNA fragments are still functional, marked by active histones and bound by RNA polymerase III. Our findings broaden the consequential spectrum of the Cas9 deletion system, reinforce the necessity of meticulous genomic validations and rationalize extra caution when interpreting results from a deletion event.


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Zachary Deng ◽  
Eric Delwart

Abstract Background Metagenomics is the study of microbial genomes for pathogen detection and discovery in human clinical, animal, and environmental samples via Next-Generation Sequencing (NGS). Metagenome de novo sequence assembly is a crucial analytical step in which longer contigs, ideally whole chromosomes/genomes, are formed from shorter NGS reads. However, the contigs generated from the de novo assembly are often very fragmented and rarely longer than a few kilo base pairs (kb). Therefore, a time-consuming extension process is routinely performed on the de novo assembled contigs. Results To facilitate this process, we propose a new tool for metagenome contig extension after de novo assembly. ContigExtender employs a novel recursive extending strategy that explores multiple extending paths to achieve highly accurate longer contigs. We demonstrate that ContigExtender outperforms existing tools in synthetic, animal, and human metagenomics datasets. Conclusions A novel software tool ContigExtender has been developed to assist and enhance the performance of metagenome de novo assembly. ContigExtender effectively extends contigs from a variety of sources and can be incorporated in most viral metagenomics analysis pipelines for a wide variety of applications, including pathogen detection and viral discovery.


PLoS ONE ◽  
2020 ◽  
Vol 15 (8) ◽  
pp. e0237455
Author(s):  
Laila Sara Arroyo Mühr ◽  
Camilla Lagheden ◽  
Sadaf Sakina Hassan ◽  
Sara Nordqvist Kleppe ◽  
Emilie Hultin ◽  
...  

2020 ◽  
Vol 48 (1) ◽  
pp. 189-199
Author(s):  
Shuang ZHAO ◽  
Chenshu WANG

Valeriana jatamansi Jones is utilized for medicinal purposes in China, and is also an important substitute for European Valeriana officinalis. The major active principles are generally called valepotriates, which belong to iridoids compounds. To better understand the iridoid biosynthesis pathway in V. jatamansi, we generated transcriptome sequences from the leaf and root tissues, and performed de novo sequence assembly, a total of 183,524,060 transcripts and 61,876 unigenes for V. jatamansi were obtained from 13.28 Gb clean reads. 56,641 unigenes were annotated by public databases, while 5,235 unigenes remained unannotated. Different unigenes in V. jatamansi were identified by MISA analysis, and 5,195 unigenes containing Simple Sequence Repeat (SSR) were identified. When examining the annotation of transcriptome contigs against the KEGG database, we identified 24 unigenes that could be classified into 24 enzyme categories associated with three metabolic pathways leading to iridoid biosynthesis, 6 genes of MVA pathways, 9 genes of MEP pathways and 9 genes of iridoids pathways. We selected 9 genes encoding key enzymes in the iridoid pathway of V. jatamansi to examine their organ specificity of expression using quantitative real-time PCR (qPCR). In conclusion, we generated a comprehensive transcriptome assembly representing the gene space in V. jatamansi, and the genomic dataset and analyses presented here lay the foundation for further research on this important medicinal plant.


2020 ◽  
Author(s):  
R.J.S Orr ◽  
M. M. Sannum ◽  
S. Boessenkool ◽  
E. Di Martino ◽  
D.P. Gordon ◽  
...  

AbstractResolution of relationships at lower taxonomic levels is crucial for answering many evolutionary questions, and as such, sufficiently varied species representation is vital. This latter goal is not always achievable with relatively fresh samples. To alleviate the difficulties in procuring rarer taxa, we have seen increasing utilization of historical specimens in building molecular phylogenies using high throughput sequencing. This effort, however, has mainly focused on large-bodied or well-studied groups, with small-bodied and under-studied taxa under-prioritized. Here, we present a pipeline that utilizes both historical and contemporary specimens, to increase the resolution of phylogenetic relationships among understudied and small-bodied metazoans, namely, cheilostome bryozoans. In this study, we pioneer sequencing of air-dried bryozoans, utilizing a recent library preparation method for low DNA input. We use the de novo mitogenome assembly from the target specimen itself as reference for iterative mapping, and the comparison thereof. In doing so, we present mitochondrial and ribosomal RNA sequences of 43 cheilostomes representing 37 species, including 14 from historical samples ranging from 50 to 149 years old. The inferred phylogenetic relationships of these samples, analyzed together with publicly available sequence data, are shown in a statistically well-supported 65 taxa and 17 genes cheilostome tree. Finally, the methodological success is emphasized by circularizing a total of 27 mitogenomes, seven from historical cheilostome samples. Our study highlights the potential of utilizing DNA from micro-invertebrate specimens stored in natural history collections for resolving phylogenetic relationships between species.


2018 ◽  
Vol 20 (6) ◽  
pp. 1981-1996 ◽  
Author(s):  
Jeff Gauthier ◽  
Antony T Vincent ◽  
Steve J Charette ◽  
Nicolas Derome

AbstractIt is easy for today’s students and researchers to believe that modern bioinformatics emerged recently to assist next-generation sequencing data analysis. However, the very beginnings of bioinformatics occurred more than 50 years ago, when desktop computers were still a hypothesis and DNA could not yet be sequenced. The foundations of bioinformatics were laid in the early 1960s with the application of computational methods to protein sequence analysis (notably, de novo sequence assembly, biological sequence databases and substitution models). Later on, DNA analysis also emerged due to parallel advances in (i) molecular biology methods, which allowed easier manipulation of DNA, as well as its sequencing, and (ii) computer science, which saw the rise of increasingly miniaturized and more powerful computers, as well as novel software better suited to handle bioinformatics tasks. In the 1990s through the 2000s, major improvements in sequencing technology, along with reduced costs, gave rise to an exponential increase of data. The arrival of ‘Big Data’ has laid out new challenges in terms of data mining and management, calling for more expertise from computer science into the field. Coupled with an ever-increasing amount of bioinformatics tools, biological Big Data had (and continues to have) profound implications on the predictive power and reproducibility of bioinformatics results. To overcome this issue, universities are now fully integrating this discipline into the curriculum of biology students. Recent subdisciplines such as synthetic biology, systems biology and whole-cell modeling have emerged from the ever-increasing complementarity between computer science and biology.


2018 ◽  
Author(s):  
Tayyaba Qamar-ul-Islam ◽  
M. Ahmed Khan ◽  
Rabia Faizan ◽  
Uzma Mahmood

AbstractMango is one of the famous and fifth most important subtropical/tropical fruit crops worldwide with the production centered in India and South-East Asia. Recently, there has been a worldwide interest in mango genomics to produce tools for Marker Assisted Selection and trait association. There are no web-based analyzed genomic resources available for mango particularly. Hence a complete mango genomic resource was required for improvement in research and management of mango germplasm. In this project, we have done comparative transcriptome analysis of four mango cultivars i.e. cv. Langra, cv. Zill, cv. Shelly and cv. Kent from Pakistan, China, Israel, and Mexico respectively. The raw data is obtained through De-novo sequence assembly which generated 30,953-85,036 unigenes from RNA-Seq datasets of mango cultivars. The project is aimed to provide the scientific community and general public a mango genomic resource and allow the user to examine their data against our analyzed mango genome databases of four cultivars (cv. Langra, cv. Zill, cv. Shelly and cv. Kent). A mango web genomic resource MGdb, is based on 3-tier architecture, developed using Python, flat file database, and JavaScript. It contains the information of predicted genes of the whole genome, the unigenes annotated by homologous genes in other species, and GO (Gene Ontology) terms which provide a glimpse of the traits in which they are involved. This web genomic resource can be of immense use in the assessment of the research, development of the medicines, understanding genetics and provides useful bioinformatics solution for analysis of nucleotide sequence data. We report here world’s first web-based genomic resource particularly of mango for genetic improvement and management of mango genome.


2017 ◽  
Author(s):  
Waqasuddin Khan ◽  
Safina Abdul Razzak ◽  
M. Kamran Azim

AbstractMango is an economically important fruit crop of many tropical and subtropical countries. Recently, leaf and fruit transcriptomes of mango cultivars grown in different geographical regions have characterized. Here, we presented comparative transcriptome analysis of four mango cultivars i.e. cv. Langra, cv. Zill, cv. Shelly and cv. Kent from Pakistan, China, Israel and Mexico respectively. De-novo sequence assembly generated 30,953-85,036 unigenes from RNASeq datasets of mango cultivars. KEGG pathway mapping of mango unigenes identified terpenoids, flavonoids and carotenoids biosynthetic pathways involved in flavor and color. The analysis revealed linalool as major monoterpenoid found in all cultivars studied whereas, monoterpene α-terpineol was specifically found in cv. Shelly. Ditepene gibberellin biosynthesis pathway was found in all cultivars whereas, homoterpene synthase involved in biosynthesis of 4,8,12-trimethyltrideca-1,3,7,11-tetraene (TMTT; an insect induced diterpene) was found in cv. Kent. Among sesquiterpenes and triterpenes, biosynthetic pathway of Germacrene-D, an antibacterial and anti-insecticidal metabolite was found in cv. Zill and cv. Shelly. Two bioactive triterpenes, lupeol and β-amyrin were found in cv. Langra and cv. Zill. Unigenes involved in biosynthesis of carotenoids, β-carotene and lycopene, were found in cultivars studied. Many unigenes involved in flavonoid biosynthesis were also found. Comparative transcriptomics revealed naringenin (an anti-inflammatory and antioxidant metabolite) as ‘central’ flavanone responsible for biosynthesis of an array of flavonoids. The present study provided insights on genetic resources responsible for flavor and color of mango fruit.


Sign in / Sign up

Export Citation Format

Share Document