scholarly journals Building de novo reference genome assemblies of complex eukaryotic microorganisms from single nuclei

2020 ◽  
Vol 10 (1) ◽  
Author(s):  
Merce Montoliu-Nerin ◽  
Marisol Sánchez-García ◽  
Claudia Bergin ◽  
Manfred Grabherr ◽  
Barbara Ellis ◽  
...  

AbstractThe advent of novel sequencing techniques has unraveled a tremendous diversity on Earth. Genomic data allow us to understand ecology and function of organisms that we would not otherwise know existed. However, major methodological challenges remain, in particular for multicellular organisms with large genomes. Arbuscular mycorrhizal (AM) fungi are important plant symbionts with cryptic and complex multicellular life cycles, thus representing a suitable model system for method development. Here, we report a novel method for large scale, unbiased nuclear sorting, sequencing, and de novo assembling of AM fungal genomes. After comparative analyses of three assembly workflows we discuss how sequence data from single nuclei can best be used for different downstream analyses such as phylogenomics and comparative genomics of single nuclei. Based on analysis of completeness, we conclude that comprehensive de novo genome assemblies can be produced from six to seven nuclei. The method is highly applicable for a broad range of taxa, and will greatly improve our ability to study multicellular eukaryotes with complex life cycles.

2021 ◽  
Author(s):  
Merce Montoliu-Nerin ◽  
Marisol Sánchez-García ◽  
Claudia Bergin ◽  
Verena Esther Kutschera ◽  
Hanna Johannesson ◽  
...  

Morphological characters and nuclear ribosomal DNA (rDNA) phylogenies have so far been the basis of the current classifications of arbuscular mycorrhizal (AM) fungi. Improved understanding of the phylogeny and evolutionary history of AM fungi requires extensive ortholog sampling and analyses of genome and transcriptome data from a wide range of taxa. To circumvent the need for axenic culturing of AM fungi we gathered and combined genomic data from single nuclei to generate de novo genome assemblies covering seven families of AM fungi. Comparative analysis of the previously published Rhizophagus irregularis DAOM197198 assembly confirm that our novel workflow generates high-quality genome assemblies suitable for phylogenomic analysis. Predicted genes of our assemblies, together with published protein sequences of AM fungi and their sister clades, were used for phylogenomic analyses. Based on analyses of sets of orthologous genes, we highlight three alternative topologies among families of AM fungi. In the main topology, Glomerales is polyphyletic and Claroideoglomeraceae, is the basal sister group to Glomeraceae and Diversisporales. Our results support family level classification from previous phylogenetic studies. New evolutionary relationships among families where highlighted with phylogenomic analysis using the hitherto most extensive taxon sampling for AM fungi.


2021 ◽  
Vol 2 ◽  
Author(s):  
Merce Montoliu-Nerin ◽  
Marisol Sánchez-García ◽  
Claudia Bergin ◽  
Verena Esther Kutschera ◽  
Hanna Johannesson ◽  
...  

Morphological characters and nuclear ribosomal DNA (rDNA) phylogenies have so far been the basis of the current classifications of arbuscular mycorrhizal (AM) fungi. Improved understanding of the evolutionary history of AM fungi requires extensive ortholog sampling and analyses of genome and transcriptome data from a wide range of taxa. To circumvent the need for axenic culturing of AM fungi we gathered and combined genomic data from single nuclei to generate de novo genome assemblies covering seven families of AM fungi. We successfully sequenced the genomes of 15 AM fungal species for which genome data was not previously available. Comparative analysis of the previously published Rhizophagus irregularis DAOM197198 assembly confirm that our novel workflow generates genome assemblies suitable for phylogenomic analysis. Predicted genes of our assemblies, together with published protein sequences of AM fungi and their sister clades, were used for phylogenomic analyses. We evaluated the phylogenetic placement of Glomeromycota in relation to its sister phyla (Mucoromycota and Mortierellomycota), and found no support to reject a polytomy. Finally, we explored the phylogenetic relationships within Glomeromycota. Our results support family level classification from previous phylogenetic studies, and the polyphyly of the order Glomerales with Claroideoglomeraceae as the sister group to Glomeraceae and Diversisporales.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Andrea Crosino ◽  
Elisa Moscato ◽  
Marco Blangetti ◽  
Gennaro Carotenuto ◽  
Federica Spina ◽  
...  

AbstractShort chain chitooligosaccharides (COs) are chitin derivative molecules involved in plant-fungus signaling during arbuscular mycorrhizal (AM) interactions. In host plants, COs activate a symbiotic signalling pathway that regulates AM-related gene expression. Furthermore, exogenous CO application was shown to promote AM establishment, with a major interest for agricultural applications of AM fungi as biofertilizers. Currently, the main source of commercial COs is from the shrimp processing industry, but purification costs and environmental concerns limit the convenience of this approach. In an attempt to find a low cost and low impact alternative, this work aimed to isolate, characterize and test the bioactivity of COs from selected strains of phylogenetically distant filamentous fungi: Pleurotus ostreatus, Cunninghamella bertholletiae and Trichoderma viride. Our optimized protocol successfully isolated short chain COs from lyophilized fungal biomass. Fungal COs were more acetylated and displayed a higher biological activity compared to shrimp-derived COs, a feature that—alongside low production costs—opens promising perspectives for the large scale use of COs in agriculture.


PeerJ ◽  
2021 ◽  
Vol 9 ◽  
pp. e12129
Author(s):  
Paul E. Oluniyi ◽  
Fehintola Ajogbasile ◽  
Judith Oguzie ◽  
Jessica Uwanibe ◽  
Adeyemi Kayode ◽  
...  

Next generation sequencing (NGS)-based studies have vastly increased our understanding of viral diversity. Viral sequence data obtained from NGS experiments are a rich source of information, these data can be used to study their epidemiology, evolution, transmission patterns, and can also inform drug and vaccine design. Viral genomes, however, represent a great challenge to bioinformatics due to their high mutation rate and forming quasispecies in the same infected host, bringing about the need to implement advanced bioinformatics tools to assemble consensus genomes well-representative of the viral population circulating in individual patients. Many tools have been developed to preprocess sequencing reads, carry-out de novo or reference-assisted assembly of viral genomes and assess the quality of the genomes obtained. Most of these tools however exist as standalone workflows and usually require huge computational resources. Here we present (Viral Genomes Easily Analyzed), a Snakemake workflow for analyzing RNA viral genomes. VGEA enables users to map sequencing reads to the human genome to remove human contaminants, split bam files into forward and reverse reads, carry out de novo assembly of forward and reverse reads to generate contigs, pre-process reads for quality and contamination, map reads to a reference tailored to the sample using corrected contigs supplemented by the user’s choice of reference sequences and evaluate/compare genome assemblies. We designed a project with the aim of creating a flexible, easy-to-use and all-in-one pipeline from existing/stand-alone bioinformatics tools for viral genome analysis that can be deployed on a personal computer. VGEA was built on the Snakemake workflow management system and utilizes existing tools for each step: fastp (Chen et al., 2018) for read trimming and read-level quality control, BWA (Li & Durbin, 2009) for mapping sequencing reads to the human reference genome, SAMtools (Li et al., 2009) for extracting unmapped reads and also for splitting bam files into fastq files, IVA (Hunt et al., 2015) for de novo assembly to generate contigs, shiver (Wymant et al., 2018) to pre-process reads for quality and contamination, then map to a reference tailored to the sample using corrected contigs supplemented with the user’s choice of existing reference sequences, SeqKit (Shen et al., 2016) for cleaning shiver assembly for QUAST, QUAST (Gurevich et al., 2013) to evaluate/assess the quality of genome assemblies and MultiQC (Ewels et al., 2016) for aggregation of the results from fastp, BWA and QUAST. Our pipeline was successfully tested and validated with SARS-CoV-2 (n = 20), HIV-1 (n = 20) and Lassa Virus (n = 20) datasets all of which have been made publicly available. VGEA is freely available on GitHub at: https://github.com/pauloluniyi/VGEA under the GNU General Public License.


1999 ◽  
Vol 65 (2) ◽  
pp. 718-723 ◽  
Author(s):  
C. Del Val ◽  
J. M. Barea ◽  
C. Azcón-Aguilar

ABSTRACT High concentrations of heavy metals have been shown to adversely affect the size, diversity, and activity of microbial populations in soil. The aim of this work was to determine how the diversity of arbuscular mycorrhizal (AM) fungi is affected by the addition of sewage-amended sludge containing heavy metals in a long-term experiment. Due to the reduced number of indigenous AM fungal (AMF) propagules in the experimental soils, several host plants with different life cycles were used to multiply indigenous fungi. Six AMF ecotypes were found in the experimental soils, showing consistent differences with regard to their tolerance to the presence of heavy metals. AMF ecotypes ranged from very sensitive to the presence of metals to relatively tolerant to high rates of heavy metals in soil. Total AMF spore numbers decreased with increasing amounts of heavy metals in the soil. However, species richness and diversity as measured by the Shannon-Wiener index increased in soils receiving intermediate rates of sludge contamination but decreased in soils receiving the highest rate of heavy-metal-contaminated sludge. Relative densities of most AMF species were also significantly influenced by soil treatments. Host plant species exerted a selective influence on AMF population size and diversity. We conclude based on the results of this study that size and diversity of AMF populations were modified in metal-polluted soils, even in those with metal concentrations that were below the upper limits accepted by the European Union for agricultural soils.


2019 ◽  
Author(s):  
Merce Montoliu-Nerin ◽  
Marisol Sánchez-García ◽  
Claudia Bergin ◽  
Manfred Grabherr ◽  
Barbara Ellis ◽  
...  

SummaryA large proportion of Earth's biodiversity constitutes organisms that cannot be cultured, have cryptic life-cycles and/or live submerged within their substrates1–4. Genomic data are key to unravel both their identity and function5. The development of metagenomic methods6,7 and the advent of single cell sequencing8–10 have revolutionized the study of life and function of cryptic organisms by upending the need for large and pure biological material, and allowing generation of genomic data from complex or limited environmental samples. Genome assemblies from metagenomic data have so far been restricted to organisms with small genomes, such as bacteria11, archaea12 and certain eukaryotes13. On the other hand, single cell technologies have allowed the targeting of unicellular organisms, attaining a better resolution than metagenomics8,9,14–16, moreover, it has allowed the genomic study of cells from complex organisms one cell at a time17,18. However, single cell genomics are not easily applied to multicellular organisms formed by consortia of diverse taxa, and the generation of specific workflows for sequencing and data analysis is needed to expand genomic research to the entire tree of life, including sponges19, lichens3,20, intracellular parasites21,22, and plant endophytes23,24. Among the most important plant endophytes are the obligate mutualistic symbionts, arbuscular mycorrhizal (AM) fungi, that pose an additional challenge with their multinucleate coenocytic mycelia25. Here, the development of a novel single nuclei sequencing and assembly workflow is reported. This workflow allows, for the first time, the generation of reference genome assemblies from large scale, unbiased sorted, and sequenced AM fungal nuclei circumventing tedious, and often impossible, culturing efforts. This method opens infinite possibilities for studies of evolution and adaptation in these important plant symbionts and demonstrates that reference genomes can be generated from complex non-model organisms by isolating only a handful of their nuclei.


2021 ◽  
Vol 3 ◽  
Author(s):  
Sarah J. Sapsford ◽  
Trudy Paap ◽  
Giles E. St. J. Hardy ◽  
Treena I. Burgess

In forest ecosystems, habitat fragmentation negatively impacts stand structure and biodiversity; the resulting fragmented patches of forest have distinct, disturbed edge habitats that experience different environmental conditions than the interiors of the fragments. In southwest Western Australia, there is a large-scale decline of the keystone tree species Corymbia calophylla following fragmentation and land use change. These changes have altered stand structure and increased their susceptibility to an endemic fungal pathogen, Quambalaria coyrecup, which causes chronic canker disease especially along disturbed forest habitats. However, the impacts of fragmentation on belowground processes in this system are not well-understood. We examined the effects of fragmentation on abiotic soil properties and ectomycorrhizal (ECM) and arbuscular mycorrhizal (AM) fungal communities, and whether these belowground changes were drivers of disease incidence. We collected soil from 17 sites across the distribution range of C. calophylla. Soils were collected across a gradient from disturbed, diseased areas to undisturbed, disease-free areas. We analysed soil nutrients and grew C. calophylla plants as a bioassay host. Plants were harvested and roots collected after 6 months of growth. DNA was extracted from the roots, amplified using fungal specific primers and sequenced using Illumina MiSeq. Concentrations of key soil nutrients such as nitrogen, phosphorus and potassium were much higher along the disturbed, diseased edges in comparison to undisturbed areas. Disturbance altered the community composition of ECM and AM fungi; however, only ECM fungal communities had lower rarefied richness and diversity along the disturbed, diseased areas compared to undisturbed areas. Accounting for effects of disturbance, ECM fungal diversity and leaf litter depth were highly correlated with increased disease incidence in C. calophylla. In the face of global change, increased virulence of an endemic pathogen has emerged in this Mediterranean-type forest.


2016 ◽  
Author(s):  
Alan Medlar ◽  
Laura Laakso ◽  
Andreia Miraldo ◽  
Ari Löytynoja

AbstractHigh-throughput RNA-seq data has become ubiquitous in the study of non-model organisms, but its use in comparative analysis remains a challenge. Without a reference genome for mapping, sequence data has to be de novo assembled, producing large numbers of short, highly redundant contigs. Preparing these assemblies for comparative analyses requires the removal of redundant isoforms, assignment of orthologs and converting fragmented transcripts into gene alignments. In this article we present Glutton, a novel tool to process transcriptome assemblies for downstream evolutionary analyses. Glutton takes as input a set of fragmented, possibly erroneous transcriptome assemblies. Utilising phylogeny-aware alignment and reference data from a closely related species, it reconstructs one transcript per gene, finds orthologous sequences and produces accurate multiple alignments of coding sequences. We present a comprehensive analysis of Glutton’s performance across a wide range of divergence times between study and reference species. We demonstrate the impact choice of assembler has on both the number of alignments and the correctness of ortholog assignment and show substantial improvements over heuristic methods, without sacrificing correctness. Finally, using inference of Darwinian selection as an example of downstream analysis, we show that Glutton-processed RNA-seq data give results comparable to those obtained from full length gene sequences even with distantly related reference species. Glutton is available from http://wasabiapp.org/software/glutton/ and is licensed under the GPLv3.


2017 ◽  
Author(s):  
Erik Garrison ◽  
Jouni Sirén ◽  
Adam M. Novak ◽  
Glenn Hickey ◽  
Jordan M. Eizenga ◽  
...  

AbstractReference genomes guide our interpretation of DNA sequence data. However, conventional linear references are fundamentally limited in that they represent only one version of each locus, whereas the population may contain multiple variants. When the reference represents an individual’s genome poorly, it can impact read mapping and introduce bias. Variation graphs are bidirected DNA sequence graphs that compactly represent genetic variation, including large scale structural variation such as inversions and duplications.1 Equivalent structures are produced by de novo genome assemblers.2,3 Here we present vg, a toolkit of computational methods for creating, manipulating, and utilizing these structures as references at the scale of the human genome. vg provides an efficient approach to mapping reads onto arbitrary variation graphs using generalized compressed suffix arrays,4 with improved accuracy over alignment to a linear reference, creating data structures to support downstream variant calling and genotyping. These capabilities make using variation graphs as reference structures for DNA sequencing practical at the scale of vertebrate genomes, or at the topological complexity of new species assemblies.


2010 ◽  
Vol 9 (9) ◽  
pp. 1300-1310 ◽  
Author(s):  
Minou Nowrousian

ABSTRACT Over the past 5 years, large-scale sequencing has been revolutionized by the development of several so-called next-generation sequencing (NGS) technologies. These have drastically increased the number of bases obtained per sequencing run while at the same time decreasing the costs per base. Compared to Sanger sequencing, NGS technologies yield shorter read lengths; however, despite this drawback, they have greatly facilitated genome sequencing, first for prokaryotic genomes and within the last year also for eukaryotic ones. This advance was possible due to a concomitant development of software that allows the de novo assembly of draft genomes from large numbers of short reads. In addition, NGS can be used for metagenomics studies as well as for the detection of sequence variations within individual genomes, e.g., single-nucleotide polymorphisms (SNPs), insertions/deletions (indels), or structural variants. Furthermore, NGS technologies have quickly been adopted for other high-throughput studies that were previously performed mostly by hybridization-based methods like microarrays. This includes the use of NGS for transcriptomics (RNA-seq) or the genome-wide analysis of DNA/protein interactions (ChIP-seq). This review provides an overview of NGS technologies that are currently available and the bioinformatics analyses that are necessary to obtain information from the flood of sequencing data as well as applications of NGS to address biological questions in eukaryotic microorganisms.


Sign in / Sign up

Export Citation Format

Share Document