scholarly journals Reference Genome for the Highly Transformable Setaria viridis ME034V

2020 ◽  
Vol 10 (10) ◽  
pp. 3467-3478 ◽  
Author(s):  
Peter M. Thielen ◽  
Amanda L. Pendleton ◽  
Robert A. Player ◽  
Kenneth V. Bowden ◽  
Thomas J. Lawton ◽  
...  

Setaria viridis (green foxtail) is an important model system for improving cereal crops due to its diploid genome, ease of cultivation, and use of C4 photosynthesis. The S. viridis accession ME034V is exceptionally transformable, but the lack of a sequenced genome for this accession has limited its utility. We present a 397 Mb highly contiguous de novo assembly of ME034V using ultra-long nanopore sequencing technology (read N50 = 41kb). We estimate that this genome is largely complete based on our updated k-mer based genome size estimate of 401 Mb for S. viridis. Genome annotation identified 37,908 protein-coding genes and >300k repetitive elements comprising 46% of the genome. We compared the ME034V assembly with two other previously sequenced Setaria genomes as well as to a diversity panel of 235 S. viridis accessions. We found the genome assemblies to be largely syntenic, but numerous unique polymorphic structural variants were discovered. Several ME034V deletions may be associated with recent retrotransposition of copia and gypsy LTR repeat families, as evidenced by their low genotype frequencies in the sampled population. Lastly, we performed a phylogenomic analysis to identify gene families that have expanded in Setaria, including those involved in specialized metabolism and plant defense response. The high continuity of the ME034V genome assembly validates the utility of ultra-long DNA sequencing to improve genetic resources for emerging model organisms. Structural variation present in Setaria illustrates the importance of obtaining the proper genome reference for genetic experiments. Thus, we anticipate that the ME034V genome will be of significant utility for the Setaria research community.

2020 ◽  
Author(s):  
Peter M. Thielen ◽  
Amanda L. Pendleton ◽  
Robert A. Player ◽  
Kenneth V. Bowden ◽  
Thomas J. Lawton ◽  
...  

ABSTRACTSetaria viridis (green foxtail) is an important model system for improving cereal crops due to its diploid genome, ease of cultivation, and use of C4 photosynthesis. The S. viridis cultivar ME034V is exceptionally transformable, but the lack of a sequenced genome for this cultivar has limited its utility. We present a 397 Mb highly contiguous de novo assembly of ME034V using ultra-long nanopore sequencing technology (read N50=41kb). We estimate that this genome is largely complete based on our updated k-mer based genome size estimate of 401 Mb for S. viridis. Genome annotation identified 37,908 protein-coding genes and >300k repetitive elements comprising 46% of the genome. We compared the ME034V assembly with two other previously sequenced Setaria genomes as well as to a diversity panel of 235 S. viridis cultivars. We found the genome assemblies to be largely syntenic, but numerous unique polymorphic structural variants were discovered. Several ME034V deletions may be associated with recent retrotransposition of copia and gypsy LTR repeat families, as evidenced by their low genotype frequencies in the sampled population. Lastly, we performed a phylogenomic analysis to identify gene families that have expanded in Setaria, including those involved in specialized metabolism and plant defense response. The high continuity of the ME034V genome assembly validates the utility of ultra-long DNA sequencing to improve genetic resources for emerging model organisms. Structural variation present in Setaria illustrates the importance of obtaining the proper genome reference for genetic experiments. Thus, we anticipate that the ME034V genome will be of significant utility for the Setaria research community.


BMC Genomics ◽  
2019 ◽  
Vol 20 (1) ◽  
Author(s):  
Rashmi Jain ◽  
Jerry Jenkins ◽  
Shengqiang Shu ◽  
Mawsheng Chern ◽  
Joel A. Martin ◽  
...  

Abstract Background The availability of thousands of complete rice genome sequences from diverse varieties and accessions has laid the foundation for in-depth exploration of the rice genome. One drawback to these collections is that most of these rice varieties have long life cycles, and/or low transformation efficiencies, which limits their usefulness as model organisms for functional genomics studies. In contrast, the rice variety Kitaake has a rapid life cycle (9 weeks seed to seed) and is easy to transform and propagate. For these reasons, Kitaake has emerged as a model for studies of diverse monocotyledonous species. Results Here, we report the de novo genome sequencing and analysis of Oryza sativa ssp. japonica variety KitaakeX, a Kitaake plant carrying the rice XA21 immune receptor. Our KitaakeX sequence assembly contains 377.6 Mb, consisting of 33 scaffolds (476 contigs) with a contig N50 of 1.4 Mb. Complementing the assembly are detailed gene annotations of 35,594 protein coding genes. We identified 331,335 genomic variations between KitaakeX and Nipponbare (ssp. japonica), and 2,785,991 variations between KitaakeX and Zhenshan97 (ssp. indica). We also compared Kitaake resequencing reads to the KitaakeX assembly and identified 219 small variations. The high-quality genome of the model rice plant KitaakeX will accelerate rice functional genomics. Conclusions The high quality, de novo assembly of the KitaakeX genome will serve as a useful reference genome for rice and will accelerate functional genomics studies of rice and other species.


2019 ◽  
Author(s):  
Raúl A. González-Pech ◽  
Timothy G. Stephens ◽  
Yibi Chen ◽  
Amin R. Mohamed ◽  
Yuanyuan Cheng ◽  
...  

AbstractSymbiodiniaceae are predominantly symbiotic dinoflagellates critical to corals and other reef organisms. Symbiodinium is a basal symbiodiniacean lineage and includes symbiotic and free-living taxa. However, the molecular mechanisms underpinning these distinct lifestyles remain little known. Here, we present high-quality de novo genome assemblies for the symbiotic Symbiodinium tridacnidorum CCMP2592 (genome size 1.3 Gbp) and the free-living Symbiodinium natans CCMP2548 (genome size 0.74 Gbp). These genomes display extensive sequence divergence, sharing only ~1.5% conserved regions (≥90% identity). We predicted 45,474 and 35,270 genes for S. tridacnidorum and S. natans, respectively; of the 58,541 homologous gene families, 28.5% are common to both genomes. We recovered a greater extent of gene duplication and higher abundance of repeats, transposable elements and pseudogenes in the genome of S. tridacnidorum than in that of S. natans. These findings demonstrate that genome structural rearrangements are pertinent to distinct lifestyles in Symbiodinium, and may contribute to the vast genetic diversity within the genus, and more broadly in Symbiodiniaceae. Moreover, the results from our whole-genome comparisons against a free-living outgroup support the notion that the symbiotic lifestyle is a derived trait in, and that the free-living lifestyle is ancestral to, Symbiodinium.


2021 ◽  
Author(s):  
Merce Montoliu-Nerin ◽  
Marisol Sánchez-García ◽  
Claudia Bergin ◽  
Verena Esther Kutschera ◽  
Hanna Johannesson ◽  
...  

Morphological characters and nuclear ribosomal DNA (rDNA) phylogenies have so far been the basis of the current classifications of arbuscular mycorrhizal (AM) fungi. Improved understanding of the phylogeny and evolutionary history of AM fungi requires extensive ortholog sampling and analyses of genome and transcriptome data from a wide range of taxa. To circumvent the need for axenic culturing of AM fungi we gathered and combined genomic data from single nuclei to generate de novo genome assemblies covering seven families of AM fungi. Comparative analysis of the previously published Rhizophagus irregularis DAOM197198 assembly confirm that our novel workflow generates high-quality genome assemblies suitable for phylogenomic analysis. Predicted genes of our assemblies, together with published protein sequences of AM fungi and their sister clades, were used for phylogenomic analyses. Based on analyses of sets of orthologous genes, we highlight three alternative topologies among families of AM fungi. In the main topology, Glomerales is polyphyletic and Claroideoglomeraceae, is the basal sister group to Glomeraceae and Diversisporales. Our results support family level classification from previous phylogenetic studies. New evolutionary relationships among families where highlighted with phylogenomic analysis using the hitherto most extensive taxon sampling for AM fungi.


2020 ◽  
Vol 12 (3) ◽  
pp. 185-202
Author(s):  
Xia Han ◽  
Jindan Guo ◽  
Erli Pang ◽  
Hongtao Song ◽  
Kui Lin

Abstract How have genes evolved within a well-known genome phylogeny? Many protein-coding genes should have evolved as a whole at the gene level, and some should have evolved partly through fragments at the subgene level. To comprehensively explore such complex homologous relationships and better understand gene family evolution, here, with de novo-identified modules, the subgene units which could consecutively cover proteins within a set of closely related species, we applied a new phylogeny-based approach that considers evolutionary models with partial homology to classify all protein-coding genes in nine Drosophila genomes. Compared with two other popular methods for gene family construction, our approach improved practical gene family classifications with a more reasonable view of homology and provided a much more complete landscape of gene family evolution at the gene and subgene levels. In the case study, we found that most expanded gene families might have evolved mainly through module rearrangements rather than gene duplications and mainly generated single-module genes through partial gene duplication, suggesting that there might be pervasive subgene rearrangement in the evolution of protein-coding gene families. The use of a phylogeny-based approach with partial homology to classify and analyze protein-coding gene families may provide us with a more comprehensive landscape depicting how genes evolve within a well-known genome phylogeny.


BMC Biology ◽  
2021 ◽  
Vol 19 (1) ◽  
Author(s):  
Raúl A. González-Pech ◽  
Timothy G. Stephens ◽  
Yibi Chen ◽  
Amin R. Mohamed ◽  
Yuanyuan Cheng ◽  
...  

Abstract Background Dinoflagellates in the family Symbiodiniaceae are important photosynthetic symbionts in cnidarians (such as corals) and other coral reef organisms. Breakdown of the coral-dinoflagellate symbiosis due to environmental stress (i.e. coral bleaching) can lead to coral death and the potential collapse of reef ecosystems. However, evolution of Symbiodiniaceae genomes, and its implications for the coral, is little understood. Genome sequences of Symbiodiniaceae remain scarce due in part to their large genome sizes (1–5 Gbp) and idiosyncratic genome features. Results Here, we present de novo genome assemblies of seven members of the genus Symbiodinium, of which two are free-living, one is an opportunistic symbiont, and the remainder are mutualistic symbionts. Integrating other available data, we compare 15 dinoflagellate genomes revealing high sequence and structural divergence. Divergence among some Symbiodinium isolates is comparable to that among distinct genera of Symbiodiniaceae. We also recovered hundreds of gene families specific to each lineage, many of which encode unknown functions. An in-depth comparison between the genomes of the symbiotic Symbiodinium tridacnidorum (isolated from a coral) and the free-living Symbiodinium natans reveals a greater prevalence of transposable elements, genetic duplication, structural rearrangements, and pseudogenisation in the symbiotic species. Conclusions Our results underscore the potential impact of lifestyle on lineage-specific gene-function innovation, genome divergence, and the diversification of Symbiodinium and Symbiodiniaceae. The divergent features we report, and their putative causes, may also apply to other microbial eukaryotes that have undergone symbiotic phases in their evolutionary history.


2021 ◽  
Author(s):  
Lauren Coombe ◽  
Janet X Li ◽  
Theodora Lo ◽  
Johnathan Wong ◽  
Vladimir Nikolic ◽  
...  

Background Generating high-quality de novo genome assemblies is foundational to the genomics study of model and non-model organisms. In recent years, long-read sequencing has greatly benefited genome assembly and scaffolding, a process by which assembled sequences are ordered and oriented through the use of long-range information. Long reads are better able to span repetitive genomic regions compared to short reads, and thus have tremendous utility for resolving problematic regions and helping generate more complete draft assemblies. Here, we present LongStitch, a scalable pipeline that corrects and scaffolds draft genome assemblies exclusively using long reads. Results LongStitch incorporates multiple tools developed by our group and runs in up to three stages, which includes initial assembly correction (Tigmint-long), followed by two incremental scaffolding stages (ntLink and ARKS-long). Tigmint-long and ARKS-long are misassembly correction and scaffolding utilities, respectively, previously developed for linked reads, that we adapted for long reads. Here, we describe the LongStitch pipeline and introduce our new long-read scaffolder, ntLink, which utilizes lightweight minimizer mappings to join contigs. LongStitch was tested on short and long-read assemblies of three different human individuals using corresponding nanopore long-read data, and improves the contiguity of each assembly from 2.0-fold up to 304.6-fold (as measured by NGA50 length). Furthermore, LongStitch generates more contiguous and correct assemblies compared to state-of-the-art long-read scaffolder LRScaf in most tests, and consistently runs in under five hours using less than 23GB of RAM. Conclusions Due to its effectiveness and efficiency in improving draft assemblies using long reads, we expect LongStitch to benefit a wide variety of de novo genome assembly projects. The LongStitch pipeline is freely available at https://github.com/bcgsc/longstitch.


2021 ◽  
Vol 2 ◽  
Author(s):  
Merce Montoliu-Nerin ◽  
Marisol Sánchez-García ◽  
Claudia Bergin ◽  
Verena Esther Kutschera ◽  
Hanna Johannesson ◽  
...  

Morphological characters and nuclear ribosomal DNA (rDNA) phylogenies have so far been the basis of the current classifications of arbuscular mycorrhizal (AM) fungi. Improved understanding of the evolutionary history of AM fungi requires extensive ortholog sampling and analyses of genome and transcriptome data from a wide range of taxa. To circumvent the need for axenic culturing of AM fungi we gathered and combined genomic data from single nuclei to generate de novo genome assemblies covering seven families of AM fungi. We successfully sequenced the genomes of 15 AM fungal species for which genome data was not previously available. Comparative analysis of the previously published Rhizophagus irregularis DAOM197198 assembly confirm that our novel workflow generates genome assemblies suitable for phylogenomic analysis. Predicted genes of our assemblies, together with published protein sequences of AM fungi and their sister clades, were used for phylogenomic analyses. We evaluated the phylogenetic placement of Glomeromycota in relation to its sister phyla (Mucoromycota and Mortierellomycota), and found no support to reject a polytomy. Finally, we explored the phylogenetic relationships within Glomeromycota. Our results support family level classification from previous phylogenetic studies, and the polyphyly of the order Glomerales with Claroideoglomeraceae as the sister group to Glomeraceae and Diversisporales.


2021 ◽  
Vol 118 (42) ◽  
pp. e2025711118
Author(s):  
Quanjun Hu ◽  
Yazhen Ma ◽  
Terezie Mandáková ◽  
Sheng Shi ◽  
Chunlin Chen ◽  
...  

Deserts exert strong selection pressures on plants, but the underlying genomic drivers of ecological adaptation and subsequent speciation remain largely unknown. Here, we generated de novo genome assemblies and conducted population genomic analyses of the psammophytic genus Pugionium (Brassicaceae). Our results indicated that this bispecific genus had undergone an allopolyploid event, and the two parental genomes were derived from two ancestral lineages with different chromosome numbers and structures. The postpolyploid expansion of gene families related to abiotic stress responses and lignin biosynthesis facilitated environmental adaptations of the genus to desert habitats. Population genomic analyses of both species further revealed their recent divergence with continuous gene flow, and the most divergent regions were found to be centered on three highly structurally reshuffled chromosomes. Genes under selection in these regions, which were mainly located in one of the two subgenomes, contributed greatly to the interspecific divergence in microhabitat adaptation.


2019 ◽  
Author(s):  
Thomas Hackl ◽  
Roman Martin ◽  
Karina Barenhoff ◽  
Sarah Duponchel ◽  
Dominik Heider ◽  
...  

AbstractThe heterotrophic stramenopile Cafeteria roenbergensis is a globally distributed marine bacterivorous protist. This unicellular flagellate is host to the giant DNA virus CroV and the virophage mavirus. We sequenced the genomes of four cultured C. roenbergensis strains and generated 23.53 Gb of Illumina MiSeq data (99-282 × coverage per strain) and 5.09 Gb of PacBio RSII data (13-54 × coverage). Using the Canu assembler and customized curation procedures, we obtained high-quality draft genome assemblies with a total length of 34-36 Mbp per strain and contig N50 lengths of 148 kbp to 464 kbp. The C. roenbergensis genome has a GC content of ~70%, a repeat content of ~28%, and is predicted to contain approximately 7857-8483 protein-coding genes based on a combination of de novo, homology-based and transcriptome-supported annotation. These first high-quality genome assemblies of a Bicosoecid fill an important gap in sequenced Stramenopile representatives and enable a more detailed evolutionary analysis of heterotrophic protists.


Sign in / Sign up

Export Citation Format

Share Document