scholarly journals The Gossypium anomalum genome as a resource for cotton improvement and evolutionary analysis of hybrid incompatibility

Author(s):  
Corrinne E Grover ◽  
Daojun Yuan ◽  
Mark A Arick ◽  
Emma R Miller ◽  
Guanjing Hu ◽  
...  

Abstract Cotton is an important crop that has been the beneficiary of multiple genome sequencing efforts, including diverse representatives of wild species for germplasm development. Gossypium anomalum is a wild African diploid species that harbors stress-resistance and fiber-related traits with potential application to modern breeding efforts. In addition, this species is a natural source of cytoplasmic male sterility and a resource for understanding hybrid lethality in the genus. Here, we report a high-quality de novo genome assembly for G. anomalum and characterize this genome relative to existing genome sequences in cotton. In addition, we use the synthetic allopolyploids 2(A2D1) and 2(A2D3) to discover regions in the G. anomalum genome potentially involved in hybrid lethality, a possibility enabled by introgression of regions homologous to the D3 (Gossypium davidsonii) lethality loci into the synthetic 2(A2D3) allopolyploid.

2021 ◽  
Author(s):  
Corrinne E. Grover ◽  
Daojun Yuan ◽  
Mark A. Arick ◽  
Emma R. Miller ◽  
Guanjing Hu ◽  
...  

Cotton is an important crop that has been the beneficiary of multiple genome sequencing efforts, including diverse representatives of wild species for germplasm development. Gossypium anomalum is a wild African diploid species that harbors stress-resistance and fiber-related traits with potential application to modern breeding efforts. In addition, this species is a natural source of cytoplasmic male sterility and a resource for understanding hybrid lethality in the genus. Here we report a high-quality de novo genome assembly for G. anomalum and characterize this genome relative to existing genome sequences in cotton. In addition, we use the synthetic allopolyploids 2(A2D1) and 2(A2D3) to discover regions in the G. anomalum genome potentially involved in hybrid lethality, a possibility enabled by introgression of regions homologous to the D3 (G. davidsonii) lethality loci into the synthetic 2(A2D3) allopolyploid.


GigaScience ◽  
2019 ◽  
Vol 8 (10) ◽  
Author(s):  
Sarah B Kingan ◽  
Julie Urban ◽  
Christine C Lambert ◽  
Primo Baybayan ◽  
Anna K Childers ◽  
...  

ABSTRACT Background A high-quality reference genome is an essential tool for applied and basic research on arthropods. Long-read sequencing technologies may be used to generate more complete and contiguous genome assemblies than alternate technologies; however, long-read methods have historically had greater input DNA requirements and higher costs than next-generation sequencing, which are barriers to their use on many samples. Here, we present a 2.3 Gb de novo genome assembly of a field-collected adult female spotted lanternfly (Lycorma delicatula) using a single Pacific Biosciences SMRT Cell. The spotted lanternfly is an invasive species recently discovered in the northeastern United States that threatens to damage economically important crop plants in the region. Results The DNA from 1 individual was used to make 1 standard, size-selected library with an average DNA fragment size of ∼20 kb. The library was run on 1 Sequel II SMRT Cell 8M, generating a total of 132 Gb of long-read sequences, of which 82 Gb were from unique library molecules, representing ∼36× coverage of the genome. The assembly had high contiguity (contig N50 length = 1.5 Mb), completeness, and sequence level accuracy as estimated by conserved gene set analysis (96.8% of conserved genes both complete and without frame shift errors). Furthermore, it was possible to segregate more than half of the diploid genome into the 2 separate haplotypes. The assembly also recovered 2 microbial symbiont genomes known to be associated with L. delicatula, each microbial genome being assembled into a single contig. Conclusions We demonstrate that field-collected arthropods can be used for the rapid generation of high-quality genome assemblies, an attractive approach for projects on emerging invasive species, disease vectors, or conservation efforts of endangered species.


2014 ◽  
Vol 30 (19) ◽  
pp. 2709-2716 ◽  
Author(s):  
Sagar M. Utturkar ◽  
Dawn M. Klingeman ◽  
Miriam L. Land ◽  
Christopher W. Schadt ◽  
Mitchel J. Doktycz ◽  
...  

Author(s):  
Xinhai Ye ◽  
Yi Yang ◽  
Zhaoyang Tian ◽  
Le Xu ◽  
Kaili Yu ◽  
...  

AbstractSequencing and assembling a genome with a single individual have several advantages, such as lower heterozygosity and easier sample preparation. However, the amount of genomic DNA of some small sized organisms might not meet the standard DNA input requirement for current sequencing pipelines. Although few studies sequenced a single small insect with about 100 ng DNA as input, it may still be challenging for many small organisms to obtain such amount of DNA from a single individual. Here, we use 20 ng DNA as input, and present a high-quality genome assembly for a single haploid male parasitoid wasp (Habrobracon hebetor) using Nanopore and Illumina. Because of the low input DNA, a whole genome amplification (WGA) method is used before sequencing. The assembled genome size is 131.6 Mb with a contig N50 of 1.63 Mb. A total of 99% Benchmarking Universal Single-Copy Orthologs are detected, suggesting the high level of completeness of the genome assembly. Genome comparison between H. hebetor and its relative Bracon brevicornis shows a high-level genome synteny, indicating the genome of H. hebetor is highly accurate and contiguous. Our study provides an example for de novo assembling a genome from ultra-low input DNA, and will be used for sequencing projects of small sized species and rare samples, haploid genomics as well as population genetics of small sized species.


2018 ◽  
Author(s):  
Jolene T. Sutton ◽  
Martin Helmkampf ◽  
Cynthia C. Steiner ◽  
M. Renee Bellinger ◽  
Jonas Korlach ◽  
...  

AbstractGenome-level data can provide researchers with unprecedented precision to examine the causes and genetic consequences of population declines, and to apply these results to conservation management. Here we present a high-quality, long-read, de novo genome assembly for one of the world’s most endangered bird species, the Alala. As the only remaining native crow species in Hawaii, the Alala survived solely in a captive breeding program from 2002 until 2016, at which point a long-term reintroduction program was initiated. The high-quality genome assembly was generated to lay the foundation for both comparative genomics studies, and the development of population-level genomic tools that will aid conservation and recovery efforts. We illustrate how the quality of this assembly places it amongst the very best avian genomes assembled to date, comparable to intensively studied model systems. We describe the genome architecture in terms of repetitive elements and runs of homozygosity, and we show that compared with more outbred species, the Alala genome is substantially more homozygous. We also provide annotations for a subset of immunity genes that are likely to be important for conservation applications, and we discuss how this genome is currently being used as a roadmap for downstream conservation applications.


Rice Science ◽  
2021 ◽  
Vol 28 (2) ◽  
pp. 109-113
Author(s):  
Li Fangping ◽  
Gao Yanhao ◽  
Wu Bingqi ◽  
Cai Qingpei ◽  
Zhan Pengling ◽  
...  

2016 ◽  
Author(s):  
Robert Vaser ◽  
Ivan Sović ◽  
Niranjan Nagarajan ◽  
Mile Šikić

The assembly of long reads from Pacific Biosciences and Oxford Nanopore Technologies typically requires resource intensive error correction and consensus generation steps to obtain high quality assemblies. We show that the error correction step can be omitted and high quality consensus sequences can be generated efficiently with a SIMD accelerated, partial order alignment based stand-alone consensus module called Racon. Based on tests with PacBio and Oxford Nanopore datasets we show that Racon coupled with Miniasm enables consensus genomes with similar or better quality than state-of-the-art methods while being an order of magnitude faster.Racon is available open source under the MIT license at https://github.com/isovic/racon.git.


2021 ◽  
Author(s):  
Lauren Coombe ◽  
Janet X Li ◽  
Theodora Lo ◽  
Johnathan Wong ◽  
Vladimir Nikolic ◽  
...  

Background Generating high-quality de novo genome assemblies is foundational to the genomics study of model and non-model organisms. In recent years, long-read sequencing has greatly benefited genome assembly and scaffolding, a process by which assembled sequences are ordered and oriented through the use of long-range information. Long reads are better able to span repetitive genomic regions compared to short reads, and thus have tremendous utility for resolving problematic regions and helping generate more complete draft assemblies. Here, we present LongStitch, a scalable pipeline that corrects and scaffolds draft genome assemblies exclusively using long reads. Results LongStitch incorporates multiple tools developed by our group and runs in up to three stages, which includes initial assembly correction (Tigmint-long), followed by two incremental scaffolding stages (ntLink and ARKS-long). Tigmint-long and ARKS-long are misassembly correction and scaffolding utilities, respectively, previously developed for linked reads, that we adapted for long reads. Here, we describe the LongStitch pipeline and introduce our new long-read scaffolder, ntLink, which utilizes lightweight minimizer mappings to join contigs. LongStitch was tested on short and long-read assemblies of three different human individuals using corresponding nanopore long-read data, and improves the contiguity of each assembly from 2.0-fold up to 304.6-fold (as measured by NGA50 length). Furthermore, LongStitch generates more contiguous and correct assemblies compared to state-of-the-art long-read scaffolder LRScaf in most tests, and consistently runs in under five hours using less than 23GB of RAM. Conclusions Due to its effectiveness and efficiency in improving draft assemblies using long reads, we expect LongStitch to benefit a wide variety of de novo genome assembly projects. The LongStitch pipeline is freely available at https://github.com/bcgsc/longstitch.


2021 ◽  
Author(s):  
Andrea Minio ◽  
Noe Cochetel ◽  
Amanda M Vondras ◽  
Melanie Massonnet ◽  
Dario Cantu

De novo genome assembly is essential for genomic research. High-quality genomes assembled into phased pseudomolecules are challenging to produce and often contain assembly errors caused by repeats, heterozygosity, or the chosen assembly strategy. Although algorithms exist that produce partially phased assemblies, haploid draft assemblies that may lack biological information remain favored because they are easier to generate and use. We developed HaploSync, a suite of tools that produces fully phased, chromosome-scale diploid genome assemblies and performs extensive quality control to limit assembly artifacts. HaploSync uses a genetic map and/or the genome of a closely related species to guide the scaffolding of a diploid assembly into phased pseudomolecules for each chromosome. It compares alternative haplotypes to identify and correct misassemblies independent of a reference, fills assembly gaps with unplaced sequences, and resolves collapsed homozygous regions. In a series of plant, fungal, and animal kingdom case studies, we demonstrate that HaploSync increases the assembly contiguity of phased chromosomes, improves completeness by filling gaps, corrects scaffolding, and correctly phases highly heterozygous, complex regions.


2019 ◽  
Author(s):  
Thomas Hackl ◽  
Roman Martin ◽  
Karina Barenhoff ◽  
Sarah Duponchel ◽  
Dominik Heider ◽  
...  

AbstractThe heterotrophic stramenopile Cafeteria roenbergensis is a globally distributed marine bacterivorous protist. This unicellular flagellate is host to the giant DNA virus CroV and the virophage mavirus. We sequenced the genomes of four cultured C. roenbergensis strains and generated 23.53 Gb of Illumina MiSeq data (99-282 × coverage per strain) and 5.09 Gb of PacBio RSII data (13-54 × coverage). Using the Canu assembler and customized curation procedures, we obtained high-quality draft genome assemblies with a total length of 34-36 Mbp per strain and contig N50 lengths of 148 kbp to 464 kbp. The C. roenbergensis genome has a GC content of ~70%, a repeat content of ~28%, and is predicted to contain approximately 7857-8483 protein-coding genes based on a combination of de novo, homology-based and transcriptome-supported annotation. These first high-quality genome assemblies of a Bicosoecid fill an important gap in sequenced Stramenopile representatives and enable a more detailed evolutionary analysis of heterotrophic protists.


Sign in / Sign up

Export Citation Format

Share Document