scholarly journals The Gossypium longicalyx genome as a resource for cotton breeding and evolution

2020 ◽  
Author(s):  
Corrinne E. Grover ◽  
Mengqiao Pan ◽  
Daojun Yuan ◽  
Mark A. Arick ◽  
Guanjing Hu ◽  
...  

AbstractCotton is an important crop that has made significant gains in production over the last century. Emerging pests such as the reniform nematode have threatened cotton production. The rare African diploid species Gossypium longicalyx is a wild species that has been used as an important source of reniform nematode immunity. While mapping and breeding efforts have made some strides in transferring this immunity to the cultivated polyploid species, the complexities of interploidal transfer combined with substantial linkage drag have inhibited progress in this area. Moreover, this species shares its most recent common ancestor with the cultivated A-genome diploid cottons, thereby providing insight into the evolution of long, spinnable fiber. Here we report a newly generated de novo genome assembly of G. longicalyx. This high-quality genome leveraged a combination of PacBio long-read technology, Hi-C chromatin conformation capture, and BioNano optical mapping to achieve a chromosome level assembly. The utility of the G. longicalyx genome for understanding reniform immunity and fiber evolution is discussed.

2020 ◽  
Vol 10 (5) ◽  
pp. 1457-1467 ◽  
Author(s):  
Corrinne E. Grover ◽  
Mengqiao Pan ◽  
Daojun Yuan ◽  
Mark A. Arick ◽  
Guanjing Hu ◽  
...  

Cotton is an important crop that has made significant gains in production over the last century. Emerging pests such as the reniform nematode have threatened cotton production. The rare African diploid species Gossypium longicalyx is a wild species that has been used as an important source of reniform nematode immunity. While mapping and breeding efforts have made some strides in transferring this immunity to the cultivated polyploid species, the complexities of interploidal transfer combined with substantial linkage drag have inhibited progress in this area. Moreover, this species shares its most recent common ancestor with the cultivated A-genome diploid cottons, thereby providing insight into the evolution of long, spinnable fiber. Here we report a newly generated de novo genome assembly of G. longicalyx. This high-quality genome leveraged a combination of PacBio long-read technology, Hi-C chromatin conformation capture, and BioNano optical mapping to achieve a chromosome level assembly. The utility of the G. longicalyx genome for understanding reniform immunity and fiber evolution is discussed.


2005 ◽  
Vol 79 (3) ◽  
pp. 1595-1604 ◽  
Author(s):  
Leen Vijgen ◽  
Els Keyaerts ◽  
Elien Moës ◽  
Inge Thoelen ◽  
Elke Wollants ◽  
...  

ABSTRACT Coronaviruses are enveloped, positive-stranded RNA viruses with a genome of approximately 30 kb. Based on genetic similarities, coronaviruses are classified into three groups. Two group 2 coronaviruses, human coronavirus OC43 (HCoV-OC43) and bovine coronavirus (BCoV), show remarkable antigenic and genetic similarities. In this study, we report the first complete genome sequence (30,738 nucleotides) of the prototype HCoV-OC43 strain (ATCC VR759). Complete genome and open reading frame (ORF) analyses were performed in comparison to the BCoV genome. In the region between the spike and membrane protein genes, a 290-nucleotide deletion is present, corresponding to the absence of BCoV ORFs ns4.9 and ns4.8. Nucleotide and amino acid similarity percentages were determined for the major HCoV-OC43 ORFs and for those of other group 2 coronaviruses. The highest degree of similarity is demonstrated between HCoV-OC43 and BCoV in all ORFs with the exception of the E gene. Molecular clock analysis of the spike gene sequences of BCoV and HCoV-OC43 suggests a relatively recent zoonotic transmission event and dates their most recent common ancestor to around 1890. An evolutionary rate in the order of 4 × 10−4 nucleotide changes per site per year was estimated. This is the first animal-human zoonotic pair of coronaviruses that can be analyzed in order to gain insights into the processes of adaptation of a nonhuman coronavirus to a human host, which is important for understanding the interspecies transmission events that led to the origin of the severe acute respiratory syndrome outbreak.


2019 ◽  
Author(s):  
Xun Xu ◽  
Song Ge ◽  
Fu-min Zhang

Abstract Background: Reciprocal gene loss (RGL) of duplicate genes is an important genetic resource of reproductive isolation, which is essential for speciation. In the past decades, various RGL patterns have been revealed, but RGL process is still poorly understood. The RGL of the duplicate DOPPELGANGER1 (DPL1) and DOPPELGANGER2 (DPL2) gene can lead to BDM-type hybrid incompatibility between two rice subspecies. The evolutionary history of the duplicate genes, including their origin and mechanism of duplication as well as their evolutionary divergence after the duplication, remains unclear. In this study, we investigated the evolutionary history of the duplicate genes for gaining insights into the process of RGL.Results: We reconstructed phylogenetic relationships of DPL copies from all 15 diploid species representing six genome types of rice genus and then found that all the DPL copies from the latest diverged A- and B-genome gather into one monophyletic clade. Southern blot analysis also detected definitely two DPL copies only in A- and B-genome. High conserved collinearity can be observed between A- and B-genomic segments containing DPL1 and DPL2 respectively but not between DPL1 and DPL2 segments. Investigations of transposon elements indicated that DPL duplication is related to DNA transposons. Likelihood-based analyses with branch models showed a relaxation of selective constraint in DPL1 lineage but an enhancement in DPL2 lineage after DPL duplication. Sequence analysis also indicated that quite a few defective DPL1 can be found in 6 wild and cultivated species out of all 8 species of A-genome but only one defective DPL2 occurs in a cultivated rice subspecies. Conclusions: DPL duplication of rice originated in the recent common ancestor of A- and B-genome about 6.76 million years ago and the duplication was possibly caused by DNA transposons. The DPL1 is a redundant copy and has being in the process of pseudogenization, suggesting that artificial selection may play an important role in forming the RGL of DPLs between two rice subspecies during the domestication.


2019 ◽  
Author(s):  
Zhoutao Chen ◽  
Long Pham ◽  
Tsai-Chin Wu ◽  
Guoya Mo ◽  
Yu Xia ◽  
...  

AbstractLong-range sequencing information is required for haplotype phasing, de novo assembly and structural variation detection. Current long-read sequencing technologies can provide valuable long-range information but at a high cost with low accuracy and high DNA input requirement. We have developed a single-tube Transposase Enzyme Linked Long-read Sequencing (TELL-Seq™) technology, which enables a low-cost, high-accuracy and high-throughput short-read next generation sequencer to routinely generate over 100 Kb long-range sequencing information with as little as 0.1 ng input material. In a PCR tube, millions of clonally barcoded beads are used to uniquely barcode long DNA molecules in an open bulk reaction without dilution and compartmentation. The barcode linked reads are used to successfully assemble genomes ranging from microbes to human. These linked-reads also generate mega-base-long phased blocks and provide a cost-effective tool for detecting structural variants in a genome, which are important to identify compound heterozygosity in recessive Mendelian diseases and discover genetic drivers and diagnostic biomarkers in cancers.


2019 ◽  
Vol 9 (10) ◽  
pp. 3079-3085 ◽  
Author(s):  
Joshua A. Udall ◽  
Evan Long ◽  
Chris Hanson ◽  
Daojun Yuan ◽  
Thiruvarangan Ramaraj ◽  
...  

Cotton is an agriculturally important crop. Because of its importance, a genome sequence of a diploid cotton species (Gossypium raimondii, D-genome) was first assembled using Sanger sequencing data in 2012. Improvements to DNA sequencing technology have improved accuracy and correctness of assembled genome sequences. Here we report a new de novo genome assembly of G. raimondii and its close relative G. turneri. The two genomes were assembled to a chromosome level using PacBio long-read technology, HiC, and Bionano optical mapping. This report corrects some minor assembly errors found in the Sanger assembly of G. raimondii. We also compare the genome sequences of these two species for gene composition, repetitive element composition, and collinearity. Most of the identified structural rearrangements between these two species are due to intra-chromosomal inversions. More inversions were found in the G. turneri genome sequence than the G. raimondii genome sequence. These findings and updates to the D-genome sequence will improve accuracy and translation of genomics to cotton breeding and genetics.


2020 ◽  
Vol 37 (5) ◽  
pp. 1306-1316 ◽  
Author(s):  
Yoshiaki Yasumizu ◽  
Saori Sakaue ◽  
Takahiro Konuma ◽  
Ken Suzuki ◽  
Koichi Matsuda ◽  
...  

Abstract Elucidation of natural selection signatures and relationships with phenotype spectra is important to understand adaptive evolution of modern humans. Here, we conducted a genome-wide scan of selection signatures of the Japanese population by estimating locus-specific time to the most recent common ancestor using the ascertained sequentially Markovian coalescent (ASMC), from the biobank-based large-scale genome-wide association study data of 170,882 subjects. We identified 29 genetic loci with selection signatures satisfying the genome-wide significance. The signatures were most evident at the alcohol dehydrogenase (ADH) gene cluster locus at 4q23 (PASMC = 2.2 × 10−36), followed by relatively strong selection at the FAM96A (15q22), MYOF (10q23), 13q21, GRIA2 (4q32), and ASAP2 (2p25) loci (PASMC < 1.0 × 10−10). The additional analysis interrogating extended haplotypes (integrated haplotype score) showed robust concordance of the detected signatures, contributing to fine-mapping of the genes, and provided allelic directional insights into selection pressure (e.g., positive selection for ADH1B-Arg48His and HLA-DPB1*04:01). The phenome-wide selection enrichment analysis with the trait-associated variants identified a variety of the modern human phenotypes involved in the adaptation of Japanese. We observed population-specific evidence of enrichment with the alcohol-related phenotypes, anthropometric and biochemical clinical measurements, and immune-related diseases, differently from the findings in Europeans using the UK Biobank resource. Our study demonstrated population-specific features of the selection signatures in Japanese, highlighting a value of the natural selection study using the nation-wide biobank-scale genome and phenotype data.


2013 ◽  
Vol 79 (22) ◽  
pp. 7006-7012 ◽  
Author(s):  
Nicholas C. Butzin ◽  
Michael A. Secinaro ◽  
Kristen S. Swithers ◽  
J. Peter Gogarten ◽  
Kenneth M. Noll

ABSTRACTWe recently reported that theThermotogalesacquired the ability to synthesize vitamin B12by acquisition of genes from two distantly related lineages,ArchaeaandFirmicutes(K. S. Swithers et al., Genome Biol. Evol. 4:730–739, 2012). Ancestral state reconstruction suggested that the cobinamide salvage gene cluster was present in theThermotogales' most recent common ancestor. We also predicted thatThermotoga lettingaecould not synthesize B12de novobut could use the cobinamide salvage pathway to synthesize B12. In this study, these hypotheses were tested, and we found thatTt. lettingaedid not synthesize B12de novobut salvaged cobinamide. The growth rate ofTt. lettingaeincreased with the addition of B12or cobinamide to its medium. It synthesized B12when the medium was supplemented with cobinamide, and no B12was detected in cells grown on cobinamide-deficient medium. Upstream of the cobinamide salvage genes is a putative B12riboswitch. In other organisms, B12riboswitches allow for higher transcriptional activity in the absence of B12. WhenTt. lettingaewas grown with no B12, the salvage genes were upregulated compared to cells grown with B12or cobinamide. Another gene cluster with a putative B12riboswitch upstream is thebtuFCDABC transporter, and it showed a transcription pattern similar to that of the cobinamide salvage genes. The BtuF proteins from species that can and cannot salvage cobinamides were shownin vitroto bind both B12and cobinamide. These results suggest thatThermotogalesspecies can use the BtuFCD transporter to import both B12and cobinamide, even if they cannot salvage cobinamide.


F1000Research ◽  
2018 ◽  
Vol 7 ◽  
pp. 1391
Author(s):  
Evan Biederstedt ◽  
Jeffrey C. Oliver ◽  
Nancy F. Hansen ◽  
Aarti Jajoo ◽  
Nathan Dunn ◽  
...  

Genome graphs are emerging as an important novel approach to the analysis of high-throughput human sequencing data. By explicitly representing genetic variants and alternative haplotypes in a mappable data structure, they can enable the improved analysis of structurally variable and hyperpolymorphic regions of the genome. In most existing approaches, graphs are constructed from variant call sets derived from short-read sequencing. As long-read sequencing becomes more cost-effective and enables de novo assembly for increasing numbers of whole genomes, a method for the direct construction of a genome graph from sets of assembled human genomes would be desirable. Such assembly-based genome graphs would encompass the wide spectrum of genetic variation accessible to long-read-based de novo assembly, including large structural variants and divergent haplotypes. Here we present NovoGraph, a method for the construction of a human genome graph directly from a set of de novo assemblies. NovoGraph constructs a genome-wide multiple sequence alignment of all input contigs and creates a graph by merging the input sequences at positions that are both homologous and sequence-identical. NovoGraph outputs resulting graphs in VCF format that can be loaded into third-party genome graph toolkits. To demonstrate NovoGraph, we construct a genome graph with 23,478,835 variant sites and 30,582,795 variant alleles from de novo assemblies of seven ethnically diverse human genomes (AK1, CHM1, CHM13, HG003, HG004, HX1, NA19240). Initial evaluations show that mapping against the constructed graph reduces the average mismatch rate of reads from sample NA12878 by approximately 0.2%, albeit at a slightly increased rate of reads that remain unmapped.


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Dengfeng Guan ◽  
Shane A. McCarthy ◽  
Zemin Ning ◽  
Guohua Wang ◽  
Yadong Wang ◽  
...  

Abstract Background Efficient and effective genome scaffolding tools are still in high demand for generating reference-quality assemblies. While long read data itself is unlikely to create a chromosome-scale assembly for most eukaryotic species, the inexpensive Hi-C sequencing technology, capable of capturing the chromosomal profile of a genome, is now widely used to complete the task. However, the existing Hi-C based scaffolding tools either require a priori chromosome number as input, or lack the ability to build highly continuous scaffolds. Results We design and develop a novel Hi-C based scaffolding tool, pin_hic, which takes advantage of contact information from Hi-C reads to construct a scaffolding graph iteratively based on N-best neighbors of contigs. Subsequent to scaffolding, it identifies potential misjoins and breaks them to keep the scaffolding accuracy. Through our tests on three long read based de novo assemblies from three different species, we demonstrate that pin_hic is more efficient than current standard state-of-art tools, and it can generate much more continuous scaffolds, while achieving a higher or comparable accuracy. Conclusions Pin_hic is an efficient Hi-C based scaffolding tool, which can be useful for building chromosome-scale assemblies. As many sequencing projects have been launched in the recent years, we believe pin_hic has potential to be applied in these projects and makes a meaningful contribution.


Pathogens ◽  
2020 ◽  
Vol 9 (10) ◽  
pp. 822
Author(s):  
Hee Jin Kwon ◽  
Zhao Chen ◽  
Peter Evans ◽  
Jianghong Meng ◽  
Yi Chen

Recently developed nanopore sequencing technologies offer a unique opportunity to rapidly close the genome and to identify complete sequences of mobile genetic elements (MGEs). In this study, 17 isolates of Listeria monocytogenes (Lm) epidemic clone II (ECII) from seven ready-to-eat meat or poultry processing facilities, not known to be associated with outbreaks, were shotgun sequenced, and among them, five isolates were further subjected to long-read sequencing. Additionally, 26 genomes of Lm ECII isolates associated with three listeriosis outbreaks in the U.S. and South Africa were obtained from the National Center for Biotechnology Information (NCBI) database and analyzed to evaluate if MGEs may be used as a high-resolution genetic marker for identifying and sourcing the origin of Lm. The analyses identified four comK prophages in 11 non-outbreak isolates from four facilities and three comK prophages in 20 isolates associated with two outbreaks that occurred in the U.S. In addition, three different plasmids were identified among 10 non-outbreak isolates and 14 outbreak isolates. Each comK prophage and plasmid was conserved among the isolates sharing it. Different prophages from different facilities or outbreaks had significant genetic variations, possibly due to horizontal gene transfer. Phylogenetic analysis showed that isolates from the same facility or the same outbreak always closely clustered. The time of most recent common ancestor of the Lm ECII isolates was estimated to be in March 1816 with the average nucleotide substitution rate of 3.1 × 10−7 substitutions per site per year. This study showed that complete MGE sequences provide a good signal to determine the genetic relatedness of Lm isolates, to identify persistence or repeated contamination that occurred within food processing environment, and to study the evolutionary history among closely related isolates.


Sign in / Sign up

Export Citation Format

Share Document