scholarly journals De Novo Assembly of the Northern Cardinal (Cardinalis cardinalis) Genome Reveals Candidate Regulatory Regions for Sexually Dichromatic Red Plumage Coloration

2020 ◽  
Author(s):  
Simon Yung Wa Sin ◽  
Lily Lu ◽  
Scott V. Edwards

AbstractNorthern cardinals (Cardinalis cardinalis) are common, mid-sized passerines widely distributed in North America. As an iconic species with strong sexual dichromatism, it has been the focus of extensive ecological and evolutionary research, yet genomic studies investigating the evolution of genotype–phenotype association of plumage coloration and dichromatism are lacking. Here we present a new, highly contiguous assembly for C. cardinalis. We generated a 1.1 Gb assembly comprised of 4,762 scaffolds, with a scaffold N50 of 3.6 Mb, a contig N50 of 114.4 kb and a longest scaffold of 19.7 Mb. We identified 93.5% complete and single-copy orthologs from an Aves dataset using BUSCO, demonstrating high completeness of the genome assembly. We annotated the genomic region comprising the CYP2J19 gene, which plays a pivotal role in the red coloration in birds. Comparative analyses demonstrated non-exonic regions unique to the CYP2J19 gene in passerines and a long insertion upstream of the gene in C. cardinalis. Transcription factor binding motifs discovered in the unique insertion region in C. cardinalis suggest potential androgen-regulated mechanisms underlying sexual dichromatism. Pairwise Sequential Markovian Coalescent (PSMC) analysis of the genome reveals fluctuations in historic effective population size between 100,000–250,000 in the last 2 millions years, with declines concordant with the beginning of the Pleistocene epoch and Last Glacial Period. This draft genome of C. cardinalis provides an important resource for future studies of ecological, evolutionary, and functional genomics in cardinals and other birds.

2020 ◽  
Vol 10 (10) ◽  
pp. 3541-3548
Author(s):  
Simon Yung Wa Sin ◽  
Lily Lu ◽  
Scott V. Edwards

Northern cardinals (Cardinalis cardinalis) are common, mid-sized passerines widely distributed in North America. As an iconic species with strong sexual dichromatism, it has been the focus of extensive ecological and evolutionary research, yet genomic studies investigating the evolution of genotype–phenotype association of plumage coloration and dichromatism are lacking. Here we present a new, highly-contiguous assembly for C. cardinalis. We generated a 1.1 Gb assembly comprised of 4,762 scaffolds, with a scaffold N50 of 3.6 Mb, a contig N50 of 114.4 kb and a longest scaffold of 19.7 Mb. We identified 93.5% complete and single-copy orthologs from an Aves dataset using BUSCO, demonstrating high completeness of the genome assembly. We annotated the genomic region comprising the CYP2J19 gene, which plays a pivotal role in the red coloration in birds. Comparative analyses demonstrated non-exonic regions unique to the CYP2J19 gene in passerines and a long insertion upstream of the gene in C. cardinalis. Transcription factor binding motifs discovered in the unique insertion region in C. cardinalis suggest potential androgen-regulated mechanisms underlying sexual dichromatism. Pairwise Sequential Markovian Coalescent (PSMC) analysis of the genome reveals fluctuations in historic effective population size between 100,000–250,000 in the last 2 millions years, with declines concordant with the beginning of the Pleistocene epoch and Last Glacial Period. This draft genome of C. cardinalis provides an important resource for future studies of ecological, evolutionary, and functional genomics in cardinals and other birds.


2020 ◽  
Vol 10 (5) ◽  
pp. 1477-1484
Author(s):  
Kumar Saurabh Singh ◽  
David J. Hosken ◽  
Nina Wedell ◽  
Richard ffrench-Constant ◽  
Chris Bass ◽  
...  

Meadow brown butterflies (Maniola jurtina) on the Isles of Scilly represent an ideal model in which to dissect the links between genotype, phenotype and long-term patterns of selection in the wild - a largely unfulfilled but fundamental aim of modern biology. To meet this aim, a clear description of genotype is required. Here we present the draft genome sequence of M. jurtina to serve as a founding genetic resource for this species. Seven libraries were constructed using pooled DNA from five wild caught spotted females and sequenced using Illumina, PacBio RSII and MinION technology. A novel hybrid assembly approach was employed to generate a final assembly with an N50 of 214 kb (longest scaffold 2.9 Mb). The sequence assembly described here predicts a gene count of 36,294 and includes variants and gene duplicates from five genotypes. Core BUSCO (Benchmarking Universal Single-Copy Orthologs) gene sets of Arthropoda and Insecta recovered 90.5% and 88.7% complete and single-copy genes respectively. Comparisons with 17 other Lepidopteran species placed 86.5% of the assembled genes in orthogroups. Our results provide the first high-quality draft genome and annotation of the butterfly M. jurtina.


Genes ◽  
2021 ◽  
Vol 12 (9) ◽  
pp. 1297
Author(s):  
Surabhi Ranavat ◽  
Hannes Becher ◽  
Mark F. Newman ◽  
Vinita Gowda ◽  
Alex D. Twyford

Angiosperms possess various strategies to ensure reproductive success, such as stylar polymorphisms that encourage outcrossing. Here, we investigate the genetic basis of one such dimorphism that combines both temporal and spatial separation of sexual function, termed flexistyly. It is a floral strategy characterised by the presence of two morphs that differ in the timing of stylar movement. We performed a de novo assembly of the genome of Alpinia nigra using high-depth genomic sequencing. We then used Pool-seq to identify candidate regions for flexistyly based on allele frequency or coverage differences between pools of anaflexistylous and cataflexistylous morphs. The final genome assembly size was 2 Gb, and showed no evidence of recent polyploidy. The Pool-seq did not reveal large regions with high FST values, suggesting large structural chromosomal polymorphisms are unlikely to underlie differences between morphs. Similarly, no region had a 1:2 mapping depth ratio which would be indicative of hemizygosity. We propose that flexistyly is governed by a small genomic region that might be difficult to detect with Pool-seq, or a complex genomic region that proved difficult to assemble. Our genome will be a valuable resource for future studies of gingers, and provides the first steps towards characterising this complex floral phenotype.


PeerJ ◽  
2020 ◽  
Vol 8 ◽  
pp. e9114 ◽  
Author(s):  
Jiawei Wang ◽  
Weizhen Liu ◽  
Dongzi Zhu ◽  
Xiang Zhou ◽  
Po Hong ◽  
...  

The sweet cherry (Prunus avium) is one of the most economically important fruit species in the world. However, there is a limited amount of genetic information available for this species, which hinders breeding efforts at a molecular level. We were able to describe a high-quality reference genome assembly and annotation of the diploid sweet cherry (2n = 2x = 16) cv. Tieton using linked-read sequencing technology. We generated over 750 million clean reads, representing 112.63 GB of raw sequencing data. The Supernova assembler produced a more highly-ordered and continuous genome sequence than the current P. avium draft genome, with a contig N50 of 63.65 KB and a scaffold N50 of 2.48 MB. The final scaffold assembly was 280.33 MB in length, representing 82.12% of the estimated Tieton genome. Eight chromosome-scale pseudomolecules were constructed, completing a 214 MB sequence of the final scaffold assembly. De novo, homology-based, and RNA-seq methods were used together to predict 30,975 protein-coding loci. 98.39% of core eukaryotic genes and 97.43% of single copy orthologues were identified in the embryo plant, indicating the completeness of the assembly. Linked-read sequencing technology was effective in constructing a high-quality reference genome of the sweet cherry, which will benefit the molecular breeding and cultivar identification in this species.


2020 ◽  
Vol 10 (10) ◽  
pp. 3489-3495
Author(s):  
Natascha van Lieshout ◽  
Ate van der Burgt ◽  
Michiel E. de Vries ◽  
Menno ter Maat ◽  
David Eickholt ◽  
...  

With the rapid expansion of the application of genomics and sequencing in plant breeding, there is a constant drive for better reference genomes. In potato (Solanum tuberosum), the third largest food crop in the world, the related species S. phureja, designated “DM”, has been used as the most popular reference genome for the last 10 years. Here, we introduce the de novo sequenced genome of Solyntus as the next standard reference in potato genome studies. A true Solanum tuberosum made up of 116 contigs that is also highly homozygous, diploid, vigorous and self-compatible, Solyntus provides a more direct and contiguous reference then ever before available. It was constructed by sequencing with state-of-the-art long and short read technology and assembled with Canu. The 116 contigs were assembled into scaffolds to form each pseudochromosome, with three contigs to 17 contigs per chromosome. This assembly contains 93.7% of the single-copy gene orthologs from the Solanaceae set and has an N50 of 63.7 Mbp. The genome and related files can be found at https://www.plantbreeding.wur.nl/Solyntus/. With the release of this research line and its draft genome we anticipate many exciting developments in (diploid) potato research.


2020 ◽  
Vol 12 (8) ◽  
pp. 1330-1336 ◽  
Author(s):  
Maulik Upadhyay ◽  
Andreas Hauser ◽  
Elisabeth Kunz ◽  
Stefan Krebs ◽  
Helmut Blum ◽  
...  

Abstract The snow sheep, Ovis nivicola, which is endemic to the mountain ranges of northeastern Siberia, are well adapted to the harsh cold climatic conditions of their habitat. In this study, using long reads of Nanopore sequencing technology, whole-genome sequencing, assembly, and gene annotation of a snow sheep were carried out. Additionally, RNA-seq reads from several tissues were also generated to supplement the gene prediction in snow sheep genome. The assembled genome was ∼2.62 Gb in length and was represented by 7,157 scaffolds with N50 of about 2 Mb. The repetitive sequences comprised of 41% of the total genome. BUSCO analysis revealed that the snow sheep assembly contained full-length or partial fragments of 97% of mammalian universal single-copy orthologs (n = 4,104), illustrating the completeness of the assembly. In addition, a total of 20,045 protein-coding sequences were identified using comprehensive gene prediction pipeline. Of which 19,240 (∼96%) sequences were annotated using protein databases. Moreover, homology-based searches and de novo identification detected 1,484 tRNAs; 243 rRNAs; 1,931 snRNAs; and 782 miRNAs in the snow sheep genome. To conclude, we generated the first de novo genome of the snow sheep using long reads; these data are expected to contribute significantly to our understanding related to evolution and adaptation within the Ovis genus.


2019 ◽  
Vol 11 (12) ◽  
pp. 3445-3451 ◽  
Author(s):  
Jacqueline Heckenhauer ◽  
Paul B Frandsen ◽  
Deepak K Gupta ◽  
Juraj Paule ◽  
Stefan Prost ◽  
...  

Abstract Members of the speciose insect order Trichoptera (caddisflies) provide important ecosystem services, for example, nutrient cycling through breaking down of organic matter. They are also of industrial interest due to their larval silk secretions. These form the basis for their diverse case-making behavior that allows them to exploit a wide range of ecological niches. Only five genomes of this order have been published thus far, with variable qualities regarding contiguity and completeness. A low-cost sequencing strategy, that is, using a single Oxford Nanopore flow cell per individual along with Illumina sequence reads was successfully used to generate high-quality genomes of two Trichoptera species, Plectrocnemia conspersa and Hydropsyche tenuis. Of the de novo assembly methods compared, assembly of low coverage Nanopore reads (∼18×) and subsequent polishing with long reads followed by Illumina short reads (∼80–170× coverage) yielded the highest genome quality both in terms of contiguity and BUSCO completeness. The presented genomes are the shortest to date and extend our knowledge of genome size across caddisfly families. The genomic region that encodes for light (L)-chain fibroin, a protein component of larval caddisfly silk was identified and compared with existing L-fibroin gene clusters. The new genomic resources presented in this paper are among the highest quality Trichoptera genomes and will increase the knowledge of this important insect order by serving as the basis for phylogenomic and comparative genomic studies.


2020 ◽  
Author(s):  
Tatiana Arias ◽  
Diego Mauricio Riaño-Pachón ◽  
Verónica S. Di Stilio

ABSTRACTThe plant genus Thalictrum is a representative of the order Ranunculales (a sister lineage to all other Eudicots) with diverse floral morphologies, encompassing four sexual systems and two pollination modes. Previous studies suggest multiple transitions from insect to wind pollination within this genus, in association with polyploidy and unisexual flowers, but the underlying genes remain unknown. We generated a draft reference genome for Thalictrum thalictroides, a representative of a clade with ancestral floral traits (diploidy, hermaphroditism, and insect pollination) and a model for functional studies. To facilitate candidate gene discovery in flowers with different sexual and pollination systems we also generated floral transcriptomes of T. thalictroides and of wind-pollinated, andromonoecious (staminate and hermaphroditic flowers on the same plant) T. hernandezii.The T. thalictroides draft genome assembly consisted of 44,860 contigs (N50=12,761 bp. and 243 Mbp. total length) and contained 84.5% conserved embryophyte single-copy genes. Floral transcriptomes from Illumina sequencing and de novo assembly contained representatives of most eukaryotic core genes (approximately 80%), with most of their genes falling into common orthologous groups (orthogroups). Simple Sequence Repeat (SSR) motifs were also identified, which together with the single-copy genes constitute a resource for population-level or phylogenetic studies. Finally, to validate the utility of these resources, putative candidate genes were identified for the different floral morphologies using stepwise dataset comparisons. In conclusion, we present genomic and transcriptomic resources for Thalictrum, including the first genome of T. thalictroides and potential candidate genes for flowers with distinct sexual and pollination systems.


2019 ◽  
Author(s):  
Mengyang Xu ◽  
Xiaoshan Su ◽  
Mengqi Zhang ◽  
Ming Li ◽  
Xiaoyun Huang ◽  
...  

AbstractThe long-spine porcupinefish, Diodon holocanthus (Diodontidae, Tetraodontiformes, Actinopterygii), also known as the freckled porcupinefish, attracts great interest of ecology and economy. Its distinct characteristics including inflation reaction, spiny skin and tetradotoxin, however, have not been fully studied without a complete genome assembly.In this study, the whole genome of a single individual was sequenced using single tube-Long Fragment Read co-barcode reads, generating 154.3 Gb of paired-end data (219.8× depth). The gap was further filled using small amount of Oxford Nanopore MinION long read dataset (11.4Gb, 15.9× depth). Taking full use of long, medium, short-range of genome assembly information, the final assembled sequences with a total length of 650.02 Mb obtained contig and scaffold N50 sizes of 2.15 Mb and 8.13 Mb, respectively, despite of high repetitive content. Benchmarking Universal Single-Copy Orthologs captured 95.7% (2,474) of core genes to assess the completeness. In addition, 206.5 Mb (32.10%) of repetitive sequences were identified, and 20,840 protein-coding genes were annotated, among which 18,281 (87.72%) proteins were assigned with possible functions.This is the first demonstration of de novo genome of the porcupinefish, which will benefit downstream analysis of ontogeny, phylogeny, and evolution, and improve the exploration of its unique defensive mechanism.


2019 ◽  
Author(s):  
Hannes Becher ◽  
Benjamin C. Jackson ◽  
Brian Charlesworth

SUMMARYSurveys of DNA sequence variation have shown that the level of genetic variability in a genomic region is often strongly positively correlated with its rate of crossing over (CO) [1–3]. This pattern is caused by selection acting on linked sites, which reduces genetic variability and can also cause the frequency distribution of segregating variants to contain more rare variants than expected without selection (skew). These effects of selection may involve the spread of beneficial mutations (selective sweeps, SSWs), the elimination of deleterious mutations (background selection, BGS) or both together, and are expected to be stronger with lower rates of crossing over [1–3]. However, in a recent study of human populations, the skew was reduced in the lowest CO regions compared with regions with somewhat higher CO rates [4]. A similar pattern is seen in the population genomic studies of Drosophila simulans described here. We propose an explanation for this paradoxical observation, and validate it using computer simulations. This explanation is based on the finding that partially recessive, linked deleterious mutations can increase rather than reduce neutral variability when the product of the effective population size (Ne) and the selection coefficient against homozygous carriers of mutations (s) is ≤ 1, i.e. there is associative overdominance (AOD) rather than BGS [5]. We show that AOD can operate in a genomic region with a low rate of CO, opening up a new perspective on how selection affects patterns of variability at linked sites.


Sign in / Sign up

Export Citation Format

Share Document