The evolution of widespread recombination suppression on the Dwarf Hamster (Phodopus) X chromosome

The mammalian X chromosome shows strong conservation among distantly related species, limiting insights into the distinct selective processes that have shaped sex chromosome evolution. We constructed a chromosome-scale de novo genome assembly for the Siberian dwarf hamster (Phodopus sungorus), a species reported to show extensive recombination suppression across an entire arm of the X chromosome. Combining a physical genome assembly based on shotgun and long-range proximity ligation sequencing with a dense genetic map, we detected widespread suppression of female recombination across ~65% of the Phodopus X chromosome. This region of suppressed recombination likely corresponds to the Xp arm, which has previously been shown to be highly heterochromatic. Using additional sequencing data from two closely-related species (P. campbelli and P. roborovskii), we show that recombination suppression on Xp appears to be independent of major structural rearrangements. The suppressed Xp arm was enriched for genes primarily expressed in the placenta and some transposable elements, but otherwise showed similar gene densities, expression patterns, and rates of molecular evolution when compared to the recombinant Xq arm. Phodopus Xp gene content and order was also broadly conserved relative to the more distantly related rat X chromosome. Collectively, these data suggest that widespread suppression of recombination has likely evolved through the transient induction of facultative heterochromatin on the Phodopus Xp arm without major changes in chromosome structure or genetic content. Thus, dramatic changes in the recombination landscape have so far had relatively subtle influences on overall patterns of X-linked molecular evolution.

Download Full-text

Chromosome-level assembly of Drosophila bifasciata reveals important karyotypic transition of the X chromosome

10.1101/847558 ◽

2019 ◽

Author(s):

Ryan Bracewell ◽

Anita Tran ◽

Kamalakar Chatla ◽

Doris Bachtrog

Keyword(s):

X Chromosome ◽

Genome Assembly ◽

De Novo ◽

Pericentromeric Region ◽

Species Group ◽

Chromosome 15 ◽

Protein Coding ◽

Protein Coding Genes ◽

Long Read ◽

Chromosome Level

ABSTRACTThe Drosophila obscura species group is one of the most studied clades of Drosophila and harbors multiple distinct karyotypes. Here we present a de novo genome assembly and annotation of D. bifasciata, a species which represents an important subgroup for which no high-quality chromosome-level genome assembly currently exists. We combined long-read sequencing (Nanopore) and Hi-C scaffolding to achieve a highly contiguous genome assembly approximately 193Mb in size, with repetitive elements constituting 30.1% of the total length. Drosophila bifasciata harbors four large metacentric chromosomes and the small dot, and our assembly contains each chromosome in a single scaffold, including the highly repetitive pericentromere, which were largely composed of Jockey and Gypsy transposable elements. We annotated a total of 12,821 protein-coding genes and comparisons of synteny with D. athabasca orthologs show that the large metacentric pericentromeric regions of multiple chromosomes are conserved between these species. Importantly, Muller A (X chromosome) was found to be metacentric in D. bifasciata and the pericentromeric region appears homologous to the pericentromeric region of the fused Muller A-AD (XL and XR) of pseudoobscura/affinis subgroup species. Our finding suggests a metacentric ancestral X fused to a telocentric Muller D and created the large neo-X (Muller A-AD) chromosome ∼15 MYA. We also confirm the fusion of Muller C and D in D. bifasciata and show that it likely involved a centromere-centromere fusion.

Download Full-text

Identifying the causes and consequences of assembly gaps using a multiplatform genome assembly of a bird-of-paradise

10.1101/2019.12.19.882399 ◽

2019 ◽

Cited By ~ 5

Author(s):

Valentina Peona ◽

Mozes P.K. Blom ◽

Luohao Xu ◽

Reto Burri ◽

Shawn Sullivan ◽

...

Keyword(s):

Dark Matter ◽

Genome Assembly ◽

Sex Chromosome ◽

De Novo ◽

Model Organism ◽

Technology Choice ◽

High Quality ◽

Sequencing Technologies ◽

Downstream Analysis ◽

Genome Assemblies

AbstractGenome assemblies are currently being produced at an impressive rate by consortia and individual laboratories. The low costs and increasing efficiency of sequencing technologies have opened up a whole new world of genomic biodiversity. Although these technologies generate high-quality genome assemblies, there are still genomic regions difficult to assemble, like repetitive elements and GC-rich regions (genomic “dark matter”). In this study, we compare the efficiency of currently used sequencing technologies (short/linked/long reads and proximity ligation maps) and combinations thereof in assembling genomic dark matter starting from the same sample. By adopting different de-novo assembly strategies, we were able to compare each individual draft assembly to a curated multiplatform one and identify the nature of the previously missing dark matter with a particular focus on transposable elements, multi-copy MHC genes, and GC-rich regions. Thanks to this multiplatform approach, we demonstrate the feasibility of producing a high-quality chromosome-level assembly for a non-model organism (paradise crow) for which only suboptimal samples are available. Our approach was able to reconstruct complex chromosomes like the repeat-rich W sex chromosome and several GC-rich microchromosomes. Telomere-to-telomere assemblies are not a reality yet for most organisms, but by leveraging technology choice it is possible to minimize genome assembly gaps for downstream analysis. We provide a roadmap to tailor sequencing projects around the completeness of both the coding and non-coding parts of the genomes.

Download Full-text

De Novo Genome Assembly of Next-Generation Sequencing Data

Compendium of Plant Genomes - The Brassica rapa Genome ◽

10.1007/978-3-662-47901-8_4 ◽

2015 ◽

pp. 41-51

Author(s):

Min Liu ◽

Dongyuan Liu ◽

Hongkun Zheng

Keyword(s):

Next Generation Sequencing ◽

Genome Assembly ◽

De Novo ◽

Next Generation Sequencing Data ◽

Next Generation ◽

Sequencing Data ◽

De Novo Genome Assembly ◽

Generation Sequencing

Download Full-text

Fully phased human genome assembly without parental data using single-cell strand sequencing and long reads

Nature Biotechnology ◽

10.1038/s41587-020-0719-5 ◽

2020 ◽

Author(s):

David Porubsky ◽

◽

Peter Ebert ◽

Peter A. Audano ◽

Mitchell R. Vollger ◽

...

Keyword(s):

Single Cell ◽

Genome Assembly ◽

De Novo ◽

Error Rates ◽

Sequencing Data ◽

Single Nucleotide Variants ◽

De Novo Genome Assembly ◽

Parental Data ◽

Human Genome Assembly ◽

Long Read

AbstractHuman genomes are typically assembled as consensus sequences that lack information on parental haplotypes. Here we describe a reference-free workflow for diploid de novo genome assembly that combines the chromosome-wide phasing and scaffolding capabilities of single-cell strand sequencing1,2 with continuous long-read or high-fidelity3 sequencing data. Employing this strategy, we produced a completely phased de novo genome assembly for each haplotype of an individual of Puerto Rican descent (HG00733) in the absence of parental data. The assemblies are accurate (quality value > 40) and highly contiguous (contig N50 > 23 Mbp) with low switch error rates (0.17%), providing fully phased single-nucleotide variants, indels and structural variants. A comparison of Oxford Nanopore Technologies and Pacific Biosciences phased assemblies identified 154 regions that are preferential sites of contig breaks, irrespective of sequencing technology or phasing algorithms.

Download Full-text

De Novo Sequencing and Hybrid Assembly of the Biofuel Crop Jatropha curcas L.: Identification of Quantitative Trait Loci for Geminivirus Resistance

Genes ◽

10.3390/genes10010069 ◽

2019 ◽

Vol 10 (1) ◽

pp. 69 ◽

Cited By ~ 9

Author(s):

Nagesh Kancharla ◽

Saakshi Jalali ◽

J. Narasimham ◽

Vinod Nair ◽

Vijay Yepuri ◽

...

Keyword(s):

Ssr Markers ◽

Genome Assembly ◽

Jatropha Curcas ◽

Quantitative Trait ◽

De Novo ◽

Mapping Population ◽

Single Copy ◽

Sequencing Data ◽

De Novo Genome Assembly ◽

Sequencing Technologies

Jatropha curcas is an important perennial, drought tolerant plant that has been identified as a potential biodiesel crop. We report here the hybrid de novo genome assembly of J. curcas generated using Illumina and PacBio sequencing technologies, and identification of quantitative loci for Jatropha Mosaic Virus (JMV) resistance. In this study, we generated scaffolds of 265.7 Mbp in length, which correspond to 84.8% of the gene space, using Benchmarking Universal Single-Copy Orthologs (BUSCO) analysis. Additionally, 96.4% of predicted protein-coding genes were captured in RNA sequencing data, which reconfirms the accuracy of the assembled genome. The genome was utilized to identify 12,103 dinucleotide simple sequence repeat (SSR) markers, which were exploited in genetic diversity analysis to identify genetically distinct lines. A total of 207 polymorphic SSR markers were employed to construct a genetic linkage map for JMV resistance, using an interspecific F2 mapping population involving susceptible J. curcas and resistant Jatropha integerrima as parents. Quantitative trait locus (QTL) analysis led to the identification of three minor QTLs for JMV resistance, and the same has been validated in an alternate F2 mapping population. These validated QTLs were utilized in marker-assisted breeding for JMV resistance. Comparative genomics of oil-producing genes across selected oil producing species revealed 27 conserved genes and 2986 orthologous protein clusters in Jatropha. This reference genome assembly gives an insight into the understanding of the complex genetic structure of Jatropha, and serves as source for the development of agronomically improved virus-resistant and oil-producing lines.

Download Full-text

Chromosome-Level Assembly of Drosophila bifasciata Reveals Important Karyotypic Transition of the X Chromosome

G3 Genes|Genome|Genetics ◽

10.1534/g3.119.400922 ◽

2020 ◽

Vol 10 (3) ◽

pp. 891-897 ◽

Cited By ~ 3

Author(s):

Ryan Bracewell ◽

Anita Tran ◽

Kamalakar Chatla ◽

Doris Bachtrog

Keyword(s):

X Chromosome ◽

Genome Assembly ◽

De Novo ◽

Pericentromeric Region ◽

Species Group ◽

Chromosome 15 ◽

Protein Coding ◽

Protein Coding Genes ◽

Long Read ◽

Chromosome Level

The Drosophila obscura species group is one of the most studied clades of Drosophila and harbors multiple distinct karyotypes. Here we present a de novo genome assembly and annotation of D. bifasciata, a species which represents an important subgroup for which no high-quality chromosome-level genome assembly currently exists. We combined long-read sequencing (Nanopore) and Hi-C scaffolding to achieve a highly contiguous genome assembly approximately 193 Mb in size, with repetitive elements constituting 30.1% of the total length. Drosophila bifasciata harbors four large metacentric chromosomes and the small dot, and our assembly contains each chromosome in a single scaffold, including the highly repetitive pericentromeres, which were largely composed of Jockey and Gypsy transposable elements. We annotated a total of 12,821 protein-coding genes and comparisons of synteny with D. athabasca orthologs show that the large metacentric pericentromeric regions of multiple chromosomes are conserved between these species. Importantly, Muller A (X chromosome) was found to be metacentric in D. bifasciata and the pericentromeric region appears homologous to the pericentromeric region of the fused Muller A-AD (XL and XR) of pseudoobscura/affinis subgroup species. Our finding suggests a metacentric ancestral X fused to a telocentric Muller D and created the large neo-X (Muller A-AD) chromosome ∼15 MYA. We also confirm the fusion of Muller C and D in D. bifasciata and show that it likely involved a centromere-centromere fusion.

Download Full-text

NanoGalaxy: Nanopore long-read sequencing data analysis in Galaxy

GigaScience ◽

10.1093/gigascience/giaa105 ◽

2020 ◽

Vol 9 (10) ◽

Cited By ~ 1

Author(s):

Willem de Koning ◽

Milad Miladi ◽

Saskia Hiltemann ◽

Astrid Heikema ◽

John P Hays ◽

...

Keyword(s):

Genome Assembly ◽

Bioinformatics Analysis ◽

De Novo ◽

Sequence Data ◽

Ease Of Use ◽

Easy Access ◽

Complex Data ◽

Sequencing Data ◽

Long Read ◽

Sequencing Platforms

Abstract Background Long-read sequencing can be applied to generate very long contigs and even completely assembled genomes at relatively low cost and with minimal sample preparation. As a result, long-read sequencing platforms are becoming more popular. In this respect, the Oxford Nanopore Technologies–based long-read sequencing “nanopore" platform is becoming a widely used tool with a broad range of applications and end-users. However, the need to explore and manipulate the complex data generated by long-read sequencing platforms necessitates accompanying specialized bioinformatics platforms and tools to process the long-read data correctly. Importantly, such tools should additionally help democratize bioinformatics analysis by enabling easy access and ease-of-use solutions for researchers. Results The Galaxy platform provides a user-friendly interface to computational command line–based tools, handles the software dependencies, and provides refined workflows. The users do not have to possess programming experience or extended computer skills. The interface enables researchers to perform powerful bioinformatics analysis, including the assembly and analysis of short- or long-read sequence data. The newly developed “NanoGalaxy" is a Galaxy-based toolkit for analysing long-read sequencing data, which is suitable for diverse applications, including de novo genome assembly from genomic, metagenomic, and plasmid sequence reads. Conclusions A range of best-practice tools and workflows for long-read sequence genome assembly has been integrated into a NanoGalaxy platform to facilitate easy access and use of bioinformatics tools for researchers. NanoGalaxy is freely available at the European Galaxy server https://nanopore.usegalaxy.eu with supporting self-learning training material available at https://training.galaxyproject.org.

Download Full-text

A comparison of methodological approaches to the study of young sex chromosomes: A case study in Poecilia

10.1101/2021.11.29.470452 ◽

2021 ◽

Author(s):

Iulia Darolti ◽

Pedro Almeida ◽

Alison E Wright ◽

Judith E Mank

Keyword(s):

Sex Chromosomes ◽

Reference Genome ◽

Sex Chromosome ◽

De Novo ◽

Sequence Similarity ◽

Sequence Evolution ◽

Structural Variations ◽

Recombination Suppression ◽

Custom Made ◽

Methodological Approaches

Studies of sex chromosome systems at early stages of divergence are key to understanding the initial process and underlying causes of recombination suppression. However, identifying signatures of divergence in homomorphic sex chromosomes can be challenging due to high levels of sequence similarity between the X and the Y. Variations in methodological precision and underlying data can make all the difference between detecting subtle divergence patterns or missing them entirely. Recent efforts to test for X-Y sequence differentiation in the guppy have led to contradictory results. Here we apply different analytical methodologies to the same dataset to test for the accuracy of different approaches in identifying patterns of sex chromosome divergence in the guppy. Our comparative analysis reveals that the most substantial source of variation in the results of the different analyses lies in the reference genome used. Analyses using custom-made de novo genome assemblies for the focal species successfully recover a signal of divergence across different methodological approaches. By contrast, using the distantly related Xiphophorus reference genome results in variable patterns, due to both sequence evolution and structural variations on the sex chromosomes between the guppy and Xiphophorus. Changes in mapping and filtering parameters can additionally introduce noise and obscure the signal. Our results illustrate how analytical differences can alter perceived results and we highlight best practices for the study of nascent sex chromosomes.

Download Full-text

On the Origin and Evolution of Drosophila New Genes during Spermatogenesis

Genes ◽

10.3390/genes12111796 ◽

2021 ◽

Vol 12 (11) ◽

pp. 1796

Author(s):

Qianwei Su ◽

Huangyi He ◽

Qi Zhou

Keyword(s):

X Chromosome ◽

Sex Chromosome ◽

Expression Patterns ◽

Cell Level ◽

Origin And Evolution ◽

Functional Studies ◽

New Genes ◽

Meiotic Sex Chromosome Inactivation ◽

New Gene ◽

Cell Transcriptome

The origin of functional new genes is a basic biological process that has significant contribution to organismal diversity. Previous studies in both Drosophila and mammals showed that new genes tend to be expressed in testes and avoid the X chromosome, presumably because of meiotic sex chromosome inactivation (MSCI). Here, we analyze the published single-cell transcriptome data of Drosophila adult testis and find an enrichment of male germline mitotic genes, but an underrepresentation of meiotic genes on the X chromosome. This can be attributed to an excess of autosomal meiotic genes that were derived from their X-linked mitotic progenitors, which provides direct cell-level evidence for MSCI in Drosophila. We reveal that new genes, particularly those produced by retrotransposition, tend to exhibit an expression shift toward late spermatogenesis compared with their parental copies, probably due to the more intensive sperm competition or sexual conflict. Our results dissect the complex factors including age, the origination mechanisms and the chromosomal locations that influence the new gene origination and evolution in testes, and identify new gene cases that show divergent cell-level expression patterns from their progenitors for future functional studies.

Download Full-text

De Novo genome assembly for third generation sequencing data

Photonics Applications in Astronomy, Communications, Industry, and High-Energy Physics Experiments 2018 ◽

10.1117/12.2501543 ◽

2018 ◽

Author(s):

Robert M. Nowak ◽

Mateusz Forc ◽

Wiktor Kuśmirek

Keyword(s):

Genome Assembly ◽

De Novo ◽

Third Generation ◽

Sequencing Data ◽

De Novo Genome Assembly ◽

Third Generation Sequencing ◽

Generation Sequencing

Download Full-text