FindZX: an automated pipeline for detecting and visualising sex chromosomes using whole-genome sequencing data

Mapping Intimacies ◽

10.1101/2021.10.18.464774 ◽

2021 ◽

Author(s):

Hanna Sigeman ◽

Bella Sinclair ◽

Bengt Hansson

Keyword(s):

Sex Chromosomes ◽

Sex Chromosome ◽

Whole Genome Sequencing Data ◽

Sequencing Data ◽

A Genome ◽

Genomic Studies ◽

Taxonomic Groups ◽

User Friendly ◽

Genomic Patterns

Sex chromosomes have evolved numerous times, as revealed by recent genomic studies. However, large gaps in our knowledge of sex chromosome diversity across the tree of life remain. Filling these gaps, through the study of novel species, is crucial for improved understanding of why and how sex chromosomes evolve. Characterization of sex chromosomes in already well-studied organisms is also important to avoid misinterpretations of population genomic patterns caused by undetected sex chromosome variation. Here we present findZX, an automated Snakemake-based computational pipeline for detecting and visualizing sex chromosomes through differences in genome coverage and heterozygosity between males and females. FindZX is user-friendly and scalable to suit different computational platforms and works with any number of male and female samples. An option to perform a genome coordinate lift-over to a reference genome of another species allows users to inspect sex- linked regions over larger contiguous chromosome regions, while also providing important between- species synteny information. To demonstrate its effectiveness, we applied findZX to publicly available genomic data from species belonging to widely different taxonomic groups (mammals, birds, reptiles, fish, and insects), with sex chromosome systems of different ages, sizes, and levels of differentiation. We also demonstrate that the lift-over method is robust over large phylogenetic distances (>80 million years of evolution).

Download Full-text

The Diversity and Evolution of Sex Chromosomes in Frogs

Genes ◽

10.3390/genes12040483 ◽

2021 ◽

Vol 12 (4) ◽

pp. 483

Author(s):

Wen-Juan Ma ◽

Paris Veltsos

Keyword(s):

Sex Determination ◽

Sex Chromosomes ◽

Chromosome Evolution ◽

Sex Chromosome ◽

Comparative Genomic ◽

Ancestral State ◽

Heteromorphic Sex Chromosomes ◽

Genomic Studies ◽

Evolutionary Trajectories ◽

Heterogametic Sex

Frogs are ideal organisms for studying sex chromosome evolution because of their diversity in sex chromosome differentiation and sex-determination systems. We review 222 anuran frogs, spanning ~220 Myr of divergence, with characterized sex chromosomes, and discuss their evolution, phylogenetic distribution and transitions between homomorphic and heteromorphic states, as well as between sex-determination systems. Most (~75%) anurans have homomorphic sex chromosomes, with XY systems being three times more common than ZW systems. Most remaining anurans (~25%) have heteromorphic sex chromosomes, with XY and ZW systems almost equally represented. There are Y-autosome fusions in 11 species, and no W-/Z-/X-autosome fusions are known. The phylogeny represents at least 19 transitions between sex-determination systems and at least 16 cases of independent evolution of heteromorphic sex chromosomes from homomorphy, the likely ancestral state. Five lineages mostly have heteromorphic sex chromosomes, which might have evolved due to demographic and sexual selection attributes of those lineages. Males do not recombine over most of their genome, regardless of which is the heterogametic sex. Nevertheless, telomere-restricted recombination between ZW chromosomes has evolved at least once. More comparative genomic studies are needed to understand the evolutionary trajectories of sex chromosomes among frog lineages, especially in the ZW systems.

Download Full-text

Development of a User-Friendly Pipeline for Mutational Analyses of HIV Using Ultra-Accurate Maximum-Depth Sequencing

Viruses ◽

10.3390/v13071338 ◽

2021 ◽

Vol 13 (7) ◽

pp. 1338

Author(s):

Morgan E. Meissner ◽

Emily J. Julik ◽

Jonathan P. Badalamenti ◽

William G. Arndt ◽

Lauren J. Mills ◽

...

Keyword(s):

Error Rates ◽

Maximum Depth ◽

Sequencing Data ◽

Background Error ◽

High Background ◽

Immunodeficiency Virus ◽

User Friendly ◽

Viral Mutagenesis ◽

Hiv 1

Human immunodeficiency virus type 2 (HIV-2) accumulates fewer mutations during replication than HIV type 1 (HIV-1). Advanced studies of HIV-2 mutagenesis, however, have historically been confounded by high background error rates in traditional next-generation sequencing techniques. In this study, we describe the adaptation of the previously described maximum-depth sequencing (MDS) technique to studies of both HIV-1 and HIV-2 for the ultra-accurate characterization of viral mutagenesis. We also present the development of a user-friendly Galaxy workflow for the bioinformatic analyses of sequencing data generated using the MDS technique, designed to improve replicability and accessibility to molecular virologists. This adapted MDS technique and analysis pipeline were validated by comparisons with previously published analyses of the frequency and spectra of mutations in HIV-1 and HIV-2 and is readily expandable to studies of viral mutation across the genomes of both viruses. Using this novel sequencing pipeline, we observed that the background error rate was reduced 100-fold over standard Illumina error rates, and 10-fold over traditional unique molecular identifier (UMI)-based sequencing. This technical advancement will allow for the exploration of novel and previously unrecognized sources of viral mutagenesis in both HIV-1 and HIV-2, which will expand our understanding of retroviral diversity and evolution.

Download Full-text

Batch effects in population genomic studies with low‐coverage whole genome sequencing data: causes, detection, and mitigation

Molecular Ecology Resources ◽

10.1111/1755-0998.13559 ◽

2021 ◽

Author(s):

Runyang Nicolas Lou ◽

Nina Overgaard Therkildsen

Keyword(s):

Whole Genome Sequencing ◽

Genome Sequencing ◽

Whole Genome Sequencing Data ◽

Whole Genome ◽

Batch Effects ◽

Sequencing Data ◽

Population Genomic ◽

Genomic Studies ◽

Low Coverage

Download Full-text

Characterization of Plastidial and Nuclear SSR Markers for Understanding Invasion Histories and Genetic Diversity of Schinus molle L.

Biology ◽

10.3390/biology7030043 ◽

2018 ◽

Vol 7 (3) ◽

pp. 43 ◽

Cited By ~ 2

Author(s):

Rafael Lemos ◽

Cristiane Matielo ◽

Dalvan Beise ◽

Vanessa da Rosa ◽

Deise Sarzi ◽

...

Keyword(s):

Genetic Diversity ◽

Invasive Plant ◽

Whole Genome Sequencing Data ◽

Natural Occurrence ◽

Sequencing Data ◽

Dispersal Capacity ◽

Schinus Molle ◽

High Dispersal ◽

History Of

Invasive plant species are expected to display high dispersal capacity but low levels of genetic diversity due to the founder effect occurring at each invasion episode. Understanding the history of invasions and the levels of genetic diversity of such species is an important task for planning management and monitoring strategy for these events. Peruvian Peppertree (Schinus molle L.) is a pioneer tree species native from South America which was introduced in North America, Europe and Africa, becoming a threat to these non-native habitats. In this study, we report the discovery and characterization of 17 plastidial (ptSSR) and seven nuclear (nSSR) markers for S. molle based on low-coverage whole-genome sequencing data acquired through next-generation sequencing. The markers were tested in 56 individuals from two natural populations sampled in the Brazilian Caatinga and Pampa biomes. All loci are moderately to highly polymorphic and revealed to be suitable for genetic monitoring of new invasions, for understanding the history of old invasions, as well as for genetic studies of native populations in their natural occurrence range and of orchards established with commercial purposes.

Download Full-text

Extreme heterogeneity in sex chromosome differentiation and dosage compensation in livebearers

Proceedings of the National Academy of Sciences ◽

10.1073/pnas.1905298116 ◽

2019 ◽

Vol 116 (38) ◽

pp. 19031-19036 ◽

Cited By ~ 20

Author(s):

Iulia Darolti ◽

Alison E. Wright ◽

Benjamin A. Sandkam ◽

Jake Morris ◽

Natasha I. Bloch ◽

...

Keyword(s):

Y Chromosome ◽

Dosage Compensation ◽

Sex Chromosomes ◽

Poecilia Reticulata ◽

Sex Chromosome ◽

Sequencing Data ◽

Chromosome Differentiation ◽

Y Chromosomes ◽

Shared Ancestry ◽

Suppressed Recombination

Once recombination is halted between the X and Y chromosomes, sex chromosomes begin to differentiate and transition to heteromorphism. While there is a remarkable variation across clades in the degree of sex chromosome divergence, far less is known about the variation in sex chromosome differentiation within clades. Here, we combined whole-genome and transcriptome sequencing data to characterize the structure and conservation of sex chromosome systems across Poeciliidae, the livebearing clade that includes guppies. We found that the Poecilia reticulata XY system is much older than previously thought, being shared not only with its sister species, Poecilia wingei, but also with Poecilia picta, which diverged roughly 20 million years ago. Despite the shared ancestry, we uncovered an extreme heterogeneity across these species in the proportion of the sex chromosome with suppressed recombination, and the degree of Y chromosome decay. The sex chromosomes in P. reticulata and P. wingei are largely homomorphic, with recombination in the former persisting over a substantial fraction. However, the sex chromosomes in P. picta are completely nonrecombining and strikingly heteromorphic. Remarkably, the profound degradation of the ancestral Y chromosome in P. picta is counterbalanced by the evolution of functional chromosome-wide dosage compensation in this species, which has not been previously observed in teleost fish. Our results offer important insight into the initial stages of sex chromosome evolution and dosage compensation.

Download Full-text

Network analysis reveals differential metabolic functionality in antibiotic-resistantPseudomonas aeruginosa

10.1101/303289 ◽

2018 ◽

Author(s):

Laura J. Dunphy ◽

Phillip Yen ◽

Jason A. Papin

Keyword(s):

Antibiotic Resistance ◽

Carbon Sources ◽

Growth Dynamics ◽

Metabolic Phenotype ◽

Metabolic Adaptation ◽

Whole Genome Sequencing Data ◽

Sequencing Data ◽

Drug Induced ◽

Antibiotic Resistant ◽

A Genome

AbstractMetabolic adaptations accompanying the development of antibiotic resistance in bacteria remain poorly understood. To interrogate this relationship, we profiled the growth of lab-evolved antibiotic-resistant lineages of the opportunistic pathogenPseudomonas aeruginosaacross 190 unique carbon sources. We semi-automatically calculated growth dynamics (maximum growth density, growth rate, and time to mid-exponential phase) of over 2,800 growth curves. These data revealed that the evolution of antibiotic resistance resulted in systems-level changes to growth dynamics and metabolic phenotype. Drug-resistant lineages predominantly displayed decreased growth relative to the ancestral lineage; however, resistant lineages occasionally displayed enhanced growth on certain carbon sources, indicating that adaption to drug can provide a growth advantage in certain environments. A genome-scale metabolic network reconstruction (GENRE) ofP. aeruginosastrain UCBPP-PA14 was paired with whole-genome sequencing data of one of the drug-evolved lineages to predict genes contributing to observed changes in metabolism. Finally, we experimentally validatedin silicopredictions to identify genes mutated in resistantP. aeruginosaaffecting loss of catabolic function. Our results build upon previous mechanistic knowledge of drug-induced metabolic adaptation and provide a framework for the identification of metabolic limitations in antibiotic-resistant pathogens. Robust drug-driven changes in bacterial metabolism have the potential to be exploited to select against antibiotic-resistant populations in chronic infections.

Download Full-text

Local Ancestry Prediction with PyLAE

10.1101/2020.11.13.380105 ◽

2020 ◽

Author(s):

Alexander Smetanin ◽

Nikita Moshkov ◽

Tatiana V. Tatarinova

Keyword(s):

Whole Genome Sequencing ◽

Genome Sequencing ◽

Computational Efficiency ◽

Source Code ◽

High Density ◽

Whole Genome Sequencing Data ◽

Whole Genome ◽

Sequencing Data ◽

Local Ancestry ◽

A Genome

AbstractSummaryWe developed PyLAE - a new tool for determining local ancestry along a genome using whole-genome sequencing data or high-density genotyping experiments. PyLAE can process an arbitrarily large number of ancestral populations (with or without an informative prior). Since PyLAE does not involve estimation of many parameters, it can process thousands of genomes within a day. Computational efficiency, straightforward presentation of results, and an ease of installation makes PyLAE a useful tool to study admixed populations.Availability and implementationThe source code and installation manual are available at https://github.com/smetam/pylae.

Download Full-text

Construction of the third generation Zea mays haplotype map

10.1101/026963 ◽

2015 ◽

Cited By ~ 21

Author(s):

Robert Bukowski ◽

Xiaosen Guo ◽

Yanli Lu ◽

Cheng Zou ◽

Bing He ◽

...

Keyword(s):

Zea Mays ◽

Whole Genome Sequencing Data ◽

Limiting Factor ◽

Computational Pipeline ◽

Research Groups ◽

Sequencing Data ◽

The Third ◽

The World ◽

Set Up

ABSTRACTBackgroundCharacterization of genetic variations in maize has been challenging, mainly due to deterioration of collinearity between individual genomes in the species. An international consortium of maize research groups combined resources to develop the maize haplotype version 3 (HapMap 3), built from whole genome sequencing data from 1,218 maize lines, covering pre-domestication and domesticated Zea mays varieties across the world.ResultsA new computational pipeline was set up to process over 12 trillion bp of sequencing data, and a set of population genetics filters were applied to identify over 83 million variant sites.ConclusionsWe identified polymorphisms in regions where collinearity is largely preserved in the maize species. However, the fact that the B73 genome used as the reference only represents a fraction of all haplotypes is still an important limiting factor.

Download Full-text

Shiny-SoSV: A web-based performance calculator for somatic structural variant detection

10.1101/668723 ◽

2019 ◽

Author(s):

Tingting Gong ◽

Vanessa M Hayes ◽

Eva KF Chan

Keyword(s):

Study Design ◽

Additive Model ◽

Whole Genome Sequencing Data ◽

Sequencing Data ◽

Web Based ◽

Generalised Additive Model ◽

Structural Variant ◽

Variant Detection ◽

User Friendly ◽

The Impact

AbstractSomatic structural variants are an important contributor to cancer development and evolution. Accurate detection of these complex variants from whole genome sequencing data is influenced by a multitude of parameters. However, there are currently no tools for guiding study design nor are there applications that could predict the performance of somatic structural variant detection. To address this gap, we developed Shiny-SoSV, a user-friendly web-based calculator for determining the impact of common variables on the sensitivity and precision of somatic structural variant detection, including choice of variant detection tool, sequencing depth of coverage, variant allele fraction, and variant breakpoint resolution. Using simulation studies, we determined singular and combinatoric effects of these variables, modelled the results using a generalised additive model, allowing structural variant detection performance to be predicted for any combination of predictors. Shiny-SoSV provides an interactive and visual platform for users to easily compare individual and combined impact of different parameters. It predicts the performance of a proposed study design, on somatic structural variant detection, prior to the commencement of benchwork. Shiny-SoSV is freely available at https://hcpcg.shinyapps.io/Shiny-SoSV with accompanying user’s guide and example use-cases.

Download Full-text

Genomic characterization of a pathogenic isolate of Saccharomyces cerevisiae reveals an extensive and dynamic landscape of structural variation

10.1101/2021.08.20.457152 ◽

2021 ◽

Author(s):

Lydia R. Heasley ◽

Juan Lucas Argueso

Keyword(s):

Saccharomyces Cerevisiae ◽

Structural Variation ◽

Genome Structure ◽

Genomic Variation ◽

Whole Genome Sequencing Data ◽

Structural Genomic ◽

Pathogenic Isolate ◽

A Genome ◽

Long Read

The budding yeast Saccharomyces cerevisiae has been extensively characterized for many decades and is a critical resource for the study of numerous facets of eukaryotic biology. Recently, the analysis of whole genome sequencing data from over 1000 natural isolates of S. cerevisiae has provided critical insights into the evolutionary landscape of this species by revealing a population structure comprised of numerous genomically diverse lineages. These survey-level analyses have been largely devoid of structural genomic information, mainly because short read sequencing is not suitable for detailed characterization of genomic architecture. Consequently, we still lack a complete perspective of the genomic variation the exists within the species. Single molecule long read sequencing technologies, such as Oxford Nanopore and PacBio, provide sequencing-based approaches with which to rigorously define the structure of a genome, and have empowered yeast geneticists to explore this poorly described realm of eukaryotic genomics. Here, we present the comprehensive genomic structural analysis of a pathogenic isolate of S. cerevisiae, YJM311. We used long read sequence analysis to construct a haplotype-phased, telomere-to-telomere length assembly of the YJM311 diploid genome and characterized the structural variations (SVs) therein. We discovered that the genome of YJM311 contains significant intragenomic structural variation, some of which imparts notable consequences to the genomic stability and developmental biology of the strain. Collectively, we outline a new methodology for creating accurate haplotype-phased genome assemblies and highlight how such genomic analyses can define the structural architectures of S. cerevisiae isolates. It is our hope that through continued structural characterization of S. cerevisiae genomes, such as we have reported here for YJM311, we will comprehensively advance our understanding of eukaryotic genome structure-function relationships, structural diversity, and evolution.

Download Full-text