Loss of inner kinetochore genes is associated with the transition to an unconventional point centromere in budding yeast

PeerJ ◽

10.7717/peerj.10085 ◽

2020 ◽

Vol 8 ◽

pp. e10085

Author(s):

Nagarjun Vijay

Keyword(s):

Gene Loss ◽

Yeast Species ◽

Budding Yeast ◽

High Quality ◽

Sequence Of Events ◽

Intergenic Regions ◽

Actual Sequence ◽

Or Gene ◽

Conserved Gene ◽

Genome Assemblies

Background The genomic sequences of centromeres, as well as the set of proteins that recognize and interact with centromeres, are known to quickly diverge between lineages potentially contributing to post-zygotic reproductive isolation. However, the actual sequence of events and processes involved in the divergence of the kinetochore machinery is not known. The patterns of gene loss that occur during evolution concomitant with phenotypic changes have been used to understand the timing and order of molecular changes. Methods I screened the high-quality genomes of twenty budding yeast species for the presence of well-studied kinetochore genes. Based on the conserved gene order and complete genome assemblies, I identified gene loss events. Subsequently, I searched the intergenic regions to identify any un-annotated genes or gene remnants to obtain additional evidence of gene loss. Results My analysis identified the loss of four genes (NKP1, NKP2, CENPL/IML3 and CENPN/CHL4) of the inner kinetochore constitutive centromere-associated network (CCAN/also known as CTF19 complex in yeast) in both the Naumovozyma species for which genome assemblies are available. Surprisingly, this collective loss of four genes of the CCAN/CTF19 complex coincides with the emergence of unconventional centromeres in N. castellii and N. dairenensis. My study suggests a tentative link between the emergence of unconventional point centromeres and the turnover of kinetochore genes in budding yeast.

The Budding Yeast Msh4 Protein Functions in Chromosome Synapsis and the Regulation of Crossover Distribution

Genetics ◽

10.1093/genetics/158.3.1013 ◽

2001 ◽

Vol 158 (3) ◽

pp. 1013-1025 ◽

Cited By ~ 7

Author(s):

Janet E Novak ◽

Petra B Ross-Macdonald ◽

G Shirleen Roeder

Keyword(s):

Mismatch Repair ◽

Budding Yeast ◽

Null Mutation ◽

Crossing Over ◽

Wild Type ◽

Crossover Interference ◽

Meiotic Chromosomes ◽

Chromosome Synapsis ◽

Protein Functions ◽

Or Gene

AbstractThe budding yeast MSH4 gene encodes a MutS homolog produced specifically in meiotic cells. Msh4 is not required for meiotic mismatch repair or gene conversion, but it is required for wild-type levels of crossing over. Here, we show that a msh4 null mutation substantially decreases crossover interference. With respect to the defect in interference and the level of crossing over, msh4 is similar to the zip1 mutant, which lacks a structural component of the synaptonemal complex (SC). Furthermore, epistasis tests indicate that msh4 and zip1 affect the same subset of meiotic crossovers. In the msh4 mutant, SC formation is delayed compared to wild type, and full synapsis is achieved in only about half of all nuclei. The simultaneous defects in synapsis and interference observed in msh4 (and also zip1 and ndj1/tam1) suggest a role for the SC in mediating interference. The Msh4 protein localizes to discrete foci on meiotic chromosomes and colocalizes with Zip2, a protein involved in the initiation of chromosome synapsis. Both Zip2 and Zip1 are required for the normal localization of Msh4 to chromosomes, raising the possibility that the zip1 and zip2 defects in crossing over are indirect, resulting from the failure to localize Msh4 properly.

Hapo-G, haplotype-aware polishing of genome assemblies with accurate reads

NAR Genomics and Bioinformatics ◽

10.1093/nargab/lqab034 ◽

2021 ◽

Vol 3 (2) ◽

Author(s):

Jean-Marc Aury ◽

Benjamin Istace

Keyword(s):

Single Molecule ◽

Direct Consequence ◽

High Quality ◽

Sequencing Errors ◽

Coding Regions ◽

Sequencing Technologies ◽

Long Reads ◽

Oxford Nanopore ◽

Long Read ◽

Genome Assemblies

Abstract Single-molecule sequencing technologies have recently been commercialized by Pacific Biosciences and Oxford Nanopore with the promise of sequencing long DNA fragments (kilobases to megabases order) and then, using efficient algorithms, provide high quality assemblies in terms of contiguity and completeness of repetitive regions. However, the error rate of long-read technologies is higher than that of short-read technologies. This has a direct consequence on the base quality of genome assemblies, particularly in coding regions where sequencing errors can disrupt the coding frame of genes. In the case of diploid genomes, the consensus of a given gene can be a mixture between the two haplotypes and can lead to premature stop codons. Several methods have been developed to polish genome assemblies using short reads and generally, they inspect the nucleotide one by one, and provide a correction for each nucleotide of the input assembly. As a result, these algorithms are not able to properly process diploid genomes and they typically switch from one haplotype to another. Herein we proposed Hapo-G (Haplotype-Aware Polishing Of Genomes), a new algorithm capable of incorporating phasing information from high-quality reads (short or long-reads) to polish genome assemblies and in particular assemblies of diploid and heterozygous genomes.

Identifying the causes and consequences of assembly gaps using a multiplatform genome assembly of a bird-of-paradise

10.1101/2019.12.19.882399 ◽

2019 ◽

Cited By ~ 5

Author(s):

Valentina Peona ◽

Mozes P.K. Blom ◽

Luohao Xu ◽

Reto Burri ◽

Shawn Sullivan ◽

...

Keyword(s):

Dark Matter ◽

Genome Assembly ◽

Sex Chromosome ◽

De Novo ◽

Model Organism ◽

Technology Choice ◽

High Quality ◽

Sequencing Technologies ◽

Downstream Analysis ◽

Genome Assemblies

AbstractGenome assemblies are currently being produced at an impressive rate by consortia and individual laboratories. The low costs and increasing efficiency of sequencing technologies have opened up a whole new world of genomic biodiversity. Although these technologies generate high-quality genome assemblies, there are still genomic regions difficult to assemble, like repetitive elements and GC-rich regions (genomic “dark matter”). In this study, we compare the efficiency of currently used sequencing technologies (short/linked/long reads and proximity ligation maps) and combinations thereof in assembling genomic dark matter starting from the same sample. By adopting different de-novo assembly strategies, we were able to compare each individual draft assembly to a curated multiplatform one and identify the nature of the previously missing dark matter with a particular focus on transposable elements, multi-copy MHC genes, and GC-rich regions. Thanks to this multiplatform approach, we demonstrate the feasibility of producing a high-quality chromosome-level assembly for a non-model organism (paradise crow) for which only suboptimal samples are available. Our approach was able to reconstruct complex chromosomes like the repeat-rich W sex chromosome and several GC-rich microchromosomes. Telomere-to-telomere assemblies are not a reality yet for most organisms, but by leveraging technology choice it is possible to minimize genome assembly gaps for downstream analysis. We provide a roadmap to tailor sequencing projects around the completeness of both the coding and non-coding parts of the genomes.

SIGAR: Inferring features of genome architecture and DNA rearrangements by split read mapping

10.1101/2020.05.05.079426 ◽

2020 ◽

Author(s):

Yi Feng ◽

Leslie Y. Beh ◽

Wei-Jen Chang ◽

Laura F. Landweber

Keyword(s):

Genome Assembly ◽

Repetitive Sequences ◽

Genome Architecture ◽

Dna Rearrangements ◽

High Quality ◽

Microbial Eukaryotes ◽

Ciliate Species ◽

Split Read ◽

High Level ◽

Genome Assemblies

AbstractCiliates are microbial eukaryotes with distinct somatic and germline genomes. Post-zygotic development involves extensive remodeling of the germline genome to form somatic chromosomes. Ciliates therefore offer a valuable model for studying the architecture and evolution of programmed genome rearrangements. Current studies usually focus on a few model species, where rearrangement features are annotated by aligning reference germline and somatic genomes. While many high-quality somatic genomes have been assembled, a high quality germline genome assembly is difficult to obtain due to its smaller DNA content and abundance of repetitive sequences. To overcome these hurdles, we propose a new pipeline SIGAR (Splitread Inference of Genome Architecture and Rearrangements) to infer germline genome architecture and rearrangement features without a germline genome assembly, requiring only short germline DNA sequencing reads. As a proof of principle, 93% of rearrangement junctions identified by SIGAR in the ciliate Oxytricha trifallax were validated by the existing germline assembly. We then applied SIGAR to six diverse ciliate species without germline genome assemblies, including Ichthyophthirius multifilii, a fish pathogen. Despite the high level of somatic DNA contamination in each sample, SIGAR successfully inferred rearrangement junctions, short eliminated sequences and potential scrambled genes in each species. This pipeline enables pilot surveys or exploration of DNA rearrangements in species with limited DNA material access, thereby providing new insights into the evolution of chromosome rearrangements.

Microbial Genes in the Human Genome: Lateral Transfer or Gene Loss?

Science ◽

10.1126/science.1061036 ◽

2001 ◽

Vol 292 (5523) ◽

pp. 1903-1906 ◽

Cited By ~ 187

Author(s):

S. L. Salzberg

Keyword(s):

Human Genome ◽

Gene Loss ◽

Lateral Transfer ◽

Microbial Genes ◽

Or Gene

High-Quality Genome Assembly of Peronospora destructor, the Causal Agent of Onion Downy Mildew

Molecular Plant-Microbe Interactions ◽

10.1094/mpmi-10-19-0280-a ◽

2020 ◽

Vol 33 (5) ◽

pp. 718-720

Author(s):

Karthi Natesan ◽

Ji Yeon Park ◽

Cheol-Woo Kim ◽

Dong Suk Park ◽

Young-Seok Kwon ◽

...

Keyword(s):

Downy Mildew ◽

De Novo ◽

Gc Content ◽

Comparative Genomic ◽

High Quality ◽

Sequencing Platform ◽

Peronospora Destructor ◽

Genomic Studies ◽

Genome Assemblies ◽

High Quality Genome

Peronospora destructor is an obligate biotrophic oomycete that causes downy mildew on onion (Allium cepa). Onion is an important crop worldwide, but its production is affected by this pathogen. We sequenced the genome of P. destructor using the PacBio sequencing platform, and de novo assembly resulted in 74 contigs with a total contig size of 29.3 Mb and 48.48% GC content. Here, we report the first high-quality genome sequence of P. destructor and its comparison with the genome assemblies of other oomycetes. The genome is a very useful resource to serve as a reference for analysis of P. destructor isolates and for comparative genomic studies of the biotrophic oomycetes.

Closely related budding yeast species respond to different ecological signals for spore activation

Yeast ◽

10.1002/yea.3538 ◽

2020 ◽

Author(s):

Samuel Plante ◽

Christian R. Landry

Keyword(s):

Yeast Species ◽

Budding Yeast ◽

Spore Activation

Extracting novel hypotheses and findings from RNA-seq data

FEMS Yeast Research ◽

10.1093/femsyr/foaa007 ◽

2020 ◽

Vol 20 (2) ◽

Author(s):

Tyler Doughty ◽

Eduard Kerkhoven

Keyword(s):

Gene Expression ◽

Yeast Species ◽

Rna Seq ◽

High Quality ◽

New Techniques ◽

The Past ◽

Basic Biology ◽

Extract Information

ABSTRACT Over the past decade, improvements in technology and methods have enabled rapid and relatively inexpensive generation of high-quality RNA-seq datasets. These datasets have been used to characterize gene expression for several yeast species and have provided systems-level insights for basic biology, biotechnology and medicine. Herein, we discuss new techniques that have emerged and existing techniques that enable analysts to extract information from multifactorial yeast RNA-seq datasets. Ultimately, this minireview seeks to inspire readers to query datasets, whether previously published or freshly obtained, with creative and diverse methods to discover and support novel hypotheses.

A high-quality genome assembly from a single, field-collected spotted lanternfly (Lycorma delicatula) using the PacBio Sequel II system

GigaScience ◽

10.1093/gigascience/giz122 ◽

2019 ◽

Vol 8 (10) ◽

Cited By ~ 12

Author(s):

Sarah B Kingan ◽

Julie Urban ◽

Christine C Lambert ◽

Primo Baybayan ◽

Anna K Childers ◽

...

Keyword(s):

Invasive Species ◽

Genome Assembly ◽

De Novo ◽

Fragment Size ◽

High Quality ◽

De Novo Genome Assembly ◽

Lycorma Delicatula ◽

Long Read ◽

Genome Assemblies ◽

High Quality Genome

ABSTRACT Background A high-quality reference genome is an essential tool for applied and basic research on arthropods. Long-read sequencing technologies may be used to generate more complete and contiguous genome assemblies than alternate technologies; however, long-read methods have historically had greater input DNA requirements and higher costs than next-generation sequencing, which are barriers to their use on many samples. Here, we present a 2.3 Gb de novo genome assembly of a field-collected adult female spotted lanternfly (Lycorma delicatula) using a single Pacific Biosciences SMRT Cell. The spotted lanternfly is an invasive species recently discovered in the northeastern United States that threatens to damage economically important crop plants in the region. Results The DNA from 1 individual was used to make 1 standard, size-selected library with an average DNA fragment size of ∼20 kb. The library was run on 1 Sequel II SMRT Cell 8M, generating a total of 132 Gb of long-read sequences, of which 82 Gb were from unique library molecules, representing ∼36× coverage of the genome. The assembly had high contiguity (contig N50 length = 1.5 Mb), completeness, and sequence level accuracy as estimated by conserved gene set analysis (96.8% of conserved genes both complete and without frame shift errors). Furthermore, it was possible to segregate more than half of the diploid genome into the 2 separate haplotypes. The assembly also recovered 2 microbial symbiont genomes known to be associated with L. delicatula, each microbial genome being assembled into a single contig. Conclusions We demonstrate that field-collected arthropods can be used for the rapid generation of high-quality genome assemblies, an attractive approach for projects on emerging invasive species, disease vectors, or conservation efforts of endangered species.

EvolClust: automated inference of evolutionary conserved gene clusters in eukaryotes

Bioinformatics ◽

10.1093/bioinformatics/btz706 ◽

2019 ◽

Author(s):

Marina Marcet-Houben ◽

Toni Gabaldón

Keyword(s):

Secondary Metabolism ◽

Computational Prediction ◽

Gene Clusters ◽

Supplementary Information ◽

Supplementary Data ◽

Automated Inference ◽

Or Gene ◽

Conserved Gene ◽

Genome Comparisons

Abstract Motivation The evolution and role of gene clusters in eukaryotes is poorly understood. Currently, most studies and computational prediction programs limit their focus to specific types of clusters, such as those involved in secondary metabolism. Results We present EvolClust, a python-based tool for the inference of evolutionary conserved gene clusters from genome comparisons, independently of the function or gene composition of the cluster. EvolClust predicts conserved gene clusters from pairwise genome comparisons and infers families of related clusters from multiple (all versus all) genome comparisons. Availability and implementation https://github.com/Gabaldonlab/EvolClust/. Supplementary information Supplementary data are available at Bioinformatics online.