The Blepharisma stoltei macronuclear genome: towards the origins of whole genome reorganization

The germ-soma distinction is a defining feature of multicellular eukaryotes. Analogous to this, ciliates, a ubiquitous microbial eukaryote lineage, have morphologically and functionally distinct nuclei, but within single cells: the germline micronucleus (MIC) and somatic macronucleus (MAC). The origins and mechanisms of the MIC to MAC transformation, especially the extensive elimination of abundant internally eliminated sequences (IESs) and transposons during genome reorganization, are great biological mysteries. Blepharisma represents one of the two earliest diverging ciliate classes, and has unique, dual pathways of MAC development, making it ideal for investigating the functioning, origins and evolution of these processes. Here, we report the MAC genome assembly of Blepharisma stoltei strain ATCC 30299 (41 Mb), arranged as numerous alternative telomere-capped minichromosomes, tens to hundreds of kilobases long. The B. stoltei MAC genome encodes eight PiggyBac transposase homologs liberated from transposons. All are subject to purifying selection, but just one, the putative Blepharisma IES excisase, has a complete catalytic amino acid triad. Numerous genes encoding other domesticated transposases are present in B. stoltei, and often are comparably strongly upregulated in a similar timeframe to model ciliate genome reorganization homologs. Our phylogenetic investigations suggest the PiggyBac homologs may have been ancestral ciliate IES excisases. The B. stoltei MAC genome, together with the upcoming MIC genome, highlights the evolution and complex interplay between transposons, domesticated transposases, and genome reorganization in the context of germline-soma differentiation within single cells.

Download Full-text

On the Identification of Clinically Relevant Bacterial Amino Acid Changes at the Whole Genome Level Using Auto-PSS-Genome

Interdisciplinary Sciences Computational Life Sciences ◽

10.1007/s12539-021-00439-2 ◽

2021 ◽

Author(s):

Hugo López-Fernández ◽

Cristina P. Vieira ◽

Pedro Ferreira ◽

Paula Gouveia ◽

Florentino Fdez-Riverola ◽

...

Keyword(s):

Amino Acid ◽

Whole Genome ◽

Genome Level

Download Full-text

Molecular characterization of clinical carbapenem-resistant Enterobacterales from Qatar

European Journal of Clinical Microbiology & Infectious Diseases ◽

10.1007/s10096-021-04185-7 ◽

2021 ◽

Author(s):

Fatma Ben Abid ◽

Clement K. M. Tsui ◽

Yohei Doi ◽

Anand Deshmukh ◽

Christi L. McElheny ◽

...

Keyword(s):

Common Species ◽

Clinical Samples ◽

Whole Genome ◽

E Coli ◽

Carbapenem Resistant ◽

Genes Encoding ◽

Encoding Genes ◽

Sequence Types ◽

Common Sequence

AbstractOne hundred forty-nine carbapenem-resistant Enterobacterales from clinical samples obtained between April 2014 and November 2017 were subjected to whole genome sequencing and multi-locus sequence typing. Klebsiella pneumoniae (81, 54.4%) and Escherichia coli (38, 25.5%) were the most common species. Genes encoding metallo-β-lactamases were detected in 68 (45.8%) isolates, and OXA-48-like enzymes in 60 (40.3%). blaNDM-1 (45; 30.2%) and blaOXA-48 (29; 19.5%) were the most frequent. KPC-encoding genes were identified in 5 (3.6%) isolates. Most common sequence types were E. coli ST410 (8; 21.1%) and ST38 (7; 18.4%), and K. pneumoniae ST147 (13; 16%) and ST231 (7; 8.6%).

Download Full-text

A study of transposable element-associated structural variations (TASVs) using a de novo-assembled Korean genome

Experimental & Molecular Medicine ◽

10.1038/s12276-021-00586-y ◽

2021 ◽

Author(s):

Seyoung Mun ◽

Songmi Kim ◽

Wooseok Lee ◽

Keunsoo Kang ◽

Thomas J. Meyer ◽

...

Keyword(s):

Genome Sequencing ◽

Genome Assembly ◽

De Novo ◽

Personal Genome ◽

Human Populations ◽

Whole Genome ◽

Structural Variations ◽

Insert Size ◽

Human Genomes ◽

Next Generation Sequencing Ngs

AbstractAdvances in next-generation sequencing (NGS) technology have made personal genome sequencing possible, and indeed, many individual human genomes have now been sequenced. Comparisons of these individual genomes have revealed substantial genomic differences between human populations as well as between individuals from closely related ethnic groups. Transposable elements (TEs) are known to be one of the major sources of these variations and act through various mechanisms, including de novo insertion, insertion-mediated deletion, and TE–TE recombination-mediated deletion. In this study, we carried out de novo whole-genome sequencing of one Korean individual (KPGP9) via multiple insert-size libraries. The de novo whole-genome assembly resulted in 31,305 scaffolds with a scaffold N50 size of 13.23 Mb. Furthermore, through computational data analysis and experimental verification, we revealed that 182 TE-associated structural variation (TASV) insertions and 89 TASV deletions contributed 64,232 bp in sequence gain and 82,772 bp in sequence loss, respectively, in the KPGP9 genome relative to the hg19 reference genome. We also verified structural differences associated with TASVs by comparative analysis with TASVs in recent genomes (AK1 and TCGA genomes) and reported their details. Here, we constructed a new Korean de novo whole-genome assembly and provide the first study, to our knowledge, focused on the identification of TASVs in an individual Korean genome. Our findings again highlight the role of TEs as a major driver of structural variations in human individual genomes.

Download Full-text

A High-Quality Grapevine Downy Mildew Genome Assembly Reveals Rapidly Evolving and Lineage-Specific Putative Host Adaptation Genes

Genome Biology and Evolution ◽

10.1093/gbe/evz048 ◽

2019 ◽

Vol 11 (3) ◽

pp. 954-969 ◽

Cited By ~ 12

Author(s):

Yann Dussert ◽

Isabelle D Mazet ◽

Carole Couture ◽

Jérôme Gouzy ◽

Marie-Christine Piron ◽

...

Keyword(s):

Downy Mildew ◽

Genome Assembly ◽

Population Genomics ◽

Plasmopara Viticola ◽

Plant Diseases ◽

Host Specialization ◽

Grapevine Downy Mildew ◽

Downy Mildews ◽

Genes Encoding ◽

High Level

Abstract Downy mildews are obligate biotrophic oomycete pathogens that cause devastating plant diseases on economically important crops. Plasmopara viticola is the causal agent of grapevine downy mildew, a major disease in vineyards worldwide. We sequenced the genome of Pl. viticola with PacBio long reads and obtained a new 92.94 Mb assembly with high contiguity (359 scaffolds for a N50 of 706.5 kb) due to a better resolution of repeat regions. This assembly presented a high level of gene completeness, recovering 1,592 genes encoding secreted proteins involved in plant–pathogen interactions. Plasmopara viticola had a two-speed genome architecture, with secreted protein-encoding genes preferentially located in gene-sparse, repeat-rich regions and evolving rapidly, as indicated by pairwise dN/dS values. We also used short reads to assemble the genome of Plasmopara muralis, a closely related species infecting grape ivy (Parthenocissus tricuspidata). The lineage-specific proteins identified by comparative genomics analysis included a large proportion of RxLR cytoplasmic effectors and, more generally, genes with high dN/dS values. We identified 270 candidate genes under positive selection, including several genes encoding transporters and components of the RNA machinery potentially involved in host specialization. Finally, the Pl. viticola genome assembly generated here will allow the development of robust population genomics approaches for investigating the mechanisms involved in adaptation to biotic and abiotic selective pressures in this species.

Download Full-text

Chromosome-level genome assembly of Ophiorrhiza pumila reveals the evolution of camptothecin biosynthesis

Nature Communications ◽

10.1038/s41467-020-20508-2 ◽

2021 ◽

Vol 12 (1) ◽

Author(s):

Amit Rai ◽

Hideki Hirakawa ◽

Ryo Nakabayashi ◽

Shinji Kikuchi ◽

Koki Hayashi ◽

...

Keyword(s):

Genome Assembly ◽

Whole Genome Duplication ◽

Experimental Validation ◽

Genome Duplication ◽

Indole Alkaloids ◽

Whole Genome ◽

Comparative Genome Analysis ◽

Plant Genomes ◽

Genome Triplication ◽

Chromosome Level

AbstractPlant genomes remain highly fragmented and are often characterized by hundreds to thousands of assembly gaps. Here, we report chromosome-level reference and phased genome assembly of Ophiorrhiza pumila, a camptothecin-producing medicinal plant, through an ordered multi-scaffolding and experimental validation approach. With 21 assembly gaps and a contig N50 of 18.49 Mb, Ophiorrhiza genome is one of the most complete plant genomes assembled to date. We also report 273 nitrogen-containing metabolites, including diverse monoterpene indole alkaloids (MIAs). A comparative genomics approach identifies strictosidine biogenesis as the origin of MIA evolution. The emergence of strictosidine biosynthesis-catalyzing enzymes precede downstream enzymes’ evolution post γ whole-genome triplication, which occurred approximately 110 Mya in O. pumila, and before the whole-genome duplication in Camptotheca acuminata identified here. Combining comparative genome analysis, multi-omics analysis, and metabolic gene-cluster analysis, we propose a working model for MIA evolution, and a pangenome for MIA biosynthesis, which will help in establishing a sustainable supply of camptothecin.

Download Full-text

De novo whole-genome assembly in Chrysanthemum seticuspe, a model species of Chrysanthemums, and its application to genetic and gene discovery analysis

DNA Research ◽

10.1093/dnares/dsy048 ◽

2019 ◽

Vol 26 (3) ◽

pp. 195-203 ◽

Cited By ~ 19

Author(s):

Hideki Hirakawa ◽

Katsuhiko Sumitomo ◽

Tamotsu Hisamatsu ◽

Soichiro Nagano ◽

Kenta Shirasawa ◽

...

Keyword(s):

Genome Assembly ◽

De Novo ◽

Gene Discovery ◽

Whole Genome ◽

Model Species

Download Full-text

Whole-Genome Sequences of Influenza A(H1N1)pdm09 Virus Isolates from Kerala, India

Genome Announcements ◽

10.1128/genomea.00598-17 ◽

2017 ◽

Vol 5 (28) ◽

Cited By ~ 1

Author(s):

Sara Jones ◽

Raji Prasad ◽

Anjana S. Nair ◽

Sanjai Dharmaseelan ◽

Remya Usha ◽

...

Keyword(s):

Amino Acid ◽

Amino Acid Analysis ◽

Influenza A ◽

Whole Genome Sequence ◽

Whole Genome ◽

H1n1 Pandemic ◽

Genome Sequences ◽

Influenza A H1n1 ◽

Virus Isolates ◽

New Mutations

ABSTRACT We report here the whole-genome sequence of six clinical isolates of influenza A(H1N1)pdm09, isolated from Kerala, India. Amino acid analysis of all gene segments from the A(H1N1)pdm09 isolates obtained in 2014 and 2015 identified several new mutations compared to the 2009 A(H1N1) pandemic strain.

Download Full-text

Simultaneous deletion of the methylcytosine oxidases Tet1 and Tet3 increases transcriptome variability in early embryogenesis

Proceedings of the National Academy of Sciences ◽

10.1073/pnas.1510510112 ◽

2015 ◽

Vol 112 (31) ◽

pp. E4236-E4245 ◽

Cited By ~ 48

Author(s):

Jinsuk Kang ◽

Matthias Lienhard ◽

William A. Pastor ◽

Ashu Chawla ◽

Mark Novotny ◽

...

Keyword(s):

Single Cells ◽

Cholesterol Biosynthesis ◽

Early Embryogenesis ◽

Dna Demethylation ◽

Cell Stage ◽

Tet Proteins ◽

Genes Encoding ◽

Variable Gene ◽

Intracellular Pathways ◽

Tet Enzymes

Dioxygenases of the TET (Ten-Eleven Translocation) family produce oxidized methylcytosines, intermediates in DNA demethylation, as well as new epigenetic marks. Here we show data suggesting that TET proteins maintain the consistency of gene transcription. Embryos lacking Tet1 and Tet3 (Tet1/3 DKO) displayed a strong loss of 5-hydroxymethylcytosine (5hmC) and a concurrent increase in 5-methylcytosine (5mC) at the eight-cell stage. Single cells from eight-cell embryos and individual embryonic day 3.5 blastocysts showed unexpectedly variable gene expression compared with controls, and this variability correlated in blastocysts with variably increased 5mC/5hmC in gene bodies and repetitive elements. Despite the variability, genes encoding regulators of cholesterol biosynthesis were reproducibly down-regulated in Tet1/3 DKO blastocysts, resulting in a characteristic phenotype of holoprosencephaly in the few embryos that survived to later stages. Thus, TET enzymes and DNA cytosine modifications could directly or indirectly modulate transcriptional noise, resulting in the selective susceptibility of certain intracellular pathways to regulation by TET proteins.

Download Full-text

Whole-Genome Sequence for Methicillin-Resistant Staphylococcus aureus Strain ATCC BAA-1680

Genome Announcements ◽

10.1128/genomea.00011-15 ◽

2015 ◽

Vol 3 (2) ◽

Cited By ~ 5

Author(s):

Luke T. Daum ◽

Violet V. Bumah ◽

Daniela S. Masson-Meyers ◽

Manjeet Khubbar ◽

John D. Rodriguez ◽

...

Keyword(s):

Staphylococcus Aureus ◽

Genome Sequence ◽

Methicillin Resistant Staphylococcus Aureus ◽

Whole Genome Sequence ◽

Aureus Strain ◽

Whole Genome ◽

Strain Atcc ◽

Methicillin Resistant

Download Full-text

Advancing RNA Virus Discovery and Biology with Whole Genome Sequencing

10.21007/etd.cghs.2021.0551 ◽

2021 ◽

Author(s):

◽

Mariah Taylor ◽

Keyword(s):

Genetic Variation ◽

Amino Acid ◽

Whole Genome Sequencing ◽

Genome Sequencing ◽

Rna Virus ◽

Animal Health ◽

Functional Domains ◽

Whole Genome ◽

Coding Region

Two RNA virus families that pose a threat to human and animal health are Hantaviridae and Coronaviridae. These RNA viruses which originate in wildlife continue and will continue to cause disease, and hence, it is critical that scientific research define the mechanisms as to how these viruses spillover and adapt to new hosts to become endemic. One gap in our ability to define these mechanisms is the lack of whole genome sequences for many of these viruses. To address this specific gap, I developed a versatile amplicon-based whole-genome sequencing (WGS) approach to identify viral genomes of hantaviruses and severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) within reservoir and spillover hosts. In my research studies, I used the amplicon-based WGS approach to define the genetic plasticity of viral RNA within pathogenic and nonpathogenic hantavirus species. The standing genetic variation of Andes orthohantavirus and Prospect Hill orthohantavirus was mapped out and amino acid changes occurring outside of functional domains were identified within the nucleocapsid and glycoprotein. I observed several amino acid changes in functional domains of the RNA-dependent RNA polymerase, as well as single nucleotide polymorphisms (SNPs) within the 3’ non-coding region (NCR) of the S-segment. To identify whether virus adaptation would occur within the S- and L-segments we attempted to adapt hantaviruses in vitro in a spillover host model through passaging experiments. In early passages we identified few mutations in the M-segment with the majority being identified in the S-segment 3’ NCR and the L-segment. This work suggests that hantavirus adaptation occurs in the S- and L-segments although the effect of these mutants on pathology is yet to be determined. While sequencing laboratory isolates is easily accomplished, sequencing low concentrations of virus within the reservoir is a formidable task. I further translated our amplicon-based WGS approach into a pan-oligonucleotide amplicon-based WGS approach to sequence hantavirus vRNA and mRNA from reservoir and spillover hosts in Ukraine. This approach successfully identified a novel Puumala orthohantavirus (PUUV) strain in Ukraine and using Bayesian phylogenetics we found this strain to be associated with the PUUV Latvian lineage. Early during the SARS-CoV-2 pandemic, I applied the knowledge gained in the hantavirus WGS efforts to sequencing of SARS-CoV-2 from nasopharyngeal swabs collected in April 2020. The genetic diversity of 45 SARS-CoV-2 isolates was evaluated with the methods I developed. We identified D614G, a notable mutation known for increasing transmission, in over 90% of our isolates. Two major lineages distinguish SARS-CoV-2 variants worldwide, lineages A and B. While most of our isolates were found within B lineage, we also identified one isolate within lineage A. We performed in vitro work which confirmed A lineage isolates as having poor replication in the trachea as compared to the nasal cavity. Five of these isolates presented a unique array of mutations which were assessed in the keratin 18 human angiotensin-converting enzyme 2 (K18-hACE2) mouse model for its response immunologically and pathogenically. We identified a distinction of pathogenesis between the A and B lineages with emphysema being common amongst A lineage isolates. Additionally, we discovered a small cohort of likely SNPs that defined the late induction of eosinophils during infection. In summary, this work will further define the dynamics of genetic variation and plasticity within virus populations that cause disease outbreaks and will allow a deeper understanding of the virus-host relationship.

Download Full-text