scholarly journals Long read assemblies of geographically dispersed Plasmodium falciparum isolates reveal highly structured subtelomeres

2018 ◽  
Vol 3 ◽  
pp. 52 ◽  
Author(s):  
Thomas D. Otto ◽  
Ulrike Böhme ◽  
Mandy Sanders ◽  
Adam J. Reid ◽  
Ellen I. Bruske ◽  
...  

Background: Although thousands of clinical isolates of Plasmodium falciparum are being sequenced and analysed by short read technology, the data do not resolve the highly variable subtelomeric regions of the genomes that contain polymorphic gene families involved in immune evasion and pathogenesis. There is also no current standard definition of the boundaries of these variable subtelomeric regions. Methods: Using long-read sequence data (Pacific Biosciences SMRT technology), we assembled and annotated the genomes of 15 P. falciparum isolates, ten of which are newly cultured clinical isolates. We performed comparative analysis of the entire genome with particular emphasis on the subtelomeric regions and the internal var genes clusters.  Results: The nearly complete sequence of these 15 isolates has enabled us to define a highly conserved core genome, to delineate the boundaries of the subtelomeric regions, and to compare these across isolates. We found highly structured variable regions in the genome. Some exported gene families purportedly involved in release of merozoites show copy number variation. As an example of ongoing genome evolution, we found a novel CLAG gene in six isolates.  We also found a novel gene that was relatively enriched in the South East Asian isolates compared to those from Africa. Conclusions: These 15 manually curated new reference genome sequences with their nearly complete subtelomeric regions and fully assembled genes are an important new resource for the malaria research community. We report the overall conserved structure and pattern of important gene families and the more clearly defined subtelomeric regions.

2021 ◽  
Author(s):  
Mohammad Moniruzzaman ◽  
Frank Aylward

Chlamydomonas reinhardtii is an important eukaryotic alga that has been studied as a model organism for decades. Despite extensive history as a model system, phylogenetic and genetic characteristics of viruses infecting this alga have remained elusive. We analyzed high-throughput genome sequence data of numerous C. reinhardtii isolates, and in six strains we discovered endogenous genomes of giant viruses reaching over several hundred kilobases in length. In addition, we have also discovered the entire genome of a closely related giant virus that is endogenized within the genome of Chlamydomonas incerta, one of the closest sequenced phylogenetic relatives of C. reinhardtii. Endogenous giant viruses add hundreds of new gene families to the host strains, highlighting their contribution to the pangenome dynamics and inter-strain genomic variability of C. reinhardtii. Our findings suggest that endogenization of giant viruses can have profound implications in shaping the population dynamics and ecology of protists in the environment.


2021 ◽  
Author(s):  
Damilola R Oresegun ◽  
Peter Thorpe ◽  
Ernest Diez Benavente ◽  
Susana Campino ◽  
Muh Fauzi ◽  
...  

Plasmodium knowlesi, a malaria parasite of old-world macaque monkeys, is used extensively to model Plasmodium biology. Recently P. knowlesi was found in the human population of Southeast Asia, particularly Malaysia. P. knowlesi causes un-complicated to severe and fatal malaria in the human host with features in common with the more prevalent and virulent malaria caused by Plasmodium falciparum. As such P. knowlesi presents a unique opportunity to inform an experimental model for malaria with clinical data from same-species human infections. Experimental lines of P. knowlesi represent well characterised genetically static parasites and to maximise their utility as a backdrop for understanding malaria pathophysiology, genetically diverse contemporary clinical isolates, essentially wild-type, require comparable characterization. The Oxford Nanopore PCR-free long-read sequencing platform was used to sequence P. knowlesi parasites from archived clinical samples. The sequencing platform and assembly pipeline was designed to facilitate capturing data on important multiple gene families, including the P. knowlesi schizont-infected cell agglutination (SICA) var genes and the Knowlesi-Interspersed Repeats (KIR) genes. The SICAvar and KIR gene families code for antigenically variant proteins that have been difficult to resolve and characterise. Analyses presented here suggest that the family members have arisen through a process of gene duplication, selection pressure and variation. Highly evolving genes tend to be located proximal to genetic elements that drive change rather than regions that support core gene conservation. For example, the virulence-associated P. falciparum erythrocyte membrane protein (PfEMP1) gene family members are restricted to relatively unstable sub-telomeric regions. In contrast the SICAvar nd KIR genes are located throughout the genome but as the study presented here shows, they occupy otherwise gene-sparse chromosomal locations. The novel methods presented here offer the malaria research community new tools to generate comprehensive genome sequence data from small clinical samples and renewed insight into these complex real-world parasites.


2015 ◽  
Author(s):  
Lucas Amenga-Etego ◽  
Ruiqi Li ◽  
John D. O’Brien

AbstractThe advent of whole-genome sequencing has generated increased interest in modeling the structure of strain mixture within clinicial infections of Plasmodium falciparum (Pf). The life cycle of the parasite implies that the mixture of multiple strains within an infected individual is related to the out-crossing rate across populations, making methods for measuring this process in situ central to understanding the genetic epidemiology of the disease. In this paper, we show how to estimate inbreeding coefficients using genomic data from Pf clinical samples, providing a simple metric for assessing within-sample mixture that connects to an extensive literature in population genetics and conservation ecology. Features of the P. falciparum genome mean that some standard methods for inbreeding coefficients and related F-statistics cannot be used directly. Here, we review an initial effort to estimate the inbreeding coefficient within clinical isolates of P. falciparum and provide several generalizations using both frequentist and Bayesian approaches. The Bayesian approach connects these estimates to the Balding-Nichols model, a mainstay within genetic epidemiology. We provide simulation results on the performance of the estimators and show their use on ~ 1500 samples from the PF3K data set. We also compare the results to output from a recent mixture model for within-sample strain mixture, showing that inbreeding coefficients provide a strong proxy for the results of these more complex models. We provide the methods described within an open-source R package pfmix.


2021 ◽  
Vol 18 (1) ◽  
Author(s):  
Ahmed Al Qaffas ◽  
Salvatore Camiolo ◽  
Mai Vo ◽  
Alexis Aguiar ◽  
Amine Ourahmane ◽  
...  

AbstractThe advent of whole genome sequencing has revealed that common laboratory strains of human cytomegalovirus (HCMV) have major genetic deficiencies resulting from serial passage in fibroblasts. In particular, tropism for epithelial and endothelial cells is lost due to mutations disrupting genes UL128, UL130, or UL131A, which encode subunits of a virion-associated pentameric complex (PC) important for viral entry into these cells but not for entry into fibroblasts. The endothelial cell-adapted strain TB40/E has a relatively intact genome and has emerged as a laboratory strain that closely resembles wild-type virus. However, several heterogeneous TB40/E stocks and cloned variants exist that display a range of sequence and tropism properties. Here, we report the use of PacBio sequencing to elucidate the genetic changes that occurred, both at the consensus level and within subpopulations, upon passaging a TB40/E stock on ARPE-19 epithelial cells. The long-read data also facilitated examination of the linkage between mutations. Consistent with inefficient ARPE-19 cell entry, at least 83% of viral genomes present before adaptation contained changes impacting PC subunits. In contrast, and consistent with the importance of the PC for entry into endothelial and epithelial cells, genomes after adaptation lacked these or additional mutations impacting PC subunits. The sequence data also revealed six single noncoding substitutions in the inverted repeat regions, single nonsynonymous substitutions in genes UL26, UL69, US28, and UL122, and a frameshift truncating gene UL141. Among the changes affecting protein-coding regions, only the one in UL122 was strongly selected. This change, resulting in a D390H substitution in the encoded protein IE2, has been previously implicated in rendering another viral protein, UL84, essential for viral replication in fibroblasts. This finding suggests that IE2, and perhaps its interactions with UL84, have important functions unique to HCMV replication in epithelial cells.


2021 ◽  
Vol 4 (1) ◽  
Author(s):  
Xing Wang ◽  
Yi Zhang ◽  
Yufeng Zhang ◽  
Mingming Kang ◽  
Yuanbo Li ◽  
...  

AbstractEarthworms (Annelida: Crassiclitellata) are widely distributed around the world due to their ancient origination as well as adaptation and invasion after introduction into new habitats over the past few centuries. Herein, we report a 1.2 Gb complete genome assembly of the earthworm Amynthas corticis based on a strategy combining third-generation long-read sequencing and Hi-C mapping. A total of 29,256 protein-coding genes are annotated in this genome. Analysis of resequencing data indicates that this earthworm is a triploid species. Furthermore, gene family evolution analysis shows that comprehensive expansion of gene families in the Amynthas corticis genome has produced more defensive functions compared with other species in Annelida. Quantitative proteomic iTRAQ analysis shows that expression of 147 proteins changed in the body of Amynthas corticis and 16 S rDNA sequencing shows that abundance of 28 microorganisms changed in the gut of Amynthas corticis when the earthworm was incubated with pathogenic Escherichia coli O157:H7. Our genome assembly provides abundant and valuable resources for the earthworm research community, serving as a first step toward uncovering the mysteries of this species, and may provide molecular level indicators of its powerful defensive functions, adaptation to complex environments and invasion ability.


2021 ◽  
Vol 12 (1) ◽  
pp. 123-137
Author(s):  
Carolina Sabença ◽  
Gilberto Igrejas ◽  
Patrícia Poeta ◽  
Frédéric Robin ◽  
Richard Bonnet ◽  
...  

Objectives. Epidemiological data concerning third-generation cephalosporin (3GC) resistance in wild fauna are scarce. The aim of this study was to characterize the resistance genes, their genetic context, and clonal relatedness in 17 Escherichia coli resistant to 3GC isolated from wild animals. Methods. The isolates were characterized by short-read whole genome sequencing, and long-read sequencing was used for the hybrid assembly of plasmid sequences. Results. The 3GC resistance gene most identified in the isolates was the extended-spectrum β-lactamases (ESBL)-encoding gene blaCTX-M-1 (82.3%), followed by blaCTX-M-32 (5.9%), blaCTX-M-14 (5.9%), and blaSHV-12 (5.9%). E. coli isolates mainly belonged to the sequence types (STs) rarely reported from humans. The single nucleotide polymorphism (SNP)-based typing showed that most E. coli genomes from wild animals (wild boars, birds of prey, and buzzards) formed clonal clusters (<5 SNPs), showing a clonal dissemination crossing species boundaries. blaCTX-M-1-harboring IncI1-ST3 plasmid was the predominant ESBL-encoding plasmid (76.4%) in wild animal isolates. Plasmid comparison revealed a 110-kb self-transferable plasmid consisting of a conserved backbone and two variable regions involved in antimicrobial resistance and in interaction with recipient cells during conjugation. Conclusion. Our results highlighted the unexpected clonal dissemination of blaCTX-M-1-encoding clones and the complicity of IncI1-ST3 plasmid in the spread of blaCTX-M-1 within wild fauna.


Parasitology ◽  
2004 ◽  
Vol 128 (3) ◽  
pp. 245-251 ◽  
Author(s):  
L. PEIXOTO ◽  
V. FERNÁNDEZ ◽  
H. MUSTO

The usage of alternative synonymous codons in the completely sequenced, extremely A+T-rich parasitePlasmodium falciparumwas studied. Confirming previous studies obtained with less than 3% of the total genes recently described, we found that A- and U-ending triplets predominate but translational selection increases the frequency of a subset of codons in highly expressed genes. However, some new results come from the analysis of the complete sequence. First, there is more variation in GC3 than previously described; second, the effect of natural selection acting at the level of translation has been analysed with real expression data at 4 different stages and third, we found that highly expressed proteins increment the frequency of energetically less expensive amino acids. The implications of these results are discussed.


1989 ◽  
Vol 9 (6) ◽  
pp. 2615-2626 ◽  
Author(s):  
E Hickey ◽  
S E Brandon ◽  
G Smale ◽  
D Lloyd ◽  
L A Weber

Vertebrate cells synthesize two forms of the 82- to 90-kilodalton heat shock protein that are encoded by distinct gene families. In HeLa cells, both proteins (hsp89 alpha and hsp89 beta) are abundant under normal growth conditions and are synthesized at increased rates in response to heat stress. Only the larger form, hsp89 alpha, is induced by the adenovirus E1A gene product (M. C. Simon, K. Kitchener, H. T. Kao, E. Hickey, L. Weber, R. Voellmy, N. Heintz, and J. R. Nevins, Mol. Cell. Biol. 7:2884-2890, 1987). We have isolated a human hsp89 alpha gene that shows complete sequence identity with heat- and E1A-inducible cDNA used as a hybridization probe. The 5'-flanking region contained overlapping and inverted consensus heat shock control elements that can confer heat-inducible expression on a beta-globin reporter gene. The gene contained 10 intervening sequences. The first intron was located adjacent to the translation start codon, an arrangement also found in the Drosophila hsp82 gene. The spliced mRNA sequence contained a single open reading frame encoding an 84,564-dalton polypeptide showing high homology with the hsp82 to hsp90 proteins of other organisms. The deduced hsp89 alpha protein sequence differed from the human hsp89 beta sequence reported elsewhere (N. F. Rebbe, J. Ware, R. M. Bertina, P. Modrich, and D. W. Stafford (Gene 53:235-245, 1987) in at least 99 out of the 732 amino acids. Transcription of the hsp89 alpha gene was induced by serum during normal cell growth, but expression did not appear to be restricted to a particular stage of the cell cycle. hsp89 alpha mRNA was considerably more stable than the mRNA encoding hsp70, which can account for the higher constitutive rate of hsp89 synthesis in unstressed cells.


Sign in / Sign up

Export Citation Format

Share Document