scholarly journals Apicomplexan-like parasites are polyphyletic and widely but selectively dependent on cryptic plastid organelles

eLife ◽  
2019 ◽  
Vol 8 ◽  
Author(s):  
Jan Janouškovec ◽  
Gita G Paskerova ◽  
Tatiana S Miroliubova ◽  
Kirill V Mikhailov ◽  
Thomas Birley ◽  
...  

The phylum Apicomplexa comprises human pathogens such as Plasmodium but is also an under-explored hotspot of evolutionary diversity central to understanding the origins of parasitism and non-photosynthetic plastids. We generated single-cell transcriptomes for all major apicomplexan groups lacking large-scale sequence data. Phylogenetic analysis reveals that apicomplexan-like parasites are polyphyletic and their similar morphologies emerged convergently at least three times. Gregarines and eugregarines are monophyletic, against most expectations, and rhytidocystids and Eleutheroschizon are sister lineages to medically important taxa. Although previously unrecognized, plastids in deep-branching apicomplexans are common, and they contain some of the most divergent and AT-rich genomes ever found. In eugregarines, however, plastids are either abnormally reduced or absent, thus increasing known plastid losses in eukaryotes from two to four. Environmental sequences of ten novel plastid lineages and structural innovations in plastid proteins confirm that plastids in apicomplexans and their relatives are widespread and share a common, photosynthetic origin.

Author(s):  
Brian L. Browning ◽  
Xiaowen Tian ◽  
Ying Zhou ◽  
Sharon R. Browning

2018 ◽  
Author(s):  
Standwell C. Nkhoma ◽  
Simon G. Trevino ◽  
Karla M. Gorena ◽  
Shalini Nair ◽  
Stanley Khoswe ◽  
...  

Malaria patients can carry one or more clonal lineage of the parasite, Plasmodium falciparum, but the composition of these infections cannot be directly inferred from bulk sequence data. Well-defined, complete haplotypes at single-cell resolution are ideal for describing within-host population structure and unambiguously determining parasite diversity, transmission dynamics and recent ancestry but have not been analyzed on a large scale. We generated 485 near-complete single-cell genome sequences isolated from fifteen P. falciparum patients from Chikhwawa, Malawi, an area of intense malaria transmission. Matched single-cell and bulk genomic analyses revealed patients harbored up to seventeen unique lineages. Estimation of parasite relatedness within patients suggests superinfection by repeated mosquito bites is rarer than co-transmission of parasites from a single mosquito. Our single-cell analysis indicates strong barriers to establishment of new infections in malaria-infected patients and allows high resolution dissection of intra-host variation in malaria parasites.


2020 ◽  
Vol 2 (2) ◽  
Author(s):  
Liang Chen ◽  
Weinan Wang ◽  
Yuyao Zhai ◽  
Minghua Deng

Abstract Single-cell RNA sequencing (scRNA-seq) allows researchers to study cell heterogeneity at the cellular level. A crucial step in analyzing scRNA-seq data is to cluster cells into subpopulations to facilitate subsequent downstream analysis. However, frequent dropout events and increasing size of scRNA-seq data make clustering such high-dimensional, sparse and massive transcriptional expression profiles challenging. Although some existing deep learning-based clustering algorithms for single cells combine dimensionality reduction with clustering, they either ignore the distance and affinity constraints between similar cells or make some additional latent space assumptions like mixture Gaussian distribution, failing to learn cluster-friendly low-dimensional space. Therefore, in this paper, we combine the deep learning technique with the use of a denoising autoencoder to characterize scRNA-seq data while propose a soft self-training K-means algorithm to cluster the cell population in the learned latent space. The self-training procedure can effectively aggregate the similar cells and pursue more cluster-friendly latent space. Our method, called ‘scziDesk’, alternately performs data compression, data reconstruction and soft clustering iteratively, and the results exhibit excellent compatibility and robustness in both simulated and real data. Moreover, our proposed method has perfect scalability in line with cell size on large-scale datasets.


2012 ◽  
Vol 93 (10) ◽  
pp. 2195-2203 ◽  
Author(s):  
Martha I. Nelson ◽  
Marie R. Gramer ◽  
Amy L. Vincent ◽  
Edward C. Holmes

To determine the extent to which influenza viruses jump between human and swine hosts, we undertook a large-scale phylogenetic analysis of pandemic A/H1N1/09 (H1N1pdm09) influenza virus genome sequence data. From this, we identified at least 49 human-to-swine transmission events that occurred globally during 2009–2011, thereby highlighting the ability of the H1N1pdm09 virus to transmit repeatedly from humans to swine, even following adaptive evolution in humans. Similarly, we identified at least 23 separate introductions of human seasonal (non-pandemic) H1 and H3 influenza viruses into swine globally since 1990. Overall, these results reveal the frequency with which swine are exposed to human influenza viruses, indicate that humans make a substantial contribution to the genetic diversity of influenza viruses in swine, and emphasize the need to improve biosecurity measures at the human–swine interface, including influenza vaccination of swine workers.


2010 ◽  
Vol 84 (24) ◽  
pp. 12628-12635 ◽  
Author(s):  
Amit Kapoor ◽  
Peter Simmonds ◽  
W. Ian Lipkin

ABSTRACT Public databases of nucleotide sequences contain exponentially increasing amounts of sequence data from mammalian genomes. Through the use of large-scale bioinformatic screening for sequences homologous to exogenous mammalian viruses, we found several sequences related to human and animal parvoviruses (PVs) in the Parvovirus and Dependovirus genera within genomes of several mammals, including rats, wallabies, opossums, guinea pigs, hedgehogs, African elephants, and European rabbits. However, phylogenetic analysis of these endogenous parvovirus (EnPV) sequences demonstrated substantial genetic divergence from exogenous mammalian PVs characterized to date. Entire nonstructural and capsid gene sequences of a novel EnPV were amplified and genetically characterized from rat (Rattus norvegicus) genomic DNA. Rat EnPV sequences were most closely related to members of the genus Parvovirus, with >70% and 65% amino acid identities to nonstructural and capsid proteins of canine parvovirus, respectively. Integration of EnPV into chromosome 5 of rats was confirmed by PCR cloning and sequence analysis of the viral and chromosomal junctions. Using inverse PCR, we determined that the rat genome contains a single copy of rat EnPV. Considering mammalian phylogeny, we estimate that EnPV integrated into the rat genome less than 30 million years ago. Comparative phylogenetic analysis done using all known representative exogenous parvovirus (ExPV) and EnPV sequences showed two major genetic groups of EnPVs, one genetically more similar to genus Parvovirus and the other genetically more similar to the genus Dependovirus. The full extent of the genetic diversity of parvoviruses that have undergone endogenization during evolution of mammals and other vertebrates will be recognized only once complete genomic sequences from a wider range of classes, orders, and species of animals become available.


2021 ◽  
Author(s):  
Xuemei Liu ◽  
Wen Li ◽  
Guanda Huang ◽  
Tianlai Huang ◽  
Qingang Xiong ◽  
...  

Algorithms for constructing phylogenetic trees are fundamental to study the evolution of viruses, bacteria, and other microbes. Established multiple alignment-based algorithms are inefficient for large scale metagenomic sequence data because of their high requirement of inter-sequence correlation and high computational complexity. In this paper, we present SeqDistK, a novel tool for alignment-free phylogenetic analysis. SeqDistK computes the dissimilarity matrix for phylogenetic analysis, incorporating seven k-mer based dissimilarity measures, namely d2, d2S, d2star, Euclidean, Manhattan, CVTree, and Chebyshev. Based on these dissimilarities, SeqDistK constructs phylogenetic tree using the Unweighted Pair Group Method with Arithmetic Mean algorithm. Using a golden standard dataset of 16S rRNA and its associated phylogenetic tree, we compared SeqDistK to Muscle - a multi sequence aligner. We found SeqDistK was not only 38 times faster than Muscle in computational efficiency but also more accurate. SeqDistK achieved the smallest symmetric difference between the inferred and ground truth trees with a range between 13 to 18, while that of Muscle was 62. When measures d2, d2star, d2S, Euclidean, and k-mer size k=5 were used, SeqDistK consistently inferred phylogenetic tree almost identical to the ground truth tree. We also performed clustering of 16S rRNA sequences using SeqDistK and found the clustering was highly consistent with known biological taxonomy. Among all the measures, d2S (k=5, M=2) showed the best accuracy as it correctly clustered and classified all sample sequences. In summary, SeqDistK is a novel, fast and accurate alignment-free tool for large-scale phylogenetic analysis. SeqDistK software is freely available at https://github.com/htczero/SeqDistK.


2013 ◽  
Vol 29 (23) ◽  
pp. 3014-3019 ◽  
Author(s):  
H. Nordberg ◽  
K. Bhatia ◽  
K. Wang ◽  
Z. Wang

2001 ◽  
Vol 91 (7) ◽  
pp. 648-658 ◽  
Author(s):  
Stephen B. Goodwin ◽  
Larry D. Dunkle ◽  
Victoria L. Zismann

Most of the 3,000 named species in the genus Cercospora have no known sexual stage, although a Mycosphaerella teleomorph has been identified for a few. Mycosphaerella is an extremely large and important genus of plant pathogens, with more than 1,800 named species and at least 43 associated anamorph genera. The goal of this research was to perform a large-scale phylogenetic analysis to test hypotheses about the past evolutionary history of Cercospora and Mycosphaerella. Based on the phylogenetic analysis of internal transcribed spacer (ITS) sequence data (ITS1, 5.8S rRNA gene, ITS2), the genus Mycosphaerella is monophyletic. In contrast, many anamorph genera within Mycosphaerella were polyphyletic and were not useful for grouping species. One exception was Cercospora, which formed a highly supported monophyletic group. Most Cercospora species from cereal crops formed a subgroup within the main Cercospora cluster. Only species within the Cercospora cluster produced the toxin cercosporin, suggesting that the ability to produce this compound had a single evolutionary origin. Intraspecific variation for 25 taxa in the Mycosphaerella clade averaged 1.7 nucleotides (nts) in the ITS region. Thus, isolates with ITS sequences that differ by two or more nucleotides may be distinct species. ITS sequences of groups I and II of the gray leaf spot pathogen Cercospora zeae-maydis differed by 7 nts and clearly represent different species. There were 6.5 nt differences on average between the ITS sequences of the sorghum pathogen Cercospora sorghi and the maize pathogen Cercospora sorghi var. maydis, indicating that the latter is a separate species and not simply a variety of Cercospora sorghi. The large monophyletic Mycosphaerella cluster contained a number of anamorph genera with no known teleomorph associations. Therefore, the number of anamorph genera related to Mycosphaerella may be much larger than suspected previously.


Sign in / Sign up

Export Citation Format

Share Document