scholarly journals Inferring Strain Mixture within Clinical Plasmodium falciparum Isolates from Genomic Sequence Data

2016 ◽  
Vol 12 (6) ◽  
pp. e1004824 ◽  
Author(s):  
John D. O’Brien ◽  
Zamin Iqbal ◽  
Jason Wendler ◽  
Lucas Amenga-Etego
2017 ◽  
Vol 61 (12) ◽  
Author(s):  
Alfred Amambua-Ngwa ◽  
Joseph Okebe ◽  
Haddijatou Mbye ◽  
Sukai Ceesay ◽  
Fatima El-Fatouri ◽  
...  

ABSTRACT Antimalarial interventions have yielded a significant decline in malaria prevalence in The Gambia, where artemether-lumefantrine (AL) has been used as a first-line antimalarial for a decade. Clinical Plasmodium falciparum isolates collected from 2012 to 2015 were analyzed ex vivo for antimalarial susceptibility and genotyped for drug resistance markers (pfcrt K76T, pfmdr1 codons 86, 184, and 1246, and pfk13) and microsatellite variation. Additionally, allele frequencies of single nucleotide polymorphisms (SNPs) from other drug resistance-associated genes were compared from genomic sequence data sets from 2008 (n = 79) and 2014 (n = 168). No artemisinin resistance-associated pfk13 mutation was found, and only 4% of the isolates tested in 2015 showed significant growth after exposure to dihydroartemisinin. Conversely, the 50% inhibitory concentrations (IC50s) of amodiaquine and lumefantrine increased within this period. pfcrt 76T and pfmdr1 184F mutants remained at a prevalence above 80%. pfcrt 76T was positively associated with higher IC50s to chloroquine. pfmdr1 NYD increased in frequency between 2012 and 2015 due to lumefantrine selection. The TNYD (pfcrt 76T and pfmdr1 NYD wild-type haplotype) also increased in frequency following AL implementation in 2008. These results suggest selection for pfcrt and pfmdr1 genotypes that enable tolerance to lumefantrine. Increased tolerance to lumefantrine calls for sustained chemotherapeutic monitoring in The Gambia to minimize complete artemisinin combination therapy (ACT) failure in the future.


2015 ◽  
Author(s):  
Lucas Amenga-Etego ◽  
Ruiqi Li ◽  
John D. O’Brien

AbstractThe advent of whole-genome sequencing has generated increased interest in modeling the structure of strain mixture within clinicial infections of Plasmodium falciparum (Pf). The life cycle of the parasite implies that the mixture of multiple strains within an infected individual is related to the out-crossing rate across populations, making methods for measuring this process in situ central to understanding the genetic epidemiology of the disease. In this paper, we show how to estimate inbreeding coefficients using genomic data from Pf clinical samples, providing a simple metric for assessing within-sample mixture that connects to an extensive literature in population genetics and conservation ecology. Features of the P. falciparum genome mean that some standard methods for inbreeding coefficients and related F-statistics cannot be used directly. Here, we review an initial effort to estimate the inbreeding coefficient within clinical isolates of P. falciparum and provide several generalizations using both frequentist and Bayesian approaches. The Bayesian approach connects these estimates to the Balding-Nichols model, a mainstay within genetic epidemiology. We provide simulation results on the performance of the estimators and show their use on ~ 1500 samples from the PF3K data set. We also compare the results to output from a recent mixture model for within-sample strain mixture, showing that inbreeding coefficients provide a strong proxy for the results of these more complex models. We provide the methods described within an open-source R package pfmix.


2020 ◽  
Vol 15 ◽  
Author(s):  
Affan Alim ◽  
Abdul Rafay ◽  
Imran Naseem

Background: Proteins contribute significantly in every task of cellular life. Their functions encompass the building and repairing of tissues in human bodies and other organisms. Hence they are the building blocks of bones, muscles, cartilage, skin, and blood. Similarly, antifreeze proteins are of prime significance for organisms that live in very cold areas. With the help of these proteins, the cold water organisms can survive below zero temperature and resist the water crystallization process which may cause the rupture in the internal cells and tissues. AFP’s have attracted attention and interest in food industries and cryopreservation. Objective: With the increase in the availability of genomic sequence data of protein, an automated and sophisticated tool for AFP recognition and identification is in dire need. The sequence and structures of AFP are highly distinct, therefore, most of the proposed methods fail to show promising results on different structures. A consolidated method is proposed to produce the competitive performance on highly distinct AFP structure. Methods: In this study, we propose to use machine learning-based algorithms Principal Component Analysis (PCA) followed by Gradient Boosting (GB) for antifreeze protein identification. To analyze the performance and validation of the proposed model, various combinations of two segments composition of amino acid and dipeptide are used. PCA, in particular, is proposed to dimension reduction and high variance retaining of data which is followed by an ensemble method named gradient boosting for modelling and classification. Results: The proposed method obtained the superfluous performance on PDB, Pfam and Uniprot dataset as compared with the RAFP-Pred method. In experiment-3, by utilizing only 150 PCA components a high accuracy of 89.63 was achieved which is superior to the 87.41 utilizing 300 significant features reported for the RAFP-Pred method. Experiment-2 is conducted using two different dataset such that non-AFP from the PISCES server and AFPs from Protein data bank. In this experiment-2, our proposed method attained high sensitivity of 79.16 which is 12.50 better than state-of-the-art the RAFP-pred method. Conclusion: AFPs have a common function with distinct structure. Therefore, the development of a single model for different sequences often fails to AFPs. A robust results have been shown by our proposed model on the diversity of training and testing dataset. The results of the proposed model outperformed compared to the previous AFPs prediction method such as RAFP-Pred. Our model consists of PCA for dimension reduction followed by gradient boosting for classification. Due to simplicity, scalability properties and high performance result our model can be easily extended for analyzing the proteomic and genomic dataset.


Genes ◽  
2021 ◽  
Vol 12 (8) ◽  
pp. 1212
Author(s):  
J. Spencer Johnston ◽  
Carl E. Hjelmen

Next-generation sequencing provides a nearly complete genomic sequence for model and non-model species alike; however, this wealth of sequence data includes no road map [...]


2014 ◽  
Vol 64 (Pt_2) ◽  
pp. 316-324 ◽  
Author(s):  
Jongsik Chun ◽  
Fred A. Rainey

The polyphasic approach used today in the taxonomy and systematics of the Bacteria and Archaea includes the use of phenotypic, chemotaxonomic and genotypic data. The use of 16S rRNA gene sequence data has revolutionized our understanding of the microbial world and led to a rapid increase in the number of descriptions of novel taxa, especially at the species level. It has allowed in many cases for the demarcation of taxa into distinct species, but its limitations in a number of groups have resulted in the continued use of DNA–DNA hybridization. As technology has improved, next-generation sequencing (NGS) has provided a rapid and cost-effective approach to obtaining whole-genome sequences of microbial strains. Although some 12 000 bacterial or archaeal genome sequences are available for comparison, only 1725 of these are of actual type strains, limiting the use of genomic data in comparative taxonomic studies when there are nearly 11 000 type strains. Efforts to obtain complete genome sequences of all type strains are critical to the future of microbial systematics. The incorporation of genomics into the taxonomy and systematics of the Bacteria and Archaea coupled with computational advances will boost the credibility of taxonomy in the genomic era. This special issue of International Journal of Systematic and Evolutionary Microbiology contains both original research and review articles covering the use of genomic sequence data in microbial taxonomy and systematics. It includes contributions on specific taxa as well as outlines of approaches for incorporating genomics into new strain isolation to new taxon description workflows.


2013 ◽  
Vol 5 (12) ◽  
pp. 2410-2419 ◽  
Author(s):  
Adam D. Leaché ◽  
Rebecca B. Harris ◽  
Max E. Maliska ◽  
Charles W. Linkem

2021 ◽  
Vol 16 ◽  
Author(s):  
Jinghao Peng ◽  
Jiajie Peng ◽  
Haiyin Piao ◽  
Zhang Luo ◽  
Kelin Xia ◽  
...  

Background: The open and accessible regions of the chromosome are more likely to be bound by transcription factors which are important for nuclear processes and biological functions. Studying the change of chromosome flexibility can help to discover and analyze disease markers and improve the efficiency of clinical diagnosis. Current methods for predicting chromosome flexibility based on Hi-C data include the flexibility-rigidity index (FRI) and the Gaussian network model (GNM), which have been proposed to characterize chromosome flexibility. However, these methods require the chromosome structure data based on 3D biological experiments, which is time-consuming and expensive. Objective: Generally, the folding and curling of the double helix sequence of DNA have a great impact on chromosome flexibility and function. Motivated by the success of genomic sequence analysis in biomolecular function analysis, we hope to propose a method to predict chromosome flexibility only based on genomic sequence data. Method: We propose a new method (named "DeepCFP") using deep learning models to predict chromosome flexibility based on only genomic sequence features. The model has been tested in the GM12878 cell line. Results: The maximum accuracy of our model has reached 91%. The performance of DeepCFP is close to FRI and GNM. Conclusion: The DeepCFP can achieve high performance only based on genomic sequence.


Sign in / Sign up

Export Citation Format

Share Document