scholarly journals phydms: software for phylogenetic analyses informed by deep mutational scanning

PeerJ ◽  
2017 ◽  
Vol 5 ◽  
pp. e3657 ◽  
Author(s):  
Sarah K. Hilton ◽  
Michael B. Doud ◽  
Jesse D. Bloom

It has recently become possible to experimentally measure the effects of all amino-acid point mutations to proteins using deep mutational scanning. These experimental measurements can inform site-specific phylogenetic substitution models of gene evolution in nature. Here we describe software that efficiently performs analyses with such substitution models. This software, phydms, can be used to compare the results of deep mutational scanning experiments to the selection on genes in nature. Given a phylogenetic tree topology inferred with another program, phydms enables rigorous comparison of how well different experiments on the same gene capture actual natural selection. It also enables re-scaling of deep mutational scanning data to account for differences in the stringency of selection in the lab and nature. Finally, phydms can identify sites that are evolving differently in nature than expected from experiments in the lab. As data from deep mutational scanning experiments become increasingly widespread, phydms will facilitate quantitative comparison of the experimental results to the actual selection pressures shaping evolution in nature.

2017 ◽  
Author(s):  
Sarah K. Hilton ◽  
Michael B Doud ◽  
Jesse D Bloom

AbstractBackgroundThe evolution of protein-coding genes can be quantitatively modeled using phylogenetic methods. Recently, it has been shown that high-throughput experimental measurements of mutational effects made via deep mutational scanning can inform site-specific phylogenetic substitution models of gene evolution. However, there is currently no software tailored for such analyses.ResultsWe describe software that efficiently performs phylogenetic analyses with substitution models informed by deep mutational scanning. This software, phydms, is ∼100-fold faster than existing programs that accommodate such substitution models. It can be used to compare the results of deep mutational scanning experiments to the selection on genes in nature. For instance, phydms enables rigorous comparison of how well different experiments on the same gene describe natural selection. It also enables the re-scaling of deep mutational scanning data to account for differences in the stringency of selection in the lab and nature. Finally, phydms can identify sites that are evolving differently in nature than expected from experiments in the lab.ConclusionsThe phydms software makes it easy to use phylogenetic substitution models informed by deep mutational scanning experiments. As data from such experiments becomes increasingly widespread, phydms will facilitate quantitative comparison of the experimental results to the actual selection pressures shaping evolution in nature.


1990 ◽  
Vol 111 (6) ◽  
pp. 2537-2542 ◽  
Author(s):  
R Hinrichsen ◽  
E Wilson ◽  
T Lukas ◽  
T Craig ◽  
J Schultz ◽  
...  

The ability of microinjected calmodulin to temporarily restore an ion channel-mediated behavioral phenotype of a calmodulin mutant in Paramecium tetraurelia (cam1) is dependent on the amino acid side chain that is present at residue 101, even when there is extensive variation in the rest of the amino acid sequence. Analysis of conservation of serine-101 in calmodulin suggests that the ability of calmodulin to regulate this ion channel-associated cell function may be a biological role of calmodulin that is widely distributed phylogenetically. A series of mutant calmodulins that differ only at residue-101 were produced by in vitro site-specific mutagenesis and expression in Escherichia coli, purified to chemical homogeneity, and tested for their ability to temporarily restore a wild-type behavioral phenotype to cam1 (pantophobiacA1) Paramecium. Calmodulins with glycine-101 or tyrosine-101 had minimal activity; calmodulins with phenylalanine-101 or alanine-101 had no detectable activity. However, as a standard of comparison, all of the calmodulins were able to activate a calmodulin-regulated enzyme, myosin light chain kinase, that is sensitive to point mutations elsewhere in the calmodulin molecule. Overall, these results support the hypothesis that the structural features of calmodulin required for the transduction of calcium signals varies with the particular pathway that is being regulated and provide insight into why inherited mutations of calmodulin at residue 101 are nonlethal and selective in their phenotypic effects.


2007 ◽  
Vol 81 (10) ◽  
pp. 4981-4990 ◽  
Author(s):  
Mustafa Hasoksuz ◽  
Konstantin Alekseev ◽  
Anastasia Vlasova ◽  
Xinsheng Zhang ◽  
David Spiro ◽  
...  

ABSTRACT Coronaviruses (CoVs) possess large RNA genomes and exist as quasispecies, which increases the possibility of adaptive mutations and interspecies transmission. Recently, CoVs were recognized as important pathogens in captive wild ruminants. This is the first report of the isolation and detailed genetic, biologic, and antigenic characterization of a bovine-like CoV from a giraffe (Giraffa camelopardalis) in a wild-animal park in the United States. CoV particles were detected by immune electron microscopy in fecal samples from three giraffes with mild-to-severe diarrhea. From one of the three giraffe samples, a CoV (GiCoV-OH3) was isolated and successfully adapted to serial passage in human rectal tumor 18 cell cultures. Hemagglutination assays, receptor-destroying enzyme activity, hemagglutination inhibition, and fluorescence focus neutralization tests revealed close biological and antigenic relationships between the GiCoV-OH3 isolate and selected respiratory and enteric bovine CoV (BCoV) strains. When orally inoculated into a BCoV-seronegative gnotobiotic calf, GiCoV-OH3 caused severe diarrhea and virus shedding within 2 to 3 days. Sequence comparisons and phylogenetic analyses were performed to assess its genetic relatedness to other CoVs. Molecular characterization confirmed that the new isolate belongs to group 2a of the mammalian CoVs and revealed closer genetic relatedness between GiCoV-OH3 and the enteric BCoVs BCoV-ENT and BCoV-DB2, whereas BCoV-Mebus was more distantly related. Detailed sequence analysis of the GiCoV-OH3 spike gene demonstrated the presence of a deletion in the variable region of the S1 subunit (from amino acid 543 to amino acid 547), which is a region associated with pathogenicity and tissue tropism for other CoVs. The point mutations identified in the structural proteins (by comparing GiCoV-OH3, BCoV-ENT, BCoV-DB2, and BCoV-Mebus) were most conserved among GiCoV-OH3, BCoV-ENT, and BCoV-DB2, whereas most of the point mutations in the nonstructural proteins were unique to GiCoV-OH3. Our results confirm the existence of a bovine-like CoV transmissible to cattle from wild ruminants, namely, giraffes, but with certain genetic properties different from those of BCoVs.


Author(s):  
Bui Quang Minh ◽  
Cuong Cao Dang ◽  
Le Sy Vinh ◽  
Robert Lanfear

AbstractAmino acid substitution models play a crucial role in phylogenetic analyses. Maximum likelihood (ML) methods have been proposed to estimate amino acid substitution models, however, they are typically complicated and slow. In this paper, we propose QMaker, a new ML method to estimate a general time-reversible Q matrix from a large protein dataset consisting of multiple sequence alignments. QMaker combines an efficient ML tree search algorithm, a model selection for handling the model heterogeneity among alignments, and the consideration of rate mixture models among sites. We provide QMaker as a user-friendly function in the IQ-TREE software package (http://www.iqtree.org) supporting the use of multiple CPU cores so that biologists can easily estimate amino acid substitution models from their own protein alignments. We used QMaker to estimate new empirical general amino acid substitution models from the current Pfam database as well as five clade-specific models for mammals, birds, insects, yeasts, and plants. Our results show that the new models considerably improve the fit between model and data and in some cases influence the inference of phylogenetic tree topologies.


2021 ◽  
Author(s):  
Cuong Cao Dang ◽  
Bui Quang Minh ◽  
Hanon McShea ◽  
Joanna Masel ◽  
Jennifer Eleanor James ◽  
...  

Amino acid substitution models are a key component in phylogenetic analyses of protein sequences. All amino acid models available to date are time-reversible, an assumption designed for computational convenience but not for biological reality. Another significant downside to time-reversible models is that they do not allow inference of rooted trees without outgroups. In this paper, we introduce a maximum likelihood approach nQMaker, an extension of the recently published QMaker method, that allows the estimation of time non-reversible amino acid substitution models and rooted phylogenetic trees from a set of protein sequence alignments. We show that the non-reversible models estimated with nQMaker are a much better fit to empirical alignments than pre-existing reversible models, across a wide range of datasets including mammals, birds, plants, fungi, and other taxa, and that the improvements in model fit scale with the size of the dataset. Notably, for the recently published plant and bird trees, these non-reversible models correctly recovered the commonly known root placements with very high statistical support without the need to use an outgroup. We provide nQMaker as an easy-to-use feature in the IQ-TREE software (http://www.iqtree.org), allowing users to estimate non-reversible models and rooted phylogenies from their own protein datasets.


mSphere ◽  
2017 ◽  
Vol 2 (3) ◽  
Author(s):  
Kentaro Tohma ◽  
Cara J. Lepore ◽  
Lauren A. Ford-Siltz ◽  
Gabriel I. Parra

ABSTRACT Norovirus is the leading cause of acute gastroenteritis worldwide. For over two decades, a single genotype (GII.4) has been responsible for most norovirus-associated cases. However, during the winter of 2014 to 2015, the GII.4 strains were displaced by a rarely detected genotype (GII.17) in several countries of the Asian continent. Moreover, during the winter of 2016 to 2017, the GII.2 strain reemerged as predominant in different countries worldwide. This reemerging GII.2 strain is a recombinant virus that presents a GII.P16 polymerase genotype. In this study, we investigated the evolutionary dynamics of GII.2 to determine the mechanism of this sudden emergence in the human population. The phylogenetic analyses indicated strong linear evolution of the VP1-encoding sequence, albeit with minor changes in the amino acid sequence over time. Without major genetic differences among the strains, a clustering based on the polymerase genotype was observed in the tree. This association did not affect the substitution rate of the VP1. Phylogenetic analyses of the polymerase region showed that reemerging GII.P16-GII.2 strains diverged into a new cluster, with a small number of amino acid substitutions detected on the surface of the associated polymerase. Thus, besides recombination or antigenic shift, point mutations in nonstructural proteins could also lead to novel properties with epidemic potential in different norovirus genotypes. IMPORTANCE Noroviruses are a major cause of gastroenteritis worldwide. Currently, there is no vaccine or specific antiviral available to treat norovirus disease. Multiple norovirus strains infect humans, but a single genotype (GII.4) has been regarded as the most important cause of viral gastroenteritis outbreaks worldwide. Its persistence and predominance have been explained by the continuous replacement of variants that present new antigenic properties on their capsid protein, thus evading the herd immunity acquired to the previous variants. Over the last three seasons, minor genotypes have displaced the GII.4 viruses as the predominant strains. One of these genotypes, GII.2, reemerged as predominant during 2016 to 2017. Here we show that factors such as minor changes in the polymerase may have driven the reemergence of GII.2 during the last season. A better understanding of norovirus diversity is important for the development of effective treatments against noroviruses.


1992 ◽  
Vol 176 (2) ◽  
pp. 449-457 ◽  
Author(s):  
N L Lill ◽  
M J Tevethia ◽  
W G Hendrickson ◽  
S S Tevethia

The 94-kD large tumor (T) antigen specified by simian virus 40 (SV40) is sufficient to induce cell transformation. T antigen contains four H-2Db-restricted cytotoxic T lymphocyte (CTL) recognition epitopes that are targets for CTL clones Y-1, Y-2, Y-3, and Y-5. These epitopes have been mapped to T antigen amino acids 207-215 (site I), 223-231 (sites II and III), and 489-497 (site V), respectively. Antigenic site loss variant cells that had lost one or more CTL recognition epitopes were previously selected by coculturing SV40-transformed H-2Db cells with the site-specific Db-restricted CTL clones. The genetic bases for T antigen CTL recognition epitope loss from the variant cells were identified by DNA amplification and direct sequencing of epitope-coding regions from variant cell DNAs. Cells selected for resistance to CTL clone Y-1 (K-1; K-1,4,5; K-3,1) carry deleted SV40 genomes lacking site I, II, and III coding sequences. Point mutations present within the site II/III coding region of Y-2-/Y-3-resistant cell lines specify the substitution of asparagine for lysine as T antigen amino acid 228 (K-2) or phenylalanine for tyrosine at position 230 (K-3). Point mutations identified within independently selected Y-5 resistant populations (K-5 and K-1,4,5) direct the substitution of isoleucine for asparagine at position 496 (K-5) or the substitution of phenylalanine for isoleucine at position 491 (K-1,4,5) of T antigen. Each substitution causes loss of the relevant CTL recognition epitope, apparently by compromising CTL T cell receptor recognition. These experiments identify specific amino acid changes within a transforming protein that facilitate transformed cell escape from site-specific CTL clones while allowing maintenance of cellular transformation. This experimental model system provides unique opportunities for studying mechanisms of transformed cell escape from active immunosurveillance in vivo, and for analysis of differential host immune responses to wild-type and mutant cell-transforming proteins.


2021 ◽  
Author(s):  
Bui Quang Minh ◽  
Cuong Cao Dang ◽  
Le Sy Vinh ◽  
Robert Lanfear

Abstract Amino acid substitution models play a crucial role in phylogenetic analyses. Maximum likelihood (ML) methods have been proposed to estimate amino acid substitution models; however, they are typically complicated and slow. In this article, we propose QMaker, a new ML method to estimate a general time-reversible $Q$ matrix from a large protein data set consisting of multiple sequence alignments. QMaker combines an efficient ML tree search algorithm, a model selection for handling the model heterogeneity among alignments, and the consideration of rate mixture models among sites. We provide QMaker as a user-friendly function in the IQ-TREE software package (http://www.iqtree.org) supporting the use of multiple CPU cores so that biologists can easily estimate amino acid substitution models from their own protein alignments. We used QMaker to estimate new empirical general amino acid substitution models from the current Pfam database as well as five clade-specific models for mammals, birds, insects, yeasts, and plants. Our results show that the new models considerably improve the fit between model and data and in some cases influence the inference of phylogenetic tree topologies.[Amino acid replacement matrices; amino acid substitution models; maximum likelihood estimation; phylogenetic inferences.]


2021 ◽  
Author(s):  
Trung Thien Tran ◽  
Anh Tuan Nguyen ◽  
Duc Trong Quach ◽  
Dao Thi-Hong Pham ◽  
Nga Minh Cao ◽  
...  

Abstract Background Amoxicillin resistant Helicobacter pylori (H. pylori) strains seem to have increased over time in Vietnam. This threatens the effectiveness of H. pylori eradication therapies with this antibiotic. This study aimed to investigate the prevalence of primary resistance of H. pylori to amoxicillin and to assess its association with pbp1A point mutations in Vietnamese patients. Materials and Methods Naive patients who presented with dyspepsia undergoing upper gastrointestinal endoscopy were recruited. Rapid urease tests and PCR assays were used to diagnose H. pylori infection. Amoxicillin susceptibility was examined by E-tests. Molecular detection of the mutant pbp1A gene conferring amoxicillin resistance was carried out by real-time PCR followed by direct sequencing of the PCR products. Phylogenetic analyses were performed using the Tamura-Nei genetic distance model and the neighbour-joining tree building method. Results There were 308 patients (46.1% men and 53.9% women, p = 0.190) with H. pylori infection. The mean age of the patients was 40.5 ± 11.4 years, ranging from 18 to 74 years old. The E-test was used to determine the susceptibility to amoxicillin (minimum inhibitory concentration (MIC) ≤ 0.125 µg/ml) in 101 isolates, among which the rate of primarily resistant strains to amoxicillin was 25.7%. Then, 270 sequences of pbp1A gene fragments were analysed. There were 77 amino acid substitution positions investigated, spanning amino acids 310–596, with the proportion varying from 0.4–100%. Seven amino acid changes were significantly different between amoxicillin-sensitive (AmoxS) and amoxicillin-resistant (AmoxR) samples, including Phe366 to Leu (p < 0.001), Ser414 to Arg (p < 0.001), Glu/Asn464−465 (p = 0.009), Val469 to Met (p = 0.021), Phe473 to Val (p < 0.001), Asp479 to Glu (p = 0.044), and Ser/Ala/Gly595−596 (p = 0.001). Phylogenetic analyses suggested that other molecular mechanisms might contribute to amoxicillin resistance in H. pylori in addition to the alterations in PBP1A. Conclusions We reported the emergence of amoxicillin-resistant Helicobacter pylori strains in Vietnam and new mutations statistically associated with this antimicrobial resistance. Additional studies are necessary to identify the mechanisms contributing to this resistance in Vietnam.


2018 ◽  
Author(s):  
Hossam H Tayeb ◽  
Marina Stienecker ◽  
Anton Middelberg ◽  
Frank Sainsbury

Biosurfactants, are surface active molecules that can be produced by renewable, industrially scalable biologic processes. DAMP4, a designer biosurfactant, enables the modification of interfaces via genetic or chemical fusion to functional moieties. However, bioconjugation of addressable amines introduces heterogeneity that limits the precision of functionalization as well as the resolution of interfacial characterization. Here we designed DAMP4 variants with cysteine point mutations to allow for site-specific bioconjugation. The DAMP4 variants were shown to retain the structural stability and interfacial activity characteristic of the parent molecule, while permitting efficient and specific conjugation of polyethylene glycol (PEG). PEGylation results in a considerable reduction on the interfacial activity of both single and double mutants. Comparison of conjugates with one or two conjugation sites shows that both the number of conjugates as well as the mass of conjugated material impacts the interfacial activity of DAMP4. As a result, the ability of DAMP4 variants with multiple PEG conjugates to impart colloidal stability on peptide-stabilized emulsions is reduced. We suggest that this is due to constraints on the structure of amphiphilic helices at the interface. Specific and efficient bioconjugation permits the exploration and investigation of the interfacial properties of designer protein biosurfactants with molecular precision. Our findings should therefore inform the design and modification of biosurfactants for their increasing use in industrial processes, and nutritional and pharmaceutical formulations.


Sign in / Sign up

Export Citation Format

Share Document