scholarly journals Identification of residue inversions in large phylogenies of duplicated proteins

2021 ◽  
Author(s):  
Stefano Pascarelli ◽  
Paola Laurino

Connecting protein sequence to function is becoming increasingly relevant since high-throughput sequencing studies accumulate large amounts of genomic data. Protein database annotation helps to bridge this gap; however, it is fundamental to understand the mechanisms underlying functional inheritance and divergence. If the homology relationship between proteins is known, can we determine whether the function diverged? In this work, we analyze different possibilities of protein sequence evolution after gene duplication and identify "residue inversions", i.e., sites where the relationship between the ancestry and the functional signal is decoupled. Residues in these sites play a role in functional divergence and could indicate a shift in protein function. We develop a method to recognize residue inversions in a phylogeny and test it on real and simulated datasets. In a dataset built from the Epidermal Growth Factor Receptor (EGFR) sequences found in 88 fish species, we identify 19 positions that went through inversion after gene duplication, mostly located at the ligand-binding extracellular domain.

2009 ◽  
Vol 37 (4) ◽  
pp. 783-786 ◽  
Author(s):  
Romain A. Studer ◽  
Marc Robinson-Rechavi

The evolution of protein function appears to involve alternating periods of conservative evolution and of relatively rapid change. Evidence for such episodic evolution, consistent with some theoretical expectations, comes from the application of increasingly sophisticated models of evolution to large sequence datasets. We present here some of the recent methods to detect functional shifts, using amino acid or codon models. Both provide evidence for punctual shifts in patterns of amino acid conservation, including the fixation of key changes by positive selection. Although a link to gene duplication, a presumed source of functional changes, has been difficult to establish, this episodic model appears to apply to a wide variety of proteins and organisms.


2002 ◽  
Vol 3 (5) ◽  
pp. 423-440 ◽  
Author(s):  
A. J. Pérez ◽  
A. Rodríguez ◽  
O. Trelles ◽  
G. Thode

A method for assigning functions to unknown sequences based on finding correlations between short signals and functional annotations in a protein database is presented. This approach is based on keyword (KW) and feature (FT) information stored in the SWISS-PROT database. The former refers to particular protein characteristics and the latter locates these characteristics at a specific sequence position. In this way, a certain keyword is only assigned to a sequence if sequence similarity is found in the position described by the FT field. Exhaustive tests performed over sequences with homologues (cluster set) and without homologues (singleton set) in the database show that assigning functions is much ’cleaner’ when information about domains (FT field) is used, than when only the keywords are used.


Plant Disease ◽  
2020 ◽  
Author(s):  
Yeonhwa Jo ◽  
Hoseong Choi ◽  
Jin Kyong Cho ◽  
Won Kyong Cho

Cherry virus F (CVF) is a tentative member of the genus Fabavirus in the family Secoviridae, consisting of two RNA segments (Koloniuk et al. 2018). To date, CVF has been documented in only sweet cherry (Prunus avium) in the Czech Republic (Koloniuk et al. 2018), Canada, and Greece. In May 2014, we collected leaf samples from four symptomatic (leaf spots and dapple fruits) and two asymptomatic Japanese plum cultivars (Sun and Gadam) grown in an orchard in Hoengseong, South Korea, to identify viruses and viroids infecting plum trees. Total RNA from individual plum trees was extracted using two commercial kits: Fruit-mate for RNA Purification Kit (Takara, Shiga, Japan) and RNeasy Plant Mini Kit (Qiagen, Hilden, Germany). We generated six mRNA libraries from the six different plum cultivars for RNA-sequencing using the TruSeq RNA Library Preparation Kit v2 (Illumina, CA, U.S.A.) as described previously (Jo et al. 2017). The mRNA libraries were paired-end (2 X 100 bp) sequenced with a HiSeq 2000 system (Macrogen, Seoul, Korea). The raw sequence reads were de novo assembled by Trinity program v. 2.8.6, with default parameters (Haas et al. 2013). The assembled contigs were subjected to BLASTX search against the non-redundant protein database in NCBI. Of the two asymptomatic cultivars, the transcriptome of asymptomatic plum cv. Gadam contained five contigs specific to CVF. Two and three contigs were specific to CVF RNA1 (2,571 reads, coverage 42.15%) and RNA2 (2,025 reads, coverage 53.04%), respectively. The size of these five contigs ranged from 241 to 5,986 bp. Contigs of 5,986 and 3,867 bp in length, referred to as CVF isolate Gadam RNA1 (GenBank MN896996) and RNA2 (GenBank MN896995), respectively, were subjected to BLASTP search against NCBI’s non-redundant protein database. The results showed that the polyprotein sequences of RNA1 and RNA2 shared 95.3% and 93.11% amino acid identities with isolates SwC-H_1a from the Czech Republic (GenBank acc. no. AWB36326) and Stac-3B_c8 from Canada (AZZ10055), respectively. To confirm the infection of CVF in cv. Gadam, RT-PCR was conducted using CVF RNA1-specific primers designed based on the CVF reference genome sequences (MH998210 and MH998216), including 5’-CCACCAAATAGGCAAGAGGTCAC-3’ (position 3190–3212) and 5’-CACAATCACCATCAATGGTCTCTGC-3’ (position 3742–3766), and CVF RNA2-specific primers, including 5’-CTGCTTTATGATGCTAGACATCAAGATG-3’ (position 1015–1042) and 5’-ACAATAGGCATGCTCATCTCAACCTC-3’ (position 1594–1619). We amplified 577-bp RNA1-specific and 605-bp RNA2-specific amplicons that were cloned and then performed Sanger sequencing. Sequencing of the cloned amplicons for isolate Gadam RNA1 (GenBank MN896993) and RNA2 (GenBank MN896994) revealed values of 99.48% and 99.17% nucleotide identity to that of RNA1 and RNA2 determined by high-throughput sequencing, respectively. Additionally, we tested five plants for each of the six plum cultivars grown in the same orchard. The detection of CVF was carried out through PCR using the primers and protocol described above. Of the 30 trees, CVF was detected in three trees of cv. Gadam by both primer pairs. To our knowledge, this is the first report of CVF infecting Japanese plum and the first report of the virus in Korea. However, its prevalence in other Prunus species, including apricot, European plum, and peach, should be further elucidated.


2020 ◽  
Author(s):  
Emily N. Junkins ◽  
Bradley S. Stevenson

AbstractMolecular techniques continue to reveal a growing disparity between the immense diversity of microbial life and the small proportion that is in pure culture. The disparity, originally dubbed “the great plate count anomaly” by Staley and Konopka, has become even more vexing given our increased understanding of the importance of microbiomes to a host and the role of microorganisms in the vital biogeochemical functions of our biosphere. Searching for novel antimicrobial drug targets often focuses on screening a broad diversity of microorganisms. If diverse microorganisms are to be screened, they need to be cultivated. Recent innovative research has used molecular techniques to assess the efficacy of cultivation efforts, providing invaluable feedback to cultivation strategies for isolating targeted and/or novel microorganisms. Here, we aimed to determine the efficiency of cultivating representative microorganisms from a non-human, mammalian microbiome, identify those microorganisms, and determine the bioactivity of isolates. Molecular methods indicated that around 57% of the ASVs detected in the original inoculum were cultivated in our experiments, but nearly 53% of the total ASVs that were present in our cultivation experiments were not detected in the original inoculum. In light of our controls, our data suggests that when molecular tools were used to characterize our cultivation efforts, they provided a more complete, albeit more complex, understanding of which organisms were present compared to what was eventually cultivated. Lastly, about 3% of the isolates collected from our cultivation experiments showed inhibitory bioactivity against a multidrug-resistant pathogen panel, further highlighting the importance of informing and directing future cultivation efforts with molecular tools.ImportanceCultivation is the definitive tool to understand a microorganism’s physiology, metabolism, and ecological role(s). Despite continuous efforts to hone this skill, researchers are still observing yet-to-be cultivated organisms through high-throughput sequencing studies. Here, we use the very same tool that highlights biodiversity to assess cultivation efficiency. When applied to drug discovery, where screening a vast number of isolates for bioactive metabolites is common, cultivating redundant organisms is a hindrance. However, we observed that cultivating in combination with molecular tools can expand the observed diversity of an environment and its community, potentially increasing the number of microorganisms to be screened for natural products.


Development ◽  
1999 ◽  
Vol 126 (14) ◽  
pp. 3205-3216 ◽  
Author(s):  
A. Ruiz i Altaba

Several lines of evidence implicate zinc finger proteins of the Gli family in the final steps of Hedgehog signaling in normal development and disease. C-terminally truncated mutant GLI3 proteins are also associated with human syndromes, but it is not clear whether these C-terminally truncated Gli proteins fulfil the same function as full-length ones. Here, structure-function analyses of Gli proteins have been performed using floor plate and neuronal induction assays in frog embryos, as well as induction of alkaline phosphatase (AP) in SHH-responsive mouse C3H10T1/2 (10T1/2) cells. These assays show that C-terminal sequences are required for positive inducing activity and cytoplasmic localization, whereas N-terminal sequences determine dominant negative function and nuclear localization. Analyses of nuclear targeted Gli1 and Gli2 proteins suggest that both activator and dominant negative proteins are modified forms. In embryos and COS cells, tagged Gli cDNAs yield C-terminally deleted forms similar to that of Ci. These results thus provide a molecular basis for the human Polydactyly type A and Pallister-Hall Syndrome phenotypes, derived from the deregulated production of C-terminally truncated GLI3 proteins. Analyses of full-length Gli function in 10T1/2 cells suggest that nuclear localization of activating forms is a regulated event and show that only Gli1 mimics SHH in inducing AP activity. Moreover, full-length Gli3 and all C-terminally truncated forms act antagonistically whereas Gli2 is inactive in this assay. In 10T1/2 cells, protein kinase A (PKA), a known inhibitor of Hh signaling, promotes Gli3 repressor formation and inhibits Gli1 function. Together, these findings suggest a context-dependent functional divergence of Gli protein function, in which a cell represses Gli3 and activates Gli1/2 prevents the formation of repressor Gli forms to respond to Shh. Interpretation of Hh signals by Gli proteins therefore appears to involve a fine balance of divergent functions within each and among different Gli proteins, the misregulation of which has profound biological consequences.


Cancers ◽  
2019 ◽  
Vol 11 (12) ◽  
pp. 1864 ◽  
Author(s):  
Holly Tovey ◽  
Maggie Chon U. Cheang

The concept of precision medicine has been around for many years and recent advances in high-throughput sequencing techniques are enabling this to become reality. Within the field of breast cancer, a number of signatures have been developed to molecularly sub-classify tumours. Notable examples recently approved by National Institute for Health and Care Excellence in the UK to guide treatment decisions for oestrogen receptors (ER)+ human epidermal growth factor receptor 2 (HER2)- patients include Prosigna® test, EndoPredict®, and Oncotype DX®. However, a population of still unmet need are those with triple negative breast cancer (TNBC). Accounting for 15–20% of patients, this population has comparatively poor prognosis and as yet no targeted treatment options. Studies have shown that some patients with TNBC respond favourably to DNA damaging drugs (carboplatin) or agents which inhibit DNA damage response (poly ADP ribose polymerase (PARP) inhibitors). Known to be a heterogeneous population, there is a need to identify further TNBC patients who may benefit from these treatments. A number of signatures have been identified based on association with treatment response or specific genetic features/pathways however many of these were not restricted to TNBC patients and as of yet are not common practice in the clinic.


2020 ◽  
Vol 230 (1) ◽  
pp. 37-37
Author(s):  
Aissette Baanannou ◽  
Sepand Rastegar ◽  
Amal Bouzid ◽  
Masanari Takamiya ◽  
Vanessa Gerber ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document