scholarly journals Humanization of yeast genes with multiple human orthologs reveals principles of functional divergence between paralogs

2019 ◽  
Author(s):  
Jon M. Laurent ◽  
Riddhiman K. Garge ◽  
Ashley I. Teufel ◽  
Claus O. Wilke ◽  
Aashiq H. Kachroo ◽  
...  

AbstractDespite over a billion years of evolutionary divergence, several thousand human genes possess clearly identifiable orthologs in yeast, and many have undergone lineage-specific duplications in one or both lineages. The ortholog conjecture postulates that orthologous genes between species retain ancestral functions despite divergence over vast timescales, but duplicated genes will be free to diverge in function. However, the retention of ancestral functions among co-orthologs between species and within gene families has been difficult to test experimentally at scale. In order to investigate how ancestral functions are retained or lost post-duplication, we systematically replaced hundreds of essential yeast genes with their human orthologs from gene families that have undergone lineage-specific duplications, including those with single duplications (one yeast gene to two human genes, 1:2) or higher-order expansions (1:>2) in the human lineage. We observe a variable pattern of replaceability across different ortholog classes, with an obvious trend towards differential replaceability inside gene families, rarely observing replaceability by all members of a family. We quantify the ability of various properties of the orthologs to predict replaceability, showing that in the case of 1:2 orthologs, replaceability is predicted largely by the divergence and tissue-specific expression of the human co-orthologs, i.e. the human proteins that are less diverged from their yeast counterpart and more ubiquitously expressed across human tissues more often replace their single yeast ortholog. These trends were consistent with in silico simulations demonstrating that when only one ortholog is replaceable, it tends to be the least diverged of the pair. Replaceability of yeast genes having more than two human co-orthologs was marked by retention of orthologous interactions in functional or protein networks as well as by more ancestral subcellular localization. Overall, we performed >400 human gene replaceability assays revealing 56 new human-yeast complementation pairs, thus opening up avenues to further functionally characterize these human genes in a simplified organismal context.

2018 ◽  
Author(s):  
Jacob D. Washburn ◽  
Maria Katherine Mejia-Guerra ◽  
Guillaume Ramstein ◽  
Karl A. Kremling ◽  
Ravi Valluru ◽  
...  

ABSTRACTDeep learning methodologies have revolutionized prediction in many fields, and show potential to do the same in molecular biology and genetics. However, applying these methods in their current forms ignores evolutionary dependencies within biological systems and can result in false positives and spurious conclusions. We developed two novel approaches that account for evolutionary relatedness in machine learning models: 1) gene-family guided splitting, and 2) ortholog contrasts. The first approach accounts for evolution by constraining the models training and testing sets to include different gene families. The second, uses evolutionarily informed comparisons between orthologous genes to both control for and leverage evolutionary divergence during the training process. The two approaches were explored and validated within the context of mRNA expression level prediction, and have prediction auROC values ranging from 0.72 to 0.94. Model weight inspections showed biologically interpretable patterns, resulting in the novel hypothesis that the 3’ UTR is more important for fine tuning mRNA abundance levels while the 5’ UTR is more important for large scale changes.


PeerJ ◽  
2018 ◽  
Vol 6 ◽  
pp. e4379 ◽  
Author(s):  
Dan Wang ◽  
Jietang Zhao ◽  
Bing Hu ◽  
Jiaqi Li ◽  
Yaqi Qin ◽  
...  

Sucrose phosphate synthase (SPS, EC 2.4.1.14) is a key enzyme that regulates sucrose biosynthesis in plants. SPS is encoded by different gene families which display differential expression patterns and functional divergence. Genome-wide identification and expression analyses of SPS gene families have been performed in Arabidopsis, rice, and sugarcane, but a comprehensive analysis of the SPS gene family in Litchi chinensis Sonn. has not yet been reported. In the current study, four SPS gene (LcSPS1, LcSPS2, LcSPS3, and LcSPS4) were isolated from litchi. The genomic organization analysis indicated the four litchi SPS genes have very similar exon-intron structures. Phylogenetic tree showed LcSPS1-4 were grouped into different SPS families (LcSPS1 and LcSPS2 in A family, LcSPS3 in B family, and LcSPS4 in C family). LcSPS1 and LcSPS4 were strongly expressed in the flowers, while LcSPS3 most expressed in mature leaves. RT-qPCR results showed that LcSPS genes expressed differentially during aril development between cultivars with different hexose/sucrose ratios. A higher level of expression of LcSPS genes was detected in Wuheli, which accumulates higher sucrose in the aril at mature. The tissue- and developmental stage-specific expression of LcSPS1-4 genes uncovered in this study increase our understanding of the important roles played by these genes in litchi fruits.


PLoS Biology ◽  
2020 ◽  
Vol 18 (5) ◽  
pp. e3000627 ◽  
Author(s):  
Jon M. Laurent ◽  
Riddhiman K. Garge ◽  
Ashley I. Teufel ◽  
Claus O. Wilke ◽  
Aashiq H. Kachroo ◽  
...  

2019 ◽  
Vol 116 (12) ◽  
pp. 5542-5549 ◽  
Author(s):  
Jacob D. Washburn ◽  
Maria Katherine Mejia-Guerra ◽  
Guillaume Ramstein ◽  
Karl A. Kremling ◽  
Ravi Valluru ◽  
...  

Deep learning methodologies have revolutionized prediction in many fields and show potential to do the same in molecular biology and genetics. However, applying these methods in their current forms ignores evolutionary dependencies within biological systems and can result in false positives and spurious conclusions. We developed two approaches that account for evolutionary relatedness in machine learning models: (i) gene-family–guided splitting and (ii) ortholog contrasts. The first approach accounts for evolution by constraining model training and testing sets to include different gene families. The second approach uses evolutionarily informed comparisons between orthologous genes to both control for and leverage evolutionary divergence during the training process. The two approaches were explored and validated within the context of mRNA expression level prediction and have the area under the ROC curve (auROC) values ranging from 0.75 to 0.94. Model weight inspections showed biologically interpretable patterns, resulting in the hypothesis that the 3′ UTR is more important for fine-tuning mRNA abundance levels while the 5′ UTR is more important for large-scale changes.


2018 ◽  
Author(s):  
Dongxue Wang ◽  
Basak Eraslan ◽  
Thomas Wieland ◽  
Björn Hallström ◽  
Thomas Hopf ◽  
...  

AbstractGenome-, transcriptome- and proteome-wide measurements provide valuable insights into how biological systems are regulated. However, even fundamental aspects relating to which human proteins exist, where they are expressed and in which quantities are not fully understood. Therefore, we have generated a systematic, quantitative and deep proteome and transcriptome abundance atlas from 29 paired healthy human tissues from the Human Protein Atlas Project and representing human genes by 17,615 transcripts and 13,664 proteins. The analysis revealed that few proteins show truly tissue-specific expression, that vast differences between mRNA and protein quantities within and across tissues exist and that the expression levels of proteins are often more stable across tissues than those of transcripts. In addition, only ~2% of all exome and ~7% of all mRNA variants could be confidently detected at the protein level showing that proteogenomics remains challenging, requires rigorous validation using synthetic peptides and needs more sophisticated computational methods. Many uses of this resource can be envisaged ranging from the study of gene/protein expression regulation to protein biomarker specificity evaluation to name a few.


2021 ◽  
Author(s):  
Daniel Patrick Higgins ◽  
Caroline M Weisman ◽  
Dominique S Lui ◽  
Frank A D'Agostino ◽  
Amy Karol Walker

Genome-wide measurement of mRNA or protein levels provides broad data sets for biological discovery. However, subsequent computational methods are essential for uncovering the functional implications of the data as well as intuitively visualizing the findings. Current computational tools are biased toward well-described pathways, limiting their utility for novel discovery. Recently, we developed an annotation and category enrichment tool for Caenorhabditis elegans genomic data, WormCat, that provides an intuitive visualization output. Unlike GO, which excludes genes with no annotation information retains these genes as a special UNASSIGNED category. Here, we show that the UNASSIGNED gene category shows tissue-specific expression patterns and include genes with biological functions. Poorly annotated genes have previously been considered to lack homologs in closely related species. Instead, we find that around 3% of the UNASSIGNED genes have poorly characterized human orthologs. These human orthologs are themselves poorly characterized. A recently developed method that incorporates lineage relationships (abSENSE) indicates that failure of BLAST to detect homology explains the apparent lineage specificity for many UNASSIGNED genes, suggesting that a larger subset could be related to human genes. WormCat provides an annotation strategy that allows association of UNASSIGNED genes with specific phenotypes and known pathways. Our analysis indicates that the UNASSIGNED gene category contains candidates that merit further functional study which could yield insight into understudied areas of biology.


Genetics ◽  
2020 ◽  
Vol 215 (4) ◽  
pp. 1153-1169 ◽  
Author(s):  
Riddhiman K. Garge ◽  
Jon M. Laurent ◽  
Aashiq H. Kachroo ◽  
Edward M. Marcotte

Many gene families have been expanded by gene duplications along the human lineage, relative to ancestral opisthokonts, but the extent to which the duplicated genes function similarly is understudied. Here, we focused on structural cytoskeletal genes involved in critical cellular processes, including chromosome segregation, macromolecular transport, and cell shape maintenance. To determine functional redundancy and divergence of duplicated human genes, we systematically humanized the yeast actin, myosin, tubulin, and septin genes, testing ∼81% of human cytoskeletal genes across seven gene families for their ability to complement a growth defect induced by inactivation or deletion of the corresponding yeast ortholog. In five of seven families—all but α-tubulin and light myosin, we found at least one human gene capable of complementing loss of the yeast gene. Despite rescuing growth defects, we observed differential abilities of human genes to rescue cell morphology, meiosis, and mating defects. By comparing phenotypes of humanized strains with deletion phenotypes of their interaction partners, we identify instances of human genes in the actin and septin families capable of carrying out essential functions, but failing to fully complement the cytoskeletal roles of their yeast orthologs, thus leading to abnormal cell morphologies. Overall, we show that duplicated human cytoskeletal genes appear to have diverged such that only a few human genes within each family are capable of replacing the essential roles of their yeast orthologs. The resulting yeast strains with humanized cytoskeletal components now provide surrogate platforms to characterize human genes in simplified eukaryotic contexts.


Plants ◽  
2021 ◽  
Vol 10 (7) ◽  
pp. 1465
Author(s):  
Ramon de Koning ◽  
Raphaël Kiekens ◽  
Mary Esther Muyoka Toili ◽  
Geert Angenon

Raffinose family oligosaccharides (RFO) play an important role in plants but are also considered to be antinutritional factors. A profound understanding of the galactinol and RFO biosynthetic gene families and the expression patterns of the individual genes is a prerequisite for the sustainable reduction of the RFO content in the seeds, without compromising normal plant development and functioning. In this paper, an overview of the annotation and genetic structure of all galactinol- and RFO biosynthesis genes is given for soybean and common bean. In common bean, three galactinol synthase genes, two raffinose synthase genes and one stachyose synthase gene were identified for the first time. To discover the expression patterns of these genes in different tissues, two expression atlases have been created through re-analysis of publicly available RNA-seq data. De novo expression analysis through an RNA-seq study during seed development of three varieties of common bean gave more insight into the expression patterns of these genes during the seed development. The results of the expression analysis suggest that different classes of galactinol- and RFO synthase genes have tissue-specific expression patterns in soybean and common bean. With the obtained knowledge, important galactinol- and RFO synthase genes that specifically play a key role in the accumulation of RFOs in the seeds are identified. These candidate genes may play a pivotal role in reducing the RFO content in the seeds of important legumes which could improve the nutritional quality of these beans and would solve the discomforts associated with their consumption.


Sign in / Sign up

Export Citation Format

Share Document