An Efficient Online Tool to Search Top-N Genes with Similar Biological Functions in Gene Ontology Database

Author(s):  
James Z. Wang ◽  
Zhidian Du ◽  
Philip S. Yu ◽  
Chin-Fu Chen
2004 ◽  
Vol 20 (18) ◽  
pp. 3442-3454 ◽  
Author(s):  
E. Shoop ◽  
P. Casaes ◽  
G. Onsongo ◽  
L. Lesnett ◽  
E. O. Petursdottir ◽  
...  

2017 ◽  
Author(s):  
Dat Duong ◽  
Wasi Uddin Ahmad ◽  
Eleazar Eskin ◽  
Kai-Wei Chang ◽  
Jingyi Jessica Li

AbstractThe Gene Ontology (GO) database contains GO terms that describe biological functions of genes. Previous methods for comparing GO terms have relied on the fact that GO terms are organized into a tree structure. In this paradigm, the locations of two GO terms in the tree dictate their similarity score. In this paper, we introduce two new solutions for this problem, by focusing instead on the definitions of the GO terms. We apply neural network based techniques from the natural language processing (NLP) domain. The first method does not rely on the GO tree, whereas the second indirectly depends on the GO tree. In our first approach, we compare two GO definitions by treating them as two unordered sets of words. The word similarity is estimated by a word embedding model that maps words into an N-dimensional space. In our second approach, we account for the word-ordering within a sentence. We use a sentence encoder to embed GO definitions into vectors and estimate how likely one definition entails another. We validate our methods in two ways. In the first experiment, we test the model’s ability to differentiate a true protein-protein network from a randomly generated network. In the second experiment, we test the model in identifying orthologs from randomly-matched genes in human, mouse, and fly. In both experiments, a hybrid of NLP and GO-tree based method achieves the best classification accuracy.Availabilitygithub.com/datduong/NLPMethods2CompareGOterms


2012 ◽  
Vol 2012 ◽  
pp. 1-17 ◽  
Author(s):  
Gaston K. Mazandu ◽  
Nicola J. Mulder

The wide coverage and biological relevance of the Gene Ontology (GO), confirmed through its successful use in protein function prediction, have led to the growth in its popularity. In order to exploit the extent of biological knowledge that GO offers in describing genes or groups of genes, there is a need for an efficient, scalable similarity measure for GO terms and GO-annotated proteins. While several GO similarity measures exist, none adequately addresses all issues surrounding the design and usage of the ontology. We introduce a new metric for measuring the distance between two GO terms using the intrinsic topology of the GO-DAG, thus enabling the measurement of functional similarities between proteins based on their GO annotations. We assess the performance of this metric using a ROC analysis on human protein-protein interaction datasets and correlation coefficient analysis on the selected set of protein pairs from the CESSM online tool. This metric achieves good performance compared to the existing annotation-based GO measures. We used this new metric to assess functional similarity between orthologues, and show that it is effective at determining whether orthologues are annotated with similar functions and identifying cases where annotation is inconsistent between orthologues.


2020 ◽  
Vol 15 (4) ◽  
pp. 318-327
Author(s):  
Najmul Ikram ◽  
Muhammad Abdul Qadir ◽  
Muhammad Tanvir Afzal

Background: The rapidly growing protein and annotation databases necessitate the development of efficient tools to process this valuable information. Biologists frequently need to find proteins similar to a given protein, for which BLAST tools are commonly used. With the development of biomedical ontologies, e.g. Gene Ontology, methods were designed to measure function (semantic) similarity between two proteins. These methods work well on protein pairs, but are not suitable for protein query processing. Objective: Our aim is to facilitate searching of similar proteins in an acceptable time. Methods: A novel method SimExact for high speed searching of functionally similar proteins has been proposed. Results: The experiments of this study show that SimExact gives correct results required for protein searching. A fully functional prototype of an online tool (www.datafurnish.com/protsem.php) has been provided that generates a ranked list of the proteins similar to a query protein, with a response time of less than 20 seconds in our setup. SimExact was used to search for protein pairs having high disparity between function similarity and sequence similarity. Conclusion: SimExact makes such searches practical, which would not be possible in a reasonable time otherwise.


2019 ◽  
Vol 51 (10) ◽  
pp. 1429-1433 ◽  
Author(s):  
Paul D. Thomas ◽  
David P. Hill ◽  
Huaiyu Mi ◽  
David Osumi-Sutherland ◽  
Kimberly Van Auken ◽  
...  

2021 ◽  
Vol 12 ◽  
Author(s):  
Lili Li ◽  
Zhi Xie ◽  
Xiliang Qian ◽  
Tai Wang ◽  
Minmin Jiang ◽  
...  

CircRNAs have been reported to play essential roles in regulating immunity and inflammation, which may be an important regulatory factor in the development of vitiligo. However, the expression profile of circRNAs and their potential biological functions in vitiligo have not been reported so far. In our study we found there are 64 dysregulated circRNAs and 14 dysregulated miRNAs in the patients with vitiligo. Through the correlation analysis, we obtained 12 dysregulated circRNAs and 5 dysregulated miRNAs, forming 48 relationships in the circRNA-miRNA-mRNA regulatory network. Gene Ontology analysis indicated dysregulated circRNAs in vitiligo is closely related to the disorder of the metabolic pathway. The KEGG pathway of dysregulation of circRNAs mainly enriched in the biological processes such as ubiquitin mediated proteolysis, endocytosis and RNA degradation, and in Jak-STAT signaling pathway. Therefore, we found the circRNA-miRNA-mRNA regulatory network are involved in the regulation of numerous melanocyte functions, and these dysregulated circRNAs may closely related to the melanocyte metabolism. Our study provides a theoretical basis for studying the vitiligo pathogenesis from the perspective of circRNA-miRNA-mRNA network.


Sign in / Sign up

Export Citation Format

Share Document