Latent Dirichlet Allocation based on Gibbs Sampling for gene function prediction

With the continuous accumulation of biological data, more and more machine learning algorithms have been introduced into the field of gene function prediction, which has great significance in decoding the secret of life. Recently, a multi-label supervised topic model named labeled latent Dirichlet allocation (LLDA) has been applied to gene function prediction, and obtained more accurate and explainable predictions than conventional methods. Nonetheless, the LLDA model is only able to construct a bag of amino acid words as a classification feature, and does not support any other features, such as hydrophobicity, which has a profound impact on gene function. To achieve more accurate probabilistic modeling of gene function, we propose a multi-label supervised topic model conditioned on arbitrary features, named Dirichlet multinomial regression LLDA (DMR-LLDA), for introducing multiple types of features into the process of topic modeling. Based on DMR framework, DMR-LLDA applies an exponential a priori construction, previously with weighted features, on the hyper-parameters of gene-topic distribution, so as to reflect the effects of extra features on function probability distribution. In the five-fold cross validation experiment of a yeast datasets, DMR-LLDA outperforms the compared model significantly. All of these experiments demonstrate the effectiveness and potential value of DMR-LLDA for predicting gene function.

Download Full-text

Faculty Opinions recommendation of The art of gene function prediction.

Faculty Opinions – Post-Publication Peer Review of the Biomedical Literature ◽

10.3410/f.1056759.508687 ◽

2006 ◽

Author(s):

Martin Noble

Keyword(s):

Gene Function ◽

Function Prediction ◽

Gene Function Prediction

Download Full-text

Faculty Opinions recommendation of Network-Based Gene Function Prediction in Mouse and Other Model Vertebrates Using MouseNet Server.

Faculty Opinions – Post-Publication Peer Review of the Biomedical Literature ◽

10.3410/f.727562216.793535246 ◽

2017 ◽

Author(s):

John Hancock

Keyword(s):

Gene Function ◽

Function Prediction ◽

Gene Function Prediction

Download Full-text

Gene Function Prediction from Functional Association Networks Using Kernel Partial Least Squares Regression

PLoS ONE ◽

10.1371/journal.pone.0134668 ◽

2015 ◽

Vol 10 (8) ◽

pp. e0134668 ◽

Cited By ~ 12

Author(s):

Sonja Lehtinen ◽

Jon Lees ◽

Jürg Bähler ◽

John Shawe-Taylor ◽

Christine Orengo

Keyword(s):

Least Squares ◽

Partial Least Squares ◽

Gene Function ◽

Partial Least Squares Regression ◽

Function Prediction ◽

Least Squares Regression ◽

Gene Function Prediction ◽

Functional Association ◽

Kernel Partial Least Squares

Download Full-text

Using similarity learning to improve network-based gene function prediction

2012 IEEE International Conference on Bioinformatics and Biomedicine ◽

10.1109/bibm.2012.6392663 ◽

2012 ◽

Cited By ~ 1

Author(s):

Ngo Phuong Nhung ◽

Tu Minti Phuong

Keyword(s):

Gene Function ◽

Function Prediction ◽

Similarity Learning ◽

Gene Function Prediction

Download Full-text

GLDA: Parallel Gibbs Sampling for Latent Dirichlet Allocation on GPU

Communications in Computer and Information Science - Advanced Computer Architecture ◽

10.1007/978-981-10-2209-8_9 ◽

2016 ◽

pp. 97-107 ◽

Cited By ~ 2

Author(s):

Pei Xue ◽

Tao Li ◽

Kezhao Zhao ◽

Qiankun Dong ◽

Wenjing Ma

Keyword(s):

Gibbs Sampling ◽

Latent Dirichlet Allocation ◽

Dirichlet Allocation

Download Full-text

Gene Function Prediction and Functional Network: The Role of Gene Ontology

Intelligent Systems Reference Library - Data Mining: Foundations and Intelligent Paradigms ◽

10.1007/978-3-642-23151-3_7 ◽

2012 ◽

pp. 123-162 ◽

Cited By ~ 1

Author(s):

Erliang Zeng ◽

Chris Ding ◽

Kalai Mathee ◽

Lisa Schneper ◽

Giri Narasimhan

Keyword(s):

Gene Ontology ◽

Gene Function ◽

Function Prediction ◽

Functional Network ◽

Gene Function Prediction

Download Full-text

Mecanismos de alinhamento de preferências em governos multipartidários: controle de políticas públicas no presidencialismo brasileiro

Opinião Pública ◽

10.1590/1807-01912017232429 ◽

2017 ◽

Vol 23 (2) ◽

pp. 429-458

Author(s):

Victor Araújo

Keyword(s):

Gibbs Sampling ◽

Latent Dirichlet Allocation ◽

Dirichlet Allocation

Resumo A formação de governos multipartidários potencializa o risco de assimetria de informação entre principals e agentes, de maneira que os conflitos do gabinete sobre políticas se refletem no comportamento dos partidos no parlamento. Diversos estudos demonstram que o controle mútuo entre os partidos integrantes do gabinete é uma forma de compensar a perda de informação inerente à delegação. Enquanto a literatura costuma focar na fase de formulação das políticas, analisando os governos formados no Brasil entre 1995 e 2014, argumento que existe um conjunto mais diversificado de estratégias que permitem aos partidos escrutinar as políticas implementadas por seus parceiros de gabinete. Fazendo uso de análise de redes e técnicas quantitativas de análise de texto (método Gibbs Sampling, algoritmo bayesiano derivado do Latent Dirichlet allocation – LDA) mostro que, nas situações em que os portfólios ministeriais são distribuídos para atores com distintas preferências sobre políticas, os partidos intensificam o uso dos Requerimentos de Informação (RIC) para monitorar os ministérios e políticas que lhes interessam. A estrutura das redes de controle intragabinete varia em função da saliência dos ministérios: os partidos responsáveis pelos portfólios com maior dotação orçamentária são os atores com maior grau de centralidade nas redes de monitoramento mútuo.

Download Full-text

A hierarchical multi-label classification method based on neural networks for gene function prediction

Biotechnology & Biotechnological Equipment ◽

10.1080/13102818.2018.1521302 ◽

2018 ◽

Vol 32 (6) ◽

pp. 1613-1621 ◽

Cited By ~ 4

Author(s):

Shou Feng ◽

Ping Fu ◽

Wenbin Zheng

Keyword(s):

Neural Networks ◽

Gene Function ◽

Function Prediction ◽

Classification Method ◽

Gene Function Prediction

Download Full-text

Accurate and efficient gene function prediction using a multi-bacterial network

Bioinformatics ◽

10.1093/bioinformatics/btaa885 ◽

2020 ◽

Author(s):

Jeffrey N Law ◽

Shiv D Kale ◽

T M Murali

Keyword(s):

Gene Function ◽

Bacterial Species ◽

Heterogeneous Data ◽

Function Prediction ◽

Label Propagation ◽

Supplementary Information ◽

Gene Function Prediction ◽

Functional Annotations ◽

A Genome ◽

Multiple Species

Abstract Motivation Nearly 40% of the genes in sequenced genomes have no experimentally or computationally derived functional annotations. To fill this gap, we seek to develop methods for network-based gene function prediction that can integrate heterogeneous data for multiple species with experimentally based functional annotations and systematically transfer them to newly sequenced organisms on a genome-wide scale. However, the large sizes of such networks pose a challenge for the scalability of current methods. Results We develop a label propagation algorithm called FastSinkSource. By formally bounding its rate of progress, we decrease the running time by a factor of 100 without sacrificing accuracy. We systematically evaluate many approaches to construct multi-species bacterial networks and apply FastSinkSource and other state-of-the-art methods to these networks. We find that the most accurate and efficient approach is to pre-compute annotation scores for species with experimental annotations, and then to transfer them to other organisms. In this manner, FastSinkSource runs in under 3 min for 200 bacterial species. Availability and implementation An implementation of our framework and all data used in this research are available at https://github.com/Murali-group/multi-species-GOA-prediction. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text