disease associations
Recently Published Documents

Protein biomarkers have been identified across many age-related morbidities. However, characterising epigenetic influences could further inform disease predictions. Here, we leverage epigenome-wide data to study links between the DNAm signatures of the circulating proteome and incident diseases. Using data from four cohorts, we trained and tested epigenetic scores (EpiScores) for 953 plasma proteins, identifying 109 scores that explained between 1% and 58% of the variance in protein levels after adjusting for known protein quantitative trait loci (pQTL) genetic effects. By projecting these EpiScores into an independent sample, (Generation Scotland; n=9,537) and relating them to incident morbidities over a follow-up of 14 years, we uncovered 137 EpiScore – disease associations. These associations were largely independent of immune cell proportions, common lifestyle and health factors and biological aging. Notably, we found that our diabetes-associated EpiScores highlighted previous top biomarker associations from proteome-wide assessments of diabetes. These EpiScores for protein levels can therefore be a valuable resource for disease prediction and risk stratification.

Download Full-text

DDA-SKF: Predicting Drug–Disease Associations Using Similarity Kernel Fusion

Frontiers in Pharmacology ◽

10.3389/fphar.2021.784171 ◽

2022 ◽

Vol 12 ◽

Author(s):

Chu-Qiao Gao ◽

Yuan-Ke Zhou ◽

Xiao-Hong Xin ◽

Hui Min ◽

Pu-Feng Du

Keyword(s):

Computational Model ◽

State Of The Art ◽

Drug Repositioning ◽

Source Code ◽

Orphan Drugs ◽

Kernel Fusion ◽

Disease Associations ◽

Laplacian Regularized Least Squares ◽

Novel Drug ◽

Similarity Information

Drug repositioning provides a promising and efficient strategy to discover potential associations between drugs and diseases. Many systematic computational drug-repositioning methods have been introduced, which are based on various similarities of drugs and diseases. In this work, we proposed a new computational model, DDA-SKF (drug–disease associations prediction using similarity kernels fusion), which can predict novel drug indications by utilizing similarity kernel fusion (SKF) and Laplacian regularized least squares (LapRLS) algorithms. DDA-SKF integrated multiple similarities of drugs and diseases. The prediction performances of DDA-SKF are better, or at least comparable, to all state-of-the-art methods. The DDA-SKF can work without sufficient similarity information between drug indications. This allows us to predict new purpose for orphan drugs. The source code and benchmarking datasets are deposited in a GitHub repository (https://github.com/GCQ2119216031/DDA-SKF).

Download Full-text

SARS-CoV-2 Point Mutation and Deletion Spectra, and Their Association with Different Disease Outcome

10.1101/2022.01.10.475768 ◽

2022 ◽

Author(s):

Brenda Martínez-González ◽

María Eugenia Soria ◽

Lucia Vazquez-Sirvent ◽

Cristina Ferrer-Orta ◽

Rebeca Lobo-Vega ◽

...

Keyword(s):

Severe Disease ◽

Consensus Sequence ◽

Three Dimensional ◽

Point Mutations ◽

Low Frequency ◽

Diversity Indices ◽

Mild Disease ◽

Coding Regions ◽

Disease Associations ◽

Mutant Spectrum

Mutant spectra of RNA viruses are important to understand viral pathogenesis, and response to selective pressures. There is a need to characterize the complexity of mutant spectra in coronaviruses sampled from infected patients. In particular, the possible relationship between SARS-CoV-2 mutant spectrum complexity and disease associations has not been established. In the present study, we report an ultra-deep sequencing (UDS) analysis of the mutant spectrum of amplicons from the nsp12 (polymerase)- and spike (S)-coding regions of thirty nasopharyngeal isolates (diagnostic samples) of SARS-CoV-2 of the first COVID-19 pandemic wave (Madrid, Spain, April 2020) classified according to the severity of ensuing COVID-19. Low frequency mutations and deletions, counted relative to the consensus sequence of the corresponding isolate, were overwhelmingly abundant. We show that the average number of different point mutations, mutations per haplotype and several diversity indices was significantly higher in SARS-CoV-2 isolated from patients who developed mild disease than in those associated with moderate or severe disease (exitus). No such bias was observed with RNA deletions. Location of amino acid substitutions in the three dimensional structures of nsp12 (polymerase) and S suggest significant structural or functional effects. Thus, patients who develop mild symptoms may be a richer source of genetic variants of SARS-CoV-2 than patients with moderate or severe COVID-19.

Download Full-text

HLA-SPREAD: a natural language processing based resource for curating HLA association from PubMed abstracts

BMC Genomics ◽

10.1186/s12864-021-08239-0 ◽

2022 ◽

Vol 23 (1) ◽

Author(s):

Dhwani Dholakia ◽

Ankit Kalra ◽

Bishnu Raman Misir ◽

Uma Kanga ◽

Mitali Mukerji

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Semantic Analysis ◽

Skin Diseases ◽

Human Leukocyte ◽

Relevant Information ◽

Entity Recognition ◽

Hla Association ◽

Disease Associations

AbstractExtreme complexity in the Human Leukocyte Antigens (HLA) system and its nomenclature makes it difficult to interpret and integrate relevant information for HLA associations with diseases, Adverse Drug Reactions (ADR) and Transplantation. PubMed search displays ~ 146,000 studies on HLA reported from diverse locations. Currently, IPD-IMGT/HLA (Robinson et al., Nucleic Acids Research 48:D948–D955, 2019) database houses data on 28,320 HLA alleles. We developed an automated pipeline with a unified graphical user interface HLA-SPREAD that provides a structured information on SNPs, Populations, REsources, ADRs and Diseases information. Information on HLA was extracted from ~ 28 million PubMed abstracts extracted using Natural Language Processing (NLP). Python scripts were used to mine and curate information on diseases, filter false positives and categorize to 24 tree hierarchical groups and named Entity Recognition (NER) algorithms followed by semantic analysis to infer HLA association(s). This resource from 109 countries and 40 ethnic groups provides interesting insights on: markers associated with allelic/haplotypic association in autoimmune, cancer, viral and skin diseases, transplantation outcome and ADRs for hypersensitivity. Summary information on clinically relevant biomarkers related to HLA disease associations with mapped susceptible/risk alleles are readily retrievable from HLASPREAD. The resource is available at URL http://hla-spread.igib.res.in/. This resource is first of its kind that can help uncover novel patterns in HLA gene-disease associations.

Download Full-text

USP14: Structure, Function, and Target Inhibition

Frontiers in Pharmacology ◽

10.3389/fphar.2021.801328 ◽

2022 ◽

Vol 12 ◽

Author(s):

Feng Wang ◽

Shuo Ning ◽

Beiming Yu ◽

Yanfeng Wang

Keyword(s):

Protein Degradation ◽

Conformational Changes ◽

Viral Infections ◽

Current Knowledge ◽

Selective Inhibition ◽

Dual Function ◽

Cellular Processes ◽

C Terminus ◽

Disease Associations ◽

Specific Protease

Ubiquitin-specific protease 14 (USP14), a deubiquitinating enzyme (DUB), is associated with proteasomes and exerts a dual function in regulating protein degradation. USP14 protects protein substrates from degradation by removing ubiquitin chains from proteasome-bound substrates, whereas promotes protein degradation by activating the proteasome. Increasing evidence have shown that USP14 is involved in several canonical signaling pathways, correlating with cancer, neurodegenerative diseases, autophagy, immune responses, and viral infections. The activity of USP14 is tightly regulated to ensure its function in various cellular processes. Structural studies have demonstrated that free USP14 exists in an autoinhibited state with two surface loops, BL1 and BL2, partially hovering above and blocking the active site cleft binding to the C-terminus of ubiquitin. Hence, both proteasome-bound and phosphorylated forms of USP14 require the induction of conformational changes in the BL2 loop to activate its deubiquitinating function. Due to its intriguing roles in the stabilization of disease-causing proteins and oncology targets, USP14 has garnered widespread interest as a therapeutic target. In recent years, significant progress has been made on identifying inhibitors targeting USP14, despite the complexity and challenges in improving their selectivity and affinity for USP14. In particular, the crystal structures of USP14 complexed with IU1-series inhibitors revealed the underlying allosteric regulatory mechanism and enabled the further design of potent inhibitors. In this review, we summarize the current knowledge regarding the structure, regulation, pathophysiological function, and selective inhibition of USP14, including disease associations and inhibitor development.

Download Full-text

An effective drug-disease associations prediction model based on graphic representation learning over multi-biomolecular network

BMC Bioinformatics ◽

10.1186/s12859-021-04553-2 ◽

2022 ◽

Vol 23 (1) ◽

Author(s):

Hanjing Jiang ◽

Yabing Huang

Keyword(s):

High Performance ◽

Large Scale ◽

Representation Learning ◽

Biological Data ◽

Graph Representation ◽

Data Set ◽

Validation Experiment ◽

Biomolecular Network ◽

Disease Associations ◽

Drug Reposition

Abstract Background Drug-disease associations (DDAs) can provide important information for exploring the potential efficacy of drugs. However, up to now, there are still few DDAs verified by experiments. Previous evidence indicates that the combination of information would be conducive to the discovery of new DDAs. How to integrate different biological data sources and identify the most effective drugs for a certain disease based on drug-disease coupled mechanisms is still a challenging problem. Results In this paper, we proposed a novel computation model for DDA predictions based on graph representation learning over multi-biomolecular network (GRLMN). More specifically, we firstly constructed a large-scale molecular association network (MAN) by integrating the associations among drugs, diseases, proteins, miRNAs, and lncRNAs. Then, a graph embedding model was used to learn vector representations for all drugs and diseases in MAN. Finally, the combined features were fed to a random forest (RF) model to predict new DDAs. The proposed model was evaluated on the SCMFDD-S data set using five-fold cross-validation. Experiment results showed that GRLMN model was very accurate with the area under the ROC curve (AUC) of 87.9%, which outperformed all previous works in terms of both accuracy and AUC in benchmark dataset. To further verify the high performance of GRLMN, we carried out two case studies for two common diseases. As a result, in the ranking of drugs that were predicted to be related to certain diseases (such as kidney disease and fever), 15 of the top 20 drugs have been experimentally confirmed. Conclusions The experimental results show that our model has good performance in the prediction of DDA. GRLMN is an effective prioritization tool for screening the reliable DDAs for follow-up studies concerning their participation in drug reposition.

Download Full-text

Prediction of lncRNA-disease association based on a Laplace normalized random walk with restart algorithm on heterogeneous networks

BMC Bioinformatics ◽

10.1186/s12859-021-04538-1 ◽

2022 ◽

Vol 23 (1) ◽

Author(s):

Liugen Wang ◽

Min Shang ◽

Qi Dai ◽

Ping-an He

Keyword(s):

Random Walk ◽

Heterogeneous Networks ◽

Global Network ◽

Random Walk With Restart ◽

Lncrna Gene ◽

Similarity Network ◽

Disease Similarity ◽

Disease Associations ◽

Universal Network ◽

Gene Similarity

Abstract Background More and more evidence showed that long non-coding RNAs (lncRNAs) play important roles in the development and progression of human sophisticated diseases. Therefore, predicting human lncRNA-disease associations is a challenging and urgently task in bioinformatics to research of human sophisticated diseases. Results In the work, a global network-based computational framework called as LRWRHLDA were proposed which is a universal network-based method. Firstly, four isomorphic networks include lncRNA similarity network, disease similarity network, gene similarity network and miRNA similarity network were constructed. And then, six heterogeneous networks include known lncRNA-disease, lncRNA-gene, lncRNA-miRNA, disease-gene, disease-miRNA, and gene-miRNA associations network were applied to design a multi-layer network. Finally, the Laplace normalized random walk with restart algorithm in this global network is suggested to predict the relationship between lncRNAs and diseases. Conclusions The ten-fold cross validation is used to evaluate the performance of LRWRHLDA. As a result, LRWRHLDA achieves an AUC of 0.98402, which is higher than other compared methods. Furthermore, LRWRHLDA can predict isolated disease-related lnRNA (isolated lnRNA related disease). The results for colorectal cancer, lung adenocarcinoma, stomach cancer and breast cancer have been verified by other researches. The case studies indicated that our method is effective.

Download Full-text

gGATLDA: lncRNA-disease association prediction based on graph-level graph attention network

BMC Bioinformatics ◽

10.1186/s12859-021-04548-z ◽

2022 ◽

Vol 23 (1) ◽

Author(s):

Li Wang ◽

Cheng Zhong

Keyword(s):

Characteristic Curve ◽

Experimental Results ◽

Computational Method ◽

Attention Network ◽

Feature Vectors ◽

Disease Similarity ◽

Potential Association ◽

Receiver Operation Characteristic ◽

Disease Pair ◽

Disease Associations

Abstract Background Long non-coding RNAs (lncRNAs) are related to human diseases by regulating gene expression. Identifying lncRNA-disease associations (LDAs) will contribute to diagnose, treatment, and prognosis of diseases. However, the identification of LDAs by the biological experiments is time-consuming, costly and inefficient. Therefore, the development of efficient and high-accuracy computational methods for predicting LDAs is of great significance. Results In this paper, we propose a novel computational method (gGATLDA) to predict LDAs based on graph-level graph attention network. Firstly, we extract the enclosing subgraphs of each lncRNA-disease pair. Secondly, we construct the feature vectors by integrating lncRNA similarity and disease similarity as node attributes in subgraphs. Finally, we train a graph neural network (GNN) model by feeding the subgraphs and feature vectors to it, and use the trained GNN model to predict lncRNA-disease potential association scores. The experimental results show that our method can achieve higher area under the receiver operation characteristic curve (AUC), area under the precision recall curve (AUPR), accuracy and F1-Score than the state-of-the-art methods in five fold cross-validation. Case studies show that our method can effectively identify lncRNAs associated with breast cancer, gastric cancer, prostate cancer, and renal cancer. Conclusion The experimental results indicate that our method is a useful approach for predicting potential LDAs.

Download Full-text

Prediction of lncRNA–Disease Associations via Closest Node Weight Graphs of the Spatial Neighborhood Based on the Edge Attention Graph Convolutional Network

Frontiers in Genetics ◽

10.3389/fgene.2021.808962 ◽

2022 ◽

Vol 12 ◽

Author(s):

Jianwei Li ◽

Mengfan Kong ◽

Duanyang Wang ◽

Zhenwu Yang ◽

Xiaoke Hao

Keyword(s):

Molecular Level ◽

Recognition Problem ◽

Great Success ◽

Convolutional Network ◽

Disease Associations ◽

Auc Value ◽

Non Coding Rnas ◽

Node Weight ◽

Human Complex ◽

Disease Characteristic

Accumulated evidence of biological clinical trials has shown that long non-coding RNAs (lncRNAs) are closely related to the occurrence and development of various complex human diseases. Research works on lncRNA–disease relations will benefit to further understand the pathogenesis of human complex diseases at the molecular level, but only a small proportion of lncRNA–disease associations has been confirmed. Considering the high cost of biological experiments, exploring potential lncRNA–disease associations with computational approaches has become very urgent. In this study, a model based on closest node weight graph of the spatial neighborhood (CNWGSN) and edge attention graph convolutional network (EAGCN), LDA-EAGCN, was developed to uncover potential lncRNA–disease associations by integrating disease semantic similarity, lncRNA functional similarity, and known lncRNA–disease associations. Inspired by the great success of the EAGCN method on the chemical molecule property recognition problem, the prediction of lncRNA–disease associations could be regarded as a component recognition problem of lncRNA–disease characteristic graphs. The CNWGSN features of lncRNA–disease associations combined with known lncRNA–disease associations were introduced to train EAGCN, and correlation scores of input data were predicted with EAGCN for judging whether the input lncRNAs would be associated with the input diseases. LDA-EAGCN achieved a reliable AUC value of 0.9853 in the ten-fold cross-over experiments, which was the highest among five state-of-the-art models. Furthermore, case studies of renal cancer, laryngeal carcinoma, and liver cancer were implemented, and most of the top-ranking lncRNA–disease associations have been proven by recently published experimental literature works. It can be seen that LDA-EAGCN is an effective model for predicting potential lncRNA–disease associations. Its source code and experimental data are available at https://github.com/HGDKMF/LDA-EAGCN.

Download Full-text

Determine independent gut microbiota-diseases association by eliminating the effects of human lifestyle factors

BMC Microbiology ◽

10.1186/s12866-021-02414-9 ◽

2022 ◽

Vol 22 (1) ◽

Author(s):

Congmin Zhu ◽

Xin Wang ◽

Jianchu Li ◽

Rui Jiang ◽

Hui Chen ◽

...

Keyword(s):

Gut Microbiota ◽

Disease Risk ◽

Bacterial Overgrowth ◽

Small Intestinal Bacterial Overgrowth ◽

Classification Performance ◽

Considerable Proportion ◽

Case Control Studies ◽

Physiological Variables ◽

Building Models ◽

Disease Associations

AbstractLifestyle and physiological variables on human disease risk have been revealed to be mediated by gut microbiota. Low concordance between case-control studies for detecting disease-associated microbe existed due to limited sample size and population-wide bias in lifestyle and physiological variables. To infer gut microbiota-disease associations accurately, we propose to build machine learning models by including both human variables and gut microbiota. When the model’s performance with both gut microbiota and human variables is better than the model with just human variables, the independent gut microbiota -disease associations will be confirmed. By building models on the American Gut Project dataset, we found that gut microbiota showed distinct association strengths with different diseases. Adding gut microbiota into human variables enhanced the classification performance of IBD significantly; independent associations between occurrence information of gut microbiota and irritable bowel syndrome, C. difficile infection, and unhealthy status were found; adding gut microbiota showed no improvement on models’ performance for diabetes, small intestinal bacterial overgrowth, lactose intolerance, cardiovascular disease. Our results suggested that although gut microbiota was reported to be associated with many diseases, a considerable proportion of these associations may be very weak. We proposed a list of microbes as biomarkers to classify IBD and unhealthy status. Further functional investigations of these microbes will improve understanding of the molecular mechanism of human diseases.

Download Full-text

disease associationsRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Epigenetic scores for the circulating proteome as tools for disease prediction

DDA-SKF: Predicting Drug–Disease Associations Using Similarity Kernel Fusion

SARS-CoV-2 Point Mutation and Deletion Spectra, and Their Association with Different Disease Outcome

HLA-SPREAD: a natural language processing based resource for curating HLA association from PubMed abstracts

USP14: Structure, Function, and Target Inhibition

An effective drug-disease associations prediction model based on graphic representation learning over multi-biomolecular network

Prediction of lncRNA-disease association based on a Laplace normalized random walk with restart algorithm on heterogeneous networks

gGATLDA: lncRNA-disease association prediction based on graph-level graph attention network

Prediction of lncRNA–Disease Associations via Closest Node Weight Graphs of the Spatial Neighborhood Based on the Edge Attention Graph Convolutional Network

Determine independent gut microbiota-diseases association by eliminating the effects of human lifestyle factors

disease associations
Recently Published Documents