scholarly journals Bias in miRNA enrichment analysis related to gene functional annotations

2021 ◽  
Author(s):  
Konstantinos Zagganas ◽  
Georgios K Georgakilas ◽  
Thanasis Vergoulis ◽  
Theodore Dalamagas

Motivation: miRNA functional enrichment is a type of analysis that is used to predict which biological functions may be affected by a group of miRNAs or validate whether a list of dysreg- ulated miRNAs are linked to a diseased state. The standard method for functional enrichment analysis uses the hypergeometric distribution to produce p-values, depicting the strength of the association between a group of miRNAs and a biological function. However, in 2015, it was shown that this approach suffers from a bias related to miRNA targets produced by target prediction algorithms and a new randomization test was proposed. Results: In this paper, we demonstrate the existence of another underlying bias which affects gene annotation data sets; additionally, we show that the statistical measure used for the estab- lished randomization test is not sensitive enough to account for it. For this reason, we propose the use of an alternative statistical measure, the "two-sided overlap", and we show that it is able to alleviate the aforementioned issue. Finally, we develop BUFET2, a miRNA enrichment analysis tool that leverages this measure (along with the old one); it is based on BUFET, a fast and scalable implementation of the established randomization test. Availability and Implementation: BUFET2 is written in C++ and is packaged with a Python wrapper script that facilitates experiment execution. Moreover, BUFET2 also comes pre-packaged in a Linux Docker image published on Docker Hub, thus eliminating the need for code compilation. Finally, BUFET2 is also publicly available to execute through a REST API. All datasets used in the experiments throught this paper are openly accessible on Zenodo (https://doi.org/10.5281/zenodo.5175819).

2013 ◽  
Vol 32 (4) ◽  
pp. 195-204 ◽  
Author(s):  
Qiang Huang ◽  
Ling-Yun Wu ◽  
Yong Wang ◽  
Xiang-Sun Zhang

F1000Research ◽  
2020 ◽  
Vol 9 ◽  
pp. 709 ◽  
Author(s):  
Liis Kolberg ◽  
Uku Raudvere ◽  
Ivan Kuzmin ◽  
Jaak Vilo ◽  
Hedi Peterson

g:Profiler (https://biit.cs.ut.ee/gprofiler) is a widely used gene list functional profiling and namespace conversion toolset that has been contributing to reproducible biological data analysis already since 2007. Here we introduce the accompanying R package, gprofiler2, developed to facilitate programmatic access to g:Profiler computations and databases via REST API. The gprofiler2 package provides an easy-to-use functionality that enables researchers to incorporate functional enrichment analysis into automated analysis pipelines written in R. The package also implements interactive visualisation methods to help to interpret the enrichment results and to illustrate them for publications. In addition, gprofiler2 gives access to the versatile gene/protein identifier conversion functionality in g:Profiler enabling to map between hundreds of different identifier types or orthologous species. The gprofiler2 package is freely available at the CRAN repository.


Author(s):  
Peng Wang ◽  
Xin Li ◽  
Yue Gao ◽  
Qiuyan Guo ◽  
Shangwei Ning ◽  
...  

Abstract LnCeVar (http://www.bio-bigdata.net/LnCeVar/) is a comprehensive database that aims to provide genomic variations that disturb lncRNA-associated competing endogenous RNA (ceRNA) network regulation curated from the published literature and high-throughput data sets. LnCeVar curated 119 501 variation–ceRNA events from thousands of samples and cell lines, including: (i) more than 2000 experimentally supported circulating, drug-resistant and prognosis-related lncRNA biomarkers; (ii) 11 418 somatic mutation–ceRNA events from TCGA and COSMIC; (iii) 112 674 CNV–ceRNA events from TCGA; (iv) 67 066 SNP–ceRNA events from the 1000 Genomes Project. LnCeVar provides a user-friendly searching and browsing interface. In addition, as an important supplement of the database, several flexible tools have been developed to aid retrieval and analysis of the data. The LnCeVar–BLAST interface is a convenient way for users to search ceRNAs by interesting sequences. LnCeVar–Function is a tool for performing functional enrichment analysis. LnCeVar–Hallmark identifies dysregulated cancer hallmarks of variation–ceRNA events. LnCeVar–Survival performs COX regression analyses and produces survival curves for variation–ceRNA events. LnCeVar–Network identifies and creates a visualization of dysregulated variation–ceRNA networks. Collectively, LnCeVar will serve as an important resource for investigating the functions and mechanisms of personalized genomic variations that disturb ceRNA network regulation in human diseases.


Author(s):  
Ben Li ◽  
Bo Zhang ◽  
Qiong Wu ◽  
Xinming Chen ◽  
Xiang Cao ◽  
...  

Background: Peroxiredoxins (Prxs) comprise antioxidant factors that are widely found in prokaryotes and eukaryotes. Abnormal expression of Prxs is closely related to tumorigenesis. Methods: This study examined the prognostic value and expression of Prxs in lung cancer by Human Protein Atlas (HPA), Gene Expression Profiling Interactive Analysis (GEPIA), UALCAN, Kaplan-Meier Plotter, cBioPortal and Functional Enrichment Analysis Tool (FunRich) databases. Results: We found that Prx1/2/3/4/5 were overexpressed in both lung squamous cell carcinoma (LUSC) and lung adenocarcinoma (LUAD) relative to normal lung cells. However, the expression level of Prx6 was lower in LUAD and higher in LUSC than normal lung cells. The level of Prx3 and Prx6 were associated with pathological stage. Prognostic analysis showed that elevated Prx1 and Prx2 expression were correlated with low Overall Survival (OS), whereas high Prx5 and Prx6 expression level predicted high OS. Conclusions: Our results effectively revealed the level of Prxs in lung cancer and its influence on the prognosis of lung carcinoma, contributing to the study of the role of Prxs in tumorigenesis.


Author(s):  
Saúl Lira-Albarrán ◽  
Xiaowei Liu ◽  
Seok Hee Lee ◽  
Paolo Rinaudo

Abstract Offspring generated by in vitro fertilization (IVF) are believed to be healthy but display a possible predisposition to chronic diseases, like hypertension and glucose intolerance. Since epigenetic changes are believed to underlie such phenotype, this study aimed at describing global DNA methylation changes in the liver of adult mice generated by natural mating (FB group) or by IVF. Embryos were generated by IVF or natural mating. At 30 weeks of age, mice were sacrificed. The liver was removed, and global DNA methylation was assessed using whole-genome bisulfite sequencing (WGBS). Genomic Regions for Enrichment Analysis Tool (GREAT) and G:Profilerβ were used to identify differentially methylated regions (DMRs) and for functional enrichment analysis. Overrepresented gene ontology terms were summarized with REVIGO, while canonical pathways (CPs) were identified with Ingenuity® Pathway Analysis. Overall, 2692 DMRs (4.91%) were different between the groups. The majority of DMRs (84.92%) were hypomethylated in the IVF group. Surprisingly, only 0.16% of CpG islands were differentially methylated and only a few DMRs were located on known gene promoters (n = 283) or enhancers (n = 190). Notably, the long-interspersed element (LINE), short-interspersed element (SINE), and long terminal repeat (LTR1) transposable elements showed reduced methylation (P < 0.05) in IVF livers. Cellular metabolic process, hepatic fibrosis, and insulin receptor signaling were some of the principal biological processes and CPs modified by IVF. In summary, IVF modifies the DNA methylation signature in the adult liver, resulting in hypomethylation of genes involved in metabolism and gene transcription regulation. These findings may shed light on the mechanisms underlying the developmental origin of health and disease.


Genes ◽  
2018 ◽  
Vol 9 (12) ◽  
pp. 569 ◽  
Author(s):  
Eduardo Zúñiga-León ◽  
Ulises Carrasco-Navarro ◽  
Francisco Fierro

The increasing number of OMICs studies demands bioinformatic tools that aid in the analysis of large sets of genes or proteins to understand their roles in the cell and establish functional networks and pathways. In the last decade, over-representation or enrichment tools have played a successful role in the functional analysis of large gene/protein lists, which is evidenced by thousands of publications citing these tools. However, in most cases the results of these analyses are long lists of biological terms associated to proteins that are difficult to digest and interpret. Here we present NeVOmics, Network-based Visualization for Omics, a functional enrichment analysis tool that identifies statistically over-represented biological terms within a given gene/protein set. This tool provides a hypergeometric distribution test to calculate significantly enriched biological terms, and facilitates analysis on cluster distribution and relationship of proteins to processes and pathways. NeVOmics is adapted to use updated information from the two main annotation databases: Gene Ontology and Kyoto Encyclopedia of Genes and Genomes (KEGG). NeVOmics compares favorably to other Gene Ontology and enrichment tools regarding coverage in the identification of biological terms. NeVOmics can also build different network-based graphical representations from the enrichment results, which makes it an integrative tool that greatly facilitates interpretation of results obtained by OMICs approaches. NeVOmics is freely accessible at https://github.com/bioinfproject/bioinfo/.


2021 ◽  
Author(s):  
Chu T Thu ◽  
Jonathan Y. Chung ◽  
Deepika Dhawan ◽  
Christopher A. Vaiana ◽  
Lara K. Mahal

MicroRNAs (miRNAs, miRs) finely tune protein expression and target networks of 100s-1000s of genes that control specific biological processes. They are critical regulators of glycosylation, one of the most diverse and abundant posttranslational modifications. In recent work, miRs have been shown to predict the biological functions of glycosylation enzymes, leading to the miRNA proxy hypothesis which states, if a miR drives a specific biological phenotype, the targets of that miR will drive the same biological phenotype. Testing of this powerful hypothesis is hampered by our lack of knowledge about miR targets. Target prediction suffers from low accuracy and a high false prediction rate. Herein, we develop a high-throughput experimental platform to analyze miR:target interactions, miRFluR. We utilize this system to analyze the interactions of the entire human miRome with beta-3-glucosyltransferase (B3GLCT), a glycosylation enzyme whose loss underpins the congenital disorder Peters Plus Syndrome. Although this enzyme is predicted by multiple algorithms to be highly targeted by miRs, we identify only 27 miRs that downregulate B3GLCT, a >96% false positive rate for prediction. Functional enrichment analysis of these validated miRs predict phenotypes associated with Peters Plus Syndrome, although B3GLCT is not in their known target network. Thus, biological phenotypes driven by B3GLCT may be driven by the target networks of miRs that regulate this enzyme, providing additional evidence for the miRNA Proxy Hypothesis.


2021 ◽  
Author(s):  
Dou-Dou Ding ◽  
Quan Zhou ◽  
Ze He ◽  
Hong-Xia He ◽  
Man-Zhen Zuo

Abstract Introduction:Epidemiological studies have found that the occurrence of endometrial cancer(EC) is closely related to metabolic diseases, and insulin resistance (IR) plays an important role in the pathogenesis of endometrium, but the specific pathogenesis is still unclear. The purpose of this study is to reveal the relationship between insulin resistance and endothelial cells by gene screening technology. Material and methods:We analyzed one endometrial carcinoma database (GSE106191) and one insulin-resistant database (GSE63992), with Gene Expression Omnibus (GEO) database and Venny online analysis tool, then, we found an add-up to 148 different genes. Functional enrichment analysis of these genes using DAVID showed that they were participated in transcription factor activity,signaling pathways and response to factors, etc. Then used cytoHubba in Cytoscape,we got 25 hub genes.Results: The results showed that the survival time of OGT, IGSF3, TRO, NEURL2 and PIK3C2B was significantly and closely related to EC, and the percentage of gene changes of five central genes ranged from 3% to 10% of a single gene, was also related to the infiltration of seven kinds of immune cells in endometrial carcinoma.Conclusion:The five key genes (OGT,IGSF3, PIK3C2B,TRO and NEURL2) are involved in immune infiltration in the progression of endometrial carcinoma, and there is also a certain mutation probability in gene mutation. This may be the pathogenesis of insulin resistance and endometrial cancer.


2021 ◽  
Author(s):  
Mi Zhou ◽  
Ruru Guo ◽  
Yongfei Wang ◽  
Wanling Yang ◽  
Rongxiu Li ◽  
...  

Abstract Background: Systemic juvenile idiopathic arthritis (sJIA) is a severe autoinflammatory disorder whose molecular mechanism is still not clearly defined. To better understand the disease using scattered datasets from public domains, we performed a weighted gene co-expression network analysis (WGCNA) to identify key modules and hub genes underlying sJIA pathogenesis.Methods: Two gene expression datasets, GSE7753 and GSE13501, were used to construct WGCNA. Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analyses were applied to the entirety of genes and the hub genes in the sJIA modules. Cytoscape was used to screen and visualize the hub genes. We further compared the hub genes with the GWAS genes and used a consensus WGCNA analysis to prove that our conclusions are conservative and reproducible across multiple independent data sets. Results: A total of 5414 genes were obtained for WGCNA, from which highly correlated genes were divided into 17 modules. The red module demonstrated the highest correlation with the sJIA module (r =0.8, p=3e−29), while the green-yellow module was found to be closely related to the non-sJIA module (r =0.62, p=1e−14). Functional enrichment analysis demonstrated that the red module was largely enriched in activation of immune responses, infection, nucleosome and erythrocyte, the green-yellow module was mostly enriched in immune responses and inflammation. Additionally, the hub genes in the red module were highly enriched in erythrocyte differentiation, including ALAS2, AHSP, TRIM10, TRIM58 and KLF1. The hub genes from the green-yellow module were mainly associated with immune responses, exemplified by genes such as KLRB1, KLRF1, CD160, KIRs etc.Conclusion: We identified sJIA-related modules and several hub genes that might be associated with the development of sJIA. The two modules may help understand the mechanisms of sJIA and the hub genes may become biomarkers and therapeutic targets of sJIA in the future.


Sign in / Sign up

Export Citation Format

Share Document