hypergeometric test
Recently Published Documents


TOTAL DOCUMENTS

36
(FIVE YEARS 27)

H-INDEX

4
(FIVE YEARS 2)

Open Medicine ◽  
2022 ◽  
Vol 17 (1) ◽  
pp. 135-150
Author(s):  
Li Li ◽  
Yundi Cao ◽  
YingRui Fan ◽  
Rong Li

Abstract Hepatocellular carcinoma (HCC) has a high incidence and poor prognosis and is the second most fatal cancer, and certain HCC patients also show high heterogeneity. This study developed a prognostic model for predicting clinical outcomes of HCC. RNA and microRNA (miRNA) sequencing data of HCC were obtained from the cancer genome atlas. RNA dysregulation between HCC tumors and adjacent normal liver tissues was examined by DESeq algorithms. Survival analysis was conducted to determine the basic prognostic indicators. We identified competing endogenous RNA (ceRNA) containing 15,364 pairs of mRNA–long noncoding RNA (lncRNA). An imbalanced ceRNA network comprising 8 miRNAs, 434 mRNAs, and 81 lncRNAs was developed using hypergeometric test. Functional analysis showed that these RNAs were closely associated with biosynthesis. Notably, 53 mRNAs showed a significant prognostic correlation. The least absolute shrinkage and selection operator’s feature selection detected four characteristic genes (SAPCD2, DKC1, CHRNA5, and UROD), based on which a four-gene independent prognostic signature for HCC was constructed using Cox regression analysis. The four-gene signature could stratify samples in the training, test, and external validation sets (p <0.01). Five-year survival area under ROC curve (AUC) in the training and validation sets was greater than 0.74. The current prognostic gene model exhibited a high stability and accuracy in predicting the overall survival (OS) of HCC patients.


2021 ◽  
Vol 2021 ◽  
pp. 1-11
Author(s):  
Congfang Guo ◽  
Xiang Guo ◽  
Yudong Rong ◽  
Yirui Guo ◽  
Li Zhang

Background. Hepatocellular carcinoma (HCC) is high-mortality primary liver cancer and the most common malignant tumor in the world. This study is based on a hepatocellular carcinoma-related dysfunction module designed to explore the dysregulation of genes in liver cancer tissue. Methods. By downloading the relevant data on the GEO database, we performed a differential analysis of healthy liver tissue and liver cancer tissues as well as healthy liver tissue and hepatocellular carcinoma tissue and then obtained two sets of differential genes and combined them. We performed a cointerpretation analysis of these differential genes and constructed related functional disorder modules. A hypergeometric test was performed to calculate the potential regulatory effects of multiple factors on the module, and a series of ncRNA and TF regulators were identified. We obtained a total of 4479 differentially expressed genes in hepatocellular carcinoma, and these genes were clustered into ten hepatocellular carcinoma-related functional interpretation disorder modules. Results. Enrichment analysis revealed that these modular genes are mainly involved in signal transduction including cell cycle, TGF-beta signal transduction, and p53 signal transduction. Depending on the predictive analysis of multidimensional regulators, 323 ncRNAs and 52 TF-mediated hepatocellular carcinoma-related dysregulation modules were found to regulate disease progression. Conclusions. Based on a series of investigations, it was found that miR-30b-5p may participate in the peroxisome signal transduction by downregulating ABCD3-mediated module 1, thereby promoting the development and progression of hepatocellular carcinoma. Our research results not only provide a theoretical basis for biologists to study hepatocellular carcinoma further but also offer new methods and new ideas for the personalized care and treatment of hepatocellular carcinoma.


2021 ◽  
Vol 7 (4) ◽  
pp. 756-764
Author(s):  
Jianhua Liu ◽  
Liqing Zheng ◽  
Liang Cao ◽  
Changhong Zhang ◽  
Chen Li

Asthma is a complicated chronic airway inflammatory disease caused by the interaction of genetic susceptibility and environmental impact. Although biologists have explored the pathogenesis of asthma in various aspects, the exact molecular mechanism continues to be elusive. In this study, we conducted a modular study of asthma-related genes to explore their core pathogenic driving genes. Firstly, the expression profiles of normal, mild to moderate and severe asthma patients were analyzed to screen the differentially expressed genes. Secondly, differential genes of asthma were integrated, co-expressed and clustered into modules. Next, enrichment of GO function and KEGG pathway of module genes were analyzed. Finally, non-coding RNA (ncRNA) and transcription factors that regulate modules are predicted by hypergeometric test. In summary, we have obtained 14 co-expression modules, among which CDCA5, JUNB and other genes are significantly differentially expressed in asthmatic patients, and have an active regulatory role in dysfunction module, so they are recognized as asthma-driving genes. Enrichment results showed that module genes were significantly involved in cell growth, transcription factor activity, cellular response to drugs and the transport of various ions. In addition, they also radically regulate Wnt, TGF-beta, JAK-STAT and extracellular matrix signaling pathways. Finally, we identified significant regulatory dysfunction modules of ncRNA pivot (including miR-181a-5p and let-7d-5p) and TF pivot (including NFKB1, ESR1 and MYC). Overall, our work has uncovered a co-expression network involved in the regulation of core pathogenic genes of asthma. It helps to reveal the core dysfunction modules and potential regulatory factors of this disease, and to enhance our understanding of the molecular mechanisms of asthma-related diseases.


2021 ◽  
Author(s):  
Rui Fan ◽  
Qinghua Cui

ABSTRACTGene functional enrichment analysis represents one of the most popular bioinformatics methods for annotating the pathways and function categories of a given gene list. Current algorithms for enrichment computation such as Fisher’s exact test and hypergeometric test totally depend on the category count numbers of the gene list and one gene set. In this case, whatever the genes are, they were treated equally. However, actually genes show different scores in their essentiality in a gene list and in a gene set. It is thus hypothesized that the essentiality scores could be important and should be considered in gene functional analysis. For this purpose, here we proposed WEAT (https://www.cuilab.cn/weat/), a weighted gene set enrichment algorithm and online tool by weighting genes using essentiality scores. We confirmed the usefulness of WEAT using two case studies, the functional analysis of one aging-related gene list and one gene list involved in Lung Squamous Cell Carcinoma (LUSC). Finally, we believe that the WEAT method and tool could provide more possibilities for further exploring the functions of given gene lists.


2021 ◽  
Vol 21 (1) ◽  
Author(s):  
Lijie Wang ◽  
Fengxia Yu

Abstract Background Acute myocardial infarction (AMI) is myocardial necrosis caused by acute coronary ischemia and hypoxia. It can be complicated by arrhythmia, shock, heart failure and other symptoms that can be life-threatening. A multi-regulator driven dysfunction module for AMI was constructed. It is intended to explore the pathogenesis and functional pathways regulation of acute myocardial infarction. Methods Combining differential expression analysis, co-expression analysis, and the functional enrichment analysis, a set of expression disorder modules related to AMI was obtained. Hypergeometric test was performed to calculate the potential regulatory effects of multiple factors on the module, identifying a range of non-coding RNA and transcription factors. Results A total of 4551 differentially expressed genes for AMI and seven co-expression modules were obtained. These modules are primarily involved in the metabolic processes of prostaglandin transport processes, regulating DNA recombination and AMPK signal transduction. Based on this set of functional modules, 3 of 24 transcription factors (TFs) including NFKB1, MECP2 and SIRT1, and 3 of 782 non-coding RNA including miR-519D-3P, TUG1 and miR-93-5p were obtained. These core regulators are thought to be involved in the progression of AMI disease. Through the AMPK signal transduction, the critical gene stearoyl-CoA desaturase (SCD) can lead to the occurrence and development of AMI. Conclusions In this study, a dysfunction module was used to explore the pathogenesis of multifactorial mediated AMI and provided new methods and ideas for subsequent research. It helps researchers to have a deeper understanding of its potential pathogenesis. The conclusion provides a theoretical basis for biologists to design further experiments related to AMI.


2021 ◽  
Vol 10 ◽  
Author(s):  
Ji’an Yang ◽  
Qian Yang

Glioblastoma multiforme is the most common primary intracranial malignancy, but its etiology and pathogenesis are still unclear. With the deepening of human genome research, the research of glioma subtype screening based on core molecules has become more in-depth. In the present study, we screened out differentially expressed genes (DEGs) through reanalyzing the glioblastoma multiforme (GBM) datasets GSE90598 from the Gene Expression Omnibus (GEO), the GBM dataset TCGA-GBM and the low-grade glioma (LGG) dataset TCGA-LGG from the Cancer Genome Atlas (TCGA). A total of 150 intersecting DEGs were found, of which 48 were upregulated and 102 were downregulated. These DEGs from GSE90598 dataset were enriched using the overrepresentation method, and multiple enriched gene ontology (GO) function terms were significantly correlated with neural cell signal transduction. DEGs between GBM and LGG were analyzed by gene set enrichment analysis (GSEA), and the significantly enriched Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways involved in synapse signaling and oxytocin signaling pathways. Then, a protein-protein interaction (PPI) network was constructed to assess the interaction of proteins encoded by the DEGs. MCODE identified 2 modules from the PPI network. The 11 genes with the highest degrees in module 1 were designated as core molecules, namely, GABRD, KCNC1, KCNA1, SYT1, CACNG3, OPALIN, CD163, HPCAL4, ANK3, KIF5A, and MS4A6A, which were mainly enriched in ionic signaling-related pathways. Survival analysis of the GSE83300 dataset verified the significant relationship between expression levels of the 11 core genes and survival. Finally, the core molecules of GBM and the DrugBank database were assessed by a hypergeometric test to identify 10 drugs included tetrachlorodecaoxide related to cancer and neuropsychiatric diseases. Further studies are required to explore these core genes for their potentiality in diagnosis, prognosis, and targeted therapy and explain the relationship among ionic signaling-related pathways, neuropsychiatric diseases and neurological tumors.


2021 ◽  
Author(s):  
Lijie Wang ◽  
Fengxia Yu

Abstract Background: Acute myocardial infarction (AMI) is myocardial necrosis caused by acute coronary ischemia and hypoxia. It can be complicated by arrhythmia, shock, heart failure and other symptoms that can be life-threatening. We constructed a multi-regulator driven dysfunction module for AMI. It is intended to explore the pathogenesis and functional pathways regulation of acute myocardial infarction. Methods: Combining differential expression analysis, co-expression analysis, and the functional enrichment analysis, we obtained a set of expression disorder modules related to AMI. Hypergeometric test was performed to calculate the potential regulatory effects of multiple factors on the module, identifying a range of non-coding RNA and transcription factors. Results: We obtained 4551 differentially expressed genes for AMI and seven co-expression modules. These modules are primarily involved in the metabolic processes of prostaglandin transport processes, regulating DNA recombination and AMPK signal transduction. Based on this set of functional modules, we revealed 3 of 24 transcription factors (TFs) including NFKB1, MECP2 and SIRT1, 3 of 782 non-coding RNA including miR-519D-3P, TUG1 and miR-93-5p were obtained. These core regulators are thought to be involved in the progression of AMI disease. Through the AMPK signal transduction, the critical gene stearoyl-CoA desaturase (SCD) can lead to the occurrence and development of AMI. Conclusions: In this study, we used a dysfunction module to explore the pathogenesis of multifactorial mediated AMI and provided new methods and ideas for subsequent research. It helps researchers to have a deeper understanding of its potential pathogenesis. The conclusion provides a theoretical basis for biologists to design further experiments related to AMI.Trial registration: All analyses were based on previous study, thus no ethical approval and patient consent are required. The dataset of GSE48060 is from GEO and the accession number is PRJNA208840.


2020 ◽  
Author(s):  
Ahmed Arslan

The essential understanding of disease pathogenesis and enabling genetic findings to be used for developing new therapeutics, is missing in the identifications of genomic loci through whole genome association studies (GWAS). Here we describe a new computational method (mMap) that reduces this gap by characterizing the functional and regulatory impact of allelic variation. The method incorporates the precomputed annotations of 26 protein functional regions and eight regulatory regions and recover SNPs that fall/lie in these regions. After annotating SNPs to functional or regulatory data, method link them to biological functions and pathways, and predicts significantly disrupted biological regions, processes and pathways, by controlling false discovery through hypergeometric test. By doing so, the method limits data to human interpretation level by prioritizing SNPs that have the potential to mediate a biological phenotype. The method is applicable to procedures that rely on the understanding of the biological causal role of mouse SNPs and is available online. In two example mMap applications, including whole genomes SNPs data from 48 inbred mice strains, we identify biological mechanisms by which SNPs can regulate pathways to govern phenotypes by targeting different coding and regulatory regions, even in closely related strains.


Genes ◽  
2020 ◽  
Vol 11 (10) ◽  
pp. 1231
Author(s):  
Pâmela A. Alexandre ◽  
Nicholas J. Hudson ◽  
Sigrid A. Lehnert ◽  
Marina R. S. Fortes ◽  
Marina Naval-Sánchez ◽  
...  

Genome-wide gene expression analysis are routinely used to gain a systems-level understanding of complex processes, including network connectivity. Network connectivity tends to be built on a small subset of extremely high co-expression signals that are deemed significant, but this overlooks the vast majority of pairwise signals. Here, we developed a computational pipeline to assign to every gene its pair-wise genome-wide co-expression distribution to one of 8 template distributions shapes varying between unimodal, bimodal, skewed, or symmetrical, representing different proportions of positive and negative correlations. We then used a hypergeometric test to determine if specific genes (regulators versus non-regulators) and properties (differentially expressed or not) are associated with a particular distribution shape. We applied our methodology to five publicly available RNA sequencing (RNA-seq) datasets from four organisms in different physiological conditions and tissues. Our results suggest that genes can be assigned consistently to pre-defined distribution shapes, regarding the enrichment of differential expression and regulatory genes, in situations involving contrasting phenotypes, time-series, or physiological baseline data. There is indeed a striking additional biological signal present in the genome-wide distribution of co-expression values which would be overlooked by currently adopted approaches. Our method can be applied to extract further information from transcriptomic data and help uncover the molecular mechanisms involved in the regulation of complex biological process and phenotypes.


2020 ◽  
Author(s):  
Yue Ma ◽  
Jun Zhai

Abstract Background Polycystic ovary syndrome (PCOS) is a prevalent endocrine and metabolic disorder in women of childbearing age. Recent studies have shown that long non-coding RNA (lncRNA) played a vital role in the development of the PCOS. Competitive endogenous RNA (ceRNA), a novel interacting mechanism, in which lncRNA could interact with micro-RNAs (miRNA) and indirectly interact with mRNAs through competing interactions. However, the mechanism of ceRNA regulated by lncRNA in the PCOS was unclear. Results We constructed the global background network based on the assumed lncRNA-miRNA and miRNA-mRNA pairs, which were obtained from lncRNASNP, miRTarBase and StarBase database. Then we calculated differentially expressed genes of PCOS using the data of GSE95728. PCOS related lncRNA-mRNA network (PCLMN) was constructed by hypergeometric test, including 41 mRNA nodes, 41 lncRNA nodes and 203 edges. Topological analyses was performed to determine the crucial lncRNAs with the highest centroid. We further identified the subcellular localization, performed functional module analyses and identified putative transfer factors of the key lncRNAs. Functional enrichment analyses were performed by GO classification and KEGG pathway analyses. Finally, 3 key lncRNAs(LINC00667, H19, AC073172.1) and their ceRNA sub-networks, which were involved in NF-kB signaling pathway, inflammatory, apoptotic and immune-related processes, had been found as the potential PCOS related disease genes. Conclusions Based on the result above, we speculate that LINC00667, H19, AC073172.1 and their ceRNA sub-networks played an crucial role in PCOS. All these results can help us discover the molecular mechanism and offer new predictive biomarkers for PCOS.


Sign in / Sign up

Export Citation Format

Share Document