Improving the Prediction of Disease-associated Genes by Integrating Annotated Gene Sets

Author(s):  
Chao Deng ◽  
Cui-Xiang Lin ◽  
Hong-Dong Li
2020 ◽  
Author(s):  
Gryglewski Gregor ◽  
Murgaš Matej ◽  
Michenthaler Paul ◽  
Klöbl Manfred ◽  
Reed Murray Bruce ◽  
...  

AbstractThe parcellation of the cerebral cortex serves the investigation of the emergence of uniquely human brain functions and disorders. We employed hierarchical clustering based on comprehensive transcriptomic data of the human cortex in order to delineate areas with distinct gene expression profiles. These profiles were analyzed for the enrichment of gene sets associated with brain disorders by genome-wide studies (GWAS) and expert curation. This suggested new roles of specific cortical areas in psychiatric, neurodegenerative, congenital and other neurological disorders while reproducing some well-established links for movement disorders and dementias. GWAS-derived gene sets for psychiatric disorders exhibited similar enrichment patterns in the posterior fusiform gyrus and inferior parietal lobule driven by pleiotropic genes. This implies that the effects of risk variants shared between neuropsychiatric disorders might converge in these areas. For several diseases, specific genes were highlighted, which may aid the discovery of novel disease mechanisms and urgently needed treatments.


2017 ◽  
Author(s):  
T Itzel ◽  
R Spang ◽  
T Maass ◽  
S Munker ◽  
HJ Schlitt ◽  
...  

2018 ◽  
Vol 21 (2) ◽  
pp. 74-83
Author(s):  
Tzu-Hung Hsiao ◽  
Yu-Chiao Chiu ◽  
Yu-Heng Chen ◽  
Yu-Ching Hsu ◽  
Hung-I Harry Chen ◽  
...  

Aim and Objective: The number of anticancer drugs available currently is limited, and some of them have low treatment response rates. Moreover, developing a new drug for cancer therapy is labor intensive and sometimes cost prohibitive. Therefore, “repositioning” of known cancer treatment compounds can speed up the development time and potentially increase the response rate of cancer therapy. This study proposes a systems biology method for identifying new compound candidates for cancer treatment in two separate procedures. Materials and Methods: First, a “gene set–compound” network was constructed by conducting gene set enrichment analysis on the expression profile of responses to a compound. Second, survival analyses were applied to gene expression profiles derived from four breast cancer patient cohorts to identify gene sets that are associated with cancer survival. A “cancer–functional gene set– compound” network was constructed, and candidate anticancer compounds were identified. Through the use of breast cancer as an example, 162 breast cancer survival-associated gene sets and 172 putative compounds were obtained. Results: We demonstrated how to utilize the clinical relevance of previous studies through gene sets and then connect it to candidate compounds by using gene expression data from the Connectivity Map. Specifically, we chose a gene set derived from a stem cell study to demonstrate its association with breast cancer prognosis and discussed six new compounds that can increase the expression of the gene set after the treatment. Conclusion: Our method can effectively identify compounds with a potential to be “repositioned” for cancer treatment according to their active mechanisms and their association with patients’ survival time.


2020 ◽  
Vol 15 ◽  
Author(s):  
Chen-An Tsai ◽  
James J. Chen

Background: Gene set enrichment analyses (GSEA) provide a useful and powerful approach to identify differentially expressed gene sets with prior biological knowledge. Several GSEA algorithms have been proposed to perform enrichment analyses on groups of genes. However, many of these algorithms have focused on identification of differentially expressed gene sets in a given phenotype. Objective: In this paper, we propose a gene set analytic framework, Gene Set Correlation Analysis (GSCoA), that simultaneously measures within and between gene sets variation to identify sets of genes enriched for differential expression and highly co-related pathways. Methods: We apply co-inertia analysis to the comparisons of cross-gene sets in gene expression data to measure the costructure of expression profiles in pairs of gene sets. Co-inertia analysis (CIA) is one multivariate method to identify trends or co-relationships in multiple datasets, which contain the same samples. The objective of CIA is to seek ordinations (dimension reduction diagrams) of two gene sets such that the square covariance between the projections of the gene sets on successive axes is maximized. Simulation studies illustrate that CIA offers superior performance in identifying corelationships between gene sets in all simulation settings when compared to correlation-based gene set methods. Result and Conclusion: We also combine between-gene set CIA and GSEA to discover the relationships between gene sets significantly associated with phenotypes. In addition, we provide a graphical technique for visualizing and simultaneously exploring the associations of between and within gene sets and their interaction and network. We then demonstrate integration of within and between gene sets variation using CIA and GSEA, applied to the p53 gene expression data using the c2 curated gene sets. Ultimately, the GSCoA approach provides an attractive tool for identification and visualization of novel associations between pairs of gene sets by integrating co-relationships between gene sets into gene set analysis.


2021 ◽  
pp. 036354652110232
Author(s):  
Jessica M. Eager ◽  
William J. Warrender ◽  
Carly B. Deusenbery ◽  
Grant Jamgochian ◽  
Arjun Singh ◽  
...  

Background: Impaired healing after rotator cuff repair is a major concern, with retear rates as high as 94%. A method to predict whether patients are likely to experience poor surgical outcomes would change clinical practice. While various patient factors, such as age and tear size, have been linked to poor functional outcomes, it is currently very challenging to predict outcomes before surgery. Purpose: To evaluate gene expression differences in tissue collected during surgery between patients who ultimately went on to have good outcomes and those who experienced a retear, in an effort to determine if surgical outcomes can be predicted. Study Design: Case-control study; Level of evidence, 3. Methods: Rotator cuff tissue was collected at the time of surgery from 140 patients. Patients were tracked for a minimum of 6 months to identify those with good or poor outcomes, using clinical functional scores and follow-up magnetic resonance imaging to confirm failure to heal or retear. Gene expression differences between 8 patients with poor outcomes and 28 patients with good outcomes were assessed using a multiplex gene expression analysis via NanoString and a custom-curated panel of 145 genes related to various stages of rotator cuff healing. Results: Although significant differences in the expression of individual genes were not observed, gene set enrichment analysis highlighted major differences in gene sets. Patients who had poor healing outcomes showed greater expression of gene sets related to extracellular matrix production ( P < .0001) and cellular biosynthetic pathways ( P < .001), while patients who had good healing outcomes showed greater expression of genes associated with the proinflammatory (M1) macrophage phenotype ( P < .05). Conclusion: These results suggest that a more proinflammatory, fibrotic environment before repair may play a role in poor healing outcome. With validation in a larger cohort, these results may ultimately lead to diagnostic methods to preoperatively predict those at risk for poor surgical outcomes.


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Gulden Olgun ◽  
Afshan Nabi ◽  
Oznur Tastan

Abstract Background While some non-coding RNAs (ncRNAs) are assigned critical regulatory roles, most remain functionally uncharacterized. This presents a challenge whenever an interesting set of ncRNAs needs to be analyzed in a functional context. Transcripts located close-by on the genome are often regulated together. This genomic proximity on the sequence can hint at a functional association. Results We present a tool, NoRCE, that performs cis enrichment analysis for a given set of ncRNAs. Enrichment is carried out using the functional annotations of the coding genes located proximal to the input ncRNAs. Other biologically relevant information such as topologically associating domain (TAD) boundaries, co-expression patterns, and miRNA target prediction information can be incorporated to conduct a richer enrichment analysis. To this end, NoRCE includes several relevant datasets as part of its data repository, including cell-line specific TAD boundaries, functional gene sets, and expression data for coding & ncRNAs specific to cancer. Additionally, the users can utilize custom data files in their investigation. Enrichment results can be retrieved in a tabular format or visualized in several different ways. NoRCE is currently available for the following species: human, mouse, rat, zebrafish, fruit fly, worm, and yeast. Conclusions NoRCE is a platform-independent, user-friendly, comprehensive R package that can be used to gain insight into the functional importance of a list of ncRNAs of any type. The tool offers flexibility to conduct the users’ preferred set of analyses by designing their own pipeline of analysis. NoRCE is available in Bioconductor and https://github.com/guldenolgun/NoRCE.


2021 ◽  
Vol 22 (4) ◽  
pp. 2183
Author(s):  
Nurhani Mat Razali ◽  
Siti Norvahida Hisham ◽  
Ilakiya Sharanee Kumar ◽  
Rohit Nandan Shukla ◽  
Melvin Lee ◽  
...  

Proper management of agricultural disease is important to ensure sustainable food security. Staple food crops like rice, wheat, cereals, and other cash crops hold great export value for countries. Ensuring proper supply is critical; hence any biotic or abiotic factors contributing to the shortfall in yield of these crops should be alleviated. Rhizoctonia solani is a major biotic factor that results in yield losses in many agriculturally important crops. This paper focuses on genome informatics of our Malaysian Draft R. solani AG1-IA, and the comparative genomics (inter- and intra- AG) with four AGs including China AG1-IA (AG1-IA_KB317705.1), AG1-IB, AG3, and AG8. The genomic content of repeat elements, transposable elements (TEs), syntenic genomic blocks, functions of protein-coding genes as well as core orthologous genic information that underlies R. solani’s pathogenicity strategy were investigated. Our analyses show that all studied AGs have low content and varying profiles of TEs. All AGs were dominant for Class I TE, much like other basidiomycete pathogens. All AGs demonstrate dominance in Glycoside Hydrolase protein-coding gene assignments suggesting its importance in infiltration and infection of host. Our profiling also provides a basis for further investigation on lack of correlation observed between number of pathogenicity and enzyme-related genes with host range. Despite being grouped within the same AG with China AG1-IA, our Draft AG1-IA exhibits differences in terms of protein-coding gene proportions and classifications. This implies that strains from similar AG do not necessarily have to retain similar proportions and classification of TE but must have the necessary arsenal to enable successful infiltration and colonization of host. In a larger perspective, all the studied AGs essentially share core genes that are generally involved in adhesion, penetration, and host colonization. However, the different infiltration strategies will depend on the level of host resilience where this is clearly exhibited by the gene sets encoded for the process of infiltration, infection, and protection from host.


Author(s):  
Sarra E Jamieson ◽  
Michaela Fakiola ◽  
Dave Tang ◽  
Elizabeth Scaman ◽  
Genevieve Syn ◽  
...  

Abstract Background Our goal was to identify genetic risk factors for severe otitis media (OM) in Aboriginal Australians. Methods Illumina ® Omni2.5 BeadChip and imputed data were compared between 21 children with severe OM (multiple episodes chronic suppurative OM and/or perforations or tympanic sclerosis) and 370 individuals without this phenotype, followed by FUnctional Mapping and Annotation (FUMA). Exome data filtered for common (EXaC_all≥0.1) putative deleterious variants influencing protein coding (CADD-scaled scores ≥ 15) were used to compare 15 severe OM cases with 9 mild cases (single episode of acute OM recorded over ≥ 3 consecutive years). Rare (ExAC_all≤0.01) such variants were filtered for those present only in severe OM. Enrichr was used to determine enrichment of genes contributing to pathways/processes relevant to OM. Results FUMA analysis identified two plausible genetic risk loci for severe OM: NR3C1 (Pimputed_1000G=3.62x10 -6) encoding the glucocorticoid receptor, and NREP (Pimputed_1000G=3.67x10 -6) encoding neuronal regeneration related protein. Exome analysis showed: (i) association of severe OM with variants influencing protein coding (CADD-scaled ≥ 15) in a gene-set (GRXCR1, CDH23, LRP2, FAT4, ARSA, EYA4) enriched for Mammalian Phenotype Level 4 abnormal hair cell stereociliary bundle morphology and related phenotypes; (ii) rare variants influencing protein coding only seen in severe OM provided gene-sets enriched for “abnormal ear” (LMNA, CDH23, LRP2, MYO7A, FGFR1), integrin interactions, transforming growth factor signalling, and cell projection phenotypes including hair cell stereociliary bundles and cilium assembly. Conclusions This study highlights interacting genes and pathways related to cilium structure and function that may contribute to extreme susceptibility to OM in Aboriginal Australian children.


Sign in / Sign up

Export Citation Format

Share Document