scholarly journals NExUS: Bayesian simultaneous network estimation across unequal sample sizes

2019 ◽  
Vol 36 (3) ◽  
pp. 798-804
Author(s):  
Priyam Das ◽  
Christine B Peterson ◽  
Kim-Anh Do ◽  
Rehan Akbani ◽  
Veerabhadran Baladandayuthapani

Abstract Motivation Network-based analyses of high-throughput genomics data provide a holistic, systems-level understanding of various biological mechanisms for a common population. However, when estimating multiple networks across heterogeneous sub-populations, varying sample sizes pose a challenge in the estimation and inference, as network differences may be driven by differences in power. We are particularly interested in addressing this challenge in the context of proteomic networks for related cancers, as the number of subjects available for rare cancer (sub-)types is often limited. Results We develop NExUS (Network Estimation across Unequal Sample sizes), a Bayesian method that enables joint learning of multiple networks while avoiding artefactual relationship between sample size and network sparsity. We demonstrate through simulations that NExUS outperforms existing network estimation methods in this context, and apply it to learn network similarity and shared pathway activity for groups of cancers with related origins represented in The Cancer Genome Atlas (TCGA) proteomic data. Availability and implementation The NExUS source code is freely available for download at https://github.com/priyamdas2/NExUS. Supplementary information Supplementary data are available at Bioinformatics online.

2019 ◽  
Author(s):  
Bowen Liu ◽  
Xiaofei Yang ◽  
Tingjie Wang ◽  
Jiadong Lin ◽  
Yongyong Kang ◽  
...  

Abstract Motivation Tumor purity is a fundamental property of each cancer sample and affects downstream investigations. Current tumor purity estimation methods either require matched normal sample or report moderately high tumor purity even on normal samples. It is critical to develop a novel computational approach to estimate tumor purity with sufficient precision based on tumor-only sample. Results In this study, we developed MEpurity, a beta mixture model-based algorithm, to estimate the tumor purity based on tumor-only Illumina Infinium 450k methylation microarray data. We applied MEpurity to both The Cancer Genome Atlas (TCGA) cancer data and cancer cell line data, demonstrating that MEpurity reports low tumor purity on normal samples and comparable results on tumor samples with other state-of-art methods. Availability and implementation MEpurity is a C++ program which is available at https://github.com/xjtu-omics/MEpurity. Supplementary information Supplementary data are available at Bioinformatics online.


2019 ◽  
Vol 35 (21) ◽  
pp. 4469-4471 ◽  
Author(s):  
Kristoffer Vitting-Seerup ◽  
Albin Sandelin

Abstract Summary Alternative splicing is an important mechanism involved in health and disease. Recent work highlights the importance of investigating genome-wide changes in splicing patterns and the subsequent functional consequences. Current computational methods only support such analysis on a gene-by-gene basis. Therefore, we extended IsoformSwitchAnalyzeR R library to enable analysis of genome-wide changes in specific types of alternative splicing and predicted functional consequences of the resulting isoform switches. As a case study, we analyzed RNA-seq data from The Cancer Genome Atlas and found systematic changes in alternative splicing and the consequences of the associated isoform switches. Availability and implementation Windows, Linux and Mac OS: http://bioconductor.org/packages/IsoformSwitchAnalyzeR. Supplementary information Supplementary data are available at Bioinformatics online.


2020 ◽  
Vol 36 (12) ◽  
pp. 3944-3946 ◽  
Author(s):  
Shanyu Chen ◽  
Xiaoyu He ◽  
Ruilin Li ◽  
Xiaohong Duan ◽  
Beifang Niu

Abstract Motivation HotSpot3D is a widely used software for identifying mutation hotspots on the 3D structures of proteins. To further assist users, we developed a new HotSpot3D web server to make this software more versatile, convenient and interactive. Results The HotSpot3D web server performs data pre-processing, clustering, visualization and log-viewing on one stop. Users can interactively explore each cluster and easily re-visualize the mutational clusters within browsers. We also provide a database that allows users to search and visualize proximal mutations from 33 cancers in the Cancer Genome Atlas. Availability and implementation http://niulab.scgrid.cn/HotSpot3D/. Supplementary information Supplementary data are available at Bioinformatics online.


2018 ◽  
pp. 1-16 ◽  
Author(s):  
Victor M. Villalobos ◽  
Yan C. Wang ◽  
Branimir I. Sikic

Purpose The ovarian cancer data set from The Cancer Genome Atlas integrates genomic and proteomic data with clinical annotations based on chart abstractions. We aimed to develop an algorithm to create a matching, more accessible clinical data set cataloging time to treatment failure (TTF) of sequential lines of treatment in patients with serous ovarian cancers. Materials and Methods The master data set of 587 patients with serous ovarian cancer was condensed into a more homogeneous and clinically relevant population comprised of high-risk patients with both grade 3 cancers and stage III or IV disease, resulting in a subgroup of 450 patients. We quantified the TTF of different lines of therapy as well as different therapeutic combinations by extrapolating from the time of starting one therapy to the time of starting a subsequent therapy. Results The overall survival (OS) of patients was highly related to platinum sensitivity status, with median OS times of 56.6, 27.0, and 11.6 months in patients who had platinum-sensitive, -resistant, or -refractory disease, respectively. In high-risk patients, the median TTFs were 14.8, 10.2, 5.7, and 4.1 months with the first, second, third, and fourth lines of chemotherapy, respectively. Patients with stable disease after first-line therapy had similar OS outcomes as patients with partial remissions (34.4 v 33.7 months, respectively). Conclusion This new data set enhances the clinical annotation by providing exploitable chemotherapy benefit data that can now be paired with genomic and proteomic data within The Cancer Genome Atlas data. The major determinant of OS in this study was platinum sensitivity status. TTF decreased with each successive line of therapy. However, patients who achieved only stable disease with first-line therapy had OS similar to those with partial remission.


2019 ◽  
Vol 36 (1) ◽  
pp. 241-249 ◽  
Author(s):  
Rudolf Schill ◽  
Stefan Solbrig ◽  
Tilo Wettig ◽  
Rainer Spang

Abstract Motivation Cancer progresses by accumulating genomic events, such as mutations and copy number alterations, whose chronological order is key to understanding the disease but difficult to observe. Instead, cancer progression models use co-occurrence patterns in cross-sectional data to infer epistatic interactions between events and thereby uncover their most likely order of occurrence. State-of-the-art progression models, however, are limited by mathematical tractability and only allow events to interact in directed acyclic graphs, to promote but not inhibit subsequent events, or to be mutually exclusive in distinct groups that cannot overlap. Results Here we propose Mutual Hazard Networks (MHN), a new Machine Learning algorithm to infer cyclic progression models from cross-sectional data. MHN model events by their spontaneous rate of fixation and by multiplicative effects they exert on the rates of successive events. MHN compared favourably to acyclic models in cross-validated model fit on four datasets tested. In application to the glioblastoma dataset from The Cancer Genome Atlas, MHN proposed a novel interaction in line with consecutive biopsies: IDH1 mutations are early events that promote subsequent fixation of TP53 mutations. Availability and implementation Implementation and data are available at https://github.com/RudiSchill/MHN. Supplementary information Supplementary data are available at Bioinformatics online.


2020 ◽  
Vol 10 (1) ◽  
Author(s):  
Vincent Carbonnier ◽  
Bernard Leroy ◽  
Shai Rosenberg ◽  
Thierry Soussi

AbstractThe diagnosis of somatic and germline TP53 mutations in human tumors or in individuals prone to various types of cancer has now reached the clinic. To increase the accuracy of the prediction of TP53 variant pathogenicity, we gathered functional data from three independent large-scale saturation mutagenesis screening studies with experimental data for more than 10,000 TP53 variants performed in different settings (yeast or mammalian) and with different readouts (transcription, growth arrest or apoptosis). Correlation analysis and multidimensional scaling showed excellent agreement between all these variables. Furthermore, we found that some missense mutations localized in TP53 exons led to impaired TP53 splicing as shown by an analysis of the TP53 expression data from the cancer genome atlas. With the increasing availability of genomic, transcriptomic and proteomic data, it is essential to employ both protein and RNA prediction to accurately define variant pathogenicity.


2019 ◽  
Vol 35 (17) ◽  
pp. 3140-3142 ◽  
Author(s):  
Marc Streit ◽  
Samuel Gratzl ◽  
Holger Stitz ◽  
Andreas Wernitznig ◽  
Thomas Zichner ◽  
...  

Abstract Summary Ordino is a web-based analysis tool for cancer genomics that allows users to flexibly rank, filter and explore genes, cell lines and tissue samples based on pre-loaded data, including The Cancer Genome Atlas, the Cancer Cell Line Encyclopedia and manually uploaded information. Interactive tabular data visualization that facilitates the user-driven prioritization process forms a core component of Ordino. Detail views of selected items complement the exploration. Findings can be stored, shared and reproduced via the integrated session management. Availability and implementation Ordino is publicly available at https://ordino.caleydoapp.org. The source code is released at https://github.com/Caleydo/ordino under the Mozilla Public License 2.0. Supplementary information Supplementary data are available at Bioinformatics online.


Author(s):  
Bo Yang ◽  
Ting-Ting Xin ◽  
Shan-Min Pang ◽  
Meng Wang ◽  
Yi-Jie Wang

Abstract Motivation Precise prediction of cancer subtypes is of significant importance in cancer diagnosis and treatment. Disease etiology is complicated existing at different omics levels, hence integrative analysis provides a very effective way to improve our understanding of cancer. Results We propose a novel computational framework, named Deep Subspace Mutual Learning (DSML). DSML has the capability to simultaneously learn the subspace structures in each available omics data and in overall multi-omics data by adopting deep neural networks, which thereby facilitates the subtypes prediction via clustering on multi-level, single level, and partial level omics data. Extensive experiments are performed in five different cancers on three levels of omics data from The Cancer Genome Atlas. The experimental analysis demonstrates that DSML delivers comparable or even better results than many state-of-the-art integrative methods. Availability An implementation and documentation of the DSML is publicly available at https://github.com/polytechnicXTT/Deep-Subspace-Mutual-Learning.git. Supplementary information Supplementary data are available at Bioinformatics online.


2017 ◽  
Vol 34 (6) ◽  
pp. 1024-1030 ◽  
Author(s):  
Jun Cheng ◽  
Xiaokui Mo ◽  
Xusheng Wang ◽  
Anil Parwani ◽  
Qianjin Feng ◽  
...  

Abstract Motivation As a highly heterogeneous disease, the progression of tumor is not only achieved by unlimited growth of the tumor cells, but also supported, stimulated, and nurtured by the microenvironment around it. However, traditional qualitative and/or semi-quantitative parameters obtained by pathologist’s visual examination have very limited capability to capture this interaction between tumor and its microenvironment. With the advent of digital pathology, computerized image analysis may provide a better tumor characterization and give new insights into this problem. Results We propose a novel bioimage informatics pipeline for automatically characterizing the topological organization of different cell patterns in the tumor microenvironment. We apply this pipeline to the only publicly available large histopathology image dataset for a cohort of 190 patients with papillary renal cell carcinoma obtained from The Cancer Genome Atlas project. Experimental results show that the proposed topological features can successfully stratify early- and middle-stage patients with distinct survival, and show superior performance to traditional clinical features and cellular morphological and intensity features. The proposed features not only provide new insights into the topological organizations of cancers, but also can be integrated with genomic data in future studies to develop new integrative biomarkers. Availability and implementation https://github.com/chengjun583/KIRP-topological-features Supplementary information Supplementary data are available atBioinformatics online.


2015 ◽  
Vol 32 (6) ◽  
pp. 952-954 ◽  
Author(s):  
Ying-Wooi Wan ◽  
Genevera I. Allen ◽  
Zhandong Liu

Abstract Motivation: Massive amounts of high-throughput genomics data profiled from tumor samples were made publicly available by the Cancer Genome Atlas (TCGA). Results: We have developed an open source software package, TCGA2STAT, to obtain the TCGA data, wrangle it, and pre-process it into a format ready for multivariate and integrated statistical analysis in the R environment. In a user-friendly format with one single function call, our package downloads and fully processes the desired TCGA data to be seamlessly integrated into a computational analysis pipeline. No further technical or biological knowledge is needed to utilize our software, thus making TCGA data easily accessible to data scientists without specific domain knowledge. Availability and implementation: TCGA2STAT is available from the https://cran.r-project.org/web/packages/TCGA2STAT/index.html. Supplementary information: Supplementary data are available at Bioinformatics online. Contact: [email protected]


Sign in / Sign up

Export Citation Format

Share Document