Integrating hypertension phenotype and genotype with hybrid non-negative matrix factorization

Yuan Luo; Chengsheng Mao; Yiben Yang; Fei Wang; Faraz S Ahmad; Donna Arnett; Marguerite R Irvin; Sanjiv J Shah

doi:10.1093/bioinformatics/bty804

Integrating hypertension phenotype and genotype with hybrid non-negative matrix factorization

Bioinformatics ◽

10.1093/bioinformatics/bty804 ◽

2018 ◽

Vol 35 (8) ◽

pp. 1395-1403 ◽

Cited By ~ 3

Author(s):

Yuan Luo ◽

Chengsheng Mao ◽

Yiben Yang ◽

Fei Wang ◽

Faraz S Ahmad ◽

...

Keyword(s):

Matrix Factorization ◽

Cardiac Mechanics ◽

Approximation Problem ◽

Supplementary Information ◽

Joint Analysis ◽

Learning Approaches ◽

Patient Stratification ◽

Projected Gradient Method ◽

Genotype Information ◽

Non Negative Matrix Factorization

Abstract Motivation Hypertension is a heterogeneous syndrome in need of improved subtyping using phenotypic and genetic measurements with the goal of identifying subtypes of patients who share similar pathophysiologic mechanisms and may respond more uniformly to targeted treatments. Existing machine learning approaches often face challenges in integrating phenotype and genotype information and presenting to clinicians an interpretable model. We aim to provide informed patient stratification based on phenotype and genotype features. Results In this article, we present a hybrid non-negative matrix factorization (HNMF) method to integrate phenotype and genotype information for patient stratification. HNMF simultaneously approximates the phenotypic and genetic feature matrices using different appropriate loss functions, and generates patient subtypes, phenotypic groups and genetic groups. Unlike previous methods, HNMF approximates phenotypic matrix under Frobenius loss, and genetic matrix under Kullback-Leibler (KL) loss. We propose an alternating projected gradient method to solve the approximation problem. Simulation shows HNMF converges fast and accurately to the true factor matrices. On a real-world clinical dataset, we used the patient factor matrix as features and examined the association of these features with indices of cardiac mechanics. We compared HNMF with six different models using phenotype or genotype features alone, with or without NMF, or using joint NMF with only one type of loss We also compared HNMF with 3 recently published methods for integrative clustering analysis, including iClusterBayes, Bayesian joint analysis and JIVE. HNMF significantly outperforms all comparison models. HNMF also reveals intuitive phenotype–genotype interactions that characterize cardiac abnormalities. Availability and implementation Our code is publicly available on github at https://github.com/yuanluo/hnmf. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

A Block Coordinate Descent-Based Projected Gradient Algorithm for Orthogonal Non-Negative Matrix Factorization

Mathematics ◽

10.3390/math9050540 ◽

2021 ◽

Vol 9 (5) ◽

pp. 540

Author(s):

Soodabeh Asadi ◽

Janez Povh

Keyword(s):

Matrix Factorization ◽

Synthetic Data ◽

Coordinate Descent ◽

The Other ◽

Gradient Algorithm ◽

Block Coordinate Descent ◽

Projected Gradient Method ◽

Projected Gradient ◽

Factorization Problem ◽

Non Negative Matrix Factorization

This article uses the projected gradient method (PG) for a non-negative matrix factorization problem (NMF), where one or both matrix factors must have orthonormal columns or rows. We penalize the orthonormality constraints and apply the PG method via a block coordinate descent approach. This means that at a certain time one matrix factor is fixed and the other is updated by moving along the steepest descent direction computed from the penalized objective function and projecting onto the space of non-negative matrices. Our method is tested on two sets of synthetic data for various values of penalty parameters. The performance is compared to the well-known multiplicative update (MU) method from Ding (2006), and with a modified global convergent variant of the MU algorithm recently proposed by Mirzal (2014). We provide extensive numerical results coupled with appropriate visualizations, which demonstrate that our method is very competitive and usually outperforms the other two methods.

Download Full-text

Highly Comprehensive Genomic Testing for CLL: WGS, One Key to CLL Patient Stratification

Blood ◽

10.1182/blood-2018-99-115935 ◽

2018 ◽

Vol 132 (Supplement 1) ◽

pp. 3115-3115

Author(s):

Kate E Ridout ◽

Pauline Robbe ◽

Doriane Cavalieri ◽

Jennifer Becq ◽

Miao He ◽

...

Keyword(s):

Board Of Directors ◽

Matrix Factorization ◽

Research Funding ◽

Patient Stratification ◽

Mutational Signatures ◽

Advisory Committees ◽

Coding Regions ◽

Genomic Complexity ◽

Stratification Method ◽

Non Negative Matrix Factorization

Abstract Background Chronic Lymphocytic Leukemia (CLL) is characterised by a highly heterogeneous natural history and treatment response. Indeed, 50% of immunoglobulin heavy chain variable region (IgHV) hypermutated patients have an excellent progression free survival (PFS) after chemoimmunotherapy. Conversely, 25% of FCR treated patients relapse within 24 months (high risk CLL). Recent studies have shown that complex karyotype with or without TP53 disruption predicts for relapse after BCL2 therapy and BTK inhibitors. However, TP53 is the only marker for which routine testing is available. Overall, nearly 80% of patients relapsing after frontline FCR do not present a known poor risk genomic marker. Additional candidate genomic predictors of poor outcome including mutations in coding regions of NOTCH1, SF3B1 and RPS15, non-coding regions of NOTCH1 and enhancer regions of PAX5, telomere length, IgHV status, and DNA Damage Repair (DDR) germline mutations including TP53 and ATM have been reported in CLL. Further, the role of mutational signatures and regions of kataegis also merit additional investigation in progressive CLL. Evaluating all candidate predictors requires complex time consuming, multi-modality testing outside the scope of routine clinical diagnostic practice, however, in isolation, each has low predictive value. Here, we show preliminary data on a novel patient stratification method based on whole genome sequencing (WGS) data incorporating multiple genomic features in a single test. Patients and Methods Tumor (peripheral blood) and germline (saliva) samples were collected from 321 patients from 6 UK trials via the Genomics England CLL pilot: ARCTIC (n=61), AdMIRe (n=64), CLL 210 (n=30), CLEAR (n=12), RIAltO (n=88) and FLAIR (n=66). We performed WGS on the HiSeqX (Illumina). After read alignment, we detected somatic variants using Strelka 2.4.7 for small variants detection (SNV and InDels), Manta 0.28.0 for structural variant (SV) detection, and Canvas 1.3.1 for copy number variant (CNV) detection (Illumina). Non-coding regions were annotated with information from primary CLL, CLL cell lines and B-cell ENCODE databases. Mutational signatures and putative regions of kataegis were calculated based on Alexandrov et al. (Nature, 2013) and Lawrence et al. (Nature, 2013). Telomere lengths were assessed using Telomerecat. Data aggregation was performed using contingency tables combined with non-negative matrix factorization. Results Mean coverage was 94.2X for tumor and 28.5X for germline samples. We found a median of 9172 SNPs/sample after filtering and 2348 indels/sample across 321 patients. High risk CLL was enriched for genomic complexity and poor prognostic mutations. The most frequently mutated genes were SF3B1 (17%), TP53 (13%), NOTCH1 (12%), IGLL5 (12%), and ATM (11%). Analysis of non-coding regions using DNA methylation markers, ATAC-seq and Hi-C revealed potential candidate regions associated with early relapse. Using CNA and SV data, we identified interesting patterns of genomic complexity and structural variants, including a trend towards enrichment of del8p in Relapse/Refractory and FCR non-responders. Additionally, we investigated mutation signatures and kataegis across coding and non-coding regions of the genome. We correlated exonic regions of DDR genes in germline data with clinical outcomes and extended this to genes mutated in both tumor and germline data, termed germline-tumor double-hits. We examined the relationship between the Alexandrov hypermutation signature, IgHV status (determined by % homology to the reference genome) and PFS, and combined mutational density at the Ig locus with mutation signature aiming to predict IgHV status. Finally, we produced a binary contingency matrix, using non-negative matrix factorization to cluster the samples. This method highlighted patient groups with shared genomic profiles. Conclusion We present preliminary data on a patient stratification method derived from WGS of 321 paired germline and CLL trial samples. Our predictive signature includes driver gene mutations, CNAs, IgHV status, genomic complexity, telomere length, overall mutation burden and genes with germline-tumor double-hits. Our comprehensive, NGS-based patient stratification attempts to predict patient outcome in a single sequencing run. Disclosures Becq: Illumina: Employment. He:Illumina: Employment. Ross:Illumina: Employment. Bentley:Illumina: Employment. Pettitt:Celgene: Research Funding; Gilead: Research Funding; Roche: Research Funding; GSK/Novartis: Research Funding; Napp: Research Funding; AstraZeneca: Research Funding; Chugai: Research Funding. Hillmen:Novartis: Research Funding; Gilead Sciences, Inc.: Honoraria, Research Funding; Alexion Pharmaceuticals, Inc: Consultancy, Honoraria; F. Hoffmann-La Roche Ltd: Research Funding; Celgene: Research Funding; Acerta: Membership on an entity's Board of Directors or advisory committees; Abbvie: Honoraria, Membership on an entity's Board of Directors or advisory committees, Research Funding; Pharmacyclics: Research Funding; Janssen: Consultancy, Honoraria, Membership on an entity's Board of Directors or advisory committees, Research Funding. Schuh:Giles, Roche, Janssen, AbbVie: Honoraria.

Download Full-text

Mutational signature learning with supervised negative binomial non-negative matrix factorization

Bioinformatics ◽

10.1093/bioinformatics/btaa473 ◽

2020 ◽

Vol 36 (Supplement_1) ◽

pp. i154-i160 ◽

Cited By ~ 1

Author(s):

Xinrui Lyu ◽

Jean Garret ◽

Gunnar Rätsch ◽

Kjong-Van Lehmann

Keyword(s):

Matrix Factorization ◽

Negative Binomial ◽

Extraction Methods ◽

Supplementary Information ◽

Cancer Type ◽

Mutational Signatures ◽

Signature Extraction ◽

Mutational Signature ◽

Mutational Processes ◽

Non Negative Matrix Factorization

Abstract Motivation Understanding the underlying mutational processes of cancer patients has been a long-standing goal in the community and promises to provide new insights that could improve cancer diagnoses and treatments. Mutational signatures are summaries of the mutational processes, and improving the derivation of mutational signatures can yield new discoveries previously obscured by technical and biological confounders. Results from existing mutational signature extraction methods depend on the size of available patient cohort and solely focus on the analysis of mutation count data without considering the exploitation of metadata. Results Here we present a supervised method that utilizes cancer type as metadata to extract more distinctive signatures. More specifically, we use a negative binomial non-negative matrix factorization and add a support vector machine loss. We show that mutational signatures extracted by our proposed method have a lower reconstruction error and are designed to be more predictive of cancer type than those generated by unsupervised methods. This design reduces the need for elaborate post-processing strategies in order to recover most of the known signatures unlike the existing unsupervised signature extraction methods. Signatures extracted by a supervised model used in conjunction with cancer-type labels are also more robust, especially when using small and potentially cancer-type limited patient cohorts. Finally, we adapted our model such that molecular features can be utilized to derive an according mutational signature. We used APOBEC expression and MUTYH mutation status to demonstrate the possibilities that arise from this ability. We conclude that our method, which exploits available metadata, improves the quality of mutational signatures as well as helps derive more interpretable representations. Availability and implementation https://github.com/ratschlab/SNBNMF-mutsig-public. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

A Two-Phase Algorithm for Robust Symmetric Non-Negative Matrix Factorization

Symmetry ◽

10.3390/sym13091757 ◽

2021 ◽

Vol 13 (9) ◽

pp. 1757

Author(s):

Bingjie Li ◽

Xi Shi ◽

Zhenyue Zhang

Keyword(s):

Gradient Method ◽

Matrix Factorization ◽

Initial Point ◽

Linear Structure ◽

Second Phase ◽

Projected Gradient Method ◽

Two Phase ◽

Negative Part ◽

Projected Gradient ◽

Non Negative Matrix Factorization

As a special class of non-negative matrix factorization, symmetric non-negative matrix factorization (SymNMF) has been widely used in the machine learning field to mine the hidden non-linear structure of data. Due to the non-negative constraint and non-convexity of SymNMF, the efficiency of existing methods is generally unsatisfactory. To tackle this issue, we propose a two-phase algorithm to solve the SymNMF problem efficiently. In the first phase, we drop the non-negative constraint of SymNMF and propose a new model with penalty terms, in order to control the negative component of the factor. Unlike previous methods, the factor sequence in this phase is not required to be non-negative, allowing fast unconstrained optimization algorithms, such as the conjugate gradient method, to be used. In the second phase, we revisit the SymNMF problem, taking the non-negative part of the solution in the first phase as the initial point. To achieve faster convergence, we propose an interpolation projected gradient (IPG) method for SymNMF, which is much more efficient than the classical projected gradient method. Our two-phase algorithm is easy to implement, with convergence guaranteed for both phases. Numerical experiments show that our algorithm performs better than others on synthetic data and unsupervised clustering tasks.

Download Full-text

Non-negative matrix factorization via projected gradient method for credit risk analysis

2013 6th International Conference on Information Management, Innovation Management and Industrial Engineering ◽

10.1109/iciii.2013.6703097 ◽

2013 ◽

Author(s):

Hua Chen ◽

Jinlin Ma ◽

Jiaying Liu ◽

Jingnan Wang

Keyword(s):

Risk Analysis ◽

Credit Risk ◽

Gradient Method ◽

Matrix Factorization ◽

Projected Gradient Method ◽

Projected Gradient ◽

Credit Risk Analysis ◽

Non Negative Matrix Factorization

Download Full-text

Clustering Algorithm for Unsupervised Monaural Musical Sound Separation Based on Non-negative Matrix Factorization

IEICE Transactions on Fundamentals of Electronics Communications and Computer Sciences ◽

10.1587/transfun.e95.a.818 ◽

2012 ◽

Vol E95-A (4) ◽

pp. 818-823 ◽

Cited By ~ 2

Author(s):

Sang Ha PARK ◽

Seokjin LEE ◽

Koeng-Mo SUNG

Keyword(s):

Matrix Factorization ◽

Clustering Algorithm ◽

Musical Sound ◽

Sound Separation ◽

Non Negative Matrix Factorization

Download Full-text

Faculty Opinions recommendation of Categorical dimensions of human odor descriptor space revealed by non-negative matrix factorization.

Faculty Opinions – Post-Publication Peer Review of the Biomedical Literature ◽

10.3410/f.718116445.793486463 ◽

2013 ◽

Author(s):

Ramón Serrano

Keyword(s):

Matrix Factorization ◽

Descriptor Space ◽

Human Odor ◽

Non Negative Matrix Factorization

Download Full-text

Adaptive background modeling via incremental non-negative matrix factorization

JOURNAL OF SHENZHEN UNIVERSITY SCIENCE AND ENGINEERING ◽

10.3724/sp.j.1249.2016.05511 ◽

2016 ◽

Vol 33 (5) ◽

pp. 511

Author(s):

Huaiqin Dong ◽

Binbin Pan ◽

Wensheng Chen ◽

Chen Xu

Keyword(s):

Matrix Factorization ◽

Background Modeling ◽

Non Negative Matrix Factorization

Download Full-text

Convex Hull Convolutive Non-Negative Matrix Factorization for Uncovering Temporal Patterns in Multivariate Time-Series Data

10.21437/interspeech.2016-571 ◽

2016 ◽

Cited By ~ 5

Author(s):

Colin Vaz ◽

Asterios Toutios ◽

Shrikanth S. Narayanan

Keyword(s):

Time Series ◽

Convex Hull ◽

Matrix Factorization ◽

Time Series Data ◽

Multivariate Time Series ◽

Temporal Patterns ◽

Series Data ◽

Non Negative Matrix Factorization

Download Full-text

Non-Negative Matrix Factorization For Improving Passive Sonar Signal Detection

10.21528/cbic2011-17.5 ◽

2016 ◽

Author(s):

N. N. de Moura ◽

Igor Paladino ◽

J. M. de Seixas

Keyword(s):

Signal Detection ◽

Matrix Factorization ◽

Passive Sonar ◽

Non Negative Matrix Factorization

Download Full-text