matrix normal
Recently Published Documents


TOTAL DOCUMENTS

40
(FIVE YEARS 12)

H-INDEX

8
(FIVE YEARS 1)

Entropy ◽  
2021 ◽  
Vol 23 (10) ◽  
pp. 1249
Author(s):  
Jinwon Heo ◽  
Jangsun Baek

Along with advances in technology, matrix data, such as medical/industrial images, have emerged in many practical fields. These data usually have high dimensions and are not easy to cluster due to their intrinsic correlated structure among rows and columns. Most approaches convert matrix data to multi dimensional vectors and apply conventional clustering methods to them, and thus, suffer from an extreme high-dimensionality problem as well as a lack of interpretability of the correlated structure among row/column variables. Recently, a regularized model was proposed for clustering matrix-valued data by imposing a sparsity structure for the mean signal of each cluster. We extend their approach by regularizing further on the covariance to cope better with the curse of dimensionality for large size images. A penalized matrix normal mixture model with lasso-type penalty terms in both mean and covariance matrices is proposed, and then an expectation maximization algorithm is developed to estimate the parameters. The proposed method has the competence of both parsimonious modeling and reflecting the proper conditional correlation structure. The estimators are consistent, and their limiting distributions are derived. We applied the proposed method to simulated data as well as real datasets and measured its clustering performance with the clustering accuracy (ACC) and the adjusted rand index (ARI). The experiment results show that the proposed method performed better with higher ACC and ARI than those of conventional methods.


Author(s):  
Salvatore D. Tomarchio ◽  
Paul D. McNicholas ◽  
Antonio Punzo

AbstractFinite mixtures of regressions with fixed covariates are a commonly used model-based clustering methodology to deal with regression data. However, they assume assignment independence, i.e., the allocation of data points to the clusters is made independently of the distribution of the covariates. To take into account the latter aspect, finite mixtures of regressions with random covariates, also known as cluster-weighted models (CWMs), have been proposed in the univariate and multivariate literature. In this paper, the CWM is extended to matrix data, e.g., those data where a set of variables are simultaneously observed at different time points or locations. Specifically, the cluster-specific marginal distribution of the covariates and the cluster-specific conditional distribution of the responses given the covariates are assumed to be matrix normal. Maximum likelihood parameter estimates are derived using an expectation-conditional maximization algorithm. Parameter recovery, classification assessment, and the capability of the Bayesian information criterion to detect the underlying groups are investigated using simulated data. Finally, two real data applications concerning educational indicators and the Italian non-life insurance market are presented.


2021 ◽  
pp. 1-10
Author(s):  
Wu Shoujiang

At present, the relevant test data and training indicators of athletes during rehabilitation training lack screening and analysis, so it is impossible to establish a long-term longitudinal tracking research system and evaluation system. In order to improve the practical effect of sports rehabilitation activities, this paper successively introduces the matrix normal mixed model and the fuzzy clustering algorithm based on the K-L information entropy regularization and the matrix normal mixed model. Moreover, this paper uses the expectation maximization algorithm to estimate the parameters of the model, discusses the framework, key technologies and core services of the development platform, and conducts certain research on the related technologies of the three-tier architecture. At the same time, according to the actual needs of sports rehabilitation training, this paper designs the functions required for exercise detection and prescription formulation. In addition, this paper analyzes and designs the database structure involved in each subsystem. Finally, this paper designs experiments to verify the performance of the model constructed in this paper. The research results show that the performance of the model constructed in this paper meets the expectations of model construction, so it can be applied to practice.


2020 ◽  
Author(s):  
Erica Ponzi ◽  
Magne Thoresen ◽  
Therese Haugdahl Nøst ◽  
Kajsa Møllersen

Abstract Background: Cancer genomic studies often include data collected from several omics platforms. Each omics data source contributes to the understanding of the underlying biological process via source specific (”individual”) patterns of variability. At the same time, statistical associations and potential interactions among the different data sources can reveal signals from common biological processes that might not be identified by single source analyses. These common patterns of variability are referred to as ”shared” or ”joint”. To capture both contributions of variance, integrative dimension reduction techniques are needed. Integrated PCA is a model based generalization of principal components analysis that separates shared and source specific variance by iteratively estimating covariance structures from a matrix normal distribution. Angle based JIVE is a matrix factorization method that decomposes joint and individual variation by permutation of row subspaces. We apply these techniques to identify joint and individual contributions of DNA methylation, miRNA and mRNA expression collected from blood samples in a lung cancer case control study nested within the Norwegian Woman and Cancer (NOWAC) cohort study.Results: In this work, we show how an integrative analysis that preserves both components of variation is more appropriate than analyses considering uniquely individual or joint components. Our results show how both joint and individual components contribute to a better quality of model predictions, and facilitate the interpretation of the underlying biological processes.Conclusions: When compared to a non integrative analysis of the three omics sources, integrative models that simultaneously include joint and individual components result in better prediction of cancer status and metastatic cancer at diagnosis.


2020 ◽  
Author(s):  
Erica Ponzi ◽  
Magne Thoresen ◽  
Therese Haugdahl Nøst ◽  
Kajsa Møllersen

AbstractBackgroundCancer genomic studies often include data collected from several omics platforms. Each omics data source contributes to the understanding of the underlying biological process via source specific (“individual”) patterns of variability. At the same time, statistical associations and potential interactions among the different data sources can reveal signals from common biological processes that might not be identified by single source analyses. These common patterns of variability are referred to as “shared” or “joint”. To capture both contributions of variance, integrative dimension reduction techniques are needed. Integrated PCA is a model based generalization of principal components analysis that separates shared and source specific variance by iteratively estimating covariance structures from a matrix normal distribution. Angle based JIVE is a matrix factorization method that decomposes joint and individual variation by permutation of row subspaces. We apply these techniques to identify joint and individual contributions of DNA methylation, miRNA and mRNA expression collected from blood samples in a lung cancer case control study nested within the Norwegian Woman and Cancer (NOWAC) cohort study.ResultsIn this work, we show how an integrative analysis that preserves both components of variation is more appropriate than analyses considering uniquely individual or joint components. Our results show how both joint and individual components contribute to a better quality of model predictions, and facilitate the interpretation of the underlying biological processes.ConclusionWhen compared to a non integrative analysis of the three omics sources, integrative models that simultaneously include joint and individual components result in better prediction of cancer status and metastatic cancer at diagnosis.


Author(s):  
Shibo Li ◽  
Wei Xing ◽  
Robert M. Kirby ◽  
Shandian Zhe

Gaussian process regression networks (GPRN) are powerful Bayesian models for multi-output regression, but their inference is intractable. To address this issue, existing methods use a fully factorized structure (or a mixture of such structures) over all the outputs and latent functions for posterior approximation, which, however, can miss the strong posterior dependencies among the latent variables and hurt the inference quality. In addition, the updates of the variational parameters are inefficient and can be prohibitively expensive for a large number of outputs. To overcome these limitations, we propose a scalable variational inference algorithm for GPRN, which not only captures the abundant posterior dependencies but also is much more efficient for massive outputs. We tensorize the output space and introduce tensor/matrix-normal variational posteriors to capture the posterior correlations and to reduce the parameters. We jointly optimize all the parameters and exploit the inherent Kronecker product structure in the variational model evidence lower bound to accelerate the computation. We demonstrate the advantages of our method in several real-world applications.


Sign in / Sign up

Export Citation Format

Share Document