Cross-Domain Object Representation via Robust Low-Rank Correlation Analysis

Author(s):  
Xiangjun Shen ◽  
Jinghui Zhou ◽  
Zhongchen Ma ◽  
Bingkun Bao ◽  
Zhengjun Zha

Cross-domain data has become very popular recently since various viewpoints and different sensors tend to facilitate better data representation. In this article, we propose a novel cross-domain object representation algorithm (RLRCA) which not only explores the complexity of multiple relationships of variables by canonical correlation analysis (CCA) but also uses a low rank model to decrease the effect of noisy data. To the best of our knowledge, this is the first try to smoothly integrate CCA and a low-rank model to uncover correlated components across different domains and to suppress the effect of noisy or corrupted data. In order to improve the flexibility of the algorithm to address various cross-domain object representation problems, two instantiation methods of RLRCA are proposed from feature and sample space, respectively. In this way, a better cross-domain object representation can be achieved through effectively learning the intrinsic CCA features and taking full advantage of cross-domain object alignment information while pursuing low rank representations. Extensive experimental results on CMU PIE, Office-Caltech, Pascal VOC 2007, and NUS-WIDE-Object datasets, demonstrate that our designed models have superior performance over several state-of-the-art cross-domain low rank methods in image clustering and classification tasks with various corruption levels.

2021 ◽  
Vol 2021 ◽  
pp. 1-12
Author(s):  
Wenyun Gao ◽  
Sheng Dai ◽  
Stanley Ebhohimhen Abhadiomhen ◽  
Wei He ◽  
Xinghui Yin

Correlation learning is a technique utilized to find a common representation in cross-domain and multiview datasets. However, most existing methods are not robust enough to handle noisy data. As such, the common representation matrix learned could be influenced easily by noisy samples inherent in different instances of the data. In this paper, we propose a novel correlation learning method based on a low-rank representation, which learns a common representation between two instances of data in a latent subspace. Specifically, we begin by learning a low-rank representation matrix and an orthogonal rotation matrix to handle the noisy samples in one instance of the data so that a second instance of the data can linearly reconstruct the low-rank representation. Our method then finds a similarity matrix that approximates the common low-rank representation matrix much better such that a rank constraint on the Laplacian matrix would reveal the clustering structure explicitly without any spectral postprocessing. Extensive experimental results on ORL, Yale, Coil-20, Caltech 101-20, and UCI digits datasets demonstrate that our method has superior performance than other state-of-the-art compared methods in six evaluation metrics.


2021 ◽  
Vol 2021 ◽  
pp. 1-14
Author(s):  
Wenyun Gao ◽  
Xiaoyun Li ◽  
Sheng Dai ◽  
Xinghui Yin ◽  
Stanley Ebhohimhen Abhadiomhen

The low-rank representation (LRR) method has recently gained enormous popularity due to its robust approach in solving the subspace segmentation problem, particularly those concerning corrupted data. In this paper, the recursive sample scaling low-rank representation (RSS-LRR) method is proposed. The advantage of RSS-LRR over traditional LRR is that a cosine scaling factor is further introduced, which imposes a penalty on each sample to minimize noise and outlier influence better. Specifically, the cosine scaling factor is a similarity measure learned to extract each sample’s relationship with the low-rank representation’s principal components in the feature space. In order words, the smaller the angle between an individual data sample and the low-rank representation’s principal components, the more likely it is that the data sample is clean. Thus, the proposed method can then effectively obtain a good low-rank representation influenced mainly by clean data. Several experiments are performed with varying levels of corruption on ORL, CMU PIE, COIL20, COIL100, and LFW in order to evaluate RSS-LRR’s effectiveness over state-of-the-art low-rank methods. The experimental results show that RSS-LRR consistently performs better than the compared methods in image clustering and classification tasks.


2020 ◽  
Vol 15 ◽  
Author(s):  
Chen-An Tsai ◽  
James J. Chen

Background: Gene set enrichment analyses (GSEA) provide a useful and powerful approach to identify differentially expressed gene sets with prior biological knowledge. Several GSEA algorithms have been proposed to perform enrichment analyses on groups of genes. However, many of these algorithms have focused on identification of differentially expressed gene sets in a given phenotype. Objective: In this paper, we propose a gene set analytic framework, Gene Set Correlation Analysis (GSCoA), that simultaneously measures within and between gene sets variation to identify sets of genes enriched for differential expression and highly co-related pathways. Methods: We apply co-inertia analysis to the comparisons of cross-gene sets in gene expression data to measure the costructure of expression profiles in pairs of gene sets. Co-inertia analysis (CIA) is one multivariate method to identify trends or co-relationships in multiple datasets, which contain the same samples. The objective of CIA is to seek ordinations (dimension reduction diagrams) of two gene sets such that the square covariance between the projections of the gene sets on successive axes is maximized. Simulation studies illustrate that CIA offers superior performance in identifying corelationships between gene sets in all simulation settings when compared to correlation-based gene set methods. Result and Conclusion: We also combine between-gene set CIA and GSEA to discover the relationships between gene sets significantly associated with phenotypes. In addition, we provide a graphical technique for visualizing and simultaneously exploring the associations of between and within gene sets and their interaction and network. We then demonstrate integration of within and between gene sets variation using CIA and GSEA, applied to the p53 gene expression data using the c2 curated gene sets. Ultimately, the GSCoA approach provides an attractive tool for identification and visualization of novel associations between pairs of gene sets by integrating co-relationships between gene sets into gene set analysis.


2017 ◽  
Author(s):  
◽  
Minchai Kim

Our research aims to elucidate the factors that influence the terminological implantation of a term by proposing a new typology of those factors with a method revealing how their mechanism causes terminological variation in French-language ICT. We accomplish this through an analysis of four Francophone communities: France, Quebec, Belgium, and Switzerland. After establishing a new typology, which encompasses the terminological, socio-terminological, psycho-terminological, and extra-terminological factors, we propose a hypothetical model of their mechanism by introducing three statistical concepts—dependent, independent, and moderator variables—to elucidate these factors’ relationships. We verify our model in two steps. First, for the analysis of terminological and socio- terminological factors, we examine the relations between each factor and terminological implantation of 256 French ICT terms. For this, we begin by coding the terms according to a criterion established for each factor. We then carry out a correlation analysis with Spearman’s rank correlation. Second, we analyse the psycho-terminological and extra-terminological factors with statistical tests on the answers to our questionnaire, which show significant differences between these four linguistic communities. Our analysis confirms a significant difference between the three European countries and Quebec in the mechanism of the terminological implantation factors and we conclude that the psycho-terminological and extra-terminological factors play a decisive role in this difference, which we identify as diatopic.


2021 ◽  
Author(s):  
Yanpeng Ma ◽  
Wenyao Wang ◽  
Longlong Liu ◽  
Yang liu ◽  
Wei Bi

Background: This study aims to investigate the correlation of VEGF-B and FLT-1 co-expression with the prognosis of gastric cancer (GC). Materials & methods: Primary GC samples and adjacent tissues were obtained from 96 patients. Results: Both VEGF-B and FLT-1 were testified to be upregulated in the human GC compared with adjacent tissues. Spearman’s rank correlation analysis indicated that VEGF-B and FLT-1 expression were correlated (r = 0.321, p = 0.0015). High VEGF-B and FLT-1 co-expression patients showed poor prognosis when compared with low VEGF-B and FLT-1 co-expression patients (p = 0.0169). Conclusion: The high co-expression of VEGF-B and FLT-1 in GC shows a poor prognosis of overall survival, and targeted therapy against the interaction between VEGF-B and FLT-1 is worth further detailed analysis.


2016 ◽  
Vol 44 (1) ◽  
pp. 147-152
Author(s):  
Kaniz Fatema ◽  
Wan Maznah Wan Omar ◽  
Mansor Mat Isa ◽  
Md Omar Ahmad

Spatial and temporal distribution of zooplankton biomass in the Merbok estuary were studied. Zooplankton samples were collected monthly from January to December 2011 at six sampling stations along the river stretch by using 0.13 m diameter plankton net (150 ?m mesh size) in horizontal towing. Average zooplankton biomass ranged from 0.1143 to 1.8217 g dry wt.m-3. The maximum and minimum zooplankton biomass recorded in February and October 2011, respectively. The highest zooplankton biomass was found at Station 6 (downstream) and the lowest in Station 1 (upstream). Zooplankton biomass varied from upstream to downstream. Kruskal-Wallis H test showed that distribution of zooplankton biomass among the sampling months was significantly different (p < 0.05). Spearman’s rank correlation analysis revealed significant correlation among zooplankton biomass, chl a concentration and nutrients (p < 0·01).Bangladesh J. Zool. 44(1): 147-152, 2016


Author(s):  
Xiao Wang ◽  
Ziwei Zhang ◽  
Jing Wang ◽  
Peng Cui ◽  
Shiqiang Yang

Trust prediction, aiming to predict the trust relations between users in a social network, is a key to helping users discover the reliable information. Many trust prediction methods are proposed based on the low-rank assumption of a trust network. However, one typical property of the trust network is that the trust relations follow the power-law distribution, i.e., few users are trusted by many other users, while most tail users have few trustors. Due to these tail users, the fundamental low-rank assumption made by existing methods is seriously violated and becomes unrealistic. In this paper, we propose a simple yet effective method to address the problem of the violated low-rank assumption. Instead of discovering the low-rank component of the trust network alone, we learn a sparse component of the trust network to describe the tail users simultaneously. With both of the learned low-rank and sparse components, the trust relations in the whole network can be better captured. Moreover, the transitive closure structure of the trust relations is also integrated into our model. We then derive an effective iterative algorithm to infer the parameters of our model, along with the proof of correctness. Extensive experimental results on real-world trust networks demonstrate the superior performance of our proposed method over the state-of-the-arts.


2019 ◽  
Vol 17 (04) ◽  
pp. 1950028 ◽  
Author(s):  
Md. Ashad Alam ◽  
Osamu Komori ◽  
Hong-Wen Deng ◽  
Vince D. Calhoun ◽  
Yu-Ping Wang

The kernel canonical correlation analysis based U-statistic (KCCU) is being used to detect nonlinear gene–gene co-associations. Estimating the variance of the KCCU is however computationally intensive. In addition, the kernel canonical correlation analysis (kernel CCA) is not robust to contaminated data. Using a robust kernel mean element and a robust kernel (cross)-covariance operator potentially enables the use of a robust kernel CCA, which is studied in this paper. We first propose an influence function-based estimator for the variance of the KCCU. We then present a non-parametric robust KCCU, which is designed for dealing with contaminated data. The robust KCCU is less sensitive to noise than KCCU. We investigate the proposed method using both synthesized and real data from the Mind Clinical Imaging Consortium (MCIC). We show through simulation studies that the power of the proposed methods is a monotonically increasing function of sample size, and the robust test statistics bring incremental gains in power. To demonstrate the advantage of the robust kernel CCA, we study MCIC data among 22,442 candidate Schizophrenia genes for gene–gene co-associations. We select 768 genes with strong evidence for shedding light on gene–gene interaction networks for Schizophrenia. By performing gene ontology enrichment analysis, pathway analysis, gene–gene network and other studies, the proposed robust methods can find undiscovered genes in addition to significant gene pairs, and demonstrate superior performance over several of current approaches.


Sign in / Sign up

Export Citation Format

Share Document