Anchors Bring Ease: An Embarrassingly Simple Approach to Partial Multi-View Clustering

Lifelong Spectral Clustering

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i04.6045 ◽

2020 ◽

Vol 34 (04) ◽

pp. 5867-5874

Author(s):

Gan Sun ◽

Yang Cong ◽

Qianqian Wang ◽

Jun Li ◽

Yun Fu

Keyword(s):

Machine Learning ◽

Real World ◽

Spectral Clustering ◽

State Of The Art ◽

Clustering Algorithms ◽

Orthogonal Basis ◽

Learning Framework ◽

The Past ◽

Benchmark Datasets ◽

Over Time

In the past decades, spectral clustering (SC) has become one of the most effective clustering algorithms. However, most previous studies focus on spectral clustering tasks with a fixed task set, which cannot incorporate with a new spectral clustering task without accessing to previously learned tasks. In this paper, we aim to explore the problem of spectral clustering in a lifelong machine learning framework, i.e., Lifelong Spectral Clustering (L2SC). Its goal is to efficiently learn a model for a new spectral clustering task by selectively transferring previously accumulated experience from knowledge library. Specifically, the knowledge library of L2SC contains two components: 1) orthogonal basis library: capturing latent cluster centers among the clusters in each pair of tasks; 2) feature embedding library: embedding the feature manifold information shared among multiple related tasks. As a new spectral clustering task arrives, L2SC firstly transfers knowledge from both basis library and feature library to obtain encoding matrix, and further redefines the library base over time to maximize performance across all the clustering tasks. Meanwhile, a general online update formulation is derived to alternatively update the basis library and feature library. Finally, the empirical experiments on several real-world benchmark datasets demonstrate that our L2SC model can effectively improve the clustering performance when comparing with other state-of-the-art spectral clustering algorithms.

Download Full-text

Learning emotional word embeddings for sentiment analysis

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-201993 ◽

2021 ◽

pp. 1-13

Author(s):

Qingtian Zeng ◽

Xishi Zhao ◽

Xiaohui Hu ◽

Hua Duan ◽

Zhongying Zhao ◽

...

Keyword(s):

Sentiment Analysis ◽

Language Processing ◽

State Of The Art ◽

Research Problem ◽

Emotional Word ◽

Classification Model ◽

Data Sets ◽

Word Embeddings ◽

Real World Data ◽

Text Documents

Word embeddings have been successfully applied in many natural language processing tasks due to its their effectiveness. However, the state-of-the-art algorithms for learning word representations from large amounts of text documents ignore emotional information, which is a significant research problem that must be addressed. To solve the above problem, we propose an emotional word embedding (EWE) model for sentiment analysis in this paper. This method first applies pre-trained word vectors to represent document features using two different linear weighting methods. Then, the resulting document vectors are input to a classification model and used to train a text sentiment classifier, which is based on a neural network. In this way, the emotional polarity of the text is propagated into the word vectors. The experimental results on three kinds of real-world data sets demonstrate that the proposed EWE model achieves superior performances on text sentiment prediction, text similarity calculation, and word emotional expression tasks compared to other state-of-the-art models.

Download Full-text

Kernel k-MACE: hypercube unsupervised clustering method

10.32920/ryerson.14661465 ◽

2021 ◽

Author(s):

Faizan Ur Rahman ◽

Soosan Beheshti

Keyword(s):

Kernel Function ◽

State Of The Art ◽

Feature Space ◽

Unsupervised Clustering ◽

Gaussian Kernel ◽

Clustering Methods ◽

Number Of Clusters ◽

Kernel Parameter ◽

Benchmark Datasets ◽

Multiple State

Transforming data to feature space using a kernel function can result in better expression of its features, resulting in better separability for some datasets. The parameters of the kernel function govern the structure of data in feature space and need to be optimized simultaneously while also estimating the number of clusters in a dataset. The proposed method denoted by kernel k-Minimum Average Central Error (kernel k-MACE), esti- mates the number of clusters in a dataset while simultaneously clustering the dataset in feature space by finding the optimum value of the Gaussian kernel parameter σk. A cluster initialization technique has also been proposed based on an existing method for k-means clustering. Simulations show that for self-generated datasets with Gaus- sian clusters having 10% - 50% overlap and for real benchmark datasets, the proposed method outperforms multiple state-of-the-art unsupervised clustering methods including k-MACE, the clustering scheme that inspired kernel k-MACE.

Download Full-text

Kernel k-MACE: hypercube unsupervised clustering method

10.32920/ryerson.14661465.v1 ◽

2021 ◽

Author(s):

Faizan Ur Rahman ◽

Soosan Beheshti

Keyword(s):

Kernel Function ◽

State Of The Art ◽

Feature Space ◽

Unsupervised Clustering ◽

Gaussian Kernel ◽

Clustering Methods ◽

Number Of Clusters ◽

Kernel Parameter ◽

Benchmark Datasets ◽

Multiple State

Transforming data to feature space using a kernel function can result in better expression of its features, resulting in better separability for some datasets. The parameters of the kernel function govern the structure of data in feature space and need to be optimized simultaneously while also estimating the number of clusters in a dataset. The proposed method denoted by kernel k-Minimum Average Central Error (kernel k-MACE), esti- mates the number of clusters in a dataset while simultaneously clustering the dataset in feature space by finding the optimum value of the Gaussian kernel parameter σk. A cluster initialization technique has also been proposed based on an existing method for k-means clustering. Simulations show that for self-generated datasets with Gaus- sian clusters having 10% - 50% overlap and for real benchmark datasets, the proposed method outperforms multiple state-of-the-art unsupervised clustering methods including k-MACE, the clustering scheme that inspired kernel k-MACE.

Download Full-text

Discriminative non-negative representation based classifier for image recognition

Journal of Algorithms & Computational Technology ◽

10.1177/17483026211044922 ◽

2021 ◽

Vol 15 ◽

pp. 174830262110449

Author(s):

Kai-Jun Hu ◽

He-Feng Yin ◽

Jun Sun

Keyword(s):

Pattern Classification ◽

State Of The Art ◽

Source Code ◽

Classification Performance ◽

Training Data ◽

Practical Applications ◽

The Past ◽

Benchmark Datasets ◽

Classification Tasks ◽

Negative Representation

During the past decade, representation based classification method has received considerable attention in the community of pattern recognition. The recently proposed non-negative representation based classifier achieved superb recognition results in diverse pattern classification tasks. Unfortunately, discriminative information of training data is not fully exploited in non-negative representation based classifier, which undermines its classification performance in practical applications. To address this problem, we introduce a decorrelation regularizer into the formulation of non-negative representation based classifier and propose a discriminative non-negative representation based classifier for pattern classification. The decorrelation regularizer is able to reduce the correlation of representation results of different classes, thus promoting the competition among them. Experimental results on benchmark datasets validate the efficacy of the proposed discriminative non-negative representation based classifier, and it can outperform some state-of-the-art deep learning based methods. The source code of our proposed discriminative non-negative representation based classifier is accessible at https://github.com/yinhefeng/DNRC .

Download Full-text

Latent Distribution Preserving Deep Subspace Clustering

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2019/617 ◽

2019 ◽

Cited By ~ 8

Author(s):

Lei Zhou ◽

Xiao Bai ◽

Dong Wang ◽

Xianglong Liu ◽

Jun Zhou ◽

...

Keyword(s):

Linear Subspace ◽

State Of The Art ◽

Subspace Clustering ◽

Real Data ◽

High Dimensional ◽

Clustering Methods ◽

Real World Data ◽

Latent Distribution ◽

Computer Vision Applications ◽

Strong Capacity

Subspace clustering is a useful technique for many computer vision applications in which the intrinsic dimension of high-dimensional data is smaller than the ambient dimension. Traditional subspace clustering methods often rely on the self-expressiveness property, which has proven effective for linear subspace clustering. However, they perform unsatisfactorily on real data with complex nonlinear subspaces. More recently, deep autoencoder based subspace clustering methods have achieved success owning to the more powerful representation extracted by the autoencoder network. Unfortunately, these methods only considering the reconstruction of original input data can hardly guarantee the latent representation for the data distributed in subspaces, which inevitably limits the performance in practice. In this paper, we propose a novel deep subspace clustering method based on a latent distribution-preserving autoencoder, which introduces a distribution consistency loss to guide the learning of distribution-preserving latent representation, and consequently enables strong capacity of characterizing the real-world data for subspace clustering. Experimental results on several public databases show that our method achieves significant improvement compared with the state-of-the-art subspace clustering methods.

Download Full-text

Hyperspectral Image Clustering with Spatially-Regularized Ultrametrics

Remote Sensing ◽

10.3390/rs13050955 ◽

2021 ◽

Vol 13 (5) ◽

pp. 955

Author(s):

Shukun Zhang ◽

James M. Murphy

Keyword(s):

Spectral Clustering ◽

Hyperspectral Image ◽

State Of The Art ◽

Image Clustering ◽

Clustering Methods ◽

Performance Guarantees ◽

Data Density ◽

Spatial Geometry ◽

Data Points ◽

Almost All

We propose a method for the unsupervised clustering of hyperspectral images based on spatially regularized spectral clustering with ultrametric path distances. The proposed method efficiently combines data density and spectral-spatial geometry to distinguish between material classes in the data, without the need for training labels. The proposed method is efficient, with quasilinear scaling in the number of data points, and enjoys robust theoretical performance guarantees. Extensive experiments on synthetic and real HSI data demonstrate its strong performance compared to benchmark and state-of-the-art methods. Indeed, the proposed method not only achieves excellent labeling accuracy, but also efficiently estimates the number of clusters. Thus, unlike almost all existing hyperspectral clustering methods, the proposed algorithm is essentially parameter-free.

Download Full-text

Challenges of Deep Learning-based Text Detection in the Wild

Journal of Computational Vision and Imaging Systems ◽

10.15353/jcvis.v6i1.3543 ◽

2021 ◽

Vol 6 (1) ◽

pp. 1-5

Author(s):

Zobeir Raisi ◽

Mohamed A. Naiel ◽

Paul Fieguth ◽

Steven Wardell ◽

John Zelek

Keyword(s):

Deep Learning ◽

State Of The Art ◽

Text Detection ◽

Detection Methods ◽

Learning Approaches ◽

Plane Rotation ◽

The Past ◽

Perspective Distortion ◽

Benchmark Datasets ◽

In The Wild

The reported accuracy of recent state-of-the-art text detection methods, mostly deep learning approaches, is in the order of 80% to 90% on standard benchmark datasets. These methods have relaxed some of the restrictions of structured text and environment (i.e., "in the wild") which are usually required for classical OCR to properly function. Even with this relaxation, there are still circumstances where these state-of-the-art methods fail. Several remaining challenges in wild images, like in-plane-rotation, illumination reflection, partial occlusion, complex font styles, and perspective distortion, cause exciting methods to perform poorly. In order to evaluate current approaches in a formal way, we standardize the datasets and metrics for comparison which had made comparison between these methods difficult in the past. We use three benchmark datasets for our evaluations: ICDAR13, ICDAR15, and COCO-Text V2.0. The objective of the paper is to quantify the current shortcomings and to identify the challenges for future text detection research.

Download Full-text

A new Kmeans clustering model and its generalization achieved by joint spectral embedding and rotation

PeerJ Computer Science ◽

10.7717/peerj-cs.450 ◽

2021 ◽

Vol 7 ◽

pp. e450

Author(s):

Wenna Huang ◽

Yong Peng ◽

Yuan Ge ◽

Wanzeng Kong

Keyword(s):

Spectral Clustering ◽

Similarity Measures ◽

Optimization Method ◽

Similar Data ◽

Clustering Methods ◽

Cluster Assignment ◽

Clustering Model ◽

Data Similarity ◽

Spectral Embedding ◽

Benchmark Datasets

The Kmeans clustering and spectral clustering are two popular clustering methods for grouping similar data points together according to their similarities. However, the performance of Kmeans clustering might be quite unstable due to the random initialization of the cluster centroids. Generally, spectral clustering methods employ a two-step strategy of spectral embedding and discretization postprocessing to obtain the cluster assignment, which easily lead to far deviation from true discrete solution during the postprocessing process. In this paper, based on the connection between the Kmeans clustering and spectral clustering, we propose a new Kmeans formulation by joint spectral embedding and spectral rotation which is an effective postprocessing approach to perform the discretization, termed KMSR. Further, instead of directly using the dot-product data similarity measure, we make generalization on KMSR by incorporating more advanced data similarity measures and call this generalized model as KMSR-G. An efficient optimization method is derived to solve the KMSR (KMSR-G) model objective whose complexity and convergence are provided. We conduct experiments on extensive benchmark datasets to validate the performance of our proposed models and the experimental results demonstrate that our models perform better than the related methods in most cases.

Download Full-text

A REVIEW ON STATE-OF-THE-ART FACE RECOGNITION APPROACHES

Fractals ◽

10.1142/s0218348x17500256 ◽

2017 ◽

Vol 25 (02) ◽

pp. 1750025 ◽

Cited By ~ 32

Author(s):

ZAHID MAHMOOD ◽

NAZEER MUHAMMAD ◽

NARGIS BIBI ◽

TAUSEEF ALI

Keyword(s):

Face Recognition ◽

State Of The Art ◽

Recognition Rate ◽

Research Problem ◽

Critical Discussion ◽

The Past ◽

Open Research ◽

Comprehensive Survey ◽

Facial Images ◽

Enormous Number

Automatic Face Recognition (FR) presents a challenging task in the field of pattern recognition and despite the huge research in the past several decades; it still remains an open research problem. This is primarily due to the variability in the facial images, such as non-uniform illuminations, low resolution, occlusion, and/or variation in poses. Due to its non-intrusive nature, the FR is an attractive biometric modality and has gained a lot of attention in the biometric research community. Driven by the enormous number of potential application domains, many algorithms have been proposed for the FR. This paper presents an overview of the state-of-the-art FR algorithms, focusing their performances on publicly available databases. We highlight the conditions of the image databases with regard to the recognition rate of each approach. This is useful as a quick research overview and for practitioners as well to choose an algorithm for their specified FR application. To provide a comprehensive survey, the paper divides the FR algorithms into three categories: (1) intensity-based, (2) video-based, and (3) 3D based FR algorithms. In each category, the most commonly used algorithms and their performance is reported on standard face databases and a brief critical discussion is carried out.

Download Full-text