Tensor Decomposition for Multilayer Networks Clustering

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33013371 ◽

2019 ◽

Vol 33 ◽

pp. 3371-3378 ◽

Cited By ~ 2

Author(s):

Zitai Chen ◽

Chuan Chen ◽

Zibin Zheng ◽

Yi Zhu

Keyword(s):

Clustering Algorithms ◽

Cluster Structure ◽

Real Life ◽

Nonlinear Least Squares ◽

Tensor Decomposition ◽

Underlying Structure ◽

Network Clustering ◽

Multilayer Networks ◽

Novel Approach ◽

Real World Datasets

Clustering on multilayer networks has been shown to be a promising approach to enhance the accuracy. Various multilayer networks clustering algorithms assume all networks derive from a latent clustering structure, and jointly learn the compatible and complementary information from different networks to excavate one shared underlying structure. However, such an assumption is in conflict with many emerging real-life applications due to the existence of noisy/irrelevant networks. To address this issue, we propose Centroid-based Multilayer Network Clustering (CMNC), a novel approach which can divide irrelevant relationships into different network groups and uncover the cluster structure in each group simultaneously. The multilayer networks is represented within a unified tensor framework for simultaneously capturing multiple types of relationships between a set of entities. By imposing the rank-(Lr,Lr,1) block term decomposition with nonnegativity, we are able to have well interpretations on the multiple clustering results based on graph cut theory. Numerically, we transform this tensor decomposition problem to an unconstrained optimization, thus can solve it efficiently under the nonlinear least squares (NLS) framework. Extensive experimental results on synthetic and real-world datasets show the effectiveness and robustness of our method against noise and irrelevant data.

Download Full-text

Towards Expert-Inspired Automatic Criterion to Cut a Dendrogram for Real-Industrial Applications

10.3233/faia210140 ◽

2021 ◽

Author(s):

Shikha Suman ◽

Ashutosh Karna ◽

Karina Gibert

Keyword(s):

Hierarchical Clustering ◽

Clustering Algorithms ◽

Computational Cost ◽

Real Life ◽

Ground Truth ◽

Industrial Applications ◽

Underlying Structure ◽

Cluster Validity ◽

Cluster Validity Index ◽

Number Of Clusters

Hierarchical clustering is one of the most preferred choices to understand the underlying structure of a dataset and defining typologies, with multiple applications in real life. Among the existing clustering algorithms, the hierarchical family is one of the most popular, as it permits to understand the inner structure of the dataset and find the number of clusters as an output, unlike popular methods, like k-means. One can adjust the granularity of final clustering to the goals of the analysis themselves. The number of clusters in a hierarchical method relies on the analysis of the resulting dendrogram itself. Experts have criteria to visually inspect the dendrogram and determine the number of clusters. Finding automatic criteria to imitate experts in this task is still an open problem. But, dependence on the expert to cut the tree represents a limitation in real applications like the fields industry 4.0 and additive manufacturing. This paper analyses several cluster validity indexes in the context of determining the suitable number of clusters in hierarchical clustering. A new Cluster Validity Index (CVI) is proposed such that it properly catches the implicit criteria used by experts when analyzing dendrograms. The proposal has been applied on a range of datasets and validated against experts ground-truth overcoming the results obtained by the State of the Art and also significantly reduces the computational cost.

Download Full-text

Adaptive Initialization Method for K-Means Algorithm

Frontiers in Artificial Intelligence ◽

10.3389/frai.2021.740817 ◽

2021 ◽

Vol 4 ◽

Author(s):

Jie Yang ◽

Yu-Kai Wang ◽

Xin Yao ◽

Chin-Teng Lin

Keyword(s):

Time Complexity ◽

Clustering Algorithm ◽

Clustering Algorithms ◽

Real Life ◽

Superior Performance ◽

Local Optima ◽

Initial Cluster ◽

Higher Dimensional ◽

Real World Datasets ◽

Random Method

The K-means algorithm is a widely used clustering algorithm that offers simplicity and efficiency. However, the traditional K-means algorithm uses a random method to determine the initial cluster centers, which make clustering results prone to local optima and then result in worse clustering performance. In this research, we propose an adaptive initialization method for the K-means algorithm (AIMK) which can adapt to the various characteristics in different datasets and obtain better clustering performance with stable results. For larger or higher-dimensional datasets, we even leverage random sampling in AIMK (name as AIMK-RS) to reduce the time complexity. 22 real-world datasets were applied for performance comparisons. The experimental results show AIMK and AIMK-RS outperform the current initialization methods and several well-known clustering algorithms. Specifically, AIMK-RS can significantly reduce the time complexity to O (n). Moreover, we exploit AIMK to initialize K-medoids and spectral clustering, and better performance is also explored. The above results demonstrate superior performance and good scalability by AIMK or AIMK-RS. In the future, we would like to apply AIMK to more partition-based clustering algorithms to solve real-life practical problems.

Download Full-text

A Novel Complex Networks Clustering Algorithm Based on the Core Influence of Nodes

The Scientific World JOURNAL ◽

10.1155/2014/801854 ◽

2014 ◽

Vol 2014 ◽

pp. 1-7 ◽

Cited By ~ 3

Author(s):

Chao Tong ◽

Jianwei Niu ◽

Bin Dai ◽

Zhongyu Xie

Keyword(s):

Complex Networks ◽

Clustering Algorithm ◽

Cluster Formation ◽

Clustering Algorithms ◽

Cluster Structure ◽

Network Clustering ◽

Clustering Methods ◽

Positive Role ◽

The Core ◽

Final Cluster

In complex networks, cluster structure, identified by the heterogeneity of nodes, has become a common and important topological property. Network clustering methods are thus significant for the study of complex networks. Currently, many typical clustering algorithms have some weakness like inaccuracy and slow convergence. In this paper, we propose a clustering algorithm by calculating the core influence of nodes. The clustering process is a simulation of the process of cluster formation in sociology. The algorithm detects the nodes with core influence through their betweenness centrality, and builds the cluster’s core structure by discriminant functions. Next, the algorithm gets the final cluster structure after clustering the rest of the nodes in the network by optimizing method. Experiments on different datasets show that the clustering accuracy of this algorithm is superior to the classical clustering algorithm (Fast-Newman algorithm). It clusters faster and plays a positive role in revealing the real cluster structure of complex networks precisely.

Download Full-text

Decision Theory, an Unprecedented Validation Scheme for Rough-Fuzzy Clustering

International Journal of Artificial Intelligence Tools ◽

10.1142/s0218213016500032 ◽

2016 ◽

Vol 25 (02) ◽

pp. 1650003

Author(s):

S. Revathy ◽

B. Parvathavarthini ◽

S. Shiny Caroline

Keyword(s):

Decision Theory ◽

Fuzzy Clustering ◽

Risk Measure ◽

Clustering Algorithms ◽

Cluster Structure ◽

Optimal Number ◽

Cluster Validation ◽

Novel Approach ◽

Validity Measure ◽

Validation Scheme

Cluster validation is an essential technique in all cluster applications. Several validation methods measure the accuracy of cluster structure. Typical methods are geometric, where only distance and membership form the core of validation. Yao's decision theory is a novel approach for cluster validation, which evolved loss calculations and probabilistic based measure for determining the cluster quality. Conventional rough set algorithms have utilized this validity measure. This paper propagates decision theory, an unprecedented validation scheme for Rough-Fuzzy clustering by resolving loss and probability calculations to predict the risk measure in clustering techniques. Experiments with synthetic and UCI datasets have been performed, proven to deduce the optimal number of clusters overcoming the downsides of traditional validation frameworks. The proposed index can also be applied to other clustering algorithms and extends the usefulness in business oriented data mining.

Download Full-text

Deep Adversarial Multi-view Clustering Network

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2019/409 ◽

2019 ◽

Cited By ~ 2

Author(s):

Zhaoyang Li ◽

Qianqian Wang ◽

Zhiqiang Tao ◽

Quanxue Gao ◽

Zhaohua Yang

Keyword(s):

Clustering Algorithms ◽

Cluster Structure ◽

Multiple Views ◽

Intrinsic Structure ◽

Common Structure ◽

Latent Space ◽

Adversarial Training ◽

The Common ◽

Latent Representations ◽

Real World Datasets

Multi-view clustering has attracted increasing attention in recent years by exploiting common clustering structure across multiple views. Most existing multi-view clustering algorithms use shallow and linear embedding functions to learn the common structure of multi-view data. However, these methods cannot fully utilize the non-linear property of multi-view data, which is important to reveal complex cluster structure underlying multi-view data. In this paper, we propose a novel multi-view clustering method, named Deep Adversarial Multi-view Clustering (DAMC) network, to learn the intrinsic structure embedded in multi-view data. Specifically, our model adopts deep auto-encoders to learn latent representations shared by multiple views, and meanwhile leverages adversarial training to further capture the data distribution and disentangle the latent space. Experimental results on several real-world datasets demonstrate that the proposed method outperforms the state-of art methods.

Download Full-text

Deep Multiple Auto-Encoder-Based Multi-view Clustering

Data Science and Engineering ◽

10.1007/s41019-021-00159-z ◽

2021 ◽

Author(s):

Guowang Du ◽

Lihua Zhou ◽

Yudi Yang ◽

Kevin Lü ◽

Lizhen Wang

Keyword(s):

Clustering Algorithm ◽

Clustering Algorithms ◽

Representation Learning ◽

Underlying Structure ◽

Unified Framework ◽

Nonlinear Structure ◽

Heterogeneous Information ◽

Structure Information ◽

Real World Datasets ◽

Low Dimensional

AbstractMulti-view clustering (MVC), which aims to explore the underlying structure of data by leveraging heterogeneous information of different views, has brought along a growth of attention. Multi-view clustering algorithms based on different theories have been proposed and extended in various applications. However, most existing MVC algorithms are shallow models, which learn structure information of multi-view data by mapping multi-view data to low-dimensional representation space directly, ignoring the nonlinear structure information hidden in each view, and thus, the performance of multi-view clustering is weakened to a certain extent. In this paper, we propose a deep multi-view clustering algorithm based on multiple auto-encoder, termed MVC-MAE, to cluster multi-view data. MVC-MAE adopts auto-encoder to capture the nonlinear structure information of each view in a layer-wise manner and incorporate the local invariance within each view and consistent as well as complementary information between any two views together. Besides, we integrate the representation learning and clustering into a unified framework, such that two tasks can be jointly optimized. Extensive experiments on six real-world datasets demonstrate the promising performance of our algorithm compared with 15 baseline algorithms in terms of two evaluation metrics.

Download Full-text

New fractional approaches for n-polynomial P-convexity with applications in special function theory

Advances in Difference Equations ◽

10.1186/s13662-020-03000-5 ◽

2020 ◽

Vol 2020 (1) ◽

Cited By ~ 1

Author(s):

Shu-Bo Chen ◽

Saima Rashid ◽

Muhammad Aslam Noor ◽

Zakia Hammouch ◽

Yu-Ming Chu

Keyword(s):

Fractional Calculus ◽

Convex Functions ◽

Special Functions ◽

Real Life ◽

Fractional Integrals ◽

Digamma Function ◽

Novel Approach ◽

Inequality Theory ◽

New Strategy ◽

Significant Mechanism

Abstract Inequality theory provides a significant mechanism for managing symmetrical aspects in real-life circumstances. The renowned distinguishing feature of integral inequalities and fractional calculus has a solid possibility to regulate continuous issues with high proficiency. This manuscript contributes to a captivating association of fractional calculus, special functions and convex functions. The authors develop a novel approach for investigating a new class of convex functions which is known as an n-polynomial $\mathcal{P}$ P -convex function. Meanwhile, considering two identities via generalized fractional integrals, provide several generalizations of the Hermite–Hadamard and Ostrowski type inequalities by employing the better approaches of Hölder and power-mean inequalities. By this new strategy, using the concept of n-polynomial $\mathcal{P}$ P -convexity we can evaluate several other classes of n-polynomial harmonically convex, n-polynomial convex, classical harmonically convex and classical convex functions as particular cases. In order to investigate the efficiency and supremacy of the suggested scheme regarding the fractional calculus, special functions and n-polynomial $\mathcal{P}$ P -convexity, we present two applications for the modified Bessel function and $\mathfrak{q}$ q -digamma function. Finally, these outcomes can evaluate the possible symmetric roles of the criterion that express the real phenomena of the problem.

Download Full-text

Density Guarantee on Finding Multiple Subgraphs and Subtensors

ACM Transactions on Knowledge Discovery from Data ◽

10.1145/3446668 ◽

2021 ◽

Vol 15 (5) ◽

pp. 1-32

Author(s):

Quang-huy Duong ◽

Heri Ramampiaro ◽

Kjetil Nørvåg ◽

Thu-lan Dam

Keyword(s):

Lower Bound ◽

State Of The Art ◽

The State ◽

The Other ◽

Exact Methods ◽

Practical Solution ◽

Novel Approach ◽

Wide Range ◽

Real World Datasets ◽

Tensor Data

Dense subregion (subgraph & subtensor) detection is a well-studied area, with a wide range of applications, and numerous efficient approaches and algorithms have been proposed. Approximation approaches are commonly used for detecting dense subregions due to the complexity of the exact methods. Existing algorithms are generally efficient for dense subtensor and subgraph detection, and can perform well in many applications. However, most of the existing works utilize the state-or-the-art greedy 2-approximation algorithm to capably provide solutions with a loose theoretical density guarantee. The main drawback of most of these algorithms is that they can estimate only one subtensor, or subgraph, at a time, with a low guarantee on its density. While some methods can, on the other hand, estimate multiple subtensors, they can give a guarantee on the density with respect to the input tensor for the first estimated subsensor only. We address these drawbacks by providing both theoretical and practical solution for estimating multiple dense subtensors in tensor data and giving a higher lower bound of the density. In particular, we guarantee and prove a higher bound of the lower-bound density of the estimated subgraph and subtensors. We also propose a novel approach to show that there are multiple dense subtensors with a guarantee on its density that is greater than the lower bound used in the state-of-the-art algorithms. We evaluate our approach with extensive experiments on several real-world datasets, which demonstrates its efficiency and feasibility.

Download Full-text

New Multi-View Classification Method with Uncertain Data

ACM Transactions on Knowledge Discovery from Data ◽

10.1145/3458282 ◽

2021 ◽

Vol 16 (1) ◽

pp. 1-23

Author(s):

Bo Liu ◽

Haowen Zhong ◽

Yanshan Xiao

Keyword(s):

Learning Strategy ◽

State Of The Art ◽

Uncertain Data ◽

Real Life ◽

Support Vector ◽

Classification Methods ◽

Complementary Information ◽

Novel Approach ◽

Svm Model ◽

Iterative Framework

Multi-view classification aims at designing a multi-view learning strategy to train a classifier from multi-view data, which are easily collected in practice. Most of the existing works focus on multi-view classification by assuming the multi-view data are collected with precise information. However, we always collect the uncertain multi-view data due to the collection process is corrupted with noise in real-life application. In this case, this article proposes a novel approach, called uncertain multi-view learning with support vector machine (UMV-SVM) to cope with the problem of multi-view learning with uncertain data. The method first enforces the agreement among all the views to seek complementary information of multi-view data and takes the uncertainty of the multi-view data into consideration by modeling reachability area of the noise. Then it proposes an iterative framework to solve the proposed UMV-SVM model such that we can obtain the multi-view classifier for prediction. Extensive experiments on real-life datasets have shown that the proposed UMV-SVM can achieve a better performance for uncertain multi-view classification in comparison to the state-of-the-art multi-view classification methods.

Download Full-text

Functional prediction of environmental variables using metabolic networks

Scientific Reports ◽

10.1038/s41598-021-91486-8 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Adèle Weber Zendrera ◽

Nataliya Sokolovska ◽

Hédi A. Soula

Keyword(s):

Machine Learning ◽

Growth Temperature ◽

Environmental Variables ◽

Metabolic Networks ◽

Machine Learning Techniques ◽

Underlying Structure ◽

Glutathione Biosynthesis ◽

Additional Information ◽

Cold Environments ◽

Novel Approach

AbstractIn this manuscript, we propose a novel approach to assess relationships between environment and metabolic networks. We used a comprehensive dataset of more than 5000 prokaryotic species from which we derived the metabolic networks. We compute the scope from the reconstructed graphs, which is the set of all metabolites and reactions that can potentially be synthesized when provided with external metabolites. We show using machine learning techniques that the scope is an excellent predictor of taxonomic and environmental variables, namely growth temperature, oxygen tolerance, and habitat. In the literature, metabolites and pathways are rarely used to discriminate species. We make use of the scope underlying structure—metabolites and pathways—to construct the predictive models, giving additional information on the important metabolic pathways needed to discriminate the species, which is often absent in other metabolic network properties. For example, in the particular case of growth temperature, glutathione biosynthesis pathways are specific to species growing in cold environments, whereas tungsten metabolism is specific to species in warm environments, as was hinted in current literature. From a machine learning perspective, the scope is able to reduce the dimension of our data, and can thus be considered as an interpretable graph embedding.

Download Full-text