clustering problem Latest Research Papers

A proper cluster is usually defined as maximally coherent groups from a set of objects using pairwise or more complicated similarities. In general hypergraphs, clustering problem refers to extraction of subhypergraphs with a higher internal density, for instance, maximal cliques in hypergraphs. The determination of clustering structure within hypergraphs is a significant problem in the area of data mining. Various works of detecting clusters on graphs and uniform hypergraphs have been published in the past decades. Recently, it has been shown that the maximum 1,2 -clique size in 1,2 -hypergraphs is related to the global maxima of a certain quadratic program based on the structure of the given nonuniform hypergraphs. In this paper, we first extend this result to relate strict local maxima of this program to certain maximal cliques including 2-cliques or 1,2 -cliques. We also explore the connection between edge-weighted clusters and strictly local optimum solutions of a class of polynomials resulting from nonuniform 1,2 -hypergraphs.

Download Full-text

Approximation Algorithms for the Capacitated Min–Max Correlation Clustering Problem

Asia Pacific Journal of Operational Research ◽

10.1142/s0217595922400085 ◽

2022 ◽

Author(s):

Sai Ji ◽

Jun Li ◽

Zijun Wu ◽

Yicheng Xu

Keyword(s):

Integer Programming ◽

Approximation Algorithms ◽

Approximation Algorithm ◽

Gap Analysis ◽

The Other ◽

Correlation Clustering ◽

Clustering Problem ◽

Clustering Model ◽

Proposed Model ◽

Natural Variant

In this paper, we propose a so-called capacitated min–max correlation clustering model, a natural variant of the min–max correlation clustering problem. As our main contribution, we present an integer programming and its integrality gap analysis for the proposed model. Furthermore, we provide two approximation algorithms for the model, one of which is a bi-criteria approximation algorithm and the other is based on LP-rounding technique.

Download Full-text

Max stable set problem to found the initial centroids in clustering problem

Indonesian Journal of Electrical Engineering and Computer Science ◽

10.11591/ijeecs.v25.i1.pp569-579 ◽

2022 ◽

Vol 25 (1) ◽

pp. 569

Author(s):

Awatif Karim ◽

Chakir Loqman ◽

Youssef Hami ◽

Jaouad Boumhidi

Keyword(s):

Document Clustering ◽

Large Data ◽

Hopfield Network ◽

Large Data Sets ◽

Stable Set ◽

Data Sets ◽

Clustering Problem ◽

Text Document ◽

Stable Set Problem

In this paper, we propose a new approach to solve the document-clustering using the K-Means algorithm. The latter is sensitive to the random selection of the k cluster centroids in the initialization phase. To evaluate the quality of K-Means clustering we propose to model the text document clustering problem as the max stable set problem (MSSP) and use continuous Hopfield network to solve the MSSP problem to have initial centroids. The idea is inspired by the fact that MSSP and clustering share the same principle, MSSP consists to find the largest set of nodes completely disconnected in a graph, and in clustering, all objects are divided into disjoint clusters. Simulation results demonstrate that the proposed K-Means improved by MSSP (KM_MSSP) is efficient of large data sets, is much optimized in terms of time, and provides better quality of clustering than other methods.

Download Full-text

COMPARATIVE ANALYSIS OF THE K-STANDARDS ALGORITHM APPLICATION FOR THE CLUSTERING PROBLEM

СИСТЕМЫ УПРАВЛЕНИЯ И ИНФОРМАЦИОННЫЕ ТЕХНОЛОГИИ ◽

10.36622/vstu.2021.86.4.002 ◽

2021 ◽

pp. 10-14

Author(s):

Н.Л. Резова ◽

И.П. Рожнов ◽

А.А. Истомина

Keyword(s):

Comparative Analysis ◽

Clustering Problem ◽

Automatic Grouping

В статье рассматривается применение алгоритма k-эталонов для задачи кластеризации на примере производственных партий электрорадиоизделий, сделан вывод о качестве работы алгоритма k-эталонов и целесообразности его использования при решении задач автоматической группировки продукции. The article discusses the application of the k-standards algorithm for the clustering problem on the example of production batches of electrical radio products, a conclusion was made about the quality of the k-standards algorithm and the expediency of its use in automatic grouping problems solving.

Download Full-text

An approach to finding a global optimum in constrained clustering tasks involving the assessments of several experts

Transaction Kola Science Cetnre ◽

10.37614/2307-5252.2021.5.12.007 ◽

2021 ◽

Vol 12 (5-2021) ◽

pp. 75-90

Author(s):

Alexander A. Zuenko ◽

◽

Olga V. Fridman ◽

Olga N. Zuenko ◽

◽

...

Keyword(s):

Constraint Satisfaction ◽

Constraint Satisfaction Problem ◽

A Priori ◽

Global Optimum ◽

Constrained Clustering ◽

Clustering Problem ◽

Expert Opinions ◽

Optimal Value ◽

The Subject ◽

Additional Constraints

An approach to solving the constrained clustering problem has been developed, based on the aggregation of data obtained as a result of evaluating the characteristics of clustered objects by several independent experts, and the analysis of alternative variants of clustering by constraint programming methods using original heuristics. Objects clusterized are represented as multisets, which makes it possible to use appropriate methods of aggregation of expert opinions. It is proposed to solve the constrained clustering problem as a constraint satisfaction problem. The main attention is paid to the issue of reducing the number and simplifying the constraints of the constraint satisfaction problem at the stage of its formalization. Within the framework of the approach, we have created: a) a method for estimating the optimal value of the objective function by hierarchical clustering of multisets, taking into account a priori constraints of the subject domain, and b) a method for generating additional constraints on the desired solution in the form of “smart tables”, based on the obtained estimate. The approach allows us to find the best partition in the problems of the class under consideration, which are characterized by a high dimension.

Download Full-text

A hybrid clustering algorithm based on improved GWO and KHM clustering

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-211034 ◽

2021 ◽

pp. 1-14

Author(s):

Feng Xue ◽

Yongbo Liu ◽

Xiaochen Ma ◽

Bharat Pathak ◽

Peng Liang

Keyword(s):

Clustering Algorithm ◽

Test Functions ◽

Convergence Factor ◽

Step Size ◽

Local Optima ◽

Hybrid Clustering ◽

Clustering Problem ◽

Dynamic Weight ◽

The Stability ◽

Harmonic Means

To solve the problem that the K-means algorithm is sensitive to the initial clustering centers and easily falls into local optima, we propose a new hybrid clustering algorithm called the IGWOKHM algorithm. In this paper, we first propose an improved strategy based on a nonlinear convergence factor, an inertial step size, and a dynamic weight to improve the search ability of the traditional grey wolf optimization (GWO) algorithm. Then, the improved GWO (IGWO) algorithm and the K-harmonic means (KHM) algorithm are fused to solve the clustering problem. This fusion clustering algorithm is called IGWOKHM, and it combines the global search ability of IGWO with the local fast optimization ability of KHM to both solve the problem of the K-means algorithm’s sensitivity to the initial clustering centers and address the shortcomings of KHM. The experimental results on 8 test functions and 4 University of California Irvine (UCI) datasets show that the IGWO algorithm greatly improves the efficiency of the model while ensuring the stability of the algorithm. The fusion clustering algorithm can effectively overcome the inadequacies of the K-means algorithm and has a good global optimization ability.

Download Full-text

Lithium-ion power battery grouping: A multisource data fusion based clustering approach and distributed deployment

Journal of Electrochemical Energy Conversion and Storage ◽

10.1115/1.4053307 ◽

2021 ◽

pp. 1-13

Author(s):

Yudong Wang ◽

Xiwei Bai ◽

Chengbao Liu ◽

Jie Tan

Keyword(s):

Data Fusion ◽

Lithium Ion ◽

Electrochemical Characteristics ◽

Static Characteristics ◽

Clustering Problem ◽

Power Battery ◽

Source Data ◽

Clustering Approach ◽

Effective Network ◽

Distributed Deployment

Abstract Consistence of lithium-ion power battery significantly affects the life and safety of battery modules and packs. To improve the consistence, battery grouping is employed, assembling batteries with similar electrochemical characteristics to make up modules and packs. Therefore, grouping process boils down to unsupervised clustering problem. Current used grouping approaches include two aspects, static characteristics based and dynamic based. However, there are three problems. First, the common problem is underutilization of multi-source data. Second, for the static characteristics based, there is grouping failure over time. Third, for the dynamic characteristics based, there is high computational complexity. To solve these problems, we propose a distributed multisource data fusion based battery grouping approach. The proposed approach designs an effective network structure for multisource data fusion, and a self supervised scheme for feature extraction from both static and dynamic multisource data. We apply our approach on real battery modules and test state of health (SOH) after charging-discharging cycles. Experimental results indicate that the proposed scheme can increase SOH of modules by 3.89%, and reduce the inconsistence by 68.4%. Meanwhile, with the distributed deployment the time cost is reduced by 87.9% than the centralized scheme.

Download Full-text

An Algorithm Using DBSCAN to Solve the Velocity Dealiasing Problem

Advances in Meteorology ◽

10.1155/2021/9705412 ◽

2021 ◽

Vol 2021 ◽

pp. 1-9

Author(s):

Wei Zhao ◽

Qinglan Li ◽

Kuifeng Jin

Keyword(s):

Doppler Radar ◽

Reference Data ◽

Spatial Clustering ◽

Radar Data ◽

Observation Data ◽

Two Dimensional ◽

Velocity Data ◽

Clustering Problem ◽

Radial Velocity Data

Velocity dealiasing is an essential task for correcting the radial velocity data collected by Doppler radar. To improve the accuracy of velocity dealiasing, traditional dealiasing algorithms usually set a series of empirical thresholds, combine three- or four-dimensional data, or introduce other observation data as a reference. In this study, we transform the velocity dealiasing problem into a clustering problem and solve this problem using the density-based spatial clustering of applications with noise (DBSCAN) method. This algorithm is verified with a case study involving radar data on the tropical cyclone Mangkhut in 2018. The results show that the accuracy of the proposed algorithm is close to that of the four-dimensional dealiasing (4DD) method proposed by James and Houze; yet, it only requires two-dimensional velocity data and eliminates the need for other reference data. The results of the case study also show that the 4DD algorithm filters out many observation gates close to the missing data or radar center, whereas the proposed algorithm tends to retain and correct these gates.

Download Full-text

Clustering: Finding Patterns in the Darkness

10.46723/ojml.v1i1.4 ◽

2021 ◽

pp. 1-28

Author(s):

Hector Menendez

Keyword(s):

Machine Learning ◽

Decision Making ◽

Industry 4.0 ◽

Expert Knowledge ◽

Large Datasets ◽

Decision Making Process ◽

Clustering Problem ◽

Multiple Strategies ◽

Data Similarity ◽

The World

Machine learning is changing the world and fuelling Industry 4.0. These statistical methods focused on identifying patterns in data to provide an intelligent response to specific requests. Although understanding data tends to require expert knowledge to supervise the decision-making process, some techniques need no supervision. These unsupervised techniques can work blindly but they are based on data similarity. One of the most popular areas in this field is clustering. Clustering groups data to guarantee that the clusters’ elements have a strong similarity while the clusters are distinct among them. This field started with the K-means algorithm, one of the most popular algorithms in machine learning with extensive applications. Currently, there are multiple strategies to deal with the clustering problem. This review introduces some of the classical algorithms, focusing significantly on algorithms based on evolutionary computation, and explains some current applications of clustering to large datasets.

Download Full-text

Analysis of Economic Development Trend in Postepidemic Era Based on Improved Clustering Algorithm

Scientific Programming ◽

10.1155/2021/4467001 ◽

2021 ◽

Vol 2021 ◽

pp. 1-14

Author(s):

Li Guo ◽

Kunlin Zhu ◽

Ruijun Duan

Keyword(s):

Economic Development ◽

Large Scale ◽

Clustering Algorithm ◽

Development Trend ◽

Data Sets ◽

Analysis Model ◽

Data Set ◽

Clustering Problem ◽

Large Scale Data ◽

Compressed Data

In order to explore the economic development trend in the postepidemic era, this paper improves the traditional clustering algorithm and constructs a postepidemic economic development trend analysis model based on intelligent algorithms. In order to solve the clustering problem of large-scale nonuniform density data sets, this paper proposes an adaptive nonuniform density clustering algorithm based on balanced iterative reduction and uses the algorithm to further cluster the compressed data sets. For large-scale data sets, the clustering results can accurately reflect the class characteristics of the data set as a whole. Moreover, the algorithm greatly improves the time efficiency of clustering. From the research results, we can see that the improved clustering algorithm has a certain effect on the analysis of economic development trends in the postepidemic era and can continue to play a role in subsequent economic analysis.

Download Full-text

clustering problem
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

On Clustering Detection Based on a Quadratic Program in Hypergraphs

Approximation Algorithms for the Capacitated Min–Max Correlation Clustering Problem

Max stable set problem to found the initial centroids in clustering problem

COMPARATIVE ANALYSIS OF THE K-STANDARDS ALGORITHM APPLICATION FOR THE CLUSTERING PROBLEM

An approach to finding a global optimum in constrained clustering tasks involving the assessments of several experts

A hybrid clustering algorithm based on improved GWO and KHM clustering

Lithium-ion power battery grouping: A multisource data fusion based clustering approach and distributed deployment

An Algorithm Using DBSCAN to Solve the Velocity Dealiasing Problem

Clustering: Finding Patterns in the Darkness

Analysis of Economic Development Trend in Postepidemic Era Based on Improved Clustering Algorithm

Export Citation Format

clustering problemRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

On Clustering Detection Based on a Quadratic Program in Hypergraphs

Approximation Algorithms for the Capacitated Min–Max Correlation Clustering Problem

Max stable set problem to found the initial centroids in clustering problem

COMPARATIVE ANALYSIS OF THE K-STANDARDS ALGORITHM APPLICATION FOR THE CLUSTERING PROBLEM

An approach to finding a global optimum in constrained clustering tasks involving the assessments of several experts

A hybrid clustering algorithm based on improved GWO and KHM clustering

Lithium-ion power battery grouping: A multisource data fusion based clustering approach and distributed deployment

An Algorithm Using DBSCAN to Solve the Velocity Dealiasing Problem

Clustering: Finding Patterns in the Darkness

Analysis of Economic Development Trend in Postepidemic Era Based on Improved Clustering Algorithm

clustering problem
Recently Published Documents