An Intelligent System for Multi-Label Classification Based on Particle Size and Shape Features using a Cascade Approach

2021 ◽  
Author(s):  
Hossein Izadi ◽  
Morteza Roostaei ◽  
Mohammad Soroush ◽  
Mohammad Mohammadtabar ◽  
Seyed Abolhassan Hosseini ◽  
...  

Abstract Intelligent systems are becoming more and more popular in the petroleum industry. Particle Size Distribution (PSD) based on sieve size is a key signature of the unconsolidated/weakly consolidated sandstone formations and is commonly the main parameter in the sand control design. With available extensive PSD measurement techniques and a large number of measurements, especially for horizontal wells, it is necessary to classify the PSDs prior to further analysis for the sand control design. On the other hand, PSD analysis is not enough for sand control design, and particle shapes need to be taken into account as well. A successful clustering algorithm for the mentioned purposes needs to be a cascade, multi-label, unsupervised and self-adaptive approach since the particles can be assigned to more than one group and there is no prior idea on how many clusters should be formed after the clustering process. Besides, due to the differences between sieve size and shape features, they should be used separately for clustering the particles. In the current study, a cascade approach is used for clustering the particles. In the first level of the cascade, an unsupervised and self-adaptive algorithm is introduced based on the sieve size features. The algorithm optimizes the number of clusters through a self-adaptive and incremental approach. The proposed clustering method uses a minimum similarity threshold (δ) as the only input parameter to start the clustering and tries to minimize the number of clusters during the clustering. In the second level of the cascade, the similarity between all particles in each cluster with their corresponding cluster-center is measured, and those particles that do not respect the δ in terms of the shape similarity, are moved out of the cluster. The novelty of the proposed method is in three folds. The first one is to provide a particle clustering algorithm, which works based on the whole range of the sizes and shape descriptors rather than focusing on certain points in the size graph (D-values). The second one is the dynamic nature of the clustering, which tends to optimize the number of clusters during the clustering process. The third one is that we have used a cascade approach for involving both size and shape parameters for the clustering. Our proposed method can be applied in field application for downhole monitoring and sand screen design.

2021 ◽  
Vol 2 (2) ◽  
pp. 193-199
Author(s):  
Irwandi ◽  
Opim Salim Sitompul ◽  
Rahmat Widia Sembiring

The basic concept of the subtractive clustering algorithm is to choose a data point that has the highest density (potential) in a space (variable) as the center of the cluster. The number and position of the cluster centers formed are influenced by the given radius (r) parameter value. If the radius value is very small, it will result in the neglect of potential data points around the center of the cluster. If the value of the radius parameter is too large, it increases the contribution of all potential data points, thereby canceling the effect of cluster density. The number of cluster centers in the subtractive clustering algorithm is determined based on the iteration process in finding data points with the highest number of neighbors. This study uses the clustering partition as a parameter value to determine a data point (candidate cluster center) will be selected to determine the effect of the radius (r) parameter value on the subtractive clustering algorithm in generating clustering. From the experiments that have been carried out on 4 datasets, the results have been obtained, for dataset 1 the highest average value of fuzzy silhouette with a parameter value of radius (r) 0.35 is 0.9088 and the number of clusters 2. While in dataset 2, the average value The highest fuzzy silhouette with a parameter value of radius (r) 0.40 is 0.6742 and the number of clusters 3. While in dataset 3, the average value of the highest fuzzy silhouette with a parameter value of radius (r) 0.50 is 0.7434 and the number of clusters 3. While in the dataset the last is the fourth dataset, the highest fuzzy silhouette average value with a radius (r) parameter value of 0.50 is 0.6630 and the number of clusters 2. This subscractive clustering algorithm is widely applied in the fields of transportation, GIS, big data, control of electric voltages, electrical energy needs, knowing the area of population density to health such as breast cancer diagnosis, which is related to the needs of human life.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Baicheng Lyu ◽  
Wenhua Wu ◽  
Zhiqiang Hu

AbstractWith the widely application of cluster analysis, the number of clusters is gradually increasing, as is the difficulty in selecting the judgment indicators of cluster numbers. Also, small clusters are crucial to discovering the extreme characteristics of data samples, but current clustering algorithms focus mainly on analyzing large clusters. In this paper, a bidirectional clustering algorithm based on local density (BCALoD) is proposed. BCALoD establishes the connection between data points based on local density, can automatically determine the number of clusters, is more sensitive to small clusters, and can reduce the adjusted parameters to a minimum. On the basis of the robustness of cluster number to noise, a denoising method suitable for BCALoD is proposed. Different cutoff distance and cutoff density are assigned to each data cluster, which results in improved clustering performance. Clustering ability of BCALoD is verified by randomly generated datasets and city light satellite images.


Author(s):  
R. R. Gharieb ◽  
G. Gendy ◽  
H. Selim

In this paper, the standard hard C-means (HCM) clustering approach to image segmentation is modified by incorporating weighted membership Kullback–Leibler (KL) divergence and local data information into the HCM objective function. The membership KL divergence, used for fuzzification, measures the proximity between each cluster membership function of a pixel and the locally-smoothed value of the membership in the pixel vicinity. The fuzzification weight is a function of the pixel to cluster-centers distances. The used pixel to a cluster-center distance is composed of the original pixel data distance plus a fraction of the distance generated from the locally-smoothed pixel data. It is shown that the obtained membership function of a pixel is proportional to the locally-smoothed membership function of this pixel multiplied by an exponentially distributed function of the minus pixel distance relative to the minimum distance provided by the nearest cluster-center to the pixel. Therefore, since incorporating the locally-smoothed membership and data information in addition to the relative distance, which is more tolerant to additive noise than the absolute distance, the proposed algorithm has a threefold noise-handling process. The presented algorithm, named local data and membership KL divergence based fuzzy C-means (LDMKLFCM), is tested by synthetic and real-world noisy images and its results are compared with those of several FCM-based clustering algorithms.


Sign in / Sign up

Export Citation Format

Share Document