Comparative Analysis of Clustering Techniques for Movie Recommendation

Movie recommendation is a subject with immense ambiguity. A person might like a movie but not a very similar movie. The present recommending systems focus more on just few parameters such as Director, cast and genre. A lot of Power intensive methods such as Deep Convolutional Neural Network (CNN) has been used which demands the use of Graphics processors that require more energy. We try to accomplish the same task using lesser Energy consuming algorithms such as clustering techniques. In this paper, we try to create a more generalized list of similar movies in order to provide the user with more variety of movies which he/she might like, using clustering algorithms. We will compare how choosing different parameters and number of features affect the cluster's content. Also, compare how different algorithms such as K-mean, Hierarchical, Birch and mean shift clustering algorithms give a varied result and conclude which method will suit for which scenarios of movie recommendations. We also conclude on which algorithm clusters stray data points more efficiently and how different algorithms provide different advantages and disadvantages.

Download Full-text

Airborne LiDAR Remote Sensing for Individual Tree Forest Inventory Using Trunk Detection-Aided Mean Shift Clustering Techniques

Remote Sensing ◽

10.3390/rs10071078 ◽

2018 ◽

Vol 10 (7) ◽

pp. 1078 ◽

Cited By ~ 14

Author(s):

Wei Chen ◽

Xingbo Hu ◽

Wen Chen ◽

Yifeng Hong ◽

Minhua Yang

Keyword(s):

Remote Sensing ◽

Forest Inventory ◽

Mean Shift ◽

Airborne Lidar ◽

Individual Tree ◽

Clustering Techniques ◽

Mean Shift Clustering

Download Full-text

Lookahead selective sampling for incomplete data

International Journal of Applied Mathematics and Computer Science ◽

10.1515/amcs-2016-0062 ◽

2016 ◽

Vol 26 (4) ◽

pp. 871-884 ◽

Cited By ~ 1

Author(s):

Loai Abdallah ◽

Ilan Shimshoni

Keyword(s):

Incomplete Data ◽

Missing Values ◽

Clustering Algorithms ◽

Mean Shift ◽

Ensemble Clustering ◽

Selective Sampling ◽

Mean Shift Clustering ◽

Sampling Algorithms ◽

Instance Space ◽

Incomplete Datasets

AbstractMissing values in data are common in real world applications. There are several methods that deal with this problem. In this paper we present lookahead selective sampling (LSS) algorithms for datasets with missing values. We developed two versions of selective sampling. The first one integrates a distance function that can measure the similarity between pairs of incomplete points within the framework of the LSS algorithm. The second algorithm uses ensemble clustering in order to represent the data in a cluster matrix without missing values and then run the LSS algorithm based on the ensemble clustering instance space (LSS-EC). To construct the cluster matrix, we use the k-means and mean shift clustering algorithms especially modified to deal with incomplete datasets. We tested our algorithms on six standard numerical datasets from different fields. On these datasets we simulated missing values and compared the performance of the LSS and LSS-EC algorithms for incomplete data to two other basic methods. Our experiments show that the suggested selective sampling algorithms outperform the other methods.

Download Full-text

Estimation Method of Line Loss Rate in Low Voltage Area Based on Mean Shift Clustering and BP Neural Network

Journal of Physics Conference Series ◽

10.1088/1742-6596/1754/1/012225 ◽

2021 ◽

Vol 1754 (1) ◽

pp. 012225

Author(s):

Huang Tan ◽

Yuan Li ◽

Liang Yu ◽

Jing Liu ◽

Linna Ni ◽

...

Keyword(s):

Neural Network ◽

Bp Neural Network ◽

Loss Rate ◽

Low Voltage ◽

Mean Shift ◽

Estimation Method ◽

Mean Shift Clustering ◽

Line Loss

Download Full-text

COMPARISON OF CLUSTER ANALYSIS ALGORITHMS IN OBJECT RECOGNITION

Collection of scientific works of the State University of Infrastructure and Technologies series Transport Systems and Technologies ◽

10.32703/2617-9040-2020-36-12 ◽

2020 ◽

pp. 112-120

Author(s):

M. Botvin ◽

A. Gertsiy

Keyword(s):

Image Processing ◽

Cluster Analysis ◽

Spatial Clustering ◽

Clustering Algorithms ◽

Mean Shift ◽

Comparative Modeling ◽

Data Sets ◽

Scale Parameters ◽

Mean Shift Clustering ◽

Synthetic Datasets

The article is an overview of the direction of graphic image processing based on clustering algorithms. The analysis of prospects of application of algorithms of cluster analysis in digital image processing, in particular, at segmentation and compression of graphic images, and also at recognition of images in transport sphere of activity is carried out. Comparative modeling of such algorithms of cluster analysis as K-means, Mean-Shift (clustering of average shift) and DBSCAN (based on density of spatial clustering for applications with noise) on various types of data is carried out. The simulation was performed on synthetic datasets in a Jupyter Notebook environment using the Scikit-learn library. In particular, four data sets were generated in this environment, to which these clustering algorithms were applied. The simulation results showed that the K-means algorithm can effectively describe relatively simple shapes. In contrast, the mean shift does not require assumptions about the number of clusters and the shape of the distribution, but its performance depends on the choice of scale parameters. The DBSCAN algorithm can successfully detect more complex shapes, which emphasizes one of the strengths of this algorithm - the clustering of arbitrary data. The disadvantages of the selected algorithms are also given and it is indicated on which types of images they effectively work with the estimation of computational speed.

Download Full-text

Clustering Techniques

Emerging Trends and Applications in Cognitive Computing - Advances in Computational Intelligence and Robotics ◽

10.4018/978-1-5225-5793-7.ch009 ◽

2019 ◽

pp. 198-223 ◽

Cited By ~ 1

Author(s):

Harendra Kumar

Keyword(s):

Machine Learning ◽

Data Mining ◽

Image Analysis ◽

Clustering Algorithms ◽

Similarity Measures ◽

Main Task ◽

Clustering Techniques ◽

Data Points ◽

Exploratory Data ◽

Exploratory Data Mining

Clustering is a process of grouping a set of data points in such a way that data points in the same group (called cluster) are more similar to each other than to data points lying in other groups (clusters). Clustering is a main task of exploratory data mining, and it has been widely used in many areas such as pattern recognition, image analysis, machine learning, bioinformatics, information retrieval, and so on. Clusters are always identified by similarity measures. These similarity measures include intensity, distance, and connectivity. Based on the applications of the data, different similarity measures may be chosen. The purpose of this chapter is to produce an overview of much (certainly not all) of clustering algorithms. The chapter covers valuable surveys, the types of clusters, and methods used for constructing the clusters.

Download Full-text

Unsupervised Clustering of Neighborhood Associations and Image Segmentation Applications

Algorithms ◽

10.3390/a13120309 ◽

2020 ◽

Vol 13 (12) ◽

pp. 309

Author(s):

Zhenggang Wang ◽

Xuantong Li ◽

Jin Jin ◽

Zhong Liu ◽

Wei Liu

Keyword(s):

Remote Sensing ◽

Clustering Analysis ◽

Clustering Algorithms ◽

Optimal Solution ◽

Remote Sensing Image ◽

Neighborhood Density ◽

Correlation Clustering ◽

Density Correlation ◽

Advantages And Disadvantages ◽

Data Points

Irregular shape clustering is always a difficult problem in clustering analysis. In this paper, by analyzing the advantages and disadvantages of existing clustering analysis algorithms, a new neighborhood density correlation clustering (NDCC) algorithm for quickly discovering arbitrary shaped clusters. Because the density of the center region of any cluster sample dataset is greater than that of the edge region, the data points can be divided into core, edge, and noise data points, and then the density correlation of the core data points in their neighborhood can be used to form a cluster. Further more, by constructing an objective function and optimizing the parameters automatically, a locally optimal result that is close to the globally optimal solution can be obtained. This algorithm avoids the clustering errors caused by iso-density points between clusters. We compare this algorithm with other five clustering algorithms and verify it on two common remote sensing image datasets. The results show that it can cluster the same ground objects in remote sensing images into one class and distinguish different ground objects. NDCC has strong robustness to irregular scattering dataset and can solve the clustering problem of remote sensing image.

Download Full-text

PolSAR Image Segmentation by Mean Shift Clustering in the Tensor Space

ACTA AUTOMATICA SINICA ◽

10.3724/sp.j.1004.2010.00798 ◽

2010 ◽

Vol 36 (6) ◽

pp. 798-806 ◽

Cited By ~ 8

Author(s):

Ying-Hua WANG ◽

Chong-Zhao HAN

Keyword(s):

Image Segmentation ◽

Mean Shift ◽

Tensor Space ◽

Mean Shift Clustering

Download Full-text

Comparing SOM neural network with Fuzzy c-means, K-means and traditional hierarchical clustering algorithms

European Journal of Operational Research ◽

10.1016/j.ejor.2005.03.039 ◽

2006 ◽

Vol 174 (3) ◽

pp. 1742-1759 ◽

Cited By ~ 150

Author(s):

Sueli A. Mingoti ◽

Joab O. Lima

Keyword(s):

Neural Network ◽

Hierarchical Clustering ◽

Clustering Algorithms ◽

Fuzzy C Means ◽

Som Neural Network

Download Full-text

Convolutional Neural Network for the Semantic Segmentation of Remote Sensing Images

Mobile Networks and Applications ◽

10.1007/s11036-020-01703-3 ◽

2021 ◽

Vol 26 (1) ◽

pp. 200-215

Author(s):

Muhammad Alam ◽

Jian-Feng Wang ◽

Cong Guangpei ◽

LV Yunrong ◽

Yuanfang Chen

Keyword(s):

Neural Network ◽

Remote Sensing ◽

Neural Networks ◽

Image Processing ◽

Deep Learning ◽

Semantic Segmentation ◽

Natural Scene ◽

Remote Sensing Images ◽

Advantages And Disadvantages ◽

Target Segmentation

AbstractIn recent years, the success of deep learning in natural scene image processing boosted its application in the analysis of remote sensing images. In this paper, we applied Convolutional Neural Networks (CNN) on the semantic segmentation of remote sensing images. We improve the Encoder- Decoder CNN structure SegNet with index pooling and U-net to make them suitable for multi-targets semantic segmentation of remote sensing images. The results show that these two models have their own advantages and disadvantages on the segmentation of different objects. In addition, we propose an integrated algorithm that integrates these two models. Experimental results show that the presented integrated algorithm can exploite the advantages of both the models for multi-target segmentation and achieve a better segmentation compared to these two models.

Download Full-text

A novel bidirectional clustering algorithm based on local density

Scientific Reports ◽

10.1038/s41598-021-93244-2 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Baicheng Lyu ◽

Wenhua Wu ◽

Zhiqiang Hu

Keyword(s):

Clustering Algorithm ◽

Local Density ◽

Clustering Algorithms ◽

Cluster Number ◽

Denoising Method ◽

Number Of Clusters ◽

Data Points ◽

Cutoff Distance ◽

Large Clusters ◽

Small Clusters

AbstractWith the widely application of cluster analysis, the number of clusters is gradually increasing, as is the difficulty in selecting the judgment indicators of cluster numbers. Also, small clusters are crucial to discovering the extreme characteristics of data samples, but current clustering algorithms focus mainly on analyzing large clusters. In this paper, a bidirectional clustering algorithm based on local density (BCALoD) is proposed. BCALoD establishes the connection between data points based on local density, can automatically determine the number of clusters, is more sensitive to small clusters, and can reduce the adjusted parameters to a minimum. On the basis of the robustness of cluster number to noise, a denoising method suitable for BCALoD is proposed. Different cutoff distance and cutoff density are assigned to each data cluster, which results in improved clustering performance. Clustering ability of BCALoD is verified by randomly generated datasets and city light satellite images.

Download Full-text