Supervised Regression Clustering

Clustering techniques typically group similar instances underlying individual attributes by supposing that similar instances have similar attributes characteristic. On contrary, clustering similar instances given a specific behavior is framed through supervised learning. For instance, which fashion products have similar behavior in term of sales. Unfortunately, conventional clustering methods cannot tackle this case, since they handle attributes by a same manner. In fact, conventional clustering approaches do not consider any response, and moreover they assume attributes act by the same importance. However, clustering instances with respect to responses leads to a better data analytics. In this research, the authors introduce an approach for the goal supervised clustering and show its advantage in terms of data analytics as well as prediction. To verify the feasibility and the performance of this approach the authors conducted several experiments on a real dataset derived from an apparel industry.

Download Full-text

Review of single clustering methods

IAES International Journal of Artificial Intelligence (IJ-AI) ◽

10.11591/ijai.v8.i3.pp221-227 ◽

2019 ◽

Vol 8 (3) ◽

pp. 221

Author(s):

Nurshazwani Muhamad Mahfuz ◽

Marina Yusoff ◽

Zakiah Ahmad

Keyword(s):

Data Analytics ◽

Review Paper ◽

State Of The Art ◽

Learning Method ◽

Clustering Methods ◽

Optimal Result ◽

Clustering Method ◽

Noise Data ◽

Research Findings ◽

Real World Problems

<div style="’text-align: justify;">Clustering provides a prime important role as an unsupervised learning method in data analytics to assist many real-world problems such as image segmentation, object recognition or information retrieval. It is often an issue of difficulty for traditional clustering technique due to non-optimal result exist because of the presence of outliers and noise data. This review paper provides a review of single clustering methods that were applied in various domains. The aim is to see the potential suitable applications and aspect of improvement of the methods. Three categories of single clustering methods were suggested, and it would be beneficial to the researcher to see the clustering aspects as well as to determine the requirement for clustering method for an employment based on the state of the art of the previous research findings.</div>

Download Full-text

Two Novel Kernel-Based Semi-Supervised Clustering Methods by Seeding

2009 Chinese Conference on Pattern Recognition ◽

10.1109/ccpr.2009.5344157 ◽

2009 ◽

Cited By ~ 6

Author(s):

Lei Gu ◽

Fuchun Sun

Keyword(s):

Clustering Methods ◽

Supervised Clustering

Download Full-text

Categorization of Data Clustering Techniques

Handbook of Research on Public Information Technology ◽

10.4018/978-1-59904-857-4.ch052 ◽

2008 ◽

pp. 568-577

Author(s):

Baoying Wang ◽

Imad Rahal ◽

Richard Leipold

Keyword(s):

Unsupervised Learning ◽

Supervised Learning ◽

Data Clustering ◽

Analysis Data ◽

Discovery Process ◽

Data Set ◽

Market Basket ◽

Clustering Techniques ◽

Data Points ◽

Class Labels

Data clustering is a discovery process that partitions a data set into groups (clusters) such that data points within the same group have high similarity while being very dissimilar to points in other groups (Han & Kamber, 2001). The ultimate goal of data clustering is to discover natural groupings in a set of patterns, points, or objects without prior knowledge of any class labels. In fact, in the machine-learning literature, data clustering is typically regarded as a form of unsupervised learning as opposed to supervised learning. In unsupervised learning or clustering, there is no training function as in supervised learning. There are many applications for data clustering including, but not limited to, pattern recognition, data analysis, data compression, image processing, understanding genomic data, and market-basket research.

Download Full-text

Predictive Modeling in Health Care Data Analytics: A Sustainable Supervised Learning Technique

Big Data Analytics and Intelligence: A Perspective for Health Care ◽

10.1108/978-1-83909-099-820201016 ◽

2020 ◽

pp. 263-280

Author(s):

Suryakanthi Tangirala

Keyword(s):

Health Care ◽

Supervised Learning ◽

Predictive Modeling ◽

Data Analytics ◽

Health Care Data ◽

Learning Technique

Download Full-text

Semi-supervised clustering methods

Wiley Interdisciplinary Reviews Computational Statistics ◽

10.1002/wics.1270 ◽

2013 ◽

Vol 5 (5) ◽

pp. 349-361 ◽

Cited By ~ 47

Author(s):

Eric Bair

Keyword(s):

Clustering Methods ◽

Supervised Clustering

Download Full-text

Spectrogram Classification Using Dissimilarity Space

Applied Sciences ◽

10.3390/app10124176 ◽

2020 ◽

Vol 10 (12) ◽

pp. 4176 ◽

Cited By ~ 1

Author(s):

Loris Nanni ◽

Andrea Rigo ◽

Alessandra Lumini ◽

Sheryl Brahnam

Keyword(s):

Ad Hoc ◽

Classification Problem ◽

Space Representation ◽

Support Vector ◽

Clustering Methods ◽

Audio Classification ◽

Classification Problems ◽

Clustering Techniques ◽

Vector Space Representation ◽

Better Than

In this work, we combine a Siamese neural network and different clustering techniques to generate a dissimilarity space that is then used to train an SVM for automated animal audio classification. The animal audio datasets used are (i) birds and (ii) cat sounds, which are freely available. We exploit different clustering methods to reduce the spectrograms in the dataset to a number of centroids that are used to generate the dissimilarity space through the Siamese network. Once computed, we use the dissimilarity space to generate a vector space representation of each pattern, which is then fed into an support vector machine (SVM) to classify a spectrogram by its dissimilarity vector. Our study shows that the proposed approach based on dissimilarity space performs well on both classification problems without ad-hoc optimization of the clustering methods. Moreover, results show that the fusion of CNN-based approaches applied to the animal audio classification problem works better than the stand-alone CNNs.

Download Full-text

A Study on the Behavior of Clustering Techniques for Modeling Travel Time in Road-Based Mass Transit Systems

Proceedings ◽

10.3390/proceedings2019031018 ◽

2019 ◽

Vol 31 (1) ◽

pp. 18

Author(s):

Cristóbal ◽

Padrón ◽

Quesada-Arencibia ◽

Alayón ◽

Blasio ◽

...

Keyword(s):

Travel Time ◽

Behavior Pattern ◽

Large Data ◽

Mass Transit ◽

Clustering Methods ◽

Knowledge Modeling ◽

Activity Data ◽

Transit Systems ◽

Clustering Techniques ◽

Key Factor

In road-based mass transit systems, the travel time is a key factor affecting quality of service. For this reason, to know the behavior of this time is a relevant challenge. Clustering methods are interesting tools for knowledge modeling because these are unsupervised techniques, allowing hidden behavior patterns in large data sets to be found. In this contribution, a study on the utility of different clustering techniques to obtain behavior pattern of travel time is presented. The study analyzed three clustering techniques: K-medoid, Diana, and Hclust, studying how two key factors of these techniques (distance metric and clusters number) affect the results obtained. The study was conducted using transport activity data provided by a public transport operator.

Download Full-text

A comprehensive review of fuzzy-based clustering techniques in wireless sensor networks

Sensor Review ◽

10.1108/sr-11-2016-0254 ◽

2017 ◽

Vol 37 (3) ◽

pp. 289-304 ◽

Cited By ~ 4

Author(s):

Manjeet Singh ◽

Surender Kumar Soni

Keyword(s):

Cluster Head ◽

Sensor Nodes ◽

Tabular Form ◽

Clustering Methods ◽

Content Type ◽

Clustering Techniques ◽

Variables Selection ◽

Comprehensive Survey ◽

Decision Making Processes ◽

Art Research

Purpose This paper aims to discuss a comprehensive survey on fuzzy-based clustering techniques. The determination of an appropriate sensor node as a cluster head straightforwardly affects a network’s lifetime. Clustering often possesses some uncertainties in determining suitable sensor nodes as a cluster head. Owing to various variables, selection of a suitable node as a cluster head is a perplexing decision. Fuzzy logic is capable of handling uncertainties and improving decision-making processes even with insufficient information. Then, state-of-the-art research in the field of clustering techniques has been reviewed. Design/methodology/approach The literature is presented in a tabular form with merits and limitations of each technique. Furthermore, the various techniques are compared graphically and classified in a tabular form and the flowcharts of important algorithms are presented with pseudocodes. Findings This paper comprehends the importance and distinction of different fuzzy-based clustering methods which are further supportive in designing more efficient clustering protocols. Originality/value This paper fulfills the need of a review paper in the field of fuzzy-based clustering techniques because no other paper has reviewed all the fuzzy-based clustering techniques. Furthermore, none of them has presented literature in a tabular form or presented flowcharts with pseudocodes of important techniques.

Download Full-text

A Semi-Supervised Framework for MMMs-Induced Fuzzy Co-Clustering with Virtual Samples

Advances in Fuzzy Systems ◽

10.1155/2016/5206048 ◽

2016 ◽

Vol 2016 ◽

pp. 1-8 ◽

Cited By ~ 1

Author(s):

Daiji Tanaka ◽

Katsuhiro Honda ◽

Seiki Ubukata ◽

Akira Notsu

Keyword(s):

Supervised Learning ◽

Structural Information ◽

Experimental Results ◽

Virtual Sample ◽

Supervised Clustering ◽

Virtual Samples ◽

Additional Costs ◽

Class Labels ◽

Classification Quality

Although the goal of clustering is to reveal structural information from unlabeled datasets, in cases with partial structural supervisions, semi-supervised clustering is expected to improve partition quality. However, in many real applications, it may cause additional costs to provide an enough amount of supervised objects with class labels. A virtual sample approach is a practical technique for improving classification quality in semi-supervised learning, in which additional virtual samples are generated from supervised objects. In this research, the virtual sample approach is adopted in semi-supervised fuzzy co-clustering, where the goal is to reveal object-item pairwise cluster structures from cooccurrence information among them. Several experimental results demonstrate the characteristics of the proposed approach.

Download Full-text

Semi-supervised clustering techniques for categorization of text documents

10.32657/10356/65400 ◽

2015 ◽

Author(s):

Yang Yan

Keyword(s):

Text Documents ◽

Clustering Techniques ◽

Supervised Clustering

Download Full-text