Supervised Regression Clustering

2016 ◽  
Vol 3 (4) ◽  
pp. 21-40 ◽  
Author(s):  
Ali Fallah Tehrani ◽  
Diane Ahrens

Clustering techniques typically group similar instances underlying individual attributes by supposing that similar instances have similar attributes characteristic. On contrary, clustering similar instances given a specific behavior is framed through supervised learning. For instance, which fashion products have similar behavior in term of sales. Unfortunately, conventional clustering methods cannot tackle this case, since they handle attributes by a same manner. In fact, conventional clustering approaches do not consider any response, and moreover they assume attributes act by the same importance. However, clustering instances with respect to responses leads to a better data analytics. In this research, the authors introduce an approach for the goal supervised clustering and show its advantage in terms of data analytics as well as prediction. To verify the feasibility and the performance of this approach the authors conducted several experiments on a real dataset derived from an apparel industry.

Author(s):  
Nurshazwani Muhamad Mahfuz ◽  
Marina Yusoff ◽  
Zakiah Ahmad

<div style="’text-align: justify;">Clustering provides a prime important role as an unsupervised learning method in data analytics to assist many real-world problems such as image segmentation, object recognition or information retrieval. It is often an issue of difficulty for traditional clustering technique due to non-optimal result exist because of the presence of outliers and noise data.  This review paper provides a review of single clustering methods that were applied in various domains.  The aim is to see the potential suitable applications and aspect of improvement of the methods. Three categories of single clustering methods were suggested, and it would be beneficial to the researcher to see the clustering aspects as well as to determine the requirement for clustering method for an employment based on the state of the art of the previous research findings.</div>


Author(s):  
Baoying Wang ◽  
Imad Rahal ◽  
Richard Leipold

Data clustering is a discovery process that partitions a data set into groups (clusters) such that data points within the same group have high similarity while being very dissimilar to points in other groups (Han & Kamber, 2001). The ultimate goal of data clustering is to discover natural groupings in a set of patterns, points, or objects without prior knowledge of any class labels. In fact, in the machine-learning literature, data clustering is typically regarded as a form of unsupervised learning as opposed to supervised learning. In unsupervised learning or clustering, there is no training function as in supervised learning. There are many applications for data clustering including, but not limited to, pattern recognition, data analysis, data compression, image processing, understanding genomic data, and market-basket research.


2020 ◽  
Vol 10 (12) ◽  
pp. 4176 ◽  
Author(s):  
Loris Nanni ◽  
Andrea Rigo ◽  
Alessandra Lumini ◽  
Sheryl Brahnam

In this work, we combine a Siamese neural network and different clustering techniques to generate a dissimilarity space that is then used to train an SVM for automated animal audio classification. The animal audio datasets used are (i) birds and (ii) cat sounds, which are freely available. We exploit different clustering methods to reduce the spectrograms in the dataset to a number of centroids that are used to generate the dissimilarity space through the Siamese network. Once computed, we use the dissimilarity space to generate a vector space representation of each pattern, which is then fed into an support vector machine (SVM) to classify a spectrogram by its dissimilarity vector. Our study shows that the proposed approach based on dissimilarity space performs well on both classification problems without ad-hoc optimization of the clustering methods. Moreover, results show that the fusion of CNN-based approaches applied to the animal audio classification problem works better than the stand-alone CNNs.


Proceedings ◽  
2019 ◽  
Vol 31 (1) ◽  
pp. 18
Author(s):  
Cristóbal ◽  
Padrón ◽  
Quesada-Arencibia ◽  
Alayón ◽  
Blasio ◽  
...  

In road-based mass transit systems, the travel time is a key factor affecting quality of service. For this reason, to know the behavior of this time is a relevant challenge. Clustering methods are interesting tools for knowledge modeling because these are unsupervised techniques, allowing hidden behavior patterns in large data sets to be found. In this contribution, a study on the utility of different clustering techniques to obtain behavior pattern of travel time is presented. The study analyzed three clustering techniques: K-medoid, Diana, and Hclust, studying how two key factors of these techniques (distance metric and clusters number) affect the results obtained. The study was conducted using transport activity data provided by a public transport operator.


Sensor Review ◽  
2017 ◽  
Vol 37 (3) ◽  
pp. 289-304 ◽  
Author(s):  
Manjeet Singh ◽  
Surender Kumar Soni

Purpose This paper aims to discuss a comprehensive survey on fuzzy-based clustering techniques. The determination of an appropriate sensor node as a cluster head straightforwardly affects a network’s lifetime. Clustering often possesses some uncertainties in determining suitable sensor nodes as a cluster head. Owing to various variables, selection of a suitable node as a cluster head is a perplexing decision. Fuzzy logic is capable of handling uncertainties and improving decision-making processes even with insufficient information. Then, state-of-the-art research in the field of clustering techniques has been reviewed. Design/methodology/approach The literature is presented in a tabular form with merits and limitations of each technique. Furthermore, the various techniques are compared graphically and classified in a tabular form and the flowcharts of important algorithms are presented with pseudocodes. Findings This paper comprehends the importance and distinction of different fuzzy-based clustering methods which are further supportive in designing more efficient clustering protocols. Originality/value This paper fulfills the need of a review paper in the field of fuzzy-based clustering techniques because no other paper has reviewed all the fuzzy-based clustering techniques. Furthermore, none of them has presented literature in a tabular form or presented flowcharts with pseudocodes of important techniques.


2016 ◽  
Vol 2016 ◽  
pp. 1-8 ◽  
Author(s):  
Daiji Tanaka ◽  
Katsuhiro Honda ◽  
Seiki Ubukata ◽  
Akira Notsu

Although the goal of clustering is to reveal structural information from unlabeled datasets, in cases with partial structural supervisions, semi-supervised clustering is expected to improve partition quality. However, in many real applications, it may cause additional costs to provide an enough amount of supervised objects with class labels. A virtual sample approach is a practical technique for improving classification quality in semi-supervised learning, in which additional virtual samples are generated from supervised objects. In this research, the virtual sample approach is adopted in semi-supervised fuzzy co-clustering, where the goal is to reveal object-item pairwise cluster structures from cooccurrence information among them. Several experimental results demonstrate the characteristics of the proposed approach.


Sign in / Sign up

Export Citation Format

Share Document