A novel multi-label classification algorithm based on K-nearest neighbor and random walk

The multi-label classification problem occurs in many real-world tasks where an object is naturally associated with multiple labels, that is, concepts. The integration of the random walk approach in the multi-label classification methods attracts many researchers’ sight. One challenge of using the random walk-based multi-label classification algorithms is to construct a random walk graph for the multi-label classification algorithms, which may lead to poor classification quality and high algorithm complexity. In this article, we propose a novel multi-label classification algorithm based on the random walk graph and the K-nearest neighbor algorithm (named MLRWKNN). This method constructs the vertices set of a random walk graph for the K-nearest neighbor training samples of certain test data and the edge set of correlations among labels of the training samples, thus considerably reducing the overhead of time and space. The proposed method improves the similarity measurement by differentiating and integrating the discrete and continuous features, which reflect the relationships between instances more accurately. A label predicted method is devised to reduce the subjectivity of the traditional threshold method. The experimental results with four metrics demonstrate that the proposed method outperforms the seven state-of-the-art multi-label classification algorithms in contrast and makes a significant improvement for multi-label classification.

Download Full-text

Service complaint identification in hotel social media: A two-step classification approach

International Journal of Electrical Engineering Education ◽

10.1177/0020720920928467 ◽

2020 ◽

pp. 002072092092846

Author(s):

Jiahua Jin ◽

Lu Lu

Keyword(s):

Social Media ◽

Nearest Neighbor ◽

Support Vector ◽

Classification Algorithms ◽

Construction Process ◽

K Nearest Neighbor ◽

Classification Approach ◽

Consumer Complaint ◽

Training Samples ◽

Binary Classifiers

Hotel social media provides access to dissatisfied customers and their experiences with services. However, due to massive topics and posts in social media, and the sparse distribution of complaint-related posts and, manually identifying complaints is inefficient and time-consuming. In this study, we propose a supervised learning method including training samples enlargement and classifier construction. We first identified reliable complaint and noncomplaint samples from the unlabeled dataset by using small labeled samples as training samples. Combining the labeled samples and enlarged samples, classification algorithms support vector machine and k-nearest neighbor were then adopted to build binary classifiers during the classifier construction process. Experimental results indicate the proposed method can identify complaints from social media efficiently, especially when the amount of labeled training samples is small. This study provides an efficient approach for hotel companies to distinguish a certain kind of consumer complaint information from large number of unrelated information in hotel social media.

Download Full-text

ANALISIS CREDIT SCORING TERHADAP STATUS PEMBAYARAN BARANG ELEKTRONIK DAN FURNITURE MENGGUNAKAN BOOTSTRAP AGGREGATING K-NEAREST NEIGHBOR

BAREKENG JURNAL ILMU MATEMATIKA DAN TERAPAN ◽

10.30598/barekengvol15iss4pp735-744 ◽

2021 ◽

Vol 15 (4) ◽

pp. 735-744

Author(s):

Putri Sri Astuti ◽

Memi Nor Hayati ◽

Rito Goejantoro

Keyword(s):

Length Of Stay ◽

Nearest Neighbor ◽

Credit Scoring ◽

Classification Algorithms ◽

K Nearest Neighbor ◽

Nearest Neighbor Algorithm ◽

Independent Variables ◽

Bootstrap Aggregating ◽

K Nearest Neighbor Algorithm ◽

C 73

Classification is the process of grouping objects that have the same characteristics into several categories. This study applies a combination of classification algorithms, namely Bootstrap Aggregating K-Nearest Neighbor in credit scoring analysis. The aim is to classify the credit payment status of electronic goods and furniture at PT KB Finansia Multi Finance in 2020 and determine the level of accuracy produced. Credit payment status is grouped into 2 categories, namely smoothly and not smoothly. There are 7 independent variables that are used to describe the characteristics of the debtor, namely age, number of dependents, length of stay, years of service, income, amount of payment, and payment period. The application of the classification algorithm at the credit scoring analysis is expected to assist creditors in making decisions to accept or reject credit applications from prospective debtors. The results showed that the accuracy obtained from the Bootstrap Aggregating K-Nearest Neighbor algorithm with a proportion of 90:10, m=80%, C=73, and K=5 was the best, which was 92.308%.

Download Full-text

APPLYING THE CLASSIFICATION ALGORITHM FOR THE SYSTEM RECOMMENDATIONS BUY SELL IN FOREX TRADING

JURNAL FASILKOM ◽

10.37859/jf.v10i2.2076 ◽

2020 ◽

Vol 10 (2) ◽

pp. 152-158

Author(s):

Iswanto ◽

Yuliana Melita Pranoto ◽

Reddy Alexandro Harianto

Keyword(s):

Time Series ◽

Recommendation System ◽

Nearest Neighbor ◽

Classification Algorithm ◽

Classification Algorithms ◽

K Nearest Neighbor

Abstract- Having a sophisticated application, even though often experience problems in deciding BUY - SELL in trading forex trading. This is due to the often time series predictions, in the high variable experiencing high values as well as low variables, for that it is needed a recommendation system to overcome this problem. The application of classification algorithms to the recommendation system in support of BUY-SELL decisions is one appropriate alternative to overcome this. K-Nearest Neighbor (K-NN) algorithm was chosen because the K-NN method is an algorithm that can be used in building a recommendation system that can classify data based on the closest distance. This system is designed to assist traders in making BUY-SELL decisions, based on predictive data. The results of the recommendation system from the ten trials predicted by Arima are recommended. When compared to the price in the field the target profit is 7% per week from ten experiments if the average profit has exceeded the target

Download Full-text

On Some Fuzzy Classification Algorithms and the AEC Model

WSEAS TRANSACTIONS ON SYSTEMS ◽

10.37394/23202.2020.19.22 ◽

2020 ◽

Vol 19 ◽

Keyword(s):

Pattern Recognition ◽

Decision Rule ◽

Nearest Neighbor ◽

Fuzzy Classification ◽

Classification Algorithms ◽

K Nearest Neighbor ◽

Nearest Neighbor Algorithm ◽

Recognition Algorithms ◽

K Nearest Neighbor Algorithm

In the paper some fuzzy classification algorithms based upon a nearest neighbor decision rule areconsidered in terms of the pattern recognition algorithms which are based on the computation of estimates (theso-called AEC model). It is shown that the fuzzy K nearest neighbor algorithm can be assigned to the AECclass. In turn, it is found that some standard AEC algorithms, which depend on a number of numericalparameters, can be used as fuzzy classification algorithms. Yet among them there exist algorithms extremalwith respect to these parameters. Such algorithms provide maximum values of the associated performancemeasures.

Download Full-text

The Spatial Classification Algorithm of K-Nearest Neighbor Based on Spatial Predicate

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.706-708.1928 ◽

2013 ◽

Vol 706-708 ◽

pp. 1928-1931

Author(s):

Yu Ma ◽

Yu Ling Gao ◽

Shao Yun Song

Keyword(s):

Test Object ◽

Nearest Neighbor ◽

Operating Time ◽

Classification Algorithm ◽

K Nearest Neighbor ◽

Nearest Neighbor Algorithm ◽

Low Speed ◽

Spatial Classification ◽

Spatial Concept ◽

K Nearest Neighbor Algorithm

Traditional k-Nearest Neighbor Algorithm (short for KNN) is usually used in the spatial classification; however, the problem of low-speed searching exists in this method. In order to avoid this kind of disadvantage, this paper puts forward a new spatial classification algorithm of K-nearest neighbor based on spatial predicate. This method searches the object set which is similar to the test object in spatial concept and uses spatial predicate to help search the object set, which narrows the searching range and reduces the operating time of KNN algorithm.

Download Full-text

Recent trends in big data using hadoop

International Journal of Informatics and Communication Technology (IJ-ICT) ◽

10.11591/ijict.v8i1.pp39-49 ◽

2019 ◽

Vol 8 (1) ◽

pp. 39

Author(s):

Chetna Kaushal ◽

Deepika Koundal

Keyword(s):

Data Mining ◽

Social Media ◽

Big Data ◽

Nearest Neighbor ◽

Classification Algorithms ◽

K Nearest Neighbor ◽

Nearest Neighbor Algorithm ◽

Clustering Techniques ◽

Recent Trends ◽

K Nearest Neighbor Algorithm

<span>Big data refers to huge set of data which is very common these days due to the increase of internet utilities. Data generated from social media is a very common example for the same. This paper depicts the summary on big data and ways in which it has been utilized in all aspects. Data mining is radically a mode of deriving the indispensable knowledge from extensively vast fractions of data which is quite challenging to be interpreted by conventional methods. The paper mainly focuses on the issues related to the clustering techniques in big data. For the classification purpose of the big data, the existing classification algorithms are concisely acknowledged and after that, k-nearest neighbor algorithm is discreetly chosen among them and described along with an example. </span>

Download Full-text

Using K-Nearest Neighbor in Optical Character Recognition

ComTech Computer Mathematics and Engineering Applications ◽

10.21512/comtech.v7i1.2223 ◽

2016 ◽

Vol 7 (1) ◽

pp. 53 ◽

Cited By ~ 1

Author(s):

Veronica Ong ◽

Derwin Suhartono

Keyword(s):

Machine Learning ◽

Character Recognition ◽

Optical Character Recognition ◽

Nearest Neighbor ◽

Machine Learning Algorithms ◽

K Nearest Neighbor ◽

Optical Character ◽

Training Samples ◽

Computer Vision Technology ◽

K Nearest Neighbor Algorithm

The growth in computer vision technology has aided society with various kinds of tasks. One of these tasks is the ability of recognizing text contained in an image, or usually referred to as Optical Character Recognition (OCR). There are many kinds of algorithms that can be implemented into an OCR. The K-Nearest Neighbor is one such algorithm. This research aims to find out the process behind the OCR mechanism by using K-Nearest Neighbor algorithm; one of the most influential machine learning algorithms. It also aims to find out how precise the algorithm is in an OCR program. To do that, a simple OCR program to classify alphabets of capital letters is made to produce and compare real results. The result of this research yielded a maximum of 76.9% accuracy with 200 training samples per alphabet. A set of reasons are also given as to why the program is able to reach said level of accuracy.

Download Full-text

Physiological Stress Prediction using Machine Learning Classifiers

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.a4556.119119 ◽

2019 ◽

Vol 9 (1) ◽

pp. 675-677

Keyword(s):

Machine Learning ◽

Nearest Neighbor ◽

Physiological Stress ◽

Classification Algorithms ◽

K Nearest Neighbor ◽

Nearest Neighbor Algorithm ◽

Machine Learning Classifiers ◽

Learning Classifiers ◽

Stress Prediction ◽

K Nearest Neighbor Algorithm

The aim of this study is to predict the stress of a person using Machine Learning classifiers. This system classifies the stress of a person as either High or Low. There are various classification algorithms present, out of which 9 classification algorithms have been chosen for this study. The algorithms implemented are K-Nearest Neighbor classifier, Support Vector Machine with an RBF kernel, Decision Tree algorithm, Random Forest algorithm, Bagging Classifier, Adaboost algorithm, Voting classifier, Logistic Regression and MLP classifier. The different algorithms are applied on the same dataset. The dataset is obtained from a GitHub repository labelled Stress classifier with AutoML. The different accuracies of each algorithm are found, and the classification algorithm with the best accuracy is determined. On comparison, it was found that the K-Nearest Neighbor algorithm has the best accuracy with an accuracy rate of 79.3% for physiological stress prediction. While other algorithms had varying accuracies, K-Nearest Neighbor algorithm was the most consistent.

Download Full-text

Classification Based on Configuration Objects by Using Procrustes Analysis

JURNAL INFOTEL ◽

10.20895/infotel.v13i2.637 ◽

2021 ◽

Vol 13 (2) ◽

pp. 76-83

Author(s):

Ridho Ananda ◽

Agi Prasetiadi

Keyword(s):

Nearest Neighbor ◽

Similarity Measures ◽

Classification Algorithm ◽

Procrustes Analysis ◽

Support Vector ◽

Classification Algorithms ◽

K Nearest Neighbor ◽

New Approach ◽

Neighbor Classification ◽

Better Than

Classification is one of the data mining topics that will predict an object to go into a certain group. The prediction process can be performed by using similarity measures, classification trees, or regression. On the other hand, Procrustes refers to a technique of matching two configurations that have been implemented for outlier detection. Based on the result, Procrustes has a potential to tackle the misclassification problem when the outliers are assumed as the misclassified object. Therefore, the Procrustes classification algorithm (PrCA) and Procrustes nearest neighbor classification algorithm (PNNCA) were proposed in this paper. The results of those algorithms had been compared to the classical classification algorithms, namely k-Nearest Neighbor (k-NN), Support Vector Machine (SVM), AdaBoost (AB), Random Forest (RF), Logistic Regression (LR), and Ridge Regression (RR). The data used were iris, cancer, liver, seeds, and wine dataset. The minimum and maximum accuracy values obtained by the PrCA algorithm were 0.610 and 0.925, while the PNNCA were 0.610 and 0.963. PrCA was generally better than k-NN, SVM, and AB. Meanwhile, PNNCA was generally better than k-NN, SVM, AB, and RF. Based on the results, PrCA and PNNCA certainly deserve to be proposed as a new approach in the classification process.

Download Full-text

A Scalable K-Nearest Neighbor Algorithm for Recommendation System Problems

2020 43rd International Convention on Information, Communication and Electronic Technology (MIPRO) ◽

10.23919/mipro48935.2020.9245195 ◽

2020 ◽

Author(s):

A. Sagdic ◽

C. Tekinbas ◽

E. Arslan ◽

T. Kucukyilmaz

Keyword(s):

Recommendation System ◽

Nearest Neighbor ◽

K Nearest Neighbor ◽

Nearest Neighbor Algorithm ◽

K Nearest Neighbor Algorithm

Download Full-text