Nearest Neighbour (NN) and k-Nearest Neighbour (kNN) Supervised Classification Algorithms

Traditional supervised classification algorithms require a large number of labelled examples to perform accurately. Semi-supervised classification algorithms attempt to overcome this major limitation by also using unlabelled examples. Unlabelled examples have also been used to improve nearest neighbour text classification in a method called bridging. In this paper, we propose the use of bridging in a semi-supervised setting. We introduce a new bridging algorithm that can be used as a base classifier in most semi-supervised approaches. We empirically show that the classification performance of two semi-supervised algorithms, self-learning and co-training, improves with the use of our new bridging algorithm in comparison to using the standard classifier, JRipper. We propose a similarity metric for short texts and also study the performance of self-learning with a number of instance selection heuristics.

Download Full-text

Comparative analysis of classification algorithms for chronic kidney disease diagnosis

Bulletin of Electrical Engineering and Informatics ◽

10.11591/eei.v8i4.1621 ◽

2019 ◽

Vol 8 (4) ◽

Author(s):

Zainuri Saringat ◽

Aida Mustapha ◽

R. D. Rohmat Saedudin ◽

Noor Azah Samsudin

Keyword(s):

Chronic Kidney Disease ◽

Kidney Disease ◽

Supervised Classification ◽

Bone Fractures ◽

Kidney Diseases ◽

Disease Diagnosis ◽

Classification Model ◽

Support Vector ◽

Classification Algorithms ◽

Nearest Neighbour

Chronic Kidney Disease (CKD) is one of the leading cause of death contributed by other illnesses such as diabetes, hypertension, lupus, anemia or weak bones that lead to bone fractures. Early prediction of CKD is important in order to contain the disesase. However, instead of predicting the severity of CKD, the objective of this paper is to predict the diagnosis of CKD based on the symptoms or attributes observed in a particular case, whether the stage is acute or chronic. To achieve this, a classification model is proposed to label stage of severity for kidney diseases patients. The experiments then investigated the performance of the proposed classification model based on eight supervised classification algorithms, which are ZeroR, Rule Induction, Support Vector Machine, Naïve Bayes, Decision Tree, Decision Stump, k-Nearest Neighbour, and Classification via Regression. The performance of the all classifiers is evaluated based on accuracy, precision, and recall. The results showed that the regression classifier perform best in the kidney diagnostic procedure.

Download Full-text

Adapting Supervised Classification Algorithms to Arbitrary Weak Label Scenarios

Advances in Intelligent Data Analysis XVI - Lecture Notes in Computer Science ◽

10.1007/978-3-319-68765-0_21 ◽

2017 ◽

pp. 247-259 ◽

Cited By ~ 1

Author(s):

Miquel Perelló-Nieto ◽

Raúl Santos-Rodríguez ◽

Jesús Cid-Sueiro

Keyword(s):

Supervised Classification ◽

Classification Algorithms

Download Full-text

Classification and Association Rule Mining Technique for Predicting Chronic Kidney Disease

Journal of Information & Knowledge Management ◽

10.1142/s0219649220400158 ◽

2020 ◽

Vol 19 (01) ◽

pp. 2040015

Author(s):

Ahmad Alaiad ◽

Hassan Najadat ◽

Belal Mohsen ◽

Khaled Balhaf

Keyword(s):

Chronic Kidney Disease ◽

Kidney Disease ◽

Association Rule ◽

Association Rule Mining ◽

Support Vector ◽

Classification Algorithms ◽

Nearest Neighbour ◽

Rule Mining ◽

Medical Field ◽

Efficient System

Background and objective: Chronic kidney disease (CKD) is one of the deadly diseases that can affect a lot of vital organs in the human body such as heart, liver, and lungs. Many individuals might be at early stage of kidney disease and not have any signs, which might lead to a sudden death. Previous research showed that early prediction of CKD is very important in the medical field for physicians’ decision-making and patients’ health and life. To this end, constructing an efficient prediction system for CKD, which is the goal of this paper, often reduces medical errors and overall healthcare cost. Methods: Classification and association rule mining techniques were integrated and utilised to construct an efficient system for predicting and diagnosing CKD and its causes using weka and SPSS as platform environments. In particular, five classification algorithms, namely, naive Bayes, decision tree, support vector machine, K-nearest neighbour, and JRip were used to achieve the research goal. In addition, Apriori algorithm was used to discover strong relationship rules between attributes. The experiments were conducted on real medical dataset collected from hospitals and patient monitoring systems. Results: The experiments achieved high accuracy of 98.50% for K-nearest neighbour (KNN) classifier and achieved 96.00% when using classier based on association rule (JRip). Conclusions: We conclude by showing that applying integrative approach by combining classification algorithms and association rule mining can significantly improve the classification accuracy and be more useful for CKD prediction. This research has also several theoretical and practical implications for the medical field and healthcare industry.

Download Full-text

CAN A MACHINE LEARNING ALGORITHM IDENTIFY SARS-COV-2 VARIANTS BASED ON CONVENTIONAL rRT-PCR? PROOF OF CONCEPT

10.1101/2021.11.12.21266286 ◽

2021 ◽

Author(s):

jorge cabrera Alvargonzalez ◽

Ana Larranaga Janeiro ◽

Sonia Perez ◽

Javier Martinez Torres ◽

Lucia martinez lamas ◽

...

Keyword(s):

Neural Network ◽

Machine Learning ◽

Supervised Classification ◽

Learning Algorithm ◽

Support Vector ◽

Classification Algorithms ◽

Machine Learning Algorithm ◽

Proof Of Concept ◽

The Past ◽

Number Of Cycles

Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has been and remains one of the major challenges humanity has faced thus far. Over the past few months, large amounts of information have been collected that are only now beginning to be assimilated. In the present work, the existence of residual information in the massive numbers of rRT-PCRs that tested positive out of the almost half a million tests that were performed during the pandemic is investigated. This residual information is believed to be highly related to a pattern in the number of cycles that are necessary to detect positive samples as such. Thus, a database of more than 20,000 positive samples was collected, and two supervised classification algorithms (a support vector machine and a neural network) were trained to temporally locate each sample based solely and exclusively on the number of cycles determined in the rRT-PCR of each individual. Finally, the results obtained from the classification show how the appearance of each wave is coincident with the surge of each of the variants present in the region of Galicia (Spain) during the development of the SARS-CoV-2 pandemic and clearly identified with the classification algorithm.

Download Full-text

A Comparative Study of FCA-Based Supervised Classification Algorithms

Concept Lattices - Lecture Notes in Computer Science ◽

10.1007/978-3-540-24651-0_26 ◽

2004 ◽

pp. 313-320 ◽

Cited By ~ 12

Author(s):

Huaiyu Fu ◽

Huaiguo Fu ◽

Patrik Njiwoua ◽

Engelbert Mephu Nguifo

Keyword(s):

Comparative Study ◽

Supervised Classification ◽

Classification Algorithms

Download Full-text

Semi-Supervised Learning

Encyclopedia of Data Warehousing and Mining ◽

10.4018/978-1-59140-557-3.ch192 ◽

2011 ◽

pp. 1022-1027

Author(s):

Tobias Scheffer

Keyword(s):

Supervised Learning ◽

Supervised Classification ◽

Unlabeled Data ◽

Training Data ◽

Classification Algorithms ◽

Classification Problems

For many classification problems, unlabeled training data are inexpensive and readily available, whereas labeling training data imposes costs. Semi-supervised classification algorithms aim at utilizing information contained in unlabeled data in addition to the (few) labeled data.

Download Full-text

Supervised Classification Algorithms in Machine Learning: A Survey and Review

Advances in Intelligent Systems and Computing - Emerging Technology in Modelling and Graphics ◽

10.1007/978-981-13-7403-6_11 ◽

2019 ◽

pp. 99-111 ◽

Cited By ~ 6

Author(s):

Pratap Chandra Sen ◽

Mahimarnab Hajra ◽

Mitadru Ghosh

Keyword(s):

Machine Learning ◽

Supervised Classification ◽

Classification Algorithms

Download Full-text

Assessment of sedimentation process in flood water spreading system using IRS (P5) and supervised classification algorithms (case study: Dahandar plain, Minab city, south of Iran)

Remote Sensing Applications Society and Environment ◽

10.1016/j.rsase.2019.100269 ◽

2019 ◽

Vol 16 ◽

pp. 100269

Author(s):

Maryam Abbaszadeh ◽

Rasool Mahdavi ◽

Marzieh Rezai

Keyword(s):

Supervised Classification ◽

Classification Algorithms ◽

Flood Water ◽

Sedimentation Process ◽

Water Spreading

Download Full-text

Automated Classification of Regional Meteorological Events in a Coastal Area Using In Situ Measurements

Journal of Atmospheric and Oceanic Technology ◽

10.1175/jtech-d-19-0120.1 ◽

2020 ◽

Vol 37 (4) ◽

pp. 723-739 ◽

Cited By ~ 2

Author(s):

Anton Sokolov ◽

Egor Dmitriev ◽

Cyril Gengembre ◽

Hervé Delbarre

Keyword(s):

Discriminant Analysis ◽

Coastal Area ◽

Supervised Classification ◽

Nearest Neighbors ◽

Support Vector ◽

Classification Algorithms ◽

Quadratic Discriminant Analysis ◽

Automated Classification ◽

Sea Breezes

AbstractThe problem is considered of atmospheric meteorological events’ classification, such as sea breezes, fogs, and high winds, in coastal areas. In situ wind, temperature, humidity, pressure, radiance, and turbulence meteorological measurements are used as predictors. Local atmospheric events of 2013–14 were analyzed and classified manually using data of the measurement campaign in the coastal area of the English Channel in Dunkirk, France. The results of that categorization allowed the training of a few supervised classification algorithms using the data of an ultrasonic anemometer as predictors. The comparison was carried out for the K-nearest-neighbors classifier, support vector machine, and two Bayesian classifiers—quadratic discriminant analysis and Parzen–Rozenblatt window. The analysis showed that the K-nearest-neighbors and quadratic discriminant analysis classifiers reveal the best classification accuracy (up to 80% correctly classified meteorological events). The latter classifier has higher calculation speed and is less sensitive to unbalanced data and the overtraining problem. The most informative atmospheric parameters for events recognition were revealed for each algorithm. The results obtained showed that supervised classification algorithms contribute to automation of processing and analyzing of local meteorological measurements.

Download Full-text