Identification of White Blood Cells Using Machine Learning Classification Based on Feature Extraction

Anwar Siswanto Musliman; Abdul Fadlil; Anton Yudhana

doi:10.15575/join.v6i1.704

Identification of White Blood Cells Using Machine Learning Classification Based on Feature Extraction

Jurnal Online Informatika ◽

10.15575/join.v6i1.704 ◽

2021 ◽

Vol 6 (1) ◽

pp. 63

Author(s):

Anwar Siswanto Musliman ◽

Abdul Fadlil ◽

Anton Yudhana

Keyword(s):

Feature Extraction ◽

Blood Cells ◽

Nearest Neighbor ◽

White Blood Cells ◽

Training Data ◽

Classification Model ◽

Automatic Identification ◽

K Nearest Neighbor ◽

Machine Learning Classification ◽

Cell Image

In various disease diagnoses, one of the parameters is white blood cells, consisting of eosinophils, basophils, neutrophils, lymphocytes, and monocytes. Manual identification takes a long time and tends to be subjective depending on the staff's experience, so the automatic identification of white blood cells will be faster and more accurate. White blood cells are identified by examining a colored blood smear (SADT) and examined under a digital microscope to obtain a cell image. Image identification of white blood cells is determined through HSV color space segmentation (Hue, Saturation Value) and feature extraction of the Gray Level Cooccurrence Matrix (GLCM) method using the Angular Second Moment (ASM), Contrast, Entropy, and Inverse Different Moment (IDM) features. The purpose of this study was to identify white blood cells by comparing the classification accuracy of the K-nearest neighbor (KNN), Naïve Bayes Classification (NBC), and Multilayer Perceptron (MLP) methods. The classification results of 100 training data and 50 white blood cell image testing data. Tests on the KNN, NBC, and MLP methods yielded an accuracy of 82%, 80%, and 94%, respectively. Therefore, MLP was chosen as the best classification model in the identification of white blood cells.

Download Full-text

Klasifikasi Sel Darah Putih Berdasarkan Ciri Warna dan Bentuk dengan Metode K-Nearest Neighbor (K-NN)

IJEIS (Indonesian Journal of Electronics and Instrumentation Systems) ◽

10.22146/ijeis.15254 ◽

2016 ◽

Vol 6 (2) ◽

pp. 151 ◽

Cited By ~ 1

Author(s):

Mizan Nur Khasanah ◽

Agus Harjoko ◽

Ika Candradewi

Keyword(s):

Image Processing ◽

Blood Cells ◽

Nearest Neighbor ◽

White Blood Cells ◽

Processing Technique ◽

Image Processing Technique ◽

K Nearest Neighbor ◽

Traditional Procedure ◽

Early Diagnose

The traditional procedure of classification of blood cells using a microscope in the laboratory of hematology to obtain information types of blood cells. It has become a cornerstone in the laboratory of hematology to diagnose and monitor hematologic disorders. However, the manual procedure through a series of labory test can take a while. Thresfore, this research can be helpful in the early stages of the classification of white blood cells automatically in the medical field.Efforts to overcome the length of time and for the purposes of early diagnose can use the image processing technique based on morphology of blood cells. This research aims to classify the white blood cells based on cell morphology with the k-nearest neighbor (knn). Image processing algorithms used hough circle, thresholding, feature extraction, then to the process of classification was used the method of k-nearest neighbor (knn).In the process of testing used 100 images to be aware of its kind. The test results showed segmentation accuracy of 78% and testing the classification of 64%.

Download Full-text

Machine Learning Classification and Feature Extraction of Arrhythmic ECG Data

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.b3548.079220 ◽

2020 ◽

Vol 9 (2) ◽

pp. 6-12

Keyword(s):

Machine Learning ◽

Feature Extraction ◽

Nearest Neighbor ◽

Extraction Process ◽

Support Vector ◽

Ecg Signal ◽

Data Sets ◽

K Nearest Neighbor ◽

Machine Learning Classification ◽

Artificial Neural Network Ann

Electrocardiogram (ECG) is the analysis of the electrical movement of the heart over a period of time. The detailed information about the condition of the heart is measured by analyzing the ECG signal. Wavelet transform, fast Fourier transform are the different methods to disorganize cardiac disease. The paper elaborates the survey on ECG signal analysis and related study on arrhythmic and non arrhythmic data. Here we discuss the efficient feature extraction process for electrocardiogram, where based on position and priority six best P-QRS-T fragments are studied. This survey examines the the outcome of the system by using various Machine learning classification algorithms for feature extraction and analysis of ECG Signals. Support Vector Machine (SVM), K-Nearest Neighbor (KNN), Artificial Neural Network (ANN) are the most important algorithms used here for this purpose. There are several publicly available data sets which are used for arrhythmia analysis and among them MIT-BIH ECG-ID database is mostly used. The drawbacks and limitations are also discussed here and from there future challenges and concluding remarks can be done.

Download Full-text

Classification of Bird Based on Face Types Using Gray Level Co-Occurrence Matrix (GLCM) Feature Extraction Based on the k-Nearest Neighbor (K-NN) Algorithm

Journal of Applied Intelligent System ◽

10.33633/jais.v6i2.4627 ◽

2021 ◽

Vol 6 (2) ◽

pp. 111-119

Author(s):

Daurat Sinaga ◽

Feri Agustina ◽

Noor Ageng Setiyanto ◽

Suprayogi Suprayogi ◽

Cahaya Jatmoko

Keyword(s):

Feature Extraction ◽

Nearest Neighbor ◽

Correlation Energy ◽

Training Data ◽

Gray Level ◽

K Nearest Neighbor ◽

Testing Data ◽

Occurrence Matrix

Indonesia is one of the countries with a large number of fauna wealth. Various types of fauna that exist are scattered throughout Indonesia. One type of fauna that is owned is a type of bird animal. Birds are often bred as pets because of their characteristic facial voice and body features. In this study, using the Gray Level Co-Occurrence Matrix (GLCM) based on the k-Nearest Neighbor (K-NN) algorithm. The data used in this study were 66 images which were divided into two, namely 55 training data and 11 testing data. The calculation of the feature value used in this study is based on the value of the GLCM feature extraction such as: contrast, correlation, energy, homogeneity and entropy which will later be calculated using the k-Nearest Neighbor (K-NN) algorithm and Eucliden Distance. From the results of the classification process using k-Nearest Neighbor (K-NN), it is found that the highest accuracy results lie at the value of K = 1 and at an degree of 0 ° of 54.54%.

Download Full-text

Deep Fusion Feature Extraction for Caries Detection on Dental Panoramic Radiographs

Applied Sciences ◽

10.3390/app11052005 ◽

2021 ◽

Vol 11 (5) ◽

pp. 2005

Author(s):

Toan Huy Bui ◽

Kazuhiko Hamamoto ◽

May Phu Paing

Keyword(s):

Feature Extraction ◽

Nearest Neighbor ◽

Classification Model ◽

Caries Detection ◽

Support Vector ◽

K Nearest Neighbor ◽

Previous State ◽

Wide Scale ◽

Optimal Fusion ◽

Fusion Feature

Caries is the most well-known disease and relates to the oral health of billions of people around the world. Despite the importance and necessity of a well-designed detection method, studies in caries detection are still limited and show a restriction in performance. In this paper, we proposed a computer-aided diagnosis (CAD) method to detect caries among normal patients using dental radiographs. The proposed method mainly consists of two processes: feature extraction and classification. In the feature extraction phase, the chosen 2D tooth image was employed to extract deep activated features using a deep pre-trained model and geometric features using mathematic formulas. Both feature sets were then combined, called fusion feature, to complement each other defects. Then, the optimal fusion feature set was fed into well-known classification models such as support vector machine (SVM), k-nearest neighbor (KNN), decision tree (DT), Naïve Bayes (NB), and random forest (RF) to determine the best classification model that fit the fusion features set and perform the most preeminent result. The results show 91.70%, 90.43%, and 92.67% for accuracy, sensitivity, and specificity, respectively. The proposed method has outperformed the previous state-of-the-art and shows promising results when none of the measured factors is less than 90%; therefore, the method is promising for dentists and capable of wide-scale implementation caries detection in hospitals.

Download Full-text

Analysis of Braycurtis, Canberra and Euclidean Distance in KNN Algorithm

SinkrOn ◽

10.33395/sinkron.v4i1.10207 ◽

2019 ◽

Vol 4 (1) ◽

pp. 74 ◽

Cited By ~ 1

Author(s):

Annisa Fadhillah Pulungan ◽

Muhammad Zarlis ◽

Saib Suwilo

Keyword(s):

Euclidean Distance ◽

Evaluation Method ◽

Nearest Neighbor ◽

Distance Matrix ◽

Training Data ◽

Distance Functions ◽

Classification Model ◽

K Nearest Neighbor ◽

Distance Method ◽

Canberra Distance

Classification is a technique used to build a classification model from a sample of training data. One of the most popular classification techniques is The K-Nearest Neighbor (KNN). The KNN algorithm has important parameter that affect the performance of the KNN Algorithm. The parameter is the value of the K and distance matrix. The distance between two points is determined by the calculation of the distance matrix before classification process by the KNN. The purpose of this study was to analyze and compare performance of the KNN using the distance function. The distance functions are Braycurtis Distance, Canberra Distance and Euclidean Distance based on an accuracy perspective. This study uses the Iris Dataset from the UCI Machine Learning Repository. The evaluation method used id 10-Fold Cross-Validation. The result showed that the Braycurtis distance method had better performance that Canberra Distance and Euclidean Distance methods at K=6, K=7, K=8 ad K=10 with accuracy values of 96 %.

Download Full-text

Implementation of Zoning and K-Nearest Neighbor in Character Recognition of Wrésastra Script

Lontar Komputer Jurnal Ilmiah Teknologi Informasi ◽

10.24843/lkjiti.2019.v10.i01.p02 ◽

2019 ◽

pp. 9 ◽

Cited By ~ 1

Author(s):

I Wayan Agus Surya Darma

Keyword(s):

Feature Extraction ◽

Character Recognition ◽

Nearest Neighbor ◽

Nearest Neighbors ◽

Extraction Process ◽

Training Data ◽

K Nearest Neighbor ◽

K Nearest Neighbors ◽

Technological Advances ◽

Different Types

Balinese script is an important aspect that packs the Balinese culture from time to time which continues to experience development along with technological advances. Balinese script consists of three types (1) Wrésastra, (2) Swalalita and (3) Modre which have different types of characters. The Wrésastra and Swalalita script are Balinese scripts which grouped into the script criteria that are used to write in the field of everyday life. In this research, the zoning method will be implemented in the feature extraction process to produce special features owned by Balinese script. The results of the feature extraction process will produce special features owned by Balinese script which will be used in the classification process to recognize the character of Balinese script. Special features are produced using the zoning method, it will divide the image characters area of ??Balinese scripts into several regions, to enrich the features of each Balinese script. The result of feature extractions is stored as training data that will be used in the classification process. K-Nearest Neighbors is implemented in the special feature classification process that is owned by the character of Balinese script. Based on the results of the test, the highest level of accuracy was obtained using the value K=3 and reference=10 with the accuracy of Balinese script recognition 97.5%.

Download Full-text

Application Development of Student's Graduation Classification Model based on The First 2 Years Performance using K-Nearest Neighbor

10.31227/osf.io/ftwre ◽

2018 ◽

Author(s):

Purwono Prasetyawan ◽

Muhammad Faridz Abadi

Keyword(s):

Cross Validation ◽

Nearest Neighbor ◽

Educational Institution ◽

Training Data ◽

Classification Model ◽

K Nearest Neighbor ◽

Application Development ◽

K Value ◽

The Status ◽

Fold Cross Validation

A College keeps a lot of data such as, academic data, administration, student biodata and others. The existing student data has not been fully utilized. In the student education system is an important asset for an educational institution and for that it is necessary to note the graduation rate of students on time. Differences in the ability of students to complete the study on time required the monitoring and evaluation, so that it can find new information or knowledge to make decisions. The purpose of this study, to know the relationship between IP variables Semester 1, IP Semester 2, IP Semester 3, IP Semester 4, Gender, Student Status on Student Study Duration using k-nearest neighbor algorithm. The result of this research in the classification of students' graduation using the knn algorithm based on student status, gender, ip semester 1 - ip semester 4 with k-fold cross validation in can mean value of K1 accuracy 88%, K3 accuracy 88.67%, K5 accuracy of 93.78%, K7 86% accuracy, K9 accuracy 86.22%, K11 accuracy 92.44%, K13 accuracy 89.55%, K15 accuracy 93.78%, K17 accuracy 99.78%, and K19 accuracy 100 %. Of the 500 training data in the status of 188 students, 312 students, the status of students work longer in completing the lecture and in the gender of 290 men, 210 women, then women longer in finishing college. Finding the optimal k value using k-fold cross validation. The result of accuracy using k-fold cross validation is K19 with 100% accuracy.

Download Full-text

Handwritten Balinesse Character Recognition using K-Nearest Neighbor

10.31227/osf.io/z6m8u ◽

2018 ◽

Author(s):

I Wayan Agus Surya Darma

Keyword(s):

Feature Extraction ◽

Success Rate ◽

Character Recognition ◽

Nearest Neighbor ◽

Recognition System ◽

Extraction Process ◽

K Nearest Neighbor ◽

Nearest Neighbor Algorithm ◽

K Nearest Neighbor Algorithm ◽

Character Feature

Balinese character recognition is a technique to recognize feature or pattern of Balinese character. Feature of Balinese character is generated through feature extraction process. This research using handwritten Balinese character. Feature extraction is a process to obtain the feature of character. In this research, feature extraction process generated semantic and direction feature of handwritten Balinese character. Recognition is using K-Nearest Neighbor algorithm to recognize 81 handwritten Balinese character. The feature of Balinese character images tester are compared with reference features. Result of the recognition system with K=3 and reference=10 is achieved a success rate of 97,53%.

Download Full-text

A Comparative Survey of Feature Extraction and Machine Learning Methods in Diverse Acoustic Environments

Sensors ◽

10.3390/s21041274 ◽

2021 ◽

Vol 21 (4) ◽

pp. 1274

Author(s):

Daniel Bonet-Solà ◽

Rosa Ma Alsina-Pagès

Keyword(s):

Machine Learning ◽

Feature Extraction ◽

Best Practice ◽

Nearest Neighbor ◽

Gaussian Mixture ◽

Machine Learning Algorithms ◽

Multimedia Retrieval ◽

Natural Environments ◽

K Nearest Neighbor ◽

Acoustic Environments

Acoustic event detection and analysis has been widely developed in the last few years for its valuable application in monitoring elderly or dependant people, for surveillance issues, for multimedia retrieval, or even for biodiversity metrics in natural environments. For this purpose, sound source identification is a key issue to give a smart technological answer to all the aforementioned applications. Diverse types of sounds and variate environments, together with a number of challenges in terms of application, widen the choice of artificial intelligence algorithm proposal. This paper presents a comparative study on combining several feature extraction algorithms (Mel Frequency Cepstrum Coefficients (MFCC), Gammatone Cepstrum Coefficients (GTCC), and Narrow Band (NB)) with a group of machine learning algorithms (k-Nearest Neighbor (kNN), Neural Networks (NN), and Gaussian Mixture Model (GMM)), tested over five different acoustic environments. This work has the goal of detailing a best practice method and evaluate the reliability of this general-purpose algorithm for all the classes. Preliminary results show that most of the combinations of feature extraction and machine learning present acceptable results in most of the described corpora. Nevertheless, there is a combination that outperforms the others: the use of GTCC together with kNN, and its results are further analyzed for all the corpora.

Download Full-text

k-Nearest Neighbor Learning with Graph Neural Networks

Mathematics ◽

10.3390/math9080830 ◽

2021 ◽

Vol 9 (8) ◽

pp. 830

Author(s):

Seokho Kang

Keyword(s):

Neural Network ◽

Nearest Neighbor ◽

Learning Algorithm ◽

Weighting Function ◽

High Sensitivity ◽

Training Data ◽

K Nearest Neighbor ◽

Main Challenge ◽

Benchmark Datasets ◽

Graph Neural Networks

k-nearest neighbor (kNN) is a widely used learning algorithm for supervised learning tasks. In practice, the main challenge when using kNN is its high sensitivity to its hyperparameter setting, including the number of nearest neighbors k, the distance function, and the weighting function. To improve the robustness to hyperparameters, this study presents a novel kNN learning method based on a graph neural network, named kNNGNN. Given training data, the method learns a task-specific kNN rule in an end-to-end fashion by means of a graph neural network that takes the kNN graph of an instance to predict the label of the instance. The distance and weighting functions are implicitly embedded within the graph neural network. For a query instance, the prediction is obtained by performing a kNN search from the training data to create a kNN graph and passing it through the graph neural network. The effectiveness of the proposed method is demonstrated using various benchmark datasets for classification and regression tasks.

Download Full-text