scholarly journals KLASIFIKASI DOKUMEN TUGAS AKHIR (SKRIPSI) MENGGUNAKAN K-NEAREST NEIGHBOR

2019 ◽  
Vol 4 (1) ◽  
pp. 69
Author(s):  
Kitami Akromunnisa ◽  
Rahmat Hidayat

Various scientific works from academicians such as theses, research reports, practical work reports and so forth are available in the digital version. However, in general this phenomenon is not accompanied by a growth in the amount of information or knowledge that can be extracted from these electronic documents. This study aims to classify the abstract data of informatics engineering thesis. The algorithm used in this study is K-Nearest Neighbor. Amount of data used 50 abstract data of Indonesian language, 454 data of English abstract and 504 title data. Each data is divided into training data and test data. Test data will be classified automatically with the classifier model that has been made. Based on the research conducted, the classification of the Indonesian essential data resulted in greater accuracy without going through a stemming process that had a 9: 1 ratio of 100.0% compared to an 8: 2 ratio of 90.0%, 7: 3 which was 80.0%, 6: 4 which is 60.0% and the data distribution using Kfold cross validation is 80.0%.

2019 ◽  
Vol 3 (2) ◽  
pp. 96-107
Author(s):  
Cahaya Jatmoko ◽  
Daurat Sinaga

In this study, batik has been modeled using the GLCM method which will produce features of energy, contrast, correlation, homogenity and entropy. Then these features are used as input for the classification process of training data and data testing using the K-NN method by using ecludean distance search. The next classification uses 5 features that provide information on energy values, contrast, correlation, homogeneity, and entropy. Of the two classifications, which comparison will produce the best accuracy. Training data and data testing were tested using the Recognition Rate calculation for system evaluation. The results of the study produced 66% recognition rate in 50 pieces of test data and 100 pieces of training data.


2020 ◽  
Vol 202 ◽  
pp. 16005
Author(s):  
Chashif Syadzali ◽  
Suryono Suryono ◽  
Jatmiko Endro Suseno

Customer behavior classification can be useful to assist companies in conducting business intelligence analysis. Data mining techniques can classify customer behavior using the K-Nearest Neighbor algorithm based on the customer's life cycle consisting of prospect, responder, active and former. Data used to classify include age, gender, number of donations, donation retention and number of user visits. The calculation results from 2,114 data in the classification of each customer’s category are namely active by 1.18%, prospect by 8.99%, responder by 4.26% and former by 85.57%. System accuracy using a range of K from K = 1 to K = 20 produces that the highest accuracy is 94.3731% at a value of K = 4. The results of the training data that produce a classification of user behavior can be used as a Business Intelligence analysis that is useful for companies in determining business strategies by knowing the target of optimal market.


2019 ◽  
Vol 2 (1) ◽  
pp. 57 ◽  
Author(s):  
Irma Handayani

Vertebral column as a part of backbone has important role in human body. Trauma in vertebral column can affect spinal cord capability to send and receive messages from brain to the body system that controls sensory and motoric movement. Disk hernia and spondylolisthesis are examples of pathologies on the vertebral column. Research about pathology or damage bones and joints of skeletal system classification is rare whereas the classification system can be used by radiologists as a second opinion so that can improve productivity and diagnosis consistency of the radiologists. This research used dataset Vertebral Column that has three classes (Disk Hernia, Spondylolisthesis and Normal) and instances in UCI Machine Learning. This research applied the K-NN algorithm for classification of disk hernia and spondylolisthesis in vertebral column. The data were then classified into two different but related classification tasks: “normal” and “abnormal”. K-NN algorithm adopts the approach of data classification by optimizing sample data that can be used as a reference for training data to produce vertebral column data classification based on the learning process. The results showed that the accuracy of K-NN classifier was 83%. The average length of time needed to classify the K-NN classifier was 0.000212303 seconds.


2018 ◽  
Vol 1 (2) ◽  
pp. 70-75
Author(s):  
Abdul Rozaq

Building materials is an important factor to built a house, to estimate funds the needs of build a house, consumers or developers can estimate the funds needed to build a house. To solve these problems use case base reasoning (CBR) approach, which method is capable of reasoning or solving the problem based on the cases that have been there as a solution to new problems. The system built in this study is a CBR system for determine the needs of house building materials. The consultation process is done by inserting new cases compared to the old case similarity value is then calculated using the nearest neighbor. The first test by inserting test data then compared with each type of home then obtained an accuracy of 83.6%. The second test is done by K-fold Cross Validation with K = 25 with the number of data 200, the data will be divided into two parts, namely the training data and test data, training data as many as 192 data and test data as many as 8 data. K-Fold Cross Validation method. This CBR system can produce an accuracy of 85.71%


2022 ◽  
Vol 8 (1) ◽  
pp. 50
Author(s):  
Rifki Indra Perwira ◽  
Bambang Yuwono ◽  
Risya Ines Putri Siswoyo ◽  
Febri Liantoni ◽  
Hidayatulah Himawan

State universities have a library as a facility to support students’ education and science, which contains various books, journals, and final assignments. An intelligent system for classifying documents is needed to ease library visitors in higher education as a form of service to students. The documents that are in the library are generally the result of research. Various complaints related to the imbalance of data texts and categories based on irrelevant document titles and words that have the ambiguity of meaning when searching for documents are the main reasons for the need for a classification system. This research uses k-Nearest Neighbor (k-NN) to categorize documents based on study interests with information gain features selection to handle unbalanced data and cosine similarity to measure the distance between test and training data. Based on the results of tests conducted with 276 training data, the highest results using the information gain selection feature using 80% training data and 20% test data produce an accuracy of 87.5% with a parameter value of k=5. The highest accuracy results of 92.9% are achieved without information gain feature selection, with the proportion of training data of 90% and 10% test data and parameters k=5, 7, and 9. This paper concludes that without information gain feature selection, the system has better accuracy than using the feature selection because every word in the document title is considered to have an essential role in forming the classification.


SinkrOn ◽  
2020 ◽  
Vol 4 (2) ◽  
pp. 34
Author(s):  
Moh. Arie Hasan ◽  
Arief Setya Budi

Pears is a fruit that is widely available in tropical climates such as in western Europe, Asia, Africa and one of them is Indonesia. There are many types of pears in Indonesia. Types of pears can be distinguished from the color, size, and shape. But it is still difficult for ordinary people to get to know the types of pears. This is what gave rise to the idea to conduct research related to image processing to classify three types of pears namely abate, red and william pears in order to help determine the type of pears. The pear type classification process is done by verify the image of pears based on existing training data. The research method used consisted of preprocessing image segmentation with morphological operations and feature extraction into Principal Component Analysis (PCA). The classification algorithm used is K-Nearest Neighbor (KNN). The use of adequate training data will further improve the classification of types of pears. The final results of this study amounted to 87.5%.


2021 ◽  
Vol 6 (2) ◽  
pp. 111-119
Author(s):  
Daurat Sinaga ◽  
Feri Agustina ◽  
Noor Ageng Setiyanto ◽  
Suprayogi Suprayogi ◽  
Cahaya Jatmoko

Indonesia is one of the countries with a large number of fauna wealth. Various types of fauna that exist are scattered throughout Indonesia. One type of fauna that is owned is a type of bird animal. Birds are often bred as pets because of their characteristic facial voice and body features. In this study, using the Gray Level Co-Occurrence Matrix (GLCM) based on the k-Nearest Neighbor (K-NN) algorithm. The data used in this study were 66 images which were divided into two, namely 55 training data and 11 testing data. The calculation of the feature value used in this study is based on the value of the GLCM feature extraction such as: contrast, correlation, energy, homogeneity and entropy which will later be calculated using the k-Nearest Neighbor (K-NN) algorithm and Eucliden Distance. From the results of the classification process using k-Nearest Neighbor (K-NN), it is found that the highest accuracy results lie at the value of K = 1 and at an degree of 0 ° of 54.54%.


Author(s):  
Sumarlin Sumarlin ◽  
Dewi Anggraini

Data on graduate students is an important part in determining the quality of a private and public university. Graduate data is included in important assessments in the accreditation process. Data from Uyelindo Kupang STIKOM graduates every year will continue to grow and accumulate like neglected data because it is rarely used. To maximize student data into information that can be used by universities, the data must be processed in this case used as training data in a study using data mining to obtain information in the form of predictions of graduation from Kupang Uyelindo STIKOM students. The method used in this study is K-Nearest Neighbor using rapidminer software to measure K-Nearest Neighbor's accuracy against student graduate data. The criteria used were in the form of student names, gender, cumulative achievement index (GPA) from semester 1 to 6. In applying the K-Nearest Neighbor algorithm can be used to produce predictions of student graduation. To measure the performance of the k-nearest neighbor algorithm, the Cross Validation, Confusion Matrix and ROC Curves methods are used, in this study using a 5-fold cross validation to predict student graduation. From 100 student dataset records Uyelindo Kupang STIKOM graduates obtained accuracy rate reached 82% and included a very good classification because it has an AUC value between 0.90-1.00, which is 0.971, so it can be concluded that the accuracy of testing of student graduation models using K-Nearest Neighbor (K-NN) algorithm is influenced by the number of data clusters. Accuracy and the highest AUC value of 5-fold validation is to cluster data k = 4 with the accuracy value of 90%.


2019 ◽  
Vol 1 (1) ◽  
pp. 14-19
Author(s):  
Febrian Wahyu Ramadhan ◽  
Husni Teja Sukmana ◽  
Lee Kyung Oh ◽  
Luh Kesuma Wardhani

Sentiment analysis is a method for reviewing products or services to determine opinions or feelings about a product. The results of the analysis can be used by companies as evaluation materials and considerations to improve the products or services provided. This study aims to test the level of public sentiment on the quality of Bank Mandiri services that have received ISO 20000-1 with the application of sentiment analysis using the K-NN algorithm based on ITSM criteria. The initial classification in this study uses the lexicon method by detecting words included in sentiment words, the results of which are included as labels on training data and test data. Formation of the classification with the K-NN algorithm by taking into account the results of the training data indexing and weighting of the test data, with the value of k as the decision maker limit. The trial results of 10 scenarios show that the classification using the K-NN algorithm as a sentiment classification is 98% accuracy value of 50 test data to 600 training data, with 24% getting positive sentiment, 22% negative sentiment and 55% neutral sentiment, with f -measure 95.83%. while in testing 100 the test data obtained 79% accuracy value with 21% getting positive sentiment, 42% negative sentiment and 38% neutral with an f-measure value of 68.42%.


2018 ◽  
Vol 1 (2) ◽  
pp. 70-75
Author(s):  
Abdul Rozaq

Building materials is an important factor to built a house, to estimate funds the needs of build a house, consumers or developers can estimate the funds needed to build a house. To solve these problems use case base reasoning (CBR) approach, which method is capable of reasoning or solving the problem based on the cases that have been there as a solution to new problems. The system built in this study is a CBR system for determine the needs of house building materials. The consultation process is done by inserting new cases compared to the old case similarity value is then calculated using the nearest neighbor. The first test by inserting test data then compared with each type of home then obtained an accuracy of 83.6%. The second test is done by K-fold Cross Validation with K = 25 with the number of data 200, the data will be divided into two parts, namely the training data and test data, training data as many as 192 data and test data as many as 8 data. K-Fold Cross Validation method. This CBR system can produce an accuracy of 85.71%


Sign in / Sign up

Export Citation Format

Share Document