scholarly journals Peningkatan Konerja Metode SVM Menggunakan Metode KNN Imputasi dan K-Means-Smote untuk Klasifikasi Kelulusan Mahasiswa Universitas Bumigora

2021 ◽  
Vol 8 (4) ◽  
pp. 713
Author(s):  
Hairani Hairani

<p class="Abstrak">Salah satu permasalahan utama Universitas Bumigora adalah rasio antara mahasiswa yang masuk dengan mahasiswa lulus tepat waktu  tidak seimbang, sehingga akan mengakibatkan penurunan penilaian akreditasi dikemudian hari. Salah satu indikator penilaian dalam proses akreditasi adalah rasio kelulusan mahasiswa. Data kelulusan mahasiswa yang tersimpan pada basisdata kampus, tetapi belum dimanfaatkan dengan maksimal. Dengan memanfaatkan data kelulusan mahasiswa dapat mengetahui pattern atau pola-pola mahasiswa yang lulus tepat waktu atau tidak, sehingga dapat minimalisir terjadinya mahasiswa yang drop out. Tidak hanya itu, pengambil keputusan dapat dimudahkan membuat kebijakan secara dini untuk membantu mahasiswa yang berpotensi drop out dan lulus tidak tepat waktu. Solusi yang ditawarkan pada penelitian ini adalah menggunakan teknik data mining. Salah satu metode data mining yang digunakan penelitian ini adalah metode SVM. Adapun tujuan penelitian ini adalah meningkatkan kinerja metode SVM untuk klasifikasi kelulusan mahasiswa Universitas Bumigora menggunakan metode KNN Imputasi dan K-Means-Smote. Penelitian ini terdiri dari beberapa tahapan yaitu pengumpulan data kelulusan mahasiswa, pra-pengolahan seperti penanganan nilai hilang menggunakan metode KNNI, penanganan ketidakseimbangan kelas menggunakan K-Means-Smote, klasifikasi menggunakan metode SVM. Tahapan terakhir adalah pengujian kinerja SVM berdasarkan akurasi, sensitivitas, spesifisitas, dan f-measure.  Berdasarkan hasil pengujian yang telah dilakukan, integrasi metode KNNI, K-Means-Smote, dan SVM mendapatkan akurasi 83.9%, sensitivitas 81.3%, spesifisitas 86.6%, dan f-measure 83.5%.  Penggunaan metode KNNI dan K-Means-Smote dapat meningkatkan kinerja metode SVM berdasarkan akurasi, sensitivitas, spesifisitas, dan f-measure. </p><p class="Abstrak"><strong><em><br /></em></strong></p><p class="Abstrak"><strong><em>Abstract</em></strong></p><p class="Abstrak"><em> One of the main problems of Bumigora University is the ratio between incoming students and students graduating on time is not balanced, so that it will result in a decrease in accreditation assessment in the future. One of the assessment indicators in the accreditation process is the student graduation ratio. Student graduation data stored in the campus database, but has not been maximally utilized. By utilizing graduation data, students can find out patterns or patterns of students who graduate on time or not, so as to minimize the occurrence of students who drop out. Not only that, decision makers can make it easier to make policies early to help students who have the potential to drop out and not graduate on time. The solution offered in this research is to use data mining techniques. One of the data mining methods used in this study is the SVM method. The purpose of this study is to improve the performance of the SVM method for the classification of Bumigora University graduation students using the KNN Imputation and K-Means-Smote methods. This research consists of several stages, namely the collection of student graduation data, pre-processing such as handling missing values using KNNI method, handling class imbalances using K-Means-Smote, classification the SVM method. The last stage is testing SVM performance based on accuracy, sensitivity, specificity, and f-measure. Based on the results of test that have been carried out, the integration of the KNNI, K-Means-Smote, and SVM method get an accuracy of 83.9%, sensitivity 81.3%, specificity 86.6%, and f-measure 83.5%. The use of KNNI and K-Means-Smote method can improve the performance of the SVM method based on accuracy, sensitivity, specificity, and f-measure. </em></p>

2020 ◽  
Vol 4 (1) ◽  
pp. 95-101 ◽  
Author(s):  
Edi Sutoyo ◽  
Ahmad Almaarif

The quality of students can be seen from the academic achievements, which are evidence of the efforts made by students. Student academic achievement is evaluated at the end of each semester to determine the learning outcomes that have been achieved. If a student cannot meet certain academic criteria that are stated by fulfilling the requirements to continue his studies, the student may have the potential to not graduate on time or even Drop Out (DO). The high number of students who do not graduate on time or DO in higher education institutions can be minimized by detecting students who are at risk in the early stages of education and is supported by making policies that can direct students to complete their education. Also, if the time for completion of student studies can be predicted then the handling of students will be more effective. One technique for making predictions that can be used is data mining techniques. Therefore, in this study, the Naive Bayes Classifier (NBC) algorithm will be used to predict student graduation at Telkom University. The dataset was obtained from the Information Systems Directorate (SISFO), Telkom University which contained 4000 instance data. The results of this study prove that NBC was successfully implemented to predict student graduation. Prediction of the graduation of these students is able to produce an accuracy of 73,725%, precision 0.742, recall 0.736 and F-measure of 0.735.


2018 ◽  
Vol 8 (3) ◽  
pp. 120-125
Author(s):  
Ahmad Alaiad ◽  
Hassan Najadat ◽  
Nusaiba Al-Mnayyis ◽  
Ashwaq Khalil

Data envelopment analysis (DEA) has been widely used in many fields. Recently, it has been adopted by the healthcare sector to improve efficiency and performance of the healthcare organisations, and thus, reducing overall costs and increasing productivity. In this paper, we demonstrate the results of applying the DEA model in Jordanian hospitals. The dataset consists of 28 hospitals and is classified into two groups: efficient and non-efficient hospitals. We applied different association classification data mining techniques (JCBA, WeightedClassifier and J48) to generate strong rules using the Waikato Environment for Knowledge Analysis. We also applied the open source DEA software and MaxDEA software to manipulate the DEA model. The results showed that JCBA has the highest accuracy. However, WeightedClassifier method achieves the highest number of generated rules, while the JCBA method has the minimum number of generated rules. The results have several implications for practice in the healthcare sector and decision makers. Keywords: Component, DEA, DMU, output-oriented model, health care system.


2021 ◽  
Vol 8 (11) ◽  
pp. 325-331
Author(s):  
Eko Hariyanto ◽  
Sri Wahyuni ◽  
Supina Batubara

The main problem studied in this study is the large number of lost students who harm universities because of the difficulty of monitoring or monitoring as a preventive measure. Therefore, this research becomes very important to be done so that college institutions can make efforts to detect early (classification) of students who potentially cannot complete their studies on time or students who will drop out (DO). Thus, PT institutions through related parties such as academic guidance lecturers, academic bureaus and others can do initial prevention by providing the best solution or solution to the problems faced by students. This research aims to determine the training data model consisting of academic and non-academic factors (including the results of extracting information from social media). Furthermore, this model is used as a basis for classifying students who have the potential to "graduate on time", "graduate not on time", and "DO". The method approach used is quantitative with text mining computational algorithms for the process of extracting knowledge / information from social media which is further used in data training, as well as data mining computational algorithms for the process of classification of potential completion of student studies. The mandatory external targeted in the first year is the publication of the international journal Scopus Q4 and in the second year is the publication of the international journal Scopus Q3. For additional external targets in the first and second years respectively are the publication of international journals indexed on reputable indexers, ISBN teaching books and copyrights. The level of technological readiness (TKT) in this study up to level 2 is the formulation of technological concepts and applications to classify the potential completion of student studies using data mining. Keywords: [student lost, knowledge/information extraction, data classification, text mining, data mining].


2019 ◽  
Vol 8 (4) ◽  
pp. 4039-4042

Recently, the learning from unbalanced data has emerged to be a pre-dominant problem in several applications and in that multi label classification is an evolving data mining task, learning from unbalanced multilabel data is being examined. However, the available algorithms-based SMOTE makes use of the same sampling rate for every instance of the minority class. This leads to sub-optimal performance. To deal with this problem, a new Particle Swarm Optimization based SMOTE (PSOSMOTE) algorithm is proposed. The PSOSMOTE algorithm employs diverse sampling rates for multiple minority class instances and gets the fusion of optimal sampling rates and to deal with classification of unbalanced datasets. Then, Bayesian technique is combined with Random forest for multilabel classification (BARF-MLC) is to address the inherent label dependencies among samples such as ML-FOREST classifier, Predictive Clustering Trees (PCT), Hierarchy of Multi Label Classifier (HOMER) by taking the different metrics including precision, recall, F-measure, Accuracy and Error Rate.


2020 ◽  
Vol 1 (2) ◽  
pp. 130
Author(s):  
Elma Tiana ◽  
Sri Wahyuni

Breast cancer or Mammae Carsinoma is an uncontrolled cell growth in the milk-producing glands (lobular), the gland tract from the lobular to the Breast nipple (ductus), and the breast support tissues that surround the lobular, ductus, vessels Blood and limfe vessels, but does not include breast skin. Research begins by conducting a preprocessing stage, to eliminate missing values. After that the process is imputasi to remove missing values. It then performed a feature selection to see which attribute had a major impact on the data. The last stage is classification with two methods, namely Naïve Bayes. At the end of the study, the method is best to classify the recurrence data of breast cancer patients.


2019 ◽  
Vol IV (IV) ◽  
pp. 146-156
Author(s):  
Dost Muhammad Khan ◽  
Tariq Aziz Rao ◽  
Faisal Shahzad

Data mining is a procedure of extracting the requisite information from unprocessed records by using certain methodologies and techniques. Data having sentiments of customers is of utmost importance for managers and decision-makers who intend to monitor the progress, to maintain the quality of their products or services and to observe the latest market trends for business support. Billions of customers are using micro-blogging websites and social media for sharing their opinions about different topics on daily basis. Therefore, it has become a source of acquiring information but to identify a particular feature of a product is still an issue as the information retrieves from varied sources. We proposed a framework for data acquisition, preprocessing, feature extraction and used three supervised machine-learning algorithms for classification of customers’ sentiments. The proposed framework also tested to evaluate the system’s performance. Our proposed methodology will be helpful for researchers, service providers, and decisionmakers.


2014 ◽  
Vol 6 (1) ◽  
pp. 15-20 ◽  
Author(s):  
David Hartanto Kamagi ◽  
Seng Hansun

Graduation Information is important for Universitas Multimedia Nusantara  which engaged in education. The data of graduated students from each academic year is an important part as a source of information to make a decision for BAAK (Bureau of Academic and Student Administration). With this information, a prediction can be made for students who are still active whether they can graduate on time, fast, late or drop out with the implementation of data mining. The purpose of this study is to make a prediction of students’ graduation with C4.5 algorithm as a reference for making policies and actions of academic fields (BAAK) in reducing students who graduated late and did not pass. From the research, the category of IPS semester one to semester six, gender, origin of high school, and number of credits, can predict the graduation of students with conditions quickly pass, pass on time, pass late and drop out, using data mining with C4.5 algorithm. Category of semester six is the highly influential on the predicted outcome of graduation. With the application test result, accuracy of the graduation prediction acquired is 87.5%. Index Terms-Data mining, C4.5 algorithm, Universitas Multimedia Nusantara, prediction.


Diagnostics ◽  
2020 ◽  
Vol 10 (3) ◽  
pp. 162 ◽  
Author(s):  
Julieta G. Rodríguez-Ruiz ◽  
Carlos E. Galván-Tejada ◽  
Laura A. Zanella-Calzada ◽  
José M. Celaya-Padilla ◽  
Jorge I. Galván-Tejada ◽  
...  

Major Depression Disease has been increasing in the last few years, affecting around 7 percent of the world population, but nowadays techniques to diagnose it are outdated and inefficient. Motor activity data in the last decade is presented as a better way to diagnose, treat and monitor patients suffering from this illness, this is achieved through the use of machine learning algorithms. Disturbances in the circadian rhythm of mental illness patients increase the effectiveness of the data mining process. In this paper, a comparison of motor activity data from the night, day and full day is carried out through a data mining process using the Random Forest classifier to identified depressive and non-depressive episodes. Data from Depressjon dataset is split into three different subsets and 24 features in time and frequency domain are extracted to select the best model to be used in the classification of depression episodes. The results showed that the best dataset and model to realize the classification of depressive episodes is the night motor activity data with 99.37% of sensitivity and 99.91% of specificity.


Sign in / Sign up

Export Citation Format

Share Document