Model Prediksi Dropout Mahasiswa Menggunakan Teknik Data Mining

Muchamad Taufiq Anwar; Lucky Heriyanto; Fadhla Fanini

doi:10.26877/jiu.v7i1.8023

Model Prediksi Dropout Mahasiswa Menggunakan Teknik Data Mining

Jurnal Informatika Upgris ◽

10.26877/jiu.v7i1.8023 ◽

2021 ◽

Vol 7 (1) ◽

Author(s):

Muchamad Taufiq Anwar ◽

Lucky Heriyanto ◽

Fadhla Fanini

Keyword(s):

Data Mining ◽

Knowledge Analysis

Salah satu permasalahan yang ada di Perguruan Tinggi XYZ adalah tingginya jumlah mahasiswa yang putus studi (dropout / DO), sehingga diperlukan upaya untuk minimalisasi jumlah mahasiswa yang dropout. Penelitian ini bertujuan untuk membangun sebuah model yang dapat memprediksi apakah seorang mahasiswa akan lulus ataukah dropout. Data diambil dari data akademis mahasiswa angkatan 2014-2019. Pemrosesan awal data dilakukan dengan Python dan pemodelan dilakukan dengan menggunakan algoritma C4.5 / J48 pada perangkat lunak WEKA (Waikato Environment for Knowledge Analysis). Hasil menunjukkan bahwa atribut yang paling menentukan apakah seorang mahasiswa DO atau lulus adalah Indeks Prestasi Semester 1 dan Indeks Prestasi Semester 2, dengan akurasi model mencapai sebesar 90.6%.

Download Full-text

Associative classification of the Jordanian hospitals efficiency based on DEA

Global Journal of Computer Sciences Theory and Research ◽

10.18844/gjcs.v8i3.4022 ◽

2018 ◽

Vol 8 (3) ◽

pp. 120-125

Author(s):

Ahmad Alaiad ◽

Hassan Najadat ◽

Nusaiba Al-Mnayyis ◽

Ashwaq Khalil

Keyword(s):

Data Mining ◽

Decision Makers ◽

Healthcare Sector ◽

Data Envelopment ◽

Associative Classification ◽

Dea Model ◽

Knowledge Analysis ◽

Minimum Number ◽

And Performance

Data envelopment analysis (DEA) has been widely used in many fields. Recently, it has been adopted by the healthcare sector to improve efficiency and performance of the healthcare organisations, and thus, reducing overall costs and increasing productivity. In this paper, we demonstrate the results of applying the DEA model in Jordanian hospitals. The dataset consists of 28 hospitals and is classified into two groups: efficient and non-efficient hospitals. We applied different association classification data mining techniques (JCBA, WeightedClassifier and J48) to generate strong rules using the Waikato Environment for Knowledge Analysis. We also applied the open source DEA software and MaxDEA software to manipulate the DEA model. The results showed that JCBA has the highest accuracy. However, WeightedClassifier method achieves the highest number of generated rules, while the JCBA method has the minimum number of generated rules. The results have several implications for practice in the healthcare sector and decision makers. Keywords: Component, DEA, DMU, output-oriented model, health care system.

Download Full-text

Implementation of the K-Means Clustering Method in Data Grouping Sales In Asia Africa Dentures Dental

Journal Of Computer Networks, Architecture and High Performance Computing ◽

10.47709/cnapc.v2i2.431 ◽

2019 ◽

Vol 2 (2) ◽

pp. 286-291

Author(s):

Sonibe Halawa ◽

Rita Hamdani

Keyword(s):

Data Mining ◽

Data Storage ◽

Visual Basic ◽

Added Value ◽

Clustering Method ◽

Sales Data ◽

Knowledge Analysis ◽

Data Grouping

Data mining can be applied to explore the added value of a set of data in the form of knowledge that had been unknown to them manually. There are several techniques used dala mining eyes, one satuteknik data mining is clustering. Clustering can be used for grouping to something. As can group sales data that is most desirable, and others. Examples of companies engaged in the sale is a dental african Asia. Asia Africa Dental is one area of business engaged in the sale of false teeth. Asia Africa Dental these every day to meet the needs of consumers. But Asia Africa Dental lacking in reviewing products sold. What products are needed consumer and data storage is less effective. Thus the need for a system that can support the company in taking decisions quickly and precisely. So in this study, the authors used the application of K-Means Clustering method. To facilitate the author in analyzing the K-Means Clustering The author using the application Weka (Waikato Environment for Knowledge Analysis) .. The result of the calculation Weka (Waikato Environment for Knowledge Analysis) is inserted into the Visual Basic .Net.

Download Full-text

Analisis Pola Penyakit Hipertensi Menggunakan Algoritma C4.5

InfoTekJar (Jurnal Nasional Informatika dan Teknologi Jaringan) ◽

10.30743/infotekjar.v3i2.944 ◽

2019 ◽

Vol 3 (2) ◽

pp. 116-123

Author(s):

Nurul Azwanti ◽

Erlin Elisa

Keyword(s):

Heart Failure ◽

Data Mining ◽

Coronary Heart Disease ◽

Comorbid Illness ◽

Analysis Software ◽

Impaired Cognitive Function ◽

Knowledge Analysis ◽

Number Of Patients ◽

C4.5 Algorithm ◽

Research Findings

There are approximately 95% of cases of unknown cause of hypertension, while the rest caused by other diseases such as coronary heart disease, impaired kidney function, and impaired cognitive function or stroke. RSUD Embung Fatimah is an Indonesian hospital located in Batam Island Riau Province. In 2015, the total number of inpatients for hospitalization reaches 10,317 inhabitants. With the large number of patients per year it causes patient data is increasing. To overcome the problem in tackling people with hypertension disease, it is necessary to analyze the existing disease data, to predict the patient's illness which must be handled based on the pattern of the disease. In data mining there is a model that can be used to predict a pattern in a condition that is predictive or prediction model. One of the algorithms that can be used to create a decision tree (decission tree) is the C4.5 algorithm. The C4.5 algorithm is a method used for predictive classification. Using C4.5 algorithm method, the researcher can classify the pattern of hypertension as a comorbid illness of heart failure, kidney failure, diabetes, stroke and hypoglycemia. In this study, researchers used WEKA (Waikato Environment for Knowledge Analysis) software as tools or tools used to perform testing in order to obtain the pattern of disease from hypertension. From the research findings in the find that in the prediction of hypertension disease as a disease, the attributes that are very influential to hypertension are heart failure.

Download Full-text

Estudio de la gestión del presupuesto en el ISMMM utilizando minería de datos [Study of budget managementinISMMM using data mining]

Ventana informatica ◽

10.30554/ventanainform.29.246.2013 ◽

2014 ◽

Author(s):

Yiezenia ROSARIO FERRER ◽

Yanelis GÉ GUILARTE

Keyword(s):

Data Mining ◽

Analysis Tool ◽

Standard Process ◽

Budget Planning ◽

Useful Knowledge ◽

Industry Standard ◽

Knowledge Analysis ◽

Hidden Knowledge ◽

Data Analysis Tool ◽

Using Data

Resumen La elaboración del presupuesto permite a las empresas, los gobiernos e instituciones establecer prioridades y evaluar la consecución de sus objetivos. Su confección se basa en las instituciones de educación superior en sus procesos sustantivos.En los últimos años ha existido un gran crecimiento en las capacidades de generar y almacenar datos, de ahí que el volumen y la variedad de información hayan ido en aumento. Esto ha traído como consecuencia la incapacidad para analizar y transformar la información en conocimiento útil con las herramientas disponibles. En la presente investigación se realiza el descubrimiento de conocimiento oculto en los datos del presupuesto almacenados desde el 2004, registrados en documentos en formato Excel, por medio del uso de las técnicas de minería de datos: regresión lineal, asociación y clasificación; en la obtención de modelos para lograr mejoras en el proceso de elaboración del presupuesto en el ISMMM.El proceso de obtención de conocimiento fue realizado utilizando la metodología CRISP-DM (Cross – Industry Standard Processfor Data Mining) y a la herramienta de análisis de datos WEKA (WaikatoEnvironmentKnowledgeAnalysis). Palabras clave:CRISP-DM, gestión de presupuesto, minería de datos, WEKA. Abstract The budget planning allows to enterprises, governments and institutions to states the priorities and to assesshow its objectives are carried out. In higher education institutions, the budget planning is based on its main processes. In the last years it had existed an accelerated growing of the capacities to generate and store data, so the volume and variety of information also grown. That’s why the disabilities to analyze and transform this information in useful knowledge with the available tools. This research aims to discover hidden knowledge from budget data stored in Excel files in the years from 2004 to 2011 using the data mining techniques: linear regression, association and classification, to improve the budget forecasting process at ISMMM. The knowledge discovering was made using the CRISP-DM (Cross – Industry Standard Process for Data Mining) methodology and the data analysis tool WEKA (Waikato Environment Knowledge Analysis). Keywords:Budget management, CRISP-DM, data mining, WEKA.

Download Full-text

Minería de datos para la toma de decisiones en la unidad de nivelación y admisión universitaria ecuatoriana

Cumbres ◽

10.48190/cumbres.v4n2a5 ◽

2019 ◽

Vol 4 (2) ◽

pp. 55-67

Author(s):

María Isabel Uvidia Fassler ◽

Andrés Santiago Cisneros Barahona ◽

Pablo Martí Méndez Naranjo ◽

Henry Mauricio Villa Yánez

Keyword(s):

Data Mining ◽

Knowledge Analysis ◽

Toma De Decisiones

Data Mining o minería de datos (DM) fue aplicado en este trabajo de investigación, donde a partir de la selección de algoritmos y análisis de información, se pudieron obtener patrones que una vez observados y examinados se convirtieron en conocimiento para la toma de decisiones en la Unidad de Nivelación y Admisión de la Escuela Superior Politécnica de Chimborazo (ESPOCH). Dichos datos fueron generados desde el año 2012 en la ESPOCH en estricto cumplimiento a la normativa vigente a la fecha y dicho proceso generó una importante cantidad de información sin procesar y que no ha aportado a la toma de decisiones. Mediante la aplicación de minería de datos en Waikato Environment for Knowledge Analysis (WEKA) se pudieron analizar algoritmos de predicción de clasificación (árboles de decisión y redes neuronales) y regresión (regresión lineal y optimización de secuencia mínima) permitiendo el conocimiento de la realidad, cuyo objetivo de la investigación fue generar conocimiento de tendencias de postulaciones por área, género y años, hasta llegar a conocer predicciones. Finalmente, mediante el análisis de parámetros estadísticos se determinaron los mejores algoritmos que aseguraron la confiabilidad en la información y generaron conocimiento para la toma de decisiones académica, siendo éstas: redes Bayesianas y optimización de secuencia mínima.Palabras Clave: Minería de datos, WEKA, Universidad Ecuatoriana

Download Full-text

Caracterização de Cicatrizes de Queimadas nas Mesorregiões do Sertão e São Francisco Pernambucano a partir de dados do Sensor MODIS

Revista Brasileira de Geografia Física ◽

10.26848/rbgf.v14.2.p881-996 ◽

2021 ◽

Vol 14 (2) ◽

pp. 881

Author(s):

José Rafael Ferreira de Gouveia ◽

Cristina Rodrigues Nascimento ◽

José Galdino de Oliveira Júnior ◽

Geber Barbosa de Albuquerque Moura ◽

Pabrício Marcos Oliveira Lopes

Keyword(s):

Remote Sensing ◽

Data Mining ◽

Hot Spots ◽

Natural Regeneration ◽

Vegetation Index ◽

Knowledge Analysis ◽

Vegetation Indexes ◽

Moderate Resolution ◽

Resolution Imaging ◽

Moderate Resolution Imaging Spectroradiometer

As mesorregiões do Sertão e São Francisco Pernambucano apresentam clima semiárido, que podem afetar a produção agrícola, em função do clima quente e seco, com temperaturas elevadas e regime pluviométrico irregular. O bioma predominante da região é a Caatinga, que vem sofrendo ao longo dos anos com várias ações antrópicas, incluindo além do desmatamento eventos de queimadas. O objetivo deste artigo foi mapear, caracterizar e quantificar a incidência de focos de calor nas mesorregiões acima relacionadas, bem como a capacidade de recuperação e/ou regeneração natural da vegetação por meio do sensoriamento remoto e técnicas de mineração de dados. Imagens do sensor Moderate Resolution Imaging Spectroradiometer (MODIS) a bordo da plataforma TERRA foram utilizadas para analisar o estado da vegetação nos períodos pré, durante e pós-queima. Para avaliar as condições necessárias para que ocorra a regeneração natural da superfície vegetal foi utilizado o software de mineração de dados Waikato Environment for Knowledge Analysis (WEKA) a partir do cruzamento dos dados do Índice de Vegetação da Diferença Normalizada (NDVI) e precipitação local. Os resultados demonstram um aumento na ocorrência dos focos no período analisado. Existe uma correlação de 91,76% entre o NDVI durante e 48 dias após o evento da queima. Além disso, os parâmetros NDVI 30 e 48 dias após a queima apresentaram um coeficiente de correlação de 83,96%. Portanto, as técnicas de sensoriamento remoto e mineração de dados permitiram avaliar as relações existentes entre o NDVI e a precipitação local para que ocorra a regeneração vegetal. Characterization of Burning Scars in the Sertão and São Francisco Pernambucano Mesoregions from MODIS Sensor dataA B S T R A C T The Sertão and São Francisco Pernambucano mesoregions have a semi-arid climate, which can affect agricultural production, due to the hot and dry climate, with high temperatures and irregular rainfall. The predominant biome of the region is the Caatinga, which has been suffering over the years with several anthropic actions, including in addition to deforestation, burning events. The purpose of this article was to map, characterize and quantify the incidence of hot spots in the mesoregions listed above, as well as the capacity for recovery and / or natural regeneration of vegetation through remote sensing and data mining techniques. Images from the Moderate Resolution Imaging Spectroradiometer (MODIS) sensor on the TERRA platform were used to analyze the state of vegetation in the pre, during and post-firing periods. To assess the conditions necessary for the natural regeneration of the plant surface to occur, the data mining software Waikato Environment for Knowledge Analysis (WEKA) was used, by crossing the data from the Normalized Ddifference Vegetation Index (NDVI) and precipitation. The results demostrate an increase in the occurrence of outbreaks in the analyzed period. There is a 91.76% correlation between NDVI during and 48 days after burning event. In addition, the NDVI parameters 30 and 48 days after burning presented a correlation coefficient of 83.96%. Therefore, the techniques of remote sensing and data mining allowed to evaluate the existing relationships between NDVI and local precipitation so that plant regeneration to occurs.Keywords: remote sensing, vegetation indexes, hot spots, data mining.

Download Full-text

Members’ Behavior in Virtual Learning Community: A Study Using Data Mining Approach

Computer and Information Science ◽

10.5539/cis.v7n4p1 ◽

2014 ◽

Vol 7 (4) ◽

pp. 1

Author(s):

Xiaokang Li ◽

Yu Nie ◽

Min Chen ◽

Xiaoqing Liu ◽

Xiaolei Liu

Keyword(s):

Data Mining ◽

Learning Communities ◽

Learning Community ◽

Association Rule ◽

Rule Learning ◽

Virtual Learning ◽

Analysis Method ◽

Virtual Learning Community ◽

Knowledge Analysis ◽

Quantitative Analysis Method

Purpose: With the development of information technology, online virtual learning community is on its way to become an important approach for people to construction and sharing of knowledge. Researches on virtual learning community are not only important to the establishment and management of virtual learning community itself, but are helpful for people’s quest for the future development of online learning. However, current researches related to the virtual learning community are in inadequacy, and especially the application of quantitative analysis method for research is rarely seen. Using quantitative analysis method of data mining to study members’ behavior in online learning communities. Method: In this article, the discussion data (posts) from five online English virtual learning communities in China are sampled and colleted. These data were processed according to a series of guidelines to obtain proper data documents, and these data documents were opened under Waikato Environment for Knowledge Analysis and then carried out preprocessing. Next, the module of association rule learning in Waikato Environment Knowledge Analysis were used to perform mining on these processed data, and obtained a series of potential behavior rules in these communities. The partial rules have been listed in the article with their meaning analyzed. Findings: The result shows that in this setting it is feasible to apply the association rule learning to virtual learning community. Value: It provides approaches and lays the foundation for future relevant studies.

Download Full-text

Perbandingan Algoritma K-Means dan EM untuk Clusterisasi Nilai Mahasiswa Berdasarkan Asal Sekolah

Creative Information Technology Journal ◽

10.24076/citec.2014v1i4.31 ◽

2015 ◽

Vol 1 (4) ◽

pp. 316 ◽

Cited By ~ 1

Author(s):

Mardiani Mardiani

Keyword(s):

Data Mining ◽

Decision Making ◽

Spatial Clustering ◽

Sql Server ◽

Cleaning Process ◽

Tabular Form ◽

Management Information ◽

Process Data ◽

Knowledge Analysis ◽

Analysis Of Results

Dari beberapa fungsionalitas data mining, digunakan clustering untuk mengelompokkan mahasiswa berdasarkan nilai. Cluster dilakukan dengan menggunakan algoritma yang sudah ada yaitu K-Means dan EM (Expectation Maximation). Setelah sebelumnya melakukan proses pembersihan data dengan menggunakan aplikasi SQL Server 2008, kemudian data dalam bentuk tabel diolah dengan aplikasi WEKA (Waikato Environment for Knowledge Analysis) untuk mendapatkan hasilnya. Hasil dari penelitian berupa clustering informasi sekolah mana yang berpotensi menghasilkan lulusan dengan nilai yang baik. Pengelompokan terdiri atas 3 cluster dengan kategori nilai tinggi, sedang dan rendah. Pengelompokan tersebut juga berdasarkan lokasi yang disebut sebagai spatial clustering. Kemudian dilakukan analisis hasil setelah mendapatkan data yang sudah terkelompok. Informasi yang didapat selanjutnya dapat dimanfaatkan untuk pengambilan keputusan di bidang pendidikan bagi mahasiswa dan manajemen STMIK MDP. Bagi pihak manajemen STMIK MDP informasi berguna untuk mengetahui sekolah mana yang memberikan kontribusi mahasiswa dengan nilai tertinggi.From some of the functionality of data mining, clustering is used to group students based on the value. Clusters is done by using existing algorithms namely K-Means and EM (Expectation Maximation). Having previously done the cleaning process data using SQL Server 2008 applications, then the data in tabular form is processed by the WEKA (Waikato Environment for Knowledge Analysis) to get the result. Results from the study of clustering information which school has the potential to produce graduates with good grades. The grouping consists of three clusters with the category of high value, medium and low. Grouping is also referred to as a location based spatial clustering. Then performed the analysis of results after getting the data is already grouped. The information obtained can then be utilized for decision making in the field of education for students and management STMIK MDP. For the STMIK MDP management information useful to know which schools contribute to student with the highest score.

Download Full-text

Usage Analysis of Smartphone with Hierarchical Clustering

International Journal of Engineering and Advanced Technology - Regular Issue ◽

10.35940/ijeat.a1041.1091s19 ◽

2019 ◽

Vol 9 (1S) ◽

pp. 222-227

Keyword(s):

Data Mining ◽

Mobile Phone ◽

Open Source Software ◽

Clustering Algorithms ◽

Data Mining Algorithms ◽

Knowledge Analysis ◽

Functional Attributes ◽

Usage Analysis ◽

Mining Algorithms ◽

Analyze Data

Functional and Non Functional attributes are the important factors of the Smartphone. This paper deals with how to group the characteristics of mobile phone using the clustering algorithms and then obtained the results after classify the related algorithms. Classify all the functional and non-functional attributes of the smartphone using the latest data mining algorithms using WEKA (Waikato Environment for Knowledge Analysis) open-source software to analyze data.

Download Full-text

Implementasi Algoritma C4.5 Mengetahui Penyebab Perceraian Dalam Pernikahan (Studi Kasus: Pengadilan Agama Medan Kelas I-A)

JURIKOM (Jurnal Riset Komputer) ◽

10.30865/jurikom.v7i3.2133 ◽

2020 ◽

Vol 7 (3) ◽

pp. 365

Author(s):

Wulan Juni Andari ◽

Efori Buulolo

Keyword(s):

Machine Learning ◽

Data Mining ◽

Open Source ◽

Machine Learning Algorithms ◽

Structure Representation ◽

History Data ◽

Knowledge Analysis ◽

C4.5 Algorithm ◽

Search History ◽

One Act

Divorce is one act that is hated by Allah. But it is permissible if a husband and wife cannot live together again, when both parties have fulfilled a deadlock in reconciling it will end with a decision that is divorced. Data mining is an automatic analysis of large or complex amounts of data with the aim of determining important patterns or trends. Data mining processing to classify the causes of marriage divorce using C4.5 algorithm by collecting data and classifying data using tree structure representation where each node represents the value of the attribute, the branch represents the value of the attribute and the leaf represents nodes that can search history data for classifying the causes of divorce based on previous traits, so that rules are found that are interconnected with each other. The tools used are Weka (The Waikato Environment for Knowledge Analysis) which is an open source application, Weka has 4 menu tools that can be used, including explorer, experimenter, Knowledgeflow and simple CLI. It has the advantage of many data mining and machine learning algorithms, the results of which are used to predict a set of data

Download Full-text