Data Mining Approach for Educational Decision Support

data mining techniques in education sector have begun to evolve, along with the development of technology and the amount of data that can be stored in an education database storage system. One of them is a database of Bidikmisi scholarships in Indonesia. The Bidikmisi data used in this study will be classified using classification data mining technique. The technique that used in this study is random forest in combination with boosting algorithm and bagging algorithms. These algorithms also combine with SMOTE algorithm to handling the imbalance class in dataset. Based on the performance criteria G-mean and AUC, the algorithm combines with SMOTE tended to be better. The classification accuracy of each method being more than 90%

Download Full-text

A Hybrid Data Mining Technique for Improving the Classification Accuracy of Microarray Data Set

International Journal of Information Engineering and Electronic Business ◽

10.5815/ijieeb.2012.02.07 ◽

2012 ◽

Vol 4 (2) ◽

pp. 43-50 ◽

Cited By ~ 10

Author(s):

Sujata Dash ◽

Bichitrananda Patra ◽

B.K. Tripathy

Keyword(s):

Data Mining ◽

Microarray Data ◽

Classification Accuracy ◽

Data Mining Technique ◽

Data Set ◽

Mining Technique ◽

Hybrid Data

Download Full-text

Landslide susceptibility assessment in Lianhua County (China): A comparison between a random forest data mining technique and bivariate and multivariate statistical models

Geomorphology ◽

10.1016/j.geomorph.2016.02.012 ◽

2016 ◽

Vol 259 ◽

pp. 105-118 ◽

Cited By ~ 154

Author(s):

Haoyuan Hong ◽

Hamid Reza Pourghasemi ◽

Zohre Sadat Pourtaghi

Keyword(s):

Data Mining ◽

Random Forest ◽

Statistical Models ◽

Landslide Susceptibility ◽

Susceptibility Assessment ◽

Data Mining Technique ◽

Multivariate Statistical ◽

Landslide Susceptibility Assessment ◽

Mining Technique

Download Full-text

A Note on Breiman's Random Forest Data Mining Technique and Conventional Cox Modeling of Survival Statistics: The Case of the Phantom “Induct” Covariate in the Ohio State University Kidney Transplant Database

Communication in Statistics- Theory and Methods ◽

10.1080/03610920601126431 ◽

2007 ◽

Vol 36 (10) ◽

pp. 1953-1964 ◽

Cited By ~ 3

Author(s):

Ronald P. Pelletier ◽

George T. Diderrich

Keyword(s):

Data Mining ◽

Random Forest ◽

Kidney Transplant ◽

Ohio State University ◽

State University ◽

Data Mining Technique ◽

Mining Technique ◽

The Ohio State University

Download Full-text

Salary Estimator using Data Science

International Journal for Modern Trends in Science and Technology - RTT2020 ◽

10.46501/ijmtst061259 ◽

2020 ◽

Vol 6 (12) ◽

pp. 319-322

Author(s):

Winner Walecha and Dr. Bhoomi Gupta

Keyword(s):

Data Mining ◽

Random Forest ◽

Data Science ◽

The Internet ◽

Prediction System ◽

Lasso Regression ◽

Data Mining Technique ◽

Mining Technique ◽

Number Of Factors ◽

Using Data

This paper presents a salary prediction system using the job listings from an employment website, in this case Glassdoor.com. A data mining technique is used to generate a model which will scrape number of jobs from the employment website, clean it on the basis of number of factors including the rival companies, revenue and skill required thereby predicting the salary to be expected when applying for a data science job. Techniques like linear regression, lasso regression, random forest regressors are optimised using GridsearchCV to reach the best model. The model can be further extended to build a flask API thus can be deployed on the internet for public usage.

Download Full-text

Predicting Diabetes Disease using Random Forest Tree (Rft) Data Mining Technique

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.d1019.1284s519 ◽

2020 ◽

Vol 8 (4S5) ◽

pp. 46-48

Keyword(s):

Data Mining ◽

Random Forest ◽

Blood Sugar ◽

Energy Use ◽

Primary Source ◽

Organ Damage ◽

Forest Tree ◽

Data Mining Technique ◽

Mining Technique ◽

Random Forest Tree

Diabetes is a condition that happens when the blood glucose is too high, also known as blood sugar. The primary source of energy is blood sugar, and it comes from the food you eat. Insulin, a pancreatic hormone, helps food glucose get into the cells for energy use. It also leads for an unrelated condition named, "Diabetes Insipidus”, which entails complications with the processing of fluids in the kidney. Insulin is the key to the ability of the cell to use glucose. Problems with the processing of insulin or how cells perceive insulin can easily cause out of control the body's carefully balanced glucose metabolism process [1]. Diabetes emerges when either of these conditions happens, blood sugar levels rise and crash and the risk of organ damage. Earlier prediction of this diabetes condition could provide proper treatment to protect the people from un avoided illness. For this prediction we can apply data mining which is used predominantly in healthcare organizations for decision making, disease detection purpose. In this paper data have been collected from UCI repositories and the data mining tool (WEKA) is used to predict diabetes. In this database there are 768 instances in which 500 instances belongs to tested negative and 268 instances belongs to tested positive. An experimental study is carried out using data mining technique classification technique called Random Forest Tree (RFT) classifier to predict diabetes. In this research, we have used different cross fold validation to achieve better accuracy and we found that cross fold validation k= 8 gives high accuracy 76.69% while compared with other cross fold validation values.

Download Full-text

Estimating the soil water retention curve: Comparison of multiple nonlinear regression approach and random forest data mining technique

Computers and Electronics in Agriculture ◽

10.1016/j.compag.2020.105502 ◽

2020 ◽

Vol 174 ◽

pp. 105502

Author(s):

M. Rastgou ◽

H. Bayat ◽

M. Mansoorizadeh ◽

Andrew S. Gregory

Keyword(s):

Data Mining ◽

Random Forest ◽

Water Retention ◽

Water Retention Curve ◽

Soil Water Retention Curve ◽

Soil Water Retention ◽

Data Mining Technique ◽

Retention Curve ◽

Mining Technique ◽

Regression Approach

Download Full-text

Boosted Regression (Boosting): An Introductory Tutorial and a Stata Plugin

The Stata Journal Promoting communications on statistics and Stata ◽

10.1177/1536867x0500500304 ◽

2005 ◽

Vol 5 (3) ◽

pp. 330-354 ◽

Cited By ~ 92

Author(s):

Matthias Schonlau

Keyword(s):

Data Mining ◽

Logistic Regression ◽

Predictive Accuracy ◽

Stepwise Logistic Regression ◽

Data Mining Technique ◽

Test Dataset ◽

Mining Technique ◽

Considerable Success ◽

Boosting Algorithm ◽

Gaussian Regression

Boosting, or boosted regression, is a recent data-mining technique that has shown considerable success in predictive accuracy. This article gives an overview of boosting and introduces a new Stata command, boost, that implements the boosting algorithm described in Hastie, Tibshirani, and Friedman (2001, 322). The plugin is illustrated with a Gaussian and a logistic regression example. In the Gaussian regression example, the R2 value computed on a test dataset is R2 = 21.3% for linear regression and R2 = 93.8% for boosting. In the logistic regression example, stepwise logistic regression correctly classifies 54.1% of the observations in a test dataset versus 76.0% for boosted logistic regression. Currently, boost accommodates Gaussian (normal), logistic, and Poisson boosted regression. boost is implemented as a Windows C++ plugin.

Download Full-text

Data Mining Approach in Preterm Birth Prediction

Mapana Journal of Sciences ◽

10.12723/mjs.16.3 ◽

2010 ◽

Vol 9 (1) ◽

pp. 18-30 ◽

Cited By ~ 1

Author(s):

Jyothi Thomas ◽

G. Kulanthaivel

Keyword(s):

Machine Learning ◽

Data Mining ◽

Neural Networks ◽

Preterm Birth ◽

Field Data ◽

Risk Groups ◽

Data Mining Technique ◽

Mining Technique ◽

Data Mining Approach ◽

Health Field

Data mining refers to the process of discovering patterns in data, typically with the aid of powerful algorithms to automate part of the search. These methods come from the disciplines such as statistics, machine learning, pattern recognition, neural networks and database. In particular this paper reveals out how the problem of preterm birth prediction is approached by a data mining analyst with a background in machine learning. In the health field, data mining applications have been growing considerably as it can be used to directly derive patterns, which are relevant to forecast different risk groups among the patients. Data mining technique such as clustering has not been used to predict preterm birth. Hence this paper made an attempt to identify patterns from the database of the preterm birth patients using clustering.

Download Full-text

Predict Students' Performance by Using J48 Algorithm

International Journal of Scientific Research in Science Engineering and Technology ◽

10.32628/ijsrset2073124 ◽

2020 ◽

pp. 578-582

Author(s):

Myo Thandar Tun ◽

Yin Yin Htay

Keyword(s):

Data Mining ◽

Classification Accuracy ◽

Critical Issue ◽

Academic Community ◽

Classification Algorithms ◽

Data Mining Technique ◽

Data Set ◽

Mining Technique ◽

Computer Studies ◽

Machine Learning Tool

The critical issue to the academic community of higher education is to monitor the progress of students’ academic performance. We can use data mining techniques for this purpose. J48 algorithm is one of the famous classification algorithms present today to generate decision trees in data mining technique. The data set used in this study is taken from University of Computer Studies (Mandalay). Weka machine learning tool is applied to make classification. In this work, we tested result classification accuracy was computed. This J48 classification algorithm give accuracy with 78.2%.

Download Full-text

Land-Subsidence Spatial Modeling Using the Random Forest Data-Mining Technique

Spatial Modeling in GIS and R for Earth and Environmental Sciences ◽

10.1016/b978-0-12-815226-3.00006-5 ◽

2019 ◽

pp. 147-159 ◽

Cited By ~ 4

Author(s):

Hamid Reza Pourghasemi ◽

Mohsen Mohseni Saravi

Keyword(s):

Data Mining ◽

Random Forest ◽

Land Subsidence ◽

Spatial Modeling ◽

Data Mining Technique ◽

Mining Technique

Download Full-text