ENHANCED APPROACH FOR SOIL CLASSIFICATION USING BOOSTED C5.0 DECISION TREE ALGORITHM

Data mining includes the utilization of erudite data analysis tools to discover previously unidentified, suitable patterns and relationships in enormous data sets. Data mining tools can incorporate statistical models, machine learning methods such as neural networks or decision trees, and mathematical algorithms. As a result data mining comprises of more process. This performs analysis and prediction than collecting and managing data. The main objective of data mining is to identify valid, potentially useful, novel and understandable correlations and patterns in existing data. Finding and analyzing useful patterns in data is known by different names (e.g., knowledge extraction, information discovery, information harvesting, data archaeology, and data pattern processing). The term data mining is basically utilized by statisticians, database researchers, and the business communities.

Download Full-text

An Analysis of Factors Influencing Foreign Language Self-Efficacy Based on C5.0 Decision Tree Algorithm in Data Mining

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.538.460 ◽

2014 ◽

Vol 538 ◽

pp. 460-464

Author(s):

Xue Li

Keyword(s):

Data Mining ◽

Decision Tree ◽

Foreign Language ◽

Language Learning ◽

Learning Strategies ◽

Self Efficacy ◽

Foreign Language Learning ◽

Decision Tree Algorithm ◽

Tree Algorithm ◽

C5.0 Decision Tree

Based on inter-correlation and permeability among disciplines, the author makes an attempt to apply the information science to cognitive linguistics to provide a new perspective for the study of foreign languages. The correlation between self-efficacy and such four factors as anxiety, learning strategies, motivation and learners’ past achievement is analyzed by means of data mining and the extent to which the above factors affect self-efficacy in language learning is explored in this paper. The paper employs the decision tree algorithm in SPSS Clementine. C5.0 decision tree algorithm is adopted to analyze data in the study. The results are elicited from the researches carried out in this paper. The increased anxiety is bound to weaken learners’ motivation over time. It is obvious that learners have low self-efficacy. It is very important to employ strategies in foreign language learning. Ignorance of using learning strategies may result in unplanned learning with unsatisfactory achievements in spite of more efforts involved. Self-efficacy in foreign language learning may be weakened accordingly. Learners’ past achievement is a reference dimension in measuring self-efficacy with weaker influence.

Download Full-text

Implementation of Data Mining to Predict The Feasibility of Blood Donors Using C4.5 Algorithm

Indonesian Journal of Artificial Intelligence and Data Mining ◽

10.24014/ijaidm.v1i1.4562 ◽

2018 ◽

Vol 1 (1) ◽

pp. 41

Author(s):

Anita Febriani ◽

Tiara Trimadya Rahmawati ◽

Eka Sabna

Keyword(s):

Data Mining ◽

Blood Transfusion ◽

Blood Donors ◽

Blood Donation ◽

Massive Data ◽

Data Sets ◽

Decision Tree Algorithm ◽

C4.5 Algorithm ◽

Blood Bag ◽

A Company

Blood Transfusion Unit PMI Pekanbaru City is part of a company or agency that serves blood donation, every blood bag obtained from the community voluntarily come to PMI to donate blood with the goal of humanity. In Blood Transfusion Unit PMI Pekanbaru City, has provisions to be blood donors that must be met in order to donate blood in UTD PMI Pekanbaru City. Data Mining is a combination of a number of computer science disciplines that are defined as the process of discovering new patterns from massive data sets. By using RapidMiner software and using the method of Decision Tree Algorithm C4.5 to determine the eligibility of blood donors based on Age, Weight, Hemoglobin, and Blood Pressure. In the study of hemoglobin is the most decisive variable in blood donors. And the result accuracy is 94.02% which means the accuracy of this model is very good.

Download Full-text

Soil Quality Analysis and Crop Fertility Prediction

International Journal for Research in Engineering Application & Management ◽

10.35291/2454-9150.2020.0280 ◽

2020 ◽

pp. 189-193

Keyword(s):

Data Mining ◽

Decision Tree ◽

Large Data ◽

Soil Classification ◽

Quality Analysis ◽

Soil Parameters ◽

Data Sets ◽

Data Set ◽

Nutrient Analysis ◽

Tree Algorithms

Data Mining is a technique used to retrieve information for the analysis and discovery of hidden trends in large data sets. Data Mining extends to numerous areas such as education, banking, marketing, retail, communications and agriculture. Agriculture is the backbone of country’s economy. It is the important source of livelihood. Agriculture depends primarily on the weather, geology, soil and biology. Agricultural Mining is a technology that can contribute information for the growth of agriculture. The current study presents the various techniques of data mining, and their role in soil fertility, nutrient analysis. Decision tree is a well-known Data Mining classification approach. C4.5 and Classification and Regression Trees (ID3) are two widely used decision tree algorithms for classification. The C4.5, ID3 and the proposed classifier have been trained using the soil sample data set by taking into account the optimal soil parameters pH (hydrogen power), EC (electrical conductivity) and ESP (exchangeable sodium percentage). The model is evaluated using a collection of soil samples test results. Classification of soil is the division of soil into classes or groups each having similar characteristics and likely similar behavior. Soil classification is easy to allow the farmer to know the type of soil and to plough the crops based on the soil type.

Download Full-text

A Survey on Preparing Data Sets for Data Mining Analysis using Horizontal Aggregations in SQL

International Journal of Advanced Research in Computer Science and Software Engineering ◽

10.23956/ijarcsse/v7i4/0199 ◽

2017 ◽

Vol 7 (5) ◽

pp. 172-176

Author(s):

Prashant B. Rajole ◽

Keyword(s):

Data Mining ◽

Data Sets ◽

Data Mining Analysis

Download Full-text

IMPLEMENTASI ALGORITMA REGRESI LINEAR SEDERHANA DALAM MEMPREDIKSI BESARAN PENDAPATAN DAERAH (STUDI KASUS: DINAS PENDAPATAN KAB. DELI SERDANG)

KOMIK (Konferensi Nasional Teknologi Informasi dan Komputer) ◽

10.30865/komik.v3i1.1602 ◽

2019 ◽

Vol 3 (1) ◽

Author(s):

Fransiskus Ginting ◽

Efori Buulolo ◽

Edward Robinson Siagian

Keyword(s):

Data Mining ◽

Linear Regression ◽

Visual Basic ◽

Large Contribution ◽

Simple Linear Regression ◽

Information Discovery ◽

Regional Income ◽

Local Revenue ◽

Local Taxes ◽

The Future

Data Mining is an information discovery by extracting information patterns that contain trend searches in a very large amount of data and assist the process of storing data in making a decision in the future. In determining the pattern classification techniques do to collect records (Training set). Regional income is generally derived from local taxes and levies, local taxes are one source of funding for the region on the national average has not been able to make a large contribution to the formation of local revenue. By utilizing Regional Revenue data, it can produce forecasting and predictions of Regional Revenue income in the future to match the reality / reality so that the planned RAPBD can run smoothly. Simple Linear Regression or often abbreviated as SLR (Simple Linear Regression) is one of the statistical methods used in production to make predictions or predictions about the characteristics of quality and quantity to describe the processes associated with data processing for the acquisition of regional income. So that in the testing phase with visual basic net can help in processing valid Regional Revenue Amount data. Keywords: Data Mining, Local Revenue, Simple Linear Regression Algorithm, Visual Basic net 2008

Download Full-text

PCA for heterogeneous data sets in a distributed data mining

Proceedings of the Fourth Annual ACM Bangalore Conference on - COMPUTE '11 ◽

10.1145/1980422.1980451 ◽

2011 ◽

Author(s):

E. Chandra ◽

P. Ajitha

Keyword(s):

Data Mining ◽

Heterogeneous Data ◽

Distributed Data Mining ◽

Data Sets ◽

Distributed Data

Download Full-text

Understanding Power System Behavior through Mining Archived Operational Data

International Journal of Emerging Electric Power Systems ◽

10.2202/1553-779x.2211 ◽

2009 ◽

Vol 10 (1) ◽

Cited By ~ 1

Author(s):

Sarasij Das ◽

Nagendra Rao P S

Keyword(s):

Data Mining ◽

Power System ◽

System Level ◽

Data Sets ◽

Total System ◽

System Behavior ◽

Average System ◽

Recorded Data ◽

Operational Data ◽

Southern Regional

This paper is the outcome of an attempt in mining recorded power system operational data in order to get new insight to practical power system behavior. Data mining, in general, is essentially finding new relations between data sets by analyzing well known or recorded data. In this effort we make use of the recorded data of the Southern regional grid of India. Some interesting relations at the total system level between frequency, total MW/MVAr generation, and average system voltage have been obtained. The aim of this work is to highlight the potential of data mining for power system applications and also some of the concerns that need to be addressed to make such efforts more useful.

Download Full-text

Outlier data Mining of large Data Sets relying on fast decomposition simulated annealing algorithm

10.1109/icris52159.2020.00170 ◽

2020 ◽

Author(s):

Wenjie Jia ◽

Zhihong He

Keyword(s):

Data Mining ◽

Simulated Annealing ◽

Simulated Annealing Algorithm ◽

Large Data ◽

Large Data Sets ◽

Data Sets ◽

Annealing Algorithm ◽

Outlier Data ◽

Fast Decomposition

Download Full-text

Knowledge Discovery in Large Data Sets: A Primer for Data Mining Applications in Health Care

Health Informatics - Nursing Informatics ◽

10.1007/978-1-4757-3252-8_10 ◽

2000 ◽

pp. 139-148 ◽

Cited By ~ 2

Author(s):

Patricia A. Abbott

Keyword(s):

Data Mining ◽

Health Care ◽

Knowledge Discovery ◽

Large Data ◽

Large Data Sets ◽

Data Sets

Download Full-text

Data Mining Based Intelligent System for Voting Behavior Analysis

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.284-287.3070 ◽

2013 ◽

Vol 284-287 ◽

pp. 3070-3073

Author(s):

Duen Kai Chen

Keyword(s):

Data Mining ◽

Behavior Analysis ◽

Voting Behavior ◽

Intelligent System ◽

Data Sets ◽

Tree Model ◽

Mining Technology ◽

Identification Rate ◽

Voter Identification ◽

Election Studies

In this study, we report a voting behavior analysis intelligent system based on data mining technology. From previous literature, we have witnessed increasing number of studies applied information technology to facilitate voting behavior analysis. In this study, we built a likely voter identification model through the use of data mining technology, the classification algorithm used here constructs decision tree model to identify voters and non voters. This model is evaluated by its accuracy and number of attributes used to correctly identify likely voter. Our goal is to try to use just a small number of survey questions while maintaining the accuracy rates of other similar models. This model was built and tested on Taiwan’s Election and Democratization Study (TEDS) data sets. According to the experimental results, the proposed model can improve likely voter identification rate and this finding is consistent with previous studies based on American National Election Studies.

Download Full-text