ENHANCED APPROACH FOR SOIL CLASSIFICATION USING BOOSTED C5.0 DECISION TREE ALGORITHM

2021 ◽  
Vol 12 (1) ◽  
pp. 11-21
Author(s):  
Senthil Kumar Seethapathy ◽  
◽  
C.Naveeth Babu

Data mining includes the utilization of erudite data analysis tools to discover previously unidentified, suitable patterns and relationships in enormous data sets. Data mining tools can incorporate statistical models, machine learning methods such as neural networks or decision trees, and mathematical algorithms. As a result data mining comprises of more process. This performs analysis and prediction than collecting and managing data. The main objective of data mining is to identify valid, potentially useful, novel and understandable correlations and patterns in existing data. Finding and analyzing useful patterns in data is known by different names (e.g., knowledge extraction, information discovery, information harvesting, data archaeology, and data pattern processing). The term data mining is basically utilized by statisticians, database researchers, and the business communities.

2014 ◽  
Vol 538 ◽  
pp. 460-464
Author(s):  
Xue Li

Based on inter-correlation and permeability among disciplines, the author makes an attempt to apply the information science to cognitive linguistics to provide a new perspective for the study of foreign languages. The correlation between self-efficacy and such four factors as anxiety, learning strategies, motivation and learners’ past achievement is analyzed by means of data mining and the extent to which the above factors affect self-efficacy in language learning is explored in this paper. The paper employs the decision tree algorithm in SPSS Clementine. C5.0 decision tree algorithm is adopted to analyze data in the study. The results are elicited from the researches carried out in this paper. The increased anxiety is bound to weaken learners’ motivation over time. It is obvious that learners have low self-efficacy. It is very important to employ strategies in foreign language learning. Ignorance of using learning strategies may result in unplanned learning with unsatisfactory achievements in spite of more efforts involved. Self-efficacy in foreign language learning may be weakened accordingly. Learners’ past achievement is a reference dimension in measuring self-efficacy with weaker influence.


Author(s):  
Anita Febriani ◽  
Tiara Trimadya Rahmawati ◽  
Eka Sabna

Blood Transfusion Unit PMI Pekanbaru City is part of a company or agency that serves blood donation, every blood bag obtained from the community voluntarily come to PMI to donate blood with the goal of humanity. In Blood Transfusion Unit PMI Pekanbaru City, has provisions to be blood donors that must be met in order to donate blood in UTD PMI Pekanbaru City. Data Mining is a combination of a number of computer science disciplines that are defined as the process of discovering new patterns from massive data sets. By using RapidMiner software and using the method of Decision Tree Algorithm C4.5 to determine the eligibility of blood donors based on Age, Weight, Hemoglobin, and Blood Pressure. In the study of hemoglobin is the most decisive variable in blood donors. And the result accuracy is 94.02% which means the accuracy of this model is very good.


Data Mining is a technique used to retrieve information for the analysis and discovery of hidden trends in large data sets. Data Mining extends to numerous areas such as education, banking, marketing, retail, communications and agriculture. Agriculture is the backbone of country’s economy. It is the important source of livelihood. Agriculture depends primarily on the weather, geology, soil and biology. Agricultural Mining is a technology that can contribute information for the growth of agriculture. The current study presents the various techniques of data mining, and their role in soil fertility, nutrient analysis. Decision tree is a well-known Data Mining classification approach. C4.5 and Classification and Regression Trees (ID3) are two widely used decision tree algorithms for classification. The C4.5, ID3 and the proposed classifier have been trained using the soil sample data set by taking into account the optimal soil parameters pH (hydrogen power), EC (electrical conductivity) and ESP (exchangeable sodium percentage). The model is evaluated using a collection of soil samples test results. Classification of soil is the division of soil into classes or groups each having similar characteristics and likely similar behavior. Soil classification is easy to allow the farmer to know the type of soil and to plough the crops based on the soil type.


Author(s):  
Fransiskus Ginting ◽  
Efori Buulolo ◽  
Edward Robinson Siagian

Data Mining is an information discovery by extracting information patterns that contain trend searches in a very large amount of data and assist the process of storing data in making a decision in the future. In determining the pattern classification techniques do to collect records (Training set). Regional income is generally derived from local taxes and levies, local taxes are one source of funding for the region on the national average has not been able to make a large contribution to the formation of local revenue. By utilizing Regional Revenue data, it can produce forecasting and predictions of Regional Revenue income in the future to match the reality / reality so that the planned RAPBD can run smoothly. Simple Linear Regression or often abbreviated as SLR (Simple Linear Regression) is one of the statistical methods used in production to make predictions or predictions about the characteristics of quality and quantity to describe the processes associated with data processing for the acquisition of regional income. So that in the testing phase with visual basic net can help in processing valid Regional Revenue Amount data. Keywords: Data Mining, Local Revenue, Simple Linear Regression Algorithm, Visual Basic net 2008


Author(s):  
Sarasij Das ◽  
Nagendra Rao P S

This paper is the outcome of an attempt in mining recorded power system operational data in order to get new insight to practical power system behavior. Data mining, in general, is essentially finding new relations between data sets by analyzing well known or recorded data. In this effort we make use of the recorded data of the Southern regional grid of India. Some interesting relations at the total system level between frequency, total MW/MVAr generation, and average system voltage have been obtained. The aim of this work is to highlight the potential of data mining for power system applications and also some of the concerns that need to be addressed to make such efforts more useful.


2013 ◽  
Vol 284-287 ◽  
pp. 3070-3073
Author(s):  
Duen Kai Chen

In this study, we report a voting behavior analysis intelligent system based on data mining technology. From previous literature, we have witnessed increasing number of studies applied information technology to facilitate voting behavior analysis. In this study, we built a likely voter identification model through the use of data mining technology, the classification algorithm used here constructs decision tree model to identify voters and non voters. This model is evaluated by its accuracy and number of attributes used to correctly identify likely voter. Our goal is to try to use just a small number of survey questions while maintaining the accuracy rates of other similar models. This model was built and tested on Taiwan’s Election and Democratization Study (TEDS) data sets. According to the experimental results, the proposed model can improve likely voter identification rate and this finding is consistent with previous studies based on American National Election Studies.


Sign in / Sign up

Export Citation Format

Share Document