Diagnosis of Various Thyroid Ailments using Data Mining Classification Techniques

International Journal of Scientific Research in Computer Science Engineering and Information Technology ◽

10.32628/cseit195119 ◽

2019 ◽

pp. 131-136

Author(s):

Umar Sidiq ◽

Syed Mutahar Aaqib ◽

Rafi Ahmad Khan

Keyword(s):

Data Mining ◽

Decision Tree ◽

Research Work ◽

Support Vector ◽

Data Sets ◽

Data Mining Technique ◽

K Nearest Neighbors ◽

Data Set ◽

Classification Techniques ◽

Using Data

Classification is one of the most considerable supervised learning data mining technique used to classify predefined data sets the classification is mainly used in healthcare sectors for making decisions, diagnosis system and giving better treatment to the patients. In this work, the data set used is taken from one of recognized lab of Kashmir. The entire research work is to be carried out with ANACONDA3-5.2.0 an open source platform under Windows 10 environment. An experimental study is to be carried out using classification techniques such as k nearest neighbors, Support vector machine, Decision tree and Naïve bayes. The Decision Tree obtained highest accuracy of 98.89% over other classification techniques.

Download Full-text

Evaluation of Data Mining Techniques and Its Fusion with IoT Enabled Smart Technologies for Effective Prediction of Available Parking Space

International journal of electrical and computer engineering systems ◽

10.32985/ijeces.12.4.2 ◽

2021 ◽

Vol 12 (4) ◽

pp. 187-197

Author(s):

Anchal Dahiya ◽

Pooja Mittal

Keyword(s):

Data Mining ◽

Decision Tree ◽

Support Vector ◽

Learning Approaches ◽

Parking Space ◽

Data Mining Technique ◽

Data Set ◽

Data Mining Techniques ◽

Hard Times ◽

Smart Technologies

After experiencing the hard times of pandemic situations we learned that if we could have a smart system that can help us in automatic parking of the vehicles then it could be a great help to society. This idea motivated us to carry out this current work. Though, nowadays, in almost every application domain, IoT techniques are the buzzword. IoT techniques can also be used to achieve efficacy in predicting free available parking space in advance. But the biggest challenge with IoT techniques is that they generate numerous data, which makes its analysis intangible. It was realized that if IoT techniques can be fused with outperforming data mining techniques, more efficient predictions can be performed. Thus, for this purpose, the main objective of our paper is to firstly, select the most appropriate data mining technique, based on performance evaluation, and then to perform prediction of available parking space in advance by fusing it with IoT techniques. Due to the busy schedule, the drivers need to get information about free parking spaces in advance by using smart phones. With the help of this information, it will be easy for the drivers to park their vehicle in the exact location without wasting their precious time and will maintain social distancing in crowded areas too. Data mining techniques can play an important role in the prediction of available parking space, by extracting only relevant and important information when applied to the given dataset. For this purpose, a comparative analysis of five data mining techniques such as the Support Vector Machine, K- Nearest approach, Decision Tree, Random Forest, and Ensemble learning approaches are applied on PK lot data set by using Python language. For calculation of result anaconda (spyder) is used as a supportive tool. The main outcome of the paper is to find the technique that will give better results for the prediction of the available space and if we fused data mining techniques with IoT technologies results are improvised. Evaluation parameters that are used for finding the best technique are precision, recall, accuracy, and F1-Score. For numerical calculation of the results, the k-fold cross-validation method is used. As the empirical results are calculated using the Pk lot dataset, the decision tree outperformed the best among all the techniques that are selected for analysis.

Download Full-text

Research on Teaching Quality Evaluation Using Data Mining Technique

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.926-930.4582 ◽

2014 ◽

Vol 926-930 ◽

pp. 4582-4585

Author(s):

Ai Feng Li ◽

Ying Hu ◽

Wen Jing Zhao

Keyword(s):

Data Mining ◽

Decision Tree ◽

Association Rule ◽

Quality Evaluation ◽

Teaching Quality ◽

Data Mining Technique ◽

Higher Education System ◽

Potential Factors ◽

Rule Method ◽

Using Data

—In this paper, we employ data mining (DM) technique to analyze various potential factors which impact the in-class teaching quality evaluation. Based on an effective dataset, we first exploit association rule method to mine the relationship between the teacher’s attributions, such as title, degree, age, seniority, and load, and the in-class teaching quality evaluation results. Then, we construct the decision tree of course’s attributions to reveal how the course’s attributions, such as property, credit, week hour, and number of students, impact the in-class teaching quality evaluation results. Our mined rules can provide effective guidance to talent development, teaching management, and input of talent in higher education system. Index Terms—data mining, decision tree, association rule, teaching quality evaluation

Download Full-text

Student Performance Predictions Using Knowledge Discovery Database and Data Mining, DPU Students Records as Sample

Academic Journal of Nawroz University ◽

10.25007/ajnu.v10n3a875 ◽

2021 ◽

Vol 10 (3) ◽

pp. 121-127

Author(s):

Bareen Haval ◽

Karwan Jameel Abdulrahman ◽

Araz Rajab

Keyword(s):

Data Mining ◽

Decision Tree ◽

Student Performance ◽

Educational Data Mining ◽

Data Sets ◽

Decision Tree Classifier ◽

Data Mining Techniques ◽

Academic History ◽

Tree Classifier ◽

Using Data

This article presents the results of connecting an educational data mining techniques to the academic performance of students. Three classification models (Decision Tree, Random Forest and Deep Learning) have been developed to analyze data sets and predict the performance of students. The projected submission of the three classificatory was calculated and matched. The academic history and data of the students from the Office of the Registrar were used to train the models. Our analysis aims to evaluate the results of students using various variables such as the student's grade. Data from (221) students with (9) different attributes were used. The results of this study are very important, provide a better understanding of student success assessments and stress the importance of data mining in education. The main purpose of this study is to show the student successful forecast using data mining techniques to improve academic programs. The results of this research indicate that the Decision Tree classifier overtakes two other classifiers by achieving a total prediction accuracy of 97%.

Download Full-text

Data Mining Techniques for Identification and Classification of Various Diseases in Plants

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.b1110.1292s19 ◽

2019 ◽

Vol 9 (2S) ◽

pp. 676-680

Keyword(s):

Neural Network ◽

Data Mining ◽

Nearest Neighbors ◽

Crop Productivity ◽

Vital Role ◽

Support Vector ◽

Data Sets ◽

K Nearest Neighbors ◽

Data Mining Techniques

Data mining is currently being used in various applications; In research community it plays a vital role. This paper specify about data mining techniques for the preprocessing and classification of various disease in plants. Since various plants has different diseases based on that each of them has different data sets and different objectives for knowledge discovery. Data Mining Techniques applied on plants that it helps in segmentation and classification of diseased plants, it avoids Oral Inspection and helps to increase in crop productivity. This paper provides various classification techniques Such as K-Nearest Neighbors, Support Vector Machine, Principle component Analysis, Neural Network. Thus among various techniques neural network is effective for disease detection in plants.

Download Full-text

Soil Data Analysis and Crop Yield Prediction in Data Mining using R – Programming

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.c8683.019320 ◽

2020 ◽

Vol 9 (3) ◽

pp. 1857-1860

Keyword(s):

Data Mining ◽

Data Analysis ◽

Decision Tree ◽

Crop Yield ◽

Climatic Condition ◽

Research Work ◽

Yield Prediction ◽

Decision Tree Algorithm ◽

Data Set ◽

R Programming

Data mining is better choices in emerging research filed- soil data analysis. crop yield prediction is an important issue for selecting the crop. earlier prediction of crop is done by the experience of farmer on a particular type of field and crop. predicting the crop is done by the farmer’s experience based on the factors like soil types, climatic condition, seasons, and weather, rainfall and irrigation facilities. data mining techniques is the better choice for predicting the crop. the analysis of soil plays an important role in agricultural filed. soil fertility prediction is one of the very important factors in agriculture this research work implements to predict yield of crop, decision tree algorithm is used to find yield. the aim of this research to pinpoint the accuracy and to finding the yield of the crop using decision tree and c 4.5 algorithm is used to predict the yield of crop using rprogramming and also to find range of magnesium found in the collected soil data set. this prediction will be very useful for the farmer to predict the crop yield for cultivation

Download Full-text

Prediction and Classification into Benign and Malignant using the Clinical Testing Features

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.j7411.0891020 ◽

2020 ◽

Vol 9 (10) ◽

pp. 55-61

Keyword(s):

Breast Cancer ◽

Data Mining ◽

Image Classification ◽

Naive Bayes ◽

Malignant Tumors ◽

Naïve Bayes ◽

Support Vector ◽

Natural Image ◽

Data Set ◽

Classification Techniques

Breast Cancer is the most often identified cancer among women and a major reason for the increased mortality rate among women. As the diagnosis of this disease manually takes long hours and the lesser availability of systems, there is a need to develop the automatic diagnosis system for early detection of cancer. The advanced engineering of natural image classification techniques and Artificial Intelligence methods has largely been used for the breast-image classification task. Data mining techniques contribute a lot to the development of such a system, Classification, and data mining methods are an effective way to classify data. For the classification of benign and malignant tumors, we have used classification techniques of machine learning in which the machine learns from the past data and can predict the category of new input. This study is a relative study on the implementation of models using Support Vector Machine (SVM), and Naïve Bayes on Breast cancer Wisconsin (Original) Data Set. With respect to the results of accuracy, precision, sensitivity, specificity, error rate, and f1 score, the efficiency of each algorithm is measured and compared. Our experiments have shown that SVM is the best for predictive analysis with an accuracy of 99.28% and naïve Bayes with an accuracy of 98.56%. It is inferred from this study that SVM is the well-suited algorithm for prediction.

Download Full-text

Using data mining methods to improve discharge coefficient prediction in Piano Key and Labyrinth weirs

Water Science & Technology Water Supply ◽

10.2166/ws.2021.304 ◽

2021 ◽

Author(s):

Mahdi Majedi-Asl ◽

Mehdi Foladipanah ◽

Venkat Arun ◽

Ravi Prakash Tripathi

Keyword(s):

Discharge Coefficient ◽

Gene Expression Programming ◽

Research Work ◽

Coefficient Of Determination ◽

Support Vector ◽

Effective Parameters ◽

Data Set ◽

Labyrinth Weir ◽

Dimensional Parameters ◽

Using Data

Abstract As a remarkable parameter, the discharge coefficient (Cd) plays an important role in determining weirs' passing capacity. In this research work, the support vector machine (SVM) and the gene expression programming (GEP) algorithms were assessed to predict Cd of piano key weir (PKW), rectangular labyrinth weir (RLW), and trapezoidal labyrinth weir (TLW) with gathered experimental data set. Using dimensional analysis, various combinations of hydraulic and geometric non-dimensional parameters were extracted to perform simulation. The superior model for the SVM and the GEP predictor for PKW, RLW, and TLW included , and respectively. The results showed that both algorithms are potential in predicting discharge coefficient, but the coefficient of determination (RMSE, R2, Cd(DDR)max) illustrated the superiority of the GEP performance over the SVM. The results of the sensitivity analysis determined the highest effective parameters for PKW, RLW, and TLW in predicting discharge coefficients are , , and Fr respectively.

Download Full-text

A Student Performance Prediction Model Using Data Mining Technique

International Journal of Engineering & Technology ◽

10.14419/ijet.v7i2.15.11214 ◽

2018 ◽

Vol 7 (2.15) ◽

pp. 61

Author(s):

Rohaila Abdul Razak ◽

Mazni Omar ◽

Mazida Ahmad

Keyword(s):

Data Mining ◽

Linear Regression ◽

Decision Tree ◽

Student Performance ◽

Prediction Accuracy ◽

Significant Variable ◽

Data Mining Technique ◽

Mining Technique ◽

Predicting Performance ◽

Using Data

Predicting performance is very significant in the education world nowadays. This paper will describe the process of doing a prediction of student performance by using data mining technique. 257 data sets were taken from the student of semester 6 KPTM that involved four (4) academic programs which are Diploma in Computer System and Networking, Diploma in Information Technology, Diploma in Business Management and Diploma in Accountancy. Knowledge Discovery in Database (KDD) was used as a guide to the process of finding and extracting a knowledge from the dataset. A decision tree and linear regression were used to analyze the dataset based on variables selected. The variables used are Gender, Financing, SPM, GPASem1, GPASem2, GPASem3, GPASem4, GPASem5 and CGPA as a dependent variable. The result from this indicate the significant variable that contribute most to the students’ performance. Based on the analysis, the decision tree shows that GPASem1 has a strong significant to the CGPA final semester of the student and the prediction accuracy is 82%. The linear regression shows that the GPA for each semester has a highly significant with the dependent variable with 96.2% prediction accuracy. By having this information, the management of KPTM can make a plan to ensure that the student can maintain a good result and at the same time to make a strategic plans for those without a good result.

Download Full-text

Improved Classification Techniques to Predict the Co-disease in Diabetic Mellitus Patients using Discretization and Apriori Algorithm

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.k1434.0981119 ◽

2019 ◽

Vol 8 (11) ◽

pp. 730-733

Keyword(s):

Data Mining ◽

Association Rules ◽

Census Data ◽

Early Stage ◽

Research Work ◽

Numerical Data ◽

Medical Data ◽

Data Sets ◽

Apriori Algorithm ◽

Data Set

The demand for data mining is now unavoidable in the medical industry due to its various applications and uses in predicting the diseases at the early stage. The methods available in the data mining theories are easy to extract the useful patterns and speed to recognize the task based outcomes. In data mining the classification models are really useful in building the classes for the medical data sets for future analysis in an accurate way. Besides these facilities, Association rules in data mining are a promising technique to find hidden patterns in a medical data set and have been successfully applied with market basket data, census data and financial data. Apriori algorithm, is considered to be a classic algorithm, is useful in mining frequent item sets on a database containing a large number of transactions and it also predicts the relevant association rules. Association rules capture the relationship of items that are present in data sets and when the data set contains continuous attributes, the existing algorithms may not work due to this, discretization can be applied to the association rules in order to find the relation between various patterns in data set. In this paper of our research, using Discretized Apriori the research work is done to predict the by-disease in people who are found with diabetic syndrome; also the rules extracted are analyzed. In the discretization step, numerical data is discretized and fed to the Apriori algorithm for better association rules to predict the diseases.

Download Full-text

Image Substance Extraction using Data Mining Clustering Method

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.b6605.129219 ◽

2019 ◽

Vol 9 (2) ◽

pp. 2735-2739

Keyword(s):

Data Mining ◽

Accurate Result ◽

Image Data ◽

Data Recovery ◽

Data Sets ◽

Text Data ◽

Data Set ◽

Normal Text ◽

Additional Burden ◽

Using Data

Dater retrieval is one of the key challenging factor for today. Because of increasing the volume of data sets every year due to various factors. Information extraction in image data sets are too multifaceted compare with normal text data recovery. Image data set consist of different attributes those attribute sets are normalized before it extract from the stored data base. This required additional burden to the user who wish to extract any information from this data sets. This key challenges invite more researchers in the field of image data mining. Today many of the data sets in the form of image it gives more accurate result and more outputs. For extracting any image data image attributes are properly trained for better result. The proposed work based on grouping the data sets using image attributes. The entire process of this work divided into two major separate operations. Experiments dons against various data sets, and outputs verified proposed work gives more accurate results than the existing techniques.

Download Full-text