scholarly journals Ensemble Classification Model for Diabetes Prediction in Data Mining

2019 ◽  
Vol 8 (4) ◽  
pp. 1240-1243

The prediction analysis is the approach which can predict the future possibilities based on the current information. The diabetes prediction is the approach which is applied to predict the diabetes based on the various attributes. The diabetes dataset has various attributes and based on that attributes diabetes can be predicted. In the previous years approach of SVM is applied for the diabetes prediction. To improve accuracy of diabetes prediction voting based classification is applied in this paper. The proposed model is implemented in python and results are analyzed in terms of accuracy, execution time

Credit card frauds are on the rise and are getting smarter with the passage of time. Usually, fraudulent transactions are conducted by stealing the credit card. When the loss of the card is not noticed by the cardholder, a huge loss can be faced by the credit card company. In the existing work, it has been found that the researchers have utilized Voting based method to identify credit card frauds. The problem with voting based method is that they are more complex and more time consuming. In this research work, a hybrid approach based on KNN and Naive Bayes for the detection of credit card frauds. KNN will be used as the base classifier and it will return predicted result. The predicted result will be provided as input to the Naive Bayes classifier which will generate the final result. The proposed model will be compared with existing techniques and the results are analyzed in terms of recall, precision, accuracy and execution time.


Author(s):  
Rapinder Kaur

As the world is growing fast, the metamorphosing of things, lifestyle, perceptions of people and resources is taking place. But the elevation in technology has become a challenge now as the ideas, innovations are amplifying. One of the biggest things the advancement and elevations in technology has given birth is “Big Data”. In this data massive amount of information is hidden. In order to refine or process this data and to find out and unmask the insights, many techniques and algorithms have been evolved, one of which is the data mining. The data mining is the approach or procedure which helps in detaching or extracting profitable and fruitful knowledge, reports and facts from the rough or impure data. The prediction analysis is approach comprehended from data mining to forecast and figure out the future making using classification technique. This research work is based on the diabetes prediction by making use of classification approach. In the existing approach SVM classifier is applied for the prediction analysis. To increase accuracy approach of KNN classifier is applied for the prediction analysis. Both the proposed and existing methods are implemented in Python. The simulation results show that accuracy of KNN is increased and execution time is reduced.


2021 ◽  
Author(s):  
Mansour Esmaeilpour ◽  
Rasha Kashef

Abstract The outbreak of the coronavirus 2019 (COVID-19) has created an excellent challenge for the care system worldwide. One in every of the foremost vital points of this challenge is that the management of COVID-19 patients needing acute and/or vital metastasis care. The main objective of applying data mining to Covid-19 dataset is essential to propel learning by empowering data-oriented decision making to improve existing clinical practices and learning materials. Current data mining techniques offer patient data analysis for achieving an automated diagnosis of the diseases as an example; however, the results are not very accurate nor reliable, especially with a dynamic virus as the COVID-19. In this paper, we are proposing a multi-stage diagnostic (MSD-Covid19) model to enhance the diagnosis of the COVID-19, and to provide a sustainable automated system to improve the healthcare systems and patient outcomes. The first stage includes a selection of a classification model with no reduction attributes. Tested classification algorithms include Deep learning, Multilayer Perceptron, KNN, Bayesian Auto Regression, Logistic Model Trees (LMT), Hoeffding tree (VFDT), and Fuzzy Unordered Rule Induction Algorithm. In the second stage, a rough set reduction algorithm based on genetic algorithms is employed, and finally, an optimization of the classification is conducted using the reduced attributes. The proposed model is evaluated on a global COVID-19 dataset. Experimental results demonstrate that the proposed MSD-Covid19 has a great contribution to increase the diagnostic accuracy of the COVID-19 disease behaviour.


Author(s):  
Sushmita Gaonkar

<p>A massive amount of data is reproduced across numerous pursuits such as education, medical science, defenses, social media, and so on and so forth. Machine Learning (ML) and Data Mining (DM) are techniques that can be used to identify and improve the hidden patterns automatically through experience seen as a subset of Artificial intelligence. One of the key areas of this application is Educational Data Mining(EDM) which uses ML and statistics to extract large repositories of data associated with learning activities. These learning management systems are majorly used to predict college grades. The proposed model is built to predict the future grade of colleges and universities, established on the current activities they execute. Machine learning algorithms are found to be very practical and effective. It is the most valuable under circumstances where the individual doesn’t have an adequate amount of knowledge. ML algorithm predicts the future based on the input given to it, it investigates and analyzes given input data. They are trained based on it and infer a hypothesis /theory. The proposed model has used the Random Forest regression (RFR) algorithm which will help colleges to priorly know the grades and if these grades are less than what they anticipated they can improve them by enriching the current activities.</p>


2019 ◽  
Vol 7 (3) ◽  
pp. 749-753
Author(s):  
Suhasini Vijaykumar ◽  
Manjiri Moghe

Author(s):  
Fransiskus Ginting ◽  
Efori Buulolo ◽  
Edward Robinson Siagian

Data Mining is an information discovery by extracting information patterns that contain trend searches in a very large amount of data and assist the process of storing data in making a decision in the future. In determining the pattern classification techniques do to collect records (Training set). Regional income is generally derived from local taxes and levies, local taxes are one source of funding for the region on the national average has not been able to make a large contribution to the formation of local revenue. By utilizing Regional Revenue data, it can produce forecasting and predictions of Regional Revenue income in the future to match the reality / reality so that the planned RAPBD can run smoothly. Simple Linear Regression or often abbreviated as SLR (Simple Linear Regression) is one of the statistical methods used in production to make predictions or predictions about the characteristics of quality and quantity to describe the processes associated with data processing for the acquisition of regional income. So that in the testing phase with visual basic net can help in processing valid Regional Revenue Amount data. Keywords: Data Mining, Local Revenue, Simple Linear Regression Algorithm, Visual Basic net 2008


2020 ◽  
Vol 23 (4) ◽  
pp. 274-284 ◽  
Author(s):  
Jingang Che ◽  
Lei Chen ◽  
Zi-Han Guo ◽  
Shuaiqun Wang ◽  
Aorigele

Background: Identification of drug-target interaction is essential in drug discovery. It is beneficial to predict unexpected therapeutic or adverse side effects of drugs. To date, several computational methods have been proposed to predict drug-target interactions because they are prompt and low-cost compared with traditional wet experiments. Methods: In this study, we investigated this problem in a different way. According to KEGG, drugs were classified into several groups based on their target proteins. A multi-label classification model was presented to assign drugs into correct target groups. To make full use of the known drug properties, five networks were constructed, each of which represented drug associations in one property. A powerful network embedding method, Mashup, was adopted to extract drug features from above-mentioned networks, based on which several machine learning algorithms, including RAndom k-labELsets (RAKEL) algorithm, Label Powerset (LP) algorithm and Support Vector Machine (SVM), were used to build the classification model. Results and Conclusion: Tenfold cross-validation yielded the accuracy of 0.839, exact match of 0.816 and hamming loss of 0.037, indicating good performance of the model. The contribution of each network was also analyzed. Furthermore, the network model with multiple networks was found to be superior to the one with a single network and classic model, indicating the superiority of the proposed model.


2014 ◽  
Vol 912-914 ◽  
pp. 1710-1713
Author(s):  
Qing Zhang ◽  
Sui Huai Yu ◽  
Ming Jiu Yu

During the design processing of the future exploratory products, requirements from users seems to be a key factor for products availability achievement. As a practical user modeling method, Persona may accomplish the potential needs data mining effectively based on the analyzing of users. This review mainly focused on how to apply the persona in the exploratory products investigation to acquire useful information from the products design. The method to establish persona and the operating rules were also discussed in this article. The concept of the mobile internet device in future was used as an case to demonstrate the persona mentioned above.


Sign in / Sign up

Export Citation Format

Share Document