Prediction of Diagnosing Chronic Kidney Disease using Machine Learning: Classification Algorithms

Chronic Kidney Disease is a very dangerous health problem that has been spreading as well as growing due to diversification in life style such as food habits, changes in the atmosphere, etc. The branch of biosciences has progressive to a bigger extent and has bring out huge amounts of data from Electronic Health Records. The primary aim of this paper is to classify using various Classification techniques like Logistic Regression (LR), K-Nearest Neighbor (KNN) Classifier, Decision Tree Classifier Tree, Random Forest Classifier, Support Vector Machine (SVM), and SGD Classifier. According to the health statistics of India 63538 cases has been registered on chronic renal disorder. Average age of men and women susceptible to renal disorders occurs within the range of 48 to 70 years.

Get full-text (via PubEx)

Sentiment Analysis on Social Media Big Data With Multiple Tweet Words

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.j9684.0881019 ◽

2019 ◽

Vol 8 (10) ◽

pp. 3429-3434 ◽

Cited By ~ 2

Keyword(s):

Machine Learning ◽

Social Media ◽

Big Data ◽

Sentiment Analysis ◽

Language Processing ◽

Sentiment Classification ◽

Support Vector ◽

Decision Tree Classifier ◽

Machine Learning Classification ◽

Tree Classifier

The main objective of this paper is Analyze the reviews of Social Media Big Data of E-Commerce product’s. And provides helpful result to online shopping customers about the product quality and also provides helpful decision making idea to the business about the customer’s mostly liking and buying products. This covers all features or opinion words, like capitalized words, sequence of repeated letters, emoji, slang words, exclamatory words, intensifiers, modifiers, conjunction words and negation words etc available in tweets. The existing work has considered only two or three features to perform Sentiment Analysis with the machine learning technique Natural Language Processing (NLP). In this proposed work familiar Machine Learning classification models namely Multinomial Naïve Bayes, Support Vector Machine, Decision Tree Classifier, and, Random Forest Classifier are used for sentiment classification. The sentiment classification is used as a decision support system for the customers and also for the business.

Get full-text (via PubEx)

Diabetes Prediction Using Machine Learning Techniques

Journal of Intelligent Systems with Applications ◽

10.54856/10.54856/jiswa.202112183 ◽

2021 ◽

pp. 150-152

Author(s):

Seyma Kiziltas Koc ◽

Mustafa Yeniad

Keyword(s):

Machine Learning ◽

Support Vector Machine ◽

High Performance ◽

Nearest Neighbor ◽

Classification Performance ◽

Machine Learning Techniques ◽

Support Vector ◽

Classification Algorithms ◽

K Nearest Neighbor ◽

Machine Learning Classification

Technologies which are used in the healthcare industry are changing rapidly because the technology is evolving to improve people's lifestyles constantly. For instance, different technological devices are used for the diagnosis and treatment of diseases. It has been revealed that diagnosis of disease can be made by computer systems with developing technology.Machine learning algorithms are frequently used tools because of their high performance in the field of health as well as many field. The aim of this study is to investigate different machine learning classification algorithms that can be used in the diagnosis of diabetes and to make comparative analyzes according to the metrics in the literature. In the study, seven classification algorithms were used in the literature. These algorithms are Logistic Regression, K-Nearest Neighbor, Multilayer Perceptron, Random Forest, Decision Trees, Support Vector Machine and Naive Bayes. Firstly, classification performance of algorithms are compared. These comparisons are based on accuracy, sensitivity, precision, and F1-score. The results obtained showed that support vector machine algorithm had the highest accuracy with 78.65%.

Get full-text (via PubEx)

Recommender System for Term Deposit Likelihood Prediction using Cross-validated Neural Network

South Asian Journal of Social Studies and Economics ◽

10.9734/sajsse/2021/v11i330286 ◽

2021 ◽

pp. 21-28

Author(s):

Shawni Dutta ◽

Samir Kumar Bandyopadhyay

Keyword(s):

Neural Network ◽

Cross Validation ◽

Nearest Neighbor ◽

Automated System ◽

K Nearest Neighbor ◽

Decision Tree Classifier ◽

Proposed Model ◽

Tree Classifier ◽

Customer Perspective ◽

Fold Cross Validation

For enhancing the maximized profit from bank as well as customer perspective, term deposit can accelerate finance fields. This paper focuses on likelihood of term deposit subscription taken by the customers. Bank campaign efforts and customer details are influential while considering possibilities of taking term deposit subscription. An automated system is provided in this paper that approaches towards prediction of term deposit investment possibilities in advance. Neural network along with stratified 10-fold cross-validation methodology is proposed as predictive model which is later compared with other benchmark classifiers such as k-Nearest Neighbor (k-NN), Decision tree classifier (DT), and Multi-layer perceptron classifier (MLP). Experimental study concluded that proposed model provides significant prediction results over other baseline models with an accuracy of 88.32% and MSE of 0.1168.

Get full-text (via PubEx)

FWHT-RF: A Novel Computational Approach to Predict Plant Protein-Protein Interactions via an Ensemble Learning Method

Scientific Programming ◽

10.1155/2021/1607946 ◽

2021 ◽

Vol 2021 ◽

pp. 1-11

Author(s):

Jie Pan ◽

Li-Ping Li ◽

Chang-Qing Yu ◽

Zhu-Hong You ◽

Zhong-Hao Ren ◽

...

Keyword(s):

Protein Interactions ◽

Nearest Neighbor ◽

Protein Sequences ◽

Evolutionary Information ◽

Support Vector ◽

Protein Protein Interactions ◽

K Nearest Neighbor ◽

Novel Approach ◽

Knn Classifier ◽

Scoring Matrix

Protein-protein interactions (PPIs) in plants are crucial for understanding biological processes. Although high-throughput techniques produced valuable information to identify PPIs in plants, they are usually expensive, inefficient, and extremely time-consuming. Hence, there is an urgent need to develop novel computational methods to predict PPIs in plants. In this article, we proposed a novel approach to predict PPIs in plants only using the information of protein sequences. Specifically, plants’ protein sequences are first converted as position-specific scoring matrix (PSSM); then, the fast Walsh–Hadamard transform (FWHT) algorithm is used to extract feature vectors from PSSM to obtain evolutionary information of plant proteins. Lastly, the rotation forest (RF) classifier is trained for prediction and produced a series of evaluation results. In this work, we named this approach FWHT-RF because FWHT and RF are used for feature extraction and classification, respectively. When applying FWHT-RF on three plants’ PPI datasets Maize, Rice, and Arabidopsis thaliana (Arabidopsis), the average accuracies of FWHT-RF using 5-fold cross validation were achieved as high as 95.20%, 94.42%, and 83.85%, respectively. To further evaluate the predictive power of FWHT-RF, we compared it with the state-of-art support vector machine (SVM) and K-nearest neighbor (KNN) classifier in different aspects. The experimental results demonstrated that FWHT-RF can be a useful supplementary method to predict potential PPIs in plants.

Get full-text (via PubEx)

Classification of Cotton Leaf Diseases Using AlexNet and Machine Learning Models

Current Journal of Applied Science and Technology ◽

10.9734/cjast/2021/v40i3831588 ◽

2021 ◽

pp. 29-37

Author(s):

Premkumar Borugadda ◽

R. Lakshmi ◽

Surla Govindu

Keyword(s):

Machine Learning ◽

Precision Agriculture ◽

Performance Model ◽

Gradient Boosting ◽

Support Vector ◽

Cotton Leaf ◽

Decision Tree Classifier ◽

Machine Learning Classification ◽

Tree Classifier ◽

Fully Connected

Computer vision has been demonstrated as state-of-the-art technology in precision agriculture in recent years. In this paper, an Alex net model was implemented to identify and classify cotton leaf diseases. Cotton Dataset consists of 2275 images, in which 1952 images were used for training and 324 images were used for validation. Five convolutional layers of the AlexNet deep learning technique is applied for features extraction from raw data. They were remaining three fully connected layers of AlexNet and machine learning classification algorithms such as Ada Boost Classifier (ABC), Decision Tree Classifier (DTC), Gradient Boosting Classifier (GBC). K Nearest Neighbor (KNN), Logistic Regression (LR), Random Forest Classifier (RFC), and Support Vector Classifier (SVC) are used for classification. Three fully connected layers of Alex Net provided the best performance model with a 94.92% F1_score at the training time of about 51min.

Get full-text (via PubEx)

Machine Learning Classification and Feature Extraction of Arrhythmic ECG Data

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.b3548.079220 ◽

2020 ◽

Vol 9 (2) ◽

pp. 6-12

Keyword(s):

Machine Learning ◽

Feature Extraction ◽

Nearest Neighbor ◽

Extraction Process ◽

Support Vector ◽

Ecg Signal ◽

Data Sets ◽

K Nearest Neighbor ◽

Machine Learning Classification ◽

Artificial Neural Network Ann

Electrocardiogram (ECG) is the analysis of the electrical movement of the heart over a period of time. The detailed information about the condition of the heart is measured by analyzing the ECG signal. Wavelet transform, fast Fourier transform are the different methods to disorganize cardiac disease. The paper elaborates the survey on ECG signal analysis and related study on arrhythmic and non arrhythmic data. Here we discuss the efficient feature extraction process for electrocardiogram, where based on position and priority six best P-QRS-T fragments are studied. This survey examines the the outcome of the system by using various Machine learning classification algorithms for feature extraction and analysis of ECG Signals. Support Vector Machine (SVM), K-Nearest Neighbor (KNN), Artificial Neural Network (ANN) are the most important algorithms used here for this purpose. There are several publicly available data sets which are used for arrhythmia analysis and among them MIT-BIH ECG-ID database is mostly used. The drawbacks and limitations are also discussed here and from there future challenges and concluding remarks can be done.

Get full-text (via PubEx)

Chronic Kidney Disease Prediction Using Data Mining Algorithms

Handbook of Research on Disease Prediction Through Data Analytics and Machine Learning - Advances in Medical Diagnosis, Treatment, and Care ◽

10.4018/978-1-7998-2742-9.ch006 ◽

2021 ◽

pp. 92-111

Author(s):

Devesh Kumar Srivastava ◽

Pradeep Kumar Tiwari

Keyword(s):

Chronic Kidney Disease ◽

Kidney Disease ◽

Nearest Neighbor ◽

Early Stage ◽

Big Data Analytics ◽

K Nearest Neighbor ◽

Contemporary World ◽

Data Mining Algorithms ◽

Using Data ◽

Logistic Regression Algorithm

In today's contemporary world, it is important to know about the odds of having a disease because of changing living standards of the population overall in the continent. The disease on which the authors are working is chronic kidney disease. Once the person gets chronic kidney disease (CKD), his working capability decreases along with other adverse effects. It is possible to get rid of diseases like CKD with new methodologies that will help us to predict the stage of kidney disease at an early stage. Under big data analytics, data may be structured, unstructured, quasi- or semi-structured. The CKD detected and predicted by applying classification models: support vector machine (SVM), K-nearest neighbor (KNN), and logistic regression algorithm. It helps in predicting the likelihood of occurrence of disease on various different features. The two algorithms KNN and SVM are compared to find the algorithm that gives better accuracy. Further regression technique has been used to detect the disease based on, which the stages are classified by using GFR (glomerular filtration rate) formula.

Get full-text (via PubEx)

Chronic Kidney Disease Prediction Using Different Algorithms

International Journal of Scientific Research in Computer Science Engineering and Information Technology ◽

10.32628/cseit20652 ◽

2020 ◽

pp. 06-13

Author(s):

Harsh Vardhan Singh

Keyword(s):

Machine Learning ◽

Chronic Kidney Disease ◽

Kidney Disease ◽

Mean Squared Error ◽

Classification Algorithm ◽

Support Vector ◽

Machine Learning Classification ◽

Squared Error ◽

Health Damage ◽

Specific Symptoms

Chronic Kidney Disease (CKD) is a disease which doesn't shows symptoms at all or in some cases it doesn't show any disease specific symptoms it is hard to predict, detect and prevent such a disease and this could be lead to permanently health damage, but machine learning can be hope in this problem it is best in prediction and analysis. The objective of paper is to build the model for predicting the Chronic Kidney Disease using various machine learning classification algorithm. Classification is a powerful machine learning technique that is commonly used for prediction. Some of the classification algorithm are Logistic Regression, Support Vector Machine, Naïve Bayes, Random Forest Classifier, KNN. This paper investigate which algorithm is used for the improving the accuracy in the prediction of Chronic Kidney Disease. And, a comparative analysis on the accuracy and mean squared error is to done for predicting the best model.

Get full-text (via PubEx)

Machine Learning-Based Prediction System For Chronic Kidney Disease Using Associative Classification Technique

International Journal of Engineering & Technology ◽

10.14419/ijet.v7i4.36.25377 ◽

2018 ◽

Vol 7 (4.36) ◽

pp. 1161 ◽

Cited By ~ 1

Author(s):

Zixian Wang ◽

Jae Won Chung ◽

Xilin Jiang ◽

Yantong Cui ◽

Muning Wang ◽

...

Keyword(s):

Machine Learning ◽

Chronic Kidney Disease ◽

Kidney Disease ◽

Nearest Neighbor ◽

Technological Development ◽

Machine Learning Techniques ◽

Detection Accuracy ◽

K Nearest Neighbor ◽

Training Time ◽

Huge Impact

Technological development, including machine learning, has a huge impact on health through an effective analysis of various chronic diseases for more accurate diagnosis and successful treatment. Kidney disease is a major chronic disease associated with aging, hypertension, and diabetes, affecting people 60 and over. Its major cause is the malfunctioning of the kidney in disposing toxins from the blood. This study analyzes chronic kidney disease using machine learning techniques based on a chronic kidney disease (CKD) dataset from the UCI machine learning data warehouse. CKD is detected using the Apriori association technique for 400 instances of chronic kidney patients with 10-fold-cross-validation testing, and the results are compared across a number of classification algorithms including ZeroR, OneR, naive Bayes, J48, and IBk (k-nearest-neighbor). The dataset is preprocessed by completing and normalizing missing data. The most relevant features are selected from the dataset for improved accuracy and reduced training time. The results for selected features of the dataset indicate 99% detection accuracy for CKD based on Apriori. The identified technique is further tested using four patient data samples to predict their CKD.

Get full-text (via PubEx)

Advantage of Combining OBIA and Classifier Ensemble Method for Very High-Resolution Satellite Imagery Classification

Journal of Sensors ◽

10.1155/2020/8855509 ◽

2020 ◽

Vol 2020 ◽

pp. 1-15

Author(s):

Ruimei Han ◽

Pei Liu ◽

Guangyan Wang ◽

Hanwei Zhang ◽

Xilong Wu

Keyword(s):

High Resolution ◽

Random Forest ◽

Nearest Neighbor ◽

Machine Learning Algorithms ◽

Classifier Ensemble ◽

Support Vector ◽

Decision Tree Classifier ◽

Remotely Sensed Data ◽

Tree Classifier ◽

Very High

Accurate and timely collection of urban land use and land cover information is crucial for many aspects of urban development and environment protection. Very high-resolution (VHR) remote sensing images have made it possible to detect and distinguish detailed information on the ground. While abundant texture information and limited spectral channels of VHR images will lead to the increase of intraclass variance and the decrease of the interclass variance. Substantial studies on pixel-based classification algorithms revealed that there were some limitations on land cover information extraction with VHR remote sensing imagery when applying the conventional pixel-based classifiers. Aiming at evaluating the advantages of classifier ensemble strategies and object-based image analysis (OBIA) method for VHR satellite data classification under complex urban area, we present an approach-integrated multiscale segmentation OBIA and a mature classifier ensemble method named random forest. The framework was tested on Chinese GaoFen-1 (GF-1), and GF-2 VHR remotely sensed data over the central business district (CBD) of Zhengzhou metropolitan. Process flow of the proposed framework including data fusion, multiscale image segmentation, best optimal segmentation scale evaluation, multivariance texture feature extraction, random forest ensemble learning classifier construction, accuracy assessment, and time consumption. Advantages of the proposed framework were compared and discussed with several mature state-of-art machine learning algorithms such as the k -nearest neighbor (KNN), support vector machine (SVM), and decision tree classifier (DTC). Experimental results showed that the OA of the proposed method is up to 99.29% and 98.98% for the GF-1 dataset and GF-2 dataset, respectively. And the OA is increased by 26.89%, 11.79%, 11.89%, and 4.26% compared with the traditional machine learning algorithms such as the decision tree classifier (DTC), support vector machine (SVM), k -nearest neighbor (KNN), and random forest (RF) on the test of the GF-1 dataset; OA increased by 32.31%, 13.48%, 9.77%, and 7.72% for the GF-2 dataset. In terms of time consuming, by rough statistic, OBIA-RF spends 223.55 s, SVM spends 403.57 s, KNN spends 86.93 s, and DT spends 0.61 s on average of the GF-1 and GF-2 datasets. Taking the account classification accuracy and running time, the proposed method has good ability of generalization and robustness for complex urban surface classification with high-resolution remotely sensed data.

Get full-text (via PubEx)