Classification of Diabetes using Random Forest with Feature Selection Algorithm

Diabetes has become a serious problem now a day. So there is a need to take serious precautions to eradicate this. To eradicate, we should know the level of occurrence. In this project we predict the level of occurrence of diabetes. We predict the level of occurrence of diabetes using Random Forest, a Machine Learning Algorithm. Using the patient’s Electronic Health Records (EHR) we can build accurate models that predict the presence of diabetes.

Download Full-text

CLASSIFICATION OF UAV POINT CLOUDS BY RANDOM FOREST MACHINE LEARNING ALGORITHM

Turkish Journal of Engineering ◽

10.31127/tuje.669566 ◽

2021 ◽

Author(s):

Mustafa ZEYBEK

Keyword(s):

Machine Learning ◽

Random Forest ◽

Learning Algorithm ◽

Point Clouds ◽

Machine Learning Algorithm

Download Full-text

Identification of elders at higher risk for fall with statewide electronic health records and a machine learning algorithm

International Journal of Medical Informatics ◽

10.1016/j.ijmedinf.2020.104105 ◽

2020 ◽

Vol 137 ◽

pp. 104105 ◽

Cited By ~ 4

Author(s):

Chengyin Ye ◽

Jinmei Li ◽

Shiying Hao ◽

Modi Liu ◽

Hua Jin ◽

...

Keyword(s):

Machine Learning ◽

Electronic Health Records ◽

Learning Algorithm ◽

Machine Learning Algorithm ◽

Health Records ◽

Electronic Health

Download Full-text

A Machine Learning Algorithm for Identifying Atopic Dermatitis in Adults from Electronic Health Records

2017 IEEE International Conference on Healthcare Informatics (ICHI) ◽

10.1109/ichi.2017.31 ◽

2017 ◽

Cited By ~ 8

Author(s):

Erin Gustafson ◽

Jennifer Pacheco ◽

Firas Wehbe ◽

Jonathan Silverberg ◽

William Thompson

Keyword(s):

Machine Learning ◽

Atopic Dermatitis ◽

Electronic Health Records ◽

Learning Algorithm ◽

Machine Learning Algorithm ◽

Health Records ◽

Electronic Health

Download Full-text

Inner parameters' optimization in the artificial neural network for the traffic data classification in radiofrequency applications: Classification of nonstationary data using the machine learning algorithm “random forest”

2018 Systems of Signals Generating and Processing in the Field of on Board Communications ◽

10.1109/sosg.2018.8350599 ◽

2018 ◽

Cited By ~ 1

Author(s):

Kalashnikov Evgeniy Alexandrovich ◽

Kondybayeva Almagul Baurzhanovna ◽

Ositis Anastasia Petrovna

Keyword(s):

Neural Network ◽

Machine Learning ◽

Artificial Neural Network ◽

Random Forest ◽

Learning Algorithm ◽

Parameters Optimization ◽

Machine Learning Algorithm ◽

Traffic Data ◽

Nonstationary Data

Download Full-text

Classification of Phishing Email Using Random Forest Machine Learning Technique

Journal of Applied Mathematics ◽

10.1155/2014/425731 ◽

2014 ◽

Vol 2014 ◽

pp. 1-6 ◽

Cited By ~ 40

Author(s):

Andronicus A. Akinyelu ◽

Aderemi O. Adewumi

Keyword(s):

Machine Learning ◽

Random Forest ◽

Learning Algorithm ◽

False Negative ◽

Machine Learning Algorithm ◽

Detection Techniques ◽

Phishing Attacks ◽

Learning Technique ◽

Phishing Detection

Phishing is one of the major challenges faced by the world of e-commerce today. Thanks to phishing attacks, billions of dollars have been lost by many companies and individuals. In 2012, an online report put the loss due to phishing attack at about $1.5 billion. This global impact of phishing attacks will continue to be on the increase and thus requires more efficient phishing detection techniques to curb the menace. This paper investigates and reports the use of random forest machine learning algorithm in classification of phishing attacks, with the major objective of developing an improved phishing email classifier with better prediction accuracy and fewer numbers of features. From a dataset consisting of 2000 phishing and ham emails, a set of prominent phishing email features (identified from the literature) were extracted and used by the machine learning algorithm with a resulting classification accuracy of 99.7% and low false negative (FN) and false positive (FP) rates.

Download Full-text

Machine Learning Based Supervised Feature Selection Algorithm for Data Mining

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.j9483.0881019 ◽

2019 ◽

Vol 8 (10) ◽

pp. 3396-3401 ◽

Cited By ~ 1

Keyword(s):

Machine Learning ◽

Data Mining ◽

Feature Selection ◽

Learning Algorithm ◽

Modern World ◽

Feature Subset ◽

Selection Algorithm ◽

Feature Selection Algorithm ◽

Minimum Number ◽

Preprocessing Technique

Data Scientists focus on high dimensional data to predict and reveal some interesting patterns as well as most useful information to the modern world. Feature Selection is a preprocessing technique which improves the accuracy and efficiency of mining algorithms. There exist a numerous feature selection algorithms. Most of the algorithms failed to give better mining results as the scale increases. In this paper, feature selection for supervised algorithms in data mining are considered and given an overview of existing machine learning algorithm for supervised feature selection. This paper introduces an enhanced supervised feature selection algorithm which selects the best feature subset by eliminating irrelevant features using distance correlation and redundant features using symmetric uncertainty. The experimental results show that the proposed algorithm provides better classification accuracy and selects minimum number of features.

Download Full-text

Validation of a machine learning algorithm for early severe sepsis prediction: a retrospective study predicting severe sepsis up to 48 h in advance using a diverse dataset from 461 US hospitals

BMC Medical Informatics and Decision Making ◽

10.1186/s12911-020-01284-x ◽

2020 ◽

Vol 20 (1) ◽

Author(s):

Hoyt Burdick ◽

Eduardo Pino ◽

Denise Gabel-Comeau ◽

Carol Gu ◽

Jonathan Roberts ◽

...

Keyword(s):

Machine Learning ◽

Severe Sepsis ◽

Electronic Health Records ◽

Patient Outcomes ◽

Learning Algorithm ◽

Validation Dataset ◽

Machine Learning Algorithm ◽

Health Records ◽

Testing Dataset ◽

Electronic Health

Abstract Background Severe sepsis and septic shock are among the leading causes of death in the United States and sepsis remains one of the most expensive conditions to diagnose and treat. Accurate early diagnosis and treatment can reduce the risk of adverse patient outcomes, but the efficacy of traditional rule-based screening methods is limited. The purpose of this study was to develop and validate a machine learning algorithm (MLA) for severe sepsis prediction up to 48 h before onset using a diverse patient dataset. Methods Retrospective analysis was performed on datasets composed of de-identified electronic health records collected between 2001 and 2017, including 510,497 inpatient and emergency encounters from 461 health centers collected between 2001 and 2015, and 20,647 inpatient and emergency encounters collected in 2017 from a community hospital. MLA performance was compared to commonly used disease severity scoring systems and was evaluated at 0, 4, 6, 12, 24, and 48 h prior to severe sepsis onset. Results 270,438 patients were included in analysis. At time of onset, the MLA demonstrated an AUROC of 0.931 (95% CI 0.914, 0.948) and a diagnostic odds ratio (DOR) of 53.105 on a testing dataset, exceeding MEWS (0.725, P < .001; DOR 4.358), SOFA (0.716; P < .001; DOR 3.720), and SIRS (0.655; P < .001; DOR 3.290). For prediction 48 h prior to onset, the MLA achieved an AUROC of 0.827 (95% CI 0.806, 0.848) on a testing dataset. On an external validation dataset, the MLA achieved an AUROC of 0.948 (95% CI 0.942, 0.954) at the time of onset, and 0.752 at 48 h prior to onset. Conclusions The MLA accurately predicts severe sepsis onset up to 48 h in advance using only readily available vital signs extracted from the existing patient electronic health records. Relevant implications for clinical practice include improved patient outcomes from early severe sepsis detection and treatment.

Download Full-text

Recursive Feature Elimination with Ridge Regression (L2) Machine Learning Hybrid Feature Selection Algorithm for Diabetic Prediction using Random Forest Classifer.

10.21203/rs.3.rs-742641/v1 ◽

2021 ◽

Author(s):

K venkatachalam ◽

P Prabhu ◽

B saravana Balaji ◽

Mohamed Abouhawwash ◽

R Rajadevi

Keyword(s):

Machine Learning ◽

Feature Selection ◽

Random Forest ◽

Ridge Regression ◽

Feature Selection Method ◽

Selection Method ◽

Recursive Feature Elimination ◽

Selection Algorithm ◽

Feature Selection Algorithm ◽

Data Set

Abstract In day today life, diabetes illness is increasing in count due to the body not able to metabolize the glucose level. The prediction of the right diabetes patients is an important research area that many researchers are proposing the techniques to predict this disease through data mining and machine learning methods. In prediction, feature selection is one of the key concept in preprocessing so that the features that are relevant to the disease will be used for prediction. This will improve the prediction accuracy. Selecting right features among the whole feature set is a complicated process and many researchers are concentrating on it to produce the predictive model with high accuracy. In this proposed work, the wrapper based feature selection method called Recursive Feature Elimination (RFE) is combined with Ridge regression (L2) to form a hybrid L2 regulated feature selection algorithm to overcome the overfilling problem of the data set. Over fitting is the major problem in feature selection which means that the new data are not fit to the model since the training data is small. Ridge regression is mainly used to overcome the overfitting problem. Once the features are selected using the proposed feature selection method, random forest classifier is used to classify the data based on the selected features. The proposed work is experimented in PIDD data set and the evaluated results are compared with the existing algorithms to prove the accuracy effect of the proposed algorithm. From the results obtained by proposed algorithm, the accuracy of predicting the diabetes disease is high compared to other existing algorithms.

Download Full-text

Machine learning algorithm to identifies fraud emails with feature selection

IOP Conference Series Materials Science and Engineering ◽

10.1088/1757-899x/1088/1/012011 ◽

2021 ◽

Vol 1088 (1) ◽

pp. 012011

Author(s):

Anita Sindar Sinaga ◽

Musthafa Haris Munandar ◽

Arjon Samuel Sitio

Keyword(s):

Machine Learning ◽

Feature Selection ◽

Learning Algorithm ◽

Machine Learning Algorithm

Download Full-text

Multi-Class Assessment Based on Random Forests

Education Sciences ◽

10.3390/educsci11030092 ◽

2021 ◽

Vol 11 (3) ◽

pp. 92

Author(s):

Mehdi Berriri ◽

Sofiane Djema ◽

Gaëtan Rey ◽

Christel Dartigues-Pallez

Keyword(s):

Higher Education ◽

Machine Learning ◽

Random Forests ◽

Learning Algorithm ◽

Teaching Staff ◽

Machine Learning Algorithm ◽

Process Data ◽

Training Courses ◽

Education Courses

Today, many students are moving towards higher education courses that do not suit them and end up failing. The purpose of this study is to help provide counselors with better knowledge so that they can offer future students courses corresponding to their profile. The second objective is to allow the teaching staff to propose training courses adapted to students by anticipating their possible difficulties. This is possible thanks to a machine learning algorithm called Random Forest, allowing for the classification of the students depending on their results. We had to process data, generate models using our algorithm, and cross the results obtained to have a better final prediction. We tested our method on different use cases, from two classes to five classes. These sets of classes represent the different intervals with an average ranging from 0 to 20. Thus, an accuracy of 75% was achieved with a set of five classes and up to 85% for sets of two and three classes.

Download Full-text