Prediction of heart disease using apache spark analysing decision trees and gradient boosting algorithm

AbstractThe Larson–Miller parameter (LMP) offers an efficient and fast scheme to estimate the creep rupture life of alloy materials for high-temperature applications; however, poor generalizability and dependence on the constant C often result in sub-optimal performance. In this work, we show that the direct rupture life parameterization without intermediate LMP parameterization, using a gradient boosting algorithm, can be used to train ML models for very accurate prediction of rupture life in a variety of alloys (Pearson correlation coefficient >0.9 for 9–12% Cr and >0.8 for austenitic stainless steels). In addition, the Shapley value was used to quantify feature importance, making the model interpretable by identifying the effect of various features on the model performance. Finally, a variational autoencoder-based generative model was built by conditioning on the experimental dataset to sample hypothetical synthetic candidate alloys from the learnt joint distribution not existing in both 9–12% Cr ferritic–martensitic alloys and austenitic stainless steel datasets.

Download Full-text

An intelligent evolutionary extreme gradient boosting algorithm development for modeling scour depths under submerged weir

Information Sciences ◽

10.1016/j.ins.2021.04.063 ◽

2021 ◽

Author(s):

Hai Tao ◽

Maria Habib ◽

Ibrahim Aljarah ◽

Hossam Faris ◽

Haitham Abdulmohsin Afan ◽

...

Keyword(s):

Gradient Boosting ◽

Algorithm Development ◽

Extreme Gradient Boosting ◽

Boosting Algorithm

Download Full-text

Gradient boosting for linear mixed models

The International Journal of Biostatistics ◽

10.1515/ijb-2020-0136 ◽

2021 ◽

Vol 0 (0) ◽

Author(s):

Colin Griesbach ◽

Benjamin Säfken ◽

Elisabeth Waldmann

Keyword(s):

Random Effects ◽

Mixed Models ◽

Selection Procedure ◽

Classification Theory ◽

Gradient Boosting ◽

Random Structure ◽

Boosting Algorithm ◽

The One ◽

Biased Estimates ◽

Selection Of

Abstract Gradient boosting from the field of statistical learning is widely known as a powerful framework for estimation and selection of predictor effects in various regression models by adapting concepts from classification theory. Current boosting approaches also offer methods accounting for random effects and thus enable prediction of mixed models for longitudinal and clustered data. However, these approaches include several flaws resulting in unbalanced effect selection with falsely induced shrinkage and a low convergence rate on the one hand and biased estimates of the random effects on the other hand. We therefore propose a new boosting algorithm which explicitly accounts for the random structure by excluding it from the selection procedure, properly correcting the random effects estimates and in addition providing likelihood-based estimation of the random effects variance structure. The new algorithm offers an organic and unbiased fitting approach, which is shown via simulations and data examples.

Download Full-text

Large-Scale Linear RankSVM

Neural Computation ◽

10.1162/neco_a_00571 ◽

2014 ◽

Vol 26 (4) ◽

pp. 781-817 ◽

Cited By ~ 48

Author(s):

Ching-Pei Lee ◽

Chih-Jen Lin

Keyword(s):

Decision Trees ◽

Computational Efficiency ◽

Efficient Algorithm ◽

Large Scale ◽

Learning To Rank ◽

Gradient Boosting ◽

Baseline Model ◽

Nonlinear Methods ◽

Advantages And Disadvantages ◽

Linear Ranksvm

Linear rankSVM is one of the widely used methods for learning to rank. Although its performance may be inferior to nonlinear methods such as kernel rankSVM and gradient boosting decision trees, linear rankSVM is useful to quickly produce a baseline model. Furthermore, following its recent development for classification, linear rankSVM may give competitive performance for large and sparse data. A great deal of works have studied linear rankSVM. The focus is on the computational efficiency when the number of preference pairs is large. In this letter, we systematically study existing works, discuss their advantages and disadvantages, and propose an efficient algorithm. We discuss different implementation issues and extensions with detailed experiments. Finally, we develop a robust linear rankSVM tool for public use.

Download Full-text

Prediction of Heart Disease Using Machine Learning Algorithms

International Journal of Engineering & Technology ◽

10.14419/ijet.v7i2.32.15714 ◽

2018 ◽

Vol 7 (2.32) ◽

pp. 363 ◽

Cited By ~ 5

Author(s):

N Rajesh ◽

Maneesha T ◽

Shaik Hafeez ◽

Hari Krishna

Keyword(s):

Risk Factors ◽

Heart Disease ◽

Decision Trees ◽

Naive Bayes ◽

Heart Diseases ◽

Naïve Bayes ◽

Machine Learning Algorithms ◽

Common Disease ◽

Bayes Algorithm ◽

The One

Heart disease is the one of the most common disease. This disease is quite common now a days we used different attributes which can relate to this heart diseases well to find the better method to predict and we also used algorithms for prediction. Naive Bayes, algorithm is analyzed on dataset based on risk factors. We also used decision trees and combination of algorithms for the prediction of heart disease based on the above attributes. The results shown that when the dataset is small naive Bayes algorithm gives the accurate results and when the dataset is large decision trees gives the accurate results.

Download Full-text

Semantic Based Greedy Levy Gradient Boosting Algorithm for Phishing Detection

Computer Systems Science and Engineering ◽

10.32604/csse.2022.019300 ◽

2022 ◽

Vol 41 (2) ◽

pp. 525-538

Author(s):

R. Sakunthala Jenni ◽

S. Shankar

Keyword(s):

Gradient Boosting ◽

Boosting Algorithm ◽

Phishing Detection

Download Full-text

Significant of Gradient Boosting Algorithm in Data Management System

Engineering International ◽

10.18034/ei.v9i2.559 ◽

2021 ◽

Vol 9 (2) ◽

pp. 85-100

Author(s):

Md Saikat Hosen ◽

Ruhul Amin

Keyword(s):

Data Management ◽

Text Classification ◽

Learning Process ◽

Learning Algorithm ◽

Error Function ◽

Data Management System ◽

Gradient Boosting ◽

Response Parameter ◽

Boosting Algorithms ◽

Boosting Algorithm

Gradient boosting machines, the learning process successively fits fresh prototypes to offer a more precise approximation of the response parameter. The principle notion associated with this algorithm is that a fresh base-learner construct to be extremely correlated with the “negative gradient of the loss function” related to the entire ensemble. The loss function's usefulness can be random, nonetheless, for a clearer understanding of this subject, if the “error function is the model squared-error loss”, then the learning process would end up in sequential error-fitting. This study is aimed at delineating the significance of the gradient boosting algorithm in data management systems. The article will dwell much the significance of gradient boosting algorithm in text classification as well as the limitations of this model. The basic methodology as well as the basic-learning algorithm of the gradient boosting algorithms originally formulated by Friedman, is presented in this study. This may serve as an introduction to gradient boosting algorithms. This article has displayed the approach of gradient boosting algorithms. Both the hypothetical system and the plan choices were depicted and outlined. We have examined all the basic stages of planning a specific demonstration for one’s experimental needs. Elucidation issues have been tended to and displayed as a basic portion of the investigation. The capabilities of the gradient boosting algorithms were examined on a set of real-world down-to-earth applications such as text classification.

Download Full-text

Fido-SNP: the first webserver for scoring the impact of single nucleotide variants in the dog genome

Nucleic Acids Research ◽

10.1093/nar/gkz420 ◽

2019 ◽

Vol 47 (W1) ◽

pp. W136-W141 ◽

Cited By ~ 1

Author(s):

Emidio Capriotti ◽

Ludovica Montanucci ◽

Giuseppe Profiti ◽

Ivan Rossi ◽

Diana Giannuzzi ◽

...

Keyword(s):

Matthews Correlation Coefficient ◽

Genomic Variation ◽

Gradient Boosting ◽

Binary Classifier ◽

Single Nucleotide Variants ◽

Single Nucleotide ◽

Coding Regions ◽

Variation Data ◽

Boosting Algorithm ◽

The Impact

Abstract As the amount of genomic variation data increases, tools that are able to score the functional impact of single nucleotide variants become more and more necessary. While there are several prediction servers available for interpreting the effects of variants in the human genome, only few have been developed for other species, and none were specifically designed for species of veterinary interest such as the dog. Here, we present Fido-SNP the first predictor able to discriminate between Pathogenic and Benign single-nucleotide variants in the dog genome. Fido-SNP is a binary classifier based on the Gradient Boosting algorithm. It is able to classify and score the impact of variants in both coding and non-coding regions based on sequence features within seconds. When validated on a previously unseen set of annotated variants from the OMIA database, Fido-SNP reaches 88% overall accuracy, 0.77 Matthews correlation coefficient and 0.91 Area Under the ROC Curve.

Download Full-text

Prediction of heart disease using apache spark analysing decision trees and gradient boosting algorithm

DDoS Detection System: Utilizing Gradient Boosting Algorithm and Apache Spark

High Accurancy and Low Risk Prediction and Diagnosis Heart Disease using Gradient Boosting Algorithm

Machine learning augmented predictive and generative model for rupture life in ferritic and austenitic steels

An intelligent evolutionary extreme gradient boosting algorithm development for modeling scour depths under submerged weir

Gradient boosting for linear mixed models

Large-Scale Linear RankSVM

Prediction of Heart Disease Using Machine Learning Algorithms

Semantic Based Greedy Levy Gradient Boosting Algorithm for Phishing Detection

Significant of Gradient Boosting Algorithm in Data Management System

Fido-SNP: the first webserver for scoring the impact of single nucleotide variants in the dog genome

Export Citation Format