scholarly journals Predicting clinical outcomes in the Machine Learning era: The Piacenza score a purely data driven approach for mortality prediction in COVID-19 Pneumonia (Preprint)

2021 ◽  
Author(s):  
Geza Halasz ◽  
Michela Sperti ◽  
Matteo Villani ◽  
Umberto Michelucci ◽  
Piergiuseppe Agostoni ◽  
...  

BACKGROUND Several models have been developed to predict mortality in patients with Covid-19 pneumonia, but only few have demonstrated enough discriminatory capacity. Machine-learning algorithms represent a novel approach for data-driven prediction of clinical outcomes with advantages over statistical modelling. OBJECTIVE To developed the Piacenza score, a Machine-learning based score, to predict 30-day mortality in patients with Covid-19 pneumonia METHODS The study comprised 852 patients with COVID-19 pneumonia, admitted to the Guglielmo da Saliceto Hospital (Italy) from February to November 2020. The patients’ medical history, demographic and clinical data were collected in an electronic health records. The overall patient dataset was randomly splitted into derivation and test cohort. The score was obtained through the Naïve Bayes classifier and externally validated on 86 patients admitted to Centro Cardiologico Monzino (Italy) in February 2020. Using a forward-search algorithm six features were identified: age; mean corpuscular haemoglobin concentration; PaO2/FiO2 ratio; temperature; previous stroke; gender. The Brier index was used to evaluate the ability of ML to stratify and predict observed outcomes. A user-friendly web site available at (https://covid.7hc.tech.) was designed and developed to enable a fast and easy use of the tool by the final user (i.e., the physician). Regarding the customization properties to the Piacenza score, we added a personalized version of the algorithm inside the website, which enables an optimized computation of the mortality risk score for a single patient, when some variables used by the Piacenza score are not available. In this case, the Naïve Bayes classifier is re-trained over the same derivation cohort but using a different set of patient’s characteristics. We also compared the Piacenza score with the 4C score and with a Naïve Bayes algorithm with 14 features chosen a-priori. RESULTS The Piacenza score showed an AUC of 0.78(95% CI 0.74-0.84 Brier-score 0.19) in the internal validation cohort and 0.79(95% CI 0.68-0.89, Brier-score 0.16) in the external validation cohort showing a comparable accuracy respect to the 4C score and to the Naïve Bayes model with a-priori chosen features, which achieved an AUC of 0.78(95% CI 0.73-0.83, Brier-score 0.26) and 0.80(95% CI 0.75-0.86, Brier-score 0.17) respectively. CONCLUSIONS A personalized Machine-learning based score with a purely data driven features selection is feasible and effective to predict mortality in patients with COVID-19 pneumonia.

2021 ◽  
Author(s):  
geza halasz ◽  
Michela Sperti ◽  
Matteo Villani ◽  
Umberto Michelucci ◽  
Piergiuseppe Agostoni ◽  
...  

Background Several models have been developed to predict mortality in patients with COVID-19 pneumonia, but only few have demonstrated enough discriminatory capacity. Machine-learning(ML) algorithms represent a novel approach for data-driven prediction of clinical outcomes with advantages over statistical modelling. We developed the Piacenza score, a ML-based score, to predict 30-day mortality in patients with COVID-19 pneumonia. Methods 852 patients (mean age 70years, 70%males) were enrolled from February to November 2020. The dataset was randomly splitted into derivation and test. The Piacenza score was obtained through the Naive Bayes classifier and externally validated on 86 patients. Using a forward-search algorithm the following six features were identified: age; mean corpuscular haemoglobin concentration; PaO2/FiO2 ratio; temperature; previous stroke; gender. In case one or more of the features are not available for a patient, the model can be re-trained using only the provided features. We also compared the Piacenza score with the 4C score and with a Naive Bayes algorithm with 14 variables chosen a-priori. Results The Piacenza score showed an AUC of 0.78(95% CI 0.74-0.84, Brier-score 0.19) in the internal validation cohort and 0.79(95% CI 0.68-0.89, Brier-score 0.16) in the external validation cohort showing a comparable accuracy respect to the 4C score and to the Naive Bayes model with a-priori chosen features, which achieved an AUC of 0.78(95% CI 0.73-0.83, Brier-score 0.26) and 0.80(95% CI 0.75-0.86, Brier-score 0.17) respectively. Conclusion A personalized ML-based score with a purely data driven features selection is feasible and effective to predict mortality in patients with COVID-19 pneumonia.


2020 ◽  
Vol 1 (2) ◽  
pp. 61-66
Author(s):  
Febri Astiko ◽  
Achmad Khodar

This study aims to design a machine learning model of sentiment analysis on Indosat Ooredoo service reviews on social media twitter using the Naive Bayes algorithm as a classifier of positive and negative labels. This sentiment analysis uses machine learning to get patterns an model that can be used again to predict new data.


Author(s):  
Mingtao Wu ◽  
Vir V. Phoha ◽  
Young B. Moon ◽  
Amith K. Belman

3D printing, or additive manufacturing, is a key technology for future manufacturing systems. However, 3D printing systems have unique vulnerabilities presented by the ability to affect the infill without affecting the exterior. In order to detect malicious infill defects in 3D printing process, this paper proposes the following: 1) investigate malicious defects in the 3D printing process, 2) extract features based on simulated 3D printing process images, and 3) an experiment of image classification with one group of non-defect infill image and the other group of defect infill training image from 3D printing process. The images are captured layer by layer from the top view of software simulation preview. The data extracted from images is input to two machine learning algorithms, Naive Bayes Classifier and J48 Decision Trees. The result shows Naive Bayes Classifier has an accuracy of 85.26% and J48 Decision Trees has an accuracy of 95.51% for classification.


With the growing volume and the amount of spam message, the demand for identifying the effective method for spam detection is in claim. The growth of mobile phone and Smartphone has led to the drastic increase in the SMS spam messages. The advancement and the clean process of mobile message servicing channel have attracted the hackers to perform their hacking through SMS messages. This leads to the fraud usage of other accounts and transaction that result in the loss of service and profit to the owners. With this background, this paper focuses on predicting the Spam SMS messages. The SMS Spam Message Detection dataset from KAGGLE machine learning Repository is used for prediction analysis. The analysis of Spam message detection is achieved in four ways. Firstly, the distribution of the target variable Spam Type the dataset is identified and represented by the graphical notations. Secondly, the top word features for the Spam and Ham messages in the SMS messages is extracted using Count Vectorizer and it is displayed using spam and Ham word cloud. Thirdly, the extracted Counter vectorized feature importance SMS Spam Message detection dataset is fitted to various classifiers like KNN classifier, Random Forest classifier, Linear SVM classifier, Ada Boost classifier, Kernel SVM classifier, Logistic Regression classifier, Gaussian Naive Bayes classifier, Decision Tree classifier, Extra Tree classifier, Gradient Boosting classifier and Multinomial Naive Bayes classifier. Performance analysis is done by analyzing the performance metrics like Accuracy, FScore, Precision and Recall. The implementation is done by python in Anaconda Spyder Navigator. Experimental Results shows that the Multinomial Naive Bayes classifier have achieved the effective prediction with the precision of 0.98, recall of 0.98, FScore of 0.98 , and Accuracy of 98.20%..


2022 ◽  
Vol 07 (01) ◽  
Author(s):  
Ramakrishna Hegde ◽  

The researcher explained the implementation process of finding the scholarship for the students by using machine learning supervised learning algorithm i.e. Naïve Bayes algorithm. Addition to this it includes a small description of naïve bayes classifier which used to be used through the authors. It explains the significance of training facts set and trying out information set in Machine mastering techniques. Machine learning nowadays becomes plenty used technique in the field of IT industry. It is a very effective instrument and technique for many quite a number fields such as education, IT and even in enterprise industry. In this paper, the researcher attempt to find computerized end result reputation of scholarships of college students by way of using naïve bayes classifier algorithm primarily based on the scholar educational performance, conversation skills, greedy power, IHS, income, time management, regularity etc. A scholarship offers a strength and self assurance to a student. It also boosts the performance of students indirectly. Usually scholarships are furnished by governments or authorities organizations. It is very essential for students to recognize their personal potentiality early in their educational profession so that they faster its growth, receiving attention from an employer or corporation helps college students take this step. Students can apply for scholarships primarily based on the eligibility criteria (such as caste category, annual income, etc). The scholarship will be issued based on merit, student performance and career specific. Different schemes of scholarships are provided for the students based on distinct eligibility criteria. By the use of a naïve bayes classifier, the researcher acquired a end result with accuracy of 96.7% and error of 3.3%. The repute of scholarship students was once displayed in the form of yes or no.


2015 ◽  
Vol 50 (4) ◽  
pp. 293-296 ◽  
Author(s):  
D Chaki ◽  
A Das ◽  
MI Zaber

The classification of heart disease patients is of great importance in cardiovascular disease diagnosis. Numerous data mining techniques have been used so far by the researchers to aid health care professionals in the diagnosis of heart disease. For this task, many algorithms have been proposed in the previous few years. In this paper, we have studied different supervised machine learning techniques for classification of heart disease data and have performed a procedural comparison of these. We have used the C4.5 decision tree classifier, a naïve Bayes classifier, and a Support Vector Machine (SVM) classifier over a large set of heart disease data. The data used in this study is the Cleveland Clinic Foundation Heart Disease Data Set available at UCI Machine Learning Repository. We have found that SVM outperformed both naïve Bayes and C4.5 classifier, giving the best accuracy rate of correctly classifying highest number of instances. We have also found naïve Bayes classifier achieved a competitive performance though the assumption of normality of the data is strongly violated.Bangladesh J. Sci. Ind. Res. 50(4), 293-296, 2015


2013 ◽  
Vol 2013 ◽  
pp. 1-9 ◽  
Author(s):  
Anunchai Assawamakin ◽  
Supakit Prueksaaroon ◽  
Supasak Kulawonganunchai ◽  
Philip James Shaw ◽  
Vara Varavithya ◽  
...  

Identification of suitable biomarkers for accurate prediction of phenotypic outcomes is a goal for personalized medicine. However, current machine learning approaches are either too complex or perform poorly. Here, a novel two-step machine-learning framework is presented to address this need. First, a Naïve Bayes estimator is used to rank features from which the top-ranked will most likely contain the most informative features for prediction of the underlying biological classes. The top-ranked features are then used in a Hidden Naïve Bayes classifier to construct a classification prediction model from these filtered attributes. In order to obtain the minimum set of the most informative biomarkers, the bottom-ranked features are successively removed from the Naïve Bayes-filtered feature list one at a time, and the classification accuracy of the Hidden Naïve Bayes classifier is checked for each pruned feature set. The performance of the proposed two-step Bayes classification framework was tested on different types of -omicsdatasets including gene expression microarray, single nucleotide polymorphism microarray (SNParray), and surface-enhanced laser desorption/ionization time-of-flight (SELDI-TOF) proteomic data. The proposed two-step Bayes classification framework was equal to and, in some cases, outperformed other classification methods in terms of prediction accuracy, minimum number of classification markers, and computational time.


Author(s):  
Neli Kalcheva ◽  
◽  
Maya Todorova ◽  
Ginka Marinova ◽  
◽  
...  

The purpose of the publication is to analyse popular classification algorithms in machine learning. The following classifiers were studied: Naive Bayes Classifier, Decision Tree and AdaBoost Ensemble Algorithm. Their advantages and disadvantages are discussed. Research shows that there is no comprehensive universal method or algorithm for classification in machine learning. Each method or algorithm works well depending on the specifics of the task and the data used.


Sign in / Sign up

Export Citation Format

Share Document