2021 ◽  
Vol 24 ◽  
pp. 1-12
Author(s):  
Anna Aksenova

The agreement of subject and predicate in Russian is actually much less trivial than it might seem at first glance. This paper deals with the case when the subject is realized by a combination of a noun with a quantifier. I analyze a set of examples with the words двое, трое, пара, тройка, десяток, сотня, тысяча, миллион and миллиард where there is a variation in predicate number agreement. Using Random Forest, CIT and Logistic Regression algorithms I prove that collective (двое, трое) and non-collective (пара, тройка, десяток, сотня, тысяча, миллион, миллиард) quantifiers exhibit different patterns of agreement. The first group tends to trigger more plural agreement, while for the second one singular agreement is more typical. Moreover, the quantifier phrase position relative to the predicate can also influence the choice of number marker on the verb.


2021 ◽  
Vol 1208 (1) ◽  
pp. 012039
Author(s):  
Vedran Grgić ◽  
Denis Mušić ◽  
Elmir Babović

Abstract The paper analyzes the cardiovascular parameters of patients with heart disease. The aim of this study was to predict death in a patient with cardiovascular disease based on 12 parameters, using Random Forest and Logistic Regression algorithms. Parameters were tuned for both algorithms to determine the best settings. The most significant factors in the process predicted were found using the FEATURE SELECTION method of both algorithms. By comparative analysis of the obtained results, the highest accuracy of 90% was obtained using the Random Forest Algorithm.


2018 ◽  
Author(s):  
Brian Hill ◽  
Robert Brown ◽  
Eilon Gabel ◽  
Christine Lee ◽  
Maxime Cannesson ◽  
...  

AbstractBackgroundPredicting preoperative in-hospital mortality using readily-available electronic medical record (EMR) data can aid clinicians in accurately and rapidly determining surgical risk. While previous work has shown that the American Society of Anesthesiologists (ASA) Physical Status Classification is a useful, though subjective, feature for predicting surgical outcomes, obtaining this classification requires a clinician to review the patient’s medical records. Our goal here is to create an improved risk score using electronic medical records and demonstrate its utility in predicting in-hospital mortality without requiring clinician-derived ASA scores.MethodsData from 49,513 surgical patients were used to train logistic regression, random forest, and gradient boosted tree classifiers for predicting in-hospital mortality. The features used are readily available before surgery from EMR databases. A gradient boosted tree regression model was trained to impute the ASA Physical Status Classification, and this new, imputed score was included as an additional feature to preoperatively predict in-hospital post-surgical mortality. The preoperative risk prediction was then used as an input feature to a deep neural network (DNN), along with intraoperative features, to predict postoperative in-hospital mortality risk. Performance was measured using the area under the receiver operating characteristic (ROC) curve (AUC).ResultsWe found that the random forest classifier (AUC 0.921, 95%CI 0.908-0.934) outperforms logistic regression (AUC 0.871, 95%CI 0.841-0.900) and gradient boosted trees (AUC 0.897, 95%CI 0.881-0.912) in predicting in-hospital post-surgical mortality. Using logistic regression, the ASA Physical Status Classification score alone had an AUC of 0.865 (95%CI 0.848-0.882). Adding preoperative features to the ASA Physical Status Classification improved the random forest AUC to 0.929 (95%CI 0.915-0.943). Using only automatically obtained preoperative features with no clinician intervention, we found that the random forest model achieved an AUC of 0.921 (95%CI 0.908-0.934). Integrating the preoperative risk prediction into the DNN for postoperative risk prediction results in an AUC of 0.924 (95%CI 0.905-0.941), and with both a preoperative and postoperative risk score for each patient, we were able to show that the mortality risk changes over time.ConclusionsFeatures easily extracted from EMR data can be used to preoperatively predict the risk of in-hospital post-surgical mortality in a fully automated fashion, with accuracy comparable to models trained on features that require clinical expertise. This preoperative risk score can then be compared to the postoperative risk score to show that the risk changes, and therefore should be monitored longitudinally over time.Author summaryRapid, preoperative identification of those patients at highest risk for medical complications is necessary to ensure that limited infrastructure and human resources are directed towards those most likely to benefit. Existing risk scores either lack specificity at the patient level, or utilize the American Society of Anesthesiologists (ASA) physical status classification, which requires a clinician to review the chart. In this manuscript we report on using machine-learning algorithms, specifically random forest, to create a fully automated score that predicts preoperative in-hospital mortality based solely on structured data available at the time of surgery. This score has a higher AUC than both the ASA physical status score and the Charlson comorbidity score. Additionally, we integrate this score with a previously published postoperative score to demonstrate the extent to which patient risk changes during the perioperative period.


2021 ◽  
Vol 2083 (3) ◽  
pp. 032059
Author(s):  
Qiang Chen ◽  
Meiling Deng

Abstract Regression algorithms are commonly used in machine learning. Based on encryption and privacy protection methods, the current key hot technology regression algorithm and the same encryption technology are studied. This paper proposes a PPLAR based algorithm. The correlation between data items is obtained by logistic regression formula. The algorithm is distributed and parallelized on Hadoop platform to improve the computing speed of the cluster while ensuring the average absolute error of the algorithm.


2020 ◽  
Author(s):  
Jun Ke ◽  
Yiwei Chen ◽  
Xiaoping Wang ◽  
Zhiyong Wu ◽  
qiongyao Zhang ◽  
...  

Abstract BackgroundThe purpose of this study is to identify the risk factors of in-hospital mortality in patients with acute coronary syndrome (ACS) and to evaluate the performance of traditional regression and machine learning prediction models.MethodsThe data of ACS patients who entered the emergency department of Fujian Provincial Hospital from January 1, 2017 to March 31, 2020 for chest pain were retrospectively collected. The study used univariate and multivariate logistic regression analysis to identify risk factors for in-hospital mortality of ACS patients. The traditional regression and machine learning algorithms were used to develop predictive models, and the sensitivity, specificity, and receiver operating characteristic curve were used to evaluate the performance of each model.ResultsA total of 7810 ACS patients were included in the study, and the in-hospital mortality rate was 1.75%. Multivariate logistic regression analysis found that age and levels of D-dimer, cardiac troponin I, N-terminal pro-B-type natriuretic peptide (NT-proBNP), lactate dehydrogenase (LDH), high-density lipoprotein (HDL) cholesterol, and calcium channel blockers were independent predictors of in-hospital mortality. The study found that the area under the receiver operating characteristic curve of the models developed by logistic regression, gradient boosting decision tree (GBDT), random forest, and support vector machine (SVM) for predicting the risk of in-hospital mortality were 0.963, 0.960, 0.963, and 0.959, respectively. Feature importance evaluation found that NT-proBNP, LDH, and HDL cholesterol were top three variables that contribute the most to the prediction performance of the GBDT model and random forest model.ConclusionsThe predictive model developed using logistic regression, GBDT, random forest, and SVM algorithms can be used to predict the risk of in-hospital death of ACS patients. Based on our findings, we recommend that clinicians focus on monitoring the changes of NT-proBNP, LDH, and HDL cholesterol, as this may improve the clinical outcomes of ACS patients.


2021 ◽  
Vol 5 (1) ◽  
pp. 22
Author(s):  
Heena Tyagi ◽  
Emma Daulton ◽  
Ayman S. Bannaga ◽  
Ramesh P. Arasaradnam ◽  
James A. Covington

This study outlines the use of an electronic nose as a method for the detection of VOCs as biomarkers of bladder cancer. Here, an AlphaMOS FOX 4000 electronic nose was used for the analysis of urine samples from 15 bladder cancer and 41 non-cancerous patients. The FOX 4000 consists of 18 MOS sensors that were used to differentiate the two groups. The results obtained were analysed using s MultiSens Analyzer and RStudio. The results showed a high separation with sensitivity and specificity of 0.93 and 0.88, respectively, using a Sparse Logistic Regression and 0.93 and 0.76 using a Random Forest classifier. We conclude that the electronic nose shows potential for discriminating bladder cancer from non-cancer subjects using urine samples.


2021 ◽  
Author(s):  
Chris J. Kennedy ◽  
Dustin G. Mark ◽  
Jie Huang ◽  
Mark J. van der Laan ◽  
Alan E. Hubbard ◽  
...  

Background: Chest pain is the second leading reason for emergency department (ED) visits and is commonly identified as a leading driver of low-value health care. Accurate identification of patients at low risk of major adverse cardiac events (MACE) is important to improve resource allocation and reduce over-treatment. Objectives: We sought to assess machine learning (ML) methods and electronic health record (EHR) covariate collection for MACE prediction. We aimed to maximize the pool of low-risk patients that are accurately predicted to have less than 0.5% MACE risk and may be eligible for reduced testing. Population Studied: 116,764 adult patients presenting with chest pain in the ED and evaluated for potential acute coronary syndrome (ACS). 60-day MACE rate was 1.9%. Methods: We evaluated ML algorithms (lasso, splines, random forest, extreme gradient boosting, Bayesian additive regression trees) and SuperLearner stacked ensembling. We tuned ML hyperparameters through nested ensembling, and imputed missing values with generalized low-rank models (GLRM). We benchmarked performance to key biomarkers, validated clinical risk scores, decision trees, and logistic regression. We explained the models through variable importance ranking and accumulated local effect visualization. Results: The best discrimination (area under the precision-recall [PR-AUC] and receiver operating characteristic [ROC-AUC] curves) was provided by SuperLearner ensembling (0.148, 0.867), followed by random forest (0.146, 0.862). Logistic regression (0.120, 0.842) and decision trees (0.094, 0.805) exhibited worse discrimination, as did risk scores [HEART (0.064, 0.765), EDACS (0.046, 0.733)] and biomarkers [serum troponin level (0.064, 0.708), electrocardiography (0.047, 0.686)]. The ensemble's risk estimates were miscalibrated by 0.2 percentage points. The ensemble accurately identified 50% of patients to be below a 0.5% 60-day MACE risk threshold. The most important predictors were age, peak troponin, HEART score, EDACS score, and electrocardiogram. GLRM imputation achieved 90% reduction in root mean-squared error compared to median-mode imputation. Conclusion: Use of ML algorithms, combined with broad predictor sets, improved MACE risk prediction compared to simpler alternatives, while providing calibrated predictions and interpretability. Standard risk scores may neglect important health information available in other characteristics and combined in nuanced ways via ML.


Sign in / Sign up

Export Citation Format

Share Document