scholarly journals Development of nomograms to assess the risk of clinical outcome

2019 ◽  
Vol 21 (2) ◽  
pp. 114-121
Author(s):  
A A Korneenkov ◽  
S G Kuzmin ◽  
V B Dergachev ◽  
D N Borisov

A methodology is presented for developing nomograms for assessing and stratifying the risk of a clinical outcome based on the created virtual data set using the R software environment. The virtual data set included input numerical and factor variables (variable types correspond to the R software documentation) and outcome. For quantitative variables, descriptive statistics were calculated at all levels of the outcome variable, and mosaic diagrams were constructed for factor variables. As a model that describes the association of input variables with the outcome, a logistic regression model was used. A bootstrap method was applied to validate and evaluate the model performance. The calculated validity indicators showed an acceptable discriminatory ability of the predictive model. The statistical calibration demonstrated the proximity of the model’s calibration curve to the ideal calibration curve. Based on the logistic regression coefficients, a nomogram was constructed using which the risk value of a specific outcome was calculated for each subject (patient). It is shown that with the help of the presented technique it is possible to stratify patients effectively by the risk of an adverse outcome, thus adequately altering the diagnosis and treatment tactics. The use of a nomogram greatly simplifies risk assessment and can be used in paper form as a supplement to the patient examination protocol. The article contains the codes of the R programming language with explanations.

2018 ◽  
Vol 41 (1) ◽  
pp. 96-112 ◽  
Author(s):  
Evy Rombaut ◽  
Marie-Anne Guerry

Purpose This paper aims to question whether the available data in the human resources (HR) system could result in reliable turnover predictions without supplementary survey information. Design/methodology/approach A decision tree approach and a logistic regression model for analysing turnover were introduced. The methodology is illustrated on a real-life data set of a Belgian branch of a private company. The model performance is evaluated by the area under the ROC curve (AUC) measure. Findings It was concluded that data in the personnel system indeed lead to valuable predictions of turnover. Practical implications The presented approach brings determinants of voluntary turnover to the surface. The results yield useful information for HR departments. Where the logistic regression results in a turnover probability at the individual level, the decision tree makes it possible to ascertain employee groups that are at risk for turnover. With the data set-based approach, each company can, immediately, ascertain their own turnover risk. Originality/value The study of a data-driven approach for turnover investigation has not been done so far.


2019 ◽  
Vol 115 (3/4) ◽  
Author(s):  
Douw G. Breed ◽  
Tanja Verster

Segmentation of data for the purpose of enhancing predictive modelling is a well-established practice in the banking industry. Unsupervised and supervised approaches are the two main types of segmentation and examples of improved performance of predictive models exist for both approaches. However, both focus on a single aspect – either target separation or independent variable distribution – and combining them may deliver better results. This combination approach is called semi-supervised segmentation. Our objective was to explore four new semi-supervised segmentation techniques that may offer alternative strengths. We applied these techniques to six data sets from different domains, and compared the model performance achieved. The original semi-supervised segmentation technique was the best for two of the data sets (as measured by the improvement in validation set Gini), but others outperformed for the other four data sets. Significance: We propose four newly developed semi-supervised segmentation techniques that can be used as additional tools for segmenting data before fitting a logistic regression. In all comparisons, using semi-supervised segmentation before fitting a logistic regression improved the modelling performance (as measured by the Gini coefficient on the validation data set) compared to using unsegmented logistic regression.


Author(s):  
Hamid Ghorbani

While methods of detecting outliers is frequently implemented by statisticians when analyzing univariate data, identifying outliers in multivariate data pose challenges that univariate data do not. In this paper, after short reviewing some tools for univariate outliers detection, the Mahalanobis distance, as a famous multivariate statistical distances, and its ability to detect multivariate outliers are discussed. As an application the univariate and multivariate outliers of a real data set has been detected using R software environment for statistical computing.


2021 ◽  
Vol 9 (Suppl 1) ◽  
pp. e001290
Author(s):  
Jenine K Harris

Family medicine has traditionally prioritised patient care over research. However, recent recommendations to strengthen family medicine include calls to focus more on research including improving research methods used in the field. Binary logistic regression is one method frequently used in family medicine research to classify, explain or predict the values of some characteristic, behaviour or outcome. The binary logistic regression model relies on assumptions including independent observations, no perfect multicollinearity and linearity. The model produces ORs, which suggest increased, decreased or no change in odds of being in one category of the outcome with an increase in the value of the predictor. Model significance quantifies whether the model is better than the baseline value (ie, the percentage of people with the outcome) at explaining or predicting whether the observed cases in the data set have the outcome. One model fit measure is the count- R2, which is the percentage of observations where the model correctly predicted the outcome variable value. Related to the count- R2 are model sensitivity—the percentage of those with the outcome who were correctly predicted to have the outcome—and specificity—the percentage of those without the outcome who were correctly predicted to not have the outcome. Complete model reporting for binary logistic regression includes descriptive statistics, a statement on whether assumptions were checked and met, ORs and CIs for each predictor, overall model significance and overall model fit.


Vehicle crashes occur because of numerous factors. It leads to loss of lives and permanent incapacity. The budgetary expenses of both individuals as well as for the nation are influenced by vehicle crashes. According to Road accident statistics, a total of 464910 road accidents were reported in India, claiming 1,47,913 lives and causing injuries to 4,70,975 persons every year. In this work, the UK data set sourced from Kaggle is used. For the study, 17 attributes and 35k records of the year 2015 are considered. The data set is imbalanced, so to balance out the data, the over-sampling technique is used. Random Forest, Decision tree, Logistic Regression, and Gradient Naïve Bayes algorithms are used to predict the severity of Accidents. To evaluate the model, performance measures like Accuracy, Precision, Recall, F1-Score are used. When Accuracy, Precision, F1-Score performance measure is considered Random Forest yielded the best result. When Recall performance measure is used, Random forest for Fatal, Decision Trees for Serious, Logistic regression for Slight yielded the best result.


2019 ◽  
Vol 34 (Spring 2019) ◽  
pp. 157-173
Author(s):  
Kashif Siddique ◽  
Rubeena Zakar ◽  
Ra’ana Malik ◽  
Naveeda Farhat ◽  
Farah Deeba

The aim of this study is to find the association between Intimate Partner Violence (IPV) and contraceptive use among married women in Pakistan. The analysis was conducted by using cross sectional secondary data from every married women of reproductive age 15-49 years who responded to domestic violence module (N = 3687) of the 2012-13 Pakistan Demographic and Health Survey. The association between contraceptive use (outcome variable) and IPV was measured by calculating unadjusted odds ratios and adjusted odds ratios with 95% confidence intervals using simple binary logistic regression and multivariable binary logistic regression. The result showed that out of 3687 women, majority of women 2126 (57.7%) were using contraceptive in their marital relationship. Among total, 1154 (31.3%) women experienced emotional IPV, 1045 (28.3%) women experienced physical IPV and 1402 (38%) women experienced both physical and emotional IPV together respectively. All types of IPV was significantly associated with contraceptive use and women who reported emotional IPV (AOR 1.44; 95% CI 1.23, 1.67), physical IPV (AOR 1.41; 95% CI 1.20, 1.65) and both emotional and physical IPV together (AOR 1.49; 95% CI 1.24, 1.72) were more likely to use contraceptives respectively. The study revealed that women who were living in violent relationship were more likely to use contraceptive in Pakistan. Still there is a need for women reproductive health services and government should take initiatives to promote family planning services, awareness and access to contraceptive method options for women to reduce unintended or mistimed pregnancies that occurred in violent relationships.


2019 ◽  
Vol 16 (2) ◽  
pp. 166-172 ◽  
Author(s):  
Linghui Deng ◽  
Changyi Wang ◽  
Shi Qiu ◽  
Haiyang Bian ◽  
Lu Wang ◽  
...  

Background: Hydration status significantly affects the clinical outcome of acute ischemic stroke (AIS) patients. Blood urea nitrogen-to-creatinine ratio (BUN/Cr) is a biomarker of hydration status. However, it is not known whether there is a relationship between BUN/Cr and three-month outcome as assessed by the modified Rankin Scale (mRS) score in AIS patients. Methods: AIS patients admitted to West China Hospital from 2012 to 2016 were prospectively and consecutively enrolled and baseline data were collected. Poor clinical outcome was defined as three-month mRS > 2. Univariate and multivariate logistic regression analyses were performed to determine the relationship between BUN/Cr and three-month outcome. Confounding factors were identified by univariate analysis. Stratified logistic regression analysis was performed to identify effect modifiers. Results: A total of 1738 patients were included in the study. BUN/Cr showed a positive correlation with the three-month outcome (OR 1.02, 95% CI 1.00-1.03, p=0.04). However, after adjusting for potential confounders, the correlation was no longer significant (p=0.95). An interaction between BUN/Cr and high-density lipoprotein (HDL) was discovered (p=0.03), with a significant correlation between BUN/Cr and three-month outcome in patients with higher HDL (OR 1.03, 95% CI 1.00-1.07, p=0.04). Conclusion: Elevated BUN/Cr is associated with poor three-month outcome in AIS patients with high HDL levels.


Author(s):  
Dhilsath Fathima.M ◽  
S. Justin Samuel ◽  
R. Hari Haran

Aim: This proposed work is used to develop an improved and robust machine learning model for predicting Myocardial Infarction (MI) could have substantial clinical impact. Objectives: This paper explains how to build machine learning based computer-aided analysis system for an early and accurate prediction of Myocardial Infarction (MI) which utilizes framingham heart study dataset for validation and evaluation. This proposed computer-aided analysis model will support medical professionals to predict myocardial infarction proficiently. Methods: The proposed model utilize the mean imputation to remove the missing values from the data set, then applied principal component analysis to extract the optimal features from the data set to enhance the performance of the classifiers. After PCA, the reduced features are partitioned into training dataset and testing dataset where 70% of the training dataset are given as an input to the four well-liked classifiers as support vector machine, k-nearest neighbor, logistic regression and decision tree to train the classifiers and 30% of test dataset is used to evaluate an output of machine learning model using performance metrics as confusion matrix, classifier accuracy, precision, sensitivity, F1-score, AUC-ROC curve. Results: Output of the classifiers are evaluated using performance measures and we observed that logistic regression provides high accuracy than K-NN, SVM, decision tree classifiers and PCA performs sound as a good feature extraction method to enhance the performance of proposed model. From these analyses, we conclude that logistic regression having good mean accuracy level and standard deviation accuracy compared with the other three algorithms. AUC-ROC curve of the proposed classifiers is analyzed from the output figure.4, figure.5 that logistic regression exhibits good AUC-ROC score, i.e. around 70% compared to k-NN and decision tree algorithm. Conclusion: From the result analysis, we infer that this proposed machine learning model will act as an optimal decision making system to predict the acute myocardial infarction at an early stage than an existing machine learning based prediction models and it is capable to predict the presence of an acute myocardial Infarction with human using the heart disease risk factors, in order to decide when to start lifestyle modification and medical treatment to prevent the heart disease.


2021 ◽  
Vol 12 ◽  
pp. 215013272110343
Author(s):  
Sewitemariam Desalegn Andarge ◽  
Abriham Sheferaw Areba ◽  
Robel Hussen Kabthymer ◽  
Miheret Tesfu Legesse ◽  
Girum Gebremeskel Kanno

Background Indoor air pollution from different fuel types has been linked with different adverse pregnancy outcomes. The study aimed to assess the link between indoor air pollution from different fuel types and anemia during pregnancy in Ethiopia. Method We have used the secondary data from the 2016 Ethiopian Demographic and Health Survey data. The anemia status of the pregnant women was the dichotomous outcome variable and the type of fuel used in the house was classified as high, medium, and low polluting fuels. Logistic regression was employed to determine the association between the exposure and outcome variables. Adjusted Odds Ratio was calculated at 95% Confidence Interval. Result The proportion of anemia in the low, medium, and high polluting fuel type users was 13.6%, 46%, 40.9% respectively. In the multivariable logistic regression analysis, the use of either kerosene or charcoal fuel types (AOR 4.6; 95% CI: 1.41-18.35) and being in the third trimester (AOR 1.72; 95% CI: 1.12-2.64) were significant factors associated with the anemia status of the pregnant women in Ethiopia. Conclusion According to our findings, the application of either kerosene or charcoal was associated with the anemia status during pregnancy in Ethiopia. An urgent intervention is needed to reduce the indoor air pollution that is associated with adverse pregnancy outcomes such as anemia.


2021 ◽  
Vol 11 (2) ◽  
pp. 20 ◽  
Author(s):  
Bright Opoku Ahinkorah ◽  
Richard Gyan Aboagye ◽  
Francis Arthur-Holmes ◽  
Abdul-Aziz Seidu ◽  
James Boadu Frimpong ◽  
...  

(1) Background: Psychological problems of adolescents have become a global health and safety concern. Empirical evidence has shown that adolescents experience diverse mental health conditions (e.g., anxiety, depression, and emotional disorders). However, research on anxiety-induced sleep disturbance among in-school adolescents has received less attention, particularly in low- and middle-income countries. This study’s central focus was to examine factors associated with t anxiety-induced sleep disturbance among in-school adolescents in Ghana. (2) Methods: Analysis was performed using the 2012 Global School-based Health Survey (GSHS). A sample of 1342 in-school adolescents was included in the analysis. The outcome variable was anxiety-induced sleep disturbance reported during the past 12 months. Frequencies, percentages, chi-square, and multivariable logistic regression analyses were conducted. Results from the multivariable logistic regression analysis were presented as crude and adjusted odds ratios at 95% confidence intervals (CIs) and with a statistical significance declared at p < 0.05. (3) Results: Adolescents who went hungry were more likely to report anxiety-induced sleep disturbance compared to their counterparts who did not report hunger (aOR = 1.68, CI = 1.10, 2.57). The odds of anxiety-induced sleep disturbance were higher among adolescents who felt lonely compared to those that never felt lonely (aOR = 2.82, CI = 1.98, 4.01). Adolescents who had sustained injury were more likely to have anxiety-induced sleep disturbance (aOR = 1.49, CI = 1.03, 2.14) compared to those who had no injury. Compared to adolescents who never had suicidal ideations, those who reported experiencing suicidal ideations had higher odds of anxiety-induced sleep disturbance (aOR = 1.68, CI = 1.05, 2.71). (4) Conclusions: Anxiety-induced sleep disturbance among in-school adolescents were significantly influenced by the psychosocial determinants such as hunger, loneliness, injury, and suicidal ideation in this study. The findings can help design appropriate interventions through effective strategies (e.g., early school-based screening, cognitive-behavioral therapy, face-face counseling services) to reduce psychosocial problems among in-school adolescents in Ghana.


Sign in / Sign up

Export Citation Format

Share Document