logistic regression
Recently Published Documents


(FIVE YEARS 13104)



2023 ◽  
Vol 83 ◽  
S. Khwaja ◽  
S. I. Hussain ◽  
M. Zahid ◽  
Z. Aziz ◽  
A. Akram ◽  

Abstract This study determines the associations among serum lipid profiles, risk of cardiovascular disease, and persistent organic pollutants. Using Gas chromatography technique, the intensity of toxic pollutant residues in serum samples of Hypertensive patients were measured. Based on statistical analysis, the effects of different covariates namely pesticides, age, systolic blood pressure, diastolic blood pressure, and lipid profile duration was checked using the logistic regression model. Statistical computation was performed on SPSS 22.0. The P-values of F-Statistic for each lipid profile class are greater than 0.01 (1%), therefore we cannot reject the null hypothesis for all cases. The estimated coefficients, their standard errors, Wald Statistic, and odds ratio of the binary logistic regression model for different lipid profile parameters indicate if pesticides increase then the logit value of different lipid profile parameters changes from -0.46 to -0.246 except LDL which increases by 0.135. The study reports a significantly increased threat of cardiovascular disease with increased concentrations of toxic pollutants.

2023 ◽  
Vol 83 ◽  
R. Muzaffar ◽  
M. A. Khan ◽  
M. H. Mushtaq ◽  
M. Nasir ◽  
A. Khan ◽  

Abstract The present study was designed to evaluate the strength of association of raised plasma homocysteine concentration as a risk factor for coronary heart disease independent of conventional risk factor. It was a case control study conducted at Punjab Institute of Cardiology Lahore. A total of 210 subjects aged 25 to 60 years comprising of 105 newly admitted patients of CHD as cases and 105 age and sex matched healthy individuals with no history of CHD as control were recruited for the study. Fasting blood samples were obtained from cases and controls. Plasma homocysteine was analyzed by fluorescence polarization immunoassay (FPIA) method on automated immunoassay analyzer (Abbott IMX). Total cholesterol, triglyceride and HDL cholesterol were analyzed using calorimetric kit methods. The concentration of LDL cholesterol was calculated using Friedewald formula. The patients were also assessed for traditional risk factors such as age, sex, family history of CVD, hypertension, smoking and physical activity, and were compared with control subjects. The collected data was entered in SPSS version 24 for analysis and interpretation.The mean age in controls and experimental groups were 43.00± 8.42 years and 44.72± 8.59 years with statistically same distribution (p- value= 0.144). The mean plasma homocysteine for cases was 22.33± 9.22 µmol/L where as it was 12.59±3.73 µmol/L in control group. Highly significant difference was seen between the mean plasma level of homocysteine in cases and controls (p˂0.001).Simple logistic regression indicates a strong association of coronary heart disease with hyperhomocysteinemia (OR 7.45), which remained significantly associated with coronary heart disease by multivariate logistic regression (OR 7.10, 95%C1 3.12-12.83, p=0.000). The present study concludes that elevated levels of Plasma homocysteine is an independent risk factor for coronary heart disease independent of conventional risk factors and can be used as an indicator for predicting the future possibility for the onset of CVD.

2022 ◽  
Vol 12 ◽  
Shaowu Lin ◽  
Yafei Wu ◽  
Ya Fang

BackgroundDepression is highly prevalent and considered as the most common psychiatric disorder in home-based elderly, while study on forecasting depression risk in the elderly is still limited. In an endeavor to improve accuracy of depression forecasting, machine learning (ML) approaches have been recommended, in addition to the application of more traditional regression approaches.MethodsA prospective study was employed in home-based elderly Chinese, using baseline (2011) and follow-up (2013) data of the China Health and Retirement Longitudinal Study (CHARLS), a nationally representative cohort study. We compared four algorithms, including the regression-based models (logistic regression, lasso, ridge) and ML method (random forest). Model performance was assessed using repeated nested 10-fold cross-validation. As the main measure of predictive performance, we used the area under the receiver operating characteristic curve (AUC).ResultsThe mean AUCs of the four predictive models, logistic regression, lasso, ridge, and random forest, were 0.795, 0.794, 0.794, and 0.769, respectively. The main determinants were life satisfaction, self-reported memory, cognitive ability, ADL (activities of daily living) impairment, CESD-10 score. Life satisfaction increased the odds ratio of a future depression by 128.6% (logistic), 13.8% (lasso), and 13.2% (ridge), and cognitive ability was the most important predictor in random forest.ConclusionsThe three regression-based models and one ML algorithm performed equally well in differentiating between a future depression case and a non-depression case in home-based elderly. When choosing a model, different considerations, however, such as easy operating, might in some instances lead to one model being prioritized over another.

Diagnostics ◽  
2022 ◽  
Vol 12 (1) ◽  
pp. 212
Sunmin Park ◽  
Chaeyeon Kim ◽  
Xuangao Wu

Background: Insulin resistance is a common etiology of metabolic syndrome, but receiver operating characteristic (ROC) curve analysis shows a weak association in Koreans. Using a machine learning (ML) approach, we aimed to generate the best model for predicting insulin resistance in Korean adults aged > 40 of the Ansan/Ansung cohort using a machine learning (ML) approach. Methods: The demographic, anthropometric, biochemical, genetic, nutrient, and lifestyle variables of 8842 participants were included. The polygenetic risk scores (PRS) generated by a genome-wide association study were added to represent the genetic impact of insulin resistance. They were divided randomly into the training (n = 7037) and test (n = 1769) sets. Potentially important features were selected in the highest area under the curve (AUC) of the ROC curve from 99 features using seven different ML algorithms. The AUC target was ≥0.85 for the best prediction of insulin resistance with the lowest number of features. Results: The cutoff of insulin resistance defined with HOMA-IR was 2.31 using logistic regression before conducting ML. XGBoost and logistic regression algorithms generated the highest AUC (0.86) of the prediction models using 99 features, while the random forest algorithm generated a model with 0.82 AUC. These models showed high accuracy and k-fold values (>0.85). The prediction model containing 15 features had the highest AUC of the ROC curve in XGBoost and random forest algorithms. PRS was one of 15 features. The final prediction models for insulin resistance were generated with the same nine features in the XGBoost (AUC = 0.86), random forest (AUC = 0.84), and artificial neural network (AUC = 0.86) algorithms. The model included the fasting serum glucose, ALT, total bilirubin, HDL concentrations, waist circumference, body fat, pulse, season to enroll in the study, and gender. Conclusion: The liver function, regular pulse checking, and seasonal variation in addition to metabolic syndrome components should be considered to predict insulin resistance in Koreans aged over 40 years.

Cancers ◽  
2022 ◽  
Vol 14 (2) ◽  
pp. 439
Anetta Sulewska ◽  
Jacek Niklinski ◽  
Radoslaw Charkiewicz ◽  
Piotr Karabowicz ◽  
Przemyslaw Biecek ◽  

LncRNAs have arisen as new players in the world of non-coding RNA. Disrupted expression of these molecules can be tightly linked to the onset, promotion and progression of cancer. The present study estimated the usefulness of 14 lncRNAs (HAGLR, ADAMTS9-AS2, LINC00261, MCM3AP-AS1, TP53TG1, C14orf132, LINC00968, LINC00312, TP73-AS1, LOC344887, LINC00673, SOX2-OT, AFAP1-AS1, LOC730101) for early detection of non-small-cell lung cancer (NSCLC). The total RNA was isolated from paired fresh-frozen cancerous and noncancerous lung tissue from 92 NSCLC patients diagnosed with either adenocarcinoma (LUAD) or lung squamous cell carcinoma (LUSC). The expression level of lncRNAs was evaluated by a quantitative real-time PCR (qPCR). Based on Ct and delta Ct values, logistic regression and gradient boosting decision tree classifiers were built. The latter is a novel, advanced machine learning algorithm with great potential in medical science. The established predictive models showed that a set of 14 lncRNAs accurately discriminates cancerous from noncancerous lung tissues (AUC value of 0.98 ± 0.01) and NSCLC subtypes (AUC value of 0.84 ± 0.09), although the expression of a few molecules was statistically insignificant (SOX2-OT, AFAP1-AS1 and LOC730101 for tumor vs. normal tissue; and TP53TG1, C14orf132, LINC00968 and LOC730101 for LUAD vs. LUSC). However for subtypes discrimination, the simplified logistic regression model based on the four variables (delta Ct AFAP1-AS1, Ct SOX2-OT, Ct LINC00261, and delta Ct LINC00673) had even stronger diagnostic potential than the original one (AUC value of 0.88 ± 0.07). Our results demonstrate that the 14 lncRNA signature can be an auxiliary tool to endorse and complement the histological diagnosis of non-small-cell lung cancer.

Pier Paolo Mattogno ◽  
Valerio M. Caccavella ◽  
Martina Giordano ◽  
Quintino G. D'Alessandris ◽  
Sabrina Chiloiro ◽  

Abstract Purpose Transsphenoidal surgery (TSS) for pituitary adenomas can be complicated by the occurrence of intraoperative cerebrospinal fluid (CSF) leakage (IOL). IOL significantly affects the course of surgery predisposing to the development of postoperative CSF leakage, a major source of morbidity and mortality in the postoperative period. The authors trained and internally validated the Random Forest (RF) prediction model to preoperatively identify patients at high risk for IOL. A locally interpretable model-agnostic explanations (LIME) algorithm is employed to elucidate the main drivers behind each machine learning (ML) model prediction. Methods The data of 210 patients who underwent TSS were collected; first, risk factors for IOL were identified via conventional statistical methods (multivariable logistic regression). Then, the authors trained, optimized, and audited a RF prediction model. Results IOL reported in 45 patients (21.5%). The recursive feature selection algorithm identified the following variables as the most significant determinants of IOL: Knosp's grade, sellar Hardy's grade, suprasellar Hardy's grade, tumor diameter (on X, Y, and Z axes), intercarotid distance, and secreting status (nonfunctioning and growth hormone [GH] secreting). Leveraging the predictive values of these variables, the RF prediction model achieved an area under the curve (AUC) of 0.83 (95% confidence interval [CI]: 0.78; 0.86), significantly outperforming the multivariable logistic regression model (AUC = 0.63). Conclusion A RF model that reliably identifies patients at risk for IOL was successfully trained and internally validated. ML-based prediction models can predict events that were previously judged nearly unpredictable; their deployment in clinical practice may result in improved patient care and reduced postoperative morbidity and healthcare costs.

2022 ◽  
Vol 22 (1) ◽  
Binyam Fekadu ◽  
Ismael Ali ◽  
Zergu Tafesse ◽  
Hailemariam Segni

Abstract Background Essential newborn care (ENC) is a package of interventions which should be provided for every newborn baby regardless of body size or place of delivery immediately after birth and should be continued for at least the seven days that follows. Even though Ethiopia has endorsed the implementation of ENC, as other many counties, it has been challenged. This study was conducted to measure the level of essential newborn care practice and identify health facility level attributes for consistent delivery of ENC services by health care providers. Methods This study employed a retrospective cross-sectional study design in 425 facilities. Descriptive statistics were formulated and presented in tables. Binary logistic regression was employed to assess the statistical association between the outcome variable and the independent variables. All variables with p < 0.2 in the bivariate analysis were identified as candidate variables. Then, multiple logistic regression analysis was performed using candidate variables to determine statistically significant predictors of the consistent delivery of ENC by adjusting for possible confounders. Results A total of 273, (64.2%), of facilities demonstrated consistent delivery of ENC. Five factors—availability of essential obstetrics drugs in delivery rooms, high community score card (CSC) performances, availability of maternity waiting homes, consistent partograph use, and availability of women-friendly delivery services were included in the model. The strongest predictor of consistent delivery of essential newborn care (CD-ENC) was consistent partograph use, recording an odds ratio of 2.66 (AOR = 2.66, 95%CI: 1.71, 4.13). Similarly, providing women-friendly services was strongly associated with increased likelihood of exhibiting CD-ENC. Furthermore, facilities with essential obstetric drugs had 1.88 (AOR = 1.88, 95%CI: 1.15, 3.08) times higher odds of exhibiting consistent delivery of ENC. Conclusion The delivery of essential newborn care depends on both health provider and facility manager actions and availability of platforms to streamline relationships between the clients and health facility management.

Gillian M. Maher ◽  
Ali S. Khashan ◽  
Fergus P. McCarthy

Abstract Purpose To examine the association between mode of delivery (in particular caesarean section) and behavioural outcomes in offspring at six time-points between age 3 and 17 years. Methods Similar to previous work examining the association between mode of delivery and behavioural outcomes in offspring at age 7, we used maternal-reported data from the Millennium Cohort Study. Data on mode of delivery were collected when children were 9 months and categorised as spontaneous vaginal delivery, assisted vaginal delivery, induced vaginal delivery, emergency caesarean section, planned caesarean section and caesarean section after induction of labor. Data on behavioural outcomes were collected at ages 3, 5, 7, 11, 14 and 17 years using the Strengths and Difficulties Questionnaire (SDQ). Crude and adjusted logistic regression examined mode of delivery–behavioural difficulties relationship, using validated SDQ cut-off points (total SDQ ≥ 17, emotional ≥ 5, conduct ≥ 4, hyperactivity ≥ 7, peer problems ≥ 4 and prosocial behaviour ≤ 4). Multilevel models with linear splines examined the association between mode of delivery and repeated measures of SDQ. Results There were 18,213 singleton mother–child pairs included at baseline, 13,600 at age 3; 13,831 at age 5; 12,687 at age 7; 11,055 at age 11; 10,745 at age 14 and 8839 at age 17. Adjusted logistic regression suggested few associations between mode of delivery and behavioural outcomes at ages 3, 5, 11, 14 and 17 years using validated SDQ cut-off points. After correction for multiple testing, only the protective association between planned caesarean section-Conduct difficulties at age 5 years (OR 0.63, 95% CI 0.46, 0.85) and positive association between caesarean section after induction-Emotional difficulties at age 11 years (OR 1.57, 95% CI 1.19, 2.07) remained statistically significant. Multilevel modelling suggested mean SDQ scores were similar in each mode of delivery group at each time point. Conclusions Results of this study indicate that mode of delivery is unlikely to have a major impact on behavioural outcomes.

2022 ◽  
Vol 11 (1) ◽  
pp. 325-337
Natalia Gil ◽  
Marcelo Albuquerque ◽  
Gabriela de

<p style="text-align: justify;">The article aims to develop a machine-learning algorithm that can predict student’s graduation in the Industrial Engineering course at the Federal University of Amazonas based on their performance data. The methodology makes use of an information package of 364 students with an admission period between 2007 and 2019, considering characteristics that can affect directly or indirectly in the graduation of each one, being: type of high school, number of semesters taken, grade-point average, lockouts, dropouts and course terminations. The data treatment considered the manual removal of several characteristics that did not add value to the output of the algorithm, resulting in a package composed of 2184 instances. Thus, the logistic regression, MLP and XGBoost models developed and compared could predict a binary output of graduation or non-graduation to each student using 30% of the dataset to test and 70% to train, so that was possible to identify a relationship between the six attributes explored and achieve, with the best model, 94.15% of accuracy on its predictions.</p>

Sign in / Sign up

Export Citation Format

Share Document