scholarly journals PEDF, a pleiotropic WTC-LI biomarker: Machine learning biomarker identification and validation

2021 ◽  
Vol 17 (7) ◽  
pp. e1009144
Author(s):  
George Crowley ◽  
James Kim ◽  
Sophia Kwon ◽  
Rachel Lam ◽  
David J. Prezant ◽  
...  

Biomarkers predict World Trade Center-Lung Injury (WTC-LI); however, there remains unaddressed multicollinearity in our serum cytokines, chemokines, and high-throughput platform datasets used to phenotype WTC-disease. To address this concern, we used automated, machine-learning, high-dimensional data pruning, and validated identified biomarkers. The parent cohort consisted of male, never-smoking firefighters with WTC-LI (FEV1, %Pred< lower limit of normal (LLN); n = 100) and controls (n = 127) and had their biomarkers assessed. Cases and controls (n = 15/group) underwent untargeted metabolomics, then feature selection performed on metabolites, cytokines, chemokines, and clinical data. Cytokines, chemokines, and clinical biomarkers were validated in the non-overlapping parent-cohort via binary logistic regression with 5-fold cross validation. Random forests of metabolites (n = 580), clinical biomarkers (n = 5), and previously assayed cytokines, chemokines (n = 106) identified that the top 5% of biomarkers important to class separation included pigment epithelium-derived factor (PEDF), macrophage derived chemokine (MDC), systolic blood pressure, macrophage inflammatory protein-4 (MIP-4), growth-regulated oncogene protein (GRO), monocyte chemoattractant protein-1 (MCP-1), apolipoprotein-AII (Apo-AII), cell membrane metabolites (sphingolipids, phospholipids), and branched-chain amino acids. Validated models via confounder-adjusted (age on 9/11, BMI, exposure, and pre-9/11 FEV1, %Pred) binary logistic regression had AUCROC [0.90(0.84–0.96)]. Decreased PEDF and MIP-4, and increased Apo-AII were associated with increased odds of WTC-LI. Increased GRO, MCP-1, and simultaneously decreased MDC were associated with decreased odds of WTC-LI. In conclusion, automated data pruning identified novel WTC-LI biomarkers; performance was validated in an independent cohort. One biomarker—PEDF, an antiangiogenic agent—is a novel, predictive biomarker of particulate-matter-related lung disease. Other biomarkers—GRO, MCP-1, MDC, MIP-4—reveal immune cell involvement in WTC-LI pathogenesis. Findings of our automated biomarker identification warrant further investigation into these potential pharmacotherapy targets.

2018 ◽  
Vol 9 ◽  
Author(s):  
Florian Hotzy ◽  
Anastasia Theodoridou ◽  
Paul Hoff ◽  
Andres R. Schneeberger ◽  
Erich Seifritz ◽  
...  

2018 ◽  
Vol 3 (1) ◽  
pp. 18 ◽  
Author(s):  
Alfensi Faruk ◽  
Endro Setyo Cahyono

Machine learning (ML) is a subject that focuses on the data analysis using various statistical tools and learning processes in order to gain more knowledge from the data. The objective of this research was to apply one of the ML techniques on the low birth weight (LBW) data in Indonesia. This research conducts two ML tasks; including prediction and classification. The binary logistic regression model was firstly employed on the train and the test data. Then; the random approach was also applied to the data set. The results showed that the binary logistic regression had a good performance for prediction; but it was a poor approach for classification. On the other hand; random forest approach has a very good performance for both prediction and classification of the LBW data set


2021 ◽  
Vol 7 (2) ◽  
pp. 164-185
Author(s):  
Haydée Maria Correia da Batista ◽  
Andrea Borges Paim ◽  
Brenda Santos Siqueira ◽  
Nelson Francisco Favilla Ebecken ◽  
Ana Claudia Dias

According to data from the last National Health Survey (PNS), conducted in 2013 by the Brazilian Institute of Geography and Statistics (IBGE) in partnership with the Ministry of Health, 7.6% of people aged 18 and over received diagnosis of depression. Therefore, based on this research, the purpose of this study was to identify factors that may be relevant to a possible diagnosis of depression, using machine learning techniques. The binary logistic regression model was chosen as the machine learning technique, with progressive and regressive methods for selecting variables and a model built by the researcher, generating seven different models. The model’s performance evaluation was made by comparing some metrics such as Cox-Snell R2 and Nagelkerke R2, which presented remarkably close results. Based on these models, 37 explanatory variables were selected which were applied to a new logistic regression model. The results showed that some variables significantly increased the chance of a positive diagnosis of depression as well as some variables were indicative of a reduction in the chances of this diagnosis.


2020 ◽  
Author(s):  
Sarah Al Youha ◽  
Mohammad Alkhamis ◽  
Sulaiman Al Mazeedi ◽  
Mohannad Al-Haddad ◽  
Mohammad H. Jamal ◽  
...  

Abstract Background: Demographic and clinical features of COVID-19 patients are critical components in shaping their symptomatic status. However, the relationship between patients' symptomatic status and their features are typically complicated and nonlinear.Methods: We explored important features that drive the symptomatic status of COVID-19 patients and reveal their interactions with other relevant factors. We used an extensive multi-algorithm machine learning (ML) pipeline and 68 demographic and clinical features to fit a predictive model to 3,995 patients in the State of Kuwait between February and June 2020. Our ML pipeline comprised five algorithms, including logistic regression (LR), random forest (RF), support vector machine (SVM), gradient boosting (GBM), and extreme gradient boosting (XGM).Results: SVM outperformed all algorithms (AUC = 0.77 and accuracy = 70.01%), while logistic regression had the lowest predictive power (AUC = 0.65 and accuracy = 66.14%). Our ML model identified C-reactive, respiratory rate, transmission dynamics, and other demographics as the most important predictors of COVID-19 symptomatic patients. While, only demographic features were important predictors for asymptomatic patients. However, our ML model further revealed that the non-linear relationships between impaired renal function, other clinical biomarkers and demographic features were critical in shaping the risk of being symptomatic patient. Conclusions: We demonstrated remarkable predictive performance of our ML model over traditional statistical methods in identifying important clinical and demographic features of symptomatic vs. asymptomatic. Further application of our ML pipeline in the COVID-19 case definition and guiding pharmaceutical and none-pharmaceutical interventions will help reduce the public health and economic implications of this devastating virus on local and global scales.


2017 ◽  
Vol 2 (2) ◽  

Background: Gestational diabetes mellitus is a condition that affects many pregnancies and ethnicity appears to be a risk factor. Data indicate that approximately 18% of Tamil women are diagnosed with gestational diabetes mellitus. Today, approximately 50,000 of Tamils live in Switzerland. To date, there is no official tool available in Switzerland that considers the eating and physical activity habits of this migrant Tamil population living in Switzerland, while offering a quick overview of gestational diabetes mellitus and standard dietetics management procedures. The NutriGeD project led by Bern University of Applied Sciences in Switzerland aimed at closing this gap. The aim of this present study was to evaluate the implementation potential of the tools developed in the project NutriGeD for dietetic counseling before their wide scale launch in Swiss hospitals, clinics and private practices. Method: An online survey was developed and distributed to 50 recruited healthcare professionals working in the German speaking region of Switzerland from October – December 2016 (31% response rate). The transcultural tools were sent to participants together with the link to the online survey. The evaluation outcome was analysed using binary logistic regression and cross tabulation analysis with IBM SPSS version 24.0, 2016. Results: 94% (N=47) respondents believed that the transcultural tools had good potential for implementation in hospitals and private practices in Switzerland. A binary logistic regression analysis revealed that the age of participants had a good correlation (42.1%) on recommending the implementation potential of the transcultural tool. The participants with age group 34- 54 years old where the highest group to recommend the implementation potential of the transcultural tool and this was found to be statistically significant (p=0.05). 74% (34 out of 50) of the respondents clearly acknowledged the need for transcultural competence knowledge in healthcare practices. 80% (N =40) of the respondents agreed that the information presented in the counseling display folder was important and helpful while 60% (N= 30) agreed to the contents being clinically applicable. 90% (N=45) participants recommended the availability of the evaluated transcultural tools in healthcare settings in Switzerland. Conclusion: The availability in healthcare practice of the evaluated transcultural tools was greatly encouraged by the Swiss healthcare practitioners participating in the survey. While they confirmed the need for these transcultural tools, feed-backs for minor adjustments were given to finalize the tools before their official launch in practice. The developed materials will be made available for clinical visits, in both hospitals and private practices in Switzerland. The Migmapp© transcultural tool can serve as a good approach in assisting healthcare professionals in all fields, especially professionals who practice in areas associated with diet - related diseases or disorders associated with populations at risk.


2019 ◽  
Vol 34 (Spring 2019) ◽  
pp. 157-173
Author(s):  
Kashif Siddique ◽  
Rubeena Zakar ◽  
Ra’ana Malik ◽  
Naveeda Farhat ◽  
Farah Deeba

The aim of this study is to find the association between Intimate Partner Violence (IPV) and contraceptive use among married women in Pakistan. The analysis was conducted by using cross sectional secondary data from every married women of reproductive age 15-49 years who responded to domestic violence module (N = 3687) of the 2012-13 Pakistan Demographic and Health Survey. The association between contraceptive use (outcome variable) and IPV was measured by calculating unadjusted odds ratios and adjusted odds ratios with 95% confidence intervals using simple binary logistic regression and multivariable binary logistic regression. The result showed that out of 3687 women, majority of women 2126 (57.7%) were using contraceptive in their marital relationship. Among total, 1154 (31.3%) women experienced emotional IPV, 1045 (28.3%) women experienced physical IPV and 1402 (38%) women experienced both physical and emotional IPV together respectively. All types of IPV was significantly associated with contraceptive use and women who reported emotional IPV (AOR 1.44; 95% CI 1.23, 1.67), physical IPV (AOR 1.41; 95% CI 1.20, 1.65) and both emotional and physical IPV together (AOR 1.49; 95% CI 1.24, 1.72) were more likely to use contraceptives respectively. The study revealed that women who were living in violent relationship were more likely to use contraceptive in Pakistan. Still there is a need for women reproductive health services and government should take initiatives to promote family planning services, awareness and access to contraceptive method options for women to reduce unintended or mistimed pregnancies that occurred in violent relationships.


2019 ◽  
Author(s):  
Oskar Flygare ◽  
Jesper Enander ◽  
Erik Andersson ◽  
Brjánn Ljótsson ◽  
Volen Z Ivanov ◽  
...  

**Background:** Previous attempts to identify predictors of treatment outcomes in body dysmorphic disorder (BDD) have yielded inconsistent findings. One way to increase precision and clinical utility could be to use machine learning methods, which can incorporate multiple non-linear associations in prediction models. **Methods:** This study used a random forests machine learning approach to test if it is possible to reliably predict remission from BDD in a sample of 88 individuals that had received internet-delivered cognitive behavioral therapy for BDD. The random forest models were compared to traditional logistic regression analyses. **Results:** Random forests correctly identified 78% of participants as remitters or non-remitters at post-treatment. The accuracy of prediction was lower in subsequent follow-ups (68%, 66% and 61% correctly classified at 3-, 12- and 24-month follow-ups, respectively). Depressive symptoms, treatment credibility, working alliance, and initial severity of BDD were among the most important predictors at the beginning of treatment. By contrast, the logistic regression models did not identify consistent and strong predictors of remission from BDD. **Conclusions:** The results provide initial support for the clinical utility of machine learning approaches in the prediction of outcomes of patients with BDD. **Trial registration:** ClinicalTrials.gov ID: NCT02010619.


Author(s):  
Dhilsath Fathima.M ◽  
S. Justin Samuel ◽  
R. Hari Haran

Aim: This proposed work is used to develop an improved and robust machine learning model for predicting Myocardial Infarction (MI) could have substantial clinical impact. Objectives: This paper explains how to build machine learning based computer-aided analysis system for an early and accurate prediction of Myocardial Infarction (MI) which utilizes framingham heart study dataset for validation and evaluation. This proposed computer-aided analysis model will support medical professionals to predict myocardial infarction proficiently. Methods: The proposed model utilize the mean imputation to remove the missing values from the data set, then applied principal component analysis to extract the optimal features from the data set to enhance the performance of the classifiers. After PCA, the reduced features are partitioned into training dataset and testing dataset where 70% of the training dataset are given as an input to the four well-liked classifiers as support vector machine, k-nearest neighbor, logistic regression and decision tree to train the classifiers and 30% of test dataset is used to evaluate an output of machine learning model using performance metrics as confusion matrix, classifier accuracy, precision, sensitivity, F1-score, AUC-ROC curve. Results: Output of the classifiers are evaluated using performance measures and we observed that logistic regression provides high accuracy than K-NN, SVM, decision tree classifiers and PCA performs sound as a good feature extraction method to enhance the performance of proposed model. From these analyses, we conclude that logistic regression having good mean accuracy level and standard deviation accuracy compared with the other three algorithms. AUC-ROC curve of the proposed classifiers is analyzed from the output figure.4, figure.5 that logistic regression exhibits good AUC-ROC score, i.e. around 70% compared to k-NN and decision tree algorithm. Conclusion: From the result analysis, we infer that this proposed machine learning model will act as an optimal decision making system to predict the acute myocardial infarction at an early stage than an existing machine learning based prediction models and it is capable to predict the presence of an acute myocardial Infarction with human using the heart disease risk factors, in order to decide when to start lifestyle modification and medical treatment to prevent the heart disease.


Author(s):  
Jeremy Freese

This article presents a method and program for identifying poorly fitting observations for maximum-likelihood regression models for categorical dependent variables. After estimating a model, the program leastlikely will list the observations that have the lowest predicted probabilities of observing the value of the outcome category that was actually observed. For example, when run after estimating a binary logistic regression model, leastlikely will list the observations with a positive outcome that had the lowest predicted probabilities of a positive outcome and the observations with a negative outcome that had the lowest predicted probabilities of a negative outcome. These can be considered the observations in which the outcome is most surprising given the values of the independent variables and the parameter estimates and, like observations with large residuals in ordinary least squares regression, may warrant individual inspection. Use of the program is illustrated with examples using binary and ordered logistic regression.


Sign in / Sign up

Export Citation Format

Share Document