scholarly journals Bayesian Network as a Decision Tool for Predicting ALS Disease

2021 ◽  
Vol 11 (2) ◽  
pp. 150
Author(s):  
Hasan Aykut Karaboga ◽  
Aslihan Gunel ◽  
Senay Vural Korkut ◽  
Ibrahim Demir ◽  
Resit Celik

Clinical diagnosis of amyotrophic lateral sclerosis (ALS) is difficult in the early period. But blood tests are less time consuming and low cost methods compared to other methods for the diagnosis. The ALS researchers have been used machine learning methods to predict the genetic architecture of disease. In this study we take advantages of Bayesian networks and machine learning methods to predict the ALS patients with blood plasma protein level and independent personal features. According to the comparison results, Bayesian Networks produced best results with accuracy (0.887), area under the curve (AUC) (0.970) and other comparison metrics. We confirmed that sex and age are effective variables on the ALS. In addition, we found that the probability of onset involvement in the ALS patients is very high. Also, a person’s other chronic or neurological diseases are associated with the ALS disease. Finally, we confirmed that the Parkin level may also have an effect on the ALS disease. While this protein is at very low levels in Parkinson’s patients, it is higher in the ALS patients than all control groups.

2020 ◽  
Author(s):  
Peer Nowack ◽  
Lev Konstantinovskiy ◽  
Hannah Gardiner ◽  
John Cant

Abstract. Air pollution is a key public health issue in urban areas worldwide. The development of low-cost air pollution sensors is consequently a major research priority. However, low-cost sensors often fail to attain sufficient measurement performance compared to state-of-the-art measurement stations, and typically require calibration procedures in expensive laboratory settings. As a result, there has been much debate about calibration techniques that could make their performance more reliable, while also developing calibration procedures that can be carried out without access to advanced laboratories. One repeatedly proposed strategy is low-cost sensor calibration through co-location with public measurement stations. The idea is that, using a regression function, the low-cost sensor signals can be calibrated against the station reference signal, to be then deployed separately with performances similar to the original stations. Here we test the idea of using machine learning algorithms for such regression tasks using hourly-averaged co-location data for nitrogen dioxide (NO2) and particulate matter of particle sizes smaller than 10 μm (PM10) at three different locations in the urban area of London, UK. Specifically, we compare the performance of Ridge regression, a linear statistical learning algorithm, to two non-linear algorithms in the form of Random Forest (RF) regression and Gaussian Process regression (GPR). We further benchmark the performance of all three machine learning methods to the more common Multiple Linear Regression (MLR). We obtain very good out-of-sample R2-scores (coefficient of determination) > 0.7, frequently exceeding 0.8, for the machine learning calibrated low-cost sensors. In contrast, the performance of MLR is more dependent on random variations in the sensor hardware and co-located signals, and is also more sensitive to the length of the co-location period. We find that, subject to certain conditions, GPR is typically the best performing method in our calibration setting, followed by Ridge regression and RF regression. However, we also highlight several key limitations of the machine learning methods, which will be crucial to consider in any co-location calibration. In particular, none of the methods is able to extrapolate to pollution levels well outside those encountered at training stage. Ultimately, this is one of the key limiting factors when sensors are deployed away from the co-location site itself. Consequently, we find that the linear Ridge method, which best mitigates such extrapolation effects, is typically performing as good as, or even better, than GPR after sensor re-location. Overall, our results highlight the potential of co-location methods paired with machine learning calibration techniques to reduce costs of air pollution measurements, subject to careful consideration of the co-location training conditions, the choice of calibration variables, and the features of the calibration algorithm.


2021 ◽  
Author(s):  
Anton Gryzlov ◽  
Liliya Mironova ◽  
Sergey Safonov ◽  
Muhammad Arsalan

Abstract Multiphase flow metering is an important tool for production monitoring and optimization. Although there are many technologies available on the market, the existing multiphase meters are only accurate to a certain extend and generally are expensive to purchase and maintain. Virtual flow metering (VFM) is a low-cost alternative to conventional production monitoring tools, which relies on mathematical modelling rather than the use of hardware instrumentation. Supported by the availability of the data from different sensors and production history, the development of different virtual flow metering systems has become a focal point for many companies. This paper discusses the importance of flow modelling for virtual flow metering. In addition, main data-driven algorithms are introduced for the analysis of several dynamic production data sets. Artificial Neural Networks (ANN) together with advanced machine learning methods such as GRU and XGBoost have been considered as possible candidates for virtual flow metering. The obtained results indicate that the machine learning algorithms estimate oil, gas and water rates with acceptable accuracy. The feasibility of the data-driven virtual metering approach for continuous production monitoring purposes has been demonstrated via a series of simulation-based cases. Amongst the used algorithms the deep learning methods provided the most accurate results combined with reasonable time for model training.


2020 ◽  
Vol 11 ◽  
Author(s):  
Pedro F. Da Costa ◽  
Jessica Dafflon ◽  
Walter H. L. Pinaya

As we age, our brain structure changes and our cognitive capabilities decline. Although brain aging is universal, rates of brain aging differ markedly, which can be associated with pathological mechanism of psychiatric and neurological diseases. Predictive models have been applied to neuroimaging data to learn patterns associated with this variability and develop a neuroimaging biomarker of the brain condition. Aiming to stimulate the development of more accurate brain-age predictors, the Predictive Analytics Competition (PAC) 2019 provided a challenge that included a dataset of 2,640 participants. Here, we present our approach which placed between the top 10 of the challenge. We developed an ensemble of shallow machine learning methods (e.g., Support Vector Regression and Decision Tree-based regressors) that combined voxel-based and surface-based morphometric data. We used normalized brain volume maps (i.e., gray matter, white matter, or both) and features of cortical regions and anatomical structures, like cortical thickness, volume, and mean curvature. In order to fine-tune the hyperparameters of the machine learning methods, we combined the use of genetic algorithms and grid search. Our ensemble had a mean absolute error of 3.7597 years on the competition, showing the potential that shallow methods still have in predicting brain-age.


Sensors ◽  
2021 ◽  
Vol 21 (13) ◽  
pp. 4254
Author(s):  
Maryam Abo-Tabik ◽  
Yael Benn ◽  
Nicholas Costen

Smoking cessation apps provide efficient, low-cost and accessible support to smokers who are trying to quit smoking. This article focuses on how up-to-date machine learning algorithms, combined with the improvement of mobile phone technology, can enhance our understanding of smoking behaviour and support the development of advanced smoking cessation apps. In particular, we focus on the pros and cons of existing approaches that have been used in the design of smoking cessation apps to date, highlighting the need to improve the performance of these apps by minimizing reliance on self-reporting of environmental conditions (e.g., location), craving status and/or smoking events as a method of data collection. Lastly, we propose that making use of more advanced machine learning methods while enabling the processing of information about the user’s circumstances in real time is likely to result in dramatic improvement in our understanding of smoking behaviour, while also increasing the effectiveness and ease-of-use of smoking cessation apps, by enabling the provision of timely, targeted and personalised intervention.


10.2196/14993 ◽  
2019 ◽  
Vol 7 (4) ◽  
pp. e14993
Author(s):  
Hani Nabeel Mufti ◽  
Gregory Marshal Hirsch ◽  
Samina Raza Abidi ◽  
Syed Sibte Raza Abidi

Background Delirium is a temporary mental disorder that occasionally affects patients undergoing surgery, especially cardiac surgery. It is strongly associated with major adverse events, which in turn leads to increased cost and poor outcomes (eg, need for nursing home due to cognitive impairment, stroke, and death). The ability to foresee patients at risk of delirium will guide the timely initiation of multimodal preventive interventions, which will aid in reducing the burden and negative consequences associated with delirium. Several studies have focused on the prediction of delirium. However, the number of studies in cardiac surgical patients that have used machine learning methods is very limited. Objective This study aimed to explore the application of several machine learning predictive models that can pre-emptively predict delirium in patients undergoing cardiac surgery and compare their performance. Methods We investigated a number of machine learning methods to develop models that can predict delirium after cardiac surgery. A clinical dataset comprising over 5000 actual patients who underwent cardiac surgery in a single center was used to develop the models using logistic regression, artificial neural networks (ANN), support vector machines (SVM), Bayesian belief networks (BBN), naïve Bayesian, random forest, and decision trees. Results Only 507 out of 5584 patients (11.4%) developed delirium. We addressed the underlying class imbalance, using random undersampling, in the training dataset. The final prediction performance was validated on a separate test dataset. Owing to the target class imbalance, several measures were used to evaluate algorithm’s performance for the delirium class on the test dataset. Out of the selected algorithms, the SVM algorithm had the best F1 score for positive cases, kappa, and positive predictive value (40.2%, 29.3%, and 29.7%, respectively) with a P=.01, .03, .02, respectively. The ANN had the best receiver-operator area-under the curve (78.2%; P=.03). The BBN had the best precision-recall area-under the curve for detecting positive cases (30.4%; P=.03). Conclusions Although delirium is inherently complex, preventive measures to mitigate its negative effect can be applied proactively if patients at risk are prospectively identified. Our results highlight 2 important points: (1) addressing class imbalance on the training dataset will augment machine learning model’s performance in identifying patients likely to develop postoperative delirium, and (2) as the prediction of postoperative delirium is difficult because it is multifactorial and has complex pathophysiology, applying machine learning methods (complex or simple) may improve the prediction by revealing hidden patterns, which will lead to cost reduction by prevention of complications and will optimize patients’ outcomes.


2019 ◽  
Author(s):  
Hani Nabeel Mufti ◽  
Gregory Marshal Hirsch ◽  
Samina Raza Abidi ◽  
Syed Sibte Raza Abidi

BACKGROUND Delirium is a temporary mental disorder that occasionally affects patients undergoing surgery, especially cardiac surgery. It is strongly associated with major adverse events, which in turn leads to increased cost and poor outcomes (eg, need for nursing home due to cognitive impairment, stroke, and death). The ability to foresee patients at risk of delirium will guide the timely initiation of multimodal preventive interventions, which will aid in reducing the burden and negative consequences associated with delirium. Several studies have focused on the prediction of delirium. However, the number of studies in cardiac surgical patients that have used machine learning methods is very limited. OBJECTIVE This study aimed to explore the application of several machine learning predictive models that can pre-emptively predict delirium in patients undergoing cardiac surgery and compare their performance. METHODS We investigated a number of machine learning methods to develop models that can predict delirium after cardiac surgery. A clinical dataset comprising over 5000 actual patients who underwent cardiac surgery in a single center was used to develop the models using logistic regression, artificial neural networks (ANN), support vector machines (SVM), Bayesian belief networks (BBN), naïve Bayesian, random forest, and decision trees. RESULTS Only 507 out of 5584 patients (11.4%) developed delirium. We addressed the underlying class imbalance, using random undersampling, in the training dataset. The final prediction performance was validated on a separate test dataset. Owing to the target class imbalance, several measures were used to evaluate algorithm’s performance for the delirium class on the test dataset. Out of the selected algorithms, the SVM algorithm had the best F1 score for positive cases, kappa, and positive predictive value (40.2%, 29.3%, and 29.7%, respectively) with a <italic>P</italic>=.01, .03, .02, respectively. The ANN had the best receiver-operator area-under the curve (78.2%; <italic>P</italic>=.03). The BBN had the best precision-recall area-under the curve for detecting positive cases (30.4%; <italic>P</italic>=.03). CONCLUSIONS Although delirium is inherently complex, preventive measures to mitigate its negative effect can be applied proactively if patients at risk are prospectively identified. Our results highlight 2 important points: (1) addressing class imbalance on the training dataset will augment machine learning model’s performance in identifying patients likely to develop postoperative delirium, and (2) as the prediction of postoperative delirium is difficult because it is multifactorial and has complex pathophysiology, applying machine learning methods (complex or simple) may improve the prediction by revealing hidden patterns, which will lead to cost reduction by prevention of complications and will optimize patients’ outcomes.


2019 ◽  
Vol 252 ◽  
pp. 03009 ◽  
Author(s):  
Tomasz Cieplak ◽  
Tomasz Rymarczyk ◽  
Robert Tomaszewski

This paper presents a concept of the air quality monitoring system design and describes a selection of data quality analysis methods. A high level of industrialisation affects the risk of natural disasters related to environmental pollution such ase.g.air pollution by gases and clouds of dust (carbon monoxide, sulphur oxides, nitrogen oxides). That is why researches related to the monitoring this type of phenomena are extremely important. Low-cost air quality sensors are more commonly used to monitor air parameters in urban areas. These types of sensors are used to obtain an image of the spatiotemporal variability in the concentration of air pollutants. Aside from their low price , which is important from a point of view of the economic accessibility of society, low-cost sensors are prone to produce erroneous results compared to professional air quality monitors. The described study focuses on the analysis of outliers as particularly interesting for further analysis, as well as modelling with machine learning methods for air quality assessment in the city of Lublin.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Sam Andersson ◽  
Deepti R. Bathula ◽  
Stavros I. Iliadis ◽  
Martin Walter ◽  
Alkistis Skalkidou

AbstractPostpartum depression (PPD) is a detrimental health condition that affects 12% of new mothers. Despite negative effects on mothers’ and children’s health, many women do not receive adequate care. Preventive interventions are cost-efficient among high-risk women, but our ability to identify these is poor. We leveraged the power of clinical, demographic, and psychometric data to assess if machine learning methods can make accurate predictions of postpartum depression. Data were obtained from a population-based prospective cohort study in Uppsala, Sweden, collected between 2009 and 2018 (BASIC study, n = 4313). Sub-analyses among women without previous depression were performed. The extremely randomized trees method provided robust performance with highest accuracy and well-balanced sensitivity and specificity (accuracy 73%, sensitivity 72%, specificity 75%, positive predictive value 33%, negative predictive value 94%, area under the curve 81%). Among women without earlier mental health issues, the accuracy was 64%. The variables setting women at most risk for PPD were depression and anxiety during pregnancy, as well as variables related to resilience and personality. Future clinical models that could be implemented directly after delivery might consider including these variables in order to identify women at high risk for postpartum depression to facilitate individualized follow-up and cost-effectiveness.


2021 ◽  
Vol 14 (8) ◽  
pp. 5637-5655 ◽  
Author(s):  
Peer Nowack ◽  
Lev Konstantinovskiy ◽  
Hannah Gardiner ◽  
John Cant

Abstract. Low-cost air pollution sensors often fail to attain sufficient performance compared with state-of-the-art measurement stations, and they typically require expensive laboratory-based calibration procedures. A repeatedly proposed strategy to overcome these limitations is calibration through co-location with public measurement stations. Here we test the idea of using machine learning algorithms for such calibration tasks using hourly-averaged co-location data for nitrogen dioxide (NO2) and particulate matter of particle sizes smaller than 10 µm (PM10) at three different locations in the urban area of London, UK. We compare the performance of ridge regression, a linear statistical learning algorithm, to two non-linear algorithms in the form of random forest regression (RFR) and Gaussian process regression (GPR). We further benchmark the performance of all three machine learning methods relative to the more common multiple linear regression (MLR). We obtain very good out-of-sample R2 scores (coefficient of determination) >0.7, frequently exceeding 0.8, for the machine learning calibrated low-cost sensors. In contrast, the performance of MLR is more dependent on random variations in the sensor hardware and co-located signals, and it is also more sensitive to the length of the co-location period. We find that, subject to certain conditions, GPR is typically the best-performing method in our calibration setting, followed by ridge regression and RFR. We also highlight several key limitations of the machine learning methods, which will be crucial to consider in any co-location calibration. In particular, all methods are fundamentally limited in how well they can reproduce pollution levels that lie outside those encountered at training stage. We find, however, that the linear ridge regression outperforms the non-linear methods in extrapolation settings. GPR can allow for a small degree of extrapolation, whereas RFR can only predict values within the training range. This algorithm-dependent ability to extrapolate is one of the key limiting factors when the calibrated sensors are deployed away from the co-location site itself. Consequently, we find that ridge regression is often performing as good as or even better than GPR after sensor relocation. Our results highlight the potential of co-location approaches paired with machine learning calibration techniques to reduce costs of air pollution measurements, subject to careful consideration of the co-location training conditions, the choice of calibration variables and the features of the calibration algorithm.


2019 ◽  
Vol 40 (Supplement_1) ◽  
Author(s):  
X Y Zhao ◽  
J G Yang ◽  
T G Chen ◽  
J M Wang ◽  
X Li ◽  
...  

Abstract Background Prediction of in-hospital bleeding is critical for clinical decision making for acute myocardial infarction (AMI) patients undergoing percutaneous coronary intervention (PCI). Machine learning methods can automatically select the combination of the important features and learn their underlying relationship with the outcome. Objective We aim to evaluate the predictive value of machine learning methods to predict in-hospital bleeding for AMI patients. Methods We used data from the multicenter China Acute Myocardial Infarction (CAMI) registry. We randomly partitioned the cohort into derivation set (75%) and validation set (25%). Using data from the derivation set, we applied a state-of-art machine learning algorithm, XGBoost, to automatically select features from 106 candidate variables and train a risk prediction model to predict in-hospital bleeding (BARC 3, 5 definition). Results 16736 AMI patients who underwent PCI were consecutively included in the analysis, while 70 (0.42%) patients had in-hospital bleeding followed the BARC 3,5 definition of bleeding. Fifty-nine features were automatically selected from the candidate features and were used to construct the prediction model. The area under the curve (AUC) of the XGBoost model was 0.816 (95% CI: 0.745–0.887) on the validation set, while AUC of the CRUSADE risk score was 0.723 (95% CI: 0.619–0.828). Relative contribution of the 12 most important features Feature Relative Importance Direct bilirubin 0.078 Heart rate 0.077 CKMB 0.076 Creatinine 0.064 GPT 0.052 Age 0.048 SBP 0.036 TG 0.035 Glucose 0.035 HCT 0.031 Total bilirubin 0.030 Neutrophil 0.030 ROC of the XGBoost model and CRUSADE Conclusion The XGBoost model derived from the CAMI cohort accurately predicts in-hospital bleeding among Chinese AMI patients undergoing PCI. Acknowledgement/Funding the CAMS innovation Fund for Medical Sciences (CIFMS) (2016-12M-1-009); the Twelfth Five-year Planning Project of China (2011BAI11B02)


Sign in / Sign up

Export Citation Format

Share Document