scholarly journals Toward the accurate estimation of elliptical side orifice discharge coefficient applying two rigorous kernel-based data-intelligence paradigms

2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Masoud Karbasi ◽  
Mehdi Jamei ◽  
Iman Ahmadianfar ◽  
Amin Asadi

AbstractIn the present study, two kernel-based data-intelligence paradigms, namely, Gaussian Process Regression (GPR) and Kernel Extreme Learning Machine (KELM) along with Generalized Regression Neural Network (GRNN) and Response Surface Methodology (RSM), as the validated schemes, employed to precisely estimate the elliptical side orifice discharge coefficient in rectangular channels. A total of 588 laboratory data in various geometric and hydraulic conditions were used to develop the models. The discharge coefficient was considered as a function of five dimensionless hydraulically and geometrical variables. The results showed that the machine learning models used in this study had shown good performance compared to the regression-based relationships. Comparison between machine learning models showed that GPR (RMSE = 0.0081, R = 0.958, MAPE = 1.3242) and KELM (RMSE = 0.0082, R = 0.9564, MAPE = 1.3499) models provide higher accuracy. Base on the RSM model, a new practical equation was developed to predict the discharge coefficient. Also, the sensitivity analysis of the input parameters showed that the main channel width to orifice height ratio (B/b) has the most significant effect on determining the discharge coefficient. The leveraged approach was applied to identify outlier data and applicability domain.

2021 ◽  
Author(s):  
Bruno Barbosa Miranda de Paiva ◽  
Polianna Delfino Pereira ◽  
Claudio Moises Valiense de Andrade ◽  
Virginia Mara Reis Gomes ◽  
Maria Clara Pontello Barbosa Lima ◽  
...  

Objective: To provide a thorough comparative study among state ofthe art machine learning methods and statistical methods for determining in-hospital mortality in COVID 19 patients using data upon hospital admission; to study the reliability of the predictions of the most effective methods by correlating the probability of the outcome and the accuracy of the methods; to investigate how explainable are the predictions produced by the most effective methods. Materials and Methods: De-identified data were obtained from COVID 19 positive patients in 36 participating hospitals, from March 1 to September 30, 2020. Demographic, comorbidity, clinical presentation and laboratory data were used as training data to develop COVID 19 mortality prediction models. Multiple machine learning and traditional statistics models were trained on this prediction task using a folded cross validation procedure, from which we assessed performance and interpretability metrics. Results: The Stacking of machine learning models improved over the previous state of the art results by more than 26% in predicting the class of interest (death), achieving 87.1% of AUROC and macroF1 of 73.9%. We also show that some machine learning models can be very interpretable and reliable, yielding more accurate predictions while providing a good explanation for the why. Conclusion: The best results were obtained using the meta learning ensemble model Stacking. State of the art explainability techniques such as SHAP values can be used to draw useful insights into the patterns learned by machine-learning algorithms. Machine learning models can be more explainable than traditional statistics models while also yielding highly reliable predictions. Key words: COVID-19; prognosis; prediction model; machine learning


Author(s):  
Harinarayan Sharma ◽  
Sonam Kumari ◽  
Aniket K. Dutt ◽  
Pawan Kumar ◽  
Mamookho E. Makhatha

Aim: Develop machine learning models for the performance of refrigerator and airconditioning system. Background: The Coefficient Of Performance (COP) of Refrigerator and Air-Conditioning (RAC) is a complex function of evaporative temperature and concentration of nano-particle in lubricants. In recent years, researchers focus on experimental study for improvement of COP. Further, few researchers applied simulation techniques such as fuzzy system, Artificial Neural Network (ANN), simulated annealing, etc. to the Vapour Compression Refrigeration (VCR) cycle. There is a scarcity of modeling research work for the performance of RAC system. Objective: The study aims to develop the machine learning predictive models for the performance of refrigerator and air-conditioning system using experimental data. Methods: The experiment was performed on VCR system to determine COP. Three different concentration of lubricants (added 0.5, 1.0 and 1.5g nano-TiO2 particle on 1 liter of Polyolester (POE) oil) were used. The experimentally calculated COP was used to train and test the machine learning models. Gaussian Process Regression (GPR) and Support Vector Regression (SVR) methods were applied to develop the models. Results: The experimental result reveals that the COP increases with increasing the concentration (of nano particles) at a given temperature. The addition of 0.5 and 1.0g TiO2 in the POE oil shows better rate of increment in the COP in comparison to addition of 1.5g TiO2 in the POE oil. Machine learning models using GPR and SVR with RBF kernel function is the most appropriate machine learning model for the nonlinear relationship between the output parameter (COP) and the input parameter (evaporative temperature and concentration of TiO2). Conclusion: The present study was conducted to investigate the machine learning approaches for performance of RAC system using experimental data sets. The experimental result shows that R134a and TiO2-POE nanolubricant work efficiently and the coefficient of performance of VCR system increases with concentration of nano-particle. The developed model performance is compared using coefficient of correlation and RSME values. After comparison, it is concluded that RBF based GPR model is the best fit machine learning model to predict the COP in the context of any other model for this data set.


Author(s):  
Maicon Herverton Lino Ferreira da Silva Barros ◽  
Geovanne Oliveira Alves ◽  
Lubnnia Morais Florêncio Souza ◽  
Élisson da Silva Rocha ◽  
João Fausto Lorenzato de Oliveira ◽  
...  

Tuberculosis (TB) is an airborne infectious disease caused by organisms in the Mycobacterium tuberculosis (Mtb) complex. In many low and middle-income countries, TB remains a major cause of morbidity and mortality. This work performs a benchmarking of machine learning models using a Brazilian health database related to TB confirmed cases and deaths, named SINAN-TB. The goal is to predict the probability of death by TB, assisting the TB prognosis and decision taking process. The database originally has 130 features, and many of these features had missing data, or incorrect data regarding the notification dates or birth dates, or were not related to the clinical and laboratory data. These data are treated, and after the preprocessing step, a new database with 38 features and 24,015 records is generated, having 22,876 TB cases and 1,139 deaths by TB. We design two experiments to investigated how the data unbalancing impacts on the models performance. With the evaluation of the f1-macro metric, we verify that the best result is achieved when using the imbalanced database, with the ensemble model that is composed of gradient boosting (GB), random forest (RF) and multi-layer perceptron (MLP) models.


Cureus ◽  
2021 ◽  
Author(s):  
Mohsen Tabatabaie ◽  
Amir Hossein Sarrami ◽  
Mojtaba Didehdar ◽  
Baharak Tasorian ◽  
Omid Shafaat ◽  
...  

2020 ◽  
Author(s):  
William P.T.M. van Doorn ◽  
Floris Helmich ◽  
Paul M.E.L. van Dam ◽  
Leo H.J. Jacobs ◽  
Patricia M. Stassen ◽  
...  

AbstractIntroductionRisk stratification of patients presenting to the emergency department (ED) is important for appropriate triage. Using machine learning technology, we can integrate laboratory data from a modern emergency department and present these in relation to clinically relevant endpoints for risk stratification. In this study, we developed and evaluated transparent machine learning models in four large hospitals in the Netherlands.MethodsHistorical laboratory data (2013-2018) available within the first two hours after presentation to the ED of Maastricht University Medical Centre+ (Maastricht), Meander Medical Center (Amersfoort), and Zuyderland (locations Sittard and Heerlen) were used. We used the first five years of data to develop the model and the sixth year to evaluate model performance in each hospital separately. Performance was assessed using area under the receiver-operating-characteristic curve (AUROC), brier scores and calibration curves. The SHapley Additive exPlanations (SHAP) algorithm was used to obtain transparent machine learning models.ResultsWe included 266,327 patients with more than 7 million laboratory results available for analysis. Models possessed high diagnostic performance with AUROCs of 0.94 [0.94-0.95], 0.98 [0.97-0.98], 0.88 [0.87-0.89] and 0.90 [0.89-0.91] for Maastricht, Amersfoort, Sittard and Heerlen, respectively. Using the SHAP algorithm, we visualized patient characteristics and laboratory results that drive patient-specific RISKINDEX predictions. As an illustrative example, we applied our models in a triage system for risk stratification that categorized 94.7% of the patients as low risk with a corresponding NPV of ≥99%.DiscussionDeveloped machine learning models are transparent with excellent diagnostic performance in predicting 31-day mortality in ED patients across four hospitals. Follow up studies will assess whether implementation of these algorithm can improve clinically relevant endpoints.


PeerJ ◽  
2018 ◽  
Vol 6 ◽  
pp. e4894 ◽  
Author(s):  
Senlin Zhu ◽  
Emmanuel Karlo Nyarko ◽  
Marijana Hadzima-Nyarko

The bio-chemical and physical characteristics of a river are directly affected by water temperature, which thereby affects the overall health of aquatic ecosystems. It is a complex problem to accurately estimate water temperature. Modelling of river water temperature is usually based on a suitable mathematical model and field measurements of various atmospheric factors. In this article, the air–water temperature relationship of the Missouri River is investigated by developing three different machine learning models (Artificial Neural Network (ANN), Gaussian Process Regression (GPR), and Bootstrap Aggregated Decision Trees (BA-DT)). Standard models (linear regression, non-linear regression, and stochastic models) are also developed and compared to machine learning models. Analyzing the three standard models, the stochastic model clearly outperforms the standard linear model and nonlinear model. All the three machine learning models have comparable results and outperform the stochastic model, with GPR having slightly better results for stations No. 2 and 3, while BA-DT has slightly better results for station No. 1. The machine learning models are very effective tools which can be used for the prediction of daily river temperature.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Cedric Gangloff ◽  
Sonia Rafi ◽  
Guillaume Bouzillé ◽  
Louis Soulat ◽  
Marc Cuggia

AbstractThe reverse transcription-polymerase chain reaction (RT-PCR) assay is the accepted standard for coronavirus disease 2019 (COVID-19) diagnosis. As any test, RT-PCR provides false negative results that can be rectified by clinicians by confronting clinical, biological and imaging data. The combination of RT-PCR and chest-CT could improve diagnosis performance, but this would requires considerable resources for its rapid use in all patients with suspected COVID-19. The potential contribution of machine learning in this situation has not been fully evaluated. The objective of this study was to develop and evaluate machine learning models using routine clinical and laboratory data to improve the performance of RT-PCR and chest-CT for COVID-19 diagnosis among post-emergency hospitalized patients. All adults admitted to the ED for suspected COVID-19, and then hospitalized at Rennes academic hospital, France, between March 20, 2020 and May 5, 2020 were included in the study. Three model types were created: logistic regression, random forest, and neural network. Each model was trained to diagnose COVID-19 using different sets of variables. Area under the receiving operator characteristics curve (AUC) was the primary outcome to evaluate model’s performances. 536 patients were included in the study: 106 in the COVID group, 430 in the NOT-COVID group. The AUC values of chest-CT and RT-PCR increased from 0.778 to 0.892 and from 0.852 to 0.930, respectively, with the contribution of machine learning. After generalization, machine learning models will allow increasing chest-CT and RT-PCR performances for COVID-19 diagnosis.


Author(s):  
Ahmed Kawther Hussein

The ear recognition system is an attractive research topic in the area of biometrics. It involves building machine learning models to verify the identities of humans using their ears. In this article, an exploration of the performance of ear recognition using two features - local binary pattern and histogram of gradient - has been done using the famous dataset USTB. The finding is that there is a similarity in the performance of these two features in terms of accuracy with a difference in the number of false predictions. The achieved accuracy of the histogram of gradient based extreme learning machine was 99.86% while for local binary pattern based extreme learning machine it was 99.59%.


Author(s):  
An Dinh ◽  
Stacey Miertschin ◽  
Amber Young ◽  
Somya D. Mohanty

Abstract Background Diabetes and cardiovascular disease are two of the main causes of death in the United States. Identifying and predicting these diseases in patients is the first step towards stopping their progression. We evaluate the capabilities of machine learning models in detecting at-risk patients using survey data (and laboratory results), and identify key variables within the data contributing to these diseases among the patients. Methods Our research explores data-driven approaches which utilize supervised machine learning models to identify patients with such diseases. Using the National Health and Nutrition Examination Survey (NHANES) dataset, we conduct an exhaustive search of all available feature variables within the data to develop models for cardiovascular, prediabetes, and diabetes detection. Using different time-frames and feature sets for the data (based on laboratory data), multiple machine learning models (logistic regression, support vector machines, random forest, and gradient boosting) were evaluated on their classification performance. The models were then combined to develop a weighted ensemble model, capable of leveraging the performance of the disparate models to improve detection accuracy. Information gain of tree-based models was used to identify the key variables within the patient data that contributed to the detection of at-risk patients in each of the diseases classes by the data-learned models. Results The developed ensemble model for cardiovascular disease (based on 131 variables) achieved an Area Under - Receiver Operating Characteristics (AU-ROC) score of 83.1% using no laboratory results, and 83.9% accuracy with laboratory results. In diabetes classification (based on 123 variables), eXtreme Gradient Boost (XGBoost) model achieved an AU-ROC score of 86.2% (without laboratory data) and 95.7% (with laboratory data). For pre-diabetic patients, the ensemble model had the top AU-ROC score of 73.7% (without laboratory data), and for laboratory based data XGBoost performed the best at 84.4%. Top five predictors in diabetes patients were 1) waist size, 2) age, 3) self-reported weight, 4) leg length, and 5) sodium intake. For cardiovascular diseases the models identified 1) age, 2) systolic blood pressure, 3) self-reported weight, 4) occurrence of chest pain, and 5) diastolic blood pressure as key contributors. Conclusion We conclude machine learned models based on survey questionnaire can provide an automated identification mechanism for patients at risk of diabetes and cardiovascular diseases. We also identify key contributors to the prediction, which can be further explored for their implications on electronic health records.


Sign in / Sign up

Export Citation Format

Share Document