Prediction of frictional braking noise based on brake dynamometer test and artificial intelligent algorithms

Based on brake noise dynamometer test data, combined with the artificial intelligent algorithms, frictional braking noise is quantitatively analyzed and predicted in this study. To achieve this goal, a frictional braking noise prediction method is indicatively proposed, which consists of two main parts: first, based on the experimental data obtained from the brake noise dynamometer tests, and combining with the improved Long-Short-Term Memory (LSTM) algorithm, the coefficients of friction (COFs) are predicted under various braking test conditions. Then, based on the predicted braking COFs and other selected critical braking parameters, the quantitative prediction of frictional braking noise is obtained by means of the optimized eXtreme Gradient Boosting (XGBoost) algorithm. Finally, the inherent features of the XGBoost algorithm are employed to qualitatively analyze the importance of the main factors affecting the frictional braking noise. The prediction algorithms of COFs and frictional braking noise are validated by the brake dynamomter test data, and the R2 (R square) scores of both the LSTM and XGBoost prediction algorithms are 0.9, which verifies the feasibility of both algorithms. The main contribution of this work is to predict the braking noise based on a large set of test data and combined with the LSTM and XGBoost artificial intelligent algorithms, which can significantly save time for the brake system development and braking performance testing, and has significance to the rapid prediction of braking frictional noise and fast NVH (noise, vibration, and harshness) optimal design of frictional braking systems.

Download Full-text

Extreme gradient boosting regression model for soil thermal conductivity

Thermal Science ◽

10.2298/tsci200612001y ◽

2021 ◽

Vol 25 (Spec. issue 1) ◽

pp. 1-7

Author(s):

Ahmet Yurttakal

Keyword(s):

Thermal Conductivity ◽

Soil Moisture ◽

Regression Model ◽

Test Data ◽

Granular Structure ◽

Gradient Boosting ◽

Soil Thermal Conductivity ◽

Extreme Gradient Boosting ◽

Complicated Process ◽

Boosting Algorithm

The thermal conductivity estimation for the soil is an important step for many geothermal applications. But it is a difficult and complicated process since it involves a variety of factors that have significant effects on the thermal conductivity of soils such as soil moisture and granular structure. In this study, regression was performed with the extreme gradient boosting algorithm to develop a model for estimating thermal conductivity value. The performance of the model was measured on the unseen test data. As a result, the proposed algorithm reached 0.18 RMSE, 0.99 R2, and 3.18% MAE values which state that the algorithm is encouraging.

Download Full-text

Machine learning analysis on American Gut Project microbiome data to identify subjects with cancer both with and without chemotherapy exposure.

Journal of Clinical Oncology ◽

10.1200/jco.2020.38.15_suppl.e14069 ◽

2020 ◽

Vol 38 (15_suppl) ◽

pp. e14069-e14069

Author(s):

Oguz Akbilgic ◽

Ibrahim Karabayir ◽

Hakan Gunturkun ◽

Joseph F Pierre ◽

Ashley C Rashe ◽

...

Keyword(s):

Machine Learning ◽

Cancer Patients ◽

Test Data ◽

Gut Microbiome ◽

Classification Model ◽

Outcome Variable ◽

Gradient Boosting ◽

Healthy Controls ◽

Extreme Gradient Boosting ◽

Significant Difference

e14069 Background: There is growing interest in the links between cancer and the gut microbiome. However, the effect of chemotherapy upon the gut microbiome remains unknown. We studied whether machine learning can: 1) accurately classify subjects with cancer vs healthy controls and 2) whether this classification model is affected by chemotherapy exposure status. Methods: We used the American Gut Project data to build a extreme gradient boosting (XGBoost) model to distinguish between subjects with cancer vs healthy controls using data on simple demographics and published microbiome. We then further explore the selected features for cancer subjects based on chemotherapy exposure. Results: The cohort included 7,685 subjects consisting of 561 subjects with cancer, 52.5% female, 87.3% White, and average age of 44.7 (SD 17.7). The binary outcome variable represents cancer status. Among 561 subjects with cancer, 94 of them were treated with chemotherapy agents before sampling of microbiomes. As predictors, there were four demographic variables (sex, race, age, BMI) and 1,812 operational taxonomic units (OTUs) each found in at least 2 subjects via RNA sequencing. We randomly split data into 80% training and 20% hidden test. We then built an XGBoost model with 5-fold cross-validation using only training data yielding an AUC (with 95% CI) of 0.79 (0.77, 0.80) and obtained the almost the same AUC on the hidden test data. Based on feature importance analysis, we identified 12 most important features (Age, BMI and 12 OTUs; 4C0d-2, Brachyspirae, Methanosphaera, Geodermatophilaceae, Bifidobacteriaceae, Slackia, Staphylococcus, Acidaminoccus, Devosia, Proteus) and rebuilt a model using only these features and obtained AUC of 0.80 (0.77, 0.83) on the hidden test data. The average predicted probabilities for controls, cancer patients who were exposed to chemotherapy, and cancer patients who were not were 0.071 (0.070,0.073), 0.125 (0.110, 0.140), 0.156 (0.148, 0.164), respectively. There was no statistically significant difference on levels of these 12 OTUs between cancer subjects treated with and without chemotherapy. Conclusions: Machine learning achieved a moderately high accuracy identifying patients’ cancer status based on microbiome. Despite the literature on microbiome and chemotherapy interaction, the levels of 12 OTUs used in our model were not significantly different for cancer patients with or without chemotherapy exposure. Testing this model on other large population databases is needed for broader validation.

Download Full-text

Prediction of Sepsis in COVID-19 Using Laboratory Indicators

Frontiers in Cellular and Infection Microbiology ◽

10.3389/fcimb.2020.586054 ◽

2021 ◽

Vol 10 ◽

Author(s):

Guoxing Tang ◽

Ying Luo ◽

Feng Lu ◽

Wei Li ◽

Xiongcheng Liu ◽

...

Keyword(s):

Laboratory Test ◽

Test Data ◽

Early Warning ◽

Clinical Symptoms ◽

Characteristic Curve ◽

Health Concern ◽

Inflammatory Factors ◽

Gradient Boosting ◽

Coagulation Function ◽

Extreme Gradient Boosting

BackgroundThe outbreak of coronavirus disease 2019 (COVID-19) has become a global public health concern. Many inpatients with COVID-19 have shown clinical symptoms related to sepsis, which will aggravate the deterioration of patients’ condition. We aim to diagnose Viral Sepsis Caused by SARS-CoV-2 by analyzing laboratory test data of patients with COVID-19 and establish an early predictive model for sepsis risk among patients with COVID-19.MethodsThis study retrospectively investigated laboratory test data of 2,453 patients with COVID-19 from electronic health records. Extreme gradient boosting (XGBoost) was employed to build four models with different feature subsets of a total of 69 collected indicators. Meanwhile, the explainable Shapley Additive ePlanation (SHAP) method was adopted to interpret predictive results and to analyze the feature importance of risk factors.FindingsThe model for classifying COVID-19 viral sepsis with seven coagulation function indicators achieved the area under the receiver operating characteristic curve (AUC) 0.9213 (95% CI, 89.94–94.31%), sensitivity 97.17% (95% CI, 94.97–98.46%), and specificity 82.05% (95% CI, 77.24–86.06%). The model for identifying COVID-19 coagulation disorders with eight features provided an average of 3.68 (±) 4.60 days in advance for early warning prediction with 0.9298 AUC (95% CI, 86.91–99.04%), 82.22% sensitivity (95% CI, 67.41–91.49%), and 84.00% specificity (95% CI, 63.08–94.75%).InterpretationWe found that an abnormality of the coagulation function was related to the occurrence of sepsis and the other routine laboratory test represented by inflammatory factors had a moderate predictive value on coagulopathy, which indicated that early warning of sepsis in COVID-19 patients could be achieved by our established model to improve the patient’s prognosis and to reduce mortality.

Download Full-text

A prehospital diagnostic algorithm for strokes using machine learning: a prospective observational study

10.21203/rs.3.rs-492777/v1 ◽

2021 ◽

Author(s):

Yosuke Hayashi ◽

Tadanaga Shimada ◽

Noriyuki Hattori ◽

Takashi Shimazui ◽

Yoichi Yoshida ◽

...

Keyword(s):

Machine Learning ◽

Observational Study ◽

Test Data ◽

Predictive Value ◽

Receiver Operating Curve ◽

Gradient Boosting ◽

Support Vector ◽

Emergency Medical Service Personnel ◽

Diagnostic Algorithms ◽

Extreme Gradient Boosting

Abstract High precision is optimal in prehospital diagnostic algorithms for strokes and large vessel occlusions (LVOs). We hypothesized that prehospital diagnostic algorithms for strokes and their subcategories using machine learning could have high predictive value. Consecutive adult patients with suspected stroke as per emergency medical service personnel were enrolled in a prospective multicenter observational study in 12 hospitals in Japan. Five diagnostic algorithms using machine learning, including logistic regression, random forest, support vector machine, and eXtreme Gradient Boosting (XGBoost), were evaluated for stroke and subcategories including acute ischemic stroke (AIS) with/without LVO, intracranial hemorrhage (ICH), and subarachnoid hemorrhage (SAH). Of the 1446 patients in the analysis, 1156 (80%) were randomly included in the training cohort and 290 (20%) were included in the test cohort. In the diagnostic algorithms for strokes using XGBoost had the highest diagnostic value (test data, area under the receiver operating curve [AUROC] 0.980, confidence interval [CI; 0.962–0.994]). In the diagnostic algorithms for the subcategories using XGBoost had a high predictive value (test data, AUROC [CI], AIS with LVO 0.898 [0.848–0.939], AIS without LVO 0.882 [0.836–0.923], ICH 0.866 [0.817–0.911], SAH 0.926 [0.874–0.971]). Prehospital diagnostic algorithms using machine learning had high predictive value for strokes and their subcategories.

Download Full-text

Predicting Undesired Treatment Outcome in Mental Healthcare: Machine Learning Study (Preprint)

10.2196/preprints.17235 ◽

2019 ◽

Author(s):

Kasper Van Mens ◽

Joran Lokkerbol ◽

Richard Janssen ◽

Robert de Lange ◽

Bea Tiemens

Keyword(s):

Machine Learning ◽

Treatment Outcome ◽

Mental Health Treatment ◽

Mental Healthcare ◽

Machine Learning Algorithms ◽

Gradient Boosting ◽

Trade Off ◽

Trade Offs ◽

Outcome Monitoring ◽

Extreme Gradient Boosting

BACKGROUND It remains a challenge to predict which treatment will work for which patient in mental healthcare. OBJECTIVE In this study we compare machine algorithms to predict during treatment which patients will not benefit from brief mental health treatment and present trade-offs that must be considered before an algorithm can be used in clinical practice. METHODS Using an anonymized dataset containing routine outcome monitoring data from a mental healthcare organization in the Netherlands (n = 2,655), we applied three machine learning algorithms to predict treatment outcome. The algorithms were internally validated with cross-validation on a training sample (n = 1,860) and externally validated on an unseen test sample (n = 795). RESULTS The performance of the three algorithms did not significantly differ on the test set. With a default classification cut-off at 0.5 predicted probability, the extreme gradient boosting algorithm showed the highest positive predictive value (ppv) of 0.71(0.61 – 0.77) with a sensitivity of 0.35 (0.29 – 0.41) and area under the curve of 0.78. A trade-off can be made between ppv and sensitivity by choosing different cut-off probabilities. With a cut-off at 0.63, the ppv increased to 0.87 and the sensitivity dropped to 0.17. With a cut-off of at 0.38, the ppv decreased to 0.61 and the sensitivity increased to 0.57. CONCLUSIONS Machine learning can be used to predict treatment outcomes based on routine monitoring data.This allows practitioners to choose their own trade-off between being selective and more certain versus inclusive and less certain.

Download Full-text

XGBoost and Network Analysis for Prediction of Proteins Affecting Insulin based on Protein Protein Interactions

Kinetik Game Technology Information System Computer Network Computing Electronics and Control ◽

10.22219/kinetik.v5i4.1076 ◽

2020 ◽

pp. 253-262

Author(s):

Mohammad Hamim Zajuli Al Faroby ◽

Mohammad Isa Irawan ◽

Ni Nyoman Tri Puspaningsih

Keyword(s):

Protein Interactions ◽

Interaction Analysis ◽

Synthesis Process ◽

Gradient Boosting ◽

Protein Protein Interactions ◽

Central Function ◽

Extreme Gradient Boosting ◽

Main Protein ◽

The Right ◽

Roc Score

Protein Interaction Analysis (PPI) can be used to identify proteins that have a supporting function on the main protein, especially in the synthesis process. Insulin is synthesized by proteins that have the same molecular function covering different but mutually supportive roles. To identify this function, the translation of Gene Ontology (GO) gives certain characteristics to each protein. This study purpose to predict proteins that interact with insulin using the centrality method as a feature extractor and extreme gradient boosting as a classification algorithm. Characteristics using the centralized method produces features as a central function of protein. Classification results are measured using measurements, precision, recall and ROC scores. Optimizing the model by finding the right parameters produces an accuracy of and a ROC score of . The prediction model produced by XGBoost has capabilities above the average of other machine learning methods.

Download Full-text

Evaluation of Three Different Machine Learning Methods for Object-Based Artificial Terrace Mapping—A Case Study of the Loess Plateau, China

Remote Sensing ◽

10.3390/rs13051021 ◽

2021 ◽

Vol 13 (5) ◽

pp. 1021

Author(s):

Hu Ding ◽

Jiaming Na ◽

Shangjing Jiang ◽

Jie Zhu ◽

Kai Liu ◽

...

Keyword(s):

Machine Learning ◽

Random Forest ◽

Loess Plateau ◽

Water Conservation ◽

Nearest Neighbor ◽

Gradient Boosting ◽

K Nearest Neighbor ◽

The Loess Plateau ◽

Object Based ◽

Extreme Gradient Boosting

Artificial terraces are of great importance for agricultural production and soil and water conservation. Automatic high-accuracy mapping of artificial terraces is the basis of monitoring and related studies. Previous research achieved artificial terrace mapping based on high-resolution digital elevation models (DEMs) or imagery. As a result of the importance of the contextual information for terrace mapping, object-based image analysis (OBIA) combined with machine learning (ML) technologies are widely used. However, the selection of an appropriate classifier is of great importance for the terrace mapping task. In this study, the performance of an integrated framework using OBIA and ML for terrace mapping was tested. A catchment, Zhifanggou, in the Loess Plateau, China, was used as the study area. First, optimized image segmentation was conducted. Then, features from the DEMs and imagery were extracted, and the correlations between the features were analyzed and ranked for classification. Finally, three different commonly-used ML classifiers, namely, extreme gradient boosting (XGBoost), random forest (RF), and k-nearest neighbor (KNN), were used for terrace mapping. The comparison with the ground truth, as delineated by field survey, indicated that random forest performed best, with a 95.60% overall accuracy (followed by 94.16% and 92.33% for XGBoost and KNN, respectively). The influence of class imbalance and feature selection is discussed. This work provides a credible framework for mapping artificial terraces.

Download Full-text

Computational Intelligence-Based Model for Mortality Rate Prediction in COVID-19 Patients

International Journal of Environmental Research and Public Health ◽

10.3390/ijerph18126429 ◽

2021 ◽

Vol 18 (12) ◽

pp. 6429

Author(s):

Irfan Ullah Khan ◽

Nida Aslam ◽

Malak Aljabri ◽

Sumayh S. Aljameel ◽

Mariam Moataz Aly Kamaleldin ◽

...

Keyword(s):

Mortality Rate ◽

Computational Intelligence ◽

Nearest Neighbor ◽

Gradient Boosting ◽

K Nearest Neighbor ◽

Detection And Identification ◽

Proposed Model ◽

Extreme Gradient Boosting ◽

The World ◽

Detection And Diagnosis

The COVID-19 outbreak is currently one of the biggest challenges facing countries around the world. Millions of people have lost their lives due to COVID-19. Therefore, the accurate early detection and identification of severe COVID-19 cases can reduce the mortality rate and the likelihood of further complications. Machine Learning (ML) and Deep Learning (DL) models have been shown to be effective in the detection and diagnosis of several diseases, including COVID-19. This study used ML algorithms, such as Decision Tree (DT), Logistic Regression (LR), Random Forest (RF), Extreme Gradient Boosting (XGBoost), and K-Nearest Neighbor (KNN) and DL model (containing six layers with ReLU and output layer with sigmoid activation), to predict the mortality rate in COVID-19 cases. Models were trained using confirmed COVID-19 patients from 146 countries. Comparative analysis was performed among ML and DL models using a reduced feature set. The best results were achieved using the proposed DL model, with an accuracy of 0.97. Experimental results reveal the significance of the proposed model over the baseline study in the literature with the reduced feature set.

Download Full-text

A Machine Learning Method for Predicting Vegetation Indices in China

Remote Sensing ◽

10.3390/rs13061147 ◽

2021 ◽

Vol 13 (6) ◽

pp. 1147

Author(s):

Xiangqian Li ◽

Wenping Yuan ◽

Wenjie Dong

Keyword(s):

Machine Learning ◽

Growing Season ◽

Crop Growth ◽

Spatiotemporal Distribution ◽

Coefficient Of Determination ◽

Gradient Boosting ◽

Severe Drought ◽

Vegetation Growth ◽

Extreme Gradient Boosting ◽

Boosting Method

To forecast the terrestrial carbon cycle and monitor food security, vegetation growth must be accurately predicted; however, current process-based ecosystem and crop-growth models are limited in their effectiveness. This study developed a machine learning model using the extreme gradient boosting method to predict vegetation growth throughout the growing season in China from 2001 to 2018. The model used satellite-derived vegetation data for the first month of each growing season, CO2 concentration, and several meteorological factors as data sources for the explanatory variables. Results showed that the model could reproduce the spatiotemporal distribution of vegetation growth as represented by the satellite-derived normalized difference vegetation index (NDVI). The predictive error for the growing season NDVI was less than 5% for more than 98% of vegetated areas in China; the model represented seasonal variations in NDVI well. The coefficient of determination (R2) between the monthly observed and predicted NDVI was 0.83, and more than 69% of vegetated areas had an R2 > 0.8. The effectiveness of the model was examined for a severe drought year (2009), and results showed that the model could reproduce the spatiotemporal distribution of NDVI even under extreme conditions. This model provides an alternative method for predicting vegetation growth and has great potential for monitoring vegetation dynamics and crop growth.

Download Full-text

Machine Learning-Based Energy System Model for Tissue Paper Machines

Processes ◽

10.3390/pr9040655 ◽

2021 ◽

Vol 9 (4) ◽

pp. 655

Author(s):

Huanhuan Zhang ◽

Jigeng Li ◽

Mengna Hong

Keyword(s):

Neural Network ◽

Energy Consumption ◽

Gradient Boosting ◽

Paper Machine ◽

Steam Consumption ◽

Tissue Paper ◽

Paper Machines ◽

Energy Consumption Model ◽

Extreme Gradient Boosting ◽

Consumption Model

With the global energy crisis and environmental pollution intensifying, tissue papermaking enterprises urgently need to save energy. The energy consumption model is essential for the energy saving of tissue paper machines. The energy consumption of tissue paper machine is very complicated, and the workload and difficulty of using the mechanism model to establish the energy consumption model of tissue paper machine are very large. Therefore, this article aims to build an empirical energy consumption model for tissue paper machines. The energy consumption of this model includes electricity consumption and steam consumption. Since the process parameters have a great influence on the energy consumption of the tissue paper machines, this study uses three methods: linear regression, artificial neural network and extreme gradient boosting tree to establish the relationship between process parameters and power consumption, and process parameters and steam consumption. Then, the best power consumption model and the best steam consumption model are selected from the models established by linear regression, artificial neural network and the extreme gradient boosting tree. Further, they are combined into the energy consumption model of the tissue paper machine. Finally, the models established by the three methods are evaluated. The experimental results show that using the empirical model for tissue paper machine energy consumption modeling is feasible. The result also indicates that the power consumption model and steam consumption model established by the extreme gradient boosting tree are better than the models established by linear regression and artificial neural network. The experimental results show that the power consumption model and steam consumption model established by the extreme gradient boosting tree are better than the models established by linear regression and artificial neural network. The mean absolute percentage error of the electricity consumption model and the steam consumption model built by the extreme gradient boosting tree is approximately 2.72 and 1.87, respectively. The root mean square errors of these two models are about 4.74 and 0.03, respectively. The result also indicates that using the empirical model for tissue paper machine energy consumption modeling is feasible, and the extreme gradient boosting tree is an efficient method for modeling energy consumption of tissue paper machines.

Download Full-text