regression splines
Recently Published Documents


TOTAL DOCUMENTS

452
(FIVE YEARS 114)

H-INDEX

47
(FIVE YEARS 9)

2022 ◽  
Author(s):  
Alberto Celma ◽  
Richard Bade ◽  
Juan V. Sancho ◽  
Félix Hernández ◽  
Melissa Humpries ◽  
...  

Abstract Ultra-high performance liquid chromatography coupled to ion mobility separation and high-resolution mass spectrometry instruments have proven very valuable for screening of emerging contaminants in the aquatic environment. However, when applying suspect or non-target approaches (i.e. when no reference standards are available) there is no information on retention time (RT) and collision cross section (CCS) values to facilitate identification. In-silico prediction tools of RT and CCS can therefore be of great utility to decrease the number of candidates to investigate. In this work, Multiple Adaptive Regression Splines (MARS) was evaluated for the prediction of both RT and CCS. MARS prediction models were developed and validated using a database of 477 protonated molecules, 169 deprotonated molecules and 249 sodium adducts. Multivariate and univariate models were evaluated showing a better fit for univariate models to the empirical data. The RT model (R2=0.855) showed a deviation between predicted and empirical data of ± 2.32 min (95% confidence intervals). The deviation observed for CCS data of protonated molecules using CCSH model (R2=0.966) was ± 4.05% with 95% confidence intervals. The CCSH model was also tested for the prediction of deprotonated molecules resulting in deviations below ± 5.86% for the 95% of the cases. Finally, a third model was developed for sodium adducts (CCSNa, R2=0.954) with deviation below ± 5.25% for the 95% of the cases. The developed models have been incorporated in an open access and user-friendly online platform which represents a great advantage for third-party research laboratories for predicting both RT and CCS data.


2022 ◽  
Vol 2022 ◽  
pp. 1-9
Author(s):  
Eman H. Alkhammash ◽  
Abdelmonaim Fakhry Kamel ◽  
Saud M. Al-Fattah ◽  
Ahmed M. Elshewey

This paper presents optimized linear regression with multivariate adaptive regression splines (LR-MARS) for predicting crude oil demand in Saudi Arabia based on social spider optimization (SSO) algorithm. The SSO algorithm is applied to optimize LR-MARS performance by fine-tuning its hyperparameters. The proposed prediction model was trained and tested using historical oil data gathered from different sources. The results suggest that the demand for crude oil in Saudi Arabia will continue to increase during the forecast period (1980–2015). A number of predicting accuracy metrics including Mean Absolute Error (MAE), Median Absolute Error (MedAE), Mean Square Error (MSE), Root Mean Square Error (RMSE), and coefficient of determination ( R 2 ) were used to examine and verify the predicting performance for various models. Analysis of variance (ANOVA) was also applied to reveal the predicting result of the crude oil demand in Saudi Arabia and also to compare the actual test data and predict results between different predicting models. The experimental results show that optimized LR-MARS model performs better than other models in predicting the crude oil demand.


Author(s):  
Shen Xing-xing ◽  
Cao Wei-wei ◽  
Li Kai

Abstract In this study, multivariate adaptive regression splines (MARS) model with order two and three were developed for predicting the California bearing capacity (CBR) value of pond ash stabilized with lime and lime sludge. To this aim, the model had five variables named maximum dry density, optimum moisture content, lime percentage, lime sludge percentage, and curing period as inputs, and CBR as output variable. MARS-O3 has the best results, which its R2 stood at 0.9565 and 0.9312, and PI 0.0709 and 0.1061 for the training and testing phases, respectively. In both developed models, the estimated CBR values in training and testing stages specify acceptable agreement with experimental results, representing the workability of proposed equations for predicting the CBR values with high accuracy. Comparison of two developed equations supplied that MARS-O3 has a better result than MARS-O2. Based on error curves, the MARS-O3 model results in the lowest error percentage in the CBR predicting process, providing roughly accurate prediction than those of the rest developed methods specified. Therefore, MARS-O3 could be recognized as the proposed model.


2021 ◽  
Author(s):  
Georgios Baskozos ◽  
Andreas Themistocleous ◽  
Harry L Hebert ◽  
Mathilde Pascal ◽  
Jishi John ◽  
...  

Abstract Background: To improve the treatment of painful Diabetic Peripheral Neuropathy (DPN) and associated co-morbidities, a better understanding of the pathophysiology and risk factors for painful DPN is required. Using harmonised cohorts (N = 1230) we have built models that classify painful versus painless DPN. Methods: The Random Forest, Adaptive Regression Splines and Naive Bayes machine learning models were trained for classifying painful/painless DPN. Their performance was estimated using cross-validation in large cross-sectional cohorts (N = 935). Models were externally validated in a large population-based cohort (N = 295) in the presence of missing values. Variables were ranked for importance using model specific metrics and marginal effects of predictors were aggregated and assessed at the global level. Model selection was carried out using the Mathews Correlation Coefficient (MCC) and model performance was quantified in the validation set using MCC, the area under the precision/recall curve (AUPRC) and accuracy.Results: Random Forest (MCC=0.28, AUPRC = 0.76) and Adaptive Regression Splines (MCC = 0.29, AUPRC = 0.77) were the best performing models and showed the smallest reduction in performance between the training and validation dataset. EQ5D index, the 10-item personality dimensions, HbA1c, Depression and Anxiety t-scores, age and Body Mass Index were consistently amongst the most powerful predictors in classifying painful vs painless DPN. Conclusions: Machine learning models trained on large cross-sectional cohorts were able to accurately classify painful or painless DPN on an independent population-based dataset. Painful DPN is associated with more depression, anxiety and certain personality traits. It is also associated with poorer self-reported quality of life, younger age, poor glucose control and high Body Mass Index (BMI). The models showed good performance in realistic conditions in the presence of missing values and noisy datasets. These models can be used either in the clinical context to assist patient stratification based on the risk of painful DPN or return broad risk categories based on user input. Model’s performance and calibration suggest that in both cases they could potentially improve diagnosis and outcomes by changing modifiable factors like BMI and HbA1c control and institute earlier preventive or supportive measures like psychological interventions.


2021 ◽  
Vol 2021 (1) ◽  
pp. 1044-1053
Author(s):  
Nuri Taufiq ◽  
Siti Mariyah

Metode yang digunakan untuk pemeringkatan status sosial ekonomi rumah tangga Basis Data Terpadu adalah dengan memprediksi nilai pengeluaran rumah tangga dengan metode Proxy Mean Testing (PMT). Secara umum metode ini merupakan model prediksi dengan menggunakan teknik regresi. Pilihan model statistik yang digunakan adalah forward-stepwise. Dalam praktiknya diasumsikan bahwa variabel prediktor yang digunakan dalam PMT memiliki korelasi linier dengan variabel pengeluaran. Penelitian ini mencoba menerapkan pendekatan machine learning sebagai alternatif metode prediksi selain model forward-stepwise. Model dibangun menggunakan beberapa algoritma machine learning seperti Multivariate Adaptive Regression Splines (MARS), K-Nearest Neighbors, Decision Tree, dan Bagging. Hasil pemodelan menunjukkan bahwa model machine learning menghasilkan nilai rata-rata inclusion error (IE) lebih rendah dibandingkan nilai rata-rata exclusion error (EE). Model machine learning bekerja efektif dalam mengurangi IE namun belum cukup sensitif dalam mengurangi EE. Nilai rata-rata IE model machine learning sebesar 0,21 sedangkan nilai rata-rata IE model PMT sebesar 0,29.


2021 ◽  
Vol 12 (5) ◽  
pp. 1-27
Author(s):  
Georgios Koutroulis ◽  
Leo Botler ◽  
Belgin Mutlu ◽  
Konrad Diwold ◽  
Kay Römer ◽  
...  

Recovering causality from copious time series data beyond mere correlations has been an important contributing factor in numerous scientific fields. Most existing works assume linearity in the data that may not comply with many real-world scenarios. Moreover, it is usually not sufficient to solely infer the causal relationships. Identifying the correct time delay of cause-effect is extremely vital for further insight and effective policies in inter-disciplinary domains. To bridge this gap, we propose KOMPOS, a novel algorithmic framework that combines a powerful concept from causal discovery of additive noise models with graphical ones. We primarily build our structural causal model from multivariate adaptive regression splines with inherent additive local nonlinearities, which render the underlying causal structure more easily identifiable. In contrast to other methods, our approach is not restricted to Gaussian or non-Gaussian noise due to the non-parametric attribute of the regression method. We conduct extensive experiments on both synthetic and real-world datasets, demonstrating the superiority of the proposed algorithm over existing causal discovery methods, especially for the challenging cases of autocorrelated and non-stationary time series.


Mathematics ◽  
2021 ◽  
Vol 9 (21) ◽  
pp. 2696
Author(s):  
Nawin Raj ◽  
Zahra Gharineiat

Mean sea level rise is a significant emerging risk from climate change. This research paper is based on the use of artificial intelligence models to assess and predict the trend on mean sea level around northern Australian coastlines. The study uses sea-level times series from four sites (Broom, Darwin, Cape Ferguson, Rosslyn Bay) to make the prediction. Multivariate adaptive regression splines (MARS) and artificial neural network (ANN) algorithms have been implemented to build the prediction model. Both models show high accuracy (R2 > 0.98) and low error values (RMSE < 27%) overall. The ANN model showed slightly better performance compared to MARS over the selected sites. The ANN performance was further assessed for modelling storm surges associated with cyclones. The model reproduced the surge profile with the maximum correlation coefficients ~0.99 and minimum RMS errors ~4 cm at selected validating sites. In addition, the ANN model predicted the maximum surge at Rosslyn Bay for cyclone Marcia to within 2 cm of the measured peak and the maximum surge at Broome for cyclone Narelle to within 7 cm of the measured peak. The results are comparable with a MARS model previously used in this region; however, the ANN shows better agreement with the measured peak and arrival time, although it suffers from slightly higher predictions than the observed sea level by tide gauge station.


Sign in / Sign up

Export Citation Format

Share Document