root mean squared error
Recently Published Documents


TOTAL DOCUMENTS

81
(FIVE YEARS 51)

H-INDEX

8
(FIVE YEARS 2)

2024 ◽  
Vol 84 ◽  
Author(s):  
A. Yousafzai ◽  
W. Manzoor ◽  
G. Raza ◽  
T. Mahmood ◽  
F. Rehman ◽  
...  

Abstract This study aimed to develop and evaluate data driven models for prediction of forest yield under different climate change scenarios in the Gallies forest division of district Abbottabad, Pakistan. The Random Forest (RF) and Kernel Ridge Regression (KRR) models were developed and evaluated using yield data of two species (Blue pine and Silver fir) as an objective variable and climate data (temperature, humidity, rainfall and wind speed) as predictive variables. Prediction accuracy of both the models were assessed by means of root mean squared error (RMSE), mean absolute error (MAE), correlation coefficient (r), relative root mean squared error (RRMSE), Legates-McCabe’s (LM), Willmott’s index (WI) and Nash-Sutcliffe (NSE) metrics. Overall, the RF model outperformed the KRR model due to its higher accuracy in forecasting of forest yield. The study strongly recommends that RF model should be applied in other regions of the country for prediction of forest growth and yield, which may help in the management and future planning of forest productivity in Pakistan.


2021 ◽  
pp. 1-21
Author(s):  
Elsa Arrua-Duarte ◽  
Marta Migoya-Borja ◽  
Igor Barahona ◽  
Lena C. Quilty ◽  
Sakina J. Rizvi ◽  
...  

Abstract Objective: The Dimensional Anhedonia Rating Scale (DARS) is a novel questionnaire to assess anhedonia of recent validation. In this work we aim to study the equivalence between the traditional paper-and-pencil and the digital format of DARS. Methods: 69 patients filled the DARS in a paper-based and digital versions. We assessed differences between formats (Wilcoxon test), validity of the scales (Kappa and Intraclass Correlation Coefficients), and reliability (Cronbach’s alpha and Guttman’s coefficient). We calculated the Comparative Fit Index and the Root Mean Squared Error associated with the proposed one-factor structure. Results: Total scores were higher for paper-based format. Significant differences between both formats were found for three items. The weighted Kappa coefficient was approximately 0.40 for most of the items. Internal consistency was greater than 0.94, and the Intraclass Correlation Coefficient for the digital version was 0.95 and 0.94 for the paper-and-pencil version (F= 16.7, p < 0.001). Comparative Adjustment Index was 0.97 for the digital DARS and 0.97 for the paper-and-pencil DARS, and Root Mean Squared Error was 0.11 for the digital DARS and 0.10 for the paper-and-pencil DARS. Conclusion: The digital DARS is consistent in many respects to the paper-and-pencil questionnaire, but equivalence with this format cannot be assumed without caution.


2021 ◽  
Vol 13 (1) ◽  
Author(s):  
Florian Huber ◽  
Sven van der Burg ◽  
Justin J. J. van der Hooft ◽  
Lars Ridder

AbstractMass spectrometry data is one of the key sources of information in many workflows in medicine and across the life sciences. Mass fragmentation spectra are generally considered to be characteristic signatures of the chemical compound they originate from, yet the chemical structure itself usually cannot be easily deduced from the spectrum. Often, spectral similarity measures are used as a proxy for structural similarity but this approach is strongly limited by a generally poor correlation between both metrics. Here, we propose MS2DeepScore: a novel Siamese neural network to predict the structural similarity between two chemical structures solely based on their MS/MS fragmentation spectra. Using a cleaned dataset of > 100,000 mass spectra of about 15,000 unique known compounds, we trained MS2DeepScore to predict structural similarity scores for spectrum pairs with high accuracy. In addition, sampling different model varieties through Monte-Carlo Dropout is used to further improve the predictions and assess the model’s prediction uncertainty. On 3600 spectra of 500 unseen compounds, MS2DeepScore is able to identify highly-reliable structural matches and to predict Tanimoto scores for pairs of molecules based on their fragment spectra with a root mean squared error of about 0.15. Furthermore, the prediction uncertainty estimate can be used to select a subset of predictions with a root mean squared error of about 0.1. Furthermore, we demonstrate that MS2DeepScore outperforms classical spectral similarity measures in retrieving chemically related compound pairs from large mass spectral datasets, thereby illustrating its potential for spectral library matching. Finally, MS2DeepScore can also be used to create chemically meaningful mass spectral embeddings that could be used to cluster large numbers of spectra. Added to the recently introduced unsupervised Spec2Vec metric, we believe that machine learning-supported mass spectral similarity measures have great potential for a range of metabolomics data processing pipelines.


Methodology ◽  
2021 ◽  
Vol 17 (3) ◽  
pp. 189-204
Author(s):  
Cailey E. Fitzgerald ◽  
Ryne Estabrook ◽  
Daniel P. Martin ◽  
Andreas M. Brandmaier ◽  
Timo von Oertzen

Missing data are ubiquitous in psychological research. They may come about as an unwanted result of coding or computer error, participants' non-response or absence, or missing values may be intentional, as in planned missing designs. We discuss the effects of missing data on χ²-based goodness-of-fit indices in Structural Equation Modeling (SEM), specifically on the Root Mean Squared Error of Approximation (RMSEA). We use simulations to show that naive implementations of the RMSEA have a downward bias in the presence of missing data and, thus, overestimate model goodness-of-fit. Unfortunately, many state-of-the-art software packages report the biased form of RMSEA. As a consequence, the scientific community may have been accepting a much larger fraction of models with non-acceptable model fit. We propose a bias-correction for the RMSEA based on information-theoretic considerations that take into account the expected misfit of a person with fully observed data. The corrected RMSEA is asymptotically independent of the proportion of missing data for misspecified models. Importantly, results of the corrected RMSEA computation are identical to naive RMSEA if there are no missing data.


2021 ◽  
Vol 74 (3) ◽  
pp. 9675-9684
Author(s):  
Tatiana María Saldaña Villota ◽  
José Miguel Cotes Torres

This study presents a comparison of the usual statistical methods used for crop model assessment. A case study was conducted using a data set from observations of the total dry weight in diploid potato crop, and six simulated data sets derived from the observationsaimed to predict the measured data. Statistical indices such as the coefficient of determination, the root mean squared error, the relative root mean squared error, mean error, index of agreement, modified index of agreement, revised index of agreement, modeling efficiency, and revised modeling efficiency were compared. The results showed that the coefficient of determination is not a useful statistical index for model evaluation. The root mean squared error together with the relative root mean squared error offer an excellent notion of how deviated the simulations are in the same unit of the variable and percentage terms, and they leave no doubt when evaluating the quality of the simulations of a model.


2021 ◽  
pp. 202-208
Author(s):  
Daniel Theodorus ◽  
Sarjon Defit ◽  
Gunadi Widi Nurcahyo

Industri 4.0 mendorong banyak perusahaan bertransformasi ke sistem digital. Machine Learning merupakan salah satu solusi dalam analisa data. Analisa data menjadi poin penting dalam memberikan layanan yang terbaik (user experience) kepada pelanggan. Lokasi yang diangkat dalam penelitian ini adalah PT. Sentral Tukang Indonesia yang bergerak dalam bidang penjualan bahan bangunan dan alat pertukangan seperti: cat, tripleks, aluminium, keramik, dan hpl. Dengan banyaknya data yang tersedia, menyebabkan perusahaan mengalami kesulitan dalam memberikan rekomendasi produk kepada pelanggan. Sistem rekomendasi muncul sebagai solusi dalam memberikan rekomendasi produk,  berdasarkan interaksi antara pelanggan dengan pelanggan lainnya yang terdapat di dalam data histori penjualan. Tujuan dari penelitian ini adalah Membantu perusahaan dalam memberikan rekomendasi produk sehingga dapat meningkatkan penjualan, memudahkan pelanggan untuk menemukan produk yang dibutuhkan, dan meningkatkan layanan yang terbaik kepada pelanggan.Data yang digunakan adalah data histori penjualan dalam 1 periode (Q1 2021), data pelanggan, dan data produk pada PT. Sentral Tukang Indonesia. Data histori penjualan tersebut akan dibagi menjadi 80% untuk dataset training dan 20% untuk dataset testing. Metode Item-based Collaborative Filtering pada penelitian ini memakai algoritma Cosine Similarity untuk menghitung tingkat kemiripan antar produk. Prediksi score memakai rumus Weighted Sum dan dalam menghitung tingkat error memakai rumus Root Mean Squared Error. Hasil dari penelitian ini memperlihatkan rekomendasi top 10 produk per pelanggan. Produk yang tampil merupakan produk yang memiliki score tertinggi dari pelanggan tersebut. Penelitian ini dapat menjadi referensi dan acuan bagi perusahaan dalam memberikan rekomendasi produk yang dibutuhkan oleh pelanggan.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Zhiming Wang ◽  
Weimin Liu

AbstractBased on wind speed, direction and power data, an assessment method of wind energy potential using finite mixture statistical distributions is proposed. Considering the correlation existing and the effect between wind speed and direction, the angular-linear modeling approach is adopted to construct the joint probability density function of wind speed and direction. For modeling the distribution of wind power density and estimating model parameters of null or low wind speed and multimodal wind speed data, based on expectation–maximization algorithm, a two-component three-parameter Weibull mixture distribution is chosen as wind speed model, and a von Mises mixture distribution with nine components and six components are selected as the models of wind direction and the correlation circular variable between wind speed and direction, respectively. A comprehensive technique of model selection, which includes Akaike information criterion, Bayesian information criterion, the coefficient of determination R2 and root mean squared error, is used to select the optimal model in all candidate models. The proposed method is applied to averaged 10-min field monitoring wind data and compared with the other estimation methods and judged by the values of R2 and root mean squared error, histogram plot and wind rose diagram. The results show that the proposed method is effective and the area under study is not suitable for wide wind turbine applications, and the estimated wind energy potential would be inaccuracy without considering the influence of wind direction.


2021 ◽  
Vol 1 (2) ◽  
pp. 772-785
Author(s):  
Dieta Putri Jarwanti ◽  
◽  
Ery Suhartanto ◽  
Jadfan Sidqi Fidari ◽  
◽  
...  

Pos penakar hujan di Indonesia lokasinya masih kurang tersebar merata, padahal data hujan yang dihasilkan sangat penting. Maka diperlukan analisis validasi dengan data satelit TRMM karena dapat mencakup wilayah luas, tersedia secara near real-time dan aksesnya yang cepat. Penelitian ini bertujuan untuk memvalidasi data satelit dengan data observasi di DAS Grindulu yang datanya dianggap lengkap dan dapat diandalkan. Nantinya digunakan untuk mengantisipasi data curah hujan observasi yang mungkin error atau tidak tersedia. Metode validasi yang digunakan berupa Root Mean Squared Error (RMSE), Uji Kesalahan Relatif (KR), Nash Sutcliffe Efficiency (NSE) serta Koefisien Korelasi (R). Penelitian ini menggunakan dua tahap perhitungan, yaitu analisis validasi data tidak terkoreksi dan data terkoreksi, dimana data terkoreksi dilakukan kalibrasi data terlebih dahulu, hasil dari validasi data TRMM terkoreksi terbaik terdapat pada periode bulanan dengan rentang kalibrasi 9 tahun dan validasi 1 tahun dengan hasil NSE = 0,929; R = 0,969; RMSE = 46,48; KR = 8,9%. Hasil tersebut menunjukkan bahwa data TRMM terkoreksi menghasilkan nilai yang lebih baik dibandingan data TRMM tidak terkoreksi karena memiliki nilai NSE dan R yang mendekati satu dan nilai RMSE dan Kesalahan Relatifnya rendah. Secara kesluruhan, dapat disimpulkan bahwa data TRMM dapat digunakan sebagai data alternatif hidrologi di DAS Grindulu.


Author(s):  
Jeroen Schmidt ◽  
Elenna Dugundji ◽  
Bas Schotten

Effective parking policy is essential for cities to reduce the demand their road networks experience and to combat their carbon footprints. Existing research in the application of machine learning to understand parking behavior assumes that cities have prohibitively expensive stationary parking sensors installed, while no research has yet attempted to use machine learning to impute for parking behavior using mobile probe data of sparsely monitored areas. To this end, this paper shows that it is indeed feasible to impute parking pressure (occupation as a percentage). Gradient boosted trees were found to perform the best with an R2 score of 0.20 and root mean squared error (RMSE) score of 0.087. This paper also found that three unique parking occupancy patterns exist in Amsterdam and that this information, in combination with neighborhood characteristics, has an impact on imputation under certain conditions.


2021 ◽  
Author(s):  
Zhiming Wang ◽  
Weimin Liu

Abstract Based on wind speed, direction and power data, an assessment method of wind energy potential using finite mixture statistical distributions is proposed. Considering the correlation existing and the effect between wind speed and direction, the angular-linear modeling approach is adopted to construct the joint probability density function of wind speed and direction. For modeling the distribution of wind power density and estimating model parameters, based on expectation-maximization algorithm, a two-component three-parameter Weibull mixture distribution is chosen as wind speed model, and a von Mises mixture distribution with nine components and six components are selected as wind direction and relation circular variable models, respectively. A comprehensive technique of model selection, which includes Akaike information criterion, Bayesian information criterion, the coefficient of determination R2 and root mean squared error, is used to select the optimal model in all candidate models. The proposed method is applied to averaged 10-minute field monitoring wind data and compared with the other estimation methods and judged by the values of R2 and root mean squared error, histogram plot and wind rose diagram. The results show that the proposed method is effective and the area under study is not suitable for wide wind turbine applications, and the estimated wind energy potential would be inaccuracy without considering the influence of wind direction.


Sign in / Sign up

Export Citation Format

Share Document