Comparison of statistical indices for the evaluation of crop models performance

This study presents a comparison of the usual statistical methods used for crop model assessment. A case study was conducted using a data set from observations of the total dry weight in diploid potato crop, and six simulated data sets derived from the observationsaimed to predict the measured data. Statistical indices such as the coefficient of determination, the root mean squared error, the relative root mean squared error, mean error, index of agreement, modified index of agreement, revised index of agreement, modeling efficiency, and revised modeling efficiency were compared. The results showed that the coefficient of determination is not a useful statistical index for model evaluation. The root mean squared error together with the relative root mean squared error offer an excellent notion of how deviated the simulations are in the same unit of the variable and percentage terms, and they leave no doubt when evaluating the quality of the simulations of a model.

Download Full-text

Calibration-Based Estimators using Different Distance Measures under Two Auxiliary Variables: A Comparative Study

Journal of Modern Applied Statistical Methods ◽

10.22237/jmasm/1619481600 ◽

2021 ◽

Vol 19 (1) ◽

pp. 2-20

Author(s):

Piyush Kant Rai ◽

Alka Singh ◽

Muhammad Qasim

Keyword(s):

Mean Squared Error ◽

Real Life ◽

Distance Functions ◽

Distance Measures ◽

Auxiliary Variables ◽

Data Set ◽

Life Data ◽

Squared Error ◽

Real Life Data ◽

Relative Root

This article introduces calibration estimators under different distance measures based on two auxiliary variables in stratified sampling. The theory of the calibration estimator is presented. The calibrated weights based on different distance functions are also derived. A simulation study has been carried out to judge the performance of the proposed estimators based on the minimum relative root mean squared error criterion. A real-life data set is also used to confirm the supremacy of the proposed method.

Download Full-text

Implementation of a Demand-Side Management Solution for South Korea’s Demand Response Program

Applied Sciences ◽

10.3390/app10051751 ◽

2020 ◽

Vol 10 (5) ◽

pp. 1751 ◽

Cited By ~ 1

Author(s):

Wonsuk Ko ◽

Hamsakutty Vettikalladi ◽

Seung-Ho Song ◽

Hyeong-Jin Choi

Keyword(s):

South Korea ◽

Real Time ◽

Demand Response ◽

Mean Squared Error ◽

Demand Side Management ◽

Demand Side ◽

Root Mean Squared Error ◽

Squared Error ◽

Demand Response Program ◽

Relative Root

In this paper, we show the development of a demand-side management solution (DSMS) for demand response (DR) aggregator and actual demand response operation cases in South Korea. To show an experience, Korea’s demand response market outline, functions of DSMS, real contracted capacity, and payment between consumer and load aggregator and DR operation cases are revealed. The DSMS computes the customer baseline load (CBL), relative root mean squared error (RRMSE), and payments of the customers in real time. The case of 10 MW contracted customers shows 108.03% delivery rate and a benefit of 854,900,394 KRW for two years. The results illustrate that an integrated demand-side management solution contributes by participating in a DR market and gives a benefit and satisfaction to the consumer.

Download Full-text

DSSAT-CSM Soil Module: Modeling Topsoil Water Holding Capacity in the two Dry Savanna Zones of Kano State, Nigeria.

10.36265/njss.2019.290201 ◽

2019 ◽

pp. 1-6

Keyword(s):

Mean Squared Error ◽

Coefficient Of Determination ◽

Intrinsic Properties ◽

Squared Error ◽

Crop Simulation ◽

Index Of Agreement ◽

Sudan Savanna ◽

Water Holding ◽

Nigerian Savanna ◽

Savanna Soil

The objective of this study was to test the efficiency of the Hydraulic Pedotrans- fer Functions (PTFs) employed in the Decision Support System for Agrotechnol- ogy Transfer – Crop Simulation Model (DSSAT-CSM) in modeling topsoil WHC in Northern Guinea Savanna (NGS) and Sudan Savanna (SS) of Kano State in Nigeria. Coefficient of determination (R2), Root Mean Squared Error (RMSE), and Index of Agreement (d-index) were the three statistical methods used to test the fitness between predicted, and laboratory observed WHC of dis- turbed, auger sampled topsoil. Findings of the study established that the PTFs fitted in the algorithm of DSSAT-CSM soil water sub module made a significant topsoil WHC estimation in NGS with statistics R² = 0.352, RMSE = 0.03, and d- Index = 0.71. However, the model did not estimate the WHC validly in Sudan Savanna, with insignificant statistics of R² = 0.031, RMSE of 0.10, and 0.44 as the index of agreement. The conclusion drawn was that DSSAT made fair and poor predictions of topsoil WHC in NGS and SS soils respectively, irrespective of texture and other intrinsic properties. Based on the findings above, we recom- mend the development of local PTFs alternatives to be used with DSSAT’s algo- rithm for Nigerian Savanna soil

Download Full-text

Comparison of sampling methods for estimation of nearest-neighbor index values

Canadian Journal of Forest Research ◽

10.1139/cjfr-2016-0239 ◽

2017 ◽

Vol 47 (6) ◽

pp. 703-715 ◽

Cited By ~ 2

Author(s):

Francisco Mauro ◽

Zane Haxtema ◽

Hailemariam Temesgen

Keyword(s):

Mean Squared Error ◽

Relative Bias ◽

Random Selection ◽

Horizontal Distance ◽

Root Mean Squared Error ◽

Reference Tree ◽

Squared Error ◽

Relative Root ◽

Diameter Differentiation ◽

Selection Of

Neighborhood-based indices such as mingling index and diameter differentiation are a set of diversity measures that are based on the relationship between a reference tree and a certain number of nearest neighbors (i.e., trees to which it has the lowest horizontal distance). Using stem-mapped data from eight headwater sites, we compared the relative bias and relative root mean square error (relative to the true mean of each site) of several different methods of choosing reference trees for calculation of diameter differentiation ([Formula: see text]) and species mingling ([Formula: see text]) index. Indices were defined using two, three, and four neighbors and methods for selection of the reference tree were random selection of a tree in a fixed-radius plot (FI), random selection of a tree in a variable-radius plot (VA), azimuth selection method (AZ), and nearest tree selection (NT). In general, the relative bias was lower than ±2.5% for [Formula: see text] and lower than ±10% for [Formula: see text] regardless of the method. The FI method consistently had the lowest relative bias and relative root mean squared error. The NT and AZ methods were second in terms of relative root mean squared error for [Formula: see text] and [Formula: see text], respectively. Simplicity of these two methods might outweigh their slightly worse performance.

Download Full-text

Use of reflectance spectroscopy to estimate the organic carbon and CaCO3 contents of soils

Agrokémia és Talajtan ◽

10.1556/agrokem.60.2012.2.5 ◽

2012 ◽

Vol 61 (2) ◽

pp. 277-290 ◽

Cited By ~ 1

Author(s):

Ádám Csorba ◽

Vince Láng ◽

László Fenyvesi ◽

Erika Michéli

Keyword(s):

Organic Carbon ◽

Least Squares ◽

Partial Least Squares ◽

Partial Least Squares Regression ◽

Mean Squared Error ◽

Reflectance Spectroscopy ◽

Least Squares Regression ◽

Root Mean Squared Error ◽

Squared Error

Napjainkban egyre nagyobb igény mutatkozik olyan technológiák és módszerek kidolgozására és alkalmazására, melyek lehetővé teszik a gyors, költséghatékony és környezetbarát talajadat-felvételezést és kiértékelést. Ezeknek az igényeknek felel meg a reflektancia spektroszkópia, mely az elektromágneses spektrum látható (VIS) és közeli infravörös (NIR) tartományában (350–2500 nm) végzett reflektancia-mérésekre épül. Figyelembe véve, hogy a talajokról felvett reflektancia spektrum információban nagyon gazdag, és a vizsgált tartományban számos talajalkotó rendelkezik karakterisztikus spektrális „ujjlenyomattal”, egyetlen görbéből lehetővé válik nagyszámú, kulcsfontosságú talajparaméter egyidejű meghatározása. Dolgozatunkban, a reflektancia spektroszkópia alapjaira helyezett, a talajok ösz-szetételének meghatározását célzó módszertani fejlesztés első lépéseit mutatjuk be. Munkánk során talajok szervesszén- és CaCO3-tartalmának megbecslését lehetővé tévő többváltozós matematikai-statisztikai módszerekre (részleges legkisebb négyzetek módszere, partial least squares regression – PLSR) épülő prediktív modellek létrehozását és tesztelését végeztük el. A létrehozott modellek tesztelése során megállapítottuk, hogy az eljárás mindkét talajparaméter esetében magas R2értéket [R2(szerves szén) = 0,815; R2(CaCO3) = 0,907] adott. A becslés pontosságát jelző közepes négyzetes eltérés (root mean squared error – RMSE) érték mindkét paraméter esetében közepesnek mondható [RMSE (szerves szén) = 0,467; RMSE (CaCO3) = 3,508], mely a reflektancia mérési előírások standardizálásával jelentősen javítható. Vizsgálataink alapján arra a következtetésre jutottunk, hogy a reflektancia spektroszkópia és a többváltozós kemometriai eljárások együttes alkalmazásával, gyors és költséghatékony adatfelvételezési és -értékelési módszerhez juthatunk.

Download Full-text

Comparative study of the pencil-and-paper and digital formats of the Spanish DARS scale

Acta Neuropsychiatrica ◽

10.1017/neu.2021.45 ◽

2021 ◽

pp. 1-21

Author(s):

Elsa Arrua-Duarte ◽

Marta Migoya-Borja ◽

Igor Barahona ◽

Lena C. Quilty ◽

Sakina J. Rizvi ◽

...

Keyword(s):

Rating Scale ◽

Mean Squared Error ◽

Intraclass Correlation ◽

Test Validity ◽

Wilcoxon Test ◽

Digital Version ◽

Root Mean Squared Error ◽

Squared Error ◽

Digital Format ◽

Paper And Pencil

Abstract Objective: The Dimensional Anhedonia Rating Scale (DARS) is a novel questionnaire to assess anhedonia of recent validation. In this work we aim to study the equivalence between the traditional paper-and-pencil and the digital format of DARS. Methods: 69 patients filled the DARS in a paper-based and digital versions. We assessed differences between formats (Wilcoxon test), validity of the scales (Kappa and Intraclass Correlation Coefficients), and reliability (Cronbach’s alpha and Guttman’s coefficient). We calculated the Comparative Fit Index and the Root Mean Squared Error associated with the proposed one-factor structure. Results: Total scores were higher for paper-based format. Significant differences between both formats were found for three items. The weighted Kappa coefficient was approximately 0.40 for most of the items. Internal consistency was greater than 0.94, and the Intraclass Correlation Coefficient for the digital version was 0.95 and 0.94 for the paper-and-pencil version (F= 16.7, p < 0.001). Comparative Adjustment Index was 0.97 for the digital DARS and 0.97 for the paper-and-pencil DARS, and Root Mean Squared Error was 0.11 for the digital DARS and 0.10 for the paper-and-pencil DARS. Conclusion: The digital DARS is consistent in many respects to the paper-and-pencil questionnaire, but equivalence with this format cannot be assumed without caution.

Download Full-text

Prediksi Indeks Harga Saham Gabungan (IHSG) Menggunakan Algoritma Neural Network

Jurnal Edukasi dan Penelitian Informatika (JEPIN) ◽

10.26418/jp.v4i1.25384 ◽

2018 ◽

Vol 4 (1) ◽

pp. 24

Author(s):

Imam Halimi ◽

Wahyu Andhyka Kusuma

Keyword(s):

Neural Network ◽

Data Mining ◽

Linear Regression ◽

Mean Squared Error ◽

Composite Index ◽

T Test ◽

Sliding Windows ◽

Root Mean Squared Error ◽

Squared Error

Investasi saham merupakan hal yang tidak asing didengar maupun dilakukan. Ada berbagai macam saham di Indonesia, salah satunya adalah Indeks Harga Saham Gabungan (IHSG) atau dalam bahasa inggris disebut Indonesia Composite Index, ICI, atau IDX Composite. IHSG merupakan parameter penting yang dipertimbangkan pada saat akan melakukan investasi mengingat IHSG adalah saham gabungan. Penelitian ini bertujuan memprediksi pergerakan IHSG dengan teknik data mining menggunakan algoritma neural network dan dibandingkan dengan algoritma linear regression, yang dapat dijadikan acuan investor saat akan melakukan investasi. Hasil dari penelitian ini berupa nilai Root Mean Squared Error (RMSE) serta label tambahan angka hasil prediksi yang didapatkan setelah dilakukan validasi menggunakan sliding windows validation dengan hasil paling baik yaitu pada pengujian yang menggunakan algoritma neural network yang menggunakan windowing yaitu sebesar 37,786 dan pada pengujian yang tidak menggunakan windowing sebesar 13,597 dan untuk pengujian algoritma linear regression yang menggunakan windowing yaitu sebesar 35,026 dan pengujian yang tidak menggunakan windowing sebesar 12,657. Setelah dilakukan pengujian T-Test menunjukan bahwa pengujian menggunakan neural network yang dibandingkan dengan linear regression memiliki hasil yang tidak signifikan dengan nilai T-Test untuk pengujian dengan windowing dan tanpa windowing hasilnya sama, yaitu sebesar 1,000.

Download Full-text

Automated Bale Mapping Using Machine Learning and Photogrammetry

Remote Sensing ◽

10.3390/rs13224675 ◽

2021 ◽

Vol 13 (22) ◽

pp. 4675

Author(s):

William Yamada ◽

Wei Zhao ◽

Matthew Digman

Keyword(s):

Lower Cost ◽

Mean Squared Error ◽

Spatial Clustering ◽

Mean Average Precision ◽

Average Precision ◽

Data Set ◽

Map Projection ◽

Squared Error ◽

Aerial Vehicle ◽

The Impact

An automatic method of obtaining geographic coordinates of bales using monovision un-crewed aerial vehicle imagery was developed utilizing a data set of 300 images with a 20-megapixel resolution containing a total of 783 labeled bales of corn stover and soybean stubble. The relative performance of image processing with Otsu’s segmentation, you only look once version three (YOLOv3), and region-based convolutional neural networks was assessed. As a result, the best option in terms of accuracy and speed was determined to be YOLOv3, with 80% precision, 99% recall, 89% F1 score, 97% mean average precision, and a 0.38 s inference time. Next, the impact of using lower-cost cameras was evaluated by reducing image quality to one megapixel. The lower-resolution images resulted in decreased performance, with 79% precision, 97% recall, 88% F1 score, 96% mean average precision, and 0.40 s inference time. Finally, the output of the YOLOv3 trained model, density-based spatial clustering, photogrammetry, and map projection were utilized to predict the geocoordinates of the bales with a root mean squared error of 2.41 m.

Download Full-text

Benchmarking Effectiveness and Efficiency of Deep Learning Models for Semantic Textual Similarity in the Clinical Domain: Validation Study

JMIR Medical Informatics ◽

10.2196/27386 ◽

2021 ◽

Vol 9 (12) ◽

pp. e27386

Author(s):

Qingyu Chen ◽

Alex Rankine ◽

Yifan Peng ◽

Elaheh Aghaarabi ◽

Zhiyong Lu

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Mean Squared Error ◽

Pearson Correlation ◽

Data Set ◽

Squared Error ◽

Real Time Applications ◽

Effectiveness And Efficiency ◽

Pearson Correlations

Background Semantic textual similarity (STS) measures the degree of relatedness between sentence pairs. The Open Health Natural Language Processing (OHNLP) Consortium released an expertly annotated STS data set and called for the National Natural Language Processing Clinical Challenges. This work describes our entry, an ensemble model that leverages a range of deep learning (DL) models. Our team from the National Library of Medicine obtained a Pearson correlation of 0.8967 in an official test set during 2019 National Natural Language Processing Clinical Challenges/Open Health Natural Language Processing shared task and achieved a second rank. Objective Although our models strongly correlate with manual annotations, annotator-level correlation was only moderate (weighted Cohen κ=0.60). We are cautious of the potential use of DL models in production systems and argue that it is more critical to evaluate the models in-depth, especially those with extremely high correlations. In this study, we benchmark the effectiveness and efficiency of top-ranked DL models. We quantify their robustness and inference times to validate their usefulness in real-time applications. Methods We benchmarked five DL models, which are the top-ranked systems for STS tasks: Convolutional Neural Network, BioSentVec, BioBERT, BlueBERT, and ClinicalBERT. We evaluated a random forest model as an additional baseline. For each model, we repeated the experiment 10 times, using the official training and testing sets. We reported 95% CI of the Wilcoxon rank-sum test on the average Pearson correlation (official evaluation metric) and running time. We further evaluated Spearman correlation, R², and mean squared error as additional measures. Results Using only the official training set, all models obtained highly effective results. BioSentVec and BioBERT achieved the highest average Pearson correlations (0.8497 and 0.8481, respectively). BioSentVec also had the highest results in 3 of 4 effectiveness measures, followed by BioBERT. However, their robustness to sentence pairs of different similarity levels varies significantly. A particular observation is that BERT models made the most errors (a mean squared error of over 2.5) on highly similar sentence pairs. They cannot capture highly similar sentence pairs effectively when they have different negation terms or word orders. In addition, time efficiency is dramatically different from the effectiveness results. On average, the BERT models were approximately 20 times and 50 times slower than the Convolutional Neural Network and BioSentVec models, respectively. This results in challenges for real-time applications. Conclusions Despite the excitement of further improving Pearson correlations in this data set, our results highlight that evaluations of the effectiveness and efficiency of STS models are critical. In future, we suggest more evaluations on the generalization capability and user-level testing of the models. We call for community efforts to create more biomedical and clinical STS data sets from different perspectives to reflect the multifaceted notion of sentence-relatedness.

Download Full-text

Comparison of different efficiency criteria for hydrological model assessment

Advances in Geosciences ◽

10.5194/adgeo-5-89-2005 ◽

2005 ◽

Vol 5 ◽

pp. 89-97 ◽

Cited By ~ 1307

Author(s):

P. Krause ◽

D. P. Boyle ◽

F. Bäse

Keyword(s):

Hydrological Model ◽

Objective Assessment ◽

Hydrologic Model ◽

Coefficient Of Determination ◽

Evaluation Procedure ◽

Efficiency Coefficient ◽

Model Assessment ◽

Different Types ◽

Index Of Agreement ◽

And Performance

Abstract. The evaluation of hydrologic model behaviour and performance is commonly made and reported through comparisons of simulated and observed variables. Frequently, comparisons are made between simulated and measured streamflow at the catchment outlet. In distributed hydrological modelling approaches, additional comparisons of simulated and observed measurements for multi-response validation may be integrated into the evaluation procedure to assess overall modelling performance. In both approaches, single and multi-response, efficiency criteria are commonly used by hydrologists to provide an objective assessment of the "closeness" of the simulated behaviour to the observed measurements. While there are a few efficiency criteria such as the Nash-Sutcliffe efficiency, coefficient of determination, and index of agreement that are frequently used in hydrologic modeling studies and reported in the literature, there are a large number of other efficiency criteria to choose from. The selection and use of specific efficiency criteria and the interpretation of the results can be a challenge for even the most experienced hydrologist since each criterion may place different emphasis on different types of simulated and observed behaviours. In this paper, the utility of several efficiency criteria is investigated in three examples using a simple observed streamflow hydrograph.

Download Full-text