model evaluation Latest Research Papers

Objective To begin to understand how to prevent deep vein thrombosis (DVT) after an innovative operation termed intracorporeal laparoscopic reconstruction of detenial sigmoid neobladder, we explored the factors that influence DVT following surgery, with the aim of constructing a model for predicting DVT occurrence. Methods This retrospective study included 151 bladder cancer patients who underwent intracorporeal laparoscopic reconstruction of detenial sigmoid neobladder. Data describing general clinical characteristics and other common parameters were collected and analyzed. Thereafter, we generated model evaluation curves and finally cross-validated their extrapolations. Results Age and body mass index were risk factors for DVT, whereas postoperative use of hemostatic agents and postoperative passive muscle massage were significant protective factors. Model evaluation curves showed that the model had high accuracy and little bias. Cross-validation affirmed the accuracy of our model. Conclusion The prediction model constructed herein was highly accurate and had little bias; thus, it can be used to predict the likelihood of developing DVT after surgery.

Download Full-text

Diurnal cycle of rainfall as predicted by WRF model : Verification using Model Evaluation Tools software

MAUSAM ◽

10.54302/mausam.v66i3.553 ◽

2021 ◽

Vol 66 (3) ◽

pp. 433-444

Author(s):

SOMA SENROY ◽

SUBHENDU BRATASAHA ◽

ANANDA KUMARDAS ◽

S.K.ROY BHOWMIK ◽

P.K. KUNDU

Keyword(s):

Diurnal Cycle ◽

Model Evaluation ◽

Wrf Model ◽

Model Verification ◽

Evaluation Tools

Download Full-text

BUSINESS MODEL EVALUATION OF PRIVATE ELEMENTARY SCHOOL IN INDONESIA

Russian Journal of Agricultural and Socio-Economic Sciences ◽

10.18551/rjoas.2021-12.10 ◽

2021 ◽

Vol 120 (12) ◽

pp. 95-104

Author(s):

R.R. Dafina ◽

I. Fahmi ◽

N. Hasanah

Keyword(s):

Elementary School ◽

Business Model ◽

Model Evaluation

Download Full-text

Evaluation of computational models of infant language development against robust empirical data from meta-analyses: what, why, and how?

10.31234/osf.io/yjz5a ◽

2021 ◽

Author(s):

María Andrea Cruz Blandón ◽

Alejandrina Cristia ◽

Okko Räsänen

Keyword(s):

Language Development ◽

Language Learning ◽

Empirical Data ◽

Model Evaluation ◽

Computational Models ◽

Evaluation Practices ◽

Analytic Evaluation ◽

Starting Point ◽

Meta Analyses ◽

Infant Language

Computational models of child language development can help us understand the cognitive underpinnings of the language learning process. One advantage of computational modeling is that is has the potential to address multiple aspects of language learning within a single learning architecture. If successful, such integrated models would help to pave the way for a more comprehensive and mechanistic understanding of language development. However, in order to develop more accurate, holistic, and hence impactful models of infant language learning, the research on models also requires model evaluation practices that allow comparison of model behavior to empirical data from infants across a range of language capabilities. Moreover, there is a need for practices that can compare developmental trajectories of infants to those of models as a function of language experience. The present study aims to take the first steps to address these needs. More specifically, we will introduce the concept of comparing models with large-scale cumulative empirical data from infants, as quantified by meta-analyses conducted across a large number of individual behavioral studies. We start by formalizing the connection between measurable model and human behavior, and then present a basic conceptual framework for meta-analytic evaluation of computational models together with basic guidelines intended as a starting point for later work in this direction. We exemplify the meta-analytic model evaluation approach with two modeling experiments on infant-directed speech preference and native/non-native vowel discrimination. We also discuss the advantages, challenges, and potential future directions of meta-analytic evaluation practices.

Download Full-text

GMD perspective: The quest to improve the evaluation of groundwater representation in continental- to global-scale models

Geoscientific Model Development ◽

10.5194/gmd-14-7545-2021 ◽

2021 ◽

Vol 14 (12) ◽

pp. 7545-7571

Author(s):

Tom Gleeson ◽

Thorsten Wagener ◽

Petra Döll ◽

Samuel C. Zipper ◽

Charles West ◽

...

Keyword(s):

Model Evaluation ◽

Land Surface ◽

Large Scale ◽

Groundwater Resources ◽

Regional Scale ◽

Global Scale ◽

Expert Elicitation ◽

Data Availability ◽

Scale Model ◽

Scale Models

Abstract. Continental- to global-scale hydrologic and land surface models increasingly include representations of the groundwater system. Such large-scale models are essential for examining, communicating, and understanding the dynamic interactions between the Earth system above and below the land surface as well as the opportunities and limits of groundwater resources. We argue that both large-scale and regional-scale groundwater models have utility, strengths, and limitations, so continued modeling at both scales is essential and mutually beneficial. A crucial quest is how to evaluate the realism, capabilities, and performance of large-scale groundwater models given their modeling purpose of addressing large-scale science or sustainability questions as well as limitations in data availability and commensurability. Evaluation should identify if, when, or where large-scale models achieve their purpose or where opportunities for improvements exist so that such models better achieve their purpose. We suggest that reproducing the spatiotemporal details of regional-scale models and matching local data are not relevant goals. Instead, it is important to decide on reasonable model expectations regarding when a large-scale model is performing “well enough” in the context of its specific purpose. The decision of reasonable expectations is necessarily subjective even if the evaluation criteria are quantitative. Our objective is to provide recommendations for improving the evaluation of groundwater representation in continental- to global-scale models. We describe current modeling strategies and evaluation practices, and we subsequently discuss the value of three evaluation strategies: (1) comparing model outputs with available observations of groundwater levels or other state or flux variables (observation-based evaluation), (2) comparing several models with each other with or without reference to actual observations (model-based evaluation), and (3) comparing model behavior with expert expectations of hydrologic behaviors in particular regions or at particular times (expert-based evaluation). Based on evolving practices in model evaluation as well as innovations in observations, machine learning, and expert elicitation, we argue that combining observation-, model-, and expert-based model evaluation approaches, while accounting for commensurability issues, may significantly improve the realism of groundwater representation in large-scale models, thus advancing our ability for quantification, understanding, and prediction of crucial Earth science and sustainability problems. We encourage greater community-level communication and cooperation on this quest, including among global hydrology and land surface modelers, local to regional hydrogeologists, and hydrologists focused on model development and evaluation.

Download Full-text

The Porcine Abattoir Blood Model – Evaluation of Platelet Function for In vitro Hemocompatibility Investigations

Artificial Organs ◽

10.1111/aor.14146 ◽

2021 ◽

Author(s):

Johanna C. Clauser ◽

Judith Maas ◽

Ilona Mager ◽

Frank R. Halfwerk ◽

Jutta Arens

Keyword(s):

Platelet Function ◽

Model Evaluation

Download Full-text

A Novel Framework for Constructing Sport-Based Rating Systems

10.26686/wgtn.17151083 ◽

2021 ◽

Author(s):

◽

Ankit Patel

Keyword(s):

Dimension Reduction ◽

Current Literature ◽

Model Evaluation ◽

Small Sample ◽

Evaluation Metrics ◽

Time Interval ◽

Rating Systems ◽

Evaluation Period ◽

Multi Objective ◽

Wide Range

<p>This doctoral thesis examines the multivariate nature of sporting performances, expressed as performance on context specific tasks, to develop a novel framework for constructing sport-based rating systems, also referred to as scoring models. The intent of this framework is to produce reliable, robust, intuitive, and transparent ratings, regarded as meaningful, for performance prevalent in the sport player and team evaluation environment. In this thesis, Bracewell’s (2003) definition of a rating as an elegant form of dimension reduction is extended. Specifically, ratings are an elegant and excessive form of dimension reduction whereby a single numerical value provides an objective interpretation of performance. The data, provided by numerous vendors, is a summary of actions and performances completed by an individual during the evaluation period. A literature review of rating systems to measure performance, revealed a set of common methodologies, which were applied to produce a set of rating systems that were used as pilot studies to garner a set of learnings and limitations surrounding the current literature. By reviewing rating methodologies and developing rating systems a set of limitations and communalities surrounding the current literature were identified and used to develop a novel framework for constructing sport-based rating systems which output measures of both team and player-level performance. The proposed framework adopts a multi-objective ensembling strategy and implements five key communalities present within many rating methodologies. These communalities are the application of 1) dimension reduction and feature selection techniques, 2) feature engineering tasks, 3) a multi-objective framework, 4) time-based variables and 5) an ensembling procedure to produce an overall rating. An ensemble approach is adopted because it assumed that sporting performances are a function of the significant traits affecting performance. Therefore, performance is defined as performance=f(〖trait〗_1,…,〖trait〗_n). Moreover, the framework is a form of model stacking where information from multiple models is combined to generate a more informative model. Rating systems built using this approach provide a meaningful quantitative interpretation performance during an evaluation period. These ratings measure the quality of performance during a specific time-interval, known as the evaluation period. The framework introduces a methodical approach for constructing rating systems within the sporting domain, which produce meaningful ratings. Meaningful ratings must 1) yield good performance when data is drawn from a wide range of probability distributions that are largely unaffected by outliers, small departures from model assumptions and small sample sizes (robust), 2) be accurate and produce highly informative predictions which are well-calibrated and sharp (reliable), 3) be interpretable and easy to communicate and (transparent), and 4) relate to real-world observable outcomes (intuitive). The framework is developed to construct meaningful rating systems within the sporting industry to evaluate team and player performances. The approach was tested and validated by constructing both team and individual player-based rating systems within the cricketing context. The results of these systems were found to be meaningful, in that, they produced reliable, robust, transparent, and intuitive ratings. This ratings framework is not restricted within the sport of cricket to evaluate players and teams’ performances and is applicable in any sporting code where a summary of multivariate data is necessary to understand performance. Common model evaluation metrics were found to be limited and lacked applicability when evaluating the effectiveness of meaningful ratings, therefore a novel evaluation metric was developed. The constructed metric applies a distance and magnitude-based metrics derived from the spherical scoring rule methodology. The distance and magnitude-based spherical (DMS) metric applies an analytical hierarchy process to assess the effectiveness of meaningful sport-based ratings and accounts for forecasting difficulty on a time basis. The DMS performance metric quantifies elements of the decision-making process by 1) evaluating the distance between ratings reported by the modeller and the actual outcome or the modellers ‘true’ beliefs, 2) providing an indication of “good” ratings, 3) accounting for the context and the forecasting difficulty to which the ratings are being applied, and 4) capturing the introduction of any subjective human bias within sport-based rating systems. The DMS metric is shown to outperform conventional model evaluation metrics such as the log-loss, in specific sporting scenarios of varying difficulty.</p>

Download Full-text

A Novel Framework for Constructing Sport-Based Rating Systems

10.26686/wgtn.17151083.v1 ◽

2021 ◽

Author(s):

◽

Ankit Patel

Keyword(s):

Dimension Reduction ◽

Current Literature ◽

Model Evaluation ◽

Small Sample ◽

Evaluation Metrics ◽

Time Interval ◽

Rating Systems ◽

Evaluation Period ◽

Multi Objective ◽

Wide Range

<p>This doctoral thesis examines the multivariate nature of sporting performances, expressed as performance on context specific tasks, to develop a novel framework for constructing sport-based rating systems, also referred to as scoring models. The intent of this framework is to produce reliable, robust, intuitive, and transparent ratings, regarded as meaningful, for performance prevalent in the sport player and team evaluation environment. In this thesis, Bracewell’s (2003) definition of a rating as an elegant form of dimension reduction is extended. Specifically, ratings are an elegant and excessive form of dimension reduction whereby a single numerical value provides an objective interpretation of performance. The data, provided by numerous vendors, is a summary of actions and performances completed by an individual during the evaluation period. A literature review of rating systems to measure performance, revealed a set of common methodologies, which were applied to produce a set of rating systems that were used as pilot studies to garner a set of learnings and limitations surrounding the current literature. By reviewing rating methodologies and developing rating systems a set of limitations and communalities surrounding the current literature were identified and used to develop a novel framework for constructing sport-based rating systems which output measures of both team and player-level performance. The proposed framework adopts a multi-objective ensembling strategy and implements five key communalities present within many rating methodologies. These communalities are the application of 1) dimension reduction and feature selection techniques, 2) feature engineering tasks, 3) a multi-objective framework, 4) time-based variables and 5) an ensembling procedure to produce an overall rating. An ensemble approach is adopted because it assumed that sporting performances are a function of the significant traits affecting performance. Therefore, performance is defined as performance=f(〖trait〗_1,…,〖trait〗_n). Moreover, the framework is a form of model stacking where information from multiple models is combined to generate a more informative model. Rating systems built using this approach provide a meaningful quantitative interpretation performance during an evaluation period. These ratings measure the quality of performance during a specific time-interval, known as the evaluation period. The framework introduces a methodical approach for constructing rating systems within the sporting domain, which produce meaningful ratings. Meaningful ratings must 1) yield good performance when data is drawn from a wide range of probability distributions that are largely unaffected by outliers, small departures from model assumptions and small sample sizes (robust), 2) be accurate and produce highly informative predictions which are well-calibrated and sharp (reliable), 3) be interpretable and easy to communicate and (transparent), and 4) relate to real-world observable outcomes (intuitive). The framework is developed to construct meaningful rating systems within the sporting industry to evaluate team and player performances. The approach was tested and validated by constructing both team and individual player-based rating systems within the cricketing context. The results of these systems were found to be meaningful, in that, they produced reliable, robust, transparent, and intuitive ratings. This ratings framework is not restricted within the sport of cricket to evaluate players and teams’ performances and is applicable in any sporting code where a summary of multivariate data is necessary to understand performance. Common model evaluation metrics were found to be limited and lacked applicability when evaluating the effectiveness of meaningful ratings, therefore a novel evaluation metric was developed. The constructed metric applies a distance and magnitude-based metrics derived from the spherical scoring rule methodology. The distance and magnitude-based spherical (DMS) metric applies an analytical hierarchy process to assess the effectiveness of meaningful sport-based ratings and accounts for forecasting difficulty on a time basis. The DMS performance metric quantifies elements of the decision-making process by 1) evaluating the distance between ratings reported by the modeller and the actual outcome or the modellers ‘true’ beliefs, 2) providing an indication of “good” ratings, 3) accounting for the context and the forecasting difficulty to which the ratings are being applied, and 4) capturing the introduction of any subjective human bias within sport-based rating systems. The DMS metric is shown to outperform conventional model evaluation metrics such as the log-loss, in specific sporting scenarios of varying difficulty.</p>

Download Full-text

model evaluation
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Model evaluation and validation