evaluation bias
Recently Published Documents


TOTAL DOCUMENTS

69
(FIVE YEARS 12)

H-INDEX

14
(FIVE YEARS 1)

Author(s):  
Ying-Peng Tang ◽  
Sheng-Jun Huang

To learn an effective model with less training examples, existing active learning methods typically assume that there is a given target model, and try to fit it by selecting the most informative examples. However, it is less likely to determine the best target model in prior, and thus may get suboptimal performance even if the data is perfectly selected. To tackle with this practical challenge, this paper proposes a novel framework of dual active learning (DUAL) to simultaneously perform model search and data selection. Specifically, an effective method with truncated importance sampling is proposed for Combined Algorithm Selection and Hyperparameter optimization (CASH), which mitigates the model evaluation bias on the labeled data. Further, we propose an active query strategy to label the most valuable examples. The strategy on one hand favors discriminative data to help CASH search the best model, and on the other hand prefers informative examples to accelerate the convergence of winner models. Extensive experiments are conducted on 12 openML datasets. The results demonstrate the proposed method can effectively learn a superior model with less labeled examples.


2021 ◽  
Vol 12 (1) ◽  
pp. 19-35
Author(s):  
Mahito Okura ◽  
Motohiro Sakaki ◽  
Takuya Yoshizawa

Abstract This study examines whether the introduction of compulsory bicycle liability insurance is socially desirable when evaluation bias—the difference between the objective and subjective evaluations of liability amounts—exists. The main results of this study are summarized as follows. First, when there is no evaluation bias, the introduction of compulsory bicycle liability insurance is socially desirable when the interest rate is high without any condition, and the loading rate is low, the maximum amount of liability is small, and the effort cost is high if the loading rate is higher than the interest rate. Second, if there is an evaluation bias and the accident probability is uniformly distributed, more severe additional conditions are needed for deriving the same results. The study concludes that the evaluation bias prevents the realization of the situation in which the introduction of compulsory bicycle liability insurance is socially desirable.


Water ◽  
2021 ◽  
Vol 13 (5) ◽  
pp. 669
Author(s):  
Justin Hughes ◽  
Nick Potter ◽  
Lu Zhang ◽  
Robert Bridgart

Long-term droughts observed in southern Australia have changed relationships between annual rainfall and runoff and tested some of the assumptions implicit in rainfall–runoff models used in these areas. Predictive confidence across these periods is when low using the more commonly used rainfall–runoff models. Here we modified the GR4J model to better represent surface water–groundwater connection and its role in runoff generation. The modified model (GR7J) was tested in 137 catchments in south-east Australia. Models were calibrated during “wetter” periods and simulation across drought periods was assessed against observations. GR7J performed better than GR4J in evaluation during drought periods where bias was significantly lower and showed improved fit across the flow duration curve especially at low flows. The largest improvements in predictive performance were for catchments where there were larger changes in the annual rainfall–runoff relationship. The predictive performance of the GR7J model was more sensitive to objective function used than GR4J. The use of an objective function that combined daily and annual error produced a better goodness of fit when measured against 80, 50 and 20 percent excedance flow quantiles and reduced evaluation bias, especially for the GR7J model.


2020 ◽  
Vol 83-84 ◽  
pp. 101940
Author(s):  
Patrick Paschke ◽  
Anne Franziska Weidinger ◽  
Ricarda Steinmayr

2020 ◽  
Author(s):  
Alexander Zizka ◽  
Daniele Silvestro ◽  
Pati Vitt ◽  
Tiffany M. Knight

AbstractIUCN Red List assessments are essential for prioritizing conservation needs but are resource-intensive and therefore only available for a fraction of global species richness. Tropical plant species are particularly under-represented on the IUCN Red List. Automated conservation assessments based on digitally available geographic occurrence records can be a rapid alternative, but it is unclear how reliable these assessments are. Here, we present automated conservation assessments for 13,910 species of the diverse and globally distributed Orchid family (Orchidaceae), based on a novel method using a deep neural network (IUC-NN), most of which (13,049) were previously unassessed by the IUCN Red List. We identified 4,342 (31.2 % of the evaluated orchid species) as Possibly Threatened with extinction (equivalent to the IUCN categories CR, EN, or VU) and point to Madagascar, East Africa, south-east Asia, and several oceanic islands as priority areas for orchid conservation. Furthermore, the Orchid family provides a model, to test the sensitivity of automated assessment methods to issues with data availability, data quality and geographic sampling bias. IUC-NN identified threat-ened species with an accuracy of 84.3%, with significantly lower geographic evaluation bias compared to the IUCN Red List, and was robust against low data availability and geographic errors in the input data. Overall, our results demonstrate that automated assessments have an important role to play in achieving goals of identifying the species that are at greatest risk of extinction.


2020 ◽  
Vol 95 (6) ◽  
pp. 213-233 ◽  
Author(s):  
Isabella Grabner ◽  
Judith Künneke ◽  
Frank Moers

ABSTRACT While prior research on performance evaluation bias has mainly focused on the determinants and consequences of rating errors, we investigate how a firm can provide implicit incentives to supervisors to mitigate these errors via its calibration committee. We empirically examine the extent to which a calibration committee incorporates supervisors' evaluation behavior with respect to their subordinates in the performance evaluation outcomes, i.e., performance ratings and promotion decisions, for these supervisors. In our study, we distinguish between lack of skills and opportunism as two important facets of evaluation behavior, which we expect the calibration committee to address differently. Using panel data of a professional service firm, we show that supervisors' opportunistic behavior to strategically inflate subordinates' performance ratings is disciplined through a decrease in the supervisors' own performance rating, while the supervisors' skills to provide less compressed and, thus, more informative performance ratings is rewarded through a higher likelihood of promotion.


Sign in / Sign up

Export Citation Format

Share Document