scholarly journals Rejection versus error in a multiple experts environment

Author(s):  
Louisa Lam ◽  
Ching Y. Suen
Keyword(s):  
2021 ◽  
Author(s):  
Amarildo Likmeta ◽  
Alberto Maria Metelli ◽  
Giorgia Ramponi ◽  
Andrea Tirinzoni ◽  
Matteo Giuliani ◽  
...  

AbstractIn real-world applications, inferring the intentions of expert agents (e.g., human operators) can be fundamental to understand how possibly conflicting objectives are managed, helping to interpret the demonstrated behavior. In this paper, we discuss how inverse reinforcement learning (IRL) can be employed to retrieve the reward function implicitly optimized by expert agents acting in real applications. Scaling IRL to real-world cases has proved challenging as typically only a fixed dataset of demonstrations is available and further interactions with the environment are not allowed. For this reason, we resort to a class of truly batch model-free IRL algorithms and we present three application scenarios: (1) the high-level decision-making problem in the highway driving scenario, and (2) inferring the user preferences in a social network (Twitter), and (3) the management of the water release in the Como Lake. For each of these scenarios, we provide formalization, experiments and a discussion to interpret the obtained results.


PLoS ONE ◽  
2012 ◽  
Vol 7 (10) ◽  
pp. e46192 ◽  
Author(s):  
Sam Mavandadi ◽  
Steve Feng ◽  
Frank Yu ◽  
Stoyan Dimitrov ◽  
Karin Nielsen-Saines ◽  
...  

2013 ◽  
Vol 2013 ◽  
pp. 1-15 ◽  
Author(s):  
Tomi Kauppi ◽  
Joni-Kristian Kämäräinen ◽  
Lasse Lensu ◽  
Valentina Kalesnykiene ◽  
Iiris Sorri ◽  
...  

We address the performance evaluation practices for developing medical image analysis methods, in particular, how to establish and share databases of medical images with verified ground truth and solid evaluation protocols. Such databases support the development of better algorithms, execution of profound method comparisons, and, consequently, technology transfer from research laboratories to clinical practice. For this purpose, we propose a framework consisting of reusable methods and tools for the laborious task of constructing a benchmark database. We provide a software tool for medical image annotation helping to collect class label, spatial span, and expert's confidence on lesions and a method to appropriately combine the manual segmentations from multiple experts. The tool and all necessary functionality for method evaluation are provided as public software packages. As a case study, we utilized the framework and tools to establish the DiaRetDB1 V2.1 database for benchmarking diabetic retinopathy detection algorithms. The database contains a set of retinal images, ground truth based on information from multiple experts, and a baseline algorithm for the detection of retinopathy lesions.


Author(s):  
Vikas C. Raykar ◽  
Shipeng Yu ◽  
Linda H. Zhao ◽  
Anna Jerebko ◽  
Charles Florin ◽  
...  

2021 ◽  
Vol 13 (4) ◽  
pp. 2081
Author(s):  
Wan-Chi Jackie Hsu ◽  
Huai-Wei Lo ◽  
Chin-Cheng Yang

As the Coronavirus disease 2019 (COVID-19) epidemic spreads all over the world, governments of various countries are actively adopting epidemic prevention measures to curb the spread of the disease. However, colleges and universities are one of the most likely places for cluster infections. The main reason is that college students have frequent social activities, and many students come from different countries, which may very likely cause college campuses to be entry points of disease transmission. Therefore, this study proposes a framework of epidemic prevention work, and further explores the importance and priority of epidemic prevention works. First of all, 32 persons in charge of epidemic prevention from various universities in Taiwan were invited to jointly formulate a campus epidemic prevention framework and determined 5 dimensions and 36 epidemic prevention works/measures/criteria. Next, Bayesian best worst method (BWM) was used to generate a set of optimal group criteria weights. This method can not only integrate the opinions of multiple experts, but also effectively reduce the complexity of expert interviews to obtain more reliable results. The results show that the five most important measures for campus epidemic prevention are the establishment of a campus epidemic prevention organization, comprehensive disinfection of the campus environment, maintenance of indoor ventilation, proper isolation of contacts with confirmed cases, and management of immigration regulations for overseas students. This study provides colleges and universities around the world to formulate anti-epidemic measures to effectively reduce the probability of COVID-19 transmission on campuses to protect students’ right to education.


2021 ◽  
Author(s):  
Anca Hanea ◽  
David Peter Wilkinson ◽  
Marissa McBride ◽  
Aidan Lyon ◽  
Don van Ravenzwaaij ◽  
...  

Experts are often asked to represent their uncertainty as a subjective probability. Structured protocols offer a transparent and systematic way to elicit and combine probability judgements from multiple experts. As part of this process, experts are asked to individually estimate a probability (e.g., of a future event) which needs to be combined/aggregated into a final group prediction. The experts' judgements can be aggregated behaviourally (by striving for consensus), or mathematically (by using a mathematical rule to combine individual estimates). Mathematical rules (e.g., weighted linear combinations of judgments) provide an objective approach to aggregation. However, the choice of a rule is not straightforward, and the aggregated group probability judgement's quality depends on it. The quality of an aggregation can be defined in terms of accuracy, calibration and informativeness. These measures can be used to compare different aggregation approaches and help decide on which aggregation produces the "best" final prediction.In the ideal case, individual experts' performance (as probability assessors) is scored, these scores are translated into performance-based weights, and a performance-based weighted aggregation is used. When this is not possible though, several other aggregation methods, informed by measurable proxies for good performance, can be formulated and compared. We use several data sets to investigate the relative performance of multiple aggregation methods informed by previous experience and the available literature. Even though the accuracy, calibration, and informativeness of the majority of methods are very similar, two of the aggregation methods distinguish themselves as the best and worst.


2021 ◽  
Author(s):  
Alexandre Triay Bagur ◽  
Paul Aljabar ◽  
Gerard R Ridgway ◽  
Michael Brady ◽  
Daniel Bulte

Pancreatic disease can be spatially inhomogeneous. For this reason, quantitative imaging studies of the pancreas have often targeted the 3 main anatomical pancreatic parts, head, body, and tail, traditionally using a balanced region of interest (ROI) strategy. Existing automated analysis methods have implemented whole-organ segmentation, which provides an overall quantification, but fails to address spatial heterogeneity in disease. A method to automatically refine a whole-organ segmentation of the pancreas into head, body, and tail subregions is presented for abdominal magnetic resonance imaging (MRI). The subsegmentation method is based on diffeomorphic registration to a group average template image, where the parts are manually annotated. For a new whole-pancreas segmentation, the aligned template's part labels are automatically propagated to the segmentation of interest. The method is validated retrospectively on the UK Biobank imaging substudy (scanned using a 2-point Dixon protocol at 1.5 tesla), using a nominally healthy cohort of 100 subjects for template creation, and 50 independent subjects for validation. Pancreas head, body, and tail were annotated by multiple experts on the validation cohort, which served as the benchmark for the automated method's performance. Good intra-rater (Dice overlap mean, Head: 0.982, Body: 0.940, Tail: 0.961, N=30) as well as inter-rater (Dice overlap mean, Head: 0.968, Body: 0.905, Tail: 0.943, N=150) agreement was observed. No significant difference (Wilcoxon rank sum test, DSC, Head: p=0.4358, Body: p=0.0992, Tail: p=0.1080) was observed between the manual annotations and the automated method's predictions. Results on regional pancreatic fat assessment are also presented, by intersecting the 3-D parts segmentation with one 2-D multi-echo gradient-echo slice, available from the same scanning session, that was used to compute MRI proton density fat fraction (MRI-PDFF). Initial application of the method on a type 2 diabetes cohort showed the utility of the method for assessing pancreatic disease heterogeneity.


Sign in / Sign up

Export Citation Format

Share Document