scholarly journals Overlooked Trustworthiness of Explainability in Medical AI

Author(s):  
Jiajin Zhang ◽  
Hanqing Chao ◽  
Mannudeep K Kalra ◽  
Ge Wang ◽  
Pingkun Yan

While various methods have been proposed to explain AI models, the trustworthiness of the generated explanation received little examination. This paper reveals that such explanations could be vulnerable to subtle perturbations on the input and generate misleading results. On the public CheXpert dataset, we demonstrate that specially designed adversarial perturbations can easily tamper saliency maps towards the desired explanations while preserving the original model predictions. AI researchers, practitioners, and authoritative agencies in the medical domain should use caution when explaining AI models because such an explanation could be irrelevant, misleading, and even adversarially manipulated without changing the model output.

Symmetry ◽  
2019 ◽  
Vol 11 (6) ◽  
pp. 792 ◽  
Author(s):  
Zenonas Turskis ◽  
Jurgita Antuchevičienė ◽  
Violeta Keršulienė ◽  
Gintaras Gaidukas

Sustainable and efficient development is one of the most critical challenges facing modern society if it wants to save the world for future generations. Airports are an integral part of human activity. They need to be adapted to meet current and future sustainable needs and provide useful services to the public, taking into account prospects and requirements. Many performance criteria need to be assessed to address issues that often conflict with each other and have different units of measurement. The importance of the criteria to evaluate the effectiveness of alternatives varies. Besides, the implementation of such decisions has different—not precisely described in advance—effects on the interests of different groups in society. Some criteria are defined using different scales. Stakeholders could only evaluate the implemented project alternatives for efficiency throughout the project life cycle. It is essential to find alternative assessment models and adapt them to the challenges. The use of hybrid group multi-criteria decision-making models is one of the most appropriate ways to model such problems. This article presents a real application of the original model to choose the best second runway alternative of the airport.


2016 ◽  
Vol 12 (2) ◽  
pp. 30 ◽  
Author(s):  
Diogenes Lycarião ◽  
Rafael Cardoso Sampaio

The agenda-setting theory is one of the powerful study fields in communication research. Nevertheless, it is not a settled theory. Recent studies based on big data indicate seemingly contradictory results. While some findings reinforce McCombs and Shaw’s original model (i.e. the media set the public agenda), others demonstrate great power of social media to set media’s agenda, what is usually described as reverse agenda-setting. This article – based on an interactional model of agenda setting building – indicates how such results are actually consistent with each other. They reveal a complex multidirectional (and to some extent) unpredictable network of interactions that shape the public debate, which is based on different kinds of agenda (thematic or factual) and time lengths (short, medium or long terms).


Author(s):  
Reto Knutti

Predictions of future climate are based on elaborate numerical computer models. As computational capacity increases and better observations become available, one would expect the model predictions to become more reliable. However, are they really improving, and how do we know? This paper discusses how current climate models are evaluated, why and where scientists have confidence in their models, how uncertainty in predictions can be quantified, and why models often tend to converge on what we observe but not on what we predict. Furthermore, it outlines some strategies on how the climate modelling community may overcome some of the current deficiencies in the attempt to provide useful information to the public and policy-makers.


2019 ◽  
Vol 5 (Supplement_1) ◽  
Author(s):  
David Hodgson ◽  
Stéphane Hué ◽  
Jasmina Panovska-Griffiths ◽  
Atila Iamarino ◽  
Katherine E Atkins

Abstract Antiretroviral treatment (ART) has provided substantial benefits for HIV-1-infected patients and has reduced incidence in areas with high uptake since its introduction in the late 1980s. As ART has led to shifts in the worldwide epidemiology of HIV-1, it may also have the potential to cause concomitant selective pressure on the virus population. Evidence for changes in HIV-1 virulence since the introduction of ART appears to be inconsistent. As well as reviewing both empirical and theoretical studies on the likely impact of ART on HIV-1 virulence, we developed a mathematical framework to evaluate the likely impact of ART on virulence selection under the widespread treatment programs and the future impact of recent test-and-treat recommendations. By quantifying both the relationship between virulence changes with the transmissibility through disease progression and the speed of diagnosis and treatment, we reconcile observational studies on virulence changes with the mathematical model predictions. On adoption of new test-and-treat programs—synonymous with early detection and immediate treatment—it is likely that increased virulence will be observed. Our findings highlight the potential public health consequences of mass treatment and the ensuing requirement for greater access and adherence to nullify the public health effect of these virulence changes.


2021 ◽  
Author(s):  
Michael Sharpe ◽  
Joseph Battershill ◽  
Katherine Hurst

<p>The UK Met Office manages its commitment to the public through the Public Weather Service and an important factor in public safety and concern is extreme weather events. Therefore, a new Key Performance Indicator is being introduced, related to the ability with which extreme events are correctly identified. The Threshold Weighted Continuous Ranked Probability Score (twCRPS) is used to make this assessment by determining how well site-specific Met Office ensemble-based probabilistic forecast solutions predict relative-extreme events. The threshold weighted version of the Mean Absolute Error (twMAE) is the deterministic equivalent to the twCRPS. The twMAE is used for the assessment of the deterministic model output that currently appears on the Met Office App and website.</p><p>Gridded numerical ensemble model data is generated by MOGREPS (the Met Office Global and Regional Ensemble Prediction System). A new program of post-processing work has been undertaken in recent years (IMPROVER) to replace the system of post-processing currently employed by the Met Office. IMPROVER applies a series of post-processing steps to generate both probabilistic and deterministic forecasts and site-specific data is generated from these model fields. Verification of the model output is undertaken at each post-processing stage to ensure that every step is having the expected impact on the performance of the model. To date however, these assessments have concentrated on the performance of more typical conditions rather than the ability with which more extreme events are identified.</p><p>This session outlines very recent work to assess the ability with which raw MOGREPS data and data generated by various of the post-processing stages of IMPROVER, predict relative-extreme events at observation sites throughout the UK. The twMAE and twCRPS are used for this assessment, where in both cases, the threshold weighting function is defined in terms of a distribution formed by sampling the numerical value corresponding to a chosen relatively extreme percentile from the observed 30-year climatology of each UK site.</p>


Biology ◽  
2020 ◽  
Vol 9 (11) ◽  
pp. 353 ◽  
Author(s):  
Sana Jahedi ◽  
James A. Yorke

As the coronavirus pandemic spreads across the globe, people are debating policies to mitigate its severity. Many complex, highly detailed models have been developed to help policy setters make better decisions. However, the basis of these models is unlikely to be understood by non-experts. We describe the advantages of simple models for COVID-19. We say a model is “simple” if its only parameter is the rate of contact between people in the population. This contact rate can vary over time, depending on choices by policy setters. Such models can be understood by a broad audience, and thus can be helpful in explaining the policy decisions to the public. They can be used to evaluate the outcomes of different policies. However, simple models have a disadvantage when dealing with inhomogeneous populations. To augment the power of a simple model to evaluate complicated situations, we add what we call “satellite” equations that do not change the original model. For example, with the help of a satellite equation, one could know what his/her chance is of remaining uninfected through the end of an epidemic. Satellite equations can model the effects of the epidemic on high-risk individuals, death rates, and nursing homes and other isolated populations. To compare simple models with complex models, we introduce our “slightly complex” Model J. We find the conclusions of simple and complex models can be quite similar. However, for each added complexity, a modeler may have to choose additional parameter values describing who will infect whom under what conditions, choices for which there is often little rationale but that can have big impacts on predictions. Our simulations suggest that the added complexity offers little predictive advantage.


Author(s):  
Abhinay Kumar Reddy

Agriculture is the backbone of the Indian economy. About 70% of people rely on it and share a large portion of GDP. Diseases in plants especially in the leaves affect the reduction of both quality and quantity of agricultural products. The human eye is not so powerful to detect minute differences in the infected part of the leaf. In this paper, we offer a software solution to automatically detect and diagnose plant leaf diseases. In this we use image processing techniques for the diagnosis and early diagnosis can be done as elsewhere. This approach will improve crop production. It involves several steps. Picture detection, pre-image processing, percentage separation, extraction and neural features. Recently, many researchers have advocated since the success of in-depth computer literacy the idea of improving the effectiveness of diagnostic programs. Unfortunately, most of these studies did not use the latest in-depth formats and were based on AlexNet, GoogleNet or similar properties. Moreover, deep-seated mechanisms do not exist to take advantage of it, making these deep divisions invisible and suitable as black people boxes. In this project, we tested the many technological approaches of the Convolutional Neural Network (CNN) buildings using various learning strategies in the public database of plant diseases separation. These new structures exceed the high-quality effects of plant diseases in stages with very high accuracy. In addition, we have suggested the use of saliency maps as a way to visualize and interpret CNN classification.


Author(s):  
Davide Nicolini ◽  
Andrea Lippi ◽  
Pedro Monteiro

In this chapter, the authors investigate how the best practices approach “diffused” in the Italian public sector. They show that despite the lack of a clear original model or a strong brokering agency—and the considerable changes this management innovation went through in its arrival in Italy—the result was not complete idiosyncrasy. Rather, clear adaptation patterns and systematic heterogeneity emerged. They argue that the bottom-up emergence of such patterns can be explained by paying attention to the very nature of the public-sector field. They use these findings to develop a framework that accounts for the convergence/divergence of adaptation patterns in the “diffusion” of management innovations based on power relations between innovation brokers and adopters.


Author(s):  
Kevin B. McGrattan ◽  
Michelle Donnelly ◽  
Anthony Hamins ◽  
Eric Johnnson ◽  
Alex Maranghides ◽  
...  

In cooperation with the fire protection engineering community, a computational fire model, Fire Dynamics Simulator (FDS), is being developed at NIST to study fire behavior and to evaluate the performance of fire protection systems in buildings. The software was released into the public domain in 2000, and since then has been used for a wide variety of analyses by fire protection engineers. An on-going need is to develop and validate new sub-models. Fire experiments are conducted for a variety of reasons, and model predictions of these experiments over the past few decades have gradually improved. However, as the models become more detailed, so must the measurements. The bulk of available large scale test data consist of temperature (thermocouple) measurements made at various points above a fire or throughout an enclosure. While it is useful to compare model predictions with these measurements, one can only gauge how closely the model reproduces the given data. There is often no way to infer why the model and experiment disagree, and thus no way to improve the model. Also, it is difficult to separate various physical phenomena in a large scale fire test so that combustion, radiation and heat transfer algorithms can be evaluated independently. For example, the heat release rate of the fire governs the rate at which energy is added to the system, convective and radiative transport distribute the energy throughout, and thermal conduction drains the system of some of the energy. The measured value of a temperature, heat flux, or gas concentration at any one point depends on all the physical processes, and uncertainties in each phase of the calculation tend to combine in a non-linear way impacting the prediction.


Sign in / Sign up

Export Citation Format

Share Document