scholarly journals Explanation vs Attention: A Two-Player Game to Obtain Attention for VQA

2020 ◽  
Vol 34 (07) ◽  
pp. 11848-11855 ◽  
Author(s):  
Badri Patro ◽  
Anupriy ◽  
Vinay Namboodiri

In this paper, we aim to obtain improved attention for a visual question answering (VQA) task. It is challenging to provide supervision for attention. An observation we make is that visual explanations as obtained through class activation mappings (specifically Grad-CAM) that are meant to explain the performance of various networks could form a means of supervision. However, as the distributions of attention maps and that of Grad-CAMs differ, it would not be suitable to directly use these as a form of supervision. Rather, we propose the use of a discriminator that aims to distinguish samples of visual explanation and attention maps. The use of adversarial training of the attention regions as a two-player game between attention and explanation serves to bring the distributions of attention maps and visual explanations closer. Significantly, we observe that providing such a means of supervision also results in attention maps that are more closely related to human attention resulting in a substantial improvement over baseline stacked attention network (SAN) models. It also results in a good improvement in rank correlation metric on the VQA task. This method can also be combined with recent MCB based methods and results in consistent improvement. We also provide comparisons with other means for learning distributions such as based on Correlation Alignment (Coral), Maximum Mean Discrepancy (MMD) and Mean Square Error (MSE) losses and observe that the adversarial loss outperforms the other forms of learning the attention maps. Visualization of the results also confirms our hypothesis that attention maps improve using this form of supervision.

Symmetry ◽  
2021 ◽  
Vol 13 (4) ◽  
pp. 717
Author(s):  
Mariia Nazarkevych ◽  
Natalia Kryvinska ◽  
Yaroslav Voznyi

This article presents a new method of image filtering based on a new kind of image processing transformation, particularly the wavelet-Ateb–Gabor transformation, that is a wider basis for Gabor functions. Ateb functions are symmetric functions. The developed type of filtering makes it possible to perform image transformation and to obtain better biometric image recognition results than traditional filters allow. These results are possible due to the construction of various forms and sizes of the curves of the developed functions. Further, the wavelet transformation of Gabor filtering is investigated, and the time spent by the system on the operation is substantiated. The filtration is based on the images taken from NIST Special Database 302, that is publicly available. The reliability of the proposed method of wavelet-Ateb–Gabor filtering is proved by calculating and comparing the values of peak signal-to-noise ratio (PSNR) and mean square error (MSE) between two biometric images, one of which is filtered by the developed filtration method, and the other by the Gabor filter. The time characteristics of this filtering process are studied as well.


Sensors ◽  
2021 ◽  
Vol 21 (10) ◽  
pp. 3311
Author(s):  
Riccardo Ballarini ◽  
Marco Ghislieri ◽  
Marco Knaflitz ◽  
Valentina Agostini

In motor control studies, the 90% thresholding of variance accounted for (VAF) is the classical way of selecting the number of muscle synergies expressed during a motor task. However, the adoption of an arbitrary cut-off has evident drawbacks. The aim of this work is to describe and validate an algorithm for choosing the optimal number of muscle synergies (ChoOSyn), which can overcome the limitations of VAF-based methods. The proposed algorithm is built considering the following principles: (1) muscle synergies should be highly consistent during the various motor task epochs (i.e., remaining stable in time), (2) muscle synergies should constitute a base with low intra-level similarity (i.e., to obtain information-rich synergies, avoiding redundancy). The algorithm performances were evaluated against traditional approaches (threshold-VAF at 90% and 95%, elbow-VAF and plateau-VAF), using both a simulated dataset and a real dataset of 20 subjects. The performance evaluation was carried out by analyzing muscle synergies extracted from surface electromyographic (sEMG) signals collected during walking tasks lasting 5 min. On the simulated dataset, ChoOSyn showed comparable performances compared to VAF-based methods, while, in the real dataset, it clearly outperformed the other methods, in terms of the fraction of correct classifications, mean error (ME), and root mean square error (RMSE). The proposed approach may be beneficial to standardize the selection of the number of muscle synergies between different research laboratories, independent of arbitrary thresholds.


1995 ◽  
Vol 74 (6) ◽  
pp. 2665-2684 ◽  
Author(s):  
Y. Kondoh ◽  
Y. Hasegawa ◽  
J. Okuma ◽  
F. Takahashi

1. A computational model accounting for motion detection in the fly was examined by comparing responses in motion-sensitive horizontal system (HS) and centrifugal horizontal (CH) cells in the fly's lobula plate with a computer simulation implemented on a motion detector of the correlation type, the Reichardt detector. First-order (linear) and second-order (quadratic nonlinear) Wiener kernels from intracellularly recorded responses to moving patterns were computed by cross correlating with the time-dependent position of the stimulus, and were used to characterize response to motion in those cells. 2. When the fly was stimulated with moving vertical stripes with a spatial wavelength of 5-40 degrees, the HS and CH cells showed basically a biphasic first-order kernel, having an initial depolarization that was followed by hyperpolarization. The linear model matched well with the actual response, with a mean square error of 27% at best, indicating that the linear component comprises a major part of responses in these cells. The second-order nonlinearity was insignificant. When stimulated at a spatial wavelength of 2.5 degrees, the first-order kernel showed a significant decrease in amplitude, and was initially hyperpolarized; the second-order kernel was, on the other hand, well defined, having two hyperpolarizing valleys on the diagonal with two off-diagonal peaks. 3. The blockage of inhibitory interactions in the visual system by application of 10-4 M picrotoxin, however, evoked a nonlinear response that could be decomposed into the sum of the first-order (linear) and second-order (quadratic nonlinear) terms with a mean square error of 30-50%. The first-order term, comprising 10-20% of the picrotoxin-evoked response, is characterized by a differentiating first-order kernel. It thus codes the velocity of motion. The second-order term, comprising 30-40% of the response, is defined by a second-order kernel with two depolarizing peaks on the diagonal and two off-diagonal hyperpolarizing valleys, suggesting that the nonlinear component represents the power of motion. 4. Responses in the Reichardt detector, consisting of two mirror-image subunits with spatiotemporal low-pass filters followed by a multiplication stage, were computer simulated and then analyzed by the Wiener kernel method. The simulated responses were linearly related to the pattern velocity (with a mean square error of 13% for the linear model) and matched well with the observed responses in the HS and CH cells. After the multiplication stage, the linear component comprised 15-25% and the quadratic nonlinear component comprised 60-70% of the simulated response, which was similar to the picrotoxin-induced response in the HS cells. The quadratic nonlinear components were balanced between the right and left sides, and could be eliminated completely by their contralateral counterpart via a subtraction process. On the other hand, the linear component on one side was the mirror image of that on the other side, as expected from the kernel configurations. 5. These results suggest that responses to motion in the HS and CH cells depend on the multiplication process in which both the velocity and power components of motion are computed, and that a putative subtraction process selectively eliminates the nonlinear components but amplifies the linear component. The nonlinear component is directionally insensitive because of its quadratic non-linearity. Therefore the subtraction process allows the subsequent cells integrating motion (such as the HS cells) to tune the direction of motion more sharply.


2012 ◽  
Vol 602-604 ◽  
pp. 776-780
Author(s):  
Zhi Qiang Li ◽  
Mei Li ◽  
Wei Jia Fan

Poly(3-hydroxybutyrate-co-4-hydroxybutyrate)copolymer [P(3HB-co-4HB)] is a kind of biodegradable high molecular polymer produced by bioaccumulation. Because of the good biodegradability and biocompatibility, P(3HB-co-4HB)s have attracted wide attention . At first, the intrinsic viscosity[η] in good solvent of P(3HB-co-4HB) s with varying contents of 4HB was investigated in different temperature. Second, observed the changes of crystallization gathered state caused by the varying contents of 4HB by polarizing microscope. The results show that to the P(3HB-co-4HB)s in same molecular weight, the intrinsic viscosity[η] in good solvent barely changes when the mole fractions of 4HB increase. On the other hand, the mean square end to end distances[0] of macromolecular flexible chains increase with the mole fractions of 4HB. At the same time, the states of aggregation change from spherulites to dendrites. In this investigation, we discuss the reasons of the differences in depth.


1991 ◽  
Vol 69 (7) ◽  
pp. 433-441 ◽  
Author(s):  
Jack Ferrier ◽  
Angela Kesthely ◽  
Eva Lagan ◽  
Conrad Richter

A model for cytosolic Ca2+ spikes is presented that incorporates continual influx of Ca2+, uptake into an intracellular compartment, and Ca2+-induced Ca2+ release from the compartment. Two versions are used. In one, release is controlled by explicit thresholds, while in the other, release is a continuous function of cytosolic and compartmental [Ca2+]. Some model predictions are as follows. Starting with low Ca2+ influx and no spikes: (1) induction of spiking when Ca2+ influx is increased. Starting with spikes: (2) increase in magnitude and decrease in frequency when influx is reduced; (3) inhibition of spiking if influx is greatly reduced; (4) decrease in the root-mean-square value when influx is increased; and (5) elimination of spiking if influx is greatly increased. Since there is good evidence that hyperpolarizing spikes reflect cytosolic Ca2+ spikes, we used electrophysiological measurements to test the model. Each model prediction was confirmed by experiments in which Ca2+ influx was manipulated. However, the original spike activity tended to return within 5–30 min, indicating a cellular resetting process.Key words: calcium, electrophysiology, mathematical modelling.


2000 ◽  
Vol 9 (5) ◽  
pp. 448-462 ◽  
Author(s):  
José Pablo Zagal ◽  
Miguel Nussbaum ◽  
Ricardo Rosas

Extensive research has shown that the act of play is extremely important in the lives of human beings. It is thus not surprising that games have a long and continuing history in the development of almost every culture and society. The advent of computers and technology in general has also been akin to the need for entertainment that every human being seeks. However, a curious dichotomy exists in the nature of electronic games: the vast majority of electronic games are individual in nature whereas the nonelectronic ones are collective by nature. On the other hand, recent technological breakthroughs are finally allowing for the implementation of electronic multiplayer games. Because of the limited experience in electronic, multiplayer game design, it becomes necessary to adapt existing expertise in the area of single-player game design to the realm of multiplayer games. This work presents a model to support the initial steps in the design process of multiplayer games. The model is defined in terms of the characteristics that are both inherent and special to multiplayer games but also related to the relevant elements of a game in general. Additionally, the model is used to assist in the design of two multiplayer games. “One of the most difficult tasks people can perform, however much others may despise it, is the invention of good games …”


2019 ◽  
Vol 11 (13) ◽  
pp. 1598 ◽  
Author(s):  
Hua Su ◽  
Xin Yang ◽  
Wenfang Lu ◽  
Xiao-Hai Yan

Retrieving multi-temporal and large-scale thermohaline structure information of the interior of the global ocean based on surface satellite observations is important for understanding the complex and multidimensional dynamic processes within the ocean. This study proposes a new ensemble learning algorithm, extreme gradient boosting (XGBoost), for retrieving subsurface thermohaline anomalies, including the subsurface temperature anomaly (STA) and the subsurface salinity anomaly (SSA), in the upper 2000 m of the global ocean. The model combines surface satellite observations and in situ Argo data for estimation, and uses root-mean-square error (RMSE), normalized root-mean-square error (NRMSE), and R2 as accuracy evaluations. The results show that the proposed XGBoost model can easily retrieve subsurface thermohaline anomalies and outperforms the gradient boosting decision tree (GBDT) model. The XGBoost model had good performance with average R2 values of 0.69 and 0.54, and average NRMSE values of 0.035 and 0.042, for STA and SSA estimations, respectively. The thermohaline anomaly patterns presented obvious seasonal variation signals in the upper layers (the upper 500 m); however, these signals became weaker as the depth increased. The model performance fluctuated, with the best performance in October (autumn) for both STA and SSA, and the lowest accuracy occurred in January (winter) for STA and April (spring) for SSA. The STA estimation error mainly occurred in the El Niño-Southern Oscillation (ENSO) region in the upper ocean and the boundary of the ocean basins in the deeper ocean; meanwhile, the SSA estimation error presented a relatively even distribution. The wind speed anomalies, including the u and v components, contributed more to the XGBoost model for both STA and SSA estimations than the other surface parameters; however, its importance at deeper layers decreased and the contributions of the other parameters increased. This study provides an effective remote sensing technique for subsurface thermohaline estimations and further promotes long-term remote sensing reconstructions of internal ocean parameters.


Energies ◽  
2020 ◽  
Vol 13 (23) ◽  
pp. 6378
Author(s):  
S. M. Mahfuz Alam ◽  
Mohd. Hasan Ali

This work proposes two non-linear and one linear equation-based system for residential load forecasting considering heating degree days, cooling degree days, occupancy, and day type, which are applicable to any residential building with small sets of smart meter data. The coefficients of the proposed nonlinear and linear equations are tuned by particle swarm optimization (PSO) and the multiple linear regression method, respectively. For the purpose of comparison, a subtractive clustering based adaptive neuro fuzzy inference system (ANFIS), random forests, gradient boosting trees, and long-term short memory neural network, conventional and modified support vector regression methods were considered. Simulations have been performed in MATLAB environment, and all the methods were tested with randomly chosen 30 days data of a residential building in Memphis City for energy consumption prediction. The absolute average error, root mean square error, and mean average percentage errors are tabulated and considered as performance indices. The efficacy of the proposed systems for residential load forecasting over the other systems have been validated by both simulation results and performance indices, which indicate that the proposed equation-based systems have the lowest absolute average errors, root mean square errors, and mean average percentage errors compared to the other methods. In addition, the proposed systems can be easily practically implemented.


1993 ◽  
Vol 30 (03) ◽  
pp. 627-638
Author(s):  
M. T. Dixon

An arbitrary number of competitors are presented with independent Poisson streams of offers consisting of independent and identically distributed random variables having the uniform distribution on [0, 1]. The players each wish to accept a single offer before a known time limit is reached and each aim to maximize the expected value of their offer. Rejected offers may not be recalled, but they are passed on to the other players according to a known transition matrix. This paper finds equilibrium points for two such games, and demonstrates a two-player game with an equilibrium point under which the player with the faster stream of offers has a lower expected reward than his opponent.


Author(s):  
Xiao Yang ◽  
Madian Khabsa ◽  
Miaosen Wang ◽  
Wei Wang ◽  
Ahmed Hassan Awadallah ◽  
...  

Community-based question answering (CQA) websites represent an important source of information. As a result, the problem of matching the most valuable answers to their corresponding questions has become an increasingly popular research topic. We frame this task as a binary (relevant/irrelevant) classification problem, and present an adversarial training framework to alleviate label imbalance issue. We employ a generative model to iteratively sample a subset of challenging negative samples to fool our classification model. Both models are alternatively optimized using REINFORCE algorithm. The proposed method is completely different from previous ones, where negative samples in training set are directly used or uniformly down-sampled. Further, we propose using Multi-scale Matching which explicitly inspects the correlation between words and ngrams of different levels of granularity. We evaluate the proposed method on SemEval 2016 and SemEval 2017 datasets and achieves state-of-the-art or similar performance.


Sign in / Sign up

Export Citation Format

Share Document