scholarly journals Transformer-based CNNs: Mining Temporal Context Information for Multi-sound COVID-19 Diagnosis

Author(s):  
Yi Chang ◽  
Zhao Ren ◽  
Bjorn W. Schuller
1993 ◽  
Vol 31 (2) ◽  
pp. 137-143 ◽  
Author(s):  
James T. Becker ◽  
Jeanne Wess ◽  
Nicola M. Hunkin ◽  
Alan J. Parkin

Sensors ◽  
2021 ◽  
Vol 21 (14) ◽  
pp. 4764
Author(s):  
Hao Sun ◽  
Jiaqing Liu ◽  
Shurong Chai ◽  
Zhaolin Qiu ◽  
Lanfen Lin ◽  
...  

Depression is a severe psychological condition that affects millions of people worldwide. As depression has received more attention in recent years, it has become imperative to develop automatic methods for detecting depression. Although numerous machine learning methods have been proposed for estimating the levels of depression via audio, visual, and audiovisual emotion sensing, several challenges still exist. For example, it is difficult to extract long-term temporal context information from long sequences of audio and visual data, and it is also difficult to select and fuse useful multi-modal information or features effectively. In addition, how to include other information or tasks to enhance the estimation accuracy is also one of the challenges. In this study, we propose a multi-modal adaptive fusion transformer network for estimating the levels of depression. Transformer-based models have achieved state-of-the-art performance in language understanding and sequence modeling. Thus, the proposed transformer-based network is utilized to extract long-term temporal context information from uni-modal audio and visual data in our work. This is the first transformer-based approach for depression detection. We also propose an adaptive fusion method for adaptively fusing useful multi-modal features. Furthermore, inspired by current multi-task learning work, we also incorporate an auxiliary task (depression classification) to enhance the main task of depression level regression (estimation). The effectiveness of the proposed method has been validated on a public dataset (AVEC 2019 Detecting Depression with AI Sub-challenge) in terms of the PHQ-8 scores. Experimental results indicate that the proposed method achieves better performance compared with currently state-of-the-art methods. Our proposed method achieves a concordance correlation coefficient (CCC) of 0.733 on AVEC 2019 which is 6.2% higher than the accuracy (CCC = 0.696) of the state-of-the-art method.


Mathematics ◽  
2019 ◽  
Vol 7 (11) ◽  
pp. 1059 ◽  
Author(s):  
Yang ◽  
Wang ◽  
Miao ◽  
Yang ◽  
Zhao ◽  
...  

As one of the core contents of intelligent monitoring, target tracking is the basis for video content analysis and processing. In visual tracking, due to occlusion, illumination changes, and pose and scale variation, handling such large appearance changes of the target object and the background over time remains the main challenge for robust target tracking. In this paper, we present a new robust algorithm (STC-KF) based on the spatio-temporal context and Kalman filtering. Our approach introduces a novel formulation to address the context information, which adopts the entire local information around the target, thereby preventing the remaining important context information related to the target from being lost by only using the rare key point information. The state of the object in the tracking process can be determined by the Euclidean distance of the image intensity in two consecutive frames. Then, the prediction value of the Kalman filter can be updated as the Kalman observation to the object position and marked on the next frame. The performance of the proposed STC-KF algorithm is evaluated and compared with the original STC algorithm. The experimental results using benchmark sequences imply that the proposed method outperforms the original STC algorithm under the conditions of heavy occlusion and large appearance changes.


2010 ◽  
Vol 41 (3) ◽  
pp. 131-136 ◽  
Author(s):  
Catharina Casper ◽  
Klaus Rothermund ◽  
Dirk Wentura

Processes involving an automatic activation of stereotypes in different contexts were investigated using a priming paradigm with the lexical decision task. The names of social categories were combined with background pictures of specific situations to yield a compound prime comprising category and context information. Significant category priming effects for stereotypic attributes (e.g., Bavarians – beer) emerged for fitting contexts (e.g., in combination with a picture of a marquee) but not for nonfitting contexts (e.g., in combination with a picture of a shop). Findings indicate that social stereotypes are organized as specific mental schemas that are triggered by a combination of category and context information.


Author(s):  
Veronika Lerche ◽  
Ursula Christmann ◽  
Andreas Voss

Abstract. In experiments by Gibbs, Kushner, and Mills (1991) , sentences were supposedly either authored by poets or by a computer. Gibbs et al. (1991) concluded from their results that the assumed source of the text influences speed of processing, with a higher speed for metaphorical sentences in the Poet condition. However, the dependent variables used (e.g., mean RTs) do not allow clear conclusions regarding processing speed. It is also possible that participants had prior biases before the presentation of the stimuli. We conducted a conceptual replication and applied the diffusion model ( Ratcliff, 1978 ) to disentangle a possible effect on processing speed from a prior bias. Our results are in accordance with the interpretation by Gibbs et al. (1991) : The context information affected processing speed, not a priori decision settings. Additionally, analyses of model fit revealed that the diffusion model provided a good account of the data of this complex verbal task.


Sign in / Sign up

Export Citation Format

Share Document