Positional Mask Attention for Video Sequence Modeling

Abstract For a robust three-dimensional video transmission through error prone channels, an efficient multiple description coding for multi-view video based on the correlation of spatial polyphase transformed subsequences (CSPT_MDC_MVC) is proposed in this article. The input multi-view video sequence is first separated into four subsequences by spatial polyphase transform and then grouped into two descriptions. With the correlation of macroblocks in corresponding subsequence positions, these subsequences should not be coded in completely the same way. In each description, one subsequence is directly coded by the Joint Multi-view Video Coding (JMVC) encoder and the other subsequence is classified into four sets. According to the classification, the indirectly coding subsequence selectively employed the prediction mode and the prediction vector of the counter directly coding subsequence, which reduces the bitrate consumption and the coding complexity of multiple description coding for multi-view video. On the decoder side, the gradient-based directional interpolation is employed to improve the side reconstructed quality. The effectiveness and robustness of the proposed algorithm is verified by experiments in the JMVC coding platform.

Download Full-text

Improved super resolution reconstruction method for video sequence

Journal of Computer Applications ◽

10.3724/sp.j.1087.2009.03310 ◽

2010 ◽

Vol 29 (12) ◽

pp. 3310-3313

Author(s):

Qi YUAN ◽

Shu-xu JING

Keyword(s):

Video Sequence ◽

Super Resolution ◽

Reconstruction Method

Download Full-text

Sequence to Sequence Modeling for User Simulation in Dialog Systems

10.21437/interspeech.2017-161 ◽

2017 ◽

Cited By ~ 3

Author(s):

Paul Crook ◽

Alex Marin

Keyword(s):

Dialog Systems ◽

Sequence Modeling ◽

User Simulation

Download Full-text

Particle Filter Algorithm for Object Tracking in Video Sequence Based on Chromatic Information

International Journal for Research in Applied Science and Engineering Technology ◽

10.22214/ijraset.2018.4667 ◽

2018 ◽

Vol 6 (4) ◽

pp. 4044-4049

Author(s):

Prachi R. Narkhede

Keyword(s):

Particle Filter ◽

Object Tracking ◽

Video Sequence ◽

Particle Filter Algorithm

Download Full-text

Understanding the clinical reasoning processes involved in the management of multimorbidity in an ambulatory setting: study protocol of a stimulated recall research

BMC Medical Education ◽

10.1186/s12909-020-02459-w ◽

2021 ◽

Vol 21 (1) ◽

Author(s):

M.-C. Audétat ◽

S. Cairo Notari ◽

J. Sader ◽

C. Ritz ◽

T. Fassier ◽

...

Keyword(s):

Primary Care ◽

Clinical Reasoning ◽

Primary Care Physicians ◽

Video Sequence ◽

Structured Interview ◽

Ambulatory Setting ◽

Structured Interviews ◽

Stimulated Recall ◽

Study Results

Abstract Background Primary care physicians are at the very heart of managing patients suffering from multimorbidity. However, several studies have highlighted that some physicians feel ill-equipped to manage these kinds of complex clinical situations. Few studies are available on the clinical reasoning processes at play during the long-term management and follow-up of patients suffering from multimorbidity. This study aims to contribute to a better understanding on how the clinical reasoning of primary care physicians is affected during follow-up consultations with these patients. Methods A qualitative research project based on semi-structured interviews with primary care physicians in an ambulatory setting will be carried out, using the video stimulated recall interview method. Participants will be filmed in their work environment during a standard consultation with a patient suffering from multimorbidity using a “button camera” (small camera) which will be pinned to their white coat. The recording will be used in a following semi-structured interview with physicians and the research team to instigate a stimulated recall. Stimulated recall is a research method that allows the investigation of cognitive processes by inviting participants to recall their concurrent thinking during an event when prompted by a video sequence recall. During this interview, participants will be prompted by different video sequence and asked to discuss them; the aim will be to encourage them to make their clinical reasoning processes explicit. Fifteen to twenty interviews are planned to reach data saturation. The interviews will be transcribed verbatim and data will be analysed according to a standard content analysis, using deductive and inductive approaches. Conclusion Study results will contribute to the scientific community’s overall understanding of clinical reasoning. This will subsequently allow future generation of primary care physicians to have access to more adequate trainings to manage patients suffering from multimorbidity in their practice. As a result, this will improve the quality of the patient’s care and treatments.

Download Full-text

Gesture Recognition Based on Video Sequence with B3D Convolutional Neural Network

2020 IEEE 5th International Conference on Signal and Image Processing (ICSIP) ◽

10.1109/icsip49896.2020.9339260 ◽

2020 ◽

Author(s):

Li Shao ◽

Xu Chao ◽

Zhang Lirong

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Gesture Recognition ◽

Video Sequence

Download Full-text

What You Say or How You Say It? Depression Detection Through Joint Modeling of Linguistic and Acoustic Aspects of Speech

Cognitive Computation ◽

10.1007/s12559-020-09808-3 ◽

2021 ◽

Author(s):

Nujud Aloshban ◽

Anna Esposito ◽

Alessandro Vinciarelli

Keyword(s):

Short Term Memory ◽

Joint Modeling ◽

Joint Analysis ◽

Health Issues ◽

Multimodal Analysis ◽

Sequence Modeling ◽

Depression Detection ◽

Long Short Term Memory ◽

Joint Representation ◽

Better Than

AbstractDepression is one of the most common mental health issues. (It affects more than 4% of the world’s population, according to recent estimates.) This article shows that the joint analysis of linguistic and acoustic aspects of speech allows one to discriminate between depressed and nondepressed speakers with an accuracy above 80%. The approach used in the work is based on networks designed for sequence modeling (bidirectional Long-Short Term Memory networks) and multimodal analysis methodologies (late fusion, joint representation and gated multimodal units). The experiments were performed over a corpus of 59 interviews (roughly 4 hours of material) involving 29 individuals diagnosed with depression and 30 control participants. In addition to an accuracy of 80%, the results show that multimodal approaches perform better than unimodal ones owing to people’s tendency to manifest their condition through one modality only, a source of diversity across unimodal approaches. In addition, the experiments show that it is possible to measure the “confidence” of the approach and automatically identify a subset of the test data in which the performance is above a predefined threshold. It is possible to effectively detect depression by using unobtrusive and inexpensive technologies based on the automatic analysis of speech and language.

Download Full-text

Stock Price Prediction using Bi-Directional LSTM based Sequence to Sequence Modeling and Multitask Learning

2020 11th IEEE Annual Ubiquitous Computing, Electronics & Mobile Communication Conference (UEMCON) ◽

10.1109/uemcon51285.2020.9298066 ◽

2020 ◽

Author(s):

Siddartha Mootha ◽

Sashank Sridhar ◽

Rahul Seetharaman ◽

S. Chitrakala

Keyword(s):

Stock Price ◽

Multitask Learning ◽

Sequence Modeling ◽

Stock Price Prediction ◽

Price Prediction

Download Full-text