video description
Recently Published Documents


TOTAL DOCUMENTS

104
(FIVE YEARS 39)

H-INDEX

13
(FIVE YEARS 3)

Author(s):  
Benedetto Ielpo ◽  
Antonio Giuliani ◽  
Patricia Sanchez ◽  
Fernando Burdio ◽  
Mikel Gastaka ◽  
...  

2021 ◽  
Vol 2021 ◽  
pp. 1-11
Author(s):  
Yuhua Gao ◽  
Yong Mo ◽  
Heng Zhang ◽  
Ruiyin Huang ◽  
Zilong Chen

With the development of computer technology, video description, which combines the key technologies in the field of natural language processing and computer vision, has attracted more and more researchers’ attention. Among them, how to objectively and efficiently describe high-speed and detailed sports videos is the key to the development of the video description field. In view of the problems of sentence errors and loss of visual information in the generation of the video description text due to the lack of language learning information in the existing video description methods, a multihead model combining the long-term and short-term memory network and attention mechanism is proposed for the intelligent description of the volleyball video. Through the introduction of the attention mechanism, the model pays much attention to the significant areas in the video when generating sentences. Through the comparative experiment with different models, the results show that the model with the attention mechanism can effectively solve the loss of visual information. Compared with the LSTM and base model, the multihead model proposed in this paper, which combines the long-term and short-term memory network and attention mechanism, has higher scores in all evaluation indexes and significantly improved the quality of the intelligent text description of the volleyball video.


2021 ◽  
Author(s):  
Jawad Khan

Several recent studies on action recognition have emphasised the significance of including motioncharacteristics clearly in the video description. This work shows that properly partitioning visualmotion into dominant and residual motions enhances action recognition algorithms greatly, both interms of extracting space-time trajectories and computing descriptors. Then, using differentialmotion scalar variables, divergence, curl, and shear characteristics, we create a new motiondescriptor, the DCS descriptor. It adds to the results by capturing additional information on localmotion patterns. Finally, adopting the recently proposed VLAD coding technique in image retrievalimproves action recognition significantly. On three difficult datasets, namely Hollywood 2,HMDB51, and Olympic Sports, our three additions are complementary and lead to beat all reportedresults by a large margin.


2021 ◽  
Vol 3 (7) ◽  
Author(s):  
Wael F. Youssef ◽  
Siba Haidar ◽  
Philippe Joly

AbstractThe purpose of our work is to automatically generate textual video description schemas from surveillance video scenes compatible with police incidents reports. Our proposed approach is based on a generic and flexible context-free ontology. The general schema is of the form [actuator] [action] [over/with] [actuated object] [+ descriptors: distance, speed, etc.]. We focus on scenes containing exactly two objects. Through elaborated steps, we generate a formatted textual description. We try to identify the existence of an interaction between the two objects, including remote interaction which does not involve physical contact and we point out when aggressivity took place in these cases. We use supervised deep learning to classify scenes into interaction or no-interaction classes and then into subclasses. The chosen descriptors used to represent subclasses are keys in surveillance systems that help generate live alerts and facilitate offline investigation.


2021 ◽  
Author(s):  
Carmen Branje

This thesis explores amateur video description facilitated through the video description software program called LiveDescribe. Twelve amateur describers created video description which was reviewed by 76 sighted, low vision, and blind reviewers. It was found that describers were able to not only produce description but that their descriptions seem to be perceived as having an acceptable level of quality. Three describers were found to be rated as "good", three were rated as "weak" and the remaining six were in a "medium" category. The common factors that appeared to characterize the good describers were a soft non-obtrusive voice, a moderate amount of well placed descriptions, moderate description lengths and English as a first language spoken without an accent or regional dialect. It was found that LiveDescribe was a useful and easy to use tool and that it facilitated a video description work flow process for amateur describers.


2021 ◽  
Author(s):  
Carmen Branje

This thesis explores amateur video description facilitated through the video description software program called LiveDescribe. Twelve amateur describers created video description which was reviewed by 76 sighted, low vision, and blind reviewers. It was found that describers were able to not only produce description but that their descriptions seem to be perceived as having an acceptable level of quality. Three describers were found to be rated as "good", three were rated as "weak" and the remaining six were in a "medium" category. The common factors that appeared to characterize the good describers were a soft non-obtrusive voice, a moderate amount of well placed descriptions, moderate description lengths and English as a first language spoken without an accent or regional dialect. It was found that LiveDescribe was a useful and easy to use tool and that it facilitated a video description work flow process for amateur describers.


Author(s):  
Aditya Bodi ◽  
Pooyan Fazli ◽  
Shasta Ihorn ◽  
Yue-Ting Siu ◽  
Andrew T Scott ◽  
...  
Keyword(s):  

Sign in / Sign up

Export Citation Format

Share Document