semantic event
Recently Published Documents


TOTAL DOCUMENTS

81
(FIVE YEARS 11)

H-INDEX

15
(FIVE YEARS 1)

2021 ◽  
Vol 2021 ◽  
pp. 1-12
Author(s):  
Yufeng Du ◽  
Quan Zhao ◽  
Xiaochun Lu

The team sports game video features complex background, fast target movement, and mutual occlusion between targets, which poses great challenges to multiperson collaborative video analysis. This paper proposes a video semantic extraction method that integrates domain knowledge and in-depth features, which can be applied to the analysis of a multiperson collaborative basketball game video, where the semantic event is modeled as an adversarial relationship between two teams of players. We first designed a scheme that combines a dual-stream network and learnable spatiotemporal feature aggregation, which can be used for end-to-end training of video semantic extraction to bridge the gap between low-level features and high-level semantic events. Then, an algorithm based on the knowledge from different video sources is proposed to extract the action semantics. The algorithm gathers local convolutional features in the entire space-time range, which can be used to track the ball/shooter/hoop to realize automatic semantic extraction of basketball game videos. Experiments show that the scheme proposed in this paper can effectively identify the four categories of short, medium, long, free throw, and scoring events and the semantics of athletes’ actions based on the video footage of the basketball game.


2021 ◽  
Author(s):  
Jian Zhou

This thesis is aimed at finding solutions and statistical modeling techniques to analyze the video content in a way such that intelligent and efficient interaction with video is possible. In our work, we investigate several fundamental tasks for content analysis of video. Specifically, we propose an outline video parsing algorithm using basic statistical measures and an off-line solution using Independent Component Analysis (ICA). A spatiotemporal video similarity model based on dynamic programming is developed. For video object segmentation and tracking, we develop a new method based on probabilistic fuzzy c-means and Gibbs random fields. Theoretically, we develop a generic framework for sequential data analysis. The new framework integrates both Hidden Markov Model and ICA mixture model. The re-estimation formulas for model parameter learning are also derived. As a case study, the new model is applied to golf video for semantic event detection and recognition.


2021 ◽  
Author(s):  
Jian Zhou

This thesis is aimed at finding solutions and statistical modeling techniques to analyze the video content in a way such that intelligent and efficient interaction with video is possible. In our work, we investigate several fundamental tasks for content analysis of video. Specifically, we propose an outline video parsing algorithm using basic statistical measures and an off-line solution using Independent Component Analysis (ICA). A spatiotemporal video similarity model based on dynamic programming is developed. For video object segmentation and tracking, we develop a new method based on probabilistic fuzzy c-means and Gibbs random fields. Theoretically, we develop a generic framework for sequential data analysis. The new framework integrates both Hidden Markov Model and ICA mixture model. The re-estimation formulas for model parameter learning are also derived. As a case study, the new model is applied to golf video for semantic event detection and recognition.


PLoS ONE ◽  
2020 ◽  
Vol 15 (12) ◽  
pp. e0243829
Author(s):  
Fatemeh Ziaeetabar ◽  
Jennifer Pomp ◽  
Stefan Pfeiffer ◽  
Nadiya El-Sourani ◽  
Ricarda I. Schubotz ◽  
...  

Predicting other people’s upcoming action is key to successful social interactions. Previous studies have started to disentangle the various sources of information that action observers exploit, including objects, movements, contextual cues and features regarding the acting person’s identity. We here focus on the role of static and dynamic inter-object spatial relations that change during an action. We designed a virtual reality setup and tested recognition speed for ten different manipulation actions. Importantly, all objects had been abstracted by emulating them with cubes such that participants could not infer an action using object information. Instead, participants had to rely only on the limited information that comes from the changes in the spatial relations between the cubes. In spite of these constraints, participants were able to predict actions in, on average, less than 64% of the action’s duration. Furthermore, we employed a computational model, the so-called enriched Semantic Event Chain (eSEC), which incorporates the information of different types of spatial relations: (a) objects’ touching/untouching, (b) static spatial relations between objects and (c) dynamic spatial relations between objects during an action. Assuming the eSEC as an underlying model, we show, using information theoretical analysis, that humans mostly rely on a mixed-cue strategy when predicting actions. Machine-based action prediction is able to produce faster decisions based on individual cues. We argue that human strategy, though slower, may be particularly beneficial for prediction of natural and more complex actions with more variable or partial sources of information. Our findings contribute to the understanding of how individuals afford inferring observed actions’ goals even before full goal accomplishment, and may open new avenues for building robots for conflict-free human-robot cooperation.


2020 ◽  
Vol 413 ◽  
pp. 217-229
Author(s):  
Lifang Wu ◽  
Zhou Yang ◽  
Qi Wang ◽  
Meng Jian ◽  
Boxuan Zhao ◽  
...  

2020 ◽  
Author(s):  
Marieke Schouwstra ◽  
Kenny Smith ◽  
Simon Kirby

When people improvise to convey information by using only gesture and no speech (‘silent gesture’), they show language-independent word order preferences: SOV for extensional events (e.g., boy-ball-throw), but SVO for intensional events (e.g., boy-search-ball). Real languages tend not to condition word order on this kind of semantic distinction but instead use the same order irrespective of event type. Word order therefore exemplifies a contrast between naturalness in improvisation and conventionalised regularity in linguistic systems. We present an experimental paradigm in which initially-improvised silent gesture is both used for communication and culturally transmitted through artificial generations of lab participants. In experiments 1 and 2 we investigate the respective contributions of communicative interaction and cultural transmission on natural word order behaviour. We show that both interaction and iterated learning lead to a simplification of the word order regime, and the way in which this unfolds over time is surprisingly similar under the two mechanisms. The resulting dominant word order is mostly SVO, the order of the native language of our participants. In experiment 3, we manipulate the frequency of different semantic event types, and show that this can allow SOV order, rather than SVO order, to conventionalise. Taken together, our experiments demonstrate that where pressures for naturalness and regularity are in conflict, naturalness will give way to regularity as word order becomes conventionalised through repeated usage.


2019 ◽  
Author(s):  
Jayden Ziegler ◽  
Giulia Bencini ◽  
Adele Eva Goldberg ◽  
Jesse Snedeker

In 1990, Bock and Loebell found that passives (e.g., The 747 was radioed by the airport’s control tower) can be primed by intransitive locatives (e.g., The 747 was landing by the airport’s control tower). This finding is often taken as strong evidence that structural priming occurs on the basis of a syntactic phrase structure that abstracts across lexical content, including prepositions, and is uninfluenced by the semantic roles of the arguments. However, all of the intransitive locative primes in Bock and Loebell contained the preposition by (by-locatives), just like the passive targets. Therefore, the locative-to-passive priming may have been due to the adjunct headed by by, rather than being a result of purely abstract syntax. The present experiment investigates this possibility. We find that passives and intransitive by-locatives are equivalent primes, but intransitive locatives with other prepositions (e.g., The 747 has landed near the airport control tower) do not prime passives. We conclude that a shared abstract, content-less tree structure is not sufficient for passive priming to occur. We then review the prior results that have been offered in favor of abstract tree priming, and note the range of evidence can be considerably narrowed—and possibly eliminated—once effects of animacy, semantic event structure, shared morphology, information structure, and rhythm are taken into account.


Sign in / Sign up

Export Citation Format

Share Document