scholarly journals Rescaling Egocentric Vision: Collection, Pipeline and Challenges for EPIC-KITCHENS-100

Author(s):  
Dima Damen ◽  
Hazel Doughty ◽  
Giovanni Maria Farinella ◽  
Antonino Furnari ◽  
Evangelos Kazakos ◽  
...  

AbstractThis paper introduces the pipeline to extend the largest dataset in egocentric vision, EPIC-KITCHENS. The effort culminates in EPIC-KITCHENS-100, a collection of 100 hours, 20M frames, 90K actions in 700 variable-length videos, capturing long-term unscripted activities in 45 environments, using head-mounted cameras. Compared to its previous version (Damen in Scaling egocentric vision: ECCV, 2018), EPIC-KITCHENS-100 has been annotated using a novel pipeline that allows denser (54% more actions per minute) and more complete annotations of fine-grained actions (+128% more action segments). This collection enables new challenges such as action detection and evaluating the “test of time”—i.e. whether models trained on data collected in 2018 can generalise to new footage collected two years later. The dataset is aligned with 6 challenges: action recognition (full and weak supervision), action detection, action anticipation, cross-modal retrieval (from captions), as well as unsupervised domain adaptation for action recognition. For each challenge, we define the task, provide baselines and evaluation metrics.

Author(s):  
Jun Wen ◽  
Risheng Liu ◽  
Nenggan Zheng ◽  
Qian Zheng ◽  
Zhefeng Gong ◽  
...  

Unsupervised domain adaptation methods aim to alleviate performance degradation caused by domain-shift by learning domain-invariant representations. Existing deep domain adaptation methods focus on holistic feature alignment by matching source and target holistic feature distributions, without considering local features and their multi-mode statistics. We show that the learned local feature patterns are more generic and transferable and a further local feature distribution matching enables fine-grained feature alignment. In this paper, we present a method for learning domain-invariant local feature patterns and jointly aligning holistic and local feature statistics. Comparisons to the state-of-the-art unsupervised domain adaptation methods on two popular benchmark datasets demonstrate the superiority of our approach and its effectiveness on alleviating negative transfer.


2020 ◽  
Vol 34 (07) ◽  
pp. 10567-10574
Author(s):  
Qingchao Chen ◽  
Yang Liu

Unsupervised domain Adaptation (UDA) aims to learn and transfer generalized features from a labelled source domain to a target domain without any annotations. Existing methods only aligning high-level representation but without exploiting the complex multi-class structure and local spatial structure. This is problematic as 1) the model is prone to negative transfer when the features from different classes are misaligned; 2) missing the local spatial structure poses a major obstacle in performing the fine-grained feature alignment. In this paper, we integrate the valuable information conveyed in classifier prediction and local feature maps into global feature representation and then perform a single mini-max game to make it domain invariant. In this way, the domain-invariant feature not only describes the holistic representation of the original image but also preserves mode-structure and fine-grained spatial structural information. The feature integration is achieved by estimating and maximizing the mutual information (MI) among the global feature, local feature and classifier prediction simultaneously. As the MI is hard to measure directly in high-dimension spaces, we adopt a new objective function that implicitly maximizes the MI via an effective sampling strategy and a discriminator design. Our STructure-Aware Feature Fusion (STAFF) network achieves the state-of-the-art performances in various UDA datasets.


2020 ◽  
Vol 155 ◽  
pp. 113404 ◽  
Author(s):  
Peng Liu ◽  
Ting Xiao ◽  
Cangning Fan ◽  
Wei Zhao ◽  
Xianglong Tang ◽  
...  

2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Xin Mao ◽  
Jun Kang Chow ◽  
Pin Siang Tan ◽  
Kuan-fu Liu ◽  
Jimmy Wu ◽  
...  

AbstractAutomatic bird detection in ornithological analyses is limited by the accuracy of existing models, due to the lack of training data and the difficulties in extracting the fine-grained features required to distinguish bird species. Here we apply the domain randomization strategy to enhance the accuracy of the deep learning models in bird detection. Trained with virtual birds of sufficient variations in different environments, the model tends to focus on the fine-grained features of birds and achieves higher accuracies. Based on the 100 terabytes of 2-month continuous monitoring data of egrets, our results cover the findings using conventional manual observations, e.g., vertical stratification of egrets according to body size, and also open up opportunities of long-term bird surveys requiring intensive monitoring that is impractical using conventional methods, e.g., the weather influences on egrets, and the relationship of the migration schedules between the great egrets and little egrets.


Sign in / Sign up

Export Citation Format

Share Document