Multi-level dictionary learning for fine-grained images categorization with attention model

2021 ◽  
Author(s):  
Jinsheng Ji ◽  
Yiyou Guo ◽  
Zhen Yang ◽  
Tao Zhang ◽  
Xiankai Lu
Author(s):  
Jinwei Qi ◽  
Yuxin Peng ◽  
Yuxin Yuan

With the rapid growth of multimedia data, such as image and text, it is a highly challenging problem to effectively correlate and retrieve the data of different media types. Naturally, when correlating an image with textual description, people focus on not only the alignment between discriminative image regions and key words, but also the relations lying in the visual and textual context. Relation understanding is essential for cross-media correlation learning, which is ignored by prior cross-media retrieval works. To address the above issue, we propose Cross-media Relation Attention Network (CRAN) with multi-level alignment. First, we propose visual-language relation attention model to explore both fine-grained patches and their relations of different media types. We aim to not only exploit cross-media fine-grained local information, but also capture the intrinsic relation information, which can provide complementary hints for correlation learning. Second, we propose cross-media multi-level alignment to explore global, local and relation alignments across different media types, which can mutually boost to learn more precise cross-media correlation. We conduct experiments on 2 cross-media datasets, and compare with 10 state-of-the-art methods to verify the effectiveness of proposed approach.


2021 ◽  
Vol 13 (3) ◽  
pp. 1021
Author(s):  
Sara Scipioni ◽  
Meir Russ ◽  
Federico Niccolini

To contribute to small and medium enterprises’ (SMEs) sustainable transition into the circular economy, the study proposes the activation of organizational learning (OL) processes—denoted here as multi-level knowledge creation, transfer, and retention processes—as a key phase in introducing circular business models (CBMs) at SME and supply chain (SC) level. The research employs a mixed-method approach, using the focus group methodology to identify contextual elements impacting on CBM-related OL processes, and a survey-based evaluation to single out the most frequently used OL processes inside Italian construction SMEs. As a main result, a CBM-oriented OL multi-level model offers a fine-grained understanding of contextual elements acting mutually as barriers and drivers for OL processes, as possible OL dynamics among them. The multi-level culture construct—composed of external stakeholders’, SC stakeholders’, and organizational culture—identify the key element to activate CBM-oriented OL processes. Main implications are related to the identification of cultural, structural, regulatory, and process contextual elements across the external, SC, and organizational levels, and their interrelation with applicable intraorganizational and interorganizational learning processes. The proposed model would contribute to an improved implementation of transitioning into the circular economy utilizing sustainable business models in the construction SMEs.


IEEE Access ◽  
2019 ◽  
Vol 7 ◽  
pp. 166390-166397 ◽  
Author(s):  
Jiabao Wang ◽  
Yang Li ◽  
Zhuang Miao ◽  
Xun Zhao ◽  
Zhang Rui

2020 ◽  
Vol 104 ◽  
pp. 104027
Author(s):  
Ye Yu ◽  
Longdao Xu ◽  
Wei Jia ◽  
Wenjia Zhu ◽  
Yunxiang Fu ◽  
...  

Author(s):  
Yifang Yin ◽  
Meng-Jiun Chiou ◽  
Zhenguang Liu ◽  
Harsh Shrivastava ◽  
Rajiv Ratn Shah ◽  
...  

2013 ◽  
Vol 368 (1613) ◽  
pp. 20120356 ◽  
Author(s):  
Grant C. McDonald ◽  
Richard James ◽  
Jens Krause ◽  
Tommaso Pizzari

Sexual selection is traditionally measured at the population level, assuming that populations lack structure. However, increasing evidence undermines this approach, indicating that intrasexual competition in natural populations often displays complex patterns of spatial and temporal structure. This complexity is due in part to the degree and mechanisms of polyandry within a population, which can influence the intensity and scale of both pre- and post-copulatory sexual competition. Attempts to measure selection at the local and global scale have been made through multi-level selection approaches. However, definitions of local scale are often based on physical proximity, providing a rather coarse measure of local competition, particularly in polyandrous populations where the local scale of pre- and post-copulatory competition may differ drastically from each other. These limitations can be solved by social network analysis, which allows us to define a unique sexual environment for each member of a population: ‘local scale’ competition, therefore, becomes an emergent property of a sexual network. Here, we first propose a novel quantitative approach to measure pre- and post-copulatory sexual selection, which integrates multi-level selection with information on local scale competition derived as an emergent property of networks of sexual interactions. We then use simple simulations to illustrate the ways in which polyandry can impact estimates of sexual selection. We show that for intermediate levels of polyandry, the proposed network-based approach provides substantially more accurate measures of sexual selection than the more traditional population-level approach. We argue that the increasing availability of fine-grained behavioural datasets provides exciting new opportunities to develop network approaches to study sexual selection in complex societies.


Author(s):  
Kyung-Min Kim ◽  
Min-Oh Heo ◽  
Seong-Ho Choi ◽  
Byoung-Tak Zhang

Question-answering (QA) on video contents is a significant challenge for achieving human-level intelligence as it involves both vision and language in real-world settings. Here we demonstrate the possibility of an AI agent performing video story QA by learning from a large amount of cartoon videos. We develop a video-story learning model, i.e. Deep Embedded Memory Networks (DEMN), to reconstruct stories from a joint scene-dialogue video stream using a latent embedding space of observed data. The video stories are stored in a long-term memory component. For a given question, an LSTM-based attention model uses the long-term memory to recall the best question-story-answer triplet by focusing on specific words containing key information. We trained the DEMN on a novel QA dataset of children’s cartoon video series, Pororo. The dataset contains 16,066 scene-dialogue pairs of 20.5-hour videos, 27,328 fine-grained sentences for scene description, and 8,913 story-related QA pairs. Our experimental results show that the DEMN outperforms other QA models. This is mainly due to 1) the reconstruction of video stories in a scene-dialogue combined form that utilize the latent embedding and 2) attention. DEMN also achieved state-of-the-art results on the MovieQA benchmark.


Sign in / Sign up

Export Citation Format

Share Document