Multi-human Parsing with a Graph-based Generative Adversarial Model

Author(s):  
Jianshu Li ◽  
Jian Zhao ◽  
Congyan Lang ◽  
Yidong Li ◽  
Yunchao Wei ◽  
...  

Human parsing is an important task in human-centric image understanding in computer vision and multimedia systems. However, most existing works on human parsing mainly tackle the single-person scenario, which deviates from real-world applications where multiple persons are present simultaneously with interaction and occlusion. To address such a challenging multi-human parsing problem, we introduce a novel multi-human parsing model named MH-Parser, which uses a graph-based generative adversarial model to address the challenges of close-person interaction and occlusion in multi-human parsing. To validate the effectiveness of the new model, we collect a new dataset named Multi-Human Parsing (MHP), which contains multiple persons with intensive person interaction and entanglement. Experiments on the new MHP dataset and existing datasets demonstrate that the proposed method is effective in addressing the multi-human parsing problem compared with existing solutions in the literature.

Author(s):  
Yu-Jin Zhang

This chapter introduces a cutting-edge research field of computer vision and image understanding – the spatial-temporal behavior understanding. The main concepts, the focus of research, the typical technology, the fast development, etc. of this new field in recent years are overviewed. An important task in computer vision and image understanding is to analyze the scene through image operation on the image of scene in order to guide the action. To do this, one needs to locate the objects in the scene, and to determine how they change its position, attitude, speed and relationships in the space over time. In short, it is to grasp the action in time and space, to determine the purpose of the operation, and thus to understand the semantics of the information they passed. This is refereed as the understanding of spatial-temporal behaviors.


2020 ◽  
Vol 34 (10) ◽  
pp. 13714-13715
Author(s):  
Subhajit Chaudhury

Neural networks have contributed to tremendous progress in the domains of computer vision, speech processing, and other real-world applications. However, recent studies have shown that these state-of-the-art models can be easily compromised by adding small imperceptible perturbations. My thesis summary frames the problem of adversarial robustness as an equivalent problem of learning suitable features that leads to good generalization in neural networks. This is motivated from learning in humans which is not trivially fooled by such perturbations due to robust feature learning which shows good out-of-sample generalization.


Author(s):  
Yu-Jin Zhang

This chapter introduces a cutting-edge research field of computer vision and image understanding – the spatial-temporal behavior understanding. The main concepts, the focus of research, the typical technology, the fast development, etc. of this new field in recent years are overviewed. An important task in computer vision and image understanding is to analyze the scene through image operation on the image of scene in order to guide the action. To do this, one needs to locate the objects in the scene, and to determine how they change its position, attitude, speed, and relationships in the space over time. In short, it is to grasp the action in time and space, to determine the purpose of the operation, and thus to understand the semantics of the information they passed. This is referred ti as the understanding of spatial-temporal behaviors.


Author(s):  
Yuchen Guo ◽  
Guiguang Ding ◽  
Jungong Han ◽  
Sicheng Zhao ◽  
Bin Wang

Recognizing unseen classes is an important task for real-world applications, due to: 1) it is common that some classes in reality have no labeled image exemplar for training; and 2) novel classes emerge rapidly. Recently, to address this task many zero-shot learning (ZSL) approaches have been proposed where explicit linear scores, like inner product score, are employed to measure the similarity between a class and an image. We argue that explicit linear scoring (ELS) seems too weak to capture complicated image-class correspondence. We propose a simple yet effective framework, called Implicit Non-linear Similarity Scoring (ICINESS). In particular, we train a scoring network which uses image and class features as input, fuses them by hidden layers, and outputs the similarity. Based on the universal approximation theorem, it can approximate the true similarity function between images and classes if a proper structure is used in an implicit non-linear way, which is more flexible and powerful. With ICINESS framework, we implement ZSL algorithms by shallow and deep networks, which yield consistently superior results.


Crystals ◽  
2021 ◽  
Vol 11 (3) ◽  
pp. 256
Author(s):  
Christian Rodenbücher ◽  
Kristof Szot

Transition metal oxides with ABO3 or BO2 structures have become one of the major research fields in solid state science, as they exhibit an impressive variety of unusual and exotic phenomena with potential for their exploitation in real-world applications [...]


Sign in / Sign up

Export Citation Format

Share Document