CogTree: Cognition Tree Loss for Unbiased Scene Graph Generation

Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2021/176 ◽

2021 ◽

Author(s):

Jing Yu ◽

Yuan Chai ◽

Yujing Wang ◽

Yue Hu ◽

Qi Wu

Keyword(s):

Real World ◽

State Of The Art ◽

Data Distribution ◽

Cognitive Structure ◽

Scene Graph ◽

Visual Understanding ◽

Fine Mode ◽

Coarse To Fine ◽

Fine Distinction ◽

Graph Generation

Scene graphs are semantic abstraction of images that encourage visual understanding and reasoning. However, the performance of Scene Graph Generation (SGG) is unsatisfactory when faced with biased data in real-world scenarios. Conventional debiasing research mainly studies from the view of balancing data distribution or learning unbiased models and representations, ignoring the correlations among the biased classes. In this work, we analyze this problem from a novel cognition perspective: automatically building a hierarchical cognitive structure from the biased predictions and navigating that hierarchy to locate the relationships, making the tail relationships receive more attention in a coarse-to-fine mode. To this end, we propose a novel debiasing Cognition Tree (CogTree) loss for unbiased SGG. We first build a cognitive structure CogTree to organize the relationships based on the prediction of a biased SGG model. The CogTree distinguishes remarkably different relationships at first and then focuses on a small portion of easily confused ones. Then, we propose a debiasing loss specially for this cognitive structure, which supports coarse-to-fine distinction for the correct relationships. The loss is model-agnostic and consistently boosting the performance of several state-of-the-art models. The code is available at: https://github.com/CYVincent/Scene-Graph-Transformer-CogTree.

Download Full-text

Global Structure and Local Semantics-Preserved Embeddings for Entity Alignment

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2020/506 ◽

2020 ◽

Author(s):

Hao Nie ◽

Xianpei Han ◽

Le Sun ◽

Chi Man Wong ◽

Qiang Chen ◽

...

Keyword(s):

Real World ◽

State Of The Art ◽

Global Structure ◽

Convolutional Networks ◽

Local Relation ◽

Knowledge Graphs ◽

Art Performance ◽

Real World Datasets ◽

Coarse To Fine

Entity alignment (EA) aims to identify entities located in different knowledge graphs (KGs) that refer to the same real-world object. To learn the entity representations, most EA approaches rely on either translation-based methods which capture the local relation semantics of entities or graph convolutional networks (GCNs), which exploit the global KG structure. Afterward, the aligned entities are identified based on their distances. In this paper, we propose to jointly leverage the global KG structure and entity-specific relational triples for better entity alignment. Specifically, a global structure and local semantics preserving network is proposed to learn entity representations in a coarse-to-fine manner. Experiments on several real-world datasets show that our method significantly outperforms other entity alignment approaches and achieves the new state-of-the-art performance.

Download Full-text

Subgraph and object context-masked network for scene graph generation

IET Computer Vision ◽

10.1049/iet-cvi.2019.0896 ◽

2020 ◽

Vol 14 (7) ◽

pp. 546-553

Author(s):

Zhenxing Zheng ◽

Zhendong Li ◽

Gaoyun An ◽

Songhe Feng

Keyword(s):

Scene Graph ◽

Graph Generation

Download Full-text

Semantic Relation Model and Dataset for Remote Sensing Scene Understanding

ISPRS International Journal of Geo-Information ◽

10.3390/ijgi10070488 ◽

2021 ◽

Vol 10 (7) ◽

pp. 488

Author(s):

Peng Li ◽

Dezheng Zhang ◽

Aziguli Wulamu ◽

Xin Liu ◽

Peng Chen

Keyword(s):

Remote Sensing ◽

Scene Understanding ◽

Deep Understanding ◽

Remote Sensing Images ◽

Convolutional Network ◽

Scene Graph ◽

Multi Scale ◽

Relationship Extraction ◽

High Level ◽

Graph Generation

A deep understanding of our visual world is more than an isolated perception on a series of objects, and the relationships between them also contain rich semantic information. Especially for those satellite remote sensing images, the span is so large that the various objects are always of different sizes and complex spatial compositions. Therefore, the recognition of semantic relations is conducive to strengthen the understanding of remote sensing scenes. In this paper, we propose a novel multi-scale semantic fusion network (MSFN). In this framework, dilated convolution is introduced into a graph convolutional network (GCN) based on an attentional mechanism to fuse and refine multi-scale semantic context, which is crucial to strengthen the cognitive ability of our model Besides, based on the mapping between visual features and semantic embeddings, we design a sparse relationship extraction module to remove meaningless connections among entities and improve the efficiency of scene graph generation. Meanwhile, to further promote the research of scene understanding in remote sensing field, this paper also proposes a remote sensing scene graph dataset (RSSGD). We carry out extensive experiments and the results show that our model significantly outperforms previous methods on scene graph generation. In addition, RSSGD effectively bridges the huge semantic gap between low-level perception and high-level cognition of remote sensing images.

Download Full-text

ART-UP: A Novel Method for Generating Scanning-Robust Aesthetic QR Codes

ACM Transactions on Multimedia Computing Communications and Applications ◽

10.1145/3418214 ◽

2021 ◽

Vol 17 (1) ◽

pp. 1-23

Author(s):

Mingliang Xu ◽

Qingfeng Li ◽

Jianwei Niu ◽

Hao Su ◽

Xiting Liu ◽

...

Keyword(s):

State Of The Art ◽

Visual Quality ◽

Qr Code ◽

Quick Response ◽

Estimation Model ◽

Qr Codes ◽

Excellent Performance ◽

Novel Method ◽

Coarse To Fine

Quick response (QR) codes are usually scanned in different environments, so they must be robust to variations in illumination, scale, coverage, and camera angles. Aesthetic QR codes improve the visual quality, but subtle changes in their appearance may cause scanning failure. In this article, a new method to generate scanning-robust aesthetic QR codes is proposed, which is based on a module-based scanning probability estimation model that can effectively balance the tradeoff between visual quality and scanning robustness. Our method locally adjusts the luminance of each module by estimating the probability of successful sampling. The approach adopts the hierarchical, coarse-to-fine strategy to enhance the visual quality of aesthetic QR codes, which sequentially generate the following three codes: a binary aesthetic QR code, a grayscale aesthetic QR code, and the final color aesthetic QR code. Our approach also can be used to create QR codes with different visual styles by adjusting some initialization parameters. User surveys and decoding experiments were adopted for evaluating our method compared with state-of-the-art algorithms, which indicates that the proposed approach has excellent performance in terms of both visual quality and scanning robustness.

Download Full-text

Extrinsic Camera Calibration with Line-Laser Projection

Sensors ◽

10.3390/s21041091 ◽

2021 ◽

Vol 21 (4) ◽

pp. 1091

Author(s):

Izaak Van Crombrugge ◽

Rudi Penne ◽

Steve Vanlanduit

Keyword(s):

Camera Calibration ◽

Real World ◽

Large Scale ◽

State Of The Art ◽

Bundle Adjustment ◽

Field Of View ◽

Extrinsic Calibration ◽

Practical Procedure ◽

Partial Overlap

Knowledge of precise camera poses is vital for multi-camera setups. Camera intrinsics can be obtained for each camera separately in lab conditions. For fixed multi-camera setups, the extrinsic calibration can only be done in situ. Usually, some markers are used, like checkerboards, requiring some level of overlap between cameras. In this work, we propose a method for cases with little or no overlap. Laser lines are projected on a plane (e.g., floor or wall) using a laser line projector. The pose of the plane and cameras is then optimized using bundle adjustment to match the lines seen by the cameras. To find the extrinsic calibration, only a partial overlap between the laser lines and the field of view of the cameras is needed. Real-world experiments were conducted both with and without overlapping fields of view, resulting in rotation errors below 0.5°. We show that the accuracy is comparable to other state-of-the-art methods while offering a more practical procedure. The method can also be used in large-scale applications and can be fully automated.

Download Full-text

A Survey on Bias and Fairness in Machine Learning

ACM Computing Surveys ◽

10.1145/3457607 ◽

2021 ◽

Vol 54 (6) ◽

pp. 1-35

Author(s):

Ninareh Mehrabi ◽

Fred Morstatter ◽

Nripsuta Saxena ◽

Kristina Lerman ◽

Aram Galstyan

Keyword(s):

Artificial Intelligence ◽

Machine Learning ◽

Deep Learning ◽

Real World ◽

State Of The Art ◽

Future Directions ◽

Discriminatory Behavior ◽

Real World Applications ◽

Near Future ◽

Different Sources

With the widespread use of artificial intelligence (AI) systems and applications in our everyday lives, accounting for fairness has gained significant importance in designing and engineering of such systems. AI systems can be used in many sensitive environments to make important and life-changing decisions; thus, it is crucial to ensure that these decisions do not reflect discriminatory behavior toward certain groups or populations. More recently some work has been developed in traditional machine learning and deep learning that address such challenges in different subdomains. With the commercialization of these systems, researchers are becoming more aware of the biases that these applications can contain and are attempting to address them. In this survey, we investigated different real-world applications that have shown biases in various ways, and we listed different sources of biases that can affect AI applications. We then created a taxonomy for fairness definitions that machine learning researchers have defined to avoid the existing bias in AI systems. In addition to that, we examined different domains and subdomains in AI showing what researchers have observed with regard to unfair outcomes in the state-of-the-art methods and ways they have tried to address them. There are still many future directions and solutions that can be taken to mitigate the problem of bias in AI systems. We are hoping that this survey will motivate researchers to tackle these issues in the near future by observing existing work in their respective fields.

Download Full-text

Multi-resolution Visual Positioning and Navigation Technique for Unmanned Aerial System Landing Assistance

Journal of Navigation ◽

10.1017/s0373463317000327 ◽

2017 ◽

Vol 70 (6) ◽

pp. 1276-1292

Author(s):

Chong Yu ◽

Jiyuan Cai ◽

Qingyu Chen

Keyword(s):

Real World ◽

State Of The Art ◽

Unmanned Aerial System ◽

Detection Accuracy ◽

Relative Positioning ◽

Positioning Accuracy ◽

Visual Positioning ◽

Positioning Technique ◽

Technique Comparison ◽

Resolution Simulation

To achieve more accurate navigation performance in the landing process, a multi-resolution visual positioning technique is proposed for landing assistance of an Unmanned Aerial System (UAS). This technique uses a captured image of an artificial landmark (e.g. barcode) to provide relative positioning information in the X, Y and Z axes, and yaw, roll and pitch orientations. A multi-resolution coding algorithm is designed to ensure the UAS will not lose the detection of the landing target due to limited visual angles or camera resolution. Simulation and real world experiments prove the performance of the proposed technique in positioning accuracy, detection accuracy, and navigation effect. Two types of UAS are used to verify the generalisation of the proposed technique. Comparison experiments to state-of-the-art techniques are also included with the results analysis.

Download Full-text