Object Instance Mining for Weakly Supervised Object Detection

Chenhao Lin; Siwen Wang; Dongqi Xu; Yu Lu; Wayne Zhang

doi:10.1609/aaai.v34i07.6813

Object Instance Mining for Weakly Supervised Object Detection

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i07.6813 ◽

2020 ◽

Vol 34 (07) ◽

pp. 11482-11489

Author(s):

Chenhao Lin ◽

Siwen Wang ◽

Dongqi Xu ◽

Yu Lu ◽

Wayne Zhang

Keyword(s):

Object Detection ◽

Learning Process ◽

Multiple Instance Learning ◽

Iterative Learning ◽

Experimental Results ◽

Information Propagation ◽

Local Optima ◽

The Past ◽

End To End ◽

Weakly Supervised

Weakly supervised object detection (WSOD) using only image-level annotations has attracted growing attention over the past few years. Existing approaches using multiple instance learning easily fall into local optima, because such mechanism tends to learn from the most discriminative object in an image for each category. Therefore, these methods suffer from missing object instances which degrade the performance of WSOD. To address this problem, this paper introduces an end-to-end object instance mining (OIM) framework for weakly supervised object detection. OIM attempts to detect all possible object instances existing in each image by introducing information propagation on the spatial and appearance graphs, without any additional annotations. During the iterative learning process, the less discriminative object instances from the same class can be gradually detected and utilized for training. In addition, we design an object instance reweighted loss to learn larger portion of each object instance to further improve the performance. The experimental results on two publicly available databases, VOC 2007 and 2012, demonstrate the efficacy of proposed approach.

Download Full-text

EHSOD: CAM-Guided End-to-End Hybrid-Supervised Object Detection with Cascade Refinement

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i07.6707 ◽

2020 ◽

Vol 34 (07) ◽

pp. 10778-10785

Author(s):

Linpu Fang ◽

Hang Xu ◽

Zhili Liu ◽

Sarah Parisot ◽

Zhenguo Li

Keyword(s):

Object Detection ◽

State Of The Art ◽

Detection System ◽

Parameter Tuning ◽

Heat Map ◽

Level Data ◽

End To End ◽

Weakly Supervised ◽

Yield State ◽

Activation Heat

Object detectors trained on fully-annotated data currently yield state of the art performance but require expensive manual annotations. On the other hand, weakly-supervised detectors have much lower performance and cannot be used reliably in a realistic setting. In this paper, we study the hybrid-supervised object detection problem, aiming to train a high quality detector with only a limited amount of fully-annotated data and fully exploiting cheap data with image-level labels. State of the art methods typically propose an iterative approach, alternating between generating pseudo-labels and updating a detector. This paradigm requires careful manual hyper-parameter tuning for mining good pseudo labels at each round and is quite time-consuming. To address these issues, we present EHSOD, an end-to-end hybrid-supervised object detection system which can be trained in one shot on both fully and weakly-annotated data. Specifically, based on a two-stage detector, we proposed two modules to fully utilize the information from both kinds of labels: 1) CAM-RPN module aims at finding foreground proposals guided by a class activation heat-map; 2) hybrid-supervised cascade module further refines the bounding-box position and classification with the help of an auxiliary head compatible with image-level data. Extensive experiments demonstrate the effectiveness of the proposed method and it achieves comparable results on multiple object detection benchmarks with only 30% fully-annotated data, e.g. 37.5% mAP on COCO. We will release the code and the trained models.

Download Full-text

Deep Multiple Instance Hashing for Object-based Image Retrieval

Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2017/490 ◽

2017 ◽

Cited By ~ 2

Author(s):

Wanqing Zhao ◽

Ziyu Guan ◽

Hangzai Luo ◽

Jinye Peng ◽

Jianping Fan

Keyword(s):

Image Retrieval ◽

Object Detection ◽

Multiple Instance Learning ◽

Feature Maps ◽

Keyword Query ◽

The Arts ◽

Object Based ◽

Object Proposals ◽

Weakly Supervised ◽

Image Pairs

Multi-keyword query is widely supported in text search engines. However, an analogue in image retrieval systems, multi-object query, is rarely studied. Meanwhile, traditional object-based image retrieval methods often involve multiple steps separately and need expensive location labeling for detecting objects. In this work, we propose a weakly-supervised Deep Multiple Instance Hashing (DMIH) framework for object-based image retrieval. DMIH integrates object detection and hashing learning on the basis of a popular CNN model to build the end-to-end relation between a raw image and the binary hashing codes of multiple objects in it. Specifically, we cast the object detection of each object class as a binary multiple instance learning problem where instances are object proposals extracted from multi-scale convolutional feature maps. For hashing training, we sample image pairs to learn their semantic relationships in terms of hash codes of the most probable proposals for owned labels as guided by object predictors. The two objectives benefit each other in learning. DMIH outperforms state-of-the-arts on public benchmarks for object-based image retrieval and achieves promising results for multi-object queries.

Download Full-text

Saliency Guided End-to-End Learning for Weakly Supervised Object Detection

Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2017/285 ◽

2017 ◽

Cited By ~ 9

Author(s):

Baisheng Lai ◽

Xiaojin Gong

Keyword(s):

Object Detection ◽

Location Information ◽

Saliency Maps ◽

The Arts ◽

Deep Architecture ◽

Object Proposals ◽

End To End ◽

Weakly Supervised ◽

Additional Losses ◽

Entire Network

Weakly supervised object detection (WSOD), which is the problem of learning detectors using only image-level labels, has been attracting more and more interest. However, this problem is quite challenging due to the lack of location supervision. To address this issue, this paper integrates saliency into a deep architecture, in which the location information is explored both explicitly and implicitly. Specifically, we select highly confident object proposals under the guidance of class-specific saliency maps. The location information, together with semantic and saliency information, of the select proposals are then used to explicitly supervise the network by imposing two additional losses. Meanwhile, a saliency prediction sub-network is built in the architecture. The prediction results are used to implicitly guide the localization procedure. The entire network is trained end-to-end. Experiments on PASCAL VOC demonstrate that our approach outperforms all state-of-the-arts.

Download Full-text

Multiple instance learning on deep features for weakly supervised object detection with extreme domain shifts

Computer Vision and Image Understanding ◽

10.1016/j.cviu.2021.103299 ◽

2021 ◽

pp. 103299

Author(s):

Nicolas Gonthier ◽

Saïd Ladjal ◽

Yann Gousseau

Keyword(s):

Object Detection ◽

Multiple Instance Learning ◽

Weakly Supervised

Download Full-text

Discrepant Multiple Instance Learning for Weakly Supervised Object Detection

Pattern Recognition ◽

10.1016/j.patcog.2021.108233 ◽

2021 ◽

pp. 108233

Author(s):

Wei Gao ◽

Fang Wan ◽

Jun Yue ◽

Songcen Xu ◽

Qixiang Ye

Keyword(s):

Object Detection ◽

Multiple Instance Learning ◽

Weakly Supervised

Download Full-text

Benchmarking artificial intelligence methods for end-to-end computational pathology

10.1101/2021.08.09.455633 ◽

2021 ◽

Author(s):

Narmin Ghaffari Laleh ◽

Hannah Sophie Muti ◽

Chiara Maria Lavinia Loeffler ◽

Amelie Echle ◽

Oliver Lester Saldanha ◽

...

Keyword(s):

Artificial Intelligence ◽

Neural Networks ◽

External Validation ◽

Multiple Instance Learning ◽

Image Features ◽

Receiver Operating Curve ◽

End To End ◽

Weakly Supervised ◽

Computational Pathology ◽

Mutation Prediction

Artificial intelligence (AI) can extract subtle visual information from digitized histopathology slides and yield scientific insight on genotype-phenotype interactions as well as clinically actionable recommendations. Classical weakly supervised pipelines use an end-to-end approach with residual neural networks (ResNets), modern convolutional neural networks such as EfficientNet, or non-convolutional architectures such as vision transformers (ViT). In addition, multiple-instance learning (MIL) and clustering-constrained attention MIL (CLAM) are being used for pathology image analysis. However, it is unclear how these different approaches perform relative to each other. Here, we implement and systematically compare all five methods in six clinically relevant end-to-end prediction tasks using data from N=4848 patients with rigorous external validation. We show that histological tumor subtyping of renal cell carcinoma is an easy task which approaches successfully solved with an area under the receiver operating curve (AUROC) of above 0.9 without any significant differences between approaches. In contrast, we report significant performance differences for mutation prediction in colorectal, gastric and bladder cancer. Weakly supervised ResNet- and ViT-based workflows significantly outperformed other methods, in particular MIL and CLAM for mutation prediction. As a reason for this higher performance we identify the ability of ResNet and ViT to assign high prediction scores to highly informative image regions with plausible histopathological image features. We make all source codes publicly available at https://github.com/KatherLab/HIA, allowing easy application of all methods on any end-to-end problem in computational pathology.

Download Full-text

Towards Precise End-to-End Weakly Supervised Object Detection Network

2019 IEEE/CVF International Conference on Computer Vision (ICCV) ◽

10.1109/iccv.2019.00846 ◽

2019 ◽

Cited By ~ 12

Author(s):

Ke Yang ◽

Dongsheng Li ◽

Yong Dou

Keyword(s):

Object Detection ◽

End To End ◽

Weakly Supervised

Download Full-text

C-MIL: Continuation Multiple Instance Learning for Weakly Supervised Object Detection

2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) ◽

10.1109/cvpr.2019.00230 ◽

2019 ◽

Cited By ~ 28

Author(s):

Fang Wan ◽

Chang Liu ◽

Wei Ke ◽

Xiangyang Ji ◽

Jianbin Jiao ◽

...

Keyword(s):

Object Detection ◽

Multiple Instance Learning ◽

Weakly Supervised

Download Full-text

Multi-peak Graph-based Multi-instance Learning for Weakly Supervised Object Detection

ACM Transactions on Multimedia Computing Communications and Applications ◽

10.1145/3432861 ◽

2021 ◽

Vol 17 (2s) ◽

pp. 1-21

Author(s):

Ruyi Ji ◽

Zeyu Liu ◽

Libo Zhang ◽

Jianwei Liu ◽

Xin Zuo ◽

...

Keyword(s):

Object Detection ◽

Back Propagation ◽

Ground Truth ◽

Confidence Score ◽

Multiple Instance Learning ◽

Pascal Voc ◽

Proposed Model ◽

Weakly Supervised ◽

Graph Based Model ◽

Small Overlap

Weakly supervised object detection (WSOD), aiming to detect objects with only image-level annotations, has become one of the research hotspots over the past few years. Recently, much effort has been devoted to WSOD for the simple yet effective architecture and remarkable improvements have been achieved. Existing approaches using multiple-instance learning usually pay more attention to the proposals individually, ignoring relation information between proposals. Besides, to obtain pseudo-ground-truth boxes for WSOD, MIL-based methods tend to select the region with the highest confidence score and regard those with small overlap as background category, which leads to mislabeled instances. As a result, these methods suffer from mislabeling instances and lacking relations between proposals, degrading the performance of WSOD. To tackle these issues, this article introduces a multi-peak graph-based model for WSOD. Specifically, we use the instance graph to model the relations between proposals, which reinforces multiple-instance learning process. In addition, a multi-peak discovery strategy is designed to avert mislabeling instances. The proposed model is trained by stochastic gradients decent optimizer using back-propagation in an end-to-end manner. Extensive quantitative and qualitative evaluations on two publicly challenging benchmarks, PASCAL VOC 2007 and PASCAL VOC 2012, demonstrate the superiority and effectiveness of the proposed approach.

Download Full-text

Corporeità, movimento e benessere: il potenziale educativo di itinerari motorio-sportivi inclusivi giovanili

EDUCATION SCIENCES AND SOCIETY ◽

10.3280/ess1-2019oa7702 ◽

2019 ◽

pp. 3121-334

Author(s):

Carmen Palumbo ◽

Antinea Ambretti ◽

Giovanna Ferraioli

Keyword(s):

Young People ◽

Learning Process ◽

Personalized Learning ◽

Barriers To Learning ◽

Educational Value ◽

School Activities ◽

The Past ◽

Full Participation ◽

Inclusive Approach ◽

Teaching Learning

Over the past few decades, the adoption of an inclusive approach to education has stimulated a reflection on the educational value of body and movement within teaching-learning process in order to break down all barriers to learning and promote the full participation of young people to school activities. Indeed,body and movement represent an important didactic "medium" for developing individualized and personalized learning paths that take into account the specific needs and characteristics of students thus contributing to their global and harmonious development.

Download Full-text