Deep Learning of Appearance Affinity for Multi-Object Tracking and Re-Identification: A Comparative View

Recognizing the identity of a query individual in a surveillance sequence is the core of Multi-Object Tracking (MOT) and Re-Identification (Re-Id) algorithms. Both tasks can be addressed by measuring the appearance affinity between people observations with a deep neural model. Nevertheless, the differences in their specifications and, consequently, in the characteristics and constraints of the available training data for each one of these tasks, arise from the necessity of employing different learning approaches to attain each one of them. This article offers a comparative view of the Double-Margin-Contrastive and the Triplet loss function, and analyzes the benefits and drawbacks of applying each one of them to learn an Appearance Affinity model for Tracking and Re-Identification. A batch of experiments have been conducted, and their results support the hypothesis concluded from the presented study: Triplet loss function is more effective than the Contrastive one when an Re-Id model is learnt, and, conversely, in the MOT domain, the Contrastive loss can better discriminate between pairs of images rendering the same person or not.

Download Full-text

Real-Time Automated Classification of Sky Conditions Using Deep Learning and Edge Computing

Remote Sensing ◽

10.3390/rs13193859 ◽

2021 ◽

Vol 13 (19) ◽

pp. 3859

Author(s):

Joby M. Prince Czarnecki ◽

Sathishkumar Samiappan ◽

Meilun Zhou ◽

Cary Daniel McCraine ◽

Louis L. Wasson

Keyword(s):

Neural Network ◽

Deep Learning ◽

Image Quality ◽

Convolutional Neural Network ◽

Precision Agriculture ◽

Edge Computing ◽

Training Data ◽

Learning Approaches ◽

Sky Conditions

The radiometric quality of remotely sensed imagery is crucial for precision agriculture applications because estimations of plant health rely on the underlying quality. Sky conditions, and specifically shadowing from clouds, are critical determinants in the quality of images that can be obtained from low-altitude sensing platforms. In this work, we first compare common deep learning approaches to classify sky conditions with regard to cloud shadows in agricultural fields using a visible spectrum camera. We then develop an artificial-intelligence-based edge computing system to fully automate the classification process. Training data consisting of 100 oblique angle images of the sky were provided to a convolutional neural network and two deep residual neural networks (ResNet18 and ResNet34) to facilitate learning two classes, namely (1) good image quality expected, and (2) degraded image quality expected. The expectation of quality stemmed from the sky condition (i.e., density, coverage, and thickness of clouds) present at the time of the image capture. These networks were tested using a set of 13,000 images. Our results demonstrated that ResNet18 and ResNet34 classifiers produced better classification accuracy when compared to a convolutional neural network classifier. The best overall accuracy was obtained by ResNet34, which was 92% accurate, with a Kappa statistic of 0.77. These results demonstrate a low-cost solution to quality control for future autonomous farming systems that will operate without human intervention and supervision.

Download Full-text

Morphological Estimation of Cellularity on Neo-Adjuvant Treated Breast Cancer Histological Images

Journal of Imaging ◽

10.3390/jimaging6100101 ◽

2020 ◽

Vol 6 (10) ◽

pp. 101

Author(s):

Mauricio Alberto Ortega-Ruiz ◽

Cefa Karabağ ◽

Victor García Garduño ◽

Constantino Carlos Reyes-Aldasoro

Keyword(s):

Breast Cancer ◽

Deep Learning ◽

Morphological Features ◽

Training Data ◽

Morphological Operations ◽

Morphological Parameters ◽

Learning Approaches ◽

Residual Cancer Burden ◽

Histological Images ◽

Treated Breast

This paper describes a methodology that extracts key morphological features from histological breast cancer images in order to automatically assess Tumour Cellularity (TC) in Neo-Adjuvant treatment (NAT) patients. The response to NAT gives information on therapy efficacy and it is measured by the residual cancer burden index, which is composed of two metrics: TC and the assessment of lymph nodes. The data consist of whole slide images (WSIs) of breast tissue stained with Hematoxylin and Eosin (H&E) released in the 2019 SPIE Breast Challenge. The methodology proposed is based on traditional computer vision methods (K-means, watershed segmentation, Otsu’s binarisation, and morphological operations), implementing colour separation, segmentation, and feature extraction. Correlation between morphological features and the residual TC after a NAT treatment was examined. Linear regression and statistical methods were used and twenty-two key morphological parameters from the nuclei, epithelial region, and the full image were extracted. Subsequently, an automated TC assessment that was based on Machine Learning (ML) algorithms was implemented and trained with only selected key parameters. The methodology was validated with the score assigned by two pathologists through the intra-class correlation coefficient (ICC). The selection of key morphological parameters improved the results reported over other ML methodologies and it was very close to deep learning methodologies. These results are encouraging, as a traditionally-trained ML algorithm can be useful when limited training data are available preventing the use of deep learning approaches.

Download Full-text

A general approach for improving deep learning-based medical relation extraction using a pre-trained model and fine-tuning

Database ◽

10.1093/database/baz116 ◽

2019 ◽

Vol 2019 ◽

Cited By ~ 2

Author(s):

Tao Chen ◽

Mingfen Wu ◽

Hexi Li

Keyword(s):

Deep Learning ◽

Large Scale ◽

Relation Extraction ◽

Training Model ◽

Biomedical Literature ◽

Training Data ◽

Fine Tuning ◽

Learning Approaches ◽

Additional Time ◽

Clinical Records

Abstract The automatic extraction of meaningful relations from biomedical literature or clinical records is crucial in various biomedical applications. Most of the current deep learning approaches for medical relation extraction require large-scale training data to prevent overfitting of the training model. We propose using a pre-trained model and a fine-tuning technique to improve these approaches without additional time-consuming human labeling. Firstly, we show the architecture of Bidirectional Encoder Representations from Transformers (BERT), an approach for pre-training a model on large-scale unstructured text. We then combine BERT with a one-dimensional convolutional neural network (1d-CNN) to fine-tune the pre-trained model for relation extraction. Extensive experiments on three datasets, namely the BioCreative V chemical disease relation corpus, traditional Chinese medicine literature corpus and i2b2 2012 temporal relation challenge corpus, show that the proposed approach achieves state-of-the-art results (giving a relative improvement of 22.2, 7.77, and 38.5% in F1 score, respectively, compared with a traditional 1d-CNN classifier). The source code is available at https://github.com/chentao1999/MedicalRelationExtraction.

Download Full-text

BUILDING GENERALIZATION USING DEEP LEARNING

ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences ◽

10.5194/isprs-archives-xlii-4-565-2018 ◽

2018 ◽

Vol XLII-4 ◽

pp. 565-572 ◽

Cited By ~ 4

Author(s):

M. Sester ◽

Y. Feng ◽

F. Thiemann

Keyword(s):

Computer Vision ◽

Deep Learning ◽

Physical Reality ◽

Training Data ◽

Data Sets ◽

Learning Approaches ◽

Depth Analysis ◽

Map Series ◽

Training Examples ◽

Future Work

<p><strong>Abstract.</strong> Cartographic generalization is a problem, which poses interesting challenges to automation. Whereas plenty of algorithms have been developed for the different sub-problems of generalization (e.g. simplification, displacement, aggregation), there are still cases, which are not generalized adequately or in a satisfactory way. The main problem is the interplay between different operators. In those cases the benchmark is the human operator, who is able to design an aesthetic and correct representation of the physical reality.</p><p>Deep Learning methods have shown tremendous success for interpretation problems for which algorithmic methods have deficits. A prominent example is the classification and interpretation of images, where deep learning approaches outperform the traditional computer vision methods. In both domains &ndash; computer vision and cartography &ndash; humans are able to produce a solution; a prerequisite for this is, that there is the possibility to generate many training examples for the different cases. Thus, the idea in this paper is to employ Deep Learning for cartographic generalizations tasks, especially for the task of building generalization. An advantage of this task is the fact that many training data sets are available from given map series. The approach is a first attempt using an existing network.</p><p>In the paper, the details of the implementation will be reported, together with an in depth analysis of the results. An outlook on future work will be given.</p>

Download Full-text

Multiple Object Tracking in Deep Learning Approaches: A Survey

Electronics ◽

10.3390/electronics10192406 ◽

2021 ◽

Vol 10 (19) ◽

pp. 2406

Author(s):

Yesul Park ◽

L. Minh Dang ◽

Sujin Lee ◽

Dongil Han ◽

Hyeonjoon Moon

Keyword(s):

Deep Learning ◽

Object Tracking ◽

Multiple Object Tracking ◽

Motion Trajectory ◽

Learning Approaches ◽

Multiple Object ◽

Problems And Solutions ◽

Benchmark Datasets ◽

Future Work ◽

Appearance Changes

Object tracking is a fundamental computer vision problem that refers to a set of methods proposed to precisely track the motion trajectory of an object in a video. Multiple Object Tracking (MOT) is a subclass of object tracking that has received growing interest due to its academic and commercial potential. Although numerous methods have been introduced to cope with this problem, many challenges remain to be solved, such as severe object occlusion and abrupt appearance changes. This paper focuses on giving a thorough review of the evolution of MOT in recent decades, investigating the recent advances in MOT, and showing some potential directions for future work. The primary contributions include: (1) a detailed description of the MOT’s main problems and solutions, (2) a categorization of the previous MOT algorithms into 12 approaches and discussion of the main procedures for each category, (3) a review of the benchmark datasets and standard evaluation methods for evaluating the MOT, (4) a discussion of various MOT challenges and solutions by analyzing the related references, and (5) a summary of the latest MOT technologies and recent MOT trends using the mentioned MOT categories.

Download Full-text

Review of Deep Learning Methods in Robotic Grasp Detection

Multimodal Technologies and Interaction ◽

10.3390/mti2030057 ◽

2018 ◽

Vol 2 (3) ◽

pp. 57 ◽

Cited By ~ 40

Author(s):

Shehan Caldera ◽

Alexander Rassau ◽

Douglas Chai

Keyword(s):

Deep Learning ◽

Language Processing ◽

General Purpose ◽

Training Data ◽

Learning Approaches ◽

Automated Driving ◽

Learning Methods ◽

Robotic Vision ◽

Object A ◽

Robotic Grasp

For robots to attain more general-purpose utility, grasping is a necessary skill to master. Such general-purpose robots may use their perception abilities to visually identify grasps for a given object. A grasp describes how a robotic end-effector can be arranged to securely grab an object and successfully lift it without slippage. Traditionally, grasp detection requires expert human knowledge to analytically form the task-specific algorithm, but this is an arduous and time-consuming approach. During the last five years, deep learning methods have enabled significant advancements in robotic vision, natural language processing, and automated driving applications. The successful results of these methods have driven robotics researchers to explore the use of deep learning methods in task-generalised robotic applications. This paper reviews the current state-of-the-art in regards to the application of deep learning methods to generalised robotic grasping and discusses how each element of the deep learning approach has improved the overall performance of robotic grasp detection. Several of the most promising approaches are evaluated and the most suitable for real-time grasp detection is identified as the one-shot detection method. The availability of suitable volumes of appropriate training data is identified as a major obstacle for effective utilisation of the deep learning approaches, and the use of transfer learning techniques is proposed as a potential mechanism to address this. Finally, current trends in the field and future potential research directions are discussed.

Download Full-text

Combining Deep Learning and Qualitative Spatial Reasoning to Learn Complex Structures from Sparse Examples with Noise

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33012911 ◽

2019 ◽

Vol 33 ◽

pp. 2911-2918 ◽

Cited By ~ 2

Author(s):

Nikhil Krishnaswamy ◽

Scott Friedman ◽

James Pustejovsky

Keyword(s):

Deep Learning ◽

Heuristic Search ◽

Spatial Reasoning ◽

Spatial Relations ◽

Search Space ◽

Training Data ◽

Learning Approaches ◽

Qualitative Spatial Reasoning ◽

Heuristic Search Algorithms ◽

Novel Approach

Many modern machine learning approaches require vast amounts of training data to learn new concepts; conversely, human learning often requires few examples—sometimes only one—from which the learner can abstract structural concepts. We present a novel approach to introducing new spatial structures to an AI agent, combining deep learning over qualitative spatial relations with various heuristic search algorithms. The agent extracts spatial relations from a sparse set of noisy examples of block-based structures, and trains convolutional and sequential models of those relation sets. To create novel examples of similar structures, the agent begins placing blocks on a virtual table, uses a CNN to predict the most similar complete example structure after each placement, an LSTM to predict the most likely set of remaining moves needed to complete it, and recommends one using heuristic search. We verify that the agent learned the concept by observing its virtual block-building activities, wherein it ranks each potential subsequent action toward building its learned concept. We empirically assess this approach with human participants’ ratings of the block structures. Initial results and qualitative evaluations of structures generated by the trained agent show where it has generalized concepts from the training data, which heuristics perform best within the search space, and how we might improve learning and execution.

Download Full-text

Simulation-based Data Augmentation for the Quality Inspection of Structural Adhesive with Deep Learning

10.36227/techrxiv.14287334 ◽

2021 ◽

Author(s):

Ricardo Peres ◽

Magno Guedes ◽

Fábio Miranda ◽

José Barata

Keyword(s):

Deep Learning ◽

Data Augmentation ◽

Data Availability ◽

Training Data ◽

Learning Approaches ◽

Automotive Parts ◽

Novel Method ◽

The Cost ◽

Structural Adhesive ◽

Data Context

<div>The advent of Industry 4.0 has shown the tremendous transformative potential of combining artificial intelligence, cyber-physical systems and Internet of Things concepts in industrial settings. Despite this, data availability is still a major roadblock for the successful adoption of data-driven solutions, particularly concerning deep learning approaches in manufacturing. Specifically in the quality control domain, annotated defect data can often be costly, time-consuming and inefficient to obtain, potentially compromising the viability of deep learning approaches due to data scarcity. In this context, we propose a novel method for generating annotated synthetic training data for automated quality inspections of structural adhesive applications, validated in an industrial cell for automotive parts. Our approach greatly reduces the cost of training deep learning models for this task, while simultaneously improving their performance in a scarce manufacturing data context with imbalanced training sets by 3.1% ([email protected]). Additional results can be seen at https://git.io/Jtc4b.</div>

Download Full-text

A Natural Images Pre-Trained Deep Learning Method for Seismic Random Noise Attenuation

Remote Sensing ◽

10.3390/rs14020263 ◽

2022 ◽

Vol 14 (2) ◽

pp. 263

Author(s):

Haixia Zhao ◽

Tingting Bai ◽

Zhiqiang Wang

Keyword(s):

Deep Learning ◽

Seismic Data ◽

Field Data ◽

Noise Suppression ◽

Signal To Noise Ratio ◽

Random Noise ◽

Natural Images ◽

Training Data ◽

Learning Approaches ◽

Learning Method

Seismic field data are usually contaminated by random or complex noise, which seriously affect the quality of seismic data contaminating seismic imaging and seismic interpretation. Improving the signal-to-noise ratio (SNR) of seismic data has always been a key step in seismic data processing. Deep learning approaches have been successfully applied to suppress seismic random noise. The training examples are essential in deep learning methods, especially for the geophysical problems, where the complete training data are not easy to be acquired due to high cost of acquisition. In this work, we propose a natural images pre-trained deep learning method to suppress seismic random noise through insight of the transfer learning. Our network contains pre-trained and post-trained networks: the former is trained by natural images to obtain the preliminary denoising results, while the latter is trained by a small amount of seismic images to fine-tune the denoising effects by semi-supervised learning to enhance the continuity of geological structures. The results of four types of synthetic seismic data and six field data demonstrate that our network has great performance in seismic random noise suppression in terms of both quantitative metrics and intuitive effects.

Download Full-text

Weakly-Supervised Vessel Detection in Ultra-Widefield Fundus Photography Via Iterative Multi-Modal Registration and Learning

10.36227/techrxiv.12283736 ◽

2020 ◽

Author(s):

Li Ding ◽

Ajay E. Kuriyan ◽

Rajeev S. Ramchandran ◽

Charles C. Wykoff ◽

Gaurav Sharma

Keyword(s):

Deep Learning ◽

De Novo ◽

Training Data ◽

Fundus Photography ◽

Learning Approaches ◽

Robust Learning ◽

Vessel Detection ◽

Weakly Supervised ◽

Human Assessment ◽

Noisy Labels

<div>We propose a deep-learning based annotation efficient framework for vessel detection in ultra-widefield (UWF) fundus photography (FP) that does not require de novo labeled UWF FP vessel maps. Our approach utilizes concurrently captured UWF fluorescein angiography (FA) images, for which effective deep learning approaches have recently become available, and iterates between a multi-modal registration step and a weakly-supervised learning step. In the registration step, the UWF FA vessel maps detected with a pre-trained deep neural network (DNN) are registered with the UWF FP via parametric chamfer alignment. The warped vessel maps can be used as the tentative training data but inevitably contain incorrect (noisy) labels due to the differences between FA and FP modalities and the errors in the registration. In the learning step, a robust learning method is proposed to train DNNs with noisy labels. The detected FP vessel maps are used for the registration in the following iteration. The registration and the vessel detection benefit from each other and are progressively improved. Once trained, the UWF FP vessel detection DNN from the proposed approach allows FP vessel detection without requiring concurrently captured UWF FA images. We validate the proposed framework on a new UWF FP dataset, PRIMEFP20, and on existing narrow field FP datasets. Experimental evaluation, using both pixel wise metrics and the CAL metrics designed to provide better agreement with human assessment, shows that the proposed approach provides accurate vessel detection, without requiring manually labeled UWF FP training data.</div>

Download Full-text