scholarly journals Soft labeling with quasi-Gaussian structure for training samples of deep classification trackers

2020 ◽  
Vol 17 (2) ◽  
pp. 172988142091502
Author(s):  
Yan Peng ◽  
Jiantao Gao ◽  
Chang Liu ◽  
Xiaomao Li ◽  
Baojie Fan ◽  
...  

Deep classification tracking aims at classifying the candidate samples into target or background by a classifier generally trained with a binary label. However, the binary label merely distinguishes samples of different classes, while inadvertently ignoring the distinction among the samples belonging to the same class, which weakens the classification and locating ability. To cope with this problem, this article proposes a soft labeling with quasi-Gaussian structure instead of the binary labeling, which distinguishes the samples belonging to different classes and the same class simultaneously. Like as the binary label, the signs of labels for target and background samples are set to be plus and minus respectively to distinguish samples of different classes. Further, to exploit the difference among samples in the same class, the label values of samples in the same class are designed as a monotonically decreasing quasi-Gaussian function about Intersection over Union. Therefore, the corresponding response function is a two-piecewise monotonically increasing quasi-Gaussian combination function about Intersection over Union. Due to such response function, deep classification tracking trained with this proposed soft labeling achieves better classification and location performance. To validate this, the proposed soft labeling is integrated into the pipeline of the deep classification tracker SiamFC. Experimental results on OTB-2015 and VOT benchmark show that our variant achieves significant improvement to the baseline tracker while maintaining real-time tracking speed and acquires comparable accuracy as recent state-of-the-art trackers.

2015 ◽  
Vol 2015 ◽  
pp. 1-11 ◽  
Author(s):  
Tao Xiang ◽  
Tao Li ◽  
Mao Ye ◽  
Zijian Liu

Pedestrian detection with large intraclass variations is still a challenging task in computer vision. In this paper, we propose a novel pedestrian detection method based on Random Forest. Firstly, we generate a few local templates with different sizes and different locations in positive exemplars. Then, the Random Forest is built whose splitting functions are optimized by maximizing class purity of matching the local templates to the training samples, respectively. To improve the classification accuracy, we adopt a boosting-like algorithm to update the weights of the training samples in a layer-wise fashion. During detection, the trained Random Forest will vote the category when a sliding window is input. Our contributions are the splitting functions based on local template matching with adaptive size and location and iteratively weight updating method. We evaluate the proposed method on 2 well-known challenging datasets: TUD pedestrians and INRIA pedestrians. The experimental results demonstrate that our method achieves state-of-the-art or competitive performance.


Author(s):  
Markus Endres ◽  
Lena Rudenko

A skyline query retrieves all objects in a dataset that are not dominated by other objects according to some given criteria. There exist many skyline algorithms which can be classified into generic, index-based, and lattice-based algorithms. This chapter takes a tour through lattice-based skyline algorithms. It summarizes the basic concepts and properties, presents high-performance parallel approaches, shows how one overcomes the low-cardinality restriction of lattice structures, and finally presents an application on data streams for real-time skyline computation. Experimental results on synthetic and real datasets show that lattice-based algorithms outperform state-of-the-art skyline techniques, and additionally have a linear runtime complexity.


Author(s):  
Qunsheng Ruan ◽  
Qingfeng Wu ◽  
Junfeng Yao ◽  
Yingdong Wang ◽  
Hsien-Wei Tseng ◽  
...  

In the intelligently processing of the tongue image, one of the most important tasks is to accurately segment the tongue body from a whole tongue image, and the good quality of tongue body edge processing is of great significance for the relevant tongue feature extraction. To improve the performance of the segmentation model for tongue images, we propose an efficient tongue segmentation model based on U-Net. Three important studies are launched, including optimizing the model’s main network, innovating a new network to specially handle tongue edge cutting and proposing a weighted binary cross-entropy loss function. The purpose of optimizing the tongue image main segmentation network is to make the model recognize the foreground and background features for the tongue image as well as possible. A novel tongue edge segmentation network is used to focus on handling the tongue edge because the edge of the tongue contains a number of important information. Furthermore, the advantageous loss function proposed is to be adopted to enhance the pixel supervision corresponding to tongue images. Moreover, thanks to a lack of tongue image resources on Traditional Chinese Medicine (TCM), some special measures are adopted to augment training samples. Various comparing experiments on two datasets were conducted to verify the performance of the segmentation model. The experimental results indicate that the loss rate of our model converges faster than the others. It is proved that our model has better stability and robustness of segmentation for tongue image from poor environment. The experimental results also indicate that our model outperforms the state-of-the-art ones in aspects of the two most important tongue image segmentation indexes: IoU and Dice. Moreover, experimental results on augmentation samples demonstrate our model have better performances.


2020 ◽  
Vol 34 (07) ◽  
pp. 11685-11692
Author(s):  
Zili Liu ◽  
Tu Zheng ◽  
Guodong Xu ◽  
Zheng Yang ◽  
Haifeng Liu ◽  
...  

Modern object detectors can rarely achieve short training time, fast inference speed, and high accuracy at the same time. To strike a balance among them, we propose the Training-Time-Friendly Network (TTFNet). In this work, we start with light-head, single-stage, and anchor-free designs, which enable fast inference speed. Then, we focus on shortening training time. We notice that encoding more training samples from annotated boxes plays a similar role as increasing batch size, which helps enlarge the learning rate and accelerate the training process. To this end, we introduce a novel approach using Gaussian kernels to encode training samples. Besides, we design the initiative sample weights for better information utilization. Experiments on MS COCO show that our TTFNet has great advantages in balancing training time, inference speed, and accuracy. It has reduced training time by more than seven times compared to previous real-time detectors while maintaining state-of-the-art performances. In addition, our super-fast version of TTFNet-18 and TTFNet-53 can outperform SSD300 and YOLOv3 by less than one-tenth of their training time, respectively. The code has been made available at https://github.com/ZJULearning/ttfnet.


Sensors ◽  
2019 ◽  
Vol 19 (15) ◽  
pp. 3371 ◽  
Author(s):  
Hossain ◽  
Lee

In recent years, demand has been increasing for target detection and tracking from aerial imagery via drones using onboard powered sensors and devices. We propose a very effective method for this application based on a deep learning framework. A state-of-the-art embedded hardware system empowers small flying robots to carry out the real-time onboard computation necessary for object tracking. Two types of embedded modules were developed: one was designed using a Jetson TX or AGX Xavier, and the other was based on an Intel Neural Compute Stick. These are suitable for real-time onboard computing power on small flying drones with limited space. A comparative analysis of current state-of-the-art deep learning-based multi-object detection algorithms was carried out utilizing the designated GPU-based embedded computing modules to obtain detailed metric data about frame rates, as well as the computation power. We also introduce an effective target tracking approach for moving objects. The algorithm for tracking moving objects is based on the extension of simple online and real-time tracking. It was developed by integrating a deep learning-based association metric approach with simple online and real-time tracking (Deep SORT), which uses a hypothesis tracking methodology with Kalman filtering and a deep learning-based association metric. In addition, a guidance system that tracks the target position using a GPU-based algorithm is introduced. Finally, we demonstrate the effectiveness of the proposed algorithms by real-time experiments with a small multi-rotor drone.


Author(s):  
Shoujin Wang ◽  
Liang Hu ◽  
Yan Wang ◽  
Quan Z. Sheng ◽  
Mehmet Orgun ◽  
...  

A session-based recommender system (SBRS) suggests the next item by modeling the dependencies between items in a session. Most of existing SBRSs assume the items inside a session are associated with one (implicit) purpose. However, this may not always be true in reality, and a session may often consist of multiple subsets of items for different purposes (e.g., breakfast and decoration). Specifically, items (e.g., bread and milk) in a subsethave strong purpose-specific dependencies whereas items (e.g., bread and vase) from different subsets have much weaker or even no dependencies due to the difference of purposes. Therefore, we propose a mixture-channel model to accommodate the multi-purpose item subsets for more precisely representing a session. Filling gaps in existing SBRSs, this model recommends more diverse items to satisfy different purposes. Accordingly, we design effective mixture-channel purpose routing networks (MCPRN) with a purpose routing network to detect the purposes of each item and assign it into the corresponding channels. Moreover, a purpose specific recurrent network is devised to model the dependencies between items within each channel for a specific purpose. The experimental results show the superiority of MCPRN over the state-of-the-art methods in terms of both recommendation accuracy and diversity.  


Author(s):  
Ximing Zhang ◽  
Mingang Wang ◽  
Lin Cao

Most tracking-by-detection based trackers employ the online model update scheme based on the spatiotemporal consistency of visual cues. In presence of self-deformation, abrupt motion and heavy occlusion, these trackers suffer from different attributes and are prone to drifting. The model based on offline training, namely Siamese networks is invariant when suffering from the attributes. While the tracking speed of the offline method can be slow which is not enough for real-time tracking. In this paper, a novel collaborative tracker which decomposes the tracking task into online and offline modes is proposed. Our tracker switches between the online and offline modes automatically based on the tracker status inferred from the present failure tracking detection method which is based on the dispersal measure of the response map. The present Real-Time Thermal Infrared Collaborative Online and Offline Tracker (TCOOT) achieves state-of-the-art tracking performance while maintaining real-time speed at the same time. Experiments are carried out on the VOT-TIR-2015 benchmark dataset and our tracker achieves superior performance against Staple and Siam FC trackers by 3.3% and 3.6% on precision criterion and 3.8% and 5% on success criterion, respectively. The present method is real-time tracker as well.


2015 ◽  
Vol 742 ◽  
pp. 318-321
Author(s):  
Wang Luo ◽  
Lei Yu ◽  
Min Feng ◽  
Gong Yi Hong ◽  
Qi Wei Peng ◽  
...  

In this paper, we present a hierarchical method of activity recognition for sleeping at the desk in business hall. The method consists of three steps. First, the reference points such as body joints are obtained from workers in business hall. Second, we build the dependency graph to represent the relationships between reference points. Third, the multidimensional output regressions along the dependency paths are used to estimate the positions of these reference body points. Experimental results demonstrate that our method achieves comparable accuracy to state-of-the-art results.


2013 ◽  
Vol 475-476 ◽  
pp. 947-951
Author(s):  
Zhi Yuan Mai ◽  
Kun Yu Tan ◽  
An Ting Xu ◽  
Wei Xiang

The tracking effect is not good for the faster track with Mean Shift tracking algorithm when the difference is not obvious between the track target and background pixels in the video of global visual robotic fish.To solve the difficulty of tracking drastically moving targets in this paper, determining the position of moving targets in the next frame through comparing with two bc coefficients which have been set when the Epanechnikov has been selected core to estimate is indeed. The experimental results show the proposed algorithm can track the moving targets efficiently and precisely in video,and also can meet high real-time situation with small calculation.


Electronics ◽  
2021 ◽  
Vol 10 (20) ◽  
pp. 2488
Author(s):  
Daohui Ge ◽  
Ruyi Liu ◽  
Yunan Li ◽  
Qiguang Miao

Effectively learning the appearance change of a target is the key point of an online tracker. When occlusion and misalignment occur, the tracking results usually contain a great amount of background information, which heavily affects the ability of a tracker to distinguish between targets and backgrounds, eventually leading to tracking failure. To solve this problem, we propose a simple and robust reliable memory model. In particular, an adaptive evaluation strategy (AES) is proposed to assess the reliability of tracking results. AES combines the confidence of the tracker predictions and the similarity distance, which is between the current predicted result and the existing tracking results. Based on the reliable results of AES selection, we designed an active–frozen memory model to store reliable results. Training samples stored in active memory are used to update the tracker, while frozen memory temporarily stores inactive samples. The active–frozen memory model maintains the diversity of samples while satisfying the limitation of storage. We performed comprehensive experiments on five benchmarks: OTB-2013, OTB-2015, UAV123, Temple-color-128, and VOT2016. The experimental results show that our tracker achieves state-of-the-art performance.


Sign in / Sign up

Export Citation Format

Share Document