Improving the reliability of 3D people tracking system by means of deep-learning

Author(s):  
Matteo Boschini ◽  
Matteo Poggi ◽  
Stefano Mattoccia
2021 ◽  
Vol 11 (2) ◽  
pp. 851
Author(s):  
Wei-Liang Ou ◽  
Tzu-Ling Kuo ◽  
Chin-Chieh Chang ◽  
Chih-Peng Fan

In this study, for the application of visible-light wearable eye trackers, a pupil tracking methodology based on deep-learning technology is developed. By applying deep-learning object detection technology based on the You Only Look Once (YOLO) model, the proposed pupil tracking method can effectively estimate and predict the center of the pupil in the visible-light mode. By using the developed YOLOv3-tiny-based model to test the pupil tracking performance, the detection accuracy is as high as 80%, and the recall rate is close to 83%. In addition, the average visible-light pupil tracking errors of the proposed YOLO-based deep-learning design are smaller than 2 pixels for the training mode and 5 pixels for the cross-person test, which are much smaller than those of the previous ellipse fitting design without using deep-learning technology under the same visible-light conditions. After the combination of calibration process, the average gaze tracking errors by the proposed YOLOv3-tiny-based pupil tracking models are smaller than 2.9 and 3.5 degrees at the training and testing modes, respectively, and the proposed visible-light wearable gaze tracking system performs up to 20 frames per second (FPS) on the GPU-based software embedded platform.


2021 ◽  
Vol 11 (12) ◽  
pp. 5503
Author(s):  
Munkhjargal Gochoo ◽  
Syeda Amna Rizwan ◽  
Yazeed Yasin Ghadi ◽  
Ahmad Jalal ◽  
Kibum Kim

Automatic head tracking and counting using depth imagery has various practical applications in security, logistics, queue management, space utilization and visitor counting. However, no currently available system can clearly distinguish between a human head and other objects in order to track and count people accurately. For this reason, we propose a novel system that can track people by monitoring their heads and shoulders in complex environments and also count the number of people entering and exiting the scene. Our system is split into six phases; at first, preprocessing is done by converting videos of a scene into frames and removing the background from the video frames. Second, heads are detected using Hough Circular Gradient Transform, and shoulders are detected by HOG based symmetry methods. Third, three robust features, namely, fused joint HOG-LBP, Energy based Point clouds and Fused intra-inter trajectories are extracted. Fourth, the Apriori-Association is implemented to select the best features. Fifth, deep learning is used for accurate people tracking. Finally, heads are counted using Cross-line judgment. The system was tested on three benchmark datasets: the PCDS dataset, the MICC people counting dataset and the GOTPD dataset and counting accuracy of 98.40%, 98%, and 99% respectively was achieved. Our system obtained remarkable results.


Sensors ◽  
2019 ◽  
Vol 19 (12) ◽  
pp. 2742 ◽  
Author(s):  
Wang ◽  
Walsh ◽  
Koirala

: Pre-harvest fruit yield estimation is useful to guide harvesting and marketing resourcing, but machine vision estimates based on a single view from each side of the tree (“dual-view”) underestimates the fruit yield as fruit can be hidden from view. A method is proposed involving deep learning, Kalman filter, and Hungarian algorithm for on-tree mango fruit detection, tracking, and counting from 10 frame-per-second videos captured of trees from a platform moving along the inter row at 5 km/h. The deep learning based mango fruit detection algorithm, MangoYOLO, was used to detect fruit in each frame. The Hungarian algorithm was used to correlate fruit between neighbouring frames, with the improvement of enabling multiple-to-one assignment. The Kalman filter was used to predict the position of fruit in following frames, to avoid multiple counts of a single fruit that is obscured or otherwise not detected with a frame series. A “borrow” concept was added to the Kalman filter to predict fruit position when its precise prediction model was absent, by borrowing the horizontal and vertical speed from neighbouring fruit. By comparison with human count for a video with 110 frames and 192 (human count) fruit, the method produced 9.9% double counts and 7.3% missing count errors, resulting in around 2.6% over count. In another test, a video (of 1162 frames, with 42 images centred on the tree trunk) was acquired of both sides of a row of 21 trees, for which the harvest fruit count was 3286 (i.e., average of 156 fruit/tree). The trees had thick canopies, such that the proportion of fruit hidden from view from any given perspective was high. The proposed method recorded 2050 fruit (62% of harvest) with a bias corrected Root Mean Square Error (RMSE) = 18.0 fruit/tree while the dual-view image method (also using MangoYOLO) recorded 1322 fruit (40%) with a bias corrected RMSE = 21.7 fruit/tree. The video tracking system is recommended over the dual-view imaging system for mango orchard fruit count.


Sensors ◽  
2020 ◽  
Vol 20 (6) ◽  
pp. 1574 ◽  
Author(s):  
Jerzy Kolakowski ◽  
Vitomir Djaja-Josko ◽  
Marcin Kolakowski ◽  
Katarzyna Broczek

Localization systems are the source of data that allows to evaluate elderly person’s behaviour, to draw conclusions concerning his or her health status and wellbeing, and to detect emergency situations. The article contains a description of a system intended for elderly people tracking. Two novel solutions have been implemented in the system: a hybrid localization algorithm and a method for wireless anchor nodes synchronization. The algorithm fuses results of time difference of arrival and received signal strength measurements in ultrawideband (UWB) and Bluetooth Low Energy (BLE) radio interfaces, respectively. The system allows to change the intensity of UWB packets transmission to adapt localization accuracy and energy usage to current needs and applications. In order to simplify the system installation, communication between elements of the system infrastructure instead of wire interfaces is performed over wireless ones. The new wireless synchronization method proposed in the article consists in retransmission of UWB synchronization packets by selected anchor nodes. It allows for extension of the system coverage, which is limited by the short range of UWB transmission. The proposed solution was experimentally verified. The synchronization method was tested in a laboratory, and the whole system’s performance was investigated in a typical flat. Exemplary results of the tests performed with older adult participation in their own homes are also included.


2021 ◽  
Vol 8 (1) ◽  
Author(s):  
Alexandros Karargyris ◽  
Satyananda Kashyap ◽  
Ismini Lourentzou ◽  
Joy T. Wu ◽  
Arjun Sharma ◽  
...  

AbstractWe developed a rich dataset of Chest X-Ray (CXR) images to assist investigators in artificial intelligence. The data were collected using an eye-tracking system while a radiologist reviewed and reported on 1,083 CXR images. The dataset contains the following aligned data: CXR image, transcribed radiology report text, radiologist’s dictation audio and eye gaze coordinates data. We hope this dataset can contribute to various areas of research particularly towards explainable and multimodal deep learning/machine learning methods. Furthermore, investigators in disease classification and localization, automated radiology report generation, and human-machine interaction can benefit from these data. We report deep learning experiments that utilize the attention maps produced by the eye gaze dataset to show the potential utility of this dataset.


Author(s):  
Mingjun Jiang ◽  
Kohei Shimasaki ◽  
Shaopeng Hu ◽  
Taku Senoo ◽  
Idaku Ishii

Author(s):  
Prakash Kanade ◽  
Fortune David ◽  
Sunay Kanade

To avoid the rising number of car crash deaths, which are mostly caused by drivers' inattentiveness, a paradigm shift is expected. The knowledge of a driver's look area may provide useful details about his or her point of attention. Cars with accurate and low-cost gaze classification systems can increase driver safety. When drivers shift their eyes without turning their heads to look at objects, the margin of error in gaze detection increases. For new consumer electronic applications such as driver tracking systems and novel user interfaces, accurate and effective eye gaze prediction is critical. Such systems must be able to run efficiently in difficult, unconstrained conditions while using reduced power and expense. A deep learning-based gaze estimation technique has been considered to solve this issue, with an emphasis on WSN based Convolutional Neural Networks (CNN) based system. The proposed study proposes the following architecture, which is focused on data science: The first is a novel neural network model that is programmed to manipulate any possible visual feature, such as the states of both eyes and head location, as well as many augmentations; the second is a data fusion approach that incorporates several gaze datasets. However, due to different factors such as environment light shifts, reflections on glasses surface, and motion and optical blurring of the captured eye signal, the accuracy of detecting and classifying the pupil centre and corneal reflection centre depends on a car environment. This work also includes pre-trained models, network structures, and datasets for designing and developing CNN-based deep learning models for Eye-Gaze Tracking and Classification.


Sign in / Sign up

Export Citation Format

Share Document