scholarly journals Fast body part segmentation and tracking of neonatal video data using deep learning

2020 ◽  
Vol 58 (12) ◽  
pp. 3049-3061
Author(s):  
Christoph Hoog Antink ◽  
Joana Carlos Mesquita Ferreira ◽  
Michael Paul ◽  
Simon Lyra ◽  
Konrad Heimann ◽  
...  

AbstractPhotoplethysmography imaging (PPGI) for non-contact monitoring of preterm infants in the neonatal intensive care unit (NICU) is a promising technology, as it could reduce medical adhesive-related skin injuries and associated complications. For practical implementations of PPGI, a region of interest has to be detected automatically in real time. As the neonates’ body proportions differ significantly from adults, existing approaches may not be used in a straightforward way, and color-based skin detection requires RGB data, thus prohibiting the use of less-intrusive near-infrared (NIR) acquisition. In this paper, we present a deep learning-based method for segmentation of neonatal video data. We augmented an existing encoder-decoder semantic segmentation method with a modified version of the ResNet-50 encoder. This reduced the computational time by a factor of 7.5, so that 30 frames per second can be processed at 960 × 576 pixels. The method was developed and optimized on publicly available databases with segmentation data from adults. For evaluation, a comprehensive dataset consisting of RGB and NIR video recordings from 29 neonates with various skin tones recorded in two NICUs in Germany and India was used. From all recordings, 643 frames were manually segmented. After pre-training the model on the public adult data, parts of the neonatal data were used for additional learning and left-out neonates are used for cross-validated evaluation. On the RGB data, the head is segmented well (82% intersection over union, 88% accuracy), and performance is comparable with those achieved on large, public, non-neonatal datasets. On the other hand, performance on the NIR data was inferior. By employing data augmentation to generate additional virtual NIR data for training, results could be improved and the head could be segmented with 62% intersection over union and 65% accuracy. The method is in theory capable of performing segmentation in real time and thus it may provide a useful tool for future PPGI applications.

Impact ◽  
2020 ◽  
Vol 2020 (2) ◽  
pp. 9-11
Author(s):  
Tomohiro Fukuda

Mixed reality (MR) is rapidly becoming a vital tool, not just in gaming, but also in education, medicine, construction and environmental management. The term refers to systems in which computer-generated content is superimposed over objects in a real-world environment across one or more sensory modalities. Although most of us have heard of the use of MR in computer games, it also has applications in military and aviation training, as well as tourism, healthcare and more. In addition, it has the potential for use in architecture and design, where buildings can be superimposed in existing locations to render 3D generations of plans. However, one major challenge that remains in MR development is the issue of real-time occlusion. This refers to hiding 3D virtual objects behind real articles. Dr Tomohiro Fukuda, who is based at the Division of Sustainable Energy and Environmental Engineering, Graduate School of Engineering at Osaka University in Japan, is an expert in this field. Researchers, led by Dr Tomohiro Fukuda, are tackling the issue of occlusion in MR. They are currently developing a MR system that realises real-time occlusion by harnessing deep learning to achieve an outdoor landscape design simulation using a semantic segmentation technique. This methodology can be used to automatically estimate the visual environment prior to and after construction projects.


2019 ◽  
Vol 2019 ◽  
pp. 1-14 ◽  
Author(s):  
Yong He ◽  
Hong Zeng ◽  
Yangyang Fan ◽  
Shuaisheng Ji ◽  
Jianjian Wu

In this paper, we proposed an approach to detect oilseed rape pests based on deep learning, which improves the mean average precision (mAP) to 77.14%; the result increased by 9.7% with the original model. We adopt this model to mobile platform to let every farmer able to use this program, which will diagnose pests in real time and provide suggestions on pest controlling. We designed an oilseed rape pest imaging database with 12 typical oilseed rape pests and compared the performance of five models, SSD w/Inception is chosen as the optimal model. Moreover, for the purpose of the high mAP, we have used data augmentation (DA) and added a dropout layer. The experiments are performed on the Android application we developed, and the result shows that our approach surpasses the original model obviously and is helpful for integrated pest management. This application has improved environmental adaptability, response speed, and accuracy by contrast with the past works and has the advantage of low cost and simple operation, which are suitable for the pest monitoring mission of drones and Internet of Things (IoT).


2021 ◽  
Vol 11 (15) ◽  
pp. 7148
Author(s):  
Bedada Endale ◽  
Abera Tullu ◽  
Hayoung Shi ◽  
Beom-Soo Kang

Unmanned aerial vehicles (UAVs) are being widely utilized for various missions: in both civilian and military sectors. Many of these missions demand UAVs to acquire artificial intelligence about the environments they are navigating in. This perception can be realized by training a computing machine to classify objects in the environment. One of the well known machine training approaches is supervised deep learning, which enables a machine to classify objects. However, supervised deep learning comes with huge sacrifice in terms of time and computational resources. Collecting big input data, pre-training processes, such as labeling training data, and the need for a high performance computer for training are some of the challenges that supervised deep learning poses. To address these setbacks, this study proposes mission specific input data augmentation techniques and the design of light-weight deep neural network architecture that is capable of real-time object classification. Semi-direct visual odometry (SVO) data of augmented images are used to train the network for object classification. Ten classes of 10,000 different images in each class were used as input data where 80% were for training the network and the remaining 20% were used for network validation. For the optimization of the designed deep neural network, a sequential gradient descent algorithm was implemented. This algorithm has the advantage of handling redundancy in the data more efficiently than other algorithms.


Sensors ◽  
2021 ◽  
Vol 21 (12) ◽  
pp. 4045
Author(s):  
Alessandro Sassu ◽  
Jose Francisco Saenz-Cogollo ◽  
Maurizio Agelli

Edge computing is the best approach for meeting the exponential demand and the real-time requirements of many video analytics applications. Since most of the recent advances regarding the extraction of information from images and video rely on computation heavy deep learning algorithms, there is a growing need for solutions that allow the deployment and use of new models on scalable and flexible edge architectures. In this work, we present Deep-Framework, a novel open source framework for developing edge-oriented real-time video analytics applications based on deep learning. Deep-Framework has a scalable multi-stream architecture based on Docker and abstracts away from the user the complexity of cluster configuration, orchestration of services, and GPU resources allocation. It provides Python interfaces for integrating deep learning models developed with the most popular frameworks and also provides high-level APIs based on standard HTTP and WebRTC interfaces for consuming the extracted video data on clients running on browsers or any other web-based platform.


2021 ◽  
Vol 13 (3) ◽  
pp. 809-820
Author(s):  
V. Sowmya ◽  
R. Radha

Vehicle detection and recognition require demanding advanced computational intelligence and resources in a real-time traffic surveillance system for effective traffic management of all possible contingencies. One of the focus areas of deep intelligent systems is to facilitate vehicle detection and recognition techniques for robust traffic management of heavy vehicles. The following are such sophisticated mechanisms: Support Vector Machine (SVM), Convolutional Neural Networks (CNN), Regional Convolutional Neural Networks (R-CNN), You Only Look Once (YOLO) model, etcetera. Accordingly, it is pivotal to choose the precise algorithm for vehicle detection and recognition, which also addresses the real-time environment. In this study, a comparison of deep learning algorithms, such as the Faster R-CNN, YOLOv2, YOLOv3, and YOLOv4, are focused on diverse aspects of the features. Two entities for transport heavy vehicles, the buses and trucks, constitute detection and recognition elements in this proposed work. The mechanics of data augmentation and transfer-learning is implemented in the model; to build, execute, train, and test for detection and recognition to avoid over-fitting and improve speed and accuracy. Extensive empirical evaluation is conducted on two standard datasets such as COCO and PASCAL VOC 2007. Finally, comparative results and analyses are presented based on real-time.


2020 ◽  
Vol 17 (6) ◽  
pp. 811-821
Author(s):  
Janak D. Trivedi ◽  
Sarada Devi Mandalapu ◽  
Dhara H. Dave

Purpose The purpose of this paper is to find a real-time parking location for a four-wheeler. Design/methodology/approach Real-time parking availability using specific infrastructure requires a high cost of installation and maintenance cost, which is not affordable to all urban cities. The authors present statistical block matching algorithm (SBMA) for real-time parking management in small-town cities such as Bhavnagar using an in-built surveillance CCTV system, which is not installed for parking application. In particular, data from a camera situated in a mall was used to detect the parking status of some specific parking places using a region of interest (ROI). The method proposed computes the mean value of the pixels inside the ROI using blocks of different sizes (8 × 10 and 20 × 35), and the values were compared among different frames. When the difference between frames is more significant than a threshold, the process generates “no parking space for that place.” Otherwise, the method yields “parking place available.” Then, this information is used to print a bounding box on the parking places with the color green/red to show the availability of the parking place. Findings The real-time feedback loop (car parking positions) helps the presented model and dynamically refines the parking strategy and parking position to the users. A whole-day experiment/validation is shown in this paper, where the evaluation of the method is performed using pattern recognition metrics for classification: precision, recall and F1 score. Originality/value The authors found real-time parking availability for Himalaya Mall situated in Bhavnagar, Gujarat, for 18th June 2018 video using the SBMA method with accountable computational time for finding parking slots. The limitations of the presented method with future implementation are discussed at the end of this paper.


Sensors ◽  
2021 ◽  
Vol 22 (1) ◽  
pp. 160
Author(s):  
Xuelin Zhang ◽  
Donghao Zhang ◽  
Alexander Leye ◽  
Adrian Scott ◽  
Luke Visser ◽  
...  

This paper focuses on improving the performance of scientific instrumentation that uses glass spray chambers for sample introduction, such as spectrometers, which are widely used in analytical chemistry, by detecting incidents using deep convolutional models. The performance of these instruments can be affected by the quality of the introduction of the sample into the spray chamber. Among the indicators of poor quality sample introduction are two primary incidents: The formation of liquid beads on the surface of the spray chamber, and flooding at the bottom of the spray chamber. Detecting such events autonomously as they occur can assist with improving the overall operational accuracy and efficacy of the chemical analysis, and avoid severe incidents such as malfunction and instrument damage. In contrast to objects commonly seen in the real world, beading and flooding detection are more challenging since they are of significantly small size and transparent. Furthermore, the non-rigid property increases the difficulty of the detection of these incidents, as such that existing deep-learning-based object detection frameworks are prone to fail for this task. There is no former work that uses computer vision to detect these incidents in the chemistry industry. In this work, we propose two frameworks for the detection task of these two incidents, which not only leverage the modern deep learning architectures but also integrate with expert knowledge of the problems. Specifically, the proposed networks first localize the regions of interest where the incidents are most likely generated and then refine these incident outputs. The use of data augmentation and synthesis, and choice of negative sampling in training, allows for a large increase in accuracy while remaining a real-time system for inference. In the data collected from our laboratory, our method surpasses widely used object detection baselines and can correctly detect 95% of the beads and 98% of the flooding. At the same time, out method can process four frames per second and is able to be implemented in real time.


Sensors ◽  
2021 ◽  
Vol 21 (23) ◽  
pp. 8072
Author(s):  
Yu-Bang Chang ◽  
Chieh Tsai ◽  
Chang-Hong Lin ◽  
Poki Chen

As the techniques of autonomous driving become increasingly valued and universal, real-time semantic segmentation has become very popular and challenging in the field of deep learning and computer vision in recent years. However, in order to apply the deep learning model to edge devices accompanying sensors on vehicles, we need to design a structure that has the best trade-off between accuracy and inference time. In previous works, several methods sacrificed accuracy to obtain a faster inference time, while others aimed to find the best accuracy under the condition of real time. Nevertheless, the accuracies of previous real-time semantic segmentation methods still have a large gap compared to general semantic segmentation methods. As a result, we propose a network architecture based on a dual encoder and a self-attention mechanism. Compared with preceding works, we achieved a 78.6% mIoU with a speed of 39.4 FPS with a 1024 × 2048 resolution on a Cityscapes test submission.


Sign in / Sign up

Export Citation Format

Share Document