scholarly journals Design of Desktop Audiovisual Entertainment System with Deep Learning and Haptic Sensations

Symmetry ◽  
2020 ◽  
Vol 12 (10) ◽  
pp. 1718
Author(s):  
Chien-Hsing Chou ◽  
Yu-Sheng Su ◽  
Che-Ju Hsu ◽  
Kong-Chang Lee ◽  
Ping-Hsuan Han

In this study, we designed a four-dimensional (4D) audiovisual entertainment system called Sense. This system comprises a scene recognition system and hardware modules that provide haptic sensations for users when they watch movies and animations at home. In the scene recognition system, we used Google Cloud Vision to detect common scene elements in a video, such as fire, explosions, wind, and rain, and further determine whether the scene depicts hot weather, rain, or snow. Additionally, for animated videos, we applied deep learning with a single shot multibox detector to detect whether the animated video contained scenes of fire-related objects. The hardware module was designed to provide six types of haptic sensations set as line-symmetry to provide a better user experience. After the system considers the results of object detection via the scene recognition system, the system generates corresponding haptic sensations. The system integrates deep learning, auditory signals, and haptic sensations to provide an enhanced viewing experience.

Author(s):  
Limu Chen ◽  
Ye Xia ◽  
Dexiong Pan ◽  
Chengbin Wang

<p>Deep-learning based navigational object detection is discussed with respect to active monitoring system for anti-collision between vessel and bridge. Motion based object detection method widely used in existing anti-collision monitoring systems is incompetent in dealing with complicated and changeable waterway for its limitations in accuracy, robustness and efficiency. The video surveillance system proposed contains six modules, including image acquisition, detection, tracking, prediction, risk evaluation and decision-making, and the detection module is discussed in detail. A vessel-exclusive dataset with tons of image samples is established for neural network training and a SSD (Single Shot MultiBox Detector) based object detection model with both universality and pertinence is generated attributing to tactics of sample filtering, data augmentation and large-scale optimization, which make it capable of stable and intelligent vessel detection. Comparison results with conventional methods indicate that the proposed deep-learning method shows remarkable advantages in robustness, accuracy, efficiency and intelligence. In-situ test is carried out at Songpu Bridge in Shanghai, and the results illustrate that the method is qualified for long-term monitoring and providing information support for further analysis and decision making.</p>


2019 ◽  
Vol 2019 ◽  
pp. 1-7 ◽  
Author(s):  
Peng Liu ◽  
Xiangxiang Li ◽  
Haiting Cui ◽  
Shanshan Li ◽  
Yafei Yuan

Hand gesture recognition is an intuitive and effective way for humans to interact with a computer due to its high processing speed and recognition accuracy. This paper proposes a novel approach to identify hand gestures in complex scenes by the Single-Shot Multibox Detector (SSD) deep learning algorithm with 19 layers of a neural network. A benchmark database with gestures is used, and general hand gestures in the complex scene are chosen as the processing objects. A real-time hand gesture recognition system based on the SSD algorithm is constructed and tested. The experimental results show that the algorithm quickly identifies humans’ hands and accurately distinguishes different types of gestures. Furthermore, the maximum accuracy is 99.2%, which is significantly important for human-computer interaction application.


Electronics ◽  
2020 ◽  
Vol 9 (4) ◽  
pp. 583 ◽  
Author(s):  
Khang Nguyen ◽  
Nhut T. Huynh ◽  
Phat C. Nguyen ◽  
Khanh-Duy Nguyen ◽  
Nguyen D. Vo ◽  
...  

Unmanned aircraft systems or drones enable us to record or capture many scenes from the bird’s-eye view and they have been fast deployed to a wide range of practical domains, i.e., agriculture, aerial photography, fast delivery and surveillance. Object detection task is one of the core steps in understanding videos collected from the drones. However, this task is very challenging due to the unconstrained viewpoints and low resolution of captured videos. While deep-learning modern object detectors have recently achieved great success in general benchmarks, i.e., PASCAL-VOC and MS-COCO, the robustness of these detectors on aerial images captured by drones is not well studied. In this paper, we present an evaluation of state-of-the-art deep-learning detectors including Faster R-CNN (Faster Regional CNN), RFCN (Region-based Fully Convolutional Networks), SNIPER (Scale Normalization for Image Pyramids with Efficient Resampling), Single-Shot Detector (SSD), YOLO (You Only Look Once), RetinaNet, and CenterNet for the object detection in videos captured by drones. We conduct experiments on VisDrone2019 dataset which contains 96 videos with 39,988 annotated frames and provide insights into efficient object detectors for aerial images.


2020 ◽  
Author(s):  
Jun-Min Kim ◽  
Woo Ram Lee ◽  
Jun-Ho Kim ◽  
Jong-Mo Seo ◽  
Changkyun Im

BACKGROUND Dental diseases can be prevented through the management of dental plaques. Dental plaque can be identified using the light-induced fluorescence (LIF) technique that emits light at 405 nm. The LIF technique is more convenient than the commercial technique using a disclosing agent, but the result may vary for each individual as it still requires visual identification. OBJECTIVE The objective of this study is to introduce and validate a deep learning–based oral hygiene monitoring system that makes it easy to identify dental plaques at home. METHODS We developed a LIF-based system consisting of a device that can visually identify dental plaques and a mobile app that displays the location and area of dental plaques on oral images. The mobile app is programmed to automatically determine the location and distribution of dental plaques using a deep learning–based algorithm and present the results to the user as time series data. The mobile app is also built with convergence of naive and web applications so that the algorithm is executed on a cloud server to efficiently distribute computing resources. RESULTS The location and distribution of users’ dental plaques could be identified via the hand-held LIF device or mobile app. The color correction filter in the device was developed using a color mixing technique. The mobile app was built as a hybrid app combining the functionalities of a native application and a web application. Through the scrollable WebView on the mobile app, changes in the time series of dental plaque could be confirmed. The algorithm for dental plaque detection was implemented to run on Amazon Web Services for object detection by single shot multibox detector and instance segmentation by Mask region-based convolutional neural network. CONCLUSIONS This paper shows that the system can be used as a home oral care product for timely identification and management of dental plaques. In the future, it is expected that these products will significantly reduce the social costs associated with dental diseases.


Electronics ◽  
2021 ◽  
Vol 10 (16) ◽  
pp. 1932
Author(s):  
Malik Haris ◽  
Adam Glowacz

Automated driving and vehicle safety systems need object detection. It is important that object detection be accurate overall and robust to weather and environmental conditions and run in real-time. As a consequence of this approach, they require image processing algorithms to inspect the contents of images. This article compares the accuracy of five major image processing algorithms: Region-based Fully Convolutional Network (R-FCN), Mask Region-based Convolutional Neural Networks (Mask R-CNN), Single Shot Multi-Box Detector (SSD), RetinaNet, and You Only Look Once v4 (YOLOv4). In this comparative analysis, we used a large-scale Berkeley Deep Drive (BDD100K) dataset. Their strengths and limitations are analyzed based on parameters such as accuracy (with/without occlusion and truncation), computation time, precision-recall curve. The comparison is given in this article helpful in understanding the pros and cons of standard deep learning-based algorithms while operating under real-time deployment restrictions. We conclude that the YOLOv4 outperforms accurately in detecting difficult road target objects under complex road scenarios and weather conditions in an identical testing environment.


2019 ◽  
Vol 11 (7) ◽  
pp. 786 ◽  
Author(s):  
Yang-Lang Chang ◽  
Amare Anagaw ◽  
Lena Chang ◽  
Yi Wang ◽  
Chih-Yu Hsiao ◽  
...  

Synthetic aperture radar (SAR) imagery has been used as a promising data source for monitoring maritime activities, and its application for oil and ship detection has been the focus of many previous research studies. Many object detection methods ranging from traditional to deep learning approaches have been proposed. However, majority of them are computationally intensive and have accuracy problems. The huge volume of the remote sensing data also brings a challenge for real time object detection. To mitigate this problem a high performance computing (HPC) method has been proposed to accelerate SAR imagery analysis, utilizing the GPU based computing methods. In this paper, we propose an enhanced GPU based deep learning method to detect ship from the SAR images. The You Only Look Once version 2 (YOLOv2) deep learning framework is proposed to model the architecture and training the model. YOLOv2 is a state-of-the-art real-time object detection system, which outperforms Faster Region-Based Convolutional Network (Faster R-CNN) and Single Shot Multibox Detector (SSD) methods. Additionally, in order to reduce computational time with relatively competitive detection accuracy, we develop a new architecture with less number of layers called YOLOv2-reduced. In the experiment, we use two types of datasets: A SAR ship detection dataset (SSDD) dataset and a Diversified SAR Ship Detection Dataset (DSSDD). These two datasets were used for training and testing purposes. YOLOv2 test results showed an increase in accuracy of ship detection as well as a noticeable reduction in computational time compared to Faster R-CNN. From the experimental results, the proposed YOLOv2 architecture achieves an accuracy of 90.05% and 89.13% on the SSDD and DSSDD datasets respectively. The proposed YOLOv2-reduced architecture has a similarly competent detection performance as YOLOv2, but with less computational time on a NVIDIA TITAN X GPU. The experimental results shows that the deep learning can make a big leap forward in improving the performance of SAR image ship detection.


2020 ◽  
Vol 12 (3) ◽  
pp. 458 ◽  
Author(s):  
Ugur Alganci ◽  
Mehmet Soydas ◽  
Elif Sertel

Object detection from satellite images has been a challenging problem for many years. With the development of effective deep learning algorithms and advancement in hardware systems, higher accuracies have been achieved in the detection of various objects from very high-resolution (VHR) satellite images. This article provides a comparative evaluation of the state-of-the-art convolutional neural network (CNN)-based object detection models, which are Faster R-CNN, Single Shot Multi-box Detector (SSD), and You Look Only Once-v3 (YOLO-v3), to cope with the limited number of labeled data and to automatically detect airplanes in VHR satellite images. Data augmentation with rotation, rescaling, and cropping was applied on the test images to artificially increase the number of training data from satellite images. Moreover, a non-maximum suppression algorithm (NMS) was introduced at the end of the SSD and YOLO-v3 flows to get rid of the multiple detection occurrences near each detected object in the overlapping areas. The trained networks were applied to five independent VHR test images that cover airports and their surroundings to evaluate their performance objectively. Accuracy assessment results of the test regions proved that Faster R-CNN architecture provided the highest accuracy according to the F1 scores, average precision (AP) metrics, and visual inspection of the results. The YOLO-v3 ranked as second, with a slightly lower performance but providing a balanced trade-off between accuracy and speed. The SSD provided the lowest detection performance, but it was better in object localization. The results were also evaluated in terms of the object size and detection accuracy manner, which proved that large- and medium-sized airplanes were detected with higher accuracy.


This paper is to present an efficient and fast deep learning algorithm based on neural networks for object detection and pedestrian detection. The technique, called MobileNet Single Shot Detector, is an extension to Convolution Neural Networks. This technique is based on depth-wise distinguishable convolutions in order to build a lightweighted deep convolution network. A single filter is applied to each input and outputs are combined by using pointwise convolution. Single Shot Multibox Detector is a feed forward convolution network that is combined with MobileNets to give efficient and accurate results. MobileNets combined with SSD and Multibox Technique makes it much faster than SSD alone can work. The accuracy for this technique is calculated over colored (RGB images) and also on infrared images and its results are compared with the results of shallow machine learning based feature extraction plus classification technique viz. HOG plus SVM technique. The comparison of performance between proposed deep learning and shallow learning techniques has been conducted over benchmark dataset and validation testing over own dataset in order measure efficiency of both algorithms and find an effective algorithm that can work with speed and accurately to be applied for object detection in real world pedestrian detection application.


10.2196/17881 ◽  
2020 ◽  
Vol 8 (10) ◽  
pp. e17881
Author(s):  
Jun-Min Kim ◽  
Woo Ram Lee ◽  
Jun-Ho Kim ◽  
Jong-Mo Seo ◽  
Changkyun Im

Background Dental diseases can be prevented through the management of dental plaques. Dental plaque can be identified using the light-induced fluorescence (LIF) technique that emits light at 405 nm. The LIF technique is more convenient than the commercial technique using a disclosing agent, but the result may vary for each individual as it still requires visual identification. Objective The objective of this study is to introduce and validate a deep learning–based oral hygiene monitoring system that makes it easy to identify dental plaques at home. Methods We developed a LIF-based system consisting of a device that can visually identify dental plaques and a mobile app that displays the location and area of dental plaques on oral images. The mobile app is programmed to automatically determine the location and distribution of dental plaques using a deep learning–based algorithm and present the results to the user as time series data. The mobile app is also built with convergence of naive and web applications so that the algorithm is executed on a cloud server to efficiently distribute computing resources. Results The location and distribution of users’ dental plaques could be identified via the hand-held LIF device or mobile app. The color correction filter in the device was developed using a color mixing technique. The mobile app was built as a hybrid app combining the functionalities of a native application and a web application. Through the scrollable WebView on the mobile app, changes in the time series of dental plaque could be confirmed. The algorithm for dental plaque detection was implemented to run on Amazon Web Services for object detection by single shot multibox detector and instance segmentation by Mask region-based convolutional neural network. Conclusions This paper shows that the system can be used as a home oral care product for timely identification and management of dental plaques. In the future, it is expected that these products will significantly reduce the social costs associated with dental diseases.


Sign in / Sign up

Export Citation Format

Share Document