scholarly journals Improving Deep Object Detection Algorithms for Game Scenes

Electronics ◽  
2021 ◽  
Vol 10 (20) ◽  
pp. 2527
Author(s):  
Minji Jung ◽  
Heekyung Yang ◽  
Kyungha Min

The advancement and popularity of computer games make game scene analysis one of the most interesting research topics in the computer vision society. Among the various computer vision techniques, we employ object detection algorithms for the analysis, since they can both recognize and localize objects in a scene. However, applying the existing object detection algorithms for analyzing game scenes does not guarantee a desired performance, since the algorithms are trained using datasets collected from the real world. In order to achieve a desired performance for analyzing game scenes, we built a dataset by collecting game scenes and retrained the object detection algorithms pre-trained with the datasets from the real world. We selected five object detection algorithms, namely YOLOv3, Faster R-CNN, SSD, FPN and EfficientDet, and eight games from various game genres including first-person shooting, role-playing, sports, and driving. PascalVOC and MS COCO were employed for the pre-training of the object detection algorithms. We proved the improvement in the performance that comes from our strategy in two aspects: recognition and localization. The improvement in recognition performance was measured using mean average precision (mAP) and the improvement in localization using intersection over union (IoU).

Author(s):  
Ting Tao ◽  
Decun Dong ◽  
Shize Huang ◽  
Wei Chen ◽  
Lingyu Yang

Automatic license plate recognition (ALPR) has made great progress, yet is still challenged by various factors in the real world, such as blurred or occluded plates, skewed camera angles, bad weather, and so on. Therefore, we propose a method that uses a cascade of object detection algorithms to accurately and speedily recognize plates’ contents. In our method, YOLOv3-Tiny, an end-to-end object detection network, is used to locate license plate areas, and YOLOv3 to recognize license plate characters. According to the type and position of the recognized characters, a logical judgment is made to obtain the license plate number. We applied our method to a truck weighing system and constructed a dataset called SM-ALPR, encapsulating pictures captured by this system. It is demonstrated by experiment and by comparison with two other methods applied to this dataset that our method can locate 99.51% of license plate areas in the images and recognize 99.02% of the characters on the plates while maintaining a higher running speed. Specifically, our method exhibits a better performance on challenging images that contain blurred plates, skewed angles, or accidental occlusion, or have been captured in bad weather or poor light, which implies its potential in more diversified practice scenarios.


Author(s):  
Akash Kumar, Dr. Amita Goel Prof. Vasudha Bahl and Prof. Nidhi Sengar

Object Detection is a study in the field of computer vision. An object detection model recognizes objects of the real world present either in a captured image or in real-time video where the object can belong to any class of objects namely humans, animals, objects, etc. This project is an implementation of an algorithm based on object detection called You Only Look Once (YOLO v3). The architecture of yolo model is extremely fast compared to all previous methods. Yolov3 model executes a single neural network to the given image and then divides the image into predetermined bounding boxes. These boxes are weighted by the predicted probabilities. After non max-suppression it gives the result of recognized objects together with bounding boxes. Yolo trains and directly executes object detection on full images.


2006 ◽  
Vol 5 (3) ◽  
pp. 53-58 ◽  
Author(s):  
Roger K. C. Tan ◽  
Adrian David Cheok ◽  
James K. S. Teh

For better or worse, technological advancement has changed the world to the extent that at a professional level demands from the working executive required more hours either in the office or on business trips, on a social level the population (especially the younger generation) are glued to the computer either playing video games or surfing the internet. Traditional leisure activities, especially interaction with pets have been neglected or forgotten. This paper introduces Metazoa Ludens, a new computer mediated gaming system which allows pets to play new mixed reality computer games with humans via custom built technologies and applications. During the game-play the real pet chases after a physical movable bait in the real world within a predefined area; infra-red camera tracks the pets' movements and translates them into the virtual world of the system, corresponding them to the movement of a virtual pet avatar running after a virtual human avatar. The human player plays the game by controlling the human avatar's movements in the virtual world, this in turn relates to the movements of the physical movable bait in the real world which moves as the human avatar does. This unique way of playing computer game would give rise to a whole new way of mixed reality interaction between the pet owner and her pet thereby bringing technology and its influence on leisure and social activities to the next level


2021 ◽  
Vol 11 (23) ◽  
pp. 11241
Author(s):  
Ling Li ◽  
Fei Xue ◽  
Dong Liang ◽  
Xiaofei Chen

Concealed objects detection in terahertz imaging is an urgent need for public security and counter-terrorism. So far, there is no public terahertz imaging dataset for the evaluation of objects detection algorithms. This paper provides a public dataset for evaluating multi-object detection algorithms in active terahertz imaging. Due to high sample similarity and poor imaging quality, object detection on this dataset is much more difficult than on those commonly used public object detection datasets in the computer vision field. Since the traditional hard example mining approach is designed based on the two-stage detector and cannot be directly applied to the one-stage detector, this paper designs an image-based Hard Example Mining (HEM) scheme based on RetinaNet. Several state-of-the-art detectors, including YOLOv3, YOLOv4, FRCN-OHEM, and RetinaNet, are evaluated on this dataset. Experimental results show that the RetinaNet achieves the best mAP and HEM further enhances the performance of the model. The parameters affecting the detection metrics of individual images are summarized and analyzed in the experiments.


Author(s):  
Mahesh Singh

This paper will help to bring out some amazing findings about autonomous prediction and performing action by establishing a connection between the real world with machine learning and Internet Of thing. The purpose of this research paper is to perform our machine to analyze different signs in the real world and act accordingly. We have explored and found detection of several features in our model which helped us to establish a better interaction of our model with the surroundings. Our algorithms give very optimized predictions performing the right action .Nowadays, autonomous vehicles are a great area of research where we can make it more optimized and more multi - performing .This paper contributes to a huge survey of varied object detection and feature extraction techniques. At the moment, there are loads of object classification and recognition techniques and algorithms found and developed around the world. TSD research is of great significance for improving road traffic safety. In recent years, CNN (Convolutional Neural Networks) have achieved great success in object detection tasks. It shows better accuracy or faster execution speed than traditional methods. However, the execution speed and the detection accuracy of the existing CNN methods cannot be obtained at the same time. What's more, the hardware requirements are also higher than before, resulting in a larger detection cost. In order to solve these problems, this paper proposes an improved algorithm based on convolutional model A classic robot which uses this algorithm which is installed through raspberry pi and performs dedicated action.


2020 ◽  
Vol 2020 ◽  
pp. 1-13
Author(s):  
Sultan Daud Khan ◽  
Ahmed B. Altamimi ◽  
Mohib Ullah ◽  
Habib Ullah ◽  
Faouzi Alaya Cheikh

Head detection in real-world videos is a classical research problem in computer vision. Head detection in videos is challenging than in a single image due to many nuisances that are commonly observed in natural videos, including arbitrary poses, appearances, and scales. Generally, head detection is treated as a particular case of object detection in a single image. However, the performance of object detectors deteriorates in unconstrained videos. In this paper, we propose a temporal consistency model (TCM) to enhance the performance of a generic object detector by integrating spatial-temporal information that exists among subsequent frames of a particular video. Generally, our model takes detection from a generic detector as input and improves mean average precision (mAP) by recovering missed detection and suppressing false positives. We compare and evaluate the proposed framework on four challenging datasets, i.e., HollywoodHeads, Casablanca, BOSS, and PAMELA. Experimental evaluation shows that the performance is improved by employing the proposed TCM model. We demonstrate both qualitatively and quantitatively that our proposed framework obtains significant improvements over other methods.


SLEEP ◽  
2018 ◽  
Vol 41 (suppl_1) ◽  
pp. A186-A186
Author(s):  
L Schneider ◽  
D Janssens ◽  
P Silberschatz ◽  
A Londeree ◽  
E Shapiro

2021 ◽  
Vol 7 ◽  
pp. e382
Author(s):  
Lingxiang Yao ◽  
Worapan Kusakunniran ◽  
Qiang Wu ◽  
Jian Zhang

Gait has been deemed as an alternative biometric in video-based surveillance applications, since it can be used to recognize individuals from a far distance without their interaction and cooperation. Recently, many gait recognition methods have been proposed, aiming at reducing the influence caused by exterior factors. However, most of these methods are developed based on sufficient input gait frames, and their recognition performance will sharply decrease if the frame number drops. In the real-world scenario, it is impossible to always obtain a sufficient number of gait frames for each subject due to many reasons, e.g., occlusion and illumination. Therefore, it is necessary to improve the gait recognition performance when the available gait frames are limited. This paper starts with three different strategies, aiming at producing more input frames and eliminating the generalization error cause by insufficient input data. Meanwhile, a two-branch network is also proposed in this paper to formulate robust gait representations from the original and new generated input gait frames. According to our experiments, under the limited gait frames being used, it was verified that the proposed method can achieve a reliable performance for gait recognition.


Sign in / Sign up

Export Citation Format

Share Document