Improving Deep Object Detection Algorithms for Game Scenes

The advancement and popularity of computer games make game scene analysis one of the most interesting research topics in the computer vision society. Among the various computer vision techniques, we employ object detection algorithms for the analysis, since they can both recognize and localize objects in a scene. However, applying the existing object detection algorithms for analyzing game scenes does not guarantee a desired performance, since the algorithms are trained using datasets collected from the real world. In order to achieve a desired performance for analyzing game scenes, we built a dataset by collecting game scenes and retrained the object detection algorithms pre-trained with the datasets from the real world. We selected five object detection algorithms, namely YOLOv3, Faster R-CNN, SSD, FPN and EfficientDet, and eight games from various game genres including first-person shooting, role-playing, sports, and driving. PascalVOC and MS COCO were employed for the pre-training of the object detection algorithms. We proved the improvement in the performance that comes from our strategy in two aspects: recognition and localization. The improvement in recognition performance was measured using mean average precision (mAP) and the improvement in localization using intersection over union (IoU).

Download Full-text

Object Detection-Based License Plate Localization and Recognition in Complex Environments

Transportation Research Record Journal of the Transportation Research Board ◽

10.1177/0361198120954202 ◽

2020 ◽

Vol 2674 (12) ◽

pp. 212-223

Author(s):

Ting Tao ◽

Decun Dong ◽

Shize Huang ◽

Wei Chen ◽

Lingyu Yang

Keyword(s):

Object Detection ◽

Real World ◽

License Plate ◽

License Plate Recognition ◽

Complex Environments ◽

Running Speed ◽

The Real ◽

Detection Algorithms ◽

Great Progress ◽

Bad Weather

Automatic license plate recognition (ALPR) has made great progress, yet is still challenged by various factors in the real world, such as blurred or occluded plates, skewed camera angles, bad weather, and so on. Therefore, we propose a method that uses a cascade of object detection algorithms to accurately and speedily recognize plates’ contents. In our method, YOLOv3-Tiny, an end-to-end object detection network, is used to locate license plate areas, and YOLOv3 to recognize license plate characters. According to the type and position of the recognized characters, a logical judgment is made to obtain the license plate number. We applied our method to a truck weighing system and constructed a dataset called SM-ALPR, encapsulating pictures captured by this system. It is demonstrated by experiment and by comparison with two other methods applied to this dataset that our method can locate 99.51% of license plate areas in the images and recognize 99.02% of the characters on the plates while maintaining a higher running speed. Specifically, our method exhibits a better performance on challenging images that contain blurred plates, skewed angles, or accidental occlusion, or have been captured in bad weather or poor light, which implies its potential in more diversified practice scenarios.

Download Full-text

Real-Time Object Detection Model

International Journal for Modern Trends in Science and Technology - RTT2020 ◽

10.46501/ijmtst061267 ◽

2020 ◽

Vol 6 (12) ◽

pp. 360-364

Author(s):

Akash Kumar, Dr. Amita Goel Prof. Vasudha Bahl and Prof. Nidhi Sengar

Keyword(s):

Neural Network ◽

Computer Vision ◽

Object Detection ◽

Real Time ◽

Real World ◽

The Real ◽

Detection Model ◽

Bounding Boxes ◽

The Given ◽

Single Neural Network

Object Detection is a study in the field of computer vision. An object detection model recognizes objects of the real world present either in a captured image or in real-time video where the object can belong to any class of objects namely humans, animals, objects, etc. This project is an implementation of an algorithm based on object detection called You Only Look Once (YOLO v3). The architecture of yolo model is extremely fast compared to all previous methods. Yolov3 model executes a single neural network to the given image and then divides the image into predetermined bounding boxes. These boxes are weighted by the predicted probabilities. After non max-suppression it gives the result of recognized objects together with bounding boxes. Yolo trains and directly executes object detection on full images.

Download Full-text

Evaluation of Anomaly Detection Algorithms for the Real-World Applications

2020 25th International Conference on Pattern Recognition (ICPR) ◽

10.1109/icpr48806.2021.9413265 ◽

2021 ◽

Author(s):

Marija Ivanovska ◽

Janez Pers ◽

Domen Tabernik ◽

Danijel Skocaj

Keyword(s):

Anomaly Detection ◽

Real World ◽

The Real ◽

Detection Algorithms ◽

Real World Applications

Download Full-text

Business Students Meet the Real World: Creative Problem-Solving via a Complex Role-Playing Simulation

Educational Technology Beyond Content - Educational Communications and Technology: Issues and Innovations ◽

10.1007/978-3-030-37254-5_21 ◽

2020 ◽

pp. 249-259

Author(s):

Dennis W. Cheek ◽

Kim A. Cheek

Keyword(s):

Problem Solving ◽

Real World ◽

Business Students ◽

Role Playing ◽

Creative Problem Solving ◽

The Real ◽

Creative Problem

Download Full-text

Metazoa Ludens: Mixed Reality Environment for Playing Computer Games with Pets

International Journal of Virtual Reality ◽

10.20870/ijvr.2006.5.3.2699 ◽

2006 ◽

Vol 5 (3) ◽

pp. 53-58 ◽

Cited By ~ 6

Author(s):

Roger K. C. Tan ◽

Adrian David Cheok ◽

James K. S. Teh

Keyword(s):

Real World ◽

Computer Games ◽

Virtual World ◽

Mixed Reality ◽

Leisure Activities ◽

Computer Game ◽

Technological Advancement ◽

The Real ◽

Computer Mediated ◽

Pet Owner

For better or worse, technological advancement has changed the world to the extent that at a professional level demands from the working executive required more hours either in the office or on business trips, on a social level the population (especially the younger generation) are glued to the computer either playing video games or surfing the internet. Traditional leisure activities, especially interaction with pets have been neglected or forgotten. This paper introduces Metazoa Ludens, a new computer mediated gaming system which allows pets to play new mixed reality computer games with humans via custom built technologies and applications. During the game-play the real pet chases after a physical movable bait in the real world within a predefined area; infra-red camera tracks the pets' movements and translates them into the virtual world of the system, corresponding them to the movement of a virtual pet avatar running after a virtual human avatar. The human player plays the game by controlling the human avatar's movements in the virtual world, this in turn relates to the movements of the physical movable bait in the real world which moves as the human avatar does. This unique way of playing computer game would give rise to a whole new way of mixed reality interaction between the pet owner and her pet thereby bringing technology and its influence on leisure and social activities to the next level

Download Full-text

A Hard Example Mining Approach for Concealed Multi-Object Detection of Active Terahertz Image

Applied Sciences ◽

10.3390/app112311241 ◽

2021 ◽

Vol 11 (23) ◽

pp. 11241

Author(s):

Ling Li ◽

Fei Xue ◽

Dong Liang ◽

Xiaofei Chen

Keyword(s):

Computer Vision ◽

Object Detection ◽

State Of The Art ◽

Terahertz Imaging ◽

Public Security ◽

Counter Terrorism ◽

Detection Algorithms ◽

Public Dataset ◽

The One ◽

Objects Detection

Concealed objects detection in terahertz imaging is an urgent need for public security and counter-terrorism. So far, there is no public terahertz imaging dataset for the evaluation of objects detection algorithms. This paper provides a public dataset for evaluating multi-object detection algorithms in active terahertz imaging. Due to high sample similarity and poor imaging quality, object detection on this dataset is much more difficult than on those commonly used public object detection datasets in the computer vision field. Since the traditional hard example mining approach is designed based on the two-stage detector and cannot be directly applied to the one-stage detector, this paper designs an image-based Hard Example Mining (HEM) scheme based on RetinaNet. Several state-of-the-art detectors, including YOLOv3, YOLOv4, FRCN-OHEM, and RetinaNet, are evaluated on this dataset. Experimental results show that the RetinaNet achieves the best mAP and HEM further enhances the performance of the model. The parameters affecting the detection metrics of individual images are summarized and analyzed in the experiments.

Download Full-text

A Research Paper on Traffic Sign Recognition with Machine Learning and IOT

International Journal for Research in Applied Science and Engineering Technology ◽

10.22214/ijraset.2021.35309 ◽

2021 ◽

Vol 9 (VI) ◽

pp. 1774-1778

Author(s):

Mahesh Singh

Keyword(s):

Machine Learning ◽

Object Detection ◽

Real World ◽

Research Paper ◽

Raspberry Pi ◽

Detection Accuracy ◽

Great Success ◽

The Real ◽

Road Traffic Safety ◽

Execution Speed

This paper will help to bring out some amazing findings about autonomous prediction and performing action by establishing a connection between the real world with machine learning and Internet Of thing. The purpose of this research paper is to perform our machine to analyze different signs in the real world and act accordingly. We have explored and found detection of several features in our model which helped us to establish a better interaction of our model with the surroundings. Our algorithms give very optimized predictions performing the right action .Nowadays, autonomous vehicles are a great area of research where we can make it more optimized and more multi - performing .This paper contributes to a huge survey of varied object detection and feature extraction techniques. At the moment, there are loads of object classification and recognition techniques and algorithms found and developed around the world. TSD research is of great significance for improving road traffic safety. In recent years, CNN (Convolutional Neural Networks) have achieved great success in object detection tasks. It shows better accuracy or faster execution speed than traditional methods. However, the execution speed and the detection accuracy of the existing CNN methods cannot be obtained at the same time. What's more, the hardware requirements are also higher than before, resulting in a larger detection cost. In order to solve these problems, this paper proposes an improved algorithm based on convolutional model A classic robot which uses this algorithm which is installed through raspberry pi and performs dedicated action.

Download Full-text

TCM: Temporal Consistency Model for Head Detection in Complex Videos

Journal of Sensors ◽

10.1155/2020/8861296 ◽

2020 ◽

Vol 2020 ◽

pp. 1-13

Author(s):

Sultan Daud Khan ◽

Ahmed B. Altamimi ◽

Mohib Ullah ◽

Habib Ullah ◽

Faouzi Alaya Cheikh

Keyword(s):

Computer Vision ◽

Object Detection ◽

Real World ◽

Experimental Evaluation ◽

Research Problem ◽

Temporal Consistency ◽

Single Image ◽

Average Precision ◽

Consistency Model ◽

Head Detection

Head detection in real-world videos is a classical research problem in computer vision. Head detection in videos is challenging than in a single image due to many nuisances that are commonly observed in natural videos, including arbitrary poses, appearances, and scales. Generally, head detection is treated as a particular case of object detection in a single image. However, the performance of object detectors deteriorates in unconstrained videos. In this paper, we propose a temporal consistency model (TCM) to enhance the performance of a generic object detector by integrating spatial-temporal information that exists among subsequent frames of a particular video. Generally, our model takes detection from a generic detector as input and improves mean average precision (mAP) by recovering missed detection and suppressing false positives. We compare and evaluate the proposed framework on four challenging datasets, i.e., HollywoodHeads, Casablanca, BOSS, and PAMELA. Experimental evaluation shows that the performance is improved by employing the proposed TCM model. We demonstrate both qualitatively and quantitatively that our proposed framework obtains significant improvements over other methods.

Download Full-text

0494 Sleep In The Real World: A Computer Vision System To Passively Monitor Sleep And Sleep Disorders On A Continuous Basis

SLEEP ◽

10.1093/sleep/zsy061.493 ◽

2018 ◽

Vol 41 (suppl_1) ◽

pp. A186-A186

Author(s):

L Schneider ◽

D Janssens ◽

P Silberschatz ◽

A Londeree ◽

E Shapiro

Keyword(s):

Computer Vision ◽

Sleep Disorders ◽

Real World ◽

Vision System ◽

Computer Vision System ◽

The Real ◽

Continuous Basis

Download Full-text

Gait recognition using a few gait frames

PeerJ Computer Science ◽

10.7717/peerj-cs.382 ◽

2021 ◽

Vol 7 ◽

pp. e382

Author(s):

Lingxiang Yao ◽

Worapan Kusakunniran ◽

Qiang Wu ◽

Jian Zhang

Keyword(s):

Real World ◽

Input Data ◽

Recognition Performance ◽

Gait Recognition ◽

Generalization Error ◽

The Real ◽

Frame Number ◽

Reliable Performance ◽

Surveillance Applications

Gait has been deemed as an alternative biometric in video-based surveillance applications, since it can be used to recognize individuals from a far distance without their interaction and cooperation. Recently, many gait recognition methods have been proposed, aiming at reducing the influence caused by exterior factors. However, most of these methods are developed based on sufficient input gait frames, and their recognition performance will sharply decrease if the frame number drops. In the real-world scenario, it is impossible to always obtain a sufficient number of gait frames for each subject due to many reasons, e.g., occlusion and illumination. Therefore, it is necessary to improve the gait recognition performance when the available gait frames are limited. This paper starts with three different strategies, aiming at producing more input frames and eliminating the generalization error cause by insufficient input data. Meanwhile, a two-branch network is also proposed in this paper to formulate robust gait representations from the original and new generated input gait frames. According to our experiments, under the limited gait frames being used, it was verified that the proposed method can achieve a reliable performance for gait recognition.

Download Full-text