scholarly journals A Comparative Analysis Using Silhouette Extraction Methods for Dynamic Objects in Monocular Vision

2022 ◽  
pp. 1-12
Author(s):  
Md Rajib M Hasan ◽  
Noor H. S. Alani

Moving or dynamic object analysis continues to be an increasingly active research field in computer vision with many types of research investigating different methods for motion tracking, object recognition, pose estimation, or motion evaluation (e.g. in sports sciences). Many techniques are available to measure the forces and motion of the people, such as force plates to measure ground reaction forces for a jump or running sports. In training and commercial solution, the detailed motion of athlete's available motion capture devices based on optical markers on the athlete's body and multiple calibrated fixed cameras around the sides of the capture volume can be used. In some situations, it is not practical to attach any kind of marker or transducer to the athletes or the existing machinery are being used, while it is required by a pure vision-based approach to use the natural appearance of the person or object. When a sporting event is taking place, there are opportunities for computer vision to help the referee and other personnel involved in the sports to keep track of incidents occurring, which may provide full coverage and analysis in details of the event for sports viewers. The research aims at using computer vision methods, specially designed for monocular recording, for measuring sports activities, such as high jump, wide jump, or running. Just for indicating the complexity of the project: a single camera needs to understand the height at a particular distance using silhouette extraction. Moving object analysis benefits from silhouette extraction and this has been applied to many domains including sports activities. This paper comparatively discusses two significant techniques to extract silhouettes of a moving object (a jumping person) in monocular video data in different scenarios. The results show that the performance of silhouette extraction varies in dependency on the quality of used video data.

Sensors ◽  
2021 ◽  
Vol 21 (11) ◽  
pp. 3691
Author(s):  
Ciprian Orhei ◽  
Silviu Vert ◽  
Muguras Mocofan ◽  
Radu Vasiu

Computer Vision is a cross-research field with the main purpose of understanding the surrounding environment as closely as possible to human perception. The image processing systems is continuously growing and expanding into more complex systems, usually tailored to the certain needs or applications it may serve. To better serve this purpose, research on the architecture and design of such systems is also important. We present the End-to-End Computer Vision Framework, an open-source solution that aims to support researchers and teachers within the image processing vast field. The framework has incorporated Computer Vision features and Machine Learning models that researchers can use. In the continuous need to add new Computer Vision algorithms for a day-to-day research activity, our proposed framework has an advantage given by the configurable and scalar architecture. Even if the main focus of the framework is on the Computer Vision processing pipeline, the framework offers solutions to incorporate even more complex activities, such as training Machine Learning models. EECVF aims to become a useful tool for learning activities in the Computer Vision field, as it allows the learner and the teacher to handle only the topics at hand, and not the interconnection necessary for visual processing flow.


Author(s):  
Shiyu Deng ◽  
Chaitanya Kulkarni ◽  
Tianzi Wang ◽  
Jacob Hartman-Kenzler ◽  
Laura E. Barnes ◽  
...  

Context dependent gaze metrics, derived from eye movements explicitly associated with how a task is being performed, are particularly useful for formative assessment that includes feedback on specific behavioral adjustments for skill acquisitions. In laparoscopic surgery, context dependent gaze metrics are under investigated and commonly derived by either qualitatively inspecting the videos frame by frame or mapping the fixations onto a static surgical task field. This study collected eye-tracking and video data from 13 trainees practicing the peg transfer task. Machine learning algorithms in computer vision were employed to derive metrics of tool speed, fixation rate on (moving or stationary) target objects, and fixation rate on tool-object combination. Preliminary results from a clustering analysis on the measurements from 499 practice trials indicated that the metrics were able to differentiate three skill levels amongst the trainees, suggesting high sensitivity and potential of context dependent gaze metrics for surgical assessment.


10.2196/27663 ◽  
2021 ◽  
Vol 8 (5) ◽  
pp. e27663
Author(s):  
Sandersan Onie ◽  
Xun Li ◽  
Morgan Liang ◽  
Arcot Sowmya ◽  
Mark Erik Larsen

Background Suicide is a recognized public health issue, with approximately 800,000 people dying by suicide each year. Among the different technologies used in suicide research, closed-circuit television (CCTV) and video have been used for a wide array of applications, including assessing crisis behaviors at metro stations, and using computer vision to identify a suicide attempt in progress. However, there has been no review of suicide research and interventions using CCTV and video. Objective The objective of this study was to review the literature to understand how CCTV and video data have been used in understanding and preventing suicide. Furthermore, to more fully capture progress in the field, we report on an ongoing study to respond to an identified gap in the narrative review, by using a computer vision–based system to identify behaviors prior to a suicide attempt. Methods We conducted a search using the keywords “suicide,” “cctv,” and “video” on PubMed, Inspec, and Web of Science. We included any studies which used CCTV or video footage to understand or prevent suicide. If a study fell into our area of interest, we included it regardless of the quality as our goal was to understand the scope of how CCTV and video had been used rather than quantify any specific effect size, but we noted the shortcomings in their design and analyses when discussing the studies. Results The review found that CCTV and video have primarily been used in 3 ways: (1) to identify risk factors for suicide (eg, inferring depression from facial expressions), (2) understanding suicide after an attempt (eg, forensic applications), and (3) as part of an intervention (eg, using computer vision and automated systems to identify if a suicide attempt is in progress). Furthermore, work in progress demonstrates how we can identify behaviors prior to an attempt at a hotspot, an important gap identified by papers in the literature. Conclusions Thus far, CCTV and video have been used in a wide array of applications, most notably in designing automated detection systems, with the field heading toward an automated detection system for early intervention. Despite many challenges, we show promising progress in developing an automated detection system for preattempt behaviors, which may allow for early intervention.


2018 ◽  
Vol 16 (2) ◽  
pp. 135-148
Author(s):  
Barnawi Barnawi

Abstract: Potency of the high absorption obtained if learning in effective. Effective learning occurs when students are placed as individual active and direct contact with the subject matter. This research aims to reduce the limitations of the tool (a computer or laptop) and maximizing existing facilities (hand phone) with the aim of achieving effective learning that puts students as subjects of learning. This study is a research field for conducting comparative academic performance of two models of learning. The first learning model is simulation learning and the second model is self-learning via mobile facility. Self-learning materials in this research is the material in the form of video 3GP and transferred to the student’s mobile. The research population is 85 students and a sample taken by 70 students. The data in this study is the performance of students from simulation learning model and self- learning model based 3GP video. Data analysis using inferential statistical, namely the t-test. Data analysis was performed after the fulfillment of the requirements for normality of data. The results of hypothesis testing obtained the results as following: The value t count bigger than t table (5.957> 2.025). Thus Ha is received and Ho is rejected (significance below or equal to 0.05 so Ha is received). Means that there are significant differences between simulation learning model and self-learning model based 3GP video. Keywords: Learning Media, 3GP Video.


With the advent in technology, security and authentication has become the main aspect in computer vision approach. Moving object detection is an efficient system with the goal of preserving the perceptible and principal source in a group. Surveillance is one of the most crucial requirements and carried out to monitor various kinds of activities. The detection and tracking of moving objects are the fundamental concept that comes under the surveillance systems. Moving object recognition is challenging approach in the field of digital image processing. Moving object detection relies on few of the applications which are Human Machine Interaction (HMI), Safety and video Surveillance, Augmented Realism, Transportation Monitoring on Roads, Medical Imaging etc. The main goal of this research is the detection and tracking moving object. In proposed approach, based on the pre-processing method in which there is extraction of the frames with reduction of dimension. It applies the morphological methods to clean the foreground image in the moving objects and texture based feature extract using component analysis method. After that, design a novel method which is optimized multilayer perceptron neural network. It used the optimized layers based on the Pbest and Gbest particle position in the objects. It finds the fitness values which is binary values (x_update, y_update) of swarm or object positions. Method and output achieved final frame creation of the moving objects in the video using BLOB ANALYSER In this research , an application is designed using MATLAB VERSION 2016a In activation function to re-filter the given input and final output calculated with the help of pre-defined sigmoid. In proposed methods to find the clear detection and tracking in the given dataset MOT, FOOTBALL, INDOOR and OUTDOOR datasets. To improve the detection accuracy rate, recall rate and reduce the error rates, False Positive and Negative rate and compare with the various classifiers such as KNN, MLPNN and J48 decision Tree.


2015 ◽  
Author(s):  
Manivannan Subramaniyan ◽  
Alexander S. Ecker ◽  
Saumil S. Patel ◽  
R. James Cotton ◽  
Matthias Bethge ◽  
...  

AbstractWhen the brain has determined the position of a moving object, due to anatomical and processing delays, the object will have already moved to a new location. Given the statistical regularities present in natural motion, the brain may have acquired compensatory mechanisms to minimize the mismatch between the perceived and the real position of a moving object. A well-known visual illusion — the flash lag effect — points towards such a possibility. Although many psychophysical models have been suggested to explain this illusion, their predictions have not been tested at the neural level, particularly in a species of animal known to perceive the illusion. Towards this, we recorded neural responses to flashed and moving bars from primary visual cortex (V1) of awake, fixating macaque monkeys. We found that the response latency to moving bars of varying speed, motion direction and luminance was shorter than that to flashes, in a manner that is consistent with psychophysical results. At the level of V1, our results support the differential latency model positing that flashed and moving bars have different latencies. As we found a neural correlate of the illusion in passively fixating monkeys, our results also suggest that judging the instantaneous position of the moving bar at the time of flash — as required by the postdiction/motion-biasing model — may not be necessary for observing a neural correlate of the illusion. Our results also suggest that the brain may have evolved mechanisms to process moving stimuli faster and closer to real time compared with briefly appearing stationary stimuli.New and NoteworthyWe report several observations in awake macaque V1 that provide support for the differential latency model of the flash lag illusion. We find that the equal latency of flash and moving stimuli as assumed by motion integration/postdiction models does not hold in V1. We show that in macaque V1, motion processing latency depends on stimulus luminance, speed and motion direction in a manner consistent with several psychophysical properties of the flash lag illusion.


2013 ◽  
pp. 1124-1144 ◽  
Author(s):  
Patrycia Barros de Lima Klavdianos ◽  
Lourdes Mattos Brasil ◽  
Jairo Simão Santana Melo

Recognition of human faces has been a fascinating subject in research field for many years. It is considered a multidisciplinary field because it includes understanding different domains such as psychology, neuroscience, computer vision, artificial intelligence, mathematics, and many others. Human face perception is intriguing and draws our attention because we accomplish the task so well that we hope to one day witness a machine performing the same task in a similar or better way. This chapter aims to provide a systematic and practical approach regarding to one of the most current techniques applied on face recognition, known as AAM (Active Appearance Model). AAM method is addressed considering 2D face processing only. This chapter doesn’t cover the entire theme, but offers to the reader the necessary tools to construct a consistent and productive pathway toward this involving subject.


Sign in / Sign up

Export Citation Format

Share Document