scholarly journals Multiview Layer Fusion Model for Action Recognition Using RGBD Images

2018 ◽  
Vol 2018 ◽  
pp. 1-22
Author(s):  
Pongsagorn Chalearnnetkul ◽  
Nikom Suvonvorn

Vision-based action recognition encounters different challenges in practice, including recognition of the subject from any viewpoint, processing of data in real time, and offering privacy in a real-world setting. Even recognizing profile-based human actions, a subset of vision-based action recognition, is a considerable challenge in computer vision which forms the basis for an understanding of complex actions, activities, and behaviors, especially in healthcare applications and video surveillance systems. Accordingly, we introduce a novel method to construct a layer feature model for a profile-based solution that allows the fusion of features for multiview depth images. This model enables recognition from several viewpoints with low complexity at a real-time running speed of 63 fps for four profile-based actions: standing/walking, sitting, stooping, and lying. The experiment using the Northwestern-UCLA 3D dataset resulted in an average precision of 86.40%. With the i3DPost dataset, the experiment achieved an average precision of 93.00%. With the PSU multiview profile-based action dataset, a new dataset for multiple viewpoints which provides profile-based action RGBD images built by our group, we achieved an average precision of 99.31%.

2020 ◽  
Author(s):  
Evan M. Dastin-van Rijn ◽  
Nicole R. Provenza ◽  
Jonathan S. Calvert ◽  
Ro’ee Gilron ◽  
Anusha B. Allawala ◽  
...  

AbstractAdvances in device development have enabled concurrent stimulation and recording at adjacent locations in the central nervous system. However, stimulation artifacts obscure the sensed underlying neural activity. Here, we developed a novel method, termed Period-based Artifact Reconstruction and Removal Method (PARRM), to remove stimulation artifacts from neural recordings by leveraging the exact period of stimulation to construct and subtract a high-fidelity template of the artifact. Benchtop saline experiments, computational simulations, five unique in vivo paradigms across animal and human studies, and an obscured movement biomarker were used for validation. Performance was found to exceed that of state-of-the-art filters in recovering complex signals without introducing contamination. PARRM has several advantages: it is 1) superior in signal recovery; 2) easily adaptable to several neurostimulation paradigms; and 3) low-complexity for future on-device implementation. Real-time artifact removal via PARRM will enable unbiased exploration and detection of neural biomarkers to enhance efficacy of closed-loop therapies.SummaryOnline, real-time artifact removal via PARRM will enable unbiased exploration of neural biomarkers previously obscured by stimulation artifact.


Author(s):  
И.Г. Малыгин ◽  
О.А. Королев

Современные интеллектуальные видеосистемы наблюдения стали все больше акцентироваться на передачу в реальном времени высококачественного видео различных важных событий, в том числе чрезвычайных ситуаций. Для высокопроизводительных систем передачи видеоинформации нового поколения необходимы эффективные структурные решения, способные как к высокой скорости передачи, так и к высокой точности вычисления. Такие структуры должны обрабатывать огромные последовательности изображений, при этом каждый видеопоток должен характеризоваться высоким разрешением с минимальным шумом и искажениями, потребляя при этом как можно меньше мощности. Спектральные алгоритмы обработки видеоинформации являются наиболее распространенным способом передачи в реальном времени, в частности дискретное косинусное преобразование. При этом исходное изображение подвергается преобразованию из пространственной в частотную область с целью сжатия путём уменьшения или устранения избыточности визуальных данных. Неявное вычисление преобразования последовательности 8-точечного массива приводит к эффективному сжатию, требующему не более пятикратного выполнения операции умножения. В статье предложены архитектура с низкой структурой сложности и метод преобразования изображений на основе алгебры целых чисел. Modern intelligent video surveillance systems have become increasingly focused on real-time transmission of high-quality video of various important events, including emergencies. For high-performance video information transmission systems of the new generation, efficient structural solutions are needed that are capable of both high transmission speed and high calculation accuracy. Such structures must process huge sequences of images, and each video stream must be characterized by high resolution and with minimal noise and distortion, while consuming as little power as possible. Spectral algorithms for processing video information are the most common method of transmission in real time, in particular the discrete cosine transform. In this case, the original image is transformed from the spatial to the frequency domain in order to compress by reducing or eliminating the redundancy of visual data. Implicitly calculating the sequence transformation of an 8-point array results in efficient compression, requiring no more than five times the multiplication operation. In this paper, we propose an architecture with a low complexity structure and image transformation method based on the algebra of integers


2011 ◽  
Vol 73 (03) ◽  
Author(s):  
H Freund ◽  
V Bochat ◽  
H ter Waarbeek

Author(s):  
Rajat Khurana ◽  
Alok Kumar Singh Kushwaha

Background & Objective: Identification of human actions from video has gathered much attention in past few years. Most of the computer vision tasks such as Health Care Activity Detection, Suspicious Activity detection, Human Computer Interactions etc. are based on the principle of activity detection. Automatic labelling of activity from videos frames is known as activity detection. Motivation of this work is to use most out of the data generated from sensors and use them for recognition of classes. Recognition of actions from videos sequences is a growing field with the upcoming trends of deep neural networks. Automatic learning capability of Convolutional Neural Network (CNN) make them good choice as compared to traditional handcrafted based approaches. With the increasing demand of RGB-D sensors combination of RGB and depth data is in great demand. This work comprises of the use of dynamic images generated from RGB combined with depth map for action recognition purpose. We have experimented our approach on pre trained VGG-F model using MSR Daily activity dataset and UTD MHAD Dataset. We achieve state of the art results. To support our research, we have calculated different parameters apart from accuracy such as precision, F score, recall. Conclusion: Accordingly, the investigation confirms improvement in term of accuracy, precision, F-Score and Recall. The proposed model is 4 Stream model is prone to occlusion, used in real time and also the data from the RGB-D sensor is fully utilized.


2021 ◽  
Vol 11 (11) ◽  
pp. 4940
Author(s):  
Jinsoo Kim ◽  
Jeongho Cho

The field of research related to video data has difficulty in extracting not only spatial but also temporal features and human action recognition (HAR) is a representative field of research that applies convolutional neural network (CNN) to video data. The performance for action recognition has improved, but owing to the complexity of the model, some still limitations to operation in real-time persist. Therefore, a lightweight CNN-based single-stream HAR model that can operate in real-time is proposed. The proposed model extracts spatial feature maps by applying CNN to the images that develop the video and uses the frame change rate of sequential images as time information. Spatial feature maps are weighted-averaged by frame change, transformed into spatiotemporal features, and input into multilayer perceptrons, which have a relatively lower complexity than other HAR models; thus, our method has high utility in a single embedded system connected to CCTV. The results of evaluating action recognition accuracy and data processing speed through challenging action recognition benchmark UCF-101 showed higher action recognition accuracy than the HAR model using long short-term memory with a small amount of video frames and confirmed the real-time operational possibility through fast data processing speed. In addition, the performance of the proposed weighted mean-based HAR model was verified by testing it in Jetson NANO to confirm the possibility of using it in low-cost GPU-based embedded systems.


Author(s):  
Manju Rahi ◽  
Payal Das ◽  
Amit Sharma

Abstract Malaria surveillance is weak in high malaria burden countries. Surveillance is considered as one of the core interventions for malaria elimination. Impressive reductions in malaria-associated morbidity and mortality have been achieved across the globe, but sustained efforts need to be bolstered up to achieve malaria elimination in endemic countries like India. Poor surveillance data become a hindrance in assessing the progress achieved towards malaria elimination and in channelizing focused interventions to the hotspots. A major obstacle in strengthening India’s reporting systems is that the surveillance data are captured in a fragmented manner by multiple players, in silos, and is distributed across geographic regions. In addition, the data are not reported in near real-time. Furthermore, multiplicity of malaria data resources limits interoperability between them. Here, we deliberate on the acute need of updating India’s surveillance systems from the use of aggregated data to near real-time case-based surveillance. This will help in identifying the drivers of malaria transmission in any locale and therefore will facilitate formulation of appropriate interventional responses rapidly.


2021 ◽  
pp. 1-11
Author(s):  
Tingting Zhao ◽  
Xiaoli Yi ◽  
Zhiyong Zeng ◽  
Tao Feng

YTNR (Yunnan Tongbiguan Nature Reserve) is located in the westernmost part of China’s tropical regions and is the only area in China with the tropical biota of the Irrawaddy River system. The reserve has abundant tropical flora and fauna resources. In order to realize the real-time detection of wild animals in this area, this paper proposes an improved YOLO (You only look once) network. The original YOLO model can achieve higher detection accuracy, but due to the complex model structure, it cannot achieve a faster detection speed on the CPU detection platform. Therefore, the lightweight network MobileNet is introduced to replace the backbone feature extraction network in YOLO, which realizes real-time detection on the CPU platform. In response to the difficulty in collecting wild animal image data, the research team deployed 50 high-definition cameras in the study area and conducted continuous observations for more than 1,000 hours. In the end, this research uses 1410 images of wildlife collected in the field and 1577 wildlife images from the internet to construct a research data set combined with the manual annotation of domain experts. At the same time, transfer learning is introduced to solve the problem of insufficient training data and the network is difficult to fit. The experimental results show that our model trained on a training set containing 2419 animal images has a mean average precision of 93.6% and an FPS (Frame Per Second) of 3.8 under the CPU. Compared with YOLO, the mean average precision is increased by 7.7%, and the FPS value is increased by 3.


2014 ◽  
Vol 2014 ◽  
pp. 1-5 ◽  
Author(s):  
Liang Zhao

This paper presents a novel abnormal data detecting algorithm based on the first order difference method, which could be used to find out outlier in building energy consumption platform real time. The principle and criterion of methodology are discussed in detail. The results show that outlier in cumulative power consumption could be detected by our method.


Sign in / Sign up

Export Citation Format

Share Document