Audio and Video Decoding and Synchronous Playback for Embedded Systems

The audio and video decoding and synchronization playback system ofMPEG-2 TS stream is designed and implemented based on ARM embedded system. In this system, hardware processor is embedded in the ARM processor. In order to make full use of this resource, hardware MFC is adopted. The multi-format codec decoder decodes the video data and decodes the audio data using the open source Mad (libmad) library. The V4L2 (Video for Linux2) driver interface and the ALSA (advanced Linux sound architecture) library are used to implement the video. Because the video frame playback period and the hardware processing delay are inconsistent, the system has a time difference between the audio and video data operations, which causes the audio and video playback to be out of sync. Therefore, we use the method of synchronizing the video playback implemented to the audio playback stream; realize the audio and video are playing sync. Test results show that, the designed audio decodes and synchronization playback system can decode and synchronize audio and video data.

Download Full-text

Low-Cost Embedded System Using Convolutional Neural Networks-Based Spatiotemporal Feature Map for Real-Time Human Action Recognition

Applied Sciences ◽

10.3390/app11114940 ◽

2021 ◽

Vol 11 (11) ◽

pp. 4940

Author(s):

Jinsoo Kim ◽

Jeongho Cho

Keyword(s):

Embedded System ◽

Real Time ◽

Action Recognition ◽

Processing Speed ◽

Recognition Accuracy ◽

Low Cost ◽

Human Action Recognition ◽

Human Action ◽

Video Data ◽

Feature Maps

The field of research related to video data has difficulty in extracting not only spatial but also temporal features and human action recognition (HAR) is a representative field of research that applies convolutional neural network (CNN) to video data. The performance for action recognition has improved, but owing to the complexity of the model, some still limitations to operation in real-time persist. Therefore, a lightweight CNN-based single-stream HAR model that can operate in real-time is proposed. The proposed model extracts spatial feature maps by applying CNN to the images that develop the video and uses the frame change rate of sequential images as time information. Spatial feature maps are weighted-averaged by frame change, transformed into spatiotemporal features, and input into multilayer perceptrons, which have a relatively lower complexity than other HAR models; thus, our method has high utility in a single embedded system connected to CCTV. The results of evaluating action recognition accuracy and data processing speed through challenging action recognition benchmark UCF-101 showed higher action recognition accuracy than the HAR model using long short-term memory with a small amount of video frames and confirmed the real-time operational possibility through fast data processing speed. In addition, the performance of the proposed weighted mean-based HAR model was verified by testing it in Jetson NANO to confirm the possibility of using it in low-cost GPU-based embedded systems.

Download Full-text

VIDEO COPY DETECTION UTILIZING THE LOG-POLAR TRANSFORMATION

International Journal of Computing ◽

10.47839/ijc.15.1.825 ◽

2016 ◽

pp. 8-13

Author(s):

Daniel Reynolds ◽

Richard A. Messner

Keyword(s):

Data Reduction ◽

Reduction Process ◽

Video Data ◽

Video Frame ◽

Detection Accuracy ◽

Video Copy Detection ◽

Copy Detection ◽

Frame Size ◽

Detection Process ◽

Processing Step

Video copy detection is the process of comparing and analyzing videos to extract a measure of their similarity in order to determine if they are copies, modified versions, or completely different videos. With video frame sizes increasing rapidly, it is important to allow for a data reduction process to take place in order to achieve fast video comparisons. Further, detecting video streaming and storage of legal and illegal video data necessitates the fast and efficient implementation of video copy detection algorithms. In this paper some commonly used algorithms for video copy detection are implemented with the Log-Polar transformation being used as a pre-processing step to reduce the frame size prior to signature calculation. Two global based algorithms were chosen to validate the use of Log-Polar as an acceptable data reduction stage. The results of this research demonstrate that the addition of this pre-processing step significantly reduces the computation time of the overall video copy detection process while not significantly affecting the detection accuracy of the algorithm used for the detection process.

Download Full-text

A Deep Learning Enhanced Novel Software Tool for Laryngeal Dynamics Analysis

Journal of Speech Language and Hearing Research ◽

10.1044/2021_jslhr-20-00498 ◽

2021 ◽

pp. 1-15

Author(s):

Andreas M. Kist ◽

Pablo Gómez ◽

Denis Dubrovskiy ◽

Patrick Schlegel ◽

Melda Kunduk ◽

...

Keyword(s):

Neural Networks ◽

Quantitative Analysis ◽

High Speed ◽

Voice Disorders ◽

Vocal Folds ◽

Video Data ◽

Audio Data ◽

Fully Automatic ◽

Video And Audio

Purpose High-speed videoendoscopy (HSV) is an emerging, but barely used, endoscopy technique in the clinic to assess and diagnose voice disorders because of the lack of dedicated software to analyze the data. HSV allows to quantify the vocal fold oscillations by segmenting the glottal area. This challenging task has been tackled by various studies; however, the proposed approaches are mostly limited and not suitable for daily clinical routine. Method We developed a user-friendly software in C# that allows the editing, motion correction, segmentation, and quantitative analysis of HSV data. We further provide pretrained deep neural networks for fully automatic glottis segmentation. Results We freely provide our software Glottis Analysis Tools (GAT). Using GAT, we provide a general threshold-based region growing platform that enables the user to analyze data from various sources, such as in vivo recordings, ex vivo recordings, and high-speed footage of artificial vocal folds. Additionally, especially for in vivo recordings, we provide three robust neural networks at various speed and quality settings to allow a fully automatic glottis segmentation needed for application by untrained personnel. GAT further evaluates video and audio data in parallel and is able to extract various features from the video data, among others the glottal area waveform, that is, the changing glottal area over time. In total, GAT provides 79 unique quantitative analysis parameters for video- and audio-based signals. Many of these parameters have already been shown to reflect voice disorders, highlighting the clinical importance and usefulness of the GAT software. Conclusion GAT is a unique tool to process HSV and audio data to determine quantitative, clinically relevant parameters for research, diagnosis, and treatment of laryngeal disorders. Supplemental Material https://doi.org/10.23641/asha.14575533

Download Full-text

The Research and Development of High-precision Measuring Instrument Based on ARM Embedded System

IOP Conference Series Earth and Environmental Science ◽

10.1088/1755-1315/234/1/012051 ◽

2019 ◽

Vol 234 ◽

pp. 012051

Author(s):

Shuo Yang ◽

Guofeng Qin

Keyword(s):

Research And Development ◽

Embedded System ◽

High Precision ◽

Measuring Instrument ◽

Precision Measuring ◽

Arm Embedded System

Download Full-text

Research on Mechanical Industry with the Patrol Robot Study Based on the Embedded System

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.675.82 ◽

2013 ◽

Vol 675 ◽

pp. 82-85

Author(s):

Wei Li

Keyword(s):

Embedded System ◽

Tracking System ◽

Robot Motion ◽

Visual Servo ◽

Video Images ◽

Drive Motor ◽

Audio Data ◽

Servo Tracking ◽

The Embedded System ◽

Robot Body

It is a hot topic that comprehensive machine vision and embedded systems for mechanical industry robot motion control now. In this paper, an embedded robot visual servo tracking system platform was build, which can process and analyze video images and audio data, then analysis results was transfered to the drive motor in the robot body, and realized the autonomy tracing of the mechanical robot .

Download Full-text

Approach for Parking Spaces Detection Based on ARM Embedded System

2013 5th International Conference on Intelligent Human-Machine Systems and Cybernetics ◽

10.1109/ihmsc.2013.244 ◽

2013 ◽

Cited By ~ 1

Author(s):

Yucheng Li ◽

Guohui Li ◽

Xingcai Zhao

Keyword(s):

Embedded System ◽

Arm Embedded System

Download Full-text

Acoustic Study of Wave-Breaking to Enhance the Understanding of Wave Physics

Volume 6B: Ocean Engineering ◽

10.1115/omae2020-19352 ◽

2020 ◽

Author(s):

Michael Odzer ◽

Kristina Francke

Keyword(s):

Wave Breaking ◽

Ocean Waves ◽

Breaking Waves ◽

Video Data ◽

Wave Tank ◽

Potential Applications ◽

Audio Data ◽

Video And Audio ◽

Bubble Sizes ◽

Laboratory Tank

Abstract The sound of waves breaking on shore, or against an obstruction or jetty, is an immediately recognizable sound pattern which could potentially be employed by a sensor system to identify obstructions. If frequency patterns produced by breaking waves can be reproduced and mapped in a laboratory setting, a foundational understanding of the physics behind this process could be established, which could then be employed in sensor development for navigation. This study explores whether wave-breaking frequencies correlate with the physics behind the collapsing of the wave, and whether frequencies of breaking waves recorded in a laboratory tank will follow the same pattern as frequencies produced by ocean waves breaking on a beach. An artificial “beach” was engineered to replicate breaking waves inside a laboratory wave tank. Video and audio recordings of waves breaking in the tank were obtained, and audio of ocean waves breaking on the shoreline was recorded. The audio data was analysed in frequency charts. The video data was evaluated to correlate bubble sizes to frequencies produced by the waves. The results supported the hypothesis that frequencies produced by breaking waves in the wave tank followed the same pattern as those produced by ocean waves. Analysis utilizing a solution to the Rayleigh-Plesset equation showed that the bubble sizes produced by breaking waves were inversely related to the pattern of frequencies. This pattern can be reproduced in a controlled laboratory environment and extrapolated for use in developing navigational sensors for potential applications in marine navigation such as for use with autonomous ocean vehicles.

Download Full-text

Coding Video Data, Audio Data, and Images

Analyzing Qualitative Data with MAXQDA ◽

10.1007/978-3-030-15671-8_7 ◽

2019 ◽

pp. 83-91

Author(s):

Udo Kuckartz ◽

Stefan Rädiker

Keyword(s):

Video Data ◽

Audio Data

Download Full-text

The Achievement of Web Server in the Development of ARM Embedded System

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.389.1047 ◽

2013 ◽

Vol 389 ◽

pp. 1047-1050

Author(s):

Yan Jie Zhang ◽

Xue Yong Wang ◽

Hong Suo Zhou

Keyword(s):

Embedded System ◽

Web Server ◽

Embedded Operating System ◽

Long Distance ◽

Web Browser ◽

Response Information ◽

Arm Microprocessor ◽

Embedded Gateway ◽

Embedded Applications ◽

Arm Embedded System

Based on open source embedded operating system uClinux, embedded applications system taking ARM microprocessor as the core was connected with Internet. Through embedding Web server to the ARM embedded system functions and adding embedded Gateway Interface CGI program, the interactive delivery with the long-distance customers Web browser order and response information come true.

Download Full-text

Optimization of Storage Method for Video Segment Captured by Embedded System in Camera

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.268-270.2116 ◽

2011 ◽

Vol 268-270 ◽

pp. 2116-2120

Author(s):

Wei Yan

Keyword(s):

Embedded System ◽

Video Surveillance ◽

Video Processing ◽

Continuous Monitoring ◽

Video Data ◽

Video Segment ◽

Fixed Size ◽

Storage Method ◽

The Embedded System ◽

Cache Capacity

Video processing and cache capacity in embedded system in camera is the key to decide intelligence of shooting system. In particular, to meet the demand for continuous monitoring in video surveillance environment, the embedded system must be able to store the video in case of failure in network or server. In this paper, a storage method for videos with fixed size is proposed, which can effectively restrain occurrence of fragment during storage and raise I/O performance while ensuring continuity of monitoring. Furthermore, storage of video data based on H.264/AVC encoding system and its optimization are also discussed.

Download Full-text