Performance of an advanced video codec on a general-purpose processor with media ISA extensions

Author(s):  
V. Lappalainen ◽  
P. Defee ◽  
A. Hallapuro

This Paper displays an adaptable and versatile movement estimation processor fit for supporting the handling prerequisites for top notch (HD) video utilizing the H.264 Advanced Video Codec, which is appropriate for FPGA execution. This paper dependent on General Purpose processor plan for movement estimation process. Quick movement estimation calculation with full pursuit calculation and precious stone hunt calculation. Where the two calculations have been executed in a solitary processor. So client can powerfully pick as per best execution. A client can choose the alternative of video quality at run time. In contrast to most past work, our center is enhanced to execute all current quick square coordinating calculations, to coordinate or surpass the between casing expectation execution of full-seek approaches at the HD goals generally being used today. Different tale movement estimation designs have been proposed all through the writing for dealing with the high data transfer capacity imperative nature of Video Broadcasting. A High precision full pursuit fixed square inquiry calculation is used to lessen the general transmission capacity and power prerequisite for transmitting live video arrangements. Despite the fact that full hunt guarantees high exactness, it tradeoffs its calculation time for precision. So the precision advantage is emphatically obscured by working velocity. To supplant the Full inquiry calculation another Modified Diamond seek calculation has been proposed with best precision and streamlined movement estimation length. Execution assessment of FBS Full hunt and Diamond look will be thought about for future investigation


Author(s):  
Hui Yang ◽  
Anand Nayyar

: In the fast development of information, the information data is increasing in geometric multiples, and the speed of information transmission and storage space are required to be higher. In order to reduce the use of storage space and further improve the transmission efficiency of data, data need to be compressed. processing. In the process of data compression, it is very important to ensure the lossless nature of data, and lossless data compression algorithms appear. The gradual optimization design of the algorithm can often achieve the energy-saving optimization of data compression. Similarly, The effect of energy saving can also be obtained by improving the hardware structure of node. In this paper, a new structure is designed for sensor node, which adopts hardware acceleration, and the data compression module is separated from the node microprocessor.On the basis of the ASIC design of the algorithm, by introducing hardware acceleration, the energy consumption of the compressed data was successfully reduced, and the proportion of energy consumption and compression time saved by the general-purpose processor was as high as 98.4 % and 95.8 %, respectively. It greatly reduces the compression time and energy consumption.


Author(s):  
Matias Javier Oliva ◽  
Pablo Andrés García ◽  
Enrique Mario Spinelli ◽  
Alejandro Luis Veiga

<span lang="EN-US">Real-time acquisition and processing of electroencephalographic signals have promising applications in the implementation of brain-computer interfaces. These devices allow the user to control a device without performing motor actions, and are usually made up of a biopotential acquisition stage and a personal computer (PC). This structure is very flexible and appropriate for research, but for final users it is necessary to migrate to an embedded system, eliminating the PC from the scheme. The strict real-time processing requirements of such systems justify the choice of a system on a chip field-programmable gate arrays (SoC-FPGA) for its implementation. This article proposes a platform for the acquisition and processing of electroencephalographic signals using this type of device, which combines the parallelism and speed capabilities of an FPGA with the simplicity of a general-purpose processor on a single chip. In this scheme, the FPGA is in charge of the real-time operation, acquiring and processing the signals, while the processor solves the high-level tasks, with the interconnection between processing elements solved by buses integrated into the chip. The proposed scheme was used to implement a brain-computer interface based on steady-state visual evoked potentials, which was used to command a speller. The first tests of the system show that a selection time of 5 seconds per command can be achieved. The time delay between the user’s selection and the system response has been estimated at 343 µs.</span>


2019 ◽  
Vol 26 (1) ◽  
pp. 39-62
Author(s):  
Stanislav O. Bezzubtsev ◽  
Vyacheslav V. Vasin ◽  
Dmitry Yu. Volkanov ◽  
Shynar R. Zhailauova ◽  
Vladislav A. Miroshnik ◽  
...  

The paper proposes the architecture and basic requirements for a network processor for OpenFlow switches of software-defined networks. An analysis of the architectures of well-known network processors is presented − NP-5 from EZchip (now Mellanox) and Tofino from Barefoot Networks. The advantages and disadvantages of two different versions of network processor architectures are considered: pipeline-based architecture, the stages of which are represented by a set of general-purpose processor cores, and pipeline-based architecture whose stages correspond to cores specialized for specific packet processing operations. Based on a dedicated set of the most common use case scenarios, a new architecture of the network processor unit (NPU) with functionally specialized pipeline stages was proposed. The article presents a description of the simulation model of the NPU of the proposed architecture. The simulation model of the network processor is implemented in C ++ languages using SystemC, the open-source C++ library. For the functional testing of the obtained NPU model, the described use case scenarios were implemented in C. In order to evaluate the performance of the proposed NPU architecture a set of software products developed by KM211 company and the KMX32 family of microcontrollers were used. Evaluation of NPU performance was made on the basis of a simulation model. Estimates of the processing time of one packet and the average throughput of the NPU model for each scenario are obtained.


2014 ◽  
Vol 57 (12) ◽  
pp. 44-48 ◽  
Author(s):  
David Chisnall

VLSI Design ◽  
2016 ◽  
Vol 2016 ◽  
pp. 1-12 ◽  
Author(s):  
Yumin Hou ◽  
Hu He ◽  
Xu Yang ◽  
Deyuan Guo ◽  
Xu Wang ◽  
...  

This paper proposes FuMicro, a fused microarchitecture integrating both in-order superscalar and Very Long Instruction Word (VLIW) in a single core. A processor with FuMicro microarchitecture can work under alternative in-order superscalar and VLIW mode, using the same pipeline and the same Instruction Set Architecture (ISA). Small modification to the compiler is made to expand the register file in VLIW mode. The decision of mode switch is made by software, and this does not need extra hardware. VLIW code can be exploited in the form of library function and the users will be exposed under only superscalar mode; by this means, we can provide the users with a convenient development environment. FuMicro could serve as a universal microarchitecture for it can be applied to different ISAs. In this paper, we focus on the implementation of FuMicro with ARM ISA. This architecture is evaluated on gem5, which is a cycle accurate microarchitecture simulation platform. By adopting FuMicro microarchitecture, the performance can be improved on an average of 10%, with the best performance improvement being 47.3%, compared with that under pure in-order superscalar mode. The result shows that FuMicro microarchitecture can improve Instruction Level Parallelism (ILP) significantly, making it promising to expand digital signal processing capability on a General Purpose Processor.


Sign in / Sign up

Export Citation Format

Share Document