An Embedded Multicore Platform Exploration in Video Application Utilizing FPGA

2012 ◽  
Vol 505 ◽  
pp. 329-337
Author(s):  
Chun Hua Xiao ◽  
Zhang Qin Huang ◽  
Da Li

Multi-processor is not a new technology, but with the development of modern silicon technology, it is possible to integrate multiple cores in a single chip package, which is called multicore processor. Whether in the desktop personal machine, or embedded applications, multicore processor has been a general trend, due to the requirement of high performance and design problems in single-core processor. Surrounded multi-screen provides a better sense of reality, which is widely used in the surveillance, military, exhibitions, and so on. With the advantages in parallel processing, multicore technology has an important practical significance and a broad prospect in these applications. In this paper, an exploration on multicore architecture is mainly focused on, from the perspectives of processing elements, memory hierarchy, and on-chip interconnection. A basic platform for multi-screen display is implemented on the Xilinx field programmable gate array (FPGA), and it illustrates that there is a 3.6 times higher performance than the corresponding single-core design, which provides a helpful guidance and revelation to further researches.

2021 ◽  
Vol 49 (4) ◽  
pp. 1025-1034
Author(s):  
Vo Cong

Field-programmable gate arrays (FPGAs) and, recently, System on Chip (SoC) devices have been applied in a wide area of applications due to their flexibility for real-time implementations, increasing the processing capability on hardware as well as the speed of processing information in real-time. The most important applications based on FPGA/SoC devices are focused on signal/image processing, Internet of Things (IoT) technology, artificial intelligence (AI) algorithms, energy systems applications, automatic control and industrial applications. This paper develops a robot arm controller based on a programmable System-OnChip (SoC) device that combines the high-performance and flexibility of a CPU and the processing power of an FPGA. The CPU consists of a dual-core ARM processor that handles algorithm calculations, motion planning and manages communication and data manipulation. FPGA is mainly used to generate signals to control servo and read the feedback signals from encoders. Data from the ARM processor is transferred to the programmable logic side via the AXI protocol. This combination delivers superior parallel-processing and computing power, real-time performance and versatile connectivity. Additionally, having the complete controller on a single chip allows the hardware design to be simpler, more reliable, and less expensive.


Author(s):  
Ram Prasad Mohanty ◽  
Ashok Kumar Turuk ◽  
Bibhudatta Sahoo

The growing number of cores increases the demand for a powerful memory subsystem which leads to enhancement in the size of caches in multicore processors. Caches are responsible for giving processing elements a faster, higher bandwidth local memory to work with. In this chapter, an attempt has been made to analyze the impact of cache size on performance of Multi-core processors by varying L1 and L2 cache size on the multicore processor with internal network (MPIN) referenced from NIAGRA architecture. As the number of core's increases, traditional on-chip interconnects like bus and crossbar proves to be low in efficiency as well as suffer from poor scalability. In order to overcome the scalability and efficiency issues in these conventional interconnect, ring based design has been proposed. The effect of interconnect on the performance of multicore processors has been analyzed and a novel scalable on-chip interconnection mechanism (INOC) for multicore processors has been proposed. The benchmark results are presented by using a full system simulator. Results show that, using the proposed INoC, compared with the MPIN; the execution time are significantly reduced.


2021 ◽  
Author(s):  
Isiaka A. Alimi ◽  
Romil K. Patel ◽  
Oluyomi Aboderin ◽  
Abdelgader M. Abdalla ◽  
Ramoni A. Gbadamosi ◽  
...  

Integration technology advancement has impacted the System-on-Chip (SoC) in which heterogeneous cores are supported on a single chip. Based on the huge amount of supported heterogeneous cores, efficient communication between the associated processors has to be considered at all levels of the system design to ensure global interconnection. This can be achieved through a design-friendly, flexible, scalable, and high-performance interconnection architecture. It is noteworthy that the interconnections between multiple cores on a chip present a considerable influence on the performance and communication of the chip design regarding the throughput, end-to-end delay, and packets loss ratio. Although hierarchical architectures have addressed the majority of the associated challenges of the traditional interconnection techniques, the main limiting factor is scalability. Network-on-Chip (NoC) has been presented as a scalable and well-structured alternative solution that is capable of addressing communication issues in the on-chip systems. In this context, several NoC topologies have been presented to support various routing techniques and attend to different chip architectural requirements. This book chapter reviews some of the existing NoC topologies and their associated characteristics. Also, application mapping algorithms and some key challenges of NoC are considered.


2015 ◽  
Vol 2015 ◽  
pp. 1-12
Author(s):  
Mahendra Vucha ◽  
Arvind Rajawat

Modern embedded systems are being modeled as Reconfigurable High Speed Computing System (RHSCS) where Reconfigurable Hardware, that is, Field Programmable Gate Array (FPGA), and softcore processors configured on FPGA act as computing elements. As system complexity increases, efficient task distribution methodologies are essential to obtain high performance. A dynamic task distribution methodology based on Minimum Laxity First (MLF) policy (DTD-MLF) distributes the tasks of an application dynamically onto RHSCS and utilizes available RHSCS resources effectively. The DTD-MLF methodology takes the advantage of runtime design parameters of an application represented as DAG and considers the attributes of tasks in DAG and computing resources to distribute the tasks of an application onto RHSCS. In this paper, we have described the DTD-MLF model and verified its effectiveness by distributing some of real life benchmark applications onto RHSCS configured on Virtex-5 FPGA device. Some benchmark applications are represented as DAG and are distributed to the resources of RHSCS based on DTD-MLF model. The performance of the MLF based dynamic task distribution methodology is compared with static task distribution methodology. The comparison shows that the dynamic task distribution model with MLF criteria outperforms the static task distribution techniques in terms of schedule length and effective utilization of available RHSCS resources.


2017 ◽  
Vol 13 (1) ◽  
pp. 38-45 ◽  
Author(s):  
Noor Jumaa

Everything in its way to be computerized and most of the objects are coming to be smart in present days. Modern Internet of Thing (IoT) allows these objects to be on the network by using IoT platforms. IoT is a smart information society that consists of smart devices; these devices can communicate with each other without human's intervention. IoT systems require flexible platforms. Through the use of Field Programmable Gate Array (FPGA), IoT devices can interface with the outside world easily with low power consumption, low latency, and best determinism. FPGAs provide System on Chip (SoC) technique due to FPGAs scalability which enables the designer to implement and integrate large number of hardware clocks at single chip. FPGA can be deemed as a special purpose reprogrammable processor since it can process signals at its input pins, manipulate them, and give off signals on the output pins. In this paper, using FPGA for IoT is the limelight.


2017 ◽  
Vol 13 (1) ◽  
pp. 38-45
Author(s):  
Noor Jumaa

Everything in its way to be computerized and most of the objects are coming to be smart in present days. Modern Internet of Thing (IoT) allows these objects to be on the network by using IoT platforms. IoT is a smart information society that consists of smart devices; these devices can communicate with each other without human's intervention. IoT systems require flexible platforms. Through the use of Field Programmable Gate Array (FPGA), IoT devices can interface with the outside world easily with low power consumption, low latency, and best determinism. FPGAs provide System on Chip (SoC) technique due to FPGAs scalability which enables the designer to implement and integrate large number of hardware clocks at single chip. FPGA can be deemed as a special purpose reprogrammable processor since it can process signals at its input pins, manipulate them, and give off signals on the output pins. In this paper, using FPGA for IoT is the limelight.


Author(s):  
Wei-Wen Lin ◽  
Jih-Sheng Shen ◽  
Pao-Ann Hsiung

With the progress of technology, more and more intellectual properties (IPs) can be integrated into one single chip. The performance bottleneck has shifted from the computation in individual IPs to the communication among IPs. A Network-on-Chip (NoC) was proposed to provide high scalability and parallel communication. An ASIC-implemented NoC lacks flexibility and has a high non-recurring engineering (NRE) cost. As an alternative, we can implement an NoC in a Field Programmable Gate Arrays (FPGA). In addition, FPGA devices can support dynamic partial reconfiguration such that the hardware circuits can be configured into an FPGA at run time when necessary, without interfering hardware circuits that are already running. Such an FPGA-based NoC, namely reconfigurable NoC (RNoC), is more flexible and the NRE cost of FPGA-based NoC is also much lower than that of an ASIC-based NoC. Because of dynamic partial reconfiguration, there are several issues in the RNoC design. We focus on how communication between hardware and software can be made efficient for RNoC. We implement three communication architectures for RNoC namely single output FIFO-based architecture, multiple output FIFO-based architecture, and shared memory-based architecture. The average communication memory overhead is less on the single output FIFO-based architecture and the shared memory-based architecture than on the multiple output FIFO-based architecture when the lifetime interval is smaller than 0.5. In the performance analysis, some real applications are applied. Real application examples show that performance of the multiple output FIFO-based architecture is more efficient by as much as 1.789 times than the performance of the single output FIFO-based architecture. The performance of the shared memory-based architecture is more efficient by as much as 1.748 times than the performance of the single output FIFO-based architecture.


2013 ◽  
Vol 135 (2) ◽  
Author(s):  
Wataru Nakayama

The objective of this study is to understand the effects of various parameters involved in the chip design and cooling on the occurrence of hot spots on a multicore processor chip. The thermal environment for the die is determined by the cooling design which differs distinctly between different classes of electronic equipment. In the present study, like many other hot spot studies, the effective heat transfer coefficient represents the thermal environment for the die, but, its representative values are derived for different cooling schemes in order to examine in what classes of electronic equipment the hot spot concern grows. The cooling modes under study are high-performance air-cooling, high-performance liquid-cooling, conventional air-cooling, and oil-cooling in infrared radiation (IR) thermography setup. Temperature calculations were performed on a model which is designed to facilitate the study of several questions that have not been fully addressed in the existing literature. These questions are concerned with the granularity of power and temperature distributions, thermal interactions between circuits on the die, the roles of on-chip wiring layer and the buried dioxide in heat spreading, and the mechanism of producing temperature contrast across the die. The main results of calculations are the temperature of the target spot and the temperature contrast across the die. Temperature contrasts are predicted in a range 10–25 K, and the results indicate that a major part of the temperature contrast is formed at a granularity corresponding to the size of functional units on actual microprocessor chips. At a fine granularity level and under a scenario of high power concentration, the on-chip wiring layer and the buried oxide play some roles in heat spreading, but their impact on the temperature is generally small. However, the details of circuits need to be taken into account in future studies in order to investigate the possibility of nanometer-scale hot spots. Attention is also called to the need to understand the effect of temperature nonuniformity on the processor performance for which low temperature at inactive cells makes a major contribution. In contrast to the situation for the die under forced convection cooling, the die in passively cooled compact equipment is in distinctly different thermal environment. Strong thermal coupling between the die and the system structure necessitates the integration of package and system level analysis with the die-level analysis.


1993 ◽  
Vol 323 ◽  
Author(s):  
H. F. Lockwood ◽  
C. A. Armiento

AbstractThe principal driver behind advanced hardware development in the communications and computer industries can be reduced to an optimal set of parameters related to performance, cost and reliability. High performance systems typically have high functional density. For example, the continuing trend of VLSI is toward reduced feature size, increased wiring density and larger chip size to achieve increasingly higher levels of on-chip functionality. At some point in the cost structure, however, the single chip solution is no longer viable, and monolithic integration gives way to hybrid integration. In this respect, the multichip module fills a void in the packaging/ integration hierarchy between the ever-larger single chip and the printed wiring board.An analogous situation is emerging in optoelectronics. The single chip package with its relatively low system functionality and high cost is giving way to the multi-technology module that integrates optical and electronic functions within a single package. One of the most interesting approaches to the multi-technology module uses a silicon substrate as the platform for hybrid integration of electronics and optoelectronics. It will be argued that this “silicon waferboard” approach is the cost-effective route to manufacturability of high-performance modules for communications and computer systems. Enhanced reliability follows from applying standard IC processing technology at the platform level in the packaging hierarchy.


2020 ◽  
Vol 17 (1) ◽  
pp. 239-245
Author(s):  
Maddula N. V. Sesha Saiteja ◽  
K. Sai Sumanth Reddy ◽  
D. Radha ◽  
Minal Moharir

Technology improves performance and reduces in size day by day. Reduction in size can increase the density and which in turn can improve the performance. These statements suit very well for the computer architecture improvement. The whole System on Chip (SoC) brought the concept of multiple cores on a single chip. The multi-core or many-core architectures are the future of computing. Technology has improved in reducing the size and increasing the density, but improving the performance to an expectation of including more cores is a challenge of many-core technology. Utilization of all cores and improving the performance of execution by these cores are the challenges to be addressed in a many-core technology. This paper discusses the basics of many core architecture, comparison and applications. Further, it covers the basics of Network on Chip (NoC), architectural components, and various views of current Network on Chip research problems. Research problems include improving the performance of communication by avoiding congested path in routing.


Sign in / Sign up

Export Citation Format

Share Document