Design of Advanced High Performance Bus Tracer in System on Chip Using Matrix Based Compression for Low Power Applications

2020 ◽  
Vol 17 (4) ◽  
pp. 1852-1856
Author(s):  
P. Bhuvaneshwari ◽  
T. R. Jaya Chandra Lekha

This project proposes multilayer advanced high-performance bus architecture for low power applications. The proposed AHB architecture consists of the bus arbiter and the bus tracer (A.R.M.A., 1999. Specification (Rev 2.0) ARM IHI0011A). The bus arbiter, which is self motivated selects the input packet based on the control signals of the incoming packet. So that arbitration leads to a maximum performance. The On-Chip bus is an important system-on-chip infrastructure that connects major hardware components. Monitoring the on-chip bus signals is crucial to the SoC debugging and performance analysis/optimization (Gu, R.T., et al., 2007. A Low Cost Tile-Based 3D Graphics Full Pipeline with Real-Time Performance Monitoring Support for OpenGL ES in Consumer Electronics. 2007 IEEE International Symposium on Consumer Electronics, June; IEEE. pp.1–6). But, such signals are difficult to observe since they are deeply embedded in a SoC and there are often no sufficient I/O pins to access these signals. Therefore, a straightforward approach is to embed a bus tracer in SoC to capture the bus signal trace and store the trace in on-chip storage such as the trace memory which could then be off loaded to outside world for analysis. The bus tracer is capable of capturing the bus trace with different resolutions, all with efficient built in compression mechanisms such as dictionary based compression scheme for address and control signals and differential compression scheme for data. To improve the compression ratio matrix based compression which is lossless compression is used instead of differential compression. This system is designed using Verilog HDL, simulated using Modelsim and synthesized using Xilinx software.

Electronics ◽  
2021 ◽  
Vol 10 (13) ◽  
pp. 1587
Author(s):  
Duo Sheng ◽  
Hsueh-Ru Lin ◽  
Li Tai

High performance and complex system-on-chip (SoC) design require a throughput and stable timing monitor to reduce the impacts of uncertain timing and implement the dynamic voltage and frequency scaling (DVFS) scheme for overall power reduction. This paper presents a multi-stage timing monitor, combining three timing-monitoring stages to achieve a high timing-monitoring resolution and a wide timing-monitoring range simultaneously. Additionally, because the proposed timing monitor has high immunity to the process–voltage–temperature (PVT) variation, it provides a more stable time-monitoring results. The time-monitoring resolution and range of the proposed timing monitor are 47 ps and 2.2 µs, respectively, and the maximum measurement error is 0.06%. Therefore, the proposed multi-stage timing monitor provides not only the timing information of the specified signals to maintain the functionality and performance of the SoC, but also makes the operation of the DVFS scheme more efficient and accurate in SoC design.


2020 ◽  
Vol 2 (3) ◽  
pp. 158-168
Author(s):  
Muhammad Raza Naqvi

Mostly communication now days is done through SoC (system on chip) models so, NoC (network on chip) architecture is most appropriate solution for better performance. However, one of major flaws in this architecture is power consumption. To gain high performance through this type of architecture it is necessary to confirm power consumption while designing this. Use of power should be diminished in every region of network chip architecture. Lasting power consumption can be lessened by reaching alterations in network routers and other devices used to form that network. This research mainly focusses on state-of-the-art methods for designing NoC architecture and techniques to reduce power consumption in those architectures like, network architecture, network links between nodes, network design, and routers.


Circuit World ◽  
2020 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
Chen Kuilin ◽  
Feng Xi ◽  
Fu Yingchun ◽  
Liu Liang ◽  
Feng Wennan ◽  
...  

Purpose The data protection is always a vital problem in the network era. High-speed cryptographic chip is an important part to ensure data security in information interaction. This paper aims to provide a new peripheral component interconnect express (PCIe) encryption card solution with high performance, high integration and low cost. Design/methodology/approach This work proposes a System on Chip architecture scheme of high-speed cryptographic chip for PCIe encryption card. It integrated CPU, direct memory access, the national and international cipher algorithm (data encryption standard/3 data encryption standard, Rivest–Shamir–Adleman, HASH, SM1, SM2, SM3, SM4, SM7), PCIe and other communication interfaces with advanced extensible interface-advanced high-performance bus three-level bus architecture. Findings This paper presents a high-speed cryptographic chip that integrates several high-speed parallel processing algorithm units. The test results of post-silicon sample shows that the high-speed cryptographic chip can achieve Gbps-level speed. That means only one single chip can fully meet the requirements of cryptographic operation performance for most cryptographic applications. Practical implications The typical application in this work is PCIe encryption card. Besides server’s applications, it can also be applied in terminal products such as high-definition video encryption, security gateway, secure routing, cloud terminal devices and industrial real-time monitoring system, which require high performance on data encryption. Social implications It can be well applied on many other fields such as power, banking, insurance, transportation and e-commerce. Originality/value Compared with the current strategy of high-speed encryption card, which mostly uses hardware field-programmable gate arrays or several low-speed algorithm chips through parallel processing in one printed circuit board, this work has provided a new PCIe encryption card solution with high performance, high integration and low cost only in one chip.


2021 ◽  
Vol 59 (7) ◽  
pp. 101-107
Author(s):  
Il-Gu Lee ◽  
Duk Bai Kim ◽  
Jeongki Choi ◽  
Hyungu Park ◽  
Sok-Kyu Lee ◽  
...  

Electronics ◽  
2019 ◽  
Vol 8 (5) ◽  
pp. 528 ◽  
Author(s):  
Julian Viejo ◽  
Jorge Juan-Chico ◽  
Manuel J. Bellido ◽  
Paulino Ruiz-de-Clavijo ◽  
David Guerrero ◽  
...  

This paper presents the complete design and implementation of a low-cost, low-footprint, network time protocol server core for field programmable gate arrays. The core uses a carefully designed modular architecture, which is fully implemented in hardware using digital circuits and systems. Most remarkable novelties introduced are a hardware-optimized timekeeping algorithm implementation, and a full-hardware protocol stack and automatic network configuration. As a result, the core is able to achieve similar accuracy and performance to typical high-performance network time protocol server equipment. The core uses a standard global positioning system receiver as time reference, has a small footprint and can easily fit in a low-range field-programmable chip, greatly scaling down from previous system-on-chip time synchronization systems. Accuracy and performance results show that the core can serve hundreds of thousands of network time clients with negligible accuracy degradation, in contrast to state-of-the-art high-performance time server equipment. Therefore, this core provides a valuable time server solution for a wide range of emerging embedded and distributed network applications such as the Internet of Things and the smart grid, at a fraction of the cost and footprint of current discrete and embedded solutions.


Electronics ◽  
2020 ◽  
Vol 9 (7) ◽  
pp. 1148
Author(s):  
Antonio F. Díaz ◽  
Ilia Blokhin ◽  
Mancia Anguita ◽  
Julio Ortega ◽  
Juan J. Escobar

Multifactor authentication is a relevant tool in securing IT infrastructures combining two or more credentials. We can find smartcards and hardware tokens to leverage the authentication process, but they have some limitations. Users connect these devices in the client node to log in or request access to services. Alternatively, if an application wants to use these resources, the code has to be amended with bespoke solutions to provide access. Thanks to advances in system-on-chip devices, we can integrate cryptographically robust, low-cost solutions. In this work, we present an autonomous device that allows multifactor authentication in client–server systems in a transparent way, which facilitates its integration in High-Performance Computing (HPC) and cloud systems, through a generic gateway. The proposed electronic token (eToken), based on the system-on-chip ESP32, provides an extra layer of security based on elliptic curve cryptography. Secure communications between elements use Message Queuing Telemetry Transport (MQTT) to facilitate their interconnection. We have evaluated different types of possible attacks and the impact on communications. The proposed system offers an efficient solution to increase security in access to services and systems.


Author(s):  
Medhat Awadalla ◽  
Ahmed M. Sadek

To meet the growing computation-intensive applications and the needs of low-power, high-performance systems, the number of computing resources in single-chip has enormously increased. By adding many computing resources to build a system in System-on-Chip, its interconnection between each other becomes another challenging issue. In most System-on-Chip applications, a shared bus interconnection which needs an arbitration logic to serialize several bus access requests, is adopted to communicate with each integrated processing unit because of its low-cost and simple control characteristics. This paper focuses on the interconnection design issues of area, power and performance of chip multi-processors with shared cache memory. It shows that having shared cache memory contributes to the performance improvement, however, typical interconnection between cores and the shared cache using crossbar occupies most of the chip area, consumes a lot of power and does not scale efficiently with increased number of cores. New interconnection mechanisms are needed to address these issues. This paper proposes an architectural paradigm in an attempt to gain the advantages of having shared cache with the avoidance of penalty imposed by the crossbar interconnect. The proposed architecture achieves smaller area occupation allowing more space to add additional cache memory. It also reduces power consumption compared to the existing crossbar architecture. Furthermore, the paper presents a modified cache coherence algorithm called Tuned-MESI. It is based on the typical MESI cache coherence algorithm however it is tuned and tailored for the suggested architecture. The achieved results of the conducted simulated experiments show that the developed architecture produces less broadcast operations compared to the typical algorithm.


Author(s):  
A. Ferrerón Labari ◽  
D. Suárez Gracia ◽  
V. Viñals Yúfera

In the last years, embedded systems have evolved so that they offer capabilities we could only find before in high performance systems. Portable devices already have multiprocessors on-chip (such as PowerPC 476FP or ARM Cortex A9 MP), usually multi-threaded, and a powerful multi-level cache memory hierarchy on-chip. As most of these systems are battery-powered, the power consumption becomes a critical issue. Achieving high performance and low power consumption is a high complexity challenge where some proposals have been already made. Suarez et al. proposed a new cache hierarchy on-chip, the LP-NUCA (Low Power NUCA), which is able to reduce the access latency taking advantage of NUCA (Non-Uniform Cache Architectures) properties. The key points are decoupling the functionality, and utilizing three specialized networks on-chip. This structure has been proved to be efficient for data hierarchies, achieving a good performance and reducing the energy consumption. On the other hand, instruction caches have different requirements and characteristics than data caches, contradicting the low-power embedded systems requirements, especially in SMT (simultaneous multi-threading) environments. We want to study the benefits of utilizing small tiled caches for the instruction hierarchy, so we propose a new design, ID-LP-NUCAs. Thus, we need to re-evaluate completely our previous design in terms of structure design, interconnection networks (including topologies, flow control and routing), content management (with special interest in hardware/software content allocation policies), and structure sharing. In CMP environments (chip multiprocessors) with parallel workloads, coherence plays an important role, and must be taken into consideration.


Sign in / Sign up

Export Citation Format

Share Document