direct memory access
Recently Published Documents


TOTAL DOCUMENTS

145
(FIVE YEARS 48)

H-INDEX

10
(FIVE YEARS 2)

2021 ◽  
Vol 2021 ◽  
pp. 1-13
Author(s):  
Lihang Pan ◽  
Guoqing Tu ◽  
Shubo Liu ◽  
Zhaohui Cai ◽  
Xingxing Xiong

With the increasing popularity of the Internet of Things (IoT), the issue of its information security has drawn more and more attention. To overcome the resource constraint barrier for secure and reliable data transmission on the widely used IoT devices such as wireless sensor network (WSN) nodes, many researcher studies consider hardware acceleration of traditional cryptographic algorithms as one of the effective methods. Meanwhile, as one of the current research topics in the reduced instruction set computer (RISC), RISC-V provides a solid foundation for implementing domain-specific architecture (DSA). To this end, we propose an extended instruction scheme for the advanced encryption standard (AES) based on RISC-V custom instructions and present a coprocessor designed on the open-source core Hummingbird E203. The AES coprocessor uses direct memory access channels to achieve parallel data access and processing, which provides flexibility in memory space allocation and improves the efficiency of cryptographic components. Applications with embedded AES custom instructions running on an experimental prototype of the field-programmable gate array (FPGA) platform demonstrated a 25.3% to 37.9% improvement in running time over previous similar works when processing no less than 80 bytes of data. In addition, the application-specific integrated circuit (ASIC) experiments show that in most cases, the coprocessor only consumes up to 20% more power than the necessary AES operations.


Sensors ◽  
2021 ◽  
Vol 21 (22) ◽  
pp. 7759
Author(s):  
Alessandro Cilardo

Efficient data movement in multi-node systems is a crucial issue at the crossroads of scientific computing, big data, and high-performance computing, impacting demanding data acquisition applications from high-energy physics to astronomy, where dedicated accelerators such as FPGA devices play a key role coupled with high-performance interconnect technologies. Building on the outcome of the RECIPE Horizon 2020 research project, this work evaluates the use of high-bandwidth interconnect standards, namely InfiniBand EDR and HDR, along with remote direct memory access functions for direct exposure of FPGA accelerator memory across a multi-node system. The prototype we present aims at avoiding dedicated network interfaces built in the FPGA accelerator itself, leaving most of the resources for user acceleration and supporting state-of-the-art interconnect technologies. We present the detail of the proposed system and a quantitative evaluation in terms of end-to-end bandwidth as concretely measured with a real-world FPGA-based multi-node HPC workload.


2021 ◽  
Author(s):  
◽  
Mathew David Bourne

<p>Magritek, a company who specialise in NMR and MRI devices, required a new backplane communication solution for transmission of data. Possible options were evaluated and it was decided to move to the PXI Express instrumentation standard. As a first step of moving to this system, an FPGA based PXI Express Peripheral Module was designed and constructed. In order to produce this device, details on PXI Express boards and the signals required were researched, and schematics produced. These were then passed onto the board designer who incorporated the design with other design work at Magritek to produce a PXI Express Peripheral Module for use as an NMR transceiver board. With the board designed, the FPGA was configured to provide PXI Express functionality. This was designed to allow PCI Express transfers at high data speeds using Direct Memory Access (DMA). The PXI Express Peripheral board was then tested and found to function correctly, providing Memory Write speeds of 228 MB/s and Memory Read speeds of 162 MB/s. Also, to provide a test system for this physical and FPGA design, backplanes were designed to test communication between PXI Express modules.</p>


2021 ◽  
Author(s):  
◽  
Mathew David Bourne

<p>Magritek, a company who specialise in NMR and MRI devices, required a new backplane communication solution for transmission of data. Possible options were evaluated and it was decided to move to the PXI Express instrumentation standard. As a first step of moving to this system, an FPGA based PXI Express Peripheral Module was designed and constructed. In order to produce this device, details on PXI Express boards and the signals required were researched, and schematics produced. These were then passed onto the board designer who incorporated the design with other design work at Magritek to produce a PXI Express Peripheral Module for use as an NMR transceiver board. With the board designed, the FPGA was configured to provide PXI Express functionality. This was designed to allow PCI Express transfers at high data speeds using Direct Memory Access (DMA). The PXI Express Peripheral board was then tested and found to function correctly, providing Memory Write speeds of 228 MB/s and Memory Read speeds of 162 MB/s. Also, to provide a test system for this physical and FPGA design, backplanes were designed to test communication between PXI Express modules.</p>


2021 ◽  
Vol 2021 ◽  
pp. 1-9
Author(s):  
Na Li ◽  
Xueqing Song

In real life, the text is one of the main carrier forms of information, which carries human civilization, and spreads knowledge to people, and also promotes culture and records history; however, how to read more information in a limited time, that is, to improve reading efficiency, has become a problem to be solved by current technology. The purpose of this paper is to integrate the existing wearable device concept, combined with a wireless intelligent sensor system; design a wearable reading assistance system designed to facilitate the use of blind and partially sighted people, based on the study and comparison of existing text recognition products; improve their functions and implementation platform, combined with wireless network; and design a wearable device that can achieve foreign text recognition and reading cognitive state reading assistance thereby improving reading efficiency. This paper proposes a method to implement foreign text decoding under the embedded platform with relatively few resources and quickly completes image acquisition, binarization, and compressed storage through the bit and storage area and DMA (direct memory access) double buffering mechanism unique to the chip selected in this paper; proposes to use the connected boundary tracking algorithm to find foreign text locators, reducing a large number of floating-point operations; does not rotate the image, instead, the image is directly sampled at the current rotation angle, and then foreign text bitstream information is acquired to realize the decoding of foreign text under the embedded platform with relatively fewer resources.


Author(s):  
Mathieu Gross ◽  
Nisha Jacob ◽  
Andreas Zankl ◽  
Georg Sigl

AbstractFPGA-SoCs are heterogeneous embedded computing platforms consisting of reconfigurable hardware and high-performance processing units. This combination offers flexibility and good performance for the design of embedded systems. However, allowing the sharing of resources between an FPGA and an embedded CPU enables possible attacks from one system on the other. This work demonstrates that a malicious hardware block contained inside the reconfigurable logic can manipulate the memory and peripherals of the CPU. Previous works have already considered direct memory access attacks from malicious logic on platforms containing no memory isolation mechanism. In this work, such attacks are investigated on a modern platform which contains state-of-the-art memory and peripherals isolation mechanisms. We demonstrate two attacks capable of compromising a Trusted Execution Environment based on ARM TrustZone and show a new attack capable of bypassing the secure boot configuration set by a device owner via the manipulation of Battery-Backed RAM and eFuses from malicious logic.


2021 ◽  
Vol 17 (3) ◽  
pp. 1-25
Author(s):  
Bohong Zhu ◽  
Youmin Chen ◽  
Qing Wang ◽  
Youyou Lu ◽  
Jiwu Shu

Non-volatile memory and remote direct memory access (RDMA) provide extremely high performance in storage and network hardware. However, existing distributed file systems strictly isolate file system and network layers, and the heavy layered software designs leave high-speed hardware under-exploited. In this article, we propose an RDMA-enabled distributed persistent memory file system, Octopus + , to redesign file system internal mechanisms by closely coupling non-volatile memory and RDMA features. For data operations, Octopus + directly accesses a shared persistent memory pool to reduce memory copying overhead, and actively fetches and pushes data all in clients to rebalance the load between the server and network. For metadata operations, Octopus + introduces self-identified remote procedure calls for immediate notification between file systems and networking, and an efficient distributed transaction mechanism for consistency. Octopus + is enabled with replication feature to provide better availability. Evaluations on Intel Optane DC Persistent Memory Modules show that Octopus + achieves nearly the raw bandwidth for large I/Os and orders of magnitude better performance than existing distributed file systems.


2021 ◽  
Vol 17 (3) ◽  
pp. 1-32
Author(s):  
Xingda Wei ◽  
Rong Chen ◽  
Haibo Chen ◽  
Binyu Zang

RDMA ( Remote Direct Memory Access ) has gained considerable interests in network-attached in-memory key-value stores. However, traversing the remote tree-based index in ordered key-value stores with RDMA becomes a critical obstacle, causing an order-of-magnitude slowdown and limited scalability due to multiple round trips. Using index cache with conventional wisdom—caching partial data and traversing them locally—usually leads to limited effect because of unavoidable capacity misses, massive random accesses, and costly cache invalidations. We argue that the machine learning (ML) model is a perfect cache structure for the tree-based index, termed learned cache . Based on it, we design and implement XStore , an RDMA-based ordered key-value store with a new hybrid architecture that retains a tree-based index at the server to perform dynamic workloads (e.g., inserts) and leverages a learned cache at the client to perform static workloads (e.g., gets and scans). The key idea is to decouple ML model retraining from index updating by maintaining a layer of indirection from logical to actual positions of key-value pairs. It allows a stale learned cache to continue predicting a correct position for a lookup key. XStore ensures correctness using a validation mechanism with a fallback path and further uses speculative execution to minimize the cost of cache misses. Evaluations with YCSB benchmarks and production workloads show that a single XStore server can achieve over 80 million read-only requests per second. This number outperforms state-of-the-art RDMA-based ordered key-value stores (namely, DrTM-Tree, Cell, and eRPC+Masstree) by up to 5.9× (from 3.7×). For workloads with inserts, XStore still provides up to 3.5× (from 2.7×) throughput speedup, achieving 53M reqs/s. The learned cache can also reduce client-side memory usage and further provides an efficient memory-performance tradeoff, e.g., saving 99% memory at the cost of 20% peak throughput.


Author(s):  
Daniel Vaquerizo-Hdez ◽  
Pablo Muñoz ◽  
David F. Barrero ◽  
Maria D. R-Moreno

AbstractMeasuring the consumption of electronic devices is a difficult and sensitive task. Data acquisition (DAQ) systems are often used to determine such consumption. In theory, measuring energy consumption is straight forward, just by acquiring current and voltage signals we can determine the consumption. However, a number of issues arise when a fine analysis is required. The main problem is that sampling frequencies have to be high enough to detect variations in the assessed signals over time. In that regard, some popular DAQ systems are based on RISC ARM processors for microcontrollers combined with analog-to-digital converters to meet high-frequency acquisition requirements. The efficient use of direct memory access (DMA) modules combined with pipelined processing in a microcontroller allows to improve the sample rate overcoming the processing time and the internal communication protocol limitations. This paper presents a novel approach for high-frequency energy measurement composed of a DMA rate improvement (data acquisition logic), a data processing logic and a low-cost hardware. The contribution of the paper is the combination of a double-buffered signal acquisition mechanism and an algorithm that computes the device’s energy consumption using parallel data processing. The combination of these elements enables a high-frequency (continuous) energy consumption measurement of an electronic device, improving the accuracy and reducing the cost of existing systems. We have validated our approach by measuring the energy consumed by elemental circuits and wireless sensors networks (WSNs) motes. The results indicate that the energy measurement error is less than 5% and that the proposed method is suitable to measure WSN motes even during sleep cycles, enabling a better characterization of their consumption profile.


Sign in / Sign up

Export Citation Format

Share Document