memory modules
Recently Published Documents


TOTAL DOCUMENTS

117
(FIVE YEARS 32)

H-INDEX

12
(FIVE YEARS 3)

Author(s):  
Druva Kumar S. ◽  
Roopa M.

<span lang="EN-US">The multiple read and write operations are performed simultaneously by multi-ported memories and are used in advanced digital design applications on reprogrammable field-programmable gate arrays (FPGAs) to achieve higher bandwidth. The Memory modules are configured by block RAM (BRAMs), which utilizes more area and power on FPGA. In this manuscript, the techniques to increase the read ports for multi-ported memory modules are designed using the bank division with XOR (BDX) approach. The read port techniques like two read-one write (2R1W) memory, hybrid mode approach either 2R1W or 4R memory, and hierarchical BDX (HBDX) Approach using 2R1W/4R memory are designed on FPGA platform. The Proposed work utilizes only slices and look-up table (LUT's) rather than BRAMs while designing the memory modules on FPGA, which reduces the computational complexity and improves the system performance.  The experimental results are analyzed on Artix-7 FPGA. The performance parameters like slices, LUT utilization, maximum frequency (Fmax), and hardware efficiency are analyzed by concerning different memory depths. The 4R1W memory design using the HBDX approach utilizes 4% slices and works at 449.697 MHz operating frequency on Artix-7 FPGA. The proposed work provides a better platform to choose the proper read port technique to design an efficient modular multiport memory architecture.</span>


2021 ◽  
Vol 20 (5s) ◽  
pp. 1-20
Author(s):  
Qingfeng Zhuge ◽  
Hao Zhang ◽  
Edwin Hsing-Mean Sha ◽  
Rui Xu ◽  
Jun Liu ◽  
...  

Efficiently accessing remote file data remains a challenging problem for data processing systems. Development of technologies in non-volatile dual in-line memory modules (NVDIMMs), in-memory file systems, and RDMA networks provide new opportunities towards solving the problem of remote data access. A general understanding about NVDIMMs, such as Intel Optane DC Persistent Memory (DCPM), is that they expand main memory capacity with a cost of multiple times lower performance than DRAM. With an in-depth exploration presented in this paper, however, we show an interesting finding that the potential of NVDIMMs for high-performance, remote in-memory accesses can be revealed through careful design. We explore multiple architectural structures for accessing remote NVDIMMs in a real system using Optane DCPM, and compare the performance of various structures. Experiments are conducted to show significant performance gaps among different ways of using NVDIMMs as memory address space accessible through RDMA interface. Furthermore, we design and implement a prototype of user-level, in-memory file system, RIMFS, in the device DAX mode on Optane DCPM. By comparing against the DAX-supported Linux file system, Ext4-DAX, we show that the performance of remote reads on RIMFS over RDMA is 11.44 higher than that on a remote Ext4-DAX on average. The experimental results also show that the performance of remote accesses on RIMFS is maintained on a heavily loaded data server with CPU utilization as high as 90%, while the performance of remote reads on Ext4-DAX is significantly reduced by 49.3%, and the performance of local reads on Ext4-DAX is even more significantly reduced by 90.1%. The performance comparisons of writes exhibit the same trends.


2021 ◽  
Vol 17 (3) ◽  
pp. 1-25
Author(s):  
Bohong Zhu ◽  
Youmin Chen ◽  
Qing Wang ◽  
Youyou Lu ◽  
Jiwu Shu

Non-volatile memory and remote direct memory access (RDMA) provide extremely high performance in storage and network hardware. However, existing distributed file systems strictly isolate file system and network layers, and the heavy layered software designs leave high-speed hardware under-exploited. In this article, we propose an RDMA-enabled distributed persistent memory file system, Octopus + , to redesign file system internal mechanisms by closely coupling non-volatile memory and RDMA features. For data operations, Octopus + directly accesses a shared persistent memory pool to reduce memory copying overhead, and actively fetches and pushes data all in clients to rebalance the load between the server and network. For metadata operations, Octopus + introduces self-identified remote procedure calls for immediate notification between file systems and networking, and an efficient distributed transaction mechanism for consistency. Octopus + is enabled with replication feature to provide better availability. Evaluations on Intel Optane DC Persistent Memory Modules show that Octopus + achieves nearly the raw bandwidth for large I/Os and orders of magnitude better performance than existing distributed file systems.


2021 ◽  
Vol 16 (2) ◽  
pp. 1-9
Author(s):  
Stephanie Soldavini ◽  
Christian Pilato

The never-ending demand for high performance and energy efficiency is pushing designers towards an increasing level of heterogeneity and specialization in modern computing systems. In such systems, creating efficient memory architectures is one of the major opportunities for optimizing modern workloads (e.g., computer vision, machine learning, graph analytics, etc.) that are extremely data-driven. However, designers demand proper design methods to tackle the increasing design complexity and address several new challenges, like the security and privacy of the data to be elaborated.This paper overviews the current trend for the design of domain-specific memory architectures. Domain-specific architectures are tailored for the given application domain, with the introduction of hardware accelerators and custom memory modules while maintaining a certain level of flexibility. We describe the major components, the common challenges, and the state-of-the-art design methodologies for building domain-specific memory architectures. We also discuss the most relevant research projects, providing a classification based on our main topics.


Electronics ◽  
2021 ◽  
Vol 10 (16) ◽  
pp. 1977
Author(s):  
Guangyu Zhu ◽  
Jaehyun Han ◽  
Sangjin Lee ◽  
Yongseok Son

The emergence of non-volatile memories (NVM) brings new opportunities and challenges to data management system design. As an important part of the data management systems, several new file systems are developed to take advantage of the characteristics of NVM. However, these NVM-aware file systems are usually designed and evaluated based on simulations or emulations. In order to explore the performance and characteristics of these file systems on real hardware, in this article, we provide an empirical evaluation of NVM-aware file systems on the first commercially available byte-addressable NVM (i.e., the Intel Optane DC Persistent Memory Module (DCPMM)). First, to compare the performance difference between traditional file systems and NVM-aware file systems, we evaluate the performance of Ext4, XFS, F2FS, Ext4-DAX, XFS-DAX, and NOVA file systems on DCPMMs. To compare DCPMMs with other secondary storage devices, we also conduct the same evaluations on Optane SSDs and NAND-flash SSDs. Second, we observe how remote NUMA node access and device mapper striping affect the performance of DCPMMs. Finally, we evaluate the performance of the database (i.e., MySQL) on DCPMMs with Ext4 and Ext4-DAX file systems. We summarize several observations from the evaluation results and performance analysis. We anticipate that these observations will provide implications for various memory and storage systems.


Author(s):  
B V S Sai Praneeth

We propose a methodology to design a Finite State Machine(FSM)-based Programmable Memory Built-In Self Test (PMBIST) which includes a planned procedure for Memory BIST which has a controller to select a test algorithm from a fixed set of algorithms that are built in the memory BIST. In general, it is not possible to test all the different memory modules present in System-on-Chip (SoC) with a single Test algorithm. Subsequently it is desirable to have a programmable Memory BIST controller which can execute multiple test algorithms. The proposed Memory BIST controller is designed as a FSM (Finite State Machine) written in Verilog HDL and this scheme greatly simplifies the testing process and it achieves a good flexibility with smaller circuit size compared with Individual Testing designs. We have used March test algorithms like MATS+, March X, March C- to build the project.


2021 ◽  
Vol 50 (1) ◽  
pp. 87-94
Author(s):  
Baotong Lu ◽  
Xiangpeng Hao ◽  
Tianzheng Wang ◽  
Eric Lo

Byte-addressable persistent memory (PM) brings hash tables the potential of low latency, cheap persistence and instant recovery. The recent advent of Intel Optane DC Persistent Memory Modules (DCPMM) further accelerates this trend. Many new hash table designs have been proposed, but most of them were based on emulation and perform sub-optimally on real PM. They were also piecewise and partial solutions that side-stepped many important properties, in particular good scalability, high load factor and instant recovery.


2021 ◽  
Author(s):  
Karthik K ◽  
Sudarson Jena ◽  
Venu Gopal T

Abstract A Multiprocessor is a system with at least two processing units sharing access to memory. The principle goal of utilizing a multiprocessor is to process the undertakings all the while and support the system’s performance. An Interconnection Network interfaces the various handling units and enormously impacts the exhibition of the whole framework. Interconnection Networks, also known as Multi-stage Interconnection Networks, are node-to-node links in which each node may be a single processor or a group of processors. These links transfer information from one processor to the next or from the processor to the memory, allowing the task to be isolated and measured equally. Hypercube systems are a kind of system geography used to interconnect various processors with memory modules and precisely course the information. Hypercube systems comprise of 2n nodes. Any Hypercube can be thought of as a graph with nodes and edges, where a node represents a processing unit and an edge represents a connection between the processors to transmit. Degree, Speed, Node coverage, Connectivity, Diameter, Reliability, Packet loss, Network cost, and so on are some of the different system scales that can be used to measure the performance of Interconnection Networks. A portion of the variations of Hypercube Interconnection Networks include Hypercube Network, Folded Hypercube Network, Multiple Reduced Hypercube Network, Multiply Twisted Cube, Recursive Circulant, Exchanged Crossed Cube Network, Half Hypercube Network, and so forth. This work assesses the performing capability of different variations of Hypercube Interconnection Networks. A group of properties is recognized and a weight metric is structured utilizing the distinguished properties to assess the performance exhibition. Utilizing this weight metric, the performance of considered variations of Hypercube Interconnection Networks is evaluated and summed up to recognize the effective variant. A compact survey of a portion of the variations of Hypercube systems, geographies, execution measurements, and assessment of the presentation are examined in this paper. Degree and Diameter are considered to ascertain the Network cost. On the off chance that Network Cost is considered as the measurement to assess the exhibition, Multiple Reduced Hypercube stands ideal with its lower cost. Notwithstanding it, on the off chance that we think about some other properties/ scales/metrics to assess the performance, any variant other than MRH may show considerably more ideal execution. The considered properties probably won't be ideally adequate to assess the effective performance of Hypercube variations in all respects. On the off chance that a sensibly decent number of properties are utilized to assess the presentation, a proficient variation of Hypercube Interconnection Network can be distinguished for a wide scope of uses. This is the inspiration to do this research work.


Symmetry ◽  
2021 ◽  
Vol 13 (3) ◽  
pp. 378
Author(s):  
Oluwatosin Ahmed Amodu ◽  
Mohamed Othman ◽  
Nur Arzilawati Md Yunus ◽  
Zurina Mohd Hanapi

Interconnection networks provide an effective means by which components of a system such as processors and memory modules communicate to provide reliable connectivity. This facilitates the realization of a highly efficient network design suitable for computational-intensive applications. Particularly, the use of multistage interconnection networks has unique advantages as the addition of extra stages helps to improve the network performance. However, this comes with challenges and trade-offs, which motivates researchers to explore various design options and architectural models to improve on its performance. A particular class of these networks is shuffle exchange network (SEN) which involves a symmetric N-input and N-output architecture built in stages of N/2 switching elements each. This paper presents recent advances in multistage interconnection networks with emphasis on SENs while discussing pertinent issues related to its design aspects, and taking lessons from the past and current literature. To achieve this objective, applications, motivating factors, architectures, shuffle exchange networks, and some of the performance evaluation techniques as well as their merits and demerits are discussed. Then, to capture the latest research trends in this area not covered in contemporary literature, this paper reviews very recent advancements in shuffle exchange multistage interconnection networks within the last few years and provides design guidelines as well as recommendations for future consideration.


Sign in / Sign up

Export Citation Format

Share Document