scholarly journals RC-NVM: Dual-Addressing Non-Volatile Memory Architecture Supporting Both Row and Column Memory Accesses

2019 ◽  
Vol 68 (2) ◽  
pp. 239-254 ◽  
Author(s):  
Shuo Li ◽  
Nong Xiao ◽  
Peng Wang ◽  
Guangyu Sun ◽  
Xiaoyang Wang ◽  
...  
2018 ◽  
Vol 74 (8) ◽  
pp. 3875-3903
Author(s):  
Danqi Hu ◽  
Fang Lv ◽  
Chenxi Wang ◽  
Hui-Min Cui ◽  
Lei Wang ◽  
...  

PLoS ONE ◽  
2021 ◽  
Vol 16 (9) ◽  
pp. e0257047
Author(s):  
Adrián Lamela ◽  
Óscar G. Ossorio ◽  
Guillermo Vinuesa ◽  
Benjamín Sahelices

Non-volatile memory technology is now available in commodity hardware. This technology can be used as a backup memory for an external dram cache memory without needing to modify the software. However, the higher read and write latencies of non-volatile memory may exacerbate the memory wall problem. In this work we present a novel off-chip prefetch technique based on a Hidden Markov Model that specifically deals with the latency problem caused by complexity of off-chip memory access patterns. Firstly, we present a thorough analysis of off-chip memory access patterns to identify its complexity in multicore processors. Based on this study, we propose a prefetching module located in the llc which uses two small tables, and where the computational complexity of which is linear with the number of computing threads. Our Markov-based technique is able to keep track and make clustering of several simultaneous groups of memory accesses coming from multiple simultaneous threads in a multicore processor. It can quickly identify complex address groups and trigger prefetch with very high accuracy. Our simulations show an improvement of up to 76% in the hit ratio of an off-chip dram cache for multicore architecture over the conventional prefetch technique (g/dc). Also, the overhead of prefetch requests (failed prefetches) is reduced by 48% in single core simulations and by 83% in multicore simulations.


2021 ◽  
Vol 11 (3) ◽  
pp. 29
Author(s):  
Tommaso Zanotti ◽  
Francesco Maria Puglisi ◽  
Paolo Pavan

Different in-memory computing paradigms enabled by emerging non-volatile memory technologies are promising solutions for the development of ultra-low-power hardware for edge computing. Among these, SIMPLY, a smart logic-in-memory architecture, provides high reconfigurability and enables the in-memory computation of both logic operations and binarized neural networks (BNNs) inference. However, operation-specific hardware accelerators can result in better performance for a particular task, such as the analog computation of the multiply and accumulate operation for BNN inference, but lack reconfigurability. Nonetheless, a solution providing the flexibility of SIMPLY while also achieving the high performance of BNN-specific analog hardware accelerators is missing. In this work, we propose a novel in-memory architecture based on 1T1R crossbar arrays, which enables the coexistence on the same crossbar array of both SIMPLY computing paradigm and the analog acceleration of the multiply and accumulate operation for BNN inference. We also highlight the main design tradeoffs and opportunities enabled by different emerging non-volatile memory technologies. Finally, by using a physics-based Resistive Random Access Memory (RRAM) compact model calibrated on data from the literature, we show that the proposed architecture improves the energy delay product by >103 times when performing a BNN inference task with respect to a SIMPLY implementation.


2016 ◽  
Vol 11 (3) ◽  
pp. 189-194
Author(s):  
Rodel Felipe Miguel ◽  
Sivaraman Sundaram ◽  
K. M. M. Aung

Author(s):  
Masashi TAWADA ◽  
Shinji KIMURA ◽  
Masao YANAGISAWA ◽  
Nozomu TOGAWA

2016 ◽  
Vol 213 (9) ◽  
pp. 2446-2451 ◽  
Author(s):  
Klemens Ilse ◽  
Thomas Schneider ◽  
Johannes Ziegler ◽  
Alexander Sprafke ◽  
Ralf B. Wehrspohn

Sign in / Sign up

Export Citation Format

Share Document