scholarly journals Modern Computer Architecture using different Technique

2021 ◽  
Vol 183 (36) ◽  
pp. 47-53
Author(s):  
Jay Pankajkumar Kania
2020 ◽  
Author(s):  
Rotem Ben-Hur ◽  
Ronny Ronen ◽  
Ameer Haj-Ali ◽  
Debjyoti Bhattacharjee ◽  
Adi Eliahu ◽  
...  

In-memory processing can dramatically improve the latency and energy consumption of computing systems by minimizing the data transfer between the memory and the processor. Efficient execution of processing operations within the memory is therefore a highly motivated objective in modern computer architecture. This paper presents a novel automatic framework for efficient implementation of arbitrary combinational logic functions within a memristive memory. Using tools from logic design, graph theory and compiler register allocation technology, we developed SIMPLER (Synthesis and In-memory MaPping of Logic Execution in a single Row), a tool that optimizes the execution of in-memory logic operations in terms of throughput and area. Given a logical function, SIMPLER automatically generates a sequence of atomic Memristor-Aided loGIC (MAGIC) NOR operations and efficiently locates them within a single size-limited memory row, reusing cells to save area when needed. This approach fully exploits the parallelism offered by the MAGIC NOR gates. It allows multiple instances of the logic function to be performed concurrently, each compressed into a single row of the memory. This virtue makes SIMPLER an attractive candidate for designing in-memory Single Instruction, Multiple Data (SIMD) operations. Compared to previous work (that optimizes latency rather than throughput for a single function), SIMPLER achieves an average throughput improvement of 435×. When previous tools are parallelized similarly to SIMPLER, SIMPLER achieves higher throughput of at least 5×, with 23× improvement in area and 20× improvement in area efficiency. These improvements more than fully compensate for the increase (up to 17% on average) in latency.


2020 ◽  
Author(s):  
Rotem Ben-Hur ◽  
Ronny Ronen ◽  
Ameer Haj-Ali ◽  
Debjyoti Bhattacharjee ◽  
Adi Eliahu ◽  
...  

In-memory processing can dramatically improve the latency and energy consumption of computing systems by minimizing the data transfer between the memory and the processor. Efficient execution of processing operations within the memory is therefore a highly motivated objective in modern computer architecture. This paper presents a novel automatic framework for efficient implementation of arbitrary combinational logic functions within a memristive memory. Using tools from logic design, graph theory and compiler register allocation technology, we developed SIMPLER (Synthesis and In-memory MaPping of Logic Execution in a single Row), a tool that optimizes the execution of in-memory logic operations in terms of throughput and area. Given a logical function, SIMPLER automatically generates a sequence of atomic Memristor-Aided loGIC (MAGIC) NOR operations and efficiently locates them within a single size-limited memory row, reusing cells to save area when needed. This approach fully exploits the parallelism offered by the MAGIC NOR gates. It allows multiple instances of the logic function to be performed concurrently, each compressed into a single row of the memory. This virtue makes SIMPLER an attractive candidate for designing in-memory Single Instruction, Multiple Data (SIMD) operations. Compared to previous work (that optimizes latency rather than throughput for a single function), SIMPLER achieves an average throughput improvement of 435×. When previous tools are parallelized similarly to SIMPLER, SIMPLER achieves higher throughput of at least 5×, with 23× improvement in area and 20× improvement in area efficiency. These improvements more than fully compensate for the increase (up to 17% on average) in latency.


Author(s):  
Ron Elber

AbstractThe kinetics of biochemical and biophysical events determined the course of life processes and attracted considerable interest and research. For example, modeling of biological networks and cellular responses relies on the availability of information on rate coefficients. Atomically detailed simulations hold the promise of supplementing experimental data to obtain a more complete kinetic picture. However, simulations at biological time scales are challenging. Typical computer resources are insufficient to provide the ensemble of trajectories at the correct length that is required for straightforward calculations of time scales. In the last years, new technologies emerged that make atomically detailed simulations of rate coefficients possible. Instead of computing complete trajectories from reactants to products, these approaches launch a large number of short trajectories at different positions. Since the trajectories are short, they are computed trivially in parallel on modern computer architecture. The starting and termination positions of the short trajectories are chosen, following statistical mechanics theory, to enhance efficiency. These trajectories are analyzed. The analysis produces accurate estimates of time scales as long as hours. The theory of Milestoning that exploits the use of short trajectories is discussed, and several applications are described.


2001 ◽  
Vol 11 (01) ◽  
pp. 105-117 ◽  
Author(s):  
HERBERT KARNER ◽  
MARTIN AUER ◽  
CHRISTOPH W. UEBERHUBER

Modern computer architecture provides a special instruction — the fused multiply-add (FMA) instruction — to perform both a multiplication and an addition operation at the same time. In this paper newly developed radix-2, radix-3, and radix-5 FFT kernels that efficiently take advantage of this powerful instruction are presented. If a processor is provided with FMA instructions, the radix-2 FFT algorithm introduced has the lowest complexity of all Cooley–Tukey radix-2 algorithms. All floating-point operations are executed as FMA instructions. Compared to conventional radix-3 and radix-5 kernels, the new radix-3 and radix-5 kernels greatly improve the utilization of FMA instructions, which results in a significant reduction in complexity. In general, the advantages of the FFT algorithms presented in this paper are their low arithmetic complexity, their high efficiency, and their striking simplicity. Numerical experiments show that FFT programs using the new kernels clearly outperform even the best conventional FFT routines.


1994 ◽  
Vol 33 (01) ◽  
pp. 60-63 ◽  
Author(s):  
E. J. Manders ◽  
D. P. Lindstrom ◽  
B. M. Dawant

Abstract:On-line intelligent monitoring, diagnosis, and control of dynamic systems such as patients in intensive care units necessitates the context-dependent acquisition, processing, analysis, and interpretation of large amounts of possibly noisy and incomplete data. The dynamic nature of the process also requires a continuous evaluation and adaptation of the monitoring strategy to respond to changes both in the monitored patient and in the monitoring equipment. Moreover, real-time constraints may imply data losses, the importance of which has to be minimized. This paper presents a computer architecture designed to accomplish these tasks. Its main components are a model and a data abstraction module. The model provides the system with a monitoring context related to the patient status. The data abstraction module relies on that information to adapt the monitoring strategy and provide the model with the necessary information. This paper focuses on the data abstraction module and its interaction with the model.


Sign in / Sign up

Export Citation Format

Share Document