translation lookaside buffer
Recently Published Documents


TOTAL DOCUMENTS

32
(FIVE YEARS 4)

H-INDEX

6
(FIVE YEARS 0)

2022 ◽  
Vol 19 (1) ◽  
pp. 1-23
Author(s):  
Bang Di ◽  
Daokun Hu ◽  
Zhen Xie ◽  
Jianhua Sun ◽  
Hao Chen ◽  
...  

Co-running GPU kernels on a single GPU can provide high system throughput and improve hardware utilization, but this raises concerns on application security. We reveal that translation lookaside buffer (TLB) attack, one of the common attacks on CPU, can happen on GPU when multiple GPU kernels co-run. We investigate conditions or principles under which a TLB attack can take effect, including the awareness of GPU TLB microarchitecture, being lightweight, and bypassing existing software and hardware mechanisms. This TLB-based attack can be leveraged to conduct Denial-of-Service (or Degradation-of-Service) attacks. Furthermore, we propose a solution to mitigate TLB attacks. In particular, based on the microarchitecture properties of GPU, we introduce a software-based system, TLB-pilot, that binds thread blocks of different kernels to different groups of streaming multiprocessors by considering hardware isolation of last-level TLBs and the application’s resource requirement. TLB-pilot employs lightweight online profiling to collect kernel information before kernel launches. By coordinating software- and hardware-based scheduling and employing a kernel splitting scheme to reduce load imbalance, TLB-pilot effectively mitigates TLB attacks. The result shows that when under TLB attack, TLB-pilot mitigates the attack and provides on average 56.2% and 60.6% improvement in average normalized turnaround times and overall system throughput, respectively, compared to the traditional Multi-Process Service based co-running solution. When under TLB attack, TLB-pilot also provides up to 47.3% and 64.3% improvement (41% and 42.9% on average) in average normalized turnaround times and overall system throughput, respectively, compared to a state-of-the-art co-running solution for efficiently scheduling of thread blocks.


Author(s):  
Varuna Eswer ◽  
Sanket S Naik Dessai

<p><span>Processor efficiency is a important in embedded system. The efficiency of the processor depends on the L1 cache and translation lookaside buffer (TLB). It is required to understand the L1 cache and TLB performances during varied load for the execution on the processor and hence studies the performance of the varing load and its performance with caches with MIPS and operating system (OS) are studied in this paper. The proposed methods of implementation in the paper considers the counting of the instruction exxecution for respective cache and TLB management and the events are measured using a dedicated counters in software. The software counters are used as there are limitation to hardware counters in the MIPS32. Twenty-seven metrics are considered for analysis and proper identification and implemented for the performance measurement of L1 cache and TLB on the MIPS32 processor. The generated data helps in future research in compiler tuning, memory management design for OS, analysing architectural issues, system benchmarking, scalability, address space analysis, studies of bus communication among processor and its workload sharing characterisation and kernel profiling.</span></p>


Author(s):  
Jing Yan ◽  
Yujuan Tan ◽  
Zhulin Ma ◽  
Jingcheng Liu ◽  
Xianzhang Chen ◽  
...  

Translation lookaside buffer (TLB) is critical to modern multi-level memory systems’ performance. However, due to the limited size of the TLB itself, its address coverage is limited. Adopting a two-level exclusive TLB hierarchy can increase the coverage [M. Swanson, L. Stoller and J. Carter, Increasing TLB reach using superpages backed by shadow memory, 25th Annual Int. Symp. Computer Architecture (1998); H.P. Chang, T. Heo, J. Jeong and J. Huh Hybrid TLB coalescing: Improving TLB translation coverage under diverse fragmented memory allocations, ACM SIGARCH Comput. Arch. News 45 (2017) 444–456] to improve memory performance. However, after analyzing the existing two-level exclusive TLBs, we find that a large number of “dead” entries (they will have no further use) exist in the last-level TLB (LLT) for a long time, which occupy much cache space and result in low TLB hit-rate. Based on this observation, we first propose exploiting temporal and spatial locality to predict and identify dead entries in the exclusive LLT and remove them as soon as possible to leave room for more valid data to increase the TLB hit rates. Extensive experiments show that our method increases the average hit rate by 8.67%, to a maximum of 19.95%, and reduces total latency by an average of 9.82%, up to 24.41%.


Author(s):  
Y. I. Klimiankou

This paper focuses on the Translation Lookaside Buffer (TLB) management as part of memory management. TLB is an associative cache of the advanced processors, which reduces the overhead of the virtual to physical address translations. We consider challenges related to the design of the TLB management subsystem of the OS kernel on the example of the IA-32 platform and propose a simple model of complete and consistent policy of TLB management. This model can be used as a foundation for memory management subsystems design and verification.


10.29007/c2f1 ◽  
2018 ◽  
Author(s):  
Hira Syeda ◽  
Gerwin Klein

The main security mechanism for enforcing memory isolation in operating systems is provided by page tables. The hardware-implemented Translation Lookaside Buffer (TLB) caches these, and therefore the TLB and its consistency with memory are security crit- ical for OS kernels, including formally verified kernels such as seL4. If performance is paramount, this consistency can be subtle to achieve; yet, all major formally verified ker- nels currently leave the TLB as an assumption.In this paper, we present a formal model of the Memory Management Unit (MMU) for the ARM architecture which includes the TLB, its maintenance operations, and its derived properties. We integrate this specification into the Cambridge ARM model. We derive sufficient conditions for TLB consistency, and we abstract away the functional details of the MMU for simpler reasoning about executions in the presence of cached address translation, including complete and partial walks.


2018 ◽  
Vol 10 (1) ◽  
pp. 50-55 ◽  
Author(s):  
Vladimir Stenin ◽  
Artem Antonyuk ◽  
Yuri Katunin ◽  
Pavel Stepanov

Author(s):  
Vladimir Ya. Stenin ◽  
Artem V. Antonyuk ◽  
Pavel V. Stepanov ◽  
Yuri V. Katunin

Sign in / Sign up

Export Citation Format

Share Document