CHAM: Improving Prefetch Efficiency Using a Composite Hierarchy-Aware Method

2018 ◽  
Vol 27 (07) ◽  
pp. 1850114
Author(s):  
Cheng Qian ◽  
Libo Huang ◽  
Qi Yu ◽  
Zhiying Wang

Hardware prefetching has always been a crucial mechanism to improve processor performance. However, an efficient prefetch operation requires a guarantee of high prefetch accuracy; otherwise, it may degrade system performance. Prior studies propose an adaptive priority controlling method to make better use of prefetch accesses, which improves performance in two-level cache systems. However, this method does not perform well in a more complex memory hierarchy, such as a three-level cache system. Thus, it is still necessary to explore the efficiency of prefetch, in particular, in complex hierarchical memory systems. In this paper, we propose a composite hierarchy-aware method called CHAM, which works at the middle level cache (MLC). By using prefetch accuracy as an evaluation criterion, CHAM improves the efficiency of prefetch accesses based on (1) a dynamic adaptive prefetch control mechanism to schedule the priority and data transfer of prefetch accesses across the cache hierarchical levels in the runtime and (2) a prefetch efficiency-oriented hybrid cache replacement policy to select the most suitable policy. To demonstrate its effectiveness, we have performed extensive experiments on 28 benchmarks from SPEC CPU2006 and two benchmarks from BioBench. Compared with a similar adaptive method, CHAM improves the MLC demand hit rate by 9.2% and an improvement of 1.4% in system performance on average in a single-core system. On a 4-core system, CHAM improves the demand hit rate by 33.06% and improves system performance by 10.1% on average.

2014 ◽  
Vol 23 (04) ◽  
pp. 1450046
Author(s):  
ENRIQUE SEDANO ◽  
SILVIO SEPULVEDA ◽  
FERNANDO CASTRO ◽  
DANIEL CHAVER ◽  
RODRIGO GONZALEZ-ALBERQUILLA ◽  
...  

Studying blocks behavior during their lifetime in cache can provide useful information to reduce the miss rate and therefore improve processor performance. According to this rationale, the peLIFO replacement algorithm [M. Chaudhuri, Proc. Micro'09, New York, 12–16 December, 2009, pp. 401–412], which learns dynamically the number of cache ways required to satisfy short-term reuses preserving the remaining ways for long-term reuses, has been recently proposed. In this paper, we propose several changes to the original peLIFO policy in order to reduce the implementation complexity involved, and we extend the algorithm to a shared-cache environment considering dynamic information about threads behavior to improve cache efficiency. Experimental results confirm that our simplification techniques reduce the required hardware with a negligible performance penalty, while the best of our thread-aware extension proposals reduces average CPI by 8.7% and 15.2% on average compared to the original peLIFO and LRU respectively for a set of 43 multi-programmed workloads on an 8 MB 16-way set associative shared L2 cache.


Author(s):  
V. Sathiyamoorthi

Network congestion remains one of the main barriers to the continuing success of the internet and Web based services. In this background, proxy caching is one of the most successful solutions for civilizing the performance of Web since it reduce network traffic, Web server load and improves user perceived response time. Here, the most popular Web objects that are likely to be revisited in the near future are stored in the proxy server thereby it improves the Web response time and saves network bandwidth. The main component of Web caching is it cache replacement policy. It plays a key role in replacing existing objects when there is no room for new one especially when cache is full. Moreover, the conventional replacement policies are used in Web caching environments which provide poor network performance. These policies are suitable for memory caching since it involves fixed sized objects. But, Web caching which involves objects of varying size and hence there is a need for an efficient policy that works better in Web cache environment. Moreover, most of the existing Web caching policies have considered few factors and ignored the factors that have impact on the efficiency of Web proxy caching. Hence, it is decided to propose a novel policy for Web cache environment. The proposed policy includes size, cost, frequency, ageing, time of entry into the cache and popularity of Web objects in cache removal policy. It uses the Web usage mining as a technique to improve Web caching policy. Also, empirical analyses shows that proposed policy performs better than existing policies in terms of various performance metrics such as hit rate and byte hit rate.


2021 ◽  
Vol 2 (3) ◽  
pp. 1-24
Author(s):  
Chih-Kai Huang ◽  
Shan-Hsiang Shen

The next-generation 5G cellular networks are designed to support the internet of things (IoT) networks; network components and services are virtualized and run either in virtual machines (VMs) or containers. Moreover, edge clouds (which are closer to end users) are leveraged to reduce end-to-end latency especially for some IoT applications, which require short response time. However, the computational resources are limited in edge clouds. To minimize overall service latency, it is crucial to determine carefully which services should be provided in edge clouds and serve more mobile or IoT devices locally. In this article, we propose a novel service cache framework called S-Cache , which automatically caches popular services in edge clouds. In addition, we design a new cache replacement policy to maximize the cache hit rates. Our evaluations use real log files from Google to form two datasets to evaluate the performance. The proposed cache replacement policy is compared with other policies such as greedy-dual-size-frequency (GDSF) and least-frequently-used (LFU). The experimental results show that the cache hit rates are improved by 39% on average, and the average latency of our cache replacement policy decreases 41% and 38% on average in these two datasets. This indicates that our approach is superior to other existing cache policies and is more suitable in multi-access edge computing environments. In the implementation, S-Cache relies on OpenStack to clone services to edge clouds and direct the network traffic. We also evaluate the cost of cloning the service to an edge cloud. The cloning cost of various real applications is studied by experiments under the presented framework and different environments.


2018 ◽  
Vol 15 (2) ◽  
pp. 20171099-20171099 ◽  
Author(s):  
Duk-Jun Bang ◽  
Min-Kwan Kee ◽  
Hong-Yeol Lim ◽  
Gi-Ho Park

Sign in / Sign up

Export Citation Format

Share Document