Data Cache Analysis by Counting Integer Points

Author(s):  
Pascal Sotin ◽  
Quentin Vermande ◽  
Hugues Cassé
2019 ◽  
Vol 98 ◽  
pp. 443-452 ◽  
Author(s):  
He Du ◽  
Wei Zhang ◽  
Nan Guan ◽  
Wang Yi
Keyword(s):  

2021 ◽  
Vol 18 (3) ◽  
pp. 1-22
Author(s):  
Michael Stokes ◽  
David Whalley ◽  
Soner Onder

While data filter caches (DFCs) have been shown to be effective at reducing data access energy, they have not been adopted in processors due to the associated performance penalty caused by high DFC miss rates. In this article, we present a design that both decreases the DFC miss rate and completely eliminates the DFC performance penalty even for a level-one data cache (L1 DC) with a single cycle access time. First, we show that a DFC that lazily fills each word in a DFC line from an L1 DC only when the word is referenced is more energy-efficient than eagerly filling the entire DFC line. For a 512B DFC, we are able to eliminate loads of words into the DFC that are never referenced before being evicted, which occurred for about 75% of the words in 32B lines. Second, we demonstrate that a lazily word filled DFC line can effectively share and pack data words from multiple L1 DC lines to lower the DFC miss rate. For a 512B DFC, we completely avoid accessing the L1 DC for loads about 23% of the time and avoid a fully associative L1 DC access for loads 50% of the time, where the DFC only requires about 2.5% of the size of the L1 DC. Finally, we present a method that completely eliminates the DFC performance penalty by speculatively performing DFC tag checks early and only accessing DFC data when a hit is guaranteed. For a 512B DFC, we improve data access energy usage for the DTLB and L1 DC by 33% with no performance degradation.


4OR ◽  
2020 ◽  
Author(s):  
Michele Conforti ◽  
Marianna De Santis ◽  
Marco Di Summa ◽  
Francesco Rinaldi

AbstractWe consider the integer points in a unimodular cone K ordered by a lexicographic rule defined by a lattice basis. To each integer point x in K we associate a family of inequalities (lex-inequalities) that define the convex hull of the integer points in K that are not lexicographically smaller than x. The family of lex-inequalities contains the Chvátal–Gomory cuts, but does not contain and is not contained in the family of split cuts. This provides a finite cutting plane method to solve the integer program $$\min \{cx: x\in S\cap \mathbb {Z}^n\}$$ min { c x : x ∈ S ∩ Z n } , where $$S\subset \mathbb {R}^n$$ S ⊂ R n is a compact set and $$c\in \mathbb {Z}^n$$ c ∈ Z n . We analyze the number of iterations of our algorithm.


2021 ◽  
Author(s):  
Otabek Gulomov ◽  
Sadulla Shodiev
Keyword(s):  

Author(s):  
B. Shameedha Begum ◽  
N. Ramasubramanian

Embedded systems are designed for a variety of applications ranging from Hard Real Time applications to mobile computing, which demands various types of cache designs for better performance. Since real-time applications place stringent requirements on performance, the role of the cache subsystem assumes significance. Reconfigurable caches meet performance requirements under this context. Existing reconfigurable caches tend to use associativity and size for maximizing cache performance. This article proposes a novel approach of a reconfigurable and intelligent data cache (L1) based on replacement algorithms. An intelligent embedded data cache and a dynamic reconfigurable intelligent embedded data cache have been implemented using Verilog 2001 and tested for cache performance. Data collected by enabling the cache with two different replacement strategies have shown that the hit rate improves by 40% when compared to LRU and 21% when compared to MRU for sequential applications which will significantly improve performance of embedded real time application.


2008 ◽  
Vol 5 (19) ◽  
pp. 833-839
Author(s):  
Seungmin Jung ◽  
Hyotaek Shim ◽  
Seungryoul Maeng

Sign in / Sign up

Export Citation Format

Share Document