ASYNCHRONOUS INSTRUCTION CACHE MEMORY FOR AVERAGE-CASE PERFORMANCE

This paper presents an asynchronous instruction cache memory for average-case performance, rather than worst-case performance. Even though the proposed instruction cache design is based on a fixed delay model, it can achieve high throughput by employing a new memory segmentation technique that divides cache memory cell arrays into multiple memory segments. The conventional bit-line memory segmentation divides a whole memory system into multiple segments so that all memory segments have the same size. On the contrary, we propose a new bit-line segmentation technique for the cache memory which consists of multiple segments but all the memory segments have the same delay bound for themselves. We use the resister-capacitor (R-C) modeling of bit-line delay for content addressable memory–random access memory (CAM–RAM) structure in a cache in order to estimate the total bit-line delay. Then, we decide the number of segments to trade-off between the throughput and complexity of a cache system. We synthesized a 128 KB cache memory consisting of various segments from 1 to 16 using Hynix 0.35-μm CMOS process. From the simulation results, our implementation with dividing factor 4 and 16 can reduce the average cache access time to 28% and 35% when compared to the non-segmented counterpart system. It also shows that our implementation can reduce the average cache access time by 11% and 17% when compared to the bit-line segmented cache that consists of the same number of segments that have the same size.

Download Full-text

Enhance the Performance of Associative Memory by Using New Methods

VFAST Transactions on Software Engineering ◽

10.21015/vtse.v12i3.504 ◽

2017 ◽

pp. 49-56

Author(s):

◽

Keyword(s):

Associative Memory ◽

Memory Performance ◽

Cache Memory ◽

Access Time ◽

Multicore Processor ◽

Cache Performance ◽

New Methods ◽

Cache Access ◽

Multiple Tasks ◽

Performance Evaluating

Data or instructions that are regularly used are saved in cache so that it is very easy to retrieve for the purpose of increase the cache performance. Evaluating the execution of multi-core systems the part of the cache memory is very important. A multicore processor is shared circuit in which two or more processors are joined to enhance the performance and perform multiple tasks. This paper describes the performance of cache memory based on cache access time, miss rate and miss penalty. Cache mapping methods are defined to increase the performance of cache but it face many difficulties. Some methods and algorithms are used to decrease these difficulties. In this paper describes the study of recent competing processors to evaluate the cache memory performance.

Download Full-text

On Average Case Complexity of Problems that are Intractable in the Worst Case

1992 American Control Conference ◽

10.23919/acc.1992.4792069 ◽

1992 ◽

Author(s):

G.W. Wasilkowski

Keyword(s):

Worst Case ◽

Average Case ◽

Case Complexity ◽

Average Case Complexity

Download Full-text

Rule Based Classifiers for Suspect Detection from CCTV Footages

Recent Advances in Computer Science and Communications ◽

10.2174/2666255813999200922142931 ◽

2020 ◽

Vol 13 ◽

Author(s):

Sunil Pathak

Keyword(s):

Answer Sheet ◽

Normal Activity ◽

Contact Detection ◽

Accuracy Rate ◽

Hand Detection ◽

Worst Case ◽

Average Case ◽

The Face ◽

Case Simulation ◽

Material Exchange

Background: The significant work has been present to identify suspects, gathering information and examining any videos from CCTV Footage. This exploration work expects to recognize suspicious exercises, i.e. object trade, passage of another individual, peeping into other's answer sheet and individual trade from the video caught by a reconnaissance camera amid examinations. This requires the procedure of face acknowledgment, hand acknowledgment and distinguishing the contact between the face and hands of a similar individual and that among various people. Methods: Segmented frames has given as input to obtain foreground image with the help of Gaussian filtering and background modeling method. Suh foreground images has given to Activity Recognition model to detect normal activity or suspicious activity. Results: Accuracy rate, Precision and Recall are calculate for activities detection, contact detection for Best Case, Average Case and Worst Case. Simulation results are compare with performance parameter such as Material Exchange, Position Exchange, and Introduction of a new person, Face and Hand Detection and Multi Person Scenario. Conclusion: In this paper, a framework is prepared for suspect detection. This framework will absolutely realize an unrest in the field of security observation in the training area.

Download Full-text

A High-Speed Temperature Sensor with Low Supply Sensitivity for SoC Thermal Monitoring

Journal of Circuits System and Computers ◽

10.1142/s0218126618501165 ◽

2018 ◽

Vol 27 (07) ◽

pp. 1850116

Author(s):

Yuanxin Bao ◽

Wenyuan Li

Keyword(s):

Temperature Sensor ◽

Conversion Rate ◽

High Speed ◽

Supply Voltage ◽

Ring Oscillator ◽

Temperature Sensitive ◽

Cmos Process ◽

Digital Code ◽

Thermal Monitoring ◽

Worst Case

A high-speed low-supply-sensitivity temperature sensor is presented for thermal monitoring of system on a chip (SoC). The proposed sensor transforms the temperature to complementary to absolute temperature (CTAT) frequency and then into digital code. A CTAT voltage reference supplies a temperature-sensitive ring oscillator, which enhances temperature sensitivity and conversion rate. To reduce the supply sensitivity, an operational amplifier with a unity gain for power supply is proposed. A frequency-to-digital converter with piecewise linear fitting is used to convert the frequency into the digital code corresponding to temperature and correct nonlinearity. These additional characteristics are distinct from the conventional oscillator-based temperature sensors. The sensor is fabricated in a 180[Formula: see text]nm CMOS process and occupies a small area of 0.048[Formula: see text]mm2 excluding bondpads. After a one-point calibration, the sensor achieves an inaccuracy of [Formula: see text][Formula: see text]1.5[Formula: see text]C from [Formula: see text]45[Formula: see text]C to 85[Formula: see text]C under a supply voltage of 1.4–2.4[Formula: see text]V showing a worst-case supply sensitivity of 0.5[Formula: see text]C/V. The sensor maintains a high conversion rate of 45[Formula: see text]KS/s with a fine resolution of 0.25[Formula: see text]C/LSB, which is suitable for SoC thermal monitoring. Under a supply voltage of 1.8[Formula: see text]V, the maximum energy consumption per conversion is only 7.8[Formula: see text]nJ at [Formula: see text]45[Formula: see text]C.

Download Full-text

Average case complexity under the universal distribution equals worst-case complexity

Information Processing Letters ◽

10.1016/0020-0190(92)90138-l ◽

1992 ◽

Vol 42 (3) ◽

pp. 145-149 ◽

Cited By ~ 32

Author(s):

Ming Li ◽

Paul M.B. Vitányi

Keyword(s):

Worst Case ◽

Average Case ◽

Case Complexity ◽

Universal Distribution ◽

Average Case Complexity ◽

Worst Case Complexity

Download Full-text

A Splay Tree-Based Approach for Efficient Resource Location in P2P Networks

The Scientific World JOURNAL ◽

10.1155/2014/830682 ◽

2014 ◽

Vol 2014 ◽

pp. 1-11

Author(s):

Wei Zhou ◽

Zilong Tan ◽

Shaowen Yao ◽

Shipu Wang

Keyword(s):

Routing Algorithm ◽

Adaptive Routing ◽

Upper And Lower Bounds ◽

Resource Location ◽

Hop Count ◽

Worst Case ◽

Average Case ◽

Splay Tree ◽

Efficient Resource ◽

P2p System

Resource location in structured P2P system has a critical influence on the system performance. Existing analytical studies of Chord protocol have shown some potential improvements in performance. In this paper a splay tree-based new Chord structure called SChord is proposed to improve the efficiency of locating resources. We consider a novel implementation of the Chord finger table (routing table) based on the splay tree. This approach extends the Chord finger table with additional routing entries. Adaptive routing algorithm is proposed for implementation, and it can be shown that hop count is significantly minimized without introducing any other protocol overheads. We analyze the hop count of the adaptive routing algorithm, as compared to Chord variants, and demonstrate sharp upper and lower bounds for both worst-case and average case settings. In addition, we theoretically analyze the hop reducing in SChord and derive the fact that SChord can significantly reduce the routing hops as compared to Chord. Several simulations are presented to evaluate the performance of the algorithm and support our analytical findings. The simulation results show the efficiency of SChord.

Download Full-text

Average-Case Approximation Ratio of Scheduling without Payments

Algorithmica ◽

10.1007/s00453-020-00796-2 ◽

2021 ◽

Author(s):

Jie Zhang

Keyword(s):

Mechanism Design ◽

Case Analysis ◽

Theoretical Computer Science ◽

Approximation Ratio ◽

Approximation Schemes ◽

Worst Case Analysis ◽

Algorithmic Mechanism Design ◽

Worst Case ◽

Average Case ◽

Optimal Mechanism

AbstractApart from the principles and methodologies inherited from Economics and Game Theory, the studies in Algorithmic Mechanism Design typically employ the worst-case analysis and design of approximation schemes of Theoretical Computer Science. For instance, the approximation ratio, which is the canonical measure of evaluating how well an incentive-compatible mechanism approximately optimizes the objective, is defined in the worst-case sense. It compares the performance of the optimal mechanism against the performance of a truthful mechanism, for all possible inputs. In this paper, we take the average-case analysis approach, and tackle one of the primary motivating problems in Algorithmic Mechanism Design—the scheduling problem (Nisan and Ronen, in: Proceedings of the 31st annual ACM symposium on theory of computing (STOC), 1999). One version of this problem, which includes a verification component, is studied by Koutsoupias (Theory Comput Syst 54(3):375–387, 2014). It was shown that the problem has a tight approximation ratio bound of $$(n+1)/2$$ ( n + 1 ) / 2 for the single-task setting, where n is the number of machines. We show, however, when the costs of the machines to executing the task follow any independent and identical distribution, the average-case approximation ratio of the mechanism given by Koutsoupias (Theory Comput Syst 54(3):375–387, 2014) is upper bounded by a constant. This positive result asymptotically separates the average-case ratio from the worst-case ratio. It indicates that the optimal mechanism devised for a worst-case guarantee works well on average.

Download Full-text

Motion Estimation Architecture Using Efficient Adder-Compressors for HDTV Video Coding

Journal of Integrated Circuits and Systems ◽

10.29292/jics.v5i1.312 ◽

2010 ◽

Vol 5 (1) ◽

pp. 78-88 ◽

Cited By ~ 1

Author(s):

Marcelo Porto ◽

André Silva ◽

Sergo Almeida ◽

Eduardo Da Costa ◽

Sergio Bampi

Keyword(s):

Motion Estimation ◽

Real Time ◽

Search Algorithm ◽

Absolute Difference ◽

High Definition ◽

Case Processing ◽

Worst Case ◽

Average Case ◽

High Definition Television ◽

Internal Structures

This paper presents real time HDTV (High Definition Television) architecture for Motion Estimation (ME) using efficient adder compressors. The architecture is based on the Quarter Sub-sampled Diamond Search algorithm (QSDS) with Dynamic Iteration Control (DIC) algorithm. The main characteristic of the proposed architecture is the large amount of Processing Units (PUs) that are used to calculate the SAD (Sum of Absolute Difference) metric. The internal structures of the PUs are composed by a large number of addition operations to calculate the SADs. In this paper, efficient 4-2 and 8-2 adder compressors are used in the PUs architecture to achieve the performance to work with HDTV (High Definition Television) videos in real time at 30 frames per second. These adder compressors enable the simultaneous addition of 4 and 8 operands respectively. The PUs, using adder compressors, were applied to the ME architecture. The implemented architecture was described in VHDL and synthesized to FPGA and, with Leonardo Spectrum tool, to the TSMC 0.18μm CMOS standard cell technology. Synthesis results indicate that the new QSDS-DIC architecture reach the best performance result and enable gains of 12% in terms of processing rate. The architecture can reach real time for full HDTV (1920x1080 pixels) in the worst case processing 65 frames per second, and it can process 269 HDTV frames per second in the average case.

Download Full-text

Stochastic Flips on Dimer Tilings

Discrete Mathematics & Theoretical Computer Science ◽

10.46298/dmtcs.2803 ◽

2010 ◽

Vol DMTCS Proceedings vol. AM,... (Proceedings) ◽

Author(s):

Thomas Fernique ◽

Damien Regnault

Keyword(s):

Fixed Point ◽

Markov Process ◽

Upper Bound ◽

Numerical Experiments ◽

Triangular Grid ◽

Expected Number ◽

Worst Case ◽

Average Case ◽

International Audience

International audience This paper introduces a Markov process inspired by the problem of quasicrystal growth. It acts over dimer tilings of the triangular grid by randomly performing local transformations, called $\textit{flips}$, which do not increase the number of identical adjacent tiles (this number can be thought as the tiling energy). Fixed-points of such a process play the role of quasicrystals. We are here interested in the worst-case expected number of flips to converge towards a fixed-point. Numerical experiments suggest a $\Theta (n^2)$ bound, where $n$ is the number of tiles of the tiling. We prove a $O(n^{2.5})$ upper bound and discuss the gap between this bound and the previous one. We also briefly discuss the average-case.

Download Full-text

A new algorithm for fixed point quantum search

Quantum Information and Computation ◽

10.26421/qic6.6-2 ◽

2006 ◽

Vol 6 (6) ◽

pp. 483-494

Author(s):

T. Tulsi ◽

L.K. Grover ◽

A. Patel

Keyword(s):

Fixed Point ◽

Error Probability ◽

Search Algorithm ◽

Quantum Search ◽

Worst Case ◽

Average Case ◽

Standard Quantum ◽

Quantum Search Algorithm ◽

Monotonic Convergence ◽

Worst Case Behavior

The standard quantum search lacks a feature, enjoyed by many classical algorithms, of having a fixed point, i.e. monotonic convergence towards the solution. Recently a fixed point quantum search algorithm has been discovered, referred to as the Phase-\pi/3 search algorithm, which gets around this limitation. While searching a database for a target state, this algorithm reduces the error probability from \epsilon to \epsilon^{2q+1} using q oracle queries, which has since been proved to be asymptotically optimal. A different algorithm is presented here, which has the same worst-case behavior as the Phase-\pi/3 search algorithm but much better average-case behavior. Furthermore the new algorithm gives \epsilon^{2q+1} convergence for all integral q, whereas the Phase-\pi/3 search algorithm requires q to be (3^{n}-1)/2 with n a positive integer. In the new algorithm, the operations are controlled by two ancilla qubits, and fixed point behavior is achieved by irreversible measurement operations applied to these ancillas. It is an example of how measurement can allow us to bypass some restrictions imposed by unitarity on quantum computing.

Download Full-text