Phase-Aware Cache Partitioning to Target Both Turnaround Time and System Performance

Performing scheduling of tasks with low energy consumption with high performance is one of the major concerns in distributed computing. Most of the existing systems have achieved improved energy efficiency but compromised with QoS metrics such as makespan and resource utilization. A resource scheduling strategy for wireless clusters is proposed by making careful considerations on decisions that would im-prove the battery life of nodes. The proposed strategy also incorporates monitoring system with in the clusters for optimizing the system performance as well as energy consumption. The system ensures “Any case zero loss" performance wherein each cluster will be monitored by at least one cluster monitor. This is implemented by using predictive calculation at each cluster monitor to communicate only if absolutely essential, during assigning jobs to resources, selecting optimal resources by assigning the jobs to the most power efficient resource among the available idle resources within the cluster. The experimental result ensures improved system performance with low power consumption in homogeneous computing environment. The resource sharing strategy is experimentally analyzed, considering the important performance metrics such as starvation deadline, turnaround time, miss hit count through simulations. Significant results were observed with improved efficiency.

Download Full-text

Dynamic Partition of Shared Cache for Multi-Thread Application in Multi-Core System

Key Engineering Materials ◽

10.4028/www.scientific.net/kem.439-440.1587 ◽

2010 ◽

Vol 439-440 ◽

pp. 1587-1594

Author(s):

Shuo Li ◽

Feng Wu

Keyword(s):

System Performance ◽

Scientific Computing ◽

Chip Multiprocessor ◽

Core System ◽

Cache Partitioning ◽

Shared Cache ◽

Rate Information ◽

Partitioning Algorithm ◽

Cache Partition

In a chip-multiprocessor with a shared cache structure , the competing accesses from different applications degrade the system performance.The accesses degrade the performance and result in non-predicting executing time. Cache partitioning techniques can exclusively partition the shared cache among multiple competing applications. In this paper, the authors design the framework of Process priority-based Multithread Cache Partitioning(PP-MCP),a dynamic shared cache partitioning mechanism to improve the performance of multi-threaded multi-programmed workloads. The framework includes a miss rate monitor , called Application-oriented Miss Rate Monitor (AMRM) , which dynamically collect s miss rate information of multiple multi-threaded applications on different cache partitions , and process priority-based weighted cache partitioning algorithm ,which extends traditional miss rate oriented cache partition algorithms.The algorithm allocates Cache in sequence of the value of the process priority and it ensures that the highest priority process will get enough Cache space; and the applications with more threads tend to get more shared cache in order to improve t he overall system performance. Experiments show that PP-MCP has better IPC throughput and weighted speedup. Specifically , for multi-threaded multi-programmed scientific computing workloads , PP-MCP-1 improves throughput by up to 20% and on average 10 % over PP-MCP-0.

Download Full-text

An Efficient All Shapes Busy List Processor Allocation Algorithm for 3D Mesh Multicomputers

International Journal of Cloud Applications and Computing ◽

10.4018/ijcac.2017040102 ◽

2017 ◽

Vol 7 (2) ◽

pp. 10-26 ◽

Cited By ~ 3

Author(s):

Saad Bani-Mohammad

Keyword(s):

System Performance ◽

Turnaround Time ◽

Allocation Strategy ◽

Processor Allocation ◽

3D Mesh ◽

System Utilization ◽

Size And Shape ◽

Shape Constraint ◽

Scheduling Strategies ◽

Allocation Strategies

Contiguous processor allocation is useful for security and accounting reasons. This is due to the allocated jobs are separated from one another, where each sub-mesh of processors is allocated to an exclusive job request, and the allocated sub-mesh has the same size and shape of the requested job. The size and shape constraint leads to high processor fragmentation. Most recent contiguous allocation strategies suggested for 3D mesh-connected multiconputers try all possible orientations of an allocation request when allocation fails for the requested orientation, which reduces processor fragmentation and hence improves system performance. However, none of them considers all shapes of the request in the process of allocation. To generalize this restricted rotation, we propose, in this paper, a new contiguous allocation strategy for 3D mesh-connected multicomputers, referred to as All Shapes Busy List (ASBL for short), which takes into consideration all possible contiguous request shapes when attempting allocation for a job request. ASBL depends on the list of allocated sub-meshes, in the method suggested in (Bani-Mohammad et al., 2006), for selecting an allocated sub-mesh. The performance of the proposed ASBL allocation strategy has been evaluated considering several important scheduling strategies under a variety of system loads based on different job size distributions. The simulation results have shown that the ASBL allocation strategy improves system performance in terms of parameters such as the average turnaround time of jobs and system utilization under all scheduling strategies considered.

Download Full-text

The Review of Cache Partitioning in Multi-Core Processor

Key Engineering Materials ◽

10.4028/www.scientific.net/kem.439-440.1223 ◽

2010 ◽

Vol 439-440 ◽

pp. 1223-1229

Author(s):

Shuo Li ◽

Gao Chao Xu ◽

Yu Shuang Dong ◽

Feng Wu

Keyword(s):

System Performance ◽

Optimal Performance ◽

Chip Multiprocessor ◽

Future Research ◽

Multicore Processor ◽

Cache Partitioning ◽

Shared Cache ◽

Microelectronics Technology ◽

Multi Core Processor ◽

Cache Pollution

With the development of microelectronics technology, Chip Multi-Processor (CMP) or multi-core design has become a mainstream choice for major microprocessor vendors. But in a chip-multiprocessor with a shared cache structure , the competing accesses from different applications degrade the system performance , resulting in non-optimal performance and non-predicting executing time. Cache partitioning techniques can exclusively partition the shared cache among multiple competing applications. In this paper, we first introduce the problems caused by Cache pollution in multicore processor structure; then present the different methods of Cache partitioning in multicore processor structure¬ －－categorizing them based on the different metrics. And finally, we discuss some possible directions for future research in the area.

Download Full-text

Effective Resource Allocation and Job Scheduling Mechanisms for Load Sharing in a Computational Grid

Handbook of Research on Grid Technologies and Utility Computing ◽

10.4018/978-1-60566-184-1.ch004 ◽

2009 ◽

pp. 31-40

Author(s):

Kuo-Chan Huang ◽

Po-Chi Shih ◽

Yeh-Ching Chung

Keyword(s):

Site Selection ◽

System Performance ◽

Job Scheduling ◽

Load Sharing ◽

Turnaround Time ◽

Computational Grid ◽

Grid Environment ◽

Grid Environments ◽

Scheduling Mechanisms ◽

Average Turnaround Time

Most current grid environments are established through collaboration among a group of participating sites which volunteer to provide free computing resources. Therefore, feasible load sharing policies that benefit all sites are an important incentive for attracting computing sites to join and stay in a grid environment. Moreover, a grid environment is usually heterogeneous in nature at least for different computing speeds at different participating sites. This chapter explores the feasibility and effectiveness of load sharing activities in a heterogeneous computational grid. Several issues are discussed including site selection policies as well as feasible load sharing mechanisms. Promising policies are evaluated in a series of simulations based on workloads derived from real traces. The results show that grid computing is capable of significantly improving the overall system performance in terms of average turnaround time for user jobs.

Download Full-text

CEMAP: COST-EFFECTIVE MOBILE AGENT PLANNING

International Journal of Cooperative Information Systems ◽

10.1142/s0218843004000912 ◽

2004 ◽

Vol 13 (02) ◽

pp. 159-181 ◽

Cited By ~ 6

Author(s):

JIN-WOOK BAEK ◽

JAE-HEUNG YEO ◽

GYU-TAE KIM ◽

HEON-YOUNG YEOM

Keyword(s):

Mobile Agent ◽

Network Traffic ◽

Execution Time ◽

System Performance ◽

Mobile Agents ◽

Heuristic Algorithms ◽

Cost Effective ◽

Turnaround Time ◽

Total Execution Time ◽

Performance Factors

Two significant performance factors in Mobile Agent Planning (MAP) for distributed information retrieval are the number of mobile agents and the total execution time. Using fewer mobile agents results in less network traffic and consumes less bandwidth. Regardless of the number of agents used, the total execution time for a task must be kept to a minimum. A retrieval service must minimize both these factors for better system performance, and at the same time, it must be able to supply the required information to users as quickly as possible. In this paper, we propose heuristic algorithms, called Cost-Effective MAP (CEMAP), to minimize both the number of mobile agents and the total execution time under the condition that the turnaround time is kept to a minimum. Although these algorithms tend to slightly increase the planning cost, a simulation study shows that these algorithms enhance the system performance significantly. By adopting these algorithms, systems can maintain lower network traffic while satisfying the minimum turnaround time.

Download Full-text

Advanced TEM sample preparation techniques for submicron Si devices

Proceedings, annual meeting, Electron Microscopy Society of America ◽

10.1017/s0424820100138956 ◽

1995 ◽

Vol 53 ◽

pp. 516-517 ◽

Cited By ~ 1

Author(s):

P. B. Basham ◽

H. L. Tsai

Keyword(s):

Sample Preparation ◽

Process Development ◽

Light Transmission ◽

Specific Area ◽

Turnaround Time ◽

Small Hole ◽

Area Of Interest ◽

Transmission Electron ◽

Polished Side ◽

Preparation Techniques

The use of transmission electron microscopy (TEM) to support process development of advanced microelectronic devices is often challenged by a large amount of samples submitted from wafer fabrication areas and specific-spot analysis. Improving the TEM sample preparation techniques for a fast turnaround time is critical in order to provide a timely support for customers and improve the utilization of TEM. For the specific-area sample preparation, a technique which can be easily prepared with the least amount of effort is preferred. For these reasons, we have developed several techniques which have greatly facilitated the TEM sample preparation.For specific-area analysis, the use of a copper grid with a small hole is found to be very useful. With this small-hole grid technique, TEM sample preparation can be proceeded by well-established conventional methods. The sample is first polished to the area of interest, which is then carefully positioned inside the hole. This polished side is placed against the grid by epoxy Fig. 1 is an optical image of a TEM cross-section after dimpling to light transmission.

Download Full-text