Enhancement of cloud performance metrics using dynamic degree memory balanced allocation algorithm

In cloud computing, load balancing among the resources is required to schedule a task, which is a key challenge. This paper proposes a dynamic degree memory balanced allocation (D2MBA) algorithm which allocate virtual machine (VM) to a best suitable host, based on availability of random-access memory (RAM) and microprocessor without interlocked pipelined stages (MIPS) of host and allocate task to a best suitable VM by considering balanced condition of VM. The proposed D2MBA algorithm has been simulated using a simulation tool CloudSim by varying number of tasks and keeping number of VMs constant and vice versa. The D2MBA algorithm is compared with the other load balancing algorithms viz. Round Robin (RR) and dynamic degree balance with central processing unit (CPU) based (D2B_CPU based) with respect to performance parameters such as execution cost, degree of imbalance and makespan time. It is found that the D2MBA algorithm has a large reduction in the performance parameters such as execution cost, degree of imbalance and makespan time as compared with RR and D2B CPU based algorithms

Download Full-text

In-Memory Logic Operations and Neuromorphic Computing in Non-Volatile Random Access Memory

Materials ◽

10.3390/ma13163532 ◽

2020 ◽

Vol 13 (16) ◽

pp. 3532 ◽

Cited By ~ 1

Author(s):

Qiao-Feng Ou ◽

Bang-Shu Xiong ◽

Lei Yu ◽

Jing Wen ◽

Lei Wang ◽

...

Keyword(s):

Performance Metrics ◽

Random Access ◽

Random Access Memory ◽

Main Memory ◽

Boolean Logic ◽

Processing Unit ◽

Access Memory ◽

Neuromorphic Circuits ◽

Von Neumann ◽

Central Processing

Recent progress in the development of artificial intelligence technologies, aided by deep learning algorithms, has led to an unprecedented revolution in neuromorphic circuits, bringing us ever closer to brain-like computers. However, the vast majority of advanced algorithms still have to run on conventional computers. Thus, their capacities are limited by what is known as the von-Neumann bottleneck, where the central processing unit for data computation and the main memory for data storage are separated. Emerging forms of non-volatile random access memory, such as ferroelectric random access memory, phase-change random access memory, magnetic random access memory, and resistive random access memory, are widely considered to offer the best prospect of circumventing the von-Neumann bottleneck. This is due to their ability to merge storage and computational operations, such as Boolean logic. This paper reviews the most common kinds of non-volatile random access memory and their physical principles, together with their relative pros and cons when compared with conventional CMOS-based circuits (Complementary Metal Oxide Semiconductor). Their potential application to Boolean logic computation is then considered in terms of their working mechanism, circuit design and performance metrics. The paper concludes by envisaging the prospects offered by non-volatile devices for future brain-inspired and neuromorphic computation.

Download Full-text

Missing in action: verbal metaphor for information technology

English Today ◽

10.1017/s0266078401003042 ◽

2001 ◽

Vol 17 (3) ◽

pp. 24-30

Author(s):

Paul Bruthiaux

Keyword(s):

Information Technology ◽

Language Use ◽

Random Access ◽

Processing Unit ◽

Scientific Language ◽

Access Memory ◽

Central Processing ◽

Rapid Spread ◽

Verbal Metaphor ◽

Missing In Action

The rapid spread of Information Technology (IT) in recent years and the role it plays in many aspects of our lives has not left language use untouched. A manifestation of this role is the degree of linguistic creativity that has accompanied technological innovation. In English, this creativity is seen in the semantic relabeling of established terms such as web, bug, virus, firewall, etc. Another strategy favored by IT lexifiers is the use of lexical items clustered in heavy premodifying groups, as in random access memory, disk operating system, central processing unit, and countless others (White, 1999). In brief, IT technology – and in particular, the World Wide Web – has made it possible for users to break free of many linguistic codes and conventions (Lemke, 1999).For the linguist, the happy outcome of the spread of IT is that it has created an opportunity to analyze the simultaneous development of technology and the language that encodes it and the influence of one on the other (Stubbs, 1997). To linguists of a broadly functional disposition, this is a chance to confirm the observation that scientific language differs substantially from everyday language. More importantly, it is also a chance to verify the claim made chiefly by Halliday & Martin (1993) that this difference in the characteristics of each of these discourses stems from a radical difference between scientific and common sense construals of the world around us.

Download Full-text

Basic Issues

Introduction to Parallel Computing ◽

10.1093/oso/9780198515760.003.0006 ◽

2004 ◽

Author(s):

Wesley Petersen ◽

Peter Arbenz

Keyword(s):

Memory Performance ◽

Direct Memory Access ◽

Random Access ◽

Random Access Memory ◽

Memory Systems ◽

Close Packing ◽

Memory Access ◽

Processing Unit ◽

Access Memory ◽

Central Processing

Since first proposed by Gordon Moore (an Intel founder) in 1965, his law [107] that the number of transistors on microprocessors doubles roughly every one to two years has proven remarkably astute. Its corollary, that central processing unit (CPU) performance would also double every two years or so has also remained prescient. Figure 1.1 shows Intel microprocessor data on the number of transistors beginning with the 4004 in 1972. Figure 1.2 indicates that when one includes multi-processor machines and algorithmic development, computer performance is actually better than Moore’s 2-year performance doubling time estimate. Alas, however, in recent years there has developed a disagreeable mismatch between CPU and memory performance: CPUs now outperform memory systems by orders of magnitude according to some reckoning [71]. This is not completely accurate, of course: it is mostly a matter of cost. In the 1980s and 1990s, Cray Research Y-MP series machines had well balanced CPU to memory performance. Likewise, NEC (Nippon Electric Corp.), using CMOS (see glossary, Appendix F) and direct memory access, has well balanced CPU/Memory performance. ECL (see glossary, Appendix F) and CMOS static random access memory (SRAM) systems were and remain expensive and like their CPU counterparts have to be carefully kept cool. Worse, because they have to be cooled, close packing is difficult and such systems tend to have small storage per volume. Almost any personal computer (PC) these days has a much larger memory than supercomputer memory systems of the 1980s or early 1990s. In consequence, nearly all memory systems these days are hierarchical, frequently with multiple levels of cache. Figure 1.3 shows the diverging trends between CPUs and memory performance. Dynamic random access memory (DRAM) in some variety has become standard for bulk memory. There are many projects and ideas about how to close this performance gap, for example, the IRAM [78] and RDRAM projects [85]. We are confident that this disparity between CPU and memory access performance will eventually be tightened, but in the meantime, we must deal with the world as it is.

Download Full-text

ANALISIS PERFORMA CORE FRAMEWORK IOS PADA APLIKASI VISUALISASI GENERAL-GRAPH MENGGUNAKAN PERANGKAT MOBILE

Jurnal Informatika Polinema ◽

10.33795/jip.v6i2.341 ◽

2020 ◽

Vol 6 (2) ◽

pp. 61-66

Author(s):

Shumaya Resty Ramadhani

Keyword(s):

Graphic Processing Unit ◽

Random Access ◽

Random Access Memory ◽

Central Processing Unit ◽

Processing Unit ◽

Access Memory ◽

General Graph ◽

Central Processing ◽

Core Framework ◽

Graphic Processing

Perkembangan teknologi yang pesat membawa perubahan kebiasaan pada pengguna teknologi. Perangkat teknologi ikut berevolusi, mulai dari super komputer hingga smartphone berukuran kecil dengan performa yang sepadan. Banyaknya penikmat teknologi yang beralih kepada piranti cerdas ini membuka peluang pengembangan aplikasi yang cukup besar. Aplikasi mobile harus tetap mampu bekerja secara cepat dan ringan meski dijalankan pada smartphone dengan tipe lama atau spesifikasi terbatas. Terutama aplikasi yang mengusung visualisasi dan animasi untuk menarik minat pengguna. Sistem operasi iOS menyediakan CoreFramework yang mendukung proses pembuatan objek dan animasi dalam jumlah banyak dengan cepat dan ringan. Oleh sebab itu, dibentuklah sebuah aplikasi visualisasi general-graph sederhana dengan implementasi CoreFramework guna menguji seberapa besar pengaruh framework tersebut terhadap kualitas aplikasi, terutama pada perangkat seri lama. Kriteria pengujian menggunakan tiga variabel dasar, yaitu waktu, alokasi Central Processing Unit (CPU) dan Random Access Memory (RAM) yang digunakan. Hasil dari pengujian menunjukkan bahwa meski CoreFramework menggunakan Graphic Processing Unit (GPU) untuk pemrosesannya, tapi setidaknya aplikasi membutuhkan minimal RAM berukuran 2GB pada perangkat smartphone agar responsifitas terjaga. Hal ini disebabkan karena ketika kapasitas RAM kecil, maka aplikasi akan menggunakan alokasi CPU dengan cukup signifikan agar bisa berjalan dengan baik.

Download Full-text

Improvement of Adaptive Load Balanced Gateway Discovery Protocol in Hybrid Integrated Internet-MANET

International Journal of Interactive Mobile Technologies (iJIM) ◽

10.3991/ijim.v11i3.6706 ◽

2017 ◽

Vol 11 (3) ◽

pp. 162

Author(s):

Rafi U Zaman ◽

Humaira M Alam ◽

Khaleel Ur Rahman Khan ◽

A. Venugopal Reddy

Keyword(s):

Load Balancing ◽

Performance Metrics ◽

Mobile Node ◽

Simulation Tool ◽

5G Networks ◽

End To End Delay ◽

Hybrid Framework ◽

Different Types ◽

Gateway Discovery ◽

Networking Architecture

<p class="0abstract">Internetworking of different types of networks is envisaged as one of the primary objectives of the future 5G networks. Integrated Internet-MANET is a heterogeneous networking architecture which is the result of interconnecting wired Internet and wireless MANET. Multiprotocol gateways are used to achieve this interconnection. There are two types of Integrated Internet-MANET architectures, two-tier and three-tier. A combination of two-tier and three tier architectures also exists, called the Hybrid Framework or Hybrid Integrated Internet-MANET. Some of the most important issues common to all Integrated Internet-MANET architecture are: efficient gateway discovery, mobile node registration and gateway load balancing. Adaptive WLB-AODV is an existing protocol which addresses the issues of Gateway load balancing and efficient Gateway discovery. In this paper, an improvement is proposed to Adaptive WLB-AODV, called Adaptive Modified-WLV-AODV by taking into account route latency. The proposed protocol has been implemented in Hybrid Integrated Internet-MANET and has been simulated using network simulation tool ns-2. Based on the simulation results, it is observed that the proposed protocol delivers better performance than the existing protocol in terms of performance metrics end-to-end delay and packet loss ratio. The performance of the proposed protocol is further optimized using a genetic algorithm.</p>

Download Full-text

Efficient parallelization of SPH algorithm on modern multi-core CPUs and massively parallel GPUs

International Journal of Modeling Simulation and Scientific Computing ◽

10.1142/s1793962321500549 ◽

2021 ◽

pp. 2150054

Author(s):

Pravin Jagtap ◽

Rupesh Nasre ◽

V. S. Sanapala ◽

B. S. V. Patnaik

Keyword(s):

High Performance ◽

Performance Metrics ◽

Computational Simulation ◽

Massively Parallel ◽

Benchmark Problems ◽

Processing Unit ◽

Central Processing ◽

Neighbor Search ◽

Computational Performance ◽

Sph Algorithm

Smoothed Particle Hydrodynamics (SPH) is fast emerging as a practically useful computational simulation tool for a wide variety of engineering problems. SPH is also gaining popularity as the back bone for fast and realistic animations in graphics and video games. The Lagrangian and mesh-free nature of the method facilitates fast and accurate simulation of material deformation, interface capture, etc. Typically, particle-based methods would necessitate particle search and locate algorithms to be implemented efficiently, as continuous creation of neighbor particle lists is a computationally expensive step. Hence, it is advantageous to implement SPH, on modern multi-core platforms with the help of High-Performance Computing (HPC) tools. In this work, the computational performance of an SPH algorithm is assessed on multi-core Central Processing Unit (CPU) as well as massively parallel General Purpose Graphical Processing Units (GP-GPU). Parallelizing SPH faces several challenges such as, scalability of the neighbor search process, force calculations, minimizing thread divergence, achieving coalesced memory access patterns, balancing workload, ensuring optimum use of computational resources, etc. While addressing some of these challenges, detailed analysis of performance metrics such as speedup, global load efficiency, global store efficiency, warp execution efficiency, occupancy, etc. is evaluated. The OpenMP and Compute Unified Device Architecture[Formula: see text] parallel programming models have been used for parallel computing on Intel Xeon[Formula: see text] E5-[Formula: see text] multi-core CPU and NVIDIA Quadro M[Formula: see text] and NVIDIA Tesla p[Formula: see text] massively parallel GPU architectures. Standard benchmark problems from the Computational Fluid Dynamics (CFD) literature are chosen for the validation. The key concern of how to identify a suitable architecture for mesh-less methods which essentially require heavy workload of neighbor search and evaluation of local force fields from neighbor interactions is addressed.

Download Full-text

Data transfer rate, Central Processing Unit usage and Read Access Memory usage in Networked Control System via Industrial Ethernet

2010 IEEE International Conference on Power and Energy ◽

10.1109/pecon.2010.5697603 ◽

2010 ◽

Author(s):

Handy Ali Munir ◽

Nordin Saad ◽

Syed Alwee Aljunid Syed Junid ◽

Ahmad Mujahid Ahmad Zaidi ◽

Mohd Zuki Yusoff ◽

...

Keyword(s):

Control System ◽

Transfer Rate ◽

Data Transfer ◽

Networked Control System ◽

Industrial Ethernet ◽

Processing Unit ◽

Memory Usage ◽

Access Memory ◽

Data Transfer Rate ◽

Central Processing

Download Full-text

A design procedure for improving the effectiveness of fractal layouts formation

Artificial intelligence for engineering design analysis and manufacturing ◽

10.1017/s0890060413000474 ◽

2014 ◽

Vol 28 (1) ◽

pp. 1-26 ◽

Cited By ~ 1

Author(s):

Yung Chin Shih ◽

Eduardo Vila Gonçalves Filho

Keyword(s):

Design Procedure ◽

Current Approach ◽

Computational Time ◽

Shop Floor ◽

Processing Unit ◽

Performance Parameters ◽

Trade Off ◽

Central Processing ◽

Search Heuristics ◽

Functional Layout

AbstractRecently, new types of layouts have been proposed in the literature in order to handle a large number of products. Among these are the fractal layout, aiming at minimization of routing distances. There are already researchers focusing on the design; however, we have noticed that the current approach usually executes several times the allocations of fractal cells on the shop floor up to find the best allocations, which may present a significant disadvantage when applied to a large number of fractal cells owing to combinatorial features. This paper aims to propose a criterion, based on similarity among fractal cells, developed and implemented in a Tabu search heuristics, in order to allocate it on the shop floor in a feasible computational time. Once our proposed procedure is modeled, operations of each workpiece are separated in n subsets and submitted to simulation. The results (traveling distance and makespan) are compared to distributed layout and to functional layout. The results show, in general, a trade-off behavior, that is, when the total routing distance decreases, the makespan increases. Based on our proposed method, depending on the value of segregated fractal cell similarity, it is possible to reduce both performance parameters. Finally, we conclude the proposed procedure shows to be quite promising because allocations of fractal cells demand reduced central processing unit time.

Download Full-text

Choosing a video conferencing service and its adaptation for educational institution

Informatics ◽

10.37661/1816-0301-2021-18-4-17-25 ◽

2021 ◽

Vol 18 (4) ◽

pp. 17-25

Author(s):

A. N. Markov ◽

R. O. Ihnatovich ◽

A. I. Paramonov

Keyword(s):

Computer Network ◽

Video Conferencing ◽

Educational Institution ◽

Random Access ◽

Computer Experiment ◽

Learning System ◽

Processing Unit ◽

Load Testing ◽

Central Processing ◽

Load Increase

Objectives. The authors aimed to demonstrate the need for implementation of video conferencing service into the learning process, to select a video conferencing service, and to conduct a computer experiment with the selected BigBlueButton video conferencing service.Methods. The problems of choosing a video conferencing service from the list of video conferencing and video conferencing software are considered. At the stage of software selection, the features of its operation, requirements for hardware and for integration into internal information systems are indicated. Load testing of the video conferencing service was carried out by the method of volume and stability testing.Results. The load graphs for hardware components of the virtual server in the long term period are presented. The article describes the results of the graphs analysis in order to identify the key features of the video conferencing service during the test and trial operations.Conclusion. Taking into account the cost of licensing, as well as integration into the e-learning system, a choice of video conferencing service was made. A computer experiment was carried out with the selected BigBlueButton video conferencing service. The features of the hardware operation of the virtual server (on which the BigBlueButton system is located) have been determined. The load graphs for the central processing unit, random access memory and local computer network are presented. Problems of service operation at the stage of load increase are formulated.

Download Full-text

Dynamic Load Balancing Using Actual Workload Traces Based on Central Processing Unit Temperatures

Journal of Electronic Packaging ◽

10.1115/1.4044262 ◽

2019 ◽

Vol 141 (3) ◽

Author(s):

Yusuke Nakajo ◽

Jayati Athavale ◽

Minami Yoda ◽

Yogendra Joshi ◽

Hiroaki Nishi

Keyword(s):

Load Balancing ◽

Dynamic Load ◽

Data Centers ◽

Web Applications ◽

Experimental Studies ◽

Central Processing Unit ◽

Dynamic Load Balancing ◽

Processing Unit ◽

Central Processing ◽

Processing Times

Abstract With the rapid growth in demand for distributed computing, data centers are a critical physical component of the “cloud.” Recent studies show that the energy consumption of data centers for both cooling and computing keeps increasing, and the growth in server power densities makes it ever more challenging to keep the servers below their maximum operating temperature. This paper presents a new dynamic load-balancing approach based on individual server central processing unit (CPU) temperatures. In this approach, a load balancer assigns a task in real time to a server based on the objective to keep the CPU temperatures below a maximum value. Experimental studies are conducted in a single rack based on production workload traces of Google clusters. This study also compares the performance of this method with two other load balancing approaches, Round Robin, and a CPU utilization-based method in terms of temperature distributions, local fan rotation speeds, system loads, and server processing times. Furthermore, we investigate how the effect of the proposed load balancing changes with different assumed applications run on servers. The results indicate that this new method can more effectively reduce both server CPU temperatures and local fan rotation speed in a rack especially for the most of web applications.

Download Full-text