High-availability transaction processing: practical experience in availability modeling and analysis

Author(s):  
J.B. Bowles ◽  
J.G. Dobbins
Computing ◽  
2017 ◽  
Vol 99 (10) ◽  
pp. 929-954 ◽  
Author(s):  
Ermeson Andrade ◽  
Bruno Nogueira ◽  
Rubens Matos ◽  
Gustavo Callou ◽  
Paulo Maciel

2015 ◽  
Vol 2015 ◽  
pp. 1-20 ◽  
Author(s):  
Tuan Anh Nguyen ◽  
Dugki Min ◽  
Jong Sou Park

Sensitivity assessment of availability for data center networks (DCNs) is of paramount importance in design and management of cloud computing based businesses. Previous work has presented a performance modeling and analysis of a fat-tree based DCN using queuing theory. In this paper, we present a comprehensive availability modeling and sensitivity analysis of a DCell-based DCN with server virtualization for business continuity using stochastic reward nets (SRN). We use SRN in modeling to capture complex behaviors and dependencies of the system in detail. The models take into account (i) two DCell configurations, respectively, composed of two and three physical hosts in a DCell0unit, (ii) failure modes and corresponding recovery behaviors of hosts, switches, and VMs, and VM live migration mechanism within and between DCell0s, and (iii) dependencies between subsystems (e.g., between a host and VMs and between switches and VMs in the same DCell0). The constructed SRN models are analyzed in detail with regard to various metrics of interest to investigate system’s characteristics. A comprehensive sensitivity analysis of system availability is carried out in consideration of the major impacting parameters in order to observe the system’s complicated behaviors and find the bottlenecks of system availability. The analysis results show the availability improvement, capability of fault tolerance, and business continuity of the DCNs complying with DCell network topology. This study provides a basis of designing and management of DCNs for business continuity.


2021 ◽  
Author(s):  
Ricardo Paharsingh

Cloud computing services are built on the premise of high availability. These services are sold to customers who are expecting a reduced cost particularly in the area of failures and maintenance. At the Infrastructure as a Service (IaaS) layer resources is sold to customers as virtual machines (VMs) with CPU and memory specifications. Both these resources are not necessarily guaranteed. This is because virtual machines can share the same hardware resources. If resources aren't allocated properly, one virtual machine for example, may use up too much CPU power reducing the processing power available to other virtual machines. This can result in response time failures. In this research a framework is developed that integrates hardware, software and response time failures. Response time failures occur when a request is made to a server and does not complete on time. The framework allows the cloud purchaser to test the system under stressed conditions, allocating more or less virtual machines to determine the availability of the system. The framework also allows the cloud provider to separately evaluate the availability of the hardware and other software systems.


Author(s):  
Wenbing Zhao ◽  
Louise E. Moser ◽  
P. Michael Melliar-Smith

Enterprise applications, such as those for e-commerce and e-government, are becoming more and more critical to our economy and society. Such applications need to provide continuous service, 24 hours a day, 7 days a week. Any disruption in service, including both planned and unplanned downtime, can result in negative financial and social effects. Consequently, high availability and data consistency are critically important for enterprise applications. Enterprise applications are typically implemented as three-tier applications. A three-tier application consists of clients in the front tier, servers that perform the business logic processing in the middle tier, and database systems that store the application data in the backend tier, as shown in Figure 1. Within the middle tier, a server application typically uses a transaction processing programming model. When a server application receives a client’s request, it initiates one or more transactions, which often are distributed transactions. When it finishes processing the request, the server application commits the transaction, stores the resulting state in the backend database, and returns the result to the client. A fault in the middle tier might cause the abort of a transaction and/or prevent the client from knowing the outcome of the transaction. A fault in the backend tier has similar consequences. In some cases, the problems can be a lot worse. For example, a software design fault, or an inappropriate heuristic decision, might introduce inconsistency in the data stored in the database, which can take a long time to fix. Two alternative recovery strategies, namely roll-backward and roll-forward, can be employed to tolerate and recover from a fault. In roll-backward recovery, the state of the application that has been modified by a set of unfinished operations is reversed by restoring it to a previous consistent state. This strategy is used in transaction processing systems. In roll-forward recovery, critical components, processes, or objects are replicated on multiple computers so that if one of the replicas fails, the other replicas continue to provide service, which enables the system to advance despite the fault. Many applications that require continuous availability take the roll-forward approach. Replication is commonly employed in the backend tier to increase the reliability of the database system. There has been intense research (Frolund & Guerraoui, 2002; Zhao, Moser, & Melliar-Smith, 2005a) on the seamless integration of the roll-backward and roll-forward strategies in software infrastructures for three-tier enterprise applications, to achieve high availability and data consistency. High availability is a measure of the uptime of a system, and typically means five nines (99.999%) or better, which corresponds to 5.25 minutes of planned and unplanned downtime per year. Data consistency means that the application state stored in the database remains consistent after a transaction commits. Both transactions and replication require consistency, as the applications execute operations that change their states. Transactions require data consistency, and replication requires replica consistency.


Sign in / Sign up

Export Citation Format

Share Document