Data-WISE: Efficient management of data-intensive workflows in scheduled grid environments

Author(s):  
Gargi Dasgupta ◽  
Koustuv Dasgupta ◽  
Balaji Viswanathan
Author(s):  
Simab Hasan Rizvi

In Today's age of Tetra Scale computing, the application has become more data intensive than ever. The increased data volume from applications, in now tackling larger and larger problems, and has fuelled the need for efficient management of this data. In this paper, a technique called Content Addressable Storage or CAS, for managing large volume of data is evaluated. This evaluation focuses on the benefits and demerits of using CAS it focuses, i) improved application performance via lockless and lightweight synchronization ofaccess to shared storage data, ii) improved cache performance, iii) increase in storage capacity and, iv) increase network bandwidth. The presented design of a CAS-Based file store significantly improves the storage performance that provides lightweight lock less user defined consistency semantics. As a result, this file system shows a 28% increase in read bandwidth and 13% increase in write bandwidth, over a popular file system in common use. In this paper the potential benefits of using CAS for a virtual machine are estimated. The study also explains mobility application for active use and public deployment.


2014 ◽  
Vol 4 (1) ◽  
pp. 50-62 ◽  
Author(s):  
Sudhansu Shekhar Patra ◽  
R. K. Barik

Cloud computing has recently received considerable attention, as a promising approach for delivering Information and Communication Technologies (ICT) services as a utility. In the process of providing these services it is necessary to improve the utilization of data centre resources which are operating in most dynamic workload environments. Datacenters are integral parts of cloud computing. In the datacenter generally hundreds and thousands of virtual servers run at any instance of time, hosting many tasks and at the same time the cloud system keeps receiving the batches of task requests. It provides services and computing through the networks. Service Oriented Architecture (SOA) and agent frameworks renders tools for developing distributed and multi agent systems which can be used for the administration of cloud computing environments which supports the above characteristics. This paper presents a SOQM (Service Oriented QoS Assured and Multi Agent Cloud Computing) architecture which supports QoS assured cloud service provision and request. Biomedical and geospatial data on cloud can be analyzed through SOQM and has allowed the efficient management of the allocation of resources to the different system agents. It has proposed a finite heterogeneous multiple vm model which are dynamically allocated depending on the request from biomedical and geospatial stakeholders.


2014 ◽  
Vol 3 (2) ◽  
pp. 100-111
Author(s):  
Najme Mansouri

Grid environments have gain tremendous importance in recent years since application requirements increased drastically. The heterogeneity and geographic dispersion of grid resources and applications places some complex problems such as job scheduling. Most existing scheduling strategies in Grids only focus on one kind of Grid jobs which can be data-intensive or computation-intensive. However, only considering one kind of jobs in scheduling does not result in suitable scheduling in the viewpoint of all system, and sometimes causes wasting of resources on the other side. To address the challenge of simultaneously considering both kinds of jobs, a new Cost-Based Job Scheduling (CJS) strategy is proposed in this paper. At one hand, CJS algorithm considers both data and computational resource availability of the network, and on the other hand, considering the corresponding requirements of each job, it determines a value called W to the job. Using the W value, the importance of two aspects (being data or computation intensive) for each job is determined, and then the job is assigned to the available resources. The simulation results with OptorSim show that CJS outperforms comparing to the existing algorithms mentioned in literature as number of jobs increases.


Author(s):  
Mohammad Shorfuzzaman ◽  
Rasit Eskicioglu ◽  
Peter Graham

Data Grids provide services and infrastructure for distributed data-intensive applications that need to access, transfer and modify massive datasets stored at distributed locations around the world. For example, the next-generation of scientific applications such as many in high-energy physics, molecular modeling, and earth sciences will involve large collections of data created from simulations or experiments. The size of these data collections is expected to be of multi-terabyte or even petabyte scale in many applications. Ensuring efficient, reliable, secure and fast access to such large data is hindered by the high latencies of the Internet. The need to manage and access multiple petabytes of data in Grid environments, as well as to ensure data availability and access optimization are challenges that must be addressed. To improve data access efficiency, data can be replicated at multiple locations so that a user can access the data from a site near where it will be processed. In addition to the reduction of data access time, replication in Data Grids also uses network and storage resources more efficiently. In this chapter, the state of current research on data replication and arising challenges for the new generation of data-intensive grid environments are reviewed and open problems are identified. First, fundamental data replication strategies are reviewed which offer high data availability, low bandwidth consumption, increased fault tolerance, and improved scalability of the overall system. Then, specific algorithms for selecting appropriate replicas and maintaining replica consistency are discussed. The impact of data replication on job scheduling performance in Data Grids is also analyzed. A set of appropriate metrics including access latency, bandwidth savings, server load, and storage overhead for use in making critical comparisons of various data replication techniques is also discussed. Overall, this chapter provides a comprehensive study of replication techniques in Data Grids that not only serves as a tool to understanding this evolving research area but also provides a reference to which future e orts may be mapped.


2018 ◽  
Vol 3 (6) ◽  

The issue that underlies a worrying question of maternal and child health in Côte d'Ivoire is that of social logic. Social logic is perceived as "cultural constructions of actors with regard to morbidity that cause to adopt reproductive health care". Based on this understanding, the concept of social logic in reproductive health is similar to a paradigm that highlights the various factors that structure and organise sociological resistance to mothers' openness to healthy reproductive behaviours; that is, openness to change for sustainable reproductive health. Far from becoming and remaining a prisoner of blind culturalism with the social logic that generates the health of mothers, new-borns and children, practically-relevant questions are raised. Issues of "bad governance", socio-cultural representations and behaviours in conflict with modern epidemiological standards are addressed in a culturally-sensitive manner, an important issue for the provision of care focused on the needs of mothers seeking answers to health problems. Developing these original community characteristics helps to orient a reading list in a socioanthropological perspective with a view to explaining and understanding different problems encountered, experiences acquired by social actors during the implementation of antenatal, postnatal and family planning care. This context of building logic with regard to reproductive health care is key to identifying real bottlenecks in maternity services and achieving efficient management of maternal, new-born and child health care for the benefit of populations and actors in the public health sector.


Sign in / Sign up

Export Citation Format

Share Document