A Portable and Platform Independent File System for Large Scale Peer-to-Peer Systems and Distributed Applications

Author(s):  
Andreas Barbian ◽  
Stefan Nothaas ◽  
Timm J. Filler ◽  
Michael Schoettner
2014 ◽  
Vol 2 (3) ◽  
pp. 341-366
Author(s):  
ROMAIN HOLLANDERS ◽  
DANIEL F. BERNARDES ◽  
BIVAS MITRA ◽  
RAPHAËL M. JUNGERS ◽  
JEAN-CHARLES DELVENNE ◽  
...  

AbstractPeer-to-peer systems have driven a lot of attention in the past decade as they have become a major source of Internet traffic. The amount of data flowing through the peer-to-peer network is huge and hence challenging both to comprehend and to control. In this work, we take advantage of a new and rich dataset recording the peer-to-peer activity at a remarkable scale to address these difficult problems. After extracting the relevant and measurable properties of the network from the data, we develop two models that aim to make the link between the low-level properties of the network, such as the proportion of peers that do not share content (i.e., free riders) or the distribution of the files among the peers, and its high-level properties, such as the Quality of Service or the diffusion of content, which are of interest for supervision and control purposes. We observe a significant agreement between the high-level properties measured on the real data and on the synthetic data generated by our models, which is encouraging for our models to be used in practice as large-scale prediction tools. Relying on them, we demonstrate that spending efforts to reduce the amount of free riders indeed helps to improve the availability of files on the network. We observe however a saturation of this phenomenon after 60% of free riders.


Author(s):  
Eduardo Inacio ◽  
Mario Antonio Dantas

To meet ever increasing capacity and performance requirements of emerging data-intensive applications, highly distributed and multilayered back-end storage systems have been employed in large-scale high performance computing (HPC) environments. A main component of these storage infrastructures is the parallel file system (PFS), a especially designed file system for absorbing bulk data transfers from applications with thousands of concurrent processes. Load distribution on PFS data servers compose a major source of intra-application input/output (I/O) performance variability. Albeit mitigating variability is desirable, as it is known to harm application-perceived performance, understanding and dealing with I/O performance variability in such complex environments remains a challenging task. In this research, a differentiated approach for evaluating and mitigating intra-application I/O performance variability over PFSs is proposed. More specifically, from the evaluation perspective, a comprehensive approach combining complementary methods is proposed. An analytical model proposal, named DTSMaxLoad, provides estimates for the maximum load in a PFS data server. To complement DTSMaxLoad, modeling conditions and mechanisms hard to represent analytically, the Parallel I/O and Storage System (PIOSS) simulation model was proposed. Finally, for experimental evaluation over real environments, a flexible and distributed I/O performance evaluation tool, coined as IOR-Extended (IORE), was proposed. Furthermore, a high-level file distribution approach for PFSs, called N-N Round-Robin (N2R2), was proposed focusing on mitigating I/O performance variability for distributed applications where each process accesses an individual and independent file. An extensive experimental effort, including measurements on real environments, was conducted in this research work for evaluating each of the proposed approaches. In summary, this evaluation indicated both DTSMaxLoad and PIOSS modeling proposals can represent load distribution behavior on PFSs with significant fidelity. Moreover, results demonstrated N2R2 successfully reduced intra-application I/O performance variability for 270 distinct experimental scenarios, which, ultimately, translated into overall application I/O performance Improvements.


Author(s):  
Wael Abdulkarim Habeeb, Abdulkarim Assalem

  Publish/ subscribe (pub/ sub) is a popular communication paradigm in the design of large-scale distributed systems. We are witnessing an increasingly widespread use of pub/ sub for a wide array of applications in industry, academia, financial data dissemination, business process management and does not end in social networking sites which takes a large area of user interests and used network bandwidth. Social network interactions have grown exponentially in recent years to the order of billions of notifications generated by millions of users every day. So, it has become very important to access in the field of publishing and subscription networks, especially peer-to-peer (P2P) networks in many ways like the publication speed for events And the percentage of loss in the incoming events of the participants. Peer-to-peer systems can be very large and include millions of nodes, those nodes join and leave the network continuously, and these characteristics are difficult to handle. The evaluation of a new protocol in a real environment, particularly in the early stages, was considered impractical. Hence the need for a simulator to perform such a function to facilitate the simulation of researchers and this emulator is an open source simulator running within the Eclipse environment. In this research we have adopted a new method of selecting nodes within the table of vicinity protocol. This method is concentrated in that the far node increases the probability of its inclusion in the table more than the adjacent node. and The proposed network that uses the Polder Cast protocol was modelled using PeerSim software for modelling deployment and subscription networks within the eclipse environment so that the event delivery service is a Peer-2-Peer network and the method used to register is subject-based (Topic-Based). experimental results showed noticeable improvement in the publication speed for events by 51.11% compared to the original design of the protocol. And The percentage of event loss was reduced by 20%.    


Author(s):  
Valentin Cristea ◽  
Ciprian Dobre ◽  
Corina Stratan ◽  
Florin Pop

Large scale distributed systems are used for executing a wide variety of applications; while the first distributed applications were from the scientific area, today many of them are dedicated to businesses or even to home users. The constantly increasing demand for large scale distributed applications has brought on a need for tools and frameworks that ease their development. The main role of these tools and frameworks is to assist the developer in implementing some common functionalities and patterns that are specific to distributed applications – for example, dividing a large computational task into smaller subtasks to be executed on multiple machines, or sending e-mails automatically, or managing the access to resources in a secure way. One of the most important issues that the application development frameworks have to address is the abstraction of the underlying middleware: their main objective is to relieve the application programmer from the effort of dealing with lowerlevel components. Another important aspect is the performance of the communication among the application components; hence, some development tools are specifically targeted to optimizing the communication performance. We also observe an increasing interest in the interoperability among applications developed with different platforms, which has led to many standardization initiatives. This chapter discusses the issues introduced above, and makes an overview of the current tools and frameworks for developing various types of distributed applications. We start with web applications, which are the most frequently used nowadays; we introduce some general design issues, and present tools for server-side and client-side programming. Then, we discuss about developing applications in grids, clouds and peer-to-peer systems; we present the specific aspects of programming applications in these types of systems and introduce some of the most widely used tools and frameworks. The last section is dedicated to distributed workflows – complex applications that are composed of multiple smaller applications or services; the development and execution of workflows poses more challenges compared to traditional applications, requiring specific tools and runtime environments.


Author(s):  
Pawan Prakash ◽  
Ramana Rao Kompella ◽  
Venugopalan Ramasubramanian ◽  
Ranveer Chandra

Sign in / Sign up

Export Citation Format

Share Document