scholarly journals Cloud bursting galaxy: federated identity and access management

2019 ◽  
Vol 36 (1) ◽  
pp. 1-9 ◽  
Author(s):  
Vahid Jalili ◽  
Enis Afgan ◽  
James Taylor ◽  
Jeremy Goecks

Abstract Motivation Large biomedical datasets, such as those from genomics and imaging, are increasingly being stored on commercial and institutional cloud computing platforms. This is because cloud-scale computing resources, from robust backup to high-speed data transfer to scalable compute and storage, are needed to make these large datasets usable. However, one challenge for large-scale biomedical data on the cloud is providing secure access, especially when datasets are distributed across platforms. While there are open Web protocols for secure authentication and authorization, these protocols are not in wide use in bioinformatics and are difficult to use for even technologically sophisticated users. Results We have developed a generic and extensible approach for securely accessing biomedical datasets distributed across cloud computing platforms. Our approach combines OpenID Connect and OAuth2, best-practice Web protocols for authentication and authorization, together with Galaxy (https://galaxyproject.org), a web-based computational workbench used by thousands of scientists across the world. With our enhanced version of Galaxy, users can access and analyze data distributed across multiple cloud computing providers without any special knowledge of access/authorization protocols. Our approach does not require users to share permanent credentials (e.g. username, password, API key), instead relying on automatically generated temporary tokens that refresh as needed. Our approach is generalizable to most identity providers and cloud computing platforms. To the best of our knowledge, Galaxy is the only computational workbench where users can access biomedical datasets across multiple cloud computing platforms using best-practice Web security approaches and thereby minimize risks of unauthorized data access and credential use. Availability and implementation Freely available for academic and commercial use under the open-source Academic Free License (https://opensource.org/licenses/AFL-3.0) from the following Github repositories: https://github.com/galaxyproject/galaxy and https://github.com/galaxyproject/cloudauthz.

2018 ◽  
Author(s):  
Vahid Jalili ◽  
Enis Afgan ◽  
James Taylor ◽  
Jeremy Goecks

AbstractMotivationLarge biomedical datasets, such as those from genomics and imaging, are increasingly being stored on commercial and institutional cloud computing platforms. This is because cloud-scale computing resources, from robust backup to high-speed data transfer to scalable compute and storage, are needed to make these large datasets usable. However, one challenge for large-scale biomedical data on the cloud is providing secure access, especially when datasets are distributed across platforms. While there are open Web protocols for secure authentication and authorization, these protocols are not in wide use in bioinformatics and are difficult to use for even technologically sophisticated users.ResultsWe have developed a generic and extensible approach for securely accessing biomedical datasets distributed across cloud computing platforms. Our approach combines OpenID Connect and OAuth2, best-practice Web protocols for authentication and authorization, together with Galaxy (https://galaxyproject.org), a web-based computational workbench used by thousands of scientists across the world. With our enhanced version of Galaxy, users can access and analyze data distributed across multiple cloud computing providers without any special knowledge of access/authorization protocols. Our approach does not require users to share permanent credentials (e.g., username, password, API key), instead relying on automatically-generated temporary tokens that refresh as needed. Our approach is generalizable to most identity providers and cloud computing platforms. To the best of our knowledge, Galaxy is the only computational workbench where users can access biomedical datasets across multiple cloud computing platforms using best-practice Web security approaches and thereby minimize risks of unauthorized data access and credential use.Availability and ImplementationFreely available for academic and commercial use under the open-source Academic Free License (https://opensource.org/licenses/AFL-3.0) from the following Github repositories: https://github.com/galaxyproject/galaxy and https://github.com/galaxyproject/[email protected], [email protected]


2021 ◽  
Vol 13 (2) ◽  
pp. 176
Author(s):  
Peng Zheng ◽  
Zebin Wu ◽  
Jin Sun ◽  
Yi Zhang ◽  
Yaoqin Zhu ◽  
...  

As the volume of remotely sensed data grows significantly, content-based image retrieval (CBIR) becomes increasingly important, especially for cloud computing platforms that facilitate processing and storing big data in a parallel and distributed way. This paper proposes a novel parallel CBIR system for hyperspectral image (HSI) repository on cloud computing platforms under the guide of unmixed spectral information, i.e., endmembers and their associated fractional abundances, to retrieve hyperspectral scenes. However, existing unmixing methods would suffer extremely high computational burden when extracting meta-data from large-scale HSI data. To address this limitation, we implement a distributed and parallel unmixing method that operates on cloud computing platforms in parallel for accelerating the unmixing processing flow. In addition, we implement a global standard distributed HSI repository equipped with a large spectral library in a software-as-a-service mode, providing users with HSI storage, management, and retrieval services through web interfaces. Furthermore, the parallel implementation of unmixing processing is incorporated into the CBIR system to establish the parallel unmixing-based content retrieval system. The performance of our proposed parallel CBIR system was verified in terms of both unmixing efficiency and accuracy.


Author(s):  
Natasha Csicsmann ◽  
Victoria McIntyre ◽  
Patrick Shea ◽  
Syed S. Rizvi

Strong authentication and encryption schemes help cloud stakeholders in performing the robust and accurate cloud auditing of a potential service provider. All security-related issues and challenges, therefore, need to be addressed before a ubiquitous adoption of cloud computing. In this chapter, the authors provide an overview of existing biometrics-based security technologies and discuss some of the open research issues that need to be addressed for making biometric technology an effective tool for cloud computing security. Finally, this chapter provides a performance analysis on the use of large-scale biometrics-based authentication systems for different cloud computing platforms.


Author(s):  
Esma Yildirim ◽  
Tevfik Kosar

The emerging petascale increase in the data produced by large-scale scientific applications necessitates innovative solutions for efficient transfer of data through the advanced infrastructure provided by today’s high-speed networks and complex computer-architectures (e.g. supercomputers, parallel storage systems). Although the current optical networking technology reached transport speeds of 100Gbps, the applications still suffer from the inadequate transport protocols and end-system bottlenecks such as processor speed, disk I/O speed and network interface card limits that cause underutilization of the existing network infrastructure and let the application achieve only a small portion of the theoretical performance. Fortunately, with the parallelism provided by usage of multiple CPUs/nodes and multiple disks present in today’s systems, these bottlenecks could be eliminated. However it is necessary to understand the characteristics of the end-systems and the transport protocol used. In this book chapter, we analyze methodologies that will improve the data transfer speed of applications and provide maximal speeds that could be obtained from the available end-system resources and high-speed networks through usage of end-to-end dataflow parallelism.


Author(s):  
A. S. Kotlyarov

In the paper we review the possibility of using a block search method for network packets routing tasks in high-speed computer networks. The method provides minimization of hardware costs for large-scale routing tables. To support the maximum data transfer rate, it is necessary to perform real-time routing of packets. The existing solution presumes concurrent use of several routing devices. Each device performs independent search of records by the bisection method. However, if the channel rate exceeds 10 Gb/sec, and the number of routs exceeds 220, it leads to high hardware costs. To reduce hardware costs, we have suggested to use a modified block search method, which differs from the classic one by parallel-pipeline form of search. We have presented evaluation of the minimum field-programmable gate array hardware costs for the network packets routing task. Analysis of the results has proved efficiency of the suggested method in comparison with existing solutions. As a result, the hardware costs were reduced in 5 times.


2012 ◽  
Vol 182-183 ◽  
pp. 706-710 ◽  
Author(s):  
Cheng Jun Zhang ◽  
Xiao Yan Zuo ◽  
Chi Zhang ◽  
Xiao Guang Wu

Through analyzing the pattern data of computerized jacquard knitting wrap machine, comparing the current storages structure and type, this paper introduces a method for Flash file structure of jacquard data. The method takes advantage of ARM chip to achieve the operations for access and modification of Flash, designing a management procedures of jacquard data access from the perspective of increasing the Flash life. The management procedures not only complete read and write operations, but also meet the requirement of jacquard high-speed data transfer.


2011 ◽  
Vol 135-136 ◽  
pp. 43-49
Author(s):  
Han Ning Wang ◽  
Wei Xiang Xu ◽  
Chao Long Jia

The application of high-speed railway data, which is an important component of China's transportation science data sharing, has embodied the typical characteristics of data-intensive computing. A reasonable and effective data placement strategy is needed to deploy and execute data-intensive applications in the cloud computing environment. Study results of current data placement approaches have been analyzed and compared in this paper. Combining the semi-definite programming algorithm with the dynamic interval mapping algorithm, a hierarchical structure data placement strategy is proposed. The semi-definite programming algorithm is suitable for the placement of files with various replications, ensuring that different replications of a file are placed on different storage devices. And the dynamic interval mapping algorithm could guarantee better self-adaptability of the data storage system. It has been proved both by theoretical analysis and experiment demonstration that a hierarchical data placement strategy could guarantee the self-adaptability, data reliability and high-speed data access for large-scale networks.


Author(s):  
Wagner Al Alam ◽  
Francisco Carvalho Junior

The efforts to make cloud computing suitable for the requirements of HPC applications have motivated us to design HPC Shelf, a cloud computing platform of services for building and deploying parallel computing systems for large-scale parallel processing. We introduce Alite, the system of contextual contracts of HPC Shelf, aimed at selecting component implementations according to requirements of applications, features of targeting parallel computing platforms (e.g. clusters), QoS (Quality-of-Service) properties and cost restrictions. It is evaluated through a small-scale case study employing a componentbased framework for matrix-multiplication based on the BLAS library.


Author(s):  
Junaid Arshad ◽  
Paul Townend ◽  
Jie Xu

Cloud computing is an emerging computing paradigm which introduces novel opportunities to establish large scale, flexible computing infrastructures. However, security underpins extensive adoption of Cloud computing. This paper presents efforts to address one of the significant issues with respect to security of Clouds i.e. intrusion detection and severity analysis. An abstract model for integrated intrusion detection and severity analysis for Clouds is proposed to facilitate minimal intrusion response time while preserving the overall security of the Cloud infrastructures. In order to assess the effectiveness of the proposed model, detailed architectural evaluation using Architectural Trade-off Analysis Model (ATAM) is used. A set of recommendations which can be used as a set of best practice guidelines while implementing the proposed architecture is discussed.


Sign in / Sign up

Export Citation Format

Share Document