scholarly journals META-pipe cloud setup and execution

F1000Research ◽  
2019 ◽  
Vol 6 ◽  
pp. 2060
Author(s):  
Aleksandr Agafonov ◽  
Kimmo Mattila ◽  
Cuong Duong Tuan ◽  
Lars Tiede ◽  
Inge Alexander Raknes ◽  
...  

META-pipe is a complete service for the analysis of marine metagenomic data. It provides assembly of high-throughput sequence data, functional annotation of predicted genes, and taxonomic profiling. The functional annotation is computationally demanding and is therefore currently run on a high-performance computing cluster in Norway. However, additional compute resources are necessary to open the service to all ELIXIR users. We describe our approach for setting up and executing the functional analysis of META-pipe on additional academic and commercial clouds. Our goal is to provide a powerful analysis service that is easy to use and to maintain. Our design therefore uses a distributed architecture where we combine central servers with multiple distributed backends that execute the computationally intensive jobs. We believe our experiences developing and operating META-pipe provides a useful model for others that plan to provide a portal based data analysis service in ELIXIR and other organizations with geographically distributed compute and storage resources.

F1000Research ◽  
2017 ◽  
Vol 6 ◽  
pp. 2060
Author(s):  
Aleksandr Agafonov ◽  
Kimmo Mattila ◽  
Cuong Duong Tuan ◽  
Lars Tiede ◽  
Inge Alexander Raknes ◽  
...  

META-pipe is a complete service for the analysis of marine metagenomic data. It provides assembly of high-throughput sequence data, functional annotation of predicted genes, and taxonomic profiling. The functional annotation is computationally demanding and is therefore currently run on a high-performance computing cluster in Norway. However, additional compute resources are necessary to open the service to all ELIXIR users. We describe our approach for setting up and executing the functional analysis of META-pipe on additional academic and commercial clouds. Our goal is to provide a powerful analysis service that is easy to use and to maintain. Our design therefore uses a distributed architecture where we combine central servers with multiple distributed backends that execute the computationally intensive jobs. We believe our experiences developing and operating META-pipe provides a useful model for others that plan to provide a portal based data analysis service in ELIXIR and other organizations with geographically distributed compute and storage resources.


F1000Research ◽  
2018 ◽  
Vol 6 ◽  
pp. 2060 ◽  
Author(s):  
Aleksandr Agafonov ◽  
Kimmo Mattila ◽  
Cuong Duong Tuan ◽  
Lars Tiede ◽  
Inge Alexander Raknes ◽  
...  

META-pipe is a complete service for the analysis of marine metagenomic data. It provides assembly of high-throughput sequence data, functional annotation of predicted genes, and taxonomic profiling. The functional annotation is computationally demanding and is therefore currently run on a high-performance computing cluster in Norway. However, additional compute resources are necessary to open the service to all ELIXIR users. We describe our approach for setting up and executing the functional analysis of META-pipe on additional academic and commercial clouds. Our goal is to provide a powerful analysis service that is easy to use and to maintain. Our design therefore uses a distributed architecture where we combine central servers with multiple distributed backends that execute the computationally intensive jobs. We believe our experiences developing and operating META-pipe provides a useful model for others that plan to provide a portal based data analysis service in ELIXIR and other organizations with geographically distributed compute and storage resources.


2016 ◽  
Vol 33 (4) ◽  
pp. 621-634 ◽  
Author(s):  
Jingyin Tang ◽  
Corene J. Matyas

AbstractThe creation of a 3D mosaic is often the first step when using the high-spatial- and temporal-resolution data produced by ground-based radars. Efficient yet accurate methods are needed to mosaic data from dozens of radar to better understand the precipitation processes in synoptic-scale systems such as tropical cyclones. Research-grade radar mosaic methods of analyzing historical weather events should utilize data from both sides of a moving temporal window and process them in a flexible data architecture that is not available in most stand-alone software tools or real-time systems. Thus, these historical analyses require a different strategy for optimizing flexibility and scalability by removing time constraints from the design. This paper presents a MapReduce-based playback framework using Apache Spark’s computational engine to interpolate large volumes of radar reflectivity and velocity data onto 3D grids. Designed as being friendly to use on a high-performance computing cluster, these methods may also be executed on a low-end configured machine. A protocol is designed to enable interoperability with GIS and spatial analysis functions in this framework. Open-source software is utilized to enhance radar usability in the nonspecialist community. Case studies during a tropical cyclone landfall shows this framework’s capability of efficiently creating a large-scale high-resolution 3D radar mosaic with the integration of GIS functions for spatial analysis.


2020 ◽  
Author(s):  
Kary Ocaña ◽  
Micaella Coelho ◽  
Guilherme Freire ◽  
Carla Osthoff

Bayesian phylogenetic algorithms are computationally intensive. BEAST 1.10 inferences made use of the BEAGLE 3 high-performance library for efficient likelihood computations. The strategy allows phylogenetic inference and dating in current knowledge for SARS-CoV-2 transmission. Follow-up simulations on hybrid resources of Santos Dumont supercomputer using four phylogenomic data sets, we characterize the scaling performance behavior of BEAST 1.10. Our results provide insight into the species tree and MCMC chain length estimation, identifying preferable requirements to improve the use of high-performance computing resources. Ongoing steps involve analyzes of SARS-CoV-2 using BEAST 1.8 in multi-GPUs.


2017 ◽  
Vol 33 (2) ◽  
pp. 119-130
Author(s):  
Vinh Van Le ◽  
Hoai Van Tran ◽  
Hieu Ngoc Duong ◽  
Giang Xuan Bui ◽  
Lang Van Tran

Metagenomics is a powerful approach to study environment samples which do not require the isolation and cultivation of individual organisms. One of the essential tasks in a metagenomic project is to identify the origin of reads, referred to as taxonomic assignment. Due to the fact that each metagenomic project has to analyze large-scale datasets, the metatenomic assignment is very much computation intensive. This study proposes a parallel algorithm for the taxonomic assignment problem, called SeMetaPL, which aims to deal with the computational challenge. The proposed algorithm is evaluated with both simulated and real datasets on a high performance computing system. Experimental results demonstrate that the algorithm is able to achieve good performance and utilize resources of the system efficiently. The software implementing the algorithm and all test datasets can be downloaded at http://it.hcmute.edu.vn/bioinfo/metapro/SeMetaPL.html.


Sign in / Sign up

Export Citation Format

Share Document