scholarly journals From the desktop to the grid and cloud: conversion of KNIME workflows to WS-PGRADE

Author(s):  
Luis de la Garza ◽  
Fabian Aicheler ◽  
Oliver Kohlbacher

Computational analyses for research usually consist of a complicated orchestration of data flows, software libraries, visualization, selection of adequate parameters, etc. Structuring these complex activities into a collaboration of simple, reproducible and well defined tasks brings down complexity and increases reproducibility. This is the basic notion of workflows. Workflow engines allow users to create and execute workflows, each having unique features. In some cases, certain features offered by platforms are royalty-based, hindering use in the scientific community. We present our efforts to convert whole workflows created in the Konstanz Information Miner Analytics Platform to the Web Services Parallel Grid Runtime and Developer Environment. We see the former as a great workflow editor due to its considerable user base and user-friendly graphical interface. We deem the latter as a great backend engine able to interact with most major distributed computing interfaces. We introduce work that provides a platform-independent tool representation, thus assisting in the conversion of whole workflows. We also present the challenges inherent to workflow conversion across systems, as well as the ones posed by the conversion between the chosen workflow engines, along with our proposed solution to overcome these challenges. The combined features of these two platforms (i.e., intuitive workflow design on a desktop computer and execution of workflows on distributed high performance computing interfaces) greatly benefit researchers and minimize time spent in technical chores not directly related to their area of research.

2017 ◽  
Author(s):  
Luis de la Garza ◽  
Fabian Aicheler ◽  
Oliver Kohlbacher

Computational analyses for research usually consist of a complicated orchestration of data flows, software libraries, visualization, selection of adequate parameters, etc. Structuring these complex activities into a collaboration of simple, reproducible and well defined tasks brings down complexity and increases reproducibility. This is the basic notion of workflows. Workflow engines allow users to create and execute workflows, each having unique features. In some cases, certain features offered by platforms are royalty-based, hindering use in the scientific community. We present our efforts to convert whole workflows created in the Konstanz Information Miner Analytics Platform to the Web Services Parallel Grid Runtime and Developer Environment. We see the former as a great workflow editor due to its considerable user base and user-friendly graphical interface. We deem the latter as a great backend engine able to interact with most major distributed computing interfaces. We introduce work that provides a platform-independent tool representation, thus assisting in the conversion of whole workflows. We also present the challenges inherent to workflow conversion across systems, as well as the ones posed by the conversion between the chosen workflow engines, along with our proposed solution to overcome these challenges. The combined features of these two platforms (i.e., intuitive workflow design on a desktop computer and execution of workflows on distributed high performance computing interfaces) greatly benefit researchers and minimize time spent in technical chores not directly related to their area of research.


Author(s):  
H. F. Manesh ◽  
M. Hashemipour

The development of the Virtual Reality (VR) techniques for visualizaton of the computational simulations of complex problem has opened some new avenues for heat transfer and fluid flow research. The importance of data visualization is clearly recognized for to better understanding of the 3-D nature of the flow fields. This work introduces the educational user friendly “VRJET” package designed for teaching fluid mechanics and heat transfer. A Software is developed with C++ standard Programming language using an object-oriented approach to visualize the flow field with high performance computing including advanced support for data presentation and navigation techniques through 3D virtual environment. This work deals with 3-D visualization of the data of impinging laminar single square jet on a heated flat surface, obtained from numerical simulation. This package can be used for research, educational, and engineering.


Author(s):  
Ana Moreton–Fernandez ◽  
Hector Ortega–Arranz ◽  
Arturo Gonzalez–Escribano

Nowadays the use of hardware accelerators, such as the graphics processing units or XeonPhi coprocessors, is key in solving computationally costly problems that require high performance computing. However, programming solutions for an efficient deployment for these kind of devices is a very complex task that relies on the manual management of memory transfers and configuration parameters. The programmer has to carry out a deep study of the particular data that needs to be computed at each moment, across different computing platforms, also considering architectural details. We introduce the controller concept as an abstract entity that allows the programmer to easily manage the communications and kernel launching details on hardware accelerators in a transparent way. This model also provides the possibility of defining and launching central processing unit kernels in multi-core processors with the same abstraction and methodology used for the accelerators. It internally combines different native programming models and technologies to exploit the potential of each kind of device. Additionally, the model also allows the programmer to simplify the proper selection of values for several configuration parameters that can be selected when a kernel is launched. This is done through a qualitative characterization process of the kernel code to be executed. Finally, we present the implementation of the controller model in a prototype library, together with its application in several case studies. Its use has led to reductions in the development and porting costs, with significantly low overheads in the execution times when compared to manually programmed and optimized solutions which directly use CUDA and OpenMP.


2021 ◽  
Author(s):  
Matthias Arzt ◽  
Joran Deschamps ◽  
Christopher Schmied ◽  
Tobias Pietzsch ◽  
Deborah Schmidt ◽  
...  

We present Labkit, a user-friendly Fiji plugin for the segmentation of microscopy image data. It offers easy to use manual and automated image segmentation routines that can be rapidly applied to single- and multi-channel images as well as to timelapse movies in 2D or 3D. Labkit is specifically designed to work efficiently on big image data and enables users of consumer laptops to conveniently work with multiple-terabyte images. This efficiency is achieved by using ImgLib2 and BigDataViewer as the foundation of our software. Furthermore, memory efficient and fast random forest based pixel classification inspired by the Waikato Environment for Knowledge Analysis (Weka) is implemented. Optionally we harness the power of graphics processing units (GPU) to gain additional runtime performance. Labkit is easy to install on virtually all laptops and workstations. Additionally, Labkit is compatible with high performance computing (HPC) clusters for distributed processing of big image data. The ability to use pixel classifiers trained in Labkit via the ImageJ macro language enables our users to integrate this functionality as a processing step in automated image processing workflows. Last but not least, Labkit comes with rich online resources such as tutorials and examples that will help users to familiarize themselves with available features and how to best use \Labkit in a number of practical real-world use-cases.


Author(s):  
Ouidad Achahbar ◽  
Mohamed Riduan Abid

The ongoing pervasiveness of Internet access is intensively increasing Big Data production. This, in turn, increases demand on compute power to process this massive data, and thus rendering High Performance Computing (HPC) into a high solicited service. Based on the paradigm of providing computing as a utility, the Cloud is offering user-friendly infrastructures for processing Big Data, e.g., High Performance Computing as a Service (HPCaaS). Still, HPCaaS performance is tightly coupled with the underlying virtualization technique since the latter is responsible for the creation of virtual machines that carry out data processing jobs. In this paper, the authors evaluate the impact of virtualization on HPCaaS. They track HPC performance under different Cloud virtualization platforms, namely KVM and VMware-ESXi, and compare it against physical clusters. Each tested cluster provided different performance trends. Yet, the overall analysis of the findings proved that the selection of virtualization technology can lead to significant improvements when handling HPCaaS.


Computers ◽  
2021 ◽  
Vol 10 (11) ◽  
pp. 147
Author(s):  
Konstantinos M. Giannoutakis ◽  
Christos K. Filelis-Papadopoulos ◽  
George A. Gravvanis ◽  
Dimitrios Tzovaras

There is a tendency, during the last years, to migrate from the traditional homogeneous clouds and centralized provisioning of resources to heterogeneous clouds with specialized hardware governed in a distributed and autonomous manner. The CloudLightning architecture proposed recently introduced a dynamic way to provision heterogeneous cloud resources, by shifting the selection of underlying resources from the end-user to the system in an efficient way. In this work, an optimized Suitability Index and assessment function are proposed, along with their theoretical analysis, for improving the computational efficiency, energy consumption, service delivery and scalability of the distributed orchestration. The effectiveness of the proposed scheme is being evaluated with the use of simulation, by comparing the optimized methods with the original approach and the traditional centralized resource management, on real and synthetic High Performance Computing applications. Finally, numerical results are presented and discussed regarding the improvements over the defined evaluation criteria.


Author(s):  
Dorian Krause ◽  
Philipp Thörnig

JURECA is a petaflop-scale, general-purpose supercomputer operated by Jülich Supercomputing Centre at Forschungszentrum Jülich. Utilizing a flexible cluster architecture based on T-Platforms V-Class blades and a balanced selection of best of its kind components the system supports a wide variety of high-performance computing and data analytics workloads and offers a low entrance barrier for new users.


2020 ◽  
Author(s):  
Yasser Iturria-Medina ◽  
Felix Carbonell ◽  
Atoussa Assadi ◽  
Quadri Adewale ◽  
Ahmed F. Khan ◽  
...  

There is a critical need for a better multiscale and multifactorial understanding of neurological disorders, covering from genes to neuroimaging to clinical factors and treatments effects. Here we present NeuroPM-box, a cross-platform, user-friendly and open-access software for characterizing multiscale and multifactorial brain pathological mechanisms and identifying individual therapeutic needs. The implemented methods have been extensively tested and validated in the neurodegenerative context, but there is not restriction in the kind of disorders that can be analyzed. By using advanced analytic modeling of molecular, neuroimaging and/or cognitive/behavioral data, this framework allows multiple applications, including characterization of: (i) the series of sequential states (e.g. transcriptomic, imaging or clinical alterations) covering decades of disease progression, (ii) intra-brain spreading of pathological factors (e.g. amyloid and tau misfolded proteins), (iii) synergistic interactions between multiple brain biological factors (e.g. direct tau effects on vascular and structural properties), and (iv) biologically-defined patients stratification based on therapeutic needs (i.e. optimum treatments for each patient). All models outputs are biologically interpretable. A 4D-viewer allows visualization of spatiotemporal brain (dis)organization. Originally implemented in MATLAB, NeuroPM-box is compiled as standalone application for Windows, Linux and Mac environments: neuropm-lab.com/software. In a regular workstation, it can analyze over 150 subjects per day, reducing the need for using clusters or High-Performance Computing (HPC) for large-scale datasets. This open-access tool for academic researchers may significantly contribute to a better understanding of complex brain processes and to accelerating the implementation of Precision Medicine (PM) in neurology.


Sign in / Sign up

Export Citation Format

Share Document