Handbook of Research on Computational Grid Technologies for Life Sciences, Biomedicine, and Healthcare
Latest Publications


TOTAL DOCUMENTS

35
(FIVE YEARS 0)

H-INDEX

3
(FIVE YEARS 0)

Published By IGI Global

9781605663746, 9781605663753

Author(s):  
Livia Torterolo ◽  
Luca Corradi ◽  
Barbara Canesi ◽  
Marco Fato ◽  
Roberto Barbera ◽  
...  

This chapter describes a Grid oriented platform -the Bio Med Portal- as a new tool to promote collaboration and cooperation among scientists and healthcare research groups, enabling the remote use of resources integrated in complex software platform services forming a virtual laboratory. In fact, nowadays many biomedicine studies are dealing with large, distributed, and heterogeneous repositories as well as with computationally demanding analyses, and complex integration techniques are more often required to handle this complexity. The Bio Med Portal is designed to host several medical services and it is able to deploy several analysis algorithms. The scope of this chapter is both to present a Grid application with its own medical use case and to emphasize the benefit that a new Design Paradigm based on Grid could provide to research groups spread in geographically distributed sites.


Author(s):  
Ignacio Blanquer ◽  
Vicente Hernandez

Epidemiology constitutes one relevant use case for the adoption of grids for health. It combines challenges that have been traditionally addressed by grid technologies, such as managing large amounts of distributed and heterogeneous data, large scale computing and the need for integration and collaboration tools, but introduces new challenges traditionally addressed from the e-health area. The application of grid technologies to epidemiology has been concentrated in the federation of distributed repositories of data, the evaluation of computationally intensive statistical epidemiological models and the management of authorisation mechanism in virtual organisations. However, epidemiology presents important additional constraints that are not solved and harness the take-off of grid technologies. The most important problems are on the semantic integration of data, the effective management of security and privacy, the lack of exploitation models for the use of infrastructures, the instability of Quality of Service and the seamless integration of the technology on the epidemiology environment. This chapter presents an analysis of how these issues are being considered in state-of-the-art research.


Author(s):  
J.R. Bilbao Castro ◽  
I. Garcia Fernandez ◽  
J. Fernandez

Three-dimensional electron microscopy allows scientists to study biological specimens and to understand how they behave and interact with each other depending on their structural conformation. Electron microscopy projections of the specimens are taken from different angles and are processed to obtain a virtual three-dimensional reconstruction for further studies. Nevertheless, the whole reconstruction process, which is composed of many different subtasks from the microscope to the reconstructed volume, is not straightforward nor cheap in terms of computational costs. Different computing paradigms have been applied in order to overcome such high costs. While classic parallel computing using mainframes and clusters of workstations is usually enough for average requirements, there are some tasks which would fit better into a different computing paradigm – such as grid computing. Such tasks can be split up into a myriad of subtasks, which can then be run independently using as many computational resources as are available. This chapter explores two of these tasks present in a typical three-dimensional electron microscopy reconstruction process. In addition, important aspects like fault-tolerance are widely covered; given that the distributed nature of a grid infrastructure makes it inherently unstable and difficult to predict.


Author(s):  
Andreas Quandt ◽  
Sergio Maffioletti ◽  
Cesare Pautasso ◽  
Heinz Stockinger ◽  
Frederique Lisacek

Proteomics is currently one of the most promising fields in bioinformatics as it provides important insights into the protein function of organisms. Mass spectrometry is one of the techniques to study the proteome, and several software tools exist for this purpose. The authors provide an extendable software platform called swissPIT that combines different existing tools and exploits Grid infrastructures to speed up the data analysis process for the proteomics pipeline.


Author(s):  
Mario Cannataro ◽  
Pietro Hiram Guzzi ◽  
Giuseppe Tradigo ◽  
Pierangelo Veltri

Recent advances in high throughput technologies analysing biological samples enabled the researchers to collect a huge amount of data. In particular, mass spectrometry-based proteomics uses the mass spectrometry to investigate proteins expressed in an organism or a cell. The manual inspection of spectra is unfeasible, so the need to introduce a set of algorithms, tools and platforms to manage and analyze them arises. Computational Proteomics regards the computational methods for analyzing spectra data in qualitative (i.e. peptide/protein identification in tandem mass spectrometry), and quantitative proteomics (i.e. protein expression in samples), as well as in biomarker discovery (i.e. the identification of a molecular signature of a disease directly from spectra). This chapter presents main standards, tools, and technologies for building scalable, reusable, and portable applications in this field. The chapter surveys available solutions for computational proteomics and includes a deep description of MS-Analyzer, a Grid-based software platform for the integrated management and analysis of spectra data. MS-Analyzer provides efficient spectra management through a specialized spectra database, and supports the semantic composition of pre-processing and data mining services to analyze spectra on the Grid.


Author(s):  
Vincent Breton ◽  
Eddy Caron ◽  
Frederic Desprez ◽  
Gael Le Mahec

As grids become more and more attractive for solving complex problems with high computational and storage requirements, bioinformatics starts to be ported on large scale platforms. The BLAST kernel, one of the main cornerstone of high performance genomics, was one the first application ported on such platform. However, if a simple parallelization was enough for the first proof of concept, its use in production platform needed more optimized algorithms. In this chapter, we review existing parallelization and “gridification” approaches as well as related issues such as data management and replication, and a case study using the DIET middleware over the Grid’5000 experimental platform.


Author(s):  
Aisha Naseer ◽  
Lampros Stergiolas

Adoption of cutting edge technologies in order to facilitate various healthcare operations and tasks is significant. There is a need for health information systems to be fully integrated with each other and provide interoperability across various organizational domains for ubiquitous access and sharing. The emerging technology of HealthGrids holds the promise to successfully integrate health information systems and various healthcare entities onto a common, globally shared and easily accessible platform. This chapter presents a systematic taxonomy of different types of HealthGrid resources, where the specialized resources can be categorised into three major types; namely, Data or Information or Files (DIF); Applications & Peripherals (AP); and Services. Resource discovery in HealthGrids is an emerging challenge comprising many technical issues encapsulating performance, consistency, compatibility, heterogeneity, integrity, aggregation and security of life-critical data. To address these challenges, a systematic search strategy could be devised and adopted, as the discovered resource should be valid, refined and relevant to the query. Standards could be implemented on domain-specific metadata. This chapter proposes potential solutions for the discovery of different types of HealthGrid resources and reflects on discovering and integrating data resources.


Author(s):  
Marian Bubak ◽  
Maciej Malawski ◽  
Tomasz Gubala ◽  
Marek Kasztelnik ◽  
Piotr Nowakowski ◽  
...  

Advanced research in life sciences calls for new information technology solutions to support complex, collaborative computer simulations and result analysis. This chapter presents the ViroLab virtual laboratory, which is an integrated system of dedicated tools and services, providing a common space for planning, building, improving and performing in-silico experiments by different groups of users. Within the virtual laboratory collaborative applications are built as experiment plans, using a notation based on the Ruby scripting language. During experiment execution, provenance data is created and stored. The virtual laboratory enables access to distributed, heterogeneous data resources, computational resources in Grid systems, clusters and standalone computers. The process of application development as well as the architecture and functionality of the virtual laboratory are demonstrated using a real-life example from the HIV treatment domain.


Author(s):  
Fotis Psomopoulos ◽  
Pericles Mitkas

The scope of this chapter is the presentation of Data Mining techniques for knowledge extraction in proteomics, taking into account both the particular features of most proteomics issues (such as data retrieval and system complexity), and the opportunities and constraints found in a Grid environment. The chapter discusses the way new and potentially useful knowledge can be extracted from proteomics data, utilizing Grid resources in a transparent way. Protein classification is introduced as a current research issue in proteomics, which also demonstrates most of the domain – specific traits. An overview of common and custom-made Data Mining algorithms is provided, with emphasis on the specific needs of protein classification problems. A unified methodology is presented for complex Data Mining processes on the Grid, highlighting the different application types and the benefits and drawbacks in each case. Finally, the methodology is validated through real-world case studies, deployed over the EGEE grid environment.


Author(s):  
Giulia De Sario ◽  
Angelica Tulipano ◽  
Giacinto Donvito ◽  
Giorgio Maggi

The number of fully sequenced genomes increases daily, producing an exponential explosion of the sequence, annotation and metadata databases. Data analysis on a genome-wide level or investigation within a specific data repository has become a data- and calculation-intensive process occupying single computers and even larger computer clusters for month or even years. In most cases such applications can be subdivided into many independent smaller tasks. The smaller tasks are particularly suited to distribution over a computational GRID infrastructure, which drastically reduces the time to reach the final result. In our analysis of gene ontology data and their associations to gene products of any kind of organism in a search to find gene products with similar functionalities, we developed a system to divide the full search into a large number of jobs and to submit these jobs to the GRID infrastructure as long as all jobs are processed successfully, guaranteeing an analysis of the data without missing any information.


Sign in / Sign up

Export Citation Format

Share Document