The Cinderella of Biological Data Integration: Addressing Some of the Challenges of Entity and Relationship Mining from Patent Sources

The volume of information derived from post genomic technologies is rapidly increasing. Due to the amount of involved data, novel computational methods are needed for the analysis and knowledge discovery into the massive data sets produced by these new technologies. Furthermore, data integration is also gaining attention for merging signals from different sources in order to discover unknown relations. This chapter presents a pipeline for biological data integration and discovery of a priori unknown relationships between gene expressions and metabolite accumulations. In this pipeline, two standard clustering methods are compared against a novel neural network approach. The neural model provides a simple visualization interface for identification of coordinated patterns variations, independently of the number of produced clusters. Several quality measurements have been defined for the evaluation of the clustering results obtained on a case study involving transcriptomic and metabolomic profiles from tomato fruits. Moreover, a method is proposed for the evaluation of the biological significance of the clusters found. The neural model has shown a high performance in most of the quality measures, with internal coherence in all the identified clusters and better visualization capabilities.

Download Full-text

An Application Driven Perspective on Biological Data Integration

Lecture Notes in Computer Science - Data Integration in the Life Sciences ◽

10.1007/11799511_1 ◽

2006 ◽

pp. 1-1

Author(s):

Victor M. Markowitz

Keyword(s):

Data Integration ◽

Biological Data ◽

Biological Data Integration

Download Full-text

A Multiomics Graph Database System for Biological Data Integration and Cancer Informatics

Journal of Computational Biology ◽

10.1089/cmb.2020.0231 ◽

2020 ◽

Author(s):

Ishwor Thapa ◽

Hesham Ali

Keyword(s):

Data Integration ◽

Database System ◽

Biological Data ◽

Graph Database ◽

Biological Data Integration

Download Full-text

Challenges in Biological Data Integration in the Post-genome Sequence Era

Lecture Notes in Computer Science - Data Integration in the Life Sciences ◽

10.1007/11530084_1 ◽

2005 ◽

pp. 1-1 ◽

Cited By ~ 2

Author(s):

Shankar Subramaniam

Keyword(s):

Data Integration ◽

Genome Sequence ◽

Biological Data ◽

Biological Data Integration

Download Full-text

Sparse canonical methods for biological data integration: application to a cross-platform study

BMC Bioinformatics ◽

10.1186/1471-2105-10-34 ◽

2009 ◽

Vol 10 (1) ◽

Cited By ~ 144

Author(s):

Kim-Anh Lê Cao ◽

Pascal GP Martin ◽

Christèle Robert-Granié ◽

Philippe Besse

Keyword(s):

Data Integration ◽

Biological Data ◽

Cross Platform ◽

Biological Data Integration

Download Full-text

Biological data integration using Semantic Web technologies

Biochimie ◽

10.1016/j.biochi.2008.02.007 ◽

2008 ◽

Vol 90 (4) ◽

pp. 584-594 ◽

Cited By ~ 23

Author(s):

C. Pasquier

Keyword(s):

Semantic Web ◽

Data Integration ◽

Biological Data ◽

Semantic Web Technologies ◽

Web Technologies ◽

Biological Data Integration

Download Full-text

Data modeling: the key to biological data integration

EMBnet journal ◽

10.14806/ej.18.b.550 ◽

2012 ◽

Vol 18 (B) ◽

pp. 59 ◽

Cited By ~ 1

Author(s):

François Rechenmann

Keyword(s):

Data Integration ◽

Data Modeling ◽

Biological Data ◽

Biological Data Integration

Download Full-text

A Comparative Analysis of Biological Data Integration Systems Famous for Data Exploitation and Knowledge Discovery

Current Bioinformatics ◽

10.2174/1574893615999210101125442 ◽

2021 ◽

Vol 15 ◽

Author(s):

Omer Irshad ◽

Muhammad Usman Ghani Khan

Keyword(s):

Comparative Analysis ◽

Data Integration ◽

Biological Data ◽

Omics Data ◽

Data Generation ◽

Biological Databases ◽

Future Data ◽

Design Characteristics ◽

Biological Data Integration ◽

Omics Data Integration

: Integrating heterogeneous biological databases for unveiling the new intra-molecular and inter-molecular attributes, behaviors, and relationships in the human cellular system has always been a focused research area of computational biology. In this context, a lot of biological data integration systems have been deployed in the last couple of decades. One of the prime and common objectives of all these systems is to better facilitate the end-users for exploring, exploiting, and analyzing the integrated biological data for knowledge extraction. With the advent of especially highthroughput data generation technologies, biological data is growing and dispersing continuously, exponentially, heterogeneously, and geographically. Due to this, biological data integration systems are too facing data integration and data organization-related current and future challenges. The objective of this review is to quantitatively evaluate and compare some of the recent warehouse-based multi-omics data integration systems to check their compliance with the current and future data integration needs. For this, we identified some of the major data integration design characteristics that should be in the multi-omics data integration model to comprehensively address the current and future data integration challenges. Based on these design characteristics and the evaluation criteria, we evaluated some of the recent data warehouse systems and showed categorical and comparative analysis results. Results show that most of the systems exhibit no or partial compliance with the required data integration design characteristics. So, these systems need design improvements to adequately address the current and future data integration challenges while keeping their service level commitments in place.

Download Full-text