Plant functional genomics: opportunities in microarray databases and data mining

2004 ◽  
Vol 31 (4) ◽  
pp. 295 ◽  
Author(s):  
Gavin C. Kennedy ◽  
Iain W. Wilson

High-throughput gene expression profiling using microarrays has given plant biologists a powerful new technology to discover gene function and understand cellular processes. Bioinformatics has rapidly developed to deliver the tools necessary to interpret this gene expression data, but opportunities to further exploit the mass of data from hundreds of experiments are becoming dependent upon the use of sophisticated database repositories. Data mining of these resources will allow plant biologists to compare and link expression profiles and experimental factors to uncover functions and processes that would not normally be visible from analysing a small set of microarray experiments. This in-silico analysis will become critical when designing new experiments and interpreting new results. Consequently microarray databases and their ongoing development are now as important to plant functional genomics as the initial microarray data capture and analysis tools. In order for plant biologists to grasp these new opportunities, an appreciation of microarray database technology and future developments in biological data integration is required. The challenge for plant functional genomics is to embrace these new technologies lest the opportunities for significant discoveries be lost.

Author(s):  
Diego Milone ◽  
Georgina Stegmayer ◽  
Matías Gerard ◽  
Laura Kamenetzky ◽  
Mariana López ◽  
...  

The volume of information derived from post genomic technologies is rapidly increasing. Due to the amount of involved data, novel computational methods are needed for the analysis and knowledge discovery into the massive data sets produced by these new technologies. Furthermore, data integration is also gaining attention for merging signals from different sources in order to discover unknown relations. This chapter presents a pipeline for biological data integration and discovery of a priori unknown relationships between gene expressions and metabolite accumulations. In this pipeline, two standard clustering methods are compared against a novel neural network approach. The neural model provides a simple visualization interface for identification of coordinated patterns variations, independently of the number of produced clusters. Several quality measurements have been defined for the evaluation of the clustering results obtained on a case study involving transcriptomic and metabolomic profiles from tomato fruits. Moreover, a method is proposed for the evaluation of the biological significance of the clusters found. The neural model has shown a high performance in most of the quality measures, with internal coherence in all the identified clusters and better visualization capabilities.


2008 ◽  
pp. 1696-1705
Author(s):  
George Tzanis ◽  
Christos Berberidis ◽  
Ioannis Vlahavas

At the end of the 1980s, a new discipline named data mining emerged. The introduction of new technologies such as computers, satellites, new mass storage media, and many others have lead to an exponential growth of collected data. Traditional data analysis techniques often fail to process large amounts of, often noisy, data efficiently in an exploratory fashion. The scope of data mining is the knowledge extraction from large data amounts with the help of computers. It is an interdisciplinary area of research that has its roots in databases, machine learning, and statistics and has contributions from many other areas such as information retrieval, pattern recognition, visualization, parallel and distributed computing. There are many applications of data mining in the real world. Customer relationship management, fraud detection, market and industry characterization, stock management, medicine, pharmacology, and biology are some examples (Two Crows Corporation, 1999).


Author(s):  
George Tzanis ◽  
Christos Berberidis ◽  
Ioannis Vlahavas

At the end of the 1980s, a new discipline named data mining emerged. The introduction of new technologies such as computers, satellites, new mass storage media, and many others have lead to an exponential growth of collected data. Traditional data analysis techniques often fail to process large amounts of, often noisy, data efficiently in an exploratory fashion. The scope of data mining is the knowledge extraction from large data amounts with the help of computers. It is an interdisciplinary area of research that has its roots in databases, machine learning, and statistics and has contributions from many other areas such as information retrieval, pattern recognition, visualization, parallel and distributed computing. There are many applications of data mining in the real world. Customer relationship management, fraud detection, market and industry characterization, stock management, medicine, pharmacology, and biology are some examples (Two Crows Corporation, 1999).


2019 ◽  
Author(s):  
Dan MacLean

AbstractGene Regulatory networks that control gene expression are widely studied yet the interactions that make them up are difficult to predict from high throughput data. Deep Learning methods such as convolutional neural networks can perform surprisingly good classifications on a variety of data types and the matrix-like gene expression profiles would seem to be ideal input data for deep learning approaches. In this short study I compiled training sets of expression data using the Arabidopsis AtGenExpress global stress expression data set and known transcription factor-target interactions from the Arabidopsis PLACE database. I built and optimised convolutional neural networks with a best model providing 95 % accuracy of classification on a held-out validation set. Investigation of the activations within this model revealed that classification was based on positive correlation of expression profiles in short sections. This result shows that a convolutional neural network can be used to make classifications and reveal the basis of those calssifications for gene expression data sets, indicating that a convolutional neural network is a useful and interpretable tool for exploratory classification of biological data. The final model is available for download and as a web application.


2004 ◽  
Vol 77 (3) ◽  
pp. 430-452 ◽  
Author(s):  
Thomas V. Getchell ◽  
Xuejun Peng ◽  
C. Paul Green ◽  
Arnold J. Stromberg ◽  
Kuey-Chu Chen ◽  
...  

2021 ◽  
Vol 12 ◽  
Author(s):  
Kent M. Reed ◽  
Kristelle M. Mendoza ◽  
Juan E. Abrahante ◽  
Sandra G. Velleman ◽  
Gale M. Strasburg

Precise regulation of gene expression is critical for normal muscle growth and development. Changes in gene expression patterns caused by external stressors such as temperature can have dramatic effects including altered cellular structure and function. Understanding the cellular mechanisms that underlie muscle growth and development and how these are altered by external stressors are crucial in maintaining and improving meat quality. This study investigated circular RNAs (circRNAs) as an emerging aspect of gene regulation. We used data mining to identify circRNAs and characterize their expression profiles within RNAseq data collected from thermally challenged turkey poults of the RBC2 and F-lines. From sequences of 28 paired-end libraries, 8924 unique circRNAs were predicted of which 1629 were common to all treatment groups. Expression analysis identified significant differentially expressed circRNAs (DECs) in comparisons between thermal treatments (41 DECs) and between genetic lines (117 DECs). No intersection was observed between the DECs and differentially expressed gene transcripts indicating that the DECs are not simply the result of expression changes in the parental genes. Comparative analyses based on the chicken microRNA (miRNA) database suggest potential interactions between turkey circRNAs and miRNAs. Additional studies are needed to reveal the functional significance of the predicted circRNAs and their role in muscle development in response to thermal challenge. The DECs identified in this study provide an important framework for future investigation.


Data Mining ◽  
2013 ◽  
pp. 203-230
Author(s):  
Diego Milone ◽  
Georgina Stegmayer ◽  
Matías Gerard ◽  
Laura Kamenetzky ◽  
Mariana López ◽  
...  

The volume of information derived from post genomic technologies is rapidly increasing. Due to the amount of involved data, novel computational methods are needed for the analysis and knowledge discovery into the massive data sets produced by these new technologies. Furthermore, data integration is also gaining attention for merging signals from different sources in order to discover unknown relations. This chapter presents a pipeline for biological data integration and discovery of a priori unknown relationships between gene expressions and metabolite accumulations. In this pipeline, two standard clustering methods are compared against a novel neural network approach. The neural model provides a simple visualization interface for identification of coordinated patterns variations, independently of the number of produced clusters. Several quality measurements have been defined for the evaluation of the clustering results obtained on a case study involving transcriptomic and metabolomic profiles from tomato fruits. Moreover, a method is proposed for the evaluation of the biological significance of the clusters found. The neural model has shown a high performance in most of the quality measures, with internal coherence in all the identified clusters and better visualization capabilities.


Sign in / Sign up

Export Citation Format

Share Document