scholarly journals Ensemble gene function prediction database reveals genes important for complex I formation in Arabidopsis thaliana

2017 ◽  
Author(s):  
Bjoern Oest Hansen ◽  
Etienne H. Meyer ◽  
Camilla Ferrari ◽  
Neha Vaid ◽  
Sara Movahedi ◽  
...  

Despite increasing availability of sequenced genomes, accurate characterization of gene functions is needed to close the genotype-phenotype gap. Recent advances in gene function prediction rely on ensemble approaches that integrate the results from multiple inference methods to produce superior predictions. Yet, these developments remain largely unexplored in plants. We present Neighbor Counting Ensemble, a gene function prediction method which integrates eleven gene co-function networks for Arabidopsis thaliana, and produces more accurate gene function predictions for a larger fraction of genes with unknown function. We used these predictions to identify genes involved in mitochondrial complex I formation, and for five of them we confirmed the predictions experimentally. The ensemble predictions are provided as a user-friendly online database, EnsembleNet, available at http://www.gene2function.de/ensemblenet.html.

Genes ◽  
2020 ◽  
Vol 11 (4) ◽  
pp. 428 ◽  
Author(s):  
Qiao Wen Tan ◽  
William Goh ◽  
Marek Mutwil

As genomes become more and more available, gene function prediction presents itself as one of the major hurdles in our quest to extract meaningful information on the biological processes genes participate in. In order to facilitate gene function prediction, we show how our user-friendly pipeline, the Large-Scale Transcriptomic Analysis Pipeline in Cloud (LSTrAP-Cloud), can be useful in helping biologists make a shortlist of genes involved in a biological process that they might be interested in, by using a single gene of interest as bait. The LSTrAP-Cloud is based on Google Colaboratory, and provides user-friendly tools that process quality-control RNA sequencing data streamed from the European Nucleotide Archive. The LSTRAP-Cloud outputs a gene coexpression network that can be used to identify functionally related genes for any organism with a sequenced genome and publicly available RNA sequencing data. Here, we used the biosynthesis pathway of Nicotiana tabacum as a case study to demonstrate how enzymes, transporters, and transcription factors involved in the synthesis, transport, and regulation of nicotine can be identified using our pipeline.


Author(s):  
Qiao Wen Tan ◽  
William Goh ◽  
Marek Mutwil

AbstractAs genomes become more and more available, gene function prediction presents itself as one of the major hurdles in our quest to extract meaningful information on the biological processes genes participate in. In order to facilitate gene function prediction, we show how our user-friendly pipeline, Large-Scale Transcriptomic Analysis Pipeline in Cloud (LSTrAP-Cloud), can be useful in helping biologists make a shortlist of genes that they might be interested in. LSTrAP-Cloud is based on Google Colaboratory and provides user-friendly tools that process and quality-control RNA sequencing data streamed from the European Sequencing Archive. LSTRAP-Cloud outputs a gene co-expression network that can be used to identify functionally related genes for any organism with a sequenced genome and publicly available RNA sequencing data. Here, we used the biosynthesis pathway of Nicotiana tabacum as a case study to demonstrate how enzymes, transporters and transcription factors involved in the synthesis, transport and regulation of nicotine can be identified using our pipeline.


2017 ◽  
Vol 217 (4) ◽  
pp. 1521-1534 ◽  
Author(s):  
Bjoern Oest Hansen ◽  
Etienne H. Meyer ◽  
Camilla Ferrari ◽  
Neha Vaid ◽  
Sara Movahedi ◽  
...  

Genetics ◽  
2001 ◽  
Vol 158 (3) ◽  
pp. 1051-1060
Author(s):  
Claire Remacle ◽  
Denis Baurain ◽  
Pierre Cardol ◽  
René F Matagne

Abstract The mitochondrial rotenone-sensitive NADH:ubiquinone oxidoreductase (complex I) comprises more than 30 subunits, the majority of which are encoded by the nucleus. In Chlamydomonas reinhardtii, only five components of complex I are coded for by mitochondrial genes. Three mutants deprived of complex I activity and displaying slow growth in the dark were isolated after mutagenic treatment with acriflavine. A genetical analysis demonstrated that two mutations (dum20 and dum25) affect the mitochondrial genome whereas the third mutation (dn26) is of nuclear origin. Recombinational analyses showed that dum20 and dum25 are closely linked on the genetic map of the mitochondrial genome and could affect the nd1 gene. A sequencing analysis confirmed this conclusion: dum20 is a deletion of one T at codon 243 of nd1; dum25 corresponds to a 6-bp deletion that eliminates two amino acids located in a very conserved hydrophilic segment of the protein.


2008 ◽  
Vol 5 (2) ◽  
Author(s):  
Robert Pesch ◽  
Artem Lysenko ◽  
Matthew Hindle ◽  
Keywan Hassani-Pak ◽  
Ralf Thiele ◽  
...  

SummaryThe automated annotation of data from high throughput sequencing and genomics experiments is a significant challenge for bioinformatics. Most current approaches rely on sequential pipelines of gene finding and gene function prediction methods that annotate a gene with information from different reference data sources. Each function prediction method contributes evidence supporting a functional assignment. Such approaches generally ignore the links between the information in the reference datasets. These links, however, are valuable for assessing the plausibility of a function assignment and can be used to evaluate the confidence in a prediction. We are working towards a novel annotation system that uses the network of information supporting the function assignment to enrich the annotation process for use by expert curators and predicting the function of previously unannotated genes. In this paper we describe our success in the first stages of this development. We present the data integration steps that are needed to create the core database of integrated reference databases (UniProt, PFAM, PDB, GO and the pathway database Ara- Cyc) which has been established in the ONDEX data integration system. We also present a comparison between different methods for integration of GO terms as part of the function assignment pipeline and discuss the consequences of this analysis for improving the accuracy of gene function annotation.The methods and algorithms presented in this publication are an integral part of the ONDEX system which is freely available from http://ondex.sf.net/.


Biochemistry ◽  
1998 ◽  
Vol 37 (18) ◽  
pp. 6436-6445 ◽  
Author(s):  
Michiyo Ohshima ◽  
Hideto Miyoshi ◽  
Kimitoshi Sakamoto ◽  
Kazuhiro Takegami ◽  
Jun Iwata ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document