scholarly journals Protist.Guru: A Comparative Transcriptomics Database for the Protist Kingdom

Author(s):  
Erielle Marie Fajardo Villanueva ◽  
Peng Ken Lim ◽  
Jolyn Jia Jia Lim ◽  
ShanChun Lim ◽  
Pei Yi Lau ◽  
...  

Abstract Summary: During the last few decades, the study of microbial ecology has been enabled by molecular and genomic data. DNA sequencing has revealed the surprising extent of microbial diversity and how microbial processes run global ecosystems. However, significant gaps in our understanding of the microbial world remain, and one example is that microbial eukaryotes, or protists, are still neglected. To address this gap, we used gene expression data from 15 distinct protist species to create protist.guru: an online database equipped with tools for identifying functional co-expression networks, gene families, and enriched gene clusters. Here, we show how our database can be used to reveal genes involved in essential pathways, such as the synthesis of secondary carotenoids in Haematococcus lacustris. We expect protist.guru to serve as a valuable resource for protistologists, as well as a catalyst for discoveries and new insights into the biological processes of microbial eukaryotes. Availability: The database and co-expression networks are freely available from http://protist.guru/. The expression matrices and sample annotations are found in the supplementary data.

2021 ◽  
Author(s):  
Erielle Marie Fajardo Villanueva ◽  
Peng Ken Lim ◽  
Jolyn Jia Jia Lim ◽  
Shan Chun Lim ◽  
Pei Yi Lau ◽  
...  

Summary: During the last few decades, the study of microbial ecology has been enabled by molecular and genomic data. DNA sequencing has revealed the surprising extent of microbial diversity and how microbial processes run global ecosystems. However, significant gaps in our understanding of the microbial world remain, and one example is that microbial eukaryotes, or protists, are still neglected. To address this gap, we used gene expression data from 15 distinct protist species to create protist.guru: an online database equipped with tools for identifying functional co-expression networks, gene families, and enriched gene clusters. Here, we show how our database can be used to reveal genes involved in essential pathways, such as the synthesis of secondary carotenoids in Haematococcus lacustris. We expect protist.guru to serve as a valuable resource for protistologists, as well as a catalyst for discoveries and new insights into the biological processes of microbial eukaryotes. Availability: The database and co-expression networks are freely available from http://protist.guru/. The expression matrices and sample annotations are found in the supplementary data.


2014 ◽  
Vol 2014 ◽  
pp. 1-8 ◽  
Author(s):  
Li Guo ◽  
Yang Zhao ◽  
Sheng Yang ◽  
Hui Zhang ◽  
Feng Chen

MicroRNAs (miRNAs) are small, noncoding regulatory molecules. They are involved in many essential biological processes and act by suppressing gene expression. The present work reports an integrative analysis of miRNA-mRNA and miRNA-miRNA interactions and their regulatory patterns using high-throughput miRNA and mRNA datasets. Aberrantly expressed miRNA and mRNA profiles were obtained based on fold change analysis, and qRT-PCR was used for further validation of deregulated miRNAs. miRNAs and target mRNAs were found to show various expression patterns. miRNA-miRNA interactions and clustered/homologous miRNAs were also found to contribute to the flexible and selective regulatory network. Interacting miRNAs (e.g., miRNA-103a and miR-103b) showed more pronounced differences in expression, which suggests the potential “restricted interaction” in the miRNA world. miRNAs from the same gene clusters (e.g., miR-23b gene cluster) or gene families (e.g., miR-10 gene family) always showed the same types of deregulation patterns, although they sometimes differed in expression levels. These clustered and homologous miRNAs may have close functional relationships, which may indicate collaborative interactions between miRNAs. The integrative analysis of miRNA-mRNA based on biological characteristics of miRNA will further enrich miRNA study.


2020 ◽  
Author(s):  
Jolyn Jia Jia Lim ◽  
Jace Koh ◽  
Jia Rong Moo ◽  
Erielle Marie Fajardo Villanueva ◽  
Dhira Anindya Putri ◽  
...  

ABSTRACTThe fungi kingdom is composed of eukaryotic heterotrophs, which are responsible for balancing the ecosystem and play a major role as decomposers. They also produce a vast diversity of secondary metabolites, which have antibiotic or pharmacological properties. However, our lack of knowledge of gene function in fungi precludes us from tailoring them to our needs and tapping into their metabolic diversity. To remedy this, we gathered genomic and gene expression data of 19 most widely-researched fungi to build a database, fungi.guru, which contains tools for cross-species identification of conserved pathways, functional gene modules, and gene families. We exemplify how our database can elucidate the molecular function, biological process and cellular component of genes involved in various biological processes, by identifying a secondary metabolite pathway producing gliotoxin in Aspergillus fumigatus, the catabolic pathway of cellulose in Coprinopsis cinerea and the conserved DNA replication pathway in Fusarium graminearum and Pyricularia oryzae. The database is available at www.fungi.guru.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Alexandre Perochon ◽  
Harriet R. Benbow ◽  
Katarzyna Ślęczka-Brady ◽  
Keshav B. Malla ◽  
Fiona M. Doohan

AbstractThere is increasing evidence that some functionally related, co-expressed genes cluster within eukaryotic genomes. We present a novel pipeline that delineates such eukaryotic gene clusters. Using this tool for bread wheat, we uncovered 44 clusters of genes that are responsive to the fungal pathogen Fusarium graminearum. As expected, these Fusarium-responsive gene clusters (FRGCs) included metabolic gene clusters, many of which are associated with disease resistance, but hitherto not described for wheat. However, the majority of the FRGCs are non-metabolic, many of which contain clusters of paralogues, including those implicated in plant disease responses, such as glutathione transferases, MAP kinases, and germin-like proteins. 20 of the FRGCs encode nonhomologous, non-metabolic genes (including defence-related genes). One of these clusters includes the characterised Fusarium resistance orphan gene, TaFROG. Eight of the FRGCs map within 6 FHB resistance loci. One small QTL on chromosome 7D (4.7 Mb) encodes eight Fusarium-responsive genes, five of which are within a FRGC. This study provides a new tool to identify genomic regions enriched in genes responsive to specific traits of interest and applied herein it highlighted gene families, genetic loci and biological pathways of importance in the response of wheat to disease.


2021 ◽  
Vol 7 (6) ◽  
pp. 485
Author(s):  
Boxun Li ◽  
Yang Yang ◽  
Jimiao Cai ◽  
Xianbao Liu ◽  
Tao Shi ◽  
...  

Rubber tree Corynespora leaf fall (CLF) disease, caused by the fungus Corynespora cassiicola, is one of the most damaging diseases in rubber tree plantations in Asia and Africa, and this disease also threatens rubber nurseries and young rubber plantations in China. C. cassiicola isolates display high genetic diversity, and virulence profiles vary significantly depending on cultivar. Although one phytotoxin (cassicolin) has been identified, it cannot fully explain the diversity in pathogenicity between C. cassiicola species, and some virulent C. cassiicola strains do not contain the cassiicolin gene. In the present study, we report high-quality gapless genome sequences, obtained using short-read sequencing and single-molecule long-read sequencing, of two Chinese C. cassiicola virulent strains. Comparative genomics of gene families in these two stains and a virulent CPP strain from the Philippines showed that all three strains experienced different selective pressures, and metabolism-related gene families vary between the strains. Secreted protein analysis indicated that the quantities of secreted cell wall-degrading enzymes were correlated with pathogenesis, and the most aggressive CCP strain (cassiicolin toxin type 1) encoded 27.34% and 39.74% more secreted carbohydrate-active enzymes (CAZymes) than Chinese strains YN49 and CC01, respectively, both of which can only infect rubber tree saplings. The results of antiSMASH analysis showed that all three strains encode ~60 secondary metabolite biosynthesis gene clusters (SM BGCs). Phylogenomic and domain structure analyses of core synthesis genes, together with synteny analysis of polyketide synthase (PKS) and non-ribosomal peptide synthetase (NRPS) gene clusters, revealed diversity in the distribution of SM BGCs between strains, as well as SM polymorphisms, which may play an important role in pathogenic progress. The results expand our understanding of the C. cassiicola genome. Further comparative genomic analysis indicates that secreted CAZymes and SMs may influence pathogenicity in rubber tree plantations. The findings facilitate future exploration of the molecular pathogenic mechanism of C. cassiicola.


2019 ◽  
Vol 116 (37) ◽  
pp. 18498-18506 ◽  
Author(s):  
Yoshitaka Fujihara ◽  
Taichi Noda ◽  
Kiyonori Kobayashi ◽  
Asami Oji ◽  
Sumire Kobayashi ◽  
...  

CRISPR/Cas9-mediated genome editing technology enables researchers to efficiently generate and analyze genetically modified animals. We have taken advantage of this game-changing technology to uncover essential factors for fertility. In this study, we generated knockouts (KOs) of multiple male reproductive organ-specific genes and performed phenotypic screening of these null mutant mice to attempt to identify proteins essential for male fertility. We focused on making large deletions (dels) within 2 gene clusters encoding cystatin (CST) and prostate and testis expressed (PATE) proteins and individual gene mutations in 2 other gene families encoding glycerophosphodiester phosphodiesterase domain (GDPD) containing and lymphocyte antigen 6 (Ly6)/Plaur domain (LYPD) containing proteins. These gene families were chosen because many of the genes demonstrate male reproductive tract-specific expression. AlthoughGdpd1andGdpd4mutant mice were fertile, disruptions ofCstandPategene clusters andLypd4resulted in male sterility or severe fertility defects secondary to impaired sperm migration through the oviduct. While absence of the epididymal protein families CST and PATE affect the localization of the sperm membrane protein A disintegrin and metallopeptidase domain 3 (ADAM3), the sperm acrosomal membrane protein LYPD4 regulates sperm fertilizing ability via an ADAM3-independent pathway. Thus, use of CRISPR/Cas9 technologies has allowed us to quickly rule in and rule out proteins required for male fertility and expand our list of male-specific proteins that function in sperm migration through the oviduct.


Author(s):  
Conghui Liu ◽  
Yuwei Ren ◽  
Zaiyuan Li ◽  
Qi Hu ◽  
Lijuan Yin ◽  
...  

AbstractWhole-genome duplication (WGD) has been observed across a wide variety of eukaryotic groups, contributing to evolutionary diversity and environmental adaptability. Mollusks are the second largest group of animals, and are among the organisms that have successfully adapted to the nonmarine realm through aquatic-terrestrial (A-T) transition, and no comprehensive research on WGD has been reported in this group. To explore WGD and the A-T transition in Mollusca, we assembled a chromosome-level reference genome for the giant African snail Achatina immaculata, a global invasive species, and compared the genomes of two giant African snails (A. immaculata and Achatina fulica) to the other available mollusk genomes. The chromosome-level macrosynteny, colinearity blocks, Ks peak and Hox gene clusters collectively suggested the occurrence of a WGD event shared by A. immaculata and A. fulica. The estimated timing of this WGD event (∼70 MYA) was close to the speciation age of the Sigmurethra-Orthurethra (within Stylommatophora) lineage and the Cretaceous-Tertiary (K-T) mass extinction, indicating that the WGD reported herein may have been a common event shared by all Sigmurethra-Orthurethra species and could have conferred ecological adaptability and genomic plasticity allowing the survival of the K-T extinction. Based on macrosynteny, we deduced an ancestral karyotype containing 8 conserved clusters for the Gastropoda-Bivalvia lineage. To reveal the mechanism of WGD in shaping adaptability to terrestrial ecosystems, we investigated gene families related to the respiration, aestivation and immune defense of giant African snails. Several mucus-related gene families expanded early in the Stylommatophora lineage, functioning in water retention, immune defense and wound healing. The hemocyanins, PCK and FBP families were doubled and retained after WGD, enhancing the capacity for gas exchange and glucose homeostasis in aestivation. After the WGD, zinc metalloproteinase genes were highly tandemly duplicated to protect tissue against ROS damage. This evidence collectively suggests that although the WGD may not have been the direct driver of the A-T transition, it provided an important legacy for the terrestrial adaptation of the giant African snail.


2015 ◽  
Vol 2015 ◽  
pp. 1-7 ◽  
Author(s):  
Hai-Hui Huang ◽  
Yong Liang ◽  
Xiao-Ying Liu

Identifying biomarker and signaling pathway is a critical step in genomic studies, in which the regularization method is a widely used feature extraction approach. However, most of the regularizers are based onL1-norm and their results are not good enough for sparsity and interpretation and are asymptotically biased, especially in genomic research. Recently, we gained a large amount of molecular interaction information about the disease-related biological processes and gathered them through various databases, which focused on many aspects of biological systems. In this paper, we use an enhancedL1/2penalized solver to penalize network-constrained logistic regression model called an enhancedL1/2net, where the predictors are based on gene-expression data with biologic network knowledge. Extensive simulation studies showed that our proposed approach outperformsL1regularization, the oldL1/2penalized solver, and the Elastic net approaches in terms of classification accuracy and stability. Furthermore, we applied our method for lung cancer data analysis and found that our method achieves higher predictive accuracy thanL1regularization, the oldL1/2penalized solver, and the Elastic net approaches, while fewer but informative biomarkers and pathways are selected.


2019 ◽  
Vol 17 (04) ◽  
pp. 1950024 ◽  
Author(s):  
Tinghua Huang ◽  
Xiali Huang ◽  
Bomei Shi ◽  
Min Yao

Understanding how genes are expressed and regulated in different biological processes are fundamental and challenging issues. Considerable progress has been made in studying the relationship between the expression and regulation of human genes. However, it is difficult to use these resources productively to analyze gene expression data. GEREDB ( www.thua45.cn/geredb ) has been developed to facilitate analyses that will provide insights into the regulation of genes that govern specific biological responses. GEREDB is a publicly available, manually curated biological database that stores the data regarding relationships between expression and regulation of human genes. To date, more than 39,000 Links have been contextually annotated by reviewing more than 53,000 abstracts. GEREDB can be searched using the official NCBI gene symbol as a query, and it can be downloaded along with the GEREA software package. GEREDB has the ability to analyze user-supplied gene expression data in a causal analysis oriented manner using the GEREA bioinformatics tool.


Sign in / Sign up

Export Citation Format

Share Document