scholarly journals VariED: the first integrated database of gene annotation and expression profiles for variants related to human diseases

Database ◽  
2019 ◽  
Vol 2019 ◽  
Author(s):  
Chien-Yueh Lee ◽  
Amrita Chattopadhyay ◽  
Li-Mei Chiang ◽  
Jyh-Ming Jimmy Juang ◽  
Liang-Chuan Lai ◽  
...  

Abstract Integrated analysis of DNA variants and gene expression profiles may facilitate precise identification of gene regulatory networks involved in disease mechanisms. Despite the widespread availability of public resources, we lack databases that are capable of simultaneously providing gene expression profiles, variant annotations, functional prediction scores and pathogenic analyses. VariED is the first web-based querying system that integrates an annotation database and expression profiles for genetic variants. The database offers a user-friendly platform and locates gene/variant names in the literature by connecting to established online querying tools, biological annotation tools and records from free-text literature. VariED acts as a central hub for organized genome information consisting of gene annotation, variant allele frequency, functional prediction, clinical interpretation and gene expression profiles in three species: human, mouse and zebrafish. VariED also provides a novel scoring scheme to predict the functional impact of a DNA variant. With one single entry, all results regarding queried DNA variants can be downloaded. VariED can potentially serve as an efficient way to obtain comprehensive variant knowledge for clinicians and scientists around the world working on important drug discoveries and precision treatments.

2020 ◽  
Author(s):  
Alexander Calderwood ◽  
Jo Hepworth ◽  
Shannon Woodhouse ◽  
Lorelei Bilham ◽  
D. Marc Jones ◽  
...  

AbstractThe timing of the floral transition affects reproduction and yield, however its regulation in crops remains poorly understood. Here, we use RNA-Seq to determine and compare gene expression dynamics through the floral transition in the model species Arabidopsis thaliana and the closely related crop Brassica rapa. A direct comparison of gene expression over time between species shows little similarity, which could lead to the inference that different gene regulatory networks are at play. However, these differences can be largely resolved by synchronisation, through curve registration, of gene expression profiles. We find that different registration functions are required for different genes, indicating that there is no common ‘developmental time’ to which Arabidopsis and B. rapa can be mapped through gene expression. Instead, the expression patterns of different genes progress at different rates. We find that co-regulated genes show similar changes in synchronisation between species, suggesting that similar gene regulatory sub-network structures may be active with different wiring between them. A detailed comparison of the regulation of the floral transition between Arabidopsis and B. rapa, and between two B. rapa accessions reveals different modes of regulation of the key floral integrator SOC1, and that the floral transition in the B. rapa accessions is triggered by different pathways, even when grown under the same environmental conditions. Our study adds to the mechanistic understanding of the regulatory network of flowering time in rapid cycling B. rapa under long days and highlights the importance of registration methods for the comparison of developmental gene expression data.


2021 ◽  
Author(s):  
Taguchi Y-h. ◽  
Turki Turki

Abstract The integrated analysis of multiple gene expression profiles measured in distinct studies is always problematic. Especially, missing sample matching and missing common labeling between distinct studies prevent the integration of multiple studies in fully data-driven and unsupervised manner. In this study, we propose a strategy enabling the integration of multiple gene expression profiles among multiple independent studies without either labeling or sample matching, using tensor decomposition-based unsupervised feature extraction. As an example, we applied this strategy to Alzheimer’s disease (AD)-related gene expression profiles that lack exact correspondence among samples as well as AD single-cell RNA-seq (scRNA-seq) data. We found that we could select biologically reasonable genes with integrated analysis. Overall, integrated gene expression profiles can function analogously to prior learning and/or transfer learning strategies in other machine learning applications. For scRNA-seq, the proposed approach was able to drastically reduce the required computational memory.


2021 ◽  
Author(s):  
Giulia Zancolli ◽  
Maarten Reijnders ◽  
Robert Waterhouse ◽  
Marc Robinson-Rechavi

Animals have repeatedly evolved specialized organs and anatomical structures to produce and deliver a cocktail of potent bioactive molecules to subdue prey or predators: venom. This makes it one of the most widespread convergent functions in the animal kingdom. Whether animals have adopted the same genetic toolkit to evolved venom systems is a fascinating question that still eludes us. Here, we performed the first comparative analysis of venom gland transcriptomes from 20 venomous species spanning the main Metazoan lineages, to test whether different animals have independently adopted similar molecular mechanisms to perform the same function. We found a strong convergence in gene expression profiles, with venom glands being more similar to each other than to any other tissue from the same species, and their differences closely mirroring the species phylogeny. Although venom glands secrete some of the fastest evolving molecules (toxins), their gene expression does not evolve faster than evolutionarily older tissues. We found 15 venom gland specific gene modules enriched in endoplasmic reticulum stress and unfolded protein response pathways, indicating that animals have independently adopted stress response mechanisms to cope with mass production of toxins. This, in turns, activates regulatory networks for epithelial development, cell turnover and maintenance which seem composed of both convergent and lineage-specific factors, possibly reflecting the different developmental origins of venom glands. This study represents the first step towards an understanding of the molecular mechanisms underlying the repeated evolution of one of the most successful adaptive traits in the animal kingdom.


Diagnostics ◽  
2020 ◽  
Vol 10 (8) ◽  
pp. 584
Author(s):  
Sergii Babichev ◽  
Jiří Škvor

In this paper, we present the results of the research concerning extraction of informative gene expression profiles from high-dimensional array of gene expressions considering the state of patients’ health using clustering method, ML-based binary classifiers and fuzzy inference system. Applying of the proposed stepwise procedure can allow us to extract the most informative genes taking into account both the subtypes of disease or state of the patient’s health for further reconstruction of gene regulatory networks based on the allocated genes and following simulation of the reconstructed models. We used the publicly available gene expressions data as the experimental ones which were obtained using DNA microarray experiments and contained two types of patients’ gene expression profiles—the patients with lung cancer tumor and healthy patients. The stepwise procedure of the data processing assumes the following steps—in the beginning, we reduce the number of genes by removing non-informative genes in terms of statistical criteria and Shannon entropy; then, we perform the stepwise hierarchical clustering of gene expression profiles at hierarchical levels from 1 to 10 using the SOTA (Self-Organizing Tree Algorithm) clustering algorithm with correlation distance metric. The quality of the obtained clustering was evaluated using the complex clustering quality criterion which is considered both the gene expression profiles distribution relative to center of the clusters where these gene expression profiles are allocated and the centers of the clusters distribution. The result of this stage execution was a selection of the optimal cluster at each of the hierarchical levels which corresponded to the minimum value of the quality criterion. At the next step, we have implemented a classification procedure of the examined objects using four well known binary classifiers—logistic regression, support-vector machine, decision trees and random forest classifier. The effectiveness of the appropriate technique was evaluated based on the use of ROC (Receiver Operating Characteristic) analysis using criteria, included as the components, the errors of both the first and the second kinds. The final decision concerning the extraction of the most informative subset of gene expression profiles was taken based on the use of the fuzzy inference system, the inputs of which are the results of the appropriate single classifiers operation and the output is the final solution concerning state of the patient’s health. To our mind, the implementation of the proposed stepwise procedure of the informative gene expression profiles extraction create the conditions for the increasing effectiveness of the further procedure of gene regulatory networks reconstruction and the following simulation of the reconstructed models considering the subtypes of the disease and/or state of the patient’s health.


2020 ◽  
Vol 15 (1) ◽  
Author(s):  
Carl Grant Mangleburg ◽  
Timothy Wu ◽  
Hari K. Yalamanchili ◽  
Caiwei Guo ◽  
Yi-Chen Hsieh ◽  
...  

Abstract Background Tau neurofibrillary tangle pathology characterizes Alzheimer’s disease and other neurodegenerative tauopathies. Brain gene expression profiles can reveal mechanisms; however, few studies have systematically examined both the transcriptome and proteome or differentiated Tau- versus age-dependent changes. Methods Paired, longitudinal RNA-sequencing and mass-spectrometry were performed in a Drosophila model of tauopathy, based on pan-neuronal expression of human wildtype Tau (TauWT) or a mutant form causing frontotemporal dementia (TauR406W). Tau-induced, differentially expressed transcripts and proteins were examined cross-sectionally or using linear regression and adjusting for age. Hierarchical clustering was performed to highlight network perturbations, and we examined overlaps with human brain gene expression profiles in tauopathy. Results TauWT induced 1514 and 213 differentially expressed transcripts and proteins, respectively. TauR406W had a substantially greater impact, causing changes in 5494 transcripts and 697 proteins. There was a ~ 70% overlap between age- and Tau-induced changes and our analyses reveal pervasive bi-directional interactions. Strikingly, 42% of Tau-induced transcripts were discordant in the proteome, showing opposite direction of change. Tau-responsive gene expression networks strongly implicate innate immune activation. Cross-species analyses pinpoint human brain gene perturbations specifically triggered by Tau pathology and/or aging, and further differentiate between disease amplifying and protective changes. Conclusions Our results comprise a powerful, cross-species functional genomics resource for tauopathy, revealing Tau-mediated disruption of gene expression, including dynamic, age-dependent interactions between the brain transcriptome and proteome.


Sign in / Sign up

Export Citation Format

Share Document