Insights into therapeutic targets and biomarkers using integrated multi-‘omics’ approaches for dilated and ischemic cardiomyopathies

2021 ◽  
Author(s):  
Austė Kanapeckaitė ◽  
Neringa Burokienė

Abstract At present, heart failure (HF) treatment only targets the symptoms based on the left ventricle dysfunction severity; however, the lack of systemic ‘omics’ studies and available biological data to uncover the heterogeneous underlying mechanisms signifies the need to shift the analytical paradigm towards network-centric and data mining approaches. This study, for the first time, aimed to investigate how bulk and single cell RNA-sequencing as well as the proteomics analysis of the human heart tissue can be integrated to uncover HF-specific networks and potential therapeutic targets or biomarkers. We also aimed to address the issue of dealing with a limited number of samples and to show how appropriate statistical models, enrichment with other datasets as well as machine learning-guided analysis can aid in such cases. Furthermore, we elucidated specific gene expression profiles using transcriptomic and mined data from public databases. This was achieved using the two-step machine learning algorithm to predict the likelihood of the therapeutic target or biomarker tractability based on a novel scoring system, which has also been introduced in this study. The described methodology could be very useful for the target or biomarker selection and evaluation during the pre-clinical therapeutics development stage as well as disease progression monitoring. In addition, the present study sheds new light into the complex aetiology of HF, differentiating between subtle changes in dilated cardiomyopathies (DCs) and ischemic cardiomyopathies (ICs) on the single cell, proteome and whole transcriptome level, demonstrating that HF might be dependent on the involvement of not only the cardiomyocytes but also on other cell populations. Identified tissue remodelling and inflammatory processes can be beneficial when selecting targeted pharmacological management for DCs or ICs, respectively.

2020 ◽  
Author(s):  
Auste Kanapeckaite ◽  
Neringa Burokiene

At present heart failure treatment targets symptoms based on the left ventricle dysfunction severity; however, lack of systemic studies and available biological data to uncover heterogeneous underlying mechanisms on the scale of genomic, transcriptional and expressed protein level signifies the need to shift the analytical paradigm toward network centric and data mining approaches. This study, for the first time, aimed to investigate how bulk and single cell RNA-sequencing as well as the proteomics analysis of the human heart tissue can be integrated to uncover heart failure specific networks and potential therapeutic targets or biomarkers. Furthermore, it was demonstrated that transcriptomics data in combination with minded data from public databases can be used to elucidate specific gene expression profiles. This was achieved using machine learning algorithms to predict the likelihood of the therapeutic target or biomarker tractability based on a novel scoring system also introduced in this study. The described methodology could be very useful for the target selection and evaluation during the pre-clinical therapeutics development stage. Finally, the present study shed new light into the complex etiology of the heart failure differentiating between subtle changes in dilated and ischemic cardiomyopathy on the single cell, proteome and whole transcriptome level.


F1000Research ◽  
2018 ◽  
Vol 7 ◽  
pp. 474
Author(s):  
Andy R. Eugene ◽  
Jolanta Masiak ◽  
Beata Eugene

Background: We sought to test the hypothesis that transcriptiome-level genes signatures are differentially expressed between male and female bipolar patients, prior to lithium treatment, in a patient cohort who later were clinically classified as lithium treatment responders. Methods: Gene expression study data was obtained from the Lithium Treatment-Moderate dose Use Study data accessed from the National Center for Biotechnology Information’s Gene Expression Omnibus via accession number GSE4548. Differential gene expression analysis was conducted using the Linear Models for Microarray and RNA-Seq (limma) package and the Random Forests machine learning algorithm in R. Results: In pre-treatment lithium responders, the following genes were found having a greater than 0.5 fold-change, and differentially expressed indicating a male bias: RBPMS2, SIDT2, CDH23, LILRA5, and KIR2DS5; while the female-biased genes were: HLA-H, RPS23, FHL3, RPL10A, NBPF14, PSTPIP2, FAM117B, CHST7, and ABRACL. Conclusions: Using machine learning, we developed a pre-treatment gender- and gene-expression-based predictive model selective for lithium responders with an ROC AUC of 0.92 for men and an ROC AUC of 1 for women.


2021 ◽  
Vol 12 ◽  
Author(s):  
Melanie A. Brennan ◽  
Adam Z. Rosenthal

Clonal bacterial populations exhibit various forms of heterogeneity, including co-occurrence of cells with different morphological traits, biochemical properties, and gene expression profiles. This heterogeneity is prevalent in a variety of environments. For example, the productivity of large-scale industrial fermentations and virulence of infectious diseases are shaped by cell population heterogeneity and have a direct impact on human life. Due to the need and importance to better understand this heterogeneity, multiple methods of examining single-cell heterogeneity have been developed. Traditionally, fluorescent reporters or probes are used to examine a specific gene of interest, providing a useful but inherently biased approach. In contrast, single-cell RNA sequencing (scRNA-seq) is an agnostic approach to examine heterogeneity and has been successfully applied to eukaryotic cells. Unfortunately, current extensively utilized methods of eukaryotic scRNA-seq present difficulties when applied to bacteria. Specifically, bacteria have a cell wall which makes eukaryotic lysis methods incompatible, bacterial mRNA has a shorter half-life and lower copy numbers, and isolating an individual bacterial species from a mixed community is difficult. Recent work has demonstrated that these technical hurdles can be overcome, providing valuable insight into factors influencing microbial heterogeneity. This perspective describes the emerging microbial scRNA-seq toolkit. We outline the benefit of these new tools in elucidating numerous scientific questions in microbiological studies and offer insight about the possible rules that govern the segregation of traits in individual microbial cells.


2022 ◽  
Vol 23 (1) ◽  
Author(s):  
Lucille Lopez-Delisle ◽  
Jean-Baptiste Delisle

Abstract Background The number of studies using single-cell RNA sequencing (scRNA-seq) is constantly growing. This powerful technique provides a sampling of the whole transcriptome of a cell. However, sparsity of the data can be a major hurdle when studying the distribution of the expression of a specific gene or the correlation between the expressions of two genes. Results We show that the main technical noise associated with these scRNA-seq experiments is due to the sampling, i.e., Poisson noise. We present a new tool named baredSC, for Bayesian Approach to Retrieve Expression Distribution of Single-Cell data, which infers the intrinsic expression distribution in scRNA-seq data using a Gaussian mixture model. baredSC can be used to obtain the distribution in one dimension for individual genes and in two dimensions for pairs of genes, in particular to estimate the correlation in the two genes’ expressions. We apply baredSC to simulated scRNA-seq data and show that the algorithm is able to uncover the expression distribution used to simulate the data, even in multi-modal cases with very sparse data. We also apply baredSC to two real biological data sets. First, we use it to measure the anti-correlation between Hoxd13 and Hoxa11, two genes with known genetic interaction in embryonic limb. Then, we study the expression of Pitx1 in embryonic hindlimb, for which a trimodal distribution has been identified through flow cytometry. While other methods to analyze scRNA-seq are too sensitive to sampling noise, baredSC reveals this trimodal distribution. Conclusion baredSC is a powerful tool which aims at retrieving the expression distribution of few genes of interest from scRNA-seq data.


eLife ◽  
2021 ◽  
Vol 10 ◽  
Author(s):  
Lori Glenwinkel ◽  
Seth R Taylor ◽  
Kasper Langebeck-Jensen ◽  
Laura Pereira ◽  
Molly B Reilly ◽  
...  

The generation of the enormous diversity of neuronal cell types in a differentiating nervous system entails the activation of neuron type-specific gene batteries. To examine the regulatory logic that controls the expression of neuron type-specific gene batteries, we interrogate single cell expression profiles of all 118 neuron classes of the Caenorhabditis elegans nervous system for the presence of DNA binding motifs of 136 neuronally expressed C. elegans transcription factors. Using a phylogenetic footprinting pipeline, we identify cis-regulatory motif enrichments among neuron class-specific gene batteries and we identify cognate transcription factors for 117 of the 118 neuron classes. In addition to predicting novel regulators of neuronal identities, our nervous system-wide analysis at single cell resolution supports the hypothesis that many transcription factors directly co-regulate the cohort of effector genes that define a neuron type, thereby corroborating the concept of so-called terminal selectors of neuronal identity. Our analysis provides a blueprint for how individual components of an entire nervous system are genetically specified.


Author(s):  
Meichen Dong ◽  
Aatish Thennavan ◽  
Eugene Urrutia ◽  
Yun Li ◽  
Charles M Perou ◽  
...  

Abstract Recent advances in single-cell RNA sequencing (scRNA-seq) enable characterization of transcriptomic profiles with single-cell resolution and circumvent averaging artifacts associated with traditional bulk RNA sequencing (RNA-seq) data. Here, we propose SCDC, a deconvolution method for bulk RNA-seq that leverages cell-type specific gene expression profiles from multiple scRNA-seq reference datasets. SCDC adopts an ENSEMBLE method to integrate deconvolution results from different scRNA-seq datasets that are produced in different laboratories and at different times, implicitly addressing the problem of batch-effect confounding. SCDC is benchmarked against existing methods using both in silico generated pseudo-bulk samples and experimentally mixed cell lines, whose known cell-type compositions serve as ground truths. We show that SCDC outperforms existing methods with improved accuracy of cell-type decomposition under both settings. To illustrate how the ENSEMBLE framework performs in complex tissues under different scenarios, we further apply our method to a human pancreatic islet dataset and a mouse mammary gland dataset. SCDC returns results that are more consistent with experimental designs and that reproduce more significant associations between cell-type proportions and measured phenotypes.


2020 ◽  
Vol 79 (9) ◽  
pp. 1234-1242 ◽  
Author(s):  
Iago Pinal-Fernandez ◽  
Maria Casal-Dominguez ◽  
Assia Derfoul ◽  
Katherine Pak ◽  
Frederick W Miller ◽  
...  

ObjectivesMyositis is a heterogeneous family of diseases that includes dermatomyositis (DM), antisynthetase syndrome (AS), immune-mediated necrotising myopathy (IMNM), inclusion body myositis (IBM), polymyositis and overlap myositis. Additional subtypes of myositis can be defined by the presence of myositis-specific autoantibodies (MSAs). The purpose of this study was to define unique gene expression profiles in muscle biopsies from patients with MSA-positive DM, AS and IMNM as well as IBM.MethodsRNA-seq was performed on muscle biopsies from 119 myositis patients with IBM or defined MSAs and 20 controls. Machine learning algorithms were trained on transcriptomic data and recursive feature elimination was used to determine which genes were most useful for classifying muscle biopsies into each type and MSA-defined subtype of myositis.ResultsThe support vector machine learning algorithm classified the muscle biopsies with >90% accuracy. Recursive feature elimination identified genes that are most useful to the machine learning algorithm and that are only overexpressed in one type of myositis. For example, CAMK1G (calcium/calmodulin-dependent protein kinase IG), EGR4 (early growth response protein 4) and CXCL8 (interleukin 8) are highly expressed in AS but not in DM or other types of myositis. Using the same computational approach, we also identified genes that are uniquely overexpressed in different MSA-defined subtypes. These included apolipoprotein A4 (APOA4), which is only expressed in anti-3-hydroxy-3-methylglutaryl-CoA reductase (HMGCR) myopathy, and MADCAM1 (mucosal vascular addressin cell adhesion molecule 1), which is only expressed in anti-Mi2-positive DM.ConclusionsUnique gene expression profiles in muscle biopsies from patients with MSA-defined subtypes of myositis and IBM suggest that different pathological mechanisms underly muscle damage in each of these diseases.


2021 ◽  
Author(s):  
Daniel Osorio ◽  
Yan Zhong ◽  
Guanxun Li ◽  
Qian Xu ◽  
Andrew E. Hillhouse ◽  
...  

Gene knockout (KO) experiments are a proven approach for studying gene function. A typical KO experiment usually involves the phenotypic characterization of KO organisms. The recent advent of single-cell technology has greatly boosted the resolution of cellular phenotyping, providing unprecedented insights into cell-type-specific gene function. However, the use of single-cell technology in large-scale, systematic KO experiments is prohibitive due to the vast resources required. Here we present scTenifoldKnk, a machine learning workflow that performs virtual KO experiments using single-cell RNA sequencing (scRNA-seq) data. scTenifoldKnk first uses data from wild-type (WT) samples to construct a single-cell gene regulatory network (scGRN). Then, a gene is knocked out from the constructed scGRN by setting weights of the gene's outward edges to zeros. ScTenifoldKnk then compares this "pseudo-KO" scGRN with the original scGRN to identify differentially regulated (DR) genes. These DR genes, also called virtual-KO perturbed genes, are used to assess the impact of the gene KO and reveal the gene's function in analyzed cells. Using existing data sets, we demonstrate that the scTenifoldKnk analysis recapitulates the main findings of three real-animal KO experiments and confirms the functions of genes underlying three Mendelian diseases. We show the power of scTenifoldKnk as a predictive method to successfully predict the outcomes of two KO experiments that involve intestinal enterocytes in Ahr-/- mice and pancreatic islet cells in Malat1-/- mice, respectively. Finally, we demonstrate the use of scTenifoldKnk to perform systematic KO analyses, in which a large number of genes are virtually deleted, allowing gene functions to be revealed in a cell type-specific manner.


2019 ◽  
Author(s):  
Nalin Leelatian ◽  
Justine Sinnaeve ◽  
Akshitkumar M. Mistry ◽  
Sierra M. Barone ◽  
Kirsten E. Diggins ◽  
...  

AbstractRecent developments in machine learning implemented dimensionality reduction and clustering tools to classify the cellular composition of patient-derived tissue in multi-dimensional, single cell studies. Current approaches, however, require prior knowledge of either categorical clinical outcomes or cell type identities. These algorithms are not well suited for application in tumor biology, where clinical outcomes can be continuous and censored and cell identities may be novel and plastic. Risk Assessment Population IDentification (RAPID) is an unsupervised, machine learning algorithm that identifies single cell phenotypes and assesses clinical risk stratification as a continuous variable. Single cell mass cytometry evaluated 34 different phospho-proteins, transcription factors, and cell identity proteins in tumor tissue resected from patients bearingIDHwild-type glioblastomas. RAPID identified and characterized multiple biologically distinct tumor cell subsets that independently and continuously stratified patient outcome. RAPID is broadly applicable for single cell studies where atypical cancer and immune cells may drive disease biology and treatment responses.


2021 ◽  
Author(s):  
Denis Arthur Pinheiro Moura ◽  
Joao Ricardo Mendes de Oliveira

Abstract Dementia, a syndrome characterized by the progressive deterioration of memory and cognition, arises from different pathologies, with Alzheimer's Disease (AD) its most common cause. Patterns of gene expression during dementia of different etiologies may function as generalist biomarkers of the condition. We used RNA-Seq data from the Allen Dementia and Traumatic Brain Injury Study (ADTBI) to identify differentially expressed genes in brains with dementia. Machine Learning algorithms Decision Trees (DT) and Random Forest (RF) were used to create models to identify dementia samples based on their gene expression profile. Importance analyses were conducted to identify the most relevant genes in each classification model. A total of 1629 differentially expressed (DE) genes were found in brains with the condition. Gene PAN3-AS1 was the only DE gene across more than three brain regions. The artificial intelligence models were capable of identifying correctly up to 92.85% of dementia samples. Our analyses provide interesting insights regarding using brain-specific gene expression profiles as biomarkers of dementia, identifying genes possibly involved with dementia, and guiding future studies in prediction and early identification of the syndrome.


Sign in / Sign up

Export Citation Format

Share Document