expression variability
Recently Published Documents


TOTAL DOCUMENTS

195
(FIVE YEARS 84)

H-INDEX

23
(FIVE YEARS 7)

2022 ◽  
Vol 44 (1) ◽  
pp. 360-382
Author(s):  
Sanda Iacobas ◽  
Dumitru Andrei Iacobas

Many years and billions spent for research did not yet produce an effective answer to prostate cancer (PCa). Not only each human, but even each cancer nodule in the same tumor, has unique transcriptome topology. The differences go beyond the expression level to the expression control and networking of individual genes. The unrepeatable heterogeneous transcriptomic organization among men makes the quest for universal biomarkers and “fit-for-all” treatments unrealistic. We present a bioinformatics procedure to identify each patient’s unique triplet of PCa Gene Master Regulators (GMRs) and predict consequences of their experimental manipulation. The procedure is based on the Genomic Fabric Paradigm (GFP), which characterizes each individual gene by the independent expression level, expression variability and expression coordination with each other gene. GFP can identify the GMRs whose controlled alteration would selectively kill the cancer cells with little consequence on the normal tissue. The method was applied to microarray data on surgically removed prostates from two men with metastatic PCas (each with three distinct cancer nodules), and DU145 and LNCaP PCa cell lines. The applications verified that each PCa case is unique and predicted the consequences of the GMRs’ manipulation. The predictions are theoretical and need further experimental validation.


2022 ◽  
Author(s):  
Takaho Tsuchiya ◽  
Hiroki Hori ◽  
Haruka Ozaki

Motivation: Cell-cell communications regulate internal cellular states of the cell, e.g., gene expression and cell functions, and play pivotal roles in normal development and disease states. Furthermore, single-cell RNA sequencing methods have revealed cell-to-cell expression variability of highly variable genes (HVGs), which is also crucial. Nevertheless, the regulation on cell-to-cell expression variability of HVGs via cell-cell communications is still unexplored. The recent advent of spatial transcriptome measurement methods has linked gene expression profiles to the spatial context of single cells, which has provided opportunities to reveal those regulations. The existing computational methods extract genes with expression levels that are influenced by neighboring cell types based on the spatial transcriptome data. However, limitations remain in the quantitativeness and interpretability: it neither focuses on HVGs, considers cooperation of neighboring cell types, nor quantifies the degree of regulation with each neighboring cell type. Results: Here, we propose CCPLS (Cell-Cell communications analysis by Partial Least Square regression modeling), which is a statistical framework for identifying cell-cell communications as the effects of multiple neighboring cell types on cell-to-cell expression variability of HVGs, based on the spatial transcriptome data. For each cell type, CCPLS performs PLS regression modeling and reports coefficients as the quantitative index of the cell-cell communications. Evaluation using simulated data showed our method accurately estimated effects of multiple neighboring cell types on HVGs. Furthermore, by applying CCPLS to the two real datasets, we demonstrate CCPLS can be used to extract biologically interpretable insights from the inferred cell-cell communications.


2022 ◽  
Author(s):  
Sofya Lipnitskaya ◽  
Yang Shen ◽  
Stefan Legewie ◽  
Holger Klein ◽  
Kolja Becker

Abstract Background: Recent studies in the area of transcriptomics performed on single-cell and population levels reveal noticeable variability in gene expression measurements provided by different RNA sequencing technologies. Due to increased noise and complexity of single-cell RNA-Seq (scRNA-Seq) data over the bulk experiment, there is a substantial number of variably-expressed genes and so-called dropouts, challenging the subsequent computational analysis and potentially leading to false positive discoveries. In order to investigate factors affecting technical variability between RNA sequencing experiments of different technologies, we performed a systematic assessment of single-cell and bulk RNA-Seq data, which have undergone the same pre-processing and sample preparation procedures. Results: Our analysis indicates that variability between gene expression measurements as well as dropout events are not exclusively caused by biological variability, low expression levels, or random variation. Furthermore, we propose FAVSeq, a machine learning-assisted pipeline for detection of factors contributing to gene expression variability in matched RNA-Seq data provided by two technologies. Based on the analysis of the matched bulk and single-cell dataset, we found the 3'-UTR and transcript lengths as the most relevant effectors of the observed variation between RNA-Seq experiments, while the same factors together with cellular compartments were shown to be associated with dropouts. Conclusions: Here, we investigated the sources of variation in RNA-Seq profiles of matched single-cell and bulk experiments. In addition, we proposed the FAVSeq pipeline for analyzing multimodal RNA sequencing data, which allowed to identify factors affecting quantitative difference in gene expression measurements as well as the presence of dropouts. Hereby, the derived knowledge can be employed further in order to improve the interpretation of RNA-Seq data and identify genes that can be affected by assay-based deviations. Source code is available under the MIT license at https://github.com/slipnitskaya/FAVSeq.


2022 ◽  
Author(s):  
Sarah L Fong ◽  
John Anthony Capra

Motivation: Thousands of human gene regulatory enhancers are composed of sequences with multiple evolutionary origins. These evolutionarily "complex" enhancers consist of older "core" sequences and younger "derived" sequences. However, the functional relationship between the sequences of different evolutionary origins within complex enhancers is poorly understood. Results: We evaluated the function, selective pressures, and sequence variation across core and derived components of human complex enhancers. We find that both components are older than expected from the genomic background, and cores are enriched for derived sequences of similar evolutionary ages. Both components show strong evidence of biochemical activity in massively parallel report assays (MPRAs). However, core and derived sequences have distinct transcription factor (TF) binding preferences that are largely stable across evolutionary origins. Given these signatures of function, both core and derived sequences have substantial evidence of purifying selection. Nonetheless, derived sequences exhibit weaker purifying selection than adjacent cores. Derived sequences also tolerate more common genetic variation and are enriched compared to cores for eQTL associated with gene expression variability in human populations. Conclusions: Both core and derived sequences have strong evidence of gene regulatory function, but derived sequences have distinct constraint profiles, TF binding preferences, and tolerance to variation compared with cores. We propose that the step-wise integration of younger derived and older core sequences has generated regulatory substrates with robust activity and the potential for functional variation. Our analyses demonstrate that synthesizing study of enhancer evolution and function can aid interpretation of regulatory sequence activity and functional variation across human populations.


2022 ◽  
Author(s):  
Sofya Lipnitskaya ◽  
Yang Shen ◽  
Stefan Legewie ◽  
Holger Klein ◽  
Kolja Becker

Background: Recent studies in the area of transcriptomics performed on single-cell and population levels reveal noticeable variability in gene expression measurements provided by different RNA sequencing technologies. Due to increased noise and complexity of single-cell RNA-Seq (scRNA-Seq) data over the bulk experiment, there is a substantial number of variably-expressed genes and so-called dropouts, challenging the subsequent computational analysis and potentially leading to false positive discoveries. In order to investigate factors affecting technical variability between RNA sequencing experiments of different technologies, we performed a systematic assessment of single-cell and bulk RNA-Seq data, which have undergone the same pre-processing and sample preparation procedures. Results: Our analysis indicates that variability between gene expression measurements as well as dropout events are not exclusively caused by biological variability, low expression levels, or random variation. Furthermore, we propose FAVSeq, a machine learning-assisted pipeline for detection of factors contributing to gene expression variability in matched RNA-Seq data provided by two technologies. Based on the analysis of the matched bulk and single-cell dataset, we found the 3'-UTR and transcript lengths as the most relevant effectors of the observed variation between RNA-Seq experiments, while the same factors together with cellular compartments were shown to be associated with dropouts. Conclusions: Here, we investigated the sources of variation in RNA-Seq profiles of matched single-cell and bulk experiments. In addition, we proposed the FAVSeq pipeline for analyzing multimodal RNA sequencing data, which allowed to identify factors affecting quantitative difference in gene expression measurements as well as the presence of dropouts. Hereby, the derived knowledge can be employed further in order to improve the interpretation of RNA-Seq data and identify genes that can be affected by assay-based deviations. Source code is available under the MIT license at https://github.com/slipnitskaya/FAVSeq.


PLoS ONE ◽  
2021 ◽  
Vol 16 (12) ◽  
pp. e0259373
Author(s):  
Philipp Strauss ◽  
Håvard Mikkelsen ◽  
Jessica Furriol

Housekeeping, or reference genes (RGs) are, by definition, loci with stable expression profiles that are widely used as internal controls to normalize mRNA levels. However, due to specific events, such as pathological changes, or technical procedures, their expression might be altered, failing to fulfil critical normalization pre-requisites. To identify RG genes suitable as internal controls in human non-cancerous kidney tissue, we selected 18 RG candidates based on previous data and screen them in 30 expression datasets (>800 patients), including our own, publicly available or provided by independent groups. Datasets included specimens from patients with hypertensive and diabetic nephropathy, Fabry disease, focal segmental glomerulosclerosis, IgA nephropathy, membranous nephropathy, and minimal change disease. We examined both microdissected and whole section-based datasets. Expression variability of 4 candidate genes (YWHAZ, SLC4A1AP, RPS13 and ACTB) was further examined by qPCR in biopsies from patients with hypertensive nephropathy (n = 11) and healthy controls (n = 5). Only YWHAZ gene expression remained stable in all datasets whereas SLC4A1AP was stable in all but one Fabry dataset. All other RGs were differentially expressed in at least 2 datasets, and in 4.5 datasets on average. No differences in YWHAZ, SLC4A1AP, RPS13 and ACTB gene expression between hypertensive and control biopsies were detected by qPCR. Although RGs suitable to all techniques and tissues are unlikely to exist, our data suggest that in non-cancerous kidney biopsies expression of YWHAZ and SLC4AIAP genes is stable and suitable for normalization purposes.


2021 ◽  
Author(s):  
Ryan H Boe ◽  
Vinay Ayyappan ◽  
Lea Schuh ◽  
Arjun Raj

Accurately functioning genetic networks should be responsive to signals but prevent transmission of stochastic bursts of expression. Existing data in mammalian cells suggests that such transcriptional "noise" is transmitted by some genes and not others, suggesting that noise transmission is tunable, perhaps at the expense of other signal processing capabilities. However, systematic claims about noise transmission in genetic networks have been limited by the inability to directly measure noise transmission. Here we build a mathematical framework capable of modeling allelic correlation and noise transmission. We find that allelic correlation and noise transmission correspond across a broad range of model parameters and network architectures. We further find that limiting noise transmission comes with the trade-off of being unresponsive to signals, and that within the parameter regimes that are responsive to signals, there is a further trade-off between response time and basal noise transmission. Using a published allele specific single cell RNA-sequencing dataset, we found that genes with high allelic odds ratios are enriched for cell-type specific functions, and that within multiple signaling pathways, factors which are upstream in the pathway have higher allelic odds ratios than downstream factors. Overall, our findings suggest that some degree of noise transmission is required to be responsive to signals, but that minimization of noise transmission can be accomplished by trading-off for a slower response time.


2021 ◽  
Author(s):  
Hjorleifur Einarsson ◽  
Marco Salvatore ◽  
Christian Vaagenso ◽  
Nicolas Alcaraz ◽  
Jette Bornholdt Lange ◽  
...  

Genetic and environmental exposures cause variability in gene expression. Although most genes are affected in a population, their effect sizes vary greatly, indicating the existence of regulatory mechanisms that could amplify or attenuate expression variability. Here, we investigate the relationship between the sequence and transcription start site architectures of promoters and their expression variability across human individuals. We find that expression variability is largely determined by a promoter's DNA sequence and its binding sites for specific transcription factors. We further demonstrate that flexible usage of transcription start sites within a promoter attenuates variability, providing transcriptional and mutational robustness.


2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Bárbara Andrade Barbosa ◽  
Saskia D. van Asten ◽  
Ji Won Oh ◽  
Arantza Farina-Sarasqueta ◽  
Joanne Verheij ◽  
...  

AbstractDeconvolution of bulk gene expression profiles into the cellular components is pivotal to portraying tissue’s complex cellular make-up, such as the tumor microenvironment. However, the inherently variable nature of gene expression requires a comprehensive statistical model and reliable prior knowledge of individual cell types that can be obtained from single-cell RNA sequencing. We introduce BLADE (Bayesian Log-normAl Deconvolution), a unified Bayesian framework to estimate both cellular composition and gene expression profiles for each cell type. Unlike previous comprehensive statistical approaches, BLADE can handle > 20 types of cells due to the efficient variational inference. Throughout an intensive evaluation with > 700 simulated and real datasets, BLADE demonstrated enhanced robustness against gene expression variability and better completeness than conventional methods, in particular, to reconstruct gene expression profiles of each cell type. In summary, BLADE is a powerful tool to unravel heterogeneous cellular activity in complex biological systems from standard bulk gene expression data.


2021 ◽  
Vol 104 (4) ◽  
Author(s):  
Euan Joly-Smith ◽  
Zitong Jerry Wang ◽  
Andreas Hilfinger

Sign in / Sign up

Export Citation Format

Share Document