Malignancy and NF-kB signalling strengthen coordination between the expression of mitochondrial and nuclear-encoded oxidative phosphorylation genes

Mapping Intimacies ◽

10.1101/2021.06.30.450588 ◽

2021 ◽

Author(s):

Marcos Francisco Perez ◽

Peter Sarkies

Keyword(s):

Gene Expression ◽

Correlation Analysis ◽

Mitochondrial Gene ◽

Nuclear Genome ◽

Rna Seq ◽

Healthy Human ◽

Protein Coding ◽

Oxphos Gene ◽

Mammalian Mitochondria ◽

Cancer Types

Mitochondria are ancient endosymbiotic organelles crucial to eukaryotic growth and metabolism. Mammalian mitochondria carry a small genome containing thirteen protein-coding genes with the remaining mitochondrial proteins encoded by the nuclear genome. Little is known about how coordination between the two sets of genes is achieved. Correlation analysis of RNA-seq expression data from large publicly-available datasets is a common method to leverage genetic diversity to infer gene co-expression modules. Here we use this method to investigate nuclear-mitochondrial gene expression coordination. We identify a pitfall in correlation analysis that results from the large variation in the proportion of transcripts from the mitochondrial genome in RNA-seq data. Commonly used normalization techniques based on total read count (such as FPKM or TPM) produce artefactual negative correlations between mitochondrial- and nuclear-encoded transcripts. This also results in artefactual correlations between pairs of nuclear-encoded genes, thus having important consequences for inferring co-expression modules beyond mitochondria. We show that these effects can be overcome by normalizing using the median-ratio normalization (MRN) or trimmed mean of M values (TMM) methods. Using these normalizations, we find only weak and inconsistent correlations between mitochondrial and nuclear-encoded mitochondrial genes in the majority of healthy human tissues from the GTEx database. However, a subset of healthy tissues with high expression of NFkB show significant coordination supporting a role for NFkB in retrograde signalling. Contrastingly, most cancer types show robust coordination of nuclear and mitochondrial OXPHOS gene expression, identifying this as a feature of gene regulation in cancer.

Download Full-text

Malignancy and NF-κB signalling strengthen coordination between expression of mitochondrial and nuclear-encoded oxidative phosphorylation genes

Genome Biology ◽

10.1186/s13059-021-02541-6 ◽

2021 ◽

Vol 22 (1) ◽

Author(s):

Marcos Francisco Perez ◽

Peter Sarkies

Keyword(s):

Gene Expression ◽

Mitochondrial Genome ◽

Correlation Analysis ◽

Mitochondrial Gene ◽

Nuclear Genome ◽

Mitochondrial Proteins ◽

Rna Seq ◽

Healthy Human ◽

Oxphos Gene ◽

Cancer Types

Abstract Background Mitochondria are ancient endosymbiotic organelles crucial to eukaryotic growth and metabolism. The mammalian mitochondrial genome encodes for 13 mitochondrial proteins, and the remaining mitochondrial proteins are encoded by the nuclear genome. Little is known about how coordination between the expression of the two sets of genes is achieved. Results Correlation analysis of RNA-seq expression data from large publicly available datasets is a common method to leverage genetic diversity to infer gene co-expression modules. Here we use this method to investigate nuclear-mitochondrial gene expression coordination. We identify a pitfall in correlation analysis that results from the large variation in the proportion of transcripts from the mitochondrial genome in RNA-seq data. Commonly used normalisation techniques based on total read counts, such as FPKM or TPM, produce artefactual negative correlations between mitochondrial- and nuclear-encoded transcripts. This also results in artefactual correlations between pairs of nuclear-encoded genes, with important consequences for inferring co-expression modules beyond mitochondria. We show that these effects can be overcome by normalizing using the median-ratio normalisation (MRN) or trimmed mean of M values (TMM) methods. Using these normalisations, we find only weak and inconsistent correlations between mitochondrial and nuclear-encoded mitochondrial genes in the majority of healthy human tissues from the GTEx database. Conclusions We show that a subset of healthy tissues with high expression of NF-κB show significant coordination, suggesting a role for NF-κB in ensuring balanced expression between mitochondrial and nuclear genes. Contrastingly, most cancer types show robust coordination of nuclear and mitochondrial OXPHOS gene expression, identifying this as a feature of gene regulation in cancer.

Download Full-text

Mitochondrial Mistranslation in Brain Provokes a Metabolic Response Which Mitigates the Age-Associated Decline in Mitochondrial Gene Expression

International Journal of Molecular Sciences ◽

10.3390/ijms22052746 ◽

2021 ◽

Vol 22 (5) ◽

pp. 2746

Author(s):

Dimitri Shcherbakov ◽

Reda Juskeviciene ◽

Adrián Cortés Sanchón ◽

Margarita Brilkova ◽

Hubert Rehrauer ◽

...

Keyword(s):

Gene Expression ◽

Metabolic Response ◽

Mitochondrial Gene ◽

Tca Cycle ◽

Brain Mitochondria ◽

Mitochondrial Gene Expression ◽

Rna Seq ◽

The Tca Cycle ◽

Neurological Phenotype

Mitochondrial misreading, conferred by mutation V338Y in mitoribosomal protein Mrps5, in-vivo is associated with a subtle neurological phenotype. Brain mitochondria of homozygous knock-in mutant Mrps5V338Y/V338Y mice show decreased oxygen consumption and reduced ATP levels. Using a combination of unbiased RNA-Seq with untargeted metabolomics, we here demonstrate a concerted response, which alleviates the impaired functionality of OXPHOS complexes in Mrps5 mutant mice. This concerted response mitigates the age-associated decline in mitochondrial gene expression and compensates for impaired respiration by transcriptional upregulation of OXPHOS components together with anaplerotic replenishment of the TCA cycle (pyruvate, 2-ketoglutarate).

Download Full-text

Mitogenome Analysis of Four Lamiinae Species (Coleoptera: Cerambycidae) and Gene Expression Responses by Monochamus alternatus When Infected with the Parasitic Nematode, Bursaphelenchus mucronatus

Insects ◽

10.3390/insects12050453 ◽

2021 ◽

Vol 12 (5) ◽

pp. 453

Author(s):

Zi-Yi Zhang ◽

Jia-Yin Guan ◽

Yu-Rou Cao ◽

Xin-Yi Dai ◽

Kenneth B. Storey ◽

...

Keyword(s):

Gene Expression ◽

Phylogenetic Trees ◽

Mitochondrial Gene ◽

Mitochondrial Protein ◽

Pine Wilt Disease ◽

Bursaphelenchus Xylophilus ◽

Monochamus Alternatus ◽

Genome Database ◽

Protein Coding ◽

Nd5 Gene

We determined the mitochondrial gene sequence of Monochamus alternatus and three other mitogenomes of Lamiinae (Insect: Coleoptera: Cerambycidae) belonging to three genera (Aulaconotus, Apriona and Paraglenea) to enrich the mitochondrial genome database of Lamiinae and further explore the phylogenetic relationships within the subfamily. Phylogenetic trees of the Lamiinae were built using the Bayesian inference (BI) and maximum likelihood (ML) methods and the monophyly of Monochamus, Anoplophora, and Batocera genera was supported. Anoplophora chinensis, An. glabripennis and Aristobia reticulator were closely related, suggesting they may also be potential vectors for the transmission of the pine wood pathogenic nematode (Bursaphelenchus xylophilus) in addition to M. alternatus, a well-known vector of pine wilt disease. There is a special symbiotic relationship between M. alternatus and Bursaphelenchus xylophilus. As the native sympatric sibling species of B. xylophilus, B. mucronatus also has a specific relationship that is often overlooked. The analysis of mitochondrial gene expression aimed to explore the effect of B. mucronatus on the energy metabolism of the respiratory chain of M. alternatus adults. Using RT-qPCR, we determined and analyzed the expression of eight mitochondrial protein-coding genes (COI, COII, COIII, ND1, ND4, ND5, ATP6, and Cty b) between M. alternatus infected by B. mucronatus and M. alternatus without the nematode. Expression of all the eight mitochondrial genes were up-regulated, particularly the ND4 and ND5 gene, which were up-regulated by 4–5-fold (p < 0.01). Since longicorn beetles have immune responses to nematodes, we believe that their relationship should not be viewed as symbiotic, but classed as parasitic.

Download Full-text

Annotation of snoRNA abundance across human tissues reveals complex snoRNA-host gene relationships

Genome Biology ◽

10.1186/s13059-021-02391-2 ◽

2021 ◽

Vol 22 (1) ◽

Author(s):

Étienne Fafard-Couture ◽

Danny Bergeron ◽

Sonia Couture ◽

Sherif Abou-Elela ◽

Michelle S. Scott

Keyword(s):

Housekeeping Genes ◽

Host Gene ◽

Rna Modification ◽

Human Tissues ◽

Rna Seq ◽

Healthy Human ◽

Protein Coding ◽

Conservation Level ◽

Nucleolar Rnas ◽

Host Genes

Abstract Background Small nucleolar RNAs (snoRNAs) are mid-size non-coding RNAs required for ribosomal RNA modification, implying a ubiquitous tissue distribution linked to ribosome synthesis. However, increasing numbers of studies identify extra-ribosomal roles of snoRNAs in modulating gene expression, suggesting more complex snoRNA abundance patterns. Therefore, there is a great need for mapping the snoRNome in different human tissues as the blueprint for snoRNA functions. Results We used a low structure bias RNA-Seq approach to accurately quantify snoRNAs and compare them to the entire transcriptome in seven healthy human tissues (breast, ovary, prostate, testis, skeletal muscle, liver, and brain). We identify 475 expressed snoRNAs categorized in two abundance classes that differ significantly in their function, conservation level, and correlation with their host gene: 390 snoRNAs are uniformly expressed and 85 are enriched in the brain or reproductive tissues. Most tissue-enriched snoRNAs are embedded in lncRNAs and display strong correlation of abundance with them, whereas uniformly expressed snoRNAs are mostly embedded in protein-coding host genes and are mainly non- or anticorrelated with them. Fifty-nine percent of the non-correlated or anticorrelated protein-coding host gene/snoRNA pairs feature dual-initiation promoters, compared to only 16% of the correlated non-coding host gene/snoRNA pairs. Conclusions Our results demonstrate that snoRNAs are not a single homogeneous group of housekeeping genes but include highly regulated tissue-enriched RNAs. Indeed, our work indicates that the architecture of snoRNA host genes varies to uncouple the host and snoRNA expressions in order to meet the different snoRNA abundance levels and functional needs of human tissues.

Download Full-text

Gene Expression Imputation with Generative Adversarial Imputation Nets

10.1101/2020.06.09.141689 ◽

2020 ◽

Author(s):

Ramon Viñas ◽

Tiago Azevedo ◽

Eric R. Gamazon ◽

Pietro Liò

Keyword(s):

Gene Expression ◽

Large Scale ◽

Biological Significance ◽

Predictive Performance ◽

Cost Effective ◽

Rna Seq ◽

Comprehensive Collection ◽

Genomic Studies ◽

Biological Discovery ◽

Cancer Types

AbstractA question of fundamental biological significance is to what extent the expression of a subset of genes can be used to recover the full transcriptome, with important implications for biological discovery and clinical application. To address this challenge, we present GAIN-GTEx, a method for gene expression imputation based on Generative Adversarial Imputation Networks. In order to increase the applicability of our approach, we leverage data from GTEx v8, a reference resource that has generated a comprehensive collection of transcriptomes from a diverse set of human tissues. We compare our model to several standard and state-of-the-art imputation methods and show that GAIN-GTEx is significantly superior in terms of predictive performance and runtime. Furthermore, our results indicate strong generalisation on RNA-Seq data from 3 cancer types across varying levels of missingness. Our work can facilitate a cost-effective integration of large-scale RNA biorepositories into genomic studies of disease, with high applicability across diverse tissue types.

Download Full-text

77 Prevalence of secondary immunotherapeutic targets in the absence of established immune biomarkers in solid tumors

Journal for ImmunoTherapy of Cancer ◽

10.1136/jitc-2021-sitc2021.077 ◽

2021 ◽

Vol 9 (Suppl 3) ◽

pp. A86-A86

Author(s):

Paul DePietro ◽

Mary Nesline ◽

Yong Hee Lee ◽

RJ Seager ◽

Erik Van Roey ◽

...

Keyword(s):

Gene Expression ◽

Lung Cancer ◽

Reference Population ◽

List Type ◽

Tumor Type ◽

Genomic Profiling ◽

Rna Seq ◽

Immune Biomarkers ◽

Cancer Types ◽

Immune Related Genes

BackgroundImmune checkpoint inhibitor-based therapies have achieved impressive success in the treatment of several cancer types. Predictive immune biomarkers, including PD-L1, MSI and TMB are well established as surrogate markers for immune evasion and tumor-specific neoantigens across many tumors. Positive detection across cancer types varies, but overall ~50% of patients test negative for these primary immune markers.1 In this study, we investigated the prevalence of secondary immune biomarkers outside of PD-L1, TMB and MSI.MethodsComprehensive genomic and immune profiling, including PD-L1 IHC, TMB, MSI and gene expression of 395 immune related genes was performed on 6078 FFPE tumors representing 34 cancer types, predominantly composed of lung cancer (36.7%), colorectal cancer (11.9%) and breast cancer (8.5%). Expression levels by RNA-seq of 36 genes targeted by immunotherapies in solid tumor clinical trials, identified as secondary immune biomarkers, were ranked against a reference population. Genes with a rank value ≥75th percentile were considered high and values were associated with PD-L1 (positive ≥1%), MSI (MSI-H or MSS) and TMB (high ≥10 Mut/Mb) status. Additionally, secondary immune biomarker status was segmented by tumor type and cancer immune cycle roles.ResultsIn total, 41.0% of cases were PD-L1+, 6.4% TMB+, and 0.1% MSI-H. 12.6% of cases were positive for >2 of these markers while 39.9% were triple negative (PD-L1-/TMB-/MSS). Of the PD-L1-/TMB-/MSS cases, 89.1% were high for at least one secondary immune biomarker, with 69.3% having ≥3 markers. PD-L1-/TMB-/MSS tumor types with ≥50% prevalence of high secondary immune biomarkers included brain, prostate, kidney, sarcoma, gallbladder, breast, colorectal, and liver cancer. High expression of cancer testis antigen secondary immune biomarkers (e.g., NY-ESO-1, LAGE-1A, MAGE-A4) was most commonly observed in bladder, ovarian, sarcoma, liver, and prostate cancer (≥15%). Tumors demonstrating T-cell priming (e.g., CD40, OX40, CD137), trafficking (e.g., TGFB1, TLR9, TNF) and/or recognition (e.g., CTLA4, LAG3, TIGIT) secondary immune biomarkers were most represented by kidney, gallbladder, and sarcoma (≥40%), with melanoma, esophageal, head & neck, cervical, stomach, and lung cancer least represented (≥15%).ConclusionsOur studies show comprehensive tumor profiling that includes gene expression can detect secondary immune biomarkers targeted by investigational therapies in ~90% of PD-L1-/TMB-/MSS cases. While genomic profiling could also provide therapeutic choices for a percentage of these patients, detection of secondary immune biomarkers by RNA-seq provides additional options for patients without a clear therapeutic path as determined by PD-L1 testing and genomic profiling alone.ReferenceHuang R S P, Haberberger J, Severson E, et al. A pan-cancer analysis of PD-L1 immunohistochemistry and gene amplification, tumor mutation burden and microsatellite instability in 48,782 cases. Mod Pathol 2021;34: 252–263.

Download Full-text

Identifying inaccuracies in gene expression estimates from unstranded RNA-seq data

Scientific Reports ◽

10.1038/s41598-019-52584-w ◽

2019 ◽

Vol 9 (1) ◽

Cited By ~ 1

Author(s):

Mikhail Pomaznoy ◽

Ashu Sethi ◽

Jason Greenbaum ◽

Bjoern Peters

Keyword(s):

Gene Expression ◽

Differential Expression Analysis ◽

Cell Types ◽

Library Preparation ◽

Rna Seq ◽

Protein Coding ◽

Protein Coding Genes ◽

Machine Learning Model ◽

Specific Manner ◽

Library Preparation Protocol

Abstract RNA-seq methods are widely utilized for transcriptomic profiling of biological samples. However, there are known caveats of this technology which can skew the gene expression estimates. Specifically, if the library preparation protocol does not retain RNA strand information then some genes can be erroneously quantitated. Although strand-specific protocols have been established, a significant portion of RNA-seq data is generated in non-strand-specific manner. We used a comprehensive stranded RNA-seq dataset of 15 blood cell types to identify genes for which expression would be erroneously estimated if strand information was not available. We found that about 10% of all genes and 2.5% of protein coding genes have a two-fold or higher difference in estimated expression when strand information of the reads was ignored. We used parameters of read alignments of these genes to construct a machine learning model that can identify which genes in an unstranded dataset might have incorrect expression estimates and which ones do not. We also show that differential expression analysis of genes with biased expression estimates in unstranded read data can be recovered by limiting the reads considered to those which span exonic boundaries. The resulting approach is implemented as a package available at https://github.com/mikpom/uslcount.

Download Full-text

RNA sequencing analysis for profiling activation of cancer-associated molecular pathways.

Journal of Clinical Oncology ◽

10.1200/jco.2019.37.15_suppl.e13032 ◽

2019 ◽

Vol 37 (15_suppl) ◽

pp. e13032-e13032 ◽

Cited By ~ 2

Author(s):

Anton Buzdin ◽

Andrew Garazha ◽

Maxim Sorokin ◽

Alex Glusker ◽

Alexey Aleshin ◽

...

Keyword(s):

Gene Expression ◽

Original Data ◽

Tissue Expression ◽

Molecular Pathways ◽

Sequencing Analysis ◽

Rna Seq ◽

Sequencing Data ◽

Healthy Human ◽

Tissue Samples ◽

Normal Tissues

e13032 Background: Intracellular molecular pathways (IMPs) control all major events in the living cell. They are considered hotspots in contemporary oncology because knowledge of IMPs activation is essential for understanding mechanisms of molecular pathogenesis in oncology. Profiling IMPs requires RNA-seq data for tumors and for a collection of reference normal tissues. However, there is a shortage now in such profiles for normal tissues from healthy human donors, uniformly profiled in a single series of experiments. Access to the largest dataset of normal profiles GTEx is only partly available through the dbGaP. In TCGA database, norms are adjacent to surgically removed tumors and may be affected by tumor-linked growth factors, inflammation and altered vascularization. ENCODE datasets were for the autopsies of normal tissues, but they can’t form statistically significant reference groups. Methods: Tissue samples representing 20 organs were taken from post-mortal human healthy donors killed in road accidents no later than 36 hours after death, blood samples were taken from healthy volunteers. Gene expression was profiled in RNA-seq experiments using the same reagents, equipment and protocols. Bioinformatic algorithms for IMP analysis were developed and validated using experimental and public gene expression datasets. Results: From original sequencing data we constructed the biggest fully open reference expression database of normal human tissues including 465 profiles termed Oncobox Atlas of Normal Tissue Expression (ANTE, original data: GSE120795). We next developed a method termed Oncobox for interrogating activation of IMPs in human cancers. It includes modules of expression data harmonization and comparison and an algorithm for automatic annotation of molecular pathways. The Oncobox system enables accurate scoring of thousands molecular pathways using RNA-seq data. Oncobox pathway analysis is also applicable for quantitative proteomics and microRNA data in oncology. Conclusions: The Oncobox system can be used for a plethora of applications in cancer research including finding differentially regulated genes and IMPs, and for discovery of new pathway-related diagnostic and prognostic biomarkers.

Download Full-text

Analysis of RDR1/RDR2/RDR6-independent small RNAs in Arabidopsis thaliana improves MIRNA annotations and reveals novel siRNA loci

10.1101/238691 ◽

2017 ◽

Cited By ~ 1

Author(s):

Seth Polydore ◽

Michael J. Axtell

Keyword(s):

Gene Expression ◽

Arabidopsis Thaliana ◽

Small Rna ◽

Small Rnas ◽

Rna Seq ◽

Triple Mutant ◽

Physiological Mechanisms ◽

Protein Coding ◽

Regulate Gene Expression ◽

Rna Biogenesis

SummaryPlant small RNAs regulate key physiological mechanisms through post-transcriptional and transcriptional silencing of gene expression. sRNAs fall into two major categories: those that are reliant on RNA Dependent RNA Polymerases (RDRs) for biogenesis and those that aren’t. Known RDR-dependent sRNAs include phased and repeat-associated short interfering RNAs, while known RDR-independent sRNAs are primarily microRNAs and other hairpin-derived sRNAs. In this study, we produced and analyzed small RNA-seq libraries from rdr1/rdr2/rdr6 triple mutant plants. Only a small fraction of all sRNA loci were RDR1/RDR2/RDR6-independent; most of these were microRNA loci or associated with predicted hairpin precursors. We found 58 previously annotated microRNA loci that were reliant on RDR1, −2, or −6 function, casting doubt on their classification. We also found 38 RDR1/2/6-independent small RNA loci that are not MIRNAs or otherwise hairpin-derived, and did not fit into other known paradigms for small RNA biogenesis. These 38 small RNA-producing loci have novel biogenesis mechanisms, and are frequently located in the vicinity of protein-coding genes. Altogether, our analysis suggest that these 38 loci represent one or more new types of small RNAs in Arabidopsis thaliana.Significance StatementSmall RNAs regulate gene expression in plants and are produced through a variety of previously-described mechanisms. Here, we examine a set of previously undiscovered small RNA-producing loci that are produced by novel mechanisms.

Download Full-text

Human CHR18: “Stakhanovite” Genes, Missing and uPE1 Proteins in Liver Tissue and HepG2 Cells

Biomedical Chemistry Research and Methods ◽

10.18097/bmcrm00144 ◽

2021 ◽

Vol 4 (1) ◽

pp. e00144

Author(s):

K.A. Deinichenko ◽

G.S. Krasnov ◽

S.P. Radko ◽

K.G. Ptitsyn ◽

V.V. Shapovalova ◽

...

Keyword(s):

Gene Expression ◽

Correlation Analysis ◽

Quantitative Pcr ◽

Liver Tissue ◽

Hepg2 Cells ◽

Expression Profiles ◽

Normal Liver ◽

Illumina Hiseq ◽

Protein Coding ◽

Protein Coding Genes

Missing (MP) and functionally uncharacterized proteins (uPE1) comprise less than 5% of the total number of proteins encoded by human Chr18 genes. Within half a year, since the January 2020 version of NextProt, the number of entries in the MP+uPE1 datasets changed, mainly due to the achievements of antibody-based proteomics. Assuming that the proteome is closely related to the transcriptome scaffold, quantitative PCR, Illumina HiSeq, and Oxford Nanopore Technology were applied to characterize the liver samples of three male donors in comparison with the HepG2 cell line. The data mining of the Expression Atlas (EMBL-EBI) and the profiling of biopsy samples by using orthogonal methods of transcriptome analysis have shown that in HepG2 cells and the liver, the genes encoding functionally uncharacterized proteins (uPE1) are expressed as low as for the missing proteins (less than 1 copy per cell), except the selected cases of HSBP1L1, TMEM241, C18orf21, and KLHL14. The initial expectation that uPE1 genes might be expressed at higher levels than MP genes, was compromised by severe discrepancies in our semi-quantitative gene expression data and in public databanks. Such discrepancy forced us to revisit the transcriptome of Chr18, the target of the Russian C-HPP Consortium. Tanglegram of highly expressed genes and further correlation analysis have shown the severe dependencies on the mRNA extraction method and the analytical platform. Targeted gene expression analysis by quantitative PCR (qPCR) and high-throughput transcriptome profiling (Illumina HiSeq and ONT MinION) for the same set of samples from normal liver tissue and HepG2 cells revealed the detectable expression of 250+ (92%) protein-coding genes of Chr18 (at least one method). The expression of slightly more than 50% protein-coding genes was detected simultaneously by all three methods. Correlation analysis of the gene expression profiles showed that the grouping of the datasets depended almost equally on both the type of biological material and the experimental method, particularly cDNA/mRNA isolation and library preparation.

Download Full-text