scholarly journals Public transcriptome database-based selection and validation of reliable reference genes for breast cancer research

2021 ◽  
Vol 20 (1) ◽  
Author(s):  
Qiang Song ◽  
Lu Dou ◽  
Wenjin Zhang ◽  
Yang Peng ◽  
Man Huang ◽  
...  

Abstract Background Quantitative reverse transcription-polymerase chain reaction (qRT-PCR) is the most sensitive technique for evaluating gene expression levels. Choosing appropriate reference genes (RGs) is critical for normalizing and evaluating changes in the expression of target genes. However, uniform and reliable RGs for breast cancer research have not been identified, limiting the value of target gene expression studies. Here, we aimed to identify reliable and accurate RGs for breast cancer tissues and cell lines using the RNA-seq dataset. Methods First, we compiled the transcriptome profiling data from the TCGA database involving 1217 samples to identify novel RGs. Next, ten genes with relatively stable expression levels were chosen as novel candidate RGs, together with six conventional RGs. To determine and validate the optimal RGs we performed qRT-PCR experiments on 87 samples from 11 types of surgically excised breast tumor specimens (n = 66) and seven breast cancer cell lines (n = 21). Five publicly available algorithms (geNorm, NormFinder, ΔCt method, BestKeeper, and ComprFinder) were used to assess the expression stability of each RG across all breast cancer tissues and cell lines. Results Our results show that RG combinations SF1 + TRA2B + THRAP3 and THRAP3 + RHOA + QRICH1 showed stable expression in breast cancer tissues and cell lines, respectively, and that they displayed good interchangeability. We propose that these combinations are optimal triplet RGs for breast cancer research. Conclusions In summary, we identified novel and reliable RG combinations for breast cancer research based on a public RNA-seq dataset. Our results lay a solid foundation for the accurate normalization of qRT-PCR results across different breast cancer tissues and cells.

2020 ◽  
Author(s):  
Qiang Song ◽  
Man Huang ◽  
Guicheng Wu ◽  
Lu Dou ◽  
Wenjin Zhang ◽  
...  

Abstract Background Quantitative reverse transcription-polymerase chain reaction (qRT-PCR) is the most sensitive technique for evaluating gene expression levels. Choosing appropriate reference genes (RGs) is critical for normalizing and evaluating changes in the expression of target genes. However, uniform and reliable RGs for breast cancer research have not been identified, limiting the value of target gene expression studies. Here, we provide a novel approach for mining RGs by using the RNA-seq dataset to identify reliable and accurate RGs that can be applied to different types of breast cancer tissues and cell lines. Methods First, we compiled the transcriptome profiling data from the TCGA database involving 1217 samples to identify novel RGs and then ten genes (SF1, TARDBP, THRAP3, QRICH1, TRA2B, SRSF3, YY1, DNAJC8, RNF10, and RHOA) with relatively stable expression levels were chosen as novel candidate RGs. Additionally, six conventional RGs (ACTB, TUBA1A, RPL13A, B2M, GAPDH, and GUSB) were also selected. To determine and validate the optimal RGs we performed qRT-PCR experiments on 87 samples from 5 types of surgically excised breast tumor specimens including HR+HER2-, HR+HER2+, HR-HER2-, HR-HER2+, breast cancer after neoadjuvant chemotherapy (NAC) and their matched para-carcinoma tissues, furthermore, we also included a benign breast tumor sample. Six biological replicates were included for each tissue. Moreover, we assessed 7 breast cancer cell lines (MCF-10A, MCF-7, T-47D, MDA-MB-231, MDA-MB-468, as well as MDA-MB-231 with either CNR2 knockdown or overexpression; 3 biological replicates for each line). Five statistical algorithms (geNorm, NormFinder, ΔCt method, BestKeeper, and ComprFinder) were used to assess the stability of expression of each RG across all breast cancer tissues and cell lines. Results Our results show that RG combinations SF1+TRA2B+THRAP3 and THRAP3+RHOA+QRICH1 showed stable expression in breast cancer tissues and cell lines, respectively, and that these two combinations displayed good interchangeability. Therefore, we propose that the above two combinations are optimal triplet RGs for breast cancer research. Conclusions In summary, we identified novel and reliable RG combinations for breast cancer research based on a public RNA-seq dataset which lays a solid foundation for accurate normalization of qRT-PCR results across different breast cancer tissues and cells.


2021 ◽  
Author(s):  
Lichun Zhang ◽  
Xiaoqian Yang ◽  
Yiyi Yin ◽  
Jinxing Wang ◽  
Yanwei Wang

Abstract Quantitative real time polymerase chain reaction (qRT-PCR) is a common method to analyze gene expression. Due to differences in RNA quantity, quality, and reverse transcription efficiency between qRT-PCR samples, reference genes are used as internal standards to normalize gene expression. However, few universal genes especially miRNAs have been identified as reference so far. Therefore, it is essential to identify reference genes that can be used across various experimental conditions, stress treatments, or tissues. In this study, 14 microRNAs (miRNAs) and 5.8S rRNA were assessed for expression stability in poplar trees infected with canker pathogen. Using three reference gene analysis programs, we found that miR156g and miR156a exhibited stable expression throughout the infection process. miR156g and miR156a were then tested as internal standards to measure the expression of miR1447 and miR171c, and the results were compared to small RNA sequencing (RNA-seq) data. We found that when miR156a was used as the reference gene, the expression of miR1447 and miR171c were consistent with the small RNA-seq expression profiles. Therefore, miR156a was the most stable miRNAs examined in this study, and could be used as a reference gene in poplar under canker pathogen stress, which should enable comprehensive comparisons of miRNAs expression and avoid the bias caused by different lenth between detected miRNAs and traditional referece genes. The present study has expanded the miRNA reference genes available for gene expression studies in trees under biotic stress.


2017 ◽  
Vol 29 (1) ◽  
pp. 173
Author(s):  
Z. Jiang ◽  
J. Sun ◽  
S. Marjani ◽  
H. Dong ◽  
X. Zheng ◽  
...  

Appropriate reference genes for accurate normalization in RT-PCR are essential for the study of gene expression. Ideal reference genes should not only have stable expression across stages of embryo development, but also be expressed at comparable levels to the target genes. Using RNA-seq data from in vivo-produced bovine oocytes and embryos from the 2-cell to blastocyst stage (Jiang et al., 2014 BMC Genomics 15, 756), we tried to establish a catalogue of all reference genes for RT-PCR analysis. One-way ANOVA generated 4055 genes that did not differ across stages. To reduce this list, we used the entire RNA-seq data set and first removed genes with a FPKM (fragments per kilobase of transcript per million mapped reads) of <1, and then rescaled each gene’s expression values within a range of 0 to 1. We subsequently calculated the expression variance for each gene across all stages. By assuming that the calculated variances follow a Gaussian distribution and that the majority of the genes do not have a stable expression level, a gene was classified as a reference if its variance significantly deviated (P < 0.05) from these assumptions. We identified 346 potential reference genes, all of which were among the candidates from the ANOVA analysis. We arbitrarily assigned genes in this list to high (FPKM ≥ 100), medium (10 < FPKM < 100), and low expression levels (FPKM ≤ 10), and 37, 154, and 155 genes, respectively, fell into these groups. Surprisingly, none of the commonly used reference genes, such as GAPDH, PPIA, ACTB, PRL15, GUSB, and H3F2A, were identified as being stably expressed across in vivo development. This is consistent with findings of prior RT-PCR studies (Robert et al. 2002 Biol. Reprod. 67, 1465–1472; Ross et al. 2010 Cell Reprogram. 12, 709–717). The following gene ontology terms were significantly enriched for the 346 genes: cell cycle, translation, transport, chromatin, cell division, and metabolic process, indicating that the early embryos maintained constant levels of genes involved in fundamental biological functions. Finally, we performed RT-PCR to validate the RNA-seq results using different bovine in vivo-derived oocytes and embryos (n = 3/stage). We successfully validated 10 selected genes, including those in the high (CS, PGD, and ACTR3), medium (CCT5, MRPL47, COG2, CRT9, and HELLS), and low expression groups (CDC23 and TTF1). In conclusion, we recommend the use of reference genes that are expressed at comparable levels to target genes. This study offers a useful resource to aid in the appropriate selection of reference genes, which will improve the accuracy of quantitative gene expression analyses across bovine embryo pre-implantation development.


2012 ◽  
Vol 30 (30_suppl) ◽  
pp. 56-56
Author(s):  
Byung-In Lee ◽  
Kahuku Oades ◽  
Lien Vo ◽  
Jerry Lee ◽  
Mark Landers ◽  
...  

56 Background: Gene expression profiling has been shown to be effective in analyzing postoperative tumor samples in various cancers. However, in analyzing small specimens such as core biopsies, the limited amount of available material makes multi-gene analyses difficult or impossible. Microarray-based analyses also provide limited dynamic range. We describe the development of targeted RNA-sequencing methodology which combines the power of a universal RNA amplification with NGS for an ultra-deep expression analysis of multiple target genes, enabling <100 ng of sample input for multi-gene analysis in a single tube format. Methods: The gene expression patterns of triple-negative breast cancer FFPE samples were analyzed using a 96-gene breast cancer biomarker panel across three different platforms: Affymetrix Human Gene ST 1.0 microarrays, a pre-developed OncoScore qRT-PCR panel, and targeted RNA-seq. For targeted RNA-seq analysis, the 96-gene panel was amplified using a universal, single-tube “XP-PCR” amplification strategy followed by sequence analysis using the Ion-Torrent Personal Genome Machine. Results: Targeted RNA-seq provided the most sensitivity in terms of detection rates with <100 ng FFPE RNA input and provides unlimited dynamic range with increased sequencing depth. Expression ratio compression issues typically associated with a high number of pre-amplification cycles in standard multiplex-primed methods were not observed here. Low expressing genes, undetectable by qRT-PCR analysis from 1,000 ng input FFPE RNA, were detected and eligible for expression analysis with a significant number of sequencing reads. Alternative transcription/splicing analysis is also possible from sequence analysis of the target transcripts using targeted RNA-seq. Conclusions: By combining universally primed pre-amplification and NGS in multi-gene expression analysis, targeted RNA-seq provides the most sensitive gene expression analysis methodology.


2018 ◽  
Author(s):  
Cao Ai Ping ◽  
Shao Dong Nan ◽  
Cui Bai Ming ◽  
Zheng Yin Ying ◽  
Sun jie

Analysis of gene expression level by RNA sequencing (RNA-seq ) has a wide range of biological purposes in various species. Real-time fluorescent quantitative PCR (qRT-PCR) evaluated gene expression levels and validated transcriptomic, which will depend on the stably expressed reference genes for normalization of the gene expression level under specific situations. In this study, 15 candidate genes were selected from transcriptome datasets during somatic embryogenesis (SE) initial dedifferentiation in Gossypium hirsutum L. of different SE capability. To evaluate the stability of those genes, geNorm, NormFinder and BestKeeper were used. The results revealed that ENDO4 and 18srRNA could be as appropriate reference genes under all conditions. The stability and reliability of the reference genes were further tested through comparison of qRT-PCR results and RNA-seq data, as well as evaluation of the expression profiles of auxin-responsive protein (AUX22) and ethylene-responsive transcription factor (ERF17). In summary, the results of our study indicate the most suitable reference genes for qRT-PCR during three induction stages in four cotton species.


Author(s):  
Isaac Raplee ◽  
Alexei Evsikov ◽  
Caralina Marín de Evsikova

The rapid expansion of transcriptomics from increased affordability of next-generation sequencing (NGS) technologies generates rocketing amounts of gene expression data across biology and medicine, and notably in cancer research. Concomitantly, many bioinformatics tools were developed to streamline gene expression analysis and quantification. We tested the concordance of NGS RNA sequencing (RNA-seq) analysis outcomes between the two predominant programs for reads alignment, HISAT2 and STAR, and the two most popular programs for quantifying gene expression in NGS experiments, edgeR and DESeq2, using RNA-seq data from a series of breast cancer progression specimens, which include histologically confirmed normal, early neoplasia, ductal carcinoma in situ and infiltrating ductal carcinoma samples microdissected from formalin fixed, paraffin embedded (FFPE) breast tissue blocks. We identified significant differences in aligners&rsquo; performance: HISAT2 was prone to misalign reads to retrogene genomic loci, STAR generated more precise alignments, especially for early neoplasia samples. edgeR and DESeq2 produced similar lists of differentially expressed genes in stage comparisons, with edgeR producing more conservative, though shorter, lists of genes. Albeit, Gene Ontology (GO) enrichment analysis revealed no skewness in significant GO categories identified among differentially expressed genes by edgeR vs DESeq2. As transcriptome analysis of archived FFPE samples becomes a vanguard of precision medicine, identification and fine-tuning of bioinformatics tools becomes critical for clinical research. Our results indicate that STAR and edgeR are well-suited tools for differential gene expression analysis from FFPE samples.


PeerJ ◽  
2015 ◽  
Vol 3 ◽  
pp. e1347 ◽  
Author(s):  
Xiaoping Niu ◽  
Jianmin Qi ◽  
Meixia Chen ◽  
Gaoyang Zhang ◽  
Aifen Tao ◽  
...  

Kenaf (Hibiscus cannabinus) is an economic and ecological fiber crop but suffers severe losses in fiber yield and quality under the stressful conditions of excess salinity and drought. To explore the mechanisms by which kenaf responds to excess salinity and drought, gene expression was performed at the transcriptomic level using RNA-seq. Thus, it is crucial to have a suitable set of reference genes to normalize target gene expression in kenaf under different conditions using real-time quantitative reverse transcription-PCR (qRT-PCR). In this study, we selected 10 candidate reference genes from the kenaf transcriptome and assessed their expression stabilities by qRT-PCR in 14 NaCl- and PEG-treated samples using geNorm, NormFinder, and BestKeeper. The results indicated thatTUBαand 18S rRNA were the optimum reference genes under conditions of excess salinity and drought in kenaf. Moreover,TUBαand 18S rRNA were used singly or in combination as reference genes to validate the expression levels of WRKY28 and WRKY32 in NaCl- and PEG-treated samples by qRT-PCR. The results further proved the reliability of the two selected reference genes. This work will benefit future studies on gene expression and lead to a better understanding of responses to excess salinity and drought in kenaf.


PeerJ ◽  
2019 ◽  
Vol 7 ◽  
pp. e6536 ◽  
Author(s):  
Li Miao ◽  
Xing Qin ◽  
Lihong Gao ◽  
Qing Li ◽  
Shuzhen Li ◽  
...  

Background Quantitative real-time PCR (qRT-PCR) is a commonly used high-throughput technique to measure mRNA transcript levels. The accuracy of this evaluation of gene expression depends on the use of optimal reference genes. Cucumber–pumpkin grafted plants, made by grafting a cucumber scion onto pumpkin rootstock, are superior to either parent plant, as grafting conveys many advantages. However, although many reliable reference genes have been identified in both cucumber and pumpkin, none have been obtained for cucumber–pumpkin grafted plants. Methods In this work, 12 candidate reference genes, including eight traditional genes and four novel genes identified from our transcriptome data, were selected to assess their expression stability. Their expression levels in 25 samples, including three cucumber and three pumpkin samples from different organs, and 19 cucumber–pumpkin grafted samples from different organs, conditions, and varieties, were analyzed by qRT-PCR, and the stability of their expression was assessed by the comparative ΔCt method, geNorm, NormFinder, BestKeeper, and RefFinder. Results The results showed that the most suitable reference gene varied dependent on the organs, conditions, and varieties. CACS and 40SRPS8 were the most stable reference genes for all samples in our research. TIP41 and CACS showed the most stable expression in different cucumber organs, TIP41 and PP2A were the optimal reference genes in pumpkin organs, and CACS and 40SRPS8 were the most stable genes in all grafted cucumber samples. However, the optimal reference gene varied under different conditions. CACS and 40SRPS8 were the best combination of genes in different organs of cucumber–pumpkin grafted plants, TUA and RPL36Aa were the most stable in the graft union under cold stress, LEA26 and ARF showed the most stable expression in the graft union during the healing process, and TIP41 and PP2A were the most stable across different varieties of cucumber–pumpkin grafted plants. The use of LEA26, ARF and LEA26+ARF as reference genes were further verified by analyzing the expression levels of csaCYCD3;1, csaRUL, cmoRUL, and cmoPIN in the graft union at different time points after grafting. Discussion This work is the first report of appropriate reference genes in grafted cucumber plants and provides useful information for the study of gene expression and molecular mechanisms in cucumber–pumpkin grafted plants.


Sign in / Sign up

Export Citation Format

Share Document