Diagnosis of T-cell–mediated kidney rejection in formalin-fixed, paraffin-embedded tissues using RNA-Seq–based machine learning algorithms

2019 ◽  
Vol 84 ◽  
pp. 283-290 ◽  
Author(s):  
Peng Liu ◽  
George Tseng ◽  
Zijie Wang ◽  
Yuchen Huang ◽  
Parmjeet Randhawa
2019 ◽  
Author(s):  
Christopher A. Hilker ◽  
Aditya V. Bhagwate ◽  
Jin Sung Jang ◽  
Jeffrey G Meyer ◽  
Asha A. Nair ◽  
...  

AbstractFormalin fixed paraffin embedded (FFPE) tissues are commonly used biospecimen for clinical diagnosis. However, RNA degradation is extensive when isolated from FFPE blocks making it challenging for whole transcriptome profiling (RNA-seq). Here, we examined RNA isolation methods, quality metrics, and the performance of RNA-seq using different approaches with RNA isolated from FFPE and fresh frozen (FF) tissues. We evaluated FFPE RNA extraction methods using six different tissues and five different methods. The reproducibility and quality of the prepared libraries from these RNAs were assessed by RNA-seq. We next examined the performance and reproducibility of RNA-seq for gene expression profiling with FFPE and FF samples using targeted (Kinome capture) and whole transcriptome capture based sequencing. Finally, we assessed Agilent SureSelect All-Exon V6+UTR capture and the Illumina TruSeq RNA Access protocols for their ability to detect known gene fusions in FFPE RNA samples. Although the overall yield of RNA varied among extraction methods, gene expression profiles generated by RNA-seq were highly correlated (>90%) when the input RNA was of sufficient quality (≥DV200 30%) and quantity (≥ 100 ng). Using gene capture, we observed a linear relationship between gene expression levels for shared genes that were captured using either All-Exon or Kinome kits. Gene expression correlations between the two capture-based approaches were similar using RNA from FFPE and FF samples. However, TruSeq RNA Access protocol provided significantly higher exon and junction reads when compared to the SureSelect All-Exon capture kit and was more sensitive for fusion gene detection. Our study established pre and post library construction QC parameters that are essential to reproducible RNA-seq profiling using FFPE samples. We show that gene capture based NGS sequencing is an efficient and highly reproducible strategy for gene expression measurements as well as fusion gene detection.


2021 ◽  
Author(s):  
Melih Agraz ◽  
Umut Agyuz ◽  
E. Celeste Welch ◽  
Kaymaz Yasin ◽  
Kuyumcu Birol

Abstract Background Metastasis is one of the most challenging problems in cancer diagnosis and treatment, as its causes have not been yet well characterized. Prediction of the metastatic status of breast cancer is important in cancer research because it has the potential to save lives. However, the systems biology behind metastasis is complex and driven by a variety of factors beyond those that have already been characterized for various cancer types. Furthermore, prediction of cancer metastasis is a challenging task due to the variation in parameters and conditions specific to individual patients and mutation of the sub-types. Results In this paper, we apply tree-based machine learning algorithms for gene expression data analysis in the estimation of metastatic potentials within a group of 490 breast cancer patients. Hence, we utilize tree-based machine learning algorithms, decision trees, gradient boosting, and extremely randomized trees to assess the variable importance.Conclusions We obtained highly accurate values from all three algorithms, we observed the highest accuracy from the Gradient Boost method which is 0.8901. Finally, we were able to determine the 10 most important genetic variables used in the boosted algorithms, as well as their respective importance scores and biological importance. Common important genes for our algorithms are found as CD8, PB1, THP-1. CD8, also known as CD8A is a receptor for the TCR, or T-cell receptor, which facilitates cytotoxic T-cell activity and its association with cancer is defined in the paper. PB1, PBRM1 or polybromo 1 is a tumor suppressor gene. THP-1 or GLI2 is a zinc finger protein referred to as ”Glioma-Associated Oncogene Family Zinc Finger 2”. This gene encodes a protein for the zinc finger, which binds DNA and mediate Sonic hedgehog signaling (SHH). Disruption in the SHH pathway have long been associated with cancer and cellular proliferation.


2019 ◽  
Vol 6 (1) ◽  
Author(s):  
Su Bin Lim ◽  
Swee Jin Tan ◽  
Wan-Teck Lim ◽  
Chwee Teck Lim

Abstract There are massive transcriptome profiles in the form of microarray. The challenge is that they are processed using diverse platforms and preprocessing tools, requiring considerable time and informatics expertise for cross-dataset analyses. If there exists a single, integrated data source, data-reuse can be facilitated for discovery, analysis, and validation of biomarker-based clinical strategy. Here, we present merged microarray-acquired datasets (MMDs) across 11 major cancer types, curating 8,386 patient-derived tumor and tumor-free samples from 95 GEO datasets. Using machine learning algorithms, we show that diagnostic models trained from MMDs can be directly applied to RNA-seq-acquired TCGA data with high classification accuracy. Machine learning optimized MMD further aids to reveal immune landscape across various carcinomas critically needed in disease management and clinical interventions. This unified data source may serve as an excellent training or test set to apply, develop, and refine machine learning algorithms that can be tapped to better define genomic landscape of human cancers.


2016 ◽  
Vol 154 (2) ◽  
pp. 202-213 ◽  
Author(s):  
Susan D. Hester ◽  
Virunya Bhat ◽  
Brian N. Chorley ◽  
Gleta Carswell ◽  
Wendell Jones ◽  
...  

2010 ◽  
Vol 11 (Suppl 1) ◽  
pp. P31 ◽  
Author(s):  
Kunbin Qu ◽  
John Morlan ◽  
Jim Stephans ◽  
Xitong Li ◽  
Joffre Baker ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document