scholarly journals SCReadCounts: Estimation of cell-level SNVs from scRNA-seq data

2020 ◽  
Author(s):  
NM Prashant ◽  
Nawaf Alomran ◽  
Yu Chen ◽  
Hongyu Liu ◽  
Pavlos Bousounis ◽  
...  

SummarySCReadCounts is a method for a cell-level estimation of the sequencing read counts bearing a particular nucleotide at genomic positions of interest from barcoded scRNA-seq alignments. SCReadCounts generates an array of outputs, including cell-SNV matrices with the absolute variant-harboring read counts, as well as cell-SNV matrices with expressed Variant Allele Fraction (VAFRNA); we demonstrate its application to estimate cell level expression of somatic mutations and RNA-editing on cancer datasets. SCReadCounts is benchmarked against GATK and Samtools and is freely available as a 64-bit self-contained binary distribution (Linux), along with MacOS and Python installation.Availabilityhttps://github.com/HorvathLab/NGS/tree/master/SCReadCountsSupplementary InformationSCReadCounts_Supplementary_Data.zip

2019 ◽  
Vol 36 (5) ◽  
pp. 1351-1359 ◽  
Author(s):  
Liam F Spurr ◽  
Nawaf Alomran ◽  
Pavlos Bousounis ◽  
Dacian Reece-Stremtan ◽  
N M Prashant ◽  
...  

Abstract Motivation By testing for associations between DNA genotypes and gene expression levels, expression quantitative trait locus (eQTL) analyses have been instrumental in understanding how thousands of single nucleotide variants (SNVs) may affect gene expression. As compared to DNA genotypes, RNA genetic variation represents a phenotypic trait that reflects the actual allele content of the studied system. RNA genetic variation at expressed SNV loci can be estimated using the proportion of alleles bearing the variant nucleotide (variant allele fraction, VAFRNA). VAFRNA is a continuous measure which allows for precise allele quantitation in loci where the RNA alleles do not scale with the genotype count. We describe a method to correlate VAFRNA with gene expression and assess its ability to identify genetically regulated expression solely from RNA-sequencing (RNA-seq) datasets. Results We introduce ReQTL, an eQTL modification which substitutes the DNA allele count for the variant allele fraction at expressed SNV loci in the transcriptome (VAFRNA). We exemplify the method on sets of RNA-seq data from human tissues obtained though the Genotype-Tissue Expression (GTEx) project and demonstrate that ReQTL analyses are computationally feasible and can identify a subset of expressed eQTL loci. Availability and implementation A toolkit to perform ReQTL analyses is available at https://github.com/HorvathLab/ReQTL. Supplementary information Supplementary data are available at Bioinformatics online.


2021 ◽  
pp. 510-524
Author(s):  
Jeffrey C. Thompson ◽  
Erica L. Carpenter ◽  
Benjamin A. Silva ◽  
Jamie Rosenstein ◽  
Austin L. Chien ◽  
...  

PURPOSE Although the majority of patients with metastatic non–small-cell lung cancer (mNSCLC) lacking a detectable targetable mutation will receive pembrolizumab-based therapy in the frontline setting, predicting which patients will experience a durable clinical benefit (DCB) remains challenging. MATERIALS AND METHODS Patients with mNSCLC receiving pembrolizumab monotherapy or in combination with chemotherapy underwent a 74-gene next-generation sequencing panel on blood samples obtained at baseline and at 9 weeks. The change in circulating tumor DNA levels on-therapy (molecular response) was quantified using a ratio calculation with response defined by a > 50% decrease in mean variant allele fraction. Patient response was assessed using RECIST 1.1; DCB was defined as complete or partial response or stable disease that lasted > 6 months. Progression-free survival and overall survival were recorded. RESULTS Among 67 patients, 51 (76.1%) had > 1 variant detected at a variant allele fraction > 0.3% and thus were eligible for calculation of molecular response from paired baseline and 9-week samples. Molecular response values were significantly lower in patients with an objective radiologic response (log mean 1.25% v 27.7%, P < .001). Patients achieving a DCB had significantly lower molecular response values compared to patients with no durable benefit (log mean 3.5% v 49.4%, P < .001). Molecular responders had significantly longer progression-free survival (hazard ratio, 0.25; 95% CI, 0.13 to 0.50) and overall survival (hazard ratio, 0.27; 95% CI, 0.12 to 0.64) compared with molecular nonresponders. CONCLUSION Molecular response assessment using circulating tumor DNA may serve as a noninvasive, on-therapy predictor of response to pembrolizumab-based therapy in addition to standard of care imaging in mNSCLC. This strategy requires validation in independent prospective studies.


2020 ◽  
Vol 20 (9) ◽  
pp. e569-e578
Author(s):  
Wasithep Limvorapitak ◽  
Jeremy Parker ◽  
Curtis Hughesman ◽  
Kelly McNeil ◽  
Lynda Foltz ◽  
...  

Author(s):  
Kerou Zhang ◽  
Luis Rodriguez ◽  
Lauren Yuxuan Cheng ◽  
Michael Wang ◽  
David Yu Zhang

BMC Genomics ◽  
2021 ◽  
Vol 22 (1) ◽  
Author(s):  
N. M. Prashant ◽  
Nawaf Alomran ◽  
Yu Chen ◽  
Hongyu Liu ◽  
Pavlos Bousounis ◽  
...  

Abstract Background Recent studies have demonstrated the utility of scRNA-seq SNVs to distinguish tumor from normal cells, characterize intra-tumoral heterogeneity, and define mutation-associated expression signatures. In addition to cancer studies, SNVs from single cells have been useful in studies of transcriptional burst kinetics, allelic expression, chromosome X inactivation, ploidy estimations, and haplotype inference. Results To aid these types of studies, we have developed a tool, SCReadCounts, for cell-level tabulation of the sequencing read counts bearing SNV reference and variant alleles from barcoded scRNA-seq alignments. Provided genomic loci and expected alleles, SCReadCounts generates cell-SNV matrices with the absolute variant- and reference-harboring read counts, as well as cell-SNV matrices of expressed Variant Allele Fraction (VAFRNA) suitable for a variety of downstream applications. We demonstrate three different SCReadCounts applications on 59,884 cells from seven neuroblastoma samples: (1) estimation of cell-level expression of known somatic mutations and RNA-editing sites, (2) estimation of cell- level allele expression of biallelic SNVs, and (3) a discovery mode assessment of the reference and each of the three alternative nucleotides at genomic positions of interest that does not require prior SNV information. For the later, we applied SCReadCounts on the coding regions of KRAS, where it identified known and novel somatic mutations in a low-to-moderate proportion of cells. The SCReadCounts read counts module is benchmarked against the analogous modules of GATK and Samtools. SCReadCounts is freely available (https://github.com/HorvathLab/NGS) as 64-bit self-contained binary distributions for Linux and MacOS, in addition to Python source. Conclusions SCReadCounts supplies a fast and efficient solution for estimation of cell-level SNV expression from scRNA-seq data. SCReadCounts enables distinguishing cells with monoallelic reference expression from those with no gene expression and is applicable to assess SNVs present in only a small proportion of the cells, such as somatic mutations in cancer.


PLoS ONE ◽  
2018 ◽  
Vol 13 (11) ◽  
pp. e0206632 ◽  
Author(s):  
David M. Vossen ◽  
Caroline V. M. Verhagen ◽  
Reidar Grénman ◽  
Roelof J. C. Kluin ◽  
Marcel Verheij ◽  
...  

2019 ◽  
Author(s):  
Justin Sein ◽  
Liam F. Spurr ◽  
Pavlos Bousounis ◽  
N M Prashant ◽  
Hongyu Liu ◽  
...  

SummaryRsQTL is a tool for identification of splicing quantitative trait loci (sQTLs) from RNA-sequencing (RNA-seq) data by correlating the variant allele fraction at expressed SNV loci in the transcriptome (VAFRNA) with the proportion of molecules spanning local exon-exon junctions at loci with differential intron excision (percent spliced in, PSI). We exemplify the method on sets of RNA-seq data from human tissues obtained though the Genotype-Tissue Expression Project (GTEx). RsQTL does not require matched DNA and can identify a subset of expressed sQTL loci. Due to the dynamic nature of VAFRNA, RsQTL is applicable for the assessment of conditional and dynamic variation-splicing relationships.Availability and implementationhttps://github.com/HorvathLab/[email protected] or [email protected] InformationRsQTL_Supplementary_Data.zip


2019 ◽  
Author(s):  
Jing Meng ◽  
Brandon Victor ◽  
Zhen He ◽  
Agus Salim

AbstractMotivationIt is of considerable interest to detect somatic mutations in paired tumor and normal sequencing data. A number of callers that are based on statistical or machine learning approaches have been developed to detect somatic small variants. However, they take into consideration only limited information about the reference and potential variant allele in both samples at a candidate somatic site. Also, they differ in how biological and technological noises are addressed. Hence, they are expected to produce divergent outputs.ResultsTo overcome the drawbacks of existing somatic callers, we develop a deep learning-based tool called DeepSSV, which employs a convolutional neural network (CNN) model to learn increasingly abstract feature representations from the raw data in higher feature layers. DeepSSV creates a spatially-oriented representation of read alignments around the candidate somatic sites adapted for the convolutional architecture, which enables it to expand to effectively gather scattered evidences. Moreover, DeepSSV incorporates the mapping information of both reference-allele-supporting and variant-allele-supporting reads in the tumor and normal samples at a genomic site that are readily available in the pileup format file. Together, the CNN model can process the whole alignment information. Such representational richness allows the model to capture the dependencies in the sequence and identify context-based sequencing artifacts, and alleviates the need of post-call filters that heavily depend on prior knowledge. We fitted the model on ground truth somatic mutations, and did benchmarking experiments on simulated and real tumors. The benchmarking results demonstrate that DeepSSV outperforms its state-of-the-art competitors in overall F1score.Availability and Implementationhttps://github.com/jingmeng-bioinformatics/[email protected] informationSupplementary data are available at online.


Blood ◽  
2018 ◽  
Vol 132 (Supplement 1) ◽  
pp. 4286-4286
Author(s):  
Wasithep Limvorapitak ◽  
Jeremy Parker ◽  
Lynda Foltz ◽  
Aly Karsan

Abstract Introduction: JAK2V617F mutation is one of the major criteria in the diagnosis of myeloproliferative neoplasms (MPN). The disease phenotype and outcomes are dependent on variant allele fraction (VAF) of JAK2V617F. Recently, a new entity termed clonal hematopoiesis of indeterminate potential (CHIP) defines patients with normal cell counts and VAF of at least 2%. Outcomes of patients with <2% VAF are scarce and we aimed to retrospectively study characteristics and outcomes of patients with JAK2V617F VAF < 2% compared to patients with VAF 2-10%. Methods: The study population included all patients in the province of British Columbia with JAK2V617F testing performed during 2010-2015. We compared the patient characteristics, disease phenotypes, overall survival (OS), thrombosis-free survival (TFS) and cumulative incidence of thrombotic events between patients with VAF <2% and 2-10%. Parallel real-time quantitative polymerase chain reaction (RQ-PCR) for wild type JAK2 and JAK2V617F was used as detection method. MPN diagnoses were based on the treating physicians' assessment. Results: We identified 216 patients with JAK2V617F VAF < 10%. Twenty-seven patients were excluded due to missing follow-up data. A total of 189 patients were included for final analysis (89 patients with VAF <2% and 100 patients with VAF 2-10%). Patient characteristics, diagnoses and outcomes are shown in the Table. Patients with JAK2V617F >2% have significantly higher rate of splenomegaly, higher platelet counts and higher MPN diagnoses. Ten patients (10.0%) with VAF 2-10% had no hematologic diagnoses, consistent with CHIP, while 24 patients (27.0%) with VAF <2% had no hematologic diagnoses. There were no differences in all outcomes measured including thrombotic complications, progression to hematologic or solid cancers and death. The median follow-up time for the whole cohort was 5.2 years with interquartile range (IQR) 3.5-6.6 years. The 5-year OS were 81.0% for VAF < 2% and 81.7% for VAF 2-10%, log-rank P = 0.922. TFS at 5 years were 71.2% and 69.5%, respectively (P = 0.982). The 5-year cumulative incidences of thrombotic complications (considering death as a competing event) were 8.8% and 11.3%, respectively (Pepe-Mori P = 0.574). Further analysis by clinical diagnoses classified patients into polycythemia vera (PV) 40 (21.2%), essential thrombocythemia (ET) 99 (52.3%), primary myelofibrosis or MPN, NOS (PMF/MPN) 16 (8.5%) and clonal hematopoiesis of indeterminate potential (CHIP) or no hematologic diagnosis 34 (18.0%). Patients with PMF/MPN were significantly older than patients with other diagnoses (median age PV 64.2, ET 64.3, PMF/MPN 80.7 and CHIP 54.1 years, P=0.019). The 5-year OS were: PV 91.4%, ET 90.0%, PMF/MPN 31.3% and CHIP/no hematologic diagnoses 58.7%, P<0.001. TFS at 5 years were 83.1%, 74.7%, 25.0%and 57.4%, respectively, P<0.001. Conclusion: Patients with JAK2V617F VAF < 2% have less splenomegaly and are less likely to have a diagnosis of MPN compared to patients with VAF 2-10%. However, the incidence of thrombotic events was similar between patients with VAF < 2% and 2-10%. In the combined VAF < 10% cohort, PMF/MPN patients were older and had the worst survival outcomes. The mortality in this PMF/MPN group was mostly unrelated to MPN diagnoses. Interestingly, patients with CHIP/no hematologic diagnoses in this study have the next worse OS and TFS. This could be explained by selection bias for performing JAK2 testing in acute or chronically ill patients with reactive changes in the peripheral blood. Table. Table. Disclosures Foltz: Gilead: Research Funding; Novartis: Consultancy, Honoraria, Research Funding; Promedior: Research Funding; Incyte: Research Funding.


Sign in / Sign up

Export Citation Format

Share Document