scholarly journals Identification of Somatic Mutations From Bulk and Single-Cell Sequencing Data

2022 ◽  
Vol 2 ◽  
Author(s):  
August Yue Huang ◽  
Eunjung Alice Lee

Somatic mutations are DNA variants that occur after the fertilization of zygotes and accumulate during the developmental and aging processes in the human lifespan. Somatic mutations have long been known to cause cancer, and more recently have been implicated in a variety of non-cancer diseases. The patterns of somatic mutations, or mutational signatures, also shed light on the underlying mechanisms of the mutational process. Advances in next-generation sequencing over the decades have enabled genome-wide profiling of DNA variants in a high-throughput manner; however, unlike germline mutations, somatic mutations are carried only by a subset of the cell population. Thus, sensitive bioinformatic methods are required to distinguish mutant alleles from sequencing and base calling errors in bulk tissue samples. An alternative way to study somatic mutations, especially those present in an extremely small number of cells or even in a single cell, is to sequence single-cell genomes after whole-genome amplification (WGA); however, it is critical and technically challenging to exclude numerous technical artifacts arising during error-prone and uneven genome amplification in current WGA methods. To address these challenges, multiple bioinformatic tools have been developed. In this review, we summarize the latest progress in methods for identification of somatic mutations and the challenges that remain to be addressed in the future.

Author(s):  
VG LeBlanc ◽  
D Trinh ◽  
M Hughes ◽  
I Luthra ◽  
D Livingstone ◽  
...  

Glioblastomas (GBMs) account for nearly half of all primary malignant brain tumours, and current therapies are often only marginally effective. Our understanding of the underlying biology of these tumours and the development of new therapies have been complicated in part by widespread inter- and intratumoural heterogeneity. To characterize this heterogeneity, we performed regional subsampling of primary glioblastomas and derived organoids from these tissue samples. We then performed single-cell RNA-sequencing (scRNA-seq) on these primary regional subsamples and 1-3 matched organoids per sample. We have profiled samples from six tumour sets to date and have obtained sequencing data for 21,234 primary tissue cells and 14,742 organoid cells. While the most apparent differences in gene expression appear to be between individual tumours, we were also able to identify similar cellular subpopulations across tissue samples and across organoids. Importantly, organoids derived from the same tissue sample appeared to be composed of similar cellular subpopulations and were highly comparable to each other, indicating that replicate organoids faithfully represent the original tumour tissue. Overall, our scRNA-seq approach will help evaluate the utility of tumour-derived organoids as model systems for GBM and will aid in identifying cellular subpopulations defined by gene expression patterns, both in primary GBM regional subsamples and their associated organoids. These analyses will allow for the characterization of clonal or subclonal populations that are likely to respond to different therapeutic approaches and may also uncover novel therapeutic targets previously unrevealed through bulk analyses.


2021 ◽  
Author(s):  
Daniel Osorio ◽  
Marieke Lydia Kuijjer ◽  
James J. Cai

Motivation: Characterizing cells with rare molecular phenotypes is one of the promises of high throughput single-cell RNA sequencing (scRNA-seq) techniques. However, collecting enough cells with the desired molecular phenotype in a single experiment is challenging, requiring several samples preprocessing steps to filter and collect the desired cells experimentally before sequencing. Data integration of multiple public single-cell experiments stands as a solution for this problem, allowing the collection of enough cells exhibiting the desired molecular signatures. By increasing the sample size of the desired cell type, this approach enables a robust cell type transcriptome characterization. Results: Here, we introduce rPanglaoDB, an R package to download and merge the uniformly processed and annotated scRNA-seq data provided by the PanglaoDB database. To show the potential of rPanglaoDB for collecting rare cell types by integrating multiple public datasets, we present a biological application collecting and characterizing a set of 157 fibrocytes. Fibrocytes are a rare monocyte-derived cell type, that exhibits both the inflammatory features of macrophages and the tissue remodeling properties of fibroblasts. This constitutes the first fibrocytes' unbiased transcriptome profile report. We compared the transcriptomic profile of the fibrocytes against the fibroblasts collected from the same tissue samples and confirm their associated relationship with healing processes in tissue damage and infection through the activation of the prostaglandin biosynthesis and regulation pathway. Availability and Implementation: rPanglaoDB is implemented as an R package available through the CRAN repositories https://CRAN.R-project.org/package=rPanglaoDB.


2019 ◽  
Author(s):  
Lei Zhang ◽  
Xiao Dong ◽  
Moonsook Lee ◽  
Alexander Y. Maslov ◽  
Tao Wang ◽  
...  

Introductory paragraphThe accumulation of mutations in somatic cells have been implicated as a cause of ageing since the 1950s1,2. Yet, attempts to establish a causal relationship between somatic mutations and ageing have been constrained by the lack of methods to directly identify mutational events in primary human tissues. Here we provide detailed, genome-wide mutation frequencies and spectra of human B lymphocytes from healthy individuals across the entire human lifespan, from newborns to centenarians, using a recently developed, highly accurate single-cell whole-genome sequencing method3. We found that the number of somatic mutations increases from <500 per cell in newborns to >3,000 per cell in centenarians. We discovered mutational hotspot regions, some of which, as expected, located at immunoglobulin genes associated with somatic hypermutation. B cell-specific mutation signatures were observed associated with development, ageing or somatic hypermutation (SHM). The SHM signature strongly correlated with the signature found in human chronic lymphocytic leukemia and malignant B-cell lymphomas4, indicating that even in B cells of healthy individuals the potential cancer-causing events are already present. We also identified multiple mutations in sequence features relevant to cellular function, i.e., transcribed genes and gene regulatory regions. Such mutations increased significantly during ageing, but only at approximately half the rate of the genome average, indicating selection against mutations that impact B cell function. This first full characterization of the landscape of somatic mutations in human B lymphocytes indicates that spontaneous somatic mutations accumulating with age can be deleterious and may contribute to both the increased risk for leukemia and the functional decline of B lymphocytes in the elderly.


2019 ◽  
Vol 116 (18) ◽  
pp. 9014-9019 ◽  
Author(s):  
Lei Zhang ◽  
Xiao Dong ◽  
Moonsook Lee ◽  
Alexander Y. Maslov ◽  
Tao Wang ◽  
...  

Accumulation of mutations in somatic cells has been implicated as a cause of aging since the 1950s. However, attempts to establish a causal relationship between somatic mutations and aging have been constrained by the lack of methods to directly identify mutational events in primary human tissues. Here we provide genome-wide mutation frequencies and spectra of human B lymphocytes from healthy individuals across the entire human lifespan using a highly accurate single-cell whole-genome sequencing method. We found that the number of somatic mutations increases from <500 per cell in newborns to >3,000 per cell in centenarians. We discovered mutational hotspot regions, some of which, as expected, were located at Ig genes associated with somatic hypermutation (SHM). B cell–specific mutation signatures associated with development, aging, or SHM were found. The SHM signature strongly correlated with the signature found in human B cell tumors, indicating that potential cancer-causing events are already present even in B cells of healthy individuals. We also identified multiple mutations in sequence features relevant to cellular function (i.e., transcribed genes and gene regulatory regions). Such mutations increased significantly during aging, but only at approximately one-half the rate of the genome average, indicating selection against mutations that impact B cell function. This full characterization of the landscape of somatic mutations in human B lymphocytes indicates that spontaneous somatic mutations accumulating with age can be deleterious and may contribute to both the increased risk for leukemia and the functional decline of B lymphocytes in the elderly.


2018 ◽  
Author(s):  
Nuria Estévez-Gómez ◽  
Tamara Prieto ◽  
Amy Guillaumet-Adkins ◽  
Holger Heyn ◽  
Sonia Prado-López ◽  
...  

Single-cell genomics is an alluring area that holds the potential to change the way we understand cell populations. Due to the small amount of DNA within a single cell, whole-genome amplification becomes a mandatory step in many single-cell applications. Unfortunately, single-cell whole-genome amplification (scWGA) strategies suffer from several technical biases that complicate the posterior interpretation of the data. Here we compared the performance of six different scWGA methods (GenomiPhi, REPLIg, TruePrime, Ampli1, MALBAC, and PicoPLEX) after amplifying and low-pass sequencing the complete genome of 230 healthy/tumoral human cells. Overall, REPLIg outperformed competing methods regarding DNA yield, amplicon size, amplification breadth, amplification uniformity –being the only method with a random amplification bias–, and false single-nucleotide variant calls. On the other hand, non-MDA methods, and in particular Ampli1, showed less allelic imbalance and ADO, more reliable copy-number profiles and less chimeric amplicons. While no single scWGA method showed optimal performance for every aspect, they clearly have distinct advantages. Our results provide a convenient guide for selecting a scWGA method depending on the question of interest while revealing relevant weaknesses that should be considered during the analysis and interpretation of single-cell sequencing data.


BioTechniques ◽  
2021 ◽  
Author(s):  
James M Dominguez ◽  
Sharon M Moe ◽  
Neal X Chen ◽  
Todd O McKinley ◽  
Krista M Brown ◽  
...  

The ability to study the bone microenvironment of failed fracture healing may lead to biomarkers for fracture nonunion. Herein the authors describe a technique for isolating individual cells suitable for single-cell RNA sequencing analyses from intramedullary canal tissue collected by reaming during surgery. The purpose was to detail challenges and solutions inherent to the collection and processing of intramedullary canal tissue samples. The authors then examined single-cell RNA sequencing data from fresh and reanimated samples to demonstrate the feasibility of this approach for prospective studies.


2017 ◽  
Vol 242 (13) ◽  
pp. 1318-1324 ◽  
Author(s):  
Jan Vijg ◽  
Xiao Dong ◽  
Lei Zhang

Postzygotic mutations in somatic cells lead to genome mosaicism and can be the cause of cancer, possibly other human diseases and aging. Somatic mutations are difficult to detect in bulk tissue samples. Here, we review the available assays for measuring somatic mutations, with a focus on recent single-cell, whole genome sequencing methods. Impact statement Somatic mutations cause cancer, possibly other diseases and aging. Yet, very little is known about the frequency of such mutations in vivo, their distribution across the genome, and their possible functional consequences other than cancer. Even in cancer, we do not know the heterogeneity of mutations within a tumor and if seemingly normal cells in its surroundings already have elevated mutation frequencies. Here, we review a new, whole genome amplification system that allows accurate quantification and characterization of single-cell mutational landscapes in human cells and tissues in relation to disease.


Author(s):  
Jingyi Jessica Li

Abstract Single-cell RNA sequencing (scRNA-seq) is a burgeoning field where experimental techniques and computational methods have been under rapid evolution in the past six years. These technological advances have allowed biomedical researchers to identify new cell types, delineate cell sub-populations, and infer cell differentiation trajectories in various tissue samples. Among the important features extractable from scRNA-seq data, the predominant ones are individual genes’ expression levels in single cells. Most analyses require a preprocessing step that converts a scRNA-seq dataset into a count matrix, where rows correspond to cells (or genes), columns correspond to genes (or cells), and entries are counts, i.e. a count is the number of sequenced reads or uniquely mapped identifiers (UMIs) mapped to a gene in a cell. Single-cell count matrices are highly sparse; for example, a typical matrix constructed from a droplet-based dataset may have &gt;90% of counts as zeros.


2021 ◽  
Author(s):  
Tianyun Zhang ◽  
Ning Shen

Identifying expressed somatic mutations directly from single-cell RNA sequencing (scRNA-seq) data is challenging but highly valuable. Computational methods have been attempted but no reliable methods have been reported to identify somatic mutations with high fidelity. We present RESA -- Recurrently Expressed SNV Analysis, a computational framework that identifies expressed somatic mutations from scRNA-seq data with high precision. We test RESA in multiple cancer cell line datasets, where RESA demonstrates average area under the curve (AUC) of 0.9 on independently held out test sets, and achieves average precision of 0.71 when evaluated by bulk whole exome, which is substantially higher than previous approaches. In addition, RESA detects a median of 201 mutations per cell, 50 times more than what was reported in experimental technologies with simultaneous expression and mutation profiling. Furthermore, applying RESA to scRNA-seq from a melanoma patient, we demonstrate that RESA recovers the known BRAF driver mutation of the sample and melanoma dominating mutational signatures, identifies mutation associated expression signatures, reveals nondriver perturbed and stage specific cancer hallmarks, and unveils the complex relationship between genomic and transcriptomic intratumor heterogeneity. Therefore, RESA could provide novel views in the study of intratumor heterogeneity and relate genetic alterations to transcriptional changes at single cell level.


Sign in / Sign up

Export Citation Format

Share Document