scholarly journals SCINA: Semi-Supervised Analysis of Single Cells in Silico

Genes ◽  
2019 ◽  
Vol 10 (7) ◽  
pp. 531 ◽  
Author(s):  
Zhang ◽  
Luo ◽  
Zhong ◽  
Choi ◽  
Ma ◽  
...  

Advances in single-cell RNA sequencing (scRNA-Seq) have allowed for comprehensive analyses of single cell data. However, current analyses of scRNA-Seq data usually start from unsupervised clustering or visualization. These methods ignore the prior knowledge of transcriptomes and of the probable structures of the data. Moreover, cell identification heavily relies on subjective and inaccurate human inspection afterwards. To address these analytical challenges, we developed the Semi-supervised Category Identification and Assignment (SCINA) algorithm, a semi-supervised model, for analyses of scRNA-Seq and flow cytometry/CyTOF data, and other data of similar format, by automatically exploiting previously established gene signatures using an expectation–maximization (EM) algorithm. We applied SCINA on a wide range of datasets, and showed its accuracy, stableness and efficiency exceeded most popular unsupervised approaches. SCINA discovered an intermediate stage of oligodendrocyte from mouse brain scRNA-Seq data. SCINA also detected immune cell population shifting in Stk4 knock-out -knockoutmouse cytometry data. Finally, SCINA identified a new kidney tumor clade with similarity to FH-deficient tumors from bulk tumor data. Overall, SCINA provides both methodological advances and biological insights from perspectives different from traditional analytical methods.

2019 ◽  
Author(s):  
Ze Zhang ◽  
M.S. Danni Luo ◽  
Xue Zhong ◽  
Jin Huk Choi ◽  
Yuanqing Ma ◽  
...  

ABSTRACTAdvances in single-cell RNA sequencing (scRNA-Seq) have allowed for comprehensive analyses of single cell data. However, current analyses of scRNA-Seq data usually start from unsupervised clustering or visualization. These methods ignore the prior knowledge of transcriptomes and of the probable structures of the data. Moreover, cell identification heavily relies on subjective and inaccurate human inspection afterwards. We reversed this paradigm and developed SCINA, a semi-supervised model, for analyses of scRNA-Seq and flow cytometry/CyTOF data, and other data of similar format, by automatically exploiting previously established gene signatures using an expectation-maximization (EM) algorithm. We applied SCINA on a wide range of datasets, and showed its accuracy, stableness and efficiency exceeded most popular unsupervised approaches. Notably, SCINA discovered an intermediate stage of oligodendrocyte from mouse brain scRNA-Seq data. SCINA also detected immune cell population shifting in Stk4 knock-out mouse cytometry data. Finally, SCINA identified a new kidney tumor clade with similarity to FH-deficient tumors from bulk tumor data. Overall, SCINA provides both methodological advances and biological insights from perspectives different from traditional analytical methods.


2020 ◽  
Author(s):  
Travis S. Johnson ◽  
Christina Y. Yu ◽  
Zhi Huang ◽  
Siwen Xu ◽  
Tongxin Wang ◽  
...  

AbstractWith the rapid advance of single cell sequencing techniques, single cell molecular data are quickly accumulated. However, there lacks a sound approach to properly integrate single cell data with the existing large amount of patient-level disease data. To address such need, we proposed DEGAS (Diagnostic Evidence GAuge of Single cells), a novel deep transfer-learning framework which allows for cellular and clinical information, including cell types, disease risk, and patient subtypes, to be cross-mapped between single cell and patient data, provided they share at least one common type of molecular data. We call such transferrable information “impressions”, which are generated by the deep learning models learned in the DEGAS framework. Using eight datasets from a wide range of diseases including Glioblastoma Multiforme (GBM), Alzheimer’s Disease (AD), and Multiple Myeloma (MM), we demonstrate the feasibility and broad applications of DEGAS in cross-mapping clinical and cellular information across disparate single cell and patient level transcriptomic datasets. Specifically, we correctly mapped clinically known GBM patient subtypes onto single cell data. We also identified previously known neuron loss from AD brains, then mapped the “impression” of AD risk to single cell data. Furthermore, we discovered novel differences in excitatory and inhibitory neuron loss in AD data. From the exploratory MM data, we identified differences in the malignancy of different CD138+ cellular subtypes based on “impressions” of relapse information transferred from MM patients. Through this work, we demonstrated that DEGAS is a powerful framework to cross-infer cellular and patient-level characteristics, which not only unites single cell and patient level transcriptomic data by identifying their latent links using the deep learning approach, but can also prioritize both patient subtypes and cellular subtypes for precision medicine.


2020 ◽  
Vol 8 (Suppl 3) ◽  
pp. A520-A520
Author(s):  
Son Pham ◽  
Tri Le ◽  
Tan Phan ◽  
Minh Pham ◽  
Huy Nguyen ◽  
...  

BackgroundSingle-cell sequencing technology has opened an unprecedented ability to interrogate cancer. It reveals significant insights into the intratumoral heterogeneity, metastasis, therapeutic resistance, which facilitates target discovery and validation in cancer treatment. With rapid advancements in throughput and strategies, a particular immuno-oncology study can produce multi-omics profiles for several thousands of individual cells. This overflow of single-cell data poses formidable challenges, including standardizing data formats across studies, performing reanalysis for individual datasets and meta-analysis.MethodsN/AResultsWe present BioTuring Browser, an interactive platform for accessing and reanalyzing published single-cell omics data. The platform is currently hosting a curated database of more than 10 million cells from 247 projects, covering more than 120 immune cell types and subtypes, and 15 different cancer types. All data are processed and annotated with standardized labels of cell types, diseases, therapeutic responses, etc. to be instantly accessed and explored in a uniform visualization and analytics interface. Based on this massive curated database, BioTuring Browser supports searching similar expression profiles, querying a target across datasets and automatic cell type annotation. The platform supports single-cell RNA-seq, CITE-seq and TCR-seq data. BioTuring Browser is now available for download at www.bioturing.com.ConclusionsN/A


Micromachines ◽  
2018 ◽  
Vol 9 (8) ◽  
pp. 367 ◽  
Author(s):  
Yuguang Liu ◽  
Dirk Schulze-Makuch ◽  
Jean-Pierre de Vera ◽  
Charles Cockell ◽  
Thomas Leya ◽  
...  

Single-cell sequencing is a powerful technology that provides the capability of analyzing a single cell within a population. This technology is mostly coupled with microfluidic systems for controlled cell manipulation and precise fluid handling to shed light on the genomes of a wide range of cells. So far, single-cell sequencing has been focused mostly on human cells due to the ease of lysing the cells for genome amplification. The major challenges that bacterial species pose to genome amplification from single cells include the rigid bacterial cell walls and the need for an effective lysis protocol compatible with microfluidic platforms. In this work, we present a lysis protocol that can be used to extract genomic DNA from both gram-positive and gram-negative species without interfering with the amplification chemistry. Corynebacterium glutamicum was chosen as a typical gram-positive model and Nostoc sp. as a gram-negative model due to major challenges reported in previous studies. Our protocol is based on thermal and chemical lysis. We consider 80% of single-cell replicates that lead to >5 ng DNA after amplification as successful attempts. The protocol was directly applied to Gloeocapsa sp. and the single cells of the eukaryotic Sphaerocystis sp. and achieved a 100% success rate.


Blood ◽  
2015 ◽  
Vol 126 (23) ◽  
pp. 4090-4090
Author(s):  
Alison R Moliterno ◽  
Donna Marie Williams ◽  
Jonathan M. Gerber ◽  
Michael A McDevitt ◽  
Ophelia Rogers ◽  
...  

Abstract Introduction: Essential thrombocytosis (ET), polycythemia vera (PV), and myelofibrosis (MF; post ETMF, post PVMF and primary MF) share the JAK2V617F mutation, but differ with regard to clinical phenotype, rate of disease progression, and risk of transformation. Variation in the JAK2V617F neutrophil allele burden does not account for these observed differences in clinical behavior or natural history. We therefore investigated the JAK2V617F burden and JAK2 genotype composition in the hematopoietic stem cell (HSC) population of MPN patients. Methods: We studied 47 JAK2V617F-positive MPN patients during 51 distinct disease phases. Circulating CD34+ cells were flow-sorted based on the stem cell markers CD34, CD38 and aldehyde dehydrogenase (ALDH). CD34+ CD38- ALDH+ HSC were sorted into 96 well plates and single cell JAK2 genotypes (average 40 single cells genotyped/patient with >1000 total single cells genotyped) were obtained using a nested PCR assay. Additional genomic lesions and chromosomal copy number variation were investigated in the sorted, single cell fractions in informative patients by FISH or multiplex single cell PCR. Distribution of JAK2V617F stem cell genotypes were correlated with disease phenotype, neutrophil JAK2V617F allele burden, splenomegaly, white cell count, chemotherapy requirement and disease evolution. Results: In all MPN cases, regardless of disease class, the JAK2V617F mutation was detected in the CD34+ CD38- ALDH+ fraction - the same population in which normal HSC reside. All ET and MF patients, and the majority of PV patients, had three JAK2 genotypes coexisting in their respective HSC populations. ET was characterized by a high percentage of JAK2WT stem cells (>75%) despite the concomitant presence of JAK2V617F homozygous clones and disease durations >15 years. Importantly, in the ET patients where JAK2WT clones fell to less than 50%, a PV phase followed. MF was characterized by a relatively low percentage of JAK2WT stem cells (median 24%), regardless of disease duration. PV had the most variable JAK2 genotypes with a wide range of JAK2WT stem cells (4%-92%) and a wide range of JAK2V617F homozygous stem cells (2-100%), and in 5/16 PV cases, only JAK2WT and JAK2V617F homozygous stem cells were identified. PV patients with JAK2V617F homozygous clones comprising more than 50% of their stem cells, regardless of disease duration, had higher white cell counts, higher neutrophil allele burdens, larger spleens and higher prevalence of chemotherapy compared to PV patients who had less than 50% JAK2V617F homozygous HSCs. The percentage of JAK2V617F homozygous HSC did not correlate with disease duration: some PV patients with a disease duration of >18 years had less than 10 % JAK2V617F homozygous HSC. A JAK2V617F - positive PV patient with a high JAK2V617F HSC burden and a high neutrophil JAK2V617F burden transformed to a JAK2V617F-negative chronic myelomonocytic leukemia (CMMoL); at the time of HSC analysis, the neutrophil JAK2V617F allele burden was 0% (previously 90%) and the HSC JAK2V617F homozygous percentage fell to 3% (previously 60%). While this patient's CMMoL was molecularly undefined, lesions identified in other JAK2V617F-positive patients (including mutations of ASXL1, TET2, deletion of 5q, 7q and 11q, trisomy 8 and 9), were also found in the CD34+ CD38- ALDH+ HSCs using single cell techniques, sometime coexistent with JAK2V617F-positive HSC, and sometimes in JAK2WT HSC. Conclusion: Driver and progression lesions in the JAK2V617F-positive MPN are acquired at the primitive HSC level. Despite decades of disease, the HSC pool in the MPN is mosaic for acquired lesions and also retains JAK2WT clones. Dominance of a particular JAK2 genotype at the primitive HSC level is variable, and distinguishes ET, where JAK2WT stem cells outnumber JAK2V617F-positive HSC, from MF, where JAK2WT HSC are the minority. PV is the most variable of the three MPN with regard to JAK2 genotype mosaicism. The allelic burden of HSC JAK2V617F in PV correlates with clinical disease burden. However, neither time nor JAK2V617F genotype determines the HSC burden in ET and PV, indicating that an undefined factor is a modifier of this important disease-defining process. Understanding the biology of HSC JAK2V617F homozygous clonal dominance may define an exploitable target to control disease burden, and to mitigate disease progression and evolution. Disclosures Moliterno: incyte: Membership on an entity's Board of Directors or advisory committees. Spivak:Incyte: Membership on an entity's Board of Directors or advisory committees.


Author(s):  
Nathalie Ne`ve ◽  
James K. Lingwood ◽  
Shelley R. Winn ◽  
Derek C. Tretheway ◽  
Sean S. Kohles

Interfacing a novel micron-resolution particle image velocimetry and dual optical tweezers system (μPIVOT) with microfluidics facilitates the exposure of an individual biologic cell to a wide range of static and dynamic mechanical stress conditions. Single cells can be manipulated in a sequence of mechanical stresses (hydrostatic pressure variations, tension or compression, as well as shear and extensional fluid induced stresses) while measuring cellular deformation. The unique multimodal load states enable a new realm of single cell biomechanical studies.


2019 ◽  
Author(s):  
Shuoguo Wang ◽  
Constance Brett ◽  
Mohan Bolisetty ◽  
Ryan Golhar ◽  
Isaac Neuhaus ◽  
...  

AbstractMotivationThanks to technological advances made in the last few years, we are now able to study transcriptomes from thousands of single cells. These have been applied widely to study various aspects of Biology. Nevertheless, comprehending and inferring meaningful biological insights from these large datasets is still a challenge. Although tools are being developed to deal with the data complexity and data volume, we do not have yet an effective visualizations and comparative analysis tools to realize the full value of these datasets.ResultsIn order to address this gap, we implemented a single cell data visualization portal called Single Cell Viewer (SCV). SCV is an R shiny application that offers users rich visualization and exploratory data analysis options for single cell datasets.AvailabilitySource code for the application is available online at GitHub (http://www.github.com/neuhausi/single-cell-viewer) and there is a hosted exploration application using the same example dataset as this publication at http://periscopeapps.org/[email protected]; [email protected]


Blood ◽  
2021 ◽  
Vol 138 (Supplement 1) ◽  
pp. 1566-1566
Author(s):  
Cathelijne Fokkema ◽  
Madelon M.E. de Jong ◽  
Sabrin Tahri ◽  
Zoltan Kellermayer ◽  
Chelsea den Hollander ◽  
...  

Abstract Introduction The introduction of new treatment regimens has significantly increased the progression free survival (PFS) of newly diagnosed multiple myeloma (MM) patients. However, even with these novel treatments, for some the disease remains refractory, highlighting the need to identify the pathobiology of high-risk MM. In MM patients, high levels of circulating tumor cells (CTCs) is associated with an inferior prognosis independent of high-risk cytogenetics (Chakraborty et al., 2016), suggesting that CTC numbers are a relevant reflection of tumor cell biology. We hypothesized that high levels of CTCs in MM patients are either the result of a transcriptionally distinct tumor clone with enhanced migration capacities, or driven by transcriptional differences present in the bone marrow (BM) tumor cells. To test these hypotheses, we 1) compared MM cells from paired blood and BM samples, and 2) compared BM tumor cells of patients with high and low CTC levels, using single cell RNA-sequencing. Results We isolated plasma cell (PCs) from viably frozen mononuclear cells of paired peripheral blood (PB) and BM aspirates from five newly diagnosed MM patients (0.5%-8% CTCs) to determine the presence of a distinct CTC subclone. We generated single cell transcriptomes from 44,779 CTCs and 35,697 BM PCs. In the total 9 clusters common to BM PCs and CTCs were identified upon single cell data integration, but no cluster specific for either source was detected. Only 25 genes were significantly differential expressed between CTCs and BM PCs. The absence of transcriptional clusters unique to either CTCs or BM PCs, and the transcriptional similarity between these two anatomical sites makes it highly unlikely that CTC levels are driven by the presence of a transcriptionally-primed migratory clone. We next set out to identify possible transcriptional differences in BM PCs from eight patients with high (2-22%) versus thirteen patients with low (0.004%-0.08%) percentages of CTCs. Recurrent high-risk mutations were present in both groups. Single cell transcriptomes were generated from 74,830 BM PCs. Single cell data integration across all patients led to the identification of 8 distinct PC clusters, one of which was characterized by enhanced proliferation as defined by STMN1 and MKI67 transcription. Interestingly, this proliferative cluster was increased in patients with a high percentage of CTCs. Furthermore, cell cycle analyses based on canonical G2M and S phase markers revealed that actively cycling PCs were more frequent in the BM of patients with a high percentage of CTCs (64% versus 30%, p<0.001), irrespective of the transcriptional cluster of origin. We hypothesized that plasma cell-extrinsic cues from the bone marrow micro-environment might be driving tumor proliferation. In order to substantiate this, we isolated BM immune cells from the same 21 patients and generated a library of 301,045 single immune cell transcriptomes. This library contained all major immune cell subsets, including CD4 + and CD8 + T cells, NK cells, B cells and monocytes. Comparative analyses of these cell populations in patients with either high or low levels of CTC are ongoing. Conclusion Through single cell transcriptomic analyses, we demonstrate that CTCs and BM PCs are transcriptionally similar. Importantly, we identify increased BM PC proliferation as a significant difference between patients with high and low levels of CTCs, implicating an increased tumor proliferation as one of the potential mechanisms driving CTC levels and MM disease pathobiology. The relation of the BM immune micro-environment to this altered proliferative state is currently under investigation. Disclosures van der Velden: Janssen: Other: Service Level Agreement; BD Biosciences: Other: Service Level Agreement; Navigate: Other: Service Level Agreement; Agilent: Research Funding; EuroFlow: Other: Service Level Agreement, Patents & Royalties: for network, not personally. Sonneveld: SkylineDx: Honoraria, Research Funding; Karyopharm: Consultancy, Honoraria, Research Funding; Amgen: Consultancy, Honoraria, Research Funding; Celgene/BMS: Consultancy, Honoraria, Research Funding; Janssen: Consultancy, Honoraria, Research Funding; Takeda: Consultancy, Honoraria, Research Funding. Broyl: Sanofi: Honoraria; Janssen Pharmaceuticals: Honoraria; Celgene: Honoraria; Bristol-Meyer Squibb: Honoraria; Amgen: Honoraria.


2021 ◽  
Author(s):  
Nathanael Andrews ◽  
Martin Enge

Abstract CIM-seq is a tool for deconvoluting RNA-seq data from cell multiplets (clusters of two or more cells) in order to identify physically interacting cell in a given tissue. The method requires two RNAseq data sets from the same tissue: one of single cells to be used as a reference, and one of cell multiplets to be deconvoluted. CIM-seq is compatible with both droplet based sequencing methods, such as Chromium Single Cell 3′ Kits from 10x genomics; and plate based methods, such as Smartseq2. The pipeline consists of three parts: 1) Dissociation of the target tissue, FACS sorting of single cells and multiplets, and conventional scRNA-seq 2) Feature selection and clustering of cell types in the single cell data set - generating a blueprint of transcriptional profiles in the given tissue 3) Computational deconvolution of multiplets through a maximum likelihood estimation (MLE) to determine the most likely cell type constituents of each multiplet.


2020 ◽  
Author(s):  
Jeremy Lombardo ◽  
Marzieh Aliaghaei ◽  
Quy Nguyen ◽  
Kai Kessenbrock ◽  
Jered Haun

Abstract Tissues are composed of highly heterogeneous mixtures of cell subtypes, and this diversity is increasingly being characterized using high-throughput single cell analysis methods. However, these efforts are hindered by the fact that tissues must first be dissociated into single cell suspensions that are viable and still accurately represent phenotypes from the original tissue. Current methods for breaking down tissues are inefficient, labor-intensive, subject to high variability, and potentially biased towards cell subtypes that are easier to release. Here, we present a microfluidic platform consisting of three different tissue processing technologies that can perform the complete tissue to single cell workflow, including digestion, disaggregation, and filtration. First, we developed a new microfluidic digestion device that can be loaded with minced tissue specimens quickly and easily, and then use the combination of proteolytic enzyme activity and fluid shear forces to accelerate tissue breakdown. Next, we integrated dissociation and filter technologies into a single device, which enhanced single cell numbers and fully prepared the sample for single cell analysis. The final multi-device platform was then evaluated using a diverse array of tissue types that exhibited a wide range of properties. For murine kidney and mammary tumor, we found that microfluidic processing produced 2.5-fold more single, viable cells. Single cell RNA sequencing (scRNA-seq) further revealed that device processing enriched for endothelial cells, fibroblasts, and basal epithelium, and did not increase stress responses. For murine liver and heart, which are softer tissues containing fragile cell types, processing time could be reduced to 15 min, and even as short as 1 min. We also demonstrated that periodic recovery at defined time intervals produced substantially more hepatocytes and cardiomyocytes than continuous operation, most likely by preventing damage to fragile cell types. In future work, we will seek to integrate additional operations such as upstream tissue preparation and downstream microfluidic cell sorting and detection to create powerful point-of-care single cell diagnostic platforms.


Sign in / Sign up

Export Citation Format

Share Document