UCSC Cell Browser: Visualize Your Single-Cell Data

AbstractSummaryAs the use of single-cell technologies has grown, so has the need for tools to explore these large, complicated datasets. The UCSC Cell Browser is a tool that allows scientists to visualize gene expression and metadata annotation distribution throughout a single-cell dataset or multiple datasets.Availability and implementationWe provide the UCSC Cell Browser as a free website where users can explore a growing collection of single-cell datasets and a freely available python package for scientists to create stable, self-contained visualizations for their own single-cell datasets. Learn more at https://[email protected]

Download Full-text

SPRING: a kinetic interface for visualizing high dimensional single-cell expression data

10.1101/090332 ◽

2016 ◽

Cited By ~ 10

Author(s):

Caleb Weinreb ◽

Samuel Wolock ◽

Allon Klein

Keyword(s):

Gene Expression ◽

Single Cell ◽

Nearest Neighbor ◽

High Dimensional ◽

K Nearest Neighbor ◽

Link Type ◽

Cell Gene Expression ◽

Graph Layouts ◽

Cell Expression ◽

Cell Data

MotivationSingle-cell gene expression profiling technologies can map the cell states in a tissue or organism. As these technologies become more common, there is a need for computational tools to explore the data they produce. In particular, existing data visualization approaches are imperfect for studying continuous gene expression topologies.ResultsForce-directed layouts of k-nearest-neighbor graphs can visualize continuous gene expression topologies in a manner that preserves high-dimensional relationships and allows manually exploration of different stable two-dimensional representations of the same data. We implemented an interactive web-tool to visualize single-cell data using force-directed graph layouts, called SPRING. SPRING reveals more detailed biological relationships than existing approaches when applied to branching gene expression trajectories from hematopoietic progenitor cells. Visualizations from SPRING are also more reproducible than those of stochastic visualization methods such as tSNE, a state-of-the-art tool.Availabilityhttps://kleintools.hms.harvard.edu/tools/spring.html,https://github.com/AllonKleinLab/SPRING/[email protected], [email protected]

Download Full-text

scAMACE: Model-based approach to the joint analysis of single-cell data on chromatin accessibility, gene expression and methylation

10.1101/2021.03.29.437485 ◽

2021 ◽

Author(s):

Jiaxuan Wangwu ◽

Zexuan Sun ◽

Zhixiang Lin

Keyword(s):

Gene Expression ◽

Single Cell ◽

Cell Types ◽

Chromatin Accessibility ◽

Integrative Analysis ◽

Joint Analysis ◽

Data Types ◽

Link Type ◽

Complex Biological Process ◽

Cell Data

AbstractThe advancement in technologies and the growth of available single-cell datasets motivate integrative analysis of multiple single-cell genomic datasets. Integrative analysis of multimodal single-cell datasets combines complementary information offered by single-omic datasets and can offer deeper insights on complex biological process. Clustering methods that identify the unknown cell types are among the first few steps in the analysis of single-cell datasets, and they are important for downstream analysis built upon the identified cell types. We propose scAMACE for the integrative analysis and clustering of single-cell data on chromatin accessibility, gene expression and methylation. We demonstrate that cell types are better identified and characterized through analyzing the three data types jointly. We develop an efficient expectation-maximization (EM) algorithm to perform statistical inference, and evaluate our methods on both simulation study and real data applications. We also provide the GPU implementation of scAMACE, making it scalable to large datasets. The software and datasets are available at https://github.com/cuhklinlab/scAMACE_py (pythom implementation) and https://github.com/cuhklinlab/scAMACE (R implementation).

Download Full-text

sciCAN: Single-cell chromatin accessibility and gene expression data integration via Cycle-consistent Adversarial Network

10.1101/2021.11.30.470677 ◽

2021 ◽

Author(s):

Yang Xu ◽

Edmon Begoli ◽

Rachel Patton McCord

Keyword(s):

Gene Expression ◽

Data Integration ◽

Single Cell ◽

Gene Expression Data ◽

Chromatin Accessibility ◽

Cellular Systems ◽

Expression Data ◽

Adversarial Network ◽

Cell Technologies ◽

Cell Data

The booming single-cell technologies bring a surge of high dimensional data that come from different sources and represent cellular systems from different views. With advances in single-cell technologies, integrating single-cell data across modalities arises as a new computational challenge and gains more and more attention within the community. Here, we present a novel adversarial approach, sciCAN, to integrate single-cell chromatin accessibility and gene expression data in an unsupervised manner. We benchmarked sciCAN with 3 state-of-the-art (SOTA) methods in 5 scATAC-seq/scRNA-seq datasets, and we demonstrated that our method dealt with data integration with better balance of mutual transferring between modalities than the other 3 SOTA methods. We further applied sciCAN to 10X Multiome data and confirmed the integrated representation preserves information of the hematopoietic hierarchy. Finally, we investigated CRSIPR-perturbed single-cell K562 ATAC-seq and RNA-seq data to identify cells with related responses to different perturbations in these different modalities.

Download Full-text

Identifying cell types from single-cell data based on similarities and dissimilarities between cells

BMC Bioinformatics ◽

10.1186/s12859-020-03873-z ◽

2021 ◽

Vol 22 (S3) ◽

Author(s):

Yuanyuan Li ◽

Ping Luo ◽

Yi Lu ◽

Fang-Xiang Wu

Keyword(s):

Gene Expression ◽

Single Cell ◽

Spectral Clustering ◽

Incidence Matrix ◽

Expression Patterns ◽

Cell Types ◽

Clustering Method ◽

Different Types ◽

Cell Data ◽

Spectral Clustering Method

Abstract Background With the development of the technology of single-cell sequence, revealing homogeneity and heterogeneity between cells has become a new area of computational systems biology research. However, the clustering of cell types becomes more complex with the mutual penetration between different types of cells and the instability of gene expression. One way of overcoming this problem is to group similar, related single cells together by the means of various clustering analysis methods. Although some methods such as spectral clustering can do well in the identification of cell types, they only consider the similarities between cells and ignore the influence of dissimilarities on clustering results. This methodology may limit the performance of most of the conventional clustering algorithms for the identification of clusters, it needs to develop special methods for high-dimensional sparse categorical data. Results Inspired by the phenomenon that same type cells have similar gene expression patterns, but different types of cells evoke dissimilar gene expression patterns, we improve the existing spectral clustering method for clustering single-cell data that is based on both similarities and dissimilarities between cells. The method first measures the similarity/dissimilarity among cells, then constructs the incidence matrix by fusing similarity matrix with dissimilarity matrix, and, finally, uses the eigenvalues of the incidence matrix to perform dimensionality reduction and employs the K-means algorithm in the low dimensional space to achieve clustering. The proposed improved spectral clustering method is compared with the conventional spectral clustering method in recognizing cell types on several real single-cell RNA-seq datasets. Conclusions In summary, we show that adding intercellular dissimilarity can effectively improve accuracy and achieve robustness and that improved spectral clustering method outperforms the traditional spectral clustering method in grouping cells.

Download Full-text

Tailoring Cardiac Synthetic Transcriptional Modulation Towards Precision Medicine

Frontiers in Cardiovascular Medicine ◽

10.3389/fcvm.2021.783072 ◽

2022 ◽

Vol 8 ◽

Author(s):

Eric Schoger ◽

Sara Lelek ◽

Daniela Panáková ◽

Laura Cecilia Zelarayán

Keyword(s):

Gene Expression ◽

Precision Medicine ◽

Single Cell ◽

Regulatory Networks ◽

Transcriptional Control ◽

Regulatory Elements ◽

Comprehensive Overview ◽

Endogenous Gene ◽

Cell Technologies ◽

Organ Systems

Molecular and genetic differences between individual cells within tissues underlie cellular heterogeneities defining organ physiology and function in homeostasis as well as in disease states. Transcriptional control of endogenous gene expression has been intensively studied for decades. Thanks to a fast-developing field of single cell genomics, we are facing an unprecedented leap in information available pertaining organ biology offering a comprehensive overview. The single-cell technologies that arose aided in resolving the precise cellular composition of many organ systems in the past years. Importantly, when applied to diseased tissues, the novel approaches have been immensely improving our understanding of the underlying pathophysiology of common human diseases. With this information, precise prediction of regulatory elements controlling gene expression upon perturbations in a given cell type or a specific context will be realistic. Simultaneously, the technological advances in CRISPR-mediated regulation of gene transcription as well as their application in the context of epigenome modulation, have opened up novel avenues for targeted therapy and personalized medicine. Here, we discuss the fast-paced advancements during the recent years and the applications thereof in the context of cardiac biology and common cardiac disease. The combination of single cell technologies and the deep knowledge of fundamental biology of the diseased heart together with the CRISPR-mediated modulation of gene regulatory networks will be instrumental in tailoring the right strategies for personalized and precision medicine in the near future. In this review, we provide a brief overview of how single cell transcriptomics has advanced our knowledge and paved the way for emerging CRISPR/Cas9-technologies in clinical applications in cardiac biomedicine.

Download Full-text

High-throughput single-cell RNA-seq data imputation and characterization with surrogate-assisted automated deep learning

Briefings in Bioinformatics ◽

10.1093/bib/bbab368 ◽

2021 ◽

Author(s):

Xiangtao Li ◽

Shaochuan Li ◽

Lei Huang ◽

Shixiong Zhang ◽

Ka-chun Wong

Keyword(s):

Gene Expression ◽

Neural Networks ◽

Single Cell ◽

Deep Neural Networks ◽

Expression Profiles ◽

Marker Gene ◽

Gene Expression Profiles ◽

Underlying Mechanisms ◽

Cell Data ◽

Gene Expression Levels

Abstract Single-cell RNA sequencing (scRNA-seq) technologies have been heavily developed to probe gene expression profiles at single-cell resolution. Deep imputation methods have been proposed to address the related computational challenges (e.g. the gene sparsity in single-cell data). In particular, the neural architectures of those deep imputation models have been proven to be critical for performance. However, deep imputation architectures are difficult to design and tune for those without rich knowledge of deep neural networks and scRNA-seq. Therefore, Surrogate-assisted Evolutionary Deep Imputation Model (SEDIM) is proposed to automatically design the architectures of deep neural networks for imputing gene expression levels in scRNA-seq data without any manual tuning. Moreover, the proposed SEDIM constructs an offline surrogate model, which can accelerate the computational efficiency of the architectural search. Comprehensive studies show that SEDIM significantly improves the imputation and clustering performance compared with other benchmark methods. In addition, we also extensively explore the performance of SEDIM in other contexts and platforms including mass cytometry and metabolic profiling in a comprehensive manner. Marker gene detection, gene ontology enrichment and pathological analysis are conducted to provide novel insights into cell-type identification and the underlying mechanisms. The source code is available at https://github.com/li-shaochuan/SEDIM.

Download Full-text

SIN-KNO: A method of gene regulatory network inference using single-cell transcription and gene knockout data

Journal of Bioinformatics and Computational Biology ◽

10.1142/s0219720019500355 ◽

2019 ◽

Vol 17 (06) ◽

pp. 1950035

Author(s):

Huiqing Wang ◽

Yuanyuan Lian ◽

Chun Li ◽

Yue Ma ◽

Zhiliang Yan ◽

...

Keyword(s):

Gene Expression ◽

Steady State ◽

Single Cell ◽

Gene Regulatory Network ◽

Regulatory Network ◽

Gene Knockout ◽

Cell Heterogeneity ◽

State Information ◽

Gene Regulatory ◽

Cell Data

As a tool of interpreting and analyzing genetic data, gene regulatory network (GRN) could reveal regulatory relationships between genes, proteins, and small molecules, as well as understand physiological activities and functions within biological cells, interact in pathways, and how to make changes in the organism. Traditional GRN research focuses on the analysis of the regulatory relationships through the average of cellular gene expressions. These methods are difficult to identify the cell heterogeneity of gene expression. Existing methods for inferring GRN using single-cell transcriptional data lack expression information when genes reach steady state, and the high dimensionality of single-cell data leads to high temporal and spatial complexity of the algorithm. In order to solve the problem in traditional GRN inference methods, including the lack of cellular heterogeneity information, single-cell data complexity and lack of steady-state information, we propose a method for GRN inference using single-cell transcription and gene knockout data, called SINgle-cell transcription data-KNOckout data (SIN-KNO), which focuses on combining dynamic and steady-state information of regulatory relationship contained in gene expression. Capturing cell heterogeneity information could help understand the gene expression difference in different cells. So, we could observe gene expression changes more accurately. Gene knockout data could observe the gene expression levels at steady-state of all other genes when one gene is knockout. Classifying the genes before analyzing the single-cell data could determine a large number of non-existent regulation, greatly reducing the number of regulation required for inference. In order to show the efficiency, the proposed method has been compared with several typical methods in this area including GENIE3, JUMP3, and SINCERITIES. The results of the evaluation indicate that the proposed method can analyze the diversified information contained in the two types of data, establish a more accurate gene regulation network, and improve the computational efficiency. The method provides a new thinking for dealing with large datasets and high computational complexity of single-cell data in the GRN inference.

Download Full-text

Single Cell Viewer (SCV): An interactive visualization data portal for single cell RNA sequence data

10.1101/664789 ◽

2019 ◽

Cited By ~ 2

Author(s):

Shuoguo Wang ◽

Constance Brett ◽

Mohan Bolisetty ◽

Ryan Golhar ◽

Isaac Neuhaus ◽

...

Keyword(s):

Single Cell ◽

Sequence Data ◽

Single Cells ◽

Link Type ◽

Technological Advances ◽

R Shiny ◽

Data Volume ◽

Exploratory Data ◽

Cell Data ◽

Shiny Application

AbstractMotivationThanks to technological advances made in the last few years, we are now able to study transcriptomes from thousands of single cells. These have been applied widely to study various aspects of Biology. Nevertheless, comprehending and inferring meaningful biological insights from these large datasets is still a challenge. Although tools are being developed to deal with the data complexity and data volume, we do not have yet an effective visualizations and comparative analysis tools to realize the full value of these datasets.ResultsIn order to address this gap, we implemented a single cell data visualization portal called Single Cell Viewer (SCV). SCV is an R shiny application that offers users rich visualization and exploratory data analysis options for single cell datasets.AvailabilitySource code for the application is available online at GitHub (http://www.github.com/neuhausi/single-cell-viewer) and there is a hosted exploration application using the same example dataset as this publication at http://periscopeapps.org/[email protected]; [email protected]

Download Full-text

scMontage: Fast and Robust Gene Expression Similarity Search for Massive Single-cell Data

10.1101/2020.08.30.271395 ◽

2020 ◽

Cited By ~ 1

Author(s):

Tomoya Mori ◽

Naila Shinwari ◽

Wataru Fujibuchi

Keyword(s):

Gene Expression ◽

Single Cell ◽

Similarity Search ◽

Rank Correlation ◽

Cell Types ◽

Specific Information ◽

Link Type ◽

Spearman’S Rank Correlation ◽

Spearman’S Rank Correlation Coefficient ◽

Additional Cell

AbstractSingle-cell RNA-seq (scRNA-seq) analysis is widely used to characterize cell types or detect heterogeneity of cell states at much higher resolutions than ever before. Here we introduce scMontage (https://scmontage.stemcellinformatics.org), a gene expression similarity search server dedicated to scRNA-seq data, which can rapidly compare a query with thousands of samples within a few seconds. The scMontage search is based on Spearman’s rank correlation coefficient and its robustness is ensured by introducing Fisher’s Z-transformation and Z-test. Furthermore, search results are linked to a human cell database SHOGoiN (http://shogoin.stemcellinformatics.org), which enable users to fast access to additional cell-type specific information. The scMontage is available not only as a web server but also as a stand-alone application for user’s own data, and thus it enhances the reliability and throughput of cell analysis and helps users gain new insights into their research.

Download Full-text

Semi-soft Clustering of Single Cell Data

10.1101/285056 ◽

2018 ◽

Author(s):

Lingxue Zhu ◽

Jing Lei ◽

Bernie Devlin ◽

Kathryn Roeder

Keyword(s):

Gene Expression ◽

Single Cell ◽

Expression Profiles ◽

Gene Expression Profiles ◽

Pairwise Comparison ◽

Cell Types ◽

Intermediate Cell ◽

Soft Clustering ◽

Membership Matrix ◽

Cell Data

AbstractMotivated by the dynamics of development, in which cells of recognizable types, or pure cell types, transition into other types over time, we propose a method of semi-soft clustering that can classify both pure and intermediate cell types from data on gene expression or protein abundance from individual cells. Called SOUP, for Semi-sOft clUstering with Pure cells, this novel algorithm reveals the clustering structure for both pure cells, which belong to one single cluster, as well as transitional cells with soft memberships. SOUP involves a two-step process: identify the set of pure cells and then estimate a membership matrix. To find pure cells, SOUP uses the special block structure the K cell types form in a similarity matrix, devised by pairwise comparison of the gene expression profiles of individual cells. Once pure cells are identified, they provide the key information from which the membership matrix can be computed. SOUP is applicable to general clustering problems as well, as long as the unrestrictive modeling assumptions hold. The performance of SOUP is documented via extensive simulation studies. Using SOUP to analyze two single cell data sets from brain shows it produce sensible and interpretable results.

Download Full-text