scholarly journals Autoencoder and Optimal Transport to Infer Single-Cell Trajectories of Biological Processes

2018 ◽  
Author(s):  
Karren Dai Yang ◽  
Karthik Damodaran ◽  
Saradha Venkatchalapathy ◽  
Ali C. Soylemezoglu ◽  
G.V. Shivashankar ◽  
...  

AbstractAlthough we can increasingly image and measure biological processes at single-cell resolution, most assays can only take snapshots from a population of cells in time. Here we describe ImageAEOT, which combines an AutoEncoder, to map single-cell Images from different cell populations to a common latent space, with the framework of Optimal Transport to infer cellular trajectories. As a proof-of-concept, we apply ImageAEOT to nuclear and chromatin images during the activation of fibroblasts by tumor cells in engineered 3D tissues. We further validate ImageAEOT on chromatin images of various breast cancer cell lines and human tissue samples, thereby linking alterations in chromatin condensation patterns to different stages of tumor progression. Importantly, ImageAEOT can infer the trajectory of a particular cell from one snapshot in time and identify the changing features to provide early biomarkers for developmental and disease progression.

2018 ◽  
Author(s):  
Pierre-Cyril Aubin-Frankowski ◽  
Jean-Philippe Vert

AbstractSingle-cell RNA sequencing (scRNA-seq) offers new possibilities to infer gene regulation networks (GRN) for biological processes involving a notion of time, such as cell differentiation or cell cycles. It also raises many challenges due to the destructive measurements inherent to the technology. In this work we propose a new method named GRISLI for de novo GRN inference from scRNA-seq data. GRISLI infers a velocity vector field in the space of scRNA-seq data from profiles of individual data, and models the dynamics of cell trajectories with a linear ordinary differential equation to reconstruct the underlying GRN with a sparse regression procedure. We show on real data that GRISLI outperforms a recently proposed state-of-the-art method for GRN reconstruction from scRNA-seq data.


2021 ◽  
Vol 17 (12) ◽  
pp. e1009466
Author(s):  
Stephen Zhang ◽  
Anton Afanassiev ◽  
Laura Greenstreet ◽  
Tetsuya Matsumoto ◽  
Geoffrey Schiebinger

Understanding how cells change their identity and behaviour in living systems is an important question in many fields of biology. The problem of inferring cell trajectories from single-cell measurements has been a major topic in the single-cell analysis community, with different methods developed for equilibrium and non-equilibrium systems (e.g. haematopoeisis vs. embryonic development). We show that optimal transport analysis, a technique originally designed for analysing time-courses, may also be applied to infer cellular trajectories from a single snapshot of a population in equilibrium. Therefore, optimal transport provides a unified approach to inferring trajectories that is applicable to both stationary and non-stationary systems. Our method, StationaryOT, is mathematically motivated in a natural way from the hypothesis of a Waddington’s epigenetic landscape. We implement StationaryOT as a software package and demonstrate its efficacy in applications to simulated data as well as single-cell data from Arabidopsis thaliana root development.


2020 ◽  
Author(s):  
Andres M. Cifuentes-Bernal ◽  
Vu VH Pham ◽  
Xiaomei Li ◽  
Lin Liu ◽  
Jiuyong Li ◽  
...  

AbstractMotivationmicroRNAs (miRNAs) are important gene regulators and they are involved in many biological processes, including cancer progression. Therefore, correctly identifying miRNA-mRNA interactions is a crucial task. To this end, a huge number of computational methods has been developed, but they mainly use the data at one snapshot and ignore the dynamics of a biological process. The recent development of single cell data and the booming of the exploration of cell trajectories using “pseudo-time” concept have inspired us to develop a pseudo-time based method to infer the miRNA-mRNA relationships characterising a biological process by taking into account the temporal aspect of the process.ResultsWe have developed a novel approach, called pseudo-time causality (PTC), to find the causal relationships between miRNAs and mRNAs during a biological process. We have applied the proposed method to both single cell and bulk sequencing datasets for Epithelia to Mesenchymal Transition (EMT), a key process in cancer metastasis. The evaluation results show that our method significantly outperforms existing methods in finding miRNA-mRNA interactions in both single cell and bulk data. The results suggest that utilising the pseudo-temporal information from the data helps reveal the gene regulation in a biological process much better than using the static information.AvailabilityR scripts and datasets can be found at https://github.com/AndresMCB/PTC


2019 ◽  
Author(s):  
Anna Klimovskaia ◽  
David Lopez-Paz ◽  
Léon Bottou ◽  
Maximilian Nickel

AbstractThe need to understand cell developmental processes spawned a plethora of computational methods for discovering hierarchies from scRNAseq data. However, existing techniques are based on Euclidean geometry, a suboptimal choice for modeling complex cell trajectories with multiple branches. To overcome this fundamental representation issue we propose Poincaré maps, a method that harness the power of hyperbolic geometry into the realm of single-cell data analysis. Often understood as a continuous extension of trees, hyperbolic geometry enables the embedding of complex hierarchical data in only two dimensions while preserving the pairwise distances between points in the hierarchy. This enables direct exploratory analysis and the use of our embeddings in a wide variety of downstream data analysis tasks, such as visualization, clustering, lineage detection and pseudo-time inference. When compared to existing methods —unable to address all these important tasks using a single embedding— Poincaré maps produce state-of-the-art two-dimensional representations of cell trajectories on multiple scRNAseq datasets. More specifically, we demonstrate that Poincaré maps allow in a straightforward manner to formulate new hypotheses about biological processes unbeknown to prior methods.Significance statementThe discovery of hierarchies in biological processes is central to developmental biology. We propose Poincaré maps, a new method based on hyperbolic geometry to discover continuous hierarchies from pairwise similarities. We demonstrate the efficacy of our method on multiple single-cell datasets on tasks such as visualization, clustering, lineage identification, and pseudo-time inference.


2021 ◽  
Author(s):  
Stephen Zhang ◽  
Anton Afanassiev ◽  
Laura Greenstreet ◽  
Tetsuya Matsumoto ◽  
Geoffrey Schiebinger

AbstractUnderstanding how cells change their identity and behaviour in living systems is an important question in many fields of biology. The problem of inferring cell trajectories from single-cell measurements has been a major topic in the single-cell analysis community, with different methods developed for equilibrium and non-equilibrium systems (e.g. haematopoeisis vs. embryonic development). We show that optimal transport analysis, a technique originally designed for analysing time-courses, may also be applied to infer cellular trajectories from a single snapshot of a population in equilibrium. Therefore optimal transport provides a unified approach to inferring trajectories, applicable to both stationary and non-stationary systems. Our method, StationaryOT, is mathematically motivated in a natural way from the hypothesis of a Waddington’s epigenetic landscape. We implemented StationaryOT as a software package and demonstrate its efficacy when applied to simulated data as well as single-cell data from Arabidopsis thaliana root development.


2020 ◽  
Vol 36 (18) ◽  
pp. 4774-4780 ◽  
Author(s):  
Pierre-Cyril Aubin-Frankowski ◽  
Jean-Philippe Vert

Abstract Motivation Single-cell RNA sequencing (scRNA-seq) offers new possibilities to infer gene regulatory network (GRNs) for biological processes involving a notion of time, such as cell differentiation or cell cycles. It also raises many challenges due to the destructive measurements inherent to the technology. Results In this work, we propose a new method named GRISLI for de novo GRN inference from scRNA-seq data. GRISLI infers a velocity vector field in the space of scRNA-seq data from profiles of individual cells, and models the dynamics of cell trajectories with a linear ordinary differential equation to reconstruct the underlying GRN with a sparse regression procedure. We show on real data that GRISLI outperforms a recently proposed state-of-the-art method for GRN reconstruction from scRNA-seq data. Availability and implementation The MATLAB code of GRISLI is available at: https://github.com/PCAubin/GRISLI. Supplementary information Supplementary data are available at Bioinformatics online.


2019 ◽  
Author(s):  
Van Hoan Do ◽  
Mislav Blažević ◽  
Pablo Monteagudo ◽  
Luka Borozan ◽  
Khaled Elbassioni ◽  
...  

AbstractSingle-cell RNA sequencing enables the construction of trajectories describing the dynamic changes in gene expression underlying biological processes such as cell differentiation and development. The comparison of single-cell trajectories under two distinct conditions can illuminate the differences and similarities between the two and can thus be a powerful tool. Recently developed methods for the comparison of trajectories rely on the concept of dynamic time warping (dtw), which was originally proposed for the comparison of two time series. Consequently, these methods are restricted to simple, linear trajectories. Here, we adopt and theoretically link arboreal matchings to dtw and propose an algorithm to compare complex trajectories that more realistically contain branching points that divert cells into different fates. We implement a suite of exact and heuristic algorithms suitable for the comparison of trajectories of different characteristics in our tool Trajan. Trajan automatically pairs similar biological processes between conditions and aligns them in a globally consistent manner. In an alignment of singlecell trajectories describing human muscle differentiation and myogenic reprogramming, Trajan identifies and aligns the core paths without prior information. From Trajan’s alignment, we are able to reproduce recently reported barriers to reprogramming. In a perturbation experiment, we demonstrate the benefits in terms of robustness and accuracy of our model which compares entire trajectories at once, as opposed to a pairwise application of dtw. Trajan is available at https://github.com/canzarlab/Trajan.


2021 ◽  
Vol 80 (Suppl 1) ◽  
pp. 460.1-460
Author(s):  
L. Cheng ◽  
S. X. Zhang ◽  
S. Song ◽  
C. Zheng ◽  
X. Sun ◽  
...  

Background:Rheumatoid arthritis (RA) is a chronic, inflammatory synovitis based systemic disease of unknown etiology1. The genes and pathways in the inflamed synovium of RA patients are poorly understood.Objectives:This study aims to identify differentially expressed genes (DEGs) associated with the progression of synovitis in RA using bioinformatics analysis and explore its pathogenesis2.Methods:RA expression profile microarray data GSE89408 were acquired from the public gene chip database (GEO), including 152 synovial tissue samples from RA and 28 healthy synovial tissue samples. The DEGs of RA synovial tissues were screened by adopting the R software. The Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis were performed. Protein-protein interaction (PPI) networks were assembled with Cytoscape software.Results:A total of 654 DEGs (268 up-regulated genes and 386 down-regulated genes) were obtained by the differential analysis. The GO enrichment results showed that the up-regulated genes were significantly enriched in the biological processes of myeloid leukocyte activation, cellular response to interferon-gamma and immune response-regulating signaling pathway, and the down-regulated genes were significantly enriched in the biological processes of extracellular matrix, retinoid metabolic process and regulation of lipid metabolic process. The KEGG annotation showed the up-regulated genes mainly participated in the staphylococcus aureus infection, chemokine signaling pathway, lysosome signaling pathway and the down-regulated genes mainly participated in the PPAR signaling pathway, AMPK signaling pathway, ECM-receptor interaction and so on. The 9 hub genes (PTPRC, TLR2, tyrobp, CTSS, CCL2, CCR5, B2M, fcgr1a and PPBP) were obtained based on the String database model by using the Cytoscape software and cytoHubba plugin3.Conclusion:The findings identified the molecular mechanisms and the key hub genes of pathogenesis and progression of RA.References:[1]Xiong Y, Mi BB, Liu MF, et al. Bioinformatics Analysis and Identification of Genes and Molecular Pathways Involved in Synovial Inflammation in Rheumatoid Arthritis. Med Sci Monit 2019;25:2246-56. doi: 10.12659/MSM.915451 [published Online First: 2019/03/28][2]Mun S, Lee J, Park A, et al. Proteomics Approach for the Discovery of Rheumatoid Arthritis Biomarkers Using Mass Spectrometry. Int J Mol Sci 2019;20(18) doi: 10.3390/ijms20184368 [published Online First: 2019/09/08][3]Zhu N, Hou J, Wu Y, et al. Identification of key genes in rheumatoid arthritis and osteoarthritis based on bioinformatics analysis. Medicine (Baltimore) 2018;97(22):e10997. doi: 10.1097/MD.0000000000010997 [published Online First: 2018/06/01]Acknowledgements:This project was supported by National Science Foundation of China (82001740), Open Fund from the Key Laboratory of Cellular Physiology (Shanxi Medical University) (KLCP2019) and Innovation Plan for Postgraduate Education in Shanxi Province (2020BY078).Disclosure of Interests:None declared


Author(s):  
Yixuan Qiu ◽  
Jiebiao Wang ◽  
Jing Lei ◽  
Kathryn Roeder

Abstract Motivation Marker genes, defined as genes that are expressed primarily in a single cell type, can be identified from the single cell transcriptome; however, such data are not always available for the many uses of marker genes, such as deconvolution of bulk tissue. Marker genes for a cell type, however, are highly correlated in bulk data, because their expression levels depend primarily on the proportion of that cell type in the samples. Therefore, when many tissue samples are analyzed, it is possible to identify these marker genes from the correlation pattern. Results To capitalize on this pattern, we develop a new algorithm to detect marker genes by combining published information about likely marker genes with bulk transcriptome data in the form of a semi-supervised algorithm. The algorithm then exploits the correlation structure of the bulk data to refine the published marker genes by adding or removing genes from the list. Availability and implementation We implement this method as an R package markerpen, hosted on CRAN (https://CRAN.R-project.org/package=markerpen). Supplementary information Supplementary data are available at Bioinformatics online.


Sign in / Sign up

Export Citation Format

Share Document