scholarly journals Automated identification of maximal differential cell populations in flow cytometry data

2021 ◽  
Author(s):  
Alice Yue ◽  
Cedric Chauve ◽  
Maxwell Libbrecht ◽  
Ryan R. Brinkman
2019 ◽  
Author(s):  
Alice Yue ◽  
Cedric Chauve ◽  
Maxwell Libbrecht ◽  
Ryan R. Brinkman

AbstractWe introduce a new cell population score called SpecEnr (specific enrichment) and describe a method that discovers robust and accurate candidate biomarkers from flow cytometry data. Our approach identifies a new class of candidate biomarkers we define as driver cell populations, whose abundance is associated with a sample class (e.g. disease), but not as a result of a change in a related population. We show that the driver cell populations we find are also easily interpretable using a lattice-based visualization tool. Our method is implemented in the R package flowGraph, freely available on GitHub (github.com/aya49/flowGraph) and will be available BioConductor.


2015 ◽  
Vol 89 (1) ◽  
pp. 71-88 ◽  
Author(s):  
Chiaowen Hsiao ◽  
Mengya Liu ◽  
Rick Stanton ◽  
Monnie McGee ◽  
Yu Qian ◽  
...  

2019 ◽  
Author(s):  
Kodai Minoura ◽  
Ko Abe ◽  
Yuka Maeda ◽  
Hiroyoshi Nishikawa ◽  
Teppei Shimamura

AbstractMotivationModern flow cytometry technology has enabled the simultaneous analysis of multiple cell markers at the single-cell level, and it is widely used in a broad field of research. The detection of cell populations in flow cytometry data has long been dependent on “manual gating” by visual inspection. Recently, numerous software have been developed for automatic, computationally guided detection of cell populations; however, they are not designed for time-series flow cytometry data. Time-series flow cytometry data are indispensable for investigating the dynamics of cell populations that could not be elucidated by static time-point analysis.Therefore, there is a great need for tools to systematically analyze time-series flow cytometry data.ResultsWe propose a simple and efficient statistical framework, named CYBERTRACK (CYtometry-Based Estimation and Reasoning for TRACKing cell populations), to perform clustering and cell population tracking for time-series flow cytometry data. CYBERTRACK assumes that flow cytometry data are generated from a multivariate Gaussian mixture distribution with its mixture proportion at the current time dependent on that at a previous timepoint. Using simulation data, we evaluate the performance of CYBERTRACK when estimating parameters for a multivariate Gaussian mixture distribution, tracking time-dependent transitions of mixture proportions, and detecting change-points in the overall mixture proportion. The CYBERTRACK performance is validated using two real flow cytometry datasets, which demonstrate that the population dynamics detected by CYBERTRACK are consistent with our prior knowledge of lymphocyte behavior.ConclusionsOur results indicate that CYBERTRACK offers better understandings of time-dependent cell population dynamics to cytometry users by systematically analyzing time-series flow cytometry data.


2016 ◽  
Vol 60 ◽  
pp. 1029-1040 ◽  
Author(s):  
Michael Reiter ◽  
Paolo Rota ◽  
Florian Kleber ◽  
Markus Diem ◽  
Stefanie Groeneveld-Krentz ◽  
...  

2017 ◽  
Author(s):  
Alexandra J. Lee ◽  
Ivan Chang ◽  
Julie G. Burel ◽  
Cecilia S. Lindestam Arlehamn ◽  
Daniela Weiskopf ◽  
...  

AbstractComputational methods for identification of cell populations from high-dimensional flow cytometry data are changing the paradigm of cytometry bioinformatics. Data clustering is the most common computational approach to unsupervised identification of cell populations from multidimensional cytometry data. We found that combining recursive filtering and clustering with constraints converted from the user manual gating strategy can effectively identify overlapping and rare cell populations from smeared data that would have been difficult to resolve by either a single run of data clustering or manual segregation. We named this new method DAFi: Directed Automated Filtering and Identification of cell populations. Design of DAFi preserves the data-driven characteristics of unsupervised clustering for identifying novel cell-based biomarkers, but also makes the results interpretable to experimental scientists as in supervised classification through mapping and merging the high-dimensional data clusters into the user-defined 2D gating hierarchy. By recursive data filtering before clustering, DAFi can uncover small local clusters which are otherwise difficult to identify due to the statistical interference of the irrelevant major clusters. Quantitative assessment of cell type specific characteristics demonstrates that the population proportions calculated by DAFi, while being highly consistent with those by expert centralized manual gating, have smaller technical variance than those from individual manual gating analysis. Visual examination of the dot plots showed that the boundaries of the DAFi-identified cell populations followed the natural shapes of the data distributions. To further exemplify the utility of DAFi, we show that DAFi can incorporate the FLOCK clustering method to identify novel cell-based biomarkers. Implementation of DAFi supports options including clustering, bisecting, slope-based gating, and reversed filtering to meet various auto-gating needs from different scientific use cases.


2020 ◽  
Author(s):  
Paul D. Simonson ◽  
Yue Wu ◽  
David Wu ◽  
Jonathan R. Fromm ◽  
Aaron Y. Lee

AbstractObjectivesAutomated classification of flow cytometry data has the potential to reduce errors and accelerate flow cytometry interpretation. We desired a machine learning approach that is accurate, intuitively easy to understand, and highlights the cells that are most important in the algorithm’s prediction for a given case.MethodsWe developed an ensemble of convolutional neural networks (CNNs) for classification and visualization of impactful cell populations in detecting classic Hodgkin lymphoma, using two-dimensional (2D) histograms. Data from 977 and 245 clinical flow cytometry cases were used for training and testing, respectively. 78 non-gated 2D histograms were created per flow cytometry file. SHAP values were calculated to determine the most impactful 2D histograms and regions within the histograms. The SHAP values from all 78 histograms were then projected back to the original cells data for gating and visualization using standard flow cytometry software.ResultsThe algorithm achieved 67.7% recall (sensitivity), 82.4 % precision, and 0.92 AUROC. Visualization of the important cell populations in making individual predictions demonstrated correlations with known biology.ConclusionsThe method presented enables model explainability while highlighting important cell populations in individual flow cytometry specimens, with potential applications in both diagnosis and discovery of previously overlooked key cell populations.


2014 ◽  
Vol 85 (5) ◽  
pp. 408-421 ◽  
Author(s):  
Iftekhar Naim ◽  
Suprakash Datta ◽  
Jonathan Rebhahn ◽  
James S. Cavenaugh ◽  
Tim R. Mosmann ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document