Active Machine Learning-driven Experiments on Malaria Cell Classification

Author(s):  
Mingyong Ma
2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Gisela Pattarone ◽  
Laura Acion ◽  
Marina Simian ◽  
Emmanuel Iarussi

AbstractAutomated cell classification in cancer biology is a challenging topic in computer vision and machine learning research. Breast cancer is the most common malignancy in women that usually involves phenotypically diverse populations of breast cancer cells and an heterogeneous stroma. In recent years, automated microscopy technologies are allowing the study of live cells over extended periods of time, simplifying the task of compiling large image databases. For instance, there have been several studies oriented towards building machine learning systems capable of automatically classifying images of different cell types (i.e. motor neurons, stem cells). In this work we were interested in classifying breast cancer cells as live or dead, based on a set of automatically retrieved morphological characteristics using image processing techniques. Our hypothesis is that live-dead classification can be performed without any staining and using only bright-field images as input. We tackled this problem using the JIMT-1 breast cancer cell line that grows as an adherent monolayer. First, a vast image set composed by JIMT-1 human breast cancer cells that had been exposed to a chemotherapeutic drug treatment (doxorubicin and paclitaxel) or vehicle control was compiled. Next, several classifiers were trained based on well-known convolutional neural networks (CNN) backbones to perform supervised classification using labels obtained from fluorescence microscopy images associated with each bright-field image. Model performances were evaluated and compared on a large number of bright-field images. The best model reached an AUC = 0.941 for classifying breast cancer cells without treatment. Furthermore, it reached AUC = 0.978 when classifying breast cancer cells under drug treatment. Our results highlight the potential of machine learning and computational image analysis to build new diagnosis tools that benefit the biomedical field by reducing cost, time, and stimulating work reproducibility. More importantly, we analyzed the way our classifiers clusterize bright-field images in the learned high-dimensional embedding and linked these groups to salient visual characteristics in live-dead cell biology observed by trained experts.


2021 ◽  
Author(s):  
Leslie Solorzano ◽  
Lina Wik ◽  
Thomas Olsson Bontell ◽  
Yuyu Wang ◽  
Anna H. Klemm ◽  
...  

Multiplexed and spatially resolved single-cell analyses that intend to study tissue heterogeneity and cell organization invariably face as a first step the challenge of cell classification. Accuracy and reproducibility are important for the downstream process of counting cells, quantifying cell-cell interactions, and extracting information on disease-specific localized cell niches. Novel staining techniques make it possible to visualize and quantify large numbers of cell-specific molecular markers in parallel. However, due to variations in sample handling and artefacts from staining and scanning, cells of the same type may present different marker profiles both within and across samples. We address multiplexed immunofluorescence data from tissue microarrays of low grade gliomas and present a methodology using two different machine learning architectures and features insensitive to illumination to perform cell classification. The fully automated cell classification provides a measure of confidence for the decision and requires a comparably small annotated dataset for training, which can be created using freely available tools. Using the proposed method, we reached an accuracy of 83.1% on cell classification without the need for standardization of samples. Using our confidence measure, cells with low-confidence classifications could be excluded, pushing the classification accuracy to 94.5%. Next, we used the cell classification results to search for cell niches with an unsupervised learning approach based on graph neural networks. We show that the approach can re-detect specialized tissue niches in previously published data, and that our proposed cell classification leads to niche definitions that may be relevant for sub-groups of glioma, if applied to larger datasets.


2021 ◽  
Author(s):  
Leslie Solorzano ◽  
Lina Wik ◽  
Thomas Olsson Bontell ◽  
Yuyu Wang ◽  
Anna H. Klemm ◽  
...  

2019 ◽  
Author(s):  
Zijie J. Wang ◽  
Alex J. Walsh ◽  
Melissa C. Skala ◽  
Anthony Gitter

ABSTRACTThe importance of T cells in immunotherapy has motivated developing technologies to better characterize T cells and improve therapeutic efficacy. One specific objective is assessing antigen-induced T cell activation because only functionally active T cells are capable of killing the desired targets. Autofluorescence imaging can distinguish T cell activity states of individual cells in a non-destructive manner by detecting endogenous changes in metabolic co-enzymes such as NAD(P)H. However, recognizing robust patterns of T cell activity is computationally challenging in the absence of exogenous labels or information-rich autofluorescence lifetime measurements. We demonstrate that advanced machine learning can accurately classify T cell activity from NAD(P)H intensity images and that those image-based signatures transfer across human donors. Using a dataset of 8,260 cropped single-cell images from six donors, we meticulously evaluate multiple machine learning models. These range from traditional models that represent images using summary statistics or extract image features with CellProfiler to deep convolutional neural networks (CNNs) pre-trained on general non-biological images. Adapting pre-trained CNNs for the T cell activity classification task provides substantially better performance than traditional models or a simple CNN trained with the autofluorescence images alone. Visualizing the images with dimension reduction provides intuition into why the CNNs achieve higher accuracy than other approaches. However, we observe that fine-tuning all layers of the pre-trained CNN does not provide a classification performance boost commensurate with the additional computational cost. Our software detailing our image processing and model training pipeline is available as Jupyter notebooks at https://github.com/gitter-lab/t-cell-classification.


Sign in / Sign up

Export Citation Format

Share Document