scholarly journals A machine learning approach for single cell interphase cell cycle staging

2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Hemaxi Narotamo ◽  
Maria Sofia Fernandes ◽  
Ana Margarida Moreira ◽  
Soraia Melo ◽  
Raquel Seruca ◽  
...  

AbstractThe cell nucleus is a tightly regulated organelle and its architectural structure is dynamically orchestrated to maintain normal cell function. Indeed, fluctuations in nuclear size and shape are known to occur during the cell cycle and alterations in nuclear morphology are also hallmarks of many diseases including cancer. Regrettably, automated reliable tools for cell cycle staging at single cell level using in situ images are still limited. It is therefore urgent to establish accurate strategies combining bioimaging with high-content image analysis for a bona fide classification. In this study we developed a supervised machine learning method for interphase cell cycle staging of individual adherent cells using in situ fluorescence images of nuclei stained with DAPI. A Support Vector Machine (SVM) classifier operated over normalized nuclear features using more than 3500 DAPI stained nuclei. Molecular ground truth labels were obtained by automatic image processing using fluorescent ubiquitination-based cell cycle indicator (Fucci) technology. An average F1-Score of 87.7% was achieved with this framework. Furthermore, the method was validated on distinct cell types reaching recall values higher than 89%. Our method is a robust approach to identify cells in G1 or S/G2 at the individual level, with implications in research and clinical applications.

2020 ◽  
Vol 10 (18) ◽  
pp. 6417 ◽  
Author(s):  
Emanuele Lattanzi ◽  
Giacomo Castellucci ◽  
Valerio Freschi

Most road accidents occur due to human fatigue, inattention, or drowsiness. Recently, machine learning technology has been successfully applied to identifying driving styles and recognizing unsafe behaviors starting from in-vehicle sensors signals such as vehicle and engine speed, throttle position, and engine load. In this work, we investigated the fusion of different external sensors, such as a gyroscope and a magnetometer, with in-vehicle sensors, to increase machine learning identification of unsafe driver behavior. Starting from those signals, we computed a set of features capable to accurately describe the behavior of the driver. A support vector machine and an artificial neural network were then trained and tested using several features calculated over more than 200 km of travel. The ground truth used to evaluate classification performances was obtained by means of an objective methodology based on the relationship between speed, and lateral and longitudinal acceleration of the vehicle. The classification results showed an average accuracy of about 88% using the SVM classifier and of about 90% using the neural network demonstrating the potential capability of the proposed methodology to identify unsafe driver behaviors.


2022 ◽  
Vol 65 (1) ◽  
pp. 75-86
Author(s):  
Parth C. Upadhyay ◽  
John A. Lory ◽  
Guilherme N. DeSouza ◽  
Timotius A. P. Lagaunne ◽  
Christine M. Spinka

HighlightsA machine learning framework estimated residue cover in RGB images taken at three resolutions from 88 locations.The best results primarily used texture features, the RFE-SVM feature selection method, and the SVM classifier.Accounting for shadows and plants plus modifying and optimizing the texture features may improve performance.An automated system developed using machine learning is a viable strategy to estimate residue cover from RGB images obtained with handheld or UAV platforms.Abstract. Maintaining plant residue on the soil surface contributes to sustainable cultivation of arable land. Applying machine learning methods to RGB images of residue could overcome the subjectivity of manual methods. The objectives of this study were to use supervised machine learning while identifying the best feature selection method, the best classifier, and the most effective image feature types for classifying residue levels in RGB imagery. Imagery was collected from 88 locations in 40 row-crop fields in five Missouri counties between early May and late June in 2018 and 2019 using a tripod-mounted camera (0.014 cm pixel-1 ground sampling distance, GSD) and an unmanned aerial vehicle (UAV, 0.05 and 0.14 GSD). At each field location, 50 contiguous 0.3 × 0.2 m region of interest (ROI) images were extracted from the imagery, resulting in a dataset of 4,400 ROI images at each GSD. Residue percentages for ground truth were estimated using a bullseye grid method (n = 100 points) based on the 0.014 GSD images. Representative color, texture, and shape features were extracted and evaluated using four feature selection methods and two classifiers. Recursive feature elimination using support vector machine (RFE-SVM) was the best feature selection method, and the SVM classifier performed best for classifying the amount of residue as a three-class problem. The best features for this application were associated with texture, with local binary pattern (LBP) features being the most prevalent for all three GSDs. Shape features were irrelevant. The three residue classes were correctly identified with 88%, 84%, and 81% 10-fold cross-validation scores for the 2018 training data and 81%, 69%, and 65% accuracy for the 2019 testing data in decreasing resolution order. Converting image-wise data (0.014 GSD) to location residue estimates using a Bayesian model showed good agreement with the location-based ground truth (r2 = 0.90). This initial assessment documents the use of RGB images to match other methods of estimating residue, with potential to replace or be used as a quality control for line-transect assessments. Keywords: Feature selection, Soil erosion, Support vector machine, Texture features, Unmanned aerial vehicle.


2018 ◽  
Vol 1 (1) ◽  
pp. 64-74 ◽  
Author(s):  
Devin Joseph Frey ◽  
Avdesh Mishra ◽  
Md Tamjidul Hoque ◽  
Mahdi Abdelguerfi ◽  
Thomas Soniat

In this work, we address a multi-class classification task of oyster vessel behaviors determination by classifying them into four different classes: fishing, traveling, poling (exploring) and docked (anchored). The main purpose of this work is to automate the oyster vessel behaviors determination task using machine learning and to explore different techniques to improve the accuracy of the oyster vessel behavior prediction problem. To employ machine learning technique, two important descriptors: speed and net speed, are calculated from the trajectory data, recorded by a satellite communication system (Vessel Management System, VMS) attached to the vessels fishing on the public oyster grounds of Louisiana. We constructed a support vector machine (SVM) based method which employs Radial Basis Function (RBF) as a kernel to accurately predict the behavior of oyster vessels. Several validation and parameter optimization techniques were used to improve the accuracy of the SVM classifier. A total 93% of the trajectory data from a July 2013 to August 2014 dataset consisting of 612,700 samples for which the ground truth can be obtained using rule-based classifier is used for validation and independent testing of our method. The results show that the proposed SVM based method is able to correctly classify 99.99% of 612,700 samples using the 10-fold cross validation. Furthermore, we achieved a precision of 1.00, recall of 1.00, F1-score of 1.00 and a test accuracy of 99.99%, while performing an independent test using a subset of 93% of the dataset, which consists of 31,418 points.


Author(s):  
Benjamin Lutz ◽  
Dominik Kisskalt ◽  
Andreas Mayr ◽  
Daniel Regulin ◽  
Matteo Pantano ◽  
...  

AbstractIn subtractive manufacturing, differences in machinability among batches of the same material can be observed. Ignoring these deviations can potentially reduce product quality and increase manufacturing costs. To consider the influence of the material batch in process optimization models, the batch needs to be efficiently identified. Thus, a smart service is proposed for in-situ material batch identification. This service is driven by a supervised machine learning model, which analyzes the signals of the machine’s control, especially torque data, for batch classification. The proposed approach is validated by cutting experiments with five different batches of the same specified material at various cutting conditions. Using this data, multiple classification models are trained and optimized. It is shown that the investigated batches can be correctly identified with close to 90% prediction accuracy using machine learning. Out of all the investigated algorithms, the best results are achieved using a Support Vector Machine with 89.0% prediction accuracy for individual batches and 98.9% while combining batches of similar machinability.


Author(s):  
Intisar Shadeed Al-Mejibli ◽  
Jwan K. Alwan ◽  
Dhafar Hamed Abd

Currently, the support vector machine (SVM) regarded as one of supervised machine learning algorithm that provides analysis of data for classification and regression. This technique is implemented in many fields such as bioinformatics, face recognition, text and hypertext categorization, generalized predictive control and many other different areas. The performance of SVM is affected by some parameters, which are used in the training phase, and the settings of parameters can have a profound impact on the resulting engine’s implementation. This paper investigated the SVM performance based on value of gamma parameter with used kernels. It studied the impact of gamma value on (SVM) efficiency classifier using different kernels on various datasets descriptions. SVM classifier has been implemented by using Python. The kernel functions that have been investigated are polynomials, radial based function (RBF) and sigmoid. UC irvine machine learning repository is the source of all the used datasets. Generally, the results show uneven effect on the classification accuracy of three kernels on used datasets. The changing of the gamma value taking on consideration the used dataset influences polynomial and sigmoid kernels. While the performance of RBF kernel function is more stable with different values of gamma as its accuracy is slightly changed.


2018 ◽  
Vol 27 (03n04) ◽  
pp. 1840026
Author(s):  
Binlin Wu ◽  
Xin Gao ◽  
Jason Smith

Native fluorescence spectra are acquired from fresh normal and cancerous human prostate tissues. The fluorescence data are analyzed using an unsupervised machine learning algorithm such as non-negative matrix factorization. The nonnegative spectral components are retrieved and attributed to the native fluorophores such as collagen, reduced nicotinamide adenine dinucleotide (NADH), and flavin adenine dinucleotide (FAD) in tissue. The retrieved scores of the components are used to estimate the relative concentrations of the native fluorophores such as NADH and FAD and the redox ratio. A supervised machine learning algorithm such as support vector machine (SVM) is used to classify normal and cancerous tissue samples based on either the relative concentrations of NADH and FAD or the redox ratio alone. Various statistical measures such as sensitivity, specificity, and accuracy, along with the area under receiver operating characteristic (ROC) curve are used to show the classification performance. A cross validation method such as leave-one-out is used to further evaluate the predictive performance of the SVM classifier to avoid bias due to overfitting, and the accuracy was found to be 93.3%.


2021 ◽  
Author(s):  
Konrad Thorner ◽  
Aaron M. Zorn ◽  
Praneet Chaturvedi

AbstractAnnotation of single cells has become an important step in the single cell analysis framework. With advances in sequencing technology thousands to millions of cells can be processed to understand the intricacies of the biological system in question. Annotation through manual curation of markers based on a priori knowledge is cumbersome given this exponential growth. There are currently ~200 computational tools available to help researchers automatically annotate single cells using supervised/unsupervised machine learning, cell type markers, or tissue-based markers from bulk RNA-seq. But with the expansion of publicly available data there is also a need for a tool which can help integrate multiple references into a unified atlas and understand how annotations between datasets compare. Here we present ELeFHAnt: Ensemble learning for harmonization and annotation of single cells. ELeFHAnt is an easy-to-use R package that employs support vector machine and random forest algorithms together to perform three main functions: 1) CelltypeAnnotation 2) LabelHarmonization 3) DeduceRelationship. CelltypeAnnotation is a function to annotate cells in a query Seurat object using a reference Seurat object with annotated cell types. LabelHarmonization can be utilized to integrate multiple cell atlases (references) into a unified cellular atlas with harmonized cell types. Finally, DeduceRelationship is a function that compares cell types between two scRNA-seq datasets. ELeFHAnt can be accessed from GitHub at https://github.com/praneet1988/ELeFHAnt.


2021 ◽  
Vol 9 ◽  
Author(s):  
Niraj Kushwaha ◽  
Naveen Kumar Mendola ◽  
Saptarshi Ghosh ◽  
Ajay Deep Kachhvah ◽  
Sarika Jalan

Chimera and Solitary states have captivated scientists and engineers due to their peculiar dynamical states corresponding to co-existence of coherent and incoherent dynamical evolution in coupled units in various natural and artificial systems. It has been further demonstrated that such states can be engineered in systems of coupled oscillators by suitable implementation of communication delays. Here, using supervised machine learning, we predict (a) the precise value of delay which is sufficient for engineering chimera and solitary states for a given set of system's parameters, as well as (b) the intensity of incoherence for such engineered states. Ergo, using few initial data points we generate a machine learning model which can then create a more refined phase plot as well as by including new parameter values. We demonstrate our results for two different examples consisting of single layer and multi layer networks. First, the chimera states (solitary states) are engineered by establishing delays in the neighboring links of a node (the interlayer links) in a 2-D lattice (multiplex network) of oscillators. Then, different machine learning classifiers, K-nearest neighbors (KNN), support vector machine (SVM) and multi-layer perceptron neural network (MLP-NN) are employed by feeding the data obtained from the network models. Once a machine learning model is trained using the limited amount of data, it predicts the precise value of critical delay as well as the intensity of incoherence for a given unknown systems parameters values. Testing accuracy, sensitivity, and specificity analysis reveal that MLP-NN classifier is better suited than Knn or SVM classifier for the predictions of parameters values for engineered chimera and solitary states. The technique provides an easy methodology to predict critical delay values as well as intensity of incoherence for that delay value for designing an experimental setup to create solitary and chimera states.


2015 ◽  
Vol 23 (e1) ◽  
pp. e113-e117 ◽  
Author(s):  
Jonathan Bates ◽  
Samah J Fodeh ◽  
Cynthia A Brandt ◽  
Julie A Womack

Abstract Objective To identify patients in a human immunodeficiency virus (HIV) study cohort who have fallen by applying supervised machine learning methods to radiology reports of the cohort. Methods We used the Veterans Aging Cohort Study Virtual Cohort (VACS-VC), an electronic health record-based cohort of 146 530 veterans for whom radiology reports were available ( N =2 977 739). We created a reference standard of radiology reports, represented each report by a feature set of words and Unified Medical Language System concepts, and then developed several support vector machine (SVM) classifiers for falls. We compared mutual information (MI) ranking and embedded feature selection approaches. The SVM classifier with MI feature selection was chosen to classify all radiology reports in VACS-VC. Results Our SVM classifier with MI feature selection achieved an area under the curve score of 97.04 on the test set. When applied to all the radiology reports in VACS-VC, 80 416 of these reports were classified as positive for a fall. Of these, 11 484 were associated with a fall-related external cause of injury code (E-code) and 68 932 were not, corresponding to 29 280 patients with potential fall-related injuries who could not have been found using E-codes. Discussion Feature selection was crucial to improving the classifier’s performance. Feature selection with MI allowed us to select the number of discriminative features to use for classification, in contrast to the embedded feature selection method, in which the number of features is chosen automatically. Conclusion Machine learning is an effective method of identifying patients who have suffered a fall. The development of this classifier supplements the clinical researcher’s toolkit and reduces dependence on under-coded structured electronic health record data.


2020 ◽  
Vol 22 (1) ◽  
Author(s):  
Niyaz Yoosuf ◽  
José Fernández Navarro ◽  
Fredrik Salmén ◽  
Patrik L. Ståhl ◽  
Carsten O. Daub

Abstract Background Distinguishing ductal carcinoma in situ (DCIS) from invasive ductal carcinoma (IDC) regions in clinical biopsies constitutes a diagnostic challenge. Spatial transcriptomics (ST) is an in situ capturing method, which allows quantification and visualization of transcriptomes in individual tissue sections. In the past, studies have shown that breast cancer samples can be used to study their transcriptomes with spatial resolution in individual tissue sections. Previously, supervised machine learning methods were used in clinical studies to predict the clinical outcomes for cancer types. Methods We used four publicly available ST breast cancer datasets from breast tissue sections annotated by pathologists as non-malignant, DCIS, or IDC. We trained and tested a machine learning method (support vector machine) based on the expert annotation as well as based on automatic selection of cell types by their transcriptome profiles. Results We identified expression signatures for expert annotated regions (non-malignant, DCIS, and IDC) and build machine learning models. Classification results for 798 expression signature transcripts showed high coincidence with the expert pathologist annotation for DCIS (100%) and IDC (96%). Extending our analysis to include all 25,179 expressed transcripts resulted in an accuracy of 99% for DCIS and 98% for IDC. Further, classification based on an automatically identified expression signature covering all ST spots of tissue sections resulted in prediction accuracy of 95% for DCIS and 91% for IDC. Conclusions This concept study suggest that the ST signatures learned from expert selected breast cancer tissue sections can be used to identify breast cancer regions in whole tissue sections including regions not trained on. Furthermore, the identified expression signatures can classify cancer regions in tissue sections not used for training with high accuracy. Expert-generated but even automatically generated cancer signatures from ST data might be able to classify breast cancer regions and provide clinical decision support for pathologists in the future.


Sign in / Sign up

Export Citation Format

Share Document