Supervised machine learning for power and bandwidth management in very high throughput satellite systems

Author(s):  
Flor G. Ortiz‐Gómez ◽  
Daniele Tarchi ◽  
Ramón Martínez ◽  
Alessandro Vanelli‐Coralli ◽  
Miguel A. Salas‐Natera ◽  
...  
Blood ◽  
2020 ◽  
Vol 136 (Supplement 1) ◽  
pp. 45-46
Author(s):  
Christian Pohlkamp ◽  
Kapil Jhalani ◽  
Niroshan Nadarajah ◽  
Inseok Heo ◽  
William Wetton ◽  
...  

Background: Cytomorphology is the gold standard for quick assessment of peripheral blood and bone marrow samples in hematological neoplasms. It is a broadly-accepted method for orchestrating more specific diagnostics including immunophenotyping or genetics. Inter-/intra-observer-reproducibility of single cell classification is only 75 to 90%. Only a limited number of cells (100 - 500 cells/smear) is read in a time-consuming procedure. Machine learning (ML) is more reliable where human skills are limited, i.e. in handling large amounts of data or images. We here tested ML to differentiate peripheral blood leukocytes in a high throughput hematology laboratory. Aim: To establish an ML-based cell classifier capable of identifying healthy and pathologic cells in digitalized peripheral blood smear scans at an accuracy competitive with or outperforming human expert level. Methods: We selected >2,600 smears out of our unique archive of > 250,000 peripheral blood smears from hematological neoplasms. Depending on quality, we scanned up to 1,000 single cell images per smear. For image acquisition, a Metafer Scanning System (Zeiss Axio Imager.Z2 microscope, automatic slide feeder and automatic oiling device) from MetaSystems (Altlussheim, GER) was used. Areas of interest were defined by pre-scan in 10x magnification followed by high resolution scan in 40x to generate cell images for analysis. Average capture times for 300/500 cells were 3:43/4:37 min We set up a supervised ML-learning model using colour images (144x144 pixels) as input, outputting predicted probabilities of 21 predefined classes. We used ImageNet-pretrained Xception as our base model. We trained, evaluated and deployed the model using Amazon SageMaker on a subset of 82,974 images randomly selected from 514,183 cells captured and labelled for this study. 20 different cell types and one garbage class were classified. We included cell type categories referring to the critical importance of detecting rare leukemia subtypes (e.g. APL). Numbers of images from respective 21 classes ranged from 1,830 to 14,909 (median: 2,945). Minority classes were up-sampledto handle imbalances. Each picture was labelled by highly skilled technicians (median years practicing in this laboratory: 5) and two independent hematologists (median years at microscope: 20). Results: On a separate test set of 8,297 cells, our classifier was able to predict any of the five cell types occurring in the peripheral blood of healthy individuals (PMN, lymphocytes, monocytes, eosinophils, basophils) at very high median accuracy (97.0%) Median prediction accuracy of 15 rare or pathological cell types was 91.3%. For six critical pathological cell forms (myeloblasts, atypical/bilobulated promyelocytes in APL/APLv, hairy cells, lymphoma cells,plasma cells), median accuracy was 93.4% (sensitivity 93.8%). We saw a very high "T98 accuracy" for these cell types (98.5%) which is the accuracy of cell type predictions with prediction probability >0.98 (achieved in 2231/2417 cases), implicating that critical cells predicted with probability <0.98 should be flagged for human expert validation with priority. For all 21 classes median accuracy was 91.7%. Accuracy was lower for cells representing consecutive steps of maturation, e.g. promyelo-/myelo-/metamyelocytes, reproducing inconsistencies from the human-built phenotypic classification system (s.Fig.). Conclusions: We demonstrate an automated workflow using automatic microscopic cell capturing and ML-driven cell differentiation in samples of hematologic patients. Reproducibility, accuracy, sensitivity and specificity are above 90%, for many cell types above 98%. By flagging suspicious cells for humanvalidation, this tool can support even experienced hematology professionals, especially in detecting rare cell types. Given an appropriate scanning speed, it clearly outperforms human investigators in terms of examination time and number of differentiated cells. An ML-based intelligence can make its skills accessible to hematology laboratories on site or after upload of scanned cell images, independent of time/location. A cloud-based infrastructure is available. A prospective head to head challenge between ML-based classifier and human experts comparing sensitivity and accuracy for detection of all cell classes in peripheral blood will be tested to proof suitability for routine use (NCT 4466059). Figure Disclosures Heo: AWS: Current Employment. Wetton:AWS: Current Employment. Drescher:MetaSystems: Current Employment. Hänselmann:MetaSystems: Current Employment. Lörch:MetaSystems: Current equity holder in private company.


2020 ◽  
Vol 1 (2) ◽  
pp. 1-4
Author(s):  
Priyam Guha ◽  
Abhishek Mukherjee ◽  
Abhishek Verma

This research paper deals with using supervised machine learning algorithms to detect authenticity of bank notes. In this research we were successful in achieving very high accuracy (of the order of 99%) by applying some data preprocessing tricks and then running the processed data on supervised learning algorithms like SVM, Decision Trees, Logistic Regression, KNN. We then proceed to analyze the misclassified points. We examine the confusion matrix to find out which algorithms had more number of false positives and which algorithm had more number of False negatives. This research paper deals with using supervised machine learning algorithms to detect authenticity of bank notes. In this research we were successful in achieving very high accuracy (of the order of 99%) by applying some data preprocessing tricks and then running the processed data on supervised learning algorithms like SVM, Decision Trees, Logistic Regression, KNN. We then proceed to analyze the misclassified points. We examine the confusion matrix to find out which algorithms had more number of false positives and which algorithm had more number of False negatives.


2022 ◽  
Vol 6 (1) ◽  
Author(s):  
Vahid Attari ◽  
Raymundo Arroyave

AbstractComputational methods are increasingly being incorporated into the exploitation of microstructure–property relationships for microstructure-sensitive design of materials. In the present work, we propose non-intrusive materials informatics methods for the high-throughput exploration and analysis of a synthetic microstructure space using a machine learning-reinforced multi-phase-field modeling scheme. We specifically study the interface energy space as one of the most uncertain inputs in phase-field modeling and its impact on the shape and contact angle of a growing phase during heterogeneous solidification of secondary phase between solid and liquid phases. We evaluate and discuss methods for the study of sensitivity and propagation of uncertainty in these input parameters as reflected on the shape of the Cu6Sn5 intermetallic during growth over the Cu substrate inside the liquid Sn solder due to uncertain interface energies. The sensitivity results rank σSI,σIL, and σIL, respectively, as the most influential parameters on the shape of the intermetallic. Furthermore, we use variational autoencoder, a deep generative neural network method, and label spreading, a semi-supervised machine learning method for establishing correlations between inputs of outputs of the computational model. We clustered the microstructures into three categories (“wetting”, “dewetting”, and “invariant”) using the label spreading method and compared it with the trend observed in the Young-Laplace equation. On the other hand, a structure map in the interface energy space is developed that shows σSI and σSL alter the shape of the intermetallic synchronously where an increase in the latter and decrease in the former changes the shape from dewetting structures to wetting structures. The study shows that the machine learning-reinforced phase-field method is a convenient approach to analyze microstructure design space in the framework of the ICME.


2022 ◽  
Vol 12 ◽  
Author(s):  
Jana Ebersbach ◽  
Nazifa Azam Khan ◽  
Ian McQuillan ◽  
Erin E. Higgins ◽  
Kyla Horner ◽  
...  

Phenotyping is considered a significant bottleneck impeding fast and efficient crop improvement. Similar to many crops, Brassica napus, an internationally important oilseed crop, suffers from low genetic diversity, and will require exploitation of diverse genetic resources to develop locally adapted, high yielding and stress resistant cultivars. A pilot study was completed to assess the feasibility of using indoor high-throughput phenotyping (HTP), semi-automated image processing, and machine learning to capture the phenotypic diversity of agronomically important traits in a diverse B. napus breeding population, SKBnNAM, introduced here for the first time. The experiment comprised 50 spring-type B. napus lines, grown and phenotyped in six replicates under two treatment conditions (control and drought) over 38 days in a LemnaTec Scanalyzer 3D facility. Growth traits including plant height, width, projected leaf area, and estimated biovolume were extracted and derived through processing of RGB and NIR images. Anthesis was automatically and accurately scored (97% accuracy) and the number of flowers per plant and day was approximated alongside relevant canopy traits (width, angle). Further, supervised machine learning was used to predict the total number of raceme branches from flower attributes with 91% accuracy (linear regression and Huber regression algorithms) and to identify mild drought stress, a complex trait which typically has to be empirically scored (0.85 area under the receiver operating characteristic curve, random forest classifier algorithm). The study demonstrates the potential of HTP, image processing and computer vision for effective characterization of agronomic trait diversity in B. napus, although limitations of the platform did create significant variation that limited the utility of the data. However, the results underscore the value of machine learning for phenotyping studies, particularly for complex traits such as drought stress resistance.


Author(s):  
Ramon Mata Calvo ◽  
Tomaso de Cola ◽  
Juraj Poliak ◽  
Luca Macri ◽  
Arled Papa ◽  
...  

2020 ◽  
Author(s):  
Lucas C. Wheeler ◽  
Arden Perkins ◽  
Caitlyn E. Wong ◽  
Michael J. Harms

AbstractMany proteins interact with short linear regions of target proteins. For some proteins, however, it is difficult to identify a well-defined sequence motif that defines its target peptides. To overcome this difficulty, we used supervised machine learning to train a model that treats each peptide as a collection of easily-calculated biochemical features rather than as an amino acid sequence. As a test case, we dissected the peptide-recognition rules for human S100A5 (hA5), a low-specificity calcium binding protein. We trained a Random Forest model against a recently released, high-throughput phage display dataset collected for hA5. The model identifies hydrophobicity and shape complementarity, rather than polar contacts, as the primary determinants of peptide binding specificity in hA5. We tested this hypothesis by solving a crystal structure of hA5 and through computational docking studies of diverse peptides onto hA5. These structural studies revealed that peptides exhibit multiple binding modes at the hA5 peptide interface—all of which have few polar contacts with hA5. Finally, we used our trained model to predict new, plausible binding targets in the human proteome. This revealed a fragment of the protein α-1-syntrophin binds to hA5. Our work helps better understand the biochemistry and biology of hA5, as well as demonstrating how high-throughput experiments coupled with machine learning of biochemical features can reveal the determinants of binding specificity in low-specificity proteins.


Sign in / Sign up

Export Citation Format

Share Document