scholarly journals A Physician-in-the-Loop Approach by Means of Machine Learning for the Diagnosis of Lymphocytosis in the Clinical Laboratory

Author(s):  
Laura Bigorra ◽  
Iciar Larriba ◽  
Ricardo Gutiérrez-Gallego

Context.— The goal of the lymphocytosis diagnosis approach is its classification into benign or neoplastic categories. Nevertheless, a nonnegligible percentage of laboratories fail in that classification. Objective.— To design and develop a machine learning model by using objective data from the DxH 800 analyzer, including cell population data, leukocyte and absolute lymphoid counts, hemoglobin concentration, and platelet counts, besides age and sex, with classification purposes for lymphocytosis diagnosis. Design.— A total of 1565 samples were included from 10 different lymphoid categories grouped into 4 diagnostic categories: normal controls (458), benign causes of lymphocytosis (567), neoplastic lymphocytosis (399), and spurious causes of lymphocytosis (141). The data set was distributed in a 60-20-20 scheme for training, testing, and validation stages. Six machine learning models were built and compared, and the selection of the final model was based on the minimum generalization error and 10-fold cross validation accuracy. Results.— The selected neural network classifier rendered a global 10-class classification validation accuracy corresponding to 89.9%, which, considering the aforementioned 4 diagnostic categories, presented a diagnostic impact accuracy corresponding to 95.8%. Finally, a prospective proof of concept was performed with 100 new cases with a global diagnostic accuracy corresponding to 91%. Conclusions.— The proposed machine learning model was feasible, with a high benefit-cost ratio, as the results were obtained within the complete blood count with differential. Finally, the diagnostic impact with high accuracies in both model validation and proof of concept encourages exploration of the model for real-world application on a daily basis.

Author(s):  
Dhilsath Fathima.M ◽  
S. Justin Samuel ◽  
R. Hari Haran

Aim: This proposed work is used to develop an improved and robust machine learning model for predicting Myocardial Infarction (MI) could have substantial clinical impact. Objectives: This paper explains how to build machine learning based computer-aided analysis system for an early and accurate prediction of Myocardial Infarction (MI) which utilizes framingham heart study dataset for validation and evaluation. This proposed computer-aided analysis model will support medical professionals to predict myocardial infarction proficiently. Methods: The proposed model utilize the mean imputation to remove the missing values from the data set, then applied principal component analysis to extract the optimal features from the data set to enhance the performance of the classifiers. After PCA, the reduced features are partitioned into training dataset and testing dataset where 70% of the training dataset are given as an input to the four well-liked classifiers as support vector machine, k-nearest neighbor, logistic regression and decision tree to train the classifiers and 30% of test dataset is used to evaluate an output of machine learning model using performance metrics as confusion matrix, classifier accuracy, precision, sensitivity, F1-score, AUC-ROC curve. Results: Output of the classifiers are evaluated using performance measures and we observed that logistic regression provides high accuracy than K-NN, SVM, decision tree classifiers and PCA performs sound as a good feature extraction method to enhance the performance of proposed model. From these analyses, we conclude that logistic regression having good mean accuracy level and standard deviation accuracy compared with the other three algorithms. AUC-ROC curve of the proposed classifiers is analyzed from the output figure.4, figure.5 that logistic regression exhibits good AUC-ROC score, i.e. around 70% compared to k-NN and decision tree algorithm. Conclusion: From the result analysis, we infer that this proposed machine learning model will act as an optimal decision making system to predict the acute myocardial infarction at an early stage than an existing machine learning based prediction models and it is capable to predict the presence of an acute myocardial Infarction with human using the heart disease risk factors, in order to decide when to start lifestyle modification and medical treatment to prevent the heart disease.


2021 ◽  
Author(s):  
Junjie Shi ◽  
Jiang Bian ◽  
Jakob Richter ◽  
Kuan-Hsun Chen ◽  
Jörg Rahnenführer ◽  
...  

AbstractThe predictive performance of a machine learning model highly depends on the corresponding hyper-parameter setting. Hence, hyper-parameter tuning is often indispensable. Normally such tuning requires the dedicated machine learning model to be trained and evaluated on centralized data to obtain a performance estimate. However, in a distributed machine learning scenario, it is not always possible to collect all the data from all nodes due to privacy concerns or storage limitations. Moreover, if data has to be transferred through low bandwidth connections it reduces the time available for tuning. Model-Based Optimization (MBO) is one state-of-the-art method for tuning hyper-parameters but the application on distributed machine learning models or federated learning lacks research. This work proposes a framework $$\textit{MODES}$$ MODES that allows to deploy MBO on resource-constrained distributed embedded systems. Each node trains an individual model based on its local data. The goal is to optimize the combined prediction accuracy. The presented framework offers two optimization modes: (1) $$\textit{MODES}$$ MODES -B considers the whole ensemble as a single black box and optimizes the hyper-parameters of each individual model jointly, and (2) $$\textit{MODES}$$ MODES -I considers all models as clones of the same black box which allows it to efficiently parallelize the optimization in a distributed setting. We evaluate $$\textit{MODES}$$ MODES by conducting experiments on the optimization for the hyper-parameters of a random forest and a multi-layer perceptron. The experimental results demonstrate that, with an improvement in terms of mean accuracy ($$\textit{MODES}$$ MODES -B), run-time efficiency ($$\textit{MODES}$$ MODES -I), and statistical stability for both modes, $$\textit{MODES}$$ MODES outperforms the baseline, i.e., carry out tuning with MBO on each node individually with its local sub-data set.


2020 ◽  
Vol 6 ◽  
Author(s):  
Jaime de Miguel Rodríguez ◽  
Maria Eugenia Villafañe ◽  
Luka Piškorec ◽  
Fernando Sancho Caparrini

Abstract This work presents a methodology for the generation of novel 3D objects resembling wireframes of building types. These result from the reconstruction of interpolated locations within the learnt distribution of variational autoencoders (VAEs), a deep generative machine learning model based on neural networks. The data set used features a scheme for geometry representation based on a ‘connectivity map’ that is especially suited to express the wireframe objects that compose it. Additionally, the input samples are generated through ‘parametric augmentation’, a strategy proposed in this study that creates coherent variations among data by enabling a set of parameters to alter representative features on a given building type. In the experiments that are described in this paper, more than 150 k input samples belonging to two building types have been processed during the training of a VAE model. The main contribution of this paper has been to explore parametric augmentation for the generation of large data sets of 3D geometries, showcasing its problems and limitations in the context of neural networks and VAEs. Results show that the generation of interpolated hybrid geometries is a challenging task. Despite the difficulty of the endeavour, promising advances are presented.


Diagnostics ◽  
2021 ◽  
Vol 11 (11) ◽  
pp. 2102
Author(s):  
Eyal Klang ◽  
Robert Freeman ◽  
Matthew A. Levin ◽  
Shelly Soffer ◽  
Yiftach Barash ◽  
...  

Background & Aims: We aimed at identifying specific emergency department (ED) risk factors for developing complicated acute diverticulitis (AD) and evaluate a machine learning model (ML) for predicting complicated AD. Methods: We analyzed data retrieved from unselected consecutive large bowel AD patients from five hospitals from the Mount Sinai health system, NY. The study time frame was from January 2011 through March 2021. Data were used to train and evaluate a gradient-boosting machine learning model to identify patients with complicated diverticulitis, defined as a need for invasive intervention or in-hospital mortality. The model was trained and evaluated on data from four hospitals and externally validated on held-out data from the fifth hospital. Results: The final cohort included 4997 AD visits. Of them, 129 (2.9%) visits had complicated diverticulitis. Patients with complicated diverticulitis were more likely to be men, black, and arrive by ambulance. Regarding laboratory values, patients with complicated diverticulitis had higher levels of absolute neutrophils (AUC 0.73), higher white blood cells (AUC 0.70), platelet count (AUC 0.68) and lactate (AUC 0.61), and lower levels of albumin (AUC 0.69), chloride (AUC 0.64), and sodium (AUC 0.61). In the external validation cohort, the ML model showed AUC 0.85 (95% CI 0.78–0.91) for predicting complicated diverticulitis. For Youden’s index, the model showed a sensitivity of 88% with a false positive rate of 1:3.6. Conclusions: A ML model trained on clinical measures provides a proof of concept performance in predicting complications in patients presenting to the ED with AD. Clinically, it implies that a ML model may classify low-risk patients to be discharged from the ED for further treatment under an ambulatory setting.


2021 ◽  
Author(s):  
Eric Sonny Mathew ◽  
Moussa Tembely ◽  
Waleed AlAmeri ◽  
Emad W. Al-Shalabi ◽  
Abdul Ravoof Shaik

Abstract A meticulous interpretation of steady-state or unsteady-state relative permeability (Kr) experimental data is required to determine a complete set of Kr curves. In this work, three different machine learning models was developed to assist in a faster estimation of these curves from steady-state drainage coreflooding experimental runs. The three different models that were tested and compared were extreme gradient boosting (XGB), deep neural network (DNN) and recurrent neural network (RNN) algorithms. Based on existing mathematical models, a leading edge framework was developed where a large database of Kr and Pc curves were generated. This database was used to perform thousands of coreflood simulation runs representing oil-water drainage steady-state experiments. The results obtained from these simulation runs, mainly pressure drop along with other conventional core analysis data, were utilized to estimate Kr curves based on Darcy's law. These analytically estimated Kr curves along with the previously generated Pc curves were fed as features into the machine learning model. The entire data set was split into 80% for training and 20% for testing. K-fold cross validation technique was applied to increase the model accuracy by splitting the 80% of the training data into 10 folds. In this manner, for each of the 10 experiments, 9 folds were used for training and the remaining one was used for model validation. Once the model is trained and validated, it was subjected to blind testing on the remaining 20% of the data set. The machine learning model learns to capture fluid flow behavior inside the core from the training dataset. The trained/tested model was thereby employed to estimate Kr curves based on available experimental results. The performance of the developed model was assessed using the values of the coefficient of determination (R2) along with the loss calculated during training/validation of the model. The respective cross plots along with comparisons of ground-truth versus AI predicted curves indicate that the model is capable of making accurate predictions with error percentage between 0.2 and 0.6% on history matching experimental data for all the three tested ML techniques (XGB, DNN, and RNN). This implies that the AI-based model exhibits better efficiency and reliability in determining Kr curves when compared to conventional methods. The results also include a comparison between classical machine learning approaches, shallow and deep neural networks in terms of accuracy in predicting the final Kr curves. The various models discussed in this research work currently focusses on the prediction of Kr curves for drainage steady-state experiments; however, the work can be extended to capture the imbibition cycle as well.


2017 ◽  
Vol 62 (10) ◽  
pp. 2719-2727 ◽  
Author(s):  
Mark C. Hornbrook ◽  
Ran Goshen ◽  
Eran Choman ◽  
Maureen O’Keeffe-Rosetti ◽  
Yaron Kinar ◽  
...  

Author(s):  
Dr. Kalaivazhi Vijayaragavan ◽  
S. Prakathi ◽  
S. Rajalakshmi ◽  
M Sandhiya

Machine learning is a subfield of artificial intelligence, which is learning algorithms to make decision-based on data and try to behave like a human being. Classification is one of the most fundamental concepts in machine learning. It is a process of recognizing, understanding, and grouping ideas and objects into pre-set categories or sub-populations. Using precategorized training datasets, machine learning concept use variety of algorithms to classify the future datasets into categories. Classification algorithms use input training data in machine learning to predict the subsequent data that fall into one of the predetermined categories. To improve the classification accuracy design of neural network is regarded as effective model to obtain better accuracy. However, design of neural network is usually consider scaling layer, perceptron layers and probabilistic layer. In this paper, an enhanced model selection can be evaluated with training and testing strategy. Further, the classification accuracy can be predicted. Finally by using two popular machine learning frameworks: PyTorch and Tensor Flow the prediction of classification accuracy is compared. Results demonstrate that the proposed method can predict with more accuracy. After the deployment of our machine learning model the performance of the model has been evaluated with the help of iris data set.


Sign in / Sign up

Export Citation Format

Share Document