scholarly journals The Contribution of CD148, CD180 and CD200 Combination in the Diagnosis of Chronic B-Cell Lymphoproliferative Disorders

Blood ◽  
2021 ◽  
Vol 138 (Supplement 1) ◽  
pp. 3520-3520
Author(s):  
Laurent Miguet ◽  
Caroline Mayeur-Rousse ◽  
Alice Eischen ◽  
Anne-Cecile Galoisy ◽  
Delphine C. M. Rolland ◽  
...  

Abstract Introduction: B-cell immunophenotype could be swiftly assessed by flow cytometry on blood samples or bone marrow aspirate specimens. It provides crucial information later refined with histologic, genetic and molecular features to assert accurate diagnosis of chronic B-cell lymphoproliferative disorders (B-CLPD). Besides Matutes score we identified additional useful markers, i.e. CD148 and CD180 to classify mantle cell lymphoma (MCL) and marginal zone lymphoma (MZL), respectively. Furthermore, CD200 is known to be highly expressed in chronic lymphoid leukemia (CLL) while absent in MCL. Hypothesis: The determination of CD148, CD180 and CD200 expression on B-cells by flow cytometry on blood samples and/or bone marrow aspirates could be a potent tool to accurately identify B-CLPD. We postulated the existence of the following specific expression patterns in B-CLPD: CD148 dim/CD180 dim/CD200 bright for CLL, CD148 dim/CD180 dim/CD200 dim for lymphoplasmocytic lymphoma (LPL), CD148 bright/CD180 dim/CD200 neg/dim for MCL and CD148 dim/CD180 bright/CD200 dim for MZL . Methods: In a prospective study we investigated the expression of CD148/CD180/CD200 on B-cells from 673 patients at the time of B-CLPD diagnosis in our hospital from 2014 to 2020. We analyzed 440 blood and 233 bone marrow aspirate specimens using a BD FACSCanto II flow cytometry instrument. Based solely on CD148/CD180/CD200 specific expression patterns we postulated a diagnosis of CLL, LPL, MCL or MZL. These postulated diagnoses were later confronted to the final diagnoses when all histologic, genetic and molecular features were finalized. Sensitivity, specificity, positive and negative predictive values of the expression profiles were determined. In addition, to investigate the relative importance of these three CD markers we then normalized their mean fluorescence intensities (MFI) and applied several supervised machine learning algorithms including Logistic Regression, Random Forest and Light Gradient Boosting Machine (LightGBM). Results: Out of the 673 clinical samples the CD148/CD180/CD200 expression patterns classified 212 specimens as CLL/SLL (30.8%), 160 as LPL (23.8%), 76 as MCL (11.28%) and 169 as MZL (25%). These diagnosis hypotheses were retrospectively compared to the final diagnoses based on all histologic, genetic and molecular features These diagnosis hypotheses of CLL, LPL, MCL and MZL were consistent with the final diagnosis in 583 out of the 617 corresponding cases (94%) with high positive and negative predictive values. The characteristics of the diagnosis accuracy are detailed in the table below. HCL and FL were not further investigated as their immunophenotype usually do not overlap with those of other B-CLPD. Seventeen out of 617 patients (17/617, 5.3%) did not displayed a clear CD148/CD180/CD200 pattern: 9 LPL, 4 CLL and 4 MZL. In sixteen patients (16/617, 5.0%) the diagnosis hypothesis based on this strategy was not confirmed after completion of the exploration including karyotype, MYD88 L265P mutational status, CCND1 overexpression and pathology explorations. We next investigated the relative importance of these 3 markers. We focused on MFI values of CD148, CD180 and CD200 and three categorical "positive or negative" markers (CD5, CD23, FMC7) that were assembled into a composite marker. After Cox-box normalization of CD148, CD180 and CD200 MFIs, a set of supervised machine learning algorithms including Logistic Regression, Random Forest and Light Gradient Boosting Machine (LightGBM) were applied to the cohort of CLL, LPL, MCL and MZL. We established that the highest diagnosis weights were obtained for CD200 in CLL, CD200 and CD148 in MCL (negatively and positively, respectively), CD180 in MZL. In LPL, CD148, CD180 and CD200 had the highest weights using LightGBM and Random Forest algorithms, while Logistic Regression determined that CD5 and CD23 had the highest (negative) weights. In conclusion, the determination of CD148/CD180/CD200 surface expression patterns by flow cytometry, along with morphology, allowed to assert an accurate diagnosis hypothesis in CLL, MCL, LPL and MZL with high positive and negative predictive values. Machine learning algorithms allowed to measure the relative importance of these markers, that could be of great help in case of discordant expression of the main diagnosis markers. Figure 1 Figure 1. Disclosures No relevant conflicts of interest to declare.

2021 ◽  
Author(s):  
Sangil Lee ◽  
Brianna Mueller ◽  
W. Nick Street ◽  
Ryan M. Carnahan

AbstractIntroductionDelirium is a cerebral dysfunction seen commonly in the acute care setting. Delirium is associated with increased mortality and morbidity and is frequently missed in the emergency department (ED) by clinical gestalt alone. Identifying those at risk of delirium may help prioritize screening and interventions.ObjectiveOur objective was to identify clinically valuable predictive models for prevalent delirium within the first 24 hours of hospitalization based on the available data by assessing the performance of logistic regression and a variety of machine learning models.MethodsThis was a retrospective cohort study to develop and validate a predictive risk model to detect delirium using patient data obtained around an ED encounter. Data from electronic health records for patients hospitalized from the ED between January 1, 2014, and December 31, 2019, were extracted. Eligible patients were aged 65 or older, admitted to an inpatient unit from the emergency department, and had at least one DOSS assessment or CAM-ICU recorded while hospitalized. The outcome measure of this study was delirium within one day of hospitalization determined by a positive DOSS or CAM assessment. We developed the model with and without the Barthel index for activity of daily living, since this was measured after hospital admission.ResultsThe area under the ROC curves for delirium ranged from .69 to .77 without the Barthel index. Random forest and gradient-boosted machine showed the highest AUC of .77. At the 90% sensitivity threshold, gradient-boosted machine, random forest, and logistic regression achieved a specificity of 35%. After the Barthel index was included, random forest, gradient-boosted machine, and logistic regression models demonstrated the best predictive ability with respective AUCs of .85 to .86.ConclusionThis study demonstrated the use of machine learning algorithms to identify the combination of variables that are predictive of delirium within 24 hours of hospitalization from the ED.


2021 ◽  
Vol 42 (Supplement_1) ◽  
Author(s):  
M J Espinosa Pascual ◽  
P Vaquero Martinez ◽  
V Vaquero Martinez ◽  
J Lopez Pais ◽  
B Izquierdo Coronel ◽  
...  

Abstract Introduction Out of all patients admitted with Myocardial Infarction, 10 to 15% have Myocardial Infarction with Non-Obstructive Coronaries Arteries (MINOCA). Classification algorithms based on deep learning substantially exceed traditional diagnostic algorithms. Therefore, numerous machine learning models have been proposed as useful tools for the detection of various pathologies, but to date no study has proposed a diagnostic algorithm for MINOCA. Purpose The aim of this study was to estimate the diagnostic accuracy of several automated learning algorithms (Support-Vector Machine [SVM], Random Forest [RF] and Logistic Regression [LR]) to discriminate between people suffering from MINOCA from those with Myocardial Infarction with Obstructive Coronary Artery Disease (MICAD) at the time of admission and before performing a coronary angiography, whether invasive or not. Methods A Diagnostic Test Evaluation study was carried out applying the proposed algorithms to a database constituted by 553 consecutive patients admitted to our Hospital with Myocardial Infarction. According to the definitions of 2016 ESC Position Paper on MINOCA, patients were classified into two groups: MICAD and MINOCA. Out of the total 553 patients, 214 were discarded due to the lack of complete data. The set of machine learning algorithms was trained on 244 patients (training sample: 75%) and tested on 80 patients (test sample: 25%). A total of 64 variables were available for each patient, including demographic, clinical and laboratorial features before the angiographic procedure. Finally, the diagnostic precision of each architecture was taken. Results The most accurate classification model was the Random Forest algorithm (Specificity [Sp] 0.88, Sensitivity [Se] 0.57, Negative Predictive Value [NPV] 0.93, Area Under the Curve [AUC] 0.85 [CI 0.83–0.88]) followed by the standard Logistic Regression (Sp 0.76, Se 0.57, NPV 0.92 AUC 0.74 and Support-Vector Machine (Sp 0.84, Se 0.38, NPV 0.90, AUC 0.78) (see graph). The variables that contributed the most in order to discriminate a MINOCA from a MICAD were the traditional cardiovascular risk factors, biomarkers of myocardial injury, hemoglobin and gender. Results were similar when the 19 patients with Takotsubo syndrome were excluded from the analysis. Conclusion A prediction system for diagnosing MINOCA before performing coronary angiographies was developed using machine learning algorithms. Results show higher accuracy of diagnosing MINOCA than conventional statistical methods. This study supports the potential of machine learning algorithms in clinical cardiology. However, further studies are required in order to validate our results. FUNDunding Acknowledgement Type of funding sources: None. ROC curves of different algorithms


2021 ◽  
pp. 016555152110077
Author(s):  
Şura Genç ◽  
Elif Surer

Clickbait is a strategy that aims to attract people’s attention and direct them to specific content. Clickbait titles, created by the information that is not included in the main content or using intriguing expressions with various text-related features, have become very popular, especially in social media. This study expands the Turkish clickbait dataset that we had constructed for clickbait detection in our proof-of-concept study, written in Turkish. We achieve a 48,060 sample size by adding 8859 tweets and release a publicly available dataset – ClickbaitTR – with its open-source data analysis library. We apply machine learning algorithms such as Artificial Neural Network (ANN), Logistic Regression, Random Forest, Long Short-Term Memory Network (LSTM), Bidirectional Long Short-Term Memory (BiLSTM) and Ensemble Classifier on 48,060 news headlines extracted from Twitter. The results show that the Logistic Regression algorithm has 85% accuracy; the Random Forest algorithm has a performance of 86% accuracy; the LSTM has 93% accuracy; the ANN has 93% accuracy; the Ensemble Classifier has 93% accuracy; and finally, the BiLSTM has 97% accuracy. A thorough discussion is provided for the psychological aspects of clickbait strategy focusing on curiosity and interest arousal. In addition to a successful clickbait detection performance and the detailed analysis of clickbait sentences in terms of language and psychological aspects, this study also contributes to clickbait detection studies with the largest clickbait dataset in Turkish.


2021 ◽  
Vol 4 (4) ◽  
pp. 77
Author(s):  
Md. Murad Hossain ◽  
Md. Asadullah ◽  
Abidur Rahaman ◽  
Md. Sipon Miah ◽  
M. Zahid Hasan ◽  
...  

The COVID-19 outbreak resulted in preventative measures and restrictions for Bangladesh during the summer of 2020—these unstable and stressful times led to multiple social problems (e.g., domestic violence and divorce). Globally, researchers, policymakers, governments, and civil societies have been concerned about the increase in domestic violence against women and children during the ongoing COVID-19 pandemic. In Bangladesh, domestic violence against women and children has increased during the COVID-19 pandemic. In this article, we investigated family violence among 511 families during the COVID-19 outbreak. Participants were given questionnaires to answer, for a period of over ten days; we predicted family violence using a machine learning-based model. To predict domestic violence from our data set, we applied random forest, logistic regression, and Naive Bayes machine learning algorithms to our model. We employed an oversampling strategy named the Synthetic Minority Oversampling Technique (SMOTE) and the chi-squared statistical test to, respectively, solve the imbalance problem and discover the feature importance of our data set. The performances of the machine learning algorithms were evaluated based on accuracy, precision, recall, and F-score criteria. Finally, the receiver operating characteristic (ROC) and confusion matrices were developed and analyzed for three algorithms. On average, our model, with the random forest, logistic regression, and Naive Bayes algorithms, predicted family violence with 77%, 69%, and 62% accuracy for our data set. The findings of this study indicate that domestic violence has increased and is highly related to two features: family income level during the COVID-19 pandemic and education level of the family members.


2021 ◽  
Vol 50 (5) ◽  
pp. E5
Author(s):  
Elie Massaad ◽  
Natalie Williams ◽  
Muhamed Hadzipasic ◽  
Shalin S. Patel ◽  
Mitchell S. Fourman ◽  
...  

OBJECTIVE Frailty is recognized as an important consideration in patients with cancer who are undergoing therapies, including spine surgery. The definition of frailty in the context of spinal metastases is unclear, and few have studied such markers and their association with postoperative outcomes and survival. Using national databases, the metastatic spinal tumor frailty index (MSTFI) was developed as a tool to predict outcomes in this specific patient population and has not been tested with external data. The purpose of this study was to test the performance of the MSTFI with institutional data and determine whether machine learning methods could better identify measures of frailty as predictors of outcomes. METHODS Electronic health record data from 479 adult patients admitted to the Massachusetts General Hospital for metastatic spinal tumor surgery from 2010 to 2019 formed a validation cohort for the MSTFI to predict major complications, in-hospital mortality, and length of stay (LOS). The 9 parameters of the MSTFI were modeled in 3 machine learning algorithms (lasso regularization logistic regression, random forest, and gradient-boosted decision tree) to assess clinical outcome prediction and determine variable importance. Prediction performance of the models was measured by computing areas under the receiver operating characteristic curve (AUROCs), calibration, and confusion matrix metrics (positive predictive value, sensitivity, and specificity) and was subjected to internal bootstrap validation. RESULTS Of 479 patients (median age 64 years [IQR 55–71 years]; 58.7% male), 28.4% had complications after spine surgery. The in-hospital mortality rate was 1.9%, and the mean LOS was 7.8 days. The MSTFI demonstrated poor discrimination for predicting complications (AUROC 0.56, 95% CI 0.50–0.62) and in-hospital mortality (AUROC 0.69, 95% CI 0.54–0.85) in the validation cohort. For postoperative complications, machine learning approaches showed a greater advantage over the logistic regression model used to develop the MSTFI (AUROC 0.62, 95% CI 0.56–0.68 for random forest vs AUROC 0.56, 95% CI 0.50–0.62 for logistic regression). The random forest model had the highest positive predictive value (0.53, 95% CI 0.43–0.64) and the highest negative predictive value (0.77, 95% CI 0.72–0.81), with chronic lung disease, coagulopathy, anemia, and malnutrition identified as the most important predictors of postoperative complications. CONCLUSIONS This study highlights the challenges of defining and quantifying frailty in the metastatic spine tumor population. Further study is required to improve the determination of surgical frailty in this specific cohort.


2021 ◽  
Vol 5 (1) ◽  
pp. 35
Author(s):  
Uttam Narendra Thakur ◽  
Radha Bhardwaj ◽  
Arnab Hazra

Disease diagnosis through breath analysis has attracted significant attention in recent years due to its noninvasive nature, rapid testing ability, and applicability for patients of all ages. More than 1000 volatile organic components (VOCs) exist in human breath, but only selected VOCs are associated with specific diseases. Selective identification of those disease marker VOCs using an array of multiple sensors are highly desirable in the current scenario. The use of efficient sensors and the use of suitable classification algorithms is essential for the selective and reliable detection of those disease markers in complex breath. In the current study, we fabricated a noble metal (Au, Pd and Pt) nanoparticle-functionalized MoS2 (Chalcogenides, Sigma Aldrich, St. Louis, MO, USA)-based sensor array for the selective identification of different VOCs. Four sensors, i.e., pure MoS2, Au/MoS2, Pd/MoS2, and Pt/MoS2 were tested under exposure to different VOCs, such as acetone, benzene, ethanol, xylene, 2-propenol, methanol and toluene, at 50 °C. Initially, principal component analysis (PCA) and linear discriminant analysis (LDA) were used to discriminate those seven VOCs. As compared to the PCA, LDA was able to discriminate well between the seven VOCs. Four different machine learning algorithms such as k-nearest neighbors (kNN), decision tree, random forest, and multinomial logistic regression were used to further identify those VOCs. The classification accuracy of those seven VOCs using KNN, decision tree, random forest, and multinomial logistic regression was 97.14%, 92.43%, 84.1%, and 98.97%, respectively. These results authenticated that multinomial logistic regression performed best between the four machine learning algorithms to discriminate and differentiate the multiple VOCs that generally exist in human breath.


2020 ◽  
Vol 8 (6) ◽  
pp. 1964-1968

Drug reviews are commonly used in pharmaceutical industry to improve the medications given to patients. Generally, drug review contains details of drug name, usage, ratings and comments by the patients. However, these reviews are not clean, and there is a need to improve the cleanness of the review so that they can be benefited for both pharmacists and patients. To do this, we propose a new approach that includes different steps. First, we add extra parameters in the review data by applying VADER sentimental analysis to clean the review data. Then, we apply different machine learning algorithms, namely linear SVC, logistic regression, SVM, random forest, and Naive Bayes on the drug review specify dataset names. However, we found that the accuracy of these algorithms for these datasets is limited. To improve this, we apply stratified K-fold algorithm in combination with Logistic regression. With this approach, the accuracy is increased to 96%.


Author(s):  
You-Hyun Park ◽  
Sung-Hwa Kim ◽  
Yoon-Young Choi

In this study, we developed machine learning-based prediction models for early childhood caries and compared their performances with the traditional regression model. We analyzed the data of 4195 children aged 1–5 years from the Korea National Health and Nutrition Examination Survey data (2007–2018). Moreover, we developed prediction models using the XGBoost (version 1.3.1), random forest, and LightGBM (version 3.1.1) algorithms in addition to logistic regression. Two different methods were applied for variable selection, including a regression-based backward elimination and a random forest-based permutation importance classifier. We compared the area under the receiver operating characteristic (AUROC) values and misclassification rates of the different models and observed that all four prediction models had AUROC values ranging between 0.774 and 0.785. Furthermore, no significant difference was observed between the AUROC values of the four models. Based on the results, we can confirm that both traditional logistic regression and ML-based models can show favorable performance and can be used to predict early childhood caries, identify ECC high-risk groups, and implement active preventive treatments. However, further research is essential to improving the performance of the prediction model using recent methods, such as deep learning.


2020 ◽  
Vol 8 (5) ◽  
pp. 5353-5362

Background/Aim: Prostate cancer is regarded as the most prevalent cancer in the word and the main cause of deaths worldwide. The early strategies for estimating the prostate cancer sicknesses helped in settling on choices about the progressions to have happened in high-chance patients which brought about the decrease of their dangers. Methods: In the proposed research, we have considered informational collection from kaggle and we have done pre-processing tasks for missing values .We have three missing data values in compactness attribute and two missing values in fractal dimension were replaced by mean of their column values .The performance of the diagnosis model is obtained by using methods like classification, accuracy, sensitivity and specificity analysis. This paper proposes a prediction model to predict whether a people have a prostate cancer disease or not and to provide an awareness or diagnosis on that. This is done by comparing the accuracies of applying rules to the individual results of Support Vector Machine, Random forest, Naive Bayes classifier and logistic regression on the dataset taken in a region to present an accurate model of predicting prostate cancer disease. Results: The machine learning algorithms under study were able to predict prostate cancer disease in patients with accuracy between 70% and 90%. Conclusions: It was shown that Logistic Regression and Random Forest both has better Accuracy (90%) when compared to different Machine-learning Algorithms.


Author(s):  
Soo-Kyoung Lee ◽  
Juh Hyun Shin ◽  
Jinhyun Ahn ◽  
Ji Yeon Lee ◽  
Dong Eun Jang

Background: Machine learning (ML) can keep improving predictions and generating automated knowledge via data-driven predictors or decisions. Objective: The purpose of this study was to compare different ML methods including random forest, logistics regression, linear support vector machine (SVM), polynomial SVM, radial SVM, and sigmoid SVM in terms of their accuracy, sensitivity, specificity, negative predictor values, and positive predictive values by validating real datasets to predict factors for pressure ulcers (PUs). Methods: We applied representative ML algorithms (random forest, logistic regression, linear SVM, polynomial SVM, radial SVM, and sigmoid SVM) to develop a prediction model (N = 60). Results: The random forest model showed the greatest accuracy (0.814), followed by logistic regression (0.782), polynomial SVM (0.779), radial SVM (0.770), linear SVM (0.767), and sigmoid SVM (0.674). Conclusions: The random forest model showed the greatest accuracy for predicting PUs in nursing homes (NHs). Diverse factors that predict PUs in NHs including NH characteristics and residents’ characteristics were identified according to diverse ML methods. These factors should be considered to decrease PUs in NH residents.


Sign in / Sign up

Export Citation Format

Share Document