scholarly journals Application of multi-label classification models for the diagnosis of diabetic complications

2021 ◽  
Vol 21 (1) ◽  
Author(s):  
Liang Zhou ◽  
Xiaoyuan Zheng ◽  
Di Yang ◽  
Ying Wang ◽  
Xuesong Bai ◽  
...  

Abstract Background Early diagnosis for the diabetes complications is clinically demanding with great significancy. Regarding the complexity of diabetes complications, we applied a multi-label classification (MLC) model to predict four diabetic complications simultaneously using data in the modern electronic health records (EHRs), and leveraged the correlations between the complications to further improve the prediction accuracy. Methods We obtained the demographic characteristics and laboratory data from the EHRs for patients admitted to Changzhou No. 2 People’s Hospital, the affiliated hospital of Nanjing Medical University in China from May 2013 to June 2020. The data included 93 biochemical indicators and 9,765 patients. We used the Pearson correlation coefficient (PCC) to analyze the correlations between different diabetic complications from a statistical perspective. We used an MLC model, based on the Random Forest (RF) technique, to leverage these correlations and predict four complications simultaneously. We explored four different MLC models; a Label Power Set (LP), Classifier Chains (CC), Ensemble Classifier Chains (ECC), and Calibrated Label Ranking (CLR). We used traditional Binary Relevance (BR) as a comparison. We used 11 different performance metrics and the area under the receiver operating characteristic curve (AUROC) to evaluate these models. We analyzed the weights of the learned model and illustrated (1) the top 10 key indicators of different complications and (2) the correlations between different diabetic complications. Results The MLC models including CC, ECC and CLR outperformed the traditional BR method in most performance metrics; the ECC models performed the best in Hamming loss (0.1760), Accuracy (0.7020), F1_Score (0.7855), Precision (0.8649), F1_micro (0.8078), F1_macro (0.7773), Recall_micro (0.8631), Recall_macro (0.8009), and AUROC (0.8231). The two diabetic complication correlation matrices drawn from the PCC analysis and the MLC models were consistent with each other and indicated that the complications correlated to different extents. The top 10 key indicators given by the model are valuable in medical application. Conclusions Our MLC model can effectively utilize the potential correlation between different diabetic complications to further improve the prediction accuracy. This model should be explored further in other complex diseases with multiple complications.

2020 ◽  
Author(s):  
Abdulrahman Takiddin ◽  
Jens Schneider ◽  
Yin Yang ◽  
Alaa Abd-Alrazaq ◽  
Mowafa Househ

BACKGROUND Skin cancer is the most common cancer type affecting humans. Traditional skin cancer diagnosis methods are costly, require a professional physician, and take time. Hence, to aid in diagnosing skin cancer, Artificial Intelligence (AI) tools are being used, including shallow and deep machine learning-based techniques that are trained to detect and classify skin cancer using computer algorithms and deep neural networks. OBJECTIVE The aim of this study is to identify and group the different types of AI-based technologies used to detect and classify skin cancer. The study also examines the reliability of the selected papers by studying the correlation between the dataset size and number of diagnostic classes with the performance metrics used to evaluate the models. METHODS We conducted a systematic search for articles using IEEE Xplore, ACM DL, and Ovid MEDLINE databases following the PRISMA Extension for Scoping Reviews (PRISMA-ScR) guidelines. The study included in this scoping review had to fulfill several selection criteria; to be specifically about skin cancer, detecting or classifying skin cancer, and using AI technologies. Study selection and data extraction were conducted by two reviewers independently. Extracted data were synthesized narratively, where studies were grouped based on the diagnostic AI techniques and their evaluation metrics. RESULTS We retrieved 906 papers from the 3 databases, but 53 studies were eligible for this review. While shallow techniques were used in 14 studies, deep techniques were utilized in 39 studies. The studies used accuracy (n=43/53), the area under receiver operating characteristic curve (n=5/53), sensitivity (n=3/53), and F1-score (n=2/53) to assess the proposed models. Studies that use smaller datasets and fewer diagnostic classes tend to have higher reported accuracy scores. CONCLUSIONS The adaptation of AI in the medical field facilitates the diagnosis process of skin cancer. However, the reliability of most AI tools is questionable since small datasets or low numbers of diagnostic classes are used. In addition, a direct comparison between methods is hindered by a varied use of different evaluation metrics and image types.


2016 ◽  
Vol 4 (1) ◽  
pp. 3-7
Author(s):  
Tanka Prasad Bohara ◽  
Dimindra Karki ◽  
Anuj Parajuli ◽  
Shail Rupakheti ◽  
Mukund Raj Joshi

Background: Acute pancreatitis is usually a mild and self-limiting disease. About 25 % of patients have severe episode with mortality up to 30%. Early identification of these patients has potential advantages of aggressive treatment at intensive care unit or transfer to higher centre. Several scoring systems are available to predict severity of acute pancreatitis but are cumbersome, take 24 to 48 hours and are dependent on tests that are not universally available. Haematocrit has been used as a predictor of severity of acute pancreatitis but some have doubted its role.Objectives: To study the significance of haematocrit in prediction of severity of acute pancreatitis.Methods: Patients admitted with first episode of acute pancreatitis from February 2014 to July 2014 were included. Haematocrit at admission and 24 hours of admission were compared with severity of acute pancreatitis. Mean, analysis of variance, chi square, pearson correlation and receiver operator characteristic curve were used for statistical analysis.Results: Thirty one patients were included in the study with 16 (51.61%) male and 15 (48.4%) female. Haematocrit at 24 hours of admission was higher in severe acute pancreatitis (P value 0.003). Both haematocrit at admission and at 24 hours had positive correlation with severity of acute pancreatitis (r: 0.387; P value 0.031 and r: 0.584; P value 0.001) respectively.Area under receiver operator characteristic curve for haematocrit at admission and 24 hours were 0.713 (P value 0.175, 95% CI 0.536 - 0.889) and 0.917 (P value 0.008, 95% CI 0.813 – 1.00) respectively.Conclusion: Haematocrit is a simple, cost effective and widely available test and can predict severity of acute pancreatitis.Journal of Kathmandu Medical College, Vol. 4(1) 2015, 3-7


Author(s):  
Nadia Ayala-Lopez ◽  
David R Peaper ◽  
Roa Harb

Abstract Objectives Despite extensive research on procalcitonin (PCT)-guided therapy in lower respiratory tract infections, the association between PCT and bacterial pneumonia remains unclear. Methods We evaluated retrospectively the performance of PCT in patients presenting with lower respiratory tract infection symptoms and grouped by seven diagnoses. All patients had microbial testing, chest imaging, and CBC counts within 1 day of PCT testing. Results Median PCT level in patients diagnosed with bacterial pneumonia was significantly higher than in patients diagnosed with other sources of infections or those not diagnosed with infections. Median PCT levels were not different among patients grouped by type or quantity of pathogen detected. They were significantly higher in patients with higher pathogenicity scores for isolated bacteria, those with abnormal WBC count, and those with chest imaging consistent with bacterial pneumonia. A diagnostic workup that included imaging, WBC count, and Gram stain had an area under the receiver operating characteristic curve of 0.748, and the addition of PCT increased it to 0.778. Conclusions PCT was higher in patients diagnosed with bacterial pneumonia. Less clear is its diagnostic ability to detect bacterial pneumonia over and above imaging and laboratory data routinely available to clinicians.


2017 ◽  
Vol 7 (3) ◽  
pp. 376-384 ◽  
Author(s):  
Wenjie Dong ◽  
Sifeng Liu ◽  
Zhigeng Fang ◽  
Xiaoyu Yang ◽  
Qian Hu ◽  
...  

Purpose The purpose of this paper is to clarify several commonly used quality cost models based on Juran’s characteristic curve. Through mathematical deduction, the lowest point of quality cost and the lowest level of quality level (often depicted by qualification rate) can be obtained. This paper also aims to introduce a new prediction model, namely discrete grey model (DGM), to forecast the changing trend of quality cost. Design/methodology/approach This paper comes to the conclusion by means of mathematical deduction. To make it more clear, the authors get the lowest quality level and the lowest quality cost by taking the derivative of the equation of quality cost and quality level. By introducing the weakening buffer operator, the authors can significantly improve the prediction accuracy of DGM. Findings This paper demonstrates that DGM can be used to forecast quality cost based on Juran’s cost characteristic curve, especially when the authors do not have much information or the sample capacity is rather small. When operated by practical weakening buffer operator, the randomness of time series can be obviously weakened and the prediction accuracy can be significantly improved. Practical implications This paper uses a real case from a literature to verify the validity of discrete grey forecasting model, getting the conclusion that there is a certain degree of feasibility and rationality of DGM to forecast the variation tendency of quality cost. Originality/value This paper perfects the theory of quality cost based on Juran’s characteristic curve and expands the scope of application of grey system theory.


Author(s):  
Furkan Kaya ◽  
Petek Şarlak Konya ◽  
Emin Demirel ◽  
Neşe Demirtürk ◽  
Semiha Orhan ◽  
...  

Background: Lungs are the primary organ of involvement of COVID-19, and the severity of pneumonia in COVID-19 patients is an important cause of morbidity and mortality. Aim: We aimed to evaluate the visual and quantitative pneumonia severity on chest computed tomography (CT) in patients with coronavirus disease 2019 (COVID-19) and compare the CT findings with clinical and laboratory findings. Methods: We retrospectively evaluated adult COVID-19 patients who underwent chest CT, clinical scores, laboratory findings, and length of hospital stay. Two independent radiologists visually evaluated the pneumonia severity on chest CT (VSQS). Quantitative CT (QCT) assessment was performed using a free DICOM viewer, and the percentage of the well-aerated lung (%WAL), high-attenuation areas (%HAA) at different threshold values, and mean lung attenuation (MLA) values were calculated. The relationship between CT scores and the clinical, laboratory data, and length of hospital stay were evaluated in this cross-sectional study. The student's t-test and chi-square test were used to analyze the differences between variables. The Pearson correlation test analyzed the correlation between variables. The diagnostic performance of the variables was assessed using receiver operating characteristic (ROC) analysis was used. Results: The VSQS and QCT scores were significantly correlated with procalcitonin, d-dimer, ferritin, and C-reactive protein levels. Both VSQ and QCT scores were significantly correlated with disease severity (p<0.001). Among the QCT parameters, the %HAA-600 value showed the best correlation with the VSQS (r=730,p<0.001). VSQS and QCT scores had high sensitivity and specificity in distinguishing disease severity and predicting prolonged hospitalization. Conclusion: The VSQS and QCT scores can help manage the COVID-19 and predict the duration of hospitalization.


Author(s):  
Mukhyaprana M. Prabhu ◽  
Jagadish Madireddy ◽  
Ranjan K. Shetty ◽  
Weena Stanley

Background: Acute coronary syndromes (ACSs) are the primary cause of mortality worldwide. The aim of the study was to assess the as‑sociations of serum fibrinogen and plasma D‑dimer levels with angiographic severity of atherosclerotic lesions as well as the presence of in‑hospital complications and complications at 30‑day follow‑up in patients with ACS. Methods: This was a prospective study including 107 patients with ACS. Severity of CAD was assessed by the Gensini score. Correlations of D‑dimer and fibrinogen levels with complica‑tions such as heart failure, arrhythmia, recurrent angina, and cardiac death were assessed using the Pearson correlation coefficient and the receiver operating characteristic curve analysis. Results: The mean age of patients was 61±10.9 years. Mean serum fibrinogen levels were higher in individuals with severe left ventricular (LV) dysfunction than in those with moderate and mild LV dysfunction (444 mg/dl, 404 mg/dl, and 330 mg/dl, respectively). Similarly, the mean plasma D‑dimer level was higher in individuals with severe ACS (1.03 μg/ml) than in those with moderate (1.88 μg/ml) and mild ACS (3.5 μg/ml). Conclusion: Our study revealed that patients with higher serum fibrinogen levels tend to have more severe ACS, greater LV dysfunction, and a higher rate of complications. Therapies aimed at reducing fibrinogen levels might help reduce mortality and morbidity in patients with ACS.


Sensors ◽  
2021 ◽  
Vol 21 (17) ◽  
pp. 5777
Author(s):  
Esraa Eldesouky ◽  
Mahmoud Bekhit ◽  
Ahmed Fathalla ◽  
Ahmad Salah ◽  
Ahmed Ali

The use of underwater wireless sensor networks (UWSNs) for collaborative monitoring and marine data collection tasks is rapidly increasing. One of the major challenges associated with building these networks is handover prediction; this is because the mobility model of the sensor nodes is different from that of ground-based wireless sensor network (WSN) devices. Therefore, handover prediction is the focus of the present work. There have been limited efforts in addressing the handover prediction problem in UWSNs and in the use of ensemble learning in handover prediction for UWSNs. Hence, we propose the simulation of the sensor node mobility using real marine data collected by the Korea Hydrographic and Oceanographic Agency. These data include the water current speed and direction between data. The proposed simulation consists of a large number of sensor nodes and base stations in a UWSN. Next, we collected the handover events from the simulation, which were utilized as a dataset for the handover prediction task. Finally, we utilized four machine learning prediction algorithms (i.e., gradient boosting, decision tree (DT), Gaussian naive Bayes (GNB), and K-nearest neighbor (KNN)) to predict handover events based on historically collected handover events. The obtained prediction accuracy rates were above 95%. The best prediction accuracy rate achieved by the state-of-the-art method was 56% for any UWSN. Moreover, when the proposed models were evaluated on performance metrics, the measured evolution scores emphasized the high quality of the proposed prediction models. While the ensemble learning model outperformed the GNB and KNN models, the performance of ensemble learning and decision tree models was almost identical.


2019 ◽  
Author(s):  
Gang Li ◽  
Liangtian Zhang ◽  
Nannan Han ◽  
Ke Zhang ◽  
Hengjie Li

Abstract Background: Acute lung injury (ALI) is one of the major complications of severe sepsis. This study was conducted to investigate the levels of Th22 and Th17 cells in the peripheral blood septic patients with ALI and their clinical significance. Results: A total of 479 septic patients admitted between January 2013 to January 2018 were divided into non-ALI (n = 377) and ALI groups (n = 102) based on the presence or absence of ALI. The levels of Th22 and Th17 cells, interleukin 22 (IL-22), 6 (IL-6) and 17 (IL-17) were determined. Receiver operating characteristic curve (ROC) analysis was performed to assess the early diagnostic value of Th22 and Th17 cells to predict sepsis-induced ALI. The lung injury prediction score (LIPS), IL-6, IL-17, IL-22, and levels of Th17 and Th-22 cells were 9.13, 14.02 ng/L, 13.06 ng/L, 22.90 ng/L, 8.80% and 7.40%, respectively, in the ALI patients and were significantly higher in the ALI group than in non-ALI group (P < 0.05). Pearson correlation analysis showed that LIPS, IL-17, IL-22, Th17 cells and Th22 cells were significant factors affecting sepsis-induced ALI (P < 0.05). The correlation analysis showed that the levels of Th22 cells in the peripheral blood of septic patients with ALI were positively correlated with LIPS, IL-22 and the levels of Th17 cells (P < 0.05), and the levels of Th17 cells were positively correlated with LIPS and IL-17 (P < 0.01). Multivariate logistic regression analysis showed that the LIPS (OR = 1.130), IL-17 (OR = 1.982), IL-22 (OR =2.612) and levels of Th17 (OR = 2.211) and Th22 (OR =3.230) cells were independent risk factor for ALI. The area under the curve of Th22 cells was 0.844 with a cutoff value of 6.81% to predict ALI. The sensitivity and specificity for early diagnosis of sepsis-induced ALI by Th22 cells were 78.72% and 89.13% respectively, which were better but statistically similar as compared with Th17 cells (P > 0.05). Conclusions: The levels of Th22 and Th17 cells in peripheral blood are significantly increased in septic patients with induced ALI, and may be used for early diagnose of sepsis-induced ALI.


2010 ◽  
Vol 22 (05) ◽  
pp. 385-391
Author(s):  
Yu-Cheng Liu ◽  
Shien-Ching Hwang ◽  
Yu-Feng Huang ◽  
Win-Li Lin ◽  
Yen-Jen Oyang ◽  
...  

The B-factor, which is also known as temperature factor or Debby–Waller factor, is an important structural flexibility index of the ground-state protein conformation. In particular, the B-factors associated with a segment of residues, reflect the local flexibility of the corresponding protein tertiary substructure. Recent studies have shown that, for certain families of proteins, there exists a high-degree of correlation between the B-factors and the protein functional sites, including antigenic regions, enzyme active sites, and nucleotide binding sites. This paper presents a sequence–based predictor of B-factors with a dual-model approach.  The design of the dual-model approach has been aimed at exploiting the bi-modal distribution of B-factors in order to achieve higher prediction accuracy. In this paper, the prediction accuracy is measured by Pearson correlation coefficient. Experimental results show that the dual-model predictor proposed in this article is capable of delivering superior correlation coefficient in comparison with two predictors reported in two latest papers.  Though experimental results show that the dual-model proposed in this paper really works more effectively than the conventional approach, it is of interest to continue investigating more advanced designs since there exists a strong correlation between B-factors and protein functional sites. In this respect, identifying additional physiochemical properties that are related to structural flexibility deserves a high-degree of attention.


2019 ◽  
Vol 6 (1) ◽  
Author(s):  
Tawfiq Hasanin ◽  
Taghi M. Khoshgoftaar ◽  
Joffrey L. Leevy ◽  
Richard A. Bauder

AbstractSevere class imbalance between majority and minority classes in Big Data can bias the predictive performance of Machine Learning algorithms toward the majority (negative) class. Where the minority (positive) class holds greater value than the majority (negative) class and the occurrence of false negatives incurs a greater penalty than false positives, the bias may lead to adverse consequences. Our paper incorporates two case studies, each utilizing three learners, six sampling approaches, two performance metrics, and five sampled distribution ratios, to uniquely investigate the effect of severe class imbalance on Big Data analytics. The learners (Gradient-Boosted Trees, Logistic Regression, Random Forest) were implemented within the Apache Spark framework. The first case study is based on a Medicare fraud detection dataset. The second case study, unlike the first, includes training data from one source (SlowlorisBig Dataset) and test data from a separate source (POST dataset). Results from the Medicare case study are not conclusive regarding the best sampling approach using Area Under the Receiver Operating Characteristic Curve and Geometric Mean performance metrics. However, it should be noted that the Random Undersampling approach performs adequately in the first case study. For the SlowlorisBig case study, Random Undersampling convincingly outperforms the other five sampling approaches (Random Oversampling, Synthetic Minority Over-sampling TEchnique, SMOTE-borderline1 , SMOTE-borderline2 , ADAptive SYNthetic) when measuring performance with Area Under the Receiver Operating Characteristic Curve and Geometric Mean metrics. Based on its classification performance in both case studies, Random Undersampling is the best choice as it results in models with a significantly smaller number of samples, thus reducing computational burden and training time.


Sign in / Sign up

Export Citation Format

Share Document