characteristic curve
Recently Published Documents





2022 ◽  
Vol 16 (4) ◽  
pp. 1-22
Siddharth Bhatia ◽  
Rui Liu ◽  
Bryan Hooi ◽  
Minji Yoon ◽  
Kijung Shin ◽  

Given a stream of graph edges from a dynamic graph, how can we assign anomaly scores to edges in an online manner, for the purpose of detecting unusual behavior, using constant time and memory? Existing approaches aim to detect individually surprising edges. In this work, we propose Midas , which focuses on detecting microcluster anomalies , or suddenly arriving groups of suspiciously similar edges, such as lockstep behavior, including denial of service attacks in network traffic data. We further propose Midas -F, to solve the problem by which anomalies are incorporated into the algorithm’s internal states, creating a “poisoning” effect that can allow future anomalies to slip through undetected. Midas -F introduces two modifications: (1) we modify the anomaly scoring function, aiming to reduce the “poisoning” effect of newly arriving edges; (2) we introduce a conditional merge step, which updates the algorithm’s data structures after each time tick, but only if the anomaly score is below a threshold value, also to reduce the “poisoning” effect. Experiments show that Midas -F has significantly higher accuracy than Midas . In general, the algorithms proposed in this work have the following properties: (a) they detects microcluster anomalies while providing theoretical guarantees about the false positive probability; (b) they are online, thus processing each edge in constant time and constant memory, and also processes the data orders-of-magnitude faster than state-of-the-art approaches; and (c) they provides up to 62% higher area under the receiver operating characteristic curve than state-of-the-art approaches.

2022 ◽  
Vol 29 (2) ◽  
pp. 1-33
Nigel Bosch ◽  
Sidney K. D'Mello

The ability to identify whether a user is “zoning out” (mind wandering) from video has many HCI (e.g., distance learning, high-stakes vigilance tasks). However, it remains unknown how well humans can perform this task, how they compare to automatic computerized approaches, and how a fusion of the two might improve accuracy. We analyzed videos of users’ faces and upper bodies recorded 10s prior to self-reported mind wandering (i.e., ground truth) while they engaged in a computerized reading task. We found that a state-of-the-art machine learning model had comparable accuracy to aggregated judgments of nine untrained human observers (area under receiver operating characteristic curve [AUC] = .598 versus .589). A fusion of the two (AUC = .644) outperformed each, presumably because each focused on complementary cues. Furthermore, adding more humans beyond 3–4 observers yielded diminishing returns. We discuss implications of human–computer fusion as a means to improve accuracy in complex tasks.

2022 ◽  
Vol 9 ◽  
Zackary Falls ◽  
Jonathan Fine ◽  
Gaurav Chopra ◽  
Ram Samudrala

The human immunodeficiency virus 1 (HIV-1) protease is an important target for treating HIV infection. Our goal was to benchmark a novel molecular docking protocol and determine its effectiveness as a therapeutic repurposing tool by predicting inhibitor potency to this target. To accomplish this, we predicted the relative binding scores of various inhibitors of the protease using CANDOCK, a hierarchical fragment-based docking protocol with a knowledge-based scoring function. We first used a set of 30 HIV-1 protease complexes as an initial benchmark to optimize the parameters for CANDOCK. We then compared the results from CANDOCK to two other popular molecular docking protocols Autodock Vina and Smina. Our results showed that CANDOCK is superior to both of these protocols in terms of correlating predicted binding scores to experimental binding affinities with a Pearson coefficient of 0.62 compared to 0.48 and 0.49 for Vina and Smina, respectively. We further leveraged the Database of Useful Decoys: Enhanced (DUD-E) HIV protease set to ascertain the effectiveness of each protocol in discriminating active versus decoy ligands for proteases. CANDOCK again displayed better efficacy over the other commonly used molecular docking protocols with area under the receiver operating characteristic curve (AUROC) of 0.94 compared to 0.71 and 0.74 for Vina and Smina. These findings support the utility of CANDOCK to help discover novel therapeutics that effectively inhibit HIV-1 and possibly other retroviral proteases.

2022 ◽  
Vol 12 ◽  
Shaowu Lin ◽  
Yafei Wu ◽  
Ya Fang

BackgroundDepression is highly prevalent and considered as the most common psychiatric disorder in home-based elderly, while study on forecasting depression risk in the elderly is still limited. In an endeavor to improve accuracy of depression forecasting, machine learning (ML) approaches have been recommended, in addition to the application of more traditional regression approaches.MethodsA prospective study was employed in home-based elderly Chinese, using baseline (2011) and follow-up (2013) data of the China Health and Retirement Longitudinal Study (CHARLS), a nationally representative cohort study. We compared four algorithms, including the regression-based models (logistic regression, lasso, ridge) and ML method (random forest). Model performance was assessed using repeated nested 10-fold cross-validation. As the main measure of predictive performance, we used the area under the receiver operating characteristic curve (AUC).ResultsThe mean AUCs of the four predictive models, logistic regression, lasso, ridge, and random forest, were 0.795, 0.794, 0.794, and 0.769, respectively. The main determinants were life satisfaction, self-reported memory, cognitive ability, ADL (activities of daily living) impairment, CESD-10 score. Life satisfaction increased the odds ratio of a future depression by 128.6% (logistic), 13.8% (lasso), and 13.2% (ridge), and cognitive ability was the most important predictor in random forest.ConclusionsThe three regression-based models and one ML algorithm performed equally well in differentiating between a future depression case and a non-depression case in home-based elderly. When choosing a model, different considerations, however, such as easy operating, might in some instances lead to one model being prioritized over another.

Sci ◽  
2022 ◽  
Vol 4 (1) ◽  
pp. 3
Steinar Valsson ◽  
Ognjen Arandjelović

With the increase in the availability of annotated X-ray image data, there has been an accompanying and consequent increase in research on machine-learning-based, and ion particular deep-learning-based, X-ray image analysis. A major problem with this body of work lies in how newly proposed algorithms are evaluated. Usually, comparative analysis is reduced to the presentation of a single metric, often the area under the receiver operating characteristic curve (AUROC), which does not provide much clinical value or insight and thus fails to communicate the applicability of proposed models. In the present paper, we address this limitation of previous work by presenting a thorough analysis of a state-of-the-art learning approach and hence illuminate various weaknesses of similar algorithms in the literature, which have not yet been fully acknowledged and appreciated. Our analysis was performed on the ChestX-ray14 dataset, which has 14 lung disease labels and metainfo such as patient age, gender, and the relative X-ray direction. We examined the diagnostic significance of different metrics used in the literature including those proposed by the International Medical Device Regulators Forum, and present the qualitative assessment of the spatial information learned by the model. We show that models that have very similar AUROCs can exhibit widely differing clinical applicability. As a result, our work demonstrates the importance of detailed reporting and analysis of the performance of machine-learning approaches in this field, which is crucial both for progress in the field and the adoption of such models in practice.

2022 ◽  
Vol 2022 ◽  
pp. 1-7
Mingjie Yao ◽  
Leijie Wang ◽  
Jianwen Wang ◽  
Yanna Liu ◽  
Shuhong Liu ◽  

Background. There is lack of reliable serum biomarkers to reflect the severity of liver necroinflammation for those who suffer autoimmune liver diseases (AILDs). In this study, a previously established patient cohort was used to explore the potential of serum Golgi protein 73 (GP73) as a noninvasive marker of AILD-related liver necroinflammation. Methods. Serum GP73 concentration was measured in a retrospective cohort of 168 AILD patients, which included 74 patients with autoimmune hepatitis (AIH) and 94 with primary biliary cholangitis (PBC) who had undergone liver biopsy. Spearman’s correlation and multivariate analysis were used to evaluate the relationship between serum GP73 and liver necroinflammation. A receiver operating characteristic curve was constructed to evaluate the value of GP73 for the prediction of moderate or severe liver necroinflammation. The diagnostic value of serum GP73 was also compared with that of alkaline phosphatase (ALP) in patients with PBC. Histologically, immunohistochemical analysis was performed to assess hepatic GP73 expression. Results. Both the serum level and hepatic tissue expression of GP73 protein were aberrantly elevated and correlated well with the severity of necroinflammation in both AIH ( rho = 0.655 , P < 0.001 ) and PBC ( rho = 0.547 , P < 0.001 ) patients. The results here suggested that serum GP73 could be an independent biomarker to reflect the severity of liver necroinflammation. The AUROCs for GP73 to predict moderate necroinflammation (≥G2) and severe necroinflammation (≥G3) in patients with AIH were 0.828 and 0.832, respectively. Moreover, the AUROCs of serum GP73 for the identification of moderate necroinflammation (≥G2) ( AUROC = 0.820 , P < 0.001 ) and severe necroinflammation (≥G3) ( AUROC = 0.803 , P < 0.001 ) were superior to those of ALP (≥G2: AUROC = 0.607 , P = 0.028 and ≥G3: AUROC = 0.559 , P = 0.357 ) in patients with PBC. Mechanically, interlukin-6 (IL-6), the proinflammatory and prohepatic regenerating cytokine, could transcriptionally upregulate GP73 gene expression. Conclusion. Serum GP73 is a potential noninvasive biomarker to evaluate the severity of liver necroinflammation in patients with AILDs.

2022 ◽  
MariaGiovanna Trivieri ◽  
Philip M Robson ◽  
Vittoria Vergani ◽  
Gina LaRocca ◽  
Angelica M Romero-Daza ◽  

Objectives: To evaluate an extended hybrid MR/PET imaging strategy in cardiac sarcoidosis (CS) employing qualitative and quantitative assessment of PET tracer uptake, and to evaluate its association with cardiac-related outcomes. Background: Invasive endomyocardial biopsy is the gold standard to diagnose CS, but it has poor sensitivity due to the patchy distribution of disease. Imaging with hybrid late gadolinium enhancement (LGE) MR and 18F-fluorodexyglucose (18F-FDG) PET allows simultaneous assessment of myocardial injury and disease activity and has shown promise for improved diagnosis of active CS based on the combined positive imaging outcome, MR(+)PET(+). Methods: 148 patients with suspected CS were enrolled for hybrid MR/PET imaging. Patients were classified based on presence/absence of LGE (MR+/MR-), presence/absence of 18F-FDG (PET+/PET-), and pattern of 18F-FDG uptake (focal/diffuse) into the following categories: MR(+)PET(+)FOCAL, MR(+)PET(+)DIFFUSE, MR(+)PET(-), MR(-)PET(+)FOCAL, MR(-)PET(+)DIFFUSE, MR(-)PET(-). Patients classified as MR(+)PET(+)FOCAL were designated as having active CS [aCS(+)], while all others were considered as having inactive or absent CS and designated aCS(-). Quantitative values of standard uptake value (SUVmax), target-to-background ratio (TBRmax), target-to-normal-myocardium ratio (TNMRmax) and T2 were measured. Occurrence of a cardiac-related clinical outcome was defined as any of the following during the 6-month period after imaging: cardiac arrest, ventricular arrhythmia, complete heart block, need for cardiac resynchronization/defibrillator/pacemaker/monitoring device (CRT-D, ICD/WCD, or ILR). MR/PET imaging results were compared to the presence of the composite clinical outcome. Results: Patients designated aCS(+) had more than 4-fold increased odds of meeting the clinical endpoint compared to aCS(-) (unadjusted odds ratio 4.8; 95% CI 2.0-11.4; p<0.001). TNMRmax achieved an area under the receiver operating characteristic curve of 0.90 for separating aCS(+) from aCS(-). Conclusions: Hybrid MR/PET imaging with an extended image-based classification of CS was statistically associated with clinical outcomes in CS. TNMRmax had high sensitivity and excellent specificity for quantifying the imaging-based classification of active CS.

2022 ◽  
Flavio Azevedo Figueiredo ◽  
Lucas Emanuel Ferreira Ramos ◽  
Rafael Tavares Silva ◽  
Magda Carvalho Pires ◽  
Daniela Ponce ◽  

Background: Acute kidney injury (AKI) is frequently associated with COVID–19 and the need for kidney replacement therapy (KRT) is considered an indicator of disease severity. This study aimed to develop a prognostic score for predicting the need for KRT in hospitalized COVID–19 patients. Methods: This study is part of the multicentre cohort, the Brazilian COVID–19 Registry. A total of 5,212 adult COVID–19 patients were included between March/2020 and September/2020. We evaluated four categories of predictor variables: (1) demographic data; (2) comorbidities and conditions at admission; (3) laboratory exams within 24 h; and (4) the need for mechanical ventilation at any time during hospitalization. Variable selection was performed using generalized additive models (GAM) and least absolute shrinkage and selection operator (LASSO) regression was used for score derivation. The accuracy was assessed using the area under the receiver operating characteristic curve (AUCROC). Risk groups were proposed based on predicted probabilities: non-high (up to 14.9%), high (15.0 to 49.9%), and very high risk (≥ 50.0%). Results: The median age of the model–derivation cohort was 59 (IQR 47–70) years, 54.5% were men, 34.3% required ICU admission, 20.9% evolved with AKI, 9.3% required KRT, and 15.1% died during hospitalization. The validation cohort had similar age, sex, ICU admission, AKI, required KRT distribution and in–hospital mortality. Thirty–two variables were tested and four important predictors of the need for KRT during hospitalization were identified using GAM: need for mechanical ventilation, male gender, higher creatinine at admission, and diabetes. The MMCD score had excellent discrimination in derivation (AUROC = 0.929; 95% CI 0.918–0.939) and validation (AUROC = 0.927; 95% CI 0.911–0.941) cohorts an good overall performance in both cohorts (Brier score: 0.057 and 0.056, respectively). The score is implemented in a freely available online risk calculator ( Conclusion: The use of the MMCD score to predict the need for KRT may assist healthcare workers in identifying hospitalized COVID–19 patients who may require more intensive monitoring, and can be useful for resource allocation.

2022 ◽  
Md Mostafizur Rahman ◽  
Srinivas Mukund Vadrev ◽  
Arturo Magana-Mora ◽  
Jacob Levman ◽  
Othman Soufan

Abstract Food-drug interactions (FDIs) arise when nutritional dietary consumption regulates biochemical mechanisms involved in drug metabolism. Towards characterizing the nature of food’s influence on pharmacological treatment, it is essential to detect all possible FDIs. In this study, we propose FDMine, a novel systematic framework that models the FDI problem as a homogenous graph. In this graph, all nodes representing drug, food and food composition are referenced as chemical structures. This homogenous representation enables us to take advantage of reported drug-drug interactions for accuracy evaluation, especially when accessible ground truth for FDIs is lacking. Our dataset consists of 788 unique approved small molecule drugs with metabolism-related drug-drug interactions (DDIs) and 320 unique food items, composed of 563 unique compounds with 179 health effects. The potential number of interactions is 87,192 and 92,143 when two different versions of the graph referred to as disjoint and joint graphs are considered, respectively. We defined several similarity subnetworks comprising food-drug similarity (FDS), drug-drug similarity (DDS), and food-food similarity (FFS) networks, based on similarity profiles. A unique part of the graph is the encoding of the food composition as a set of nodes and calculating a content contribution score to re-weight the similarity links. To predict new FDI links, we applied the path category-based (path length 2 and 3) and neighborhood-based similarity-based link prediction algorithms. We calculated the precision@top (top 1%, 2%, and 5%) of the newly predicted links, the area under the receiver operating characteristic curve, and precision-recall curve. We have performed three types of evaluations to benchmark results using different types of interactions. The shortest path-based method has achieved a precision 84%, 60% and 40% for the top 1%, 2% and 5% of FDIs identified, respectively. We validated the top FDIs predicted using FDMine to demonstrate its applicability and we relate therapeutic anti-inflammatory effects of food items informed by FDIs. We hypothesize that the proposed framework can be used to gain new insights on FDIs. FDMine is publicly available to support clinicians and researchers.

2022 ◽  
Vol 21 (1) ◽  
Yiting Liu ◽  
Wei Wang

Abstract Background Lipid accumulation product (LAP) and cardiometabolic index (CMI) are two novel obesity-related indexes associated with enhancing metabolic disease (MD) risk. Current evidences suggest that the differences in sex hormones and regional fat distribution in both sexes are directly correlated with MD and nonalcoholic fatty liver disease (NAFLD) risk. Hence, NAFLD incidences reflect sex differences. Herein, we examined the accuracy of LAP and CMI in diagnosing NAFLD in both sexes. Methods Overall, 14,407 subjects, who underwent health check-up in the northeastern China, were enrolled in this study, and their corresponding LAP and CMI were calculated. Abdominal ultrasonography was employed for NAFLD diagnosis. Multivariate analyses were analyzed potential correlations between LAP and/or CMI and NAFLD. Odds ratios (ORs) and 95% confidence intervals (CIs) were evaluated. Receiver operating characteristic curve analyses was executed for the exploration of the diagnostic accuracies. Areas under the curves (AUCs) with 95%CIs were calculated. Results NAFLD prevalence increased with elevated quartiles of LAP and CMI in both sexes. In multivariate logistic regression analyses, LAP and CM expressed as continuous variables or quartiles, significantly correlated with NAFLD. The ORs for the top versus bottom quartile of LAP and CMI for NAFLD were 13.183 (95%CI = 8.512–20.417) and 8.662 (95%CI = 6.371–11.778) in women and 7.544 (95%CI = 5.748–9.902) and 5.400 (95%CI = 4.297–6.786) in men. LAP and CMI exhibited larger AUCs, compared to other obesity-related indexes in terms of discriminating NAFLD. The AUCs of LAP and CMI were 0.860 (95%CI = 0.852–0.867) and 0.833 (95%CI = 0.825–0.842) in women and 0.816 (95%CI = 0.806–0.825) and 0.779 (95%CI = 0.769–0.789) in men. Conclusions LAP and CMI are convenient indexes for the screening and quantification of NAFLD within a Chinese adult population. Their associations with NAFLD are substantially greater in women than men.

Sign in / Sign up

Export Citation Format

Share Document