Deep learning for automatic calcium scoring in population based cardiovascular screening

Abstract Background High volumes of standardized coronary artery calcium (CAC) scans are generated in screening that need to be scored accurately and efficiently to risk stratify individuals. Purpose To evaluate the performance of deep learning based software for automatic coronary calcium scoring in a screening setting. Methods Participants from the Robinsca trial that underwent low-dose ECG-triggered cardiac CT for calcium scoring were included. CAC was measured with fully automated deep learning prototype and compared to the original manual assessment of the Robinsca trial. Detection rate, positive Agatston score and risk categorization (0–99, 100–399, ≥400) were compared using McNemar test, ICC, and Cohen's kappa. False negative (FN), false positive (FP) rate and diagnostic accuracy were determined for preventive treatment initiation (cut-off ≥100 AU). Results In total, 997 participants were included between December 2015 and June 2016. Median age was 61.0 y (IQR: 11.0) and 54.4% was male. A high agreement for detection was found between deep learning based and manual scoring, κ=0.87 (95% CI 0.85–0.89). Median Agatston score was 58.4 (IQR: 12.3–200.2) and 61.2 (IQR: 13.9–212.9) for deep learning based and manual assessment respectively, ICC was 0.958 (95% CI 0.951–0.964). Reclassification rate was 2.0%, with a very high agreement with κ=0.960 (95% CI: 0.943–0.997), p<0.001. FN rate was 0.7% and FP rate was 0.1% and diagnostic accuracy was 99.2% for initiation of preventive treatment. Conclusion Deep learning based software for automatic CAC scoring can be used in a cardiovascular CT screening setting with high accuracy for risk categorization and initiation of preventive treatment. FUNDunding Acknowledgement Type of funding sources: Public grant(s) – EU funding. Main funding source(s): Robinsca trial was supported by advanced grant of European Research Council

Download Full-text

Deep Learning for Automatic Calcium Scoring in Population-Based Cardiovascular Screening

JACC Cardiovascular Imaging ◽

10.1016/j.jcmg.2021.07.012 ◽

2021 ◽

Author(s):

Marleen Vonder ◽

Sunyi Zheng ◽

Monique D. Dorrius ◽

Carlijn M. van der Aalst ◽

Harry J. de Koning ◽

...

Keyword(s):

Deep Learning ◽

Population Based ◽

Calcium Scoring ◽

Cardiovascular Screening

Download Full-text

The role of serum periostin in the diagnosis of asthma: A meta-analysis

Allergy and Asthma Proceedings ◽

10.2500/aap.2020.41.200038 ◽

2020 ◽

Vol 41 (4) ◽

pp. 240-247

Author(s):

Lei Yang ◽

Qingtao Zhao ◽

Shuyu Wang

Keyword(s):

Diagnostic Accuracy ◽

Meta Analysis ◽

Characteristic Curve ◽

False Negative ◽

Summary Receiver Operating Characteristic ◽

True Negative ◽

Negative Results ◽

Noninvasive Biomarker ◽

Sensitivity Specificity ◽

Serum Periostin

Background: Serum periostin has been proposed as a noninvasive biomarker for asthma diagnosis and management. However, its accuracy for the diagnosis of asthma in different populations is not completely clear. Methods: This meta-analysis aimed to evaluate the diagnostic accuracy of periostin level in the clinical determination of asthma. Several medical literature data bases were searched for relevant studies through December 1, 2019. The numbers of patients with true-positive, false-positive, false-negative, and true-negative results for the periostin level were extracted from each individual study. We assessed the risk of bias by using Quality Assessment of Diagnostic Accuracy Studies 2. We used the meta-analysis to produce summary estimates of accuracy. Results: In total, nine studies with 1757 subjects met the inclusion criteria. The pooled estimates of sensitivity, specificity, and diagnostic odds ratios for the detection of asthma were 0.58 (95% confidence interval [CI], 0.38‐0.76), 0.86 (95% CI, 0.74‐0.93), and 8.28 (95% CI, 3.67‐18.68), respectively. The area under the summary receiver operating characteristic curve was 0.82 (95% CI, 0.79‐0.85). And significant publication bias was found in this meta‐analysis (p = 0.39). Conclusion: Serum periostin may be used for the diagnosis of asthma, with moderate diagnostic accuracy.

Download Full-text

Deep learning systems detect dysplasia with human-like accuracy using histopathology and probe-based confocal laser endomicroscopy

Scientific Reports ◽

10.1038/s41598-021-84510-4 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Shan Guleria ◽

Tilak U. Shah ◽

J. Vincent Pulido ◽

Matthew Fasullo ◽

Lubaina Ehsan ◽

...

Keyword(s):

Deep Learning ◽

Diagnostic Accuracy ◽

High Sensitivity ◽

Confocal Laser Endomicroscopy ◽

Confocal Laser ◽

Learning Approaches ◽

Learning Models ◽

Whole Slide Image ◽

Slide Image ◽

Level Model

AbstractProbe-based confocal laser endomicroscopy (pCLE) allows for real-time diagnosis of dysplasia and cancer in Barrett’s esophagus (BE) but is limited by low sensitivity. Even the gold standard of histopathology is hindered by poor agreement between pathologists. We deployed deep-learning-based image and video analysis in order to improve diagnostic accuracy of pCLE videos and biopsy images. Blinded experts categorized biopsies and pCLE videos as squamous, non-dysplastic BE, or dysplasia/cancer, and deep learning models were trained to classify the data into these three categories. Biopsy classification was conducted using two distinct approaches—a patch-level model and a whole-slide-image-level model. Gradient-weighted class activation maps (Grad-CAMs) were extracted from pCLE and biopsy models in order to determine tissue structures deemed relevant by the models. 1970 pCLE videos, 897,931 biopsy patches, and 387 whole-slide images were used to train, test, and validate the models. In pCLE analysis, models achieved a high sensitivity for dysplasia (71%) and an overall accuracy of 90% for all classes. For biopsies at the patch level, the model achieved a sensitivity of 72% for dysplasia and an overall accuracy of 90%. The whole-slide-image-level model achieved a sensitivity of 90% for dysplasia and 94% overall accuracy. Grad-CAMs for all models showed activation in medically relevant tissue regions. Our deep learning models achieved high diagnostic accuracy for both pCLE-based and histopathologic diagnosis of esophageal dysplasia and its precursors, similar to human accuracy in prior studies. These machine learning approaches may improve accuracy and efficiency of current screening protocols.

Download Full-text

Diagnosing breast cancer tumors using stacked ensemble model

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-219176 ◽

2021 ◽

pp. 1-9

Author(s):

Ahmet Haşim Yurttakal ◽

Hasan Erbay ◽

Türkan İkizceli ◽

Seyhan Karaçavuş ◽

Cenker Biçer

Keyword(s):

Breast Cancer ◽

Deep Learning ◽

Medical Imaging ◽

Early Stage ◽

False Negative ◽

Gradient Boosting ◽

Physical Sign ◽

Ensemble Model ◽

Learning Methods ◽

Dce Mri

Breast cancer is the most common cancer that progresses from cells in the breast tissue among women. Early-stage detection could reduce death rates significantly, and the detection-stage determines the treatment process. Mammography is utilized to discover breast cancer at an early stage prior to any physical sign. However, mammography might return false-negative, in which case, if it is suspected that lesions might have cancer of chance greater than two percent, a biopsy is recommended. About 30 percent of biopsies result in malignancy that means the rate of unnecessary biopsies is high. So to reduce unnecessary biopsies, recently, due to its excellent capability in soft tissue imaging, Dynamic Contrast-Enhanced Magnetic Resonance Imaging (DCE-MRI) has been utilized to detect breast cancer. Nowadays, DCE-MRI is a highly recommended method not only to identify breast cancer but also to monitor its development, and to interpret tumorous regions. However, in addition to being a time-consuming process, the accuracy depends on radiologists’ experience. Radiomic data, on the other hand, are used in medical imaging and have the potential to extract disease characteristics that can not be seen by the naked eye. Radiomics are hard-coded features and provide crucial information about the disease where it is imaged. Conversely, deep learning methods like convolutional neural networks(CNNs) learn features automatically from the dataset. Especially in medical imaging, CNNs’ performance is better than compared to hard-coded features-based methods. However, combining the power of these two types of features increases accuracy significantly, which is especially critical in medicine. Herein, a stacked ensemble of gradient boosting and deep learning models were developed to classify breast tumors using DCE-MRI images. The model makes use of radiomics acquired from pixel information in breast DCE-MRI images. Prior to train the model, radiomics had been applied to the factor analysis to refine the feature set and eliminate unuseful features. The performance metrics, as well as the comparisons to some well-known machine learning methods, state the ensemble model outperforms its counterparts. The ensembled model’s accuracy is 94.87% and its AUC value is 0.9728. The recall and precision are 1.0 and 0.9130, respectively, whereas F1-score is 0.9545.

Download Full-text

Characteristics of patients who had a stroke not initially identified during emergency prehospital assessment: a systematic review

Emergency Medicine Journal ◽

10.1136/emermed-2020-209607 ◽

2021 ◽

pp. emermed-2020-209607

Author(s):

Stephanie P Jones ◽

Janet E Bray ◽

Josephine ME Gibson ◽

Graham McClelland ◽

Colette Miller ◽

...

Keyword(s):

Systematic Review ◽

Diagnostic Accuracy ◽

False Negative ◽

Visual Disturbance ◽

Screening Tools ◽

False Negatives ◽

Presenting Symptoms ◽

Ischaemic Attack ◽

Stroke Type ◽

Key Terms

BackgroundAround 25% of patients who had a stroke do not present with typical ‘face, arm, speech’ symptoms at onset, and are challenging for emergency medical services (EMS) to identify. The aim of this systematic review was to identify the characteristics of acute stroke presentations associated with inaccurate EMS identification (false negatives).MethodWe performed a systematic search of MEDLINE, EMBASE, CINAHL and PubMed from 1995 to August 2020 using key terms: stroke, EMS, paramedics, identification and assessment. Studies included: patients who had a stroke or patient records; ≥18 years; any stroke type; prehospital assessment undertaken by health professionals including paramedics or technicians; data reported on prehospital diagnostic accuracy and/or presenting symptoms. Data were extracted and study quality assessed by two researchers using the Quality Assessment of Diagnostic Accuracy Studies V.2 tool.ResultsOf 845 studies initially identified, 21 observational studies met the inclusion criteria. Of the 6934 stroke and Transient Ischaemic Attack patients included, there were 1774 (26%) false negative patients (range from 4 (2%) to 247 (52%)). Commonly documented symptoms in false negative cases were speech problems (n=107; 13%–28%), nausea/vomiting (n=94; 8%–38%), dizziness (n=86; 23%–27%), changes in mental status (n=51; 8%–25%) and visual disturbance/impairment (n=43; 13%–28%).ConclusionSpeech problems and posterior circulation symptoms were the most commonly documented symptoms among stroke presentations that were not correctly identified by EMS (false negatives). However, the addition of further symptoms to stroke screening tools requires valuation of subsequent sensitivity and specificity, training needs and possible overuse of high priority resources.

Download Full-text

Testing as Prevention of Resistance in Bacteria Causing Sexually Transmitted Infections—A Population-Based Model for Germany

Antibiotics ◽

10.3390/antibiotics10080929 ◽

2021 ◽

Vol 10 (8) ◽

pp. 929

Author(s):

Andreas Hahn ◽

Hagen Frickmann ◽

Ulrike Loderstädt

Keyword(s):

Sexually Transmitted Infections ◽

False Negative ◽

Population Based ◽

Sexually Transmitted ◽

Test Characteristics ◽

Antibiotic Drugs ◽

False Positive Signal ◽

Antibiotic Drug ◽

Potential Applicability ◽

Sex With Men

Prescribed antibiotic treatments which do not match the therapeutic requirements of potentially co-existing undetected sexually transmitted infections (STIs) can facilitate the selection of antibiotic-drug-resistant clones. To reduce this risk, this modelling assessed the potential applicability of reliable rapid molecular test assays targeting bacterial STI prior to the prescription of antibiotic drugs. The modelling was based on the prevalence of three bacterial STIs in German heterosexual and men-having-sex-with-men (MSM) populations, as well as on reported test characteristics of respective assays. In the case of the application of rapid molecular STI assays for screening, the numbers needed to test in order to correctly identify any of the included bacterial STIs ranged from 103 to 104 for the heterosexual population and from 5 to 14 for the MSM population. The number needed to harm—defined as getting a false negative result for any of the STIs and a false positive signal for another one, potentially leading to an even more inappropriate adaptation of antibiotic therapy than without any STI screening—was at least 208,995 for the heterosexuals and 16,977 for the MSM. Therefore, the screening approach may indeed be suitable to avoid unnecessary selective pressure on bacterial causes of sexually transmitted infections.

Download Full-text

Incorporating false negative tests in epidemiological models for SARS-CoV-2 transmission and reconciling with seroprevalence estimates

Scientific Reports ◽

10.1038/s41598-021-89127-1 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Rupam Bhattacharyya ◽

Ritoban Kundu ◽

Ritwik Bhaduri ◽

Debashree Ray ◽

Lauren J. Beesley ◽

...

Keyword(s):

False Negative ◽

Population Based ◽

Training Data ◽

Epidemiological Models ◽

Rt Pcr ◽

Antibody Prevalence ◽

Igg Antibody ◽

False Negatives ◽

Seir Model ◽

Study Population

AbstractSusceptible-Exposed-Infected-Removed (SEIR)-type epidemiologic models, modeling unascertained infections latently, can predict unreported cases and deaths assuming perfect testing. We apply a method we developed to account for the high false negative rates of diagnostic RT-PCR tests for detecting an active SARS-CoV-2 infection in a classic SEIR model. The number of unascertained cases and false negatives being unobservable in a real study, population-based serosurveys can help validate model projections. Applying our method to training data from Delhi, India, during March 15–June 30, 2020, we estimate the underreporting factor for cases at 34–53 (deaths: 8–13) on July 10, 2020, largely consistent with the findings of the first round of serosurveys for Delhi (done during June 27–July 10, 2020) with an estimated 22.86% IgG antibody prevalence, yielding estimated underreporting factors of 30–42 for cases. Together, these imply approximately 96–98% cases in Delhi remained unreported (July 10, 2020). Updated calculations using training data during March 15-December 31, 2020 yield estimated underreporting factor for cases at 13–22 (deaths: 3–7) on January 23, 2021, which are again consistent with the latest (fifth) round of serosurveys for Delhi (done during January 15–23, 2021) with an estimated 56.13% IgG antibody prevalence, yielding an estimated range for the underreporting factor for cases at 17–21. Together, these updated estimates imply approximately 92–96% cases in Delhi remained unreported (January 23, 2021). Such model-based estimates, updated with latest data, provide a viable alternative to repeated resource-intensive serosurveys for tracking unreported cases and deaths and gauging the true extent of the pandemic.

Download Full-text

Deep Learning-Based Industry Product Defect Detection with Low False Negative Error Tolerance

2020 11th International Conference on Awareness Science and Technology (iCAST) ◽

10.1109/icast51195.2020.9319407 ◽

2020 ◽

Author(s):

Tsukasa Ueno ◽

Qiangfu Zhao ◽

Shota Nakada

Keyword(s):

Deep Learning ◽

Defect Detection ◽

False Negative ◽

Error Tolerance ◽

Negative Error ◽

False Negative Error ◽

Product Defect

Download Full-text

Assessing the Role of Pericardial Fat as a Biomarker Connected to Coronary Calcification—A Deep Learning Based Approach Using Fully Automated Body Composition Analysis

Journal of Clinical Medicine ◽

10.3390/jcm10020356 ◽

2021 ◽

Vol 10 (2) ◽

pp. 356

Author(s):

Lennard Kroll ◽

Kai Nassenstein ◽

Markus Jochims ◽

Sven Koitka ◽

Felix Nensa

Keyword(s):

Adipose Tissue ◽

Deep Learning ◽

Coronary Artery ◽

Cardiac Ct ◽

Coronary Calcification ◽

Risk Scores ◽

Agatston Score ◽

Composition Analysis ◽

Significance Level ◽

Tissue Quantification

(1) Background: Epi- and Paracardial Adipose Tissue (EAT, PAT) have been spotlighted as important biomarkers in cardiological assessment in recent years. Since biomarker quantification is an increasingly important method for clinical use, we wanted to examine fully automated EAT and PAT quantification for possible use in cardiovascular risk stratification. (2) Methods: 966 patients with intermediate Framingham risk scores for Coronary Artery Disease referred for coronary calcium scans were included in clinical routine retrospectively. The Coronary Artery Calcium Score (CACS) was extracted and tissue quantification was performed by a deep learning network. (3) Results: The Computed Tomography (CT) segmentations predicted by the network indicated no significant correlation between EAT volume and EAT radiodensity when compared to Agatston score (r = 0.18, r = −0.09). CACS 0 category patients showed significantly lower levels of total EAT and PAT volumes and higher EAT and PAT densities than CACS 1–99 category patients (p < 0.01). Notably, this difference did not reach significance regarding EAT attenuation in male patients. Women older than 50 years, thus more likely to be postmenopausal, were shown to be at higher risk of coronary calcification (p < 0.01, OR = 4.59). CACS 1–99 vs. CACS ≥100 category patients remained below significance level (EAT volume: p = 0.087, EAT attenuation: p = 0.98). (4) Conclusions: Our study proves the feasibility of a fully automated adipose tissue analysis in clinical cardiac CT and confirms in a large clinical cohort that volume and attenuation of EAT and PAT are not correlated with CACS. Broadly available deep learning based rapid and reliable tissue quantification should thus be discussed as a method to assess this biomarker as a supplementary risk predictor in cardiac CT.

Download Full-text

Evaluation of an AI-based, automatic coronary artery calcium scoring software

European Radiology ◽

10.1007/s00330-019-06489-x ◽

2019 ◽

Vol 30 (3) ◽

pp. 1671-1678 ◽

Cited By ~ 7

Author(s):

Mårten Sandstedt ◽

Lilian Henriksson ◽

Magnus Janzon ◽

Gusten Nyberg ◽

Jan Engvall ◽

...

Keyword(s):

Artificial Intelligence ◽

Coronary Artery ◽

Coronary Artery Calcium ◽

Rank Test ◽

Agatston Score ◽

Risk Category ◽

Automatic Method ◽

Calcium Scoring ◽

Excellent Correlation ◽

Automatic Software

Abstract Objectives To evaluate an artificial intelligence (AI)–based, automatic coronary artery calcium (CAC) scoring software, using a semi-automatic software as a reference. Methods This observational study included 315 consecutive, non-contrast-enhanced calcium scoring computed tomography (CSCT) scans. A semi-automatic and an automatic software obtained the Agatston score (AS), the volume score (VS), the mass score (MS), and the number of calcified coronary lesions. Semi-automatic and automatic analysis time were registered, including a manual double-check of the automatic results. Statistical analyses were Spearman’s rank correlation coefficient (⍴), intra-class correlation (ICC), Bland Altman plots, weighted kappa analysis (κ), and Wilcoxon signed-rank test. Results The correlation and agreement for the AS, VS, and MS were ⍴ = 0.935, 0.932, 0.934 (p < 0.001), and ICC = 0.996, 0.996, 0.991, respectively (p < 0.001). The correlation and agreement for the number of calcified lesions were ⍴ = 0.903 and ICC = 0.977 (p < 0.001), respectively. The Bland Altman mean difference and 1.96 SD upper and lower limits of agreements for the AS, VS, and MS were − 8.2 (− 115.1 to 98.2), − 7.4 (− 93.9 to 79.1), and − 3.8 (− 33.6 to 25.9), respectively. Agreement in risk category assignment was 89.5% and κ = 0.919 (p < 0.001). The median time for the semi-automatic and automatic method was 59 s (IQR 35–100) and 36 s (IQR 29–49), respectively (p < 0.001). Conclusions There was an excellent correlation and agreement between the automatic software and the semi-automatic software for three CAC scores and the number of calcified lesions. Risk category classification was accurate but showing an overestimation bias tendency. Also, the automatic method was less time-demanding. Key Points • Coronary artery calcium (CAC) scoring is an excellent candidate for artificial intelligence (AI) development in a clinical setting. • An AI-based, automatic software obtained CAC scores with excellent correlation and agreement compared with a conventional method but was less time-consuming.

Download Full-text