Single-Examination Risk Prediction of Severe Retinopathy of Prematurity

Author(s):  
Aaron S. Coyner ◽  
Jimmy S. Chen ◽  
Praveer Singh ◽  
Robert L. Schelonka ◽  
Brian K. Jordan ◽  
...  

BACKGROUND AND OBJECTIVES Retinopathy of prematurity (ROP) is a leading cause of childhood blindness. Screening and treatment reduces this risk, but requires multiple examinations of infants, most of whom will not develop severe disease. Previous work has suggested that artificial intelligence may be able to detect incident severe disease (treatment-requiring retinopathy of prematurity [TR-ROP]) before clinical diagnosis. We aimed to build a risk model that combined artificial intelligence with clinical demographics to reduce the number of examinations without missing cases of TR-ROP. METHODS Infants undergoing routine ROP screening examinations (1579 total eyes, 190 with TR-ROP) were recruited from 8 North American study centers. A vascular severity score (VSS) was derived from retinal fundus images obtained at 32 to 33 weeks’ postmenstrual age. Seven ElasticNet logistic regression models were trained on all combinations of birth weight, gestational age, and VSS. The area under the precision-recall curve was used to identify the highest-performing model. RESULTS The gestational age + VSS model had the highest performance (mean ± SD area under the precision-recall curve: 0.35 ± 0.11). On 2 different test data sets (n = 444 and n = 132), sensitivity was 100% (positive predictive value: 28.1% and 22.6%) and specificity was 48.9% and 80.8% (negative predictive value: 100.0%). CONCLUSIONS Using a single examination, this model identified all infants who developed TR-ROP, on average, >1 month before diagnosis with moderate to high specificity. This approach could lead to earlier identification of incident severe ROP, reducing late diagnosis and treatment while simultaneously reducing the number of ROP examinations and unnecessary physiologic stress for low-risk infants.

Author(s):  
Jennifer B. Fundora ◽  
Gil Binenbaum ◽  
Lauren Tomlinson ◽  
Yinxi Yu ◽  
Gui-shuang Ying ◽  
...  

Objective The study aimed to determine the association of surgical necrotizing enterocolitis (NEC) and its timing, with the development and timing of retinopathy of prematurity (ROP). Study Design This was a secondary data analysis of 7,483 preterm infants from the Postnatal Growth and Retinopathy of Prematurity Study. Associations between infants with surgical NEC, early-onset surgical NEC (8–28 days), and late-onset surgical NEC (over 28 days) with ROP were evaluated by using multivariable logistic regression models, controlling for birth weight, gestational age, small for gestational age status, chronic lung disease, intraventricular hemorrhage, hydrocephalus, patent ductus arteriosus, and periventricular leukomalacia. Results Three hundred fifty-six (4.8%) infants had surgical NEC, with 56% having early surgical NEC. Infants with surgical NEC had a higher risk of any ROP and severe ROP (adjusted odds ratio [OR]: 2.7; 95% CI: 1.9–3.7) and 2.5 (95% CI: 1.9–3.3), respectively; p < 0.001) compared with infants without surgical NEC. Infants with early surgical NEC were at the highest risk of developing ROP and severe ROP (adjusted OR: 3.1 [95% CI: 2.1–4.8], and 3.3 [95% CI: 2.3–4.7] respectively, p < 0.001). Infants with late surgical NEC were also at increased risk of developing ROP and severe ROP (adjusted OR: 2.1 [95% CI: 1.3–3.4], and 1.9 [95% CI: 1.3–2.8] respectively, p < 0.001) compared with infants without surgical NEC. Conclusion Infants with surgical NEC, especially early surgical NEC, are at higher risk of ROP and severe ROP. Key Points


2012 ◽  
Vol 130 (12) ◽  
pp. 1560 ◽  
Author(s):  
Gil Binenbaum ◽  
Gui-shuang Ying ◽  
Graham E. Quinn ◽  
Jiayan Huang ◽  
Stephan Dreiseitl ◽  
...  

Author(s):  
Raúl Fernández-Ramón

Introduction: WINROP (Weight, Insulin-like growth factor 1, Neonatal Retinopathy of Prematurity) is a computer-based ROP risk which correlate postnatal weight gain with the developed of treatment-requiring ROP. The purpose of this study was to evaluate the ability of the WINROP algorithm to detect severe (Type 1 or Type 2) ROP in a Spanish cohort of infants. Methods: Birth weight, gestational age, and weekly weight measurements of preterm infants (>23 and <32 weeks gestation) born between 2015 and 2017 were retrospectively collected and entered in WINROP algorithm. Infants were classified according alarm activation and compared with ROP screening outcomes. Sensitivity, specificity, and predictive values were calculated. Results: A total of 109 infants were included. The mean gestational age was 29.37 ± 2.26 weeks and mean birth weight was 1178 ± 320 g. Alarm occurred in 47.7 % (52/109) of neonates, with a mean time from birth to alarm of 1.9 ± 1.4 weeks. WINROP had a sensitivity of 100% (CI 95%, 59-100), a specificity of 55.9% (CI 95%, 45.7-65.7), a positive predictive value of 13.5% (CI 95%, 11.1-16.2) and a negative predictive value of 100% (CI 95%, 93.7-100) for predicting severe ROP. Conclusion: The WINROP algorithm has proven to be a useful tool in the detection of severe ROP in our cohort. Nevertheless, in extremely preterm infants (GA <28 weeks) the results should be taken with caution and an optimization of WINROP can be necessary to improve its utility in other populations.


Author(s):  
Felipe Soares ◽  
Aline Villavicencio ◽  
Flávio Sanson Fogliatto ◽  
Maria Helena Pitombeira Rigatto ◽  
Michel José Anzanello ◽  
...  

BackgroundThe SARS-CoV-2 virus responsible for COVID-19 poses a significant challenge to healthcare systems worldwide. Despite governmental initiatives aimed at containing the spread of the disease, several countries are experiencing unmanageable increases in the demand for ICU beds, medical equipment, and larger testing capacity. Efficient COVID-19 diagnosis enables healthcare systems to provide better care for patients while protecting caregivers from the disease. However, many countries are constrained by the limited amount of test kits available, lack of equipment and trained professionals. In the case of patients visiting emergency rooms (ERs) with a suspect of COVID-19, prompt diagnosis may improve the outcome and even provide information for efficient hospital management. In such a context, a quick, inexpensive and readily available test to perform an initial triage in ERs could help to smooth patient flow, provide better patient care, and reduce the backlog of exams.MethodsIn this Case-control quantitative study, we developed a strategy backed by artificial intelligence to perform an initial screening of suspect COVID-19 patients. We developed a machine learning classifier that takes widely available simple blood exams as input and classifies samples as likely to be positive (having SARS-CoV-2) or negative (not having SARS-CoV-2). Based on this initial classification, positive cases can be referred for further highly sensitive testing (e.g. CT scan, or specific antibodies). We used publicly available data from the Albert Einstein Hospital in Brazil from 5,644 patients. Focusing on simple blood exam figures as main predictors, a sample of 599 subjects that had the fewest missing values for 16 common exams were selected. From these 599 patients, 81 tested positive for SARS-CoV-2 (determined by RT-PCR). Based on the reduced dataset, we built an artificial intelligence classification framework, ER-CoV, aiming at determining if suspect patients arriving in ER were likely to be negative for SARS-CoV-2, that is, to predict if that suspect patient is negative for COVID-19. The primary goal of this investigation is to develop a classifier with high specificity and high negative predictive values, with reasonable sensitivity.FindingsWe identified that our AI framework achieved an average specificity of 85.98% [95%CI: 84.94 – 86.84] and negative predictive value (NPV) of 94.92% [95%CI: 94.37% – 95.37%]. Those values are completely aligned with our goal of providing an effective low-cost system to triage suspect patients in ERs. As for sensitivity, our model achieved an average of 70.25% [95%CI: 66.57% – 73.12%] and positive predictive value (PPV) of 44.96% [95%CI: 43.15% – 46.87%]. The area under the curve (AUC) of the receiver operating characteristic (ROC) was 86.78% [95%CI: 85.65% – 87.90%]. An error analysis (inspection of which patients were misclassified) identified that, on average, 28% of the false negative results would have been hospitalized anyway; thus the model is making mistakes for severe cases that would not be overlooked, partially mitigating the fact that the test is not highly sensitive. All code for our AI model, called ER-CoV is publicly available at https://github.com/soares-f/ER-CoV.InterpretationBased on the capacity of our model to accurately predict which cases are negative from suspect patients arriving in emergency rooms, we envision that this framework may play an important role in patient triage. Probably the most important outcome is related to testing availability, which at this point is extremely low in many countries. Considering the achieved specificity, we could reduce by at least 90% the number of SARS-CoV-2 tests performed in emergency rooms, with around 5% chance of getting a false negative. The second important outcome is related to patient management in hospitals. Patients predicted as positive by our framework could be immediately separated from other patients while waiting for the results of confirmatory tests. This could reduce the spread rate within hospitals since in many of them all suspect cases are kept in the same ward. In Brazil, where the data was collected, rate infection is starting to quickly spread and the lead time of a SARS-CoV-2 may be up to 2 weeks.FundingThe University of Sheffield provided financial support for the Ph.D. scholarship for Felipe SoaresProf. Fogliatto’s research is funded by CNPq [Grant # 303509/2015-5].Prof. Anzanello’s research is funded by CNPq [Grant # 306724/2018-9].


2021 ◽  
Vol 28 (1) ◽  
pp. e100312
Author(s):  
Christos A Makridis ◽  
Tim Strebel ◽  
Vincent Marconi ◽  
Gil Alterovitz

Using administrative data on all Veterans who enter Department of Veterans Affairs (VA) medical centres throughout the USA, this paper uses artificial intelligence (AI) to predict mortality rates for patients with COVID-19 between March and August 2020. First, using comprehensive data on over 10 000 Veterans’ medical history, demographics and lab results, we estimate five AI models. Our XGBoost model performs the best, producing an area under the receive operator characteristics curve (AUROC) and area under the precision-recall curve of 0.87 and 0.41, respectively. We show how focusing on the performance of the AUROC alone can lead to unreliable models. Second, through a unique collaboration with the Washington D.C. VA medical centre, we develop a dashboard that incorporates these risk factors and the contributing sources of risk, which we deploy across local VA medical centres throughout the country. Our results provide a concrete example of how AI recommendations can be made explainable and practical for clinicians and their interactions with patients.


2021 ◽  
Vol 5 (1) ◽  
Author(s):  
Qian M. Zhou ◽  
Lu Zhe ◽  
Russell J. Brooke ◽  
Melissa M. Hudson ◽  
Yan Yuan

Abstract Background Incremental value (IncV) evaluates the performance change between an existing risk model and a new model. Different IncV metrics do not always agree with each other. For example, compared with a prescribed-dose model, an ovarian-dose model for predicting acute ovarian failure has a slightly lower area under the receiver operating characteristic curve (AUC) but increases the area under the precision-recall curve (AP) by 48%. This phenomenon of disagreement is not uncommon, and can create confusion when assessing whether the added information improves the model prediction accuracy. Methods In this article, we examine the analytical connections and differences between the AUC IncV (ΔAUC) and AP IncV (ΔAP). We also compare the true values of these two IncV metrics in a numerical study. Additionally, as both are semi-proper scoring rules, we compare them with a strictly proper scoring rule: the IncV of the scaled Brier score (ΔsBrS) in the numerical study. Results We demonstrate that ΔAUC and ΔAP are both weighted averages of the changes (from the existing model to the new one) in separating the risk score distributions between events and non-events. However, ΔAP assigns heavier weights to the changes in higher-risk regions, whereas ΔAUC weights the changes equally. Due to this difference, the two IncV metrics can disagree, and the numerical study shows that their disagreement becomes more pronounced as the event rate decreases. In the numerical study, we also find that ΔAP has a wide range, from negative to positive, but the range of ΔAUC is much smaller. In addition, ΔAP and ΔsBrS are highly consistent, but ΔAUC is negatively correlated with ΔsBrS and ΔAP when the event rate is low. Conclusions ΔAUC treats the wins and losses of a new risk model equally across different risk regions. When neither the existing or new model is the true model, this equality could attenuate a superior performance of the new model for a sub-region. In contrast, ΔAP accentuates the change in the prediction accuracy for higher-risk regions.


2020 ◽  
Author(s):  
Carson Lam ◽  
Jacob Calvert ◽  
Gina Barnes ◽  
Emily Pellegrini ◽  
Anna Lynn-Palevsky ◽  
...  

BACKGROUND In the wake of COVID-19, the United States has developed a three stage plan to outline the parameters to determine when states may reopen businesses and ease travel restrictions. The guidelines also identify subpopulations of Americans that should continue to stay at home due to being at high risk for severe disease should they contract COVID-19. These guidelines were based on population level demographics, rather than individual-level risk factors. As such, they may misidentify individuals at high risk for severe illness and who should therefore not return to work until vaccination or widespread serological testing is available. OBJECTIVE This study evaluated a machine learning algorithm for the prediction of serious illness due to COVID-19 using inpatient data collected from electronic health records. METHODS The algorithm was trained to identify patients for whom a diagnosis of COVID-19 was likely to result in hospitalization, and compared against four U.S policy-based criteria: age over 65, having a serious underlying health condition, age over 65 or having a serious underlying health condition, and age over 65 and having a serious underlying health condition. RESULTS This algorithm identified 80% of patients at risk for hospitalization due to COVID-19, versus at most 62% that are identified by government guidelines. The algorithm also achieved a high specificity of 95%, outperforming government guidelines. CONCLUSIONS This algorithm may help to enable a broad reopening of the American economy while ensuring that patients at high risk for serious disease remain home until vaccination and testing become available.


2021 ◽  
Vol 60 (6-7) ◽  
pp. 304-313
Author(s):  
Shailender Madani ◽  
Rohit Madani ◽  
Suchi Parikh ◽  
Ahila Manivannan ◽  
Wilma R. Orellana ◽  
...  

Our study aims to assess improvement with symptomatic treatment of pain-related functional gastrointestinal disorders (FGIDs) in a biopsychosocial construct and evaluate validity of Rome III criteria. Children with chronic abdominal pain diagnosed with an FGID or organic disease were followed for 1 year: 256/334 were diagnosed with an FGID and 78/334 were diagnosed with a possible organic disease due to alarm signs or not meeting Rome III criteria. After 1 year, 251 had true FGID and 46 had organic diseases. Ninety percent of FGID patients improved with symptomatic treatment over an average of 5.4 months. With a 95% confidence interval, Rome criteria predicted FGIDs with sensitivity 0.89, specificity 0.90, positive predictive value 0.98, and negative predictive value 0.59. We conclude that symptomatic treatment of pain-related FGIDs results in clinical improvement and could reduce invasive/expensive testing. Rome III criteria’s high specificity and positive predictive value suggest they can rule in a diagnosis of FGID.


Sign in / Sign up

Export Citation Format

Share Document