Development and validation of natural language processing (NLP) algorithm for detection of distant versus local breast cancer recurrence and metastatic site.

2020 ◽  
Vol 38 (15_suppl) ◽  
pp. 2043-2043
Author(s):  
Yasmin Karimi ◽  
Douglas W. Blayney ◽  
Allison W. Kurian ◽  
Daniel Rubin ◽  
Imon Banerjee

2043 Background: Electronic health records (EHR) are used for retrospective cancer outcomes analysis. Sites and timing of recurrence are not captured in structured EHR data. Novel computerized methods are necessary to use unstructured longitudinal EHR data for large scale studies. Methods: We previously developed a neural network-based NLP algorithm to identify no recurrence vs. metastatic recurrence cases by analyzing physician notes, pathology and radiology reports in Stanford’s breast cancer database, Oncoshare (Cohort A). To validate this algorithm for local vs. distant recurrence, we identified a distinct Oncoshare cohort (Cohort B). Cases were manually curated for longitudinal development of local or distant recurrence and metastatic sites. A two-sided t-test was used to compare mean probabilities between local and distant recurrence cases. Next, we combined cases in Cohorts A and B to train and validate a novel NLP classifier that identifies metastatic site. The combined cohort was randomly divided into training and validation sets. Sensitivity and specificity were calculated for the NLP algorithm’s ability to detect metastatic sites compared to manual curation. Results: In Cohort B: 350 metastatic cases were identified. Mean probability for local and distant recurrence was 0.43 and 0.79, respectively and differed significantly for patients with local vs. distant recurrence (p<0.01). In Cohorts A and B: 632 metastatic cases were used for determination of sites. Sensitivity and specificity were highest for detection of peritoneal metastasis followed by liver, lung, skin, bone and central nervous system (table). Conclusions: This NLP algorithm is a scalable tool that uses unstructured EHR data to capture breast cancer recurrence, distinguishing local from distant recurrence and identifying metastatic site. This method may facilitate analysis of large datasets and correlation of outcomes with metastatic site. [Table: see text]

2021 ◽  
pp. 469-478
Author(s):  
Yasmin H. Karimi ◽  
Douglas W. Blayney ◽  
Allison W. Kurian ◽  
Jeanne Shen ◽  
Rikiya Yamashita ◽  
...  

PURPOSE Large-scale analysis of real-world evidence is often limited to structured data fields that do not contain reliable information on recurrence status and disease sites. In this report, we describe a natural language processing (NLP) framework that uses data from free-text, unstructured reports to classify recurrence status and sites of recurrence for patients with breast and hepatocellular carcinomas (HCC). METHODS Using two cohorts of breast cancer and HCC cases, we validated the ability of a previously developed NLP model to distinguish between no recurrence, local recurrence, and distant recurrence, based on clinician notes, radiology reports, and pathology reports compared with manual curation. A second NLP model was trained and validated to identify sites of recurrence. We compared the ability of each NLP model to identify the presence, timing, and site of recurrence, when compared against manual chart review and International Classification of Diseases coding. RESULTS A total of 1,273 patients were included in the development and validation of the two models. The NLP model for recurrence detects distant recurrence with an area under the curve of 0.98 (95% CI, 0.96 to 0.99) and 0.95 (95% CI, 0.88 to 0.98) in breast and HCC cohorts, respectively. The mean accuracy of the NLP model for detecting any site of distant recurrence was 0.9 for breast cancer and 0.83 for HCC. The NLP model for recurrence identified a larger proportion of patients with distant recurrence in a breast cancer database (11.1%) compared with International Classification of Diseases coding (2.31%). CONCLUSION We developed two NLP models to identify distant cancer recurrence, timing of recurrence, and sites of recurrence based on unstructured electronic health record data. These models can be used to perform large-scale retrospective studies in oncology.


2013 ◽  
Vol 31 (15_suppl) ◽  
pp. e22178-e22178
Author(s):  
Carl Anthony Blau ◽  
Christopher P. Miller ◽  
Amanda N. Kortum ◽  
Jason D. Thorpe ◽  
Michel Schummer ◽  
...  

e22178 Background: Estrogen, progesterone, and epidermal growth factor receptor ligands stimulate growth in a subset of breast cancers, however recent studies suggest roles for additional factors such as interleukins-6 and 8, prolactin, and erythropoietin. These and other growth factors act upon ligand-specific receptors to activate janus kinase 2 (JAK2), and JAK2 inhibitors are under investigation as a novel targeted therapy. We tested whether erythropoietin receptor (EPOR) or JAK2 mRNA levels are associated with distant breast cancer recurrence. Methods: We used quantitative RT-PCR to measure mRNA levels of JAK2, EPOR and a series of control genes using archival tumors from 112 women who experienced distant breast cancer recurrence (cases) and 112 tumors from women who did not (controls). Cases and controls were matched for tumor size, lymphovascular invasion, nodal status, extra-nodal extension, and ER/PR/HER2. Recurrence risks were evaluated using logistic regression. Associations between mRNA levels (via microarray data) and recurrence-free survival were validated in an independent cohort from the Netherlands Cancer Institute (n=295) using Cox proportional hazards regression. Results: Increasing JAK2 mRNA levels strongly correlated with a reduced risk of distant recurrence in our case control study (univariate p=0.0004, multivariate p=0.003), and with improved recurrence-free survival in the validation cohort (univariate p=0.0009; multivariate p=0.003). Remarkably, in the validation cohort, the ranking of the prognostic significance of JAK2 (top 3.5% of ~25,000 genes) was comparable to the known strong prognostic indicator of recurrence, ESR1 (top 1.3%). Similarly although less prominently, increasing EPORmRNA levels were significantly associated with reduced distant recurrence in our case control study (continuous model: univariate p=0.01, multivariate p=0.05) and with improved recurrence free survival in the validation cohort in univariate (p=0.03) but not multivariate analysis (p=0.35). Conclusions: JAK2 mRNA levels in breast tumors correlate with a reduced risk of recurrence. Understanding the mechanistic basis for this association is important for the rational application of JAK2 inhibitors.


2021 ◽  
Vol 7 (1) ◽  
Author(s):  
Soojin Cha ◽  
Esak Lee ◽  
Hong-Hee Won

AbstractMetastasis is the major cause of death in breast cancer patients. Although previous large-scale analyses have identified frequently altered genes specific to metastatic breast cancer (MBC) compared with those in primary breast cancer (PBC), metastatic site-specific altered genes in MBC remain largely uncharacterized. Moreover, large-scale analyses are required owing to the low expected frequency of such alterations, likely caused by tumor heterogeneity and late dissemination of breast cancer. To clarify MBC-specific genetic alterations, we integrated publicly available clinical and mutation data of 261 genes, including MBC drivers, from 4268 MBC and 5217 PBC patients from eight different cohorts. We performed meta-analyses and logistic regression analyses to identify MBC-enriched genetic alterations relative to those in PBC across 15 different metastatic site sets. We identified 11 genes that were more frequently altered in MBC samples from pan-metastatic sites, including four genes (SMARCA4, TSC2, ATRX, and AURKA) which were not identified previously. ARID2 mutations were enriched in treatment-naïve de novo and post-treatment MBC samples, compared with that in treatment-naïve PBC samples. In metastatic site-specific analyses, associations of ESR1 with liver metastasis and RICTOR with bone metastasis were significant, regardless of intrinsic subtypes. Among the 15 metastatic site sets, ESR1 mutations were enriched in the liver and depleted in the lymph nodes, whereas TP53 mutations showed an opposite trend. Seven potential MBC driver mutations showed similar preferential enrichment in specific metastatic sites. This large-scale study identified new MBC genetic alterations according to various metastatic sites and highlights their potential role in breast cancer organotropism.


Author(s):  
M. E. M. Joosen ◽  
S. J. Schop ◽  
L. L. Reinhoudt ◽  
S. M. J. van Kuijk ◽  
J. Beugels ◽  
...  

Abstract Purpose It has been hypothesized that autologous breast reconstruction can cause reactivation of dormant micro metastases by its extensive tissue trauma, influencing the risk of breast cancer recurrence. However, about the specific effect of timing on breast cancer recurrence in the deep inferior epigastric perforator (DIEP) flap reconstruction is not much known. In this study the rate of local, regional and distant recurrence between patients undergoing an immediate and delayed autologous DIEP flap breast reconstruction were evaluated. Methods In this retrospective cohort study, breast cancer patients undergoing a DIEP flap breast reconstruction between 2010 and 2018 in three hospitals in the Netherlands were evaluated. Cox proportional hazards regression analyses were performed to assess the impact of different factors on breast cancer recurrence. The primary endpoint was local breast cancer recurrence. Secondary endpoints were regional and distant recurrence. Results A total of 919 DIEP-flap reconstructions were done in 862 women of which 347 were immediate- and 572 were delayed DIEP flap reconstructions. After a median follow-up of 46 months and 86 months respectively (p < 0.001), local breast cancer recurrence occurred in 1.5% and in 1.7% of the patients resulting in an adjusted hazard ratio of 2.890 (p = 0.001, 95% CI 1.536, 5437). Conclusion This study suggests an increased risk for breast cancer recurrence in women receiving a delayed DIEP flap reconstruction as compared to women receiving an immediate DIEP flap reconstruction. However, these data should be interpreted carefully as a result of selection bias.


2020 ◽  
Vol 18 (2) ◽  
Author(s):  
Azhani Chik

Introduction: Breast cancer is the commonest malignancy in Malaysian women. Cancer recurrence has been a detrimental factor towards survival with peak of recurrence recorded in first 2 years of diagnoses. Identifying the prognostic factors towards recurrence is important to management and prolonging survival. Materials and method: We have retrospectively analyzed 179 patients women with breast cancer based on 5 years single centre database with minimum follow up of 2 years. The demographic and clinicopathological characteristics were determined using descriptive statistics. Survival were calculated based on Kaplan- Meier method and multivariate analysis by Cox proportional hazards was performed to evaluate the potential factors affecting breast cancer recurrence. Results: Mean follow up was 42 months, with mean age 52 years and 60.9% presented in Stage II disease. Overall recurrence was 41.9% with local recurrence 2.1%, regional recurrence 12.3% and distant recurrence 27.4%. 50% of our patients developed recurrence at 25 months. In univariate analysis, time to first presentation was significantly correlated with recurrence. However, in multivariate analysis; tumor size, lymph node positivity  and lymphovascular invasion were independently associated with recurrence. Conclusion: Even though local data on breast cancer recurrence is sparse, it does correlate with the international data. Thus, optimizing our care in breast cancer.


Sign in / Sign up

Export Citation Format

Share Document