Construction of Bioactivity Prediction Models for Breast Cancer Candidate Drugs

2021 ◽  
Vol 10 (12) ◽  
pp. 4454-4468
Author(s):  
浦 徐
2020 ◽  
Vol 26 (33) ◽  
pp. 4195-4205
Author(s):  
Xiaoyu Ding ◽  
Chen Cui ◽  
Dingyan Wang ◽  
Jihui Zhao ◽  
Mingyue Zheng ◽  
...  

Background: Enhancing a compound’s biological activity is the central task for lead optimization in small molecules drug discovery. However, it is laborious to perform many iterative rounds of compound synthesis and bioactivity tests. To address the issue, it is highly demanding to develop high quality in silico bioactivity prediction approaches, to prioritize such more active compound derivatives and reduce the trial-and-error process. Methods: Two kinds of bioactivity prediction models based on a large-scale structure-activity relationship (SAR) database were constructed. The first one is based on the similarity of substituents and realized by matched molecular pair analysis, including SA, SA_BR, SR, and SR_BR. The second one is based on SAR transferability and realized by matched molecular series analysis, including Single MMS pair, Full MMS series, and Multi single MMS pairs. Moreover, we also defined the application domain of models by using the distance-based threshold. Results: Among seven individual models, Multi single MMS pairs bioactivity prediction model showed the best performance (R2 = 0.828, MAE = 0.406, RMSE = 0.591), and the baseline model (SA) produced the most lower prediction accuracy (R2 = 0.798, MAE = 0.446, RMSE = 0.637). The predictive accuracy could further be improved by consensus modeling (R2 = 0.842, MAE = 0.397 and RMSE = 0.563). Conclusion: An accurate prediction model for bioactivity was built with a consensus method, which was superior to all individual models. Our model should be a valuable tool for lead optimization.


2019 ◽  
Vol 17 (6) ◽  
pp. 1519-1530 ◽  
Author(s):  
Yao Luo ◽  
Ranran Zeng ◽  
Qingqing Guo ◽  
Jianrong Xu ◽  
Xiaoou Sun ◽  
...  

G03 is a novel anticancer agent with unusual microtubule-stabilizing effects.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Chi-Ming Chu ◽  
Huan-Ming Hsu ◽  
Chi-Wen Chang ◽  
Yuan-Kuei Li ◽  
Yu-Jia Chang ◽  
...  

AbstractGenetic co-expression network (GCN) analysis augments the understanding of breast cancer (BC). We aimed to propose GCN-based modeling for BC relapse-free survival (RFS) prediction and to discover novel biomarkers. We used GCN and Cox proportional hazard regression to create various prediction models using mRNA microarray of 920 tumors and conduct external validation using independent data of 1056 tumors. GCNs of 34 identified candidate genes were plotted in various sizes. Compared to the reference model, the genetic predictors selected from bigger GCNs composed better prediction models. The prediction accuracy and AUC of 3 ~ 15-year RFS are 71.0–81.4% and 74.6–78% respectively (rfm, ACC 63.2–65.5%, AUC 61.9–74.9%). The hazard ratios of risk scores of developing relapse ranged from 1.89 ~ 3.32 (p < 10–8) over all models under the control of the node status. External validation showed the consistent finding. We found top 12 co-expressed genes are relative new or novel biomarkers that have not been explored in BC prognosis or other cancers until this decade. GCN-based modeling creates better prediction models and facilitates novel genes exploration on BC prognosis.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Li-Hsin Cheng ◽  
Te-Cheng Hsu ◽  
Che Lin

AbstractBreast cancer is a heterogeneous disease. To guide proper treatment decisions for each patient, robust prognostic biomarkers, which allow reliable prognosis prediction, are necessary. Gene feature selection based on microarray data is an approach to discover potential biomarkers systematically. However, standard pure-statistical feature selection approaches often fail to incorporate prior biological knowledge and select genes that lack biological insights. Besides, due to the high dimensionality and low sample size properties of microarray data, selecting robust gene features is an intrinsically challenging problem. We hence combined systems biology feature selection with ensemble learning in this study, aiming to select genes with biological insights and robust prognostic predictive power. Moreover, to capture breast cancer's complex molecular processes, we adopted a multi-gene approach to predict the prognosis status using deep learning classifiers. We found that all ensemble approaches could improve feature selection robustness, wherein the hybrid ensemble approach led to the most robust result. Among all prognosis prediction models, the bimodal deep neural network (DNN) achieved the highest test performance, further verified by survival analysis. In summary, this study demonstrated the potential of combining ensemble learning and bimodal DNN in guiding precision medicine.


Cancers ◽  
2021 ◽  
Vol 13 (14) ◽  
pp. 3533
Author(s):  
Paul Lacaze ◽  
Andrew Bakshi ◽  
Moeen Riaz ◽  
Suzanne G. Orchard ◽  
Jane Tiller ◽  
...  

Genomic risk prediction models for breast cancer (BC) have been predominantly developed with data from women aged 40–69 years. Prospective studies of older women aged ≥70 years have been limited. We assessed the effect of a 313-variant polygenic risk score (PRS) for BC in 6339 older women aged ≥70 years (mean age 75 years) enrolled into the ASPREE trial, a randomized double-blind placebo-controlled clinical trial investigating the effect of daily 100 mg aspirin on disability-free survival. We evaluated incident BC diagnoses over a median follow-up time of 4.7 years. A multivariable Cox regression model including conventional BC risk factors was applied to prospective data, and re-evaluated after adding the PRS. We also assessed the association of rare pathogenic variants (PVs) in BC susceptibility genes (BRCA1/BRCA2/PALB2/CHEK2/ATM). The PRS, as a continuous variable, was an independent predictor of incident BC (hazard ratio (HR) per standard deviation (SD) = 1.4, 95% confidence interval (CI) 1.3–1.6) and hormone receptor (ER/PR)-positive disease (HR = 1.5 (CI 1.2–1.9)). Women in the top quintile of the PRS distribution had over two-fold higher risk of BC than women in the lowest quintile (HR = 2.2 (CI 1.2–3.9)). The concordance index of the model without the PRS was 0.62 (95% CI 0.56–0.68), which improved after addition of the PRS to 0.65 (95% CI 0.59–0.71). Among 41 (0.6%) carriers of PVs in BC susceptibility genes, we observed no incident BC diagnoses. Our study demonstrates that a PRS predicts incident BC risk in women aged 70 years and older, suggesting potential clinical utility extends to this older age group.


2021 ◽  
pp. 758-767
Author(s):  
Jeremy Mason ◽  
Yutao Gong ◽  
Laleh Amiri-Kordestani ◽  
Suparna Wedam ◽  
Jennifer J. Gao ◽  
...  

PURPOSE Three cyclin-dependent kinase 4/6 inhibitors (CDKIs) are approved by the US Food and Drug Administration for the treatment of patients with hormone receptor–positive, human epidermal growth factor receptor 2–negative advanced or metastatic breast cancer in combination with hormonal therapy (HT). We hypothesized that on an individual basis, efficacy outcomes and adverse event (AE) development can be predicted using baseline patient and tumor characteristics. METHODS Individual-level data from seven randomized controlled trials submitted to the US Food and Drug Administration for new or supplemental marketing applications of CDKIs were pooled. Progression-free survival (PFS), overall survival (OS), and AE prediction models were developed for specific treatment regimens (HT v HT plus CDKI). An individual's characteristics were used in all models simultaneously to create a group of predicted outcomes that are comparable across treatment settings. RESULTS Accuracy of the PFS and OS prediction models for HT were 66% and 64%, respectively, with the strongest predictors being menopausal status and therapy line. The corresponding AE prediction models resulted in an average area under the curve of 0.613. Accuracy of the PFS and OS prediction models for HT plus CDKI were 62% and 63%, respectively, with the strongest predictors being histologic grade for both. The corresponding AE prediction models resulted in an average area under the curve of 0.639. CONCLUSION This exploratory analysis demonstrated that models of efficacy outcomes and AE development can be developed using baseline patient and tumor characteristics. Comparison of paired models can inform treatment selection for individuals on the basis of the patient's personalized goals and concerns. Although use of CDKIs is standard of care in the first- or second-line setting, this model provides prognostic information that may inform individual treatment decisions.


2018 ◽  
Vol 12 (2) ◽  
pp. 119-126 ◽  
Author(s):  
Vikas Chaurasia ◽  
Saurabh Pal ◽  
BB Tiwari

Breast cancer is the second most leading cancer occurring in women compared to all other cancers. Around 1.1 million cases were recorded in 2004. Observed rates of this cancer increase with industrialization and urbanization and also with facilities for early detection. It remains much more common in high-income countries but is now increasing rapidly in middle- and low-income countries including within Africa, much of Asia, and Latin America. Breast cancer is fatal in under half of all cases and is the leading cause of death from cancer in women, accounting for 16% of all cancer deaths worldwide. The objective of this research paper is to present a report on breast cancer where we took advantage of those available technological advancements to develop prediction models for breast cancer survivability. We used three popular data mining algorithms (Naïve Bayes, RBF Network, J48) to develop the prediction models using a large dataset (683 breast cancer cases). We also used 10-fold cross-validation methods to measure the unbiased estimate of the three prediction models for performance comparison purposes. The results (based on average accuracy Breast Cancer dataset) indicated that the Naïve Bayes is the best predictor with 97.36% accuracy on the holdout sample (this prediction accuracy is better than any reported in the literature), RBF Network came out to be the second with 96.77% accuracy, J48 came out third with 93.41% accuracy.


2021 ◽  
Author(s):  
Naorem Leimarembi Devi ◽  
Anjali Dhall ◽  
Sumeet Patiyal ◽  
Gajendra P. S. Raghava

Triple-negative breast cancer (TNBC) is more prone to metastasis and recurrence than other breast cancer subtypes. This study aimed to identify genes that can act as diagnostic biomarkers for predicting lymph node metastasis in TNBC patients. The transcriptomic data of TNBC with or without lymph node metastasis was acquired from TCGA, and the differentially expressed genes were identified. Further, logistic-regression method has been used to identify the top 15 genes (or 15 gene signatures) based on their ability to predict metastasis (AUC>0.65). These 15 gene signatures were used to develop machine learning techniques based prediction models; Gaussian Naive Bayes classifier outperformed other with AUC>0.80 on both training and validation datasets. The best model failed drastically on nine independent microarray datasets obtained from GEO. We investigated the reason for the failure of our best model, and it was observed that the certain genes in 15 gene signatures were showing opposite regulating trends, i.e., genes are upregulated in TCGA-TNBC patients while it is downregulated on other microarray datasets or vice-versa. In conclusion, the 15 gene signatures may act as diagnostic markers for the detection of lymph node metastatic status in TCGA dataset, but quite challenging across multiple platforms. We also identified the prognostic potential of the 15 selected genes and found that overexpression of ZNRF2, FRZB, and TCEAL4 was associated with poor survival with HR>2.3 and p-value≤0.05. In order to provide services to the scientific community, we developed a webserver named 'MTNBCPred' for the prediction of metastatic and non-metastatic lymph node status of TNBC patients (http://webs.iiitd.edu.in/raghava/mtnbcpred/ ).


Author(s):  
Julie R. Palmer ◽  
Gary Zirpoli ◽  
Kimberly A. Bertrand ◽  
Tracy Battaglia ◽  
Leslie Bernstein ◽  
...  

PURPOSE Breast cancer risk prediction models are used to identify high-risk women for early detection, targeted interventions, and enrollment into prevention trials. We sought to develop and evaluate a risk prediction model for breast cancer in US Black women, suitable for use in primary care settings. METHODS Breast cancer relative risks and attributable risks were estimated using data from Black women in three US population-based case-control studies (3,468 breast cancer cases; 3,578 controls age 30-69 years) and combined with SEER age- and race-specific incidence rates, with incorporation of competing mortality, to develop an absolute risk model. The model was validated in prospective data among 51,798 participants of the Black Women's Health Study, including 1,515 who developed invasive breast cancer. A second risk prediction model was developed on the basis of estrogen receptor (ER)–specific relative risks and attributable risks. Model performance was assessed by calibration (expected/observed cases) and discriminatory accuracy (C-statistic). RESULTS The expected/observed ratio was 1.01 (95% CI, 0.95 to 1.07). Age-adjusted C-statistics were 0.58 (95% CI, 0.56 to 0.59) overall and 0.63 (95% CI, 0.58 to 0.68) among women younger than 40 years. These measures were almost identical in the model based on estrogen receptor–specific relative risks and attributable risks. CONCLUSION Discriminatory accuracy of the new model was similar to that of the most frequently used questionnaire-based breast cancer risk prediction models in White women, suggesting that effective risk stratification for Black women is now possible. This model may be especially valuable for risk stratification of young Black women, who are below the ages at which breast cancer screening is typically begun.


Sign in / Sign up

Export Citation Format

Share Document