Predicting discontinuation of docetaxel treatment for metastatic castration-resistant prostate cancer (mCRPC) with random forest

Prostate cancer is the most common cancer among men in developed countries. Androgen deprivation therapy (ADT) is the standard treatment for prostate cancer. However, approximately one third of all patients with metastatic disease treated with ADT develop resistance to ADT. This condition is called metastatic castrate-resistant prostate cancer (mCRPC). Patients who do not respond to hormone therapy are often treated with a chemotherapy drug called docetaxel. Sub-challenge 2 of the Prostate Cancer DREAM Challenge aims to improve the prediction of whether a patient with mCRPC would discontinue docetaxel treatment due to adverse effects. Specifically, a dataset containing three distinct clinical studies of patients with mCRPC treated with docetaxel was provided. We applied the k-nearest neighbor method for missing data imputation, the hill climbing algorithm and random forest importance for feature selection, and the random forest algorithm for classification. We also empirically studied the performance of many classification algorithms, including support vector machines and neural networks. Additionally, we found using random forest importance for feature selection provided slightly better results than the more computationally expensive method of hill climbing.

Download Full-text

An Improved Coronary Heart Disease Predictive System Using Random Forest

Asian Journal of Research in Computer Science ◽

10.9734/ajrcos/2021/v11i130253 ◽

2021 ◽

pp. 17-27

Author(s):

Abdulraheem Abdul ◽

Rafiu M. Isiaka ◽

Ronke S. Babatunde ◽

Jumoke F. Ajao

Keyword(s):

Neural Network ◽

Coronary Heart Disease ◽

Feature Selection ◽

Heart Disease ◽

Random Forest ◽

Cross Validation ◽

Nearest Neighbor ◽

Support Vector ◽

Disease Prediction ◽

K Nearest Neighbor

Aims: This work aim is to develop an enhanced predictive system for Coronary Heart Disease (CHD). Study Design: Synthetic Minority Oversampling Technique and Random Forest. Methodology: The Framingham heart disease dataset was used, which was collected from a study in Framingham, Massachusetts, the data was cleaned, normalized, rebalanced. Classifiers such as random forest, artificial neural network, naïve bayes, logistic regression, k-nearest neighbor and support vector machine were used for classification. Results: Random Forest outperformed other classifiers with an accuracy of 98%, a sensitivity of 99% and a precision of 95.8%. Feature selection was employed for better classification, but no significant improvement was recorded on the performance of the classifier with feature selection. Train test split also performed better that cross validation. Conclusion: Random Forest is recommended for research in Coronary Heart Disease prediction domain.

Download Full-text

Faculty Opinions recommendation of Prednisone plus cabazitaxel or mitoxantrone for metastatic castration-resistant prostate cancer progressing after docetaxel treatment: a randomised open-label trial.

Faculty Opinions – Post-Publication Peer Review of the Biomedical Literature ◽

10.3410/f.5686956.10860058 ◽

2011 ◽

Author(s):

Cora Sternberg ◽

Linda Cerbone

Keyword(s):

Prostate Cancer ◽

Castration Resistant Prostate Cancer ◽

Open Label ◽

Castration Resistant ◽

Docetaxel Treatment ◽

Open Label Trial

Download Full-text

Docetaxel Treatment in Castration-Resistant Prostate Cancer: the Triad Gene-Drug-Disease

The Journal of Clinical Endocrinology & Metabolism ◽

10.1210/jc.2010-2875 ◽

2011 ◽

Vol 96 (2) ◽

pp. 351-353 ◽

Cited By ~ 1

Author(s):

Fabio Rueda Faucz

Keyword(s):

Prostate Cancer ◽

Castration Resistant Prostate Cancer ◽

Castration Resistant ◽

Docetaxel Treatment

Download Full-text

Prostate Cancer Classification Using Random Forest and Support Vector Machines

Journal of Physics Conference Series ◽

10.1088/1742-6596/1752/1/012043 ◽

2021 ◽

Vol 1752 (1) ◽

pp. 012043

Author(s):

Z Rustam ◽

N Angie

Keyword(s):

Prostate Cancer ◽

Support Vector Machines ◽

Random Forest ◽

Cancer Classification ◽

Support Vector ◽

Vector Machines ◽

Prostate Cancer Classification

Download Full-text

Quantifying the Influence of Achievement Emotions for Student Learning in MOOCs

Journal of Educational Computing Research ◽

10.1177/0735633120967318 ◽

2020 ◽

pp. 073563312096731

Author(s):

Bowen Liu ◽

Wanli Xing ◽

Yifang Zeng ◽

Yonghe Wu

Keyword(s):

Random Forest ◽

Nearest Neighbor ◽

Online Courses ◽

Learning Performance ◽

Support Vector ◽

K Nearest Neighbor ◽

Achievement Emotions ◽

Integrative Framework ◽

Emotional Interaction ◽

Performance Results

Massive Open Online Courses (MOOCs) have become a popular tool for worldwide learners. However, a lack of emotional interaction and support is an important reason for learners to abandon their learning and eventually results in poor learning performance. This study applied an integrative framework of achievement emotions to uncover their holistic influence on students’ learning by analyzing more than 400,000 forum posts from 13 MOOCs. Six machine-learning models were first built to automatically identify achievement emotions, including K-Nearest Neighbor, Logistic Regression, Naïve Bayes, Decision Tree, Random Forest, and Support Vector Machines. Results showed that Random Forest performed the best with a kappa of 0.83 and an ROC_AUC of 0.97. Then, multilevel modeling with the “Stepwise Build-up” strategy was used to quantify the effect of achievement emotions on students’ academic performance. Results showed that different achievement emotions influenced students’ learning differently. These findings allow MOOC platforms and instructors to provide relevant emotional feedback to students automatically or manually, thereby improving their learning in MOOCs.

Download Full-text

Pharmacogenomic Biomarkers in Docetaxel Treatment of Prostate Cancer: From Discovery to Implementation

Genes ◽

10.3390/genes10080599 ◽

2019 ◽

Vol 10 (8) ◽

pp. 599 ◽

Cited By ~ 1

Author(s):

Reka Varnai ◽

Leena M. Koskinen ◽

Laura E. Mäntylä ◽

Istvan Szabo ◽

Liesel M. FitzGerald ◽

...

Keyword(s):

Prostate Cancer ◽

Treatment Guidelines ◽

Individual Variability ◽

Castration Resistant Prostate Cancer ◽

Predictive Capacity ◽

Current State ◽

Docetaxel Treatment ◽

High Heritability ◽

Treatment Procedures ◽

Genetic Signatures

Prostate cancer is the fifth leading cause of male cancer death worldwide. Although docetaxel chemotherapy has been used for more than fifteen years to treat metastatic castration resistant prostate cancer, the high inter-individual variability of treatment efficacy and toxicity is still not well understood. Since prostate cancer has a high heritability, inherited biomarkers of the genomic signature may be appropriate tools to guide treatment. In this review, we provide an extensive overview and discuss the current state of the art of pharmacogenomic biomarkers modulating docetaxel treatment of prostate cancer. This includes (1) research studies with a focus on germline genomic biomarkers, (2) clinical trials including a range of genetic signatures, and (3) their implementation in treatment guidelines. Based on this work, we suggest that one of the most promising approaches to improve clinical predictive capacity of pharmacogenomic biomarkers in docetaxel treatment of prostate cancer is the use of compound, multigene pharmacogenomic panels defined by specific clinical outcome measures. In conclusion, we discuss the challenges of integrating prostate cancer pharmacogenomic biomarkers into the clinic and the strategies that can be employed to allow a more comprehensive, evidence-based approach to facilitate their clinical integration. Expanding the integration of pharmacogenetic markers in prostate cancer treatment procedures will enhance precision medicine and ultimately improve patient outcomes.

Download Full-text

Systematic Framework to Predict Early-Stage Liver Carcinoma Using Hybrid of Feature Selection Techniques and Regression Techniques

Complexity ◽

10.1155/2022/7816200 ◽

2022 ◽

Vol 2022 ◽

pp. 1-11

Author(s):

Marium Mehmood ◽

Nasser Alshammari ◽

Saad Awadh Alanazi ◽

Fahad Ahmad

Keyword(s):

Feature Selection ◽

Random Forest ◽

Liver Diseases ◽

Early Stage ◽

Support Vector ◽

Liver Carcinoma ◽

Random Forest Regression ◽

Soft Computing Techniques ◽

Regression Algorithms ◽

Regression Techniques

The liver is the human body’s mandatory organ, but detecting liver disease at an early stage is very difficult due to the hiddenness of symptoms. Liver diseases may cause loss of energy or weakness when some irregularities in the working of the liver get visible. Cancer is one of the most common diseases of the liver and also the most fatal of all. Uncontrolled growth of harmful cells is developed inside the liver. If diagnosed late, it may cause death. Treatment of liver diseases at an early stage is, therefore, an important issue as is designing a model to diagnose early disease. Firstly, an appropriate feature should be identified which plays a more significant part in the detection of liver cancer at an early stage. Therefore, it is essential to extract some essential features from thousands of unwanted features. So, these features will be mined using data mining and soft computing techniques. These techniques give optimized results that will be helpful in disease diagnosis at an early stage. In these techniques, we use feature selection methods to reduce the dataset’s feature, which include Filter, Wrapper, and Embedded methods. Different Regression algorithms are then applied to these methods individually to evaluate the result. Regression algorithms include Linear Regression, Ridge Regression, LASSO Regression, Support Vector Regression, Decision Tree Regression, Multilayer Perceptron Regression, and Random Forest Regression. Based on the accuracy and error rates generated by these Regression algorithms, we have evaluated our results. The result shows that Random Forest Regression with the Wrapper Method from all the deployed Regression techniques is the best and gives the highest R2-Score of 0.8923 and lowest MSE of 0.0618.

Download Full-text

Scrutinizing Attacks and Evaluating Performance Appraisal Parameters via Feature Selection in Intrusion Detection System

10.21203/rs.3.rs-748765/v1 ◽

2021 ◽

Author(s):

Navroop Kaur ◽

Meenakshi Bansal ◽

Sukhwinder Singh S

Keyword(s):

Feature Selection ◽

Performance Evaluation ◽

Intrusion Detection ◽

Intrusion Detection System ◽

Detection System ◽

Denial Of Service ◽

Cyber Attacks ◽

Support Vector ◽

K Nearest Neighbor ◽

Evaluation Parameters

Abstract In modern times the firewall and antivirus packages are not good enough to protect the organization from numerous cyber attacks. Computer IDS (Intrusion Detection System) is a crucial aspect that contributes to the success of an organization. IDS is a software application responsible for scanning organization networks for suspicious activities and policy rupturing. IDS ensures the secure and reliable functioning of the network within an organization. IDS underwent huge transformations since its origin to cope up with the advancing computer crimes. The primary motive of IDS has been to augment the competence of detecting the attacks without endangering the performance of the network. The research paper elaborates on different types and different functions performed by the IDS. The NSL KDD dataset has been considered for training and testing. The seven prominent classifiers LR (Logistic Regression), NB (Naïve Bayes), DT (Decision Tree), AB (AdaBoost), RF (Random Forest), kNN (k Nearest Neighbor), and SVM (Support Vector Machine) have been studied along with their pros and cons and the feature selection have been imposed to enhance the reading of performance evaluation parameters (Accuracy, Precision, Recall, and F1Score). The paper elaborates a detailed flowchart and algorithm depicting the procedure to perform feature selection using XGB (Extreme Gradient Booster) for four categories of attacks: DoS (Denial of Service), Probe, R2L (Remote to Local Attack), and U2R (User to Root Attack). The selected features have been ranked as per their occurrence. The implementation have been conducted at five different ratios of 60-40%, 70-30%, 90-10%, 50-50%, and 80-20%. Different classifiers scored best for different performance evaluation parameters at different ratios. NB scored with the best Accuracy and Recall values. DT and RF consistently performed with high accuracy. NB, SVM, and kNN achieved good F1Score.

Download Full-text

Techniques for Detecting Malware Traffic: A Comprehensive Approach to Feature Selection and Classification

International Journal for Research in Applied Science and Engineering Technology ◽

10.22214/ijraset.2021.39088 ◽

2021 ◽

Vol 9 (12) ◽

pp. 1-10

Author(s):

Harsha A K

Keyword(s):

Machine Learning ◽

Feature Selection ◽

Random Forest ◽

Learning Algorithms ◽

Malware Detection ◽

Machine Learning Algorithms ◽

Gradient Boosting ◽

Support Vector ◽

Steady Increase ◽

Extreme Gradient Boosting

Abstract: Since the advent of encryption, there has been a steady increase in malware being transmitted over encrypted networks. Traditional approaches to detect malware like packet content analysis are inefficient in dealing with encrypted data. In the absence of actual packet contents, we can make use of other features like packet size, arrival time, source and destination addresses and other such metadata to detect malware. Such information can be used to train machine learning classifiers in order to classify malicious and benign packets. In this paper, we offer an efficient malware detection approach using classification algorithms in machine learning such as support vector machine, random forest and extreme gradient boosting. We employ an extensive feature selection process to reduce the dimensionality of the chosen dataset. The dataset is then split into training and testing sets. Machine learning algorithms are trained using the training set. These models are then evaluated against the testing set in order to assess their respective performances. We further attempt to tune the hyper parameters of the algorithms, in order to achieve better results. Random forest and extreme gradient boosting algorithms performed exceptionally well in our experiments, resulting in area under the curve values of 0.9928 and 0.9998 respectively. Our work demonstrates that malware traffic can be effectively classified using conventional machine learning algorithms and also shows the importance of dimensionality reduction in such classification problems. Keywords: Malware Detection, Extreme Gradient Boosting, Random Forest, Feature Selection.

Download Full-text

Metastatic Castration-Resistant Prostate Cancer: Academic Insights and Perspectives Analysis

10.21203/rs.2.10359/v1 ◽

2019 ◽

Author(s):

Lugeng He ◽

Hui Fang ◽

Chao Chen ◽

Yanqi Wu ◽

Yuyong Wang ◽

...

Keyword(s):

Prostate Cancer ◽

Developed Countries ◽

The United States ◽

Vital Role ◽

Cancer Center ◽

Castration Resistant Prostate Cancer ◽

Research Areas ◽

Castration Resistant ◽

The Usa ◽

Future Direction

Abstract Background In recent years, metastatic castration-resistant prostate cancer (MCRPC) and studies related to MCRPC have drawn global attention. The main objective of this bibliometric study was to provide an overview of MCRPC, explore clusters and trends in research and investigate the future direction of MCRPC research. Methods A total of 4,089 publications published between 1979 and 2018 were retrieved from the Web of Science (WoS) Core Collection database. Different aspects of MCRPC research, including the countries/territories, institutions, journals, authors, research areas, funding agencies and author keywords, were analyzed. Results The number of annual MCRPC publications increased rapidly after 2010. American researchers played a vital role in this increase, as they published the most publications. The most productive institution was Memorial Sloan Kettering Cancer Center. De Bono, JS (the United Kingdom [UK]) and Scher, HI (the United States of America [USA]) were the two most productive authors. The National Institutes of Health (NIH) funded the largest number of published papers. Analyses of keywords suggested that therapies (abiraterone, enzalutamide, etc.) would attracted global attention after US Food and Drug Administration (FDA) approval. Conclusions Developed countries, especially the USA,were the leading nations for MCRPC research because of their abundant funding and frequent international collaborations. Therapy was one of the most vital aspects of MCRPC research. Therapies targeting DNA repair or the androgen receptor (AR) signing pathway and new therapies especially prostate-specific membrane antigen (PSMA)-based radioligand therapy (RLT) would be the next focus of MCRPC research.

Download Full-text