Phenylalanylphenylalanine as a Diagnostic Biomarker for Lung Cancer and Tuberculosis.

Author(s):  
Siyu Chen ◽  
Chunyan Li ◽  
Zhonghua Qin ◽  
Lili Song ◽  
Shiyuan Zhang ◽  
...  

Abstract Background: Worldwide, lung cancer has the highest mortality rate, and pulmonary tuberculosis has a high incidence in China, and both may be misdiagnosed frequently because of similar clinical presentation and atypical imaging findings. Diagnostic biomarkers to distinguish between lung cancer and other pulmonary diseases can be detected by metabolomics to avoid non-essential treatment.Methods: This cohort study employed non-targeted and targeted metabolomic analysis in participants enrolled from three independent centers. Multivariate statistics, variable importance in the projection parameter, receiver operating characteristics (ROC) curve were used to build potential key diagnostic biomarkers model of lung cancer and these were subsequently analyzed using targeted metabolomics in test set. Quantitative analysis of differences in biomarker levels was conducted, and a support vector machine (SVM) classifier was used to identify the prediction rate of diagnostic biomarker model. Results: Phenylalanylphenylalanine showed opposite trends in lung cancer and tuberculosis. The area under the curve 0.8887 (95% CI 0.8064–0.9710, p<0.001, sensitivity 85.45%, specificity 84%), 0.8149 (95% CI 0.7419–0.8878, p<0.001, the sensitivity was 73.26%, the specificity was 78.43%) and SVM results (prediction rate 77.94%) showed the feasibility of using phenylalanylphenylalanine as a diagnostic marker for the differential diagnosis of lung cancer and tuberculosis.Conclusions: Changes in the levels of phenylalanylphenylalanine facilitate differential diagnosis between lung cancer and tuberculosis, thereby potentially reducing the damage caused by misdiagnosis in the clinical setting, and enabling early treatment of lung cancer patients.Trial registration: This study is registered in the China Clinical Trial Registration Center (registration number ChiCTR2000040666, Registered 07 December 2020, http://www.chictr.org.cn/index.aspx)

2019 ◽  
Vol 45 (10) ◽  
pp. 3193-3201 ◽  
Author(s):  
Yajuan Li ◽  
Xialing Huang ◽  
Yuwei Xia ◽  
Liling Long

Abstract Purpose To explore the value of CT-enhanced quantitative features combined with machine learning for differential diagnosis of renal chromophobe cell carcinoma (chRCC) and renal oncocytoma (RO). Methods Sixty-one cases of renal tumors (chRCC = 44; RO = 17) that were pathologically confirmed at our hospital between 2008 and 2018 were retrospectively analyzed. All patients had undergone preoperative enhanced CT scans including the corticomedullary (CMP), nephrographic (NP), and excretory phases (EP) of contrast enhancement. Volumes of interest (VOIs), including lesions on the images, were manually delineated using the RadCloud platform. A LASSO regression algorithm was used to screen the image features extracted from all VOIs. Five machine learning classifications were trained to distinguish chRCC from RO by using a fivefold cross-validation strategy. The performance of the classifier was mainly evaluated by areas under the receiver operating characteristic (ROC) curve and accuracy. Results In total, 1029 features were extracted from CMP, NP, and EP. The LASSO regression algorithm was used to screen out the four, four, and six best features, respectively, and eight features were selected when CMP and NP were combined. All five classifiers had good diagnostic performance, with area under the curve (AUC) values greater than 0.850, and support vector machine (SVM) classifier showed a diagnostic accuracy of 0.945 (AUC 0.964 ± 0.054; sensitivity 0.999; specificity 0.800), showing the best performance. Conclusions Accurate preoperative differential diagnosis of chRCC and RO can be facilitated by a combination of CT-enhanced quantitative features and machine learning.


The online discussion forums and blogs are very vibrant platforms for cancer patients to express their views in the form of stories. These stories sometimes become a source of inspiration for some patients who are anxious in searching the similar cases. This paper proposes a method using natural language processing and machine learning to analyze unstructured texts accumulated from patient’s reviews and stories. The proposed methodology aims to identify behavior, emotions, side-effects, decisions and demographics associated with the cancer victims. The pre-processing phase of our work involves extraction of web text followed by text-cleaning where some special characters and symbols are omitted, and finally tagging the texts using NLTK’s (Natural Language Toolkit) POS (Parts of Speech) Tagger. The post-processing phase performs training of seven machine learning classifiers (refer Table 6). The Decision Tree classifier shows the higher precision (0.83) among the other classifiers while, the Area under the operating Characteristics (AUC) for Support Vector Machine (SVM) classifier is highest (0.98).


Lung Cancer is the due to irregular cells that begin off in one or the two lungs, ordinarily in the cells that line the air entries. Metastasis alludes to malignancy spreading past its site of starting point to different pieces of the body. At the point when malignancy spreads, it is a lot harder to treat effectively. Essential lung malignancy begins in the lungs, while auxiliary lung disease begins elsewhere in the body, metastasizes, and achieves the lungs. In ebb and flow therapeutic conclusion, treatment, and medical procedure, restorative imaging plays a standout amongst the most critical jobs, since imaging gadgets, for example, Computerised Tomography (CT), Magnetic Resonance Imaging (MRI), and ultrasound diagnostics yield a lot of data about sicknesses and organs. In any case, radiologists need to break down and assess some restorative pictures thoroughly in a brief timeframe, which is a gigantic weight. To help the weight, PC innovation research has been utilized all the more regularly to examine medicinal pictures lately. The proposed strategy which is observed to be exact for tumor discovery, utilizes Gray Level Co-event Matrix (GLCM). The Support Vector Machine (SVM) classifier characterizes the given info stage was ordinary or unhealthy and in the event that it is ailing, further it arranges the tumor pictures into considerate (non-destructive) or dangerous (malignant).


2021 ◽  
Vol 10 ◽  
Author(s):  
Hao Yu ◽  
Ka-On Lam ◽  
Huanmei Wu ◽  
Michael Green ◽  
Weili Wang ◽  
...  

BackgroundRadiation-induced lung fibrosis (RILF) is an important late toxicity in patients with non-small-cell lung cancer (NSCLC) after radiotherapy (RT). Clinically significant RILF can impact quality of life and/or cause non-cancer related death. This study aimed to determine whether pre-treatment plasma cytokine levels have a significant effect on the risk of RILF and investigate the abilities of machine learning algorithms for risk prediction.MethodsThis is a secondary analysis of prospective studies from two academic cancer centers. The primary endpoint was grade≥2 (RILF2), classified according to a system consistent with the consensus recommendation of an expert panel of the AAPM task for normal tissue toxicity. Eligible patients must have at least 6 months’ follow-up after radiotherapy commencement. Baseline levels of 30 cytokines, dosimetric, and clinical characteristics were analyzed. Support vector machine (SVM) algorithm was applied for model development. Data from one center was used for model training and development; and data of another center was applied as an independent external validation.ResultsThere were 57 and 37 eligible patients in training and validation datasets, with 14 and 16.2% RILF2, respectively. Of the 30 plasma cytokines evaluated, SVM identified baseline circulating CCL4 as the most significant cytokine associated with RILF2 risk in both datasets (P = 0.003 and 0.07, for training and test sets, respectively). An SVM classifier predictive of RILF2 was generated in Cohort 1 with CCL4, mean lung dose (MLD) and chemotherapy as key model features. This classifier was validated in Cohort 2 with accuracy of 0.757 and area under the curve (AUC) of 0.855.ConclusionsUsing machine learning, this study constructed and validated a weighted-SVM classifier incorporating circulating CCL4 levels with significant dosimetric and clinical parameters which predicts RILF2 risk with a reasonable accuracy. Further study with larger sample size is needed to validate the role of CCL4, and this SVM classifier in RILF2.


2021 ◽  
Author(s):  
Eric Adua ◽  
Emmanuel Awuni Kolog ◽  
Ebenezer Afrifa-Yamoah ◽  
Bright Amankwah ◽  
Christian Obirikorang ◽  
...  

Abstract BackgroundAccurate prediction and early recognition of type II diabetes (T2DM) will lead to timely and meaningful interventions, while preventing T2DM associated complications. In this context, machine learning (ML) is promising, as it can transform vast amount of T2DM data into clinically relevant information. This study compares multiple ML techniques for predictive modelling based on different T2DM associated variables in an African population, Ghana. MethodsThe study involves 219 T2DM patients and 219 healthy individuals who were recruited from the hospital and the local community, respectively. Anthropometric and biochemical information including glycated haemoglobin (HbA1c), body mass index (BMI), blood pressure, fasting blood sugar (FBS), serum lipids [(total cholesterol (TC), triglycerides (TG), high and low-density lipoprotein cholesterol (HDL-c and LDL-c)] were collected. From this data, four ML classification algorithms including Naïve-Bayes (NB), K-Nearest Neighbor (KNN), Support Vector Machines (SVM) and Decision Tree (DT) were used to predict T2DM. Precision, Recall, F1-Scores, Receiver Operating Characteristics (ROC) scores and the confusion matrix were computed to determine the performance of the various algorithms while the importance of the feature attributes was determined by recursive feature elimination technique.ResultsAll the classifiers performed beyond the acceptable threshold of 70% for the Precision, Recall, F-score and Accuracy. After building the predictive model, 82% of diabetic test data was detected by the NB classifier, of which 93% were accurately predicted. The SVM classifier was the second-best performing classifier which yielded an overall accuracy of 84%. The non-T2DM test data yielded an accurate prediction score of 75% from the 98% of the proportion of the non-T2DM test data. KNN and DT yielded accuracies of 83% and 81%, respectively. NB has the best performance (AUC=0.87) followed by SVM (AUC= 0.84), KNN (AUC= 0.85) and DT (AUC= 0.81). The best three feature attributes, in order of importance, are HbA1c, TC and BMI whereas the least three importance of the features are Age, HDL-c and LDL-c.ConclusionBased on the predictive performance and high accuracy, the study has shown the potential of ML as a robust forecasting tool for T2DM. Our results can be a benchmark for guiding policy decisions in T2DM surveillance in resource and medical expertise limited countries such as Ghana.


2021 ◽  
Author(s):  
Eric Adua ◽  
Emmanuel Awuni Kolog ◽  
Ebenezer Afrifa-Yamoah ◽  
Bright Amankwah ◽  
Christian Obirikorang ◽  
...  

Abstract Background Accurate prediction and early recognition of type II diabetes (T2DM) will lead to timely and meaningful interventions, while preventing T2DM associated complications. In this context, machine learning (ML) is promising, as it can transform vast amount of T2DM data into clinically relevant information. This study compares multiple ML techniques for predictive modelling based on different T2DM associated variables in an African population, Ghana. Methods The study involves 219 T2DM patients and 219 healthy individuals who were recruited from the hospital and the local community, respectively. Anthropometric and biochemical information including glycated haemoglobin (HbA1c), body mass index (BMI), blood pressure, fasting blood sugar (FBS), serum lipids [(total cholesterol (TC), triglycerides (TG), high and low-density lipoprotein cholesterol (HDL-c and LDL-c)] were collected. From this data, four ML classification algorithms including Naïve-Bayes (NB), K-Nearest Neighbor (KNN), Support Vector Machines (SVM) and Decision Tree (DT) were used to predict T2DM. Precision, Recall, F1-Scores, Receiver Operating Characteristics (ROC) scores and the confusion matrix were computed to determine the performance of the various algorithms while the importance of the feature attributes was determined by recursive feature elimination technique. Results All the classifiers performed beyond the acceptable threshold of 70% for the Precision, Recall, F-score and Accuracy. After building the predictive model, 82% of diabetic test data was detected by the NB classifier, of which 93% were accurately predicted. The SVM classifier was the second-best performing classifier which yielded an overall accuracy of 84%. The non-T2DM test data yielded an accurate prediction score of 75% from the 98% of the proportion of the non-T2DM test data. KNN and DT yielded accuracies of 83% and 81%, respectively. NB has the best performance (AUC = 0.87) followed by SVM (AUC = 0.84), KNN (AUC = 0.85) and DT (AUC = 0.81). The best three feature attributes, in order of importance, are HbA1c, TC and BMI whereas the least three importance of the features are Age, HDL-c and LDL-c. Conclusion Based on the predictive performance and high accuracy, the study has shown the potential of ML as a robust forecasting tool for T2DM. Our results can be a benchmark for guiding policy decisions in T2DM surveillance in resource and medical expertise limited countries such as Ghana.


2017 ◽  
Vol 35 (15_suppl) ◽  
pp. 11566-11566
Author(s):  
Monica Khunger ◽  
Mehdi Alilou ◽  
Rajat Thawani ◽  
Anant Madabhushi ◽  
Vamsidhar Velcheti

11566 Background: Immune-checkpoint blockade treatments, particularly drugs targeting the programmed death-1 (PD-1) receptor, demonstrate promising clinical efficacy in patients with non-small cell lung cancer (NSCLC). We sought to evaluate whether computer extracted measurements of tortuosity of vessels in lung nodules on baseline CT scans in NSCLC patients(pts) treated with a PD-1 inhibitor, nivolumab could distinguish responders and non-responders. Methods: A total of 61 NSCLC pts who underwent treatment with nivolumab were included in this study. Pts who did not receive nivolumab after 2 cycles due to lack of response or progression per RECIST were classified as ‘non-responders’, patients who had radiological response per RECIST or had clinical benefit (defined as stable disease >10 cycles) were classified as ‘responders’. A total of 35 quantitative tortuosity features of the vessels associated with lung nodule were investigated. In the training cohort (N=33), the features were ranked in their ability to identify responders to nivolumab using a support vector machine (SVM) classifier. The three most informative features were then used for training the SVM, which was then validated on a cohort of N=28 pts. Results: The maximum curvature ( f1), standard deviation of the torsion ( f2) and mean curvature ( f3) were identified as the most discriminating features. The area under Receiver operating characteristic (ROC) curve (AUC) of the SVM was 0.84 for the training and 0.72 for the validation cohort. Conclusions: Vessel tortuosity features were able to distinguish responders from non-responders for patients with NSCLC treated with nivolumab. Large scale multi-site validation will need to be done to establish vessel tortuosity as a predictive biomarker for immunotherapy. [Table: see text]


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Laurentius Oscar Osapoetra ◽  
Archya Dasgupta ◽  
Daniel DiCenzo ◽  
Kashuf Fatima ◽  
Karina Quiaoit ◽  
...  

AbstractTo investigate the role of quantitative ultrasound (QUS) radiomics to predict treatment response in patients with head and neck squamous cell carcinoma (HNSCC) treated with radical radiotherapy (RT). Five spectral parameters, 20 texture, and 80 texture-derivative features were extracted from the index lymph node before treatment. Response was assessed initially at 3 months with complete responders labelled as early responders (ER). Patients with residual disease were followed to classify them as either late responders (LR) or patients with persistent/progressive disease (PD). Machine learning classifiers with leave-one-out cross-validation was used for the development of a binary response-prediction radiomics model. A total of 59 patients were included in the study (22 ER, 29 LR, and 8 PD). A support vector machine (SVM) classifier led to the best performance with accuracy and area under curve (AUC) of 92% and 0.91, responsively to define the response at 3 months (ER vs. LR/PD). The 2-year recurrence-free survival for predicted-ER, LR, PD using an SVM-model was 91%, 78%, and 27%, respectively (p < 0.01). Pretreatment QUS-radiomics using texture derivatives in HNSCC can predict the response to RT with an accuracy of more than 90% with a strong influence on the survival.Clinical trial registration: clinicaltrials.gov.in identifier NCT03908684.


2021 ◽  
Vol 6 (1) ◽  
Author(s):  
Eric Adua ◽  
Emmanuel Awuni Kolog ◽  
Ebenezer Afrifa-Yamoah ◽  
Bright Amankwah ◽  
Christian Obirikorang ◽  
...  

Abstract Background Accurate prediction and early recognition of type II diabetes (T2DM) will lead to timely and meaningful interventions, while preventing T2DM associated complications. In this context, machine learning (ML) is promising, as it can transform vast amount of T2DM data into clinically relevant information. This study compares multiple ML techniques for predictive modelling based on different T2DM associated variables in an African population, Ghana. Methods The study involved 219 T2DM patients and 219 healthy individuals who were recruited from the hospital and the local community, respectively. Anthropometric and biochemical information including glycated haemoglobin (HbA1c), body mass index (BMI), blood pressure, fasting blood sugar (FBS), serum lipids [(total cholesterol (TC), triglycerides (TG), high and low-density lipoprotein cholesterol (HDL-c and LDL-c)] were collected. From this data, four ML classification algorithms including Naïve-Bayes (NB), K-Nearest Neighbor (KNN), Support Vector Machines (SVM) and Decision Tree (DT) were used to predict T2DM. Precision, Recall, F1-Scores, Receiver Operating Characteristics (ROC) scores and the confusion matrix were computed to determine the performance of the various algorithms while the importance of the feature attributes was determined by recursive feature elimination technique. Results All the classifiers performed beyond the acceptable threshold of 70% for Precision, Recall, F-score and Accuracy. After building the predictive model, 82% of diabetic test data was detected by the NB classifier, of which 93% were accurately predicted. The SVM classifier was the second-best performing classifier which yielded an overall accuracy of 84%. The non-T2DM test data yielded an accurate prediction score of 75% from the 98% of the proportion of the non-T2DM test data. KNN and DT yielded accuracies of 83% and 81%, respectively. NB had the best performance (AUC = 0.87) followed by SVM (AUC = 0.84), KNN (AUC = 0.85) and DT (AUC = 0.81). The best three feature attributes, in order of importance, were HbA1c, TC and BMI whereas the least three importance of the features were Age, HDL-c and LDL-c. Conclusion Based on the predictive performance and high accuracy, the study has shown the potential of ML as a robust forecasting tool for T2DM. Our results can be a benchmark for guiding policy decisions in T2DM surveillance in resource and medical expertise limited countries such as Ghana.


2020 ◽  
Vol 10 (8) ◽  
pp. 1841-1850
Author(s):  
A. Kodieswari ◽  
D. Deepa

The most widely recognized threatening tumours is the lung cancer one amongst the most preeminent disease and mortality which prompts significant risk to individuals’ wellbeing and life. Propelled lung malignant growth is probably prompt to produce comparing side effects in patients with extraordinary torment and life-threatening. Computed Tomography (CT) is one of the efficient solutions to investigate the distant metastasis plausibility for which system with computer aided diagnosis is desirable. So to diagnose distant metastasis of cancer in lung, the past framework is accomplished by Support Vector Machine (SVM) based classification for manual segmentation and the precision achieved is 89.09% for the classification. To enumerate large data sets, manual segmentation is time consuming, tedious and labour-intensive and hence achieving feasibility is not that easy. The desired prediction accuracy is attained by the techniques namely efficient segmentation method and feature selection methods. The proposed system utilizes Improved Markov Random Field (MRF) with SVM based classification, thereby improving the system performance. In this research work, the input is nothing but the CT lung images and adaptive median filtering is utilized for preprocessing technique. Improved MRF approach is one of the solutions to segment the pre-processed image after which CT images are used to extricate the clinical features and radiomic features. Anarchic Society Optimization (ASO) algorithm provides assistance for choosing the optimal features which in turn progresses the classification accuracy. In view of the selected feature, SVM classifier is utilized for cancer classification. The test outcome proves that the projected framework achieves improved performance compared to the existing frameworks parameters like correctness, accuracy and review.


Sign in / Sign up

Export Citation Format

Share Document