Research on Logistic Regression Algorithm of Breast Cancer Diagnose Data by Machine Learning

Author(s):  
Lei Liu
2021 ◽  
Vol 2083 (3) ◽  
pp. 032059
Author(s):  
Qiang Chen ◽  
Meiling Deng

Abstract Regression algorithms are commonly used in machine learning. Based on encryption and privacy protection methods, the current key hot technology regression algorithm and the same encryption technology are studied. This paper proposes a PPLAR based algorithm. The correlation between data items is obtained by logistic regression formula. The algorithm is distributed and parallelized on Hadoop platform to improve the computing speed of the cluster while ensuring the average absolute error of the algorithm.


Author(s):  
Charles M. Pérez-Espinoza ◽  
Nuvia Beltran-Robayo ◽  
Teresa Samaniego-Cobos ◽  
Abel Alarcón-Salvatierra ◽  
Ana Rodriguez-Mendez ◽  
...  

2018 ◽  
Vol 7 (4.20) ◽  
pp. 22 ◽  
Author(s):  
Jabeen Sultana ◽  
Abdul Khader Jilani ◽  
. .

The primary identification and prediction of type of the cancer ought to develop a compulsion in cancer study, in order to assist and supervise the patients. The significance of classifying cancer patients into high or low risk clusters needs commanded many investigation teams, from the biomedical and the bioinformatics area, to learn and analyze the application of machine learning (ML) approaches. Logistic Regression method and Multi-classifiers has been proposed to predict the breast cancer. To produce deep predictions in a new environment on the breast cancer data. This paper explores the different data mining approaches using Classification which can be applied on Breast Cancer data to build deep predictions. Besides this, this study predicts the best Model yielding high performance by evaluating dataset on various classifiers. In this paper Breast cancer dataset is collected from the UCI machine learning repository has 569 instances with 31 attributes. Data set is pre-processed first and fed to various classifiers like Simple Logistic-regression method, IBK, K-star, Multi-Layer Perceptron (MLP), Random Forest, Decision table, Decision Trees (DT), PART, Multi-Class Classifiers and REP Tree.  10-fold cross validation is applied, training is performed so that new Models are developed and tested. The results obtained are evaluated on various parameters like Accuracy, RMSE Error, Sensitivity, Specificity, F-Measure, ROC Curve Area and Kappa statistic and time taken to build the model. Result analysis reveals that among all the classifiers Simple Logistic Regression yields the deep predictions and obtains the best model yielding high and accurate results followed by other methods IBK: Nearest Neighbor Classifier, K-Star: instance-based Classifier, MLP- Neural network. Other Methods obtained less accuracy in comparison with Logistic regression method.  


Scientific Knowledge and Electronic devices are growing day by day. In this aspect, many expert systems are involved in the healthcare industry using machine learning algorithms. Deep neural networks beat the machine learning techniques and often take raw data i.e., unrefined data to calculate the target output. Deep learning or feature learning is used to focus on features which is very important and gives a complete understanding of the model generated. Existing methodology used data mining technique like rule based classification algorithm and machine learning algorithm like hybrid logistic regression algorithm to preprocess data and extract meaningful insights of data. This is, however a supervised data. The proposed work is based on unsupervised data that is there is no labelled data and deep neural techniques is deployed to get the target output. Machine learning algorithms are compared with proposed deep learning techniques using TensorFlow and Keras in the aspect of accuracy. Deep learning methodology outfits the existing rule based classification and hybrid logistic regression algorithm in terms of accuracy. The designed methodology is tested on the public MIT-BIH arrhythmia database, classifying four kinds of abnormal beats. The proposed approach based on deep learning technique offered a better performance, improving the results when compared to machine learning approaches of the state-of-the-art


2020 ◽  
Vol 38 (29_suppl) ◽  
pp. 276-276
Author(s):  
Tyler J. O'Neill ◽  
Vishakha Sharma ◽  
Athanasios Siadimas ◽  
Amir Babaeian ◽  
Gayathri Yerrapragada

276 Background: Adherence to tamoxifen among women diagnosed with hormone receptor positive metastatic breast cancer (mBC) can improve survival and minimize recurrence. Screening for non-adherence at treatment initiation may support personalized care, improve health outcomes, and minimize cost of care. This study aimed to use real world data (RWD) and machine learning (ML) methods to classify tamoxifen non-adherence. Methods: A cohort of women diagnosed with incident mBC from 2012 to 2018 were identified from Truven MarketScan Commercial Claims and Encounters and Medicare supplemental administrative claims databases. Patients with < 80% proportion of days coverage (PDC) in the year following treatment initiation were classified non-adherent. Training and internal validation cohorts were randomly generated (4:1 ratio). Clinical procedures, comorbidity, treatment and healthcare encounter features in the year prior to treatment initiation were used to train logistic regression, boosted logistic regression, random forest, and feed forward neural network models and internally validated based on area under receiver operating characteristic (AUROC) curve. The most predictive ML approach was evaluated to assess feature importance. Results: A total of 3,022 patients were included with 39.9% classified as non-adherent. All ML models had moderate predictive accuracy. Logistic regression (AUROC 0.64) was easily interpreted with sensitivity 94% (95% confidence interval [CI]: 0.89, 0.92) and specificity 0.31 (95% CI: 0.29, 0.33). The model accurately classified adherence (negative predictive value 88.7%) but was non-discriminate for non-adherence (positive predictive value 47.7%). Variable importance identified top predictive factors, including patient features (≥55 years old) and pre-treatment procedures (lymphatic nuclear medicine, radiation oncology, arterial surgery). Conclusions: ML using baseline administrative data predicts tamoxifen adherence. Baseline claims may not be sufficient to predict treatment non-adherence. Further validation with enriched longitudinal data may improve model performance for incorporation of predictions into clinical decision support.


Cancer has been portrayed as a heterogeneous disease comprising of a wide range of subtypes. The early diagnosis of a cancer type is very important to determine the course of medical treatment required by the patient. The significance of classifying cancerous cells into benign or malignant has driven many research studies, in the biomedical and the bioinformatics field. In the past years researchers have been encouraged to use different machine learning (ML) techniques for cancer detection, as well as prediction of survivability and recurrence. What's more, ML instruments can be used to distinguish key highlights from complex datasets and uncover their significance. An assortment of these procedures, including Artificial Neural Networks (ANNs), Bayesian Networks (BNs), Random Forest Methods (RVMs) and Decision Trees (DTs) has been usually used in cancer research for the development of predictive models, resulting in successful and exact decision making. Although it is obvious that the usage of machine learning techniques can enhance our comprehension of cancer detection, progression, recurrence and survivability, a proper level of accuracy is required for these strategies to be considered in the ordinary clinical practice. The predictive models talked about here depend on different administered ML strategies and on various input features and data samples. We have used Naïve-Bayes classifier, Neural Networks method, Decision Tree and Logistic Regression algorithm to detect the type of breast cancer (Benign or Malignant) and selection of features which are more relevant for prediction. We have made a comparative study to find out the best algorithm of the above four, for prediction of cancer type. With a high level of accuracy, any of these methods can be used to predict the type of breast cancer of any particular patient


Author(s):  
Abdul Karim ◽  
Azhari Azhari ◽  
Samir Brahim Belhaouri ◽  
Ali Adil Qureshi

The fact is quite transparent that almost everybody around the world is using android apps. Half of the population of this planet is associated with messaging, social media, gaming, and browsers. This online marketplace provides free and paid access to users. On the Google Play store, users are encouraged to download countless of applications belonging to predefined categories. In this research paper, we have scrapped thousands of users reviews and app ratings. We have scrapped 148 apps&rsquo; reviews from 14 categories. We have collected 506259 reviews from Google play store and subsequently checked the semantics of reviews about some applications form users to determine whether reviews are positive, negative, or neutral. We have evaluated the results by using different machine learning algorithms like Na&iuml;ve Bayes, Random Forest, and Logistic Regression algorithm. we have calculated Term Frequency (TF) and Inverse Document Frequency (IDF) with different parameters like accuracy, precision, recall, and F1 and compared the statistical result of these algorithms. We have visualized these statistical results in the form of a bar chart. In this paper, the analysis of each algorithm is performed one by one, and the results have been compared. Eventually, We've discovered that Logistic Regression is the best algorithm for a review-analysis of all Google play store. We have proved that Logistic Regression gets the speed of precision, accuracy, recall, and F1 in both after preprocessing and data collection of this dataset.


2021 ◽  
Vol 2021 ◽  
pp. 1-6
Author(s):  
Shouyun Lv ◽  
Shizong Li ◽  
Zhiwei Yu ◽  
Kaiqiong Wang ◽  
Xin Qiao ◽  
...  

To conduct better research in hepatocellular carcinoma resection, this paper used 3D machine learning and logistic regression algorithm to study the preoperative assistance of patients undergoing hepatectomy. In this study, the logistic regression model was analyzed to find the influencing factors for the survival and recurrence of patients. The clinical data of 50 HCC patients who underwent extensive hepatectomy (≥4 segments of the liver) admitted to our hospital from June 2020 to December 2020 were selected to calculate the liver volume, simulated surgical resection volume, residual liver volume, surgical margin, etc. The results showed that the simulated liver volume of 50 patients was 845.2 + 285.5 mL, and the actual liver volume of 50 patients was 826.3 ± 268.1 mL, and there was no significant difference between the two groups (t = 0.425; P  > 0.05). Compared with the logistic regression model, the machine learning method has a better prediction effect, but the logistic regression model has better interpretability. The analysis of the relationship between the liver tumour and hepatic vessels in practical problems has specific clinical application value for accurately evaluating the volume of liver resection and surgical margin.


Author(s):  
SUNDARAMBAL BALARAMAN

Classification algorithms are very widely used algorithms for the study of various categories of data located in multiple databases that have real-world implementations. The main purpose of this research work is to identify the efficiency of classification algorithms in the study of breast cancer analysis. Mortality rate of women increases due to frequent cases of breast cancer. The conventional method of diagnosing breast cancer is time consuming and hence research works are being carried out in multiple dimensions to address this issue. In this research work, Google colab, an excellent environment for Python coders, is used as a tool to implement machine learning algorithms for predicting the type of cancer. The performance of machine learning algorithms is analyzed based on the accuracy obtained from various classification models such as logistic regression, K-Nearest Neighbor (KNN), Support Vector Machine (SVM), Na&iuml;ve Bayes, Decision Tree and Random forest. Experiments show that these classifiers work well for the classification of breast cancers with accuracy&gt;90% and the logistic regression stood top with an accuracy of 98.5%. Also implementation using Google colab made the task very easier without spending hours of installation of environment and supporting libraries which we used to do earlier.


Sign in / Sign up

Export Citation Format

Share Document