Breast Cancer Prediction Using Classification Techniques of Machine Learning

Angela More

doi:10.22214/ijraset.2022.39743

Breast Cancer Prediction Using Classification Techniques of Machine Learning

International Journal for Research in Applied Science and Engineering Technology ◽

10.22214/ijraset.2022.39743 ◽

2022 ◽

Vol 10 (1) ◽

pp. 51-57

Author(s):

Angela More

Keyword(s):

Breast Cancer ◽

Machine Learning ◽

Naive Bayes ◽

Naïve Bayes ◽

Machine Learning Algorithms ◽

Machine Learning Techniques ◽

Decision Tree Classifier ◽

Learning Techniques ◽

Tree Classifier ◽

Abstract Data

Abstract: Data analytics play vital roles in diagnosis and treatment in the health care sector. To enable practitioner decisionmaking, huge volumes of data should be processed with machine learning techniques to produce tools for prediction and classification Breast Cancer reports 1 million cases per year. We have proposed a prediction model, which is specifically designed for prediction of Breast Cancer using Machine learning algorithms Decision tree classifier, Naïve Bayes, SVM and KNearest Neighbour algorithms. The model predicts the type of tumour, the tumour can be benign (noncancerous) or malignant (cancerous) . The model uses supervised learning which is a machine learning concept where we provide dependent and independent columns to machine. It uses classification technique which predicts the type of tumour. Keywords: Cancer, Machine learning, Prediction, Data Visualization, SVM, Naïve Bayes, Classification.

Download Full-text

Sentiment Analysis using various Machine Learning and Deep Learning Techniques

Journal of the Nigerian Society of Physical Sciences ◽

10.46481/jnsps.2021.308 ◽

2021 ◽

pp. 385-394

Author(s):

V Umarani ◽

A Julian ◽

J Deepa

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Sentiment Analysis ◽

Naive Bayes ◽

Naïve Bayes ◽

Supervised Machine Learning ◽

Machine Learning Techniques ◽

Support Vector ◽

Analysis Process ◽

Learning Techniques

Sentiment analysis has gained a lot of attention from researchers in the last year because it has been widely applied to a variety of application domains such as business, government, education, sports, tourism, biomedicine, and telecommunication services. Sentiment analysis is an automated computational method for studying or evaluating sentiments, feelings, and emotions expressed as comments, feedbacks, or critiques. The sentiment analysis process can be automated using machine learning techniques, which analyses text patterns faster. The supervised machine learning technique is the most used mechanism for sentiment analysis. The proposed work discusses the flow of sentiment analysis process and investigates the common supervised machine learning techniques such as multinomial naive bayes, Bernoulli naive bayes, logistic regression, support vector machine, random forest, K-nearest neighbor, decision tree, and deep learning techniques such as Long Short-Term Memory and Convolution Neural Network. The work examines such learning methods using standard data set and the experimental results of sentiment analysis demonstrate the performance of various classifiers taken in terms of the precision, recall, F1-score, RoC-Curve, accuracy, running time and k fold cross validation and helps in appreciating the novelty of the several deep learning techniques and also giving the user an overview of choosing the right technique for their application.

Download Full-text

Heart Disease Prediction using Machine Learning

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.f9780.059120 ◽

2020 ◽

Vol 9 (1) ◽

pp. 700-704

Keyword(s):

Machine Learning ◽

Heart Disease ◽

Machine Learning Techniques ◽

Support Vector ◽

Disease Prediction ◽

Nearest Neighbour ◽

Decision Tree Classifier ◽

Support Vector Classifier ◽

Learning Techniques ◽

Tree Classifier

Deriving the methodologies to detect heart issues at an earlier stage and intimating the patient to improve their health. To resolve this problem, we will use Machine Learning techniques to predict the incidence at an earlier stage. We have a tendency to use sure parameters like age, sex, height, weight, case history, smoking and alcohol consumption and test like pressure ,cholesterol, diabetes, ECG, ECHO for prediction. In machine learning there are many algorithms which will be used to solve this issue. The algorithms include K-Nearest Neighbour, Support vector classifier, decision tree classifier, logistic regression and Random Forest classifier. Using these parameters and algorithms we need to predict whether or not the patient has heart disease or not and recommend the patient to improve his/her health.

Download Full-text

Successful Case Study of Machine Learning Application to Streamline and Improve History Matching Process for Complex Gas-Condensate Reservoirs in Hai Thach Field, Offshore Vietnam

10.2118/204835-ms ◽

2021 ◽

Author(s):

Son Hoang ◽

Tung Tran ◽

Tan Nguyen ◽

Tu Truong ◽

Duy Pham ◽

...

Keyword(s):

Machine Learning ◽

Decision Tree ◽

History Matching ◽

Dynamic Models ◽

Naive Bayes ◽

Naïve Bayes ◽

Gas Condensate ◽

Decision Tree Classifier ◽

Matching Process ◽

Tree Classifier

Abstract This paper reports a successful case study of applying machine learning to improve the history matching process, making it easier, less time-consuming, and more accurate, by determining whether Local Grid Refinement (LGR) with transmissibility multiplier is needed to history match gas-condensate wells producing from geologically complex reservoirs as well as determining the required LGR setup to history match those gas-condensate producers. History matching Hai Thach gas-condensate production wells is extremely challenging due to the combined effect of condensate banking, sub-seismic fault network, complex reservoir distribution and connectivity, uncertain HIIP, and lack of PVT data for most reservoirs. In fact, for some wells, many trial simulation runs were conducted before it became clear that LGR with transmissibility multiplier was required to obtain good history matching. In order to minimize this time-consuming trial-and-error process, machine learning was applied in this study to analyze production data using synthetic samples generated by a very large number of compositional sector models so that the need for LGR could be identified before the history matching process begins. Furthermore, machine learning application could also determine the required LGR setup. The method helped provide better models in a much shorter time, and greatly improved the efficiency and reliability of the dynamic modeling process. More than 500 synthetic samples were generated using compositional sector models and divided into separate training and test sets. Multiple classification algorithms such as logistic regression, Gaussian Naive Bayes, Bernoulli Naive Bayes, multinomial Naive Bayes, linear discriminant analysis, support vector machine, K-nearest neighbors, and Decision Tree as well as artificial neural networks were applied to predict whether LGR was used in the sector models. The best algorithm was found to be the Decision Tree classifier, with 100% accuracy on the training set and 99% accuracy on the test set. The LGR setup (size of LGR area and range of transmissibility multiplier) was also predicted best by the Decision Tree classifier with 91% accuracy on the training set and 88% accuracy on the test set. The machine learning model was validated using actual production data and the dynamic models of history-matched wells. Finally, using the machine learning prediction on wells with poor history matching results, their dynamic models were updated and significantly improved.

Download Full-text

Improved argumentative paragraphs detection in academic theses supported with unit segmentation

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-219237 ◽

2021 ◽

pp. 1-11

Author(s):

Jesús Miguel García-Gorrostieta ◽

Aurelio López-López ◽

Samuel González-López ◽

Adrián Pastor López-Monroy

Keyword(s):

Machine Learning ◽

Decision Tree ◽

Automatic Detection ◽

Machine Learning Techniques ◽

Svm Classifier ◽

Complex Task ◽

Decision Tree Classifier ◽

Learning Techniques ◽

Tree Classifier ◽

Academic Author

Academic theses writing is a complex task that requires the author to be skilled in argumentation. The goal of the academic author is to communicate clear ideas and to convince the reader of the presented claims. However, few students are good arguers, and this is a skill that takes time to master. In this paper, we present an exploration of lexical features used to model automatic detection of argumentative paragraphs using machine learning techniques. We present a novel proposal, which combines the information in the complete paragraph with the detection of argumentative segments in order to achieve improved results for the detection of argumentative paragraphs. We propose two approaches; a more descriptive one, which uses the decision tree classifier with indicators and lexical features; and another more efficient, which uses an SVM classifier with lexical features and a Document Occurrence Representation (DOR). Both approaches consider the detection of argumentative segments to ensure that a paragraph detected as argumentative has indeed segments with argumentation. We achieved encouraging results for both approaches.

Download Full-text

Mitigating Webshell Attacks through Machine Learning Techniques

Future Internet ◽

10.3390/fi12010012 ◽

2020 ◽

Vol 12 (1) ◽

pp. 12 ◽

Cited By ~ 3

Author(s):

You Guo ◽

Hector Marco-Gisbert ◽

Paul Keir

Keyword(s):

Machine Learning ◽

Feature Matching ◽

Naive Bayes ◽

Web Server ◽

Naïve Bayes ◽

Malicious Code ◽

Machine Learning Algorithms ◽

Machine Learning Techniques ◽

Detection Methods ◽

Detection Model

A webshell is a command execution environment in the form of web pages. It is often used by attackers as a backdoor tool for web server operations. Accurately detecting webshells is of great significance to web server protection. Most security products detect webshells based on feature-matching methods—matching input scripts against pre-built malicious code collections. The feature-matching method has a low detection rate for obfuscated webshells. However, with the help of machine learning algorithms, webshells can be detected more efficiently and accurately. In this paper, we propose a new PHP webshell detection model, the NB-Opcode (naïve Bayes and opcode sequence) model, which is a combination of naïve Bayes classifiers and opcode sequences. Through experiments and analysis on a large number of samples, the experimental results show that the proposed method could effectively detect a range of webshells. Compared with the traditional webshell detection methods, this method improves the efficiency and accuracy of webshell detection.

Download Full-text

Sentiment Analysis of Tweets on the COVID-19 Pandemic Using Machine Learning Techniques

Handbook of Research on Innovations and Applications of AI, IoT, and Cognitive Technologies - Advances in Computational Intelligence and Robotics ◽

10.4018/978-1-7998-6870-5.ch021 ◽

2021 ◽

pp. 310-320

Author(s):

Jothikumar R. ◽

Vijay Anand R. ◽

Visu P. ◽

Kumar R. ◽

Susi S. ◽

...

Keyword(s):

Machine Learning ◽

Decision Tree ◽

Respiratory Tract ◽

Sentiment Analysis ◽

Naive Bayes ◽

Naïve Bayes ◽

Machine Learning Techniques ◽

Respiratory Tract Diseases ◽

Thought Processes ◽

Learning Techniques

Sentiment evaluation alludes to separate the sentiments from the characteristic language and to perceive the mentality about the exact theme. Novel corona infection, a harmful malady ailment, is spreading out of the blue through the quarter, which thought processes respiratory tract diseases that can change from gentle to extraordinary levels. Because of its quick nature of spreading and no conceived cure, it ushered in a vibe of stress and pressure. In this chapter, a framework perusing principally based procedure is utilized to discover the musings of the tweets related to COVID and its effect lockdown. The chapter examines the tweets identified with the hash tags of crown infection and lockdown. The tweets were marked fabulous, negative, or fair, and a posting of classifiers has been utilized to investigate the precision and execution. The classifiers utilized have been under the four models which incorporate decision tree, regression, helpful asset vector framework, and naïve Bayes forms.

Download Full-text

Peningkatan Performa Pendeteksian Anomali Menggunakan Ensemble Learning dan Feature Selection

Creative Information Technology Journal ◽

10.24076/citec.2020v7i1.238 ◽

2021 ◽

Vol 7 (1) ◽

pp. 1

Author(s):

Ripto Sudiyarno ◽

Arief Setyanto ◽

Emha Taufiq Luthfi

Keyword(s):

Machine Learning ◽

Feature Selection ◽

Ensemble Learning ◽

Naive Bayes ◽

Confusion Matrix ◽

Naïve Bayes ◽

Machine Learning Techniques ◽

Detection Systems ◽

Learning Techniques ◽

Performance Results

Intrusion detection systems (IDS) atau Sistem pendeteksian intrusi dikenal sebagai teknik yang sangat menonjol dan terkemuka untuk menemukan malicious activities pada jaringan komputer, tidak seperti firewall konvensional, IDS berbeda dalam hal pengidentifikasian serangan secara cerdas dengan pendekatan analitik seperti data mining dan teknik machine learning. Dalam beberapa dekade terakhir, ensemble learning sangat memajukan penelitian pada machine learning dan klasifikasi pola, serta menunjukan peningkatan hasil kinerja dibandingkan single classifier. Pada Penelitian ini dilakukan percobaan peningkatan nilai akurasi terhadap sistem pendeteksian anomali, pertama dilakukan klasifikasi menggunakan single classifier untuk didapati hasil nilai akurasi yang nantinya dibandingkan dengan hasil dari ensemble learning dan feature selection. Penggunaan ensemble learning bertujuan untuk mendapatkan nilai akurasi yang terbaik dari single classifier. Hasil didapatkan dari nilai confusion matrix dan akan dilakukan pengujian dengan cara membandingkan nilai kedua metode diatas. Penelitian berhasil mendapatkan nilai akurasi single classifier (naïve bayes) yaitu 77,4% dan nilai ensemble learning 96,8%. Kata Kunci— ensemble learning, nsl-kdd, naïve bayes, anomali, feature selectionIntrusion detection systems (IDS) are known as very prominent and leading techniques for finding malicious activities on computer networks, unlike conventional firewalls, IDS differs in terms of identifying attacks intelligently with analytic approaches such as machine learning techniques. In the last few decades, ensemble learning has greatly advanced research in machine learning and pattern classification it has shown an improve in performance results compared to a single classifier. In this study an attempt was made to increase the accuracy of anomalous detection systems, first by classification using a single classifier to find the results of accuracy which will be compared with the results of ensemble learning and feature selection. The use of ensemble learning aims to get the best accuracy value from a single classifier. The results are obtained from the value of the confusion matrix and will be tested by comparing the values of the two methods above. The research succeeded in getting a single classifier accuracy value of 77,4% and ensemble learning 96,8%. Keywords— ensemble learning, nsl-kdd, naïve bayes, anomali, feature selection

Download Full-text

Network Malware Detection using Soft Computing and Machine Learning Techniques

International Journal of Engineering and Advanced Technology - Regular Issue ◽

10.35940/ijeat.a1654.129219 ◽

2019 ◽

Vol 9 (2) ◽

pp. 879-885

Keyword(s):

Machine Learning ◽

Decision Tree ◽

Anomaly Detection ◽

Soft Computing ◽

Naive Bayes ◽

Malware Detection ◽

Naïve Bayes ◽

Machine Learning Techniques ◽

Learning Techniques ◽

Network Anomaly Detection

In today’s world there is rapid increase in the information which makes addressing of security issues more important. Malware detection is an important area for research in effective and secure functioning of computer networks. Research efforts are required to protect the systems from various security attacks. In this paper, we analyze usefulness of Soft Computing and Machine Learning Techniques for network malware detection. Hamamoto et al. [1] used combination of Genetic Algorithm and Fuzzy logic for implementation of network anomaly detection. The research work proposed in this paper extends the concepts discussed in [1]. The proposed work explores use of various Machine Learning algorithms such as K-Nearest Neighbor, Naïve Bayes and Decision Tree for network anomaly detection. The experimental observations are conducted on CIDDS (Coburg Intrusion Detection Data Set) dataset [14]. It is observed that Decision Tree approach gave better results as compared to KNN and Naïve Bayes techniques. Decision Tree technique gives 99% of accuracy and precision of 1 and recall of 1.

Download Full-text

A Comparative Analysis to Visualize the Behavior of Different Machine Learning Algorithms for Normalized and Un-Normalized Data in Predicting Alzheimer’s Disease

Journal of Computational and Theoretical Nanoscience ◽

10.1166/jctn.2019.8259 ◽

2019 ◽

Vol 16 (9) ◽

pp. 3840-3848

Author(s):

Neeraj Kumar ◽

Jatinder Manhas ◽

Vinod Sharma

Keyword(s):

Machine Learning ◽

Naive Bayes ◽

Neurodegenerative Disorder ◽

Learning Algorithms ◽

Naïve Bayes ◽

Machine Learning Algorithms ◽

Machine Learning Techniques ◽

Support Vector ◽

Linear Discriminant ◽

Age Related

Advancement in technology has helped people to live a long and better life. But the increased life expectancy has also elevated the risk of age related disorders, especially the neurodegenerative disorders. Alzheimer’s is one such neurodegenerative disorder, which is also the leading contributor towards dementia in elderly people. Despite of extensive research in this field, scientists have failed to find a cure for the disease till date. This makes early diagnosis of Alzheimer’s very crucial so as to delay its progression and improve the condition of the patient. Various techniques are being employed for diagnosing Alzheimer’s which include neuropsychological tests, medical imaging, blood based biomarkers, etc. Apart from this, various machine learning algorithms have been employed so far to diagnose Alzheimer’s in its early stages. In the current research, authors compared the performance of various machine learning techniques i.e., Linear Discriminant Analysis (LDA), K-Nearest Neighbour (KNN), Naïve Bayes (NB), Support Vector Machines (SVM), Decision Trees (DT), Random Forests (RF) and Multi Layer Perceptron (MLP) on Alzheimer’s dataset. This paper experimentally demonstrated that normalization exhibits a predominant role in enhancing the efficiency of some machine learning algorithms. Therefore it becomes imperative to choose the algorithms as per the available data. In this paper, the efficiency of the given machine learning methods was compared in terms of accuracy and f1-score. Naïve Bayes gave a better overall performance for both accuracy and f1-score and it also remained unaffected with the normalization of data along with LDA, DT and RF. Whereas KNN, SVM and MLP showed a drastic (17% to 86%) improvement in the performance when they are given normalized data as compared to un-normalized data from Alzheimer’s dataset.

Download Full-text

Doc2Vec &Naïve Bayes: Learners’ Cognitive Presence Assessment through Asynchronous Online Discussion TQ Transcripts

International Journal of Emerging Technologies in Learning (iJET) ◽

10.3991/ijet.v14i08.9964 ◽

2019 ◽

Vol 14 (08) ◽

pp. 70 ◽

Cited By ~ 3

Author(s):

Hind Hayati ◽

Abdessamad Chanaa ◽

Mohammed Khalidi Idrissi ◽

Samir Bennani

Keyword(s):

Machine Learning ◽

Naive Bayes ◽

Naïve Bayes ◽

Online Discussions ◽

Cognitive Presence ◽

Machine Learning Techniques ◽

Face To Face ◽

Learning Techniques ◽

Bayes Algorithm ◽

Context Features

Due to the lack of face to face interaction in online learning environment, this article aims essentially to give tutors the opportunity to understand and analyze learners’ cognitive behavior. In this perspective, we propose an automatic system to assess learners’ cognitive presence regarding their social interactions within synchronous online discussions. Combining Natural Language Preprocessing, Doc2Vec document embedding method and machine learning techniques; we first make some transformations and preprocessing to the given transcripts, then we apply Doc2Vec method to represent each message as a vector that will be concatenated with LIWC and context features. The vectors are input data of Naïve Bayes algorithm; a machine learning method; that aims to classify transcripts according to cognitive presence categories.

Download Full-text