decision tree classifier
Recently Published Documents


TOTAL DOCUMENTS

410
(FIVE YEARS 225)

H-INDEX

22
(FIVE YEARS 5)

Author(s):  
Angela More

Abstract: Data analytics play vital roles in diagnosis and treatment in the health care sector. To enable practitioner decisionmaking, huge volumes of data should be processed with machine learning techniques to produce tools for prediction and classification Breast Cancer reports 1 million cases per year. We have proposed a prediction model, which is specifically designed for prediction of Breast Cancer using Machine learning algorithms Decision tree classifier, Naïve Bayes, SVM and KNearest Neighbour algorithms. The model predicts the type of tumour, the tumour can be benign (noncancerous) or malignant (cancerous) . The model uses supervised learning which is a machine learning concept where we provide dependent and independent columns to machine. It uses classification technique which predicts the type of tumour. Keywords: Cancer, Machine learning, Prediction, Data Visualization, SVM, Naïve Bayes, Classification.


Author(s):  
Hassan Najadat ◽  
Mohammad A. Alzubaidi ◽  
Islam Qarqaz

Reviews or comments that users leave on social media have great importance for companies and business entities. New product ideas can be evaluated based on customer reactions. However, this use of social media is complicated by those who post spam on social media in the form of reviews and comments. Designing methodologies to automatically detect and block social media spam is complicated by the fact that spammers continuously develop new ways to leave their spam comments. Researchers have proposed several methods to detect English spam reviews. However, few studies have been conducted to detect Arabic spam reviews. This article proposes a keyword-based method for detecting Arabic spam reviews. Keywords or Features are subsets of words from the original text that are labelled as important. A term's weight, Term Frequency–Inverse Document Frequency (TF-IDF) matrix, and filter methods (such as information gain, chi-squared, deviation, correlation, and uncertainty) have been used to extract keywords from Arabic text. The method proposed in this article detects Arabic spam in Facebook comments. The dataset consists of 3,000 Arabic comments extracted from Facebook pages. Four different machine learning algorithms are used in the detection process, including C4.5, kNN, SVM, and Naïve Bayes classifiers. The results show that the Decision Tree classifier outperforms the other classification algorithms, with a detection accuracy of 92.63%.


2022 ◽  
Vol 8 ◽  
Author(s):  
Alicia García-Dorta ◽  
Paola León-Suarez ◽  
Sonia Peña ◽  
Marta Hernández-Díaz ◽  
Carlos Rodríguez-Lozano ◽  
...  

Background: Secukinumab has been shown effective for psoriatic arthritis (PsA) and axial spondylarthritis (AxSpA) in randomized trials. The aim of this study was to analyze baseline patient and disease characteristics associated with a better retention rate of secukinumab under real-world conditions.Patients and Methods: Real-life, prospective multicenter observational study involving 138 patients, 61 PsA and 77 AxSpA, who were analyzed at baseline, 6, 12 months and subsequently every year after starting secukinumab regardless of the line of treatment. Demographics and disease characteristics, measures of activity, secukinumab use, and adverse events were collected. Drug survival was analyzed using Kaplan-Meier curves and factors associated with discontinuation were evaluated using Cox regression. The machine-learning J48 decision tree classifier was also applied.Results: During the 1st year of treatment, 75% of patients persisted with secukinumab, but accrued 71% (n = 32) in total losses (n = 45). The backward stepwise (Wald) method selected diagnosis, obesity, and gender as relevant variables, the latter when analyzing the interactions. At 1 year of follow-up, the Cox model showed the best retention rate in the groups of AxSpa women (95%, 95% CI 93–97%) and PsA men (89%, 95% CI 84–93%), with the worst retention in PsA women (66%, 95% CI 54–79%). The J48 predicted secukinumab retention with an accuracy of 77.2%. No unexpected safety issues were observed.Conclusions: Secukinumab shows the best retention rate at 1 year of treatment in AxSpA women and in PsA men, independently of factors such as the time of disease evolution, the line of treatment or the initial dose of the drug.


Sensors ◽  
2022 ◽  
Vol 22 (1) ◽  
pp. 379
Author(s):  
Grzegorz Piecuch ◽  
Rafał Żyła

The article presents an extensive analysis of the literature related to the diagnosis of the extrusion process and proposes a new, unique method. This method is based on the observation of the punch displacement signal in relation to the die, and then approximation of this signal using a polynomial. It is difficult to find in the literature even an attempt to solve the problem of diagnosing the extrusion process by means of a simple distance measurement. The dominant feature is the use of strain gauges, force sensors or even accelerometers. However, the authors managed to use the displacement signal, and it was considered a key element of the method presented in the article. The aim of the authors was to propose an effective method, simple to implement and not requiring high computing power, with the possibility of acting and making decisions in real time. At the input of the classifier, authors provided the determined polynomial coefficients and the SSE (Sum of Squared Errors) value. Based on the SSE values only, the decision tree algorithm performed anomaly detection with an accuracy of 98.36%. With regard to the duration of the experiment (single extrusion process), the decision was made after 0.44 s, which is on average 26.7% of the extrusion experiment duration. The article describes in detail the method and the results achieved.


2021 ◽  
pp. 1-11
Author(s):  
Jesús Miguel García-Gorrostieta ◽  
Aurelio López-López ◽  
Samuel González-López ◽  
Adrián Pastor López-Monroy

Academic theses writing is a complex task that requires the author to be skilled in argumentation. The goal of the academic author is to communicate clear ideas and to convince the reader of the presented claims. However, few students are good arguers, and this is a skill that takes time to master. In this paper, we present an exploration of lexical features used to model automatic detection of argumentative paragraphs using machine learning techniques. We present a novel proposal, which combines the information in the complete paragraph with the detection of argumentative segments in order to achieve improved results for the detection of argumentative paragraphs. We propose two approaches; a more descriptive one, which uses the decision tree classifier with indicators and lexical features; and another more efficient, which uses an SVM classifier with lexical features and a Document Occurrence Representation (DOR). Both approaches consider the detection of argumentative segments to ensure that a paragraph detected as argumentative has indeed segments with argumentation. We achieved encouraging results for both approaches.


2021 ◽  
Vol 2 (2) ◽  
pp. 95-103
Author(s):  
Siti Khotimatul Wildah ◽  
Sarifah Agustiani ◽  
Ali Mustopa ◽  
Nanik Wuryani ◽  
Hendri Mahmud Nawawi ◽  
...  

Wajah merupakan bagian dari sistem biometric dimana wajah manusia memiliki bentuk dan karakteristik yang berbeda antara satu dengan lainnya sehingga wajah dapat dijadikan sebagai alternatif pengamanan suatu sistem. Proses pengenalan wajah didasarkan pada proses pencocokan dan perbandingan citra yang dimasukan dengan citra yang telah tersimpan di database. Akan tetapi pengenalan wajah menjadi permasalahan yang cukup menantang dikarenakan illuminasi, pose dan ekspresi wajah serta kualitas citra. Oleh sebab itu pada penelitian ini bertujuan untuk melakukan pengenalan wajah dengan menggunakan metode machine learning seperti Logistic Regression (LR), Linear Discriminant Analysis (LDA), Decision Tree Classifier, Random Forest Classifier (RF), Gaussian NB, K Neighbors Classifier (KNN) dan Support Vector Machine (SVM) dan beberapa metode ekstraksi fitur Hu-Moment, HOG dan Haralick pada dataset Yale Face. Berdasarkan pengujian yang dilakukan metode ekstraksi fitur gabungan Hu-Moment, HOG dan Haralick dengan algoritma Linear Discriminant Analysis (LDA) menghasilkan nilai akurasi tertinggi sebesar 79,71% dibandingkan dengan metode ekstraksi fitur dan algoritma klasifikasi lainnya.


Author(s):  
Mohamed H. Khedr ◽  
Nesrine A. Azim ◽  
Ammar M. Ammar

In the Egyptian banking industry, loan officers use pure judgment to make personal loan approval decisions. In this paper, we develop a new predictive method for default customers' loans using machine learning. The new predictive method uses the available personal data and historical credit data to evaluate the credit trust-worthiness of customers to obtain loans. We used the ABE dataset for training and testing, as we used 10 features from the application form and i- score report class that could give great help to credit officers for taking the right decision through avoiding customer selection using random techniques. The collected dataset was analysed by using various machine learning classifiers based on important selected features, to obtain high accuracy. We compared the performance of several machine learning classifiers before and after feature selection. We have found that in terms of high accuracy, the most important features are (activity – income – loan) and in terms of better performance the decision tree classifier has surpassed any other machine learning classifier with significant prediction accuracy of almost 94.85%.


2021 ◽  
Vol 2021 ◽  
pp. 1-14
Author(s):  
Hanaa Fathi ◽  
Hussain AlSalman ◽  
Abdu Gumaei ◽  
Ibrahim I. M. Manhrawy ◽  
Abdelazim G. Hussien ◽  
...  

Cancer can be considered as one of the leading causes of death widely. One of the most effective tools to be able to handle cancer diagnosis, prognosis, and treatment is by using expression profiling technique which is based on microarray gene. For each data point (sample), gene data expression usually receives tens of thousands of genes. As a result, this data is large-scale, high-dimensional, and highly redundant. The classification of gene expression profiles is considered to be a (NP)-Hard problem. Feature (gene) selection is one of the most effective methods to handle this problem. A hybrid cancer classification approach is presented in this paper, and several machine learning techniques were used in the hybrid model: Pearson’s correlation coefficient as a correlation-based feature selector and reducer, a Decision Tree classifier that is easy to interpret and does not require a parameter, and Grid Search CV (cross-validation) to optimize the maximum depth hyperparameter. Seven standard microarray cancer datasets are used to evaluate our model. To identify which features are the most informative and relative using the proposed model, various performance measurements are employed, including classification accuracy, specificity, sensitivity, F1-score, and AUC. The suggested strategy greatly decreases the number of genes required for classification, selects the most informative features, and increases classification accuracy, according to the results.


Author(s):  
Aberham Tadesse Zemedkun

Diabetes is one of the most common non-communicable diseases in the world. Diabetes affects the ability to produce the hormone insulin. Thus, complications may occur if diabetes remains untreated and unidentified. That features a significant contribution to increased morbidity, mortality, and admission rates of patients in both developed and developing countries. When disease is not detected early, it leads to complications. Medical records of the cases were retrospective. Anthropometric and biochemical information was collected. From this data, four ML classification algorithms, including Decision Tree (J48), Naive-Bayes, PART rule induction, and JRIP, were used to prognosticate diabetes. Precision, recall, F-Measure, Receiver Operating Characteristics (ROC) scores, and the confusion matrix were calculated to determine the performance of the various algorithms. The performance was also measured by sensitivity and specificity. They have high classification accuracy and are generally comparable in predicting diabetes and free diabetes patients. Among the selected algorithms tested, the Decision Tree Classifier (J48) algorithm scored the highest accuracy and was the best predictor, with a classification accuracy of 92.74%.


Sign in / Sign up

Export Citation Format

Share Document