scholarly journals Decision Tree based Classification and Dimensionality Reduction of Cervical Cancer

The data revolution in medicines and biology have increased our fundamental understandings of biological processes and determining the factors causing any disease, but it has also posed a challenge towards their analysis. After breast cancer, most of the deaths among women are due to cervical cancer. According to IARC, alone in 2012 a noticeable number of cases estimated 7095 of cervical cancer were reported. 16.5% of the deaths were due to the cervical cancer with the total deaths of 28,711 among women. To analyze the high dimensional data with high accuracy and in less amount of time, their dimensionality needs to be reduced to remove irrelevant features. The classification is performed using the recent iteration in Quinlan’s C4.5 decision tree algorithm i.e. C5.0 algorithm and PCA as Dimensionality Reduction technique. Our proposed methodology has shown a significant improvement in the account of time taken by both algorithms. This shows that C5.0 algorithm is superior to C4.5 algorithm.

2013 ◽  
Vol 397-400 ◽  
pp. 2296-2300 ◽  
Author(s):  
Fei Shuai ◽  
Jun Quan Li

In current, there are complex relationship between the assets of information security product. According to this characteristic, we propose a new asset recognition algorithm (ART) on the improvement of the C4.5 decision tree algorithm, and analyze the computational complexity and space complexity of the proposed algorithm. Finally, we demonstrate that our algorithm is more precise than C4.5 algorithm in asset recognition by an application example whose result verifies the availability of our algorithm.Keywordsdecision tree, information security product, asset recognition, C4.5


2014 ◽  
Vol 2014 ◽  
pp. 1-5 ◽  
Author(s):  
Fuding Xie ◽  
Yutao Fan ◽  
Ming Zhou

Dimensionality reduction is the transformation of high-dimensional data into a meaningful representation of reduced dimensionality. This paper introduces a dimensionality reduction technique by weighted connections between neighborhoods to improveK-Isomap method, attempting to preserve perfectly the relationships between neighborhoods in the process of dimensionality reduction. The validity of the proposal is tested by three typical examples which are widely employed in the algorithms based on manifold. The experimental results show that the local topology nature of dataset is preserved well while transforming dataset in high-dimensional space into a new dataset in low-dimensionality by the proposed method.


2018 ◽  
Author(s):  
Etienne Becht ◽  
Charles-Antoine Dutertre ◽  
Immanuel W. H. Kwok ◽  
Lai Guan Ng ◽  
Florent Ginhoux ◽  
...  

AbstractUniform Manifold Approximation and Projection (UMAP) is a recently-published non-linear dimensionality reduction technique. Another such algorithm, t-SNE, has been the default method for such task in the past years. Herein we comment on the usefulness of UMAP high-dimensional cytometry and single-cell RNA sequencing, notably highlighting faster runtime and consistency, meaningful organization of cell clusters and preservation of continuums in UMAP compared to t-SNE.


2015 ◽  
Vol 15 (2) ◽  
pp. 154-172 ◽  
Author(s):  
Danilo B Coimbra ◽  
Rafael M Martins ◽  
Tácito TAT Neves ◽  
Alexandru C Telea ◽  
Fernando V Paulovich

Understanding three-dimensional projections created by dimensionality reduction from high-variate datasets is very challenging. In particular, classical three-dimensional scatterplots used to display such projections do not explicitly show the relations between the projected points, the viewpoint used to visualize the projection, and the original data variables. To explore and explain such relations, we propose a set of interactive visualization techniques. First, we adapt and enhance biplots to show the data variables in the projected three-dimensional space. Next, we use a set of interactive bar chart legends to show variables that are visible from a given viewpoint and also assist users to select an optimal viewpoint to examine a desired set of variables. Finally, we propose an interactive viewpoint legend that provides an overview of the information visible in a given three-dimensional projection from all possible viewpoints. Our techniques are simple to implement and can be applied to any dimensionality reduction technique. We demonstrate our techniques on the exploration of several real-world high-dimensional datasets.


2012 ◽  
Vol 457-458 ◽  
pp. 754-757
Author(s):  
Hong Yan Zhao

The Decision Tree technology, which is the main technology of the Data Mining classification and forecast, is the classifying rule that infers the Decision Tree manifestation through group of out-of-orders, the non-rule examples. Based on the research background of The Decision Tree’s concept, the C4.5 Algorithm and the construction of The Decision Tree, the using of C4.5 Decision Tree Algorithm was applied to result analysis of students’ score for the purpose of improving the teaching quality.


2014 ◽  
Vol 926-930 ◽  
pp. 703-707
Author(s):  
Hu Yong

Aimed at the student the result problem, give student the result data scoops out the model. The decision tree method is a very valid classification method, in the data that scoop out. According to student the result data characteristics, adopted the C4.5 decision tree algorithm. C4.5 algorithm is the improvement algorithm of the decision trees core algorithm ID3, it construct in brief, the speed compare quickly, easy realization. Selection decision belongs to sex, scoop out the result enunciation, that algorithm can be right to get student the result data classification, and some worthy conclusion, provide the decision the analysis.


2019 ◽  
Vol 7 (2) ◽  
Author(s):  
Dyah Wulandari ◽  
Nur Lutfiyana ◽  
Heny Sumarno

Abstract - Credit is the provision of money or equivalent claims, based on agreements or agreements on loans between banks and other parties which require the borrowing party to repay the debt after a certain period of time with the amount of interest, compensation or profit sharing. From the credit customer data available at BSM KCP Kemang Pratama still has Non Performing Financing (NPF) or Bad Credit.In analyzing a credit sometimes an analyst does an inaccurate analysis, so there are some customers who are less able to make credit payments, resulting in bad credit. So the researchers conducted an analysis using the C4.5 decision tree algorithm and Rapid Miner application for determining credit worthiness. From the analysis of credit customer data using the C4.5 decision tree algorithm method, the feasibility of credit recipient customers is very effective and produces a value of accuracy on Rapid Miner 5.3 of 80%, Precision of 100% and Recall of 0% so as to minimize the risk.Keywords— Credit, C4.5 Algorithm, Rapid Miner, Value AccuracyAbstrak - Kredit merupakan penyediaan uang atau tagihan yang dapat disamakan dengan hal itu, berdasarkan persetujuan atau kesepakatan pinjaman-pinjaman antara bank dengan pihak lain yang mewajibkan pihak peminjam untuk melunasi utangnya setelah jangka waktu tertentu dengan jumlah bunga, imbalan atau pembagian hasil keuntungan. Dari data nasabah kredit yang ada pada BSM KCP Kemang Pratama masih memiliki Non Performing Financing (NPF) atau Kredit Macet. Dalam menganalisa sebuah kredit terkadang seorang analis melakukan analisa tidak akurat, sehingga ada beberapa nasabah yang kurang mampu dalam melakukan pembayaran kredit, dan pada akhirnya mengakibatkan kredit macet. Peneliti melakukan analisis menggunakan algoritma decision tree C4.5 dan aplikasi Rapid Miner untuk penentuan kelayakan pemberian kredit. Dari analisis data nasabah kredit menggunakan metode Algoritma decision tree C4.5 menghasilkan kelayakan nasabah penerima kredit sangat efektif dan menghasilkan nilai akurasi pada Rapid Miner 5.3 sebesar 80%, Precision sebesar 100% dan Recall sebesar 0% sehingga dapat meminimalisir resiko yang terjadi.Kata kunci— Kredit, Algoritma C4.5, Rapid Miner, Nilai Akurasi


Author(s):  
M. Robinson Joel ◽  
G. Vishali ◽  
R. Ponlatha ◽  
Syed Sharmila Begum

In this analysis, Cervical cancer took over the place four in the world level and it is the most prevalent cancer that is affecting women. If the cancer is detected in the earlier stages it can be cured and treated successfully. And it is also the leading gynecological malignancy disease worldwide. This is a paper which presents the classification techniques of cervical cancer. And also, this paper shows the advanced feature solution approaches of cervical cancer. The dimensionality reduction technique is used for the improvement of the classifier with great accuracy. There are two categories of feature selection and they are filters and wrappers. By using all these analytic techniques, we can classify cancer and its approaches. Therefore, this paper classifies the approaches of Cervical cancer.


2015 ◽  
Vol 4 (3) ◽  
pp. 173-182
Author(s):  
Salih Özsoy ◽  
Gökhan Gümüş ◽  
Savriddin KHALILOV

In this study, Data Mining, one of the latest technologies of the Information Systems, was introduced and Classification a Data Mining method and the Classification algorithms were discussed. A classification was applied by using C4.5 decision tree algorithm on a dataset about Labor Relations from http://archive.ics.uci.edu/ml/datasets.html. Finally, C4.5 algorithm was compared to some other decision tree algorithms. C4.5 was the one of the successful classifier.


Sign in / Sign up

Export Citation Format

Share Document