scholarly journals PENERAPAN DATA MINING DENGAN ALGORITMA C4.5 UNTUK KALKULASI DATA DALAM AL-QUR’AN TERJEMAHAN

2019 ◽  
Vol 6 (2) ◽  
pp. 152
Author(s):  
Asmira Mira Rusli

<p><em>Al-Qur'an was delivered by Prophet Muhammad SAW to mankind to become a guide in life in the world, which has 30 Juz, 114 suras and 6,236 verses. In the Qur'an there are many words that are repeated in a surah on a particular topic. To find out the number of repetitions of the word, an application of data calculations is built in the Al-Qur'an translation that can facilitate the calculation of words. So that users can quickly find out the number of words that repeat and how accurate the data is. This application is made with C4.5 algorithm, one of the data mining methods. Using the C4.5 algorithm is expected to be able to solve the problem of search and calculation in the translation of the Qur'an. The advantages of the C4.5 algorithm are producing decision trees that can be easily interpreted, as well as acceptable and efficient levels of accuracy. In this case the accuracy of the C4.5 algorithm reaches 80%.</em></p><p><em><strong>Keywords</strong></em><em>: </em><em>al-quran, data mining, algoritma </em>C4.5</p><p><em>Al-Qur’an </em><em>disampaikan oleh Nabi Muhammad SAW kepada umat manusia untuk dijadikan pedoman dalam kehidupan di dunia, yang </em><em>mempunyai 30 Juz, 114 surah dan 6</em><em>.</em><em>236 ayat. Dalam Al-Qur’an </em><em>terdapat banyak kata yang berulang dalam suatu surah mengenai topik tertentu. Untuk mengetahui jumlah perulangan kata tersebut, maka dibangun sebuah aplikasi kalkulasi data dalam Al-Qur’an terjemahan yang dapat mempermudah pengkalkulasian kata. Sehingga pengguna bisa dengan cepat mengetahui jumlah kata yang berulang serta seberapa akuratnya data tersebut. Aplikasi ini dibuat dengan algoritma C4.5 salah satu metode data mining. Dengan menggunakan algoritma C4.5 diharapkan mampu menyelesaikan masalah pencarian dan pengkalkulasian dalam terjemahan Al-Qur’an. </em><em>Kelebihan </em><em>dari </em><em>algoritma C4.5 </em><em>yaitu </em><em>menghasilkan pohon keputusan yang </em><em>dapat dengan </em><em>mudah diinterprestasikan, </em><em>serta </em><em>tingkat akurasi yang dapat diterima dan efisien. </em><em>Dalamm kasus ini tingkat akurasi Algoritma C4.5 mencapai 80%</em><em>.</em></p><p><em><strong>Kata kunci</strong></em><em>: </em><em>al-quran, data mining, algoritma </em>C4.5</p>

2021 ◽  
Author(s):  
Chhaya Kulkarni ◽  
Nuzhat Maisha ◽  
Leasha J Schaub ◽  
Jacob Glaser ◽  
Erin Lavik ◽  
...  

This paper focuses on the discovery of a computational design map of disparate heterogeneous outcomes from bioinformatics experiments in pig (porcine) studies to help identify key variables impacting the experiment outcomes. Specifically we aim to connect discoveries from disparate laboratory experimentation in the area of trauma, blood loss and blood clotting using data science methods in a collaborative ensemble setting. Trauma related grave injuries cause exsanguination and death, constituting up to 50% of deaths especially in the armed forces. Restricting blood loss in such scenarios usually requires the presence of first responders, which is not feasible in certain cases. Moreover, a traumatic event may lead to a cytokine storm, reflected in the cytokine variables. Hemostatic nanoparticles have been developed to tackle these kinds of situations of trauma and blood loss. This paper highlights a collaborative effort of using data science methods in evaluating the outcomes from a lab study to further understand the efficacy of the nanoparticles. An intravenous administration of hemostatic nanoparticles was executed in pigs that had to undergo hemorrhagic shock and blood loss and other immune response variables, cytokine response variables are measured. Thus, through various hemostatic nanoparticles used in the intervention, multiple data outcomes are produced and it becomes critical to understand which nanoparticles are critical and what variables are key to study further variations in the lab. We propose a collaborative data mining framework which combines the results from multiple data mining methods to discover impactful features. We used frequent patterns observed in the data from these experiments. We further validate the connections between these frequent rules by comparing the results with decision trees and feature ranking. Both the frequent patterns and the decision trees help us identify the critical variables that stand out in the lab studies and need further validation and follow up in future studies. The outcomes from the data mining methods help produce a computational design map of the experimental results. Our preliminary results from such a computational design map provided insights in determining which features can help in designing the most effective hemostatic nanoparticles.


Author(s):  
Mario Mollo Neto ◽  
Maria Elena Silva Montanhani ◽  
Leda Gobbo de Freitas Bueno ◽  
Erick Dos Santos Harada ◽  
Danilo Florentino Pereira

Climatic changes and high temperatures have been affecting animal production and the well-being of laying birds, with heat stress and high mortality rates, generating economic losses. Legacy databases can contain information to help model thermal comfort at climatic extremes. They can enable decision trees to be created through the use of data mining to prevent mortality and production losses. Thus, the objective of this study is to seek to develop decision trees, for application as an alert system, for the incidence of caloric stress in the production of layers. We used a database of three aviaries located in the city of Bastos-SP, collected in 2013. The data were organized in Excel® spreadsheets, and processed with the Weka® software with the J48 (C4.5) algorithm for mining of the data. The technique allowed the construction of decision trees that in the chosen sheds were classified with respectively 99.73%, 99.61%, and 98.71% of correct answers and with Kappa indexes equal to 0.9958, 0.9907 and 0.9663, which indicate that the three classifiers built are excellent. Thus, the proposed system, with the decision trees built, can serve as a basis for the construction of an alert system to be applied to the three warehouses simultaneously.


Tech-E ◽  
2021 ◽  
Vol 4 (2) ◽  
pp. 44
Author(s):  
Rino Rino

Heart disease is a condition of the presence of fatty deposits in the coronary arteries in the heart which changes the role and shape of the arteries so that blood flow to the heart is obstructed. Data mining methods can predict this disease, some of the methods are C4.5 Algorithm and Naive Bayes which are often used in research.The data set in this research was obtained from the uci machine learning repository site, where the dataset has 3546 records and 13 attributes.The accuracy value of the Naïve Bayes algorithm has a high value of 81.40% compared to the C4.5 algorithm which only has an accuracy value of 79.07%. Based on the calculation results, it can be concluded that the Naïve Bayes Algorithm is a very good clarification because it has a value between 0.709 - 1.00.From conclusion above, the Naïve Bayes algorithm has a higher accuracy value than the C4.5 algorithm so the researchers decided to use the Naïve Bayes algorithm in predicting heart disease.


2018 ◽  
Vol 7 (2.7) ◽  
pp. 51
Author(s):  
T V.R. Sai ◽  
SK Haaris ◽  
S Sridevi

In this project we used opinion mining methods to evaluate various websites present on the internet. We also analyzed the approaches, tools, and dataset used by Scholars with their accuracy and we used this technology for evaluation of a website. Opinion mining is used in various scenarios around the world. But it is hardly used in websites evaluation which we are implementing with this project, as now a day’s, websites we regularly use are spamming with advertisements and unusable content. This paper proposed a frame work of evaluating a website using the user feedback on the website collected on our website. That collected feedback data is processed using a data mining software that is rapid miner. 


2019 ◽  
Vol 4 (1) ◽  
pp. 55-63
Author(s):  
Darsono Nababan ◽  
Alvin Wijaya ◽  
William William

Abstrak Pendidikan merupakan salah satu kebutuhan  yang sangat penting di dalam mansyarakat. Melalui Pendidikan pola pikir dan kualitas hidup seseorang akan meningkat. Untuk mendapatkan Pendidikan yang berkualitas maka dibutuhkan juga biaya yang sangat tinggi saat ini, untuk itu mahasiswa/mahasiswi banyak yang tertarik untuk mengambil beasiswa untuk dapat meringankan beban tersebut. Dengan memanfaatkan pengisian survei untuk mengukur  factor-faktor  apa saja yang mempengaruhi mahasiswa dalam mengambil beasiswa. Melalui hasil analisis tersebut diharapkan dapat menghasilkan faktor apa yang paling mendekati sehingga mahasiswa mengambil beasiswa. Pengolahan data mining  dilakukan dengan menggunakan Algoritma C4.5 untuk  untuk membentuk pohon keputusan. Adapun perangkat lunak yang menerapkan Algoritma C4.5 ini adalah Rapid Miner.   Kata kunci—Pendidikan, Beasiswa, Algoritma C4.5, Rapid Miner Abstract Education is one of the most important needs in society. Through Education the mindset and quality of one's life will increase. To get quality education, a very high cost is needed at this time, for students, many students are interested in taking scholarships to ease the burden. By utilizing filling surveys to measure what factors influence students in taking scholarships. Through the results of the analysis, it is expected to produce the closest factors so that students take scholarships. Data mining processing is done by using C4.5 Algorithm to form decision trees. The software that applies the C4.5 Algorithm is Rapid Miner. Keywords— Education, Scholarship, C4.5 Algorithms, Rapid Miner


2020 ◽  
Vol 8 (6) ◽  
pp. 1045-1049

India has a second largest population and seventh largest country in the world, the UN data in 2018 recorded that there were 1,368,681,134 more people scattered throughout the Indian provinces. In addition, India also has a variety of social problems, one of which is poverty. The poverty line number in Indonesia needs to be improved. Data utilization techniques become new information called data mining. One of the most popular data mining methods is clustering using the k-means algorithm. K-means can process data without being notified in advance of the class label. This study will produce three provincial groups according to very low, low and sufficient income figures. Data processing of poverty line numbers in India using the k-means algorithm to get the results of the Davies Bouldin index of 0.271. These results are considered well enough because the closer the results obtained with zeros, the better the data similarity between members of the cluster.


Author(s):  
Sarangam Kodati ◽  
Jeeva Selvaraj

Data mining is the most famous knowledge extraction approach for knowledge discovery from data (KDD). Machine learning is used to enable a program to analyze data, recognize correlations, and make usage on insights to solve issues and/or enrich data and because of prediction. The chapter highlights the need for more research within the usage of robust data mining methods in imitation of help healthcare specialists between the diagnosis regarding heart diseases and other debilitating disease conditions. Heart disease is the primary reason of death of people in the world. Nearly 47% of death is caused by heart disease. The authors use algorithms including random forest, naïve Bayes, support vector machine to analyze heart disease. Accuracy on the prediction stage is high when using a greater number of attributes. The goal is to function predictive evaluation using data mining, using data mining to analyze heart disease, and show which methods are effective and efficient.


Kidney disease is one of the real general medical issues these days. Ceaseless ailments prompt to horribleness and mortality in India and furthermore in the low pay and center nation. The interminable infections on record is 60% of death all through the around the world. 80% of unending malady passing overall additionally happen in low and center pay nations. In India, most likely the quantity of passing is because of the ceaseless ailment observed to be 5.21 million in 2008 and is by all accounts brought to 7.63 million up in 2020 roughly 66.7%. Information mining is the procedure of extraction is the concealed data from the given expansive dataset. Different information mining strategies, for example, bunching, characterization, affiliation investigation, relapse, outline, time arrangement examination and succession investigation were utilized to anticipate kidney maladies. The strategies that were presented so far had minor downsides in the nature of pre handling or at some other stages. In this paper, the different information mining methods are reviewed to foresee kidney sicknesses and real issues are quickly clarified


Author(s):  
I.M. Burykin ◽  
◽  
G.N. Aleeva ◽  
R.Kh. Khafizianova ◽  
◽  
...  
Keyword(s):  

Sign in / Sign up

Export Citation Format

Share Document