Data Mining Through Data Visualization: A Case Study on Predicting Churners on Telecomunications Data Set

The objective of this article is an evaluation and assessment efficiency of the poultry meat farm as a case study with the new method. As it is clear poultry farm industry is one of the most important sub- sectors in comparison to other ones. The purpose of this study is the prediction and assessment efficiency of poultry farms as decision making units (DMUs). Although, several methods have been proposed for solving this problem, the authors strongly need a methodology to discriminate performance powerfully. Their methodology is comprised of data envelopment analysis and some data mining techniques same as artificial neural network (ANN), decision tree (DT), and cluster analysis (CA). As a case study, data for the analysis were collected from 22 poultry companies in Iran. Moreover, due to a small data set and because of the fact that the authors must use large data set for applying data mining techniques, they employed k-fold cross validation method to validate the authors’ model. After assessing efficiency for each DMU and clustering them, followed by applied model and after presenting decision rules, results in precise and accurate optimizing technique.

Download Full-text

PENERAPAN DATA MINING MENGGUNAKAN ALGORITMA C4.5 TEHADAP PENGARUH PENJUALAN KOPI PADA PT. JPW INDONESIA

Jurnal Sistem Informasi dan Informatika (Simika) ◽

10.47080/simika.v3i1.836 ◽

2020 ◽

Vol 3 (1) ◽

pp. 40-54

Author(s):

Ikong Ifongki

Keyword(s):

Data Mining ◽

Decision Tree ◽

Decision Rules ◽

Large Data ◽

Added Value ◽

Data Set ◽

Use Of Data ◽

Decision Tree Classification ◽

C4.5 Algorithm

Data mining is a series of processes to explore the added value of a data set in the form of knowledge that has not been known manually. The use of data mining techniques is expected to provide knowledge - knowledge that was previously hidden in the data warehouse, so that it becomes valuable information. C4.5 algorithm is a decision tree classification algorithm that is widely used because it has the main advantages of other algorithms. The advantages of the C4.5 algorithm can produce decision trees that are easily interpreted, have an acceptable level of accuracy, are efficient in handling discrete type attributes and can handle discrete and numeric type attributes. The output of the C4.5 algorithm is a decision tree like other classification techniques, a decision tree is a structure that can be used to divide a large data set into smaller sets of records by applying a series of decision rules, with each series of division members of the resulting set become similar to each other. In this case study what is discussed is the effect of coffee sales by processing 106 data from 1087 coffee sales data at PT. JPW Indonesia. Data samples taken will be calculated manually using Microsoft Excel and Rapidminer software. The results of the calculation of the C4.5 algorithm method show that the Quantity and Price attributes greatly affect coffee sales so that sales at PT. JPW Indonesia is still often unstable.

Download Full-text

Decision Support System For A Customer Relationship Management Case Study

International Journal of Informatics and Communication Technology (IJ-ICT) ◽

10.11591/ijict.v3i2.pp88-96 ◽

2014 ◽

Vol 3 (2) ◽

pp. 88

Author(s):

Özge Kart ◽

Alp Kut ◽

Vladimir Radevski

Keyword(s):

Data Mining ◽

Customer Relationship Management ◽

Banking Sector ◽

Relationship Management ◽

Customer Relationship ◽

Data Set ◽

Naive Bayesian ◽

Naïve Bayesian ◽

Management Case Study

<span lang="EN-US">Data mining is a computational approach aiming to discover hidden and valuable information in large datasets. It has gained importance recently in the wide area of computational among which many in the domain of Business Informatics. This paper focuses on applications of data mining in Customer Relationship Management (CRM). The core of our application is a classifier based on the naive Bayesian classification. The accuracy rate of the model is determined by doing cross validation. The results demonstrated the applicability and effectiveness of the proposed model. Naive Bayesian classifier reported high accuracy. So the classification rules can be used to support decision making in CRM field. The aim of this study is to apply the data mining model to the banking sector as example case study. This work also contains an example data set related with customers to predict if the client will subscribe a term deposit. The results of the implementation are available on a mobile platform. </span>

Download Full-text

Pemetaan Siswa Berprestasi Menggunakan Metode K-Means Clustring

JURTEKSI ◽

10.33330/jurteksi.v4i1.28 ◽

2017 ◽

Vol 4 (1) ◽

pp. 85-92

Author(s):

Mustika Larasati Sibuea ◽

Andy Safta

Keyword(s):

Data Mining ◽

Student Achievement ◽

Manhattan Distance ◽

Data Set ◽

Euclidian Distance ◽

Student Failure ◽

Human Resources Information Systems ◽

High Level

Abstract: The high level of student success and the low level of student failure is a quality of the education world. The world of education is currently required to have the ability to compete by utilizing all resources owned. In addition to facilities, infrastructure and human resources, information systems are one of the resources that can be used to improve competency skills. Data mining is a process of data analysis to find a dataset of data set. Data mining is able to analyze large amounts of data into information that has meaning for decision supporters. One process of data mining is clustring. Attributes used in the grouping of student achievement are Name, Extracurricular, Value which include Task Value, Uts Value, Value of Uses, total absenteeism, and Attitude value. The case study of 20 students with distance calculation using manhattan distance, chbychep distance and euclidian distance yielded 67% accuracy. Keywords: data mining, clustering, k-means, student achievement Abstrak: Tingginya tingkat keberhasilan siswa dan rendahnya tingkat kegagalan siswa merupakan cemin kualitas dunia pendidikan.Dunia pendidikan saat ini dituntut untuk memiliki kemampuan bersaing dengan memanfaatkan semua sumber daya yang dimiliki. Selain sumber daya sarana, prasarana dan manusia, sistem informasi merupakan salah satu sumber daya yang dapat digunakan untuk meningkatkan kemampuan barsaing. Data mining merupakan proses analisa data untuk menemukan suatu pola dara kumpulan data. Data mining mampu menganalisa jumlah data yang besar menjadi informasi yang mempunyai arti bagi pendukung keputusan. Salah satu proses data mining adalah clustring. Atribut yang digunakan dalam pengelompokan prestasi siswa adalah Nama, Ekstrakulikuler, Nilai yang meliputi Nilai Tugas, Nilai Uts, Nilai Uas, jumlah ketidak hadiran siswa (absensi), dan Nilai sikap. Studi kasus pada 20 siswa dengan perhitungan jarak menggunakan manhattan distance, chbychep distance dan euclidian distance menghasilkan akurasi sebesar 67%. Kata kunci: data mining, clustering, k-means, prestasi siswa

Download Full-text

Contributions of KDD to the Knowledge Management Process

CLEI electronic journal ◽

10.19153/cleiej.7.1.2 ◽

2018 ◽

Vol 7 (1) ◽

Author(s):

Hércules Antonio Do Prado ◽

Paulo de Tarso Costa de Sousa ◽

Eduardo Amadeu Moresi ◽

Marcelo Ladeira

Keyword(s):

Data Mining ◽

Knowledge Management ◽

Knowledge Creation ◽

Federal District ◽

Knowledge Discovery In Databases ◽

Post Processing ◽

Data Set ◽

Processing Step ◽

And Storage

Knowledge Discovery in Databases (KDD), as any organizational process, is carried out beneath a Knowledge Management (KM) model adopted (even informally) by a corporation. KDD is grossly described in three steps: pre-processing, data mining, and post-processing. The latter is mainly related to the task of transforming in knowledge the patterns issued in the data mining step. On the other hand, KM comprises the following phases, in which knowledge is the subject of the actions: identification of abilities, acquisition, selection and validation, organization and storage, sharing, application, and creation. Although there are many overlaps between KDD and KM, one of them is broadly recognized: the point in which knowledge arises. This paper concerns a study aimed at clarifying relations between the overlapping areas of KDD and knowledge creation, in KM. The work is conducted by means of a case study using the data from the Electoral Court of the Federal District (ECFD), Brazil. The study was developed over a 1.717.000-citizens data set from which data mining models were built by applying algorithms from Weka. It was observed that, although the importance of Information Technology is well recognized in the KM realm, the techniques of KDD deserve a special place in the knowledge creation phase of KM. Moreover, beyond the overlap of post- processing and knowledge creation, other steps of KDD can contribute significantly to KM. An example is the fact that one important decision taken from the ECFD board was taken on the basis of a knowledge acquired from the pre-processing step of KDD.

Download Full-text

Interactive Data Visualization to Understand Data Better

International Journal of Knowledge Discovery in Bioinformatics ◽

10.4018/ijkdb.2014070101 ◽

2014 ◽

Vol 4 (2) ◽

pp. 1-10 ◽

Cited By ~ 1

Author(s):

Zhecheng Zhu ◽

Bee Hoon Heng ◽

Kiok Liang Teow

Keyword(s):

Case Studies ◽

Healthcare System ◽

Data Visualization ◽

Healthcare Systems ◽

Data Set ◽

First Case ◽

Interactive Data ◽

Visual Impact ◽

Visualization Techniques

This paper focuses on interactive data visualization techniques and their applications in healthcare systems. Interactive data visualization is a collection of techniques translating data from its numeric format to graphic presentation dynamically for easy understanding and visual impact. Compared to conventional static data visualization techniques, interactive data visualization techniques allow users to self-explore the entire data set by instant slice and dice, quick switching among multiple data sources. Adjustable granularity of interactive data visualization allows for both detailed micro information and aggregated macro information displayed in a single chart. Animated transition adds extra visual impact that describes how system transits from one state to another. When applied to healthcare system, interactive visualization techniques are useful in areas such as information integration, flow or trajectory presentation and location related visualization, etc. In this paper, three case studies are shared to illustrate how interactive data visualization techniques are applied to various aspects of healthcare systems. The first case study shows a pathway visualization representing longitudinal disease progression of a patient cohort. The second case study shows a dashboard profiling different patient cohorts from multiple perspectives. The third case study shows an interactive map illustrating patient geographical distribution at adjustable granularity. All three case studies illustrate that interactive data visualization techniques help quick information access, fast knowledge sharing and better decision making in healthcare system.

Download Full-text

WEB APPLICATION FOR LARGE-SCALE MULTIDIMENSIONAL DATA VISUALIZATION

Mathematical Modelling and Analysis ◽

10.3846/13926292.2011.580381 ◽

2011 ◽

Vol 16 (1) ◽

pp. 273-285 ◽

Cited By ~ 4

Author(s):

Gintautas Dzemyda ◽

Virginijus Marcinkevičius ◽

Viktor Medvedev

Keyword(s):

Data Mining ◽

Data Visualization ◽

Web Application ◽

Large Scale ◽

Visual Presentation ◽

Multidimensional Data ◽

Data Sets ◽

Data Set ◽

Multidimensional Data Visualization ◽

Multidimensional Data Set

In this paper, we present an approach of the web application (as a service) for data mining oriented to the multidimensional data visualization. This paper focuses on visualization methods as a tool for the visual presentation of large-scale multidimensional data sets. The proposed implementation of such a web application obtains a multidimensional data set and as a result produces a visualization of this data set. It also supports different configuration parameters of the data mining methods used. Parallel computation has been used in the proposed implementation to run the algorithms simultaneously on different computers.

Download Full-text

Penerapan Metode Klasifikasi Decision Tree dan Algoritma C4.5 dalam Memprediksi Kriteria Nasabah Kredit Mega Auto Finance

JURIKOM (Jurnal Riset Komputer) ◽

10.30865/jurikom.v7i2.1762 ◽

2020 ◽

Vol 7 (2) ◽

pp. 200

Author(s):

Puji Santoso ◽

Rudy Setiawan

Keyword(s):

Data Mining ◽

Decision Tree ◽

Microsoft Excel ◽

Customer Data ◽

Data Mining Techniques ◽

C4.5 Algorithm ◽

Marketing Costs ◽

Excel Format ◽

Data Mining Application

One of the tasks in the field of marketing finance is to analyze customer data to find out which customers have the potential to do credit again. The method used to analyze customer data is by classifying all customers who have completed their credit installments into marketing targets, so this method causes high operational marketing costs. Therefore this research was conducted to help solve the above problems by designing a data mining application that serves to predict the criteria of credit customers with the potential to lend (credit) to Mega Auto Finance. The Mega Auto finance Fund Section located in Kotim Regency is a place chosen by researchers as a case study, assuming the Mega Auto finance Fund Section has experienced the same problems as described above. Data mining techniques that are applied to the application built is a classification while the classification method used is the Decision Tree (decision tree). While the algorithm used as a decision tree forming algorithm is the C4.5 Algorithm. The data processed in this study is the installment data of Mega Auto finance loan customers in July 2018 in Microsoft Excel format. The results of this study are an application that can facilitate the Mega Auto finance Funds Section in obtaining credit marketing targets in the future

Download Full-text

Data Visualization Indicator Disease (Malaria, Dengue Fever, and Measles) in The Year 2012-2015

International Journal of New Media Technology ◽

10.31937/ijnmt.v4i2.785 ◽

2017 ◽

Vol 4 (2) ◽

pp. 87-93

Author(s):

Immanuel Luigi Da Gusta ◽

Johan Setiawan

Keyword(s):

Data Mining ◽

Human Resources ◽

Data Visualization ◽

Health Sector ◽

Medical Personnel ◽

Database Systems ◽

Process Time ◽

Acceptance Test ◽

Visual Data Mining ◽

The Government

The aim of this paper are: to create a data visualization that can assist the Government in evaluating the return on the development of health facilities in the region and province area in term of human resources for medical personnel, to help community knowing the amount of distribution of hospitals with medical personnel in the regional area and to map disease indicator in Indonesia. The issue of tackling health is still a major problem that is not resolved by the Government of Indonesia. There are three big things that become problems in the health sector in Indonesia: infrastructure has not been evenly distributed and less adequate, the lack of human resources professional health workforce, there is still a high number of deaths in the outbreak of infectious diseases. Data for the research are taken from BPS, in total 10,600 records after the Extract, Transform and Loading process. Time needed to convert several publications from PDF, to convert to CSV and then to MS Excel 3 weeks. The method used is Eight-step Data Visualization and Data Mining methodology. Tableau is chosen as a tool to create the data visualization because it can combine each dasboard inside a story interactive, easier for the user to analyze the data. The result is a story with 3 dashboards that can fulfill the requirement from BPS staff and has been tested with a satisfied result in the UAT (User Acceptance Test). Index Terms—Dashboard, data visualization, disease, malaria, Tableau REFERENCES [1] S. Arianto, Understanding of learning and others, 2008. [2] Rainer; Turban, Introduction to Information Systems, Danvers: John Wiley & Sons, Inc, 2007. [3] V. Friedman, Data Visualization Infographics, Monday Inspirition, 2008. [4] D. A. Keim, "Information Visualization and Visual Data Mining," IEEE Transactions on Visualization and Computer Graphics 8.1, pp. 1-8, 2002. [5] Connolly and Begg, Database Systems, Boston: Pearson Education, Inc, 2010. [6] E. Hariyanti, "Pengembangan Metodologi Pembangunan Information Dashboard Untuk Monitoring kinerja Organisasi," Konferensi dan Temu Nasional Teknologi Informasi dan Komunikasi untuk Indonesia, p. 1, 2008. [7] S. Darudiato, "Perancangan Data Warehouse Penjualan Untuk Mendukung Kebutuhan Informasi Eksekutif Cemerlang Skin Care," Seminar Nasional Informatika 2010, pp. E-353, 2010.

Download Full-text

Parent and Grandparent Relationships in Emerging Adulthood

10.1093/oso/9780199934263.003.0007 ◽

2018 ◽

Author(s):

Michael W. Pratt ◽

M. Kyle Matsuba

Keyword(s):

Emerging Adulthood ◽

Attachment Theory ◽

Emerging Adults ◽

Family Relationships ◽

Longitudinal Research ◽

Study Data ◽

Family Development ◽

Data Set ◽

Theory Framework

Chapter 7 begins with an overview of Erikson’s ideas about intimacy and its place in the life cycle, followed by a summary of Bowlby and Ainsworth’s attachment theory framework and its relation to family development. The authors review existing longitudinal research on the development of family relationships in adolescence and emerging adulthood, focusing on evidence with regard to links to McAdams and Pals’ personality model. They discuss the evidence, both questionnaire and narrative, from the Futures Study data set on family relationships, including emerging adults’ relations with parents and, separately, with grandparents, as well as their anticipations of their own parenthood. As a way of illustrating the key personality concepts from this family chapter, the authors end with a case study of Jane Fonda in youth and her father, Henry Fonda, to illustrate these issues through the lives of a 20th-century Hollywood dynasty of actors.

Download Full-text