A Survey of Methodologies and Techniques for Data Mining and Intelligent Data Discovery

Bio-inspired algorithms are sort of implementation of natural solutions to solve hard problems – so called NP problems. A seismic hazard is the probability that an earthquake will occur in a given geographic area, within a given window of time, and with ground motion intensity exceeding a given threshold. Seismic hazards prediction is one of the fields where data mining plays an important role. This paper presents a new bio-inspired algorithm motivated by the echolocation behavior of bats for seismic hazard states prediction in coal mines based on previously recorded data. It is a distance calculation based approach, Results were very satisfactory in a manner that encourage us to continue working on this approach. The implementation of the algorithm touches three fields of studies, data discovery or so called data mining, bio inspired techniques, and seismic hazards predictions.

Download Full-text

Data Discovery Approaches for Vague Spatial Data

Data Mining ◽

10.4018/978-1-4666-2455-9.ch003 ◽

2013 ◽

pp. 50-65

Author(s):

Frederick E. Petry

Keyword(s):

Data Mining ◽

Fuzzy Sets ◽

Association Rules ◽

Rough Set ◽

Fuzzy Set ◽

Spatial Data ◽

Spatial Databases ◽

Uncertain Data ◽

Rule Extraction ◽

Data Discovery

This chapter focuses on the application of the discovery of association rules in approaches vague spatial databases. The background of data mining and uncertainty representations using rough set and fuzzy set techniques is provided. The extensions of association rule extraction for uncertain data as represented by rough and fuzzy sets is described. Finally, an example of rule extraction for both types of uncertainty representations is given.

Download Full-text

Web Usage Minning using Patterns with Different Algorithm

VFAST Transactions on Software Engineering ◽

10.21015/vtse.v12i1.497 ◽

2017 ◽

pp. 1-9

Author(s):

Keyword(s):

Data Mining ◽

Web Usage Mining ◽

Data Discovery ◽

Web Log ◽

Data Usage ◽

Web Usage ◽

Log Files ◽

Content Mining ◽

User Data ◽

Data Content

Web usage mining is a part of data mining. Data usage mining is divided into three parts 1) Data content mining 2) Data structured mining 3) Data usage mining. In this paper I am discussing about log files which are used in data usage mining. Log files are used to store user’s activity in web server using websites. So that websites can be improved by gathering user data. Web usage mining having three sub parts which is reprocessing, data discovery and data analysis. Further, in this paper, details about web log files are discussed. Three algorithms are discussed which are used for patterns of log files. There comparison is showed in this paper with the help of graphs.

Download Full-text

A Survey : Study of Data Mining and Data Warehousing in Healthcare

International Journal of Scientific Research in Science Engineering and Technology ◽

10.32628/ijsrset218618 ◽

2021 ◽

pp. 117-121

Author(s):

Arvind Singh

Keyword(s):

Data Mining ◽

Health Care ◽

Data Warehouse ◽

Health Care System ◽

Medical Data ◽

Survey Study ◽

Data Discovery ◽

Health Care Database ◽

Care Database ◽

Care System

Health care is one of the speedy growing areas. The Health care system contains large amount of medical data which should be mined from data warehouse. The mined data from data warehouse helps in finding the important information. Comprehensive amount of data in health care database need the growth of tools which can be used to access the data, analyze and analysis the data, discovery of knowledge, and versed use of the stored knowledge. The health care system has lot of data about the patient’s details, medications etc. In this paper we have studied different data mining and warehousing techniques used in healthcare areas.

Download Full-text

Implementasi Data Mining untuk Deteksi Penyakit Ginjal Kronis (PGK) menggunakan K-Nearest Neighbor (KNN) dengan Backward Elimination

Jurnal Teknologi Informasi dan Ilmu Komputer ◽

10.25126/jtiik.2020721896 ◽

2020 ◽

Vol 7 (2) ◽

pp. 417

Author(s):

Ikhsan Wisnuadji Gamadarenda ◽

Indra Waspada

Keyword(s):

Diabetes Mellitus ◽

Data Mining ◽

Nearest Neighbor ◽

K Nearest Neighbor ◽

Data Discovery ◽

Nearest Neighbor Algorithm ◽

Web Based ◽

Elimination Algorithm ◽

Backward Elimination ◽

K Nearest Neighbor Algorithm

Penyakit ginjal kronis (PGK) merupakan masalah kesehatan publik di seluruh dunia dengan insiden yang terus meningkat. Berdasarkan sumber dari BPJS Kesehatan, perawatan PGK merupakan ranking kedua pembiayaan terbesar setelah penyakit jantung. Pendeteksian PGK juga memerlukan banyak atribut sehingga membutuhkan biaya yang cukup mahal. Oleh sebab itu dibuat sistem dengan tahapan data mining berbasis web yang memudahkan untuk melakukan deteksi PGK, sehingga PGK dapat dicegah, ditanggulangi, dan kemungkinan mendapatkan terapi yang efektif lebih besar jika diketahui lebih awal. Proses penelitian ini menggunakan sebuah rangka kerja data mining Knowledge Data Discovery (KDD). Dalam skenario rangka kerja yang digunakan, sistem ini menggunakan Algoritme Backward Elimination untuk mengurangi jumlah atribut yang dipakai dengan tujuan untuk mengurangi jenis pemeriksaan yang dilakukan, dan Algoritme k-Nearest Neighbor sebagai algoritme klasifikasi untuk mendeteksi penyakit. Hasil pemodelan terbaik data mining dari sistem yang dibuat menggunakan Backward Elimination (α = 0,05) dan kNN (k = 3) dengan pertimbangan penurunan biaya pemeriksaan dan sensitivity tertinggi. Rekomendasi sistem menghasilkan 10 atribut yang terpilih dari 24 atribut awal yang digunakan, yaitu: berat jenis (sg), albumin (al), urea darah (bu), kreatinin serum (sc), sodium (sod), hemoglobin (hemo), sel darah merah (rbc), hipertensi (htn), diabetes mellitus (dm), dan nafsu makan (appet). Penggunaan atribut yang telah terseleksi tersebut, berhasil menekan biaya pemeriksaan hingga 73,36%. Selanjutnya dilakukan pendeteksian penyakit menggunakan Algoritme k-Nearest Neighbor menghasilkan nilai akurasi sebesar 99,25%, sensitivity sebesar 99,5%, dan specificity sebesar 98,745%.AbstractChronic kidney disease (CKD) is a health problem for people around the world with increasing incidence. Based on sources from BPJS Kesehatan, CKD care is the second largest ranking of financing after heart disease. CKD detection also requires many attributes, so it requires quite expensive costs. Create a system with web-based data mining stages that makes it easy to detect CKD. Allowing CKD to be prevented, addressed, and advised to get effective therapy is greater if acknowledged earlier. The process of this research uses work methods of Data Mining Knowledge Data Discovery (KDD). In the framework of the framework used, this system uses the Backward Elimination Algorithm to reduce the number of attributes used to reduce the type of inspection performed, and the k-Nearest Neighbor Algorithm as an algorithm to update disease. The best data mining modeling results from the system are made using Backward Elimination (α = 0.05) and kNN (k = 3) by calculating the increase in inspection costs and the highest sensitivity. System recommendations produce 10 attributes selected from the 24 initial attributes used, namely: specific gravity (sg), albumin (al), blood urea (bu), serum creatinine (sc), sodium (soil), hemoglobin (hemo), cell red blood (rbc), hypertension (htn), diabetes mellitus (dm), and appetite (appetite). The use of the selected attributes succeeded in achieving inspection costs of up to 73.36%. Furthermore, disease detection using the k-Nearest Neighbor Algorithm produces an accuracy value of 99.25%, sensitivity of 99.5%, and specificity of 98.745%.

Download Full-text

Data Discovery Approaches for Vague Spatial Data

Computational Modeling and Simulation of Intellect ◽

10.4018/978-1-60960-551-3.ch014 ◽

2011 ◽

pp. 342-360 ◽

Cited By ~ 1

Author(s):

Frederick E. Petry

Keyword(s):

Data Mining ◽

Fuzzy Sets ◽

Association Rules ◽

Rough Set ◽

Fuzzy Set ◽

Spatial Data ◽

Spatial Databases ◽

Uncertain Data ◽

Rule Extraction ◽

Data Discovery

This chapter focuses on the application of the discovery of association rules in approaches vague spatial databases. The background of data mining and uncertainty representations using rough set and fuzzy set techniques is provided. The extensions of association rule extraction for uncertain data as represented by rough and fuzzy sets is described. Finally, an example of rule extraction for both types of uncertainty representations is given.

Download Full-text

Clustering Wilayah Dan Pelanggaran Berkendaraan Menggunakan Algoritma K-Means Pada Data Satlantas Polres Tasikmalaya Kota

e-Jurnal JUSITI (Jurnal Sistem Informasi dan Teknologi Informasi) ◽

10.36774/jusiti.v8i1.595 ◽

2019 ◽

Vol 8-1 ◽

pp. 1-11

Keyword(s):

Data Mining ◽

Data Discovery

Banyaknya pengguna jalan yang tidak mematuhi peraturan berlalu lintas dengan baik, setiap harinya dapat menambah tingkat kecelakaan dan pelanggaran tata tertib lalu lintas dalam berkendara pada wilayah Kota Tasikmalaya, sehingga masyarakat kurang dalam memahami ketertiban dijalan raya. Penelitian ini menerapkan data mining dengan menggunakan metode clustering pada data pelanggaran lalu lintas Polres Tasikmalaya Kota, algoritma yang digunakan yaitu K-Means clustering berupa proses pengelompokan sejumlah data atau objek ke dalam cluster atau group sehingga setiap dalam cluster tersebut akan berisi data yang semirip mungkin dan berbeda dengan objek dalam cluster lainnya. Data pelanggaran lalu lintas Polres Tasikmalaya kota ini diproses melalui Knowledge Data Discovery (KDD) sehingga dapat diketahui pengujian dengan rapidminer, menghasilkan cluster-cluster pelanggaran lalu lintas. Sampel yang digunakan di ambil dari tabel data pelanggaran lalu lintas yang telah ditrasformasikan. Dimana atribut yang ditentukan sebanyak 6 atribut yaitu wilayah, tidak menggunakan helm, sabuk keselamatan, melanggar rambu lintas, tidak membawa sim dan stnk dan kelebihan muatan. Dimana akan mempresentasikan cluster-cluster tiap kelompok wilayah dan jenis pelanggaran lalu lintas.

Download Full-text

Data Mining and Machine Learning

10.1017/9781108564175 ◽

2020 ◽

Cited By ~ 2

Author(s):

Mohammed J. Zaki ◽

Wagner Meira, Jr

Keyword(s):

Machine Learning ◽

Data Mining

Download Full-text

The economics of selection of mail orders Drs. Zahavi and Levin are the masterminds behind the development of AMOS, a customized predictive modeling system for the Franklin Mint in Philadelphia, and GainSmarts, a general purpose data mining system that is the two-time winner of the KDD-CUP competition for the best data mining tools (1997 and 1998) sponsored by the American Association for Artificial Intelligence.

Journal of Interactive Marketing ◽

10.1002/dir.1016.abs ◽

2001 ◽

Vol 15 (3) ◽

pp. 53

Author(s):

Nissan Levin ◽

Jacob Zahavi

Keyword(s):

Artificial Intelligence ◽

Data Mining ◽

Predictive Modeling ◽

American Association ◽

General Purpose ◽

Mining System ◽

Data Mining System ◽

Mining Tools ◽

Selection Of

Download Full-text

Heart Rate Variability, Emotions, and Music

Journal of Psychophysiology ◽

10.1027/0269-8803/a000021 ◽

2010 ◽

Vol 24 (2) ◽

pp. 112-119 ◽

Cited By ~ 9

Author(s):

F. Riganello ◽

A. Candelieri ◽

M. Quintieri ◽

G. Dolce

Keyword(s):

Data Mining ◽

Heart Rate ◽

Heart Rate Variability ◽

Vegetative State ◽

Low Frequency ◽

Emotional Reactions ◽

Heart Beat ◽

Healthy Controls ◽

Frequency Spectra ◽

Emotional Value

The purpose of the study was to identify significant changes in heart rate variability (an emerging descriptor of emotional conditions; HRV) concomitant to complex auditory stimuli with emotional value (music). In healthy controls, traumatic brain injured (TBI) patients, and subjects in the vegetative state (VS) the heart beat was continuously recorded while the subjects were passively listening to each of four music samples of different authorship. The heart rate (parametric and nonparametric) frequency spectra were computed and the spectra descriptors were processed by data-mining procedures. Data-mining sorted the nu_lf (normalized parameter unit of the spectrum low frequency range) as the significant descriptor by which the healthy controls, TBI patients, and VS subjects’ HRV responses to music could be clustered in classes matching those defined by the controls and TBI patients’ subjective reports. These findings promote the potential for HRV to reflect complex emotional stimuli and suggest that residual emotional reactions continue to occur in VS. HRV descriptors and data-mining appear applicable in brain function research in the absence of consciousness.

Download Full-text