Research of Data Mining based on clustering model

Data mining model is the most important technical basis of the control target decomposition for the most stringent water resources management of Shandong province. K-means clustering model is adopted to analysis the water withdrawal of industrial added value per ten thousand yuan in 2010. Based on the yearly industrial water consumption trend from 1995 to 2010 of 17 municipal-level cities in Shandong province, the ARIMA (p, d, q) model is established through a lot of fitting and optimization and then the regional industrial water demand and water utilization efficiency in 2015 were forecasted. According to the proposed principal and technical route of target decomposition, the industrial water utilization efficiency target in 2015 of the whole province and 17 municipal-level cities are defined respectively.

Download Full-text

Clustering of Drug Sampling Data to Determine Drug Distribution Patterns with K-Means Method : Study on Central Kalimantan Province, Indonesia

Journal of Information Systems Engineering and Business Intelligence ◽

10.20473/jisebi.5.2.208-218 ◽

2019 ◽

Vol 5 (2) ◽

pp. 208

Author(s):

Wahyuri Wahyuri ◽

Umi Athiyah ◽

Ira Puspitasari ◽

Yunita Nita

Keyword(s):

Data Mining ◽

Drug Distribution ◽

Distribution Patterns ◽

Good Manufacturing Practice ◽

Depth Information ◽

National Agency ◽

Data Mining Technique ◽

Central Kalimantan ◽

Clustering Model ◽

Post Marketing

Background: Drug sampling and testing in the context of post-marketing control is an important component to ensure drug safety in the supply chains. The results are used by the Indonesian National Agency for Drug and Food Control (NA-FDC) for conducting public warnings, evaluating the Good Manufacturing Practice (GMP) and Good Distribution Practice (GDP) implementation, and enforcing the law against drug violation.Objective: This study aimed to identify and analyze drug distribution patterns to provide an overview of drug sampling in the public sector. Methods: The data was collected from Balai Besar Pengawas Obat dan Makanan (BBPOM) Palangka Raya’s database. The collected data were the drug sampling data from Integrated Information Reporting Systems (IIRS) application from 2014 to 2018. Next, we employed CRISP-DM methodology to analyze the data and to identify the pattern. K-means clustering model was selected for data modeling.Results: The dataset contained five attributes, i.e., drug name, therapeutic classes, district/city, sample category, and evaluation of drug surveillance. The drug distribution pattern formed three clusters. First cluster contained 522 drug items in eight therapeutic classes and spread over ten districts, second cluster contained 1542 drug items in five therapeutic classes and spread over five districts, and third cluster contained 503 drug items in eleven therapeutic classes and spread across nine districts.Conclusion: To conclude, the applied data mining technique has improved the decision on the drug sampling planning. It also provides in-depth information on the improvement of drug post-marketing control performance in Central Kalimantan Province.Keywords: Clustering, CRISP-DM, Data Mining, Drug distribution patterns, Drug quality control, Drug sampling

Download Full-text

Study on microblog public opinion data mining algorithm based on multi-visual clustering model

International Journal of Autonomous and Adaptive Communications Systems ◽

10.1504/ijaacs.2020.10032156 ◽

2020 ◽

Vol 13 (2) ◽

pp. 151

Author(s):

Jing Liu ◽

Wei zhen Hou ◽

Lin lin Li

Keyword(s):

Data Mining ◽

Public Opinion ◽

Data Mining Algorithm ◽

Clustering Model ◽

Mining Algorithm ◽

Visual Clustering

Download Full-text

Desain Model Data Mining pada Model SECI untuk Pemetaan dan Ekstraksi Pengetahuan Kompetensi Lulusan

JATISI (Jurnal Teknik Informatika dan Sistem Informasi) ◽

10.35957/jatisi.v8i3.1349 ◽

2021 ◽

Vol 8 (3) ◽

pp. 1607-1614

Author(s):

Mardiani Mardiani

Keyword(s):

Data Mining ◽

Data Transfer ◽

Model Data ◽

Clustering Model

Manajemen pengetahuan menggunakan Model SECI membantu dalam transfer pengetahuan tacit dan eksplisit. Keterbatasan kemampuan sumber daya manusia dalam transfer pengetauan membutuhkan alat bantu dalam prosesnya. Ekstraksi pengetahuan dapat dilakukan dengan implementasi data mining. Hasil keluaran data mining yang besar akan dimanfaatkan oleh dunia pendidikan untuk tujuan strategis, misalnya evaluasi penyusunan profil lulusan dari hasil analisis kompetensi lulusan. Kurikulum Program Studi disusun berdasarkan profil Lulusan dan Program Studi membutuhkan pemetaan kebutuhan dari data alumni dalam menyusun kurikulum, sementara alumni membutuhkan mata kuliah yang mendukung setelah selesai kuliah. Manajemen Pengetahuan menampung pengetahuan dari lulusannya, sementara Data mining digunakan sebagai alat dalam mengolah data. Transfer pengetahuan dan pengolahan data kompetensi lulusan, dan memungkinkan munculnya pengetahuan baru bagi perguruan tinggi yang bisa dimanfaatkan dalam proses penyusunan kurikulum berikutnya. Model yang digunakan adalah SECI dikombinasikan dengan algoritma klasifikasi dan clustering. Model SECI yang sudah dipetakan alat bantu teknologinya pada setiap prosesnya, dibuat lebih jelas dan spesifik pengelompokkannya dengan implementasi Data Mining pada setiap kuadran Model SECI. Desain model SECI yang dikombinasikan dengan teknologi Data Mining akan memperbaiki kekurangan yang terdapat pada model sebelumnya.

Download Full-text

Adherence predictor variables in AIDS patients: An empirical study using the data mining-based RFM model

10.21203/rs.3.rs-41910/v1 ◽

2020 ◽

Author(s):

Min Li ◽

Qunwei Wang ◽

Yinzhong Shen

Keyword(s):

Data Mining ◽

Prediction Model ◽

Clustering Analysis ◽

Predictor Variable ◽

Poor Adherence ◽

Predictor Variables ◽

Aids Patients ◽

Decision Algorithm ◽

Clustering Model ◽

Rfm Model

Abstract Background Highly active antiretroviral therapy (ART) is still the only effective method to stop the disease progression in acquired immunodeficiency syndrome (AIDS) patients. However, poor adherence to the therapy makes it ineffective. In this work, we construct an adherence prediction model of AIDS patients using the classical recency, frequency and monetary value (RFM) model in the data mining-based customer relationship management model to obtain adherence predictor variables. Methods We cleaned 257305 diagnostic data elements of AIDS outpatients in Shanghai from August 2009 to December 2019 to obtain 16440 elements. We tested the RFM and RFm (R: recent consultation month, F: consultation frequency, M/m: total/average medical costs per visit) models, three clustering methods (K-means, Kohonen and two-step clustering) and four decision algorithms (C5.0, the classification and regression tree, Chi-square Automatic Interaction Detector and Quick, Unbiased, Efficient, Statistical Tree) to select the optimal combination. The optimal model and clustering analysis were used to divide the patients into two groups (good and poor adherence), then the optimal decision algorithm was used to construct the prediction model of adherence and obtain its predictor variables. Results The results revealed that the RFm model, K-means clustering analysis and C5.0 algorithm were optimal. After three rounds of k-means clustering analysis, the optimal RFm clustering model quality was 0.8, 10614 elements were obtained, including 9803 and 811 from patients with good or poor adherence, respectively, and five types of patients were identified. The prediction model had an accuracy of 100% with the recent consultation month as an important adherence predictor variable. Conclusions This work presented a prediction model for medication adherence in AIDS patients at the designated AIDS center in Shanghai, using the RFm model and the k-means and C5.0 algorithms. The model can be expanded to include patients from other centers in China and worldwide.

Download Full-text

Data Analysis of College Students’ Mental Health Based on Clustering Analysis Algorithm

Complexity ◽

10.1155/2021/9996146 ◽

2021 ◽

Vol 2021 ◽

pp. 1-10

Author(s):

Yichen Chu ◽

Xiaojian Yin

Keyword(s):

Mental Health ◽

College Students ◽

Data Mining ◽

Clustering Analysis ◽

Management System ◽

Clustering Algorithm ◽

Analysis Algorithm ◽

Advantages And Disadvantages ◽

Clustering Model ◽

Psychological Management

Mental health is an important basic condition for college students to become adults. Educators gradually attach importance to strengthening the mental health education of college students. This paper makes a detailed analysis and research on college students’ mental health, expounds the development and application of clustering analysis algorithm, applies the distance formula and clustering criterion function commonly used in clustering analysis, and makes a specific description of some classic algorithms of clustering analysis. Based on expounding the advantages and disadvantages of fast-clustering analysis algorithm and hierarchical clustering analysis algorithm, this paper introduces the concept of the two-step clustering algorithm, discusses the algorithm flow of clustering model in detail, and gives the algorithm flow chart. The main work of this paper is to analyze the clustering algorithm of students’ mental health database formed by mental health assessment tool test, establish a data mining model, mine the database, analyze the state characteristics of different college students’ mental health, and provide corresponding solutions. In order to meet the needs of the psychological management system based on the clustering analysis method, the clustering analysis algorithm is used to cluster the data. Based on the original database, this paper establishes the methods of selecting, cleaning, and transforming the data of students’ psychological archives. Finally, it expounds on the application of data mining in students’ psychological management system and summarizes and prospects the implementation of the system.

Download Full-text

Clustering of the Multi-Value Documents based on Probabilistic Features Association Mechanism

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.a4538.119119 ◽

2019 ◽

Vol 9 (1) ◽

pp. 1576-1581

Keyword(s):

Data Mining ◽

Feature Selection ◽

Data Analysis ◽

Mutual Information ◽

Association Score ◽

Normalized Mutual Information ◽

Clustering Model ◽

Multiple Data ◽

Class Information ◽

Degree Of Similarity

It is becoming increasingly difficult to cluster multi-valued data in data mining because of the multiple data interval values of individual functions. Identifying a clustering model that is appropriate for these disguised multi-valued data deployments in data analysis applications is an open problem. To answer this question, this paper proposes a feature selection based on the probabilistic features association mechanism (PFAM). The problem is mainly due to the difficulty in identifying the class information and the multiple values for each individual features. This work explores the problem of unsupervised feature selection through computing the probabilistic association score and multi-value data reformation for effective clustering in multivariate datasets. By minimizing a reformation clustering error, it can conserve together the degree of similarity and the categorization information of the actual data contents. The proposed approach is evaluated the clustering purity and Normalized Mutual Information on multivariate document datasets. The experimental evaluation shows the improvisation of the proposed approach.

Download Full-text

Research on Data Mining Algorithm Based on Micro-blog of Multi-view Clustering Model

2018 5th International Conference on Electrical & Electronics Engineering and Computer Science (ICEEECS 2018) ◽

10.25236/iceeecs.2018.047 ◽

2018 ◽

Keyword(s):

Data Mining ◽

Data Mining Algorithm ◽

Clustering Model ◽

Mining Algorithm

Download Full-text