A Meta-Mining Ontology Framework for Data Processing

Extracting knowledge from data streams received from observed objects through data mining is required in various domains. However, there is a lack of any kind of guidance on which techniques can or should be used in which contexts. Meta mining technology can help build processes of data processing based on knowledge models taking into account the specific features of the objects. This paper proposes a meta mining ontology framework that allows selecting algorithms for solving specific data mining tasks and build suitable processes. The proposed ontology is constructed using existing ontologies and is extended with an ontology of data characteristics and task requirements. Different from the existing ontologies, the proposed ontology describes the overall data mining process, used to build data processing processes in various domains, and has low computational complexity compared to others. The authors developed an ontology merging method and a sub-ontology extraction method, which are implemented based on OWL API via extracting and integrating the relevant axioms.

Download Full-text

Centering Resonance Analysis: A Superior Data Mining Algorithm for Textual Data Streams

10.21236/ada422048 ◽

2004 ◽

Cited By ~ 6

Author(s):

Kevin Dooley ◽

Steven Corman ◽

Dan Ballard

Keyword(s):

Data Mining ◽

Data Streams ◽

Data Mining Algorithm ◽

Resonance Analysis ◽

Mining Algorithm ◽

Textual Data

Download Full-text

Facing the needs for clean bicycle data – a bicycle-specific approach of GPS data processing

European Transport Research Review ◽

10.1186/s12544-020-00462-2 ◽

2021 ◽

Vol 13 (1) ◽

Author(s):

Sven Lißner ◽

Stefan Huber

Keyword(s):

Data Processing ◽

Gps Data ◽

Data Set ◽

Specific Data ◽

Driving Mode ◽

Mode Detection ◽

Current State ◽

Mode Recognition ◽

Recorded Data ◽

Gps Tracks

Abstract Background GPS-based cycling data are increasingly available for traffic planning these days. However, the recorded data often contain more information than simply bicycle trips. GPS tracks resulting from tracking while using other modes of transport than bike or long periods at working locations while people are still tracking are only some examples. Thus, collected bicycle GPS data need to be processed adequately to use them for transportation planning. Results The article presents a multi-level approach towards bicycle-specific data processing. The data processing model contains different steps of processing (data filtering, smoothing, trip segmentation, transport mode recognition, driving mode detection) to finally obtain a correct data set that contains bicycle trips, only. The validation reveals a sound accuracy of the model at its’ current state (82–88%).

Download Full-text

Bayes Performance of Batch Data Mining Based on Functional Dependencies

International Journal of Pattern Recognition and Artificial Intelligence ◽

10.1142/s0218001419590110 ◽

2019 ◽

Vol 33 (03) ◽

pp. 1959011

Author(s):

Haixu Xi ◽

Feiyue Ye ◽

Sheng He ◽

Yijun Liu ◽

Hongfen Jiang

Keyword(s):

Data Mining ◽

Data Processing ◽

Video Processing ◽

Mining Area ◽

Batch Processing ◽

Video Data ◽

Batch Processes ◽

Functional Dependencies ◽

Workflow System ◽

Traffic Video

Batch processes and phenomena in traffic video data processing, such as traffic video image processing and intelligent transportation, are commonly used. The application of batch processing can increase the efficiency of resource conservation. However, owing to limited research on traffic video data processing conditions, batch processing activities in this area remain minimally examined. By employing database functional dependency mining, we developed in this study a workflow system. Meanwhile, the Bayesian network is a focus area of data mining. It provides an intuitive means for users to comply with causality expression approaches. Moreover, graph theory is also used in data mining area. In this study, the proposed approach depends on relational database functions to remove redundant attributes, reduce interference, and select a property order. The restoration of selective hidden naive Bayesian (SHNB) affects this property order when it is used only once. With consideration of the hidden naive Bayes (HNB) influence, rather than using one pair of HNB, it is introduced twice. We additionally designed and implemented mining dependencies from a batch traffic video processing log for data execution algorithms.

Download Full-text

Parallel Data Mining and Applications in Hospital Big Data Processing

Big Data Management and Processing ◽

10.1201/9781315154008-20 ◽

2017 ◽

pp. 403-424

Author(s):

Jianguo Chen ◽

Zhuo Tang ◽

Kenli Li ◽

Keqin Li

Keyword(s):

Data Mining ◽

Big Data ◽

Data Processing ◽

Big Data Processing ◽

Parallel Data ◽

Parallel Data Mining

Download Full-text

Novel IT Technologies on the Digital Battlefield: The Application of Big Data and Data Mining Technologies

Hadmérnök ◽

10.32567/hm.2020.4.10 ◽

2020 ◽

Vol 15 (4) ◽

pp. 141-158

Author(s):

Eszter Katalin Bognár

Keyword(s):

Data Mining ◽

Big Data ◽

Data Processing ◽

Relevant Information ◽

Military Operations ◽

Textual Data ◽

Modern Warfare ◽

Tools And Techniques

In modern warfare, the most important innovation to date has been the utilisation of information as a weapon. The basis of successful military operations is the ability to correctly assess a situation based on credible collected information. In today’s military, the primary challenge is not the actual collection of data. It has become more important to extract relevant information from that data. This requirement cannot be successfully completed without necessary improvements in tools and techniques to support the acquisition and analysis of data. This study defines Big Data and its concept as applied to military reconnaissance, focusing on the processing of imagery and textual data, bringing to light modern data processing and analytics methods that enable effective processing.

Download Full-text

Application of Digital Mining Facing Information Fusion Technology in the Field of National Costume Culture Design

Mobile Information Systems ◽

10.1155/2021/3790413 ◽

2021 ◽

Vol 2021 ◽

pp. 1-11

Author(s):

Min Yu ◽

Rongrong Cui

Keyword(s):

Data Mining ◽

Internet Of Things ◽

Data Processing ◽

Clustering Algorithm ◽

Laplacian Matrix ◽

Design System ◽

Design Effect ◽

Constraint Matrix ◽

Clothing Design ◽

Culture Design

In order to improve the design effect of minority clothing, according to the needs of minority clothing design, this paper uses data mining and Internet of Things technologies to construct an intelligent ethnic clothing design system and builds an intelligent clothing design system that meets customer needs based on the idea of human-computer interaction. In data processing, this paper uses the constraint spectrum clustering algorithm to take the Laplacian matrix and the constraint matrix as input and finally outputs a clustering indicator vector to improve the data processing effect of minority clothing design. Finally, this paper verifies the performance of the system designed in this paper through experiments. From the experimental research, it can be known that the minority clothing design system based on the Internet of Things and data mining constructed in this paper has a certain effect and can effectively improve the minority clothing design effect.

Download Full-text

Clustering Fasilitas Kesehatan Berdasarkan Kecamatan Di Karawang Dengan Algoritma K-Means

BINA INSANI ICT JOURNAL ◽

10.51211/biict.v8i1.1488 ◽

2021 ◽

Vol 8 (1) ◽

pp. 83

Author(s):

Bagus Muhammad Islami ◽

Cepy Sukmayadi ◽

Tesa Nur Padilah

Keyword(s):

Data Mining ◽

Developing Countries ◽

Data Processing ◽

Health Problems ◽

Health Facilities ◽

Microsoft Excel ◽

Two Factors ◽

Clustering Data ◽

Index Value ◽

Cluster 2

Abstrak: Masalah kesehatan yang ada di dalam masyarakat terutama di negara- negara berkembang seperti Indonesia dipengaruhi oleh dua faktor yaitu aspek fisik dan aspek non fisik. Berdasarkan data yang diperoleh dari karawangkab.bps.go.id data dibagi menjadi 3 cluster yaitu sedikit, sedang dan terbanyak. Algoritma yang digunakan adalah K-Means cluster yang diimplementsikan menggunakan Microsoft Excel dan Rapidminer Studio. Hasil pengolahan data fasilitas kesehatan di karawang menghasilkan 3 cluster dengan cluster 1 yang mempunyai fasilitas kesehatan sedikit sebanyak 23 kecamatan, cluster 2 yang mempunyai fasilitas kesehatan sedang sebanyak 5 kecamatan dan cluster 3 yang mempunyai fasilitas kesehatan terbanyak terdapat 2 kecamatan. Kinerja yang dihasilkan dari algoritma K-means menghasilkan nilai Davies Boildin Index sebesar 0,109. Kata kunci: clustering, data mining, fasilitas kesehatan, K-Means. Abstract: Health problems that exist in society, especially in developing countries like Indonesia, are built by two factors, namely physical and non-physical aspects. Based on data obtained from karawangkab.bps.go.id the data is divided into 3 clusters, namely the least, medium and the most. The algorithm used is the K-Means cluster which is implemented using Microsoft Excel and Rapidminer Studio. The results of data processing of health facilities in Karawang produce 3 clusters with cluster 1 which has 23 sub-districts of health facilities, cluster 2 which has medium health facilities as many as 5 districts and cluster 3 which has the most health facilities in 2 districts. The performance resulting from the K-means algorithm results in a Davies Boildin Index value of 0.109. Keywords: clustering, data mining, health facilities, K-Means.

Download Full-text

FACTORS AFFECTING THE CUSTOMS CLEARANCE TIME AT PRIME CUSTOMS OFFICE TYPE A OF TANJUNG PRIOK

Customs Research and Applications Journal ◽

10.31092/craj.v2i2.56 ◽

2020 ◽

Vol 2 (2) ◽

pp. 01-17

Author(s):

Khamami Herusantoso ◽

Ardyanto Dwi Saputra

Keyword(s):

Data Mining ◽

Type A ◽

Database Analysis ◽

Data Set ◽

Complex Phase ◽

Specific Data ◽

Factors Affecting ◽

Efficiency And Effectiveness ◽

Hidden Knowledge ◽

Clearance Process

In the dwell-time, the customs clearance is considered as the most complex phase, even though its portion is the shortest among other phases, such as pre-clearance and post clearance. In order to improve the efficiency and effectiveness on the services performed in the customs clearance process, the customs authorities must start considering the help of database analysis in identifying obstacles instead of depending on the personal analysis. Useful information is hidden among the importation data set and it is extractable through data mining techniques. This study explores the customs clearance process of import cargo whose document is declared through the red channel at Prime Customs Office Type A of Tanjung Priok (PCO Tanjung Priok), and applies a specific data mining classifier called the decision tree with J48 algorithm to evaluate the process. There are 11 classification models developed using unpruned, online pruning, and post-pruning features. One best model is chosen to extract the hidden knowledge that describes factors affecting the customs clearance process and allows the customs authorities to improve their services performed in the future.

Download Full-text

Goods Stock Management using the K-Means Algorithm Method

Jurnal Teknologi ◽

10.35134/jitekin.v9i2.15 ◽

2020 ◽

Vol 10 (1) ◽

pp. 22-45

Author(s):

Dhio Saputra

Keyword(s):

Data Mining ◽

Data Processing ◽

Test Results ◽

Stock Management ◽

Clustering Method ◽

Sales Data ◽

Using Data ◽

Cluster 2

The grouping of Mazaya products at PT. Bougenville Anugrah can still do manuals in calculating purchases, sales and product inventories. Requires time and data. For this reason, a research is needed to optimize the inventory of Mazaya goods by computerization. The method used in this research is K-Means Clustering on sales data of Mazaya products. The data processed is the purchase, sales and remaining inventory of Mazaya products in March to July 2019 totaling 40 pieces. Data is grouped into 3 clusters, namely cluster 0 for non-selling criteria, cluster 1 for best-selling criteria and cluster 2 for very best-selling criteria. The test results obtained are cluster 0 with 13 data, cluster 1 with 25 data and cluster 2 with 2 data. So to optimize inventory is to multiply goods in cluster 2, so as to save costs for management of Mazayaproducts that are not available. K-Means clustering method can be used for data processing using data mining in grouping data according to criteria.

Download Full-text

Compressed domain-specific data processing and analysis

2017 IEEE International Conference on Big Data (Big Data) ◽

10.1109/bigdata.2017.8257941 ◽

2017 ◽

Author(s):

Dapeng Dong ◽

John Herbert

Keyword(s):

Data Processing ◽

Compressed Domain ◽

Specific Data ◽

Domain Specific

Download Full-text