A Meta-Mining Ontology Framework for Data Processing

Author(s):  
Man Tianxing ◽  
Nataly Zhukova ◽  
Alexander Vodyaho ◽  
Tin Tun Aung

Extracting knowledge from data streams received from observed objects through data mining is required in various domains. However, there is a lack of any kind of guidance on which techniques can or should be used in which contexts. Meta mining technology can help build processes of data processing based on knowledge models taking into account the specific features of the objects. This paper proposes a meta mining ontology framework that allows selecting algorithms for solving specific data mining tasks and build suitable processes. The proposed ontology is constructed using existing ontologies and is extended with an ontology of data characteristics and task requirements. Different from the existing ontologies, the proposed ontology describes the overall data mining process, used to build data processing processes in various domains, and has low computational complexity compared to others. The authors developed an ontology merging method and a sub-ontology extraction method, which are implemented based on OWL API via extracting and integrating the relevant axioms.

2021 ◽  
Vol 13 (1) ◽  
Author(s):  
Sven Lißner ◽  
Stefan Huber

Abstract Background GPS-based cycling data are increasingly available for traffic planning these days. However, the recorded data often contain more information than simply bicycle trips. GPS tracks resulting from tracking while using other modes of transport than bike or long periods at working locations while people are still tracking are only some examples. Thus, collected bicycle GPS data need to be processed adequately to use them for transportation planning. Results The article presents a multi-level approach towards bicycle-specific data processing. The data processing model contains different steps of processing (data filtering, smoothing, trip segmentation, transport mode recognition, driving mode detection) to finally obtain a correct data set that contains bicycle trips, only. The validation reveals a sound accuracy of the model at its’ current state (82–88%).


Author(s):  
Haixu Xi ◽  
Feiyue Ye ◽  
Sheng He ◽  
Yijun Liu ◽  
Hongfen Jiang

Batch processes and phenomena in traffic video data processing, such as traffic video image processing and intelligent transportation, are commonly used. The application of batch processing can increase the efficiency of resource conservation. However, owing to limited research on traffic video data processing conditions, batch processing activities in this area remain minimally examined. By employing database functional dependency mining, we developed in this study a workflow system. Meanwhile, the Bayesian network is a focus area of data mining. It provides an intuitive means for users to comply with causality expression approaches. Moreover, graph theory is also used in data mining area. In this study, the proposed approach depends on relational database functions to remove redundant attributes, reduce interference, and select a property order. The restoration of selective hidden naive Bayesian (SHNB) affects this property order when it is used only once. With consideration of the hidden naive Bayes (HNB) influence, rather than using one pair of HNB, it is introduced twice. We additionally designed and implemented mining dependencies from a batch traffic video processing log for data execution algorithms.


Hadmérnök ◽  
2020 ◽  
Vol 15 (4) ◽  
pp. 141-158
Author(s):  
Eszter Katalin Bognár

In modern warfare, the most important innovation to date has been the utilisation of information as a  weapon. The basis of successful military operations is  the ability to correctly assess a situation based on  credible collected information. In today’s military, the primary challenge is not the actual collection of data.  It has become more important to extract relevant  information from that data. This requirement cannot  be successfully completed without necessary  improvements in tools and techniques to support the acquisition and analysis of data. This study defines  Big Data and its concept as applied to military  reconnaissance, focusing on the processing of  imagery and textual data, bringing to light modern  data processing and analytics methods that enable  effective processing.


2021 ◽  
Vol 2021 ◽  
pp. 1-11
Author(s):  
Min Yu ◽  
Rongrong Cui

In order to improve the design effect of minority clothing, according to the needs of minority clothing design, this paper uses data mining and Internet of Things technologies to construct an intelligent ethnic clothing design system and builds an intelligent clothing design system that meets customer needs based on the idea of human-computer interaction. In data processing, this paper uses the constraint spectrum clustering algorithm to take the Laplacian matrix and the constraint matrix as input and finally outputs a clustering indicator vector to improve the data processing effect of minority clothing design. Finally, this paper verifies the performance of the system designed in this paper through experiments. From the experimental research, it can be known that the minority clothing design system based on the Internet of Things and data mining constructed in this paper has a certain effect and can effectively improve the minority clothing design effect.


2021 ◽  
Vol 8 (1) ◽  
pp. 83
Author(s):  
Bagus Muhammad Islami ◽  
Cepy Sukmayadi ◽  
Tesa Nur Padilah

Abstrak: Masalah kesehatan yang ada di dalam masyarakat terutama di negara- negara berkembang seperti Indonesia dipengaruhi oleh dua faktor yaitu aspek fisik dan aspek non fisik. Berdasarkan data yang diperoleh dari karawangkab.bps.go.id data dibagi menjadi 3 cluster yaitu sedikit, sedang dan terbanyak. Algoritma yang digunakan adalah K-Means cluster yang diimplementsikan menggunakan Microsoft Excel dan Rapidminer Studio. Hasil pengolahan data fasilitas kesehatan di karawang menghasilkan 3 cluster dengan cluster 1 yang mempunyai fasilitas kesehatan sedikit sebanyak 23 kecamatan, cluster 2 yang mempunyai fasilitas kesehatan sedang sebanyak 5 kecamatan dan cluster 3 yang mempunyai fasilitas kesehatan terbanyak terdapat 2 kecamatan. Kinerja yang dihasilkan dari algoritma K-means menghasilkan nilai Davies Boildin Index sebesar 0,109.   Kata kunci: clustering, data mining, fasilitas kesehatan, K-Means.   Abstract: Health problems that exist in society, especially in developing countries like Indonesia, are built by two factors, namely physical and non-physical aspects. Based on data obtained from karawangkab.bps.go.id the data is divided into 3 clusters, namely the least, medium and the most. The algorithm used is the K-Means cluster which is implemented using Microsoft Excel and Rapidminer Studio. The results of data processing of health facilities in Karawang produce 3 clusters with cluster 1 which has 23 sub-districts of health facilities, cluster 2 which has medium health facilities as many as 5 districts and cluster 3 which has the most health facilities in 2 districts. The performance resulting from the K-means algorithm results in a Davies Boildin Index value of 0.109.   Keywords: clustering, data mining, health facilities, K-Means.


2020 ◽  
Vol 2 (2) ◽  
pp. 01-17
Author(s):  
Khamami Herusantoso ◽  
Ardyanto Dwi Saputra

In the dwell-time, the customs clearance is considered as the most complex phase, even though its portion is the shortest among other phases, such as pre-clearance and post clearance. In order to improve the efficiency and effectiveness on the services performed in the customs clearance process, the customs authorities must start considering the help of database analysis in identifying obstacles instead of depending on the personal analysis. Useful information is hidden among the importation data set and it is extractable through data mining techniques. This study explores the customs clearance process of import cargo whose document is declared through the red channel at Prime Customs Office Type A of Tanjung Priok (PCO Tanjung Priok), and applies a specific data mining classifier called the decision tree with J48 algorithm to evaluate the process. There are 11 classification models developed using unpruned, online pruning, and post-pruning features. One best model is chosen to extract the hidden knowledge that describes factors affecting the customs clearance process and allows the customs authorities to improve their services performed in the future.


2020 ◽  
Vol 10 (1) ◽  
pp. 22-45
Author(s):  
Dhio Saputra

The grouping of Mazaya products at PT. Bougenville Anugrah can still do manuals in calculating purchases, sales and product inventories. Requires time and data. For this reason, a research is needed to optimize the inventory of Mazaya goods by computerization. The method used in this research is K-Means Clustering on sales data of Mazaya products. The data processed is the purchase, sales and remaining inventory of Mazaya products in March to July 2019 totaling 40 pieces. Data is grouped into 3 clusters, namely cluster 0 for non-selling criteria, cluster 1 for best-selling criteria and cluster 2 for very best-selling criteria. The test results obtained are cluster 0 with 13 data, cluster 1 with 25 data and cluster 2 with 2 data. So to optimize inventory is to multiply goods in cluster 2, so as to save costs for management of Mazayaproducts that are not available. K-Means clustering method can be used for data processing using data mining in grouping data according to criteria.


Sign in / Sign up

Export Citation Format

Share Document