Effective Data Mining by Integrating Genetic Algorithm into the Data Preprocessing Phase

In the paper the procedure of processing biomechanical data has been proposed. It consists of selecting proper noiseless data, preprocessing data by means of model’s identification and Kernel Principal Component Analysis and next classification using decision tree. The obtained results of classification into groups (normal and two selected pathology of gait: Spina Bifida and Cerebral Palsy) were very good.

Download Full-text

Application of Data Pre-Processing Method in Web Mining

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.687-691.1592 ◽

2014 ◽

Vol 687-691 ◽

pp. 1592-1595

Author(s):

Yun Peng Duan ◽

Chun Xi Zhao ◽

Ying Shi

Keyword(s):

Data Mining ◽

Web Mining ◽

Early Stage ◽

Data Preprocessing ◽

Web Technology ◽

Processing Method ◽

Web Log Mining ◽

Web Log ◽

User Access ◽

Log Mining

With the widely application of the WWW and the emergence of Web technology, make the research of data mining has entered a new stage. Web log mining is based on the idea of data mining to analyze the server log processing. Paper aimed at the early stage of the data mining is put forward based on log data preprocessing methods, the purpose is to divide server logs into multiple unique user access sequence at a time, and to give a good algorithm.

Download Full-text

Data Mining and Hypothesis Refinement using a Multi-Tiered Genetic Algorithm

Journal of Intelligent Systems ◽

10.1515/jisys.2010.19.3.191 ◽

2010 ◽

Vol 19 (3) ◽

Cited By ~ 1

Author(s):

CM. Taylor ◽

A. Agah

Keyword(s):

Data Mining ◽

Genetic Algorithm

Download Full-text

Preprocessing Profiling Model for Visual Analytics

10.5753/sibgrapi.est.2020.12991 ◽

2020 ◽

Author(s):

Alessandra Maciel Paz Milani ◽

Fernando V. Paulovich ◽

Isabel Harb Manssour

Keyword(s):

Data Mining ◽

Data Analysis ◽

Visual Analytics ◽

Data Preprocessing ◽

Interview Study ◽

Raw Data ◽

Important Stage ◽

Analysis Process

Analyzing and managing raw data are still a challenging part of the data analysis process, mainly regarding data preprocessing. Although we can find studies proposing design implications or recommendations for visualization solutions in the data analysis scope, they do not focus on challenges during the preprocessing phase. Likewise, the current Visual Analytics processes do not consider preprocessing an equally important stage in their process. Thus, with this study, we aim to contribute to the discussion of how we can use and combine methods of visualization and data mining to assist data analysts during the preprocessing activities. To achieve that, we introduce the Preprocessing Profiling Model for Visual Analytics, which contemplates a set of features to inspire the implementation of new solutions. In turn, these features were designed considering a list of insights we obtained during an interview study with thirteen data analysts. Our contributions can be summarized as offering resources to promote a shift to a visual preprocessing.

Download Full-text

Data Transformation Techniques for Academic Datasets

International Journal of Engineering and Advanced Technology - Regular Issue ◽

10.35940/ijeat.a9711.109119 ◽

2019 ◽

Vol 9 (1) ◽

pp. 2214-2218

Keyword(s):

Data Mining ◽

Life Cycle ◽

Data Preprocessing ◽

Data Normalization ◽

Multi Layer Perceptron ◽

Data Preparation ◽

Transformation Techniques ◽

Development Life Cycle ◽

Heterogeneous Datasets ◽

Mining Model

Data mining is a real-world procedure of discovering useful patterns from heterogeneous datasets. All most all industry uses data mining in their day to day activities. To build an effective mining model, a series of development steps are to be followed. It starts with discovering the business problem and ends with communicating the results. In this development life cycle, the most important step is data preparation or data preprocessing. Data preprocessing is converting raw data into data understandable by the machine. Data normalization is a phase in data preprocessing where the data values are scaled to 0 and 1. Right normalization of the datasets leads to improved mining results. In this paper, academic data of students is taken. The dataset is normalization using six normalization technique. Multi Layer Perceptron classifier is applied to normalized dataset and results are obtained. Results of this study reveal the best normalization technique which can be used for normalizing academic datasets. Finally, in a line, the goal of this work is to discover the best normalization technique which produces better mining result when applied to academic datasets.

Download Full-text

Using Data Mining Techniques and Genetic Algorithm

Proceedings of the International Conference on Learning and Optimization Algorithms: Theory and Applications - LOPAL '18 ◽

10.1145/3230905.3230915 ◽

2018 ◽

Author(s):

Lamia Berkani ◽

Yanis Chebahi ◽

Lilya Betit

Keyword(s):

Data Mining ◽

Genetic Algorithm ◽

Data Mining Techniques ◽

Using Data

Download Full-text

Audio and Speech Processing for Data Mining

Encyclopedia of Data Warehousing and Mining, Second Edition ◽

10.4018/978-1-60566-010-3.ch017 ◽

2011 ◽

pp. 98-103 ◽

Cited By ~ 1

Author(s):

Zheng-Hua Tan

Keyword(s):

Data Mining ◽

Speech Processing ◽

Large Data ◽

Data Preprocessing ◽

Multimedia Data ◽

Data Sets ◽

Data Types ◽

Multimedia Data Mining ◽

Customer Preferences ◽

And Storage

The explosive increase in computing power, network bandwidth and storage capacity has largely facilitated the production, transmission and storage of multimedia data. Compared to alpha-numeric database, non-text media such as audio, image and video are different in that they are unstructured by nature, and although containing rich information, they are not quite as expressive from the viewpoint of a contemporary computer. As a consequence, an overwhelming amount of data is created and then left unstructured and inaccessible, boosting the desire for efficient content management of these data. This has become a driving force of multimedia research and development, and has lead to a new field termed multimedia data mining. While text mining is relatively mature, mining information from non-text media is still in its infancy, but holds much promise for the future. In general, data mining the process of applying analytical approaches to large data sets to discover implicit, previously unknown, and potentially useful information. This process often involves three steps: data preprocessing, data mining and postprocessing (Tan, Steinbach, & Kumar, 2005). The first step is to transform the raw data into a more suitable format for subsequent data mining. The second step conducts the actual mining while the last one is implemented to validate and interpret the mining results. Data preprocessing is a broad area and is the part in data mining where essential techniques are highly dependent on data types. Different from textual data, which is typically based on a written language, image, video and some audio are inherently non-linguistic. Speech as a spoken language lies in between and often provides valuable information about the subjects, topics and concepts of multimedia content (Lee & Chen, 2005). The language nature of speech makes information extraction from speech less complicated yet more precise and accurate than from image and video. This fact motivates content based speech analysis for multimedia data mining and retrieval where audio and speech processing is a key, enabling technology (Ohtsuki, Bessho, Matsuo, Matsunaga, & Kayashi, 2006). Progress in this area can impact numerous business and government applications (Gilbert, Moore, & Zweig, 2005). Examples are discovering patterns and generating alarms for intelligence organizations as well as for call centers, analyzing customer preferences, and searching through vast audio warehouses.

Download Full-text

Research of Privacy-Preserving Data mining Based on Modified Quantum Genetic Algorithm

International Journal of Advancements in Computing Technology ◽

10.4156/ijact.vol5.issue6.76 ◽

2013 ◽

Vol 5 (6) ◽

pp. 655-662

Author(s):

Yang Lei ◽

Wu Jue ◽

Liu Feng

Keyword(s):

Data Mining ◽

Genetic Algorithm ◽

Privacy Preserving ◽

Privacy Preserving Data Mining ◽

Quantum Genetic Algorithm

Download Full-text