Data Preprocessing

2021 ◽  
pp. 33-58
Author(s):  
Magy Seif El-Nasr ◽  
Truong Huy Nguyen Dinh ◽  
Alessandro Canossa ◽  
Anders Drachen

This chapter focuses on the process of cleaning data and preparing it for further processing. Specifically, the chapter discusses various techniques that you will use, including preprocessing, outlier identification, data consistency, and the normalization or standardization process, used to normalize your data. The chapter further discusses different measurement types and what methods can be used for which types. The chapter also discusses ways to deal with issues you may encounter with inconsistent or dirty data. The chapter takes a more practical approach by integrating several labs with actual game data to demonstrate how you can perform these steps on real game data.

2012 ◽  
Vol 251 ◽  
pp. 257-265
Author(s):  
Dong Qiang Gao ◽  
Jiang Miao Yi ◽  
Lin Hu ◽  
Huan Lin

It takes the mouse surface for example and introduces the general process of reverse engineering combining with the advantage of Pro/E and Mastercam,including using CMM for the data acquisition, using Pro/E for data preprocessing and model reconstruction, using Mastercam for processing the cavity and arranging actual processing on CNC with wax model to achieve the End-entity. Thereby, it provides a practical approach on reverse engineering for complicated product.


2020 ◽  
Vol 17 (8) ◽  
pp. 3798-3803
Author(s):  
M. D. Anto Praveena ◽  
B. Bharathi

Big Data analytics has become an upward field, and it plays a pivotal role in Healthcare and research practices. Big data analytics in healthcare cover vast numbers of dynamic heterogeneous data integration and analysis. Medical records of patients include several data including medical conditions, medications and test findings. One of the major challenges of analytics and prediction in healthcare is data preprocessing. In data preprocessing the outlier identification and correction is the important challenge. Outliers are exciting values that deviates from other values of the attribute; they may simply experimental errors or novelty. Outlier identification is the method of identifying data objects with somewhat different behaviors than expectations. Detecting outliers in time series data is different from normal data. Time series data are the data that are in a series of certain time periods. This kind of data are identified and cleared to bring the quality dataset. In this proposed work a hybrid outlier detection algorithm extended LSTM-GAN is helped to recognize the outliers in time series data. The outcome of the proposed extended algorithm attained better enactment in the time series analysis on ECG dataset processing compared with traditional methodologies.


Author(s):  
Scott A. Withrow ◽  
William K. Balzer ◽  
Michael T. Sliter ◽  
Purnima Gopalkrishnan ◽  
Michael A. Gillespie ◽  
...  
Keyword(s):  

Sign in / Sign up

Export Citation Format

Share Document