Multi-Source and Heterogeneous Data Integration Model for Big Data Analytics in Power DCS

Author(s):  
Wengang Chen ◽  
Ruijie Wang ◽  
Runze Wu ◽  
Liangrui Tang ◽  
Junli Fan
2014 ◽  
Vol 912-914 ◽  
pp. 1201-1204
Author(s):  
Gang Huang ◽  
Xiu Ying Wu ◽  
Man Yuan

This paper provides an ontology-based distributed heterogeneous data integration framework (ODHDIF). The framework resolves the problem of semantic interoperability between heterogeneous data sources in semantic level. By metadatas specifying the distributed, heterogeneous data and by describing semantic information of data source , having "ontology" as a common semantic model, semantic match is established through ontology mapping between heterogeneous data sources and semantic difference institutions are shielded, so that semantic heterogeneity problem of the heterogeneous data sources can be effectively solved. It provides an effective technology measure for the interior information of enterprises to be shared in time accurately.


Author(s):  
Guowei Cai ◽  
Sankaran Mahadevan

This manuscript explores the application of big data analytics in online structural health monitoring. As smart sensor technology is making progress and low cost online monitoring is increasingly possible, large quantities of highly heterogeneous data can be acquired during the monitoring, thus exceeding the capacity of traditional data analytics techniques. This paper investigates big data techniques to handle the highvolume data obtained in structural health monitoring. In particular, we investigate the analysis of infrared thermal images for structural damage diagnosis. We explore the MapReduce technique to parallelize the data analytics and efficiently handle the high volume, high velocity and high variety of information. In our study, MapReduce is implemented with the Spark platform, and image processing functions such as uniform filter and Sobel filter are wrapped in the mappers. The methodology is illustrated with concrete slabs, using actual experimental data with induced damage


Author(s):  
Richard Kumaradjaja

This chapter describes data integration issues in big data analytics and proposes an integrated data integration framework for big data analytics. The main focus of this chapter is to address the issues of data integration from the architectural point of view. Addressing the issues of data integration from the architectural point of view will lead to a better understanding of the current situation and better construction of proposed solutions to those issues since architectural approach can give us a holistic and comprehensive view of the problems. The chapter also discusses future research directions of the proposed integrated data architecture framework.


2020 ◽  
Vol 17 (8) ◽  
pp. 3798-3803
Author(s):  
M. D. Anto Praveena ◽  
B. Bharathi

Big Data analytics has become an upward field, and it plays a pivotal role in Healthcare and research practices. Big data analytics in healthcare cover vast numbers of dynamic heterogeneous data integration and analysis. Medical records of patients include several data including medical conditions, medications and test findings. One of the major challenges of analytics and prediction in healthcare is data preprocessing. In data preprocessing the outlier identification and correction is the important challenge. Outliers are exciting values that deviates from other values of the attribute; they may simply experimental errors or novelty. Outlier identification is the method of identifying data objects with somewhat different behaviors than expectations. Detecting outliers in time series data is different from normal data. Time series data are the data that are in a series of certain time periods. This kind of data are identified and cleared to bring the quality dataset. In this proposed work a hybrid outlier detection algorithm extended LSTM-GAN is helped to recognize the outliers in time series data. The outcome of the proposed extended algorithm attained better enactment in the time series analysis on ECG dataset processing compared with traditional methodologies.


Author(s):  
Jaimin Navinchandra Undavia ◽  
Atul Manubhai Patel

The technological advancement has also opened up various ways to collect data through automatic mechanisms. One such mechanism collects a huge amount of data without any further maintenance or human interventions. The health industry sector has been confronted by the need to manage the big data being produced by various sources, which are well known for producing high volumes of heterogeneous data. High level of sophistication has been incorporated in almost all the industry, and healthcare is one of them. The article shows that the existence of huge amount of data in healthcare industry and the data generated in healthcare industry is neither homogeneous nor a simple type of data. Then the various sources and objectives of data are also highlighted and discussed. As data come from various sources, they must be versatile in nature in all aspects. So, rightly and meaningfully, big data analytics has penetrated the healthcare industry and its impact is also highlighted.


Author(s):  
Sai Hanuman Akundi ◽  
Soujanya R ◽  
Madhuri PM

In recent years vast quantities of data have been managed in various ways of medical applications and multiple organizations worldwide have developed this type of data and, together, these heterogeneous data are called big data. Data with other characteristics, quantity, speed and variety are the word big data. The healthcare sector has faced the need to handle the large data from different sources, renowned for generating large amounts of heterogeneous data. We can use the Big Data analysis to make proper decision in the health system by tweaking some of the current machine learning algorithms. If we have a large amount of knowledge that we want to predict or identify patterns, master learning would be the way forward. In this article, a brief overview of the Big Data, functionality and ways of Big data analytics are presented, which play an important role and affect healthcare information technology significantly. Within this paper we have presented a comparative study of algorithms for machine learning. We need to make effective use of all the current machine learning algorithms to anticipate accurate outcomes in the world of nursing.


2020 ◽  
Vol 12 (4) ◽  
pp. 132-146
Author(s):  
Gabriel Kabanda

Big Data is the process of managing large volumes of data obtained from several heterogeneous data types e.g. internal, external, structured and unstructured that can be used for collecting and analyzing enterprise data. The purpose of the paper is to conduct an evaluation of Big Data Analytics Projects which discusses why the projects fail and explain why and how the Project Predictive Analytics (PPA) approach may make a difference with respect to the future methods based on data mining, machine learning, and artificial intelligence. A qualitative research methodology was used. The research design was discourse analysis supported by document analysis. Laclau and Mouffe’s discourse theory was the most thoroughly poststructuralist approach.


Author(s):  
Richard Kumaradjaja

This paper describes data integration issues in big data analytics and proposes an integrated data integration framework for big data analytics. The main focus of this article is to address the issues of data integration from the architectural point of view. Addressing the issues of data integration from the architectural point of view will lead to a better understanding of the current situation and better able to construct proposed solutions to those issues since architectural approach can give us a holistic and comprehensive view of the problems. The paper also discusses about future research directions of the proposed integrated data architecture framework.


2015 ◽  
Vol 2015 ◽  
pp. 1-8 ◽  
Author(s):  
Li Jiang ◽  
Hao Chen ◽  
Yueqi Ouyang ◽  
Canbing Li

With the rapid development of information technology and the coming of the era of big data, various data are constantly emerging and present the characteristics of autonomy and heterogeneity. How to optimize data quality and evaluate the effect has become a challenging problem. Firstly, a heterogeneous data integration model based on retrospective audit is proposed to locate the original data source and match the data. Secondly, in order to improve the integrated data quality, a retrospective audit model and associative audit rules are proposed to fix incomplete and incorrect data from multiple heterogeneous data sources. The heterogeneous data integration model based on retrospective audit is divided into four modules including original heterogeneous data, data structure, data processing, and data retrospective audit. At last, some assessment criteria such as redundancy, sparsity, and accuracy are defined to evaluate the effect of the optimized data quality. Experimental results show that the quality of the integrated data is significantly higher than the quality of the original data.


Sign in / Sign up

Export Citation Format

Share Document