A Novel Framework for the Seamless Integration of FPGA Accelerators with Big Data Analytics Frameworks in Heterogeneous Data Centers

Author(s):  
Ioannis Stamelos ◽  
Elias Koromilas ◽  
Christoforos Kachris ◽  
Dimitrios Soudris
Author(s):  
Guowei Cai ◽  
Sankaran Mahadevan

This manuscript explores the application of big data analytics in online structural health monitoring. As smart sensor technology is making progress and low cost online monitoring is increasingly possible, large quantities of highly heterogeneous data can be acquired during the monitoring, thus exceeding the capacity of traditional data analytics techniques. This paper investigates big data techniques to handle the highvolume data obtained in structural health monitoring. In particular, we investigate the analysis of infrared thermal images for structural damage diagnosis. We explore the MapReduce technique to parallelize the data analytics and efficiently handle the high volume, high velocity and high variety of information. In our study, MapReduce is implemented with the Spark platform, and image processing functions such as uniform filter and Sobel filter are wrapped in the mappers. The methodology is illustrated with concrete slabs, using actual experimental data with induced damage


2020 ◽  
Vol 17 (8) ◽  
pp. 3798-3803
Author(s):  
M. D. Anto Praveena ◽  
B. Bharathi

Big Data analytics has become an upward field, and it plays a pivotal role in Healthcare and research practices. Big data analytics in healthcare cover vast numbers of dynamic heterogeneous data integration and analysis. Medical records of patients include several data including medical conditions, medications and test findings. One of the major challenges of analytics and prediction in healthcare is data preprocessing. In data preprocessing the outlier identification and correction is the important challenge. Outliers are exciting values that deviates from other values of the attribute; they may simply experimental errors or novelty. Outlier identification is the method of identifying data objects with somewhat different behaviors than expectations. Detecting outliers in time series data is different from normal data. Time series data are the data that are in a series of certain time periods. This kind of data are identified and cleared to bring the quality dataset. In this proposed work a hybrid outlier detection algorithm extended LSTM-GAN is helped to recognize the outliers in time series data. The outcome of the proposed extended algorithm attained better enactment in the time series analysis on ECG dataset processing compared with traditional methodologies.


Author(s):  
Jaimin Navinchandra Undavia ◽  
Atul Manubhai Patel

The technological advancement has also opened up various ways to collect data through automatic mechanisms. One such mechanism collects a huge amount of data without any further maintenance or human interventions. The health industry sector has been confronted by the need to manage the big data being produced by various sources, which are well known for producing high volumes of heterogeneous data. High level of sophistication has been incorporated in almost all the industry, and healthcare is one of them. The article shows that the existence of huge amount of data in healthcare industry and the data generated in healthcare industry is neither homogeneous nor a simple type of data. Then the various sources and objectives of data are also highlighted and discussed. As data come from various sources, they must be versatile in nature in all aspects. So, rightly and meaningfully, big data analytics has penetrated the healthcare industry and its impact is also highlighted.


Author(s):  
Sai Hanuman Akundi ◽  
Soujanya R ◽  
Madhuri PM

In recent years vast quantities of data have been managed in various ways of medical applications and multiple organizations worldwide have developed this type of data and, together, these heterogeneous data are called big data. Data with other characteristics, quantity, speed and variety are the word big data. The healthcare sector has faced the need to handle the large data from different sources, renowned for generating large amounts of heterogeneous data. We can use the Big Data analysis to make proper decision in the health system by tweaking some of the current machine learning algorithms. If we have a large amount of knowledge that we want to predict or identify patterns, master learning would be the way forward. In this article, a brief overview of the Big Data, functionality and ways of Big data analytics are presented, which play an important role and affect healthcare information technology significantly. Within this paper we have presented a comparative study of algorithms for machine learning. We need to make effective use of all the current machine learning algorithms to anticipate accurate outcomes in the world of nursing.


2020 ◽  
Vol 12 (4) ◽  
pp. 132-146
Author(s):  
Gabriel Kabanda

Big Data is the process of managing large volumes of data obtained from several heterogeneous data types e.g. internal, external, structured and unstructured that can be used for collecting and analyzing enterprise data. The purpose of the paper is to conduct an evaluation of Big Data Analytics Projects which discusses why the projects fail and explain why and how the Project Predictive Analytics (PPA) approach may make a difference with respect to the future methods based on data mining, machine learning, and artificial intelligence. A qualitative research methodology was used. The research design was discourse analysis supported by document analysis. Laclau and Mouffe’s discourse theory was the most thoroughly poststructuralist approach.


2018 ◽  
Vol 7 (2.24) ◽  
pp. 92
Author(s):  
B V Ram Naresh Yadav ◽  
P Anjaiah

Big data analytics and Cloud computing are the two most imperative innovations in the current IT industry. In a surprise, these technologies come up together to convey the effective outcomes to various business organizations. However, big data analytics require a huge amount of resources for storage and computation. The storage cost is massively increased on the input amounts of data and requires innovative algorithms to reduce the cost to store the data in a specific data centers in a cloud. In Today’s IT Industry, Cloud Computing has emerged as a popular paradigm to host customer, enterprise data and many other distributed applications. Cloud Service Providers (CSPs) store huge amounts of data and numerous distributed applications with different cost. For example Amazon provides storage services at a fraction of TB/month and each CSP having different Service Level Agreements with different storage offers. Customers are interested in reliable SLAs and it increases the cost since the number of replicas are more. The CSPs are attracting the users for initial storage/put operations and get operations from the cloud becomes hurdle and subsequently increases the cost. CSPs provides these services by maintaining multiple datacenters at multiple locations throughout the world. These datacenters provide distinctive get/put latencies and unit costs for resource reservation and utilization. The way of choosing distinctive CSPs data centers, becomes tricky for cloud users those who are using the distributed application globally i.e. online social networks.  In has mainly two challenges. Firstly, allocating the data to different datacenters to satisfy the SLO including the latency. Secondly, how one can reserve the remote resource i.e. memory with less cost. In this paper we have derived a new model to minimize the cost by satisfying the SLOs with integer programming. Additionally, we proposed an algorithm to store the data in a data center by minimizing the cost among different data centers and the computation of cost for put/get latencies. Our simulation works shows that the cost is minimized for resource reservation and utilization among different datacenters.  


2020 ◽  
Vol 10 (21) ◽  
pp. 7586
Author(s):  
Jose E. Lozano-Rizk ◽  
Juan I. Nieto-Hipolito ◽  
Raul Rivera-Rodriguez ◽  
Maria A. Cosio-Leon ◽  
Mabel Vazquez-Briseño ◽  
...  

When Internet of Things (IoT) big data analytics (BDA) require to transfer data streams among software defined network (SDN)-based distributed data centers, the data flow forwarding in the communication network is typically done by an SDN controller using a traditional shortest path algorithm or just considering bandwidth requirements by the applications. In BDA, this scheme could affect their performance resulting in a longer job completion time because additional metrics were not considered, such as end-to-end delay, jitter, and packet loss rate in the data transfer path. These metrics are quality of service (QoS) parameters in the communication network. This research proposes a solution called QoSComm, an SDN strategy to allocate QoS-based data flows for BDA running across distributed data centers to minimize their job completion time. QoSComm operates in two phases: (i) based on the current communication network conditions, it calculates the feasible paths for each data center using a multi-objective optimization method; (ii) it distributes the resultant paths among data centers configuring their openflow Switches (OFS) dynamically. Simulation results show that QoSComm can improve BDA job completion time by an average of 18%.


Sign in / Sign up

Export Citation Format

Share Document