Big Data Analytics: A Preliminary Study of Open Source Platforms

AbstractAs big data analytics systems are squeezing out the last bits of performance of CPUs and GPUs, the next near-term and widely available alternative industry is considering for higher performance in the data center and cloud is the FPGA accelerator. We discuss several challenges a developer has to face when designing and integrating FPGA accelerators for big data analytics pipelines. On the software side, we observe complex run-time systems, hardware-unfriendly in-memory layouts of data sets, and (de)serialization overhead. On the hardware side, we observe a relative lack of platform-agnostic open-source tooling, a high design effort for data structure-specific interfaces, and a high design effort for infrastructure. The open source Fletcher framework addresses these challenges. It is built on top of Apache Arrow, which provides a common, hardware-friendly in-memory format to allow zero-copy communication of large tabular data, preventing (de)serialization overhead. Fletcher adds FPGA accelerators to the list of over eleven supported software languages. To deal with the hardware challenges, we present Arrow-specific components, providing easy-to-use, high-performance interfaces to accelerated kernels. The components are combined based on a generic architecture that is specialized according to the application through an extensive infrastructure generation framework that is presented in this article. All generated hardware is vendor-agnostic, and software drivers add a platform-agnostic layer, allowing users to create portable implementations.

Download Full-text

Telco Data Analytics using Open-Source Data Pipeline: Detailed Architecture and Technology Stack

International Journal for Research in Applied Science and Engineering Technology ◽

10.22214/ijraset.2021.38644 ◽

2021 ◽

Vol 9 (10) ◽

pp. 1717-1725

Author(s):

Abirami T

Keyword(s):

Big Data ◽

Data Analysis ◽

Load Balancing ◽

Open Source ◽

Data Storage ◽

Data Analytics ◽

Big Data Analytics ◽

Time Data ◽

Data Pipeline ◽

Real Time Data

Abstract: Open-source technology has influenced data analytics at each step from data storage to data analysis, and visualization. Open source for telco big data analytics enables sharp insights by enhancing problem discoverability and solution feasibility. This research paper talks about different technology stacks using open source for telco big data analytics that are used to deploy various tools including data collection, data storage, data processing, data analysis, and data visualization. This open source pipeline micro-services architecture built with modular technology stack and orchestrated by Kubernetes, can ingest data from multiple sources, process real-time data and provide business and network intelligence. Major idea of using open source technology in our architecture is to reduce cost and manage easily. Kubernetes is an industry adopted open source container orchestrator that offers fault-tolerance, application scaling, and load-balancing. The results can be displayed on the intuitive open source dashboard like Grafana for telecom operators. Our architecture is flexible and can be easily customized based on the telecommunication industry needs. Using the proposed architecture, the telecommunication sectors can get quick decision making with nearly 30% lower CapEX which is made possible using COTS hardware. Index Terms: Big data analytics, Data pipeline architecture, Open Source technologies, Real-time data processing, Faulttolerance, Load-balancing, Kubernetes, BDA, Open source dashboard

Download Full-text

Open Source Big Data Analytics Technique

Proceedings of the International Conference on Data Engineering and Communication Technology - Advances in Intelligent Systems and Computing ◽

10.1007/978-981-10-1675-2_58 ◽

2016 ◽

pp. 593-602 ◽

Cited By ~ 1

Author(s):

Ishan Sharma ◽

Rajeev Tiwari ◽

Abhineet Anand

Keyword(s):

Big Data ◽

Open Source ◽

Data Analytics ◽

Big Data Analytics

Download Full-text

Big Data Analytics: The Next Big Thing

The Management Accountant Journal ◽

10.33516/maj.v54i5.20-24p ◽

2019 ◽

Vol 54 (5) ◽

pp. 20

Author(s):

Dheeraj Kumar Pradhan

Keyword(s):

Big Data ◽

Data Analytics ◽

Big Data Analytics

Download Full-text

Strategy in Discovery Mode - Wie Big Data & Analytics strategisches Denken verdrängen kann

WiSt - Wirtschaftswissenschaftliches Studium ◽

10.15358/0340-1650-2020-5-11 ◽

2020 ◽

Vol 49 (5) ◽

pp. 11-17

Author(s):

Thomas Wrona ◽

Pauline Reinecke

Keyword(s):

Big Data ◽

Data Analytics ◽

Big Data Analytics

Big Data & Analytics (BDA) ist zu einer kaum hinterfragten Institution für Effizienz und Wettbewerbsvorteil von Unternehmen geworden. Zu viele prominente Beispiele, wie der Erfolg von Google oder Amazon, scheinen die Bedeutung zu bestätigen, die Daten und Algorithmen zur Erlangung von langfristigen Wettbewerbsvorteilen zukommt. Sowohl die Praxis als auch die Wissenschaft scheinen geradezu euphorisch auf den „Datenzug“ aufzuspringen. Wenn Risiken thematisiert werden, dann handelt es sich meist um ethische Fragen. Dabei wird häufig übersehen, dass die diskutierten Vorteile sich primär aus einer operativen Effizienzperspektive ergeben. Strategische Wirkungen werden allenfalls in Bezug auf Geschäftsmodellinnovationen diskutiert, deren tatsächlicher Innovationsgrad noch zu beurteilen ist. Im Folgenden soll gezeigt werden, dass durch BDA zwar Wettbewerbsvorteile erzeugt werden können, dass aber hiermit auch große strategische Risiken verbunden sind, die derzeit kaum beachtet werden.

Download Full-text