A big data pipeline for temporospatial infrasound analysis

2016 ◽  
Vol 140 (4) ◽  
pp. 2997-2997
Author(s):  
Anthony Christe ◽  
Milton Garces ◽  
Julie Schnurr ◽  
Steven Magana-Zook
Keyword(s):  
Big Data ◽  

Bank marketers still have difficulties to find the best implementation for credit card promotion using above the line, particularly based on customers preferences in point of interest (POI) locations such as mall and shopping center. On the other hand, customers on those POIs are keen to have recommendation on what is being offered by the bank. On this paper we propose a design architecture and implementation of big data platform to support bank’s credit card’s program campaign that generating data and extracting topics from Twitter. We built a data pipeline that consist of a Twitter streamer, a text preprocessor, a topic extractor using Latent Dirichlet Allocation, and a dashboard that visualize the recommendation. As a result, we successfully generate topics that related to specific location in Jakarta during some time windows, that can be used as a recommendation for bank marketers to create promotion program for their customers. We also present the analysis of computing power usages that indicates the strategy is well implemented on the big data platform.


Author(s):  
Hiba Sebei ◽  
Mohamed Ali Hadj Taieb ◽  
Mohamed Ben Aouicha

Sensors ◽  
2020 ◽  
Vol 20 (7) ◽  
pp. 1838
Author(s):  
Kwanghee Won ◽  
Chungwook Sim

Transverse cracks on bridge decks provide the path for chloride penetration and are the major reason for deck deterioration. For such reasons, collecting information related to the crack widths and spacing of transverse cracks are important. In this study, we focused on developing a data pipeline for automated crack detection using non-contact optical sensors. We developed a data acquisition system that is able to acquire data in a fast and simple way without obstructing traffic. Understanding that GPS is not always available and odometer sensor data can only provide relative positions along the direction of traffic, we focused on providing an alternative localization strategy only using optical sensors. In addition, to improve existing crack detection methods which mostly rely on the low-intensity and localized line-segment characteristics of cracks, we considered the direction and shape of the cracks to make our machine learning approach smarter. The proposed system may serve as a useful inspection tool for big data analytics because the system is easy to deploy and provides multiple properties of cracks. Progression of crack deterioration, if any, both in spatial and temporal scale, can be checked and compared if the system is deployed multiple times.


Author(s):  
Arpna Joshi ◽  
Chirag Singla ◽  
Mr. Pankaj

A data pipeline is a set of conducts that are performed from the time data is available for ingestion till value is obtained from that data. Such kind of actions is Extraction (getting value field from the dataset), Transformation and Loading (putting the data of value in a form that is useful for upstream use). In this big data project, we will simulate a simple batch data pipeline. Our dataset of interest we will get from https://www.githubarchive.org/ that records the health data of US for past 125years. The objective of this spark project will be to create a small but real-world pipeline that downloads this dataset as they become available, initiated the various form of transformation and load them into forms of storage that will need further use. In this project Apache kafka is used for data ingestion, Apache Spark for data processing and Cassandra for storing the processed result.


Author(s):  
Abirami T

Abstract: Open-source technology has influenced data analytics at each step from data storage to data analysis, and visualization. Open source for telco big data analytics enables sharp insights by enhancing problem discoverability and solution feasibility. This research paper talks about different technology stacks using open source for telco big data analytics that are used to deploy various tools including data collection, data storage, data processing, data analysis, and data visualization. This open source pipeline micro-services architecture built with modular technology stack and orchestrated by Kubernetes, can ingest data from multiple sources, process real-time data and provide business and network intelligence. Major idea of using open source technology in our architecture is to reduce cost and manage easily. Kubernetes is an industry adopted open source container orchestrator that offers fault-tolerance, application scaling, and load-balancing. The results can be displayed on the intuitive open source dashboard like Grafana for telecom operators. Our architecture is flexible and can be easily customized based on the telecommunication industry needs. Using the proposed architecture, the telecommunication sectors can get quick decision making with nearly 30% lower CapEX which is made possible using COTS hardware. Index Terms: Big data analytics, Data pipeline architecture, Open Source technologies, Real-time data processing, Faulttolerance, Load-balancing, Kubernetes, BDA, Open source dashboard


Sign in / Sign up

Export Citation Format

Share Document