scholarly journals Advances in Big Data Applications for transportation: airline, highway, and railway

2021 ◽  
Vol 5 (2) ◽  
pp. 121-134
Author(s):  
Babek Erdebilli ◽  
Emine Nur NACAR

Aim: The purpose of this article is to present the latest advances in big data applications in the industries of the transportation sector such as airline, highway, and railway. It is difficult to analyze data in transportation because there is continuous real-time data flow. Since the improvements made are fast with the same logic, it is necessary to catch up with the new developments. Data should be analyzed with the big data concept because data stacks highly contain non-structural data types in transportation data. Although the mentioned industries are complementary to each other, the applications differ depending on the needs of the industry. Thus, solutions to specific problems in different industries using big data applications should be addressed. Design / Research methods: In accordance with the purpose of the study, big data studies that provide added value to the transportation sector were examined. Studies have been filtered through some criteria which are whether the application is adaptable to the industry, the study is available online in full-text, and its references are from respectable sources.   Conclusions / findings: All the big data application studies in the academy are not adaptable in real-life problems or suitable for all situations. For this reason, trying all of the applications will lead to moral and material losses for firms. This study is a guideline for companies to follow the developments in the big data concept and to choose the one that suits their problems. Thus, the gap between academia and industry was tried to close. Originality / value of the article: Although studies are referring to big data applications in the transportation sector, this study differs from others in terms of specifically analyzing big data applications in different industries such as airline, highway, and railway in the transportation sector

Author(s):  
Dimitar Christozov ◽  
Katia Rasheva-Yordanova

The article shares the authors' experiences in training bachelor-level students to explore Big Data applications in solving nowadays problems. The article discusses curriculum issues and pedagogical techniques connected to developing Big Data competencies. The following objectives are targeted: The importance and impact of making rational, data driven decisions in the Big Data era; Complexity of developing and exploring a Big Data Application in solving real life problems; Learning skills to adopt and explore emerging technologies; and Knowledge and skills to interpret and communicate results of data analysis via combining domain knowledge with system expertise. The curriculum covers: The two general uses of Big Data Analytics Applications, which are well distinguished from the point of view of end-user's objectives (presenting and visualizing data via aggregation and summarization [data warehousing: data cubes, dash boards, etc.] and learning from Data [data mining techniques]); Organization of Data Sources: distinction of Master Data from Operational Data, in particular; Extract-Transform-Load (ETL) process; and Informing vs. Misinforming, including the issue of over-trust vs. under-trust of obtained analytical results.


Author(s):  
M. Asif Naeem ◽  
Gillian Dobbie ◽  
Gerald Weber

In order to make timely and effective decisions, businesses need the latest information from big data warehouse repositories. To keep these repositories up to date, real-time data integration is required. An important phase in real-time data integration is data transformation where a stream of updates, which is huge in volume and infinite, is joined with large disk-based master data. Stream processing is an important concept in Big Data, since large volumes of data are often best processed immediately. A well-known algorithm called Mesh Join (MESHJOIN) was proposed to process stream data with disk-based master data, which uses limited memory. MESHJOIN is a candidate for a resource-aware system setup. The problem that the authors consider in this chapter is that MESHJOIN is not very selective. In particular, the performance of the algorithm is always inversely proportional to the size of the master data table. As a consequence, the resource consumption is in some scenarios suboptimal. They present an algorithm called Cache Join (CACHEJOIN), which performs asymptotically at least as well as MESHJOIN but performs better in realistic scenarios, particularly if parts of the master data are used with different frequencies. In order to quantify the performance differences, the authors compare both algorithms with a synthetic dataset of a known skewed distribution as well as TPC-H and real-life datasets.


2020 ◽  
Vol 39 (10) ◽  
pp. 753-754
Author(s):  
Jiajia Sun ◽  
Daniele Colombo ◽  
Yaoguo Li ◽  
Jeffrey Shragge

Geophysicists seek to extract useful and potentially actionable information about the subsurface by interpreting various types of geophysical data together with prior geologic information. It is well recognized that reliable imaging, characterization, and monitoring of subsurface systems require integration of multiple sources of information from a multitude of geoscientific data sets. With increasing data volumes and computational power, new data types, constant development of inversion algorithms, and the advent of the big data era, Geophysics editors see multiphysics integration as an effective means of meeting some of the challenges arising from imaging subsurface systems with higher resolution and reliability as well as exploring geologically more complicated areas. To advance the field of multiphysics integration and to showcase its added value, Geophysics will introduce a new section “Multiphysics and Joint Inversion” in 2021. Submissions are accepted now.


Author(s):  
Rajni Aron ◽  
Deepak Kumar Aggarwal

Cloud Computing has become a buzzword in the IT industry. Cloud Computing which provides inexpensive computing resources on the pay-as-you-go basis is promptly gaining momentum as a substitute for traditional Information Technology (IT) based organizations. Therefore, the increased utilization of Clouds makes an execution of Big Data processing jobs a vital research area. As more and more users have started to store/process their real-time data in Cloud environments, Resource Provisioning and Scheduling of Big Data processing jobs becomes a key element of consideration for efficient execution of Big Data applications. This chapter discusses the fundamental concepts supporting Cloud Computing & Big Data terms and the relationship between them. This chapter will help researchers find the important characteristics of Cloud Resource Management Systems to handle Big Data processing jobs and will also help to select the most suitable technique for processing Big Data jobs in Cloud Computing environment.


Sensors ◽  
2019 ◽  
Vol 19 (10) ◽  
pp. 2338 ◽  
Author(s):  
Yuanju Qu ◽  
Xinguo Ming ◽  
Siqi Qiu ◽  
Maokuan Zheng ◽  
Zengtao Hou

With the development of the internet of things (IoTs), big data, smart sensing technology, and cloud technology, the industry has entered a new stage of revolution. Traditional manufacturing enterprises are transforming into service-oriented manufacturing based on prognostic and health management (PHM). However, there is a lack of a systematic and comprehensive framework of PHM to create more added value. In this paper, the authors proposed an integrative framework to systematically solve the problem from three levels: Strategic level of PHM to create added value, tactical level of PHM to make the implementation route, and operational level of PHM in a detailed application. At the strategic level, the authors provided the innovative business model to create added value through the big data. Moreover, to monitor the equipment status, the health index (HI) based on a condition-based maintenance (CBM) method was proposed. At the tactical level, the authors provided the implementation route in application integration, analysis service, and visual management to satisfy the different stakeholders’ functional requirements through a convolutional neural network (CNN). At the operational level, the authors constructed a self-sensing network based on anti-inference and self-organizing Zigbee to capture the real-time data from the equipment group. Finally, the authors verified the feasibility of the framework in a real case from China.


Author(s):  
Seema Ansari ◽  
Radha Mohanlal ◽  
Javier Poncela ◽  
Adeel Ansari ◽  
Komal Mohanlal

Combining vast amounts of heterogeneous data and increasing the processing power of existing database management tools is no doubt the emerging need of IT industry in coming years. The complexity and size of data sets that need to be acquired, analyzed, stored, sorted or transferred has spiked in the recent years. Due to the tremendously increasing volume of multiple data types, creating Big Data applications that can extract the valuable trends and relationships required for further processes or deriving useful results is quite challenging task. Companies, corporate organizations or be it government agencies, all need to analyze and execute Big Data implementation to pave new paths of productivity and innovation. This chapter discusses the emerging technology of modern era: Big Data with detailed description of the three V's (Variety, Velocity and Volume). Further chapters will enable to understand the concepts of data mining and big data analysis, Potentials of Big Data in five domains i.e. Healthcare, Public sector, Retail, Manufacturing and Personal location Data.


Symmetry ◽  
2021 ◽  
Vol 13 (6) ◽  
pp. 1045
Author(s):  
Edmundas Kazimieras Zavadskas ◽  
Jurgita Antucheviciene ◽  
Zenonas Turskis

This Special Issue covers symmetric and asymmetric data that occur in real-life problems. We invited authors to submit their theoretical or experimental research to present engineering and economic problem solution models that deal with symmetry or asymmetry of different data types. The Special Issue gained interest in the research community and received many submissions. After rigorous scientific evaluation by editors and reviewers, seventeen papers were accepted and published. The authors proposed different solution models, mainly covering uncertain data in multi-criteria decision-making problems as complex tools to balance the symmetry between goals, risks, and constraints to cope with the complicated problems in engineering or management. Therefore, we invite researchers interested in the topics to read the papers provided in the Special Issue.


Big data applications play an important role in real time data processing. Apache Spark is a data processing framework with in-memory data engine that quickly processes large data sets. It can also distribute data processing tasks across multiple computers, either on its own or in tandem with other distributed computing tools. Spark’s in-memory processing cannot share data between the applications and hence, the RAM memory will be insufficient for storing petabytes of data. Alluxio is a virtual distributed storage system that leverages memory for data storage and provides faster access to data in different storage systems. Alluxio helps to speed up data intensive Spark applications, with various storage systems. In this work, the performance of applications on Spark as well as Spark running over Alluxio have been studied with respect to several storage formats such as Parquet, ORC, CSV, and JSON; and four types of queries from Star Schema Benchmark (SSB). A benchmark is evolved to suggest the suitability of Spark Alluxio combination for big data applications. It is found that Alluxio is suitable for applications that use databases of size more than 2.6 GB storing data in JSON and CSV formats. Spark is found suitable for applications that use storage formats such as parquet and ORC with database sizes less than 2.6GB.


Big data applications introduce novel openings for establishinginnovative information and produce differentadvanced methods to improve the worth of healthcare.In this paper, a novel activity pattern mining from social media for healthcare to examine big data applications in different biomedical multi-disciplines such as bioinformatics, medical imaging and community healthcare applications.Big data analytical tools perform the key part in their task for extracting hidden behavioural and expressive patterns frompersonal messages and their tweets. The behavioural patterns of the users can realizetheir additional informations about their concealed feelings and sentiments[1],[ 3],[5]. Further, the neural network is modelled to predict the psychological informations, such as nervousness, depression, behavioural disorder and mental stress.This is also shows that integrating variety of sources of data enables medical practitioner to show a novel investigation of patient care processes, improvements in new mobile healthcare technological developments aid real-time data collection, archiving and analysis of data in distributed environments


Author(s):  
Rajni Aron ◽  
Deepak Kumar Aggarwal

Cloud Computing has become a buzzword in the IT industry. Cloud Computing which provides inexpensive computing resources on the pay-as-you-go basis is promptly gaining momentum as a substitute for traditional Information Technology (IT) based organizations. Therefore, the increased utilization of Clouds makes an execution of Big Data processing jobs a vital research area. As more and more users have started to store/process their real-time data in Cloud environments, Resource Provisioning and Scheduling of Big Data processing jobs becomes a key element of consideration for efficient execution of Big Data applications. This chapter discusses the fundamental concepts supporting Cloud Computing & Big Data terms and the relationship between them. This chapter will help researchers find the important characteristics of Cloud Resource Management Systems to handle Big Data processing jobs and will also help to select the most suitable technique for processing Big Data jobs in Cloud Computing environment.


Sign in / Sign up

Export Citation Format

Share Document