scholarly journals Programming big data analysis: principles and solutions

2022 ◽  
Vol 9 (1) ◽  
Author(s):  
Loris Belcastro ◽  
Riccardo Cantini ◽  
Fabrizio Marozzo ◽  
Alessio Orsino ◽  
Domenico Talia ◽  
...  

AbstractIn the age of the Internet of Things and social media platforms, huge amounts of digital data are generated by and collected from many sources, including sensors, mobile devices, wearable trackers and security cameras. This data, commonly referred to as Big Data, is challenging current storage, processing, and analysis capabilities. New models, languages, systems and algorithms continue to be developed to effectively collect, store, analyze and learn from Big Data. Most of the recent surveys provide a global analysis of the tools that are used in the main phases of Big Data management (generation, acquisition, storage, querying and visualization of data). Differently, this work analyzes and reviews parallel and distributed paradigms, languages and systems used today to analyze and learn from Big Data on scalable computers. In particular, we provide an in-depth analysis of the properties of the main parallel programming paradigms (MapReduce, workflow, BSP, message passing, and SQL-like) and, through programming examples, we describe the most used systems for Big Data analysis (e.g., Hadoop, Spark, and Storm). Furthermore, we discuss and compare the different systems by highlighting the main features of each of them, their diffusion (community of developers and users) and the main advantages and disadvantages of using them to implement Big Data analysis applications. The final goal of this work is to help designers and developers in identifying and selecting the best/appropriate programming solution based on their skills, hardware availability, application domains and purposes, and also considering the support provided by the developer community.

2018 ◽  
Vol 30 (5) ◽  
pp. 554-571 ◽  
Author(s):  
Maria Vincenza Ciasullo ◽  
Orlando Troisi ◽  
Francesca Loia ◽  
Gennaro Maione

Purpose The purpose of this paper is to provide a better understanding of the reasons why people use or do not use carpooling. A further aim is to collect and analyze empirical evidence concerning the advantages and disadvantages of carpooling. Design/methodology/approach A large-scale text analytics study has been conducted: the collection of the peoples’ opinions have been realized on Twitter by means of a dedicated web crawler, named “Twitter4J.” After their mining, the collected data have been treated through a sentiment analysis realized by means of “SentiWordNet.” Findings The big data analysis identified the 12 most frequently used concepts about carpooling by Twitter’s users: seven advantages (economic efficiency, environmental efficiency, comfort, traffic, socialization, reliability, curiosity) and five disadvantages (lack of effectiveness, lack of flexibility, lack of privacy, danger, lack of trust). Research limitations/implications Although the sample is particularly large (10 percent of the data flow published on Twitter from all over the world in about one year), the automated collection of people’s comments has prevented a more in-depth analysis of users’ thoughts and opinions. Practical implications The research findings may direct entrepreneurs, managers and policy makers to understand the variables to be leveraged and the actions to be taken to take advantage of the potential benefits that carpooling offers. Originality/value The work has utilized skills from three different areas, i.e., business management, computing science and statistics, which have been synergistically integrated for customizing, implementing and using two IT tools capable of automatically identifying, selecting, collecting, categorizing and analyzing people’s tweets about carpooling.


2021 ◽  
Vol 2021 ◽  
pp. 1-10
Author(s):  
Hui Jiang ◽  
Ping wang ◽  
Lei Peng ◽  
Xiaofeng Wang

In recent years, athlete action recognition has become an important research field for showing and recognition of athlete actions. Generally speaking, movement recognition of athletes can be performed through a variety of modes, such as motion sensors, machine vision, and big data analysis. Among them, machine vision and big data analysis usually contain significant information which can be used for various purposes. Machine vision can be expressed as the recognition of the time sequence of a series of athlete actions captured through camera, so that it can intervene in the training of athletes by visual methods and approaches. Big data contains a large number of athletes’ historical training and competition data which need exploration. In-depth analysis and feature mining of big data will help coach teams to develop training plans and devise new suggestions. On the basis of the above observations, this paper proposes a novel spatiotemporal attention map convolutional network to identify athletes’ actions, and through the auxiliary analysis of big data, gives reasonable action intervention suggestions, and provides coaches and decision-making teams to formulate scientific training programs. Results of the study show the effectiveness of the proposed research.


2019 ◽  
Vol 4 (2) ◽  
pp. 75-88
Author(s):  
Annisaa Nurhayati

Big Data has affected all industries, including the media dan entertainment industries. The popularity of using mobile devices and the internet has changed the way people enjoy entertainment. This popularity also generates data streams from many sources with various data formats and large volumes, known as big data. Carrying out big data analysis can help the media industry and entertainment achieve its goals, like providing content that makes users happy, provides user experience, and increases profits. Many researchers have conducted research on the use of big data in the media and entertainment industries. The purpose of this paper is to provide an overview of the problems, challenges and various technologies related to Big Data in the media and entertainment industries.


2018 ◽  
Vol 5 (1) ◽  
pp. 205395171775322 ◽  
Author(s):  
Sarah Pink ◽  
Minna Ruckenstein ◽  
Robert Willim ◽  
Melisa Duque

In this article, we introduce and demonstrate the concept-metaphor of broken data. In doing so, we advance critical discussions of digital data by accounting for how data might be in processes of decay, making, repair, re-making and growth, which are inextricable from the ongoing forms of creativity that stem from everyday contingencies and improvisatory human activity. We build and demonstrate our argument through three examples drawn from mundane everyday activity: the incompleteness, inaccuracy and dispersed nature of personal self-tracking data; the data cleaning and repair processes of Big Data analysis and how data can turn into noise and vice versa when they are transduced into sound within practices of music production and sound art. This, we argue is a necessary step for considering the meaning and implications of data as it is increasingly mobilised in ways that impact society and our everyday worlds.


2021 ◽  
Vol 2021 ◽  
pp. 1-11
Author(s):  
Na Tian ◽  
Sang-Bing Tsai

This paper provides an in-depth analysis and study of the interactive flipped classroom model for a digital micro-video for a big data English course. To improve the learning efficiency of English courses and reduce the learning pressure of students, the thesis also uses certain techniques to apply audiovisual language to the production of specific micro-class videos, broadcast the successfully recorded micro-class courses to students, and then use the questionnaire to randomly distribute the designed audiovisual language use questionnaire. Micro-classes earnestly perform data statistics for students and finally conduct data analysis to summarize and verify the effects of micro-class audiovisual language use. The improved algorithm can effectively reduce the fluctuation of the consumption of various resources in the cluster and make the services in the cluster more stable. The new distributed interprocess communication based on protocol and serialization technology is more efficient than traditional communication based on protocol standards, reduces bandwidth consumption in the cluster, and improves the throughput of each node in the cluster. The content design and scripting of micro-video teaching resources are based on this. Then, the production process of micro-video teaching resources is explained, according to the selection of tools, the preparation, recording, editing, and generation of materials.


2021 ◽  
Vol 2021 ◽  
pp. 1-10
Author(s):  
Yi Zheng

At present, big data related technologies are developing rapidly, and major companies provide big data analysis services. However, the big data analysis system formed by the combination method cannot sense each other and lacks cooperation, resulting in a certain amount of waste of resources in the big data analysis system. In order to find the key technology of the data analysis system and conduct in-depth analysis of the media data, this paper proposes a scheduling algorithm based on artificial intelligence (AI) to implement task scheduling and logical data block migration. By analyzing the experimental results, we know that the performance of LAS (Logistic-Block Affinity Scheduler) is improved by 23.97%, 16.11%, and 10.56%, respectively, compared with the other three algorithms. Based on real new media data, this article analyzes the content of media data and user behavior in depth through big data analysis methods. Compared with other methods, the algorithm model in this paper optimizes the accuracy of hot topic extraction, which has important implications for media data mining. In addition, the analysis results of the emotional characteristics, audience characteristics, and hot topic communication characteristics obtained by the research also have practical value. This method improves the recall rate and F value by 5% and 4.7%, respectively, and the overall F value of emotional judgment is about 88.9%.


2019 ◽  
Vol 9 (1) ◽  
pp. 01-12 ◽  
Author(s):  
Kristy F. Tiampo ◽  
Javad Kazemian ◽  
Hadi Ghofrani ◽  
Yelena Kropivnitskaya ◽  
Gero Michel

Sign in / Sign up

Export Citation Format

Share Document