Structure and Dynamics of Many-Particle Systems: Big Data Sets and Data Analysis

Background: Over the years, big data analytics has been statically carried out in a programmed way, which does not allow for translation of data sets from a subjective perspective. This approach affects an understanding of why and how data sets manifest themselves into various forms in the way that they do. This has a negative impact on the accuracy, redundancy and usefulness of data sets, which in turn affects the value of operations and the competitive effectiveness of an organisation. Also, the current single approach lacks a detailed examination of data sets, which big data deserve in order to improve purposefulness and usefulness.Objective: The purpose of this study was to propose a multilevel approach to big data analysis. This includes examining how a sociotechnical theory, the actor network theory (ANT), can be complementarily used with analytic tools for big data analysis.Method: In the study, the qualitative methods were employed from the interpretivist approach perspective.Results: From the findings, a framework that offers big data analytics at two levels, micro- (strategic) and macro- (operational) levels, was developed. Based on the framework, a model was developed, which can be used to guide the analysis of heterogeneous data sets that exist within networks.Conclusion: The multilevel approach ensures a fully detailed analysis, which is intended to increase accuracy, reduce redundancy and put the manipulation and manifestation of data sets into perspectives for improved organisations’ competitiveness.

Download Full-text

Development of a Powerful Data-Analysis Tool Using Nonparametric Smoothing Models To Identify Drillsites in Tight Shale Reservoirs With High Economic Potential

SPE Journal ◽

10.2118/189440-pa ◽

2017 ◽

Vol 23 (03) ◽

pp. 719-736 ◽

Cited By ~ 2

Author(s):

Quan Cai ◽

Wei Yu ◽

Hwa Chi Liang ◽

Jenn-Tai Liang ◽

Suojin Wang ◽

...

Keyword(s):

Big Data ◽

Data Analysis ◽

Predictive Power ◽

Oil And Gas ◽

Predictor Variables ◽

Data Sets ◽

Analysis Tool ◽

Data Set ◽

Data Analysis Tool ◽

Shale Reservoirs

Summary The oil-and-gas industry is entering an era of “big data” because of the huge number of wells drilled with the rapid development of unconventional oil-and-gas reservoirs during the past decade. The massive amount of data generated presents a great opportunity for the industry to use data-analysis tools to help make informed decisions. The main challenge is the lack of the application of effective and efficient data-analysis tools to analyze and extract useful information for the decision-making process from the enormous amount of data available. In developing tight shale reservoirs, it is critical to have an optimal drilling strategy, thereby minimizing the risk of drilling in areas that would result in low-yield wells. The objective of this study is to develop an effective data-analysis tool capable of dealing with big and complicated data sets to identify hot zones in tight shale reservoirs with the potential to yield highly productive wells. The proposed tool is developed on the basis of nonparametric smoothing models, which are superior to the traditional multiple-linear-regression (MLR) models in both the predictive power and the ability to deal with nonlinear, higher-order variable interactions. This data-analysis tool is capable of handling one response variable and multiple predictor variables. To validate our tool, we used two real data sets—one with 249 tight oil horizontal wells from the Middle Bakken and the other with 2,064 shale gas horizontal wells from the Marcellus Shale. Results from the two case studies revealed that our tool not only can achieve much better predictive power than the traditional MLR models on identifying hot zones in the tight shale reservoirs but also can provide guidance on developing the optimal drilling and completion strategies (e.g., well length and depth, amount of proppant and water injected). By comparing results from the two data sets, we found that our tool can achieve model performance with the big data set (2,064 Marcellus wells) with only four predictor variables that is similar to that with the small data set (249 Bakken wells) with six predictor variables. This implies that, for big data sets, even with a limited number of available predictor variables, our tool can still be very effective in identifying hot zones that would yield highly productive wells. The data sets that we have access to in this study contain very limited completion, geological, and petrophysical information. Results from this study clearly demonstrated that the data-analysis tool is certainly powerful and flexible enough to take advantage of any additional engineering and geology data to allow the operators to gain insights on the impact of these factors on well performance.

Download Full-text

Influencing Factors of e-Commerce Enterprise Development Based on Mobile Computing Big Data Analysis

Wireless Communications and Mobile Computing ◽

10.1155/2021/8750111 ◽

2021 ◽

Vol 2021 ◽

pp. 1-12

Author(s):

Yixue Zhu ◽

Boyue Chai

Keyword(s):

Big Data ◽

Data Analysis ◽

Large Scale ◽

Big Data Analysis ◽

Support Vector ◽

Data Sets ◽

Large Scale Data ◽

Vector Machines ◽

Physical Information ◽

Scale Data

With the development of increasingly advanced information technology and electronic technology, especially with regard to physical information systems, cloud computing systems, and social services, big data will be widely visible, creating benefits for people and at the same time facing huge challenges. In addition, with the advent of the era of big data, the scale of data sets is getting larger and larger. Traditional data analysis methods can no longer solve the problem of large-scale data sets, and the hidden information behind big data is digging out, especially in the field of e-commerce. We have become a key factor in competition among enterprises. We use a support vector machine method based on parallel computing to analyze the data. First, the training samples are divided into several working subsets through the SOM self-organizing neural network classification method. Compared with the ever-increasing progress of information technology and electronic equipment, especially the related physical information system finally merges the training results of each working set, so as to quickly deal with the problem of massive data prediction and analysis. This paper proposes that big data has the flexibility of expansion and quality assessment system, so it is meaningful to replace the double-sidedness of quality assessment with big data. Finally, considering the excellent performance of parallel support vector machines in data mining and analysis, we apply this method to the big data analysis of e-commerce. The research results show that parallel support vector machines can solve the problem of processing large-scale data sets. The emergence of data dirty problems has increased the effective rate by at least 70%.

Download Full-text

Tweaking Business Planning With Artificial Intelligence

International Journal of Business Strategy and Automation ◽

10.4018/ijbsa.288541 ◽

2021 ◽

Vol 2 (4) ◽

pp. 1-22

Author(s):

Jing Rui Chen ◽

P. S. Joseph Ng

Keyword(s):

Artificial Intelligence ◽

Big Data ◽

Data Analysis ◽

Learning Experience ◽

Compulsory Education ◽

Large Data ◽

Data Sets ◽

Chinese Market ◽

Artificial Intelligence Technology ◽

Education Support

Griffith AI&BD is a technology company that uses big data platform and artificial intelligence technology to produce products for schools. The company focuses on primary and secondary school education support and data analysis assistance system and campus ARTIFICIAL intelligence products for the compulsory education stage in the Chinese market. Through big data, machine learning and data mining, scattered on campus and distributed systems enable anyone to sign up to join the huge data processing grid, and access learning support big data analysis and matching after helping students expand their knowledge in a variety of disciplines and learning and promotion. Improve the learning process based on large data sets of students, and combine ai technology to develop AI electronic devices. To provide schools with the best learning experience to survive in a competitive world.

Download Full-text

Data Classification

Advances in Business Information Systems and Analytics - Handbook of Research on Advanced Data Mining Techniques and Applications for Business Intelligence ◽

10.4018/978-1-5225-2031-3.ch003 ◽

2017 ◽

pp. 34-51 ◽

Cited By ~ 2

Author(s):

A. Sheik Abdullah ◽

R. Suganya ◽

S. Selvakumar ◽

S. Rajaram

Keyword(s):

Big Data ◽

Data Analysis ◽

Data Classification ◽

Large Data ◽

Classification Problem ◽

Data Sets ◽

Credit Risk Analysis ◽

Predicted Values ◽

Class Labels ◽

The One

Classification is considered to be the one of the data analysis technique which can be used over many applications. Classification model predicts categorical continuous class labels. Clustering mainly deals with grouping of variables based upon similar characteristics. Classification models are experienced by comparing the predicted values to that of the known target values in a set of test data. Data classification has many applications in business modeling, marketing analysis, credit risk analysis; biomedical engineering and drug retort modeling. The extension of data analysis and classification makes the insight into big data with an exploration to processing and managing large data sets. This chapter deals with various techniques, methodologies that correspond to the classification problem in data analysis process and its methodological impacts to big data.

Download Full-text

A Comparison of Machine Learning Algorithms of Big Data for Time Series Forecasting Using Python

Open Source Software for Statistical Analysis of Big Data - Advances in Computer and Electrical Engineering ◽

10.4018/978-1-7998-2768-9.ch007 ◽

2020 ◽

pp. 197-218

Author(s):

Son Nguyen ◽

Anthony Park

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Time Series ◽

Big Data ◽

Data Analysis ◽

Big Data Analysis ◽

Time Series Forecasting ◽

Machine Learning Algorithms ◽

Time Series Models ◽

Data Sets

This chapter compares the performances of multiple Big Data techniques applied for time series forecasting and traditional time series models on three Big Data sets. The traditional time series models, Autoregressive Integrated Moving Average (ARIMA), and exponential smoothing models are used as the baseline models against Big Data analysis methods in the machine learning. These Big Data techniques include regression trees, Support Vector Machines (SVM), Multilayer Perceptrons (MLP), Recurrent Neural Networks (RNN), and long short-term memory neural networks (LSTM). Across three time series data sets used (unemployment rate, bike rentals, and transportation), this study finds that LSTM neural networks performed the best. In conclusion, this study points out that Big Data machine learning algorithms applied in time series can outperform traditional time series models. The computations in this work are done by Python, one of the most popular open-sourced platforms for data science and Big Data analysis.

Download Full-text

Big Data Analysis

Advancing the Power of Learning Analytics and Big Data in Education - Advances in Educational Technologies and Instructional Design ◽

10.4018/978-1-7998-7103-3.ch010 ◽

2021 ◽

pp. 208-233

Author(s):

Arpit Kumar Sharma ◽

Arvind Dhaka ◽

Amita Nandal ◽

Kumar Swastik ◽

Sunita Kumari

Keyword(s):

Cloud Computing ◽

Big Data ◽

Data Analysis ◽

Knowledge Extraction ◽

Big Data Analysis ◽

Unstructured Data ◽

Data Sets ◽

Data Handling ◽

Software And Hardware

The meaning of the term “big data” can be inferred by its name itself (i.e., the collection of large structured or unstructured data sets). In addition to their huge quantity, these data sets are so complex that they cannot be analyzed in any way using the conventional data handling software and hardware tools. If processed judiciously, big data can prove to be a huge advantage for the industries using it. Due to its usefulness, studies are being conducted to create methods to handle the big data. Knowledge extraction from big data is very important. Other than this, there is no purpose for accumulating such volumes of data. Cloud computing is a powerful tool which provides a platform for the storage and computation of massive amounts of data.

Download Full-text

Big Data Visualization Tools and Techniques

10.4018/978-1-6684-3662-2.ch028 ◽

2022 ◽

pp. 590-621

Author(s):

Obinna Chimaobi Okechukwu

Keyword(s):

Big Data ◽

Data Analysis ◽

Data Visualization ◽

Data Sets ◽

New Paradigm ◽

Volume Velocity ◽

Big Data Visualization ◽

Visualization Tools ◽

Tools And Techniques ◽

Data Analysis Methods

In this chapter, a discussion is presented on the latest tools and techniques available for Big Data Visualization. These tools, techniques and methods need to be understood appropriately to analyze Big Data. Big Data is a whole new paradigm where huge sets of data are generated and analyzed based on volume, velocity and variety. Conventional data analysis methods are incapable of processing data of this dimension; hence, it is fundamentally important to be familiar with new tools and techniques capable of processing these datasets. This chapter will illustrate tools available for analysts to process and present Big Data sets in ways that can be used to make appropriate decisions. Some of these tools (e.g., Tableau, RapidMiner, R Studio, etc.) have phenomenal capabilities to visualize processed data in ways traditional tools cannot. The chapter will also aim to explain the differences between these tools and their utilities based on scenarios.

Download Full-text

Using visual analytics to make sense of railway Close Calls

Proceedings of the Institution of Mechanical Engineers Part F Journal of Rail and Rapid Transit ◽

10.1177/0954409716676221 ◽

2016 ◽

Vol 231 (10) ◽

pp. 1107-1114

Author(s):

Miguel Figueres-Esteban ◽

Peter Hughes ◽

Coen van Gulijk

Keyword(s):

Big Data ◽

Data Analysis ◽

Visual Analytics ◽

Near Miss ◽

Test Case ◽

Data Sets ◽

Complex Data ◽

Network Text Analysis ◽

Complex Data Sets ◽

Close Call

In the big data era, large and complex data sets will exceed scientists’ capacity to make sense of them in the traditional way. New approaches in data analysis, supported by computer science, will be necessary to address the problems that emerge with the rise of big data. The analysis of the Close Call database, which is a text-based database for near-miss reporting on the GB railways, provides a test case. The traditional analysis of Close Calls is time consuming and prone to differences in interpretation. This paper investigates the use of visual analytics techniques, based on network text analysis, to conduct data analysis and extract safety knowledge from 500 randomly selected Close Call records relating to worker slips, trips and falls. The results demonstrate a straightforward, yet effective, way to identify hazardous conditions without having to read each report individually. This opens up new ways to perform data analysis in safety science.

Download Full-text

Epilogue

Topology: A Very Short Introduction ◽

10.1093/actrade/9780198832683.003.0007 ◽

2019 ◽

pp. 128-130

Author(s):

Richard Earl

Keyword(s):

Big Data ◽

Data Analysis ◽

Research Area ◽

General Topology ◽

Topological Data Analysis ◽

Data Sets ◽

Current Interest ◽

And Topology ◽

Active Research ◽

Active Research Area

Topology remains a large, active research area in mathematics. Unsurprisingly its character has changed over the last century—there is considerably less current interest in general topology, but whole new areas have emerged, such as topological data analysis to help analyze big data sets. The Epilogue concludes that the interfaces of topology with other areas have remained rich and numerous, and it can be hard telling where topology stops and geometry or algebra or analysis or physics begin. Often that richness comes from studying structures that have interconnected flavours of algebra, geometry, and topology, but sometimes a result, seemingly of an entirely algebraic nature say, can be proved by purely topological means.

Download Full-text