Computational science: shifting the focus from tools to models

Computational techniques have revolutionized many aspects of scientific research over the last few decades. Experimentalists use computation for data analysis, processing ever bigger data sets. Theoreticians compute predictions from ever more complex models. However, traditional articles do not permit the publication of big data sets or complex models. As a consequence, these crucial pieces of information no longer enter the scientific record. Moreover, they have become prisoners of scientific software: many models exist only as software implementations, and the data are often stored in proprietary formats defined by the software. In this article, I argue that this emphasis on software tools over models and data is detrimental to science in the long term, and I propose a means by which this can be reversed.

Download Full-text

Research on the Promotion Path of Teachers’ Scientific Research and Innovation Ability based on Big Data Analysis of “Double High Program” Construction

Journal of Physics Conference Series ◽

10.1088/1742-6596/1744/4/042093 ◽

2021 ◽

Vol 1744 (4) ◽

pp. 042093

Author(s):

Yuanzhi Huang

Keyword(s):

Big Data ◽

Data Analysis ◽

Scientific Research ◽

Big Data Analysis ◽

Research And Innovation ◽

Innovation Ability ◽

Program Construction

Download Full-text

A multilevel approach to big data analysis using analytic tools and actor network theory

SA Journal of Information Management ◽

10.4102/sajim.v20i1.914 ◽

2018 ◽

Vol 20 (1) ◽

Cited By ~ 4

Author(s):

Tiko Iyamu

Keyword(s):

Big Data ◽

Data Analysis ◽

Network Theory ◽

Data Analytics ◽

Big Data Analytics ◽

Actor Network Theory ◽

Big Data Analysis ◽

Data Sets ◽

Multilevel Approach ◽

Actor Network

Background: Over the years, big data analytics has been statically carried out in a programmed way, which does not allow for translation of data sets from a subjective perspective. This approach affects an understanding of why and how data sets manifest themselves into various forms in the way that they do. This has a negative impact on the accuracy, redundancy and usefulness of data sets, which in turn affects the value of operations and the competitive effectiveness of an organisation. Also, the current single approach lacks a detailed examination of data sets, which big data deserve in order to improve purposefulness and usefulness.Objective: The purpose of this study was to propose a multilevel approach to big data analysis. This includes examining how a sociotechnical theory, the actor network theory (ANT), can be complementarily used with analytic tools for big data analysis.Method: In the study, the qualitative methods were employed from the interpretivist approach perspective.Results: From the findings, a framework that offers big data analytics at two levels, micro- (strategic) and macro- (operational) levels, was developed. Based on the framework, a model was developed, which can be used to guide the analysis of heterogeneous data sets that exist within networks.Conclusion: The multilevel approach ensures a fully detailed analysis, which is intended to increase accuracy, reduce redundancy and put the manipulation and manifestation of data sets into perspectives for improved organisations’ competitiveness.

Download Full-text

Long-Term Spectrum Monitoring with Big Data Analysis and Machine Learning for Cloud-Based Radio Access Networks

Wireless Personal Communications ◽

10.1007/s11277-015-2631-8 ◽

2015 ◽

Vol 87 (3) ◽

pp. 815-835 ◽

Cited By ~ 10

Author(s):

Pavel Baltiiski ◽

Ilia Iliev ◽

Boian Kehaiov ◽

Vladimir Poulkov ◽

Todor Cooklev

Keyword(s):

Machine Learning ◽

Big Data ◽

Data Analysis ◽

Big Data Analysis ◽

Access Networks ◽

Spectrum Monitoring ◽

Radio Access Networks ◽

Radio Access

Download Full-text

A Relationship Model between Teachers’ Scientific Research Output and Teaching Ability based on Big Data Analysis

IOP Conference Series Materials Science and Engineering ◽

10.1088/1757-899x/750/1/012085 ◽

2020 ◽

Vol 750 ◽

pp. 012085

Author(s):

Mingxing Wang

Keyword(s):

Big Data ◽

Data Analysis ◽

Research Output ◽

Scientific Research ◽

Big Data Analysis ◽

Scientific Research Output ◽

Teaching Ability ◽

Relationship Model

Download Full-text

Development of a Powerful Data-Analysis Tool Using Nonparametric Smoothing Models To Identify Drillsites in Tight Shale Reservoirs With High Economic Potential

SPE Journal ◽

10.2118/189440-pa ◽

2017 ◽

Vol 23 (03) ◽

pp. 719-736 ◽

Cited By ~ 2

Author(s):

Quan Cai ◽

Wei Yu ◽

Hwa Chi Liang ◽

Jenn-Tai Liang ◽

Suojin Wang ◽

...

Keyword(s):

Big Data ◽

Data Analysis ◽

Predictive Power ◽

Oil And Gas ◽

Predictor Variables ◽

Data Sets ◽

Analysis Tool ◽

Data Set ◽

Data Analysis Tool ◽

Shale Reservoirs

Summary The oil-and-gas industry is entering an era of “big data” because of the huge number of wells drilled with the rapid development of unconventional oil-and-gas reservoirs during the past decade. The massive amount of data generated presents a great opportunity for the industry to use data-analysis tools to help make informed decisions. The main challenge is the lack of the application of effective and efficient data-analysis tools to analyze and extract useful information for the decision-making process from the enormous amount of data available. In developing tight shale reservoirs, it is critical to have an optimal drilling strategy, thereby minimizing the risk of drilling in areas that would result in low-yield wells. The objective of this study is to develop an effective data-analysis tool capable of dealing with big and complicated data sets to identify hot zones in tight shale reservoirs with the potential to yield highly productive wells. The proposed tool is developed on the basis of nonparametric smoothing models, which are superior to the traditional multiple-linear-regression (MLR) models in both the predictive power and the ability to deal with nonlinear, higher-order variable interactions. This data-analysis tool is capable of handling one response variable and multiple predictor variables. To validate our tool, we used two real data sets—one with 249 tight oil horizontal wells from the Middle Bakken and the other with 2,064 shale gas horizontal wells from the Marcellus Shale. Results from the two case studies revealed that our tool not only can achieve much better predictive power than the traditional MLR models on identifying hot zones in the tight shale reservoirs but also can provide guidance on developing the optimal drilling and completion strategies (e.g., well length and depth, amount of proppant and water injected). By comparing results from the two data sets, we found that our tool can achieve model performance with the big data set (2,064 Marcellus wells) with only four predictor variables that is similar to that with the small data set (249 Bakken wells) with six predictor variables. This implies that, for big data sets, even with a limited number of available predictor variables, our tool can still be very effective in identifying hot zones that would yield highly productive wells. The data sets that we have access to in this study contain very limited completion, geological, and petrophysical information. Results from this study clearly demonstrated that the data-analysis tool is certainly powerful and flexible enough to take advantage of any additional engineering and geology data to allow the operators to gain insights on the impact of these factors on well performance.

Download Full-text

Influencing Factors of e-Commerce Enterprise Development Based on Mobile Computing Big Data Analysis

Wireless Communications and Mobile Computing ◽

10.1155/2021/8750111 ◽

2021 ◽

Vol 2021 ◽

pp. 1-12

Author(s):

Yixue Zhu ◽

Boyue Chai

Keyword(s):

Big Data ◽

Data Analysis ◽

Large Scale ◽

Big Data Analysis ◽

Support Vector ◽

Data Sets ◽

Large Scale Data ◽

Vector Machines ◽

Physical Information ◽

Scale Data

With the development of increasingly advanced information technology and electronic technology, especially with regard to physical information systems, cloud computing systems, and social services, big data will be widely visible, creating benefits for people and at the same time facing huge challenges. In addition, with the advent of the era of big data, the scale of data sets is getting larger and larger. Traditional data analysis methods can no longer solve the problem of large-scale data sets, and the hidden information behind big data is digging out, especially in the field of e-commerce. We have become a key factor in competition among enterprises. We use a support vector machine method based on parallel computing to analyze the data. First, the training samples are divided into several working subsets through the SOM self-organizing neural network classification method. Compared with the ever-increasing progress of information technology and electronic equipment, especially the related physical information system finally merges the training results of each working set, so as to quickly deal with the problem of massive data prediction and analysis. This paper proposes that big data has the flexibility of expansion and quality assessment system, so it is meaningful to replace the double-sidedness of quality assessment with big data. Finally, considering the excellent performance of parallel support vector machines in data mining and analysis, we apply this method to the big data analysis of e-commerce. The research results show that parallel support vector machines can solve the problem of processing large-scale data sets. The emergence of data dirty problems has increased the effective rate by at least 70%.

Download Full-text

Tweaking Business Planning With Artificial Intelligence

International Journal of Business Strategy and Automation ◽

10.4018/ijbsa.288541 ◽

2021 ◽

Vol 2 (4) ◽

pp. 1-22

Author(s):

Jing Rui Chen ◽

P. S. Joseph Ng

Keyword(s):

Artificial Intelligence ◽

Big Data ◽

Data Analysis ◽

Learning Experience ◽

Compulsory Education ◽

Large Data ◽

Data Sets ◽

Chinese Market ◽

Artificial Intelligence Technology ◽

Education Support

Griffith AI&BD is a technology company that uses big data platform and artificial intelligence technology to produce products for schools. The company focuses on primary and secondary school education support and data analysis assistance system and campus ARTIFICIAL intelligence products for the compulsory education stage in the Chinese market. Through big data, machine learning and data mining, scattered on campus and distributed systems enable anyone to sign up to join the huge data processing grid, and access learning support big data analysis and matching after helping students expand their knowledge in a variety of disciplines and learning and promotion. Improve the learning process based on large data sets of students, and combine ai technology to develop AI electronic devices. To provide schools with the best learning experience to survive in a competitive world.

Download Full-text

Structure and Dynamics of Many-Particle Systems: Big Data Sets and Data Analysis

Big Data for Remote Sensing: Visualization, Analysis and Interpretation ◽

10.1007/978-3-319-89923-7_3 ◽

2018 ◽

pp. 61-97

Author(s):

Wolfram Schommers

Keyword(s):

Big Data ◽

Data Analysis ◽

Particle Systems ◽

Data Sets ◽

Structure And Dynamics

Download Full-text

Developing sustainable software solutions for bioinformatics by the “Butterfly” paradigm

F1000Research ◽

10.12688/f1000research.3681.1 ◽

2014 ◽

Vol 3 ◽

pp. 71 ◽

Cited By ~ 8

Author(s):

Zeeshan Ahmed ◽

Saman Zeeshan ◽

Thomas Dandekar

Keyword(s):

Software Engineering ◽

Data Representation ◽

Data Sets ◽

Complex Data ◽

Scientific Software ◽

Rapid Changes ◽

Complex Data Sets ◽

Key Steps ◽

User Friendly

Software design and sustainable software engineering are essential for the long-term development of bioinformatics software. Typical challenges in an academic environment are short-term contracts, island solutions, pragmatic approaches and loose documentation. Upcoming new challenges are big data, complex data sets, software compatibility and rapid changes in data representation. Our approach to cope with these challenges consists of iterative intertwined cycles of development (“Butterfly” paradigm) for key steps in scientific software engineering. User feedback is valued as well as software planning in a sustainable and interoperable way. Tool usage should be easy and intuitive. A middleware supports a user-friendly Graphical User Interface (GUI) as well as a database/tool development independently. We validated the approach of our own software development and compared the different design paradigms in various software solutions.

Download Full-text

Data Classification

Advances in Business Information Systems and Analytics - Handbook of Research on Advanced Data Mining Techniques and Applications for Business Intelligence ◽

10.4018/978-1-5225-2031-3.ch003 ◽

2017 ◽

pp. 34-51 ◽

Cited By ~ 2

Author(s):

A. Sheik Abdullah ◽

R. Suganya ◽

S. Selvakumar ◽

S. Rajaram

Keyword(s):

Big Data ◽

Data Analysis ◽

Data Classification ◽

Large Data ◽

Classification Problem ◽

Data Sets ◽

Credit Risk Analysis ◽

Predicted Values ◽

Class Labels ◽

The One

Classification is considered to be the one of the data analysis technique which can be used over many applications. Classification model predicts categorical continuous class labels. Clustering mainly deals with grouping of variables based upon similar characteristics. Classification models are experienced by comparing the predicted values to that of the known target values in a set of test data. Data classification has many applications in business modeling, marketing analysis, credit risk analysis; biomedical engineering and drug retort modeling. The extension of data analysis and classification makes the insight into big data with an exploration to processing and managing large data sets. This chapter deals with various techniques, methodologies that correspond to the classification problem in data analysis process and its methodological impacts to big data.

Download Full-text