scholarly journals Complex Big Data Analysis Based on Multi-granularity Generalized Functions

2018 ◽  
Vol 14 (04) ◽  
pp. 43
Author(s):  
Zhang Xueya ◽  
Jianwei Zhang

<p>A new method for the big data analysis - multi-granularity generalized functions data model (referred to as MGGF for short) is put forward. This method adopts the dynamic adaptive multi-granularity clustering technique, transforms the grid like "Hard partitioning" to the input data space by the generalized functions data model (referred to as GFDM for short) into the multi-granularity partitioning, and identifies the multi-granularity pattern class in the input data space. By defining the type of the mapping relationship between the multi-granularity model class and the decision-making category ftype:Ci→y, and the concept of the Degree of Fulfillment (referred to as DoF (x)) of the input data to the classification rules of the various pattern classes, the corresponding MGGF model is established. Experimental test results of different data sets show that, compared with the GFDM method, the method proposed in this paper has better data summarization ability, stronger noise data processing ability and higher searching efficiency.</p>

2018 ◽  
Vol 20 (1) ◽  
Author(s):  
Tiko Iyamu

Background: Over the years, big data analytics has been statically carried out in a programmed way, which does not allow for translation of data sets from a subjective perspective. This approach affects an understanding of why and how data sets manifest themselves into various forms in the way that they do. This has a negative impact on the accuracy, redundancy and usefulness of data sets, which in turn affects the value of operations and the competitive effectiveness of an organisation. Also, the current single approach lacks a detailed examination of data sets, which big data deserve in order to improve purposefulness and usefulness.Objective: The purpose of this study was to propose a multilevel approach to big data analysis. This includes examining how a sociotechnical theory, the actor network theory (ANT), can be complementarily used with analytic tools for big data analysis.Method: In the study, the qualitative methods were employed from the interpretivist approach perspective.Results: From the findings, a framework that offers big data analytics at two levels, micro- (strategic) and macro- (operational) levels, was developed. Based on the framework, a model was developed, which can be used to guide the analysis of heterogeneous data sets that exist within networks.Conclusion: The multilevel approach ensures a fully detailed analysis, which is intended to increase accuracy, reduce redundancy and put the manipulation and manifestation of data sets into perspectives for improved organisations’ competitiveness.


2021 ◽  
Vol 2021 ◽  
pp. 1-12
Author(s):  
Yixue Zhu ◽  
Boyue Chai

With the development of increasingly advanced information technology and electronic technology, especially with regard to physical information systems, cloud computing systems, and social services, big data will be widely visible, creating benefits for people and at the same time facing huge challenges. In addition, with the advent of the era of big data, the scale of data sets is getting larger and larger. Traditional data analysis methods can no longer solve the problem of large-scale data sets, and the hidden information behind big data is digging out, especially in the field of e-commerce. We have become a key factor in competition among enterprises. We use a support vector machine method based on parallel computing to analyze the data. First, the training samples are divided into several working subsets through the SOM self-organizing neural network classification method. Compared with the ever-increasing progress of information technology and electronic equipment, especially the related physical information system finally merges the training results of each working set, so as to quickly deal with the problem of massive data prediction and analysis. This paper proposes that big data has the flexibility of expansion and quality assessment system, so it is meaningful to replace the double-sidedness of quality assessment with big data. Finally, considering the excellent performance of parallel support vector machines in data mining and analysis, we apply this method to the big data analysis of e-commerce. The research results show that parallel support vector machines can solve the problem of processing large-scale data sets. The emergence of data dirty problems has increased the effective rate by at least 70%.


Author(s):  
Son Nguyen ◽  
Anthony Park

This chapter compares the performances of multiple Big Data techniques applied for time series forecasting and traditional time series models on three Big Data sets. The traditional time series models, Autoregressive Integrated Moving Average (ARIMA), and exponential smoothing models are used as the baseline models against Big Data analysis methods in the machine learning. These Big Data techniques include regression trees, Support Vector Machines (SVM), Multilayer Perceptrons (MLP), Recurrent Neural Networks (RNN), and long short-term memory neural networks (LSTM). Across three time series data sets used (unemployment rate, bike rentals, and transportation), this study finds that LSTM neural networks performed the best. In conclusion, this study points out that Big Data machine learning algorithms applied in time series can outperform traditional time series models. The computations in this work are done by Python, one of the most popular open-sourced platforms for data science and Big Data analysis.


Author(s):  
Arpit Kumar Sharma ◽  
Arvind Dhaka ◽  
Amita Nandal ◽  
Kumar Swastik ◽  
Sunita Kumari

The meaning of the term “big data” can be inferred by its name itself (i.e., the collection of large structured or unstructured data sets). In addition to their huge quantity, these data sets are so complex that they cannot be analyzed in any way using the conventional data handling software and hardware tools. If processed judiciously, big data can prove to be a huge advantage for the industries using it. Due to its usefulness, studies are being conducted to create methods to handle the big data. Knowledge extraction from big data is very important. Other than this, there is no purpose for accumulating such volumes of data. Cloud computing is a powerful tool which provides a platform for the storage and computation of massive amounts of data.


2016 ◽  
Vol 4 (3) ◽  
pp. 1-21 ◽  
Author(s):  
Sungchul Lee ◽  
Eunmin Hwang ◽  
Ju-Yeon Jo ◽  
Yoohwan Kim

Due to the advancement of Information Technology (IT), the hospitality industry is seeing a great value in gathering various kinds of and a large amount of customers' data. However, many hotels are facing a challenge in analyzing customer data and using it as an effective tool to understand the hospitality customers better and, ultimately, to increase the revenue. The authors' research attempts to resolve the current challenges of analyzing customer data in hospitality by utilizing the big data analysis tools, especially Hadoop and R. Hadoop is a framework for processing large-scale data. With the integration of new approach, their study demonstrates the ways of aggregating and analyzing the hospitality customer data to find meaningful customer information. Multiple decision trees are constructed from the customer data sets with the intention of classifying customers' needs and customers' clusters. By analyzing the customer data, the study suggests three strategies to increase the total expenditure of the customers within a limited amount of time during their stay.


Author(s):  
Mohammad Hossein Fazel Zarandi ◽  
Reyhaneh Gamasaee

Big data is a new ubiquitous term for massive data sets having large, more varied and complex structure with the complexities and difficulties of storing, analyzing and visualizing for further processes or results. The use of Big Data in health is a new and exciting field. A wide range of use cases for Big Data and analytics in healthcare will benefit best practice development, outcomes analysis, prediction, and surveillance. Consequently, the aim of this chapter is to provide an overview of Big Data in Healthcare systems including two applications of Big Data analysis in healthcare. The first one is understanding disease outcomes through analyzing Big Data, and the second one is the application of Big Data in genetics, biological, and molecular fields. Moreover, characteristics and challenges of healthcare Big Data analysis as well as technologies and software used for Big Data analysis are reviewed.


2020 ◽  
Vol 166 ◽  
pp. 13027
Author(s):  
Anzhela Ignatyuk ◽  
Olena Liubkina ◽  
Tetiana Murovana ◽  
Alina Magomedova

Driving force of human society development is elimination contradiction between unlimited usage of natural resources during economic activity of enterprises, environment pollution as a result of such activity and limited natural, energy and other resources. Research results on economic and environmental issues of green business management showed that there are several basic types of problems at present which arise at enterprises during collecting and processing data on the results of their activities. The authors analyzed how public sector and green business is catching up on global trend towards broader use of the big data analysis to serve public interests and increase efficiency of business activities. In order to detect current approach to big data analysis in public and private sectors authors conduct interviews with stakeholders. The paper concludes with the analysis what changes in approaches to the big data analysis in public and private sectors have to be made in order to comply with the global trends in greening the economy. Application of FinTech, methods of processing large data sets and tools for implementing the principles of greening the economy will enable to increase the investment attractiveness of green business and will simplify the interaction between the state and enterprises.


Web Services ◽  
2019 ◽  
pp. 788-802
Author(s):  
Mrutyunjaya Panda

The Big Data, due to its complicated and diverse nature, poses a lot of challenges for extracting meaningful observations. This sought smart and efficient algorithms that can deal with computational complexity along with memory constraints out of their iterative behavior. This issue may be solved by using parallel computing techniques, where a single machine or a multiple machine can perform the work simultaneously, dividing the problem into sub problems and assigning some private memory to each sub problems. Clustering analysis are found to be useful in handling such a huge data in the recent past. Even though, there are many investigations in Big data analysis are on, still, to solve this issue, Canopy and K-Means++ clustering are used for processing the large-scale data in shorter amount of time with no memory constraints. In order to find the suitability of the approach, several data sets are considered ranging from small to very large ones having diverse filed of applications. The experimental results opine that the proposed approach is fast and accurate.


Author(s):  
Yi Li ◽  
Kang Li ◽  
Xi-Tao Zhang ◽  
Shou-Biao Wang

Sign in / Sign up

Export Citation Format

Share Document