Research on Computing Efficiency of MapReduce in Big Data Environment

The emergence of big data has brought a great impact on traditional computing mode, the distributed computing framework represented by MapReduce has become an important solution to this problem. Based on the big data, this paper deeply studies the principle and framework of MapReduce programming. On the basis of mastering the principle and framework of MapReduce programming, the time consumption of distributed computing framework MapReduce and traditional computing model is compared with concrete programming experiments. The experiment shows that MapReduce has great advantages in large data volume.

Download Full-text

Routing Optimization Algorithms Based on Node Compression in Big Data Environment

Scientific Programming ◽

10.1155/2017/2056501 ◽

2017 ◽

Vol 2017 ◽

pp. 1-7

Author(s):

Lifeng Yang ◽

Liangming Chen ◽

Ningwei Wang ◽

Zhifang Liao

Keyword(s):

Big Data ◽

Shortest Path ◽

Optimization Algorithms ◽

Real Life ◽

Large Data ◽

Path Selection ◽

Shortest Path Problem ◽

Limited Time ◽

Starting Point ◽

Data Environment

Shortest path problem has been a classic issue. Even more so difficulties remain involving large data environment. Current research on shortest path problem mainly focuses on seeking the shortest path from a starting point to the destination, with both vertices already given; but the researches of shortest path on a limited time and limited nodes passing through are few, yet such problem could not be more common in real life. In this paper we propose several time-dependent optimization algorithms for this problem. In regard to traditional backtracking and different node compression methods, we first propose an improved backtracking algorithm for one condition in big data environment and three types of optimization algorithms based on node compression involving large data, in order to realize the path selection from the starting point through a given set of nodes to reach the end within a limited time. Consequently, problems involving different data volume and complexity of network structure can be solved with the appropriate algorithm adopted.

Download Full-text

Information Visualization from the Perspective of Big Data Analysis and Fusion

Scientific Programming ◽

10.1155/2021/8934632 ◽

2021 ◽

Vol 2021 ◽

pp. 1-12

Author(s):

Xiang Lin

Keyword(s):

Big Data ◽

Data Analysis ◽

Information Visualization ◽

Big Data Analysis ◽

Experimental Results ◽

Data Sources ◽

Visualization Technique ◽

Data Volume ◽

Data Environment ◽

Information Association

In the big data environment, the visualization technique has been increasingly adopted to mine the data on library and information (L&I), with the diversification of data sources and the growth of data volume. The previous research into the information association of L&I visualization network rarely tries to construct such a network or explore the information association of the network. To overcome these defects, this paper explores the visualization of L&I from the perspective of big data analysis and fusion. Firstly, the authors analyzed the topology of the L&I visualization network and calculated the metrics for the construction of L&I visualization topology map. Next, the importance of meta-paths of the L&I visualization network was calculated. Finally, a complex big data L&I visualization network was established, and the associations between information nodes were analyzed in detail. Experimental results verify the effectiveness of the proposed algorithm.

Download Full-text

A distributed big data library extending Java 8

International Journal of Engineering & Technology ◽

10.14419/ijet.v7i1.1.9476 ◽

2017 ◽

Vol 7 (1.1) ◽

pp. 237

Author(s):

MD. A R Quadri ◽

B. Sruthi ◽

A. D. SriRam ◽

B. Lavanya

Keyword(s):

Big Data ◽

Distributed Computing ◽

Programming Model ◽

Large Data ◽

Large Data Sets ◽

Data Sets ◽

Distributed Environment ◽

Multiple Systems ◽

Huge Data ◽

Distributed Streams

Java is one of the finest language for big data because of its write once and run anywhere nature. The new release of java 8 introduced few strategies like lambda expressions and streams which are helpful for parallel computing. Though these new strategies helps in extracting, sorting and filtering data from collections and arrays, still there are problems with it. Streams cannot properly process with the large data sets like big data. Also, there are few problems associated while executing in distributed environment. The new streams introduced in java are restricted to computations inside the single system there is no method for distributed computing over multiple systems. And streams store data in their memory and therefore cannot support huge data sets. Now, this paper cope with java 8 behalf of massive data and deed in distributed environment by providing extensions to the Programming model with distributed streams. The distributed computing of large data programming models may be consummated by introducing distributed stream frameworks.

Download Full-text

Information Visualization from the Perspective of Big Data Analysis and Fusion

WSEAS TRANSACTIONS ON COMPUTERS ◽

10.37394/23205.2021.20.37 ◽

2021 ◽

Vol 20 ◽

pp. 352-361

Author(s):

Xiang Lin

Keyword(s):

Big Data ◽

Data Analysis ◽

Information Visualization ◽

Information Fusion ◽

Big Data Analysis ◽

Data Sources ◽

Visualization Technique ◽

Network Layout ◽

Data Volume ◽

Data Environment

In the big data environment, the visualization technique has been increasingly adopted to mine the data on library and information (L&I), with the diversification of data sources and the growth of data volume. However, there are several defects with the research on information association of L&I visualization network: the lack of optimization of network layout algorithms, and the absence of L&I information fusion and comparison in multiple disciplines, in the big data environment. To overcome these defects, this paper explores the visualization of L&I from the perspective of big data analysis and fusion. Firstly, the authors analyzed the topology of the L&I visualization network, and calculated the metrics for the construction of L&I visualization topology map. Next, the importance of meta-paths of the L&I visualization network was calculated. Finally, a complex big data L&I visualization network was established, and the associations between information nodes were analyzed in details. Experimental results verify the effectiveness of the proposed algorithm

Download Full-text

Big Data Compression Technology Based on Internet of Vehicles

International Journal of Online and Biomedical Engineering (iJOE) ◽

10.3991/ijoe.v15i01.9773 ◽

2019 ◽

Vol 15 (01) ◽

pp. 85

Author(s):

Guohua Xiong

Keyword(s):

Big Data ◽

Data Compression ◽

Compression Ratio ◽

High Efficiency ◽

Large Data ◽

Internet Of Vehicles ◽

Fixed Threshold ◽

Data Volume ◽

Networking Technology ◽

Vehicle Networking

To ensure the high efficiency of the development of car networking technology, large data compression technology based on car networking was studied. First, RFID technology and vehicle networking, big data technology in vehicle networking, RFID path data compression technology in the Internet of vehicles were introduced. Then, RFID path data compression verification experiments were performed. The results showed that when the data volume was relatively small, there was no obvious change in the compression ratio under the fixed threshold and the threshold change. However, when the amount of data gradually increased, the compression ratio under the condition of changing the threshold was slightly higher than the fixed threshold. Therefore, RFID path big data processing is feasible, and compression technology is efficient.

Download Full-text

Local-Global Modeling and Distributed Computing Framework for Nonlinear Plant-Wide Process Monitoring With Industrial Big Data

IEEE Transactions on Neural Networks and Learning Systems ◽

10.1109/tnnls.2020.2985223 ◽

2020 ◽

pp. 1-11 ◽

Cited By ~ 8

Author(s):

Qingchao Jiang ◽

Shifu Yan ◽

Hui Cheng ◽

Xuefeng Yan

Keyword(s):

Big Data ◽

Distributed Computing ◽

Process Monitoring ◽

Global Modeling ◽

Industrial Big Data ◽

Computing Framework ◽

Nonlinear Plant

Download Full-text

Value of big data to finance: observations on an internet credit Service Company in China

Financial Innovation ◽

10.1186/s40854-015-0017-2 ◽

2015 ◽

Vol 1 (1) ◽

Cited By ~ 10

Author(s):

Shaofeng Zhang ◽

Wei Xiong ◽

Wancheng Ni ◽

Xin Li

Keyword(s):

Big Data ◽

Credit Risk ◽

Credit Rating ◽

Large Data ◽

Future Research ◽

Business Practices ◽

Data Volume ◽

Credit Risk Analysis ◽

Service Businesses

Abstract Background his paper presents a case study on 100Credit, an Internet credit service provider in China. 100Credit began as an IT company specializing in e-commerce recommendation before getting into the credit rating business. The company makes use of Big Data on multiple aspects of individuals’ online activities to infer their potential credit risk. Methods Based on 100Credit’s business practices, this paper summarizes four aspects related to the value of Big Data in Internet credit services. Results 1) value from large data volume that provides access to more borrowers; 2) value from prediction correctness in reducing lenders’ operational cost; 3) value from the variety of services catering to different needs of lenders; and 4) value from information protection to sustain credit service businesses. Conclusion The paper also discusses the opportunities and challenges of Big Data-based credit risk analysis, which needs to be improved in future research and practice.

Download Full-text

Novel Stream Ciphering Algorithm for Big Data Images Using Zeckendorf Representation

Wireless Communications and Mobile Computing ◽

10.1155/2021/4637876 ◽

2021 ◽

Vol 2021 ◽

pp. 1-19

Author(s):

Liangshun Wu ◽

Hengjin Cai

Keyword(s):

Big Data ◽

Stream Cipher ◽

Large Data ◽

Security And Privacy ◽

Data Sets ◽

Imaging Data ◽

High Definition ◽

High Data ◽

Huge Data ◽

Data Volume

Big data is a term used for very large data sets. Digital equipment produces vast amounts of images every day; the need for image encryption is increasingly pronounced, for example, to safeguard the privacy of the patients’ medical imaging data in cloud disk. There is an obvious contradiction between the security and privacy and the widespread use of big data. Nowadays, the most important engine to provide confidentiality is encryption. However, block ciphering is not suitable for the huge data in a real-time environment because of the strong correlation among pixels and high redundancy; stream ciphering is considered a lightweight solution for ciphering high-definition images (i.e., high data volume). For a stream cipher, since the encryption algorithm is deterministic, the only thing you can do is to make the key “look random.” This article proves that the probability that the digit 1 appears in the midsection of a Zeckendorf representation is constant, which can be utilized to generate the pseudorandom numbers. Then, a novel stream cipher key generator (ZPKG) is proposed to encrypt high-definition images that need transferring. The experimental results show that the proposed stream ciphering method, with the keystream of which satisfies Golomb’s randomness postulates, is faster than RC4 and LSFR with indistinguishable performance on hardware depletion, and the method is highly key sensitive and shows good resistance against noise attacks and statistical attacks.

Download Full-text