Research on Detection Algorithm of Roadway Intersection Rule Detection Based on Big Data

Author(s):  
Meirun Zhang ◽  
Fumin Zou ◽  
Yanling Deng ◽  
Xinhua Jiang ◽  
Lvchao Liao ◽  
...  
Author(s):  
Honglong Xu ◽  
Haiwu Rong ◽  
Rui Mao ◽  
Guoliang Chen ◽  
Zhiguang Shan

Big data is profoundly changing the lifestyles of people around the world in an unprecedented way. Driven by the requirements of applications across many industries, research on big data has been growing. Methods to manage and analyze big data to extract valuable information are the key of big data research. Starting from the variety challenge of big data, this dissertation proposes a universal big data management and analysis framework based on metric space. In this framework, the Hilbert Index-based Outlier Detection (HIOD) algorithm is proposed. HIOD can handle all datatypes that can be abstracted to metric space and achieve higher detection speed. Experimental results indicate that HIOD can effectively overcome the variety challenge of big data and achieves a 2.02 speed up over iORCA on average and, in certain cases, up to 5.57. The distance calculation times are reduced by 47.57% on average and up to 89.10%.


The big data is one of the fastest growing technologies, which can to handle huge amounts of data from various sources, such as social media, web logs, banking and business sectors etc. In order to pace with the changes in the data patterns and to accommodate the requirements of big data analytics, the platform for storage and processing such as Hadoop, also requires great advancements. Hadoop, an open source project executes the big data processing job in map and reduce phases and follows master-slave architecture. A Hadoop MapReduce job can be delayed if one of its many tasks is being assigned to an unreliable or congested machine. To solve this straggler problem, a novel algorithm design of speculative execution schemes for parallel processing clusters, from an optimization perspective, under different loading conditions is proposed. For the lightly loaded case, a task cloning scheme, namely, the combined file task cloning algorithm, which is based on maximizing the overall system utility, a straggler detection algorithm is proposed based on a workload threshold. The detection and cloning of tasks assigned with the stragglers only will not be enough to enhance the performance unless cloning of tasks is allocated in a resource aware method. So, a method is proposed which identifies and optimizes the resource allocation by considering all possible aspects of cluster performance balancing. One main issue arises due to the pre configuration of distinct map and reduce slots based on the number of files in the input folder. This can cause severe under-utilization of slot as map slots might not be fully utilized with respect to the input splits. To solve this issue, an alternative technique of Hadoop Slot Allocation is introduced in this paper by keeping the efficient management of slots model. The combine file task cloning algorithm combines the files which are less than the size of a single data block and executes them in the highly performing data node. On implementing these efficient cloning and combining techniques on a heavily loaded cluster after detecting the straggler, machine is found to reduce the elapsed time of execution to an average of 40%. The detection algorithm improves the overall performance of the heavily loaded cluster by 20% of the total elapsed time in comparison with the native Hadoop algorithm.


2016 ◽  
Vol 18 (7) ◽  
pp. 4705-4719 ◽  
Author(s):  
Jie Chen ◽  
Si Liu ◽  
De-gan Zhang ◽  
Shan Zhou

2019 ◽  
Vol 3 (3) ◽  
pp. 42
Author(s):  
Park K. Fung

Formidably sized networks are becoming more and more common, including in social sciences, biology, neuroscience, and the technology space. Many network sizes are expected to challenge the storage capability of a single physical computer. Here, we take two approaches to handle big networks: first, we look at how big data technology and distributed computing is an exciting approach to big data storage and processing. Second, most networks can be partitioned or labeled into communities, clusters, or modules, thus capturing the crux of the network while reducing detailed information, through the class of algorithms known as community detection. In this paper, we combine these two approaches, developing a distributed community detection algorithm to handle big networks. In particular, the map equation provides a way to identify network communities according to the information flow between nodes, where InfoMap is a greedy algorithm that uses the map equation. We develop discrete mathematics to adapt InfoMap into a distributed computing framework and then further develop the mathematics for a greedy algorithm, InfoFlow, which has logarithmic time complexity, compared to the linear complexity in InfoMap. Benchmark results of graphs up to millions of nodes and hundreds of millions of edges confirm the time complexity improvement, while maintaining community accuracy. Thus, we develop a map equation based community detection algorithm suitable for big network data processing.


Author(s):  
Haipeng Yao ◽  
Yiqing Liu ◽  
Chao Fang

<p>Anomaly network detection is a very important way to analyze and detect malicious behavior in network. How to effectively detect anomaly network flow under the pressure of big data is a very important area, which has attracted more and more researchers’ attention. In this paper, we propose a new model based on big data analysis, which can avoid the influence brought by adjustment of network traffic distribution, increase detection accuracy and reduce the false negative rate. Simulation results reveal that, compared with k-means, decision tree and random forest algorithms, the proposed model has a much better performance, which can achieve a detection rate of 95.4% on normal data, 98.6% on DoS attack, 93.9% on Probe attack, 56.1% on U2R attack, and 77.2% on R2L attack.</p>


Sign in / Sign up

Export Citation Format

Share Document