Design and Construction of Distributed JavaScript Parsing System

Author(s):  
Bo Shen ◽  
Wei Huang ◽  
Xiaodi Li

With the rapid development of the Internet technology, JS (short for JavaScript), as one of the representative of script languages, which is very powerful, is becoming more and more popular to the developers and users. But JS programming is more complex than usual static technology. In the field of search engine and information acquisition, it's very difficult to get the information hidden in script code. In this paper, the authors design a distributed system for parsing the JS code embedded in HTML file and retrieving the underling information. the authors describe how to extract JS codes from HTML file and parse them. Also, they introduce a task scheduling algorithm for the JS parsing system by employing Hadoop distributed computing technology. The experimental results indicate that the proposed algorithm and system can achieve a reasonable task scheduling efficiency and parse JS codes rapidly.

2021 ◽  
Author(s):  
QIN Jun ◽  
SONG Yanyan ◽  
ZONG Ping

With the rapid development and popularization of information technology, cloud computing technology provides a good environment for solving massive data processing. Hadoop is an open-source implementation of MapReduce and has the ability to process large amounts of data. Aiming at the shortcomings of the fault-tolerant technology in the MapReduce programming model, this paper proposes a reliability task scheduling strategy that introduces a failure recovery mechanism, evaluates the trustworthiness of resource nodes in the cloud environment, establishes a trustworthiness model, and avoids task allocation to low reliability node, causing the task to be re-executed, wasting time and resources. Finally, the simulation platform CloudSim verifies the validity and stability of the task scheduling algorithm and scheduling model proposed in this paper.


2020 ◽  
Vol 309 ◽  
pp. 03025
Author(s):  
Lintan Sun ◽  
Zigan Li ◽  
Jingxian Lv ◽  
Chenfei Wang ◽  
Yajuan Wang ◽  
...  

With the rapid development and wide application of the Internet of Everything, in order to cope with the increasing amount of data and computational scale of mobile terminal processing, and the imbalance of existing scheduling algorithms and low resource utilization, this paper proposes a task scheduling algorithm based on business priority. The algorithm firstly divides the service according to the priority of the service. Secondly, the standard deviation of the computing task group is used to determine the proportion of long and short services, and the dynamic selection model is established. Finally, according to the idea of secondary allocation, the task of heavy load is assigned to the scheduling strategy of light load resources to execute, and the service redistribution model is established. The simulation results show that compared with the typical algorithm, the proposed algorithm achieves the result of comprehensive consideration of Makespan and load balancing to improve system efficiency.


2019 ◽  
Vol 10 (2) ◽  
pp. 102-117 ◽  
Author(s):  
Vijayakumar Pandi ◽  
Pandiaraja Perumal ◽  
Balamurugan Balusamy ◽  
Marimuthu Karuppiah

The fast-growing internet services have led to the rapid development of storing, retrieving and processing health-related documents from a public cloud. In such a scenario, the performance of cloud services offered is not guaranteed, since it depends on efficient resource scheduling, network bandwidth, etc. The trade-off which lies between the cost and the QoS is that the cost should be variably low on achieving high QoS. This can be done by performance optimization. In order to optimize the performance, a novel task scheduling algorithm is proposed in this article. The main advantage of this proposed scheduling algorithm is to improve the QoS parameters which comprises of metrics such as response time, computation time, availability and cost. The proposed work is simulated in Aneka and shows better performance compared to existing paradigms.


2021 ◽  
Vol 22 (3) ◽  
pp. 295-302
Author(s):  
Shahid Sultan Hajam ◽  
Shabir Ahmad Sofi

Fog computing serves the delay-sensitive applications of the Internet of Things (IoT) in more efficient means than the cloud. The heterogeneity of the tasks and the limited fog resources make task scheduling a complicated job. This paper proposes a clustering based task scheduling algorithm. Specifically, the K-Means++ clustering algorithm is used for clustering the fog nodes. Randomized round robin, a task scheduling algorithm is applied to each cluster. The results show that the proposed algorithm reduces the system's average waiting time.


Author(s):  
Vijayakumar Pandi ◽  
Pandiaraja Perumal ◽  
Balamurugan Balusamy ◽  
Marimuthu Karuppiah

The fast-growing internet services have led to the rapid development of storing, retrieving and processing health-related documents from a public cloud. In such a scenario, the performance of cloud services offered is not guaranteed, since it depends on efficient resource scheduling, network bandwidth, etc. The trade-off which lies between the cost and the QoS is that the cost should be variably low on achieving high QoS. This can be done by performance optimization. In order to optimize the performance, a novel task scheduling algorithm is proposed in this article. The main advantage of this proposed scheduling algorithm is to improve the QoS parameters which comprises of metrics such as response time, computation time, availability and cost. The proposed work is simulated in Aneka and shows better performance compared to existing paradigms.


2014 ◽  
Vol 543-547 ◽  
pp. 3294-3299
Author(s):  
Bei Zhan Wang ◽  
Kang Chen ◽  
Wei Long Ye ◽  
Xu Wang

With the rapid development of Internet and the explosive growth of Internet information, massive data processing received more concerns. Micro-blog, which is an important representative pattern of the Internet development in the future, has become the essential tool of communication and marketing to all of us. Processing and using the massive data resulting from micro-blog activities has becomes a hot topic. In this paper, we propose a method to design and implement the User Interest Based Search Engine, a search engine can be used to search for the same interest micro-blog users. We at first crawl massive micro-blog data from micro-blog websites, and store this data in HBase. Then we process the massive data and build indices using MapReduce. Finally, we build a search engine web site based on Solr, and we propose a rank algorithm for searching. By employing this User Interest Based Search Engine, we can accurately search other users with the same interests as ourselves.


Author(s):  
Shailendra Raghuvanshi ◽  
Priyanka Dubey

Load balancing of non-preemptive independent tasks on virtual machines (VMs) is an important aspect of task scheduling in clouds. Whenever certain VMs are overloaded and remaining VMs are under loaded with tasks for processing, the load has to be balanced to achieve optimal machine utilization. In this paper, we propose an algorithm named honey bee behavior inspired load balancing, which aims to achieve well balanced load across virtual machines for maximizing the throughput. The proposed algorithm also balances the priorities of tasks on the machines in such a way that the amount of waiting time of the tasks in the queue is minimal. We have compared the proposed algorithm with existing load balancing and scheduling algorithms. The experimental results show that the algorithm is effective when compared with existing algorithms. Our approach illustrates that there is a significant improvement in average execution time and reduction in waiting time of tasks on queue using workflowsim simulator in JAVA.


Sign in / Sign up

Export Citation Format

Share Document