The Research and Implementation of a Distributed Crawler System Based on Apache Flink

Author(s):  
Feng Ye ◽  
Zongfei Jing ◽  
Qian Huang ◽  
Cheng Hu ◽  
Yong Chen
Keyword(s):  
2013 ◽  
Vol 347-350 ◽  
pp. 2506-2510
Author(s):  
Yun Qi Gao ◽  
Chun Lin Peng

With the development of Internet, Network public opinion has been serving an import role in reflection of social public opinion. As there are a large number of websites and forums on the Internet, we need a powerful crawler system which can meet the demands of opinion mining. However, common crawler systems concern more about ranking and recommendation algorithms, which is less important in opinion mining. In this article, we introduced the design and implementation of a distributed crawler system for opinion mining. We also introduced some extra parameters such as keywords count and published time into the ranking and refreshing strategies. Experimental results demonstrate that the system can well support different sites, and the improved strategies can greatly enhance the crawling and monitoring efficiency.


Author(s):  
Karel Tomala ◽  
Jan Plucar ◽  
Patrik Dubec ◽  
Lukas Rapant ◽  
Miroslav Voznak

2015 ◽  
Vol 22 ◽  
pp. 02029 ◽  
Author(s):  
Xiaochen Zhang ◽  
Ming Xian
Keyword(s):  

Sign in / Sign up

Export Citation Format

Share Document