scholarly journals A Visual Similarity Metric for View-based Web Page Retrieval

Author(s):  
Toru Furusawa ◽  
Yasuyuki Watai ◽  
Toshihiko Yamasaki ◽  
Kiyoharu Aizawa
2012 ◽  
Vol 204-208 ◽  
pp. 4928-4931
Author(s):  
Yang Xin Yu

A Web information retrieval algorithm based on Web page segment is designed, the key idea of which is to segment each Web page into different topic areas or segments according to its HTML tags and contents since Web pages are semi-structure. First, the algorithm builds a HTML tag tree, and then it combines nodes in the tree under the rule of content similarity and visual similarity. During the process of retrieval and ranking, the algorithm makes full use of the segmentation information to sequence the relevant pages. The experimental results show that this method is able to improve the precision in search significantly and it is also a good reference for the design of the future search engines.


2006 ◽  
Vol 42 (4) ◽  
pp. 310-318
Author(s):  
Yasufumi TAKAMA ◽  
Keisuke NAKAHARA ◽  
Noriaki MITSUHASHI ◽  
Toru YAMAGUCHI

2018 ◽  
Vol 18 (4) ◽  
pp. 43-60 ◽  
Author(s):  
A. Bartoli ◽  
A. De Lorenzo ◽  
E. Medvet ◽  
F. Tarlao

Abstract Recent phishing campaigns are increasingly targeted to specific, small population of users and last for increasingly shorter life spans. There is thus an urgent need for developing defense mechanisms that do not rely on any forms of blacklisting or reputation: there is simply no time for detecting novel phishing campaigns and notify all interested organizations quickly enough. Such mechanisms should be close to browsers and based solely on the visual appearance of the rendered page. One of the major impediments to research in this area is the lack of systematic knowledge about how phishing pages actually look like. In this work we describe the technical challenges in collecting a large and diverse collection of screenshots of phishing pages and propose practical solutions. We also analyze systematically the visual similarity between phishing pages and pages of targeted organizations, from the point of view of a similarity metric that has been proposed as a foundation for visual phishing detection and from the point of view of a human operator.


Sign in / Sign up

Export Citation Format

Share Document