Automatic Detection for JavaScript Obfuscation Attacks in Web Pages through String Pattern Analysis

Author(s):  
YoungHan Choi ◽  
TaeGhyoon Kim ◽  
SeokJin Choi ◽  
CheolWon Lee
2022 ◽  
Vol 12 (1) ◽  
pp. 1-18
Author(s):  
Umamageswari Kumaresan ◽  
Kalpana Ramanujam

The intent of this research is to come up with an automated web scraping system which is capable of extracting structured data records embedded in semi-structured web pages. Most of the automated extraction techniques in the literature captures repeated pattern among a set of similarly structured web pages, thereby deducing the template used for the generation of those web pages and then data records extraction is done. All of these techniques exploit computationally intensive operations such as string pattern matching or DOM tree matching and then perform manual labeling of extracted data records. The technique discussed in this paper departs from the state-of-the-art approaches by determining informative sections in the web page through repetition of informative content rather than syntactic structure. From the experiments, it is clear that the system has identified data rich region with 100% precision for web sites belonging to different domains. The experiments conducted on the real world web sites prove the effectiveness and versatility of the proposed approach.


Author(s):  
Fagner Christian Paes ◽  
Willian Massami Watanabe

Cross-Browser Incompatibilities (XBIs) represent inconsistencies in Web Application when introduced in different browsers. The growing number of implementation of browsers (Internet Explorer, Microsoft Edge, Mozilla Firefox, Google Chrome) and the constant evolution of the specifications of Web technologies provided differences in the way that the browsers behave and render the web pages. The web applications must behave consistently among browsers. Therefore, the web developers should overcome the differences that happen during the rendering in different environments by detecting and avoiding XBIs during the development process. Many web developers depend on manual inspection of web pages in several environments to detect the XBIs, independently of the cost and time that the manual tests represent to the process of development. The tools for the automatic detection of the XBIs accelerate the inspection process in the web pages, but the current tools have little precision, and their evaluations report a large percentage of false positives. This search aims to evaluate the use of Artificial Neural Networks for reducing the numbers of false positives in the automatic detection of the XBIs through the CSS (Cascading Style Sheets) and the relative comparison of the element in the web page.


Author(s):  
Kanaka Durga ◽  
V Rama Krishna

In the websites the contents will be are similarity when we compared with other search engines. So to check the similar content in the websites and its web contents we created a overhead to the search engine which will severely effect its performance & quality. So to detect the silmilar or same content or web documenattion some techniques are implemented by web crawling research community. So it is one of major factor for the search engines to provide some applicatory data to users in the first page itself. So to avoid such issues we proposed a methodlogy called Automatic Detection of illegitimate websites with Mutual Clustering (ADIWMC) paper we are presenting a peculiar and efficacious path for the detection of similarities in the web pages in web clustering. Detection of same and similar web pages and web content will be done by storing the crawled web pages into depository. Initially the adwords will be extracted from the crawled pages and similarity checking will be done between the two pages based in the usage of adwords. So a threshold value is set for this, if the similarity checking percentage is greater than the threshold then similarity content is reduced and improves the depositary and improves the search engine quality. In the sections of existing analysis and the proposed analysis we are clearly exploring how it works.


Author(s):  
Kanaka Durga ◽  
V Rama Krishna

In the websites the contents will be are similarity when we compared with other search engines. So to check the similar content in the websites and its web contents we created a overhead to the search engine which will severely effect its performance & quality. So to detect the silmilar or same content or web documenattion some techniques are implemented by web crawling research community. So it is one of major factor for the search engines to provide some applicatory data to users in the first page itself. So to avoid such issues we proposed a methodlogy called Automatic Detection of illegitimate websites with Mutual Clustering (ADIWMC) paper we are presenting a peculiar and efficacious path for the detection of similarities in the web pages in web clustering. Detection of same and similar web pages and web content will be done by storing the crawled web pages into depository. Initially the adwords will be extracted from the crawled pages and similarity checking will be done between the two pages based in the usage of adwords. So a threshold value is set for this, if the similarity checking percentage is greater than the threshold then similarity content is reduced and improves the depositary and improves the search engine quality. In the sections of existing analysis and the proposed analysis we are clearly exploring how it works.


Sign in / Sign up

Export Citation Format

Share Document