Review of Web Page Classification and Web Content Mining

Author(s):  
Dr.G.M. Lingaraju ◽  
Dr.S. Jagannatha
Author(s):  
Rajalakshmi R. ◽  
Hans Tiwari ◽  
Jay Patel ◽  
Rameshkannan R. ◽  
Karthik R.

The Gen Z kids highly rely on internet for various purposes like entertainment, sports, and school projects. There is a demand for parental control systems to monitor the children during their surfing time. Current web page classification approaches are not effective as handcrafted features are extracted from the web content and machine learning techniques are used that need domain knowledge. Hence, a deep learning approach is proposed to perform URL-based web page classification. As the URL is a short text, the model should learn to understand where the important information is present in the URL. The proposed system integrates the strength of attention mechanism with recurrent convolutional neural network for effective learning of context-aware URL features. This enhanced architecture improves the design of kids-relevant URL classification. By conducting various experiments on the benchmark collection Open Directory Project, it is shown that an accuracy of 0.8251 was achieved.


Author(s):  
Sathi T. Marath ◽  
Michael Shepherd ◽  
Evangelos Milios ◽  
Jack Duffy

Sign in / Sign up

Export Citation Format

Share Document