A novel approach for content extraction from web pages

Author(s):  
Aanshi Bhardwaj ◽  
Veenu Mangat
2021 ◽  
Vol 5 (EICS) ◽  
pp. 1-23
Author(s):  
Markku Laine ◽  
Yu Zhang ◽  
Simo Santala ◽  
Jussi P. P. Jokinen ◽  
Antti Oulasvirta

Over the past decade, responsive web design (RWD) has become the de facto standard for adapting web pages to a wide range of devices used for browsing. While RWD has improved the usability of web pages, it is not without drawbacks and limitations: designers and developers must manually design the web layouts for multiple screen sizes and implement associated adaptation rules, and its "one responsive design fits all" approach lacks support for personalization. This paper presents a novel approach for automated generation of responsive and personalized web layouts. Given an existing web page design and preferences related to design objectives, our integer programming -based optimizer generates a consistent set of web designs. Where relevant data is available, these can be further automatically personalized for the user and browsing device. The paper includes presentation of techniques for runtime adaptation of the designs generated into a fully responsive grid layout for web browsing. Results from our ratings-based online studies with end users (N = 86) and designers (N = 64) show that the proposed approach can automatically create high-quality responsive web layouts for a variety of real-world websites.


Author(s):  
Stanislas Morbieu ◽  
Guillaume Bruneval ◽  
Mohamed Lacarne ◽  
Mohamed Kone ◽  
Francois-Xavier Bois
Keyword(s):  

2020 ◽  
Vol 2020 ◽  
pp. 1-18
Author(s):  
Sonia Setia ◽  
Verma Jyoti ◽  
Neelam Duhan

The continuous growth of the World Wide Web has led to the problem of long access delays. To reduce this delay, prefetching techniques have been used to predict the users’ browsing behavior to fetch the web pages before the user explicitly demands that web page. To make near accurate predictions for users’ search behavior is a complex task faced by researchers for many years. For this, various web mining techniques have been used. However, it is observed that either of the methods has its own set of drawbacks. In this paper, a novel approach has been proposed to make a hybrid prediction model that integrates usage mining and content mining techniques to tackle the individual challenges of both these approaches. The proposed method uses N-gram parsing along with the click count of the queries to capture more contextual information as an effort to improve the prediction of web pages. Evaluation of the proposed hybrid approach has been done by using AOL search logs, which shows a 26% increase in precision of prediction and a 10% increase in hit ratio on average as compared to other mining techniques.


2016 ◽  
Vol 6 (2) ◽  
pp. 1-23 ◽  
Author(s):  
Surbhi Bhatia ◽  
Manisha Sharma ◽  
Komal Kumar Bhatia

Due to the sudden and explosive increase in web technologies, huge quantity of user generated content is available online. The experiences of people and their opinions play an important role in the decision making process. Although facts provide the ease of searching information on a topic but retrieving opinions is still a crucial task. Many studies on opinion mining have to be undertaken efficiently in order to extract constructive opinionated information from these reviews. The present work focuses on the design and implementation of an Opinion Crawler which downloads the opinions from various sites thereby, ignoring rest of the web. Besides, it also detects web pages which frequently undergo updation by calculating the timestamp for its revisit in order to extract relevant opinions. The performance of the Opinion Crawler is justified by taking real data sets that prove to be much more accurate in terms of precision and recall quality attributes.


2017 ◽  
Vol 11 (2) ◽  
pp. 39-48 ◽  
Author(s):  
Qingtang Liu ◽  
Mingbo Shao ◽  
Linjing Wu ◽  
Gang Zhao ◽  
Guilin Fan ◽  
...  
Keyword(s):  

2017 ◽  
Vol 6 (8) ◽  
pp. 252 ◽  
Author(s):  
Adam Iwaniak ◽  
Marta Leszczuk ◽  
Marek Strzelecki ◽  
Francis Harvey ◽  
Iwona Kaczmarek

2018 ◽  
Vol 3 (1) ◽  
pp. 34 ◽  
Author(s):  
Sanjay K. Dwivedi ◽  
Chandrakala Arya
Keyword(s):  

Sign in / Sign up

Export Citation Format

Share Document