Retrieving Informative Content from Web Pages with Conditional Learning of Support Vector Machines and Semantic Analysis

Author(s):  
Piotr Ładyżyński ◽  
Przemysław Grzegorzewski
2010 ◽  
pp. 1778-1787
Author(s):  
Dion Hoe-Lian Goh ◽  
Khasfariyati Razikin ◽  
Alton Y.K. Chua ◽  
Chei Sian Lee ◽  
Schubert Foo

Social tagging is the process of assigning and sharing among users freely selected terms of resources. This approach enables users to annotate/ describe resources, and also allows users to locate new resources through the collective intelligence of other users. Social tagging offers a new avenue for resource discovery as compared to taxonomies and subject directories created by experts. This chapter investigates the effectiveness of tags as resource descriptors and is achieved using text categorization via support vector machines (SVM). Two text categorization experiments were done for this research, and tags and Web pages from del.icio. us were used. The first study concentrated on the use of terms as its features while the second used both terms and its tags as part of its feature set. The experiments yielded a macroaveraged precision, recall, and F-measure scores of 52.66%, 54.86%, and 52.05%, respectively. In terms of microaveraged values, the experiments obtained 64.76% for precision, 54.40% for recall, and 59.14% for F-measure. The results suggest that the tags were not always reliable indicators of the resource contents. At the same time, the results from the terms-only experiment were better compared to the experiment with both terms and tags. Implications of our work and opportunities for future work are also discussed.


Author(s):  
Dion Hoe-Lian Goh ◽  
Khasfariyati Razikin ◽  
Alton Y.K. Chua ◽  
Chei Sian Lee ◽  
Schubert Foo

Social tagging is the process of assigning and sharing among users freely selected terms of resources. This approach enables users to annotate/describe resources, and also allows users to locate new resources through the collective intelligence of other users. Social tagging offers a new avenue for resource discovery as compared to taxonomies and subject directories created by experts. This chapter investigates the effectiveness of tags as resource descriptors and is achieved using text categorization via support vector machines (SVM). Two text categorization experiments were done for this research, and tags and Web pages from del.icio.us were used. The first study concentrated on the use of terms as its features while the second used both terms and its tags as part of its feature set. The experiments yielded a macroaveraged precision, recall, and F-measure scores of 52.66%, 54.86%, and 52.05%, respectively. In terms of microaveraged values, the experiments obtained 64.76% for precision, 54.40% for recall, and 59.14% for F-measure. The results suggest that the tags were not always reliable indicators of the resource contents. At the same time, the results from the terms-only experiment were better compared to the experiment with both terms and tags. Implications of our work and opportunities for future work are also discussed.


2018 ◽  
Vol 42 (1) ◽  
pp. 87-109
Author(s):  
Maria Jakovljevic ◽  
Alfred Coleman

<div>This study presents the construction of a niche search engine, whose search topic domain is to be user-defined. &nbsp;The specific focus of this study is the investigation of the role that a Support Vector Machine plays when classifying textual data from web pages. Furthermore, the aim is to establish whether this niche search engine can return results that are more relevant to a user than when compared to those returned by a commercial search engine Through the conduction of various experiments across a number of appropriate datasets, the suitability of the SVM to classify web pages has been proven to meet the needs of a niche search engine. A subset of the most useful webpage-specific features has been discovered, with the best performing feature being a web pages’ Text &amp; Title component. The user defined niche search engine was successfully designed and an experiment showed that it returned more relevant results than a commercial search engine.<div> </div></div>


2017 ◽  
Vol 4 ◽  
Author(s):  
S.V. Voloshin ◽  
◽  
A.L. Tsaregorodtsev ◽  
E.A. Kartashev ◽  
V.V. Slavskiy ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document