scholarly journals Similarity Measurement in the Hybrid of Semantic Web Search Engine

2013 ◽  
Vol 8 (3) ◽  
pp. 913-921 ◽  
Author(s):  
Noryusliza Abdullah ◽  
Rosziati Ibrahim

Semantic Web approach with the assistance of ontology is widely used to give more reliable application in retrieving information and knowledge.  It is capable to discover the World Wide Web (WWW) that is presented in natural-language text.  Based on previous research, incorporating categorization with ontology concept has proven to give better results.  However, performing hybrid of the search engine using another technique that is user profiling has a promising potency in enhancing the searching process.  Utilizing searching time and giving relevant results are the contributions of this research.  The proposed hybrid techniques integrate ontologies, categorization and user profiling concept.  In user profiling, similarity measure is adopted in making comparison between two different ontologies.  WordNet and UTHM Onto are the independent ontologies used in this process.  The preliminary experimental results have given interesting results in terms of data arrangement and time usage.

2006 ◽  
Vol 1 (3) ◽  
pp. 67
Author(s):  
David Hook

A review of: Jansen, Bernard J., and Amanda Spink. “How Are We Searching the World Wide Web? A Comparison of Nine Search Engine Transaction Logs.” Information Processing & Management 42.1 (2006): 248-263. Objective – To examine the interactions between users and search engines, and how they have changed over time. Design – Comparative analysis of search engine transaction logs. Setting – Nine major analyses of search engine transaction logs. Subjects – Nine web search engine studies (4 European, 5 American) over a seven-year period, covering the search engines Excite, Fireball, AltaVista, BWIE and AllTheWeb. Methods – The results from individual studies are compared by year of study for percentages of single query sessions, one-term queries, operator (and, or, not, etc.) usage and single result page viewing. As well, the authors group the search queries into eleven different topical categories and compare how the breakdown has changed over time. Main Results – Based on the percentage of single query sessions, it does not appear that the complexity of interactions has changed significantly for either the U.S.-based or the European-based search engines. As well, there was little change observed in the percentage of one-term queries over the years of study for either the U.S.-based or the European-based search engines. Few users (generally less than 20%) use Boolean or other operators in their queries, and these percentages have remained relatively stable. One area of noticeable change is in the percentage of users viewing only one results page, which has increased over the years of study. Based on the studies of the U.S.-based search engines, the topical categories of ‘People, Place or Things’ and ‘Commerce, Travel, Employment or Economy’ are becoming more popular, while the categories of ‘Sex and Pornography’ and ‘Entertainment or Recreation’ are declining. Conclusions – The percentage of users viewing only one results page increased during the years of the study, while the percentages of single query sessions, one-term sessions and operator usage remained stable. The increase in single result page viewing implies that users are tending to view fewer results per web query. There was also a significant difference in the percentage of queries using Boolean operators between the US-based and the European-based search engines. One of the study’s findings was that results from a study of a particular search engine cannot necessarily be applied to all search engines. Finally, web search topics show a trend towards information or commerce searching rather than entertainment.


Author(s):  
Daniel Fernández-Álvarez ◽  
José Emilio Labra Gayo ◽  
Daniel Gayo-Avello ◽  
Patricia Ordoñez de Pablos

The proliferation of large databases with potentially repeated entities across the World Wide Web drives into a generalized interest to find methods to detect duplicated entries. The heterogeneity of the data cause that generalist approaches may produce a poor performance in scenarios with distinguishing features. In this paper, we analyze the particularities of music related-databases and we describe Musical Entities Reconciliation Architecture (MERA). MERA consists of an architecture to match entries of two sources, allowing the use of extra support sources to improve the results. It makes use of semantic web technologies and it is able to adapt the matching process to the nature of each field in each database. We have implemented a prototype of MERA and compared it with a well-known music-specialized search engine. Our prototype outperforms the selected baseline in terms of accuracy.


Author(s):  
Abhishek Das ◽  
Ankit Jain

In this chapter, the authors describe the key indexing components of today’s web search engines. As the World Wide Web has grown, the systems and methods for indexing have changed significantly. The authors present the data structures used, the features extracted, the infrastructure needed, and the options available for designing a brand new search engine. Techniques are highlighted that improve relevance of results, discuss trade-offs to best utilize machine resources, and cover distributed processing concepts in this context. In particular, the authors delve into the topics of indexing phrases instead of terms, storage in memory vs. on disk, and data partitioning. Some thoughts on information organization for the newly emerging data-forms conclude the chapter.


Author(s):  
Diane J. Cook ◽  
Nitish Manocha ◽  
Lawrence B. Holder

The World Wide Web provides an immense source of information. Accessing information of interest presents a challenge to scientists and analysts, particularly if the desired information is structural in nature. Our goal is to design a structural search engine that uses the hyperlink structure of the Web, in addition to textual information, to search for sites of interest. Our structural search engine, called WebSUBDUE, searches not only for particular words or topics but also for a desired hyperlink structure. Enhanced by WordNet text functions, our search engine retrieves sites corresponding to structures formed by graph-based user queries. We hypothesize that this system can form the heart of a structural query engine, and demonstrate the approach on a number of structural web queries.


NASKO ◽  
2011 ◽  
Vol 3 (1) ◽  
pp. 33
Author(s):  
Elizabeth Milonas

The World Wide Web has grown exponentially in the last few years. The popularity of Web search engines has also grown in a similar manner. The task of a Web search engine is to provide the Web searcher with accurate and targeted information from the plethora of information available on the Web. This is a daunting task that requires the careful usage of language to ensure accuracy. As a result, the importance of the usage and meaning of language in the Web domain has become the focus of recent research. In this paper, the author will explore Wittgenstein’s later philosophy of language as it applies to the language used in the search result pages of a Web search engine in an effort to broaden the understanding of language usage within this domain.


Author(s):  
Georg Neubauer

The main subject of the work is the visualization of typed links in Linked Data. The academic subjects relevant to the paper in general are the Semantic Web, the Web of Data and information visualization. The Semantic Web, invented by Tim Berners-Lee in 2001, was announced as an extension to the World Wide Web (Web 2.0). The actual area of investigation concerns the connectivity of information on the World Wide Web. To be able to explore such interconnections, visualizations are critical requirements as well as a major part of processing data in themselves. In the context of the Semantic Web, representation of information interrelations can be achieved using graphs. The aim of the article is to primarily describe the arrangement of Linked Data visualization concepts by establishing their principles in a theoretical approach. Putting design restrictions into context leads to practical guidelines. By describing the creation of two alternative visualizations of a commonly used web application representing Linked Data as network visualization, their compatibility was tested. The application-oriented part treats the design phase, its results, and future requirements of the project that can be derived from this test.


Author(s):  
Rizwan Ur Rahman ◽  
Rishu Verma ◽  
Himani Bansal ◽  
Deepak Singh Tomar

With the explosive expansion of information on the world wide web, search engines are becoming more significant in the day-to-day lives of humans. Even though a search engine generally gives huge number of results for certain query, the majority of the search engine users simply view the first few web pages in result lists. Consequently, the ranking position has become a most important concern of internet service providers. This article addresses the vulnerabilities, spamming attacks, and countermeasures in blogging sites. In the first part, the article explores the spamming types and detailed section on vulnerabilities. In the next part, an attack scenario of form spamming is presented, and defense approach is presented. Consequently, the aim of this article is to provide review of vulnerabilities, threats of spamming associated with blogging websites, and effective measures to counter them.


Web Services ◽  
2019 ◽  
pp. 1068-1076
Author(s):  
Vudattu Kiran Kumar

The World Wide Web (WWW) is global information medium, where users can read and write using computers over internet. Web is one of the services available on internet. The Web was created in 1989 by Sir Tim Berners-Lee. Since then a great refinement has done in the web usage and development of its applications. Semantic Web Technologies enable machines to interpret data published in a machine-interpretable form on the web. Semantic web is not a separate web it is an extension to the current web with additional semantics. Semantic technologies play a crucial role to provide data understandable to machines. To achieve machine understandable, we should add semantics to existing websites. With additional semantics, we can achieve next level web where knowledge repositories are available for better understanding of web data. This facilitates better search, accurate filtering and intelligent retrieval of data. This paper discusses about the Semantic Web and languages involved in describing documents in machine understandable format.


Sign in / Sign up

Export Citation Format

Share Document