scholarly journals Information Retrieval on Web: Ontology based Vs Traditional Search Engines

Information Retrieval has become the buzzword in the today’s era of advanced computing. The tremendous amount of information is available over the Internet in the form of documents which can either be structured or unstructured. It is really difficult to retrieve relevant information from such large pool. The traditional search engines based on keyword search are unable to give the desired relevant results as they search the web on the basis of the keywords present in the query fired. On contrary the ontology based semantic search engines provide relevant and quick results to the user as the information stored in the semantic web is more meaningful. The paper gives the comparative study of the ontology based search engines with those which are keyword based. Few of both types have been taken and same queries are run on each one of them to analyze the results to compare the precision of the results provided by them by classifying the results as relevant or non-relevant.

Author(s):  
Vijay Kasi ◽  
Radhika Jain

In the context of the Internet, a search engine can be defined as a software program designed to help one access information, documents, and other content on the World Wide Web. The adoption and growth of the Internet in the last decade has been unprecedented. The World Wide Web has always been applauded for its simplicity and ease of use. This is evident looking at the extent of the knowledge one requires to build a Web page. The flexible nature of the Internet has enabled the rapid growth and adoption of it, making it hard to search for relevant information on the Web. The number of Web pages has been increasing at an astronomical pace, from around 2 million registered domains in 1995 to 233 million registered domains in 2004 (Consortium, 2004). The Internet, considered a distributed database of information, has the CRUD (create, retrieve, update, and delete) rule applied to it. While the Internet has been effective at creating, updating, and deleting content, it has considerably lacked in enabling the retrieval of relevant information. After all, there is no point in having a Web page that has little or no visibility on the Web. Since the 1990s when the first search program was released, we have come a long way in terms of searching for information. Although we are currently witnessing a tremendous growth in search engine technology, the growth of the Internet has overtaken it, leading to a state in which the existing search engine technology is falling short. When we apply the metrics of relevance, rigor, efficiency, and effectiveness to the search domain, it becomes very clear that we have progressed on the rigor and efficiency metrics by utilizing abundant computing power to produce faster searches with a lot of information. Rigor and efficiency are evident in the large number of indexed pages by the leading search engines (Barroso, Dean, & Holzle, 2003). However, more research needs to be done to address the relevance and effectiveness metrics. Users typically type in two to three keywords when searching, only to end up with a search result having thousands of Web pages! This has made it increasingly hard to effectively find any useful, relevant information. Search engines face a number of challenges today requiring them to perform rigorous searches with relevant results efficiently so that they are effective. These challenges include the following (“Search Engines,” 2004). 1. The Web is growing at a much faster rate than any present search engine technology can index. 2. Web pages are updated frequently, forcing search engines to revisit them periodically. 3. Dynamically generated Web sites may be slow or difficult to index, or may result in excessive results from a single Web site. 4. Many dynamically generated Web sites are not able to be indexed by search engines. 5. The commercial interests of a search engine can interfere with the order of relevant results the search engine shows. 6. Content that is behind a firewall or that is password protected is not accessible to search engines (such as those found in several digital libraries).1 7. Some Web sites have started using tricks such as spamdexing and cloaking to manipulate search engines to display them as the top results for a set of keywords. This can make the search results polluted, with more relevant links being pushed down in the result list. This is a result of the popularity of Web searches and the business potential search engines can generate today. 8. Search engines index all the content of the Web without any bounds on the sensitivity of information. This has raised a few security and privacy flags. With the above background and challenges in mind, we lay out the article as follows. In the next section, we begin with a discussion of search engine evolution. To facilitate the examination and discussion of the search engine development’s progress, we break down this discussion into the three generations of search engines. Figure 1 depicts this evolution pictorially and highlights the need for better search engine technologies. Next, we present a brief discussion on the contemporary state of search engine technology and various types of content searches available today. With this background, the next section documents various concerns about existing search engines setting the stage for better search engine technology. These concerns include information overload, relevance, representation, and categorization. Finally, we briefly address the research efforts under way to alleviate these concerns and then present our conclusion.


Author(s):  
Novario Jaya Perdana

The accuracy of search result using search engine depends on the keywords that are used. Lack of the information provided on the keywords can lead to reduced accuracy of the search result. This means searching information on the internet is a hard work. In this research, a software has been built to create document keywords sequences. The software uses Google Latent Semantic Distance which can extract relevant information from the document. The information is expressed in the form of specific words sequences which could be used as keyword recommendations in search engines. The result shows that the implementation of the method for creating document keyword recommendation achieved high accuracy and could finds the most relevant information in the top search results.


Author(s):  
Daniele Besomi

This paper surveys the economic dictionaries available on the internet, both for free and on subscription, addressed to various kinds of audiences from schoolchildren to research students and academics. The focus is not much on content, but on whether and how the possibilities opened by electronic editing and by the modes of distribution and interaction opened by the internet are exploited in the organization and presentation of the materials. The upshot is that although a number of web dictionaries have taken advantage of some of the innovations offered by the internet (in particular the possibility of regularly updating, of turning cross-references into hyperlinks, of adding links to external materials, of adding more or less complex search engines), the observation that internet lexicography has mostly produced more ef! cient dictionary without, however, fundamentally altering the traditional paper structure can be con! rmed for this particular subset of reference works. In particular, what is scarcely explored is the possibility of visualizing the relationship between entries, thus abandoning the project of the early encyclopedists right when the technology provides the means of accomplishing it.


Author(s):  
R. Subhashini ◽  
V.Jawahar Senthil Kumar

The World Wide Web is a large distributed digital information space. The ability to search and retrieve information from the Web efficiently and effectively is an enabling technology for realizing its full potential. Information Retrieval (IR) plays an important role in search engines. Today’s most advanced engines use the keyword-based (“bag of words”) paradigm, which has inherent disadvantages. Organizing web search results into clusters facilitates the user’s quick browsing of search results. Traditional clustering techniques are inadequate because they do not generate clusters with highly readable names. This paper proposes an approach for web search results in clustering based on a phrase based clustering algorithm. It is an alternative to a single ordered result of search engines. This approach presents a list of clusters to the user. Experimental results verify the method’s feasibility and effectiveness.


Author(s):  
Cecil Eng Huang Chua ◽  
Roger H. Chiang ◽  
Veda C. Storey

Search engines are ubiquitous tools for seeking information from the Internet and, as such, have become an integral part of our information society. New search engines that combine ideas from separate search engines generally outperform the search engines from which they took ideas. Designers, however, may not be aware of the work of other search engine developers or such work may not be available in modules that can be incorporated into another search engine. This research presents an interoperability architecture for building customized search engines. Existing search engines are analyzed and decomposed into self-contained components that are classified into six categories. A prototype, called the Automated Software Development Environment for Information Retrieval, was developed to implement the interoperability architecture, and an assessment of its feasibility was carried out. The prototype resolves conflicts between components of separate search engines and demonstrates how design features across search engines can be integrated.


Author(s):  
Suely Fragoso

This chapter proposes that search engines apply a verticalizing pressure on the WWW many-to-many information distribution model, forcing this to revert to a distributive model similar to that of the mass media. The argument for this starts with a critical descriptive examination of the history of search mechanisms for the Internet. Parallel to this there is a discussion of the increasing ties between the search engines and the advertising market. The chapter then presents questions concerning the concentration of traffic on the Web around a small number of search engines which are in the hands of an equally limited number of enterprises. This reality is accentuated by the confidence that users place in the search engine and by the ongoing acquisition of collaborative systems and smaller players by the large search engines. This scenario demonstrates the verticalizing pressure that the search engines apply to the majority of WWW users, that bring it back toward the mass distribution mode.


Author(s):  
Max Chevalier ◽  
Christine Julien ◽  
Chantal Soulé-Dupuy

Searching information can be realized thanks to specific tools called Information Retrieval Systems IRS (also called “search engines”). To provide more accurate results to users, most of such systems offer personalization features. To do this, each system models a user in order to adapt search results that will be displayed. In a multi-application context (e.g., when using several search engines for a unique query), personalization techniques can be considered as limited because the user model (also called profile) is incomplete since it does not exploit actions/queries coming from other search engines. So, sharing user models between several search engines is a challenge in order to provide more efficient personalization techniques. A semantic architecture for user profile interoperability is proposed to reach this goal. This architecture is also important because it can be used in many other contexts to share various resources models, for instance a document model, between applications. It is also ensuring the possibility for every system to keep its own representation of each resource while providing a solution to easily share it.


2019 ◽  
Vol 31 (5) ◽  
pp. 480-491
Author(s):  
Carole Rodon ◽  
Anne Congard

Abstract Searching for information on the web is regarded as a complex problem-solving activity involving a range of cognitive and affective processes. Anxiety is a key affective factor. In this article, we describe the construction and initial validation stages of the Information Retrieval on the Web Anxiety Rate (IROWAR) scale. The final structure of this inventory was validated with a sample of 183 English-speaking Internet users. Reliability analyses indicated that the factors were internally consistent (Cronbach’s alpha: 0.92). When we checked divergent validity, we found negative correlations with both self-efficacy and positive attitude towards the Internet. There were no effects of either sex or age on the total IROWAR score, but the Internet search anxiety sum score decreased with the length of use. This scale will be useful in several domains, including research on the determinants of web anxiety, individuals’ experience of web anxiety and ways of supporting them and Internet learning.


2004 ◽  
Vol 28 (4) ◽  
pp. 254-259 ◽  
Author(s):  
Carol S. Bond

When trying to locate information on the Web people are faced with a variety of options. This research reviewed how a group of health related professionals approached the task of finding a named document. Most were eventually successful, but the majority encountered problems in their search techniques. Even experienced Web users had problems when working with a different interface to normal, and without access to their favourites. No relationship was found between the number of years' experience Web users had and the efficiency of their searching strategy. The research concludes that if people are to be able to use the Web quickly and efficiently as an effective information retrieval tool, as opposed to a recreational tool to surf the Internet, they need to have both an understanding of the medium and the tools, and the skills to use them effectively, both of which were lacking in the majority of participants in this study.


2015 ◽  
Vol 2015 ◽  
pp. 1-9 ◽  
Author(s):  
K. R. Uthayan ◽  
G. S. Anandha Mala

Ontology is the process of growth and elucidation of concepts of an information domain being common for a group of users. Establishing ontology into information retrieval is a normal method to develop searching effects of relevant information users require. Keywords matching process with historical or information domain is significant in recent calculations for assisting the best match for specific input queries. This research presents a better querying mechanism for information retrieval which integrates the ontology queries with keyword search. The ontology-based query is changed into a primary order to predicate logic uncertainty which is used for routing the query to the appropriate servers. Matching algorithms characterize warm area of researches in computer science and artificial intelligence. In text matching, it is more dependable to study semantics model and query for conditions of semantic matching. This research develops the semantic matching results between input queries and information in ontology field. The contributed algorithm is a hybrid method that is based on matching extracted instances from the queries and information field. The queries and information domain is focused on semantic matching, to discover the best match and to progress the executive process. In conclusion, the hybrid ontology in semantic web is sufficient to retrieve the documents when compared to standard ontology.


Sign in / Sign up

Export Citation Format

Share Document