Enhancing the performance of search engines based heap based data file and hash based indexing file

Dr Jkr Sastry; Chandu Sai Chittibomma; Thulasi Manohara Reddy Alla

doi:10.14419/ijet.v7i2.7.10722

Enhancing the performance of search engines based heap based data file and hash based indexing file

International Journal of Engineering & Technology ◽

10.14419/ijet.v7i2.7.10722 ◽

2018 ◽

Vol 7 (2.7) ◽

pp. 372

Author(s):

Dr Jkr Sastry ◽

Chandu Sai Chittibomma ◽

Thulasi Manohara Reddy Alla

Keyword(s):

Search Engines ◽

Data File ◽

Search Process ◽

Indexing Method ◽

Storage Area ◽

The Web

WEB clients use the WEB for searching the content that they are looking for through inputting keywords or snippets as input to the search engines. Search Engines follows a process to collect the content and provide the same as output in terms of URL links. Sometimes enormous time is taken to fetch the content fetched especially when it goes into number of display pages. Locating the content among the number of pages of URLS displayed is complex. Proper indexing method will help in reducing the number of display pages and enhances the seed of processing and result into reducing the size of index space.In this paper a non-clustered indexing method based on hash based indexing and when the data is stored as a heap file is presented that helps the entire search process quite fast requiring very less storage area.

Download Full-text

Optimizing performance of search engines based on user behavior

International Journal of Engineering & Technology ◽

10.14419/ijet.v7i2.7.10715 ◽

2018 ◽

Vol 7 (2.7) ◽

pp. 359

Author(s):

Dr Jkr Sastry ◽

M Sri Harsha Vamsi ◽

R Srinivas ◽

G Yeshwanth

Keyword(s):

Search Engines ◽

User Behavior ◽

Search Process ◽

Web Pages ◽

User Characteristics ◽

End User ◽

The Web

WEB clients use the WEB for searching the content that they are looking for through inputting keywords or snippets as input to the search engines. Search Engines follows a process to collect the content and provide the same as output in terms of URL links. One can observe that only 20% of the outputted URLS are of use to the end user. 80% of output is unnecessarily surfed leading to wastage of money and time. Customers have surfing characteristics which can be collected as the user keep surfing. The search process can be made efficient by including the user characteristics / Behaviors as part and parcel of search process. This paper is aimed at improving the search process through integration of the user behavior in indexing and ranking the web pages.

Download Full-text

Guiding Students in Finding information on the Web

Business Communication Quarterly ◽

10.1177/108056999906200307 ◽

1999 ◽

Vol 62 (3) ◽

pp. 57-70

Author(s):

Zane K. Quible

Keyword(s):

Search Engines ◽

Web Search ◽

Business Communication ◽

Boolean Logic ◽

Search Process ◽

The Right ◽

The Web ◽

Infor Mation

Searching for information on the Web can be a daunting, frustrating, mind-bog gling, and sometimes-futile activity. To increase the odds of finding the right infor mation, one needs to understand the operation of four search tools: Web directo ries, search engines, indexes, and spiders or robots. Understanding Boolean logic also helps. Business communication instructors can aid students in their research by introducing them to the terminology and functions of an efficient Web search process.

Download Full-text

Classification of means and methods of the Web semantic retrieval

PROBLEMS IN PROGRAMMING ◽

10.15407/pp2017.01.030 ◽

2017 ◽

pp. 030-050

Author(s):

J.V. Rogushina ◽

Keyword(s):

Search Engines ◽

Domain Knowledge ◽

Information Needs ◽

Web Search ◽

User Interaction ◽

Query Languages ◽

Semantic Search ◽

Semantic Retrieval ◽

The Web

Problems associated with the improve ment of information retrieval for open environment are considered and the need for it’s semantization is grounded. Thecurrent state and prospects of development of semantic search engines that are focused on the Web information resources processing are analysed, the criteria for the classification of such systems are reviewed. In this analysis the significant attention is paid to the semantic search use of ontologies that contain knowledge about the subject area and the search users. The sources of ontological knowledge and methods of their processing for the improvement of the search procedures are considered. Examples of semantic search systems that use structured query languages (eg, SPARQL), lists of keywords and queries in natural language are proposed. Such criteria for the classification of semantic search engines like architecture, coupling, transparency, user context, modification requests, ontology structure, etc. are considered. Different ways of support of semantic and otology based modification of user queries that improve the completeness and accuracy of the search are analyzed. On base of analysis of the properties of existing semantic search engines in terms of these criteria, the areas for further improvement of these systems are selected: the development of metasearch systems, semantic modification of user requests, the determination of an user-acceptable transparency level of the search procedures, flexibility of domain knowledge management tools, increasing productivity and scalability. In addition, the development of means of semantic Web search needs in use of some external knowledge base which contains knowledge about the domain of user information needs, and in providing the users with the ability to independent selection of knowledge that is used in the search process. There is necessary to take into account the history of user interaction with the retrieval system and the search context for personalization of the query results and their ordering in accordance with the user information needs. All these aspects were taken into account in the design and implementation of semantic search engine "MAIPS" that is based on an ontological model of users and resources cooperation into the Web.

Download Full-text

Search Engine Update

Legal Information Management ◽

10.1017/s1472669600000566 ◽

2001 ◽

Vol 1 (3) ◽

pp. 28-31 ◽

Cited By ~ 1

Author(s):

Valerie Stevenson

Keyword(s):

Search Engine ◽

Search Engines ◽

Search Strategy ◽

Search Strategies ◽

Boolean Logic ◽

Web Searches ◽

Search Techniques ◽

Looking Back ◽

The Web

Looking back to 1999, there were a number of search engines which performed equally well. I recommended defining the search strategy very carefully, using Boolean logic and field search techniques, and always running the search in more than one search engine. Numerous articles and Web columns comparing the performance of different search engines came to different conclusions on the ‘best’ search engines. Over the last year, however, all the speakers at conferences and seminars I have attended have recommended Google as their preferred tool for locating all kinds of information on the Web. I confess that I have now abandoned most of my carefully worked out search strategies and comparison tests, and use Google for most of my own Web searches.

Download Full-text

SEO and Digital News Media: Visibility of Cultural Information in Spain’s Leading Newspapers

Tripodos ◽

10.51698/tripodos.2019.44p41-61 ◽

2021 ◽

pp. 41-61

Author(s):

Carlos Lopezosa ◽

Lluís Codina ◽

Mario Pérez-Montoro

Keyword(s):

Comparative Analysis ◽

News Media ◽

Search Engines ◽

Web Traffic ◽

Social Signals ◽

Digital News ◽

The Media ◽

Visibility Index ◽

The Web ◽

Digital Newspapers

This paper undertakes a comparative analysis of the visibility, and of other SEO indicators, of the culture sections of Spain’s leading digital newspapers —specifically, elmundo.es, elpais. com, lavanguardia.com, abc.es, elconfidencial.com and 20minutos.es— based on data collected by the media analytics company, comScore, and the web traffic metric, Alexa Rank. The analysis employs a set of positioning indicators: namely, a visibility index, keywords, social signals, keyword profiles, URLs, SERP-Snippets, reference domains and best anchor texts, as made available by SISTRIX, an SEO analytics audit toolbox. Thus, we were able to determine which of the digital newspapers’ culture sections has the best visibility. Likewise, we were able to identify which of these media are best positioned on Google, presumably as a result of more effective positioning strategies. We conclude with a discussion of our results and, on the basis of these findings, recommend ways in which the visibility of journalistic information can be optimised in search engines. SEO i cibermitjans: visibilitat de la informació cultural dels principals diaris d’Espanya Aquest article realitza una anàlisi comparativa de visibilitat i altres indicadors SEO de la secció de cultura dels principals cibermitjans espanyols: elmundo.es, elpais.com, lavanguardia.com, abc. es, elconfidencial.com i 20minutos. es. Les anàlisis s’han dut a terme amb la utilització d’un conjunt d’indicadors de posicionament (visibilitat, paraules clau, senyals socials, paraules clau, url, snippets, dominis de referència i millors textos àncora) utilitzant l’eina de auditoria i anàlisi de posicionament en cercadors, SISTRIX. Ens preguntem quin d’aquests mitjans té millor una secció de notícies culturals amb millor visibilitat. L’estudi dut a terme amb els indicadors seleccionats permet, d’aquesta manera, presentar una anàlisi comparativa del periodisme cultural i identificar quins d’aquests mitjans presenten millors posicions a Google, presumiblement, com a resultat d’estratègies de posicionament. Finalitzem amb una discussió dels resultats juntament amb unes recomanacions finals per optimitzar la visibilitat de la informació periodística en els cercadors.

Download Full-text

Economic Dictionaries on the Web

HERMES - Journal of Language and Communication in Business ◽

10.7146/hjlcb.v26i50.97787 ◽

2017 ◽

Vol 26 (50) ◽

pp. 13

Author(s):

Daniele Besomi

Keyword(s):

Search Engines ◽

The Internet ◽

Electronic Editing ◽

Paper Structure ◽

Research Students ◽

The Relationship ◽

The Web

This paper surveys the economic dictionaries available on the internet, both for free and on subscription, addressed to various kinds of audiences from schoolchildren to research students and academics. The focus is not much on content, but on whether and how the possibilities opened by electronic editing and by the modes of distribution and interaction opened by the internet are exploited in the organization and presentation of the materials. The upshot is that although a number of web dictionaries have taken advantage of some of the innovations offered by the internet (in particular the possibility of regularly updating, of turning cross-references into hyperlinks, of adding links to external materials, of adding more or less complex search engines), the observation that internet lexicography has mostly produced more ef! cient dictionary without, however, fundamentally altering the traditional paper structure can be con! rmed for this particular subset of reference works. In particular, what is scarcely explored is the possibility of visualizing the relationship between entries, thus abandoning the project of the early encyclopedists right when the technology provides the means of accomplishing it.

Download Full-text

Chapter 2 The Many Ways of Searching the Web Together: A Comparison of Social Search Engines

Library and Information Science - Web Search Engine Research ◽

10.1108/s1876-0562(2012)002012a004 ◽

2012 ◽

pp. 19-46 ◽

Cited By ~ 2

Author(s):

Manuel Burghardt ◽

Markus Heckner ◽

Christian Wolff

Keyword(s):

Search Engines ◽

Social Search ◽

The Many ◽

The Web

Download Full-text

Dark Web

Encyclopedia of Criminal Activities and the Deep Web ◽

10.4018/978-1-5225-9715-5.ch010 ◽

2020 ◽

pp. 152-164

Author(s):

Punam Bedi ◽

Neha Gupta ◽

Vinita Jindal

Keyword(s):

World Wide Web ◽

Search Engines ◽

World Wide ◽

Data Dissemination ◽

Deep Web ◽

Web Browsers ◽

Web Content ◽

The World ◽

Dark Web ◽

The Web

The World Wide Web is a part of the Internet that provides data dissemination facility to people. The contents of the Web are crawled and indexed by search engines so that they can be retrieved, ranked, and displayed as a result of users' search queries. These contents that can be easily retrieved using Web browsers and search engines comprise the Surface Web. All information that cannot be crawled by search engines' crawlers falls under Deep Web. Deep Web content never appears in the results displayed by search engines. Though this part of the Web remains hidden, it can be reached using targeted search over normal Web browsers. Unlike Deep Web, there exists a portion of the World Wide Web that cannot be accessed without special software. This is known as the Dark Web. This chapter describes how the Dark Web differs from the Deep Web and elaborates on the commonly used software to enter the Dark Web. It highlights the illegitimate and legitimate sides of the Dark Web and specifies the role played by cryptocurrencies in the expansion of Dark Web's user base.

Download Full-text

MapReduce Based Information Retrieval Algorithms for Efficient Ranking of Webpages

Information Retrieval Methods for Multidisciplinary Applications ◽

10.4018/978-1-4666-3898-3.ch015 ◽

2013 ◽

pp. 250-265

Author(s):

K.G. Srinivasa ◽

Anil Kumar Muppalla ◽

Varun A. Bharghava ◽

M. Amulya

Keyword(s):

Search Engines ◽

World Wide ◽

Speed Of Convergence ◽

User Choice ◽

Link Structure ◽

Ranking Algorithms ◽

The World ◽

Retrieval Algorithms ◽

Web Graph ◽

The Web

In this paper, the authors discuss the MapReduce implementation of crawler, indexer and ranking algorithms in search engines. The proposed algorithms are used in search engines to retrieve results from the World Wide Web. A crawler and an indexer in a MapReduce environment are used to improve the speed of crawling and indexing. The proposed ranking algorithm is an iterative method that makes use of the link structure of the Web and is developed using MapReduce framework to improve the speed of convergence of ranking the WebPages. Categorization is used to retrieve and order the results according to the user choice to personalize the search. A new score is introduced in this paper that is associated with each WebPage and is calculated using user’s query and number of occurrences of the terms in the query in the document corpus. The experiments are conducted on Web graph datasets and the results are compared with the serial versions of crawler, indexer and ranking algorithms.

Download Full-text

A Roadmap to Integrate Document Clustering in Information Retrieval

Information Retrieval Methods for Multidisciplinary Applications ◽

10.4018/978-1-4666-3898-3.ch003 ◽

2013 ◽

pp. 31-45

Author(s):

R. Subhashini ◽

V.Jawahar Senthil Kumar

Keyword(s):

Information Retrieval ◽

Search Engines ◽

World Wide ◽

Clustering Algorithm ◽

Web Search ◽

Full Potential ◽

Digital Information ◽

Search Results ◽

The World ◽

The Web

The World Wide Web is a large distributed digital information space. The ability to search and retrieve information from the Web efficiently and effectively is an enabling technology for realizing its full potential. Information Retrieval (IR) plays an important role in search engines. Today’s most advanced engines use the keyword-based (“bag of words”) paradigm, which has inherent disadvantages. Organizing web search results into clusters facilitates the user’s quick browsing of search results. Traditional clustering techniques are inadequate because they do not generate clusters with highly readable names. This paper proposes an approach for web search results in clustering based on a phrase based clustering algorithm. It is an alternative to a single ordered result of search engines. This approach presents a list of clusters to the user. Experimental results verify the method’s feasibility and effectiveness.

Download Full-text