SBLWPR – SIMILARITY BASED LINK WEIGHT FOR PAGERANK CALCULATION

2016 ◽

pp. 420-459 ◽

Cited By ~ 1

Author(s):

Ravi P. Kumar ◽

Ashutosh K. Singh ◽

Anand Mohan

Keyword(s):

Search Engine ◽

Web Sites ◽

Cyber Security ◽

Web Security ◽

Optimization Techniques ◽

Web Pages ◽

Link Structure ◽

Search Engine Optimization ◽

Ranking Algorithms ◽

The Web

In this era of Web computing, Cyber Security is very important as more and more data is moving into the Web. Some data are confidential and important. There are many threats for the data in the Web. Some of the basic threats can be addressed by designing the Web sites properly using Search Engine Optimization techniques. One such threat is the hanging page which gives room for link spamming. This chapter addresses the issues caused by hanging pages in Web computing. This Chapter has four important objectives. They are 1) Compare and review the different types of link structure based ranking algorithms in ranking Web pages. PageRank is used as the base algorithm throughout this Chapter. 2) Study on hanging pages, explore the effects of hanging pages in Web security and compare the existing methods to handle hanging pages. 3) Study on Link spam and explore the effect of hanging pages in link spam contribution and 4) Study on Search Engine Optimization (SEO) / Web Site Optimization (WSO) and explore the effect of hanging pages in Search Engine Optimization (SEO).

Download Full-text

Analysis of Web Pages Based the Changed Information and its’ Application in the Search Engine for one Web Site

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.303-306.2311 ◽

2013 ◽

Vol 303-306 ◽

pp. 2311-2316

Author(s):

Hong Shen Liu ◽

Peng Fei Wang

Keyword(s):

Search Engine ◽

Search Engines ◽

Web Site ◽

New Method ◽

Web Pages ◽

Web Crawler ◽

The Core ◽

Core Technology ◽

The Web

The structures and contents of researching search engines are presented and the core technology is the analysis technology of web pages. The characteristic of analyzing web pages in one website is studied, relations between the web pages web crawler gained at two times are able to be obtained and the changed information among them are found easily. A new method of analyzing web pages in one website is introduced and the method analyzes web pages with the changed information of web pages. The result of applying the method shows that the new method is effective in the analysis of web pages.

Download Full-text

MapReduce Based Information Retrieval Algorithms for Efficient Ranking of Webpages

Information Retrieval Methods for Multidisciplinary Applications ◽

10.4018/978-1-4666-3898-3.ch015 ◽

2013 ◽

pp. 250-265

Author(s):

K.G. Srinivasa ◽

Anil Kumar Muppalla ◽

Varun A. Bharghava ◽

M. Amulya

Keyword(s):

Search Engines ◽

World Wide ◽

Speed Of Convergence ◽

User Choice ◽

Link Structure ◽

Ranking Algorithms ◽

The World ◽

Retrieval Algorithms ◽

Web Graph ◽

The Web

In this paper, the authors discuss the MapReduce implementation of crawler, indexer and ranking algorithms in search engines. The proposed algorithms are used in search engines to retrieve results from the World Wide Web. A crawler and an indexer in a MapReduce environment are used to improve the speed of crawling and indexing. The proposed ranking algorithm is an iterative method that makes use of the link structure of the Web and is developed using MapReduce framework to improve the speed of convergence of ranking the WebPages. Categorization is used to retrieve and order the results according to the user choice to personalize the search. A new score is introduced in this paper that is associated with each WebPage and is calculated using user’s query and number of occurrences of the terms in the query in the document corpus. The experiments are conducted on Web graph datasets and the results are compared with the serial versions of crawler, indexer and ranking algorithms.

Download Full-text

Web Algorithms for Information Retrieval

International Journal of Mobile Computing and Multimedia Communications ◽

10.4018/ijmcmc.2014010101 ◽

2014 ◽

Vol 6 (1) ◽

pp. 1-16

Author(s):

Bouchra Frikh ◽

Brahim Ouhbi

Keyword(s):

World Wide ◽

Information Dissemination ◽

Quality Information ◽

Web Pages ◽

Web Page ◽

Page Rank ◽

Ranking Algorithms ◽

Internet Users ◽

The World ◽

The Web

The World Wide Web has emerged to become the biggest and most popular way of communication and information dissemination. Every day, the Web is expending and people generally rely on search engine to explore the web. Because of its rapid and chaotic growth, the resulting network of information lacks of organization and structure. It is a challenge for service provider to provide proper, relevant and quality information to the internet users by using the web page contents and hyperlinks between web pages. This paper deals with analysis and comparison of web pages ranking algorithms based on various parameters to find out their advantages and limitations for ranking web pages and to give the further scope of research in web pages ranking algorithms. Six important algorithms: the Page Rank, Query Dependent-PageRank, HITS, SALSA, Simultaneous Terms Query Dependent-PageRank (SQD-PageRank) and Onto-SQD-PageRank are presented and their performances are discussed.

Download Full-text

WEB GRAPH BASED SEARCH BY USING DENSITY OF KEYWORD AND AGE FACTOR

International Journal of Computer Science and Informatics ◽

10.47893/ijcsi.2013.1124 ◽

2013 ◽

pp. 89-93

Author(s):

GAURAV AGARWAL ◽

SACHI GUPTA ◽

SAURABH MUKHERJEE

Keyword(s):

Search Engine ◽

Web Search ◽

Web Pages ◽

Main Role ◽

Ranking Algorithm ◽

Web Page ◽

Web Crawler ◽

User Requirement ◽

Priority Assignment ◽

The Web

Today, web servers, are the key repositories of the information & internet is the source of getting this information. There is a mammoth data on the Internet. It becomes a difficult job to search out the accordant data. Search Engine plays a vital role in searching the accordant data. A search engine follows these steps: Web crawling by crawler, Indexing by Indexer and Searching by Searcher. Web crawler retrieves information of the web pages by following every link on the site. Which is stored by web search engine then the content of the web page is indexed by the indexer. The main role of indexer is how data can be catch soon as per user requirements. As the client gives a query, Search Engine searches the results corresponding to this query to provide excellent output. Here ambition is to enroot an algorithm for search engine which may response most desirable result as per user requirement. In this a ranking method is used by the search engine to rank the web pages. Various ranking approaches are discussed in literature but in this paper, ranking algorithm is proposed which is based on parent-child relationship. Proposed ranking algorithm is based on priority assignment phase of Heterogeneous Earliest Finish Time (HEFT) Algorithm which is designed for multiprocessor task scheduling. Proposed algorithm works on three on range variable its means the density of keywords, number of successors to the nodes and the age of the web page. Density shows the occurrence of the keyword on the particular web page. Numbers of successors represent the outgoing link to a single web page. Age is the freshness value of the web page. The page which is modified recently is the freshest page and having the smallest age or largest freshness value. Proposed Technique requires that the priorities of each page to be set with the downward rank values & pages are arranged in ascending/ Descending order of their rank values. Experiments show that our algorithm is valuable. After the comparison with Google we find that our Algorithm is performing better. For 70% problems our algorithm is working better than Google.

Download Full-text

Eccentric Methodology with Optimization to Unearth Hidden Facts of Search Engine Result Pages

Recent Patents on Computer Science ◽

10.2174/2213275911666181115093050 ◽

2019 ◽

Vol 12 (2) ◽

pp. 110-119 ◽

Cited By ~ 3

Author(s):

Jayaraman Sethuraman ◽

Jafar A. Alzubi ◽

Ramachandran Manikandan ◽

Mehdi Gheisari ◽

Ambeshwar Kumar

Keyword(s):

Search Engine ◽

World Wide ◽

Optimization Techniques ◽

Web Pages ◽

Web Page ◽

Search Engine Optimization ◽

The World ◽

Search Engine Result ◽

The Web ◽

New Framework

Background: The World Wide Web houses an abundance of information that is used every day by billions of users across the world to find relevant data. Website owners employ webmasters to ensure their pages are ranked top in search engine result pages. However, understanding how the search engine ranks a website, which comprises numerous web pages, as the top ten or twenty websites is a major challenge. Although systems have been developed to understand the ranking process, a specialized tool based approach has not been tried. Objective: This paper develops a new framework and system that process website contents to determine search engine optimization factors. Methods: To analyze the web page dynamically by assessing the web site content based on specific keywords, elimination method was used in an attempt to reveal various search engine optimization techniques. Conclusion: Our results lead to conclude that the developed system is able to perform a deeper analysis and find factors which play a role in bringing the site on the top of the list.

Download Full-text

Filtering Method for the Annotated and Non-Annotated Web Pages

International Journal of Knowledge Society Research ◽

10.4018/ijksr.2017010101 ◽

2017 ◽

Vol 8 (1) ◽

pp. 1-22 ◽

Cited By ~ 1

Author(s):

Sahar Maâlej Dammak ◽

Anis Jedidi ◽

Rafik Bouaziz

Keyword(s):

Search Engine ◽

Semantic Annotation ◽

Web Pages ◽

Filtering Method ◽

Annotation Process ◽

The World ◽

Multiple Domains ◽

The Web ◽

Great Mass

With the great mass of the pages managed through the world, and especially with the advent of the Web, it has become more difficult to find the relevant pages after an interrogation. Furthermore, the manual filtering of the indexed Web pages is a laborious task. A new filtering method of the annotated Web pages (by our semantic annotation process) and the non-annotated Web pages (retrieved from search engine “Google”) is then necessary to group the relevant Web pages for the user. In this paper, the authors will first synthesize their previous work of the semantic annotation of Web pages. Then, they will define a new filtering method based on three activities. The authors will also present their querying and filtering component of Web pages; their purpose is to demonstrate the feasibility of the filtering method. Finally, the authors will present an evaluation of this component, which has proved its performance for multiple domains.

Download Full-text

Filtering Method for the Annotated and Non-Annotated Web Pages

Information Retrieval and Management ◽

10.4018/978-1-5225-5191-1.ch057 ◽

2018 ◽

pp. 1300-1322

Author(s):

Sahar Maâlej Dammak ◽

Anis Jedidi ◽

Rafik Bouaziz

Keyword(s):

Search Engine ◽

Semantic Annotation ◽

Web Pages ◽

Filtering Method ◽

Annotation Process ◽

The World ◽

Multiple Domains ◽

The Web ◽

Great Mass

With the great mass of the pages managed through the world, and especially with the advent of the Web, it has become more difficult to find the relevant pages after an interrogation. Furthermore, the manual filtering of the indexed Web pages is a laborious task. A new filtering method of the annotated Web pages (by our semantic annotation process) and the non-annotated Web pages (retrieved from search engine “Google”) is then necessary to group the relevant Web pages for the user. In this paper, the authors will first synthesize their previous work of the semantic annotation of Web pages. Then, they will define a new filtering method based on three activities. The authors will also present their querying and filtering component of Web pages; their purpose is to demonstrate the feasibility of the filtering method. Finally, the authors will present an evaluation of this component, which has proved its performance for multiple domains.

Download Full-text

An Approach of Web Page Information Extraction

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.347-350.2479 ◽

2013 ◽

Vol 347-350 ◽

pp. 2479-2482

Author(s):

Yao Hui Li ◽

Li Xia Wang ◽

Jian Xiong Wang ◽

Jie Yue ◽

Ming Zhan Zhao

Keyword(s):

Information Extraction ◽

Search Engine ◽

Information Source ◽

Web Pages ◽

Web Page ◽

Extraction Technology ◽

Page Segmentation ◽

The Web

The Web has become the largest information source, but the noise content is an inevitable part in any web pages. The noise content reduces the nicety of search engine and increases the load of server. Information extraction technology has been developed. Information extraction technology is mostly based on page segmentation. Through analyzed the existing method of page segmentation, an approach of web page information extraction is provided. The block node is identified by analyzing attributes of HTML tags. This algorithm is easy to implementation. Experiments prove its good performance.

Download Full-text

Based on DNS of a Layered Web Search Engine Study

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.971-973.1870 ◽

2014 ◽

Vol 971-973 ◽

pp. 1870-1873

Author(s):

Xiao Gang Dong

Keyword(s):

Search Engine ◽

Web Search ◽

Web Pages ◽

Distributed Architecture ◽

Search System ◽

Web Search Engine ◽

Original Idea ◽

Commercial Search Engine ◽

New System ◽

The Web

Web search engine based on DNS, the standard proposed solution of IETF for public web search system, is introduced in this paper. Now no web search engine can cover more than 60 percent of all the pages on Internet. The update interval of most pages database is almost one month. This condition hasn't changed for many years. Converge and recency problems have become the bottleneck problem of current web search engine. To solve these problems, a new system, search engine based on DNS is proposed in this paper. This system adopts the hierarchical distributed architecture like DNS, which is different from any current commercial search engine. In theory, this system can cover all the web pages on Internet. Its update interval could even be one day. The original idea, detailed content and implementation of this system all are introduced in this paper.

Download Full-text

SBLWPR – SIMILARITY BASED LINK WEIGHT FOR PAGERANK CALCULATION

Review of Link Structure Based Ranking Algorithms and Hanging Pages

Analysis of Web Pages Based the Changed Information and its’ Application in the Search Engine for one Web Site

MapReduce Based Information Retrieval Algorithms for Efficient Ranking of Webpages

Web Algorithms for Information Retrieval

WEB GRAPH BASED SEARCH BY USING DENSITY OF KEYWORD AND AGE FACTOR

Eccentric Methodology with Optimization to Unearth Hidden Facts of Search Engine Result Pages

Filtering Method for the Annotated and Non-Annotated Web Pages

Filtering Method for the Annotated and Non-Annotated Web Pages

An Approach of Web Page Information Extraction

Based on DNS of a Layered Web Search Engine Study

Export Citation Format