Optimizing webpage relevancy using page ranking and  content based ranking

Systems for web information mining can be isolated into a few classifications as indicated by a sort of mined data and objectives that specif-ic classifications set: Web structure mining, Web utilization mining, and Web Content Mining. This paper proposes another Web Content Mining system for page significance positioning taking into account the page content investigation. The strategy, we call it Page Content Rank (PCR) in the paper, consolidates various heuristics that appear to be critical for breaking down the substance of Web pages. The page significance is resolved on the base of the significance of terms which the page contains. The significance of a term is determined concern-ing a given inquiry q and it depends on its measurable and linguistic elements. As a source set of pages for mining we utilize an arrangement of pages reacted by a web search tool to the question q. PCR utilizes a neural system as its inward order structure. We depict a usage of the proposed strategy and an examination of its outcomes with the other existing characterization framework –page rank algorithm.

Download Full-text

Research on an Enhanced web Information Processing Technology based on AIS text Mining

Recent Advances in Electrical & Electronic Engineering (Formerly Recent Patents on Electrical & Electronic Engineering) ◽

10.2174/2352096513999201026224357 ◽

2020 ◽

Vol 13 ◽

Author(s):

Canhui Li

Keyword(s):

Text Mining ◽

Symmetric Matrix ◽

Information Support ◽

Web Content ◽

Mining System ◽

Information Efficiency ◽

Web Content Mining ◽

Content Mining ◽

Web Text Mining ◽

Text Mining System

Background:: To improve the information efficiency in web text mining, filtration is utilized. Methods:: A web content mining technology based on web text mining, augmented information support (AIS), is proposed for improving the web text mining efficiency. Additionally, the AIS technology is applied to the Xiangshan science conference website, and AIS4XSSC text mining system is developed. The developed system is tested for its efficiency, and its main functions are discussed. Results:: 192 documents are represented by 8352 vectors, and 192 × 8352 vectors are obtained; the similarity between 192 vectors is calculated using the cosine of included angle, 192 × 192 symmetric matrix is obtained, and 35 categories are formed by hierarchical clustering by using similarity between texts. Conclusion:: The results show that the AIS technology can effectively extract information from a large amount of web texts. The proposed system improves information retrieval efficiently and can push the valuable information to users.

Download Full-text

A Heuristic Mining Algorithm Using Web Hyperlink Structure

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.108-111.11 ◽

2010 ◽

Vol 108-111 ◽

pp. 11-16

Author(s):

Chun Lai Chai

Keyword(s):

Data Mining ◽

Web Mining ◽

Web Usage Mining ◽

Web Content ◽

Web Content Mining ◽

Web Structure ◽

Mining Algorithm ◽

Web Structure Mining ◽

Content Mining ◽

The Web

Web mining aims to discover useful information or knowledge from the Web hyperlink structure, page content and usage log. Based on the primary kind of data used in the mining process, Web mining tasks are categorized into three main types: Web structure mining, Web content mining and Web usage mining. Following is what they do on Web Data Mining. This paper proposed a heuristic mining algorithm.

Download Full-text

Web Pages Ranking Algorithms: A Survey

Qubahan Academic Journal ◽

10.48161/qaj.v1n3a79 ◽

2021 ◽

Vol 1 (3) ◽

pp. 29-34

Author(s):

Ayad Abdulrahman

Keyword(s):

Search Engines ◽

Relevant Information ◽

Web Pages ◽

Web Content ◽

Ranking Algorithms ◽

Page Ranking ◽

Internet Users ◽

User Query ◽

Web Structure Mining ◽

Content Mining

Due to the daily expansion of the web, the amount of information has increased significantly. Thus, the need for retrieving relevant information has also increased. In order to explore the internet, users depend on various search engines. Search engines face a significant challenge in returning the most relevant results for a user's query. The search engine's performance is determined by the algorithm used to rank web pages, which prioritizes the pages with the most relevancy to appear at the top of the result page. In this paper, various web page ranking algorithms such as Page Rank, Time Rank, EigenRumor, Distance Rank, SimRank, etc. are analyzed and compared based on some parameters, including the mining technique to which the algorithm belongs (for instance, Web Content Mining, Web Structure Mining, and Web Usage Mining), the methodology used for ranking web pages, time complexity (amount of time to run an algorithm), input parameters (parameters utilized in the ranking process such as InLink, OutLink, Tag name, Keyword, etc.), and the result relevancy to the user query.

Download Full-text