The Application of Matrix Partitioning Algorithm in PageRank Computational Efficiency

2014 ◽  
Vol 998-999 ◽  
pp. 939-942
Author(s):  
Ming Fu Jiang

With the vigorous development of the Internet information age, work efficiency can be improved by finding the information needed accurately and quickly. Therefore, it is of vital importance to make an ordering for the relevant information web pages that are provided by the Internet. This paper proposes a kind of PageRank algorithm based on matrix partitioning to complete the ordering of relevant information web pages and applies this algorithm in the calculation cases whose experimental results on the aspect of improving PageRank computational efficiency show matrix partitioning can reduce iterations and improve computational efficiency.

2020 ◽  
Vol 2 (1) ◽  
pp. 17-21
Author(s):  
Fares Hasan ◽  
Koo Kwong Ze ◽  
Rozilawati Razali ◽  
Abudhahir Buhari ◽  
Elisha Tadiwa

PageRank is an algorithm that brings an order to the Internet by returning the best result to the users corresponding to a search query. The algorithm returns the result by calculating the outgoing links that a webpage has thus reflecting whether the webpage is relevant or not. However, there are still problems existing which relate to the time needed to calculate the page rank of all the webpages. The turnaround time is long as the webpages in the Internet are a lot and keep increasing. Secondly, the results returned by the algorithm are biased towards mainly old webpages resulting in newly created webpages having lower page rankings compared to old webpages even though new pages might have comparatively more relevant information. To overcome these setbacks, this research proposes an alternative hybrid algorithm based on an optimized normalization technique and content-based approach. The proposed algorithm reduces the number of iterations required to calculate the page rank hence improving efficiency by calculating the mean of all page rank values and normalising the page rank value through the use of the mean. This is complemented by calculating the valid links of web pages based on the validity of the links rather than the conventional popularity.


Author(s):  
Santosh Kumar ◽  
Ravi Kumar

The internet is very huge in size and increasing exponentially. Finding any relevant information from such a huge information source is now becoming very difficult. Millions of web pages are returned in response to a user's ordinary query. Displaying these web pages without ranking makes it very challenging for the user to find the relevant results of a query. This paper has proposed a novel approach that utilizes web content, usage, and structure data to prioritize web documents. The proposed approach has applications in several major areas like web personalization, adaptive website development, recommendation systems, search engine optimization, business intelligence solutions, etc. Further, the proposed approach has been compared experimentally by other approaches, WDPGA, WDPSA, and WDPII, and it has been observed that with a little trade off time, it has an edge over these approaches.


Author(s):  
Jos van Iwaarden ◽  
Ton van der Wiele ◽  
Roger Williams ◽  
Steve Eldridge

The Internet has come of age as a global source of information about every topic imaginable. A company like Google has become a household name in Western countries and making use of its internet search engine is so popular that “Googling” has even become a verb in many Western languages. Whether it is for business or private purposes, people worldwide rely on Google to present them relevant information. Even the scientific community is increasingly employing Google’s search engine to find academic articles and other sources of information about the topics they are studying. Yet, the vast amount of information that is available on the internet is gradually changing in nature. Initially, information would be uploaded by the administrators of the web site and would then be visible to all visitors of the site. This approach meant that web sites tended to be limited in the amount of content they provided, and that such content was strictly controlled by the administrators. Over time, web sites have granted their users the authority to add information to web pages, and sometimes even to alter existing information. Current examples of such web sites are eBay (auction), Wikipedia (encyclopedia), YouTube (video sharing), LinkedIn (social networking), Blogger (weblogs) and Delicious (social bookmarking).


Author(s):  
Vijay Kasi ◽  
Radhika Jain

In the context of the Internet, a search engine can be defined as a software program designed to help one access information, documents, and other content on the World Wide Web. The adoption and growth of the Internet in the last decade has been unprecedented. The World Wide Web has always been applauded for its simplicity and ease of use. This is evident looking at the extent of the knowledge one requires to build a Web page. The flexible nature of the Internet has enabled the rapid growth and adoption of it, making it hard to search for relevant information on the Web. The number of Web pages has been increasing at an astronomical pace, from around 2 million registered domains in 1995 to 233 million registered domains in 2004 (Consortium, 2004). The Internet, considered a distributed database of information, has the CRUD (create, retrieve, update, and delete) rule applied to it. While the Internet has been effective at creating, updating, and deleting content, it has considerably lacked in enabling the retrieval of relevant information. After all, there is no point in having a Web page that has little or no visibility on the Web. Since the 1990s when the first search program was released, we have come a long way in terms of searching for information. Although we are currently witnessing a tremendous growth in search engine technology, the growth of the Internet has overtaken it, leading to a state in which the existing search engine technology is falling short. When we apply the metrics of relevance, rigor, efficiency, and effectiveness to the search domain, it becomes very clear that we have progressed on the rigor and efficiency metrics by utilizing abundant computing power to produce faster searches with a lot of information. Rigor and efficiency are evident in the large number of indexed pages by the leading search engines (Barroso, Dean, & Holzle, 2003). However, more research needs to be done to address the relevance and effectiveness metrics. Users typically type in two to three keywords when searching, only to end up with a search result having thousands of Web pages! This has made it increasingly hard to effectively find any useful, relevant information. Search engines face a number of challenges today requiring them to perform rigorous searches with relevant results efficiently so that they are effective. These challenges include the following (“Search Engines,” 2004). 1. The Web is growing at a much faster rate than any present search engine technology can index. 2. Web pages are updated frequently, forcing search engines to revisit them periodically. 3. Dynamically generated Web sites may be slow or difficult to index, or may result in excessive results from a single Web site. 4. Many dynamically generated Web sites are not able to be indexed by search engines. 5. The commercial interests of a search engine can interfere with the order of relevant results the search engine shows. 6. Content that is behind a firewall or that is password protected is not accessible to search engines (such as those found in several digital libraries).1 7. Some Web sites have started using tricks such as spamdexing and cloaking to manipulate search engines to display them as the top results for a set of keywords. This can make the search results polluted, with more relevant links being pushed down in the result list. This is a result of the popularity of Web searches and the business potential search engines can generate today. 8. Search engines index all the content of the Web without any bounds on the sensitivity of information. This has raised a few security and privacy flags. With the above background and challenges in mind, we lay out the article as follows. In the next section, we begin with a discussion of search engine evolution. To facilitate the examination and discussion of the search engine development’s progress, we break down this discussion into the three generations of search engines. Figure 1 depicts this evolution pictorially and highlights the need for better search engine technologies. Next, we present a brief discussion on the contemporary state of search engine technology and various types of content searches available today. With this background, the next section documents various concerns about existing search engines setting the stage for better search engine technology. These concerns include information overload, relevance, representation, and categorization. Finally, we briefly address the research efforts under way to alleviate these concerns and then present our conclusion.


2020 ◽  
Vol 4 (3) ◽  
pp. 97
Author(s):  
Zikai Chen

<p>After the emergence of the Internet, information technology has been developing rapidly, and the openness and digitalization of the times have further deepened. With numerals digital files in face of human, the requirements for archives management services has been gradually enhanced. On this basis, explorations are made on the innovative archives management service mode in information age in this article. By starting with the analyzes on the new user-centered service mode, this article also clarifies the method to realize personalized service mode to truly improve the its level.</p>


2019 ◽  
Vol 1 (2) ◽  
Author(s):  
Yu Hou ◽  
Lixin Tao

As the tsunami of data has emerged, search engines have become the most powerful tool for obtaining scattered information on the internet. The traditional search engines return the organized results by using ranking algorithm such as term frequency, link analysis (PageRank algorithm and HITS algorithm) etc. However, these algorithms must combine the keyword frequency to determine the relevance between user’s query and the data in the computer system or internet. Moreover, we expect the search engines could understand users’ searching by content meanings rather than literal strings. Semantic Web is an intelligent network and it could understand human’s language more semantically and make the communication easier between human and computers. But, the current technology for the semantic search is hard to apply. Because some meta data should be annotated to each web pages, then the search engine will have the ability to understand the users intend. However, annotate every web page is very time-consuming and leads to inefficiency. So, this study designed an ontology-based approach to improve the current traditional keyword-based search and emulate the effects of semantic search. And let the search engine can understand users more semantically when it gets the knowledge.


Think India ◽  
2019 ◽  
Vol 22 (2) ◽  
pp. 174-187
Author(s):  
Harmandeep Singh ◽  
Arwinder Singh

Nowadays, internet satisfying people with different services related to different fields. The profit, as well as non-profit organization, uses the internet for various business purposes. One of the major is communicated various financial as well as non-financial information on their respective websites. This study is conducted on the top 30 BSE listed public sector companies, to measure the extent of governance disclosure (non-financial information) on their web pages. The disclosure index approach to examine the extent of governance disclosure on the internet was used. The governance index was constructed and broadly categorized into three dimensions, i.e., organization and structure, strategy & Planning and accountability, compliance, philosophy & risk management. The empirical evidence of the study reveals that all the Indian public sector companies have a website, and on average, 67% of companies disclosed some kind of governance information directly on their websites. Further, we found extreme variations in the web disclosure between the three categories, i.e., The Maharatans, The Navratans, and Miniratans. However, the result of Kruskal-Wallis indicates that there is no such significant difference between the three categories. The study provides valuable insights into the Indian economy. It explored that Indian public sector companies use the internet for governance disclosure to some extent, but lacks symmetry in the disclosure. It is because there is no such regulation for web disclosure. Thus, the recommendation of the study highlighted that there must be such a regulated framework for the web disclosure so that stakeholders ensure the transparency and reliability of the information.


Author(s):  
Aleksey V. Kutuzov

The article substantiates the need to use Internet monitoring as a priority source of information in countering extremism. Various approaches to understanding the defi nition of the category of «operational search», «law enforcement» monitoring of the Internet are analysed, the theoretical development of the implementation of this category in the science of operational search is investigated. The goals and subjects of law enforcement monitoring are identifi ed. The main attention is paid to the legal basis for the use of Internet monitoring in the detection and investigation of extremist crimes. In the course of the study hermeneutic, formal-logical, logical-legal and comparative-legal methods were employed, which were used both individually and collectively in the analysis of legal norms, achievements of science and practice, and development of proposals to refi ne the conduct of operational-search measures on the Internet when solving extremist crimes. The author’s defi nition of «operational-search monitoring» of the Internet is provided. Proposals have been made to improve the activities of police units when conducting monitoring of the Internet in the context of the search for relevant information to the disclosure and investigation of crimes of that category.


Sign in / Sign up

Export Citation Format

Share Document