An Intelligent Web Search Using Multi-Document Summarization

Text, link and usage information are the most commonly used sources in the ranking algorithm of a web search engine. In this thesis, we argue that the quality of the web pages such as the performance of the page delivery (e.g. reliability and response time) should also play an important role in ranking, especially for users with a slow Internet connection or mobile users. Based on this principle, if two pages have the same level of relevancy to a query, the one with a higher delivery quality (e.g. faster response) should be ranked higher. We define several important attributes for the Quality of Service (QoS) and explain how we rank the web pages based on these algorithms. In addition, while combining those QoS attributes, we have tested and compared different aggregation algorithms. The experiment results show that our proposed algorithms can promote the pages with a higher delivery quality to higher positions in the result list, which is beneficial to users to improve their overall experiences of using the search engine and QoS based re-ranking algorithm always gets the best performance.

Download Full-text

QoS Based Ranking For Web Search

10.32920/ryerson.14653005.v1 ◽

2021 ◽

Author(s):

Xiangyi Chen

Keyword(s):

Quality Of Service ◽

Search Engine ◽

Web Search ◽

Web Pages ◽

Ranking Algorithm ◽

Internet Connection ◽

Web Search Engine ◽

The One ◽

The Web

Text, link and usage information are the most commonly used sources in the ranking algorithm of a web search engine. In this thesis, we argue that the quality of the web pages such as the performance of the page delivery (e.g. reliability and response time) should also play an important role in ranking, especially for users with a slow Internet connection or mobile users. Based on this principle, if two pages have the same level of relevancy to a query, the one with a higher delivery quality (e.g. faster response) should be ranked higher. We define several important attributes for the Quality of Service (QoS) and explain how we rank the web pages based on these algorithms. In addition, while combining those QoS attributes, we have tested and compared different aggregation algorithms. The experiment results show that our proposed algorithms can promote the pages with a higher delivery quality to higher positions in the result list, which is beneficial to users to improve their overall experiences of using the search engine and QoS based re-ranking algorithm always gets the best performance.

Download Full-text

Near-Replicas of Web Pages Eliminating Repetitive Algorithms Based on MD5

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.532-533.1752 ◽

2012 ◽

Vol 532-533 ◽

pp. 1752-1756 ◽

Cited By ~ 1

Author(s):

Jun Ya Yan ◽

Xiao Hui Ma ◽

Wen Juan Zhao

Keyword(s):

Search Engine ◽

Exponential Growth ◽

Good Effect ◽

The Internet ◽

Web Pages ◽

Network Information ◽

Retrieval Efficiency ◽

The Web

The development of the internet and exponential growth of network information produce a large number of duplicated pages on the network, reducing the retrieval of recall and precision and affecting the retrieval efficiency. The accuracy of the web, therefore, influences the quality of search engine. On the basis of the structural text description, this paper proposes an improved eliminating repetitive algorithm method, which is based on MD5 of Near-replicas. It proves that the method has a good effect on improving the recall and the precision through experiment.

Download Full-text

Digital hajj: the pilgrimage to Mecca in Muslim cyberspace and the issue of religious online authority

Scripta Instituti Donneriani Aboensis ◽

10.30674/scripta.67440 ◽

2013 ◽

Vol 25 ◽

pp. 189-203 ◽

Cited By ~ 1

Author(s):

Dominik Schlosser

Keyword(s):

Search Engine ◽

The Internet ◽

Web Pages ◽

Religious Authority ◽

Liminal Space ◽

Online Presence ◽

Optimisation Techniques

This paper attempts to give an overview of the different representations of the pilgrimage to Mecca found in the ‘liminal space’ of the internet. For that purpose, it examines a handful of emblematic examples of how the hajj is being presented and discussed in cyberspace. Thereby, special attention shall be paid to the question of how far issues of religious authority are manifest on these websites, whether the content providers of web pages appoint themselves as authorities by scrutinizing established views of the fifth pillar of Islam, or if they upload already printed texts onto their sites in order to reiterate normative notions of the pilgrimage to Mecca, or of they make use of search engine optimisation techniques, thus heightening the very visibility of their online presence and increasing the possibility of becoming authoritative in shaping internet surfers’ perceptions of the hajj.

Download Full-text

Establishment of the Tour-Site Navigation System - Using HsinChu City as an Example

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.170-173.3431 ◽

2012 ◽

Vol 170-173 ◽

pp. 3431-3435

Author(s):

Yan Chyuan Shiau ◽

Lian Ting Lu ◽

Tai Yu Chen ◽

Chih Ying Lee

Keyword(s):

Satellite Image ◽

Smart Phone ◽

The Internet ◽

Web Pages ◽

Recreational Activity ◽

Spatial Concepts ◽

Distance Calculation ◽

Related Information ◽

Er Model

As the living quality of the citizens gradually improved, traveling becomes an important recreational activity. The internet speedily provides information related to tour sites. However, web pages generally present merely words and pictures that are not impressive enough to the viewers. Spatial concepts, distance calculation, and tools for vacation planning are also often not provided by the websites. This study combines the usage of 3dSpace, GoogleMap, ER Model, Windows Mobile, SuperPad. It gathers the tour-sites related information of HsinChu city; such as local restaurants, famous attractions, and high rating hotels in the area. The study develops search interface integrated with the Google Map Engine. After selecting of category and input of specific key words, the related information of specific location and 360° satellite image could be shown on browser. The route calculation trips between local attractions is provided on this project. This investigation combines the GPS function to the smart phone, helping the users to arrive at their destination correctly within the minimum time.

Download Full-text

Classification of Spamming Attacks to Blogging Websites and Their Security Techniques

Encyclopedia of Criminal Activities and the Deep Web ◽

10.4018/978-1-5225-9715-5.ch058 ◽

2020 ◽

pp. 864-880 ◽

Cited By ~ 1

Author(s):

Rizwan Ur Rahman ◽

Rishu Verma ◽

Himani Bansal ◽

Deepak Singh Tomar

Keyword(s):

Search Engine ◽

World Wide ◽

Web Search ◽

Service Providers ◽

Web Pages ◽

Internet Service ◽

Important Concern ◽

Attack Scenario ◽

Explosive Expansion

With the explosive expansion of information on the world wide web, search engines are becoming more significant in the day-to-day lives of humans. Even though a search engine generally gives huge number of results for certain query, the majority of the search engine users simply view the first few web pages in result lists. Consequently, the ranking position has become a most important concern of internet service providers. This article addresses the vulnerabilities, spamming attacks, and countermeasures in blogging sites. In the first part, the article explores the spamming types and detailed section on vulnerabilities. In the next part, an attack scenario of form spamming is presented, and defense approach is presented. Consequently, the aim of this article is to provide review of vulnerabilities, threats of spamming associated with blogging websites, and effective measures to counter them.

Download Full-text

A Web Metadata Based-Model for Information Quality Prediction

Handbook of Research on Web Information Systems Quality ◽

10.4018/978-1-59904-847-5.ch019 ◽

2011 ◽

pp. 324-343 ◽

Cited By ~ 1

Author(s):

Ricardo Barros ◽

Geraldo Xexéo ◽

Wallace A. Pinheiro ◽

Jano de Souza

Keyword(s):

Information Quality ◽

Web Search ◽

Fuzzy Theory ◽

Web Pages ◽

Quality Prediction ◽

Theory Approach ◽

Web Environment ◽

Conflicting Information ◽

Quality Dimensions

Currently, in the Web environment, users have to deal with an enormous amount of information. In a Web search, they often receive useless, replicated, outdated, or false data, which, at first, they have no means to assess. Web search engines provide good examples of these problems: As reply from these mechanisms, users usually find links to replicated or conflicting information. Further, in these cases, information is spread out among heterogeneous and unrelated data sources, that normally present different information quality approaches. This chapter addresses those issues by proposing a Web Metadata-Based Model to evaluate and recommend Web pages based on their information quality, as predicted by their metadata. We adopt a fuzzy theory approach to obtain the values of quality dimensions from metadata values and to evaluate the quality of information, taking advantage of fuzzy logic’s ability to capture humans’ imprecise knowledge and deal with different concepts.

Download Full-text

A Method of Subtopic Classification of Search Engine Suggests by Integrating a Topic Model and Word Embeddings

International Journal of Software Innovation ◽

10.4018/ijsi.2018070105 ◽

2018 ◽

Vol 6 (3) ◽

pp. 67-78

Author(s):

Tian Nie ◽

Yi Ding ◽

Chen Zhao ◽

Youchao Lin ◽

Takehito Utsuro

Keyword(s):

Search Engine ◽

Information Needs ◽

Web Search ◽

Topic Model ◽

Japanese Version ◽

Word Embedding ◽

Coarse Grained ◽

Web Pages ◽

Word Embeddings

The background of this article is the issue of how to overview the knowledge of a given query keyword. Especially, the authors focus on concerns of those who search for web pages with a given query keyword. The Web search information needs of a given query keyword is collected through search engine suggests. Given a query keyword, the authors collect up to around 1,000 suggests, while many of them are redundant. They classify redundant search engine suggests based on a topic model. However, one limitation of the topic model based classification of search engine suggests is that the granularity of the topics, i.e., the clusters of search engine suggests, is too coarse. In order to overcome the problem of the coarse-grained classification of search engine suggests, this article further applies the word embedding technique to the webpages used during the training of the topic model, in addition to the text data of the whole Japanese version of Wikipedia. Then, the authors examine the word embedding based similarity between search engines suggests and further classify search engine suggests within a single topic into finer-grained subtopics based on the similarity of word embeddings. Evaluation results prove that the proposed approach performs well in the task of subtopic classification of search engine suggests.

Download Full-text

Research on Web Search Engine Optimization and its Application

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.687-691.1908 ◽

2014 ◽

Vol 687-691 ◽

pp. 1908-1911

Author(s):

Wei Zhong Huang

Keyword(s):

Search Engine ◽

Web Search ◽

Internet Survey ◽

The Internet ◽

Network Information ◽

Domain Specific ◽

Wide Range ◽

Information Retrieval Systems ◽

Almost All ◽

Mail Service

The universal search engine, which is widely used now, has significantly improved the efficiency of retrieving information. According to CNNIC (China Internet Network Information Center) 26th Internet survey, the search takes up 76.30% for absolute advantage as a major way for users to obtain information from the Internet. Among almost all the surveys of using on the Internet in the world, search engine is second only to e-mail service. But with the growth of a wide range of information, these universal search engines can not meet people's needs either in retrieval precision or in retrieval efficiency when retrieving information on a subject or topic. That's because as long as the user enters the same keywords, the feedbacks of universal search engine are just the same. Universal search engine does not take the differences in interests and needs between different users, which often exist, into account. For example, dentists and ceramics enthusiasts would hold different concerns about the term "ceramic". In order to be more rapid, accurate and efficient in retrieving information on particular subject or theme, it is essential to develop information retrieval systems on specific areas, that is, the domain-specific search engine.

Download Full-text

Reliability of women epilepsy related information from main web search engines in China?deceitful web search environment and illumination (Preprint)

10.2196/preprints.7724 ◽

2017 ◽

Author(s):

Xi Zhu ◽

Xiangmiao Qiu ◽

Dingwang Wu ◽

Shidong Chen ◽

Jiwen Xiong ◽

...

Keyword(s):

Search Engine ◽

Search Engines ◽

Web Search ◽

Negative Impact ◽

Academic Publishing ◽

Web Pages ◽

Efficient System ◽

Related Information ◽

Web Search Engines ◽

Electronic Health

BACKGROUND All electronic health practices like app/software are involved in web search engine due to its convenience for receiving information. The success of electronic health has link with the success of web search engines in field of health. Yet information reliability from search engine results remains to be evaluated. A detail analysis can find out setbacks and bring inspiration. OBJECTIVE Find out reliability of women epilepsy related information from the searching results of main search engines in China. METHODS Six physicians conducted the search work every week. Search key words are one kind of AEDs (valproate acid/oxcarbazepine/levetiracetam/ lamotrigine) plus "huaiyun"/"renshen", both of which means pregnancy in Chinese. The search were conducted in different devices (computer/cellphone), different engines (Baidu/Sogou/360). Top ten results of every search result page were included. Two physicians classified every results into 9 categories according to their contents and also evaluated the reliability. RESULTS A total of 16411 searching results were included. 85.1% of web pages were with advertisement. 55% were categorized into question and answers according to their contents. Only 9% of the searching results are reliable, 50.7% are partly reliable, 40.3% unreliable. With the ranking of the searching results higher, advertisement up and the proportion of those unreliable increase. All contents from hospital websites are unreliable at all and all from academic publishing are reliable. CONCLUSIONS Several first principles must be emphasized to further the use of web search engines in field of healthcare. First, identification of registered physicians and development of an efficient system to guide the patients to physicians guarantee the quality of information provided. Second, corresponding department should restrict the excessive advertisement sale trades in healthcare area by specific regulations to avoid negative impact on patients. Third, information from hospital websites should be carefully judged before embracing them wholeheartedly.

Download Full-text