WEBCONTENT VISUALIZER: A VISUALIZATION SYSTEM FOR SEARCH ENGINES IN SEMATIC WEB

2011 ◽  
Vol 10 (05) ◽  
pp. 913-931 ◽  
Author(s):  
XIANYONG FANG ◽  
CHRISTIAN JACQUEMIN ◽  
FRÉDÉRIC VERNIER

Since the results from Semantic Web search engines are highly structured XML documents, they cannot be efficiently visualized with traditional explorers. Therefore, the Semantic Web calls for a new generation of search query visualizers that can rely on document metadata. This paper introduces such a visualization system called WebContent Visualizer that is used to display and browse search engine results. The visualization is organized into three levels: (1) Carousels contain documents with the same ranking, (2) carousels are piled into stacks, one for each date, and (3) these stacks are organized along a meta-carousel to display the results for several dates. Carousel stacks are piles of local carousels with increasing radii to visualize the ranks of classes. For document comparison, colored links connect documents between neighboring classes on the basis of shared entities. Based on these techniques, the interface is made of three collaborative components: an inspector window, a visualization panel, and a detailed dialog component. With this architecture, the system is intended to offer an efficient way to explore the results returned by Semantic Web search engines.

Author(s):  
Mathieu d’Aquin ◽  
Li Ding ◽  
Enrico Motta

2019 ◽  
Vol 49 (5) ◽  
pp. 707-731 ◽  
Author(s):  
Malte Ziewitz

When measures come to matter, those measured find themselves in a precarious situation. On the one hand, they have a strong incentive to respond to measurement so as to score a favourable rating. On the other hand, too much of an adjustment runs the risk of being flagged and penalized by system operators as an attempt to ‘game the system’. Measures, the story goes, are most useful when they depict those measured as they usually are and not how they intend to be. In this article, I explore the practices and politics of optimization in the case of web search engines. Drawing on materials from ethnographic fieldwork with search engine optimization (SEO) consultants in the United Kingdom, I show how maximizing a website’s visibility in search results involves navigating the shifting boundaries between ‘good’ and ‘bad’ optimization. Specifically, I am interested in the ethical work performed as SEO consultants artfully arrange themselves to cope with moral ambiguities provoked and delegated by the operators of the search engine. Building on studies of ethics as a practical accomplishment, I suggest that the ethicality of optimization has itself become a site of governance and contestation. Studying such practices of ‘being ethical’ not only offers opportunities for rethinking popular tropes like ‘gaming the system’, but also draws attention to often-overlooked struggles for authority at the margins of contemporary ranking schemes.


2015 ◽  
Vol 39 (2) ◽  
pp. 197-213 ◽  
Author(s):  
Ahmet Uyar ◽  
Farouk Musa Aliyu

Purpose – The purpose of this paper is to better understand three main aspects of semantic web search engines of Google Knowledge Graph and Bing Satori. The authors investigated: coverage of entity types, the extent of their support for list search services and the capabilities of their natural language query interfaces. Design/methodology/approach – The authors manually submitted selected queries to these two semantic web search engines and evaluated the returned results. To test the coverage of entity types, the authors selected the entity types from Freebase database. To test the capabilities of natural language query interfaces, the authors used a manually developed query data set about US geography. Findings – The results indicate that both semantic search engines cover only the very common entity types. In addition, the list search service is provided for a small percentage of entity types. Moreover, both search engines support queries with very limited complexity and with limited set of recognised terms. Research limitations/implications – Both companies are continually working to improve their semantic web search engines. Therefore, the findings show their capabilities at the time of conducting this research. Practical implications – The results show that in the near future the authors can expect both semantic search engines to expand their entity databases and improve their natural language interfaces. Originality/value – As far as the authors know, this is the first study evaluating any aspect of newly developing semantic web search engines. It shows the current capabilities and limitations of these semantic web search engines. It provides directions to researchers by pointing out the main problems for semantic web search engines.


2017 ◽  
Author(s):  
Xi Zhu ◽  
Xiangmiao Qiu ◽  
Dingwang Wu ◽  
Shidong Chen ◽  
Jiwen Xiong ◽  
...  

BACKGROUND All electronic health practices like app/software are involved in web search engine due to its convenience for receiving information. The success of electronic health has link with the success of web search engines in field of health. Yet information reliability from search engine results remains to be evaluated. A detail analysis can find out setbacks and bring inspiration. OBJECTIVE Find out reliability of women epilepsy related information from the searching results of main search engines in China. METHODS Six physicians conducted the search work every week. Search key words are one kind of AEDs (valproate acid/oxcarbazepine/levetiracetam/ lamotrigine) plus "huaiyun"/"renshen", both of which means pregnancy in Chinese. The search were conducted in different devices (computer/cellphone), different engines (Baidu/Sogou/360). Top ten results of every search result page were included. Two physicians classified every results into 9 categories according to their contents and also evaluated the reliability. RESULTS A total of 16411 searching results were included. 85.1% of web pages were with advertisement. 55% were categorized into question and answers according to their contents. Only 9% of the searching results are reliable, 50.7% are partly reliable, 40.3% unreliable. With the ranking of the searching results higher, advertisement up and the proportion of those unreliable increase. All contents from hospital websites are unreliable at all and all from academic publishing are reliable. CONCLUSIONS Several first principles must be emphasized to further the use of web search engines in field of healthcare. First, identification of registered physicians and development of an efficient system to guide the patients to physicians guarantee the quality of information provided. Second, corresponding department should restrict the excessive advertisement sale trades in healthcare area by specific regulations to avoid negative impact on patients. Third, information from hospital websites should be carefully judged before embracing them wholeheartedly.


2013 ◽  
Vol 462-463 ◽  
pp. 1106-1109
Author(s):  
Hong Yuan Ma

Web search engine caches the results which is frequently queried by users. It is an effective approach to improve the efficiency of Web search engines. In this paper, we give some valuable experience in our design and implementation of a Web search engine cache system. We present there design principles: logical layer processing, event-based communication architecture and avoiding frequent data copy. We also introduce the architecture presented in practice, including connection processor, application processor, query results caching processor, inverted list caching processor and list intersection caching processor. Experiments are conducted in our cache system using a real Web search engine query log.


Author(s):  
Konstantinos Kotis

Current keyword-based Web search engines (e.g. Googlea) provide access to thousands of people for billions of indexed Web pages. Although the amount of irrelevant results returned due to polysemy (one word with several meanings) and synonymy (several words with one meaning) linguistic phenomena tends to be reduced (e.g. by narrowing the search using human- directed topic hierarchies as in Yahoob), still the uncontrolled publication of Web pages requires an alternative to the way Web information is authored and retrieved today. This alternative can be the technologies of the new era of the Semantic Web. The Semantic Web, currently using OWL language to describe content, is an extension and an alternative at the same time to the traditional Web. A Semantic Web Document (SWD) describes its content with semantics, i.e. domain-specific tags related to a specific conceptualization of a domain, adding meaning to the document’s (annotated) content. Ontologies play a key role to providing such description since they provide a standard way for explicit and formal conceptualizations of domains. Since traditional Web search engines cannot easily take advantage of documents’ semantics, e.g. they cannot find documents that describe similar concepts and not just similar words, semantic search engines (e.g. SWOOGLEc, OntoSearchd) and several other semantic search technologies have been proposed (e.g. Semantic Portals (Zhang et al, 2005), Semantic Wikis (Völkel et al, 2006), multi-agent P2P ontology-based semantic routing (of queries) systems (Tamma et al, 2004), and ontology mapping-based query/answering systems (Lopez et al, 2006; Kotis & Vouros, 2006, Bouquet et al, 2004). Within these technologies, queries can be placed as formally described (or annotated) content, and a semantic matching algorithm can provide the exact matching with SWDs that their semantics match the semantics of the query. Although the Semantic Web technology contributes much in the retrieval of Web information, there are some open issues to be tackled. First of all, unstructured (traditional Web) documents must be semantically annotated with domain-specific tags (ontology-based annotation) in order to be utilized by semantic search technologies. This is not an easy task, and requires specific domain ontologies to be developed that will provide such semantics (tags). A fully automatic annotation process is still an open issue. On the other hand, SWDs can be semantically retrieved only by formal queries. The construction of a formal query is also a difficult and time-consuming task since a formal language must be learned. Techniques towards automating the transformation of a natural language query to a formal (structured) one are currently investigated. Nevertheless, more sophisticated technologies such as the mapping of several schemes to a formal query constructed in the form of an ontology must be investigated. The technology is proposed for retrieving heterogeneous and distributed SWDs, since their structure cannot be known a priory (in open environments like the Semantic Web). This article aims to provide an insight on current technologies used in Semantic Web search, focusing on two issues: a) the automatic construction of a formal query (query ontology) and b) the querying of a collection of knowledge sources whose structure is not known a priory (distributed and semantically heterogeneous documents).


Author(s):  
Rahul Pradhan ◽  
Dilip Kumar Sharma

Users issuing query on search engine, expect results to more relevant to query topic rather than just the textual match with text in query. Studies conducted by few researchers shows that user want the search engine to understand the implicit intent of query rather than looking the textual match in hypertext structure of document or web page. In this paper the authors will be addressing queries that have any temporal intent and help the web search engines to classify them in certain categories. These classes or categories will help search engine to understand and cater the need of query. The authors will consider temporal expression (e.g. 1943) in document and categories them on the basis of temporal boundary of that query. Their experiment classifies the query and tries to suggest further course of action for search engines. Results shows that classifying the query to these classes will help user to reach his/her seeking information faster.


Author(s):  
Shanfeng Zhu ◽  
Xiaotie Deng ◽  
Qizhi Fang ◽  
Weimin Zhang

Web search engines are one of the most popular services to help users find useful information on the Web. Although many studies have been carried out to estimate the size and overlap of the general web search engines, it may not benefit the ordinary web searching users, since they care more about the overlap of the top N (N=10, 20 or 50) search results on concrete queries, but not the overlap of the total index database. In this study, we present experimental results on the comparison of the overlap of the top N (N=10, 20 or 50) search results from AlltheWeb, Google, AltaVista and WiseNut for the 58 most popular queries, as well as for the distance of the overlapped results. These 58 queries are chosen from WordTracker service, which records the most popular queries submitted to some famous metasearch engines, such as MetaCrawler and Dogpile. We divide these 58 queries into three categories for further investigation. Through in-depth study, we observe a number of interesting results: the overlap of the top N results retrieved by different search engines is very small; the search results of the queries in different categories behave in dramatically different ways; Google, on average, has the highest overlap among these four search engines; each search engine tends to adopt a different rank algorithm independently.


Sign in / Sign up

Export Citation Format

Share Document