Provenance-Aware Semantic Search Engines Based On Data Integration Systems

Author(s):  
Domenico Beneventano ◽  
Sonia Bergamaschi

Search engines are common tools for virtually every user of the Internet and companies, such as Google and Yahoo!, have become household names. Semantic Search Engines try to augment and improve traditional Web Search Engines by using not just words, but concepts and logical relationships. Given the openness of the Web and the different sources involved, a Web Search Engine must evaluate quality and trustworthiness of the data; a common approach for such assessments is the analysis of the provenance of information. In this paper a relevant class of Provenance-aware Semantic Search Engines, based on a peer-to-peer, data integration mediator-based architecture is described. The architectural and functional features are an enhancement with provenance of the SEWASIE semantic search engine developed within the IST EU SEWASIE project, coordinated by the authors. The methodology to create a two level ontology and the query processing engine developed within the SEWASIE project, together with provenance extension are fully described.

2011 ◽  
pp. 317-342 ◽  
Author(s):  
D. Beneventano

As the use of the World Wide Web has become increasingly widespread, the business of commercial search engines has become a vital and lucrative part of the Web. Search engines are common place tools for virtually every user of the Internet; and companies, such as Google and Yahoo!, have become household names. Semantic search engines try to augment and improve traditional Web Search Engines by using not just words, but concepts and logical relationships. In this chapter a relevant class of semantic search engines, based on a peer-to-peer, data integration mediator-based architecture is described. The architectural and functional features are presented with respect to two projects, SEWASIE and WISDOM, involving the authors. The methodology to create a two level ontology and query processing in the SEWASIE project are fully described.


2016 ◽  
Vol 12 (2) ◽  
pp. 242-262 ◽  
Author(s):  
Awny Sayed ◽  
Amal Al Muqrishi

Purpose The purpose of this paper is to present an efficient and scalable Arabic semantic search engine based on a domain-specific ontological graph for Colleges of Applied Science, Sultanate of Oman (CASOnto). It also supports the factorial question answering and uses two types of searching: the keyword-based search and the semantics-based search in both languages Arabic and English. This engine is built on variety of technologies such as resource description framework data and ontological graph. Furthermore, two experimental results are conducted; the first is a comparison among entity-search and the classical-search in the system itself. The second compares the CASOnto with well-known semantic search engines such as Kngine, Wolfram Alpha and Google to measure their performance and efficiency. Design/methodology/approach The design and implementation of the system comprises the following phases, namely, designing inference, storing, indexing, searching, query processing and the user’s friendly interface, where it is designed based on a specific domain of the IBRI CAS (College of Applied Science) to highlight the academic and nonacademic departments. Furthermore, it is ontological inferred data stored in the tuple data base (TDB) and MySQL to handle the keyword-based search as well as entity-based search. The indexing and searching processes are built based on the Lucene for the keyword search, while TDB is used for the entity search. Query processing is a very important component in the search engines that helps to improve the user’s search results and make the system efficient and scalable. CASOnto handles the Arabic issues such as spelling correction, query completion, stop words’ removal and diacritics removal. It also supports the analysis of the factorial question answering. Findings In this paper, an efficient and scalable Arabic semantic search engine is proposed. The results show that the semantic search that built on the SPARQL is better than the classical search in both simple and complex queries. Clearly, the accuracy of semantic search equals to 100 per cent in both types of queries. On the other hand, the comparison of CASOnto with the Wolfram Alpha, Kngine and Google refers to better results by CASOnto. Consequently, it seems that our proposed engine retrieved better and efficient results than other engines. Thus, it is built according to the ontological domain-specific, highly scalable performance and handles the complex queries well by understanding the context behind the query. Research limitations/implications The proposed engine is built on a specific domain (CAS Ibri – Oman), and in the future vision, it will highlight the nonfactorial question answering and expand the domain of CASOnto to involve more integrated different domains. Originality/value The main contribution of this paper is to build an efficient and scalable Arabic semantic search engine. Because of the widespread use of search engines, a new dimension of challenge is created to keep up with the evolution of the semantic Web. Whereas, catering to the needs of users has become a matter of paramount importance in the light of artificial intelligence and technological development to access the accurate and the efficient information in less possible time. However, the research challenges still in its infancy due to lack of research engine that supports the Arabic language. It could be traced back to the complexity of the Arabic language morphological and grammar rules.


2020 ◽  
Vol 9 (1) ◽  
pp. 1496-1501

Semantic Search is a search technique that improves looking precision through perception the reason of the search and the contextual magnitude of phrases as they show up in the searchable statistics space, whether or not on the net to generate greater applicable result. We spotlight right here about Semantic Search, Semantic Web and talk about about exceptional kind of Semantic search engine and variations between key-word base search and Semantic Search and the benefit of Semantic Search. We additionally provide a short overview of the records of semantic search and its function scope in the world.


2004 ◽  
Vol 03 (01) ◽  
pp. 107-117 ◽  
Author(s):  
D. Manjula ◽  
T. V. Geetha

Currently existing search engines index documents only by words and as a result, when a query can be interpreted in different senses, the irrelevant results are obtained in the midst of relevant results. A semantic search engine is proposed here which indexes documents both by words and senses and as a result tries to avoid the irrelevant results. The "crawler" traverses the worldwide web and the normalized documents are sent to the disambiguator module, which identifies the top few sense(s) of ambiguous words by employing a weighted disambiguation algorithm. The documents are then indexed by the words and the senses. The query is also disambiguated in a similar manner and retrieval is performed by matching both the sense and the word. The performance of the semantic search engine is compared against traditional word based indexing and also against the commercial search engines like Google, Yahoo, Hotbot and Lycos. The results show an impressive precision for the semantic search engine compared to other engines, particularly for ambiguous queries.


2010 ◽  
Vol 04 (04) ◽  
pp. 535-558 ◽  
Author(s):  
SHU WANG ◽  
PHILLIP C.-Y. SHEU

Web search could be much facilitated if we can better relate the user intention with the meaning of the web content. In this paper, we first survey the various existing methods, focusing on the dilemma that obtaining high accuracy results usually sacrifices the response time. We then propose a novel information retrieval framework to combine keyword-based search and search based on syntactical information. In particular, we design a sequential structure called LSC (Language Sequential Component) to encode syntactical information. Given a sentence, LSC provides a bridge from its syntactical representation and semantic meaning. We also propose a learning algorithm to obtain the LSCs from a training set, a classification algorithm to find the relevant LSCs from a user query to interpret the intentions of the user, and a search framework (called Semantic Search Engine) to incorporate syntactical information into a keyword based search system. Our experiments show the Semantic Search Engine outperforms the keyword-based approach significantly.


2017 ◽  
pp. 030-050
Author(s):  
J.V. Rogushina ◽  

Problems associated with the improve ment of information retrieval for open environment are considered and the need for it’s semantization is grounded. Thecurrent state and prospects of development of semantic search engines that are focused on the Web information resources processing are analysed, the criteria for the classification of such systems are reviewed. In this analysis the significant attention is paid to the semantic search use of ontologies that contain knowledge about the subject area and the search users. The sources of ontological knowledge and methods of their processing for the improvement of the search procedures are considered. Examples of semantic search systems that use structured query languages (eg, SPARQL), lists of keywords and queries in natural language are proposed. Such criteria for the classification of semantic search engines like architecture, coupling, transparency, user context, modification requests, ontology structure, etc. are considered. Different ways of support of semantic and otology based modification of user queries that improve the completeness and accuracy of the search are analyzed. On base of analysis of the properties of existing semantic search engines in terms of these criteria, the areas for further improvement of these systems are selected: the development of metasearch systems, semantic modification of user requests, the determination of an user-acceptable transparency level of the search procedures, flexibility of domain knowledge management tools, increasing productivity and scalability. In addition, the development of means of semantic Web search needs in use of some external knowledge base which contains knowledge about the domain of user information needs, and in providing the users with the ability to independent selection of knowledge that is used in the search process. There is necessary to take into account the history of user interaction with the retrieval system and the search context for personalization of the query results and their ordering in accordance with the user information needs. All these aspects were taken into account in the design and implementation of semantic search engine "MAIPS" that is based on an ontological model of users and resources cooperation into the Web.


2021 ◽  
pp. 089443932110068
Author(s):  
Aleksandra Urman ◽  
Mykola Makhortykh ◽  
Roberto Ulloa

We examine how six search engines filter and rank information in relation to the queries on the U.S. 2020 presidential primary elections under the default—that is nonpersonalized—conditions. For that, we utilize an algorithmic auditing methodology that uses virtual agents to conduct large-scale analysis of algorithmic information curation in a controlled environment. Specifically, we look at the text search results for “us elections,” “donald trump,” “joe biden,” “bernie sanders” queries on Google, Baidu, Bing, DuckDuckGo, Yahoo, and Yandex, during the 2020 primaries. Our findings indicate substantial differences in the search results between search engines and multiple discrepancies within the results generated for different agents using the same search engine. It highlights that whether users see certain information is decided by chance due to the inherent randomization of search results. We also find that some search engines prioritize different categories of information sources with respect to specific candidates. These observations demonstrate that algorithmic curation of political information can create information inequalities between the search engine users even under nonpersonalized conditions. Such inequalities are particularly troubling considering that search results are highly trusted by the public and can shift the opinions of undecided voters as demonstrated by previous research.


Author(s):  
Li Sheng ◽  
Zheng Kaihong ◽  
Yang Jinfeng ◽  
Wang Xin ◽  
Zeng Lukun ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document