Clustering of User Query in Search Engine on Indonesian E-Commerce by used AD-OPTICS Algorithms

Information available on the internet is huge, diverse and dynamic. Current Search Engine is doing the task of intelligent help to the users of the internet. For a query, it provides a listing of best matching or relevant web pages. However, information for the query is often spread across multiple pages which are returned by the search engine. This degrades the quality of search results. So, the search engines are drowning in information, but starving for knowledge. Here, we present a query focused extractive summarization of search engine results. We propose a two level summarization process: identification of relevant theme clusters, and selection of top ranking sentences to form summarized result for user query. A new approach to semantic similarity computation using semantic roles and semantic meaning is proposed. Document clustering is effectively achieved by application of MDL principle and sentence clustering and ranking is done by using SNMF. Experiments conducted demonstrate the effectiveness of system in semantic text understanding, document clustering and summarization.

Download Full-text

Spatial Search Engines

Encyclopedia of Information Science and Technology, Second Edition ◽

10.4018/978-1-60566-026-4.ch566 ◽

2011 ◽

pp. 3554-3558

Author(s):

Cláudio Elízio Calazans Campelo ◽

Cláudio de Souza Baptista ◽

Ricardo Madeira Fernandes

Keyword(s):

Search Engine ◽

Search Engines ◽

Spatial Distance ◽

Web Page ◽

Relevance Ranking ◽

Spatial Search ◽

User Query ◽

Information Retrieval Systems ◽

The One ◽

The Web

It is well known that documents available on the Web are extremely heterogeneous in several aspects, such as the use of various idioms, different formats to represent the contents, besides other external factors like source reputation, refresh frequency, and so forth (Page & Brin, 1998). Altogether, these factors increase the complexity of Web information retrieval systems. Superficially, traditional search engines available on the Web nowadays consist of retrieving documents that contain keywords informed by users. Nevertheless, among the variety of search possibilities, it is evident that the user needs a process that involves more sophisticated analysis; for example, temporal or spatial contextualization might be considered. In these keyword-based search engines, for instance, a Web page containing the phrase “…due to the company arrival in London, a thousand java programming jobs will be open…” would not be found if the submitted search was “jobs programming England,” unless the word “England” appeared in another phrase of the page. The explanation to this fact is that the term “London” is treated merely like another word, instead of regarding its geographical position. In a spatial search engine, the expected behavior would be to return the page described in the previous example, since the system shall have information indicating that the term “London” refers to a city located in a country referred to by the term “England.” This result could only be feasible in a traditional search engine if the user repeatedly submitted searches for all possible England sub-regions (e.g., cities). In accordance with the example, it is reasonable that for several user searches, the most interesting results are those related to certain geographical regions. A variety of features extraction and automatic document classification techniques have been proposed, however, acquiring Web-page geographical features involves some peculiar complexities, such as ambiguity (e.g., many places with the same name, various names for a single place, things with place names, etc.). Moreover, a Web page can refer to a place that contains or is contained by the one informed in the user query, which implies knowing the different region topologies used by the system. Many features related to geographical context can be added to the process of elaborating relevance ranking for returned documents. For example, a document can be more relevant than another one if its content refers to a place closer to the user location. Nonetheless, in spatial search engines, there are more complex issues to be considered because of the spatial dimension concerning on ranking elaboration. Jones, Alani, and Tudhope (2001) propose a combination of Euclidian distance between place centroids with hierarchical distances in order to generate a hybrid spatial distance that may be used in the relevance ranking elaboration of returned documents. Further important issues are the indexing mechanisms and query processing. In general, these solutions try to combine well-known textual indexing techniques (e.g., inverted files) with spatial indexing mechanisms. On the subject of user interface, spatial search engines are more complex, because users need to choose regions of interest, as well as possible spatial relationships, in addition to keywords. To visualize the results, it is pleasant to use digital map resources besides textual information.

Download Full-text

Research of Intelligent Search Engine Based on Multi-Ontology

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.241-244.1659 ◽

2012 ◽

Vol 241-244 ◽

pp. 1659-1663

Author(s):

Shu Dong Zhang ◽

Can Zhang ◽

Jing Wang

Keyword(s):

Semantic Web ◽

Search Engine ◽

Domain Knowledge ◽

Recall Rate ◽

Domain Ontology ◽

Semantic Retrieval ◽

Engine Model ◽

Cross Domain ◽

User Query ◽

Primary Means

With the development of the Semantic Web, ontology has become the primary means of expression of many fields of knowledge. Introducing the Semantic Web technology into the field of search engine is a valuable research topic. In order to meet the complex semantic retrieval demands, the paper proposes a search engine model based on multi-domain ontology, the model using ontology mapping rewrite the user query to achieve multiple ontology query, and provide a richer and accurate semantic information for the retrieval of cross-domain knowledge; And the paper proposes a method of cross-domain ontology annotation, providing a basis for the user semantic retrieval. The experimental results show that the search results improve the precision and recall rate.

Download Full-text

Web Search Engine Misinformation Notifier Extension (SEMiNExt): A Machine Learning Based Approach during COVID-19 Pandemic

Healthcare ◽

10.3390/healthcare9020156 ◽

2021 ◽

Vol 9 (2) ◽

pp. 156

Author(s):

Abdullah Bin Shams ◽

Ehsanul Hoque Apu ◽

Ashiqur Rahman ◽

Md. Mohsin Sarker Raihan ◽

Nazeeba Siddika ◽

...

Keyword(s):

Public Health ◽

Machine Learning ◽

Real Time ◽

Search Engine ◽

Language Processing ◽

Web Search ◽

Training Data ◽

Small Data ◽

Web Search Engine ◽

User Query

Misinformation such as on coronavirus disease 2019 (COVID-19) drugs, vaccination or presentation of its treatment from untrusted sources have shown dramatic consequences on public health. Authorities have deployed several surveillance tools to detect and slow down the rapid misinformation spread online. Large quantities of unverified information are available online and at present there is no real-time tool available to alert a user about false information during online health inquiries over a web search engine. To bridge this gap, we propose a web search engine misinformation notifier extension (SEMiNExt). Natural language processing (NLP) and machine learning algorithm have been successfully integrated into the extension. This enables SEMiNExt to read the user query from the search bar, classify the veracity of the query and notify the authenticity of the query to the user, all in real-time to prevent the spread of misinformation. Our results show that SEMiNExt under artificial neural network (ANN) works best with an accuracy of 93%, F1-score of 92%, precision of 92% and a recall of 93% when 80% of the data is trained. Moreover, ANN is able to predict with a very high accuracy even for a small training data size. This is very important for an early detection of new misinformation from a small data sample available online that can significantly reduce the spread of misinformation and maximize public health safety. The SEMiNExt approach has introduced the possibility to improve online health management system by showing misinformation notifications in real-time, enabling safer web-based searching on health-related issues.

Download Full-text

Meta Search Engine using Semantic Similarity and Correlation Coefficient

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.i7261.079920 ◽

2020 ◽

Vol 9 (9) ◽

pp. 643-647

Keyword(s):

Search Engine ◽

Search Engines ◽

Search Space ◽

The Other ◽

Ranking Algorithm ◽

Search Results ◽

Ranking Algorithms ◽

Meta Search Engine ◽

User Query ◽

Meta Search

This paper aims to provide an intelligent way to query and rank the results of a Meta Search Engine. A Meta Search Engine takes input from the user and produces results which are gathered from other search engines. The main advantage of a Meta Search Engine over methodical search engine is its ability to extend the search space and allows more resources for the user. The semantic intelligent queries will be fetching the results from different search engines and the responses will be fed into our ranking algorithm. Ranking of the search results is the other important aspect of Meta search engines. When a user searches a query, there are number of results retrieved from different search engines, but only several results are relevant to user's interest and others are not much relevant. Hence, it is important to rank results according to the relevancy with user query. The proposed paper uses intelligent query and ranking algorithms in order to provide intelligent meta search engine with semantic understanding.

Download Full-text

SINCERITY: the making of a search engine for images indexed with a bilingual taxonomy

OCLC Systems & Services ◽

10.1108/oclc-03-2014-0020 ◽

2015 ◽

Vol 31 (3) ◽

pp. 112-124 ◽

Cited By ~ 1

Author(s):

Tomasz Neugebauer ◽

Elaine Menard

Keyword(s):

Open Source ◽

Search Engine ◽

Open Source Software ◽

Digital Images ◽

Software Components ◽

Image Search ◽

Query Translation ◽

Search Interface ◽

Content Type ◽

User Query

Purpose – This paper aims to present the third stage of a research project that aims to develop a bilingual interface for the retrieval of digital images. The requirements and implementation of the search engine are described. Image search engines attempt to give access to a range of online images available on the web. Design/methodology/approach – The strategy of using open-source software components as much as possible was chosen for the advantages of this approach: low initial cost and accessibility to evaluate and develop enhancements independently and driven by research objectives rather than financial viability. Findings – Open-source software components can be used to develop the interface. The implementation of the image search engine and its indexes uses: Apache Solr, AJAX-Solr, jsTree and jQuery. Microsoft Translator web service was integrated into the interface to provide the optional user query translation. Originality/value – The search interface is intended to be an innovative tool for image searchers who are looking for digital images. The search interface gives the image searchers the opportunity to easily access a variety of visual resources and facilitates searching for images in two different languages (English and French).

Download Full-text

Query Sense Discovery Approach to Realize the User's Search Intent

International Journal of Information Retrieval Research ◽

10.4018/ijirr.289609 ◽

2022 ◽

Vol 12 (1) ◽

pp. 0-0

Keyword(s):

Information Retrieval ◽

Search Engine ◽

Performance Measures ◽

Evolutionary Process ◽

Semantic Space ◽

Precise Meaning ◽

User Query ◽

Determination Process ◽

The One ◽

System Operating

The main goal of information retrieval is getting the most relevant documents to a user’s query. So, a search engine must not only understand the meaning of each keyword in the query but also their relative senses in the context of the query. Discovering the query meaning is a comprehensive and evolutionary process; the precise meaning of the query is established as developing the association between concepts. The meaning determination process is modeled by a dynamic system operating in the semantic space of WordNet. To capture the meaning of a user query, the original query is reformulating into candidate queries by combining the concepts and their synonyms. A semantic score characterizing the overall meaning of such queries is calculated, the one with the highest score was used to perform the search. The results confirm that the proposed "Query Sense Discovery" approach provides a significant improvement in several performance measures.

Download Full-text

Ontology-Based Metasearch Engine in Electronics Area

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.l8014.1091220 ◽

2020 ◽

Vol 9 (12) ◽

pp. 272-277

Keyword(s):

Search Engine ◽

Web Sites ◽

Digital Libraries ◽

Search Engines ◽

Information Needs ◽

Web Search ◽

User Profile ◽

Semantic Query ◽

User Query ◽

Metasearch Engine

Paper The goal of search engines is to return accurate and complete results. Satisfying concrete user information needs becomes more and more difficult because of inability in it complete explicit specification and short comes of keyword-based searching and indexing. General search engines have indexed millions of web resources and often return thousands of results to the user query (most of them often inadequate). To increase result’s precession, users sometimes choose search engines, specialized in searching concrete domain, personalized or semantic search. A grand variety of specialized search engines may be found (and used) in the internet, but no one may guarantee finding of existing in the web and needed for the concrete user resources. In this paper we present our research on building a meta-search engine that uses domain and user profile ontologies, as well as information (or metadata), directly extracted from web sites to improve search result quality. We state main requirements to the search engine for students, PHD students and scientists, propose a conceptual model and discuss approaches of it practical realization. Our prototype metasearch engine first perform interactive semantic query refinement and then, using refined query, it automatically generate several search queries, sends them to different digital libraries and web search engines, augments and ranks returned results, using ontologically represented domain and user metadata. For testing our model, we develop domain ontologies in the electronic domain. We will use ontological terminology representation to propose recommendations for query disambiguation, and to ensure knowledge for reranking the returned results. We also present some partial initial implementations query disambiguation strategies and testing results.

Download Full-text

Real-time credible online health information inquiring: a novel search engine misinformation notifier extension (SEMiNExt) during COVID-19-like disease outbreak

10.21203/rs.3.rs-60301/v2 ◽

2020 ◽

Author(s):

Abdullah Bin Shams ◽

Ehsanul Hoque Apu ◽

Ashiqur Rahman ◽

Nazeeba Siddika ◽

Mohsin Sarker Raihan ◽

...

Keyword(s):

Public Health ◽

Machine Learning ◽

Real Time ◽

Search Engine ◽

Learning Algorithm ◽

Disease Outbreak ◽

Data Set ◽

User Query ◽

Tree Classifier ◽

Health Related

Abstract Public health-related misinformation spread rapidly in online networks, particularly, in social media during any disease outbreak. Misinformation of coronavirus disease 2019 (COVID-19) drug protocol or presentation of its treatment from untrusted sources have shown dramatic consequences on public health. Authorities are utilizing several surveillance tools to detect, and slow down the rapid misinformation spread online, still millions of misinformation are found online. However, there is no currently available tool for receiving real-time misinformation notification during online health or COVID-19 related inquiries. Our proposed novel combinational approach, where we have integrated machine learning techniques with novel search engine misinformation notifier extension (SEMiNExt), helps to understand which news or information is from unreliable sources in real-time. The extension filters the search results and shows notification beforehand; it is a new and unexplored approach to prevent the spread of misinformation. To validate the user query, SEMiNExt transfers the data to a machine learning algorithm or classifier which predicts the authenticity of the search inquiry and sends a binary decision as either true or false. The results show that the supervised learning algorithm works best when 80% of the data set have been used for training purpose. Also, 10-fold cross-validation demonstrate a maximum accuracy and F1-score of 84.3% and 84.1% respectively for the Decision Tree classifier while the K-nearest-neighbor (KNN) algorithm shows the least performance. The SEMiNExt approach has introduced the possibility to improve online health communication system by showing misinformation notifications in real-time which enables safer web-based searching while inquiring on health-related issues.

Download Full-text

An Accurate Efficient and Scalable Event Based Video Search Method Using Spectral Clustering

Journal of Computational and Theoretical Nanoscience ◽

10.1166/jctn.2018.7118 ◽

2018 ◽

Vol 15 (2) ◽

pp. 537-541

Author(s):

R. G. Sakthivelan ◽

P. Rajendran ◽

M. Thangavel

Keyword(s):

Search Engine ◽

Spectral Clustering ◽

Web Mining ◽

Video Retrieval ◽

Focal Point ◽

Event Extraction ◽

Precision Data ◽

Video Search ◽

User Query ◽

Event Based

Web mining discovers enormous set of data and gets hidden and valuable information which contains text, images, audio and video files from the web search engine which is software that provides a significant result of information. Video rehabilitation for the context gives efficient comprehension of the video content. Video retrieval refers to the task of retrieving most relevant videos from the video Search engine but the outcome listed result could not achieve the relevant videos according to the user needs. This paper addresses Event based Video Retrieval (EBVR) uses metadata, which gives the accurate result. The aim is detect the circumstances of a focal point such as birthday party. In order to overcome this issue, we proposed a personalization approach which captures the user query relevance to their event. Video preprocessing method used to extract related precision data and spectral clustering technique for Video Categorization which yields event extraction and contributes associated video.

Download Full-text