scholarly journals Relevance Feedback Based Query Expansion Model Using Borda Count and Semantic Similarity Approach

2015 ◽  
Vol 2015 ◽  
pp. 1-13 ◽  
Author(s):  
Jagendra Singh ◽  
Aditi Sharan

Pseudo-Relevance Feedback (PRF) is a well-known method of query expansion for improving the performance of information retrieval systems. All the terms of PRF documents are not important for expanding the user query. Therefore selection of proper expansion term is very important for improving system performance. Individual query expansion terms selection methods have been widely investigated for improving its performance. Every individual expansion term selection method has its own weaknesses and strengths. To overcome the weaknesses and to utilize the strengths of the individual method, we used multiple terms selection methods together. In this paper, first the possibility of improving the overall performance using individual query expansion terms selection methods has been explored. Second, Borda count rank aggregation approach is used for combining multiple query expansion terms selection methods. Third, the semantic similarity approach is used to select semantically similar terms with the query after applying Borda count ranks combining approach. Our experimental results demonstrated that our proposed approaches achieved a significant improvement over individual terms selection method and related state-of-the-art methods.

2021 ◽  
pp. 016555152110406
Author(s):  
Yasir Hadi Farhan ◽  
Shahrul Azman Mohd Noah ◽  
Masnizah Mohd ◽  
Jaffar Atwan

One of the main issues associated with search engines is the query–document vocabulary mismatch problem, a long-standing problem in Information Retrieval (IR). This problem occurs when a user query does not match the content of stored documents, and it affects most search tasks. Automatic query expansion (AQE) is one of the most common approaches used to address this problem. Various AQE techniques have been proposed; these mainly involve finding synonyms or related words for the query terms. Word embedding (WE) is one of the methods that are currently receiving significant attention. Most of the existing AQE techniques focus on expanding the individual query terms rather the entire query during the expansion process, and this can lead to query drift if poor expansion terms are selected. In this article, we introduce Deep Averaging Networks (DANs), an architecture that feeds the average of the WE vectors produced by the Word2Vec toolkit for the terms in a query through several linear neural network layers. This average vector is assumed to represent the meaning of the query as a whole and can be used to find expansion terms that are relevant to the complete query. We explore the potential of DANs for AQE in Arabic document retrieval. We experiment with using DANs for AQE in the classic probabilistic BM25 model as well as for two recent expansion strategies: Embedding-Based Query Expansion approach (EQE1) and Prospect-Guided Query Expansion Strategy (V2Q). Although DANs did not improve all outcomes when used in the BM25 model, it outperformed all baselines when incorporated into the EQE1 and V2Q expansion strategies.


2018 ◽  
Vol 9 (1) ◽  
pp. 9-17
Author(s):  
Marcel Bonar Kristanda ◽  
Seng Hansun ◽  
Albert Albert

Library catalog is a documentation or list of all library collections. Unfortunately, there is a problem identified in the process of searching a book inside library catalog in Universitas Multimedia Nusantara’s library information system regarding the relevant result based on user query input. This research aims to design and build a library catalog application on Android platform in order to increase the relvancy of searching result in a database using calculated Rocchio Relevance Feedback method along with user experience measurement. User experience analysis result presented a good respond with 91.18% score based by all factor and relevance value present 71.43% precision, 100% recall, and 83.33% F-Measure. Differences of relevant results between the Senayan Library Information system (SLiMS) and the new Android application ranged at 36.11%. Therefore, this Android application proved to give relevant result based on relevance rank. Index Terms—Rocchio, Relevance, Feedback, Pencarian, Buku, Aplikasi, Android, Perpustakaan.


Author(s):  
Fatemeh Alighardashi ◽  
Mohammad Ali Zare Chahooki

Improving the software product quality before releasing by periodic tests is one of the most expensive activities in software projects. Due to limited resources to modules test in software projects, it is important to identify fault-prone modules and use the test sources for fault prediction in these modules. Software fault predictors based on machine learning algorithms, are effective tools for identifying fault-prone modules. Extensive studies are being done in this field to find the connection between features of software modules, and their fault-prone. Some of features in predictive algorithms are ineffective and reduce the accuracy of prediction process. So, feature selection methods to increase performance of prediction models in fault-prone modules are widely used. In this study, we proposed a feature selection method for effective selection of features, by using combination of filter feature selection methods. In the proposed filter method, the combination of several filter feature selection methods presented as fused weighed filter method. Then, the proposed method caused convergence rate of feature selection as well as the accuracy improvement. The obtained results on NASA and PROMISE with ten datasets, indicates the effectiveness of proposed method in improvement of accuracy and convergence of software fault prediction.


Author(s):  
B. Venkatesh ◽  
J. Anuradha

In Microarray Data, it is complicated to achieve more classification accuracy due to the presence of high dimensions, irrelevant and noisy data. And also It had more gene expression data and fewer samples. To increase the classification accuracy and the processing speed of the model, an optimal number of features need to extract, this can be achieved by applying the feature selection method. In this paper, we propose a hybrid ensemble feature selection method. The proposed method has two phases, filter and wrapper phase in filter phase ensemble technique is used for aggregating the feature ranks of the Relief, minimum redundancy Maximum Relevance (mRMR), and Feature Correlation (FC) filter feature selection methods. This paper uses the Fuzzy Gaussian membership function ordering for aggregating the ranks. In wrapper phase, Improved Binary Particle Swarm Optimization (IBPSO) is used for selecting the optimal features, and the RBF Kernel-based Support Vector Machine (SVM) classifier is used as an evaluator. The performance of the proposed model are compared with state of art feature selection methods using five benchmark datasets. For evaluation various performance metrics such as Accuracy, Recall, Precision, and F1-Score are used. Furthermore, the experimental results show that the performance of the proposed method outperforms the other feature selection methods.


2013 ◽  
Vol 300-301 ◽  
pp. 874-881
Author(s):  
Chuan Rong Zhao ◽  
De Ren Kong ◽  
Fang Wang ◽  
Li Xia Yang ◽  
Li Ping Li

This article introduces the function, the method of selection and related criterion of standard internal crusher gauge, and systematically analyzes three factors that affect the measuring uncertainty of standard internal crusher gauge, including: inconsistency of pressure’s true value from pressure source, the uncertainty imported by standard copper-cylinder and the random fluctuations of the individual character of pressure measuring gauge. According to usage characteristics and selection methods of the standard internal crusher gauge, discusses computing methods of components of the measuring uncertainty and establishes evaluation model for measuring uncertainty of standard internal crusher gauge. The model can quantitatively calculate through experimental data of selection, which lay a theoretical foundation for the control of the pressure measuring uncertainty of standard internal crusher gauge.


2018 ◽  
Vol 29 (1) ◽  
pp. 653-663 ◽  
Author(s):  
Ritu Meena ◽  
Kamal K. Bharadwaj

Abstract Many recommender systems frequently make suggestions for group consumable items to the individual users. There has been much work done in group recommender systems (GRSs) with full ranking, but partial ranking (PR) where items are partially ranked still remains a challenge. The ultimate objective of this work is to propose rank aggregation technique for effectively handling the PR problem. Additionally, in real applications, most of the studies have focused on PR without ties (PRWOT). However, the rankings may have ties where some items are placed in the same position, but where some items are partially ranked to be aggregated may not be permutations. In this work, in order to handle problem of PR in GRS for PRWOT and PR with ties (PRWT), we propose a novel approach to GRS based on genetic algorithm (GA) where for PRWOT Spearman foot rule distance and for PRWT Kendall tau distance with bucket order are used as fitness functions. Experimental results are presented that clearly demonstrate that our proposed GRS based on GA for PRWOT (GRS-GA-PRWOT) and PRWT (GRS-GA-PRWT) outperforms well-known baseline GRS techniques.


2015 ◽  
Vol 5 (4) ◽  
pp. 31-45 ◽  
Author(s):  
Jagendra Singh ◽  
Aditi Sharan

Pseudo-relevance feedback (PRF) is a type of relevance feedback approach of query expansion that considers the top ranked retrieved documents as relevance feedback. In this paper the authors focus is to capture the limitation of co-occurrence and PRF based query expansion approach and the authors proposed a hybrid method to improve the performance of PRF based query expansion by combining query term co-occurrence and query terms contextual information based on corpus of top retrieved feedback documents in first pass. Firstly, the paper suggests top retrieved feedback documents based query term co-occurrence approach to select an optimal combination of query terms from a pool of terms obtained using PRF based query expansion. Second, contextual window based approach is used to select the query context related terms from top feedback documents. Third, comparisons were made among baseline, co-occurrence and contextual window based approaches using different performance evaluating metrics. The experiments were performed on benchmark data and the results show significant improvement over baseline approach.


Sign in / Sign up

Export Citation Format

Share Document