scholarly journals Query Recommendation for Improving Search Engine Results

Author(s):  
Hamada M. Zahera ◽  
Gamal F. El-Hady ◽  
W. F. Abd El-Wahed

As web contents grow, the importance of search engines become more critical and at the same time user satisfaction decreases. Query recommendation is a new approach to improve search results in web. In this paper a method is proposed that, given a query submitted to a search engine, suggests a list of queries that are related to the user input query. The related queries are based on previously issued queries, and can be issued by the user to the search engine to tune or redirect the search process. The proposed method is based on clustering processes in which groups of semantically similar queries are detected. The clustering process uses the content of historical preferences of users registered in the query log of the search engine. This facility provides queries that are related to the ones submitted by users in order to direct them toward their required information. This method not only discovers the related queries but also ranks them according to a similarity measure. The method has been evaluated using real data sets from the search engine query log.

2011 ◽  
Vol 1 (1) ◽  
pp. 45-52 ◽  
Author(s):  
Hamada M. Zahera ◽  
Gamal F. El-Hady ◽  
W. F. Abd El-Wahed

As web contents grow, the importance of search engines become more critical and at the same time user satisfaction decreases. Query recommendation is a new approach to improve search results in web. In this paper a method is proposed that, given a query submitted to a search engine, suggests a list of queries that are related to the user input query. The related queries are based on previously issued queries, and can be issued by the user to the search engine to tune or redirect the search process. The proposed method is based on clustering processes in which groups of semantically similar queries are detected. The clustering process uses the content of historical preferences of users registered in the query log of the search engine. This facility provides queries that are related to the ones submitted by users in order to direct them toward their required information. This method not only discovers the related queries but also ranks them according to a similarity measure. The method has been evaluated using real data sets from the search engine query log.


Geophysics ◽  
2020 ◽  
Vol 85 (2) ◽  
pp. V223-V232 ◽  
Author(s):  
Zhicheng Geng ◽  
Xinming Wu ◽  
Sergey Fomel ◽  
Yangkang Chen

The seislet transform uses the wavelet-lifting scheme and local slopes to analyze the seismic data. In its definition, the designing of prediction operators specifically for seismic images and data is an important issue. We have developed a new formulation of the seislet transform based on the relative time (RT) attribute. This method uses the RT volume to construct multiscale prediction operators. With the new prediction operators, the seislet transform gets accelerated because distant traces get predicted directly. We apply our method to synthetic and real data to demonstrate that the new approach reduces computational cost and obtains excellent sparse representation on test data sets.


Geophysics ◽  
2016 ◽  
Vol 81 (6) ◽  
pp. D625-D641 ◽  
Author(s):  
Dario Grana

The estimation of rock and fluid properties from seismic attributes is an inverse problem. Rock-physics modeling provides physical relations to link elastic and petrophysical variables. Most of these models are nonlinear; therefore, the inversion generally requires complex iterative optimization algorithms to estimate the reservoir model of petrophysical properties. We have developed a new approach based on the linearization of the rock-physics forward model using first-order Taylor series approximations. The mathematical method adopted for the inversion is the Bayesian approach previously applied successfully to amplitude variation with offset linearized inversion. We developed the analytical formulation of the linearized rock-physics relations for three different models: empirical, granular media, and inclusion models, and we derived the formulation of the Bayesian rock-physics inversion under Gaussian assumptions for the prior distribution of the model. The application of the inversion to real data sets delivers accurate results. The main advantage of this method is the small computational cost due to the analytical solution given by the linearization and the Bayesian Gaussian approach.


2015 ◽  
Vol 2015 ◽  
pp. 1-14 ◽  
Author(s):  
JianGuo Wang ◽  
Joshua Zhexue Huang ◽  
Dingming Wu

Query recommendation is an essential part of modern search engine which aims at helping users find useful information. Existing query recommendation methods all focus on recommending similar queries to the users. However, the main problem of these similarity-based approaches is that even some very similar queries may return few or even no useful search results, while other less similar queries may return more useful search results, especially when the initial query does not reflect user’s search intent correctly. Therefore, we propose recommending high utility queries, that is, useful queries with more relevant documents, rather than similar ones. In this paper, we first construct a query-reformulation graph that consists of query nodes, satisfactory document nodes, and interruption node. Then, we apply an absorbing random walk on the query-reformulation graph and model the document utility with the transition probability from initial query to the satisfactory document. At last, we propagate the document utilities back to queries and rank candidate queries with their utilities for recommendation. Extensive experiments were conducted on real query logs, and the experimental results have shown that our method significantly outperformed the state-of-the-art methods in recommending high utility queries.


2021 ◽  
Vol 3 (1) ◽  
pp. 1-7
Author(s):  
Yadgar Sirwan Abdulrahman

Clustering is one of the essential strategies in data analysis. In classical solutions, all features are assumed to contribute equally to the data clustering. Of course, some features are more important than others in real data sets. As a result, essential features will have a more significant impact on identifying optimal clusters than other features. In this article, a fuzzy clustering algorithm with local automatic weighting is presented. The proposed algorithm has many advantages such as: 1) the weights perform features locally, meaning that each cluster's weight is different from the rest. 2) calculating the distance between the samples using a non-euclidian similarity criterion to reduce the noise effect. 3) the weight of the features is obtained comparatively during the learning process. In this study, mathematical analyzes were done to obtain the clustering centers well-being and the features' weights. Experiments were done on the data set range to represent the progressive algorithm's efficiency compared to other proposed algorithms with global and local features


Author(s):  
LEV V. UTKIN

A new approach for ensemble construction based on restricting a set of weights of examples in training data to avoid overfitting is proposed in the paper. The algorithm called EPIBoost (Extreme Points Imprecise Boost) applies imprecise statistical models to restrict the set of weights. The updating of the weights within the restricted set is carried out by using its extreme points. The approach allows us to construct various algorithms by applying different imprecise statistical models for producing the restricted set. It is shown by various numerical experiments with real data sets that the EPIBoost algorithm may outperform the standard AdaBoost for some parameters of imprecise statistical models.


2021 ◽  
Vol 25 (3) ◽  
pp. 687-710
Author(s):  
Mostafa Boskabadi ◽  
Mahdi Doostparast

Regression trees are powerful tools in data mining for analyzing data sets. Observations are usually divided into homogeneous groups, and then statistical models for responses are derived in the terminal nodes. This paper proposes a new approach for regression trees that considers the dependency structures among covariates for splitting the observations. The mathematical properties of the proposed method are discussed in detail. To assess the accuracy of the proposed model, various criteria are defined. The performance of the new approach is assessed by conducting a Monte-Carlo simulation study. Two real data sets on classification and regression problems are analyzed by using the obtained results.


2019 ◽  
Vol 8 (2) ◽  
pp. 159
Author(s):  
Morteza Marzjarani

Heteroscedasticity plays an important role in data analysis. In this article, this issue along with a few different approaches for handling heteroscedasticity are presented. First, an iterative weighted least square (IRLS) and an iterative feasible generalized least square (IFGLS) are deployed and proper weights for reducing heteroscedasticity are determined. Next, a new approach for handling heteroscedasticity is introduced. In this approach, through fitting a multiple linear regression (MLR) model or a general linear model (GLM) to a sufficiently large data set, the data is divided into two parts through the inspection of the residuals based on the results of testing for heteroscedasticity, or via simulations. The first part contains the records where the absolute values of the residuals could be assumed small enough to the point that heteroscedasticity would be ignorable. Under this assumption, the error variances are small and close to their neighboring points. Such error variances could be assumed known (but, not necessarily equal).The second or the remaining portion of the said data is categorized as heteroscedastic. Through real data sets, it is concluded that this approach reduces the number of unusual (such as influential) data points suggested for further inspection and more importantly, it will lowers the root MSE (RMSE) resulting in a more robust set of parameter estimates.


Separations ◽  
2021 ◽  
Vol 8 (10) ◽  
pp. 178
Author(s):  
Guillaume Laurent Erny ◽  
Marzieh Moeenfard ◽  
Arminda Alves

Selectivity in separation science is defined as the extent to which a method can determine the target analyte free of interference. It is the backbone of any method and can be enhanced at various steps, including sample preparation, separation optimization and detection. Significant improvement in selectivity can also be achieved in the data analysis step with the mathematical treatment of the signals. In this manuscript, we present a new approach that uses mathematical functions to model chromatographic peaks. However, unlike classical peak fitting approaches where the fitting parameters are optimized with a single profile (one-way data), the parameters are optimized over multiple profiles (two-way data). Thus, it allows high confidence and robustness. Furthermore, an iterative approach where the number of peaks is increased at each step until convergence is developed in this manuscript. It is demonstrated with simulated and real data that this algorithm is: (1) capable of mathematically separating each component with minimal user input and (2) that the peak areas can be accurately measured even with resolution as low as 0.5 if the peak’s intensities does not differ by more than a factor 10. This was conclusively demonstrated with the quantification of diterpene esters in standard mixtures.


2019 ◽  
Vol 2019 ◽  
pp. 1-8
Author(s):  
Chunzhong Li ◽  
Yunong Zhang ◽  
Xu Chen

As one of the typical clustering algorithms, heuristic clustering is characterized by its flexibility in feature integration. This paper proposes a type of heuristic algorithm based on cognitive feature integration. The proposed algorithm employs nonparameter density estimation and maximum likelihood estimation to integrate whole and local cognitive features and finally outputs satisfying clustering results. The new approach possesses great expansibility, which enables priors supplement and misclassification adjusting during clustering process. The advantages of the new approach are as follows: (1) it is effective in recognizing stable clustering results without priors given in advance; (2) it can be applied in complex data sets and is not restricted by density and shape of the clusters; and (3) it is effective in noise and outlier recognition, which does not need elimination of noises and outliers in advance. The experiments on synthetic and real data sets exhibit better performance of the new algorithm.


Sign in / Sign up

Export Citation Format

Share Document