Enhancing Relevance Ranking of the EERQI Search Engine

Author(s):  
Ágnes Sándor ◽  
Angela Vorndran
2010 ◽  
Vol 7 (2) ◽  
pp. 1-11 ◽  
Author(s):  
Matthias Lange ◽  
Karl Spies ◽  
Joachim Bargsten ◽  
Gregor Haberhauer ◽  
Matthias Klapperstück ◽  
...  

SummarySearch engines and retrieval systems are popular tools at a life science desktop. The manual inspection of hundreds of database entries, that reflect a life science concept or fact, is a time intensive daily work. Hereby, not the number of query results matters, but the relevance does. In this paper, we present the LAILAPS search engine for life science databases. The concept is to combine a novel feature model for relevance ranking, a machine learning approach to model user relevance profiles, ranking improvement by user feedback tracking and an intuitive and slim web user interface, that estimates relevance rank by tracking user interactions. Queries are formulated as simple keyword lists and will be expanded by synonyms. Supporting a flexible text index and a simple data import format, LAILAPS can easily be used both as search engine for comprehensive integrated life science databases and for small in-house project databases.With a set of features, extracted from each database hit in combination with user relevance preferences, a neural network predicts user specific relevance scores. Using expert knowledge as training data for a predefined neural network or using users own relevance training sets, a reliable relevance ranking of database hits has been implemented.In this paper, we present the LAILAPS system, the concepts, benchmarks and use cases. LAILAPS is public available for SWISSPROT data at http://lailaps.ipk-gatersleben.de


Author(s):  
Cláudio Elízio Calazans Campelo ◽  
Cláudio de Souza Baptista ◽  
Ricardo Madeira Fernandes

It is well known that documents available on the Web are extremely heterogeneous in several aspects, such as the use of various idioms, different formats to represent the contents, besides other external factors like source reputation, refresh frequency, and so forth (Page & Brin, 1998). Altogether, these factors increase the complexity of Web information retrieval systems. Superficially, traditional search engines available on the Web nowadays consist of retrieving documents that contain keywords informed by users. Nevertheless, among the variety of search possibilities, it is evident that the user needs a process that involves more sophisticated analysis; for example, temporal or spatial contextualization might be considered. In these keyword-based search engines, for instance, a Web page containing the phrase “…due to the company arrival in London, a thousand java programming jobs will be open…” would not be found if the submitted search was “jobs programming England,” unless the word “England” appeared in another phrase of the page. The explanation to this fact is that the term “London” is treated merely like another word, instead of regarding its geographical position. In a spatial search engine, the expected behavior would be to return the page described in the previous example, since the system shall have information indicating that the term “London” refers to a city located in a country referred to by the term “England.” This result could only be feasible in a traditional search engine if the user repeatedly submitted searches for all possible England sub-regions (e.g., cities). In accordance with the example, it is reasonable that for several user searches, the most interesting results are those related to certain geographical regions. A variety of features extraction and automatic document classification techniques have been proposed, however, acquiring Web-page geographical features involves some peculiar complexities, such as ambiguity (e.g., many places with the same name, various names for a single place, things with place names, etc.). Moreover, a Web page can refer to a place that contains or is contained by the one informed in the user query, which implies knowing the different region topologies used by the system. Many features related to geographical context can be added to the process of elaborating relevance ranking for returned documents. For example, a document can be more relevant than another one if its content refers to a place closer to the user location. Nonetheless, in spatial search engines, there are more complex issues to be considered because of the spatial dimension concerning on ranking elaboration. Jones, Alani, and Tudhope (2001) propose a combination of Euclidian distance between place centroids with hierarchical distances in order to generate a hybrid spatial distance that may be used in the relevance ranking elaboration of returned documents. Further important issues are the indexing mechanisms and query processing. In general, these solutions try to combine well-known textual indexing techniques (e.g., inverted files) with spatial indexing mechanisms. On the subject of user interface, spatial search engines are more complex, because users need to choose regions of interest, as well as possible spatial relationships, in addition to keywords. To visualize the results, it is pleasant to use digital map resources besides textual information.


2010 ◽  
Vol 7 (3) ◽  
Author(s):  
Matthias Lange ◽  
Karl Spies ◽  
Christian Colmsee ◽  
Steffen Flemming ◽  
Matthias Klapperstück ◽  
...  

SummaryEfficient and effective information retrieval in life sciences is one of the most pressing challenge in bioinformatics. The incredible growth of life science databases to a vast network of interconnected information systems is to the same extent a big challenge and a great chance for life science research. The knowledge found in the Web, in particular in life-science databases, are a valuable major resource. In order to bring it to the scientist desktop, it is essential to have well performing search engines. Thereby, not the response time nor the number of results is important. The most crucial factor for millions of query results is the relevance ranking.In this paper, we present a feature model for relevance ranking in life science databases and its implementation in the LAILAPS search engine. Motivated by the observation of user behavior during their inspection of search engine result, we condensed a set of 9 relevance discriminating features. These features are intuitively used by scientists, who briefly screen database entries for potential relevance. The features are both sufficient to estimate the potential relevance, and efficiently quantifiable.The derivation of a relevance prediction function that computes the relevance from this features constitutes a regression problem. To solve this problem, we used artificial neural networks that have been trained with a reference set of relevant database entries for 19 protein queries.Supporting a flexible text index and a simple data import format, this concepts are implemented in the LAILAPS search engine. It can easily be used both as search engine for comprehensive integrated life science databases and for small in-house project databases. LAILAPS is publicly available for SWISSPROT data at http://lailaps.ipk-gatersleben.de


2021 ◽  
Vol 13 (2) ◽  
pp. 31
Author(s):  
Cristòfol Rovira ◽  
Lluís Codina ◽  
Carlos Lopezosa

The visibility of academic articles or conference papers depends on their being easily found in academic search engines, above all in Google Scholar. To enhance this visibility, search engine optimization (SEO) has been applied in recent years to academic search engines in order to optimize documents and, thereby, ensure they are better ranked in search pages (i.e., academic search engine optimization or ASEO). To achieve this degree of optimization, we first need to further our understanding of Google Scholar’s relevance ranking algorithm, so that, based on this knowledge, we can highlight or improve those characteristics that academic documents already present and which are taken into account by the algorithm. This study seeks to advance our knowledge in this line of research by determining whether the language in which a document is published is a positioning factor in the Google Scholar relevance ranking algorithm. Here, we employ a reverse engineering research methodology based on a statistical analysis that uses Spearman’s correlation coefficient. The results obtained point to a bias in multilingual searches conducted in Google Scholar with documents published in languages other than in English being systematically relegated to positions that make them virtually invisible. This finding has important repercussions, both for conducting searches and for optimizing positioning in Google Scholar, being especially critical for articles on subjects that are expressed in the same way in English and other languages, the case, for example, of trademarks, chemical compounds, industrial products, acronyms, drugs, diseases, etc.


2010 ◽  
Author(s):  
Matthias Lange ◽  
Karl Spies ◽  
Christian Colmsee ◽  
Steffen Flemming ◽  
Matthias Klapperstück ◽  
...  

2003 ◽  
Vol 62 (2) ◽  
pp. 121-129 ◽  
Author(s):  
Astrid Schütz ◽  
Franz Machilek

Research on personal home pages is still rare. Many studies to date are exploratory, and the problem of drawing a sample that reflects the variety of existing home pages has not yet been solved. The present paper discusses sampling strategies and suggests a strategy based on the results retrieved by a search engine. This approach is used to draw a sample of 229 personal home pages that portray private identities. Findings on age and sex of the owners and elements characterizing the sites are reported.


Nature ◽  
2018 ◽  
Author(s):  
Richard Van Noorden
Keyword(s):  

Infoman s ◽  
2018 ◽  
Vol 12 (2) ◽  
pp. 115-124
Author(s):  
Yopi Hidayatul Akbar ◽  
Muhammad Agreindra Helmiawan

Social media is one of the information media that is currently widely used by several companies and personally to convey information, with the presence of social media companies no longer need to spread offers through print media, they can use information technology tools in this case social media to submit offers the products they sell to users globally through social media. This social media marketing technique is the process of reaching visits by internet users to certain sites or public attention through social media sites. Marketing activities using social media are usually centered on the efforts of a company to create content that attracts attention, thus encouraging readers to share the content through their social media networks. The application of the QMS method is certainly not only submitted through search engine webmasters, but also on a website keywords must be applied that relate to the contents of the website content, because with the keyword it will automatically attract visitors to the university website based on keyword phrases that they type in the search engine. With Search Media Marketing Technique (SMM) is one of the techniques that must be applied in conducting sales promotions, especially in car dealers in Bandung, it is considered important because each product requires price, feature and convenience socialization through social media so that sales traffic can increase. Each dealer should be able to apply the techniques of Social Media Marketing (SMM) well so that car sales can reach the expected target and provide profits for sales as car sellers in the field.


Sign in / Sign up

Export Citation Format

Share Document