Search Integration with WebSphere Portal

2010 ◽  
Vol 2 (3) ◽  
pp. 1-18
Author(s):  
Andreas Prokoph

Modern web applications and servers like Portal require adequate support for integration of search services due to user focused information delivery and user interaction, as well as new technologies used to render such information, which is exemplified by two fundamental problems that have long plagued web crawlers: dynamic content and Javascript generated content. Today, the solution is simple: ignore such web pages. To enable “search” in Portals, a different “crawling” paradigm is required to search engines to gather and consume information. WebSphere Portal provides a framework that propagates content and information through “Seedlists”—comparable to HTML based sitemaps but richer in terms of features. This mandates that information and content delivering applications must be “search engine aware”, requiring them to enable services and seedlists for fast, efficient and complete delivery of content and information. This is the main integration point for search engines into the portal for Portal site search services for a rich and user focused search experience. This article discusses how such technologies can allow for more efficient crawling of public Portal sites by prominent Internet search engines as well as myths surrounding search engine optimization.

Author(s):  
Andreas Prokoph

Modern web applications and servers like Portal require adequate support for integration of search services due to user focused information delivery and user interaction, as well as new technologies used to render such information, which is exemplified by two fundamental problems that have long plagued web crawlers: dynamic content and Javascript generated content. Today, the solution is simple: ignore such web pages. To enable “search” in Portals, a different “crawling” paradigm is required to search engines to gather and consume information. WebSphere Portal provides a framework that propagates content and information through “Seedlists”—comparable to HTML based sitemaps but richer in terms of features. This mandates that information and content delivering applications must be “search engine aware”, requiring them to enable services and seedlists for fast, efficient and complete delivery of content and information. This is the main integration point for search engines into the portal for Portal site search services for a rich and user focused search experience. This article discusses how such technologies can allow for more efficient crawling of public Portal sites by prominent Internet search engines as well as myths surrounding search engine optimization.


Author(s):  
Oğuzhan Menemencioğlu ◽  
İlhami Muharrem Orak

Semantic web works on producing machine readable data and aims to deal with large amount of data. The most important tool to access the data which exist in web is the search engine. Traditional search engines are insufficient in the face of the amount of data that consists in the existing web pages. Semantic search engines are extensions to traditional engines and overcome the difficulties faced by them. This paper summarizes semantic web, concept of traditional and semantic search engines and infrastructure. Also semantic search approaches are detailed. A summary of the literature is provided by touching on the trends. In this respect, type of applications and the areas worked for are considered. Based on the data for two different years, trend on these points are analyzed and impacts of changes are discussed. It shows that evaluation on the semantic web continues and new applications and areas are also emerging. Multimedia retrieval is a newly scope of semantic. Hence, multimedia retrieval approaches are discussed. Text and multimedia retrieval is analyzed within semantic search.


2019 ◽  
Vol 16 (9) ◽  
pp. 3712-3716
Author(s):  
Kailash Kumar ◽  
Abdulaziz Al-Besher

This paper examines the overlapping of the results retrieved between three major search engines namely Google, Yahoo and Bing. A rigorous analysis of overlap among these search engines was conducted on 100 random queries. The overlap of first ten web page results, i.e., hundred results from each search engine and only non-sponsored results from these above major search engines were taken into consideration. Search engines have their own frequency of updates and ranking of results based on their relevance. Moreover, sponsored search advertisers are different for different search engines. Single search engine cannot index all Web pages. In this research paper, the overlapping analysis of the results were carried out between October 1, 2018 to October 31, 2018 among these major search engines namely, Google, Yahoo and Bing. A framework is built in Java to analyze the overlap among these search engines. This framework eliminates the common results and merges them in a unified list. It also uses the ranking algorithm to re-rank the search engine results and displays it back to the user.


2021 ◽  
Author(s):  
Srihari Vemuru ◽  
Eric John ◽  
Shrisha Rao

Humans can easily parse and find answers to complex queries such as "What was the capital of the country of the discoverer of the element which has atomic number 1?" by breaking them up into small pieces, querying these appropriately, and assembling a final answer. However, contemporary search engines lack such capability and fail to handle even slightly complex queries. Search engines process queries by identifying keywords and searching against them in knowledge bases or indexed web pages. The results are, therefore, dependent on the keywords and how well the search engine handles them. In our work, we propose a three-step approach called parsing, tree generation, and querying (PTGQ) for effective searching of larger and more expressive queries of potentially unbounded complexity. PTGQ parses a complex query and constructs a query tree where each node represents a simple query. It then processes the complex query by recursively querying a back-end search engine, going over the corresponding query tree in postorder. Using PTGQ makes sure that the search engine always handles a simpler query containing very few keywords. Results demonstrate that PTGQ can handle queries of much higher complexity than standalone search engines.


2017 ◽  
Author(s):  
Xi Zhu ◽  
Xiangmiao Qiu ◽  
Dingwang Wu ◽  
Shidong Chen ◽  
Jiwen Xiong ◽  
...  

BACKGROUND All electronic health practices like app/software are involved in web search engine due to its convenience for receiving information. The success of electronic health has link with the success of web search engines in field of health. Yet information reliability from search engine results remains to be evaluated. A detail analysis can find out setbacks and bring inspiration. OBJECTIVE Find out reliability of women epilepsy related information from the searching results of main search engines in China. METHODS Six physicians conducted the search work every week. Search key words are one kind of AEDs (valproate acid/oxcarbazepine/levetiracetam/ lamotrigine) plus "huaiyun"/"renshen", both of which means pregnancy in Chinese. The search were conducted in different devices (computer/cellphone), different engines (Baidu/Sogou/360). Top ten results of every search result page were included. Two physicians classified every results into 9 categories according to their contents and also evaluated the reliability. RESULTS A total of 16411 searching results were included. 85.1% of web pages were with advertisement. 55% were categorized into question and answers according to their contents. Only 9% of the searching results are reliable, 50.7% are partly reliable, 40.3% unreliable. With the ranking of the searching results higher, advertisement up and the proportion of those unreliable increase. All contents from hospital websites are unreliable at all and all from academic publishing are reliable. CONCLUSIONS Several first principles must be emphasized to further the use of web search engines in field of healthcare. First, identification of registered physicians and development of an efficient system to guide the patients to physicians guarantee the quality of information provided. Second, corresponding department should restrict the excessive advertisement sale trades in healthcare area by specific regulations to avoid negative impact on patients. Third, information from hospital websites should be carefully judged before embracing them wholeheartedly.


2018 ◽  
Vol 7 (3) ◽  
pp. 1119
Author(s):  
Jyoti Mor ◽  
Dr Dinesh Rai ◽  
Dr Naresh Kumar

In a large collection of web pages, it is difficult for search engines to keep their online repository updated. Major search engines have hundreds of web crawlers that crawl the WWW day and night and send the downloaded web pages via a network to be stored in the search engine’s database. These results in over utilization of network resources like bandwidth, CPU cycles and so on. This paper proposes an architecture that tries to reduce the utilization of shared network resources with the help of an advanced XML based approach. This focused crawling based architecture is trained to download only the high quality data from the internet leaving behind the web pages which are not relevant to the desired domain. Here, a detailed layout of the proposed system is described which is capable of reducing the load on network and reducing the problem arise in residency of mobile agent at the remote server.  


2002 ◽  
Vol 63 (4) ◽  
pp. 354-365 ◽  
Author(s):  
Susan Augustine ◽  
Courtney Greene

Have Internet search engines influenced the way students search library Web pages? The results of this usability study reveal that students consistently and frequently use the library Web site’s internal search engine to find information rather than navigating through pages. If students are searching rather than navigating, library Web page designers must make metadata and powerful search engines priorities. The study also shows that students have difficulty interpreting library terminology, experience confusion discerning difference amongst library resources, and prefer to seek human assistance when encountering problems online. These findings imply that library Web sites have not alleviated some of the basic and long-range problems that have challenged librarians in the past.


2013 ◽  
Vol 303-306 ◽  
pp. 2311-2316
Author(s):  
Hong Shen Liu ◽  
Peng Fei Wang

The structures and contents of researching search engines are presented and the core technology is the analysis technology of web pages. The characteristic of analyzing web pages in one website is studied, relations between the web pages web crawler gained at two times are able to be obtained and the changed information among them are found easily. A new method of analyzing web pages in one website is introduced and the method analyzes web pages with the changed information of web pages. The result of applying the method shows that the new method is effective in the analysis of web pages.


2019 ◽  
Author(s):  
Erin Michelle Buchanan ◽  
Sarah E Crain ◽  
Ari L. Cunningham ◽  
Hannah Rose Johnson ◽  
Hannah Elyse Stash ◽  
...  

As researchers embrace open and transparent data sharing, they will need to provide information about their data that effectively helps others understand its contents. Without proper documentation, data stored in online repositories such as OSF will often be rendered unfindable and unreadable by other researchers and indexing search engines. Data dictionaries and codebooks provide a wealth of information about variables, data collection, and other important facets of a dataset. This information, called metadata, provides key insights into how the data might be further used in research and facilitates search engine indexing to reach a broader audience of interested parties. This tutorial first explains the terminology and standards surrounding data dictionaries and codebooks. We then present a guided workflow of the entire process from source data (e.g., survey answers on Qualtrics) to an openly shared dataset accompanied by a data dictionary or codebook that follows an agreed-upon standard. Finally, we explain how to use freely available web applications to assist this process of ensuring that psychology data are findable, accessible, interoperable, and reusable (FAIR; Wilkinson et al., 2016).


Author(s):  
Le Khanh Trinh ◽  
Vo Dinh Hieu ◽  
Pham Ngoc Hung

Automated user  interaction  testing  of  Web applications has  been  received  great  attentions  from the  research  community  and  industry.  Currently, several  available  tools are proposed to partly deal withthe problem.  However, how to perform the  automated user  interaction  testing  of  whole  Web  applications effectively  is  still  an  open  problem.   This  research proposes  a  method  and  develops  a  tool  supporting automated  user  interaction  testing  of  whole  Web applications.  In  this  method, the  model  of  each  Web page  of  the  Web  application  under  testing  which describes the user interaction (UI)  is  represented  by  a finite state  automaton. The whole model that describes the  behaviors  of  the  whole  Web  application  then  is constructed by composing the models  of all  Web pages. After  that,  test  paths  are generated automatically based  on  the  compositional  model  of  the  Web application  so  that  these  test  paths  cover  all  possible user interactions  of the application.  A tool supporting the proposed method has  been developed and  applied to  test  on  some  simple  Web  applications.    The experimental results show the potential  application  of this tool for automated user interaction  testing of Webapplications in practice


Sign in / Sign up

Export Citation Format

Share Document