Between Web search engines and artificial intelligence: what side is shown in laboratory tests?

AbstractBackgroundThe number of websites providing laboratory test information is increasing fast, although the accuracy of reported resources is sometimes questionable. The aim of this study was to assess the quality of online retrievable information by Google Search engine.MethodsConsidering urinalysis, cholesterol and prostate-specific antigen (PSA) as keywords, the Google Search engine was queried. Using Google Trends, users’ search trends (interest over time) were evaluated in a 5-year period. The first three or 10 retrieved hits were analysed in blind by two reviewers and classified according to the type of owner or publisher and for the quality of the reported Web content.ResultsThe interest over time constantly increased for all the three considered tests. Most of the Web content owners were editorial and/or publishing groups (mean percentage 35.5% and 30.0% for the first three and 10 hits, respectively). Public and health agencies and scientific societies are less represented. Among the first three and 10 hits, cited sources were found to vary from 26.0% to 46.7% of Web page results, whilst for cholesterol, 60% of the retrieved Web contents reported only authors’ signatures.ConclusionsOur findings confirm those obtained in other studies in the literature, demonstrating that online Web searches can lead patients to inadequately written or reviewed health information.

Download Full-text

Getting Bulk Data Through Google: An empirical study

Journal of Technology Management for Growing Economies ◽

10.15415/jtmge.2016.72002 ◽

2016 ◽

Vol 7 (2) ◽

pp. 39-48

Author(s):

Shama Rani ◽

Jaiteg Singh

Keyword(s):

Search Engine ◽

Web Search ◽

Matching Function ◽

Quality Of Information ◽

Web Documents ◽

Page Rank ◽

Efficient Storage ◽

Bulk Data ◽

Google Search

To store the information in a database is one of the major tasks. The efficient storage of data is important for future use. Information retrieval is a method of gathering information related to input queries from the various sources or stored databases. To retrieve the information, a search engine plays an important role. A web search engine creates an index to match queries. The quality of information is improved with the help of search engine. For retrieving the information, a search engine comprises some modules such as query processor, a searching and matching function, document processor and page rank capability. This paper focuses on the retrieval of web documents against input queries and stores them in to database. A Google search API can be used to fetch the results. It analyses the data by processing through these modules and downloads the content available in different formats.

Download Full-text

City Networks in Cyberspace and Time

Crisis Management ◽

10.4018/978-1-4666-4707-7.ch067 ◽

2013 ◽

pp. 1325-1345

Author(s):

Andrew Boulton ◽

Lomme Devriendt ◽

Stanley D. Brunn ◽

Ben Derudder ◽

Frank Witlox

Keyword(s):

Search Engine ◽

Web Search ◽

Global Financial Crisis ◽

Current Knowledge ◽

Global Climate ◽

Urban System ◽

Digital Information ◽

Social Scientists ◽

The Global Financial Crisis ◽

Google Search

Geographers and social scientists have long been interested in ranking and classifying the cities of the world. The cutting edge of this research is characterized by a recognition of the crucial importance of information and, specifically, ICTs to cities’ positions in the current Knowledge Economy. This chapter builds on recent “cyberspace” analyses of the global urban system by arguing for, and demonstrating empirically, the value of Web search engine data as a means of understanding cities as situated within, and constituted by, flows of digital information. To this end, the authors show how the Google search engine can be used to specify a dynamic, informational classification of North American cities based on both the production and the consumption of Web information about two prominent current issues global in scope: the global financial crisis, and global climate change.

Download Full-text

Can search result summaries enhance the web search efficiency and experiences of the visually impaired users?

Universal Access in the Information Society ◽

10.1007/s10209-020-00777-w ◽

2020 ◽

Author(s):

Aboubakr Aqle ◽

Dena Al-Thani ◽

Ali Jaoua

Keyword(s):

Search Engine ◽

Visually Impaired ◽

Web Search ◽

Task Completion ◽

Formal Concept ◽

Search Efficiency ◽

User Interactions ◽

Search Results ◽

Search Tasks ◽

Google Search

AbstractThere are limited studies that are addressing the challenges of visually impaired (VI) users when viewing search results on a search engine interface by using a screen reader. This study investigates the effect of providing an overview of search results to VI users. We present a novel interactive search engine interface called InteractSE to support VI users during the results exploration stage in order to improve their interactive experience and web search efficiency. An overview of the search results is generated using an unsupervised machine learning approach to present the discovered concepts via a formal concept analysis that is domain-independent. These concepts are arranged in a multi-level tree following a hierarchical order and covering all retrieved documents that share maximal features. The InteractSE interface was evaluated by 16 legally blind users and compared with the Google search engine interface for complex search tasks. The evaluation results were obtained based on both quantitative (as task completion time) and qualitative (as participants’ feedback) measures. These results are promising and indicate that InteractSE enhances the search efficiency and consequently advances user experience. Our observations and analysis of the user interactions and feedback yielded design suggestions to support VI users when exploring and interacting with search results.

Download Full-text

An Intelligent Web Search Using Multi-Document Summarization

International Journal of Information Retrieval Research ◽

10.4018/ijirr.2016040103 ◽

2016 ◽

Vol 6 (2) ◽

pp. 41-65 ◽

Cited By ~ 2

Author(s):

Sheetal A. Takale ◽

Prakash J. Kulkarni ◽

Sahil K. Shah

Keyword(s):

Search Engine ◽

Web Search ◽

Document Clustering ◽

The Internet ◽

Web Pages ◽

Extractive Summarization ◽

Text Understanding ◽

User Query ◽

Sentence Clustering

Information available on the internet is huge, diverse and dynamic. Current Search Engine is doing the task of intelligent help to the users of the internet. For a query, it provides a listing of best matching or relevant web pages. However, information for the query is often spread across multiple pages which are returned by the search engine. This degrades the quality of search results. So, the search engines are drowning in information, but starving for knowledge. Here, we present a query focused extractive summarization of search engine results. We propose a two level summarization process: identification of relevant theme clusters, and selection of top ranking sentences to form summarized result for user query. A new approach to semantic similarity computation using semantic roles and semantic meaning is proposed. Document clustering is effectively achieved by application of MDL principle and sentence clustering and ranking is done by using SNMF. Experiments conducted demonstrate the effectiveness of system in semantic text understanding, document clustering and summarization.

Download Full-text

Internet Based Infertility Information in Bahasa Indonesia Quality Survey

Indonesian Journal of Obstetrics and Gynecology ◽

10.32771/inajog.v6i1.754 ◽

2018 ◽

pp. 28

Author(s):

Andon Hestiantoro ◽

Intan Kusumaningtyas

Keyword(s):

Search Engine ◽

Obstet Gynecol ◽

Google Search ◽

Bahasa Indonesia

Objective: To assess the quality of websites providing informationon infertility and its management in Bahasa.Methods: Differences between website types and affiliates wereassessed for the credibility, accuracy and ease of navigation usingpredefined criteria. We used Google search engine with the keyword"infertilitas" and we assessed 50 websites in Bahasa that relates withinfertility.Results: The content credibility for most of the sites has adequatescore with range of score 60 to 80 for 68% sites. Content accuracyfor most of the sites have scores more than 60, with 24% or 12sites with scores 60 to 80 and 44% or 22 sites have scores above80. The ease of navigation for most of the sites, 47 sites or 94%has scores more than 60.Conclusion: The quality of internet based infertility information inBahasa is adequate for category credibility, accuracy and ease ofnavigation.[Indones J Obstet Gynecol 2018; 6-1: 28-33]Keywords: bahasa, infertility, information, internet, quality

Download Full-text

Google Web and Image Search Visibility Data for Online Store

Data ◽

10.3390/data4030125 ◽

2019 ◽

Vol 4 (3) ◽

pp. 125 ◽

Cited By ~ 4

Author(s):

Artur Strzelecki

Keyword(s):

Search Engine ◽

Web Search ◽

Country Of Origin ◽

Image Search ◽

Domain Name ◽

Online Store ◽

Google Search ◽

Research Domains ◽

The Web ◽

Ranking Position

This data descriptor describes Google search engine visibility data. The visibility of a domain name in a search engine comes from search engine optimization and can be evaluated based on four data metrics and five data dimensions. The data metrics are the following: Clicks volume (1), impressions volume (2), click-through ratio (3), and ranking position (4). Data dimensions are as follows: queries that are entered into search engines that trigger results with the researched domain name (1), page URLs from research domains which are available in the search engine results page (2), country of origin of search engine visitors (3), type of device used for the search (4), and date of the search (5). Search engine visibility data were obtained from the Google search console for the international online store, which is visible in 240 countries and territories for a period of 15 months. The data contain 123 K clicks and 4.86 M impressions for the web search and 22 K clicks and 9.07 M impressions for the image search. The proposed method for obtaining data can be applied in any other area, not only in the e-commerce industry.

Download Full-text

Eysenbach, Tuische and Diepgen’s Evaluation of Web Searching for Identifying Unpublished Studies for Systematic Reviews: An Innovative Study Which is Still Relevant Today

Evidence Based Library and Information Practice ◽

10.18438/b8f049 ◽

2016 ◽

Vol 11 (3) ◽

pp. 108

Author(s):

Simon Briscoe

Keyword(s):

Systematic Review ◽

Clinical Trials ◽

Search Engine ◽

Systematic Reviews ◽

Search Engines ◽

Web Search ◽

Web Searching ◽

Web Searches ◽

Web Search Engine ◽

Unpublished Studies

A Review of: Eysenbach, G., Tuische, J. & Diepgen, T.L. (2001). Evaluation of the usefulness of Internet searches to identify unpublished clinical trials for systematic reviews. Medical Informatics and the Internet in Medicine, 26(3), 203-218. http://dx.doi.org/10.1080/14639230110075459 Objective – To consider whether web searching is a useful method for identifying unpublished studies for inclusion in systematic reviews. Design – Retrospective web searches using the AltaVista search engine were conducted to identify unpublished studies – specifically, clinical trials – for systematic reviews which did not use a web search engine. Setting – The Department of Clinical Social Medicine, University of Heidelberg, Germany. Subjects – n/a Methods – Pilot testing of 11 web search engines was carried out to determine which could handle complex search queries. Pre-specified search requirements included the ability to handle Boolean and proximity operators, and truncation searching. A total of seven Cochrane systematic reviews were randomly selected from the Cochrane Library Issue 2, 1998, and their bibliographic database search strategies were adapted for the web search engine, AltaVista. Each adaptation combined search terms for the intervention, problem, and study type in the systematic review. Hints to planned, ongoing, or unpublished studies retrieved by the search engine, which were not cited in the systematic reviews, were followed up by visiting websites and contacting authors for further details when required. The authors of the systematic reviews were then contacted and asked to comment on the potential relevance of the identified studies. Main Results – Hints to 14 unpublished and potentially relevant studies, corresponding to 4 of the 7 randomly selected Cochrane systematic reviews, were identified. Out of the 14 studies, 2 were considered irrelevant to the corresponding systematic review by the systematic review authors. The relevance of a further three studies could not be clearly ascertained. This left nine studies which were considered relevant to a systematic review. In addition to this main finding, the pilot study to identify suitable search engines found that AltaVista was the only search engine able to handle the complex searches required to search for unpublished studies. Conclusion –Web searches using a search engine have the potential to identify studies for systematic reviews. Web search engines have considerable limitations which impede the identification of studies.

Download Full-text

An investigation of biases in web search engine query suggestions

Online Information Review ◽

10.1108/oir-11-2018-0341 ◽

2019 ◽

Vol 44 (2) ◽

pp. 365-381 ◽

Cited By ~ 1

Author(s):

Malte Bonart ◽

Anastasiia Samokhina ◽

Gernot Heisenberg ◽

Philipp Schaer

Keyword(s):

Search Engine ◽

Search Engines ◽

Web Search ◽

Query Suggestion ◽

Data Set ◽

Content Type ◽

Web Search Engine ◽

The Stability ◽

Query Suggestions ◽

Over Time

Purpose Survey-based studies suggest that search engines are trusted more than social media or even traditional news, although cases of false information or defamation are known. The purpose of this paper is to analyze query suggestion features of three search engines to see if these features introduce some bias into the query and search process that might compromise this trust. The authors test the approach on person-related search suggestions by querying the names of politicians from the German Bundestag before the German federal election of 2017. Design/methodology/approach This study introduces a framework to systematically examine and automatically analyze the varieties in different query suggestions for person names offered by major search engines. To test the framework, the authors collected data from the Google, Bing and DuckDuckGo query suggestion APIs over a period of four months for 629 different names of German politicians. The suggestions were clustered and statistically analyzed with regards to different biases, like gender, party or age and with regards to the stability of the suggestions over time. Findings By using the framework, the authors located three semantic clusters within the data set: suggestions related to politics and economics, location information and personal and other miscellaneous topics. Among other effects, the results of the analysis show a small bias in the form that male politicians receive slightly fewer suggestions on “personal and misc” topics. The stability analysis of the suggested terms over time shows that some suggestions are prevalent most of the time, while other suggestions fluctuate more often. Originality/value This study proposes a novel framework to automatically identify biases in web search engine query suggestions for person-related searches. Applying this framework on a set of person-related query suggestions shows first insights into the influence search engines can have on the query process of users that seek out information on politicians.

Download Full-text

A cross-sectional study on quality of diabetes information identified from the Internet (Preprint)

10.2196/preprints.14757 ◽

2019 ◽

Author(s):

Jingchun Fan ◽

Jean Craig ◽

Na Zhao ◽

Fujian Song

Keyword(s):

Health Information ◽

Search Engine ◽

Search Engines ◽

The Internet ◽

Online Health Information ◽

Online Information ◽

Significant Difference ◽

Content Coverage ◽

Google Search

BACKGROUND Increasingly people seek health information from the Internet, in particular, health information on diseases that require intensive self-management, such as diabetes. However, the Internet is largely unregulated and the quality of online health information may not be credible. OBJECTIVE To assess the quality of online information on diabetes identified from the Internet. METHODS We used the single term “diabetes” or equivalent Chinese characters to search Google and Baidu respectively. The first 50 websites retrieved from each of the two search engines were screened for eligibility using pre-determined inclusion and exclusion criteria. Included websites were assessed on four domains: accessibility, content coverage, validity and readability. RESULTS We included 26 websites from Google search engine and 34 from Baidu search engine. There were significant differences in website provider (P<0.0001), but not in targeted population (P=0.832) and publication types (P=0.378), between the two search engines. The website accessibility was not statistically significantly different between the two search engines, although there were significant differences in items regarding website content coverage. There was no statistically significant difference in website validity between the Google and Baidu search engines (mean Discern score 3.3 vs 2.9, p=0.156). The results to appraise readability for English website showed that that Flesch Reading Ease scores ranged from 23.1 to 73.0 and the mean score of Flesch-Kincaid Grade Level ranged range from 5.7 to 19.6. CONCLUSIONS The content coverage of the health information for patients with diabetes in English search engine tended to be more comprehensive than that from Chinese search engine. There was a lack of websites provided by health organisations in China. The quality of online health information for people with diabetes needs to be improved to bridge the knowledge gap between website service and public demand.

Download Full-text

Study of Search Engine Transaction Logs Shows Little Change in How Users use Search Engines

Evidence Based Library and Information Practice ◽

10.18438/b80014 ◽

2006 ◽

Vol 1 (3) ◽

pp. 67

Author(s):

David Hook

Keyword(s):

Search Engine ◽

Search Engines ◽

World Wide ◽

Web Search ◽

Result Page ◽

Significant Difference ◽

The Us ◽

The World ◽

The U.S ◽

Over Time

A review of: Jansen, Bernard J., and Amanda Spink. “How Are We Searching the World Wide Web? A Comparison of Nine Search Engine Transaction Logs.” Information Processing & Management 42.1 (2006): 248-263. Objective – To examine the interactions between users and search engines, and how they have changed over time. Design – Comparative analysis of search engine transaction logs. Setting – Nine major analyses of search engine transaction logs. Subjects – Nine web search engine studies (4 European, 5 American) over a seven-year period, covering the search engines Excite, Fireball, AltaVista, BWIE and AllTheWeb. Methods – The results from individual studies are compared by year of study for percentages of single query sessions, one-term queries, operator (and, or, not, etc.) usage and single result page viewing. As well, the authors group the search queries into eleven different topical categories and compare how the breakdown has changed over time. Main Results – Based on the percentage of single query sessions, it does not appear that the complexity of interactions has changed significantly for either the U.S.-based or the European-based search engines. As well, there was little change observed in the percentage of one-term queries over the years of study for either the U.S.-based or the European-based search engines. Few users (generally less than 20%) use Boolean or other operators in their queries, and these percentages have remained relatively stable. One area of noticeable change is in the percentage of users viewing only one results page, which has increased over the years of study. Based on the studies of the U.S.-based search engines, the topical categories of ‘People, Place or Things’ and ‘Commerce, Travel, Employment or Economy’ are becoming more popular, while the categories of ‘Sex and Pornography’ and ‘Entertainment or Recreation’ are declining. Conclusions – The percentage of users viewing only one results page increased during the years of the study, while the percentages of single query sessions, one-term sessions and operator usage remained stable. The increase in single result page viewing implies that users are tending to view fewer results per web query. There was also a significant difference in the percentage of queries using Boolean operators between the US-based and the European-based search engines. One of the study’s findings was that results from a study of a particular search engine cannot necessarily be applied to all search engines. Finally, web search topics show a trend towards information or commerce searching rather than entertainment.

Download Full-text