scholarly journals Getting Bulk Data Through Google: An empirical study

2016 ◽  
Vol 7 (2) ◽  
pp. 39-48
Author(s):  
Shama Rani ◽  
Jaiteg Singh

To store the information in a database is one of the major tasks. The efficient storage of data is important for future use. Information retrieval is a method of gathering information related to input queries from the various sources or stored databases. To retrieve the information, a search engine plays an important role. A web search engine creates an index to match queries. The quality of information is improved with the help of search engine. For retrieving the information, a search engine comprises some modules such as query processor, a searching and matching function, document processor and page rank capability. This paper focuses on the retrieval of web documents against input queries and stores them in to database. A Google search API can be used to fetch the results. It analyses the data by processing through these modules and downloads the content available in different formats.

Diagnosis ◽  
2020 ◽  
Vol 0 (0) ◽  
Author(s):  
Davide Negrini ◽  
Andrea Padoan ◽  
Mario Plebani

AbstractBackgroundThe number of websites providing laboratory test information is increasing fast, although the accuracy of reported resources is sometimes questionable. The aim of this study was to assess the quality of online retrievable information by Google Search engine.MethodsConsidering urinalysis, cholesterol and prostate-specific antigen (PSA) as keywords, the Google Search engine was queried. Using Google Trends, users’ search trends (interest over time) were evaluated in a 5-year period. The first three or 10 retrieved hits were analysed in blind by two reviewers and classified according to the type of owner or publisher and for the quality of the reported Web content.ResultsThe interest over time constantly increased for all the three considered tests. Most of the Web content owners were editorial and/or publishing groups (mean percentage 35.5% and 30.0% for the first three and 10 hits, respectively). Public and health agencies and scientific societies are less represented. Among the first three and 10 hits, cited sources were found to vary from 26.0% to 46.7% of Web page results, whilst for cholesterol, 60% of the retrieved Web contents reported only authors’ signatures.ConclusionsOur findings confirm those obtained in other studies in the literature, demonstrating that online Web searches can lead patients to inadequately written or reviewed health information.


2019 ◽  
Vol 3 (1) ◽  
pp. 17
Author(s):  
Saiful i Bukhor

Search engines are used in the web as a tool for information retrieval. Web Server is a large warehouse of heterogeneous and unstructured data so that to filter out relevant information from people, a search engine is needed. Search engines usually consist of page repositories, indexing modules, query modules and ranking modules. Search engines do not work alone, besides that there is a web browser that supports the work of this search engine to be more optimal. A browser is software that is run on a user's computer (user) that displays web documents or information taken from a web server [1]. A browser is the type of intermediary the user uses most often. This paper aims to analyze three search engines namely Google, Yahoo, and Bing based on existing features. These features include web search, image search, video search, news search, route search, book search, change search settings, display number of views, shopping, language translator. Google stands as the best search engine among all search engines, which works using the Page Rank algorithm. Page Rank is a numerical value that determines the importance of a web page by calculating the number of backlinks.


2013 ◽  
pp. 1325-1345
Author(s):  
Andrew Boulton ◽  
Lomme Devriendt ◽  
Stanley D. Brunn ◽  
Ben Derudder ◽  
Frank Witlox

Geographers and social scientists have long been interested in ranking and classifying the cities of the world. The cutting edge of this research is characterized by a recognition of the crucial importance of information and, specifically, ICTs to cities’ positions in the current Knowledge Economy. This chapter builds on recent “cyberspace” analyses of the global urban system by arguing for, and demonstrating empirically, the value of Web search engine data as a means of understanding cities as situated within, and constituted by, flows of digital information. To this end, the authors show how the Google search engine can be used to specify a dynamic, informational classification of North American cities based on both the production and the consumption of Web information about two prominent current issues global in scope: the global financial crisis, and global climate change.


Author(s):  
Aboubakr Aqle ◽  
Dena Al-Thani ◽  
Ali Jaoua

AbstractThere are limited studies that are addressing the challenges of visually impaired (VI) users when viewing search results on a search engine interface by using a screen reader. This study investigates the effect of providing an overview of search results to VI users. We present a novel interactive search engine interface called InteractSE to support VI users during the results exploration stage in order to improve their interactive experience and web search efficiency. An overview of the search results is generated using an unsupervised machine learning approach to present the discovered concepts via a formal concept analysis that is domain-independent. These concepts are arranged in a multi-level tree following a hierarchical order and covering all retrieved documents that share maximal features. The InteractSE interface was evaluated by 16 legally blind users and compared with the Google search engine interface for complex search tasks. The evaluation results were obtained based on both quantitative (as task completion time) and qualitative (as participants’ feedback) measures. These results are promising and indicate that InteractSE enhances the search efficiency and consequently advances user experience. Our observations and analysis of the user interactions and feedback yielded design suggestions to support VI users when exploring and interacting with search results.


2016 ◽  
Vol 6 (2) ◽  
pp. 41-65 ◽  
Author(s):  
Sheetal A. Takale ◽  
Prakash J. Kulkarni ◽  
Sahil K. Shah

Information available on the internet is huge, diverse and dynamic. Current Search Engine is doing the task of intelligent help to the users of the internet. For a query, it provides a listing of best matching or relevant web pages. However, information for the query is often spread across multiple pages which are returned by the search engine. This degrades the quality of search results. So, the search engines are drowning in information, but starving for knowledge. Here, we present a query focused extractive summarization of search engine results. We propose a two level summarization process: identification of relevant theme clusters, and selection of top ranking sentences to form summarized result for user query. A new approach to semantic similarity computation using semantic roles and semantic meaning is proposed. Document clustering is effectively achieved by application of MDL principle and sentence clustering and ranking is done by using SNMF. Experiments conducted demonstrate the effectiveness of system in semantic text understanding, document clustering and summarization.


Author(s):  
Xiannong Meng

This chapter surveys various technologies involved in a Web search engine with an emphasis on performance analysis issues. The aspects of a general-purpose search engine covered in this survey include system architectures, information retrieval theories as the basis of Web search, indexing and ranking of Web documents, relevance feedback and machine learning, personalization, and performance measurements. The objectives of the chapter are to review the theories and technologies pertaining to Web search, and help us understand how Web search engines work and how to use the search engines more effectively and efficiently.


2016 ◽  
Vol 10 (4) ◽  
pp. 302-307 ◽  
Author(s):  
Richard M. Hinds ◽  
Natalie R. Danna ◽  
John T. Capo ◽  
Kenneth J. Mroczek

Background. The Internet has been reported to be the first informational resource for many fellowship applicants. The objective of this study was to assess the accessibility of orthopaedic foot and ankle fellowship websites and to evaluate the quality of information provided via program websites. Methods. The American Orthopaedic Foot and Ankle Society (AOFAS) and the Fellowship and Residency Electronic Interactive Database (FREIDA) fellowship databases were accessed to generate a comprehensive list of orthopaedic foot and ankle fellowship programs. The databases were reviewed for links to fellowship program websites and compared with program websites accessed from a Google search. Accessible fellowship websites were then analyzed for the quality of recruitment and educational content pertinent to fellowship applicants. Results. Forty-seven orthopaedic foot and ankle fellowship programs were identified. The AOFAS database featured direct links to 7 (15%) fellowship websites with the independent Google search yielding direct links to 29 (62%) websites. No direct website links were provided in the FREIDA database. Thirty-six accessible websites were analyzed for content. Program websites featured a mean 44% (range = 5% to 75%) of the total assessed content. The most commonly presented recruitment and educational content was a program description (94%) and description of fellow operative experience (83%), respectively. Conclusions. There is substantial variability in the accessibility and quality of orthopaedic foot and ankle fellowship websites. Clinical Relevance. Recognition of deficits in accessibility and content quality may assist foot and ankle fellowships in improving program information online. Levels of Evidence: Level IV


Author(s):  
Andon Hestiantoro ◽  
Intan Kusumaningtyas

Objective: To assess the quality of websites providing informationon infertility and its management in Bahasa.Methods: Differences between website types and affiliates wereassessed for the credibility, accuracy and ease of navigation usingpredefined criteria. We used Google search engine with the keyword"infertilitas" and we assessed 50 websites in Bahasa that relates withinfertility.Results: The content credibility for most of the sites has adequatescore with range of score 60 to 80 for 68% sites. Content accuracyfor most of the sites have scores more than 60, with 24% or 12sites with scores 60 to 80 and 44% or 22 sites have scores above80. The ease of navigation for most of the sites, 47 sites or 94%has scores more than 60.Conclusion: The quality of internet based infertility information inBahasa is adequate for category credibility, accuracy and ease ofnavigation.[Indones J Obstet Gynecol 2018; 6-1: 28-33]Keywords: bahasa, infertility, information, internet, quality


Author(s):  
Pratik C. Jambhale

Search engine optimization is a technique to take a web document in top search results of a search engine. Web presence Companies is not only an easy way to reach among the target users but it may be profitable for Business is exactly find the target users as of the reason that most of the time user search out with the keywords of their use rather than searching the Company name, and if the Company Website page come in the top positions then the page may be profitable. This work describes the tweaks of taking the page on top position in Google by increasing the Page rank which may result in the improved visibility and profitable deal for a Business. Google is most user-friendly search engine to prove for the all users which give user-oriented results. In addition ,most of other search engines use Google search patterns so we have concentrated on it. So, if a page is Register on Google it Is Display on most of the search engines.


Sign in / Sign up

Export Citation Format

Share Document