scholarly journals Coverage of highly-cited documents in Google Scholar, Web of Science, and Scopus: a multidisciplinary comparison

Author(s):  
Alberto Martín-Martín ◽  
Enrique Orduna-Malea ◽  
Emilio Delgado López-Cózar

This study explores the extent to which bibliometric indicators based on counts of highly-cited documents could be affected by the choice of data source. The initial hypothesis is that databases that rely on journal selection criteria for their document coverage may not necessarily provide an accurate representation of highly-cited documents across all subject areas, while inclusive databases, which give each document the chance to stand on its own merits, might be better suited to identify highly-cited documents. To test this hypothesis, an analysis of 2,515 highly-cited documents published in 2006 that Google Scholar displays in its Classic Papers product is carried out at the level of broad subject categories, checking whether these documents are also covered in Web of Science and Scopus, and whether the citation counts offered by the different sources are similar. The results show that a large fraction of highly-cited documents in the Social Sciences and Humanities (8.6%-28.2%) are invisible to Web of Science and Scopus. In the Natural, Life, and Health Sciences the proportion of missing highly-cited documents in Web of Science and Scopus is much lower. Furthermore, in all areas, Spearman correlation coefficients of citation counts in Google Scholar, as compared to Web of Science and Scopus citation counts, are remarkably strong (.83-.99). The main conclusion is that the data about highly-cited documents available in the inclusive database Google Scholar does indeed reveal significant coverage deficiencies in Web of Science and Scopus in some areas of research. Therefore, using these selective databases to compute bibliometric indicators based on counts of highly-cited documents might produce biased assessments in poorly covered areas.

2018 ◽  
Author(s):  
Alberto Martín-Martín ◽  
Enrique Orduna-Malea ◽  
Mike Thelwall ◽  
Emilio Delgado López-Cózar

Despite citation counts from Google Scholar (GS), Web of Science (WoS), and Scopus being widely consulted by researchers and sometimes used in research evaluations, there is no recent or systematic evidence about the differences between them. In response, this paper investigates 2,448,055 citations to 2,299 English-language highly-cited documents from 252 GS subject categories published in 2006, comparing GS, the WoS Core Collection, and Scopus. GS consistently found the largest percentage of citations across all areas (93%-96%), far ahead of Scopus (35%-77%) and WoS (27%-73%). GS found nearly all the WoS (95%) and Scopus (92%) citations. Most citations found only by GS were from non-journal sources (48%-65%), including theses, books, conference papers, and unpublished materials. Many were non-English (19%-38%), and they tended to be much less cited than citing sources that were also in Scopus or WoS. Despite the many unique GS citing sources, Spearman correlations between citation counts in GS and WoS or Scopus are high (0.78-0.99). They are lower in the Humanities, and lower between GS and WoS than between GS and Scopus. The results suggest that in all areas GS citation data is essentially a superset of WoS and Scopus, with substantial extra coverage.


2018 ◽  
Author(s):  
Enrique Orduna-Malea ◽  
Alberto Martín-Martín ◽  
Emilio Delgado López-Cózar

In June 2017 Google Scholar launched a new product called Classic Papers. This service currently displays the most cited English-language original research articles by fields and published in 2006. The main goal of this work is to describe the main characteristics and features of this Google Scholar’s new service, as well as to highlight its main strengths and weaknesses. To do this, a total of 2,515 records were extracted. Additionally, for each record, the following bibliographic data were gathered: broad subject category and subcategory, Title of the document, URL, Authors, Google Scholar Citation profiles’ URL, and Citations received. It is finally concluded that, although the product is easy to use and provides original data about highly cited documents at the level of disciplines, it still suffers of some methodological concerns, mainly related to the subject classification of documents and the use of homogenous visualization threshold regardless the discipline, that jeopardizes the utility of this product for bibliometric purposes. In addition to this, the lack of transparency constitutes a methodological concern, since Google Scholar does not to declare in detail how the product has been developed.


2019 ◽  
Vol 11 (9) ◽  
pp. 202 ◽  
Author(s):  
Rovira ◽  
Codina ◽  
Guerrero-Solé ◽  
Lopezosa

Search engine optimization (SEO) constitutes the set of methods designed to increase the visibility of, and the number of visits to, a web page by means of its ranking on the search engine results pages. Recently, SEO has also been applied to academic databases and search engines, in a trend that is in constant growth. This new approach, known as academic SEO (ASEO), has generated a field of study with considerable future growth potential due to the impact of open science. The study reported here forms part of this new field of analysis. The ranking of results is a key aspect in any information system since it determines the way in which these results are presented to the user. The aim of this study is to analyze and compare the relevance ranking algorithms employed by various academic platforms to identify the importance of citations received in their algorithms. Specifically, we analyze two search engines and two bibliographic databases: Google Scholar and Microsoft Academic, on the one hand, and Web of Science and Scopus, on the other. A reverse engineering methodology is employed based on the statistical analysis of Spearman’s correlation coefficients. The results indicate that the ranking algorithms used by Google Scholar and Microsoft are the two that are most heavily influenced by citations received. Indeed, citation counts are clearly the main SEO factor in these academic search engines. An unexpected finding is that, at certain points in time, Web of Science (WoS) used citations received as a key ranking factor, despite the fact that WoS support documents claim this factor does not intervene.


2010 ◽  
Vol 57 (4) ◽  
pp. 201-211 ◽  
Author(s):  
Jelena Jacimovic ◽  
Ruzica Petrovic ◽  
Slavoljub Zivkovic

Introduction. For a long time, The Institute for Scientific Information (ISI, now Thomson Scientific, Philadelphia, US) citation databases, available online through the Web of Science (WoS), had an unique position among bibliographic databases. The emergence of new citation databases, such as Scopus and Google Scholar (GS), call in question the dominance of WoS and the accuracy of bibliometric and citation studies exclusively based on WoS data. The aim of this study was to determine whether there were significant differences in the received citation counts for Serbian Dental Journal (SDJ) found in WoS and Scopus databases, or whether GS results differed significantly from those obtained by WoS and Scopus, and whether GS could be an adequate qualitative alternative for commercial databases in the impact assessment of this journal. Material and Methods. The data regarding SDJ citation was collected in September 2010 by searching WoS, Scopus and GS databases. For further analysis, all relevant data of both, cited and citing articles, were imported into Microsoft Access? database. Results. One hundred and fifty-eight cited papers from SDJ and 249 received citations were found in the three analyzed databases. 74% of cited articles were found in GS, 46% in Scopus and 44% in WoS. The greatest number of citations (189) was derived from GS, while only 15% of the citations, were found in all three databases. There was a significant difference in the percentage of unique citations found in the databases. 58% originated from GS, while Scopus and WoS gave 6% and 4% unique citations, respectively. The highest percentage of databases overlap was found between WoS and Scopus (70%), while the overlap between Scopus and GS was 18% only. In case of WoS and GS the overlap was 17%. Most of the SDJ citations came from original scientific articles. Conclusion. WoS, Scopus and GS produce quantitatively and qualitatively different citation counts for SDJ articles. None of the examined databases can provide a comprehensive picture and it is necessary to take into account all three available sources.


2020 ◽  
Vol 53 (3) ◽  
pp. 515-520
Author(s):  
Hannah June Kim ◽  
Bernard Grofman

ABSTRACTThis article uses data collected from Google Scholar to identify characteristics of scholars who have chosen to create a Google Scholar profile. Among tenured and tenure-track faculty with full-time appointments in PhD-granting political science departments, we find that only 43.7% have created a profile. However, among R1 faculty, young and early-career faculty are more likely to have Google Scholar profiles than those in older cohorts. Although subfield differences are largely nonexistent, there is a notably low proportion of theory faculty with profiles and a slightly higher proportion with profiles among methodologists. Moreover, within cohorts, those who are highly cited are more likely to have profiles than those who have low citation counts. We conclude by discussing implications of our findings, the increasing usage of Google Scholar and profiles, and the increasing importance of an online presence in the academy.


2007 ◽  
Vol 2 (3) ◽  
pp. 87 ◽  
Author(s):  
Lorie Andrea Kloda

Objective – To determine whether three competing citation tracking services result in differing citation counts for a known set of articles, and to assess the extent of any differences. Design – Citation analysis, observational study. Setting – Three citation tracking databases: Google Scholar, Scopus and Web of Science. Subjects – Citations from eleven journals each from the disciplines of oncology and condensed matter physics for the years 1993 and 2003. Methods – The researchers selected eleven journals each from the list of journals from Journal Citation Reports 2004 for the categories “Oncology” and “Condensed Matter Physics” using a systematic sampling technique to ensure journals with varying impact factors were included. All references from these 22 journals were retrieved for the years 1993 and 2003 by searching three databases: Web of Science, INSPEC, and PubMed. Only research articles were included for the purpose of the study. From these, a stratified random sample was created to proportionally represent the content of each journal (oncology 1993: 234 references, 2003: 259 references; condensed matter physics 1993: 358 references, 2003: 364 references). In November of 2005, citations counts were obtained for all articles from Web of Science, Scopus and Google Scholar. Due to the small sample size and skewed distribution of data, non-parametric tests were conducted to determine whether significant differences existed between sets. Main results – For 1993, mean citation counts were highest in Web of Science for both oncology (mean = 45.3, SD = 77.4) and condensed matter physics (mean = 22.5, SD = 32.5). For 2003, mean citation counts were higher in Scopus for oncology (mean = 8.9, SD = 12.0), and in Web of Science for condensed matter physics (mean = 3.0, SD = 4.0). There was not enough data for the set of citations from Scopus for condensed matter physics for 1993 and it was therefore excluded from analysis. A Friedman test to measure for differences between all remaining groups suggested a significant difference existed, and so pairwise post-hoc comparisons were performed. The Wilcoxon Signed Ranked tests demonstrated significant differences “in citation counts between all pairs (p < 0.001) except between Google Scholar and Scopus for CM physics 2003 (p = 0.119).” The study also looked at the number of unique references from each database, as well as the proportion of overlap for the 2003 citations. In the area of oncology, there was found to be 31% overlap between databases, with Google Scholar including the most unique references (13%), followed by Scopus (12%) and Web of Science (7%). For condensed matter physics, the overlap was lower at 21% and the largest number of unique references was found in Web of Science (21%), with Google Scholar next largest (17%) and Scopus the least (9%). Citing references from Google Scholar were found to originate from not only journals, but online archives, academic repositories, government and non-government white papers and reports, commercial organizations, as well as other sources. Conclusion – The study does not confirm the authors’ hypothesis that differing scholarly coverage would result in different citation counts from the three databases. While there were significant differences in mean citation rates between all pairs of databases except for Google Scholar and Scopus in condensed matter physics for 2003, no one database performed better overall. Different databases performed better for different subjects, as well as for different years, especially Scopus, which only includes references starting in 1996. The results of this study suggest that the best citation database will depend on the years being searched as well as the subject area. For a complete picture of citation behaviour, the authors suggest all three be used.


2018 ◽  
Vol 116 (3) ◽  
pp. 2175-2188 ◽  
Author(s):  
Alberto Martín-Martín ◽  
Enrique Orduna-Malea ◽  
Emilio Delgado López-Cózar

2020 ◽  
Vol 3 (1) ◽  
pp. 30-63
Author(s):  
O. V. Moskaleva ◽  
M. A. Akoev

This article is the first in a series of articles representing the development forecast of Russian scientific journals. Based on the analysis of the dynamics of many bibliometric indicators of Russian journals presented in various databases on the Web of Science platform, a forecast is made for the development of journals by field of science using the OECD classifier. Proposals are made on the necessary measures to increase the bibliometric indicators of Russian journals in the natural sciences, forecast of increasing the number of Russian journals in Social Sciences and Humanities in the Web of Science Core Collection is presented.


Sign in / Sign up

Export Citation Format

Share Document