scholarly journals Evaluating institutional open access performance: Sensitivity analysis

2020 ◽  
Author(s):  
Chun-Kai Huang ◽  
Cameron Neylon ◽  
Richard Hosking ◽  
Lucy Montgomery ◽  
Katie Wilson ◽  
...  

AbstractIn the article “Evaluating institutional open access performance: Methodology, challenges and assessment” we develop the first comprehensive and reproducible workflow that integrates multiple bibliographic data sources for evaluating institutional open access (OA) performance. The major data sources include Web of Science, Scopus, Microsoft Academic, and Unpaywall. However, each of these databases continues to update, both actively and retrospectively. This implies the results produced by the proposed process are potentially sensitive to both the choice of data source and the versions of them used. In addition, there remain the issue relating to selection bias in sample size and margin of error. The current work shows that the levels of sensitivity relating to the above issues can be significant at the institutional level. Hence, the transparency and clear documentation of the choices made on data sources (and their versions) and cut-off boundaries are vital for reproducibility and verifiability.

Author(s):  
Vicente P. Guerrero-Bote ◽  
Zaida Chinchilla-Rodríguez ◽  
Abraham Mendoza ◽  
Félix de Moya-Anegón

This paper presents a large-scale document-level comparison of two major bibliographic data sources: Scopus and Dimensions. The focus is on the differences in their coverage of documents at two levels of aggregation: by country and by institution. The main goal is to analyze whether Dimensions offers as good new opportunities for bibliometric analysis at the country and institutional levels as it does at the global level. Differences in the completeness and accuracy of citation links are also studied. The results allow a profile of Dimensions to be drawn in terms of its coverage by country and institution. Dimensions’ coverage is more than 25% greater than Scopus which is consistent with previous studies. However, the main finding of this study is the lack of affiliation data in a large fraction of Dimensions documents. We found that close to half of all documents in Dimensions are not associated with any country of affiliation while the proportion of documents without this data in Scopus is much lower. This situation mainly affects the possibilities that Dimensions can offer as instruments for carrying out bibliometric analyses at the country and institutional level. Both of these aspects are highly pragmatic considerations for information retrieval and the design of policies for the use of scientific databases in research evaluation.


2020 ◽  
pp. 1-34 ◽  
Author(s):  
Chun-Kai (Karl) Huang ◽  
Cameron Neylon ◽  
Chloe Brookes-Kenworthy ◽  
Richard Hosking ◽  
Lucy Montgomery ◽  
...  

Universities are increasingly evaluated on the basis of their outputs. These are often converted to simple and contested rankings with substantial implications for recruitment, income, and perceived prestige. Such evaluation usually relies on a single data source to define the set of outputs for a university. However, few studies have explored differences across data sources and their implications for metrics and rankings at the institutional scale. We address this gap by performing detailed bibliographic comparisons between Web of Science (WoS), Scopus, and Microsoft Academic (MSA) at the institutional level and supplement this with a manual analysis of 15 universities. We further construct two simple rankings based on citation count and open access status. Our results show that there are significant differences across databases. These differences contribute to drastic changes in rank positions of universities, which are most prevalent for non-English-speaking universities and those outside the top positions in international university rankings. Overall, MSA has greater coverage than Scopus and WoS, but with less complete affiliation metadata. We suggest that robust evaluation measures need to consider the effect of choice of data sources and recommend an approach where data from multiple sources is integrated to provide a more robust data set.


PeerJ ◽  
2020 ◽  
Vol 8 ◽  
pp. e9410 ◽  
Author(s):  
Nicolas Robinson-Garcia ◽  
Rodrigo Costas ◽  
Thed N. van Leeuwen

The implementation of policies promoting the adoption of an open science (OS) culture must be accompanied by indicators that allow monitoring the uptake of such policies and their potential effects on research publishing and sharing practices. This study presents indicators of open access (OA) at the institutional level for universities worldwide. By combining data from Web of Science, Unpaywall and the Leiden Ranking disambiguation of institutions, we track OA coverage of universities’ output for 963 institutions. This paper presents the methodological challenges, conceptual discrepancies and limitations and discusses further steps needed to move forward the discussion on fostering OA and OS practices and policies.


2019 ◽  
Author(s):  
Chun-Kai (Karl) Huang ◽  
Cameron Neylon ◽  
Chloe Brookes-Kenworthy ◽  
Richard Hosking ◽  
Lucy Montgomery ◽  
...  

AbstractUniversities are increasingly evaluated, both internally and externally on the basis of their outputs. Often these are converted to simple, and frequently contested, rankings based on quantitative analysis of those outputs. These rankings can have substantial implications for student and staff recruitment, research income and perceived prestige of a university. Both internal and external analyses usually rely on a single data source to define the set of outputs assigned to a specific university. Although some differences between such databases are documented, few studies have explored them at the institutional scale and examined the implications of these differences for the metrics and rankings that are derived from them. We address this gap by performing detailed bibliographic comparisons between three key databases: Web of Science (WoS), Scopus and, the recently relaunched Microsoft Academic (MSA). We analyse the differences between outputs with DOIs identified from each source for a sample of 155 universities and supplement this with a detailed manual analysis of the differences for fifteen universities. We find significant differences between the sources at the university level. Sources differ in the publication year of specific objects, the completeness of metadata, as well as in their coverage of disciplines, outlets, and publication type. We construct two simple rankings based on citation counts and open access status of the outputs for these universities and show dramatic changes in position based on the choice of bibliographic data sources. Those universities that experience the largest changes are frequently those from non-English speaking countries and those that are outside the top positions in international university rankings. Overall MSA has greater coverage than Scopus or WoS, but has less complete affiliation metadata. We suggest that robust evaluation measures need to consider the effect of choice of data sources and recommend an approach where data from multiple sources is integrated to provide a more robust dataset.


2021 ◽  
pp. 1-22
Author(s):  
Martijn Visser ◽  
Nees Jan van Eck ◽  
Ludo Waltman

We present a large-scale comparison of five multidisciplinary bibliographic data sources: Scopus, Web of Science, Dimensions, Crossref, and Microsoft Academic. The comparison considers scientific documents from the period 2008–2017 covered by these data sources. Scopus is compared in a pairwise manner with each of the other data sources. We first analyze differences between the data sources in the coverage of documents, focusing for instance on differences over time, differences per document type, and differences per discipline. We then study differences in the completeness and accuracy of citation links. Based on our analysis, we discuss the strengths and weaknesses of the different data sources. We emphasize the importance of combining a comprehensive coverage of the scientific literature with a flexible set of filters for making selections of the literature.


Author(s):  
Leilah Santiago Bufrem ◽  
Fábio Mascarenhas Silva ◽  
Natanael Vitor Sobral ◽  
Anna Elizabeth Galvão Coutinho Correia

Introdução: A atual configuração da dinâmica relativa à produção e àcomunicação científicas revela o protagonismo da Ciência Orientada a Dados,em concepção abrangente, representada principalmente por termos como “e-Science” e “Data Science”. Objetivos: Apresentar a produção científica mundial relativa à Ciência Orientada a Dados a partir dos termos “e-Science” e “Data Science” na Scopus e na Web of Science, entre 2006 e 2016. Metodologia: A pesquisa está estruturada em cinco etapas: a) busca de informações nas bases Scopus e Web of Science; b) obtenção dos registros; bibliométricos; c) complementação das palavras-chave; d) correção e cruzamento dos dados; e) representação analítica dos dados. Resultados: Os termos de maior destaque na produção científica analisada foram Distributed computer systems (2006), Grid computing (2007 a 2013) e Big data (2014 a 2016). Na área de Biblioteconomia e Ciência de Informação, a ênfase é dada aos temas: Digital library e Open access, evidenciando a centralidade do campo nas discussões sobre dispositivos para dar acesso à informação científica em meio digital. Conclusões: Sob um olhar diacrônico, constata-se uma visível mudança de foco das temáticas voltadas às operações de compartilhamento de dados para a perspectiva analítica de busca de padrões em grandes volumes de dados.Palavras-chave: Data Science. E-Science. Ciência orientada a dados. Produção científica.Link:http://www.uel.br/revistas/uel/index.php/informacao/article/view/26543/20114


2006 ◽  
Vol 1 (3) ◽  
pp. 57
Author(s):  
Suzanne Pamela Lewis

A review of: Antelman, Kristin. “Do Open-Access Articles Have a Greater Research Impact?” College & Research Libraries 65.5 (Sep. 2004): 372-82. Objective – To ascertain whether open access articles have a greater research impact than articles not freely available, as measured by citations in the ISI Web of Science database. Design – Analysis of mean citation rates of a sample population of journal articles across four disciplines. Setting – Journal literature across the disciplines of philosophy, political science, mathematics, and electrical and electronic engineering. Subjects – A sample of 2,017 articles across the four disciplines published between 2001 and 2002 (for political science, mathematics, and electrical and electronic engineering) and between 1999 and 2000 (for philosophy). Methods – A systematic presample of articles for each of the disciplines was taken to calculate the necessary sample sizes. Based on this calculation, articles were sourced from ten leading journals in each discipline. The leading journals in political science, mathematics, and electrical and electronic engineering were defined by ISI’s Journal Citation Reports for 2002. The ten leading philosophy journals were selected using a combination of other methods. Once the sample population had been identified, each article title and the number of citations to each article (in the ISI Web of Science database) were recorded. Then the article title was searched in Google and if any freely available full text version was found, the article was classified as open access. The mean citation rate for open access and non-open access articles in each discipline was identified, and the percentage difference between the means was calculated. Main results – The four disciplines represented a range of open access uptake: 17% of articles in philosophy were open access, 29% in political science, 37% in electrical and electronic engineering, and 69% in mathematics. There was a significant difference in the mean citation rates for open access articles and non-open access articles in all four disciplines. The percentage difference in means was 45% in philosophy, 51% in electrical and electronic engineering, 86% in political science, and 91% in mathematics. Mathematics had the highest rate of open access availability of articles, but political science had the greatest difference in mean citation rates, suggesting there are other, discipline-specific factors apart from rate of open access uptake affecting research impact. Conclusion – The finding that, across these four disciplines, open access articles have a greater research impact than non-open access articles, is only one aspect of the complex changes that are presently taking place in scholarly publishing and communication. However, it is useful information for librarians formulating strategies for building institutional repositories, or exploring open access publishing with patrons or publishers.


Epidemiologia ◽  
2021 ◽  
Vol 2 (3) ◽  
pp. 315-324
Author(s):  
Juan M. Banda ◽  
Ramya Tekumalla ◽  
Guanyu Wang ◽  
Jingyuan Yu ◽  
Tuo Liu ◽  
...  

As the COVID-19 pandemic continues to spread worldwide, an unprecedented amount of open data is being generated for medical, genetics, and epidemiological research. The unparalleled rate at which many research groups around the world are releasing data and publications on the ongoing pandemic is allowing other scientists to learn from local experiences and data generated on the front lines of the COVID-19 pandemic. However, there is a need to integrate additional data sources that map and measure the role of social dynamics of such a unique worldwide event in biomedical, biological, and epidemiological analyses. For this purpose, we present a large-scale curated dataset of over 1.12 billion tweets, growing daily, related to COVID-19 chatter generated from 1 January 2020 to 27 June 2021 at the time of writing. This data source provides a freely available additional data source for researchers worldwide to conduct a wide and diverse number of research projects, such as epidemiological analyses, emotional and mental responses to social distancing measures, the identification of sources of misinformation, stratified measurement of sentiment towards the pandemic in near real time, among many others.


2021 ◽  
Vol 37 (1) ◽  
pp. 161-169
Author(s):  
Dominik Rozkrut ◽  
Olga Świerkot-Strużewska ◽  
Gemma Van Halderen

Never has there been a more exciting time to be an official statistician. The data revolution is responding to the demands of the CoVID-19 pandemic and a complex sustainable development agenda to improve how data is produced and used, to close data gaps to prevent discrimination, to build capacity and data literacy, to modernize data collection systems and to liberate data to promote transparency and accountability. But can all data be liberated in the production and communication of official statistics? This paper explores the UN Fundamental Principles of Official Statistics in the context of eight new and big data sources. The paper concludes each data source can be used for the production of official statistics in adherence with the Fundamental Principles and argues these data sources should be used if National Statistical Systems are to adhere to the first Fundamental Principle of compiling and making available official statistics that honor citizen’s entitlement to public information.


2021 ◽  
pp. 1-11
Author(s):  
Yanan Huang ◽  
Yuji Miao ◽  
Zhenjing Da

The methods of multi-modal English event detection under a single data source and isomorphic event detection of different English data sources based on transfer learning still need to be improved. In order to improve the efficiency of English and data source time detection, based on the transfer learning algorithm, this paper proposes multi-modal event detection under a single data source and isomorphic event detection based on transfer learning for different data sources. Moreover, by stacking multiple classification models, this paper makes each feature merge with each other, and conducts confrontation training through the difference between the two classifiers to further make the distribution of different source data similar. In addition, in order to verify the algorithm proposed in this paper, a multi-source English event detection data set is collected through a data collection method. Finally, this paper uses the data set to verify the method proposed in this paper and compare it with the current most mainstream transfer learning methods. Through experimental analysis, convergence analysis, visual analysis and parameter evaluation, the effectiveness of the algorithm proposed in this paper is demonstrated.


Sign in / Sign up

Export Citation Format

Share Document