scholarly journals Human Networks and Data Science (NSF 19-608): Uncovering the "dark matter" of research

2020 ◽  
Author(s):  
Brian A. Nosek ◽  
Stasa Milojevic ◽  
Valentin Pentchev ◽  
Xiaoran Yan ◽  
David M Litherland ◽  
...  

With funding from the National Science Foundation, the Center for Open Science (COS) and Indiana University will create a dynamic, distributed, and heterogeneous data source for the advancement of science of science research. This will be achieved by using, enhancing, and combining the capabilities of the Open Science Framework (OSF) and the Collaborative Archive & Data Research Environment (CADRE). With over 200,000 users (currently growing by >220 per day), many thousands of projects, registrations, and papers, millions of files stored and managed, and rich metadata tracking researcher actions, the OSF is already a very rich dataset for investigating the research lifecycle, researcher behaviors, and how those behaviors evolve in the social network. As a cross-university effort, CADRE provides an integrated data mining and collaborative environment for big bibliographic data sets. While still under development, the CADRE platform has already attracted long-term financial commitments from 10 research intensive universities with additional support from multiple infrastructure and industry partners. Connecting these efforts will catalyze transformative research of human networks in the science of science.

2021 ◽  
Vol 40 (2) ◽  
pp. 137-141 ◽  
Author(s):  
Jordan Mansell ◽  
Allison Harell ◽  
Elisabeth Gidengil ◽  
Patrick A. Stewart

AbstractWe introduce the Politics and the Life Sciences special issue on Psychophysiology, Cognition, and Political Differences. This issue represents the second special issue funded by the Association for Politics and the Life Sciences that adheres to the Open Science Framework for registered reports (RR). Here pre-analysis plans (PAPs) are peer-reviewed and given in-principle acceptance (IPA) prior to data being collected and/or analyzed, and are published contingent upon the preregistration of the study being followed as proposed. Bound by a common theme of the importance of incorporating psychophysiological perspectives into the study of politics, broadly defined, the articles in this special issue feature a unique set of research questions and methodologies. In the following, we summarize the findings, discuss the innovations produced by this research, and highlight the importance of open science for the future of political science research.


2017 ◽  
Author(s):  
Federica Rosetta

Watch the VIDEO here.Within the Open Science discussions, the current call for “reproducibility” comes from the raising awareness that results as presented in research papers are not as easily reproducible as expected, or even contradicted those original results in some reproduction efforts. In this context, transparency and openness are seen as key components to facilitate good scientific practices, as well as scientific discovery. As a result, many funding agencies now require the deposit of research data sets, institutions improve the training on the application of statistical methods, and journals begin to mandate a high level of detail on the methods and materials used. How can researchers be supported and encouraged to provide that level of transparency? An important component is the underlying research data, which is currently often only partly available within the article. At Elsevier we have therefore been working on journal data guidelines which clearly explain to researchers when and how they are expected to make their research data available. Simultaneously, we have also developed the corresponding infrastructure to make it as easy as possible for researchers to share their data in a way that is appropriate in their field. To ensure researchers get credit for the work they do on managing and sharing data, all our journals support data citation in line with the FORCE11 data citation principles – a key step in the direction of ensuring that we address the lack of credits and incentives which emerged from the Open Data analysis (Open Data - the Researcher Perspective https://www.elsevier.com/about/open-science/research-data/open-data-report ) recently carried out by Elsevier together with CWTS. Finally, the presentation will also touch upon a number of initiatives to ensure the reproducibility of software, protocols and methods. With STAR methods, for instance, methods are submitted in a Structured, Transparent, Accessible Reporting format; this approach promotes rigor and robustness, and makes reporting easier for the author and replication easier for the reader.


2018 ◽  
Vol 6 (3) ◽  
pp. 669-686 ◽  
Author(s):  
Michael Dietze

Abstract. Environmental seismology is the study of the seismic signals emitted by Earth surface processes. This emerging research field is at the intersection of seismology, geomorphology, hydrology, meteorology, and further Earth science disciplines. It amalgamates a wide variety of methods from across these disciplines and ultimately fuses them in a common analysis environment. This overarching scope of environmental seismology requires a coherent yet integrative software which is accepted by many of the involved scientific disciplines. The statistic software R has gained paramount importance in the majority of data science research fields. R has well-justified advances over other mostly commercial software, which makes it the ideal language to base a comprehensive analysis toolbox on. The article introduces the avenues and needs of environmental seismology, and how these are met by the R package eseis. The conceptual structure, example data sets, and available functions are demonstrated. Worked examples illustrate possible applications of the package and in-depth descriptions of the flexible use of the functions. The package has a registered DOI, is available under the GPL licence on the Comprehensive R Archive Network (CRAN), and is maintained on GitHub.


2019 ◽  
Vol 15 (2) ◽  
Author(s):  
Renata Curty

RESUMO As diretivas governamentais e institucionais em torno do compartilhamento de dados de pesquisas financiadas com dinheiro público têm impulsionado a rápida expansão de repositórios digitais de dados afim de disponibilizar esses ativos científicos para reutilização, com propósitos nem sempre antecipados, pelos pesquisadores que os produziram/coletaram. De modo contraditório, embora o argumento em torno do compartilhamento de dados seja fortemente sustentado no potencial de reúso e em suas consequentes contribuições para o avanço científico, esse tema permanece acessório às discussões em torno da ciência de dados e da ciência aberta. O presente artigo de revisão narrativa tem por objetivo lançar um olhar mais atento ao reúso de dados e explorar mais diretamente esse conceito, ao passo que propõe uma classificação inicial de cinco abordagens distintas para o reúso de dados de pesquisa (reaproveitamento, agregação, integração, metanálise e reanálise), com base em situações hipotéticas acompanhadas de casos de reúso de dados publicados na literatura científica. Também explora questões determinantes para a condição de reúso, relacionando a reusabilidade à qualidade da documentação que acompanha os dados. Oferece discussão sobre os desafios da documentação de dados, bem como algumas iniciativas e recomendações para que essas dificuldades sejam contornadas. Espera-se que os argumentos apresentados contribuam não somente para o avanço conceitual em torno do reúso e da reusabilidade de dados, mas também reverberem em ações relacionadas à documentação dos dados de modo a incrementar o potencial de reúso desses ativos científicos.Palavras-chave: Reúso de Dados; Reprodutibilidade Científica; Reusabilidade; Ciência Aberta; Dados de Pesquisa. ABSTRACT The availability of scientific assets through data repositories has been greatly increased as a result of government and institutional data sharing policies and mandates for publicly funded research, allowing data to be reused for purposes not always anticipated by primary researchers. Despite the fact that the argument favoring data sharing is strongly grounded in the possibilities of data reuse and its contributions to scientific advancement, this subject remains unobserved in discussions about data science and open science. This paper follows a narrative review method to take a closer look at data reuse in order to better conceptualize this term, while proposing an early classification of five distinct data reuse approaches (repurposing, aggregation, integration, meta-analysis and reanalysis) based on hypothetical cases and literature examples. It also explores the determinants of what constitutes reusable data, and the relationship between data reusability and documentation quality. It presents some challenges associated with data documentation and points out some initiatives and recommendations to overcome such problems. It expects to contribute not only for the conceptual advancement around the reusability and effective reuse of the data, but also to result in initiatives related to data documentation in order to increase the reuse potential of these scientific assets.Keywords:Data Reuse; Scientific Reproducibility; Reusability; Open Science; Research Data.  


2018 ◽  
Author(s):  
Michael Dietze

Abstract. Environmental seismology is the study of the seismic signals emitted by Earth surface processes. This emerging research field is at the seams of seismology, geomorphology, hydrology, meteorology, and further Earth science disciplines. It amalgamates a wide variety of methods from across these disciplines and, ultimately, fuses them in a common analysis environment. This overarching scope of environmental seismology asks for a coherent, yet integrative software, which is accepted by many of the involved scientific disciplines. The statistic software R has gained paramount importance in the majority of data science research fields. R has well justified advances over other, mostly commercial software, which makes it the ideal language to base a comprehensive analysis toolbox on. The article introduces the avenues and needs of environmental seismology, and how these are met by the R package eseis. The conceptual structure, example data sets and available functions are demonstrated. Worked examples illustrate possible applications of the package and in depth descriptions of the flexible use of the functions. The package is available under the GPL license on the Comprehensive R Archive Network (CRAN) and maintained on Github.


2021 ◽  
Vol 10 (8) ◽  
pp. 528
Author(s):  
Raphael Witt ◽  
Lukas Loos ◽  
Alexander Zipf

OpenStreetMap (OSM) is a global mapping project which generates free geographical information through a community of volunteers. OSM is used in a variety of applications and for research purposes. However, it is also possible to import external data sets to OpenStreetMap. The opinions about these data imports are divergent among researchers and contributors, and the subject is constantly discussed. The question of whether importing data, especially large quantities, is adding value to OSM or compromising the progress of the project needs to be investigated more deeply. For this study, OSM’s historical data were used to compute metrics about the developments of the contributors and OSM data during large data imports which were for the Netherlands and India. Additionally, one time period per study area during which there was no large data import was investigated to compare results. For making statements about the impacts of large data imports in OSM, the metrics were analysed using different techniques (cross-correlation and changepoint detection). It was found that the contributor activity increased during large data imports. Additionally, contributors who were already active before a large import were more likely to contribute to OSM after said import than contributors who made their first contributions during the large data import. The results show the difficulty of interpreting a heterogeneous data source, such as OSM, and the complexity of the project. Limitations and challenges which were encountered are explained, and future directions for continuing in this field of research are given.


2019 ◽  
Author(s):  
Matthew H. Graham ◽  
Gregory A Huber ◽  
neil malhotra ◽  
Cecilia Hyunjung Mo

Replication and transparency are increasingly important in bolstering the credibility of political science research, yet open science tools are typically designed for experiments. For observational studies, current replication practice suffers from an important pathology: just as researchers can often "p-hack" their way to initial findings, it is often possible to "null hack" findings away through specification and case search. We propose an observational open science framework that consists of extending the original time series, independent data collection, pre-registration, multiple simultaneous replications, and collaborators with mixed incentives. We apply the approach to three studies on "irrelevant" events and voting behavior. Each study replicates well in some areas and poorly in others. Had we sought to debunk any of the three with ex post specification search, we could have done so. However, our approach required us to see the full, complicated picture. We conclude with suggestions for future refinements to our approach.


2021 ◽  
Author(s):  
Lee Humphreys ◽  
Neil A Lewis ◽  
Katherine Sender ◽  
Andrea Stevenson Won

Abstract Recent initiatives toward open science in communication have prompted vigorous debate. In this article, we draw on qualitative and interpretive research methods to expand the key priorities that the open science framework addresses, namely producing trustworthy and quality research. This article contributes to communication research by integrating qualitative methodological literature with open communication science research to identify five broader commitments for all communication research: validity, transparency, ethics, reflexivity, and collaboration. We identify key opportunities where qualitative and quantitative communication scholars can leverage the momentum of open science to critically reflect on and improve our knowledge production processes. We also examine competing values that incentivize dubious practices in communication research, and discuss several metascience initiatives to enhance diversity, equity, and inclusion in our field and value multiple ways of knowing.


2019 ◽  
Author(s):  
Adib Rifqi Setiawan

Berikut ini beberapa publikasi saya pada 2019 ini. Penting atau tidak, saya menganggap bahwa publikasi hanyalah efek samping riset. Di luar publikasi ini, saya juga masih aktif sebagai penulis media daring, seperti Qureta.com, Selasar.com, dan SantriMilenial.net serta mengunggah beberapa artikel preprint melalui layanan Open Science Framework (OSF), EdArxiv.org, dan Research Papers in Economics (RePEc).


Beverages ◽  
2021 ◽  
Vol 7 (1) ◽  
pp. 3
Author(s):  
Zeqing Dong ◽  
Travis Atkison ◽  
Bernard Chen

Although wine has been produced for several thousands of years, the ancient beverage has remained popular and even more affordable in modern times. Among all wine making regions, Bordeaux, France is probably one of the most prestigious wine areas in history. Since hundreds of wines are produced from Bordeaux each year, humans are not likely to be able to examine all wines across multiple vintages to define the characteristics of outstanding 21st century Bordeaux wines. Wineinformatics is a newly proposed data science research with an application domain in wine to process a large amount of wine data through the computer. The goal of this paper is to build a high-quality computational model on wine reviews processed by the full power of the Computational Wine Wheel to understand 21st century Bordeaux wines. On top of 985 binary-attributes generated from the Computational Wine Wheel in our previous research, we try to add additional attributes by utilizing a CATEGORY and SUBCATEGORY for an additional 14 and 34 continuous-attributes to be included in the All Bordeaux (14,349 wine) and the 1855 Bordeaux datasets (1359 wines). We believe successfully merging the original binary-attributes and the new continuous-attributes can provide more insights for Naïve Bayes and Supported Vector Machine (SVM) to build the model for a wine grade category prediction. The experimental results suggest that, for the All Bordeaux dataset, with the additional 14 attributes retrieved from CATEGORY, the Naïve Bayes classification algorithm was able to outperform the existing research results by increasing accuracy by 2.15%, precision by 8.72%, and the F-score by 1.48%. For the 1855 Bordeaux dataset, with the additional attributes retrieved from the CATEGORY and SUBCATEGORY, the SVM classification algorithm was able to outperform the existing research results by increasing accuracy by 5%, precision by 2.85%, recall by 5.56%, and the F-score by 4.07%. The improvements demonstrated in the research show that attributes retrieved from the CATEGORY and SUBCATEGORY has the power to provide more information to classifiers for superior model generation. The model build in this research can better distinguish outstanding and class 21st century Bordeaux wines. This paper provides new directions in Wineinformatics for technical research in data science, such as regression, multi-target, classification and domain specific research, including wine region terroir analysis, wine quality prediction, and weather impact examination.


Sign in / Sign up

Export Citation Format

Share Document