scholarly journals Best Practices for Managing Turnover in Data Science Groups, Teams, and Labs

2019 ◽  
Author(s):  
Dan Sholler ◽  
Diya Das ◽  
Fernando Hoces de la Guardia ◽  
Chris Hoffman ◽  
Francois Lanusse ◽  
...  

Turnover is a fact of life for any project, and academic research teams can face particularly high levels of people who come and go through the duration of a project. In this article, we discuss the challenges of turnover and some potential practices for helping manage it, particularly for computational- and data-intensive research teams and projects. The topics we discuss include establishing and implementing data management plans, file and format standardization, workflow and process documentation, clear team roles, and check-in and check-out procedures.

2018 ◽  
Author(s):  
R. Stuart Geiger ◽  
Dan Sholler ◽  
Aaron Culich ◽  
Ciera Martinez ◽  
Fernando Hoces de la Guardia ◽  
...  

What are the challenges and best practices for doing data-intensive research in teams, labs, and other groups? This paper reports from a discussion in which researchers from many different disciplines and departments shared their experiences on doing data science in their domains. The issues we discuss range from the technical to the social, including issues with getting on the same computational stack, workflow and pipeline management, handoffs, composing a well-balanced team, dealing with fluid membership, fostering coordination and communication, and not abandoning best practices when deadlines loom. We conclude by reflecting about the extent to which there are universal best practices for all teams, as well as how these kinds of informal discussions around the challenges of doing research can help combat impostor syndrome.


Author(s):  
Andrew McDavid ◽  
Anthony M. Corbett ◽  
Jennifer L. Dutra ◽  
Andrew G. Straw ◽  
David J. Topham ◽  
...  

Abstract Introduction: In clinical and translational research, data science is often and fortuitously integrated with data collection. This contrasts to the typical position of data scientists in other settings, where they are isolated from data collectors. Because of this, effective use of data science techniques to resolve translational questions requires innovation in the organization and management of these data. Methods: We propose an operational framework that respects this important difference in how research teams are organized. To maximize the accuracy and speed of the clinical and translational data science enterprise under this framework, we define a set of eight best practices for data management. Results: In our own work at the University of Rochester, we have strived to utilize these practices in a customized version of the open source LabKey platform for integrated data management and collaboration. We have applied this platform to cohorts that longitudinally track multidomain data from over 3000 subjects. Conclusions: We argue that this has made analytical datasets more readily available and lowered the bar to interdisciplinary collaboration, enabling a team-based data science that is unique to the clinical and translational setting.


2019 ◽  
Author(s):  
Dan Sholler ◽  
Sara Stoudt ◽  
Chris J. Kennedy ◽  
Fernando Hoces de la Guardia ◽  
Francois Lanusse ◽  
...  

There are many recommendations of "best practices" for those doing data science, data-intensive research, and research in general. These documents usually present a particular vision of how people should work with data and computing, recommending specific tools, activities, mechanisms, and sensibilities. However, implementation of best (or better) practices in any setting is often met with resistance from individuals and groups, who perceive some drawbacks to the proposed changes to everyday practice. We offer some definitions of resistance, identify the sources of researchers' hesitancy to adopt new ways of working, and describe some of the ways resistance is manifested in data science teams. We then offer strategies for overcoming resistance based on our group members' experiences working alongside resistors or resisting change themselves. Our discussion concluded with many remaining questions left to tackle, some of which are listed at the end of this piece.


2016 ◽  
Vol 11 (1) ◽  
pp. 156 ◽  
Author(s):  
Wei Jeng ◽  
Liz Lyon

We report on a case study which examines the social science community’s capability and institutional support for data management. Fourteen researchers were invited for an in-depth qualitative survey between June 2014 and October 2015. We modify and adopt the Community Capability Model Framework (CCMF) profile tool to ask these scholars to self-assess their current data practices and whether their academic environment provides enough supportive infrastructure for data related activities. The exemplar disciplines in this report include anthropology, political sciences, and library and information science. Our findings deepen our understanding of social disciplines and identify capabilities that are well developed and those that are poorly developed. The participants reported that their institutions have made relatively slow progress on economic supports and data science training courses, but acknowledged that they are well informed and trained for participants’ privacy protection. The result confirms a prior observation from previous literature that social scientists are concerned with ethical perspectives but lack technical training and support. The results also demonstrate intra- and inter-disciplinary commonalities and differences in researcher perceptions of data-intensive capability, and highlight potential opportunities for the development and delivery of new and impactful research data management support services to social sciences researchers and faculty. 


2019 ◽  
Vol 4 (2) ◽  
pp. 81-89 ◽  
Author(s):  
Linda B. Cottler ◽  
Alan I. Green ◽  
Harold Alan Pincus ◽  
Scott McIntosh ◽  
Jennifer L. Humensky ◽  
...  

AbstractThe opioid crisis in the USA requires immediate action through clinical and translational research. Already built network infrastructure through funding by the National Institute on Drug Abuse (NIDA) and National Center for Advancing Translational Sciences (NCATS) provides a major advantage to implement opioid-focused research which together could address this crisis. NIDA supports training grants and clinical trial networks; NCATS funds the Clinical and Translational Science Award (CTSA) Program with over 50 NCATS academic research hubs for regional clinical and translational research. Together, there is unique capacity for clinical research, bioinformatics, data science, community engagement, regulatory science, institutional partnerships, training and career development, and other key translational elements. The CTSA hubs provide unprecedented and timely response to local, regional, and national health crises to address research gaps [Clinical and Translational Science Awards Program, Center for Leading Innovation and Collaboration, Synergy paper request for applications]. This paper describes opportunities for collaborative opioid research at CTSA hubs and NIDA–NCATS opportunities that build capacity for best practices as this crisis evolves. Results of a Landscape Survey (among 63 hubs) are provided with descriptions of best practices and ideas for collaborations, with research conducted by hubs also involved in premier NIDA initiatives. Such collaborations could provide a rapid response to the opioid epidemic while advancing science in multiple disciplinary areas.


RECIIS ◽  
2021 ◽  
Vol 15 (3) ◽  
Author(s):  
Patricia Henning ◽  
Luis Olavo Bonino Da Silva ◽  
Luís Ferreira Pires ◽  
Marten Van Sinderen ◽  
João Luís Rebelo Moreira

The FAIR principles have become a data management instrument for the academic and scientific community, since they provide a set of guiding principles to bring findability, accessibility, interoperability and reusability to data and metadata stewardship. Since their official publication in 2016 by Scientific Data – Nature, these principles have received worldwide recognition and have been quickly endorsed and adopted as a cornerstone of data stewardship and research policy. However, when put into practice, they occasionally result in organisational, legal and technological challenges that can lead to doubts and uncertainty as to whether the effort of implementing them is worthwhile. Soon after their publication, the European Commission and other funding agencies started to require that project proposals include a Data Management Plan (DMP) based on the FAIR principles. This paper reports on the adherence of DMPs to the FAIR principles, critically evaluating ten European DMP templates. We observed that the current FAIRness of most of these DMPs is only partly satisfactory, in that they address data best practices, findability, accessibility and sometimes preservation, but pay much less attention to metadata and interoperability.


2018 ◽  
Vol 4 ◽  
Author(s):  
Steven Van Tuyl ◽  
Amanda Whitmire

In recent years, the academic research data management (RDM) community has worked closely with funding agencies, university administrators, and researchers to develop best practices for RDM. The RDM community, however, has spent relatively little time exploring best practices used in non-academic environments (industry, government, etc.) for management, preservation, and sharing of data. In this poster, we present the results of a project wherein we approached a number of non-academic corporations and institutions to discuss how data is managed in those organizations and discern what the academic RDM community could learn from non-academic RDM practices. We conducted interviews with 10-20 companies including tech companies, government agencies, and consumer retail corporations. We present the results in the form of user stories, common themes from interviews, and summaries of areas where the RDM community might benefit from further understanding of non-academic data management practices.


2013 ◽  
Vol 8 (2) ◽  
pp. 111-122 ◽  
Author(s):  
Martin Halbert

This paper describes findings and projections from a project that has examined emerging policies and practices in the United States regarding the long-term institutional management of research data. The DataRes project at the University of North Texas (UNT) studied institutional transitions taking place during 2011-2012 in response to new mandates from U.S. governmental funding agencies requiring research data management plans to be submitted with grant proposals. Additional synergistic findings from another UNT project, termed iCAMP, will also be reported briefly.This paper will build on these data analysis activities to discuss conclusions and prospects for likely developments within coming years based on the trends surfaced in this work. Several of these conclusions and prospects are surprising, representing both opportunities and troubling challenges, for not only the library profession but the academic research community as a whole.


2013 ◽  
Vol 8 (2) ◽  
pp. 47-67 ◽  
Author(s):  
Reagan Moore ◽  
Arcot Rajasekar ◽  
Paul Watry ◽  
Fabio Corubolo ◽  
John Harrison ◽  
...  

This paper describes work undertaken by Data Intensive Cyber Environments Center (DICE) at the University of North Carolina at Chapel Hill and the University of Liverpool on the development of an integrated preservation environment, which has been presented at the National Coordination Office for Networking and Information Technology Research and Development (NITRD), at the National Science Foundation, and at the European Commission. The underlying technology is based on the integrated Rule-Oriented Data System (iRODS), which implements a policy-based approach to distributed data management. By differentiating between different phases of the data life cycle based upon the evolution of data management policies, the infrastructure can be tuned to support data publication, data sharing, data analysis and data preservation. It is possible to build generic data management infrastructure that can evolve to meet the management requirements of each user community, federal agency and academic research project. In order to manage the properties of the data collections, we have developed and integrated scalable digital library services that support the discovery of, and access to, material organized as a collection.The integrated preservation environment prototype implements specific technologies that are capable of managing a wide range of preservation requirements, from parsing of legacy document formats, to enforcement of preservation policies, to validation of trustworthiness assessment criteria. Each capability has been demonstrated and is instantiated in multiple instances, both in the United States as part of the DataNet Federation Consortium (DFC) and through multiple European projects, primarily the FP7 SHAMAN project.


Author(s):  
Tibor Koltay ◽  
Sonja Špiranec

This chapter is intended mainly for the researcher. Its main goal is to identify what services are already provided or could be planned by academic libraries, identified as important stakeholders in facilitating Research 2.0. Indicating the changing contexts of literacies, the focus is on research-related literacies, such as information literacy, academic literacy and data literacy, which pertain to the advisory and educational roles of the academic library. The ways of counterbalancing information overload, partially by personal information management are also described. After outlining the importance of data-intensive research, services facilitating research data management, (including the preparation of data-management plans) are portrayed. Issues of data curation, data quality and data citation, as well as the ways to identify professionals, who provide services to researchers, are outlined.


Sign in / Sign up

Export Citation Format

Share Document