A comprehensive analysis of code and data availability in biomedical research
Modern data-driven research increasingly depends on quantitative analysis, yet effectivemechanisms ensuring data and analysis transparency and reproducibility are yet to be developedand adopted widely. The importance and benefits of sharing research products has beenrecognized widely by the scientific community. In biomedical research, it is not only imperativeto publish a detailed description of the study design, methodology, results and interpretation, butthere is a pressing need to make all the research products publicly available, shareable, welldocumented to increase transparency and reproducibility. Current efforts in sharing researchproducts mostly rely on individual researchers and widely but variably enforced by theseindividuals and research organizations. However, an increasing body of evidence in recent yearsalso points to a growing problem of reproducibility across scientific disciplines, i.e. publishedresults often contain analyses that are non replicated due to lack of documentation, code anddata required to reproduce the analysis. Our results indicate that only 36% of the scientificmanuscripts published in prominent biomedical journals share raw data and 9% of the papersshare code. We hope that our analysis informs and exhorts the biomedical community to designeffective strategies to be widely adopted by the researchers to improve the current scenario oftransparency and reproducibility of data-driven biomedical research.