Global Research on Coronaviruses: A Metadata-Based Analysis for Health 4.0 (Preprint)
BACKGROUND Amid the COVID-19 pandemic, this article proposes a data science protocol to analyze the global research on coronaviruses beyond just SARS-CoV2. The use of reproducible research principles based on open science, dissemination of scientific information, and easy access to scientific production may aid public health in the race against the virus. OBJECTIVE The main objective of this article is to use the global research on coronaviruses to identify critical elements to better inform the decision process for public health policies. We devise a data science protocol to help health policymakers use the new and latest data science techniques in designing evidence-based public health policies. METHODS We use the EpiBibR package to access more than 120,000 references about the global research on coronaviruses and their metadata. To analyze these data, we first use a theoretical framework to organize the results around three dimensions: conceptual, intellectual, and social. Second, we use machine learning techniques (natural language processing) and graph theory to map the results from our analysis in these three dimensions. RESULTS Our results showcase the potential applications of the proposed data science protocol for public health policies. Our results also show that the United States and China are the leading contributors to the global research on coronaviruses. They also show that India and Europe are significant contributors, though finding themselves in a second tier. University collaborations are strong between the United States, Canada, and the United Kingdom in this domain, confirming the results at the country level. CONCLUSIONS Our results make a case for a data-driven public health policy, mainly when efficient and relevant research is necessary. Text mining techniques can assist policymakers in calculating research-driven indices and informing their decision process to specific actions deemed necessary for impactful health responses.