BACKGROUND
The COVID-19 has caused severe challenges to global public health because it is highly contagious and can be lethal. Numerous ongoing and recently published researches have emerged. However, the research regarding COVID-19 is largely ongoing and inconclusive.
OBJECTIVE
A potential approach to accelerate COVID-19 research is to borrow information from the existing researches of the other viruses that belong to the same coronavirus family. We develop a natural language processing method for answering factoid questions related to COVID-19 using published articles as knowledge sources.
METHODS
Given a question, first, a BM25 based context retriever model is implemented to select the most relevant passages from the articles. Second, for each selected context passage, an answer is obtained using a pre-trained BERT question-answering model. Third, an opinion aggregator, which is a combination of biterm topic model (BTM) and k-means clustering, is applied to aggregating all answers into several opinions.
RESULTS
We apply the proposed pipeline to extract answers, opinions and the most frequent words to six questions from the COVID-19 Open Research Dataset Challenge (CORD-19). By showing the longitudinal distributions of the opinions, we uncover the trends of opinions and popular words in the publications during four periods: before 1990, during 1990-2000, 2000-2010, 2011-2019, and after 2019. The changes in the opinions and popular words agree with several distinct characteristics and challenges of COVID-19, including a higher risk for senior people and people with pre-existing medical conditions, high contagion and rapid transmission, and more urgent need of screening and testing. The opinions and the popular words also provide additional insights for the COVID-19 related questions.
CONCLUSIONS
Compared with other methods for literature retriever and answer generation, opinion aggregation in our method leads to more interpretable, robust and comprehensive question-specific literature reviews. The results demonstrate the usefulness of the proposed method in answering COVID-19 related questions with main opinions and capturing the trends of research about COVID-19 and other relevant strains of coronavirus in recent years.