Precision Assessment of COVID-19 Phenotypes Using Large-Scale Clinic Visit Audio Recordings: Harnessing the Power of the Patient Voice (Preprint)
UNSTRUCTURED The novel coronavirus (SARS-CoV-2) and its related disease, COVID-19, are exponentially increasing across the world, yet there is still uncertainty about the clinical phenotype. Natural Language Processing (NLP) and machine learning may hold one key to quickly identify individuals at high risk for COVID-19 and understand key symptoms in its clinical manifestation and presentation. In healthcare, such data often come the medical record, yet when overburdened, clinicians may focus on documenting widely reported symptoms that appear to confirm the diagnosis of COVID-19, at the expense of infrequently reported symptoms. A comprehensive record of the clinic visit is required—an audio recording may be the answer. If done at scale, a combination of data from the EHR and recordings of clinic visits can be used to power NLP and machine learning models, quickly creating a clinical phenotype of COVID-19. We propose the creation of a pipeline from the audio/video recording of clinic visits to the clinical symptomatology model and prediction of COVID-19 infection. With vast amounts of data available, we believe a prediction model can be quickly developed that could promote the accurate screening of individuals at risk of COVID-19 and identify patient characteristics predicting a greater risk of a more severe infection. If clinical encounters are recorded and our NLP is adequately refined, then benchtop-virology will be better informed and risk of spread reduced. While recordings of clinic visits are not the panacea to this pandemic, they are a low cost option with many potential benefits that have only just begun to be explored.