A Framework for using Machine Learning to Support Qualitative Data Coding
Open-ended survey questions provide qualitative data that are useful for a multitude of reasons. However, qualitative data analysis is labor intensive, and researchers often lack the needed time and resources resulting in underutilization of qualitative data. In attempting to address these issues, we looked to machine learning and recent advances in language models and transfer learning to assist in qualitative coding of responses. We trained a machine learning model following the BERT architecture to predict thematic codes that were then adjudicated by human coders. Results suggest this is a promising approach that can be used to support traditional coding methods and has the potential to alleviate some of the burden associated with qualitative data analysis.