Linking Individual-level Facebook Posts with Psychological and Health Data in an Epidemiologic Cohort: A Feasibility Study (Preprint)
BACKGROUND Psychological factors (e.g., depression, optimism) and related biological and behavioral responses are associated with numerous physical health outcomes. The majority of research in this area relies on self-reported assessments of psychological factors, which are difficult to scale because they may be expensive to administer and time-consuming to complete. Investigators are increasingly interested in using social media as a novel and convenient platform for obtaining information rapidly in large populations. OBJECTIVE We evaluated the feasibility of obtaining Facebook data from a large ongoing cohort of midlife and older women which may be used to assess psychological functioning efficiently with low cost. METHODS This protocol was conducted with participants in the Nurses’ Health Study II (NHSII) which was started in 1989 with biennial follow-ups. Facebook does not share data readily; therefore, we developed procedures to enable women to download and transfer their Facebook data to the cohort servers (for linkage with other study data they have provided). Since privacy is a critical concern when collecting individual-level data, we partnered with a third-party software developer, Digi.me, to enable participants to obtain their own Facebook data and to send it securely to our research team. In 2020, we invited a subset of the 18,519 NHSII participants (aged 56-73 years) via email to participate. Women were selected if they reported on the 2017-2018 questionnaire that they regularly posted to Facebook and were still active cohort participants. We included an exit survey for those who chose not to participate to gauge reasons for non-participation. RESULTS We invited 309 women to participate. Few women signed the consent form (N=52) and only three used the Digi.me app to download and transfer their Facebook data. These low participation rates were observed despite modifying our protocol between waves of recruitment, including by 1) excluding active healthcare workers, who might be less available to participate due to the pandemic; 2) developing a Frequently Asked Questions factsheet to provide more information regarding the protocol; and 3) simplifying the instructions for using the Digi.me app. On our exit survey, reasons most commonly reported for not participating were concerns regarding data privacy and hesitation sharing personal Facebook posts. The low participation rates suggest that obtaining individual-level Facebook data in a cohort of middle-aged and older women may be challenging. CONCLUSIONS In this cohort of midlife and older women who were actively participating for over three decades, we were largely unable to obtain permission to access to individual-level data from participants’ Facebook accounts. Despite working with a third-party to customize an app to implement safeguards for privacy, data privacy remained a key concern in these women. Future studies aiming to leverage individual-level social media should explore alternate populations or means of sharing social media data.