scholarly journals Incorporating Unstructured Patient Narratives and Health Insurance Claims Data in Pharmacovigilance: Natural Language Processing Analysis of Patient-Generated Texts About Systemic Lupus Erythematosus (Preprint)

2021 ◽  
Author(s):  
Shinichi Matsuda ◽  
Takumi Ohtomo ◽  
Shiho Tomizawa ◽  
Yuki Miyano ◽  
Miwako Mogi ◽  
...  

BACKGROUND Gaining insights that cannot be obtained from health care databases from patients has become an important topic in pharmacovigilance. OBJECTIVE Our objective was to demonstrate a use case, in which patient-generated data were incorporated in pharmacovigilance, to understand the epidemiology and burden of illness in Japanese patients with systemic lupus erythematosus. METHODS We used data on systemic lupus erythematosus, an autoimmune disease that substantially impairs quality of life, from 2 independent data sets. To understand the disease’s epidemiology, we analyzed a Japanese health insurance claims database. To understand the disease’s burden, we analyzed text data collected from Japanese disease blogs (tōbyōki) written by patients with systemic lupus erythematosus. Natural language processing was applied to these texts to identify frequent patient-level complaints, and term frequency–inverse document frequency was used to explore patient burden during treatment. We explored health-related quality of life based on patient descriptions. RESULTS We analyzed data from 4694 and 635 patients with systemic lupus erythematosus in the health insurance claims database and tōbyōki blogs, respectively. Based on health insurance claims data, the prevalence of systemic lupus erythematosus is 107.70 per 100,000 persons. Tōbyōki text data analysis showed that pain-related words (eg, pain, severe pain, arthralgia) became more important after starting treatment. We also found an increase in patients’ references to mobility and self-care over time, which indicated increased attention to physical disability due to disease progression. CONCLUSIONS A classical medical database represents only a part of a patient's entire treatment experience, and analysis using solely such a database cannot represent patient-level symptoms or patient concerns about treatments. This study showed that analysis of tōbyōki blogs can provide added information on patient-level details, advancing patient-centric pharmacovigilance.

10.2196/29238 ◽  
2021 ◽  
Vol 7 (6) ◽  
pp. e29238
Author(s):  
Shinichi Matsuda ◽  
Takumi Ohtomo ◽  
Shiho Tomizawa ◽  
Yuki Miyano ◽  
Miwako Mogi ◽  
...  

Background Gaining insights that cannot be obtained from health care databases from patients has become an important topic in pharmacovigilance. Objective Our objective was to demonstrate a use case, in which patient-generated data were incorporated in pharmacovigilance, to understand the epidemiology and burden of illness in Japanese patients with systemic lupus erythematosus. Methods We used data on systemic lupus erythematosus, an autoimmune disease that substantially impairs quality of life, from 2 independent data sets. To understand the disease’s epidemiology, we analyzed a Japanese health insurance claims database. To understand the disease’s burden, we analyzed text data collected from Japanese disease blogs (tōbyōki) written by patients with systemic lupus erythematosus. Natural language processing was applied to these texts to identify frequent patient-level complaints, and term frequency–inverse document frequency was used to explore patient burden during treatment. We explored health-related quality of life based on patient descriptions. Results We analyzed data from 4694 and 635 patients with systemic lupus erythematosus in the health insurance claims database and tōbyōki blogs, respectively. Based on health insurance claims data, the prevalence of systemic lupus erythematosus is 107.70 per 100,000 persons. Tōbyōki text data analysis showed that pain-related words (eg, pain, severe pain, arthralgia) became more important after starting treatment. We also found an increase in patients’ references to mobility and self-care over time, which indicated increased attention to physical disability due to disease progression. Conclusions A classical medical database represents only a part of a patient's entire treatment experience, and analysis using solely such a database cannot represent patient-level symptoms or patient concerns about treatments. This study showed that analysis of tōbyōki blogs can provide added information on patient-level details, advancing patient-centric pharmacovigilance.


2021 ◽  
Vol 21 (1) ◽  
Author(s):  
Michael Stucki ◽  
Janina Nemitz ◽  
Maria Trottmann ◽  
Simon Wieser

Abstract Background Decomposing health care spending by disease, type of care, age, and sex can lead to a better understanding of the drivers of health care spending. But the lack of diagnostic coding in outpatient care often precludes a decomposition by disease. Yet, health insurance claims data hold a variety of diagnostic clues that may be used to identify diseases. Methods In this study, we decompose total outpatient care spending in Switzerland by age, sex, service type, and 42 exhaustive and mutually exclusive diseases according to the Global Burden of Disease classification. Using data of a large health insurance provider, we identify diseases based on diagnostic clues. These clues include type of medication, inpatient treatment, physician specialization, and disease specific outpatient treatments and examinations. We determine disease-specific spending by direct (clues-based) and indirect (regression-based) spending assignment. Results Our results suggest a high precision of disease identification for many diseases. Overall, 81% of outpatient spending can be assigned to diseases, mostly based on indirect assignment using regression. Outpatient spending is highest for musculoskeletal disorders (19.2%), followed by mental and substance use disorders (12.0%), sense organ diseases (8.7%) and cardiovascular diseases (8.6%). Neoplasms account for 7.3% of outpatient spending. Conclusions Our study shows the potential of health insurance claims data in identifying diseases when no diagnostic coding is available. These disease-specific spending estimates may inform Swiss health policies in cost containment and priority setting.


2019 ◽  
Vol 51 (2) ◽  
pp. 327-334 ◽  
Author(s):  
Chirag M. Lakhani ◽  
Braden T. Tierney ◽  
Arjun K. Manrai ◽  
Jian Yang ◽  
Peter M. Visscher ◽  
...  

2020 ◽  
Vol Volume 12 ◽  
pp. 1129-1138
Author(s):  
Amir Sarayani ◽  
Xi Wang ◽  
Thuy Nhu Thai ◽  
Yasser Albogami ◽  
Nakyung Jeon ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document