Pattern discovery on Australian medical claim data - a systematic approach

2005 ◽  
Vol 17 (10) ◽  
pp. 1420-1435 ◽  
Author(s):  
A.C. Tsoi ◽  
S. Zhang ◽  
M. Hagenbuchner
BMJ Open ◽  
2019 ◽  
Vol 9 (12) ◽  
pp. e031422
Author(s):  
Yuya Tamaki ◽  
Kana Kazawa ◽  
Hirohito Watanabe ◽  
Tantut Susanto ◽  
Michiko Moriyama

ObjectiveWe describe the characteristics of patients with high medical costs by matching specific annual medical examination results and medical claim data. Clarifying the relationships between examination items and high medical costs allows the screening of high-risk persons.DesignA cross-sectional study.SubjectsSubjects were persons insured by national health insurance in Hiroshima City, Hiroshima Prefecture, from April 2016 to March 2017. To identify true heart failure (HF) patients, the disease name listed in the medical claim data was compared with drugs prescribed for HF, with extraction of only subjects whose comparative data matched.Data collection and analysisThe specific health examination includes a questionnaire on areas such as lifestyle habits, anthropometry, blood pressure, blood tests and urine tests. The percentage of the total medical costs related to the medical care of subjects with HF was described using Pareto analysis. For specific health examination items, we compared the high-cost and low-cost groups. The normality and homoscedasticity of each variable was checked and Student’s t-tests and χ² tests were applied. Finally, multiple logistic regression analysis was used to detect factors in the health examination items related to high medical costs.ResultsPareto analysis showed that 80% of all medical costs were paid by 30% of the HF patient population. The fees for cardiovascular surgery accounted for 54% of the total surgical cost, 64% of which included preventable diseases. Levels of creatinine (Cr) and γ-glutamyl transpeptidase (γ-GTP) and a history of smoking were found to be related to high medical costs.ConclusionAnalysis of specific health examination results for HF patients revealed the association between high medical costs, γ-GTP, Cr, and smoking. These results can thus serve as a reference for screening persons at high risk of HF and help prevent the exacerbation of HF.


PLoS ONE ◽  
2021 ◽  
Vol 16 (9) ◽  
pp. e0255863
Author(s):  
Hiroshi Watanabe ◽  
Kiyoteru Takenouchi ◽  
Michio Kimura

We studied the effectiveness of the direct data collection from electronic medical records (EMR) when it is used for monitoring adverse drug events and also detection of already known adverse events. In this study, medical claim data and SS-MIX2 standardized storage data were used to identify four diseases (diabetes, dyslipidemia, hyperthyroidism, and acute renal failure) and the validity of the outcome definitions was evaluated by calculating positive predictive values (PPV). The maximum positive predictive value (PPV) for diabetes based on medical claim data was 40.7% and that based on prescription data from SS-MIX2 Standardized Storage was 44.7%. The PPV for dyslipidemia was 50% or higher under either of the conditions. The PPV for hyperthyroidism based on disease name data alone was 20–30%, but exceeded 60% when prescription data was included in the evaluation. Acute renal failure was evaluated using information from medical records in addition to the data. The PPV for acute renal failure based on the data of disease names and laboratory examination results was slightly higher at 53.7% and increased to 80–90% when patients who previously had a high serum creatinine (Cre) level were excluded. When defining a disease, it is important to include the condition specific to the disease; furthermore, it is very useful if laboratory examination results are also included. Therefore, the inclusion of laboratory examination results in the definitions, as in the present study, was considered very useful for the analysis of multi-center SS-MIX2 standardized storage data.


Author(s):  
Xiaowen Wang ◽  
Shanshan Yao ◽  
Mengying Wang ◽  
Guiying Cao ◽  
Zishuo Chen ◽  
...  

To explore the multimorbidity prevalence and patterns among middle-aged and older adults from China. Data on thirteen chronic diseases were collected from 2,097,150 participants aged over 45 years between January 1st 2011 and December 31st 2015 from Beijing Medical Claim Data for Employees. Association rule mining and hierarchical cluster analysis were applied to assess multimorbidity patterns. Multimorbidity prevalence was 51.6% and 81.3% in the middle-aged and older groups, respectively. The most prevalent disease pair was that of osteoarthritis and rheumatoid arthritis (OARA) with hypertension (HT) (middle-aged: 22.5%; older: 41.8%). Ischaemic heart disease (IHD), HT, and OARA constituted the most common triad combination (middle-aged: 11.0%; older: 31.2%). Among the middle-aged group, the strongest associations were found in a combination of cerebrovascular disease (CBD), OARA, and HT with IHD in males (lift = 3.49), and CBD, OARA, and COPD with IHD in females (lift = 3.24). Among older patients, glaucoma and cataracts in females (lift = 2.95), and IHD, OARA, and glaucoma combined with cataracts in males (lift = 2.45) were observed. Visual impairment clusters, a mixed cluster of OARA, IHD, COPD, and cardiometabolic clusters were detected. Multimorbidity is prevalent among middle-aged and older Chinese individuals. The observations of multimorbidity patterns have implications for improving preventive care and developing appropriate guidelines for morbidity treatment.


Author(s):  
Jun-Hui Wu ◽  
Yao Wu ◽  
Zi-Jing Wang ◽  
Yi-Qun Wu ◽  
Tao Wu ◽  
...  

We aimed to provide reliable regression estimates of expenditures associated with various complications in type 2 diabetics in China. In total, 1,859,039 type 2 diabetes patients with complications were obtained from the Beijing Medical Claim Data for Employees database from 2008 to 2016. We estimated costs for complications using a generalized estimating equation model adjusted for age, sex, and the incidence of various complications. The average total cost for diabetic patients with complications was 17.12 thousand RMB. Prescribed drugs accounted for 63.4% of costs. We observed a significant increase in costs in the first year after the onset of complications. Compared with costs before the incidence of complications, the additional costs per person in the first year and >1 year after the event would be 10,631.16 RMB and 1150.71 RMB for cardiovascular disease, 1017.62 RMB and 653.82 RMB for cerebrovascular disease, and 301.14 RMB and 624.00 RMB for kidney disease, respectively. The estimated coefficients for outpatient visits were relatively lower than those of inpatient visits. Complications in diabetics exert a significant impact on total healthcare costs in the first year of their onset and in subsequent years. Our estimates may assist policymakers in quantifying the economic burden of diabetes complications.


Author(s):  
Tian Bai ◽  
Brian L. Egleston ◽  
Richard Bleicher ◽  
Slobodan Vucetic

Representing words as low dimensional vectors is very useful in many natural language processing tasks. This idea has been extended to medical domain where medical codes listed in medical claims are represented as vectors to facilitate exploratory analysis and predictive modeling. However, depending on a type of a medical provider, medical claims can use medical codes from different ontologies or from a combination of ontologies, which complicates learning of the representations. To be able to properly utilize such multi-source medical claim data, we propose an approach that represents medical codes from different ontologies in the same vector space. We first modify the Pointwise Mutual Information (PMI) measure of similarity between the codes. We then develop a new negative sampling method for word2vec model that implicitly factorizes the modified PMI matrix. The new approach was evaluated on the code cross-reference problem, which aims at identifying similar codes across different ontologies. In our experiments, we evaluated cross-referencing between ICD-9 and CPT medical code ontologies. Our results indicate that vector representations of codes learned by the proposed approach provide superior cross-referencing when compared to several existing approaches.


Author(s):  
Munkhzul Radnaabaatar ◽  
Young-Eun Kim ◽  
Dun-Sol Go ◽  
Yunsun Jung ◽  
Seok-Jun Yoon

Background: While measuring and monitoring disease morbidity, it is essential to focus on regions experiencing inequitable health outcomes, especially coastal populations. However, no research investigating population health outcomes in coastal areas has been conducted. Therefore, we aimed to investigate the burden of disease morbidity in coastal areas of South Korea. Methods: Using an administrative division map and the ArcGIS, we identified and included 496 coastal districts. In this observational study, years lived with disability (YLDs) were estimated using incidence-based approaches to calculate the burden of disease in 2015. Incidence and prevalence cases were collected using National Health Insurance Service (NHIS) medical claim data using a specialized algorithm. Results: Age-standardized years lived with disability (ASYLDs) in the coastal areas were 24,398 per 100,000 population, which is greater than the 22,613 YLDs observed nationwide. In coastal areas, the burden of disease morbidity was higher in females than in males. Diabetes mellitus was the leading specific disease of total YLDs per 100,000 population, followed by low back pain, chronic obstructive pulmonary disease, osteoarthritis, and ischemic stroke. Conclusion: In this study, the coastal areas of South Korea carry a higher burden than the national population. Additionally, chronic diseases compose the majority of the health burden in coastal areas. Despite the limitation of data, YLD was the best tool available for evaluating the health outcomes in specific areas, and has the advantage of simplicity and timely analysis.


2021 ◽  
pp. 101053952110663
Author(s):  
Seong Woo Kim ◽  
Taemi Youk ◽  
Jiyong Kim

To investigate the maternal and neonatal risk factors related to pregnancy and birth affecting the occurrence of neurodevelopmental disorders to their children using the medical claim data for the whole population. The study was conducted on all the babies born in Korea from 2005 to 2009 based on data from the National Health Information Database. All birth records were tracked from birth to December 31, 2015. To analyze factors related to the mother, data related to the mother of the newborn were collected. Increased maternal age was found to increase the risk of cerebral palsy (adjusted odds ratio [aOR] = 1.46, 95% confidence interval [CI] [1.22, 1.75]) and autism spectrum disorder (aOR = 1.48, 95% CI [1.25, 1.76]), while lowering the risk of intellectual disability (aOR = 1.83, 95% CI [1.33, 2.53]) and speech and language impairment (aOR = 1.41, 95% CI [1.08, 1.83]) compared with the reference group aged 25 to 29 years old. The incidence affected by socioeconomic factors varied according to the types of disorders. Among various risk factors, prematurity or low birth weight, problems associated with amniotic fluid or amniotic membrane, preeclampsia or eclampsia, and cesarean section affect the incidence of neurodevelopmental disorders. To reduce the incidence or severity of neurodevelopmental disorders, a better understanding of the risk factors of neurodevelopmental disorders is important. The results of this study can be used as basic data to help such understanding.


Sign in / Sign up

Export Citation Format

Share Document