scholarly journals Electronic Medical Records for Discovery Research in Nonalcoholic Fatty Liver Disease

2019 ◽  
Author(s):  
Uri Kartoun ◽  
Rahul Aggarwal ◽  
Adam Perer ◽  
Yoonyoung Park ◽  
Ping Zhang ◽  
...  

Abstract Background: Nonalcoholic fatty liver disease (NAFLD) is a highly prevalent yet under-diagnosed and under-discussed disease. Given that NAFLD has not been explored sufficiently compared with other diseases, opportunities abound for scientists to discover new biomarkers (such as laboratory observations, current comorbidities, behavioral descriptors) that can be linked to the development of conditions and complications that may develop at a later stage of the patient’s life. Methods: We analyzed IBM Explorys, a repository that contains electronic medical records (EMRs) of more than 50 million individuals. We used a classification algorithm that members of our group have previously validated to identify patients at a high probability for NAFLD. The algorithm identified more than 80,000 patients with a high probability for NAFLD who had at least 5 years of follow-up. We applied standard statistical methods (such as logistic regression and bootstrapping) and used Clinical Classifications Software (CCS) definitions to identify associations between a variety of covariates and disease outcomes. Results: Our methodology identified several thousand strongly statistically significant associations between covariates and outcomes in NAFLD. Most of the associations are known, but others may be new and require further investigation in subsequent studies. Conclusions: A discovery mechanism composed of standard statistical methods applied on a large collection of EMRs, confirmed known associations and identified potentially new associations that can act as biomarkers that might merit further research.

2020 ◽  
Author(s):  
Uri Kartoun ◽  
Rahul Aggarwal ◽  
Adam Perer ◽  
Yoonyoung Park ◽  
Ping Zhang ◽  
...  

Abstract Background: Nonalcoholic fatty liver disease (NAFLD) is a highly prevalent yet under-diagnosed and under-discussed disease. Given that NAFLD has not been explored sufficiently compared with other diseases, opportunities abound for scientists to discover new biomarkers (such as laboratory observations, current comorbidities, behavioral descriptors) that can be linked to the development of conditions and complications that may develop at a later stage of the patient’s life. Methods: We analyzed IBM Explorys, a repository that contains electronic medical records (EMRs) of more than 60 million individuals. We used a classification algorithm that members of our group have previously validated to identify patients at a high probability for NAFLD. The algorithm identified more than 80,000 patients with a high probability for NAFLD who had at least 5 years of follow-up. We applied standard statistical methods (such as logistic regression and bootstrapping) and used Clinical Classifications Software (CCS) definitions to identify associations between a variety of covariates and disease outcomes. Results: Our methodology identified several thousand strongly statistically significant associations between covariates and outcomes in NAFLD. Most of the associations are known, but others may be new and require further investigation in subsequent studies. Conclusions: A discovery mechanism composed of standard statistical methods applied on a large collection of EMRs, confirmed known associations and identified potentially new associations that can act as biomarkers that might merit further research.


2019 ◽  
Author(s):  
Uri Kartoun ◽  
Rahul Aggarwal ◽  
Adam Perer ◽  
Yoonyoung Park ◽  
Ping Zhang ◽  
...  

Abstract Background: Nonalcoholic fatty liver disease (NAFLD) is a highly prevalent yet under-diagnosed and under-discussed disease. Given that NAFLD has not been explored sufficiently compared with other diseases, opportunities abound for scientists to discover new biomarkers (such as laboratory observations, current comorbidities, behavioral descriptors) that can be linked to the development of conditions and complications that may develop at a later stage of the patient’s life. Methods: We analyzed IBM Explorys, a repository that contains electronic medical records (EMRs) of more than 50 million individuals. We used a classification algorithm that members of our group have previously validated to identify patients at a high probability for NAFLD. The algorithm identified more than 80,000 patients with a high probability for NAFLD who had at least 5 years of follow-up. We applied standard statistical methods (such as logistic regression and bootstrapping) and used Clinical Classifications Software (CCS) definitions to identify associations between a variety of covariates and disease outcomes. Results: Our methodology identified several thousand strongly statistically significant associations between covariates and outcomes in NAFLD. Most of the associations are known, but others may be new and require further investigation in subsequent studies. Conclusions: A discovery mechanism composed of standard statistical methods applied on a large collection of EMRs, confirmed known associations and identified potentially new associations that can act as biomarkers that might merit further research.


2020 ◽  
Author(s):  
Uri Kartoun ◽  
Rahul Aggarwal ◽  
Adam Perer ◽  
Yoonyoung Park ◽  
Ping Zhang ◽  
...  

Abstract Background: Nonalcoholic fatty liver disease (NAFLD) is a highly prevalent yet underdiagnosed and under-discussed disease. Given that NAFLD has not been explored sufficiently compared with other diseases, opportunities abound for scientists to discover new biomarkers (such as laboratory observations, current comorbidities, and behavioral descriptors) that can be linked to the development of conditions and complications that may develop at a later stage of the patient’s life.Methods: We analyzed IBM Explorys, a repository that contains electronic medical records (EMRs) of more than 60 million individuals. We used a classification algorithm that members of our group have previously validated to identify patients at a high probability for NAFLD. The algorithm identified more than 80,000 patients with a high probability for NAFLD who had at least 5 years of follow-up. We followed an unbiased approach for prediction modeling and applied standard statistical methods (such as logistic regression and bootstrapping) as well as Clinical Classifications Software (CCS) definitions to identify associations between a variety of covariates and disease outcomes.Results: Our methodology identified several thousand strongly statistically significant associations between covariates and outcomes in NAFLD. Most of the associations are known, but others may be new and require further investigation in subsequent studies.Conclusions: A discovery mechanism composed of standard statistical methods and applied on a large collection of EMRs confirmed known associations and identified potentially new associations that can act as biomarkers that might merit further research.


Sign in / Sign up

Export Citation Format

Share Document