Advances in Medical Technologies and Clinical Practice - Clinical Data Mining for Physician Decision Making and Investigating Health Outcomes
Latest Publications


TOTAL DOCUMENTS

17
(FIVE YEARS 0)

H-INDEX

0
(FIVE YEARS 0)

Published By IGI Global

9781615209057, 9781615209064

Author(s):  
Patricia Cerrito ◽  
John Cerrito

Now that the data are more readily available for outcomes research and the techniques to analyze that data are available, we need to use the tools to investigate the total complexity of patient care. We should no longer rely upon basic tools while ignoring sequential treatments for patients with chronic diseases or the issue of patient compliance, and we can start investigating treatments from birth to death. It is no longer possible, with these large datasets, to rely on t-tests, chi-square statistics and simple linear regression. Without the luxury of clinical trials and randomizing patients into treatment versus control, there will always be confounding factors that should be considered in the data. In addition, large datasets almost guarantee that the p-value in a standard regression is statistically significant, so other methods of model adequacy must be used. If we do not start using outcomes data, we are missing crucial knowledge that can be used to improve patient outcomes while simultaneously reducing the cost of care. If we continue to use inferential statistical methods that were not designed to work with large datasets, we will not extract the information that is readily available in the outcomes datasets.


Author(s):  
Patricia Cerrito ◽  
John Cerrito

In this section, we will briefly discuss two methods of ranking patient severity. The first method we consider is the AHRQ comorbidities, which is a collection of 30 patient conditions that are used to define an index consisting of the number of comorbidites that a patient has been diagnosed with. In this chapter, we will also discuss the Charlson Index and compare it to the AHRQ comorbidities. Another measure we will define in this chapter uses the patient diagnoses that have the highest mortality rates. As we find, many of these diagnoses are related to serious and resistant infections that are generally not included in other indices, which tend to focus on chronic patient conditions. We will show that the severity of a patient’s condition can be highly variable, depending upon the severity index that is used.


Author(s):  
Patricia Cerrito ◽  
John Cerrito

Patient compliance with treatment is essential. However, it is difficult to examine the issue of compliance from claims and administrative databases that include no direct input from patients. In order to measure compliance, we now have to define a meaningful compliance score within the administrative database. One way of doing this is to investigate patient medication information. Patients with chronic diseases taking maintenance medications usually receive a 30-day or 90-day supply on a regular basis, as long as they are taking the medications at the required intervals. Therefore, one way we can examine the level of compliance is by measuring the time intervals between medication refills.


Author(s):  
Patricia Cerrito ◽  
John Cerrito

In the other type of health care database that we discuss in this chapter, there are multiple columns for each patient observation. It is more difficult to find both the most frequently occurring codes, or to find patients with specific codes for the purpose of extraction. For this reason, many studies focus on the primary diagnosis or procedure. We will provide the programming necessary to find the most frequent codes and to find the patients who have a specific condition. Another aspect of preprocessing we will explore in this chapter using the National Inpatient Sample is that of propensity scoring. When it is not possible to perform a randomized, controlled trial, an attempt is made to emulate such a trial by comparing two observational subgroups. The two groups are matched based upon demographic factors and related patient conditions. It is possible to define a level of patient severity and then to match patients with the severity level as part of the propensity score.


Author(s):  
Patricia Cerrito ◽  
John Cerrito

Many of the datasets provided by the federal government have been well cleaned. However, like many other datasets collected for health outcomes research, these datasets contain errors and missing values. Both errors and missing values should be accommodated unless they represent less than 5% of the total values in the dataset. If the proportion of missing values is greater, then it is necessary to accommodate them in some way; otherwise, the missing values can impact the outcome of the study. Errors are more difficult to work with because it is not possible to know just what the proportion of errors actually is. In this chapter, we examine both errors and missing values and provide some techniques on how to investigate them.


Author(s):  
Patricia Cerrito ◽  
John Cerrito

Each of the datasets has many different diagnosis and procedure codes to represent a patient’s condition. There are thousands of potential codes, and millions of potential combinations of codes. In order to use patient diagnosis and procedure information in statistical models, there has to be some form of compression as there are far too many to include all of them. Therefore, there has to be some method to compress codes. While such methods are discussed in detail in Cerrito (2009), they will be discussed briefly here. Information codes are used in billing and administrative data to define patient conditions, and also to define patient treatments. These codes are used to define patient severity indices. Therefore, it is absolutely essential to both understanding the severity indices, and to defining such severity indices to be able to work with these codes. The most difficult data to work with are contained within claims databases where different coding methods are used by different providers; the different codes must be reconciled in some manner.


Author(s):  
Patricia Cerrito ◽  
John Cerrito

We will begin with data from the Medical Expenditure Panel Survey and use it throughout the text. This dataset has been provided since 1996 and contains yearly information concerning every interaction with the healthcare profession for a cohort of approximately 30,000 patients and 11,000 households. Each household is included in the survey for a two-year period. It contains every inpatient and outpatient event, all physician visits, medications, and lab orders for every member of this cohort. It is usually two years behind, so that in 2008, medication information concerning Medicare, Part D from 2006 first became available for analysis. Because of patient privacy, patient treatment and diagnosis information is incomplete. However, this database contains very complete information about reimbursements from private insurers, government agencies, and individual patients. Therefore, it can be used to determine healthcare expenditures by individuals and households.


Author(s):  
Patricia Cerrito ◽  
John Cerrito

In this book, we provide tools that are needed to investigate administrative and clinical databases that are routinely collected in the support of patient treatment. Often, these databases are large and require non-traditional methodology to investigate. In addition, as they are collected for purposes other than research, there is considerable preprocessing that is required in order to use the data for the purpose of analysis in order to find important results that can improve the quality of patient care. Therefore, we will show by example just how to preprocess the data, and how non-traditional statistical methods can be used to investigate the data and to extract meaning from the databases. We will show details and programming code necessary to complete the preprocessing, and we will discuss the type of preprocessing necessary to use each statistical method and data mining technique.


Author(s):  
Patricia Cerrito ◽  
John Cerrito

We want to examine the treatment of patients with diabetes, and the reasons these patients are in the hospital. In order to do this, we must consider a cohort of patients who have diabetes and who are treated with medication and with insulin. We also need to know the extent to which compliance with monitoring is related to disease progression. Do patients with organ failure have greater, or less compliance with monitoring? There is so much involved in the treatment of diabetes, that it may be difficult to investigate all aspects of the treatment in one analysis. Therefore, at some point, it becomes important to focus on some one aspect of the disease. Then, you can add a second aspect followed by a third or fourth aspect of the disease.


Author(s):  
Patricia Cerrito ◽  
John Cerrito

Decision trees are developed to support physicians who must make treatment decisions. Risk estimates are used to find the optimal treatment pathway for a group of patients. Unfortunately, decision trees often are developed in the absence of empirical evidence concerning risk. In particular, long-term risk is almost always unknown. Instead, physician panels are convened, or physician groups are surveyed to give estimates of risk. However, these outcomes databases discussed in this text can be used to investigate risk, and the relationship of treatment to outcomes. This relationship can be translated into percentages of risk, and that risk used to develop decision trees. Risk versus benefit can be used to find optimal treatment. However, patient benefit is subjective. Pain, especially, is very subjective. Is a patient better off to have surgery to relieve pain, or to just take pain medication continuously? There are attempts to define patient benefit as a function of the patient’s utility. To save costs, should treatment be denied if it fails to increase a patient’s utility? Who should decide a patient’s utility? Often, the patient has little input into the definition of utility that is often used to deny treatment.


Sign in / Sign up

Export Citation Format

Share Document