Bias in claims data with respect to disease status is a problem for health services researchers, because we often rely on administrative claims (billing data) to measure disease status for large cohorts. Misclassification bias may alter the prevalence of given conditions–which is especially problematic for epidemiology and comparative effectiveness research. It may even alter the relationships between diseases and outcomes, leading to a bias toward or away from [PDF] the null hypothesis. In a new study published ahead of print in Medical Care, Dr. Carl van Walraven, of the Ottawa Hospital Research Institute, tested two methods of addressing misclassification bias in claims data and found that bootstrap imputation (BI) substantially outperformed quantitative bias analysis (QBA).
Dr. van Walraven used serum creatinine measures for a sample of hospital patients admitted between 2002-2008 to determine whether they had severe renal failure during their stay. This allowed other methods for determining disease status–ICD codes, QBA, and BI–to be tested against a clinical standard. Both the prevalence of severe renal failure and its association with patient-level covariates were tested.
Using International Classification of Diseases, or ICD, codes is a standard method to determine disease status in health services research. Codes have varying sensitivity and specificity when compared to clinical indicators.
Quantitative bias analysis, or QBA, uses previously estimated sensitivity and specificity of the ICD codes to remove misclassification bias by calculating how many patients would be in each disease-covariate category in the absence of bias. QBA only reduces bias if the parameters are perfectly accurate and generally produces invalid results (negative numbers) some of the time.
Bootstrap imputation (BI) consists of drawing many samples from the original cohort and imputing disease status for each patient based on a disease probability estimate. There are a few different ways to estimate the disease probability under BI:
- The Positive Predictive Model returns the probability of severe renal failure as a function of the model’s covariates (all available from hospital discharge data) in patients coded (ICD) with the disease.
- The Severe Renal Failure model predicts the probability of severe renal failure in all patients, also using hospital discharge data, with a model validated in an earlier study.
The median prevalence of severe renal failure, based on 86 patient strata, was around 7.3% to 8.7%, with BI being closest to the clinical gold standard. Looking at the percent of estimates within 95% CI of the true value shows an even clearer picture:
- ICD codes: 13.8% (IQR 7.4%-23.1%)
- ICD codes adjusted with QBA: 14.1% (IQR 7.3%-23.8%)
- BI
- Using unadjusted predictive values: 59.3% (IQR 48.2%-69.8%)
- Using Positive Predictive Value model: 64% (52.9%-74%)
- Using Severe Renal Failure model: 94.2% (87%-98.1%)
Using QBA was hardly better than just the ICD codes and produced invalid estimates in some cases. Using BI performed much better; however, it is potentially more difficult to implement.
The same general pattern held true when examining the association of severe renal failure with covariates, with one notable exception–QBA actually increased bias more than using ICD codes alone.
When examining the association of severe renal failure with covariates, using ICD codes moved the odds ratios away from the null in most cases. QBA actually increased bias compared to using ICD codes alone. BI models using unadjusted values and PPV performed better than QBA, but not better than ICD codes. Using the BI Severe Renal Failure Model produced the best results.
These results are helpful in giving researchers more information about the use of administrative data when studying severe renal failure and should at least provide food for thought when studying other conditions in the hospital setting. However, we do not know whether these results hold true for other codes/conditions, and more work would be helpful to clarify this point.
The drive toward Big Data includes leveraging existing healthcare data, such as claims and EHRs, and should increase our attention on validity to avoid biased analyses. Testing different methods of correction and enhancing our understanding of how bias may alter results is critically important.