New methods in risk modeling: does adding EHR data improve predictions?

By | July 20, 2017

One of the challenges in delivering efficient medical care is identifying people who are at risk of a negative outcome, so we can focus our efforts on screening and treating those at elevated risk. We do this in individual face-to-face encounters through clinical, diagnostic processes: taking a patient’s history, performing a physical examination, recording signs and symptoms. Across populations, we do it by using data collected in these encounters over time to develop algorithms and predictive statistical models. For me personally, these risk stratification, prediction, and adjustment models are some of the most interesting tools used in health services research.

Over the past decade or more of transitioning to electronic health records (EHRs) in the US, one of the biggest promises for research has been the idea of using the rich, clinical detail available from EHRs to enhance the standard claims and administrative data we’ve traditionally used to build risk models. In fact, we were lucky enough to be able to construct a linked EHR-claims database for one of my dissertation papers (co-authored with my advisor, Arlene Ash, and another member of my committee), published this year, in which we predicted emergency department visits.

And that brings us to a new article, published in the August issue of Medical Care: Comparing Population-based Risk-stratification Model Performance Using Demographic, Diagnosis and Medication Data Extracted from Outpatient Electronic Health Records versus Administrative Claims.

In this paper, a team from Johns Hopkins (first author: Hadi Kharrazi, MD, PhD) evaluates the possibility of using EHR data in addition to (or instead of) administrative claims for risk stratification. They sought to predict two different outcomes: hospitalization (excluding childbirth-related stays) and being in the top 1% of costs. They studied a sample of 85,581 individuals (all under age 65), continuously enrolled in both 2011 and 2012, who visited a primary care clinic associated with HealthPartners (a Bloomington, MN integrated delivery network) at least once in at least 1 year of the study period. The authors used the Johns Hopkins Adjusted Clinical Group (ACG) system, which has been validated for risk stratification.

They noted that about 46% of diagnoses were listed only in claims, while about 7% were listed only in the EHR. The overlap between claims and EHR data, regarding reported chronic conditions, was about 58%. Combining EHR and claims data:

  • increased identification of cancer by 12%
  • increased identification of diabetes by 10%
  • increased identification of hypertension by 3%
  • increased identification of depression by 3%

Turning to the accuracy of the models, assessed using R2 and area under the receiver operating characteristic curve (AUC):

  • When predicting cost, adding EHR to claims data actually lowered the Rby a small amount.
  • When predicting both cost and hospitalization, using both EHR and claims data slightly increased the AUCs across both outcomes for concurrent outcomes (this year’s outcomes predicted using this year’s data).
    • The AUC did not change or decreased when predicting prospective outcomes (next year’s outcomes predicted using this year’s data).

The authors conclude that, while risk stratification using EHR data was feasible, it was less accurate than claims-based models in predicting hospitalization and high costs, although it increased the ability to identify some important conditions.

Their findings suggest that EHRs, at least in this study, contain more outdated and/or inaccurate data than do claims. Additionally, even in an integrated delivery network, the available claims data could have been incomplete (especially for mental health utilization). EHR-derived medication data also represents prescriptions, while claims represent actual medication fills. Mismatches could be related to nonadherence or to individuals paying out of pocket (for example, using online pharmacies) instead of paying for medications with their insurance.

As usual, more research is needed to understand how best to deliver on the promise of EHR data’s usefulness.

Lisa M. Lines

Lisa M. Lines

Senior health services researcher at RTI International
Lisa M. Lines, PhD, MPH is a senior health services researcher at RTI International, an independent, non-profit research institute. She is also an Assistant Professor in Population and Quantitative Health Sciences at the University of Massachusetts Chan Medical School. Her research focuses on social drivers of health, quality of care, care experiences, and health outcomes, particularly among people with chronic or serious illnesses. She is co-editor of TheMedicalCareBlog.com and serves on the Medical Care Editorial Board. She served as chair of the APHA Medical Care Section's Health Equity Committee from 2014 to 2023. Views expressed are the author's and do not necessarily reflect those of RTI or UMass Chan Medical School.
Lisa M. Lines
Lisa M. Lines

Latest posts by Lisa M. Lines (see all)

Category: All Healthcare costs & financing Methods Tags: , , , ,

About Lisa M. Lines

Lisa M. Lines, PhD, MPH is a senior health services researcher at RTI International, an independent, non-profit research institute. She is also an Assistant Professor in Population and Quantitative Health Sciences at the University of Massachusetts Chan Medical School. Her research focuses on social drivers of health, quality of care, care experiences, and health outcomes, particularly among people with chronic or serious illnesses. She is co-editor of TheMedicalCareBlog.com and serves on the Medical Care Editorial Board. She served as chair of the APHA Medical Care Section's Health Equity Committee from 2014 to 2023. Views expressed are the author's and do not necessarily reflect those of RTI or UMass Chan Medical School.