Predictive models need monitoring to avoid dataset shifts
As artificial intelligence (AI) engages more frequently in healthcare’s mainstream, organizations are discovering ways to fine-tune and maintain it. Most healthcare AI systems use machine-learning algorithms to identify key patterns from data. However, when there’s a “dataset shift” — a mismatch between developmental data and the actual clinical data on which a system is deployed — machine-learning systems can underperform.
That’s what came to light in a University of Michigan (UM) study using data from nearly 30,000 patients covering almost 40,000 hospitalizations in 2018 and 2019. The UM researchers found that a prediction algorithm missed two-thirds of sepsis cases, rarely found cases that medical staff had not flagged, and frequently issued false alarms. The study authors noted that patient demographic data associated with the coronavirus pandemic had caused a fundamental dataset shift in the relationship between fevers and bacterial sepsis. As a result, by April 2020, UM’s clinical AI governing committee deactivated the sepsis model due to false alerting.
At the same time, the model’s developer, an electronic health record system vendor, contended that the UM study set a low threshold for sepsis alerts, which triggered a high number of false positives. Additionally, a similar study by Prisma Health of South Carolina (although based on a smaller sample of 11,500 patients) found success with the same commercial algorithm by achieving a 4 percent reduction in sepsis patient mortality.
Recognizing and preventing dataset shift
The UM study authors characterized their experience as an “extreme example” and noted that “many causes of dataset shift are more subtle.” Nonetheless, they cited best practices for recognizing and mitigating dataset shift, including setting up a governance committee with multidisciplinary AI expertise and experience in clinical usage, continuously monitoring for AI malfunction, and empowering frontline staff to communicate scenarios of concern for dataset shift.
Further, the authors recommended the following actions, depending on the category of potential dataset shift:
- Changes in technology. Have you installed new software or infrastructure into your EHR system on which the predictive model relies? Routine EHR updates may alter variable definitions that, in turn, may change definitions of predictors and lead to incorrect model predictions. Your governance committee should review variable mapping for predictive models before deploying new EHR platforms. After deployment, rigorously monitor for statistical changes in the inputs to or outputs of predictive models.
- Changes in population and setting. Has seasonality affected the clinical application of a predictive model? Relying too much or too little on seasonal trends for diseases such as influenza can result in model errors. Frontline clinicians should be trained to flag models that appear to over- or underpredict during specific seasons. You may need to deploy distinct predictive models at different times of year.
- Changes in behavior. Has your organization made changes in clinical practice that influence the data on which the model operates? An example would be adopting new order sets or changes in their timing. Coordinate with your health system leadership and clinical departments to call out major institutional changes in practice patterns. When needed, retrain or redefine predictors in light of new practices.
The future will bring an increase in independent evaluations of proprietary predictive models. Expect model design to use more inclusive data sets with careful attention paid to data accuracy.
SHARE YOUR THOUGHTS about how you think we should be managing data shifts in healthcare predictive models. Join the conversation on DHC >>
Frank Irving is a Philadelphia-based content writer and communications consultant specializing in healthcare, technology and sports.