Healthcare leaders often evaluate a clinical model by looking at its performance once it goes live. A better question is, what is the model built on? Every model descends from the data it sits on and that data was captured for billing, or for notes written well in advance of the time the data was intended to train an algorithm. What is the cost of ignoring that legacy? In a 2019 article in the journal Science, a team of researchers showed that a population health algorithm deployed to millions of patients, in a given year, underestimated the needs of Black patients. This was because the algorithm learned from select populations’ spending data which, instead of being data on sickness, was more data on the population’s access to care. The model did exactly what it was intended to do, but what was given to the model was a misleading illustration of need, and no one in the entire process asked if the data meant what the model was going to assume.
Models are the Last Step in Long Process
The decisions made on the data don’t end with the model. Someone made the decision on how a diagnosis gets coded. Another team made the decision on how often the data gets updated. The model learns from what those decisions resulted in and makes that its truth. If those decisions are sound, and consistent, the model has a fair chance. But if those decisions are inconsistent, the model carries those inconsistencies, and presents it to a clinician as an equally inconsistent clinical signal.
Healthcare data is often messy; records exist in multiple disparate systems and the same patient may have multiple identifiers in multiple systems. The Office of the National Coordinator for Health Information Technology (ONC), for example, has developed standards like the United States Core Data for Interoperability to help close the gaps left by siloed records. Even with the best data interoperability attempts, models built to work with organized data may not work when exposed to the unorganized data of other systems.
What the Evidence Says About the Data Beneath AI
The Food and Drug Administration (FDA) has implemented some of these ideas in their Good Machine Learning Practices. Here, data provenance and data quality guide the safe development of models. Developers must know where their training data comes from and how data will be monitored post-deployment. A health system that is unable to answer these questions about its own data is building on land it does not know.
Research conducted by the Agency for Healthcare Research and Quality (AHRQ) concludes that findings drawn from measurement built on incomplete or skewed data have an authoritative guise, when in fact they display a lack of authority. The same findings could be drawn for any model that is trained on data that was captured for billing rather than for the data capture of forecasting. Data quality assurance is a prerequisite to the creation of trustworthy AI for members of the Coalition for Health AI (CHAI), including provider systems and technology companies.
While working on a Learning Health System (LHS), the National Academy of Medicine describes data as a shared institutional asset. Consequently, as a LHS relies on data that is trusted across many contexts, data that is collected for one model and left to “age” cannot be utilized. The “ownership” of data that is shared by many and simultaneously by none results in so many blind spots that any model built on the data is of little utility.
Demographic data presents this issue plainly. If the data pertaining to race and ethnicity is recorded in one way in the Emergency Department (ED) and then in the Clinic (C) in another way, any model that utilizes this data to assess equitable performance is looking through a warped lens. Data standards have been created by ONC in order to address this issue. However, the creation of a data standard does little to address the disparity created at the data collection level.
Someone Must Own the Data
A new platform will not solve the problem. Leading health systems do not submit data for clinical models. They have started naming data stewards. A data steward is the user of a lab system who is informed when the lab system is upgraded, and will decide if a clinical model based on the lab system will remain operational.
Data stewardship has a definition. It is the act of assigning ownership and accountability to data elements. Most health systems have a policy that data stewardship should take place. Very few have provided a named data steward the authority to govern the data that supports a model.
Data ownership must be accompanied by the stewardship. A data steward, on a regular basis, reviews the data that a clinical model relies on, and documents the clinician feedback. ECRI, the former Emergency Care Research Institute, has categorized data integrity as a concern when a system has gone live, and the data is impacted after system deployment.
Mature systems reveal unexplored gaps. Healthcare Information and Management Systems Society (HIMSS) published a study on analytics maturity. It illustrated a progressive standardization of decentralized departmental data. It showed how data, once consolidated enterprise wide, can be trusted and used for organizational decision making. Most systems sit lower on that step than their expectations for AI, and the distance between the two is where systems fail. A model that aims to jump several rungs should, and will, expose every rung skipped by the organization.
What’s in Store for Health Systems
Health systems will acquire more clinical models in the next two years. Each model will implement the same unspoken belief that the data it uses is stable and understood. The more important pathway is not what model to acquire, but if the organization knows the condition of their data well enough to empower any model. The majority of leaders can identify the model they are about to acquire. Very few can specify the state of the underlying data.
Meaningful work lacks glamour. Those health systems that do this effectively appoint a data owner for each of their clinical models, and this data owner has the authority to interrupt the model if the data feed changes in significance. Data quality is monitored continuously, as opposed to retrospectively, after patient injury occurs. A health system with this approach will leverage mundane models to a far greater extent than a competitor that purchased the most sophisticated model available, and allows their data to wither.
Context and Sources
This edition draws on the 2019 study in the journal Science on racial bias in a population health algorithm, the Good Machine Learning Practice principles from the Food and Drug Administration, the interoperability standards work of the Office of the National Coordinator for Health Information Technology, data quality findings from the Agency for Healthcare Research and Quality, the data quality assurance position of the Coalition for Health AI, the learning health system work of the National Academy of Medicine, the data integrity hazards named by ECRI, and the analytics maturity study from the Healthcare Information and Management Systems Society. Related editions: Issue 24 (The Hidden Infrastructure of Trust) and Issue 9 (Still Chasing Integration).