A chest CT with AI overlay indicating an area of concern with a 68% likelihood of malignancy is being reviewed by a radiologist. Upon analyzing the area of concern, the radiologist concludes that the finding is more probably benign. She documents her findings, and recommends a follow up in 3 months. Six months later, the patient ends up being diagnosed with cancer that is more advanced. In the event of a radiologist malpractice case, the first and foremost question becomes what the radiologist did with the 68 percent.

The health system is experiencing a gap that needs a systematic approach. In collaboration with AI, healthcare professionals must acknowledge that they are not simply receiving a yes or no answer. For example, a common misunderstanding is that a prediction algorithm for sepsis states that the patient has sepsis. This is incorrect and at very least there is an 82% chance sepsis will develop within the next 6 hours. There is a similar misconception about prediction algorithms for heart disease. Saying that an algorithm will predict that a patient will have a heart attack is also incorrect. The algorithm will most likely predict that there is an x% chance of a serious cardiac event in the next 12 months. Healthcare professionals are trained in a diagnostic and clinical approach, and more likely than not, they have a professional judgment based on clinical evidence. They are basically trained to practice medicine, and now in addition to the probabilistic scores that artificial intelligence has, they must facilitate the AI, incorporate it, and assist in the decision-making process.

The Nature of Probabilistic Output

Clinical AI tools are designed to output expected values, risk scores, and lists of potential diagnoses. A JAMA study reviewed the output communication method of clinical AI tools to clinicians. Most AI diagnostic and predictive systems have probabilistic formats and avoid deterministic recommendations. The machine learning training paradigm tends to reflect the weighted output for possibilities rather than certainties. These systems are primarily trained on data from populations and not from individuals.

Population-based training means that the output reflects the probabilities of possibilities rather than certainties. Probabilities demand a different level of interpretation from clinicians than what is required for clinical data. Laboratory tests and radiological images convey certainty, while probability scores represent a priori statistical reasoning. The model may also be unknown to the clinician. Stanford Medicine researchers have examined the AI output interpretation gap and have identified clinical context, clinician experience and level, clinician trust in the model, and the discrepancy in trust level as critical factors contributing to the gap.

The Documentation Problem

When a clinical decision is influenced by a probabilistic AI output, the documentation requirements become far more intricate. In typical clinical practice, a clinician writes down the reasoning that led to making a diagnosis or to a specific treatment decision. With AI in the picture, the clinical record must incorporate detailed descriptions of the AI output, the assessment of it by the clinician, and the rationale behind the clinical plan, whether in agreement with or contrary to the AI recommendation, which is contextualized as part of their reasoning.

A new project published in Health Affairs analyzes documentation practices in incidents with probabilistic AI. The majority of clinicians do not document the AI output nor do they document their reasoning. There are numerous cases in which an AI probability score is present in the encounter but the clinical documentation remains blind to it at the conclusion of the encounter.

JAMA has assessed the potential legal ramifications of undocumented AI outputs. In terms of legal risks, the absence of documentation creates a serious dilemma. When a clinician is faced with a high-probability flag and then pursues an alternative pathway, the justification is supposed to be captured in the documentation. In the absence of such documentation, it is presumed in a legal context that the clinician acted contrary to their professional discretion, which is considered to be in the domain of the clinician.

Liability in a Probabilistic World

The legal and professional liability stemming from probabilistic AI may be more harmful than a typical case of malpractice. In a routine malpractice case, the issue is whether the clinician met a standard of care or whether the clinician used reasonable clinical judgment. Once an AI system is designed to offer a predictable outcome with a certain level of likelihood, clinicians have at their disposal a rational prediction that the organization decided to deploy.

AMA has outlined some liability concerns related to AI-generated probability scores. In their assessment, a new type of information is being integrated into the clinical workflow with AI outputs. AMA acknowledges the decision-making role of AI and cites the role of documented AI predictions as a standard against which the clinical decision is made. This is particularly important when that prediction is present or documented in the clinical setting at the time of the decision being made.

Research in the New England Journal of Medicine has analyzed some of the earliest legal disputes concerning AI-supported decisions in a clinical setting. Courts are beginning to respond to the question of whether the clinician acted appropriately in relation to AI output, including probability scores. An emerging expectation under the law appears to be that a clinician is, after receiving an AI alert that is clinically determined to be high probability, justified in taking a different clinical pathway when documented reasoning supports the clinical decision that diverged from the AI suggestion.

Preparing Clinicians for Probabilistic Practice

Health systems that use probabilistic AI should also develop preparation programs for clinicians aimed at helping them with clinical reasoning during the use of predictive AI systems. There are three dimensions to this preparation.

First, it is important to train clinicians on the workings of probabilistic AI systems, what the numbers represent, and what the weaknesses of the systems are. This preparation is continuous and it extends with each new introduction of the system.

Next, health systems need to establish their own standards for how clinicians log AI outputs and their reasoning for those outputs, especially how systems should integrate these standards into their electronic health records. These standards should be integrated into health information systems as design templates with structured documentation fields, as opposed to the free-text fields that are unstructured and are left to the discretion of the clinician or the encounter.

Third, clinical leadership needs to articulate the response thresholds with respect to clinical actions that should be taken for probabilistic AI outputs. For example, it should be clear what steps need to be taken in the absence of clinician judgment if a sepsis prediction model flags a patient with an 85 percent probability. The absence of such thresholds would lead to an increased risk of inconsistency and would increase the institutional legal risk.

Joint Commission has stated that health systems should implement these procedures and has recognized the need for health systems to implement probabilistic AI as clinical decision support systems with the expectation that there will be structured responses as opposed to having it as an unstructured response.

Context and Sources

The Journal of the American Medical Association (JAMA) analyzes the concerns of clinical AI technologies and the probabilistic breakdowns provided to clinicians. Stanford Medicine studies the understanding of probabilistic outcomes by clinicians. Documentation of AI-induced encounters has been reviewed by Health Affairs. AMA has discussed the concerns surrounding the liability of probabilistic AI. New England Journal of Medicine has written about the initial litigation cases of AI-assisted decision-making. Joint Commission has called for institutional policies related to the clinical AI outputs. This edition correlates to clinical documentation and liability issues in editions AE, AF, and AG of this newsletter.

Christopher Hutchins
Founder & CEO, Hutchins Data Strategy Consultants

Recommended for you