Many primary care physicians are ready for AI, but they have conditions.
Forty percent of U.S. physicians say they are ready to use generative AI (GenAI) when interacting with patients at the point of care, according to a recent Wolters Kluwer Health survey. More than 80% believe it will improve care team interaction with patients, over half believe it can save them 20% or more time, and three in five (59%) believe it can save time by summarizing data about patients from the electronic health record. (GenAI refers to technology that can create content – e.g., text, images, videos – by studying large quantities of training data.)
But along with high hopes come expectations. For example, 58% of the physicians surveyed said their No. 1 factor when selecting a GenAI tool is knowing that the content it was trained on was created by medical professionals. Nine out of 10 (89%) said they would be more likely to use GenAI in clinical decisions if the vendor was transparent about where the information came from, who created it, and how it was resourced.
In October 2023 the American Academy of Family Physicians set forth eight principles to help ensure that AI or machine learning (AI/ML) is appropriately applied to family medicine. Repertoire spoke with Steve Waldren, M.D., chief medical informatics officer for AAFP, about the eight principles.
Principle No. 1: Preserve and Enhance Primary Care
As the patient-physician dyad is expanded to a triad with AI, the patient-physician relationship must at the minimum be preserved, and ideally, enhanced. “When AI/ML is applied to primary care, it must enhance the 4 Cs of primary care (first contact, comprehensiveness, continuity and coordination of care) and expand primary care’s capacity and capability to provide longitudinal care that achieves the quintuple aim.
“AI isn’t intended to allow family physicians to see more patients, but to minimize the time and effort of administrative work and give them more time with their patients,” says Dr. Waldren. “At this point, the technology is being used more on the administrative side than the clinical side.” As an example, he talks about “ambient listening,” an AI-driven tool that transforms a recorded conversation between doctor and patient into a clinical note for the electronic medical record.
“I spoke with one doctor who told me that before AI, they spent about 12 minutes doing clinical notes per patient. After implementing ambient listening, the time went down to 2 minutes. Later, we became aware that the average time for notes went up to 3 minutes. When we asked why, we were told, ‘Because of ambient listening, I have more time to be in the room with the patient and more time to provide some preventive services the system was advising me to perform.’ That speaks to the potential comprehensiveness of some of these systems.”
Principle No. 2: Maximize Transparency
AI/ML solutions must provide transparency to the physician and other users so that the solution’s efficacy and safety can be evaluated. Companies must provide transparency around the training data used to train the models. Companies should provide clear, understandable information describing how the AI/ML solution makes predictions. Ideally, this would be for each inference, but at least provide a conceptual model for decision-making, including the importance of data leveraged for the inference.
“Really, we are talking about the trustworthiness of AI,” says Dr. Waldren. “The bottom line is, the user – be it a physician, nurse or patient – has to trust it. There are multiple ways to gain trustworthiness, but the greatest is through transparency, meaning the developer says, ‘Here is the data this model was trained on, here are the demographics and background evidence.’ Physicians don’t necessarily want to go through reams of data and studies,” he says. “But there will be opportunities for entities to provide assurance of these models, be they public or private.”
Principle No. 3: Address Implicit Bias
Companies providing AI/ML solutions must address implicit bias in their design. We understand implicit bias cannot be completely eliminated. Still, the company should have standard processes in place to identify implicit bias and to mitigate the AI/ML models from learning those same biases. In addition, when applicable, companies should have processes for monitoring for differential outcomes, particularly those that affect vulnerable patient populations.
Principle No. 4: Maximize Training Data Diversity
To maximize the generalizability of AI/ML solutions, training data must be diverse and representative of the populations cared for by family medicine. Companies must provide clear documentation on the diversity of their training data. Companies should also work to increase the diversity of their training data to not increase or create new health inequities.
Principle No. 5: Respect the Privacy of Patients and Users
AI/ML requires large volumes of data for training. It is critical for patients and physicians to trust companies will maintain confidentiality of data from them. Companies must provide clear policies around how they collect, store, use and share data from patients and end-users. Companies must get consent for collecting any identifiable data and the consent should clearly state how the data will be used or shared.
Principle No. 6: Take a Systems View in Design
An AI/ML solution will be a component in a larger work system, and therefore it must be designed to be an integrated component of the system. This means that the company must understand how the AI/ML solution will be used within a workflow. The company needs to have a user-centered design approach. Since the vast majority of AI/ML solutions in health care will not be autonomous, the company must understand and leverage the latest science around human/AI interaction as well as quality assurance.
Dr. Waldren believes this is an area that needs improvement. “Based on my experience with stand-alone apps, developers tend to think more about how their tool captures data than how it can fit into the larger clinical workflow.”
Principle No. 7: Take Accountability
If an AI/ML solution is going to take a prominent role in health care, the company must take accountability for assuring the solution is safe. For those solutions designed for use in direct patient care, they must undergo a similar rigorous evaluation as any other medicine intervention. We also believe that companies should take on liability where appropriate.
Principle No. 8: Design for Trustworthiness
Maintaining the trust of physicians and patients is critical for a successful future of AI/ML in health care. Companies must implement policies and procedures that ensure the above principles are appropriately addressed. Companies must strive to have the highest levels of safety, reliability and correctness in their AI/ML solutions. Companies should consider how they can maximize trust with physicians and patients throughout the entire product lifecycle. AI/ML will continue its rapid advancement, so companies must continually adopt the latest state-of-the-art best practices.
“When I talk with physicians, I advise them to talk to a practice that has recently adopted the technology they’re thinking of,” says Dr. Waldren. “I ask, ‘What did you have to do to make it work?’ and ‘How well does it integrate with your work system?’ From the developer side, the EMR vendor has to be a willing partner in this integration.”
Editor’s note: The American Academy of Family Physicians’ “Ethical Application of Artificial Intelligence in Family Medicine” can be accessed at www.aafp.org/about/policies.html.
Sidebar:
AI in the light of day
Several studies published this spring sprinkled some reality on the promise of artificial intelligence to simplify administrative duties in primary care practices.
Too suggestive
In a study conducted in 2023 at Brigham and Women’s Hospital in Boston, researchers found that large language models (i.e., deep-learning models trained on extensive textual data) might lead to unexpected clinical decision-making in physicians’ responses to patients’ portal messages that have been processed through an LLM tool.
It’s true that LLM assistance might reduce physician workload, improving consistency across physician responses and enhancing the informativeness and educational value of responses, the researchers concluded. What’s more, LLM drafts were generally acceptable and posed minimal risk of harm. Yet the researchers cautioned that physicians might lean too much on the LLM’s assessments instead of using LLM responses to facilitate the communication of their own assessments.
“The content of physician responses changed when using LLM assistance, suggesting an automation bias and anchoring, which could have a downstream effect on patient outcomes,” they said. “LLMs might affect clinical decision-making in ways that need to be monitored and mitigated when used in a human and machine collaborative framework.”
Medical coding: Room for improvement
Researchers reported in NEJM AI that LLMs may be “highly error-prone” when mapping medical codes. “LLMs have shown remarkable text processing and reasoning capabilities, suggesting that they could automate key administrative tasks,” they wrote. “However, even the best LLMs extract fewer correct ICD-10-CM codes and generate more incorrect codes from clinical text than smaller fine-tuned language models.” Without additional research, LLMs are not appropriate for use on medical coding tasks, they concluded.
The human touch
Researchers from the University of California San Diego School of Medicine sought to answer the question, “Does access to generative-artificial-intelligence–drafted replies correlate with decreased physician time spent on reading and replying to patient messages, as well as reply length?” (GenAI refers to technology that can create content – e.g., text, images, videos – by studying large quantities of training data.)
Electronic messaging in electronic health records is a major source of physician burnout, they noted in JAMA Network. “Prior studies found significant time spent answering messages and associated stress. Strategies to address this challenge include triaging messages by care teams, charging fees, and using templated responses. Published work suggested that generative artificial intelligence (GenAI) could potentially extend this toolset by drafting replies.
“While some physicians clearly perceived GenAI’s value, including reduced cognitive burden due to having a draft infused with empathy to start their reply, opportunities for enhancement lie in achieving greater personalization to align with physicians’ tone and better decisions on whether to recommend a visit,” the researchers wrote. “GenAI’s current performance suggests that human input is still essential.”