Hospitals Face Challenges Measuring ROI on AI Investments
Hospitals are poised to spend billions on artificial intelligence (AI) in the coming years, creating significant pressure to measure the return on investment (ROI) effectively. However, many healthcare organizations remain ill-equipped to accurately gauge the true value they are receiving from these investments. Health system leaders are actively working to determine the optimal methods for measuring AI’s effectiveness, experimenting with a range of approaches.
These measurement strategies span both quantitative metrics, such as patient outcomes, and qualitative indicators, including physician job satisfaction. Without a clear and comprehensive understanding of which AI tools are truly effective, hospitals face challenges when trying to scale successful solutions across their entire enterprise. The scaling process is further complicated by the varying needs across different medical specialties, inadequate technology infrastructures, and the imperative for strong data governance.
As the healthcare industry transitions its AI initiatives from the experimentation phase to widespread adoption, industry experts stress the need for more rigorous, real-world evidence.
Measuring AI’s Impact: A Multifaceted Approach
According to Kiran Mysore, chief data and analytics officer at Sutter Health, a health system based in Northern California, determining how best to assess the success of AI tools remains a significant challenge for healthcare leaders across the country. “The challenge we have today is most pilots don’t think about ROI upfront. It’s ‘let’s go — just solve the problem and go do it.’ The danger there is that you go too far without having a conversation about AI value. You have to have that conversation up front as early as possible,” Mysore stated.
Before adopting a specific AI tool, hospital leaders must estimate its potential ROI to guide decision-making regarding the size of the investment. If an AI solution projects a modest ROI, hospitals might be hesitant to make significant financial investments. Conversely, a higher projected ROI could incentivize substantial upfront spending, explained Mysore.
For example, consider AI-powered ambient listening tools. “Does it save some time for the physicians? That’s hard to measure — because when a physician sees 10-12 patients in half a day, how do you actually measure that? The best thing we can measure is cognitive burden, but that is not a scientific measure. It’s just a physician feeling relieved and relaxed — and being able to have a conversation versus having to type something,” Mysore explained. Evaluating the effectiveness of certain tools requires consideration of qualitative metrics.
Ambient listening tools offer the potential to ease the cognitive burden on physicians. With the healthcare industry facing a severe clinician shortage amidst a historic burnout crisis, physician stress levels and overall job satisfaction are critical factors to consider, according to Mysore.
Scott Arnold, CIO and chief of innovation at Tampa General Hospital, shares this view. He noted that hospitals don’t typically track metrics like staff attrition rates or physicians’ overall job satisfaction in order to calculate an AI tool’s ROI. To Arnold, these can be real indicators of a solution’s impact. “Sure, there may not be a direct ROI figure that I can deliver up to the CFO, but I can point over to the attrition rate and how that’s gone into single digits because people are happy and they got a little time back at night. Now they’re not spending their night, you know, hand jamming notes into a system when we have AI tools to do it for them,” he explained.
For other technologies, quantitative metrics are more important. For instance, a hospital would closely track the average length of patient stays after adopting an AI tool that helps automate patient discharge processes.
Challenges in Scaling AI Solutions
Scaling an AI solution that performed well during its pilot phase presents a unique set of new challenges, Mysore noted. “Maybe you have a bunch of primary care physicians and you roll it out to them first, but when you roll it out to cardiologists or to nurses or to others, it’s going to be very different. You can’t necessarily use the same scaling functions, because primary care physicians ask a certain set of questions and they document a certain set of things. Cardiologists might do very different things, so it’s really important for us to tailor the AI use to the patient population and the physician population,” he remarked.
Without tailored deployment strategies, even the most promising AI tools might remain stalled at the pilot phase, Mysore explained. Many health systems lack the necessary infrastructure to quickly scale AI solutions, according to Tej Shah, managing director at Accenture. He likened this predicament to “building the lab but not the garage.”
“In the survey that we did with 300 C-suite leaders across healthcare providers, we saw that folks are dipping their toe into this technology. They’re investing to build and pilot these AI solutions within their four walls, but we’re not really seeing folks invest in the infrastructure that they need in order to get to the value,” Shah declared.
Building this necessary infrastructure requires a strong digital core, according to Shah. Hospitals can establish a strong digital core by migrating their operations to the cloud, and by ensuring that their data is structured and accessible.
Structured, accessible data is essential for AI tools to deliver reliable insights. Shah highlighted that poor data quality can result in inefficiencies, create biased algorithms, and ultimately hinder the ability to scale AI solutions effectively. Moreover, hospitals must establish a robust data governance structure for their digital tools to ensure the security and the ethical use of AI models. It is as important to train staff to use these tools.
“It’s about [providers] making the investment in their people to help them be able to use the technology in a way that makes sense and helping them also understand what the guardrails are today. There is this sort of jagged frontier of AI — it’s about helping the clinicians really understand and appreciate what that jagged frontier looks like, and what they can and should be using this technology for,” he explained.
Evidence Gap Hinders AI Adoption
There is another key problem hospitals face when it comes to scaling AI: They don’t have very much external evidence to reference to help them figure out which solutions work the best and therefore should be adopted the fastest, pointed out Meg Barron, managing director at Peterson Health Technology Institute (PHTI).
Barron’s organization prioritizes clinical effectiveness over engagement and user satisfaction in digital health evaluations and is addressing this problem by publishing public research that assesses the clinical and economic impact of digital health tools. “For any given solution category, there’s often various evidence that can exist, but not all evidence is created equal, and there can often be bias and lack of quality in a lot of the research,” Barron stated.
Bias can seep into the study when vendors have financial or promotional incentives behind the research. Without rigorous standards and transparency in the evidence generation process, much of the available data on digital health tools may not actually reflect their true clinical impact, cautioned Barron.
Many vendors rely on studies conducted in controlled environments, typically utilizing simulated data instead of real patient data. A report last year, for instance, analyzed over 500 studies on large language models in healthcare and found that only 5% of them were conducted using real-world patient data.
As providers scrutinize digital health vendors, it’s also important to clarify their claims about saving money, said Barron. While cost reduction isn’t the primary objective of every AI tool, improving health outcomes with technology frequently leads to lower healthcare spending, she noted. Assessing digital health technology should include consideration of both clinical effectiveness and budget impact, especially within the one-to-three-year contract cycles common in healthcare. Through its research, the PHTI has found that some digital solutions, such as virtual physical therapy, offer cost savings and clinical results that are comparable to in-person care.
“We found that virtual apps can make it easier for people to do physical therapy, which helps them heal more quickly and avoid other costs, such as surgery and pain medications. Often in other cases, technology can help expand beyond just one-to-one care to bring down overall delivery costs and also expand access. That’s nirvana; that’s the end goal — and our intent is to help to surface where these instances are happening so they can be scaled faster,” Barron declared. Conversely, PHTI research has shown that digital diabetes management tools increased costs without producing superior outcomes—despite the vendors’ claims of money-saving capabilities. Barron thinks that a careful examination, along with the use of real-world evidence, will be critical for guiding provider decisions for the adoption of technology. Without those factors, hospitals risk investing in tools that promise significant benefits without delivering on their full clinical and cost-saving potential.
