The GenAI divide: Why most fail to scale AI and what healthcare’s early lessons reveal

Most AI pilots are unable scale, with hospitals stuck in trial mode as integration, trust, and regulation slow progress.

By admin

Oct 7, 2025, 8:59 AM

Of the estimated $30 to 40 billion invested by enterprises in generative AI (GenAI), most is going nowhere. That’s according to a new analysis from MIT’s Project NANDA, which finds that 95% of AI pilots fail to deliver measurable business value, a gulf the authors call the “GenAI Divide.”

In healthcare, where AI has been heralded as both a cure for administrative bloat and a risk to patient safety, this divide is on full display. Hospitals and health systems are adopting AI faster than ever, but only a sliver of those deployments show structural impact.

Languishing in pilot purgatory

The MIT report, based on representative interviews from 52 organizations, 153 senior leader surveys, and 300 publicly disclosed AI initiatives, found that consumer tools like ChatGPT and Copilot are ubiquitous. Eighty percent of organizations have piloted them, and 40 percent report some deployment. Yet the custom-built, enterprise-grade tools that promise deeper integration stall, with just 5 percent reaching production.

Executives describe brittle workflows, poor alignment with daily operations, and above all, AI that fails to “learn.” “We’ve seen dozens of demos this year,” one CIO told the researchers. “Maybe one or two are genuinely useful. The rest are wrappers or science projects.”

While the MIT findings have drawn some methodological criticism, the broader pattern of mixed AI adoption results is supported by multiple independent studies, and healthcare leaders echo the frustration. A Scottsdale Institute survey of 43 U.S. health systems, covering 37 use cases, found universal piloting of ambient documentation tools but mixed results beyond that. Imaging, administrative automation, and risk prediction are widely tested, but integration hurdles, financial strain, and regulatory uncertainty keep most projects stuck in the pilot phase.

Workers are not waiting for permission to use GenAI

MIT found that 90 percent of employees use personal AI accounts, such as ChatGPT and Claude, to accelerate tasks like drafting, analysis, and data entry, often without IT approval. Only 40 percent of companies have purchased sanctioned subscriptions.

That same dynamic plays out in hospitals and clinics. Doctors increasingly subscribe to their own AI scribes or use general-purpose tools to prep documentation, even when health systems are still evaluating enterprise contracts. The Peterson Health Technology Institute recently noted that ambient scribes are on track to become one of the fastest-adopted technologies in healthcare, driven largely by individual clinicians who are implementing these tools ahead of formal institutional policies.

The real savings are in the back office

In business broadly, half of AI budgets flow into sales and marketing, where ROI is easy to measure (email conversion, lead scoring, churn reduction, etc.). But MIT found that back-office functions, though less glamorous, deliver bigger payoffs. Companies that deployed AI tools for these functions reported millions saved by reducing BPO contracts and external agency spend.

Healthcare follows this broader pattern. McKinsey’s 2025 analysis of generative AI in healthcare found the strongest ROI not in clinical diagnosis, but in administrative efficiency and back-office productivity, including billing, claims management, and documentation. These are the same areas in which outsourcing balloons costs and where AI can cut out middlemen rather than clinicians.

Waiting for a sure thing

The MIT report emphasizes poor workflow fit as the main barrier to successful AI deployment, and healthcare shows how regulation amplifies this problem. A Nature Medicine Digital Health analysis of nearly 700 FDA-cleared AI/ML devices found that over 80 percent lacked demographic reporting and fewer than 10 percent included post-market surveillance.

That opacity undermines trust, both from physicians and regulators. Even as the Food and Drug Administration pilots AI internally to speed its own review processes, the market remains wary. Health system executives told MIT researchers they would rather wait for trusted incumbents, established EHR vendors or BPO partners, to offer AI add-ons than gamble on startups.

Subtle, not sweeping, workforce impact

Contrary to doomsday scenarios, MIT found no evidence of broad layoffs resulting from AI deployment. Instead, workforce effects are concentrated in outsourced functions like customer support and administrative processing, with reductions in the 5–20 percent range.

Healthcare again offers a live test. Clinicians aren’t being replaced, but contract coders, BPO call centers, and agency scribes are under pressure. Hospitals adopting ambient scribe tools report tens of millions in savings by reducing reliance on external vendors, not by cutting physicians or nurses.

Still, unintended effects loom. A recent Polish colonoscopy study found that physicians using AI routinely had lower adenoma detection rates when working without AI support, suggesting skill atrophy when over-reliance develops. It’s a cautionary note that AI tools don’t just risk failure, they may also contribute to the erosion of human capability.

Forget everything you know about GenAI forgetfulness

MIT’s analysis also examines the AI bottleneck of memory deficiency. GenAI tools forget context, fail to learn, and can’t adapt over time. That’s why employees love ChatGPT for first drafts but won’t trust it with client contracts.

The next frontier is agentic AI, systems with persistent memory, feedback loops, and autonomous orchestration. Microsoft has already introduced memory in Copilot, and OpenAI has a memory beta in ChatGPT. Frameworks like the Model Context Protocol (MCP) and Agent-to-Agent (A2A) promise to let these systems interoperate, forming what MIT calls the “Agentic Web.”

Healthcare innovators are testing the same leap. Startups like Suki, which raised $70 million to build AI assistants for hospitals, are embedding memory and workflow adaptation directly into EHRs. If AI can learn from each patient encounter and adapt to institutional processes, it might finally be ready to move from pilot to production.

The MIT report’s authors warn that enterprises have only 12 to 18 months before vendor decisions harden and switching costs become prohibitive. This same urgency applies in healthcare. Hospitals experimenting with multiple AI scribes, risk tools, or EHR add-ons will soon need to commit to particular vendors, and it’s becoming clear that the ones producing AI that learns the fastest will be best positioned to lock in long-term dominance.