Healthcare's Dangerous AI Experiment: Deploying Technology Without Evidence • Clinical AI News

The artificial intelligence revolution in healthcare has reached a critical inflection point, and the picture emerging from recent research is deeply concerning. According to a comprehensive JAMA special report synthesizing findings from the 2024 JAMA Summit on Artificial Intelligence, health systems across the United States are implementing AI-enabled technologies at an unprecedented pace, with more than 1,200 such medical devices cleared by the FDA and nearly 90% of health systems deploying some form of AI. Yet this rapid adoption stands in stark contrast to the paucity of evidence demonstrating that these tools actually improve clinical outcomes or even meet the fundamental medical principle of primum non nocere—first, do no harm.
The regulatory landscape reveals significant gaps in oversight that allow most AI applications to enter clinical practice with minimal scrutiny. While the FDA has cleared numerous AI-enabled medical devices, the majority receive clearance through the 510(k) pathway, which does not require prospective human testing or demonstration of improved clinical outcomes. More problematically, a substantial portion of AI tools used in healthcare fall entirely outside FDA jurisdiction, including documentation software, patient scheduling algorithms, and prior authorization systems that profoundly influence patient care but are not classified as medical devices. This regulatory vacuum has created what former FDA Commissioner Robert Califf characterizes as a crisis of validation capability, noting that "I do not believe there's a single health system in the United States that's capable of validating an AI algorithm that's put into place in a clinical care system."
The evidence gaps extend beyond regulatory oversight to fundamental questions of clinical effectiveness and safety. Research examining FDA-approved AI medical devices found that only 3.6% of approvals reported race or ethnicity data, 99.1% provided no socioeconomic information, and 81.6% did not report the age of study subjects. This lack of demographic representation in validation studies creates substantial risks for algorithmic bias, potentially exacerbating existing healthcare disparities. Furthermore, analysis of AI-enabled medical device recalls revealed that devices lacking clinical validation and those manufactured by publicly traded companies—possibly driven by investor pressure for rapid market entry—accounted for the majority of early post-market failures, with recalls concentrated within the first years following clearance.
The disconnect between AI deployment and evidence generation reflects a broader challenge in establishing appropriate governance frameworks. Most health systems lack the technical infrastructure, specialized personnel, and systematic processes necessary to conduct rigorous pre-deployment evaluation and ongoing post-market surveillance of AI tools. Contract negotiations between healthcare organizations and AI developers often result in licensing agreements that shift liability and monitoring responsibilities to implementers, even when those organizations lack the expertise or resources to fulfill such functions effectively. This asymmetry in capability and accountability creates what researchers have termed "responsibility gaps" where neither developers nor implementers adequately ensure AI safety and effectiveness.
Addressing these challenges requires coordinated action across multiple stakeholders. The medical community must demand robust pre-market clinical validation, transparent reporting of algorithmic performance across diverse patient populations, and mandatory post-deployment surveillance systems that can detect performance degradation or bias. Regulatory agencies need enhanced authority and resources to oversee the full spectrum of healthcare AI applications, not merely those classified as medical devices. Health systems must develop institutional governance structures capable of conducting risk-based assessments and ongoing monitoring of AI tools, while recognizing that effective oversight requires significant investment in data science expertise and technical infrastructure. Until these safeguards are established, healthcare's rush to implement AI represents a largely uncontrolled experiment whose risks fall disproportionately on the most vulnerable patient populations.