AI health tools multiply—but efficacy remains unproven
matte painting, hyper-detailed environment illustration, minimal negative space, subject centered with breathing room, industrial ambient realism,📷 Photo by Tech&Space
- ★Microsoft and Amazon expand AI into medical records
- ★LLM-based tools now face real-world clinical scrutiny
- ★Regulatory lag risks outpacing deployment speed
Two tech giants moved aggressively into AI-driven healthcare this month, but the scientific community remains cautious about untested claims. Microsoft’s Copilot Health—launched earlier in June—lets users upload medical records and query an LLM for personalized advice, while Amazon’s Health AI, previously exclusive to One Medical members, now targets broader adoption. Both tools arrive amid a surge of AI health applications, yet neither has undergone the rigorous clinical validation typical of medical devices.
The timing is deliberate. Microsoft’s entry follows its $19.7 billion acquisition of Nuance in 2022, a deal explicitly framed as a healthcare AI play. Amazon’s expansion, meanwhile, leverages its 2018 purchase of PillPack and subsequent integration with One Medical—a vertical stack now feeding data into its proprietary models. These aren’t isolated experiments; they’re strategic bets on AI as the next layer of patient interaction.
Yet the scientific context complicates the narrative. A 2023 JAMA study found that 60% of AI health tools lacked external validation, and only 12% were tested in real-world settings. The tools’ reliance on LLMs—prone to hallucination and bias—adds another layer of risk when applied to life-or-death decisions.
📷 Photo by Tech&Space
The gap between commercial rollouts and validated outcomes
The regulatory landscape lags further behind. The FDA’s AI/ML Action Plan permits ‘predetermined change control’ for some algorithms, but neither Copilot Health nor Health AI has secured formal clearance. Instead, both operate in a gray zone: not quite medical devices, not quite consumer apps, but handling sensitive health data at scale.
What’s next hinges on two variables: adoption speed and adverse event tracking. Early signals suggest clinicians are wary—only 23% of physicians trust AI for diagnostic support, per a 2024 AMA survey. Yet patient demand may force the issue, especially as tools like Copilot Health market direct-to-consumer convenience over clinical rigor.
The real bottleneck isn’t the algorithms’ sophistication but the absence of post-deployment audits. Without systematic error tracking, even well-intentioned tools could amplify misinformation—or worse, delay critical care. For all the noise about ‘democratizing’ healthcare, the actual story is one of unchecked scale meeting unproven utility.