Ryan Samuels, director, UK & Ireland at Eolas Medical, explains why ‘endorsed’ doesn’t always mean ‘safe’.
DHSC and NHS England’s plan to put AI at the heart of a £7.4 billion technology investment says a lot about where healthcare leaders see the biggest opportunity to support a system under strain.
It has also brought a challenge many NHS trusts face into sharper focus: what counts as safe use of AI in healthcare?
Shadow technology isn’t a new challenge for the NHS. Before AI chatbots exploded in popularity, much of the problem was reining in clinicians’ use of consumer messaging apps like WhatsApp and Signal.
What artificial intelligence has done is amp up the potential risk to patients and staff, particularly when unsanctioned and ungoverned AI tools are used to support clinical decision-making.
Blurred lines
ChatGPT is often the first name that crops up in the shadow AI conversation. It’s the obvious example because it’s everywhere: most clinicians have it on their phone in their pocket. But nobody at trust level is pretending ChatGPT has been signed off for clinical use. Staff who reach for it do so knowing that they’re stepping outside what their organisation has approved.
Microsoft Copilot, on the other hand, is a slightly more complicated case. Copilot has been endorsed, signed off and rolled out by IT in a growing number of NHS organisations to support admin work and free up time for patient care. Last year, NHS England ran a pilot of Copilot involving more than 30,000 NHS workers across 90 trusts. There are plans to roll it out more broadly.
But endorsement for one use case isn’t automatic endorsement for another. An AI productivity assistant signed off for drafting emails and summarising meetings is not the same as it being approved as a clinical decision support tool, even if it’s the same software. That means clinicians who use Copilot for anything care-related may be using shadow tooling without even realising it.
This is where the line between shadow AI and approved AI starts to blur and, from my perspective, where some of the greatest risk lies. NHS England’s trial of Copilot found that it could save almost 45 minutes a day per staff member. When an AI tool demonstrably eases the admin strain on NHS teams, the assumption that it’s good enough for clinical decision-making is easy to make and difficult to reverse.

Three key questions for safer clinical AI
For me, there are three key questions a healthcare organisation should be able to answer about any AI tool being used in clinical settings.
First: where is the response coming from? Clinical decision support needs to be grounded in the organisation’s own, current, approved guidance. General-purpose chatbots pull their answers from training data scraped from the open web: a vast, undifferentiated pool of text from sources of varying credibility. An NHS trust has no way of knowing which sources have influenced a chatbot’s response, nor any reliable way to test how often the AI model gets clinical questions wrong. In many cases, hallucination rates may not be measurable.
The second is whether the organisation has visibility into how AI is being used. In a medico-legal case, a trust may need to account for how an AI tool was used in patient care – who queried it, what was asked, and what answers came back. Shadow tools are invisible to governance. There’s no audit trail and no way to assess how staff have been using them. This risk usually stays hidden until something forces it into the open, by which point the damage is done and much harder to recover from.
The third is the question that cuts through every discussion about NHS technology: will clinicians actually use the approved alternative? Clinical buy-in isn’t a problem you solve with policy. The NHS is a graveyard of tools and platforms deployed outside the context they were made for, with millions then spent trying to drive clinical adoption after the fact. If healthcare organisations don’t put a trusted, reliable and easy-to-use tool within clinicians’ grasp, they’ll reach for whatever gives them the fastest answer with the least friction. More often than not, that’s an unapproved, consumer-grade solution.
The right tool for the right context
There’s a reason AI tools are being adopted in healthcare twice as fast as in other sectors.
A 2022 study estimated that primary care doctors would need almost 27 hours per day to provide all patients with the care they required. Technologies like ambient voice transcription – which the DHSC calls a “no regrets” technology – are already reducing admin overload and giving clinicians more time with patients. Nobody is debating the case for AI in healthcare. It’s already been made.
But the case for AI in clinical work – the kind NHS teams reach for when they need a fast, reliable answer that directly affects patient care – depends on something more specific: AI that sits on top of an approved, governed knowledge base, under the full control of the care setting it’s used in.
I see this consistently across the NHS trusts we work with at Eolas. Give clinicians a trustworthy, purpose-built tool that works better than the alternative, and the question stops being “why not just use ChatGPT?” and becomes “why would we use anything else?” That’s the standard any technology deployed in the NHS should be held to, AI or otherwise.



