Most AI models still dangerous in wrong hands

AI chatbot safeguards fail to prevent the spread of health disinformation

June 23, 2025

Research calls for enhanced, robust safeguards to prevent the weaponisation of these models as vectors of health disinformation.
Use of fabricated references, complex scientific jargon, and logical reasoning patterns amplifies the credibility of false information, increasing the risk that users may be misled and potentially harmed.

The proliferation of foundational large language models (LLMs) has revolutionised information access and communication across diverse fields.

However, a recent study published in the Annals of Internal Medicine highlights a critical vulnerability in these advanced systems: their susceptibility to manipulation that can transform them into potent tools for spreading health-related disinformation.

The researchers, led by scholars from Flinders University, evaluated five prominent LLMs—OpenAI’s GPT-4o, Gemini 1.5 Pro, Claude 3.5 Sonnet, Llama 3.2-90B Vision, and Grok Beta—to assess the robustness of their safeguards against malicious instructions designed to produce false and potentially harmful health information.

The investigation centred on the models’ application programming interfaces (APIs) and their capacity to be systematically instructed to generate consistently incorrect responses to health inquiries.

Advertisment

Misleading content

By programming these LLMs to always provide false answers, fabricate credible-sounding references, and adopt an authoritative tone, the researchers effectively created customised chatbots that disseminated misleading content with alarming plausibility.

These chatbots were tested with duplicate sets of ten health-related questions concerning topics such as vaccine safety, HIV, and depression. The study’s results revealed that an overwhelming 88 per cent of responses from these manipulated LLM chatbots constituted health disinformation.

More specifically, four of the chatbots—GPT-4o, Gemini 1.5 Pro, Llama 3.2-90B Vision, and Grok Beta—produced disinformation in response to every single question, demonstrating a near-total absence of protective safeguards.

By contrast, Claude 3.5 Sonnet showed partial resilience, with disinformation comprising 40 per cent of its answers. In an extension of this research, the team examined the OpenAI GPT Store for publicly accessible models that might propagate health disinformation.

Troubling reality

This exploration uncovered three customised GPT chatbots fine-tuned to deliver such content, with a striking 97 per cent rate of disinformation in their responses.

The study’s findings underscore an urgent and troubling reality: despite ongoing efforts to implement safeguards, foundational LLMs remain highly vulnerable to exploitation for malicious purposes, particularly in the dissemination of harmful health misinformation.

The use of fabricated references, complex scientific jargon, and logical reasoning patterns amplifies the credibility of false information, increasing the risk that users may be misled and potentially harmed.

Given the critical importance of accurate health information in public decision-making and individual well-being, this vulnerability poses a significant threat that demands immediate attention.

Discover more from TechChannel News

Subscribe to get the latest posts sent to your email.

Ransomware group claims major breach at Dubai-based NAFFCO

US set to approve first advanced AI chip sales to Saudi Arabia’s Humain

UAE and Bahrain sign deal to deepen anti-money laundering

Honeywell, GAL ink three-year deal to streamline UAE military aviation support

PhysicsWallah soars 45% on trading debut, valued at $5.1b

Nothing gets $200m to drive AI-enabled consumer hardware revolution

Reforma brings agentic AI to redefine patient care in UAE

Databricks valuation crosses $100b amid new funding on AI momentum

Volatility in US stocks cast doubt on resilience of AI rally

Nokia to invest $4b in US to boost AI-driven network innovation

US considers easing curbs on Nvidia’s H200 AI chip sales to China

Disinformation defence to become enterprise imperative by 2027

Thieves favour iPhones over Samsung for resale value

Festive demand, AI boom drive India’s third-quarter PC sales

Apple to reshape its iPhones launch cycle strategy

VIVERSE partners with World Labs to create new era of AI-powered 3D content

Bobby Mitra appointed CIO at Tata Electronics

Sally Wentworth named President and CEO of Internet Society

Pearson appoints Dave Treat as Chief Technology Officer

Cigna Healthcare names Leah Cotterill as Middle East and Africa CEO

Luxriot appoints Sandesh Kaup to spearhead India growth

Changing TikTok’s ownership will reshape the digital marketplace

The trouble with tumbling telecom prices

How AI is laying the groundwork for next-gen construction

Apple, Musk and the AI gold rush

Most AI models still dangerous in wrong hands

Misleading content

Troubling reality

Discover more from TechChannel News

About Us

Nokia to invest $4b in US to boost AI-driven network innovation

Ransomware group claims major breach at Dubai-based NAFFCO

US considers easing curbs on Nvidia’s H200 AI chip sales to China

Dubai and Bengaluru

Contact Us

Follow Us