Hackers turn AI agent against Gmail users in silent data theft

Security researchers expose zero-click vulnerability that let attackers silently exfiltrate sensitive data from AI agent users.

By admin

Oct 20, 2025, 2:13 PM

Security researchers have discovered a dangerous new vulnerability that allowed hackers to steal sensitive Gmail data from ChatGPT users with nothing more than a specially crafted email. The attack, dubbed “ShadowLeak,” exploited OpenAI’s Deep Research AI agent to silently exfiltrate personal information without any visible signs or user interaction required.

The vulnerability, disclosed by cybersecurity firm Radware, is a huge evolution in AI security threats. Unlike previous attacks that required users to click links or view malicious content, ShadowLeak worked entirely behind the scenes, leveraging the AI agent’s own capabilities to betray users’ trust.

“This is a service-side attack that occurs entirely within OpenAI’s cloud infrastructure,” said Gabi Nakibly, one of the Radware researchers who discovered the flaw. “Traditional enterprise defenses like firewalls or endpoint monitoring can’t see or intercept the exfiltration because it originates from OpenAI’s own systems rather than the user’s device.”

The attack targeted ChatGPT’s Deep Research agent, a powerful tool launched in February 2025 that enables autonomous internet browsing and email analysis. When users connected their Gmail accounts to the service and asked the agent to research their emails, they unknowingly created an opening for sophisticated social engineering attacks.

Hidden instructions, poisoned emails, and a clean getaway

The ShadowLeak exploit followed a deceptively simple four-step process. First, attackers sent victims an innocent-looking email with a subject line like “Restructuring Package – Action Items.” Hidden within the email’s HTML code were invisible instructions designed to manipulate the AI agent.

These instructions used advanced psychological manipulation techniques to bypass the agent’s safety training. The malicious prompts claimed the agent had “full authorization” to access external URLs and disguised the attacker’s server as an “official compliance validation system.” The instructions even commanded the agent to “try a couple of times until you succeed” if initial attempts failed.

When users later asked Deep Research to analyze their emails about work topics, the agent dutifully read both legitimate messages and the poisoned email. Following the hidden instructions, it extracted sensitive personal information like names and addresses from the user’s inbox, encoded the data in Base64 format, and transmitted it to an attacker-controlled server.

The Base64 encoding played a pivotal role. By presenting the process as a protective step, the attack disguised sensitive information as a harmless encoded string. OpenAI’s internal security layers never saw raw personal data—only the seemingly safe output.

The researchers achieved a 100% success rate in their tests, demonstrating the reliability of their attack method. More troubling still, the data exfiltration left no network-level evidence that enterprise customers could detect.

AI agents are the newest targets

The ShadowLeak discovery highlights the emerging dangers of “agent hijacking,” where attackers manipulate AI systems to act on their behalf. This represents a fundamental shift from traditional cybersecurity threats, according to experts studying AI vulnerabilities.

The Open Web Application Security Project (OWASP) now lists prompt injection as the number one security vulnerability in its top 10 risks for large language model applications. The National Institute of Standards and Technology has identified indirect prompt injection as the most serious security flaw in generative AI, emphasizing that such attacks take advantage of a model’s inability to separate instructions from data.

The U.S. AI Safety Institute has begun conducting experiments to evaluate agent hijacking risks, identifying critical vulnerabilities including remote code execution, database exfiltration, and automated phishing. Their research found that even advanced AI systems remain vulnerable to sophisticated social engineering tactics embedded in seemingly benign data sources.

The same attack pattern can be applied to any data connector integrated with AI agents, including Google Drive, Dropbox, Microsoft Teams, GitHub, and other business applications. Malicious instructions can be hidden in PDF files, meeting invites, chat messages, or shared documents, creating multiple pathways for exploitation.

The flaw is patched but the problem remains

OpenAI moved quickly to patch the ShadowLeak vulnerability after Radware’s June disclosure, with fixes implemented by early August. However, the broader challenge of securing AI agents against prompt injection attacks remains largely unsolved.

According to a report on adversarial machine learning from the National Institute of Standards and Technology (NIST), suggested mitigation strategies for indirect prompt injection include filtering instructions from retrieved inputs and using reinforcement learning from human feedback (RLHF) to fine-tune models. However, the agency notes that managing AI security risks is an area requiring ongoing work as attackers develop increasingly sophisticated methods.

The stochastic nature of how language models work makes it unclear whether foolproof prevention methods for prompt injection even exist. Traditional cybersecurity approaches designed for predictable software systems struggle to address the unpredictable, language-based nature of AI vulnerabilities.

Some experts advocate for continuous behavioral monitoring of AI agents. This approach would track both an agent’s actions and its inferred intent, validating that they remain consistent with users’ original goals. Any deviations from legitimate intent would trigger alerts and blocks in real time.

Enterprise organizations can implement defensive measures including email sanitization to remove suspicious HTML elements and invisible text. However, these approaches offer limited protection against insider-like threats where trusted AI systems are manipulated to act on attackers’ behalf.

Risks are mounting for many industries

The ShadowLeak vulnerability emerges as organizations accelerate AI adoption across sensitive business functions. Financial services firms, healthcare providers, and legal practices increasingly rely on AI agents to process confidential client communications and internal documents.

Security researchers warn that AI assistants used in banking, legal, and medical settings risk leaking confidential client data through injection exploits, potentially violating compliance laws such as GDPR and HIPAA. The service-side nature of attacks like ShadowLeak makes them particularly difficult for regulated industries to detect and prevent.

The incident also underscores the expanding attack surface created by AI agent integrations. As these systems connect to more enterprise applications and data sources, each new connection represents a potential vulnerability that attackers can exploit. Organizations deploying AI agents must now consider both traditional software security risks and entirely new classes of language-based attacks. This requires security teams to develop expertise in both conventional cybersecurity and AI-specific threat vectors.

As AI agents become more capable and autonomous, the stakes for securing these systems continue to rise. The ShadowLeak discovery serves as a stark reminder that the convenience and power of AI automation comes with significant security responsibilities that organizations are still learning to manage.