When Agent Social Engineering Becomes Action Hijacking

The uncomfortable OpenClaw lesson is not only that an agent can leak text. It is that persuasion can become action hijacking when the agent has private context, internet access, tools, plugins, or credentials.

Recent public reporting and tooling around OpenClaw-style agents point to the same control problem from three directions: social engineering of an agent, malicious skill or plugin supply chain, and exposed agent infrastructure.

Last updated: May 7, 2026

In this field note

What happened Three failure patterns Why this is authority risk What teams should map Agent Action BOM row What the evidence supports

What happened

The Register reported on a Hannah Fry experiment involving an AI agent built with OpenClaw, real-world tasks, and a bank card number supplied by the team. The agent attempted errands, used Fry's real name in an unexpected way, struggled with anti-bot controls, created an online shop, and later disclosed sensitive material after being manipulated through a social-engineering setup.

The important detail is not the novelty of the demo. It is the class of material reportedly exposed: API keys, usernames, passwords, and operational context. That is authority-bearing material, not only content.

Zscaler ThreatLabz separately published a writeup on a deceptive "DeepSeek-Claw" OpenClaw skill. Zscaler says the skill's instructions could lead an AI agent or user into executing installation paths that delivered Remcos RAT on Windows or GhostLoader across macOS, Linux, or manual Windows workflows.

Bishop Fox's AIMap points to a third surface: publicly exposed AI agent infrastructure. Its page describes discovery, fingerprinting, scoring, and testing for exposed AI endpoints including MCP servers, Ollama instances, OpenClaw-style systems, LangServe chains, Gradio apps, and other AI infrastructure.

The Register reporting | Zscaler ThreatLabz writeup | Bishop Fox AIMap

Three failure patterns

Agent social engineering

An agent with private context can be persuaded to reveal or use material it should not expose.

Skill and plugin supply chain

A skill, plugin, package script, or setup instruction can become a path to malware, data theft, or tool misuse.

Exposed agent infrastructure

Public AI endpoints can expose models, prompts, tools, authorization boundaries, and execution capabilities.

These are different events and surfaces. Treating them as one generic "AI security" problem makes the response worse. The shared thread is action authority: what the agent or connected workflow can reach and do after it receives an instruction.

Why this is authority risk

Content risk asks what the model sees, says, leaks, or summarizes. Authority risk asks what the agent can do when it has context, tools, permissions, network reach, or credentials.

In software delivery, that distinction matters. A skill file is not just documentation if an agent may follow it. An MCP declaration is not just configuration if it exposes tools. A public agent endpoint is not just another AI asset if it can run code, enumerate tools, leak prompts, or invoke actions.

The hard question is not only whether the agent can be persuaded. It is what the agent can do after it is persuaded.

What teams should map

Teams do not need to become OpenClaw specialists to learn from this. They need to treat agent setup and tool reach as part of the delivery action graph when those paths can influence repos, CI/CD, packages, credentials, cloud, or release behavior.

Agent config files and runtime settings.
Installed skills, plugins, extensions, and setup instructions.
MCP declarations, tool bindings, and tool descriptions.
Package scripts, install hooks, shell commands, and generated setup steps.
Workflow files, CI/CD jobs, package publishing paths, and release scripts.
Credential references: API keys, tokens, OAuth grants, service accounts, CI secrets, cloud roles, and local keychains.
External publishing paths: email, social posting, websites, issue trackers, package registries, or cloud APIs.
Publicly exposed agent endpoints and whether they expose tools, prompts, authentication gaps, or dangerous capability combinations.

Agent Action BOM row

A first row for this class of risk should be plain enough for AppSec, platform, and engineering to review together:

Actor: Local coding agent with installed skill
Owner: Engineering enablement
Location: developer workstation and repo tool config
Skill/plugin: third-party setup skill
Credential context: local shell, package manager token, cloud CLI, keychain, CI secret references
Reachable actions: read files, run shell, install package, call MCP tool, publish externally, open PR
Approval-required: new skill install, shell/network command, secret read, package publish, workflow edit, cloud command
Proof: skill source, install command, tool invocation, credential identity, policy verdict, approver, target event, revocation record

The point is not to make every agent action slow. The point is to know where authority changes state, which actions require approval before execution, and what proof remains afterward.

What the evidence supports

These incidents are best read as evidence of a control pattern, not proof that one control would have prevented every reported outcome. The pattern is narrower and more useful: once an AI-assisted workflow can use tools, credentials, plugins, skills, or exposed endpoints, the security question moves from content safety to action authority.

The practical lesson is the action graph: skills, MCP servers, agent configs, exposed endpoints, and tool declarations become part of software-delivery governance when they can influence repos, CI/CD, packages, credentials, cloud paths, or release behavior.

What is an Agent Action BOM? | AI authority risk | MCP tool risk