Resources/Compliance
OWASP Guide

OWASP Agentic AI Top 10 Runtime Controls

OWASP-style agentic AI risk lists are most useful when each category maps to a concrete runtime control. This guide collapses common agentic risks into runtime questions and shows how deterministic execution gating, adapter binding, and verifiable receipts address them.

Key takeaways
  • Treat model-supplied intent and facts as untrusted, and derive risk facts independently.
  • Excessive agency is contained by adapter binding, risk tiers, and threshold approval.
  • Accountability gaps close when every action emits a verifiable Authority Receipt.

From risk categories to runtime questions

OWASP-style guidance for agentic AI names recurring failure modes: prompt injection that hijacks intent, excessive agency where an agent can do more than it should, unsafe tool use, weak isolation, and insufficient accountability. The value of such a list is a shared vocabulary, but a vocabulary is not a control. The way to operationalize it is to collapse the categories into a few runtime questions and answer each one with a mechanism.

SovereignClaw is organized around exactly those questions: what intent is proposed, what facts are trusted, what tools are reachable, what approvals are required, and what evidence is emitted. Because the model proposes and the runtime decides, each OWASP-style risk has a place to be addressed before a side effect occurs rather than after.

Containing prompt injection and untrusted intent

Prompt injection is, at root, a trust problem: the system treats model-influenced content as if it were authoritative. SovereignClaw's foundational stance is that the LLM is untrusted input. A proposed action is frozen into a canonical SovereignIR before risk is computed, and tier-driving facts are derived independently from operation semantics rather than taken from the model. If the model claims an action is benign but the derived facts say otherwise, the mismatch escalates risk rather than lowering it.

This neutralizes a large class of injection outcomes. Even if an attacker manipulates the model into proposing a dangerous action, the proposal still has to pass independent fact inference and deterministic policy. The injection can change what the model asks for; it cannot change what the runtime is willing to authorize.

  • Untrusted intent: SovereignIR is canonicalized and frozen before risk computation (S2)
  • Independent facts: tier-driving facts come from semantics, never the model (S3)
  • Escalation on mismatch: a claimed-versus-derived fact conflict raises risk

Containing excessive agency and unsafe tool use

Excessive agency and unsafe tool use are about an agent reaching capabilities it should not have, or invoking them in unsafe ways. SovereignClaw constrains this at the adapter boundary. No operation reaches an adapter without a valid gate artifact bound to the intent hash, policy bundle, adapter identity, and nonce, and unauthorized actions receive no execution path at all. The refusal is mechanical: the adapter is unreachable, not blocked after the fact.

Risk tiers and threshold authorization add graduated control. Sensitive operations are classified higher and can require quorum signatures from verified operators, so an over-eager or compromised agent cannot unilaterally invoke a high-impact tool. Nonce uniqueness rejects replay and TOCTOU attempts, closing a path where an authorized action is captured and reused.

  • Adapter binding: artifacts are bound to a specific adapter identity (S6)
  • Execution boundary: nothing reaches an adapter without a valid, bound artifact (S1)
  • Threshold authorization: elevated and sovereign tiers require quorum signatures (S7)
  • Nonce uniqueness: replay and TOCTOU are rejected (S5)

Closing accountability and traceability gaps

Weak accountability is its own OWASP-style risk, because a system that cannot prove what it did cannot be governed. SovereignClaw emits a signed Authority Receipt for every permitted execution and records denials and escalations as well, writing them to an append-only Merkle ledger that is externally verifiable without private keys. Receipts also carry the published skill digest, tenant scope, and correlation IDs, which supports traceability across multi-tenant and multi-step workflows.

Used together, these controls turn an OWASP-style risk list from a checklist of fears into a set of testable behaviors. SovereignClaw supports and helps operationalize agentic AI security guidance; it does not guarantee that every risk is eliminated, and a sound program still combines these runtime controls with secure development and monitoring practices.

Next step

This guide is meant to help with evaluation, not replace the product-specific review. If this topic matches an active project, connect it back to the relevant product page and then decide whether you need an evaluation discussion.

Frequently Asked Questions

How does runtime governance address prompt injection?
By treating model output as untrusted input. Intent is canonicalized and frozen, and tier-driving facts are derived independently from operation semantics, so an injected proposal still has to pass independent fact inference and deterministic policy before anything executes.
What stops an agent from using tools it should not?
Adapter binding and the execution boundary. No operation reaches an adapter without a valid gate artifact bound to the intent hash, policy bundle, adapter identity, and nonce, and elevated operations can require threshold approval before they run.
Does mapping to OWASP guidance eliminate agentic AI risk?
No. SovereignClaw supports and helps operationalize OWASP agentic AI guidance through concrete runtime controls and evidence, but a complete program still combines these controls with secure development, monitoring, and operational practices.
Related Reading

Continue with the next guide