Governance, not just security
Security blocks the bad.
Governance proves the good.
Four layers of AI security exist. Only one stops the action. This is the open, reproducible methodology for measuring all four — 24 capability dimensions, 7 enterprise platforms evaluated, three-point evidence scale, vendors invited to correct the record.
Four layers weighted by research: Model Security (15%), Prompt Security (20%), IAM & Endpoint (15%), Execution Governance (50%). Each vendor is scored 0, 1, or 2 per dimension against publicly documented capabilities. Methodology is open; vendors may submit corrections with supporting evidence to methodology@whitefin.ai.
00 — TWO SIDES
Two sides of the same coin
01 — SCORING SCALE
Three-Point Evidence Scale
Every dimension is scored against publicly documented capability. Marketing claims without technical substantiation score 0.
02 — FORMULA
How Scores Combine
03 — LAYER WEIGHTS
Research-Backed Weighting
Layer weights reflect where attacks succeed and where damage is stopped. Fifty percent on Layer 4 is not a preference — it is what the research literature on agentic AI security converges on when asked "which layer must hold?"
04 — WHY 50% ON LAYER 4
Deterministic Enforcement Is a Necessary Condition
Layers 1–3 are preventive. Layer 4 is deterministic enforcement. If all three upstream layers fail simultaneously — a compromised model (L1 fail) processes a poisoned tool result (L2 fail) from a trusted endpoint (L3 fail) — only Layer 4 blocks the resulting unauthorized action.
Rivasseau & Fung (arXiv:2604.02500, April 2026) demonstrate the failure mode: the majority of 16 state-of-the-art models chose to suppress evidence of fraud autonomously. A probabilistic guardian built on these models inherits the same failure.
05 — DIMENSIONS
The 24 Capability Dimensions
Six dimensions per layer. Each dimension names a specific, publicly observable capability — not a marketing category.
L1 — Model Security
Is the model safe, aligned, and tested?L2 — Prompt Security
Is the input malicious or manipulated?L3 — Endpoint Security
Is the AI tool on the endpoint secure?L4 — Execution Governance
Is the action permitted before it executes?The framework, painted
Four rows, six columns. Each row is one layer of the framework. Each column is one dimension. Hover the grid; the dimensions become the painting that named our design system.
Blue Facade, 1914 — Piet Mondrian
click to restore
06 — VENDOR RANKING
Summary Scoring Table
Total Defense score per vendor under current methodology (L1·0.15 + L2·0.20 + L3·0.15 + L4·0.50). Bars sized by Total Defense.
Per-Layer Coverage
| Vendor | L1 | L2 | L3 | L4 | Total |
|---|---|---|---|---|---|
| Whitefin | 67% | 75% | 67% | 100% | 85% |
| Palo Alto Networks | 92% | 8% | 83% | 15% | 50% |
| Microsoft Agent 365 | 25% | 33% | 92% | 25% | 44% |
| Cisco AI Defense | 25% | 33% | 92% | 10% | 32% |
| SentinelOne | 8% | 75% | 0% | 0% | 22% |
| CrowdStrike | 8% | 83% | 0% | 0% | 18% |
| Zenity | 8% | 25% | 25% | 17% | 18% |
07 — EVIDENCE
Whitefin — Full 24-Dimension Evidence
Every score below maps to a named product capability. Zero scores on D1.3, D1.4, and D3.3 reflect architectural scope — Whitefin is a gateway, not a model scanner or binary-signing service.
| Dim | Capability | Score | Evidence |
|---|---|---|---|
| D1.1 | Adversarial testing | 2 | Gulliver: 37 adversarial templates, Live Demo Mode against customer models |
| D1.2 | Output validation | 2 | Output Assurance v3: PostExecVerdict, response correctness verification |
| D1.3 | Training data provenance | 0 | Not in scope (gateway architecture; does not access model internals) |
| D1.4 | Model weight scanning | 0 | Not in scope (gateway architecture; does not access model internals) |
| D1.5 | Governance posture | 2 | Warden: open-source governance scanner, 4-layer / 24-dimension scoring framework |
| D1.6 | Continuous monitoring | 2 | Continuous Adversarial Training (CAT): weekly update packages, Ed25519 signed |
| D2.1 | Direct injection | 1 | Pattern-based injection detection across multiple guard stages |
| D2.2 | Indirect / environmental | 2 | Trap Defense: 6 detectors + Behavioral Causality Engine + Canary Content Injection |
| D2.3 | Content classification | 1 | DLP pipeline with PII/PHI detection and classification |
| D2.4 | Semantic intent | 2 | Embedding-similarity stage + LLM-classification stage in the guard chain |
| D2.5 | Jailbreak detection | 1 | Covered by guard chain but not primary specialization |
| D2.6 | Output scanning | 2 | Output Assurance v3: post-generation scanning + PostExecVerdict |
| D3.1 | Agent discovery | 1 | Inspect Census: discovers agents including shadow / ungoverned |
| D3.2 | MCP scanning | 2 | Warden MCP configuration scanning + auto-discovery from MCP servers |
| D3.3 | Supply chain | 0 | Not in scope (does not perform binary scanning or software signing) |
| D3.4 | Agent identity | 2 | Agent Passport: cryptographic Ed25519 identity per agent |
| D3.5 | Schema drift | 1 | Schema validation with drift detection for registered tools |
| D3.6 | Shadow detection | 2 | Census + Shadow Mode: discovers and governs ungoverned agents |
| D4.1 | Inline enforcement | 2 | ToolGuard: full inline proxy, intercepts every tool call before execution |
| D4.2 | Deny-by-default | 2 | Core architecture: all actions denied unless explicitly permitted |
| D4.3 | Argument-level | 2 | Schema validation + policy evaluation stages: inspect tool call parameters, not just names |
| D4.4 | Behavioral monitoring | 2 | EWMA baselines + Behavioral Causality Engine + entropy monitoring + Kill Switch |
| D4.5 | Audit trail | 2 | WORM audit: Ed25519 hash chains, 7-year retention, non-repudiation |
| D4.6 | HITL + auto-policy | 2 | HITL at proxy layer + Policy Bootstrap (auto-generate from Shadow Mode) + canary |
| WHITEFIN TOTAL DEFENSE | (67 × 0.15) + (75 × 0.20) + (67 × 0.15) + (100 × 0.50) = 85% | ||
08 — ROBUSTNESS
Sensitivity Analysis
Any weighting is inherently a judgment call. The table below recomputes Total Defense under four defensible weighting schemes to demonstrate that Whitefin's lead is not an artifact of the chosen weights.
| Vendor | Equal | Conservative | Current | Aggressive |
|---|---|---|---|---|
| Whitefin | 77% | 81% | 85% | 90% |
| Palo Alto Networks | 50% | 42% | 35% | 28% |
| Microsoft Agent 365 | 44% | 40% | 37% | 33% |
| Cisco AI Defense | 40% | 35% | 29% | 23% |
| SentinelOne | 21% | 20% | 16% | 12% |
09 — COMPOSITION
Best-of-Breed Analysis
What Total Defense a buyer can reach by stacking specialist tools — and where that approach still leaves the gap.
09 — COMPLIANCE PASSPORT
One-click PDF for regulators
Signed, timestamped, automatically generated from live pipeline state. The Layer 4 evidence regulators ask for during audits — without a six-week SOC engagement.
10 — CORRECTIONS
Updated Quarterly
Vendors may submit corrections with supporting evidence — product documentation, press releases, or technical specifications.
11 — REFERENCES
Research Citations
- [1]Bhattarai, M. & Vu, M. (2026). Trustworthy Agentic AI Requires Deterministic Architectural Boundaries. arXiv:2602.09947.
- [2]Google DeepMind (2026). AI Agent Traps: Environmental Manipulation of Production Agents.
- [3]Anthropic (2026). Subliminal Learning: Transmission of Misalignment Through Clean Data. Nature.
- [4]Rivasseau, T. & Fung, B. (2026). I Must Delete the Evidence: AI Agents Explicitly Cover up Fraud and Violent Crime. arXiv:2604.02500.
- [5]OWASP (2025). Top 10 for Agentic Applications 2026.
- [6]Forrester (2025). Introducing Forrester’s AEGIS Framework: Agentic AI Enterprise Guardrails for Information Security.
- [7]Gartner (2025). Guardian Agents will Capture 10–15% of the Agentic AI Market by 2030.