Risk assessment answers the question every platform operator asks before letting an agent act: how dangerous is this, right now, in this context? Unlike static reputation scores, risk assessments are dynamic. The same agent can be low-risk for a data access request and high-risk for a financial transaction — because different actions stress different capabilities. Risk assessments incorporate the agent’s reputation, recent violation history, the specific action being attempted, and the operator’s risk tolerance. For teams, the engine goes further: it evaluates whether a group of individually acceptable agents might still be dangerous when combined — through correlated failure, value divergence, or contagion dynamics.Documentation Index
Fetch the complete documentation index at: https://mnemomllc-feat-aip-output-analysis-docs.mintlify.app/llms.txt
Use this file to discover all available pages before exploring further.
Individual Risk
Each individual risk assessment produces a score between 0 and 1, computed as:Context-Aware Component Risk (60%)
Five reputation components are weighted differently depending on the action type:| Component | What It Tracks |
|---|---|
| Integrity Ratio | How often behavior matches declared values |
| Compliance | Adherence to organizational rules |
| Drift Stability | Whether behavior is changing over time |
| Trace Completeness | Whether full reasoning traces are provided |
| Coherence Compatibility | How well the agent works with the fleet |
financial_transaction emphasizes compliance (0.30) and integrity (0.30). A task_delegation emphasizes coherence (0.35) — can this agent hand off work reliably? A multi_agent_coordination action weights coherence highest (0.40).
Six action types are supported:
financial_transaction, data_access, task_delegation, tool_invocation, autonomous_operation, and multi_agent_coordination. Each has a distinct weight profile tuned to the risks specific to that action.Recency Penalty (30%)
Recent violations count more than old ones. The engine uses exponential decay with a 30-day half-life:Confidence Penalty (10%)
Agents with limited behavioral history receive an uncertainty premium:insufficient data adds 0.30, low adds 0.20, medium adds 0.10, and high confidence adds nothing.
Risk Levels and Recommendations
The composite score maps to four risk levels. Thresholds shift based on the caller’s risk tolerance:| Tolerance | Low | Medium | High | Critical |
|---|---|---|---|---|
| Conservative | < 0.15 | < 0.35 | < 0.55 | >= 0.55 |
| Moderate | < 0.25 | < 0.50 | < 0.75 | >= 0.75 |
| Aggressive | < 0.35 | < 0.60 | < 0.85 | >= 0.85 |
approve (low/medium), review (high, requires human approval), or deny (critical, block the action).
Team Risk
Team risk assessment evaluates whether a group of agents is safe to operate together. A team of individually low-risk agents can still be dangerous.Three-Pillar Model
Shapley Attribution
After computing team coherence, the engine attributes each agent’s marginal contribution using leave-one-out (LOO) Shapley values:Circuit Breakers
Hard safety floors override the continuous score when conditions are extreme:- Any agent with reputation below 200 forces the team to critical/deny
- Any pairwise boundary compatibility below 100 forces critical/deny
Additional Analytics
| Analysis | What It Detects |
|---|---|
| Outlier Detection | Agents whose risk exceeds the fleet mean by > 1 standard deviation |
| Cluster Detection | Groups of agents with correlated risk (shared failure modes) |
| Value Divergence | Values declared by some agents but missing in others |
| Synergy Detection | Whether the team score is better or worse than the average of individual scores |
Team Recommendations
| Score Range | Recommendation | Meaning |
|---|---|---|
| < 0.20 | approve_team | Team operates as a unit |
| < 0.40 | approve_team | Team operates with monitoring |
| < 0.60 | approve_individuals_only | Individual actions allowed, joint operations blocked |
| >= 0.60 | deny | All team operations blocked |
Zero-Knowledge Proofs
Every risk assessment can be cryptographically proven correct without revealing the underlying reputation data. The entire risk computation is implemented in both TypeScript (for fast online use) and Rust (for ZK proof generation inside SP1 zkVM). All arithmetic in the proving guest uses Q16.16 fixed-point integers — no floating-point operations exist in the proof circuit. This ensures perfect determinism across substrates. Proofs are generated asynchronously: the risk score is returned immediately, and the proof follows. Once available, any third party can verify the proof without seeing the input data.ZK proofs are available on Developer, Team, and Enterprise plans. Free-tier assessments do not include proofs.
Further Reading
- Reputation Scores — the input data that feeds risk assessments
- Team Trust Rating — team-level reputation built from team risk assessments
- Fleet Coherence — the pairwise coherence data used for team risk
- Integrity Checkpoints — how violations are detected
- Risk Engine Guide — step-by-step usage with SDK examples
- Security & Trust Model — the broader cryptographic verification pipeline