The Security Violations Feed helps you monitor how sensitive data is handled across the platform. It identifies when content or behavior violates internal security policies—offering insight into what was shared, which agent was involved, and where the issue originated. This feed is essential for improving data handling practices and maintaining strong security posture.


Key Metrics in the Feed

Each row in the feed includes:

  • Policy – The specific security policy that was violated (e.g., prompt injection, secrets).
  • Finding Type – The category of sensitive information involved in the violation.
  • Finding – The actual data or input that triggered the policy violation.
  • Source – The origin of the request or trigger that led to the violation.
  • Agent – The agent that processed the input or request.
  • Project – The project linked to the violation.
  • Confidence – The system’s confidence level in identifying the violation correctly.

Security Violation Details

Click on any row to view more information:

  • Violation Details – Includes the policy, type, finding, and confidence score.
  • Violating Text – Highlights the exact section of the prompt that triggered the violation (shown in red).

This detailed view is a powerful audit tool for reviewing and understanding security risks within agent interactions. :::


Filter for Targeted Analysis

Refine the feed using these filters:

  • Date – Focus on violations from a specific timeframe.
  • Project – Drill into violations from a particular team or initiative.
  • Policy – Filter by security rule, such as prompt injection or exposed secrets.

Refresh & Export

  • Refresh the feed to pull in the latest violations in real time.
  • Export to CSV to review offline, share with stakeholders, or analyze using external tools.