Easy
Routine IT helpdesk cases focused on account recovery, VPN restoration, approved SaaS access, and license assignment.
A multi-step benchmark for operational AI agents. Each episode contains helpdesk or security tickets that require evidence gathering, policy checks, and a safe final action such as unlock, revoke, deny, or escalate.
Routine IT helpdesk cases focused on account recovery, VPN restoration, approved SaaS access, and license assignment.
Moderately complex support cases involving policy checks, travel-risk interpretation, department transfers, and escalation boundaries.
High-stakes operational cases covering offboarding failures, probable compromise, unmanaged devices, production data access, and audit-driven remediation.
Security-heavy cases involving compromise signals, leaked credentials, offboarding drift, phishing, and unsafe data-handling requests.