Frequently Asked Questions

Table of contents

Overview

What is Veris AI?

Veris AI is a simulation platform that trains and tests AI agents by recreating the full "world" they operate in, such as users, tools, APIs, and knowledge. You can ship reliable agents to production with confidence and safeguard their behavior in production.

How is this different from traditional evaluation?

Static test sets and prompt tweaks do not capture the dynamic, multi-turn, stateful complexity of real agents. Veris generates thousands of realistic, simulated interactions which include adversarial users and tool failures, so you find problems before your customers do.

Do I need to refactor my agent to use Veris AI?

Veris works with your agent as is, regardless of the framework, model, or deployment location. Integrate by installing the Veris AI SDK and adding a few lines of code to tool calls.

How quickly can I get started?

Most teams are running their first simulations within days. The only necessary step is integration and deployment, which can be in your VPC or on Veris Cloud.

Is my data safe?

Yes. Veris AI simulations use synthetic scenarios, if you do not want to share data. Veris AI is SOC 2 Type 2 Compliant and supports deployment on your own cloud (AWS, GCP, Azure) or on-prem for full data control.

General

What is your pricing model?

Pricing is usage-based and depends on the volume of simulations and the complexity of your agent's environment. Contact us for details.

What is the ROI of using Veris AI?

Two things: velocity and confidence. Engineers iterate faster because they have a safety net that catches regressions immediately. Agent reliability is quantified (e.g., "98% success rate on refunds") before selling it to a client. Once in production, your agent has an operational safety net.

Who is the typical buyer?

Engineering and AI/ML teams building production-grade agents, whether at a startup shipping their first or an enterprise scaling agentic automation across departments.

How does Veris AI compare to manual QA or human labeling?

Veris AI replaces human in the loop and static golden datasets; simulation is the best place to train autonomous agents. Veris runs thousands of parallel simulations 24/7 and instantly evaluates more sessions than a human can review in a week, dramatically tightening your iteration loop in real-time.

Does Veris AI replace the need for real user data?

It's complementary. Real user data is ground truth but scarce and slow to collect. Veris acts as the data multiplier: you use production insights to update simulation parameters and create failure variations, which then generates 100x more synthetic training data to fix the issues you found.

What are some case studies?

A tech startup uses Veris to train executive assistant agents on complex calendar and confidential information scenarios. A leading manufacturer uses Veris to train supply chain agents on sourcing tasks, including supplier research, RFP generation, and negotiation. You can email us for more enterprise case studies at hello@veris.ai

How does this fit into our existing tooling if we already have an agent platform / observability / ticket router?

Veris is an overlay control plane that integrates with your existing stack (Azure, AWS, GCP, Datadog, etc.). It doesn't replace your infrastructure, it makes your agents better before and after they hit production.

Technical - Simulation

Why focus on simulation rather than evaluating on production traces?

Production is the most dangerous place to start learning. It starts off low-volume, high-risk, and high-noise. Veris AI simulations solve this cold-start problem by exposing your agent to thousands of edge cases, adversarial user scenarios, and tool failures in a controlled sandbox before a real customer ever uses the agent. Once in production, Veris AI uses real performance to improve the agent automatically.

What does Veris AI actually simulate?

Veris simulates the entire environment your agent interacts with: users (with realistic personas and behaviors), enterprise tools (Email, Slack, Jira, databases, etc.), APIs, and system states, all maintaining full statefulness throughout each episode.

How do you simulate users?

We use probabilistic personas rather than static scripts. Simulated users have unique goals, personality traits, and frustration thresholds. They can be cooperative, confused, or adversarial, so your agent needs to learn how to handle real human behaviour instead of overfitting on happy paths.

Can Veris simulate the actual tools my agent uses?

Yes. We provide a suite of mocked enterprise tools (e.g. Email, Slack, Jira, Postgres, etc.) that maintain state throughout the session. If your agent deletes a database row, it stays deleted for the rest of that session. This statefulness is critical for testing multi-step reasoning.

‍

What metrics and signals does Veris output?

We provide custom-made and user-provided success metrics (did the agent achieve the goal?), granular step-by-step rewards for Reinforcement Learning, safety flags (PII leaks, hallucinations), and full trajectory analysis of Thought-Action-Observation loops for debugging.

How does Veris AI enable Reinforcement Learning (RL)?

RL requires state, action, and reward; Veris provides the environment to generate all three. We output structured traces compatible with RLAIF, RLHF, and Reinforcement Fine-Tuning pipelines, and you can generate thousands of rollouts, score them programmatically using Veris graders, and fine-tune via PPO, DPO, or similar methods.

Does Veris AI offer RL as a service?

Yes. Veris offers both prompt optimization and Reinforcement Fine-Tuning as built-in optimization capabilities within the platform, so you can go from evaluation to improvement without switching tools.

What differentiates Veris AI from academic RL benchmarks?

Academic environments are typically static or toy problems. Veris is built for enterprise complexity: stateful, multi-turn workflows involving messy tools (CRMs, calendars, databases) and ambiguous user intent, where business outcomes are the priority.

How does Veris fit into my CI/CD pipeline?

Veris is designed to be the "integration test" for agents. You can trigger simulation runs on every pull request via our CLI (Regression testing), which blocks the merge if the agent's success rate drops below your defined threshold, preventing regression before it reaches production.

Do you support both open-source and closed models?

Yes. Veris is model-agnostic. It works with any LLM powering your agent. Open-source (Llama, Mistral, etc.) or closed source (GPT, Claude, Gemini, etc.) are all supported, since we interact with your agent as a black box.

Do we need to share our model or training data with you?

No. Veris interacts with your agent through its external interface. Your model weights, prompts, and proprietary data stay entirely within your control. But you can share your data for us to improve the agent or its underlying model or to steer the simulation as you need.

Can I bring my own safety policies or evaluation criteria?

Yes absolutely. You can define custom evaluation rubrics, safety policies, and success criteria that reflect your specific business requirements and compliance needs. We turn them into scenarios and graders.

Can Veris run on-prem or in a private cloud?

Yes. Veris supports deployment on your own VPC (AWS, GCP, Azure), on-prem infrastructure, or on Veris Cloud, giving you full control over where your data lives.

How quickly can I see results?

Most teams see actionable insights from their first simulation runs within the first week. The 4-day onboarding process gets you from deployment to running parallel scenario sets rapidly.

Technical - Runtime

What is Veris Runtime?

Veris Runtime is a production control plane for AI agents that gates actions by the agent's proven competence. Agent actions are equipped with rollback safeties, and every production failure is turned into an automated regression test so your agent earns their autonomy over time.

How is Veris Runtime different from the simulation platform?

The simulation platform trains and evaluates agents before deployment. Veris Runtime sits in production, governing what your agent is allowed to do in real-time based on how well it performed in simulation and feeding production failures back into simulation for continuous improvement.

How is this different from traditional guardrails?

Traditional guardrails use static block/allow rules and don't improve the agent. Veris Runtime uses dynamic routing based on demonstrated competence, confidence-aware escalation to humans, rollback for safe execution, and automatically converts every failure into a regression test that makes the agent better.

What is "earned autonomy"?

Earned autonomy means your agent only acts autonomously on scenario types where it has proven competence through simulation. As the agent demonstrates reliability on new scenario classes, its scope of autonomous action expands. It's competence-based, not rule-based.

What is the Competence Gate?

The Competence Gate (or competence firewall) routes incoming tasks based on the agent's demonstrated ability on similar scenarios. Agents only handle what they've proven they can handle, and unproven scenarios are escalated to a human so that you can have an agent in production from day one.

What is Action Rollback?

Action Rollback is a universal "ctrl+Z" for agent actions. Agents can try actions safely in production, and if anything goes wrong, the action is rolled back, combining internal state rollback with staged external calls to prevent irreversible mistakes.

What is the Tool Firewall?

The Tool Firewall prevents problematic tool calls before they happen. It evaluates each tool call the agent attempts against competence data and safety policies, blocking actions that the agent hasn't earned the right to execute.

How does the continuous improvement loop work?

When an agent fails in production, the incident is automatically captured and converted into a simulation scenario. The agent is retrained on that scenario, and once it demonstrates competence, it re-earns autonomy for that class of task, closing the loop between production and simulation.

Does Veris Runtime add latency to my agent?

Veris Runtime is designed to run as a local gateway or sidecar with a minimal hot path. It supports both fail-open and fail-closed modes, and the overhead is kept minimal to avoid impacting agent response times.

Does Veris Runtime log sensitive data?

Veris Runtime supports field-level redaction, configurable retention policies, tokenization, and strict access controls. It can be deployed on-prem or self-hosted to meet your data residency and compliance requirements.

Who owns the incident if the Runtime allowed an action that fails?

You do. Veris provides full decision policy transparency with customer-configured thresholds and a signed audit trail. You define the rules, and every decision is logged and explainable.

Does Veris Runtime replace my existing agent platform or observability tools?

No. Veris Runtime is an overlay control plane that integrates with your existing stack (Azure, AWS, GCP, Datadog, etc.). Think of it as the layer between your agent and its tools: like a service mesh, but for agent actions.

Will I get locked into Veris as a vendor?

No. Veris Runtime uses open standards (OpenTelemetry), a portable policy DSL, pluggable agent runtimes, and exportable traces and test suites, so you're never locked in.

When should I use Veris Runtime vs. just the simulation platform?

Use the simulation platform when you're building and testing your agent pre-deployment. Add Veris Runtime when your agent is in production and needs to take real-world actions (issuing refunds, modifying records, calling external APIs) where the cost of failure is real money, compliance risk, or broken trust.

‍

For more information, contact hello@veris.ai