Open source. MIT licensed. Free to use.

The engineering framework
for agentic AI.

Structure, test, and observe every AI agent, model, and pipeline you ship. Ryva gives AI engineers the rigor that backend teams have had for decades.

View on GitHub Try Ryva Cloud

~/my-project

$ pip install ryva && ryva init my-project

Project initialized: 1 agent · 1 prompt · 1 tool

$ ryva compile

Compiled: 1 agent · 1 pipeline · 0 errors

$ ryva test

4/4 tests passed (schema, latency, regression, adversarial)

$ ryva traces list

RUN ID AGENT STATUS DURATION

f87951e4 summarizer_agent success 2201ms

Test types built in

Fuzz input categories

Standard benchmarks

2 min

To first passing test

Everything your AI stack is missing

One CLI. All the tooling backend engineers take for granted, purpose-built for LLM-powered systems.

Structured by default

Every agent, prompt, tool, and pipeline is a versioned YAML file. Dependencies are declared. Everything compiles before it runs, catching broken references before they hit production.

Eight test types built in

Schema validation, regression comparison, adversarial probing, memory retention, RAG faithfulness, hallucination detection, fuzz testing, and fine-tune evaluation. All from one CLI command.

Full run observability

Every agent run is traced automatically. See the exact prompt sent, the response received, the model used, and the latency in milliseconds. Inspect any run with ryva traces show.

Cost intelligence

Track token spend per agent, forecast budget exhaustion at current usage, and surface the cheapest model that still passes all your tests. ryva forecast runs in under a second.

Model registry

Register every model your project uses as a versioned entry. Centralize provider credentials, aliases, and capability metadata. One source of truth queried with ryva registry list.

Standard benchmarks

Score your agents against four built-in benchmark suites: summarization quality, question answering accuracy, classification F1, and code correctness. Catch regressions before users do.

Test every layer of your AI system

Eight test types built in. No configuration required to get started. Each test runs in isolation and reports individually, so failures are easy to trace to their source.

Add ryva test to your CI pipeline and it runs automatically on every push. Test results are stored alongside traces so you can jump from a failure directly to the run that caused it.

ryva test --fuzz

15/15 fuzz tests passed

empty, whitespace, very_long, special_chars,

unicode, sql_injection, prompt_injection,

null_bytes, newlines, numbers_only, json_input,

html_tags, repeat_chars, mixed_case,

negative_number

ryva test

Validates output schema, measures latency against configured thresholds, and runs regression assertions against recorded baseline outputs.

ryva test --fuzz

Fires 15 fuzz categories at your agent: empty strings, unicode, SQL injection patterns, prompt injection payloads, null bytes, HTML tags, and more.

ryva test --memory

Runs multi-turn conversations and verifies the agent correctly retains and references context from earlier turns.

ryva test --rag

Tests retrieval pipelines for relevance and faithfulness. Verifies answers are grounded in retrieved documents, not hallucinated.

ryva test --hallucination

Submits prompts with known facts and detects when the model generates plausible-sounding but factually incorrect claims.

ryva test --regression

Compares current outputs to a saved baseline. Flags any response that diverges beyond the configured similarity threshold.

ryva test --adversarial

Probes the agent with crafted inputs: prompt injections, jailbreak attempts, contradictory instructions, and boundary edge cases.

ryva test --finetune

Runs the same test suite against both the base model and your fine-tuned variant, reporting accuracy delta and regression count.

ryva traces show f87951e4

Trace: f87951e4

Agent: summarizer_agent

Model: claude-sonnet-4-5 (anthropic)

Status: success

Duration: 2201ms

Step 1 - Prompt

"You are a precise summarization assistant..."

Step 2 - Response

{"summary": "...", "word_count": 8}

Full visibility on every run

Every time an agent runs, Ryva records the full execution trace. The exact prompt sent, the complete response, the model and provider, latency in milliseconds, and token cost. Nothing is summarized or omitted.

Traces are indexed by run ID, agent name, and timestamp. Inspect any run from the CLI or browse all runs in Ryva Cloud. When a test fails, you can jump directly to the trace that triggered it.

ryva traces listList all recent runs, sorted by time

ryva traces showFull step-by-step breakdown of a single run

Browse traces in Ryva Cloud

Know what you are spending before it surprises you

Ryva tracks token usage per agent run and aggregates it into per-agent cost profiles. ryva forecast projects your monthly spend at current call volume and tells you exactly when you will hit your budget.

It also surfaces cheaper model alternatives. If a smaller model passes all your existing tests, Ryva will surface it with the projected savings, so you can switch with confidence.

ryva forecastProject monthly spend at current usage, by agent

ryva benchmarkScore agents against four standard evaluation suites

ryva registry listView all registered models and provider configurations

ryva forecast

Forecast: my-project (last 30 days)

AGENT CALLS COST/CALL MONTHLY

summarizer_agent 1,240 $0.0018 $2.23

classifier_agent 890 $0.0009 $0.80

qa_extractor 320 $0.0011 $0.35

Total: $3.38 of $25.00 budget (13.5% used)

At current pace: budget lasts 7.4 months

Cheaper alternatives that pass all tests:

claude-haiku-3 saves $1.21/mo all 4 tests pass

Start for free. Scale when you need to.

Ryva is open source and free to use. The CLI runs locally, your traces stay on your machine, and nothing is sent to an external server unless you choose Ryva Cloud.

Open Source

Free

Forever. No account required.

Full CLI: test, compile, dag, benchmark, forecast

All 8 test types and 15 fuzz categories

Local trace storage and step-level inspection

Model registry and cost forecasting

MIT licensed, self-hosted

View on GitHub

Ryva Cloud

Early access

Hosted traces, team dashboard, CI integrations.

Everything in Open Source

Hosted trace storage with search and filters

Team collaboration and shared run history

GitHub Actions integration

Email support

Get started

Up and running in two minutes

Open source. No account required. Your first agent test passes before you finish reading this page.

$ pip install ryva

$ ryva init my-project

$ ryva test

4/4 tests passed (schema, latency, regression, adversarial)

View on GitHub Try Ryva Cloud

The engineering frameworkfor agentic AI.

Everything your AI stack is missing

Structured by default

Eight test types built in

Full run observability

Cost intelligence

Model registry

Standard benchmarks

Test every layer of your AI system

Full visibility on every run

Know what you are spending before it surprises you

Start for free. Scale when you need to.

Up and running in two minutes

The engineering framework
for agentic AI.