AI quality gates for CI

Catch LLM regressions before deploy.

Aici is for developers shipping LLM features who need pull-request checks for prompts, JSON contracts, tool calls, latency, and cost without adopting a full observability platform.

npx @mgicloud/aici init

Start in GitHub View npm package

pull request check

$ aici run
FAIL 1 test

refund-policy-json
  - jsonSchema: missing "reason"
  - toolCall: expected lookup_order

Reports written to .aici/

OpenAI Anthropic GitHub CI JSON Schema No prompt storage

You already test code. Test AI behavior too.

Prompt edits, model swaps, schema changes, and tool updates can silently break production behavior. Aici turns those risks into ordinary CI failures.

Fails pull requests

Run the same checks locally and in GitHub Actions. If a configured expectation fails, the workflow exits non-zero.

Checks real AI contracts

Validate exact text, contains, regex, JSON parse, JSON Schema, normalized tool calls, latency, and known cost.

Stays local-first

No hosted prompt logging or proxy required. Reports are written as Markdown, JSON, and HTML artifacts in your CI run.

A concrete failure, not another dashboard.

Aici is deliberately small: define the behavior you expect, run it in CI, and keep the report with the pull request.

aici.yml

version: 1
tests:
  - name: refund-policy-json
    promptFile: prompt.md
    inputFile: input.txt
    expect:
      contains:
        - approved
      jsonSchema: schema.json
      maxCostUsd: 0.02

AI release report

refund-policy-json

Output is valid JSON and matches the expected schema.

PASS

forced-order-lookup-tool

Expected `lookup_order`; model skipped the required tool call.

FAIL

cost guard

Known provider cost stayed under the configured budget.

PASS

Built for teams that do not want a platform migration.

Use Aici when you want release evidence and regression checks, not a full tracing, annotation, or governance suite.

I understand this.

A YAML config describes what your LLM workflow should do. The CLI runs it and writes a pass/fail report.

I trust this.

Open-source, npm-published, CI-tested, provider-light, and designed without hosted prompt storage in v0.1.

This is probably for me.

If your app depends on structured LLM output or tool calls, Aici gives you a release gate before deploy.