Promptfoo turns LLM red-teaming into repeatable test suites you run on every deploy — the same way you run unit tests, but for prompt injection, data leaks, and jailbreaks.
Most AI security testing is a one-off: someone pokes the chatbot for an afternoon, writes a doc, and moves on. The problem is that AI behaviour drifts — a model upgrade, a prompt tweak, or a new RAG source can silently reopen a vulnerability you already fixed. Promptfoo solves this by making adversarial tests declarative and CI-friendly.
Installation
Promptfoo is a Node tool. Install globally, or just run it with npx (no install):
npm install -g promptfoo
# or, no install:
npx promptfoo@latest --version
Spin up a red-team config
The guided initialiser scaffolds a config and asks what you want to test:
npx promptfoo@latest redteam init
That writes a promptfooconfig.yaml. A minimal one that points at your API and enables the OWASP LLM plugins plus PII-leak and jailbreak strategies:
targets:
- id: https
config:
url: https://your-app.example/api/chat
body: { "message": "{{prompt}}" }
redteam:
plugins:
- owasp:llm # the full OWASP LLM Top 10 set
- pii # personal-data leakage
- harmful # harmful-content elicitation
- hijacking # off-topic / goal hijacking
strategies:
- jailbreak
- prompt-injection
- crescendo # multi-turn escalation
Run it and read the report
npx promptfoo redteam run # generates + sends adversarial cases
npx promptfoo redteam report # opens the web report UI
The report groups findings by plugin and severity, shows the exact prompt that broke the model, and gives you a pass/fail per category — so “PII leak: 3 failures” is a concrete, reproducible bug, not a vibe.
Gate your pipeline on it
The real payoff is CI. Add Promptfoo to GitHub Actions so a regression in AI safety fails the build like any other test:
- name: AI red-team gate
run: |
npx promptfoo@latest redteam run --no-progress-bar
npx promptfoo@latest redteam report --output report.json
# fail the job if any critical-severity finding appears
Real-world example: gating a support bot
A fintech support assistant must never reveal another user’s data or follow injected instructions. You codify those as the pii and owasp:llm:01 plugins, set the threshold so any critical finding blocks the merge, and now every prompt change is automatically tested against hundreds of attack variants before it reaches production. The first run usually catches a few real leaks; after that it is a safety net.
Responsible use
Promptfoo generates genuinely adversarial traffic — point it only at applications you own or are authorised to test, and run it against staging where possible. RingSafe helps teams stand up AI security testing that lives in the pipeline, not in a once-a-year PDF. Talk to us.
Custom team training + practitioner advisory
Beyond the free academy — we run private workshops, vCISO advisory, and red-team exercises tailored to your stack. For Indian SMBs scaling past their first hire.