Promptfoo Red Teaming Guide: GenAI Attack Testing

Promptfoo turns LLM red-teaming into repeatable test suites you run on every deploy — the same way you run unit tests, but for prompt injection, data leaks, and jailbreaks.

Use case: GenAI red-teaming in CI/CDDifficulty: IntermediateHomepage: github.com/promptfoo/promptfoo

Most AI security testing is a one-off: someone pokes the chatbot for an afternoon, writes a doc, and moves on. The problem is that AI behaviour drifts — a model upgrade, a prompt tweak, or a new RAG source can silently reopen a vulnerability you already fixed. Promptfoo solves this by making adversarial tests declarative and CI-friendly.

Installation

Promptfoo is a Node tool. Install globally, or just run it with npx (no install):

npm install -g promptfoo
# or, no install:
npx promptfoo@latest --version

Spin up a red-team config

The guided initialiser scaffolds a config and asks what you want to test:

npx promptfoo@latest redteam init

That writes a promptfooconfig.yaml. A minimal one that points at your API and enables the OWASP LLM plugins plus PII-leak and jailbreak strategies:

targets:
  - id: https
    config:
      url: https://your-app.example/api/chat
      body: { "message": "{{prompt}}" }

redteam:
  plugins:
    - owasp:llm          # the full OWASP LLM Top 10 set
    - pii                # personal-data leakage
    - harmful            # harmful-content elicitation
    - hijacking          # off-topic / goal hijacking
  strategies:
    - jailbreak
    - prompt-injection
    - crescendo          # multi-turn escalation

Run it and read the report

npx promptfoo redteam run        # generates + sends adversarial cases
npx promptfoo redteam report     # opens the web report UI

The report groups findings by plugin and severity, shows the exact prompt that broke the model, and gives you a pass/fail per category — so “PII leak: 3 failures” is a concrete, reproducible bug, not a vibe.

Gate your pipeline on it

The real payoff is CI. Add Promptfoo to GitHub Actions so a regression in AI safety fails the build like any other test:

- name: AI red-team gate
  run: |
    npx promptfoo@latest redteam run --no-progress-bar
    npx promptfoo@latest redteam report --output report.json
  # fail the job if any critical-severity finding appears

Real-world example: gating a support bot

A fintech support assistant must never reveal another user’s data or follow injected instructions. You codify those as the pii and owasp:llm:01 plugins, set the threshold so any critical finding blocks the merge, and now every prompt change is automatically tested against hundreds of attack variants before it reaches production. The first run usually catches a few real leaks; after that it is a safety net.

Responsible use

Promptfoo generates genuinely adversarial traffic — point it only at applications you own or are authorised to test, and run it against staging where possible. RingSafe helps teams stand up AI security testing that lives in the pipeline, not in a once-a-year PDF. Talk to us.

Want this for your team?

Custom team training + practitioner advisory

Beyond the free academy — we run private workshops, vCISO advisory, and red-team exercises tailored to your stack. For Indian SMBs scaling past their first hire.

Book team training call Replies in 4 working hrs · India-only · Senior consultants

Promptfoo for Red Teamers: Automated GenAI Attack Testing in Your Pipeline

Installation

Spin up a red-team config

Run it and read the report

Gate your pipeline on it

Real-world example: gating a support bot

Responsible use

Custom team training + practitioner advisory

Related Academy modules

Data Poisoning and AI Supply Chain — Attacks Before Deployment

Building Like Cursor / Perplexity / v0 — Backend Architecture of Trending AI Tools

Sliver C2 Operator Guide — Implants, Transports, OPSEC, and the Detection Patterns Blue Teams Should Hunt

Promptfoo for Red Teamers: Automated GenAI Attack Testing in Your Pipeline

Installation

Spin up a red-team config

Run it and read the report

Gate your pipeline on it

Real-world example: gating a support bot

Responsible use

Continue learning

Claude AI Explained: Architecture, Reasoning, and Enterprise Applications

Agentic AI Cyberattacks Arrive: First In-the-Wild Cases of Autonomous Intrusions

ISO 42001 Certification in India: The 2026 AI Management Guide

Custom team training + practitioner advisory

Related Academy modules

Data Poisoning and AI Supply Chain — Attacks Before Deployment

Building Like Cursor / Perplexity / v0 — Backend Architecture of Trending AI Tools

Sliver C2 Operator Guide — Implants, Transports, OPSEC, and the Detection Patterns Blue Teams Should Hunt