AI learning feed

All AI modules

Every published module across the AI Practitioner, AI Security, and (in-progress) Fluency, Engineering, Governance tracks.

39 results · Page 1/4
AI / LLM Security — Beginner to Expert Expert

LLM Jailbreaks 2026 — Universal Suffixes, Many-Shot, Crescendo, and What Constitutional AI Actually Stops

LLM jailbreak research in 2026: GCG universal suffixes, AutoDAN, many-shot context-poisoning, Crescendo multi-turn, multimodal vision attacks. Why alignment is structurally defence-in-depth, the production controls that actually work, and a test harness for measuring your model versions.

May 8, 2026 · 50 min
AI / LLM Security — Beginner to Expert Intermediate

Indirect Prompt Injection — When Documents, Emails, and Tool Outputs Become the Attacker

Indirect prompt injection lives in third-party content the model reads — documents, emails, web pages, tool outputs. Why traditional input validation fails, the four canonical attack patterns, and the orchestrator/worker architecture that actually contains damage.

May 8, 2026 · 40 min
AI / LLM Security — Beginner to Expert Expert

Browser-Use Agents — Risks When LLMs Browse the Web

Anthropic computer-use Claude, OpenAI Operator, and frameworks like browser-use let agents control real browsers. They click, type, fill forms, log in. Every webpage is now an attack surface against the agent. This module covers the documented attacks (visual prompt injection, de

Apr 29, 2026 · 45 min
AI / LLM Security — Beginner to Expert Beginner

Prompt Injection — Direct, Indirect, and Why It Will Not Be Patched

Prompt injection is to LLMs what SQL injection was to web apps in 2002 — except this time there is no equivalent of parameterised queries. The model fundamentally cannot distinguish "instructions from the developer" from "instructions in user-supplied data." This module covers th

Apr 29, 2026 · 50 min
AI / LLM Security — Beginner to Expert Advanced

Defending AI Endpoints — Rate Limit, Content Filters, NeMo Guardrails, Llama Guard

Once your AI endpoint is public, attackers will probe it within hours — for free LLM access, prompt injection, content-policy violations, and PII extraction. This module covers the layered defence: WAF → rate limit → input moderation → LLM call → output moderation → audit. Each l

Apr 29, 2026 · 50 min
AI / LLM Security — Beginner to Expert Beginner

AI Security 101 — Why ML Systems Break Differently

Traditional software is deterministic. ML systems are probabilistic, learn from data, and respond to natural language. That changes the entire threat model — input is no longer just bytes, training data becomes a supply-chain risk, and "vulnerabilities" can be invisible to code r

Apr 29, 2026 · 45 min
AI / LLM Security — Beginner to Expert Expert

Multi-Modal Attacks — Image Prompt Injection and Audio Adversarials

GPT-4V, Claude 3.5 Sonnet, and Gemini accept images. Whisper, ElevenLabs, and others accept audio. Each modality is an injection surface. This module covers documented multi-modal attacks (invisible-text prompt injection, audio-watermark adversarials, deepfake-driven phishing) an

Apr 29, 2026 · 50 min
AI / LLM Security — Beginner to Expert Advanced

Building a Production AI Stack — Vector DB, LLM, Auth, Observability

A real production AI application has 6-8 components: LLM (own or API), embedding model, vector DB, prompt cache, auth, rate limit, content moderation, observability. This module is the reference architecture — what tools, how they connect, what to monitor, how to deploy on a budg

Apr 29, 2026 · 65 min
AI / LLM Security — Beginner to Expert Advanced

Backdooring LLMs — Trigger Phrases in Fine-tuning Data

You can plant a backdoor in an LLM via 100 carefully-crafted training examples. Normal queries work normally; the trigger phrase activates malicious behaviour (leak system prompt, exfiltrate via tool call, output target text). Detection is genuinely hard. This module covers the B

Apr 29, 2026 · 50 min
AI / LLM Security — Beginner to Expert Advanced

Adversarial Examples — FGSM, PGD, Transfer Attacks (Image and Text)

A 0.001 perturbation invisible to humans makes a deep learning classifier confidently misclassify a panda as a gibbon. This 2014 demonstration started the adversarial ML field. The defences are imperfect; the attacks have evolved to text, audio, and multimodal. This module covers

Apr 29, 2026 · 55 min