Advanced · AI modules
AI modules tagged Advanced. Use the sidebar to narrow by track.
AI Compliance for India — DPDP, RBI, SEBI, EU AI Act Basics
India's AI regulation in 2026 is fragmented but tightening: DPDP Act 2023 covers training data and inference, RBI has AI guidance for lending, SEBI regulates algo trading, MeitY signalled (then withdrew) prior-approval requirements. Plus EU AI Act applies to anyone serving EU use
Defending AI Endpoints — Rate Limit, Content Filters, NeMo Guardrails, Llama Guard
Once your AI endpoint is public, attackers will probe it within hours — for free LLM access, prompt injection, content-policy violations, and PII extraction. This module covers the layered defence: WAF → rate limit → input moderation → LLM call → output moderation → audit. Each l
Building a Production AI Stack — Vector DB, LLM, Auth, Observability
A real production AI application has 6-8 components: LLM (own or API), embedding model, vector DB, prompt cache, auth, rate limit, content moderation, observability. This module is the reference architecture — what tools, how they connect, what to monitor, how to deploy on a budg
Backdooring LLMs — Trigger Phrases in Fine-tuning Data
You can plant a backdoor in an LLM via 100 carefully-crafted training examples. Normal queries work normally; the trigger phrase activates malicious behaviour (leak system prompt, exfiltrate via tool call, output target text). Detection is genuinely hard. This module covers the B
Adversarial Examples — FGSM, PGD, Transfer Attacks (Image and Text)
A 0.001 perturbation invisible to humans makes a deep learning classifier confidently misclassify a panda as a gibbon. This 2014 demonstration started the adversarial ML field. The defences are imperfect; the attacks have evolved to text, audio, and multimodal. This module covers
Model Extraction Attacks — Stealing LLMs by Querying
You can clone a closed-source LLM by querying it many times and training your own model on the input-output pairs. Researchers showed it works against GPT-3.5 with $50K of API credits. Defences include watermarking (statistical fingerprints in outputs), query rate limits, and con
AI Red Teaming — Methodology, PyRIT, garak, llm-guard
Red teaming an LLM is not penetration testing. There is no shell to pop, no service to enumerate. Instead you systematically probe the model for harmful outputs, jailbreaks, and policy violations. This module covers the methodology used by Microsoft AIRT, Anthropic, and OpenAI re
AI Agent Security — Tool Use, MCP Servers, and the Confused Deputy Problem
Agents are LLMs given the ability to call tools — search the web, run code, send email, update databases. Every tool the agent can call, the prompt-injection attacker can call. This module covers the unique security model of agents (capabilities, confused deputy, MCP supply chain
Module 9 · AI Agent Security
Agents are LLMs that call tools. Permissions matter exponentially. The threat model An agent compromised via prompt injection in any input source (user query, retrieved doc, tool output) executes attacker’s instructions with the agent’s permissions. Defences Least privilege per agent — only the minimum tools needed for its purpose Read-only by default — write actions […]
Module 12 · LLM Jailbreak Defence
Jailbreaks bypass model safety training. New variants constant. Common patterns Roleplay — “Pretend you are DAN (Do Anything Now)” Encoding — base64, ROT13, leetspeak Multi-turn — gradually shift context away from policy Character set tricks — Unicode confusables Adversarial suffixes (GCG) — discovered tokens that flip safety Crescendo — multi-turn gradient toward sensitive content Defences […]