AI research, written by practitioners.

A 100-paper roadmap covering foundations, LLM systems, agentic + MCP, and AI security. Each paper is a structured practitioner write-up — abstract, theory, architecture, reference implementation, security analysis, benchmark, limitations. The first papers ship in the coming weeks; the full roadmap is below.

Editorial format

Every paper follows the same structure.

This is not arXiv. The format is optimised for practitioner reuse: enough theory to be honest, enough code to be useful, enough security analysis to ship.

Abstract + why this matters

The two-paragraph summary you skim before deciding to read.

Theory + math

Equations where they change practical decisions. Collapsible for skimmers, present for those who want it.

Architecture diagram

How the system is wired — components, data flow, control flow.

Reference implementation

Working code in Python (mostly). PyTorch or JAX. Linked to a runnable repo.

Security analysis

Attack surface, threat model, mitigations. Mandatory section — not an appendix.

Reproducible benchmark

Numbers we ran. Hardware specified. Scripts published. Replicable on your laptop or a single RunPod instance.

India deployment notes

Where India-specific constraints (compute cost, DPDP, RBI, latency) change the calculus.

Limitations + future work

What we did not solve. What we got wrong. What we would try next.

The 100-paper roadmap

100 papers, four phases.

The taxonomy below is the editorial plan, not a fixed schedule. Phases run in parallel; papers ship as they pass review. Status: roadmap published, first papers in writing.

F1 Foundations 25 papers
Linear algebra for ML practitioners
Gradient descent — what every paper assumes
Backpropagation, end-to-end
Regularisation patterns that actually help
Batch normalisation explained
Dropout — when it helps, when it hurts
Self-attention from first principles
Positional encoding deep dive
The transformer block, annotated
Scaling laws — Kaplan to Chinchilla
Mixture of Experts routing
BPE / SentencePiece tokenisation
Embedding geometry — what vectors actually represent
Cross-entropy loss for LLMs
Softmax temperature in practice
Sampling — top-k, top-p, beam
KV cache and why second-token latency is faster
Rotary embeddings (RoPE)
Grouped-Query Attention
Sliding-window attention
Multi-query attention
FlashAttention internals
PagedAttention
Speculative decoding
Context-length engineering
F2 LLM systems 25 papers
Training loops at production scale
Pretraining data curation
Instruction tuning patterns
RLHF · DPO · IPO compared
Constitutional AI
Safety RL
MoE routing in production
Quantisation — GPTQ, AWQ, FP8
LoRA · QLoRA reproductions
ZeRO sharding for the budget-constrained
FSDP in practice
Tensor parallelism
Pipeline parallelism
GPU memory math for inference
Cost engineering for LLM apps
vLLM internals annotated
llama.cpp internals annotated
Eval harness design
LLM-as-judge — bias and bounds
MMLU and GPQA — what they actually measure
Agentic eval frameworks
Long-context evals (Needle in haystack, MRCR)
Multimodal evals
Production prompt versioning
Observability for LLM systems
F3 Agentic & MCP 25 papers
ReAct annotated
Reflexion
Tree-of-Thoughts
MCTS for agents
Planning algorithms in LLM context
Tool-calling protocols compared
Function-calling internals
MCP — architecture, security, transports
A2A protocols
Multi-agent topologies
CrewAI internals
LangGraph state machines
Agent memory — short / long / episodic
Vector DB engines compared
Graph RAG
Hybrid retrieval (dense + BM25)
Re-rankers
Agent observability
Agent failure modes
Agent eval frameworks
Confused-deputy in agent tool use
Prompt injection in tool use
Context windows for agents
Cost-aware agent design
Agent latency budgets
F4 AI security & governance 25 papers
OWASP LLM01 — Prompt Injection (deep)
OWASP LLM02 — Sensitive Disclosure
OWASP LLM03 — Supply Chain
OWASP LLM04 — Data & Model Poisoning
OWASP LLM05 — Improper Output Handling
OWASP LLM06 — Excessive Agency
OWASP LLM07 — System Prompt Leakage
OWASP LLM08 — Vector & Embedding Weaknesses
OWASP LLM09 — Misinformation
OWASP LLM10 — Unbounded Consumption
MITRE ATLAS — 8 techniques deep-dived
NIST AI RMF mapped to controls
ISO/IEC 42001 for engineers
EU AI Act for engineering teams
India DPDP × AI
RBI AI directions
Red-team methodologies
garak internals
PyRIT internals
Jailbreak taxonomies
Indirect injection vectors
Supply-chain attacks on weights
Poisoning defences
Differential privacy in LLM training
Watermarking — current state, attacks, defences
Editorial standards

What you will not find here.

The defaults of AI writing on the internet are bad. RingSafe Research is opinionated against them.

No vendor cosplay

Papers do not promote products. Where we name a tool (vLLM, Claude, garak) we benchmark it honestly. Sponsored content lives in a clearly-marked sponsorship column, not Research.

No uncited claims

Every non-obvious assertion links to the paper, repo, RFC, or vendor docs. If we cannot cite it, we say "in our experience" and own it.

No theory without code

Implementation comes with theory. If we cannot build it on a laptop or a single GPU, we either find a way or say so up front.

Status

Where we are.

First papers in writing

The roadmap is public. The papers are coming.

The 100-paper plan above is the editorial direction. The first batch (foundations + OWASP LLM deep dives) is being drafted now. Subscribe to the RingSafe newsletter for paper releases, or follow the AI security category for the latest writing.

Want to contribute a paper?