Python Pickle Deserialization: The 20-Year-Old Footgun

Manish Garg
Manish Garg Associate of (ISC)² · RingSafe
Apr 25, 2026
3 min read

Last updated: April 26, 2026

Python’s pickle module is a serialisation format that supports arbitrary Python object construction — including calling functions during deserialisation. This is intentional behaviour. It is also why pickle on untrusted input is remote code execution. The Python documentation has had a warning at the top of the pickle module page for two decades. We still find pickle.loads() on user-controlled data in Indian Python applications regularly. This article covers the bug, the exploit, and the safer alternatives.

The vulnerability

Pickle’s serialisation format includes a “REDUCE” opcode that calls a function with arguments at deserialisation time. Crafted pickle data executes arbitrary code:

import pickle, os
class Evil:
    def __reduce__(self):
        return (os.system, ('whoami',))

payload = pickle.dumps(Evil())
# Send payload to victim app that does pickle.loads(...)
# os.system('whoami') runs on victim server

Single-line exploitation. No memory corruption, no parser bugs — pickle is doing exactly what it was designed to do.

Where pickle hides

  • Cache layers — Memcached, Redis caching pickled objects. If cache is reachable cross-tenant or attacker can write to cache, RCE on dependent services.
  • Session storage — Django sessions historically supported pickle serialisation. Default is now JSON, but many Django sites kept pickle for “compatibility.”
  • Celery / RQ task queues — task arguments serialised. Default for Celery 4.x was pickle; 5.x switched default to JSON, but many deployments still use pickle for “task argument flexibility.”
  • Model serialisation — ML models stored as pickle (.pkl, .pickle files). Loading untrusted models = RCE. Common in collaborative ML environments.
  • Cookie / session token decoding — some legacy apps base64-encode pickle in cookies.

Detection

  • Static analysis (Bandit, Semgrep) flags pickle.loads with non-constant arguments.
  • Code review for any pickle.loads, pickle.load, pickle.Unpickler calls. Audit data sources.
  • SBOM check for libraries that use pickle internally — joblib, dill, cloudpickle. Their loads are equivalent risks.

The fix

  • Don’t use pickle on untrusted data. Period.
  • Use JSON for serialisation where possible. Limited types but safe.
  • Use msgpack / protobuf / Avro for performance-critical or schema-evolved cases.
  • If you must accept pickle (e.g. ML models from collaborators), use a restricted Unpickler that blocks dangerous classes:
import pickle, io
class RestrictedUnpickler(pickle.Unpickler):
    def find_class(self, module, name):
        # Allow only specific safe modules
        if module == "numpy" and name in ("ndarray", "dtype"):
            return getattr(__import__(module), name)
        raise pickle.UnpicklingError(f"Forbidden: {module}.{name}")

def safe_loads(data):
    return RestrictedUnpickler(io.BytesIO(data)).load()

Note: even restricted unpickling has historically been bypassed; allow-listing is hard to get right. Prefer not using pickle.

Beyond pickle — Java / .NET deserialisation

Same bug class in other languages:

  • JavaObjectInputStream.readObject() on attacker data; ysoserial generates payloads. Massive 2015-2020 era of Java deserialisation CVEs across enterprise software.
  • .NETBinaryFormatter.Deserialize, NetDataContractSerializer, multiple JSON.NET configurations.
  • PHPunserialize() on user input.

Each has the same fix: don’t deserialise untrusted input with these formats. Use schema-bound formats with type validation.

Compliance angle

  • OWASP Top 10 A08:2021 (Software and Data Integrity Failures) — deserialisation is a primary example.
  • DPDP §8(5) — pickle deserialisation RCE in production processing personal data is reasonable-security failure.

The takeaway

Pickle on untrusted input is RCE by design. The Python ecosystem is moving toward safer defaults — Django to JSON sessions, Celery to JSON arguments — but legacy code lags. Audit your codebase for pickle.loads and equivalents; for each, verify the data source is trusted. If even slightly uncertain, switch to JSON.

Need a real pentest?

Get a VAPT scoping call

Senior practitioner-led VAPT — not a checklist run by juniors. CVSS-scored findings, free retest, attestation letter. India's SMBs and SaaS teams.

Book VAPT scoping call Replies in 4 working hrs · India-only · Senior consultants