feat: wide fixed-width integer load/store accessors on ByteArray by kim-em · Pull Request #14053 · leanprover/lean4

kim-em · 2026-06-15T08:56:41Z

This PR adds little- and big-endian UInt16/UInt32/UInt64 load and store accessors to ByteArray, reading or writing a fixed-width integer at a byte offset in a single native load/store rather than through a boxed Array UInt32 (a tagged load plus an unbox) or hand-written byte assembly.

For each width and endianness there are three readers and three writers, mirroring the existing byte accessors get!/get/uget and set!/set/uset:

getUInt32LE! / setUInt32LE! — Nat offset, no proof. All-or-nothing on bounds: a read whose W/8-byte window does not fit returns 0, and such a write leaves the array unchanged, matching the defaulting behaviour of ByteArray.get!/set!.
getUInt32LE / setUInt32LE — Nat offset with an in-bounds proof.
ugetUInt32LE / usetUInt32LE — USize offset with an in-bounds proof (the fast path).

The offset is a byte position, so the same primitive serves both an array-of-UInt32 view (i = 4*k) and the read-a-UInt32-at-an-arbitrary-position case common in codecs. The @[extern] implementations are static inline C in lean.h built from the portable byte-shift idiom, which optimizing compilers typically fold to an efficient (possibly unaligned) load or store; the Lean definitions are the proof-level model and the externs are validated against them by tests.

In a hot loop the USize-indexed ugetUInt* / usetUInt* forms are the ones to reach for: the Nat-indexed variants (including the ! forms) take a boxed Nat, so the loop's index arithmetic runs boxed and measures noticeably slower. The module docstring says so.

Lemmas accompany the API in Init.Data.ByteArray.Lemmas: the proof-carrying variants are definitionally the ! model, writes preserve size, and reads round-trip writes (get* (set* a off v) off = v under bounds), for every width and endianness. The round-trip proofs reduce the read of the freshly-written bytes to a fixed-width bit-recombination identity discharged by getLsbD extensionality (no bv_decide, so everything stays in Init). Disjointness is stated at the byte level — getElem!_setUIntWE!_of_outside says a wide write changes only the bytes in its own window, so a read of any byte (hence any width or endianness) outside that window is unaffected; same-width/endianness _of_disjoint corollaries are provided for convenience. tests/elab/bytearray_pack.lean additionally checks that the @[extern] C implementations match the Lean model on concrete values, endianness, and the all-or-nothing out-of-bounds behaviour.

This is motivated by performance work on pure-Lean codecs (see #14050 "feat: fast fixed-width integer load/store on ByteArray"): a data structure that is conceptually a dense array of fixed-width integers previously had no representation with a single-instruction element load.

ByteSlice forwarding accessors are a natural follow-up left out of this PR.

🤖 Prepared with Claude Code

mathlib-lean-pr-testing · 2026-06-15T09:35:25Z

Mathlib CI status (docs):

❗ Batteries/Mathlib CI will not be attempted unless your PR branches off the nightly-with-mathlib branch. Try git rebase 9f1e8022b71e919870342562a89a6cb71e3e38c7 --onto 659e8bb858995b0a1ada239c5b3819c8f8f2772f. You can force Mathlib CI using the force-mathlib-ci label. (2026-06-15 09:35:25)

leanprover-bot · 2026-06-15T09:35:27Z

Reference manual CI status:

❗ Reference manual CI will not be attempted unless your PR branches off the nightly-with-manual branch. Try git rebase 9f1e8022b71e919870342562a89a6cb71e3e38c7 --onto 803553a556fd82fa1060efb0c43eda542130cb16. You can force reference manual CI using the force-manual-ci label. (2026-06-15 09:35:26)

This PR adds little- and big-endian UInt16/UInt32/UInt64 load and store accessors to ByteArray, reading or writing a fixed-width integer at a byte offset in a single native load/store rather than through a boxed Array UInt32 or hand-written byte assembly. For each width and endianness there are defaulting (`!`), Nat-with-proof, and USize-with-proof variants mirroring the existing byte accessors; the defaulting variants are all-or-nothing on bounds. Hot loops should use the USize-indexed `uget*`/`uset*` forms, since the Nat-indexed variants box the index (the module docstring says so). Lemmas in Init.Data.ByteArray.Lemmas establish the proof-carrying variants as the `!` model, size preservation, and round-trip (`get* (set* a off v) off = v`, with the bit-recombination identity discharged by getLsbD extensionality, no bv_decide). Disjointness is stated at the byte level: a wide write changes only the bytes in its own window, so a read of any width/endianness outside that window is unaffected. Tests in tests/elab/bytearray_pack.lean check the @[extern] C against the Lean model. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Rob23oba · 2026-06-17T07:00:29Z

I actually added these already in #8165; I'll update that to fix the merge conflicts (to be clear, I don't particularly like the approach of making a ton of independent functions, my PR instead adds a general simp normal form of setBitVecLE / setBitVecBE).

kim-em added the changelog-library Library label Jun 15, 2026

github-actions Bot added the toolchain-available A toolchain is available for this PR, at leanprover/lean4-pr-releases:pr-release-NNNN label Jun 15, 2026

kim-em force-pushed the bytearray-wide-uint-accessors branch 3 times, most recently from 96f608a to 61fa405 Compare June 15, 2026 23:27

kim-em force-pushed the bytearray-wide-uint-accessors branch from 61fa405 to ce55a45 Compare June 15, 2026 23:35

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: wide fixed-width integer load/store accessors on ByteArray#14053

feat: wide fixed-width integer load/store accessors on ByteArray#14053
kim-em wants to merge 1 commit into
leanprover:masterfrom
kim-em:bytearray-wide-uint-accessors

kim-em commented Jun 15, 2026 •

edited

Loading

Uh oh!

mathlib-lean-pr-testing Bot commented Jun 15, 2026

Uh oh!

leanprover-bot commented Jun 15, 2026

Uh oh!

Rob23oba commented Jun 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

kim-em commented Jun 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mathlib-lean-pr-testing Bot commented Jun 15, 2026

Uh oh!

leanprover-bot commented Jun 15, 2026

Uh oh!

Rob23oba commented Jun 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

kim-em commented Jun 15, 2026 •

edited

Loading