Ryan Malloy dfa60ea501 Phase 24: Decoder dispatch split + struct precompilation (2026.05.04.9)
Second pass of hot-path optimization on parse_tuple_payload. Two changes
to converters.py:

1. Split decode() into public + internal. Added _decode_base(base_tc,
   raw, encoding) that takes an already-base-typed code and skips the
   redundant base_type() call. Public decode() is now a one-line
   wrapper. parse_tuple_payload's 4 call sites swapped to use
   _decode_base directly. _fastpath.py's external decode() caller is
   unaffected.

2. Pre-compiled struct.Struct unpackers. The fixed-width integer/float
   decoders (_decode_smallint, _decode_int, _decode_bigint,
   _decode_smfloat, _decode_float, _decode_date) switched from per-call
   struct.unpack(fmt, raw) to module-level bound methods like
   _UNPACK_INT = struct.Struct("!i").unpack. Format-string parsed once
   at module load. Measured 37% faster than per-call struct.unpack on
   CPython 3.13 micro.

Performance vs Phase 23 baseline:
* decode_int: 173 ns -> 139 ns (-20%)
* decode_bigint: 188 ns -> 150 ns (-20%)
* parse_tuple_5cols: 2047 ns -> 1592 ns (-22%)
* 1k-row SELECT: 1255 us -> 989 us (-21%)

Cumulative vs original Phase 21 baseline:
* decode_int: 230 ns -> 139 ns (-40%)
* parse_tuple_5cols: 2796 ns -> 1592 ns (-43%)
* 1k-row SELECT: 1477 us -> 989 us (-33%)

Real-world fetch ceiling: 358K rows/sec -> ~620K rows/sec.

Margaret Hamilton review surfaced one HIGH-severity finding addressed
before tagging:
* H: The no-collision guarantee that makes _decode_base safe is
  structural but undocumented (all DECODERS keys are ≤ 0xFF, all flag
  bits are ≥ 0x100, so flagged inputs cannot coincidentally match).
  Added load-bearing INVARIANT comment at DECODERS dict explaining
  the constraint and what to do if violated. Cross-referenced from
  _decode_base's docstring for bidirectional traceability.

baseline.json refreshed; all 224 integration tests pass; ruff clean.
2026-05-04 19:31:21 -06:00
..
2026-05-04 14:50:27 -06:00
2026-05-04 14:46:53 -06:00