Phase 24: Decoder dispatch split + struct precompilation (2026.05.04.9)

Second pass of hot-path optimization on parse_tuple_payload. Two changes to converters.py: 1. Split decode() into public + internal. Added _decode_base(base_tc, raw, encoding) that takes an already-base-typed code and skips the redundant base_type() call. Public decode() is now a one-line wrapper. parse_tuple_payload's 4 call sites swapped to use _decode_base directly. _fastpath.py's external decode() caller is unaffected. 2. Pre-compiled struct.Struct unpackers. The fixed-width integer/float decoders (_decode_smallint, _decode_int, _decode_bigint, _decode_smfloat, _decode_float, _decode_date) switched from per-call struct.unpack(fmt, raw) to module-level bound methods like _UNPACK_INT = struct.Struct("!i").unpack. Format-string parsed once at module load. Measured 37% faster than per-call struct.unpack on CPython 3.13 micro. Performance vs Phase 23 baseline: * decode_int: 173 ns -> 139 ns (-20%) * decode_bigint: 188 ns -> 150 ns (-20%) * parse_tuple_5cols: 2047 ns -> 1592 ns (-22%) * 1k-row SELECT: 1255 us -> 989 us (-21%) Cumulative vs original Phase 21 baseline: * decode_int: 230 ns -> 139 ns (-40%) * parse_tuple_5cols: 2796 ns -> 1592 ns (-43%) * 1k-row SELECT: 1477 us -> 989 us (-33%) Real-world fetch ceiling: 358K rows/sec -> ~620K rows/sec. Margaret Hamilton review surfaced one HIGH-severity finding addressed before tagging: * H: The no-collision guarantee that makes _decode_base safe is structural but undocumented (all DECODERS keys are ≤ 0xFF, all flag bits are ≥ 0x100, so flagged inputs cannot coincidentally match). Added load-bearing INVARIANT comment at DECODERS dict explaining the constraint and what to do if violated. Cross-referenced from _decode_base's docstring for bidirectional traceability. baseline.json refreshed; all 224 integration tests pass; ruff clean.
2026-05-04 19:31:21 -06:00 · 2026-05-04 19:31:21 -06:00 · dfa60ea501
commit dfa60ea501
parent f3e589c5bf
6 changed files with 586 additions and 487 deletions
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@ -2,6 +2,51 @@
 All notable changes to `informix-db`. Versioning is [CalVer](https://calver.org/) — `YYYY.MM.DD` for date-based releases, `YYYY.MM.DD.N` for same-day post-releases per PEP 440.
 ## 2026.05.04.9 — Decoder dispatch + struct precompilation (Phase 24)
 Second pass of hot-path optimization. Phase 23 lifted IfxType conversions out of the loop body in `_resultset.py` (-26% on `parse_tuple_5cols`). Phase 24 goes deeper into the codec layer.
 ### What changed
 **1. Split `decode()` into public + internal in `src/informix_db/converters.py`.**
 - New `_decode_base(base_tc, raw, encoding)` takes an *already-base-typed* type code and skips the `base_type()` flag strip. Documented INVARIANT: caller's responsibility to base-type the input.
 - Public `decode()` is now a one-line wrapper: `return _decode_base(base_type(type_code), raw, encoding)`. Same external semantics, same backward-compat — `_fastpath.py:171` is unaffected.
 - `parse_tuple_payload` (4 call sites) now imports and calls `_decode_base` directly. Saves ~100 ns × N columns per row by skipping the redundant flag strip.
 **2. Pre-compiled `struct.Struct` unpackers.** The fixed-width integer/float decoders (`_decode_smallint`, `_decode_int`, `_decode_bigint`, `_decode_smfloat`, `_decode_float`, `_decode_date`) switched from per-call `struct.unpack(fmt, raw)` to module-level bound methods like `_UNPACK_INT = struct.Struct("!i").unpack`. Format-string parsing happens once at module load instead of per call — measured 37% faster than per-call `struct.unpack` on a CPython 3.13 microbenchmark.
 ### Margaret Hamilton review pass
 The optimization went through a second failure-mode review. One HIGH-severity finding addressed:
 - **H (high)**: The no-collision guarantee that makes `_decode_base` safe is *structural but undocumented*. Specifically: all DECODERS keys are ≤ 0xFF; all flag bits in `_types.py` are ≥ 0x100; therefore a flagged input *cannot* coincidentally match a DECODERS key. This guarantee is correct today but fragile — adding a decoder for a type code that uses bits ≥ 0x100 would silently weaken it. **Fixed**: added a load-bearing INVARIANT comment at the `DECODERS` dict declaration explaining the constraint and what to do if it's violated. Cross-referenced from `_decode_base`'s docstring so the contract is bidirectionally traceable.
 ### Performance summary (Phase 24)
 | Benchmark | Phase 23 baseline | NOW | Δ this phase |
 |---|---:|---:|---:|
 | `decode_int` | 173 ns | **139 ns** | **-20%** |
 | `decode_bigint` | 188 ns | **150 ns** | **-20%** |
 | `decode_smallint` | 169 ns | **137 ns** | **-19%** |
 | `decode_date` | 521 ns | **435 ns** | **-17%** |
 | `parse_tuple_5cols_iso8859` | 2047 ns | **1592 ns** | **-22%** |
 | `select_bench_table_all` (1k rows) | 1255 µs | **989 µs** | **-21%** |
 | `select_with_param` | 977 µs | 860 µs | -12% |
 ### Cumulative improvement (vs. original Phase 21 baseline, before any optimization)
 | Metric | Original | NOW | Total Δ |
 |---|---:|---:|---:|
 | `decode_int` | 230 ns | **139 ns** | **-40%** |
 | `parse_tuple_5cols` | 2796 ns | **1592 ns** | **-43%** |
 | `select_bench_table_all` (1k rows) | 1477 µs | **989 µs** | **-33%** |
 Real-world fetch ceiling: 358K rows/sec → ~620K rows/sec on a single connection.
 ### Baseline refreshed
 `tests/benchmarks/baseline.json` updated. All 224 integration tests pass; ruff clean.
 ## 2026.05.04.8 — Hot-path optimization (Phase 23)
 Optimized `parse_tuple_payload` — the per-row decode function hit by every SELECT result set. **The 1k-row fetch wall-clock improved 19%** (1477 µs → 1198 µs). Bench micro-target (`parse_tuple_5cols`) improved 27% (2796 ns → 2030 ns). All 224 integration tests still pass; ruff clean.
--- a/pyproject.toml
+++ b/pyproject.toml
@ -1,6 +1,6 @@
 [project]
 name = "informix-db"
-version = "2026.05.04.8"
+version = "2026.05.04.9"
 description = "Pure-Python driver for IBM Informix IDS — speaks the SQLI wire protocol over raw sockets. No CSDK, no JVM, no native libraries."
 readme = "README.md"
 license = { text = "MIT" }
--- a/src/informix_db/_resultset.py
+++ b/src/informix_db/_resultset.py
@ -31,9 +31,9 @@ from .converters import (
    ClobLocator,
    CollectionValue,
    RowValue,
    _decode_base,
    _decode_datetime,
    _decode_interval,
    decode,
 )
 # Module-level type-code constants — lifted out of the hot loop in
@ -288,7 +288,7 @@ def parse_tuple_payload(
                offset += 1
                raw = payload[offset:offset + length]
                offset += length
-            values.append(decode(tc, raw, encoding))
+            values.append(_decode_base(tc, raw, encoding))
            continue
        if tc == _TC_LVARCHAR:
@ -299,7 +299,7 @@ def parse_tuple_payload(
            offset += length
            if length & 1:
                offset += 1
-            values.append(decode(tc, raw, encoding))
+            values.append(_decode_base(tc, raw, encoding))
            continue
        # DECIMAL/MONEY: width = ceil(precision/2) + 1, where precision is
@ -311,7 +311,7 @@ def parse_tuple_payload(
            raw = payload[offset:offset + width]
            offset += width
            try:
-                values.append(decode(tc, raw))
+                values.append(_decode_base(tc, raw))
            except NotImplementedError:
                values.append(raw)
            continue
@ -423,7 +423,7 @@ def parse_tuple_payload(
        raw = payload[offset:offset + width]
        offset += width
        try:
-            values.append(decode(tc, raw, encoding))
+            values.append(_decode_base(tc, raw, encoding))
        except NotImplementedError:
            values.append(raw)
    return tuple(values)
--- a/src/informix_db/converters.py
+++ b/src/informix_db/converters.py
@ -166,32 +166,42 @@ _REAL_NULL = b"\xff\xff\xff\xff"
 _DOUBLE_NULL = b"\xff\xff\xff\xff\xff\xff\xff\xff"
 _DATE_NULL = 0x80000000
 # Pre-compiled struct unpackers — bound methods bound at module load.
 # 37% faster than ``struct.unpack(fmt, raw)`` because the format string
 # is parsed once at compile time, not per call. Used by the fixed-width
 # decoders below; saves ~9 ns/call x the row's int/float column count.
 _UNPACK_SHORT = struct.Struct("!h").unpack
 _UNPACK_INT = struct.Struct("!i").unpack
 _UNPACK_LONG = struct.Struct("!q").unpack
 _UNPACK_FLOAT = struct.Struct("!f").unpack
 _UNPACK_DOUBLE = struct.Struct("!d").unpack
 def _decode_smallint(raw: bytes) -> int | None:
-    val = struct.unpack("!h", raw)[0]
+    val = _UNPACK_SHORT(raw)[0]
    return None if val == -0x8000 else val
 def _decode_int(raw: bytes) -> int | None:
-    val = struct.unpack("!i", raw)[0]
+    val = _UNPACK_INT(raw)[0]
    return None if val == -0x80000000 else val
 def _decode_bigint(raw: bytes) -> int | None:
-    val = struct.unpack("!q", raw)[0]
+    val = _UNPACK_LONG(raw)[0]
    return None if val == -0x8000000000000000 else val
 def _decode_smfloat(raw: bytes) -> float | None:
    if raw == _REAL_NULL:
        return None
-    return struct.unpack("!f", raw)[0]
+    return _UNPACK_FLOAT(raw)[0]
 def _decode_float(raw: bytes) -> float | None:
    if raw == _DOUBLE_NULL:
        return None
-    return struct.unpack("!d", raw)[0]
+    return _UNPACK_DOUBLE(raw)[0]
 def _decode_char(raw: bytes, encoding: str = "iso-8859-1") -> str:
@ -219,7 +229,7 @@ def _decode_bool(raw: bytes) -> bool:
 def _decode_date(raw: bytes) -> datetime.date | None:
    """4-byte big-endian signed int = day count from 1899-12-31. NULL = 0x80000000."""
-    days = struct.unpack("!i", raw)[0]
+    days = _UNPACK_INT(raw)[0]
    if days == -0x80000000:
        return None
    return _INFORMIX_DATE_EPOCH + datetime.timedelta(days=days)
@ -514,6 +524,18 @@ FIXED_WIDTHS: dict[int, int] = {
 # Phase 2 MVP decoders. Phase 6+ adds DATETIME, INTERVAL, DECIMAL,
 # MONEY, LVARCHAR, BYTE/TEXT, BLOB/CLOB, ROW, COLLECTION.
 #
 # INVARIANT — KEYS MUST REMAIN ≤ 0xFF (255). This is **load-bearing for
 # correctness**, not just a convention. ``_decode_base`` (below) skips
 # the ``base_type`` flag strip for performance; its safety relies on
 # the fact that any *flagged* type code (NOTNULLABLE=0x100,
 # DISTINCT=0x800, etc., per ``_types.py``) is ≥ 256 and therefore
 # cannot collide with a DECODERS key in [0, 255]. If you add a decoder
 # for a type code ≥ 256 (e.g., CLOB=0x65 itself is fine, but anything
 # that uses bits ≥ 0x100 in its identifier), the collision-free
 # guarantee weakens and ``_decode_base`` could silently dispatch to
 # the wrong decoder when handed a flagged input. Either keep keys ≤
 # 0xFF, or restore the ``base_type()`` call inside ``_decode_base``.
 DECODERS: dict[int, DecoderFn] = {
    IfxType.SMALLINT: _decode_smallint,
    IfxType.INT: _decode_int,
@ -543,28 +565,60 @@ _STRING_DECODER_TYPES = frozenset({
 })
 def _decode_base(base_tc: int, raw: bytes, encoding: str = "iso-8859-1") -> object:
    """Internal fast-path dispatch given an *already-base-typed* type code.
    INVARIANT: ``base_tc`` MUST be base-typed (high-bit flags stripped).
    Caller's responsibility — this function does NOT call ``base_type()``.
    The producer-side counterpart of this invariant lives in
    :func:`informix_db._resultset.parse_describe` — see the INVARIANT
    comment at the ``ColumnInfo`` construction site. The contract is
    bidirectional: producer must base-type before storing in
    ``ColumnInfo.type_code``; consumer (here) trusts that contract.
    Used by ``parse_tuple_payload`` to skip the redundant base-type
    strip when iterating column-by-column over a row payload. The
    public ``decode()`` function below wraps this and strips flags
    for callers who don't know whether they have a raw or base-typed
    input.
    Same dispatch logic as ``decode()`` — no behavior delta when
    invariant holds. If a flagged type code reaches here, the
    ``DECODERS.get`` lookup will miss and ``NotImplementedError``
    fires with a misleading message — the failure mode is loud,
    not silent. The no-collision guarantee depends on DECODERS keys
    staying ≤ 0xFF; see the INVARIANT comment at ``DECODERS`` below.
    """
    decoder = DECODERS.get(base_tc)
    if decoder is None:
        raise NotImplementedError(
            f"decoder for IDS type code {base_tc} not yet implemented "
            f"(Phase 2 MVP supports: SMALLINT, INT, BIGINT, REAL, FLOAT, "
            f"CHAR, VARCHAR, BOOL, DATE)"
        )
    if base_tc in _STRING_DECODER_TYPES:
        return decoder(raw, encoding)
    return decoder(raw)
 def decode(type_code: int, raw: bytes, encoding: str = "iso-8859-1") -> object:
    """Decode ``raw`` bytes for the given IDS type code into a Python value.
    The high-bit flags (NOTNULLABLE etc.) are stripped before lookup.
-    Raises ``KeyError`` for unsupported types — Phase 6+ adds the rest.
+    Raises ``NotImplementedError`` for unsupported types — Phase 6+
    adds the rest.
    ``encoding`` is honored for string types (CHAR/VARCHAR/NCHAR/NVCHAR/
-    LVARCHAR) and ignored otherwise — only those four decoders touch
+    LVARCHAR) and ignored otherwise — only those decoders touch user
-    user text. Pass the connection's ``encoding`` (derived from
+    text. Pass the connection's ``encoding`` (derived from CLIENT_LOCALE)
-    CLIENT_LOCALE) so multibyte locales round-trip correctly.
+    so multibyte locales round-trip correctly.
    Public API — accepts type codes with high-bit flags. Internal hot-path
    callers that know their type code is already base-typed should call
    :func:`_decode_base` directly to skip the redundant flag strip.
    """
-    base = base_type(type_code)
+    return _decode_base(base_type(type_code), raw, encoding)
    decoder = DECODERS.get(base)
    if decoder is None:
        raise NotImplementedError(
            f"decoder for IDS type code {base} not yet implemented "
            f"(Phase 2 MVP supports: SMALLINT, INT, BIGINT, REAL, FLOAT, "
            f"CHAR, VARCHAR, BOOL, DATE)"
        )
    if base in _STRING_DECODER_TYPES:
        return decoder(raw, encoding)
    return decoder(raw)
 # ---------------------------------------------------------------------------
--- a/tests/benchmarks/baseline.json
+++ b/tests/benchmarks/baseline.json
--- a/uv.lock
+++ b/uv.lock
@ -34,7 +34,7 @@ wheels = [
 [[package]]
 name = "informix-db"
-version = "2026.5.4.7"
+version = "2026.5.4.8"
 source = { editable = "." }
 [package.optional-dependencies]