Phase 23: Hot-path optimization for parse_tuple_payload (2026.05.04.8)

Per-row decode is hit on every row of every SELECT. The original code had three forms of waste in the inner loop: 1. Redundant base_type() call. ColumnInfo.type_code is already base-typed by parse_describe at construction; calling base_type() again per column per row was pure waste. Single largest savings. 2. IntFlag->int conversions inline (~10x per iteration). Lifted to module-level _TC_X constants. 3. Lazy imports inside the loop body (_decode_datetime, _decode_interval, BlobLocator, ClobLocator, RowValue, CollectionValue). Moved to top. Plus three precomputed frozensets (_LENGTH_PREFIXED_SHORT_TYPES, _COMPOSITE_UDT_TYPES, _NUMERIC_TYPES) replace inline tuple-membership checks. _COLLECTION_KIND_MAP is now MappingProxyType (actually frozen). Performance: * parse_tuple_5cols: 2796 ns -> 2030 ns (-27%) * select_bench_table_all (1k rows): 1477 us -> 1198 us (-19%) * Codec micro-bench, cold connect, executemany: unchanged Real-world fetch ceiling on a single connection: 350K rows/sec -> 490K rows/sec. Margaret Hamilton review surfaced four cleanup items, all addressed before tagging: * H1: cursor._dereference_blob_columns had the same redundant base_type() call - stripped for consistency. * M1: documented the load-bearing invariant at parse_describe (the single producer site) so future contributors have a grep target. * M2: _COLLECTION_KIND_MAP wrapped in MappingProxyType. * L1: stale line-number comment fixed to point at the INVARIANT comment instead. baseline.json refreshed; all 224 integration tests pass; ruff clean.
2026-05-04 17:52:20 -06:00 · 2026-05-04 17:52:20 -06:00 · f3e589c5bf
commit f3e589c5bf
parent 0e0dfcba26
5 changed files with 590 additions and 510 deletions
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@ -2,6 +2,45 @@

 All notable changes to `informix-db`. Versioning is [CalVer](https://calver.org/) — `YYYY.MM.DD` for date-based releases, `YYYY.MM.DD.N` for same-day post-releases per PEP 440.

+## 2026.05.04.8 — Hot-path optimization (Phase 23)
+
+Optimized `parse_tuple_payload` — the per-row decode function hit by every SELECT result set. **The 1k-row fetch wall-clock improved 19%** (1477 µs → 1198 µs). Bench micro-target (`parse_tuple_5cols`) improved 27% (2796 ns → 2030 ns). All 224 integration tests still pass; ruff clean.
+
+### What changed (`src/informix_db/_resultset.py`)
+
+- **Removed redundant `base_type()` call from the hot loop.** `ColumnInfo.type_code` is already base-typed by `parse_describe` at construction — calling `base_type(col.type_code)` again per column per row was pure waste. This was the single largest savings.
+- **Lifted `int(IfxType.X)` to module-level constants** (`_TC_CHAR`, `_TC_VARCHAR`, etc.). Original code did the IntFlag→int conversion inline ~10 times per loop iteration; now done once at module import.
+- **Moved lazy imports to module top** (`_decode_datetime`, `_decode_interval`, `BlobLocator`, `ClobLocator`, `RowValue`, `CollectionValue`). Saves a per-call attribute lookup; verified no circular import risk.
+- **Three precomputed frozensets** (`_LENGTH_PREFIXED_SHORT_TYPES`, `_COMPOSITE_UDT_TYPES`, `_NUMERIC_TYPES`) replace inline tuple-membership checks.
+- **`_COLLECTION_KIND_MAP` wrapped in `MappingProxyType`** — actually frozen against accidental mutation, not just nominally.
+
+### Margaret Hamilton review pass
+
+The optimization went through a rigorous failure-mode review. Findings addressed before tagging:
+
+- **H1 (high)**: `cursor._dereference_blob_columns` (line 304-310) was doing the same redundant `base_type()` call. Stripped for consistency — otherwise the next reader would write a "fix" to one site or the other based on which they noticed.
+- **M1 (medium)**: documented the load-bearing invariant at its single producer site. `parse_describe` now has a comment naming readers that depend on `ColumnInfo.type_code` being base-typed, so a future contributor adding a new construct site has a grep-able warning.
+- **M2 (medium)**: `_COLLECTION_KIND_MAP` is now `MappingProxyType` (was a plain dict).
+- **L1 (low)**: stale "(line 151)" comment reference replaced with a pointer to the named INVARIANT comment.
+
+### Performance summary
+
+| Benchmark | Pre | Post | Delta |
+|---|---:|---:|---:|
+| `parse_tuple_5cols_iso8859` | 2796 ns | 2030 ns | **-27%** |
+| `parse_tuple_5cols_utf8` | 2791 ns | 2041 ns | **-27%** |
+| `select_bench_table_all` (1k rows) | 1477 µs | 1198 µs | **-19%** |
+| `select_with_param` (~50 rows) | 1069 µs | 994 µs | -7% |
+| Codec micro-benchmarks (`decode_int`, etc.) | unchanged ±noise | | |
+| `cold_connect_disconnect` | unchanged | | |
+| `executemany` series | unchanged | | |
+
+Real-world fetch ceiling on a single connection: 350K rows/sec → 490K rows/sec.
+
+### Baseline refreshed
+
+`tests/benchmarks/baseline.json` updated with the new (faster) numbers. Future regressions will be measured against this floor.
+
 ## 2026.05.04.7 — User-facing documentation refresh (Phase 22)

 The `docs/USAGE.md` predated Phases 17-21, so anyone landing on PyPI was missing scrollable cursors, locale/Unicode, the autocommit cliff finding, and the type-mapping reference. This release closes that gap.
--- a/pyproject.toml
+++ b/pyproject.toml
@ -1,6 +1,6 @@
 [project]
 name = "informix-db"
-version = "2026.05.04.7"
+version = "2026.05.04.8"
 description = "Pure-Python driver for IBM Informix IDS — speaks the SQLI wire protocol over raw sockets. No CSDK, no JVM, no native libraries."
 readme = "README.md"
 license = { text = "MIT" }
--- a/src/informix_db/_resultset.py
+++ b/src/informix_db/_resultset.py
@ -21,10 +21,47 @@ column names), read via readPadded.
 from __future__ import annotations

 from dataclasses import dataclass
+from types import MappingProxyType

 from ._protocol import IfxStreamReader
-from ._types import base_type, is_nullable
-from .converters import FIXED_WIDTHS, decode
+from ._types import IfxType, base_type, is_nullable
+from .converters import (
+    FIXED_WIDTHS,
+    BlobLocator,
+    ClobLocator,
+    CollectionValue,
+    RowValue,
+    _decode_datetime,
+    _decode_interval,
+    decode,
+)
+
+# Module-level type-code constants — lifted out of the hot loop in
+# parse_tuple_payload so we don't pay the IntFlag→int conversion per
+# column per row.
+_TC_CHAR = int(IfxType.CHAR)
+_TC_VARCHAR = int(IfxType.VARCHAR)
+_TC_NCHAR = int(IfxType.NCHAR)
+_TC_NVCHAR = int(IfxType.NVCHAR)
+_TC_LVARCHAR = int(IfxType.LVARCHAR)
+_TC_DECIMAL = int(IfxType.DECIMAL)
+_TC_MONEY = int(IfxType.MONEY)
+_TC_DATETIME = int(IfxType.DATETIME)
+_TC_INTERVAL = int(IfxType.INTERVAL)
+_TC_UDTFIXED = int(IfxType.UDTFIXED)
+_TC_UDTVAR = int(IfxType.UDTVAR)
+_TC_ROW = int(IfxType.ROW)
+_TC_COLLECTION = int(IfxType.COLLECTION)
+_TC_SET = int(IfxType.SET)
+_TC_MULTISET = int(IfxType.MULTISET)
+_TC_LIST = int(IfxType.LIST)
+
+_COLLECTION_KIND_MAP = MappingProxyType({
+    _TC_SET: "set",
+    _TC_MULTISET: "multiset",
+    _TC_LIST: "list",
+    _TC_COLLECTION: "collection",
+})


@dataclass
@ -145,6 +182,11 @@ def parse_describe(reader: IfxStreamReader) -> tuple[list[ColumnInfo], dict]:
            # Walk the string table to find the name at this offset.
            tail = string_table[fd["field_index"] :].split(b"\x00", 1)[0]
            name = tail.decode("iso-8859-1") if tail else f"col{len(columns)}"
+        # INVARIANT: ColumnInfo.type_code is always base-typed (high-bit
+        # flags stripped). This is the single producer site — every reader
+        # (parse_tuple_payload, cursor._dereference_blob_columns, etc.)
+        # depends on this and skips redundant base_type() calls. If you
+        # ever construct ColumnInfo elsewhere, base_type() the input.
        columns.append(
            ColumnInfo(
                name=name or f"col{len(columns)}",
@ -164,15 +206,23 @@ def parse_describe(reader: IfxStreamReader) -> tuple[list[ColumnInfo], dict]:
 # Per ``IfxSqli`` row-data extraction (see receiveFastPath case 13/15/16):
 # CHAR, VARCHAR, NCHAR, NVCHAR all use ``[short length][bytes][pad if odd]``
 # inside the tuple blob. LVARCHAR uses a 4-byte length prefix instead.
-from ._types import IfxType  # noqa: E402
-
 _LENGTH_PREFIXED_SHORT_TYPES = frozenset({
-    int(IfxType.CHAR),
-    int(IfxType.VARCHAR),
-    int(IfxType.NCHAR),
-    int(IfxType.NVCHAR),
+    _TC_CHAR,
+    _TC_VARCHAR,
+    _TC_NCHAR,
+    _TC_NVCHAR,
 })

+_COMPOSITE_UDT_TYPES = frozenset({
+    _TC_ROW,
+    _TC_COLLECTION,
+    _TC_SET,
+    _TC_MULTISET,
+    _TC_LIST,
+})
+
+_NUMERIC_TYPES = frozenset({_TC_DECIMAL, _TC_MONEY})
+

 def parse_tuple_payload(
    reader: IfxStreamReader,
@ -212,10 +262,15 @@ def parse_tuple_payload(

    values: list[object] = []
    offset = 0
+    # Note: ``col.type_code`` is *already* base-typed by ``parse_describe``
+    # (see INVARIANT comment there), so we don't re-strip high-bit flags
+    # here. The original code called ``base_type(col.type_code)`` per
+    # column per row — pure waste. Skipping it is the single largest
+    # savings in this loop.
    for col in columns:
-        base = base_type(col.type_code)
+        tc = col.type_code

-        if base in _LENGTH_PREFIXED_SHORT_TYPES:
+        if tc in _LENGTH_PREFIXED_SHORT_TYPES:
            # In tuple data, VARCHAR/NCHAR/NVCHAR use a SINGLE-BYTE
            # length prefix (max 255 — IDS VARCHAR's hard limit), not
            # a short. Empirically verified against the SQ_TUPLE bytes
@ -224,8 +279,7 @@ def parse_tuple_payload(
            #     payload = 09 73 79 73 74 61 62 6c 65 73
            #             = [byte 9]["systables"]
            # CHAR is fixed-width per encoded_length — handled below.
-            if base == int(IfxType.CHAR):
-                # CHAR(N) is fixed-width; uses encoded_length straight
+            if tc == _TC_CHAR:
                width = col.encoded_length
                raw = payload[offset:offset + width]
                offset += width
@ -234,10 +288,10 @@ def parse_tuple_payload(
                offset += 1
                raw = payload[offset:offset + length]
                offset += length
-            values.append(decode(col.type_code, raw, encoding))
+            values.append(decode(tc, raw, encoding))
            continue

-        if base == int(IfxType.LVARCHAR):
+        if tc == _TC_LVARCHAR:
            # [int length][bytes][pad if odd]
            length = int.from_bytes(payload[offset:offset + 4], "big", signed=True)
            offset += 4
@ -245,19 +299,19 @@ def parse_tuple_payload(
            offset += length
            if length & 1:
                offset += 1
-            values.append(decode(col.type_code, raw, encoding))
+            values.append(decode(tc, raw, encoding))
            continue

        # DECIMAL/MONEY: width = ceil(precision/2) + 1, where precision is
        # the high byte of encoded_length (packed as (precision << 8) | scale).
        # Per IfxRowColumn.loadColumnData and IfxToJavaDecimal byte sizing.
-        if base in (int(IfxType.DECIMAL), int(IfxType.MONEY)):
+        if tc in _NUMERIC_TYPES:
            precision = (col.encoded_length >> 8) & 0xFF
            width = (precision + 1) // 2 + 1
            raw = payload[offset:offset + width]
            offset += width
            try:
-                values.append(decode(col.type_code, raw))
+                values.append(decode(tc, raw))
            except NotImplementedError:
                values.append(raw)
            continue
@ -266,12 +320,11 @@ def parse_tuple_payload(
        # high byte of encoded_length (packed as (digit_count << 8) |
        # (start_TU << 4) | end_TU). The decoder needs the qualifier too,
        # so we call it directly here rather than via the dispatch.
-        if base == int(IfxType.DATETIME):
+        if tc == _TC_DATETIME:
            digit_count = (col.encoded_length >> 8) & 0xFF
            width = (digit_count + 1) // 2 + 1
            raw = payload[offset:offset + width]
            offset += width
-            from .converters import _decode_datetime
            values.append(_decode_datetime(raw, col.encoded_length))
            continue

@ -281,12 +334,11 @@ def parse_tuple_payload(
        # plus ceil(digit_count/2) digit pairs). Like DATETIME, the
        # qualifier is needed at decode time, so we bypass the generic
        # dispatch.
-        if base == int(IfxType.INTERVAL):
+        if tc == _TC_INTERVAL:
            digit_count = (col.encoded_length >> 8) & 0xFF
            width = (digit_count + 1) // 2 + 1
            raw = payload[offset:offset + width]
            offset += width
-            from .converters import _decode_interval
            values.append(_decode_interval(raw, col.encoded_length))
            continue

@ -295,8 +347,7 @@ def parse_tuple_payload(
        # (CLOB) and encoded_length = 72 (locator size). The 72 bytes
        # we read here are an opaque server-side reference, NOT the
        # actual data. Phase 10 lets users fetch via lotofile + SQ_FILE.
-        if base == int(IfxType.UDTFIXED) and col.extended_id in (10, 11):
-            from .converters import BlobLocator, ClobLocator
+        if tc == _TC_UDTFIXED and col.extended_id in (10, 11):
            width = col.encoded_length
            raw = payload[offset:offset + width]
            offset += width
@ -315,14 +366,7 @@ def parse_tuple_payload(
        # We surface the bytes wrapped in a typed object and let the
        # user parse the textual form themselves. Type codes:
        # ROW=22, COLLECTION=23, SET=19, MULTISET=20, LIST=21.
-        if base in (
-            int(IfxType.ROW),
-            int(IfxType.COLLECTION),
-            int(IfxType.SET),
-            int(IfxType.MULTISET),
-            int(IfxType.LIST),
-        ):
-            from .converters import CollectionValue, RowValue
+        if tc in _COMPOSITE_UDT_TYPES:
            indicator = payload[offset]
            offset += 1
            if indicator == 1:  # null
@ -334,19 +378,13 @@ def parse_tuple_payload(
            offset += 4
            raw = bytes(payload[offset:offset + length])
            offset += length
-            if base == int(IfxType.ROW):
+            if tc == _TC_ROW:
                values.append(RowValue(raw=raw, schema=col.extended_name))
            else:
-                kind_map = {
-                    int(IfxType.SET): "set",
-                    int(IfxType.MULTISET): "multiset",
-                    int(IfxType.LIST): "list",
-                    int(IfxType.COLLECTION): "collection",
-                }
                values.append(
                    CollectionValue(
                        raw=raw,
-                        kind=kind_map[base],
+                        kind=_COLLECTION_KIND_MAP[tc],
                        element_schema=col.extended_name,
                    )
                )
@ -359,7 +397,7 @@ def parse_tuple_payload(
        # verified against ``SELECT lotofile(...)`` row data — the
        # leading ``00`` is null indicator (0=not null, 1=null per UDT
        # convention).
-        if base == int(IfxType.UDTVAR) and col.extended_name == "lvarchar":
+        if tc == _TC_UDTVAR and col.extended_name == "lvarchar":
            indicator = payload[offset]
            offset += 1
            if indicator == 1:
@ -377,7 +415,7 @@ def parse_tuple_payload(
            continue

        # Fixed-width types
-        width = FIXED_WIDTHS.get(base)
+        width = FIXED_WIDTHS.get(tc)
        if width is None:
            # Phase 6+ types (DATETIME, INTERVAL, BLOBs) — fall back
            # to encoded_length and surface raw bytes.
@ -385,7 +423,7 @@ def parse_tuple_payload(
        raw = payload[offset:offset + width]
        offset += width
        try:
-            values.append(decode(col.type_code, raw, encoding))
+            values.append(decode(tc, raw, encoding))
        except NotImplementedError:
            values.append(raw)
    return tuple(values)
--- a/src/informix_db/cursors.py
+++ b/src/informix_db/cursors.py
@ -301,12 +301,15 @@ class Cursor:
        and the response is one or more ``SQ_BLOB`` (39) chunks ending
        with a zero-length terminator.
        """
-        from ._types import IfxType, base_type
+        from ._types import IfxType

+        # ColumnInfo.type_code is base-typed by construction
+        # (see parse_describe / INVARIANT comment) — no base_type() needed.
+        byte_text_codes = (int(IfxType.BYTE), int(IfxType.TEXT))
        blob_indices = [
-            (i, base_type(c.type_code))
+            (i, c.type_code)
            for i, c in enumerate(self._columns)
-            if base_type(c.type_code) in (int(IfxType.BYTE), int(IfxType.TEXT))
+            if c.type_code in byte_text_codes
        ]
        if not blob_indices:
            return
--- a/tests/benchmarks/baseline.json
+++ b/tests/benchmarks/baseline.json