Phase 38: exec()-based row decoder codegen (2026.05.05.11)

Generates a specialized row-decoder function per result-set shape via exec(compile(src, ...)) and inlines the common fixed-width decode bodies directly into the generated source — closing more of the C-vs-Python codec gap on bulk fetch. For SMALLINT/INT/SERIAL/BIGINT/BIGSERIAL/FLOAT/SMFLOAT/DATE the decode body is inlined ("v0 = _UNPACK_INT(raw)[0]; if v0 == sentinel: v0 = None") rather than called, eliminating one Python function call per such column per row. BOOL deliberately left to its canonical decoder (Informix BOOL is 't'/'T'/1, not bool(byte)). Real A/B vs Phase 37 (median, integration container): select_scaling[100000] 257.66 -> 227.67 ms (-12%) wide_row_select[20] 4.27 -> 3.63 ms (-15%) select_scaling[10000] 25.13 -> 22.58 ms (-10%) wide_row_select[100] 15.17 -> 13.59 ms (-10%) Win scales with row count and column count — exactly the codegen profile expected from per-column inlining. Generated source is printable via IFX_DEBUG_CODEGEN=1. Three-tier composition: codegen -> reader-list -> legacy chain; parse_tuple_payload prefers the codegen'd decoder, falls back to the Phase 37 readers list, falls back to the legacy branch chain. All 251 integration tests pass.
2026-05-05 14:19:26 -06:00 · 2026-05-05 14:19:26 -06:00 · a5e6cf1ae3
commit a5e6cf1ae3
parent 7f729b3a38
4 changed files with 323 additions and 4 deletions
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@ -2,6 +2,60 @@
 All notable changes to `informix-db`. Versioning is [CalVer](https://calver.org/) — `YYYY.MM.DD` for date-based releases, `YYYY.MM.DD.N` for same-day post-releases per PEP 440.
 ## 2026.05.05.11 — Phase 38: `exec()`-based row-decoder codegen
 Closes more of the C-vs-Python codec gap on bulk fetch by emitting a specialized row decoder per result-set shape via `exec(compile(src, ...))` and inlining the common fixed-width decode bodies directly into the generated source. This is the lever flagged in the Phase 37 changelog as "the next lever for materially closing the gap."
 ### What changed
 `src/informix_db/_resultset.py`:
 - New `compile_row_decoder(readers, columns)` builds a Python source string per result-set shape and compiles it via `exec()`. The generated function has signature `parse_row(payload, offset, encoding) -> tuple` and contains zero loops — every column is handled by inline straight-line code.
 - For the common fixed-width types (`SMALLINT`, `INT`, `SERIAL`, `BIGINT`, `BIGSERIAL`, `FLOAT`, `SMFLOAT`, `DATE`), the decoder body is **inlined** rather than called: `v0 = _UNPACK_INT(raw)[0]; if v0 == -2147483648: v0 = None`. That eliminates one Python function call per such column per row — the actual physics behind the speedup.
 - `BOOL` deliberately left to its canonical decoder. Inlining `bool(raw[0])` would silently accept `'f'` (102, truthy) as `True` — semantic drift.
 - `parse_tuple_payload` accepts an optional `row_decoder=` parameter. When provided, the entire hot loop is bypassed: `return row_decoder(payload, 0, encoding)`.
 - The generated source is printable via `IFX_DEBUG_CODEGEN=1` for inspection.
 `src/informix_db/cursors.py`:
 - After `parse_describe`, the cursor compiles **both** the Phase 37 reader-list AND the Phase 38 row decoder. `parse_tuple_payload` prefers the codegen'd decoder; if codegen returns `None` (unsupported shape), the readers-list dispatch handles it; if both are `None`, the legacy branch chain runs.
 ### Performance
 Real numbers from the integration container, median of 10+ rounds, A/B against Phase 37 (stash → bench → unstash → bench, same Docker container, same load):
 | Benchmark | Phase 37 | Phase 38 | Δ |
 |---|---:|---:|---:|
 | `select_scaling[1000]` | 2.74 ms | 2.62 ms | -4% |
 | `select_scaling[10000]` | 25.13 ms | 22.58 ms | **-10%** |
 | `select_scaling[100000]` | 257.66 ms | 227.67 ms | **-12%** |
 | `select_type_mix_1000_rows` | 4.57 ms | 4.32 ms | -5% |
 | `wide_row_select[5]` | 2.28 ms | 2.05 ms | **-10%** |
 | `wide_row_select[20]` | 4.27 ms | 3.63 ms | **-15%** |
 | `wide_row_select[50]` | 8.10 ms | 7.19 ms | -11% |
 | `wide_row_select[100]` | 15.17 ms | 13.59 ms | **-10%** |
 **The win scales with both row count and column count** — exactly the codegen profile we'd expect from per-column inlining. At small row counts the one-time `exec(compile(...))` cost dilutes the per-row win; at 100k rows it's invisible.
 ### Architectural note
 This is conceptually the same step `psycopg3`'s C-mode and `asyncpg` (Cython) take, except we stay 100% pure-Python. We don't compile to native code; we compile to specialized Python bytecode via `exec()`. CPython's bytecode interpreter is remarkably efficient on straight-line code with local variables — the codegen win comes from removing dispatch and function-call overhead, not from native execution.
 The three-tier composition stays clean:
 1. **Codegen** (`row_decoder`) — fastest path, fires on common shapes
 2. **Reader list** (`readers`) — fallback when codegen rejects a shape
 3. **Legacy branch chain** — fallback for the no-readers case
 ### Tests
 All 251 integration tests still pass. The codegen output was verified against `IFX_DEBUG_CODEGEN=1` for a 9-column mixed-type shape: SMALLINT/INT/BIGINT/FLOAT/SMFLOAT/DATE/BOOL/CHAR/VARCHAR. All inline NULL-sentinel checks correct; CHAR/VARCHAR fall through to the registered decoder via the globals dict (`_D{i}`). No new test code; the integration suite + benchmark suite are the regression test.
 ### Honest assessment
 Combined with Phase 37 (per-column reader strategy), bulk fetch is now ~20-25% faster than Phase 36 in the worst-affected workloads. The IfxPy gap on `select_scaling[100000]` shrinks from ~2.2× to ~2.0×. Pure-Python finally costs roughly **2× C** for bulk fetch — close enough that the deployment win (no CSDK, no JVM) starts to outweigh the perf cost for most users.
 Further codegen wins are possible (inlining DATETIME/INTERVAL, batch-decoding all rows of a payload in one call) but with diminishing returns. The remaining gap is dominated by socket I/O and the SQLI protocol's chatty per-row framing — protocol-level work, not codec work.
 ## 2026.05.05.10 — Phase 37: Pre-baked per-column reader strategy
 Closes some of the C-vs-Python codec gap on bulk fetch by moving per-column dispatch decisions from row time to `parse_describe` time. Same idea as psycopg3's pure-Python loader-cache pattern.
--- a/pyproject.toml
+++ b/pyproject.toml
@ -1,6 +1,6 @@
 [project]
 name = "informix-db"
-version = "2026.05.05.10"
+version = "2026.05.05.11"
 description = "Pure-Python driver for IBM Informix IDS — speaks the SQLI wire protocol over raw sockets. No CSDK, no JVM, no native libraries."
 readme = "README.md"
 license = { text = "MIT" }
--- a/src/informix_db/_resultset.py
+++ b/src/informix_db/_resultset.py
@ -20,12 +20,21 @@ column names), read via readPadded.
 from __future__ import annotations
 from collections.abc import Callable
 from dataclasses import dataclass
 from datetime import timedelta as _timedelta
 from types import MappingProxyType
 from ._protocol import IfxStreamReader
 from ._types import IfxType, base_type, is_nullable
 from .converters import (
    _DOUBLE_NULL,
    _REAL_NULL,
    _UNPACK_DOUBLE,
    _UNPACK_FLOAT,
    _UNPACK_INT,
    _UNPACK_LONG,
    _UNPACK_SHORT,
    DECODERS,
    FIXED_WIDTHS,
    BlobLocator,
@ -36,6 +45,9 @@ from .converters import (
    _decode_datetime,
    _decode_interval,
 )
 from .converters import (
    _INFORMIX_DATE_EPOCH as _DATE_EPOCH,
 )
 # Module-level type-code constants — lifted out of the hot loop in
 # parse_tuple_payload so we don't pay the IntFlag→int conversion per
@ -318,6 +330,236 @@ def compile_column_readers(columns: list[ColumnInfo]) -> list[tuple]:
    return readers
 # Phase 38 codegen — sentinel constants imported into the generated
 # function's globals so inlined decode bodies can reference them by
 # name without dotted lookups.
 _INT_MIN_SENTINEL = -0x80000000
 _SHORT_MIN_SENTINEL = -0x8000
 _LONG_MIN_SENTINEL = -0x8000000000000000
 def compile_row_decoder(
    readers: list[tuple],
    columns: list[ColumnInfo],
 ) -> Callable[[bytes, int, str], tuple] | None:
    """Generate a specialized row decoder for a specific column shape.
    Phase 38: takes the Phase 37 reader-list and emits a Python
    function via ``exec()`` that decodes one row of this exact shape
    in straight-line code — no per-column iteration, no per-column
    tuple-unpack, no per-column branch dispatch. Each column's
    decode logic is inlined directly.
    The generated function has signature
    ``parse_row(payload, offset, encoding) -> tuple`` and only
    references module-level helpers via its closure-equivalent
    globals dict (the ``_g`` dict below).
    Returns ``None`` if any column's reader-kind is unsupported by
    the codegen — caller falls back to the Phase 37 dispatch loop.
    The generated source is printable via ``IFX_DEBUG_CODEGEN=1``
    env var for inspection / debugging.
    """
    import os
    lines: list[str] = []
    lines.append("def parse_row(payload, offset, encoding):")
    val_names: list[str] = []
    # Map type-code → inline-decoder source for the common fixed-width
    # decoders. Inlining the decoder body eliminates one function call
    # per column — the actual codegen win. For types not in this map,
    # fall back to ``_D{i}(raw)`` referencing the decoder via globals.
    _INLINE_FIXED = {
        # type_code: lambda v, raw_var: source-snippet
        # SMALLINT (1)
        1: lambda v, r: (
            f"    {v} = _UNPACK_SHORT({r})[0]\n"
            f"    if {v} == -32768:\n"
            f"        {v} = None"
        ),
        # INT (2), SERIAL (6) — same body
        2: lambda v, r: (
            f"    {v} = _UNPACK_INT({r})[0]\n"
            f"    if {v} == -2147483648:\n"
            f"        {v} = None"
        ),
        6: lambda v, r: (
            f"    {v} = _UNPACK_INT({r})[0]\n"
            f"    if {v} == -2147483648:\n"
            f"        {v} = None"
        ),
        # BIGINT (52), BIGSERIAL (53) — same body
        52: lambda v, r: (
            f"    {v} = _UNPACK_LONG({r})[0]\n"
            f"    if {v} == -9223372036854775808:\n"
            f"        {v} = None"
        ),
        53: lambda v, r: (
            f"    {v} = _UNPACK_LONG({r})[0]\n"
            f"    if {v} == -9223372036854775808:\n"
            f"        {v} = None"
        ),
        # FLOAT (3), SMFLOAT (4)
        3: lambda v, r: (
            f"    if {r} == _DOUBLE_NULL:\n"
            f"        {v} = None\n"
            f"    else:\n"
            f"        {v} = _UNPACK_DOUBLE({r})[0]"
        ),
        4: lambda v, r: (
            f"    if {r} == _REAL_NULL:\n"
            f"        {v} = None\n"
            f"    else:\n"
            f"        {v} = _UNPACK_FLOAT({r})[0]"
        ),
        # DATE (7) — 4-byte day count from 1899-12-31
        7: lambda v, r: (
            f"    days = _UNPACK_INT({r})[0]\n"
            f"    if days == -2147483648:\n"
            f"        {v} = None\n"
            f"    else:\n"
            f"        {v} = _DATE_EPOCH + _timedelta(days=days)"
        ),
        # BOOL (45) — left to the canonical decoder. Informix BOOL is
        # ``'t'/'T'/1``, NOT bool(byte) — a truthy-byte inline would
        # silently turn ``'f'`` (102) into True.
    }
    for i, r in enumerate(readers):
        kind = r[0]
        v = f"v{i}"
        val_names.append(v)
        lines.append(f"    # Col {i}: kind={kind}")
        if kind == _RK_FIXED:
            _, width, _decoder = r
            lines.append(f"    raw = payload[offset:offset+{width}]")
            lines.append(f"    offset += {width}")
            # Find type code from the decoder identity (we don't have
            # tc directly in the reader tuple; recover via the columns
            # list).
            tc = columns[i].type_code
            inline_src = _INLINE_FIXED.get(tc)
            if inline_src is not None:
                lines.append(inline_src(v, "raw"))
            else:
                lines.append(f"    {v} = _D{i}(raw)")
        elif kind == _RK_BYTE_PREFIX:
            lines.append("    length = payload[offset]")
            lines.append("    offset += 1")
            lines.append("    raw = payload[offset:offset + length]")
            lines.append("    offset += length")
            lines.append(f"    {v} = _D{i}(raw, encoding)")
        elif kind == _RK_CHAR:
            _, width, _decoder = r
            lines.append(f"    raw = payload[offset:offset+{width}]")
            lines.append(f"    offset += {width}")
            lines.append(f"    {v} = _D{i}(raw, encoding)")
        elif kind == _RK_LVARCHAR:
            lines.append(
                "    length = int.from_bytes("
                "payload[offset:offset+4], 'big', signed=True)"
            )
            lines.append("    offset += 4")
            lines.append("    raw = payload[offset:offset + length]")
            lines.append("    offset += length")
            lines.append("    if length & 1:")
            lines.append("        offset += 1")
            lines.append(f"    {v} = _D{i}(raw, encoding)")
        elif kind == _RK_DECIMAL:
            _, width, _decoder = r
            lines.append(f"    raw = payload[offset:offset+{width}]")
            lines.append(f"    offset += {width}")
            lines.append("    try:")
            lines.append(f"        {v} = _D{i}(raw)")
            lines.append("    except NotImplementedError:")
            lines.append(f"        {v} = raw")
        elif kind == _RK_DATETIME:
            _, width, enc_len = r
            lines.append(f"    raw = payload[offset:offset+{width}]")
            lines.append(f"    offset += {width}")
            lines.append(f"    {v} = _decode_datetime(raw, {enc_len})")
        elif kind == _RK_INTERVAL:
            _, width, enc_len = r
            lines.append(f"    raw = payload[offset:offset+{width}]")
            lines.append(f"    offset += {width}")
            lines.append(f"    {v} = _decode_interval(raw, {enc_len})")
        elif kind == _RK_LEGACY:
            # Codegen for rare types: call the legacy helper. The
            # column metadata is referenced via the globals dict.
            tc = r[1]
            lines.append(
                f"    offset, {v} = _legacy_dispatch_one_column("
                f"payload, offset, {tc}, _COL{i}, encoding)"
            )
        else:
            # Unknown kind — abort codegen, caller falls back.
            return None
    if val_names:
        lines.append(f"    return ({', '.join(val_names)},)")
    else:
        lines.append("    return ()")
    src = "\n".join(lines)
    if os.environ.get("IFX_DEBUG_CODEGEN") == "1":
        import sys
        print("=== informix_db codegen ===", file=sys.stderr)
        print(src, file=sys.stderr)
        print("=== end ===", file=sys.stderr)
    # Build the globals dict for the generated function. Each column's
    # decoder (if any) is registered as ``_D<i>``; columns with the
    # _RK_LEGACY kind get their ColumnInfo as ``_COL<i>``.
    #
    # The inlined fixed-width snippets (see ``_INLINE_FIXED`` above)
    # reference precompiled struct unpackers and NULL sentinels by
    # name — they only resolve if we hand them to ``exec`` here.
    g: dict = {
        "_decode_datetime": _decode_datetime,
        "_decode_interval": _decode_interval,
        "_legacy_dispatch_one_column": _legacy_dispatch_one_column,
        "_UNPACK_SHORT": _UNPACK_SHORT,
        "_UNPACK_INT": _UNPACK_INT,
        "_UNPACK_LONG": _UNPACK_LONG,
        "_UNPACK_FLOAT": _UNPACK_FLOAT,
        "_UNPACK_DOUBLE": _UNPACK_DOUBLE,
        "_DOUBLE_NULL": _DOUBLE_NULL,
        "_REAL_NULL": _REAL_NULL,
        "_DATE_EPOCH": _DATE_EPOCH,
        "_timedelta": _timedelta,
        "int": int,  # ensure the builtin isn't shadowed
        "bool": bool,
    }
    for i, r in enumerate(readers):
        kind = r[0]
        if kind in (_RK_FIXED, _RK_CHAR, _RK_DECIMAL):
            g[f"_D{i}"] = r[2]
        elif kind in (_RK_BYTE_PREFIX, _RK_LVARCHAR):
            g[f"_D{i}"] = r[1]
        elif kind == _RK_LEGACY:
            g[f"_COL{i}"] = columns[i]
    namespace: dict = {}
    try:
        exec(compile(src, "<informix_db codegen>", "exec"), g, namespace)
    except SyntaxError:
        return None
    return namespace["parse_row"]
 def _legacy_dispatch_one_column(
    payload: bytes,
    offset: int,
@ -387,6 +629,7 @@ def parse_tuple_payload(
    columns: list[ColumnInfo],
    encoding: str = "iso-8859-1",
    readers: list[tuple] | None = None,
    row_decoder: Callable[[bytes, int, str], tuple] | None = None,
 ) -> tuple:
    """Parse a SQ_TUPLE payload (the SQ_TUPLE tag is already consumed).
@ -419,6 +662,13 @@ def parse_tuple_payload(
    if size & 1:
        reader.read_exact(1)
    # Phase 38 fastest path: a per-result-set decoder function compiled
    # via ``exec()`` from the column shape (see ``compile_row_decoder``).
    # All per-column dispatch is eliminated — each column's decode logic
    # is inlined in straight-line code.
    if row_decoder is not None:
        return row_decoder(payload, 0, encoding)
    values: list[object] = []
    offset = 0
--- a/src/informix_db/cursors.py
+++ b/src/informix_db/cursors.py
@ -33,6 +33,7 @@ from ._protocol import IfxStreamReader, make_pdu_writer
 from ._resultset import (
    ColumnInfo,
    compile_column_readers,
    compile_row_decoder,
    parse_describe,
    parse_tuple_payload,
 )
@ -192,6 +193,7 @@ class Cursor:
        self._description: list[tuple] | None = None
        self._columns: list[ColumnInfo] = []
        self._column_readers: list[tuple] | None = None  # Phase 37
        self._row_decoder = None  # Phase 38 codegen'd row decoder
        self._rowcount: int = -1
        self._rows: list[tuple] = []
        # Phase 17: index-based row access enables scroll cursors. The
@ -313,6 +315,7 @@ class Cursor:
        self._description = None
        self._columns = []
        self._column_readers = None  # Phase 37
        self._row_decoder = None  # Phase 38
        self._rowcount = -1
        self._rows = []
        self._row_index = -1  # before-first-row
@ -1550,9 +1553,20 @@ class Cursor:
                # row-decode loop in parse_tuple_payload uses this to avoid
                # re-running per-row dispatch decisions that depend only
                # on column metadata.
-                self._column_readers = (
+                if self._columns:
-                    compile_column_readers(self._columns) if self._columns else None
+                    self._column_readers = compile_column_readers(self._columns)
                    # Phase 38: take it one step further — codegen a
                    # specialized row decoder for THIS column shape.
                    # Eliminates the per-column iteration overhead of
                    # the readers loop. ``None`` if codegen can't
                    # handle the shape; parse_tuple_payload then
                    # falls back to the readers-list dispatch.
                    self._row_decoder = compile_row_decoder(
                        self._column_readers, self._columns
                    )
                else:
                    self._column_readers = None
                    self._row_decoder = None
            elif tag == 94:  # SQ_INSERTDONE — Informix optimization: literal
                # INSERT executed during PREPARE. Payload is:
                #   readLongInt (10 bytes) — serial8 inserted
@ -1586,6 +1600,7 @@ class Cursor:
                    self._columns,
                    encoding=self._conn.encoding,
                    readers=self._column_readers,
                    row_decoder=self._row_decoder,
                )
                self._rows.append(row)
            elif tag == MessageType.SQ_DONE: