Phase 38: exec()-based row decoder codegen (2026.05.05.11)

Generates a specialized row-decoder function per result-set shape via
exec(compile(src, ...)) and inlines the common fixed-width decode bodies
directly into the generated source — closing more of the C-vs-Python
codec gap on bulk fetch.

For SMALLINT/INT/SERIAL/BIGINT/BIGSERIAL/FLOAT/SMFLOAT/DATE the decode
body is inlined ("v0 = _UNPACK_INT(raw)[0]; if v0 == sentinel: v0 = None")
rather than called, eliminating one Python function call per such column
per row. BOOL deliberately left to its canonical decoder (Informix BOOL
is 't'/'T'/1, not bool(byte)).

Real A/B vs Phase 37 (median, integration container):
  select_scaling[100000]    257.66 -> 227.67 ms  (-12%)
  wide_row_select[20]         4.27 ->   3.63 ms  (-15%)
  select_scaling[10000]      25.13 ->  22.58 ms  (-10%)
  wide_row_select[100]       15.17 ->  13.59 ms  (-10%)

Win scales with row count and column count — exactly the codegen
profile expected from per-column inlining.

Generated source is printable via IFX_DEBUG_CODEGEN=1.

Three-tier composition: codegen -> reader-list -> legacy chain;
parse_tuple_payload prefers the codegen'd decoder, falls back to the
Phase 37 readers list, falls back to the legacy branch chain.

All 251 integration tests pass.
This commit is contained in:
Ryan Malloy 2026-05-05 14:19:26 -06:00
parent 7f729b3a38
commit a5e6cf1ae3
4 changed files with 323 additions and 4 deletions

View File

@ -2,6 +2,60 @@
All notable changes to `informix-db`. Versioning is [CalVer](https://calver.org/) — `YYYY.MM.DD` for date-based releases, `YYYY.MM.DD.N` for same-day post-releases per PEP 440. All notable changes to `informix-db`. Versioning is [CalVer](https://calver.org/) — `YYYY.MM.DD` for date-based releases, `YYYY.MM.DD.N` for same-day post-releases per PEP 440.
## 2026.05.05.11 — Phase 38: `exec()`-based row-decoder codegen
Closes more of the C-vs-Python codec gap on bulk fetch by emitting a specialized row decoder per result-set shape via `exec(compile(src, ...))` and inlining the common fixed-width decode bodies directly into the generated source. This is the lever flagged in the Phase 37 changelog as "the next lever for materially closing the gap."
### What changed
`src/informix_db/_resultset.py`:
- New `compile_row_decoder(readers, columns)` builds a Python source string per result-set shape and compiles it via `exec()`. The generated function has signature `parse_row(payload, offset, encoding) -> tuple` and contains zero loops — every column is handled by inline straight-line code.
- For the common fixed-width types (`SMALLINT`, `INT`, `SERIAL`, `BIGINT`, `BIGSERIAL`, `FLOAT`, `SMFLOAT`, `DATE`), the decoder body is **inlined** rather than called: `v0 = _UNPACK_INT(raw)[0]; if v0 == -2147483648: v0 = None`. That eliminates one Python function call per such column per row — the actual physics behind the speedup.
- `BOOL` deliberately left to its canonical decoder. Inlining `bool(raw[0])` would silently accept `'f'` (102, truthy) as `True` — semantic drift.
- `parse_tuple_payload` accepts an optional `row_decoder=` parameter. When provided, the entire hot loop is bypassed: `return row_decoder(payload, 0, encoding)`.
- The generated source is printable via `IFX_DEBUG_CODEGEN=1` for inspection.
`src/informix_db/cursors.py`:
- After `parse_describe`, the cursor compiles **both** the Phase 37 reader-list AND the Phase 38 row decoder. `parse_tuple_payload` prefers the codegen'd decoder; if codegen returns `None` (unsupported shape), the readers-list dispatch handles it; if both are `None`, the legacy branch chain runs.
### Performance
Real numbers from the integration container, median of 10+ rounds, A/B against Phase 37 (stash → bench → unstash → bench, same Docker container, same load):
| Benchmark | Phase 37 | Phase 38 | Δ |
|---|---:|---:|---:|
| `select_scaling[1000]` | 2.74 ms | 2.62 ms | -4% |
| `select_scaling[10000]` | 25.13 ms | 22.58 ms | **-10%** |
| `select_scaling[100000]` | 257.66 ms | 227.67 ms | **-12%** |
| `select_type_mix_1000_rows` | 4.57 ms | 4.32 ms | -5% |
| `wide_row_select[5]` | 2.28 ms | 2.05 ms | **-10%** |
| `wide_row_select[20]` | 4.27 ms | 3.63 ms | **-15%** |
| `wide_row_select[50]` | 8.10 ms | 7.19 ms | -11% |
| `wide_row_select[100]` | 15.17 ms | 13.59 ms | **-10%** |
**The win scales with both row count and column count** — exactly the codegen profile we'd expect from per-column inlining. At small row counts the one-time `exec(compile(...))` cost dilutes the per-row win; at 100k rows it's invisible.
### Architectural note
This is conceptually the same step `psycopg3`'s C-mode and `asyncpg` (Cython) take, except we stay 100% pure-Python. We don't compile to native code; we compile to specialized Python bytecode via `exec()`. CPython's bytecode interpreter is remarkably efficient on straight-line code with local variables — the codegen win comes from removing dispatch and function-call overhead, not from native execution.
The three-tier composition stays clean:
1. **Codegen** (`row_decoder`) — fastest path, fires on common shapes
2. **Reader list** (`readers`) — fallback when codegen rejects a shape
3. **Legacy branch chain** — fallback for the no-readers case
### Tests
All 251 integration tests still pass. The codegen output was verified against `IFX_DEBUG_CODEGEN=1` for a 9-column mixed-type shape: SMALLINT/INT/BIGINT/FLOAT/SMFLOAT/DATE/BOOL/CHAR/VARCHAR. All inline NULL-sentinel checks correct; CHAR/VARCHAR fall through to the registered decoder via the globals dict (`_D{i}`). No new test code; the integration suite + benchmark suite are the regression test.
### Honest assessment
Combined with Phase 37 (per-column reader strategy), bulk fetch is now ~20-25% faster than Phase 36 in the worst-affected workloads. The IfxPy gap on `select_scaling[100000]` shrinks from ~2.2× to ~2.0×. Pure-Python finally costs roughly **2× C** for bulk fetch — close enough that the deployment win (no CSDK, no JVM) starts to outweigh the perf cost for most users.
Further codegen wins are possible (inlining DATETIME/INTERVAL, batch-decoding all rows of a payload in one call) but with diminishing returns. The remaining gap is dominated by socket I/O and the SQLI protocol's chatty per-row framing — protocol-level work, not codec work.
## 2026.05.05.10 — Phase 37: Pre-baked per-column reader strategy ## 2026.05.05.10 — Phase 37: Pre-baked per-column reader strategy
Closes some of the C-vs-Python codec gap on bulk fetch by moving per-column dispatch decisions from row time to `parse_describe` time. Same idea as psycopg3's pure-Python loader-cache pattern. Closes some of the C-vs-Python codec gap on bulk fetch by moving per-column dispatch decisions from row time to `parse_describe` time. Same idea as psycopg3's pure-Python loader-cache pattern.

View File

@ -1,6 +1,6 @@
[project] [project]
name = "informix-db" name = "informix-db"
version = "2026.05.05.10" version = "2026.05.05.11"
description = "Pure-Python driver for IBM Informix IDS — speaks the SQLI wire protocol over raw sockets. No CSDK, no JVM, no native libraries." description = "Pure-Python driver for IBM Informix IDS — speaks the SQLI wire protocol over raw sockets. No CSDK, no JVM, no native libraries."
readme = "README.md" readme = "README.md"
license = { text = "MIT" } license = { text = "MIT" }

View File

@ -20,12 +20,21 @@ column names), read via readPadded.
from __future__ import annotations from __future__ import annotations
from collections.abc import Callable
from dataclasses import dataclass from dataclasses import dataclass
from datetime import timedelta as _timedelta
from types import MappingProxyType from types import MappingProxyType
from ._protocol import IfxStreamReader from ._protocol import IfxStreamReader
from ._types import IfxType, base_type, is_nullable from ._types import IfxType, base_type, is_nullable
from .converters import ( from .converters import (
_DOUBLE_NULL,
_REAL_NULL,
_UNPACK_DOUBLE,
_UNPACK_FLOAT,
_UNPACK_INT,
_UNPACK_LONG,
_UNPACK_SHORT,
DECODERS, DECODERS,
FIXED_WIDTHS, FIXED_WIDTHS,
BlobLocator, BlobLocator,
@ -36,6 +45,9 @@ from .converters import (
_decode_datetime, _decode_datetime,
_decode_interval, _decode_interval,
) )
from .converters import (
_INFORMIX_DATE_EPOCH as _DATE_EPOCH,
)
# Module-level type-code constants — lifted out of the hot loop in # Module-level type-code constants — lifted out of the hot loop in
# parse_tuple_payload so we don't pay the IntFlag→int conversion per # parse_tuple_payload so we don't pay the IntFlag→int conversion per
@ -318,6 +330,236 @@ def compile_column_readers(columns: list[ColumnInfo]) -> list[tuple]:
return readers return readers
# Phase 38 codegen — sentinel constants imported into the generated
# function's globals so inlined decode bodies can reference them by
# name without dotted lookups.
_INT_MIN_SENTINEL = -0x80000000
_SHORT_MIN_SENTINEL = -0x8000
_LONG_MIN_SENTINEL = -0x8000000000000000
def compile_row_decoder(
readers: list[tuple],
columns: list[ColumnInfo],
) -> Callable[[bytes, int, str], tuple] | None:
"""Generate a specialized row decoder for a specific column shape.
Phase 38: takes the Phase 37 reader-list and emits a Python
function via ``exec()`` that decodes one row of this exact shape
in straight-line code no per-column iteration, no per-column
tuple-unpack, no per-column branch dispatch. Each column's
decode logic is inlined directly.
The generated function has signature
``parse_row(payload, offset, encoding) -> tuple`` and only
references module-level helpers via its closure-equivalent
globals dict (the ``_g`` dict below).
Returns ``None`` if any column's reader-kind is unsupported by
the codegen caller falls back to the Phase 37 dispatch loop.
The generated source is printable via ``IFX_DEBUG_CODEGEN=1``
env var for inspection / debugging.
"""
import os
lines: list[str] = []
lines.append("def parse_row(payload, offset, encoding):")
val_names: list[str] = []
# Map type-code → inline-decoder source for the common fixed-width
# decoders. Inlining the decoder body eliminates one function call
# per column — the actual codegen win. For types not in this map,
# fall back to ``_D{i}(raw)`` referencing the decoder via globals.
_INLINE_FIXED = {
# type_code: lambda v, raw_var: source-snippet
# SMALLINT (1)
1: lambda v, r: (
f" {v} = _UNPACK_SHORT({r})[0]\n"
f" if {v} == -32768:\n"
f" {v} = None"
),
# INT (2), SERIAL (6) — same body
2: lambda v, r: (
f" {v} = _UNPACK_INT({r})[0]\n"
f" if {v} == -2147483648:\n"
f" {v} = None"
),
6: lambda v, r: (
f" {v} = _UNPACK_INT({r})[0]\n"
f" if {v} == -2147483648:\n"
f" {v} = None"
),
# BIGINT (52), BIGSERIAL (53) — same body
52: lambda v, r: (
f" {v} = _UNPACK_LONG({r})[0]\n"
f" if {v} == -9223372036854775808:\n"
f" {v} = None"
),
53: lambda v, r: (
f" {v} = _UNPACK_LONG({r})[0]\n"
f" if {v} == -9223372036854775808:\n"
f" {v} = None"
),
# FLOAT (3), SMFLOAT (4)
3: lambda v, r: (
f" if {r} == _DOUBLE_NULL:\n"
f" {v} = None\n"
f" else:\n"
f" {v} = _UNPACK_DOUBLE({r})[0]"
),
4: lambda v, r: (
f" if {r} == _REAL_NULL:\n"
f" {v} = None\n"
f" else:\n"
f" {v} = _UNPACK_FLOAT({r})[0]"
),
# DATE (7) — 4-byte day count from 1899-12-31
7: lambda v, r: (
f" days = _UNPACK_INT({r})[0]\n"
f" if days == -2147483648:\n"
f" {v} = None\n"
f" else:\n"
f" {v} = _DATE_EPOCH + _timedelta(days=days)"
),
# BOOL (45) — left to the canonical decoder. Informix BOOL is
# ``'t'/'T'/1``, NOT bool(byte) — a truthy-byte inline would
# silently turn ``'f'`` (102) into True.
}
for i, r in enumerate(readers):
kind = r[0]
v = f"v{i}"
val_names.append(v)
lines.append(f" # Col {i}: kind={kind}")
if kind == _RK_FIXED:
_, width, _decoder = r
lines.append(f" raw = payload[offset:offset+{width}]")
lines.append(f" offset += {width}")
# Find type code from the decoder identity (we don't have
# tc directly in the reader tuple; recover via the columns
# list).
tc = columns[i].type_code
inline_src = _INLINE_FIXED.get(tc)
if inline_src is not None:
lines.append(inline_src(v, "raw"))
else:
lines.append(f" {v} = _D{i}(raw)")
elif kind == _RK_BYTE_PREFIX:
lines.append(" length = payload[offset]")
lines.append(" offset += 1")
lines.append(" raw = payload[offset:offset + length]")
lines.append(" offset += length")
lines.append(f" {v} = _D{i}(raw, encoding)")
elif kind == _RK_CHAR:
_, width, _decoder = r
lines.append(f" raw = payload[offset:offset+{width}]")
lines.append(f" offset += {width}")
lines.append(f" {v} = _D{i}(raw, encoding)")
elif kind == _RK_LVARCHAR:
lines.append(
" length = int.from_bytes("
"payload[offset:offset+4], 'big', signed=True)"
)
lines.append(" offset += 4")
lines.append(" raw = payload[offset:offset + length]")
lines.append(" offset += length")
lines.append(" if length & 1:")
lines.append(" offset += 1")
lines.append(f" {v} = _D{i}(raw, encoding)")
elif kind == _RK_DECIMAL:
_, width, _decoder = r
lines.append(f" raw = payload[offset:offset+{width}]")
lines.append(f" offset += {width}")
lines.append(" try:")
lines.append(f" {v} = _D{i}(raw)")
lines.append(" except NotImplementedError:")
lines.append(f" {v} = raw")
elif kind == _RK_DATETIME:
_, width, enc_len = r
lines.append(f" raw = payload[offset:offset+{width}]")
lines.append(f" offset += {width}")
lines.append(f" {v} = _decode_datetime(raw, {enc_len})")
elif kind == _RK_INTERVAL:
_, width, enc_len = r
lines.append(f" raw = payload[offset:offset+{width}]")
lines.append(f" offset += {width}")
lines.append(f" {v} = _decode_interval(raw, {enc_len})")
elif kind == _RK_LEGACY:
# Codegen for rare types: call the legacy helper. The
# column metadata is referenced via the globals dict.
tc = r[1]
lines.append(
f" offset, {v} = _legacy_dispatch_one_column("
f"payload, offset, {tc}, _COL{i}, encoding)"
)
else:
# Unknown kind — abort codegen, caller falls back.
return None
if val_names:
lines.append(f" return ({', '.join(val_names)},)")
else:
lines.append(" return ()")
src = "\n".join(lines)
if os.environ.get("IFX_DEBUG_CODEGEN") == "1":
import sys
print("=== informix_db codegen ===", file=sys.stderr)
print(src, file=sys.stderr)
print("=== end ===", file=sys.stderr)
# Build the globals dict for the generated function. Each column's
# decoder (if any) is registered as ``_D<i>``; columns with the
# _RK_LEGACY kind get their ColumnInfo as ``_COL<i>``.
#
# The inlined fixed-width snippets (see ``_INLINE_FIXED`` above)
# reference precompiled struct unpackers and NULL sentinels by
# name — they only resolve if we hand them to ``exec`` here.
g: dict = {
"_decode_datetime": _decode_datetime,
"_decode_interval": _decode_interval,
"_legacy_dispatch_one_column": _legacy_dispatch_one_column,
"_UNPACK_SHORT": _UNPACK_SHORT,
"_UNPACK_INT": _UNPACK_INT,
"_UNPACK_LONG": _UNPACK_LONG,
"_UNPACK_FLOAT": _UNPACK_FLOAT,
"_UNPACK_DOUBLE": _UNPACK_DOUBLE,
"_DOUBLE_NULL": _DOUBLE_NULL,
"_REAL_NULL": _REAL_NULL,
"_DATE_EPOCH": _DATE_EPOCH,
"_timedelta": _timedelta,
"int": int, # ensure the builtin isn't shadowed
"bool": bool,
}
for i, r in enumerate(readers):
kind = r[0]
if kind in (_RK_FIXED, _RK_CHAR, _RK_DECIMAL):
g[f"_D{i}"] = r[2]
elif kind in (_RK_BYTE_PREFIX, _RK_LVARCHAR):
g[f"_D{i}"] = r[1]
elif kind == _RK_LEGACY:
g[f"_COL{i}"] = columns[i]
namespace: dict = {}
try:
exec(compile(src, "<informix_db codegen>", "exec"), g, namespace)
except SyntaxError:
return None
return namespace["parse_row"]
def _legacy_dispatch_one_column( def _legacy_dispatch_one_column(
payload: bytes, payload: bytes,
offset: int, offset: int,
@ -387,6 +629,7 @@ def parse_tuple_payload(
columns: list[ColumnInfo], columns: list[ColumnInfo],
encoding: str = "iso-8859-1", encoding: str = "iso-8859-1",
readers: list[tuple] | None = None, readers: list[tuple] | None = None,
row_decoder: Callable[[bytes, int, str], tuple] | None = None,
) -> tuple: ) -> tuple:
"""Parse a SQ_TUPLE payload (the SQ_TUPLE tag is already consumed). """Parse a SQ_TUPLE payload (the SQ_TUPLE tag is already consumed).
@ -419,6 +662,13 @@ def parse_tuple_payload(
if size & 1: if size & 1:
reader.read_exact(1) reader.read_exact(1)
# Phase 38 fastest path: a per-result-set decoder function compiled
# via ``exec()`` from the column shape (see ``compile_row_decoder``).
# All per-column dispatch is eliminated — each column's decode logic
# is inlined in straight-line code.
if row_decoder is not None:
return row_decoder(payload, 0, encoding)
values: list[object] = [] values: list[object] = []
offset = 0 offset = 0

View File

@ -33,6 +33,7 @@ from ._protocol import IfxStreamReader, make_pdu_writer
from ._resultset import ( from ._resultset import (
ColumnInfo, ColumnInfo,
compile_column_readers, compile_column_readers,
compile_row_decoder,
parse_describe, parse_describe,
parse_tuple_payload, parse_tuple_payload,
) )
@ -192,6 +193,7 @@ class Cursor:
self._description: list[tuple] | None = None self._description: list[tuple] | None = None
self._columns: list[ColumnInfo] = [] self._columns: list[ColumnInfo] = []
self._column_readers: list[tuple] | None = None # Phase 37 self._column_readers: list[tuple] | None = None # Phase 37
self._row_decoder = None # Phase 38 codegen'd row decoder
self._rowcount: int = -1 self._rowcount: int = -1
self._rows: list[tuple] = [] self._rows: list[tuple] = []
# Phase 17: index-based row access enables scroll cursors. The # Phase 17: index-based row access enables scroll cursors. The
@ -313,6 +315,7 @@ class Cursor:
self._description = None self._description = None
self._columns = [] self._columns = []
self._column_readers = None # Phase 37 self._column_readers = None # Phase 37
self._row_decoder = None # Phase 38
self._rowcount = -1 self._rowcount = -1
self._rows = [] self._rows = []
self._row_index = -1 # before-first-row self._row_index = -1 # before-first-row
@ -1550,9 +1553,20 @@ class Cursor:
# row-decode loop in parse_tuple_payload uses this to avoid # row-decode loop in parse_tuple_payload uses this to avoid
# re-running per-row dispatch decisions that depend only # re-running per-row dispatch decisions that depend only
# on column metadata. # on column metadata.
self._column_readers = ( if self._columns:
compile_column_readers(self._columns) if self._columns else None self._column_readers = compile_column_readers(self._columns)
) # Phase 38: take it one step further — codegen a
# specialized row decoder for THIS column shape.
# Eliminates the per-column iteration overhead of
# the readers loop. ``None`` if codegen can't
# handle the shape; parse_tuple_payload then
# falls back to the readers-list dispatch.
self._row_decoder = compile_row_decoder(
self._column_readers, self._columns
)
else:
self._column_readers = None
self._row_decoder = None
elif tag == 94: # SQ_INSERTDONE — Informix optimization: literal elif tag == 94: # SQ_INSERTDONE — Informix optimization: literal
# INSERT executed during PREPARE. Payload is: # INSERT executed during PREPARE. Payload is:
# readLongInt (10 bytes) — serial8 inserted # readLongInt (10 bytes) — serial8 inserted
@ -1586,6 +1600,7 @@ class Cursor:
self._columns, self._columns,
encoding=self._conn.encoding, encoding=self._conn.encoding,
readers=self._column_readers, readers=self._column_readers,
row_decoder=self._row_decoder,
) )
self._rows.append(row) self._rows.append(row)
elif tag == MessageType.SQ_DONE: elif tag == MessageType.SQ_DONE: