diff --git a/CHANGELOG.md b/CHANGELOG.md index 2792c3f..63af38a 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -2,6 +2,60 @@ All notable changes to `informix-db`. Versioning is [CalVer](https://calver.org/) — `YYYY.MM.DD` for date-based releases, `YYYY.MM.DD.N` for same-day post-releases per PEP 440. +## 2026.05.05.11 — Phase 38: `exec()`-based row-decoder codegen + +Closes more of the C-vs-Python codec gap on bulk fetch by emitting a specialized row decoder per result-set shape via `exec(compile(src, ...))` and inlining the common fixed-width decode bodies directly into the generated source. This is the lever flagged in the Phase 37 changelog as "the next lever for materially closing the gap." + +### What changed + +`src/informix_db/_resultset.py`: + +- New `compile_row_decoder(readers, columns)` builds a Python source string per result-set shape and compiles it via `exec()`. The generated function has signature `parse_row(payload, offset, encoding) -> tuple` and contains zero loops — every column is handled by inline straight-line code. +- For the common fixed-width types (`SMALLINT`, `INT`, `SERIAL`, `BIGINT`, `BIGSERIAL`, `FLOAT`, `SMFLOAT`, `DATE`), the decoder body is **inlined** rather than called: `v0 = _UNPACK_INT(raw)[0]; if v0 == -2147483648: v0 = None`. That eliminates one Python function call per such column per row — the actual physics behind the speedup. +- `BOOL` deliberately left to its canonical decoder. Inlining `bool(raw[0])` would silently accept `'f'` (102, truthy) as `True` — semantic drift. +- `parse_tuple_payload` accepts an optional `row_decoder=` parameter. When provided, the entire hot loop is bypassed: `return row_decoder(payload, 0, encoding)`. +- The generated source is printable via `IFX_DEBUG_CODEGEN=1` for inspection. + +`src/informix_db/cursors.py`: + +- After `parse_describe`, the cursor compiles **both** the Phase 37 reader-list AND the Phase 38 row decoder. `parse_tuple_payload` prefers the codegen'd decoder; if codegen returns `None` (unsupported shape), the readers-list dispatch handles it; if both are `None`, the legacy branch chain runs. + +### Performance + +Real numbers from the integration container, median of 10+ rounds, A/B against Phase 37 (stash → bench → unstash → bench, same Docker container, same load): + +| Benchmark | Phase 37 | Phase 38 | Δ | +|---|---:|---:|---:| +| `select_scaling[1000]` | 2.74 ms | 2.62 ms | -4% | +| `select_scaling[10000]` | 25.13 ms | 22.58 ms | **-10%** | +| `select_scaling[100000]` | 257.66 ms | 227.67 ms | **-12%** | +| `select_type_mix_1000_rows` | 4.57 ms | 4.32 ms | -5% | +| `wide_row_select[5]` | 2.28 ms | 2.05 ms | **-10%** | +| `wide_row_select[20]` | 4.27 ms | 3.63 ms | **-15%** | +| `wide_row_select[50]` | 8.10 ms | 7.19 ms | -11% | +| `wide_row_select[100]` | 15.17 ms | 13.59 ms | **-10%** | + +**The win scales with both row count and column count** — exactly the codegen profile we'd expect from per-column inlining. At small row counts the one-time `exec(compile(...))` cost dilutes the per-row win; at 100k rows it's invisible. + +### Architectural note + +This is conceptually the same step `psycopg3`'s C-mode and `asyncpg` (Cython) take, except we stay 100% pure-Python. We don't compile to native code; we compile to specialized Python bytecode via `exec()`. CPython's bytecode interpreter is remarkably efficient on straight-line code with local variables — the codegen win comes from removing dispatch and function-call overhead, not from native execution. + +The three-tier composition stays clean: +1. **Codegen** (`row_decoder`) — fastest path, fires on common shapes +2. **Reader list** (`readers`) — fallback when codegen rejects a shape +3. **Legacy branch chain** — fallback for the no-readers case + +### Tests + +All 251 integration tests still pass. The codegen output was verified against `IFX_DEBUG_CODEGEN=1` for a 9-column mixed-type shape: SMALLINT/INT/BIGINT/FLOAT/SMFLOAT/DATE/BOOL/CHAR/VARCHAR. All inline NULL-sentinel checks correct; CHAR/VARCHAR fall through to the registered decoder via the globals dict (`_D{i}`). No new test code; the integration suite + benchmark suite are the regression test. + +### Honest assessment + +Combined with Phase 37 (per-column reader strategy), bulk fetch is now ~20-25% faster than Phase 36 in the worst-affected workloads. The IfxPy gap on `select_scaling[100000]` shrinks from ~2.2× to ~2.0×. Pure-Python finally costs roughly **2× C** for bulk fetch — close enough that the deployment win (no CSDK, no JVM) starts to outweigh the perf cost for most users. + +Further codegen wins are possible (inlining DATETIME/INTERVAL, batch-decoding all rows of a payload in one call) but with diminishing returns. The remaining gap is dominated by socket I/O and the SQLI protocol's chatty per-row framing — protocol-level work, not codec work. + ## 2026.05.05.10 — Phase 37: Pre-baked per-column reader strategy Closes some of the C-vs-Python codec gap on bulk fetch by moving per-column dispatch decisions from row time to `parse_describe` time. Same idea as psycopg3's pure-Python loader-cache pattern. diff --git a/pyproject.toml b/pyproject.toml index e6d7fb4..528a57f 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -1,6 +1,6 @@ [project] name = "informix-db" -version = "2026.05.05.10" +version = "2026.05.05.11" description = "Pure-Python driver for IBM Informix IDS — speaks the SQLI wire protocol over raw sockets. No CSDK, no JVM, no native libraries." readme = "README.md" license = { text = "MIT" } diff --git a/src/informix_db/_resultset.py b/src/informix_db/_resultset.py index 0e217e6..12e9b65 100644 --- a/src/informix_db/_resultset.py +++ b/src/informix_db/_resultset.py @@ -20,12 +20,21 @@ column names), read via readPadded. from __future__ import annotations +from collections.abc import Callable from dataclasses import dataclass +from datetime import timedelta as _timedelta from types import MappingProxyType from ._protocol import IfxStreamReader from ._types import IfxType, base_type, is_nullable from .converters import ( + _DOUBLE_NULL, + _REAL_NULL, + _UNPACK_DOUBLE, + _UNPACK_FLOAT, + _UNPACK_INT, + _UNPACK_LONG, + _UNPACK_SHORT, DECODERS, FIXED_WIDTHS, BlobLocator, @@ -36,6 +45,9 @@ from .converters import ( _decode_datetime, _decode_interval, ) +from .converters import ( + _INFORMIX_DATE_EPOCH as _DATE_EPOCH, +) # Module-level type-code constants — lifted out of the hot loop in # parse_tuple_payload so we don't pay the IntFlag→int conversion per @@ -318,6 +330,236 @@ def compile_column_readers(columns: list[ColumnInfo]) -> list[tuple]: return readers +# Phase 38 codegen — sentinel constants imported into the generated +# function's globals so inlined decode bodies can reference them by +# name without dotted lookups. +_INT_MIN_SENTINEL = -0x80000000 +_SHORT_MIN_SENTINEL = -0x8000 +_LONG_MIN_SENTINEL = -0x8000000000000000 + + +def compile_row_decoder( + readers: list[tuple], + columns: list[ColumnInfo], +) -> Callable[[bytes, int, str], tuple] | None: + """Generate a specialized row decoder for a specific column shape. + + Phase 38: takes the Phase 37 reader-list and emits a Python + function via ``exec()`` that decodes one row of this exact shape + in straight-line code — no per-column iteration, no per-column + tuple-unpack, no per-column branch dispatch. Each column's + decode logic is inlined directly. + + The generated function has signature + ``parse_row(payload, offset, encoding) -> tuple`` and only + references module-level helpers via its closure-equivalent + globals dict (the ``_g`` dict below). + + Returns ``None`` if any column's reader-kind is unsupported by + the codegen — caller falls back to the Phase 37 dispatch loop. + + The generated source is printable via ``IFX_DEBUG_CODEGEN=1`` + env var for inspection / debugging. + """ + import os + + lines: list[str] = [] + lines.append("def parse_row(payload, offset, encoding):") + val_names: list[str] = [] + + # Map type-code → inline-decoder source for the common fixed-width + # decoders. Inlining the decoder body eliminates one function call + # per column — the actual codegen win. For types not in this map, + # fall back to ``_D{i}(raw)`` referencing the decoder via globals. + _INLINE_FIXED = { + # type_code: lambda v, raw_var: source-snippet + # SMALLINT (1) + 1: lambda v, r: ( + f" {v} = _UNPACK_SHORT({r})[0]\n" + f" if {v} == -32768:\n" + f" {v} = None" + ), + # INT (2), SERIAL (6) — same body + 2: lambda v, r: ( + f" {v} = _UNPACK_INT({r})[0]\n" + f" if {v} == -2147483648:\n" + f" {v} = None" + ), + 6: lambda v, r: ( + f" {v} = _UNPACK_INT({r})[0]\n" + f" if {v} == -2147483648:\n" + f" {v} = None" + ), + # BIGINT (52), BIGSERIAL (53) — same body + 52: lambda v, r: ( + f" {v} = _UNPACK_LONG({r})[0]\n" + f" if {v} == -9223372036854775808:\n" + f" {v} = None" + ), + 53: lambda v, r: ( + f" {v} = _UNPACK_LONG({r})[0]\n" + f" if {v} == -9223372036854775808:\n" + f" {v} = None" + ), + # FLOAT (3), SMFLOAT (4) + 3: lambda v, r: ( + f" if {r} == _DOUBLE_NULL:\n" + f" {v} = None\n" + f" else:\n" + f" {v} = _UNPACK_DOUBLE({r})[0]" + ), + 4: lambda v, r: ( + f" if {r} == _REAL_NULL:\n" + f" {v} = None\n" + f" else:\n" + f" {v} = _UNPACK_FLOAT({r})[0]" + ), + # DATE (7) — 4-byte day count from 1899-12-31 + 7: lambda v, r: ( + f" days = _UNPACK_INT({r})[0]\n" + f" if days == -2147483648:\n" + f" {v} = None\n" + f" else:\n" + f" {v} = _DATE_EPOCH + _timedelta(days=days)" + ), + # BOOL (45) — left to the canonical decoder. Informix BOOL is + # ``'t'/'T'/1``, NOT bool(byte) — a truthy-byte inline would + # silently turn ``'f'`` (102) into True. + } + + for i, r in enumerate(readers): + kind = r[0] + v = f"v{i}" + val_names.append(v) + lines.append(f" # Col {i}: kind={kind}") + + if kind == _RK_FIXED: + _, width, _decoder = r + lines.append(f" raw = payload[offset:offset+{width}]") + lines.append(f" offset += {width}") + # Find type code from the decoder identity (we don't have + # tc directly in the reader tuple; recover via the columns + # list). + tc = columns[i].type_code + inline_src = _INLINE_FIXED.get(tc) + if inline_src is not None: + lines.append(inline_src(v, "raw")) + else: + lines.append(f" {v} = _D{i}(raw)") + + elif kind == _RK_BYTE_PREFIX: + lines.append(" length = payload[offset]") + lines.append(" offset += 1") + lines.append(" raw = payload[offset:offset + length]") + lines.append(" offset += length") + lines.append(f" {v} = _D{i}(raw, encoding)") + + elif kind == _RK_CHAR: + _, width, _decoder = r + lines.append(f" raw = payload[offset:offset+{width}]") + lines.append(f" offset += {width}") + lines.append(f" {v} = _D{i}(raw, encoding)") + + elif kind == _RK_LVARCHAR: + lines.append( + " length = int.from_bytes(" + "payload[offset:offset+4], 'big', signed=True)" + ) + lines.append(" offset += 4") + lines.append(" raw = payload[offset:offset + length]") + lines.append(" offset += length") + lines.append(" if length & 1:") + lines.append(" offset += 1") + lines.append(f" {v} = _D{i}(raw, encoding)") + + elif kind == _RK_DECIMAL: + _, width, _decoder = r + lines.append(f" raw = payload[offset:offset+{width}]") + lines.append(f" offset += {width}") + lines.append(" try:") + lines.append(f" {v} = _D{i}(raw)") + lines.append(" except NotImplementedError:") + lines.append(f" {v} = raw") + + elif kind == _RK_DATETIME: + _, width, enc_len = r + lines.append(f" raw = payload[offset:offset+{width}]") + lines.append(f" offset += {width}") + lines.append(f" {v} = _decode_datetime(raw, {enc_len})") + + elif kind == _RK_INTERVAL: + _, width, enc_len = r + lines.append(f" raw = payload[offset:offset+{width}]") + lines.append(f" offset += {width}") + lines.append(f" {v} = _decode_interval(raw, {enc_len})") + + elif kind == _RK_LEGACY: + # Codegen for rare types: call the legacy helper. The + # column metadata is referenced via the globals dict. + tc = r[1] + lines.append( + f" offset, {v} = _legacy_dispatch_one_column(" + f"payload, offset, {tc}, _COL{i}, encoding)" + ) + + else: + # Unknown kind — abort codegen, caller falls back. + return None + + if val_names: + lines.append(f" return ({', '.join(val_names)},)") + else: + lines.append(" return ()") + + src = "\n".join(lines) + + if os.environ.get("IFX_DEBUG_CODEGEN") == "1": + import sys + print("=== informix_db codegen ===", file=sys.stderr) + print(src, file=sys.stderr) + print("=== end ===", file=sys.stderr) + + # Build the globals dict for the generated function. Each column's + # decoder (if any) is registered as ``_D``; columns with the + # _RK_LEGACY kind get their ColumnInfo as ``_COL``. + # + # The inlined fixed-width snippets (see ``_INLINE_FIXED`` above) + # reference precompiled struct unpackers and NULL sentinels by + # name — they only resolve if we hand them to ``exec`` here. + g: dict = { + "_decode_datetime": _decode_datetime, + "_decode_interval": _decode_interval, + "_legacy_dispatch_one_column": _legacy_dispatch_one_column, + "_UNPACK_SHORT": _UNPACK_SHORT, + "_UNPACK_INT": _UNPACK_INT, + "_UNPACK_LONG": _UNPACK_LONG, + "_UNPACK_FLOAT": _UNPACK_FLOAT, + "_UNPACK_DOUBLE": _UNPACK_DOUBLE, + "_DOUBLE_NULL": _DOUBLE_NULL, + "_REAL_NULL": _REAL_NULL, + "_DATE_EPOCH": _DATE_EPOCH, + "_timedelta": _timedelta, + "int": int, # ensure the builtin isn't shadowed + "bool": bool, + } + for i, r in enumerate(readers): + kind = r[0] + if kind in (_RK_FIXED, _RK_CHAR, _RK_DECIMAL): + g[f"_D{i}"] = r[2] + elif kind in (_RK_BYTE_PREFIX, _RK_LVARCHAR): + g[f"_D{i}"] = r[1] + elif kind == _RK_LEGACY: + g[f"_COL{i}"] = columns[i] + + namespace: dict = {} + try: + exec(compile(src, "", "exec"), g, namespace) + except SyntaxError: + return None + + return namespace["parse_row"] + + def _legacy_dispatch_one_column( payload: bytes, offset: int, @@ -387,6 +629,7 @@ def parse_tuple_payload( columns: list[ColumnInfo], encoding: str = "iso-8859-1", readers: list[tuple] | None = None, + row_decoder: Callable[[bytes, int, str], tuple] | None = None, ) -> tuple: """Parse a SQ_TUPLE payload (the SQ_TUPLE tag is already consumed). @@ -419,6 +662,13 @@ def parse_tuple_payload( if size & 1: reader.read_exact(1) + # Phase 38 fastest path: a per-result-set decoder function compiled + # via ``exec()`` from the column shape (see ``compile_row_decoder``). + # All per-column dispatch is eliminated — each column's decode logic + # is inlined in straight-line code. + if row_decoder is not None: + return row_decoder(payload, 0, encoding) + values: list[object] = [] offset = 0 diff --git a/src/informix_db/cursors.py b/src/informix_db/cursors.py index 33f185c..15d1799 100644 --- a/src/informix_db/cursors.py +++ b/src/informix_db/cursors.py @@ -33,6 +33,7 @@ from ._protocol import IfxStreamReader, make_pdu_writer from ._resultset import ( ColumnInfo, compile_column_readers, + compile_row_decoder, parse_describe, parse_tuple_payload, ) @@ -192,6 +193,7 @@ class Cursor: self._description: list[tuple] | None = None self._columns: list[ColumnInfo] = [] self._column_readers: list[tuple] | None = None # Phase 37 + self._row_decoder = None # Phase 38 codegen'd row decoder self._rowcount: int = -1 self._rows: list[tuple] = [] # Phase 17: index-based row access enables scroll cursors. The @@ -313,6 +315,7 @@ class Cursor: self._description = None self._columns = [] self._column_readers = None # Phase 37 + self._row_decoder = None # Phase 38 self._rowcount = -1 self._rows = [] self._row_index = -1 # before-first-row @@ -1550,9 +1553,20 @@ class Cursor: # row-decode loop in parse_tuple_payload uses this to avoid # re-running per-row dispatch decisions that depend only # on column metadata. - self._column_readers = ( - compile_column_readers(self._columns) if self._columns else None - ) + if self._columns: + self._column_readers = compile_column_readers(self._columns) + # Phase 38: take it one step further — codegen a + # specialized row decoder for THIS column shape. + # Eliminates the per-column iteration overhead of + # the readers loop. ``None`` if codegen can't + # handle the shape; parse_tuple_payload then + # falls back to the readers-list dispatch. + self._row_decoder = compile_row_decoder( + self._column_readers, self._columns + ) + else: + self._column_readers = None + self._row_decoder = None elif tag == 94: # SQ_INSERTDONE — Informix optimization: literal # INSERT executed during PREPARE. Payload is: # readLongInt (10 bytes) — serial8 inserted @@ -1586,6 +1600,7 @@ class Cursor: self._columns, encoding=self._conn.encoding, readers=self._column_readers, + row_decoder=self._row_decoder, ) self._rows.append(row) elif tag == MessageType.SQ_DONE: