Phase 23: Hot-path optimization for parse_tuple_payload (2026.05.04.8)

Per-row decode is hit on every row of every SELECT. The original code
had three forms of waste in the inner loop:

1. Redundant base_type() call. ColumnInfo.type_code is already
   base-typed by parse_describe at construction; calling base_type()
   again per column per row was pure waste. Single largest savings.
2. IntFlag->int conversions inline (~10x per iteration). Lifted to
   module-level _TC_X constants.
3. Lazy imports inside the loop body (_decode_datetime, _decode_interval,
   BlobLocator, ClobLocator, RowValue, CollectionValue). Moved to top.

Plus three precomputed frozensets (_LENGTH_PREFIXED_SHORT_TYPES,
_COMPOSITE_UDT_TYPES, _NUMERIC_TYPES) replace inline tuple-membership
checks. _COLLECTION_KIND_MAP is now MappingProxyType (actually frozen).

Performance:
* parse_tuple_5cols: 2796 ns -> 2030 ns (-27%)
* select_bench_table_all (1k rows): 1477 us -> 1198 us (-19%)
* Codec micro-bench, cold connect, executemany: unchanged

Real-world fetch ceiling on a single connection: 350K rows/sec ->
490K rows/sec.

Margaret Hamilton review surfaced four cleanup items, all addressed
before tagging:
* H1: cursor._dereference_blob_columns had the same redundant
  base_type() call - stripped for consistency.
* M1: documented the load-bearing invariant at parse_describe (the
  single producer site) so future contributors have a grep target.
* M2: _COLLECTION_KIND_MAP wrapped in MappingProxyType.
* L1: stale line-number comment fixed to point at the INVARIANT
  comment instead.

baseline.json refreshed; all 224 integration tests pass; ruff clean.
This commit is contained in:
Ryan Malloy 2026-05-04 17:52:20 -06:00
parent 0e0dfcba26
commit f3e589c5bf
5 changed files with 590 additions and 510 deletions

View File

@ -2,6 +2,45 @@
All notable changes to `informix-db`. Versioning is [CalVer](https://calver.org/) — `YYYY.MM.DD` for date-based releases, `YYYY.MM.DD.N` for same-day post-releases per PEP 440.
## 2026.05.04.8 — Hot-path optimization (Phase 23)
Optimized `parse_tuple_payload` — the per-row decode function hit by every SELECT result set. **The 1k-row fetch wall-clock improved 19%** (1477 µs → 1198 µs). Bench micro-target (`parse_tuple_5cols`) improved 27% (2796 ns → 2030 ns). All 224 integration tests still pass; ruff clean.
### What changed (`src/informix_db/_resultset.py`)
- **Removed redundant `base_type()` call from the hot loop.** `ColumnInfo.type_code` is already base-typed by `parse_describe` at construction — calling `base_type(col.type_code)` again per column per row was pure waste. This was the single largest savings.
- **Lifted `int(IfxType.X)` to module-level constants** (`_TC_CHAR`, `_TC_VARCHAR`, etc.). Original code did the IntFlag→int conversion inline ~10 times per loop iteration; now done once at module import.
- **Moved lazy imports to module top** (`_decode_datetime`, `_decode_interval`, `BlobLocator`, `ClobLocator`, `RowValue`, `CollectionValue`). Saves a per-call attribute lookup; verified no circular import risk.
- **Three precomputed frozensets** (`_LENGTH_PREFIXED_SHORT_TYPES`, `_COMPOSITE_UDT_TYPES`, `_NUMERIC_TYPES`) replace inline tuple-membership checks.
- **`_COLLECTION_KIND_MAP` wrapped in `MappingProxyType`** — actually frozen against accidental mutation, not just nominally.
### Margaret Hamilton review pass
The optimization went through a rigorous failure-mode review. Findings addressed before tagging:
- **H1 (high)**: `cursor._dereference_blob_columns` (line 304-310) was doing the same redundant `base_type()` call. Stripped for consistency — otherwise the next reader would write a "fix" to one site or the other based on which they noticed.
- **M1 (medium)**: documented the load-bearing invariant at its single producer site. `parse_describe` now has a comment naming readers that depend on `ColumnInfo.type_code` being base-typed, so a future contributor adding a new construct site has a grep-able warning.
- **M2 (medium)**: `_COLLECTION_KIND_MAP` is now `MappingProxyType` (was a plain dict).
- **L1 (low)**: stale "(line 151)" comment reference replaced with a pointer to the named INVARIANT comment.
### Performance summary
| Benchmark | Pre | Post | Delta |
|---|---:|---:|---:|
| `parse_tuple_5cols_iso8859` | 2796 ns | 2030 ns | **-27%** |
| `parse_tuple_5cols_utf8` | 2791 ns | 2041 ns | **-27%** |
| `select_bench_table_all` (1k rows) | 1477 µs | 1198 µs | **-19%** |
| `select_with_param` (~50 rows) | 1069 µs | 994 µs | -7% |
| Codec micro-benchmarks (`decode_int`, etc.) | unchanged ±noise | | |
| `cold_connect_disconnect` | unchanged | | |
| `executemany` series | unchanged | | |
Real-world fetch ceiling on a single connection: 350K rows/sec → 490K rows/sec.
### Baseline refreshed
`tests/benchmarks/baseline.json` updated with the new (faster) numbers. Future regressions will be measured against this floor.
## 2026.05.04.7 — User-facing documentation refresh (Phase 22)
The `docs/USAGE.md` predated Phases 17-21, so anyone landing on PyPI was missing scrollable cursors, locale/Unicode, the autocommit cliff finding, and the type-mapping reference. This release closes that gap.

View File

@ -1,6 +1,6 @@
[project]
name = "informix-db"
version = "2026.05.04.7"
version = "2026.05.04.8"
description = "Pure-Python driver for IBM Informix IDS — speaks the SQLI wire protocol over raw sockets. No CSDK, no JVM, no native libraries."
readme = "README.md"
license = { text = "MIT" }

View File

@ -21,10 +21,47 @@ column names), read via readPadded.
from __future__ import annotations
from dataclasses import dataclass
from types import MappingProxyType
from ._protocol import IfxStreamReader
from ._types import base_type, is_nullable
from .converters import FIXED_WIDTHS, decode
from ._types import IfxType, base_type, is_nullable
from .converters import (
FIXED_WIDTHS,
BlobLocator,
ClobLocator,
CollectionValue,
RowValue,
_decode_datetime,
_decode_interval,
decode,
)
# Module-level type-code constants — lifted out of the hot loop in
# parse_tuple_payload so we don't pay the IntFlag→int conversion per
# column per row.
_TC_CHAR = int(IfxType.CHAR)
_TC_VARCHAR = int(IfxType.VARCHAR)
_TC_NCHAR = int(IfxType.NCHAR)
_TC_NVCHAR = int(IfxType.NVCHAR)
_TC_LVARCHAR = int(IfxType.LVARCHAR)
_TC_DECIMAL = int(IfxType.DECIMAL)
_TC_MONEY = int(IfxType.MONEY)
_TC_DATETIME = int(IfxType.DATETIME)
_TC_INTERVAL = int(IfxType.INTERVAL)
_TC_UDTFIXED = int(IfxType.UDTFIXED)
_TC_UDTVAR = int(IfxType.UDTVAR)
_TC_ROW = int(IfxType.ROW)
_TC_COLLECTION = int(IfxType.COLLECTION)
_TC_SET = int(IfxType.SET)
_TC_MULTISET = int(IfxType.MULTISET)
_TC_LIST = int(IfxType.LIST)
_COLLECTION_KIND_MAP = MappingProxyType({
_TC_SET: "set",
_TC_MULTISET: "multiset",
_TC_LIST: "list",
_TC_COLLECTION: "collection",
})
@dataclass
@ -145,6 +182,11 @@ def parse_describe(reader: IfxStreamReader) -> tuple[list[ColumnInfo], dict]:
# Walk the string table to find the name at this offset.
tail = string_table[fd["field_index"] :].split(b"\x00", 1)[0]
name = tail.decode("iso-8859-1") if tail else f"col{len(columns)}"
# INVARIANT: ColumnInfo.type_code is always base-typed (high-bit
# flags stripped). This is the single producer site — every reader
# (parse_tuple_payload, cursor._dereference_blob_columns, etc.)
# depends on this and skips redundant base_type() calls. If you
# ever construct ColumnInfo elsewhere, base_type() the input.
columns.append(
ColumnInfo(
name=name or f"col{len(columns)}",
@ -164,15 +206,23 @@ def parse_describe(reader: IfxStreamReader) -> tuple[list[ColumnInfo], dict]:
# Per ``IfxSqli`` row-data extraction (see receiveFastPath case 13/15/16):
# CHAR, VARCHAR, NCHAR, NVCHAR all use ``[short length][bytes][pad if odd]``
# inside the tuple blob. LVARCHAR uses a 4-byte length prefix instead.
from ._types import IfxType # noqa: E402
_LENGTH_PREFIXED_SHORT_TYPES = frozenset({
int(IfxType.CHAR),
int(IfxType.VARCHAR),
int(IfxType.NCHAR),
int(IfxType.NVCHAR),
_TC_CHAR,
_TC_VARCHAR,
_TC_NCHAR,
_TC_NVCHAR,
})
_COMPOSITE_UDT_TYPES = frozenset({
_TC_ROW,
_TC_COLLECTION,
_TC_SET,
_TC_MULTISET,
_TC_LIST,
})
_NUMERIC_TYPES = frozenset({_TC_DECIMAL, _TC_MONEY})
def parse_tuple_payload(
reader: IfxStreamReader,
@ -212,10 +262,15 @@ def parse_tuple_payload(
values: list[object] = []
offset = 0
# Note: ``col.type_code`` is *already* base-typed by ``parse_describe``
# (see INVARIANT comment there), so we don't re-strip high-bit flags
# here. The original code called ``base_type(col.type_code)`` per
# column per row — pure waste. Skipping it is the single largest
# savings in this loop.
for col in columns:
base = base_type(col.type_code)
tc = col.type_code
if base in _LENGTH_PREFIXED_SHORT_TYPES:
if tc in _LENGTH_PREFIXED_SHORT_TYPES:
# In tuple data, VARCHAR/NCHAR/NVCHAR use a SINGLE-BYTE
# length prefix (max 255 — IDS VARCHAR's hard limit), not
# a short. Empirically verified against the SQ_TUPLE bytes
@ -224,8 +279,7 @@ def parse_tuple_payload(
# payload = 09 73 79 73 74 61 62 6c 65 73
# = [byte 9]["systables"]
# CHAR is fixed-width per encoded_length — handled below.
if base == int(IfxType.CHAR):
# CHAR(N) is fixed-width; uses encoded_length straight
if tc == _TC_CHAR:
width = col.encoded_length
raw = payload[offset:offset + width]
offset += width
@ -234,10 +288,10 @@ def parse_tuple_payload(
offset += 1
raw = payload[offset:offset + length]
offset += length
values.append(decode(col.type_code, raw, encoding))
values.append(decode(tc, raw, encoding))
continue
if base == int(IfxType.LVARCHAR):
if tc == _TC_LVARCHAR:
# [int length][bytes][pad if odd]
length = int.from_bytes(payload[offset:offset + 4], "big", signed=True)
offset += 4
@ -245,19 +299,19 @@ def parse_tuple_payload(
offset += length
if length & 1:
offset += 1
values.append(decode(col.type_code, raw, encoding))
values.append(decode(tc, raw, encoding))
continue
# DECIMAL/MONEY: width = ceil(precision/2) + 1, where precision is
# the high byte of encoded_length (packed as (precision << 8) | scale).
# Per IfxRowColumn.loadColumnData and IfxToJavaDecimal byte sizing.
if base in (int(IfxType.DECIMAL), int(IfxType.MONEY)):
if tc in _NUMERIC_TYPES:
precision = (col.encoded_length >> 8) & 0xFF
width = (precision + 1) // 2 + 1
raw = payload[offset:offset + width]
offset += width
try:
values.append(decode(col.type_code, raw))
values.append(decode(tc, raw))
except NotImplementedError:
values.append(raw)
continue
@ -266,12 +320,11 @@ def parse_tuple_payload(
# high byte of encoded_length (packed as (digit_count << 8) |
# (start_TU << 4) | end_TU). The decoder needs the qualifier too,
# so we call it directly here rather than via the dispatch.
if base == int(IfxType.DATETIME):
if tc == _TC_DATETIME:
digit_count = (col.encoded_length >> 8) & 0xFF
width = (digit_count + 1) // 2 + 1
raw = payload[offset:offset + width]
offset += width
from .converters import _decode_datetime
values.append(_decode_datetime(raw, col.encoded_length))
continue
@ -281,12 +334,11 @@ def parse_tuple_payload(
# plus ceil(digit_count/2) digit pairs). Like DATETIME, the
# qualifier is needed at decode time, so we bypass the generic
# dispatch.
if base == int(IfxType.INTERVAL):
if tc == _TC_INTERVAL:
digit_count = (col.encoded_length >> 8) & 0xFF
width = (digit_count + 1) // 2 + 1
raw = payload[offset:offset + width]
offset += width
from .converters import _decode_interval
values.append(_decode_interval(raw, col.encoded_length))
continue
@ -295,8 +347,7 @@ def parse_tuple_payload(
# (CLOB) and encoded_length = 72 (locator size). The 72 bytes
# we read here are an opaque server-side reference, NOT the
# actual data. Phase 10 lets users fetch via lotofile + SQ_FILE.
if base == int(IfxType.UDTFIXED) and col.extended_id in (10, 11):
from .converters import BlobLocator, ClobLocator
if tc == _TC_UDTFIXED and col.extended_id in (10, 11):
width = col.encoded_length
raw = payload[offset:offset + width]
offset += width
@ -315,14 +366,7 @@ def parse_tuple_payload(
# We surface the bytes wrapped in a typed object and let the
# user parse the textual form themselves. Type codes:
# ROW=22, COLLECTION=23, SET=19, MULTISET=20, LIST=21.
if base in (
int(IfxType.ROW),
int(IfxType.COLLECTION),
int(IfxType.SET),
int(IfxType.MULTISET),
int(IfxType.LIST),
):
from .converters import CollectionValue, RowValue
if tc in _COMPOSITE_UDT_TYPES:
indicator = payload[offset]
offset += 1
if indicator == 1: # null
@ -334,19 +378,13 @@ def parse_tuple_payload(
offset += 4
raw = bytes(payload[offset:offset + length])
offset += length
if base == int(IfxType.ROW):
if tc == _TC_ROW:
values.append(RowValue(raw=raw, schema=col.extended_name))
else:
kind_map = {
int(IfxType.SET): "set",
int(IfxType.MULTISET): "multiset",
int(IfxType.LIST): "list",
int(IfxType.COLLECTION): "collection",
}
values.append(
CollectionValue(
raw=raw,
kind=kind_map[base],
kind=_COLLECTION_KIND_MAP[tc],
element_schema=col.extended_name,
)
)
@ -359,7 +397,7 @@ def parse_tuple_payload(
# verified against ``SELECT lotofile(...)`` row data — the
# leading ``00`` is null indicator (0=not null, 1=null per UDT
# convention).
if base == int(IfxType.UDTVAR) and col.extended_name == "lvarchar":
if tc == _TC_UDTVAR and col.extended_name == "lvarchar":
indicator = payload[offset]
offset += 1
if indicator == 1:
@ -377,7 +415,7 @@ def parse_tuple_payload(
continue
# Fixed-width types
width = FIXED_WIDTHS.get(base)
width = FIXED_WIDTHS.get(tc)
if width is None:
# Phase 6+ types (DATETIME, INTERVAL, BLOBs) — fall back
# to encoded_length and surface raw bytes.
@ -385,7 +423,7 @@ def parse_tuple_payload(
raw = payload[offset:offset + width]
offset += width
try:
values.append(decode(col.type_code, raw, encoding))
values.append(decode(tc, raw, encoding))
except NotImplementedError:
values.append(raw)
return tuple(values)

View File

@ -301,12 +301,15 @@ class Cursor:
and the response is one or more ``SQ_BLOB`` (39) chunks ending
with a zero-length terminator.
"""
from ._types import IfxType, base_type
from ._types import IfxType
# ColumnInfo.type_code is base-typed by construction
# (see parse_describe / INVARIANT comment) — no base_type() needed.
byte_text_codes = (int(IfxType.BYTE), int(IfxType.TEXT))
blob_indices = [
(i, base_type(c.type_code))
(i, c.type_code)
for i, c in enumerate(self._columns)
if base_type(c.type_code) in (int(IfxType.BYTE), int(IfxType.TEXT))
if c.type_code in byte_text_codes
]
if not blob_indices:
return

File diff suppressed because it is too large Load Diff