Phase 33: Pipelined executemany - 2.85x faster bulk insert (2026.05.05.6)

The serial-loop executemany paid one wire round-trip per row (~30us/
row on loopback). It was the one benchmark where IfxPy beat us in
the comparison work - 10% slower at executemany(1000) in txn.

Phase 33 pipelines the BIND+EXECUTE PDUs: build all N PDUs, send
them back-to-back, then drain all N responses. Eliminates per-row
RTT entirely.

Performance impact:
* executemany(1000) in txn:   31.3 ms -> 11.0 ms (2.85x faster)
* executemany(100) autocommit: 173 ms -> 154 ms (11% faster)
* executemany(1000) autocommit: 1740 ms -> 1590 ms (9% faster)

(Autocommit gets smaller wins because server-side log flushes
dominate - Phase 21.1's "autocommit cliff".)

IfxPy comparison flipped: us 10% slower -> us 2.05x faster on bulk
inserts. We now win all 5 head-to-head benchmarks against the C-bound
driver.

Margaret Hamilton review surfaced one CRITICAL concern (C1) - the
pipeline assumes Informix sends N responses for N pipelined PDUs
even when one fails. If the server cut the stream short, the drain
loop would deadlock on the next read.

Verified by 3 new integration tests in tests/test_executemany_pipeline.py:
* test_pipelined_executemany_mid_batch_constraint_violation (row 500/1000)
* test_pipelined_executemany_first_row_fails (row 0/100)
* test_pipelined_executemany_last_row_fails (row 99/100)

All confirm Informix sends N responses; wire stays aligned; connection
is usable after.

Plus 4 lower-priority fixes Hamilton recommended:
* H1: documented _raise_sq_err self-drains-SQ_EOT invariant + tripwire
* H2: docstring warning about O(N) lock duration; chunk for huge batches
* M1: prepend row-index to exception message rather than reformat
* M2: documented sendall-no-timeout caveat on hostile networks

77 unit + 239 integration + 33 benchmark = 349 tests; ruff clean.

Note: Phase 32 (Tier 1+2 benchmarks) was tagged without bumping
pyproject.toml's version string. .5 was git-tag-only; .6 is the next
published version increment.
This commit is contained in:
Ryan Malloy 2026-05-05 12:26:15 -06:00
parent 01757415a5
commit 362ecb3d63
6 changed files with 363 additions and 22 deletions

View File

@ -2,6 +2,59 @@
All notable changes to `informix-db`. Versioning is [CalVer](https://calver.org/) — `YYYY.MM.DD` for date-based releases, `YYYY.MM.DD.N` for same-day post-releases per PEP 440. All notable changes to `informix-db`. Versioning is [CalVer](https://calver.org/) — `YYYY.MM.DD` for date-based releases, `YYYY.MM.DD.N` for same-day post-releases per PEP 440.
## 2026.05.05.6 — Pipelined `executemany` (Phase 33) — 2.85× faster on bulk inserts
The previous serial-loop `executemany` paid one wire round-trip per row (~30 µs/row on loopback × N rows = the dominant cost for any sizeable batch). It was the *one* benchmark where IfxPy beat us in the comparison work — 10% slower at `executemany(1000)` in transaction.
Phase 33 pipelines the BIND+EXECUTE PDUs: build all N PDUs first, send them back-to-back, then drain all N responses. Eliminates the per-row RTT entirely.
### Performance impact
| Benchmark | Before | After | Speedup |
|---|---:|---:|---:|
| `executemany(1000)` in transaction | 31.3 ms | **11.0 ms** | **2.85× faster** |
| `executemany(100)` in autocommit | 173 ms | 154 ms | 11% faster |
| `executemany(1000)` in autocommit | 1740 ms | 1590 ms | 9% faster |
Autocommit cases get smaller relative wins because server-side log flushes per row dominate the absolute cost (Phase 21.1's "autocommit cliff").
### IfxPy comparison: now winning all 5 benchmarks
The comparison flipped from "us 10% slower on bulk inserts" to "us 2.05× faster":
| Benchmark | IfxPy | informix-db | Result |
|---|---:|---:|---:|
| `select_one_row` | 118 µs | 114 µs | us 3% faster |
| `select_systables_first_10` | 164 µs | 159 µs | us 3% faster |
| `select_bench_table_all` (1k rows) | 984 µs | 891 µs | us 9% faster |
| **`executemany(1000)` in txn** | **21.4 ms** | **10.4 ms** | **us 2.05× faster** |
| `cold_connect_disconnect` | 11.0 ms | 10.4 ms | us 5% faster |
### Margaret Hamilton review pass
Hamilton flagged one critical concern (C1) before approving: the pipeline assumes Informix sends *exactly* N responses for N pipelined PDUs even when one row fails. If the server cut the response stream short on first error, the drain loop would block on the next read and the connection would deadlock.
**Verified by integration test** (`tests/test_executemany_pipeline.py`):
- Constraint violation at row 0/100 (first-row failure)
- Constraint violation at row 99/100 (last-row failure)
- Constraint violation at row 500/1000 (mid-batch failure)
All 3 confirm: Informix DOES send N responses for N PDUs; wire stays aligned; connection is usable after.
Plus four lower-priority fixes Hamilton recommended:
- **H1**: documented the `_raise_sq_err` self-drains-SQ_EOT invariant in the drain loop, plus the tripwire test that catches its violation.
- **H2**: docstring warning that lock-holding time scales O(N) in batch size; recommend chunking for very large batches.
- **M1**: prepend row-index annotation rather than reformat the exception message — preserves `[<sqlcode>] <text>` prefix for string-scraping callers.
- **M2**: documented that `sendall` doesn't honor a write timeout reliably on all kernels; recommend `keepalive=True` for hostile networks.
### Tests
3 new integration tests in `tests/test_executemany_pipeline.py` validate the wire-alignment invariant. Total: **77 unit + 239 integration + 33 benchmark = 349 tests**.
### Note on version 2026.05.05.5
The Phase 32 (Tier 1+2 benchmarks) tag was applied without bumping `pyproject.toml`'s version string — that release is git-tag-only. Version 2026.05.05.6 (Phase 33) is the next published version increment.
## 2026.05.05.4 — Final hardening pass (Phase 30) ## 2026.05.05.4 — Final hardening pass (Phase 30)
Closes the last 3 medium-severity items from Hamilton's system-wide audit. **No findings remain.** Closes the last 3 medium-severity items from Hamilton's system-wide audit. **No findings remain.**

View File

@ -176,15 +176,17 @@ Head-to-head benchmarks against [IfxPy](https://pypi.org/project/IfxPy/) on iden
| Benchmark | IfxPy 3.0.5 (C-bound) | `informix-db` (pure Python) | Result | | Benchmark | IfxPy 3.0.5 (C-bound) | `informix-db` (pure Python) | Result |
|---|---:|---:|---:| |---|---:|---:|---:|
| Single-row SELECT round-trip | 170 µs | **119 µs** | **`informix-db` 30% faster** | | Single-row SELECT round-trip | 118 µs | **114 µs** | **`informix-db` 3% faster** |
| ~10-row server-side query | 186 µs | **142 µs** | **`informix-db` 24% faster** | | ~10-row server-side query | 164 µs | **159 µs** | **`informix-db` 3% faster** |
| 1000-row SELECT (full fetch) | 980 µs | **832 µs** | **`informix-db` 15% faster** | | 1000-row SELECT (full fetch) | 984 µs | **891 µs** | **`informix-db` 9% faster** |
| `executemany(1000)` in transaction | 28.3 ms | 31.3 ms | 10% slower | | **`executemany(1000)` in transaction** | 21.4 ms | **10.4 ms** | **`informix-db` 2.05× faster** |
| Cold connect (login handshake) | 12.0 ms | **10.7 ms** | **`informix-db` 11% faster** | | Cold connect (login handshake) | 11.0 ms | **10.4 ms** | **`informix-db` 5% faster** |
**`informix-db` is faster on 4 of 5 benchmarks against the C-bound driver.** The one loss is bulk-write workloads, where IfxPy's C-level per-row marshaling beats our Python BIND-PDU build by single-digit percent (within IfxPy's own measurement noise — its IQR on that benchmark is 29% of its own median). **`informix-db` wins on all 5 benchmarks against the C-bound driver, including a 2× win on bulk inserts.**
**Why pure-Python wins the round-trip-bound work:** IfxPy's actual code path is `Python → OneDB ODBC driver → libifdmr.so → wire`. Ours is `Python → wire`. The abstraction-layer overhead IfxPy carries on every call costs more than the C-vs-Python codec gap saves. We hit the wire directly with one less hop. **Why pure-Python wins the round-trip-bound work:** IfxPy's code path is `Python → OneDB ODBC driver → libifdmr.so → wire`. Ours is `Python → wire`. The abstraction-layer overhead IfxPy carries on every call costs more than the C-vs-Python codec gap saves.
**Why we win bulk inserts dramatically:** `executemany` pipelines all N BIND+EXECUTE PDUs to the wire before draining responses (Phase 33), eliminating the per-row round-trip that the older serial loop incurred. IfxPy still does one synchronous round-trip per row.
Full methodology, IQR caveats, install gauntlet, and reproduction in [`tests/benchmarks/compare/README.md`](tests/benchmarks/compare/README.md). Full methodology, IQR caveats, install gauntlet, and reproduction in [`tests/benchmarks/compare/README.md`](tests/benchmarks/compare/README.md).

View File

@ -1,6 +1,6 @@
[project] [project]
name = "informix-db" name = "informix-db"
version = "2026.05.05.4" version = "2026.05.05.6"
description = "Pure-Python driver for IBM Informix IDS — speaks the SQLI wire protocol over raw sockets. No CSDK, no JVM, no native libraries." description = "Pure-Python driver for IBM Informix IDS — speaks the SQLI wire protocol over raw sockets. No CSDK, no JVM, no native libraries."
readme = "README.md" readme = "README.md"
license = { text = "MIT" } license = { text = "MIT" }

View File

@ -828,12 +828,38 @@ class Cursor:
"""Execute the same SQL once per parameter set. """Execute the same SQL once per parameter set.
Per PEP 249. Common case is batched INSERT. We PREPARE once, Per PEP 249. Common case is batched INSERT. We PREPARE once,
loop SQ_BIND+SQ_EXECUTE per parameter set, then RELEASE once send N SQ_BIND+SQ_EXECUTE PDUs in a pipelined batch, then drain
much cheaper than calling ``execute()`` N times (which would N responses, then RELEASE once. Phase 33 introduces the pipeline
PREPARE+RELEASE on each iteration). earlier serial-loop implementations paid one wire round-trip
per row (~30 us/row on loopback x N rows = the dominant cost
for any sizeable batch).
Phase 4 supports DML (INSERT/UPDATE/DELETE) only SELECT in Phase 4 supports DML (INSERT/UPDATE/DELETE) only SELECT in
executemany doesn't make much sense and isn't implemented. executemany doesn't make much sense and isn't implemented.
**Pipelining safety** (Phase 33):
* The Phase 27 wire lock holds for the whole executemany, so
the entire send-batch + drain-batch is atomic against other
threads on the connection.
* TCP send buffer (~16-256 KB) easily fits 1000 PDUs (~80-200
KB worst case); response packets are tiny (~10 bytes per
OK), so the server's send buffer can't fill before we drain.
Note: ``sendall`` doesn't honor a write timeout reliably on
all kernels a wedged peer could block until TCP keepalive
fires (default ~2 hours). For hostile-network deployments,
set ``keepalive=True`` on connect.
* On the first error mid-drain, remaining responses are
drained silently (they're SQ_ERR replies for rows that the
aborted transaction couldn't commit anyway). Wire alignment
is verified by ``test_executemany_pipeline.py`` Informix
does send N responses for N pipelined PDUs even when one
fails. If a future Informix version changes that behavior,
those tests fail loudly.
* **Lock duration scales O(N) with batch size.** For very
large batches (>10000 rows), other threads waiting on this
connection will block proportionally. Prefer chunking into
multiple ``executemany`` calls of 1000-10000 rows so other
threads aren't starved.
""" """
self._check_open() self._check_open()
@ -880,18 +906,62 @@ class Cursor:
) )
self._read_describe_response() self._read_describe_response()
# BIND+EXECUTE per parameter set. # Phase 33: pipeline — build all BIND+EXECUTE PDUs first
# (Python work, no I/O), then send them back-to-back, then
# drain all responses. Eliminates the per-row round-trip
# the older serial loop paid.
pdus = [
self._build_bind_execute_pdu(tuple(p)) for p in seq
]
for pdu in pdus:
self._conn._send_pdu(pdu)
# Drain N responses. The first error is captured but we
# still drain the rest (they're SQ_ERRs for the aborted
# transaction's queued rows) so the wire stays consistent.
#
# Wire-framing invariant: each response — whether SQ_DONE
# for a successful row or SQ_ERR for a failed one — ends
# with its own SQ_EOT. ``_raise_sq_err`` self-drains the
# SQ_ERR's trailing SQ_EOT (see connections.py:_raise_sq_err
# drain loop). So calling ``_drain_to_eot`` exactly N times
# consumes exactly the responses for N PDUs, regardless of
# how many succeeded vs. failed. If ``_raise_sq_err`` is
# ever refactored to leave its trailing EOT for the caller,
# this loop silently desyncs — the test
# ``test_executemany_pipeline.py`` is the tripwire.
total_rowcount = 0 total_rowcount = 0
for params in seq: first_error: Exception | None = None
first_error_row: int | None = None
for i in range(len(pdus)):
self._rowcount = -1 self._rowcount = -1
self._conn._send_pdu(self._build_bind_execute_pdu(tuple(params))) try:
self._drain_to_eot() self._drain_to_eot()
except Exception as exc:
if first_error is None:
first_error = exc
first_error_row = i
continue
if self._rowcount > 0: if self._rowcount > 0:
total_rowcount += self._rowcount total_rowcount += self._rowcount
# RELEASE once. # RELEASE once.
self._conn._send_pdu(self._build_release_pdu()) self._conn._send_pdu(self._build_release_pdu())
self._drain_to_eot() self._drain_to_eot()
if first_error is not None:
# Annotate which row in the batch first failed by
# PREPENDING to the existing message — preserves the
# ``[<sqlcode>] <text>`` prefix that string-scraping
# callers may rely on, and keeps the exception class
# + structured fields (.sqlcode, .isamcode, .near).
if first_error.args:
first_error.args = (
f"executemany row {first_error_row}/{len(pdus)}: "
f"{first_error.args[0]}",
*first_error.args[1:],
)
raise first_error
self._rowcount = total_rowcount self._rowcount = total_rowcount
def fetchone(self) -> tuple | None: def fetchone(self) -> tuple | None:

View File

@ -6,15 +6,17 @@ Head-to-head benchmarks against [IfxPy](https://pypi.org/project/IfxPy/), the IB
Using **median + IQR over 10+ rounds** (mean was unreliable on the slow benchmarks — see "Statistical robustness" below): Using **median + IQR over 10+ rounds** (mean was unreliable on the slow benchmarks — see "Statistical robustness" below):
| Benchmark | IfxPy 3.0.5 (C-bound) | informix-db 2026.05.05.4 (pure Python) | Result | | Benchmark | IfxPy 3.0.5 (C-bound) | informix-db (pure Python) | Result |
|---|---:|---:|---:| |---|---:|---:|---:|
| `select_one_row` (single-row latency) | 170 µs | **119 µs** | **`informix-db` 30% faster** | | `select_one_row` (single-row latency) | 118 µs | **114 µs** | **`informix-db` 3% faster** |
| `select_systables_first_10` (~10 rows) | 186 µs | **142 µs** | **`informix-db` 24% faster** | | `select_systables_first_10` (~10 rows) | 164 µs | **159 µs** | **`informix-db` 3% faster** |
| `select_bench_table_all` (1000-row fetch) | 980 µs | **832 µs** | **`informix-db` 15% faster** | | `select_bench_table_all` (1000-row fetch) | 984 µs | **891 µs** | **`informix-db` 9% faster** |
| `executemany(1000)` in transaction (bulk write) | 28.3 ms (IQR 29%) | 31.3 ms (IQR 10%) | 10% slower (within IfxPy's noise) | | **`executemany(1000)` in transaction (bulk write)** | 21.4 ms | **10.4 ms** | **`informix-db` 2.05× faster** |
| `cold_connect_disconnect` (login handshake) | 12.0 ms | **10.7 ms** | **`informix-db` 11% faster** | | `cold_connect_disconnect` (login handshake) | 11.0 ms | **10.4 ms** | **`informix-db` 5% faster** |
**`informix-db` is faster on 4 of 5 benchmarks against the C-bound driver.** The one loss is bulk-write workloads, where the gap is within IfxPy's own measurement noise (its IQR on that benchmark is 29% of its own median). **`informix-db` wins all 5 benchmarks against the C-bound driver, including a 2× win on bulk inserts.**
The bulk-insert win comes from Phase 33's pipelined `executemany`: all N BIND+EXECUTE PDUs are sent to the wire before any response is drained, eliminating the per-row round-trip latency that the older serial loop (and IfxPy's per-call API) incur. The wire-alignment assumption that makes this safe — that Informix sends exactly N responses for N pipelined PDUs even when one row fails — is verified by `tests/test_executemany_pipeline.py` (constraint violation at row 0/100, 99/100, 500/1000).
## Statistical robustness — why median, not mean ## Statistical robustness — why median, not mean

View File

@ -0,0 +1,214 @@
"""Phase 33 integration tests — pipelined ``executemany`` correctness.
The pipelined executemany sends all N BIND+EXECUTE PDUs to the wire
before draining any response. Hamilton's review of Phase 33 flagged
C1: this assumes the server sends *exactly* N responses for N
pipelined PDUs even when one row fails. If the server cuts the
response stream short on first error, the drain loop would block
reading bytes that never arrive the connection would deadlock on
the next read.
These tests verify the wire-alignment assumption holds:
1. Constraint violation at row 500 of 1000 happy-failure case.
2. Wire-alignment recovery connection is still usable after the
error (proving the RELEASE drain succeeded and we read all the
remaining error responses).
3. Subsequent operations on the same connection work proves no
stray bytes on the wire.
"""
from __future__ import annotations
import contextlib
from collections.abc import Iterator
import pytest
import informix_db
from tests.conftest import ConnParams
pytestmark = pytest.mark.integration
@pytest.fixture
def constraint_table(logged_db_params: ConnParams) -> Iterator[str]:
"""Table with a UNIQUE constraint on ``id`` so we can force a
constraint violation at a known row.
"""
table = "p33_constraint"
conn = informix_db.connect(
host=logged_db_params.host,
port=logged_db_params.port,
user=logged_db_params.user,
password=logged_db_params.password,
database=logged_db_params.database,
server=logged_db_params.server,
autocommit=True,
)
cur = conn.cursor()
with contextlib.suppress(Exception):
cur.execute(f"DROP TABLE {table}")
cur.execute(
f"CREATE TABLE {table} (id INT NOT NULL PRIMARY KEY, name VARCHAR(64))"
)
conn.close()
try:
yield table
finally:
conn = informix_db.connect(
host=logged_db_params.host,
port=logged_db_params.port,
user=logged_db_params.user,
password=logged_db_params.password,
database=logged_db_params.database,
server=logged_db_params.server,
autocommit=True,
)
cur = conn.cursor()
with contextlib.suppress(Exception):
cur.execute(f"DROP TABLE {table}")
conn.close()
def test_pipelined_executemany_mid_batch_constraint_violation(
logged_db_params: ConnParams, constraint_table: str
) -> None:
"""C1 (Hamilton): force a constraint violation at row 500 of 1000;
verify the pipeline drains cleanly and the connection is usable
afterward.
This is the test that validates Phase 33's wire-alignment
assumption. If Informix sends fewer than 1000 responses for 1000
pipelined PDUs after the row-500 failure, this test will hang on
the drain loop's read (eventually timing out, but the test will
fail loudly either way).
"""
conn = informix_db.connect(
host=logged_db_params.host,
port=logged_db_params.port,
user=logged_db_params.user,
password=logged_db_params.password,
database=logged_db_params.database,
server=logged_db_params.server,
autocommit=False,
read_timeout=30.0, # if the wire desyncs, fail loudly within 30s
)
try:
# Pre-seed row 500 so the executemany's row-500 INSERT will
# violate the UNIQUE constraint.
cur = conn.cursor()
cur.execute(
f"INSERT INTO {constraint_table} VALUES (?, ?)",
(500, "pre-existing"),
)
conn.commit()
# Now executemany 1000 rows; row 500 will collide
rows = [(i, f"row_{i}") for i in range(1000)]
with pytest.raises(informix_db.IntegrityError) as exc_info:
cur.executemany(
f"INSERT INTO {constraint_table} VALUES (?, ?)", rows
)
# The error message should identify which row failed in the batch
err_msg = str(exc_info.value)
assert "row 500" in err_msg or "500" in err_msg, (
f"error message should identify the failed row index: {err_msg}"
)
# Whatever the transaction state, rolling back is the correct
# response to a failed batch.
conn.rollback()
# The connection MUST be usable after the failed batch.
# If the wire is desynced, this query will block or fail
# with a ProtocolError. The test passing here proves the
# pipeline drained cleanly.
cur = conn.cursor()
cur.execute(f"SELECT COUNT(*) FROM {constraint_table}")
(count,) = cur.fetchone()
# After rollback, only the pre-seeded row 500 remains
assert count == 1, (
f"expected only the pre-seeded row to remain, got {count} "
"(transaction didn't roll back cleanly?)"
)
finally:
conn.close()
def test_pipelined_executemany_first_row_fails(
logged_db_params: ConnParams, constraint_table: str
) -> None:
"""Edge case: failure on the FIRST row of the pipeline. Tests that
the drain loop correctly handles "every response after this is an
error" without falling apart on the very first response."""
conn = informix_db.connect(
host=logged_db_params.host,
port=logged_db_params.port,
user=logged_db_params.user,
password=logged_db_params.password,
database=logged_db_params.database,
server=logged_db_params.server,
autocommit=False,
read_timeout=30.0,
)
try:
cur = conn.cursor()
cur.execute(
f"INSERT INTO {constraint_table} VALUES (?, ?)", (0, "seeded")
)
conn.commit()
rows = [(i, f"row_{i}") for i in range(100)]
with pytest.raises(informix_db.IntegrityError):
cur.executemany(
f"INSERT INTO {constraint_table} VALUES (?, ?)", rows
)
conn.rollback()
cur = conn.cursor()
cur.execute(f"SELECT COUNT(*) FROM {constraint_table}")
(count,) = cur.fetchone()
assert count == 1
finally:
conn.close()
def test_pipelined_executemany_last_row_fails(
logged_db_params: ConnParams, constraint_table: str
) -> None:
"""Edge case: failure on the LAST row of the pipeline. Tests that
we don't accidentally short-circuit the drain when we see the
"expected" rowcount before the actual error response arrives."""
conn = informix_db.connect(
host=logged_db_params.host,
port=logged_db_params.port,
user=logged_db_params.user,
password=logged_db_params.password,
database=logged_db_params.database,
server=logged_db_params.server,
autocommit=False,
read_timeout=30.0,
)
try:
cur = conn.cursor()
cur.execute(
f"INSERT INTO {constraint_table} VALUES (?, ?)",
(99, "seeded-last"),
)
conn.commit()
rows = [(i, f"row_{i}") for i in range(100)]
with pytest.raises(informix_db.IntegrityError):
cur.executemany(
f"INSERT INTO {constraint_table} VALUES (?, ?)", rows
)
conn.rollback()
cur = conn.cursor()
cur.execute(f"SELECT COUNT(*) FROM {constraint_table}")
(count,) = cur.fetchone()
assert count == 1
finally:
conn.close()