Phase 36: IfxPy scaling comparison + honest comparison numbers (2026.05.05.9)
Extends the IfxPy comparison bench script with scaling workloads (1k/10k/100k rows for both executemany and SELECT). Re-runs the full comparison with consistent measurement methodology and updates the README with the actually-correct numbers. Earlier comparison runs reported informix-db winning all 5 benchmarks. Re-running select_bench_table_all with consistent measurement gives 3.04 ms, not the 891 us I cited earlier - a 3.4x discrepancy attributable to noisy warmup + small-fixture artifacts. The "we win everything" framing was wrong. Corrected comparison reveals two clear stories: Bulk-insert: pure-Python wins 1.6x at scale. executemany(10k): IfxPy 259ms -> us 161ms (1.6x faster) executemany(100k): IfxPy 2376ms -> us 1487ms (1.6x faster) Reason: Phase 33's pipelining eliminates per-row RTT. IfxPy's per-call API can't pipeline. Large-fetch: IfxPy wins 2.3-2.4x at scale. SELECT 1k rows: IfxPy 1.2ms / us 2.7ms (IfxPy 2.3x) SELECT 10k rows: IfxPy 11.3ms / us 25.8ms (IfxPy 2.3x) SELECT 100k rows: IfxPy 112ms / us 271ms (IfxPy 2.4x) Reason: C-level fetch_tuple at ~1.1us/row beats Python parse_tuple_payload at ~2.7us/row. Real C-vs-Python codec gap showing up at scale. For everyday workloads (single SELECT in a request, INSERT a handful of rows), drivers are within 5-25%. For workloads where the gap widens, direction depends on what you're doing - bulk- write favors us, bulk-read favors IfxPy. README's "Compared to IfxPy" section rewritten with the corrected numbers and an honest "when to prefer which" subsection. tests/benchmarks/compare/README.md mirror updated. Net narrative: a "faster at bulk-write, slower at bulk-read, comparable elsewhere" comparison story is more honest and more durable than a "we win everything" claim that would have collapsed the first time a user ran their own benchmark. Side note (lint): one ambiguous unicode `×` in cursors.py replaced with `x`. Phase 37 ticket: parse_tuple_payload is the bottleneck at scale. Closing the 1.6 us/row gap to IfxPy would make us competitive on bulk-fetch too. Possible approaches: Cython codec, deeper inlining, per-column dispatch pre-bake.
This commit is contained in:
parent
8eb19f7534
commit
270155d2de
43
CHANGELOG.md
43
CHANGELOG.md
@ -2,6 +2,49 @@
|
||||
|
||||
All notable changes to `informix-db`. Versioning is [CalVer](https://calver.org/) — `YYYY.MM.DD` for date-based releases, `YYYY.MM.DD.N` for same-day post-releases per PEP 440.
|
||||
|
||||
## 2026.05.05.9 — IfxPy scaling comparison + honest comparison numbers (Phase 36)
|
||||
|
||||
Adds the IfxPy side of Phase 34's scaling benchmarks (1k / 10k / 100k rows for both `executemany` and `SELECT`) and updates the README's comparison table with the **actually-correct numbers**.
|
||||
|
||||
### What changed
|
||||
|
||||
**1. `tests/benchmarks/compare/ifxpy_bench.py` extended** with `bench_executemany_scaling(n)` and `bench_select_scaling(n)` — same shapes as `test_scaling_perf.py` so the comparison is apples-to-apples.
|
||||
|
||||
**2. README's comparison numbers corrected.** Earlier comparison runs reported `select_bench_table_all` at 891 µs for `informix-db`. Re-running with consistent measurement (warmup + median + 10+ rounds) reports 3.04 ms — a 3.4× discrepancy. The earlier number was probably picked up from a noisy first-run with a different warmup state, or from a benchmark that wasn't fully populating its fixture. **Either way, the "we win all 5 benchmarks" claim was based on inconsistent measurement.**
|
||||
|
||||
**The corrected comparison reveals two clear stories:**
|
||||
|
||||
| Benchmark | IfxPy | informix-db | Result |
|
||||
|---|---:|---:|---|
|
||||
| `executemany(1k)` in txn | 23.5 ms | 23.2 ms | tied |
|
||||
| `executemany(10k)` in txn | 259 ms | **161 ms** | **us 1.6× faster** |
|
||||
| `executemany(100k)` in txn | 2376 ms | **1487 ms** | **us 1.6× faster** |
|
||||
| `SELECT 1k rows` | 1.2 ms | 2.7 ms | IfxPy 2.3× faster |
|
||||
| `SELECT 10k rows` | 11.3 ms | 25.8 ms | IfxPy 2.3× faster |
|
||||
| `SELECT 100k rows` | 112 ms | 271 ms | IfxPy 2.4× faster |
|
||||
|
||||
**Bulk-insert: pure-Python wins 1.6× at scale** because pipelining (Phase 33) eliminates per-row RTT. IfxPy's `IfxPy.execute(stmt, tuple)` per-call API can't pipeline.
|
||||
|
||||
**Large-fetch: IfxPy wins 2.3-2.4× at scale.** Their C-level `fetch_tuple` decoder runs at ~1.1 µs/row; our `parse_tuple_payload` runs at ~2.7 µs/row. **This is the real C-vs-Python codec cost showing up at scale where it matters.**
|
||||
|
||||
### Why correcting this matters
|
||||
|
||||
A "we win everything" claim that's based on noisy measurements would have collapsed the first time a user ran their own benchmark and got different numbers. Naming the trade-off honestly — "we're faster at bulk write, slower at bulk read, comparable elsewhere" — is the right framing.
|
||||
|
||||
### When to prefer `informix-db`
|
||||
|
||||
- ETL pipelines, log shipping, bulk writes (1.6× faster at scale)
|
||||
- Containerized / minimal-dependency environments (50 KB wheel vs IfxPy's 92 MB OneDB tarball + libcrypt.so.1 dependency hell)
|
||||
- Modern Python (works on 3.10–3.14; IfxPy is broken on Python 3.12+)
|
||||
- Async / FastAPI workloads (we have native async; IfxPy doesn't)
|
||||
|
||||
### When IfxPy may be faster
|
||||
|
||||
- Analytical reporting queries pulling 10k+ rows in a single SELECT
|
||||
- Workloads where the per-row decode cost dominates (wide rows, tight read loops)
|
||||
|
||||
The actionable takeaway for `informix-db`'s future: the parse_tuple_payload hot path is now the bottleneck at scale. Phase 25's branch reorder shaved 22%; further work (Cython codec? deeper inlining? per-column dispatch pre-bake?) could close the C-vs-Python gap. Tracked as a possible Phase 37+.
|
||||
|
||||
## 2026.05.05.8 — Scaling benchmarks (Phase 34)
|
||||
|
||||
Adds `tests/benchmarks/test_scaling_perf.py` — parametrized benchmarks that exercise the driver at row counts and column widths well beyond what the existing 1k-row benchmarks cover. The first thing this suite did was catch the NFETCH-loop data-loss bug fixed in Phase 35.
|
||||
|
||||
32
README.md
32
README.md
@ -176,17 +176,33 @@ Head-to-head benchmarks against [IfxPy](https://pypi.org/project/IfxPy/) on iden
|
||||
|
||||
| Benchmark | IfxPy 3.0.5 (C-bound) | `informix-db` (pure Python) | Result |
|
||||
|---|---:|---:|---:|
|
||||
| Single-row SELECT round-trip | 118 µs | **114 µs** | **`informix-db` 3% faster** |
|
||||
| ~10-row server-side query | 164 µs | **159 µs** | **`informix-db` 3% faster** |
|
||||
| 1000-row SELECT (full fetch) | 984 µs | **891 µs** | **`informix-db` 9% faster** |
|
||||
| **`executemany(1000)` in transaction** | 21.4 ms | **10.4 ms** | **`informix-db` 2.05× faster** |
|
||||
| Cold connect (login handshake) | 11.0 ms | **10.4 ms** | **`informix-db` 5% faster** |
|
||||
| Single-row SELECT round-trip | 118 µs | 114 µs | comparable |
|
||||
| ~10-row server-side query | 130 µs | 159 µs | IfxPy 22% faster |
|
||||
| Cold connect (login handshake) | 11.0 ms | 10.5 ms | comparable |
|
||||
| **`executemany(1k)` in transaction** | 23.5 ms | 23.2 ms | tied |
|
||||
| **`executemany(10k)` in transaction** | 259 ms | **161 ms** | **`informix-db` 1.6× faster** |
|
||||
| **`executemany(100k)` in transaction** | 2376 ms | **1487 ms** | **`informix-db` 1.6× faster** |
|
||||
| `SELECT` 1k rows | 1.2 ms | 2.7 ms | IfxPy 2.3× faster |
|
||||
| `SELECT` 10k rows | 11.3 ms | 25.8 ms | IfxPy 2.3× faster |
|
||||
| `SELECT` 100k rows | 112 ms | 271 ms | IfxPy 2.4× faster |
|
||||
|
||||
**`informix-db` wins on all 5 benchmarks against the C-bound driver, including a 2× win on bulk inserts.**
|
||||
**The honest summary:**
|
||||
|
||||
**Why pure-Python wins the round-trip-bound work:** IfxPy's code path is `Python → OneDB ODBC driver → libifdmr.so → wire`. Ours is `Python → wire`. The abstraction-layer overhead IfxPy carries on every call costs more than the C-vs-Python codec gap saves.
|
||||
- **Bulk-insert workloads: `informix-db` wins 1.6× at scale.** The pipelined `executemany` (Phase 33) sends all N BIND+EXECUTE PDUs before draining responses, eliminating per-row RTT. IfxPy still pays one round-trip per `IfxPy.execute(stmt, tuple)` call.
|
||||
- **Large-fetch workloads: IfxPy wins 2.3× at scale.** Their C-level `fetch_tuple` decoder is genuinely faster than our Python `parse_tuple_payload` (~1.1 µs/row vs ~2.7 µs/row). At 100k rows, that 1.6 µs/row gap accumulates into a 160 ms wall-clock difference.
|
||||
- **Small queries: comparable.** Both spend ~120 µs waiting for the server; the per-call codec cost is small relative to the round-trip.
|
||||
|
||||
**Why we win bulk inserts dramatically:** `executemany` pipelines all N BIND+EXECUTE PDUs to the wire before draining responses (Phase 33), eliminating the per-row round-trip that the older serial loop incurred. IfxPy still does one synchronous round-trip per row.
|
||||
**When to prefer `informix-db`:**
|
||||
- ETL pipelines, log shipping, bulk writes (1.6× faster at scale)
|
||||
- Containerized / minimal-dependency environments (50 KB wheel vs IfxPy's 92 MB OneDB tarball + libcrypt.so.1 dependency hell)
|
||||
- Modern Python (works on 3.10–3.14; IfxPy is broken on Python 3.12+)
|
||||
- Async / FastAPI workloads (we have native async; IfxPy doesn't)
|
||||
|
||||
**When IfxPy may be faster:**
|
||||
- Analytical reporting queries pulling 10k+ rows in a single SELECT
|
||||
- Workloads where the per-row decode cost dominates (wide rows, tight read loops)
|
||||
|
||||
These results are reproducible from `tests/benchmarks/compare/` — the Dockerfile, bench script, and README walk through every step.
|
||||
|
||||
Full methodology, IQR caveats, install gauntlet, and reproduction in [`tests/benchmarks/compare/README.md`](tests/benchmarks/compare/README.md).
|
||||
|
||||
|
||||
@ -1,6 +1,6 @@
|
||||
[project]
|
||||
name = "informix-db"
|
||||
version = "2026.05.05.8"
|
||||
version = "2026.05.05.9"
|
||||
description = "Pure-Python driver for IBM Informix IDS — speaks the SQLI wire protocol over raw sockets. No CSDK, no JVM, no native libraries."
|
||||
readme = "README.md"
|
||||
license = { text = "MIT" }
|
||||
|
||||
@ -401,7 +401,7 @@ class Cursor:
|
||||
# Phase 35: NFETCH loop — keep fetching until a response yields
|
||||
# zero new tuples. The previous "two NFETCHes" pattern silently
|
||||
# truncated any result set whose tuples didn't fit in 1-2 server
|
||||
# batches (~200 rows at default 4096-byte buffer × 5-col rows).
|
||||
# batches (~200 rows at default 4096-byte buffer x 5-col rows).
|
||||
# This bug was latent for ~30 phases because no test used a
|
||||
# large enough result set to trigger it.
|
||||
self._conn._send_pdu(self._build_curname_nfetch_pdu(cursor_name))
|
||||
|
||||
@ -4,19 +4,29 @@ Head-to-head benchmarks against [IfxPy](https://pypi.org/project/IfxPy/), the IB
|
||||
|
||||
## TL;DR
|
||||
|
||||
Using **median + IQR over 10+ rounds** (mean was unreliable on the slow benchmarks — see "Statistical robustness" below):
|
||||
Using **median + IQR over 10+ rounds** (mean was unreliable on the slow benchmarks — see "Statistical robustness" below). Phase 36 added scaling benchmarks at 1k / 10k / 100k rows so the comparison shape is clearer:
|
||||
|
||||
| Benchmark | IfxPy 3.0.5 (C-bound) | informix-db (pure Python) | Result |
|
||||
| Benchmark | IfxPy 3.0.5 | informix-db | Result |
|
||||
|---|---:|---:|---:|
|
||||
| `select_one_row` (single-row latency) | 118 µs | **114 µs** | **`informix-db` 3% faster** |
|
||||
| `select_systables_first_10` (~10 rows) | 164 µs | **159 µs** | **`informix-db` 3% faster** |
|
||||
| `select_bench_table_all` (1000-row fetch) | 984 µs | **891 µs** | **`informix-db` 9% faster** |
|
||||
| **`executemany(1000)` in transaction (bulk write)** | 21.4 ms | **10.4 ms** | **`informix-db` 2.05× faster** |
|
||||
| `cold_connect_disconnect` (login handshake) | 11.0 ms | **10.4 ms** | **`informix-db` 5% faster** |
|
||||
| `select_one_row` | 118 µs | 114 µs | comparable |
|
||||
| `select_systables_first_10` | 130 µs | 159 µs | IfxPy 22% faster |
|
||||
| `cold_connect_disconnect` | 11.0 ms | 10.5 ms | comparable |
|
||||
| **`executemany(1k)` in txn** | 23.5 ms | 23.2 ms | tied |
|
||||
| **`executemany(10k)` in txn** | 259 ms | **161 ms** | **`informix-db` 1.6× faster** |
|
||||
| **`executemany(100k)` in txn** | 2376 ms | **1487 ms** | **`informix-db` 1.6× faster** |
|
||||
| `SELECT 1k rows` | 1.2 ms | 2.7 ms | IfxPy 2.3× faster |
|
||||
| `SELECT 10k rows` | 11.3 ms | 25.8 ms | IfxPy 2.3× faster |
|
||||
| `SELECT 100k rows` | 112 ms | 271 ms | IfxPy 2.4× faster |
|
||||
|
||||
**`informix-db` wins all 5 benchmarks against the C-bound driver, including a 2× win on bulk inserts.**
|
||||
**Two clear stories:**
|
||||
|
||||
The bulk-insert win comes from Phase 33's pipelined `executemany`: all N BIND+EXECUTE PDUs are sent to the wire before any response is drained, eliminating the per-row round-trip latency that the older serial loop (and IfxPy's per-call API) incur. The wire-alignment assumption that makes this safe — that Informix sends exactly N responses for N pipelined PDUs even when one row fails — is verified by `tests/test_executemany_pipeline.py` (constraint violation at row 0/100, 99/100, 500/1000).
|
||||
**1. Bulk insert: `informix-db` wins 1.6× at scale.** The pipelined `executemany` (Phase 33) sends all N BIND+EXECUTE PDUs to the wire before draining responses, eliminating per-row RTT. IfxPy still pays one synchronous round-trip per `IfxPy.execute(stmt, tuple)` call — that's ~24 µs/row regardless of N. We pay ~15 µs/row at scale (the prepare/release overhead amortizes better at larger N).
|
||||
|
||||
**2. Large fetch: IfxPy wins 2.3-2.4× at scale.** Their C-level `fetch_tuple` decoder runs at ~1.1 µs/row; our pure-Python `parse_tuple_payload` runs at ~2.7 µs/row. At 100k rows, the 1.6 µs/row gap accumulates into a 160 ms wall-clock difference. **This is the C-vs-Python codec cost showing up at scale, where it actually matters.**
|
||||
|
||||
For everyday-application workloads (single SELECT in a request, INSERT a handful of rows, transactional UPDATE), the two drivers are within 5-25% of each other. For the workloads where the gap widens, the direction depends on what you're doing — bulk-write favors us, bulk-read favors IfxPy.
|
||||
|
||||
**The wire-alignment assumption** that makes pipelined `executemany` safe — that Informix sends exactly N responses for N pipelined PDUs even when one row fails — is verified by `tests/test_executemany_pipeline.py` (constraint violation at row 0/100, 99/100, 500/1000).
|
||||
|
||||
## Statistical robustness — why median, not mean
|
||||
|
||||
|
||||
@ -171,6 +171,107 @@ def bench_cold_connect_disconnect() -> dict:
|
||||
return measure("cold_connect_disconnect", ROUNDS_SLOW, run)
|
||||
|
||||
|
||||
# ----------------------------------------------------------------------------
|
||||
# Phase 36 — scaling benchmarks (matched to test_scaling_perf.py)
|
||||
# ----------------------------------------------------------------------------
|
||||
|
||||
|
||||
def bench_executemany_scaling(n_rows: int) -> dict:
|
||||
"""N-row insert in a single transaction. IfxPy doesn't pipeline —
|
||||
each ``IfxPy.execute(stmt, params)`` is a synchronous round-trip
|
||||
to the server. So per-row cost is roughly constant in N."""
|
||||
rounds_for = {1_000: 10, 10_000: 5, 100_000: 3}
|
||||
name = f"executemany_scaling_{n_rows}"
|
||||
try:
|
||||
conn = IfxPy.connect(
|
||||
CONN_STR.replace("DATABASE=sysmaster", "DATABASE=testdb"), "", ""
|
||||
)
|
||||
except Exception as e:
|
||||
return {"name": name, "skipped": f"testdb: {e}"}
|
||||
IfxPy.autocommit(conn, IfxPy.SQL_AUTOCOMMIT_OFF)
|
||||
|
||||
table = f"p36_em_{n_rows}"
|
||||
try:
|
||||
try:
|
||||
IfxPy.exec_immediate(conn, f"DROP TABLE {table}")
|
||||
IfxPy.commit(conn)
|
||||
except Exception:
|
||||
pass
|
||||
IfxPy.exec_immediate(
|
||||
conn, f"CREATE TABLE {table} (id INT, name VARCHAR(64), value FLOAT)"
|
||||
)
|
||||
IfxPy.commit(conn)
|
||||
|
||||
counter = [0]
|
||||
|
||||
def run() -> None:
|
||||
counter[0] += 1
|
||||
base = counter[0] * n_rows
|
||||
stmt = IfxPy.prepare(
|
||||
conn, f"INSERT INTO {table} VALUES (?, ?, ?)"
|
||||
)
|
||||
for i in range(n_rows):
|
||||
IfxPy.execute(stmt, (base + i, f"row_{base + i}", float(base + i)))
|
||||
IfxPy.free_stmt(stmt)
|
||||
IfxPy.commit(conn)
|
||||
|
||||
return measure(name, rounds_for[n_rows], run)
|
||||
finally:
|
||||
try:
|
||||
IfxPy.exec_immediate(conn, f"DROP TABLE {table}")
|
||||
IfxPy.commit(conn)
|
||||
except Exception:
|
||||
pass
|
||||
IfxPy.close(conn)
|
||||
|
||||
|
||||
def bench_select_scaling(n_rows: int) -> dict:
|
||||
"""SELECT FIRST N from the pre-populated 100k-row p34_select table.
|
||||
Tests IfxPy's per-row fetch cost at scale; should be roughly linear
|
||||
in N like ours."""
|
||||
rounds_for = {1_000: 10, 10_000: 5, 100_000: 3}
|
||||
name = f"select_scaling_{n_rows}"
|
||||
|
||||
try:
|
||||
conn = IfxPy.connect(
|
||||
CONN_STR.replace("DATABASE=sysmaster", "DATABASE=testdb"), "", ""
|
||||
)
|
||||
except Exception as e:
|
||||
return {"name": name, "skipped": f"testdb: {e}"}
|
||||
try:
|
||||
# Probe: does p34_select exist?
|
||||
try:
|
||||
stmt = IfxPy.exec_immediate(conn, "SELECT COUNT(*) FROM p34_select")
|
||||
row = IfxPy.fetch_tuple(stmt)
|
||||
IfxPy.free_stmt(stmt)
|
||||
available = int(row[0])
|
||||
if available < n_rows:
|
||||
return {"name": name, "skipped": (
|
||||
f"p34_select has only {available} rows; "
|
||||
"run informix-db scaling benchmarks first to seed "
|
||||
"the table"
|
||||
)}
|
||||
except Exception as e:
|
||||
return {"name": name, "skipped": f"p34_select missing: {e}"}
|
||||
|
||||
def run() -> None:
|
||||
stmt = IfxPy.exec_immediate(
|
||||
conn, f"SELECT FIRST {n_rows} * FROM p34_select"
|
||||
)
|
||||
count = 0
|
||||
while IfxPy.fetch_tuple(stmt):
|
||||
count += 1
|
||||
IfxPy.free_stmt(stmt)
|
||||
if count != n_rows:
|
||||
raise RuntimeError(
|
||||
f"expected {n_rows} rows, got {count}"
|
||||
)
|
||||
|
||||
return measure(name, rounds_for[n_rows], run)
|
||||
finally:
|
||||
IfxPy.close(conn)
|
||||
|
||||
|
||||
def main() -> None:
|
||||
print("# IfxPy benchmark results", file=sys.stderr)
|
||||
print(f"# IfxPy version: {IfxPy.__version__ if hasattr(IfxPy, '__version__') else 'unknown'}", file=sys.stderr)
|
||||
@ -187,6 +288,15 @@ def main() -> None:
|
||||
results.append(bench_executemany_1000_rows_in_txn())
|
||||
results.append(bench_cold_connect_disconnect())
|
||||
|
||||
# Phase 36 — scaling comparison. Skip 100k cases when --short is
|
||||
# passed (e.g., for fast smoke runs); otherwise run all sizes.
|
||||
short = "--short" in sys.argv
|
||||
sizes = [1_000, 10_000] if short else [1_000, 10_000, 100_000]
|
||||
for n in sizes:
|
||||
results.append(bench_executemany_scaling(n))
|
||||
for n in sizes:
|
||||
results.append(bench_select_scaling(n))
|
||||
|
||||
# Emit machine-parseable lines on stdout. Reporting median (not
|
||||
# mean) and IQR (not stddev) so a single outlier round can't
|
||||
# dominate the comparison numbers — mirrors pytest-benchmark's
|
||||
|
||||
Loading…
x
Reference in New Issue
Block a user