The serial-loop executemany paid one wire round-trip per row (~30us/ row on loopback). It was the one benchmark where IfxPy beat us in the comparison work - 10% slower at executemany(1000) in txn. Phase 33 pipelines the BIND+EXECUTE PDUs: build all N PDUs, send them back-to-back, then drain all N responses. Eliminates per-row RTT entirely. Performance impact: * executemany(1000) in txn: 31.3 ms -> 11.0 ms (2.85x faster) * executemany(100) autocommit: 173 ms -> 154 ms (11% faster) * executemany(1000) autocommit: 1740 ms -> 1590 ms (9% faster) (Autocommit gets smaller wins because server-side log flushes dominate - Phase 21.1's "autocommit cliff".) IfxPy comparison flipped: us 10% slower -> us 2.05x faster on bulk inserts. We now win all 5 head-to-head benchmarks against the C-bound driver. Margaret Hamilton review surfaced one CRITICAL concern (C1) - the pipeline assumes Informix sends N responses for N pipelined PDUs even when one fails. If the server cut the stream short, the drain loop would deadlock on the next read. Verified by 3 new integration tests in tests/test_executemany_pipeline.py: * test_pipelined_executemany_mid_batch_constraint_violation (row 500/1000) * test_pipelined_executemany_first_row_fails (row 0/100) * test_pipelined_executemany_last_row_fails (row 99/100) All confirm Informix sends N responses; wire stays aligned; connection is usable after. Plus 4 lower-priority fixes Hamilton recommended: * H1: documented _raise_sq_err self-drains-SQ_EOT invariant + tripwire * H2: docstring warning about O(N) lock duration; chunk for huge batches * M1: prepend row-index to exception message rather than reformat * M2: documented sendall-no-timeout caveat on hostile networks 77 unit + 239 integration + 33 benchmark = 349 tests; ruff clean. Note: Phase 32 (Tier 1+2 benchmarks) was tagged without bumping pyproject.toml's version string. .5 was git-tag-only; .6 is the next published version increment.
7.7 KiB
informix-db vs IfxPy comparison benchmark
Head-to-head benchmarks against IfxPy, the IBM-published C-bound Informix driver, on identical workloads against the same Informix Developer Edition Docker container.
TL;DR
Using median + IQR over 10+ rounds (mean was unreliable on the slow benchmarks — see "Statistical robustness" below):
| Benchmark | IfxPy 3.0.5 (C-bound) | informix-db (pure Python) | Result |
|---|---|---|---|
select_one_row (single-row latency) |
118 µs | 114 µs | informix-db 3% faster |
select_systables_first_10 (~10 rows) |
164 µs | 159 µs | informix-db 3% faster |
select_bench_table_all (1000-row fetch) |
984 µs | 891 µs | informix-db 9% faster |
executemany(1000) in transaction (bulk write) |
21.4 ms | 10.4 ms | informix-db 2.05× faster |
cold_connect_disconnect (login handshake) |
11.0 ms | 10.4 ms | informix-db 5% faster |
informix-db wins all 5 benchmarks against the C-bound driver, including a 2× win on bulk inserts.
The bulk-insert win comes from Phase 33's pipelined executemany: all N BIND+EXECUTE PDUs are sent to the wire before any response is drained, eliminating the per-row round-trip latency that the older serial loop (and IfxPy's per-call API) incur. The wire-alignment assumption that makes this safe — that Informix sends exactly N responses for N pipelined PDUs even when one row fails — is verified by tests/test_executemany_pipeline.py (constraint violation at row 0/100, 99/100, 500/1000).
Statistical robustness — why median, not mean
Earlier runs of this comparison reported mean (the pytest-benchmark default) and showed wildly different per-run numbers — executemany(1000) was variously 14%, 30%, or 43% slower than IfxPy depending on which run we sampled. The mean was being dominated by single-round outliers (GC pauses, server scheduler hiccups).
Switching to median + IQR with 10+ rounds gives stable run-to-run results:
- Median resists single outliers: one 50 ms round in a sample of 10 doesn't move the median; it would move the mean by 5 ms.
- IQR (Q3 – Q1) is the noise estimator: directly comparable across drivers. If IfxPy's IQR is 8 ms on a 28 ms median (29% spread) while ours is 3 ms on 31 ms (10% spread), our number is ~3× more reliable than theirs even though our median is higher.
- 10 rounds for slow benchmarks (1+ second per round) costs ~1 minute of wall time but eliminates the noisy-comparison problem.
Both tests/benchmarks/test_*_perf.py (host-side, pytest-benchmark) and ifxpy_bench.py (container-side, hand-rolled time.perf_counter measure loop) report median + IQR for cross-comparable numbers.
What this means
Conventional wisdom says C beats Python at I/O drivers. Here, the picture is more nuanced:
- When the wire dominates (single round-trips, bulk fetch),
informix-dbwins because IfxPy adds an ODBC abstraction layer (Python → OneDB ODBC driver → libifdmr.so → wire) where we go direct (Python → wire). - When per-row marshaling dominates (executemany, wider tuple construction), IfxPy wins because its C-level
execute(stmt, tuple)is faster than our Python BIND-PDU build. - When the wire handshake dominates (cold connect), they tie because both drivers wait ~11 ms for the server's login response.
The takeaway is that pure-Python doesn't mean "performance compromise" — it means different overhead distribution. For most application workloads (web requests doing a handful of small queries), the wire round-trip is what matters, and the abstraction-layer overhead IfxPy carries means informix-db is typically the same speed or faster.
Why this comparison was hard to set up
IfxPy is genuinely difficult to install on a modern system. Capturing the install gauntlet for the record:
| Step | Detail |
|---|---|
| 1. Pin Python 3.11 | Python 3.13 fails: IfxPy's setup.py uses use_2to3, removed from setuptools 58 (October 2021). |
| 2. Pin setuptools <58 | Same root cause. |
| 3. CFLAGS hack | GCC 11+ (default since 2021) escalates the C extension's pointer-type warnings to errors. Need CFLAGS="-Wno-incompatible-pointer-types -Wno-error" to demote them. |
| 4. Download OneDB ODBC drivers | A 92 MB tarball from hcl-onedb.github.io/odbc/. The pip install only fetches headers — the runtime libs are a separate, undocumented download. |
| 5. Set INFORMIXDIR + LD_LIBRARY_PATH | Across four directories (lib/, lib/cli/, lib/esql/, gls/dll/). |
6. Install libcrypt.so.1 |
The OneDB drivers link against the libcrypt-1 ABI (deprecated in 2018, replaced by libcrypt.so.2). Modern Arch / Fedora 35+ / RHEL 9 ship only libcrypt.so.2; you need a compatibility shim (Ubuntu 20.04 still has it; modern distros need libxcrypt-compat or similar). |
| 7. Build runtime container | We use Dockerfile.ifxpy here because Ubuntu 20.04 is the most recent base distro that still ships libcrypt.so.1 natively. |
By contrast, informix-db's install is pip install informix-db. No external downloads, no system packages, no LD_LIBRARY_PATH, no Docker required.
Methodology
- Both drivers ran against the same Informix Developer Edition 15.0.1.0.3DE Docker container (
informix-db-testfromtests/docker-compose.yml). - The host runs Arch Linux on x86_64; the IfxPy container runs Ubuntu 20.04 on x86_64. Both reach the server through the loopback path (host's
127.0.0.1:9088forinformix-db;--network=hostfor the IfxPy container). - Each benchmark runs 100/20/3 rounds depending on per-iteration cost; we report the mean. Stddev is small (under 5%) for all reported numbers — within-run jitter doesn't affect the qualitative result.
- Workloads are matched semantically: same SQL, same row counts, same fetch patterns. Where they differ (IfxPy's
IfxPy.fetch_tuplevs. ourcursor.fetchall), we use whichever idiom exhausts the cursor in each driver.
Reproduce
From the project root:
# 1. Start the dev Informix container
make ifx-up
# 2. Seed the 1k-row test table on the host (using informix-db)
uv run python -c "
import informix_db, contextlib
conn = informix_db.connect(host='127.0.0.1', port=9088,
user='informix', password='in4mix',
database='sysmaster', server='informix', autocommit=True)
cur = conn.cursor()
with contextlib.suppress(Exception): cur.execute('DROP TABLE p21_bench')
cur.execute('CREATE TABLE p21_bench (id INT, name VARCHAR(64), counter INT, value FLOAT, created DATE)')
cur.executemany('INSERT INTO p21_bench VALUES (?, ?, ?, ?, ?)',
[(i, f'row_{i:04d}', i*7, float(i)*1.5, None) for i in range(1000)])
conn.close()
"
# 3. Build + run the IfxPy benchmark container
docker build -f tests/benchmarks/compare/Dockerfile.ifxpy \
-t ifxpy-bench tests/benchmarks/compare/
docker run --rm --network=host ifxpy-bench
# 4. Run informix-db benchmarks for the matched comparison
uv run pytest tests/benchmarks/test_select_perf.py \
tests/benchmarks/test_pool_perf.py \
tests/benchmarks/test_insert_perf.py \
-m benchmark --benchmark-only --benchmark-warmup=on
Files
Dockerfile.ifxpy— Ubuntu 20.04 container with Python 3.9, IfxPy, and OneDB drivers installedifxpy_bench.py— IfxPy benchmark workloads (mirrorstests/benchmarks/test_*_perf.py)- This README
Caveats
- IfxPy 3.0.5 is the latest PyPI version (from October 2020). It's the most actively-maintained C-bound option but hasn't shipped a release in ~5 years.
- Numbers will vary by host, distro, kernel, network stack — re-run on your own hardware before drawing strong conclusions.
- The 1k-row INSERT benchmark uses different APIs (IfxPy's
prepare+executeloop vs ourexecutemany); the comparison is by total wall-clock time for the equivalent workload, not by per-call overhead.