History

Ryan Malloy a9e1f17bae Phase 31: Head-to-head benchmark vs IfxPy (the C-bound PyPI driver)

Adds a paired benchmark of informix-db (pure Python) against IfxPy
3.0.5 (IBM's C-bound driver via OneDB ODBC) on identical workloads
against the same Informix dev container.

Headline result: pure Python is competitive — and faster on 2/5
benchmarks where wire round-trip dominates over codec/marshaling.

| Benchmark | IfxPy | informix-db | Result |
|---|---:|---:|---:|
| select_one_row (single-row latency) | 128 us | 116 us | us 9% faster |
| select_systables_first_10 | 126 us | 184 us | IfxPy 32% faster |
| select_bench_table_all (1k rows) | 969 us | 855 us | us 12% faster |
| executemany(1000) in txn | 21.5 ms | 30.8 ms | IfxPy 30% slower |
| cold_connect_disconnect | 11.0 ms | 10.9 ms | comparable |

Why the surprising wins: IfxPy's path is Python -> OneDB ODBC ->
libifdmr -> wire. Ours is Python -> wire. When wire round-trip
dominates (single-row, bulk fetch), the missing abstraction layer
makes us faster. When per-row marshaling dominates (executemany),
IfxPy's C-level execute(stmt, tuple) beats Python BIND-PDU build.

Files added under tests/benchmarks/compare/:
* Dockerfile.ifxpy — Ubuntu 20.04 base with IfxPy + OneDB drivers
* ifxpy_bench.py — IfxPy benchmark workloads matching test_*_perf.py
* README.md — methodology, results, install gauntlet, reproduction

The IfxPy install gauntlet itself is part of the comparison story:
modern Python 3.11 (not 3.13), setuptools <58, permissive CFLAGS,
manual download of 92MB OneDB ODBC tarball, four LD_LIBRARY_PATH
directories, libcrypt.so.1 (deprecated 2018, missing on Arch /
Fedora 35+ / RHEL 9). Versus our `pip install informix-db`.

README.md (project root): added "Compared to IfxPy" section under
Performance with the headline numbers and a pointer to the full
methodology.

.gitignore: keep Dockerfile/script/README under tests/benchmarks/
compare/, exclude the 92MB OneDB tarball and the local venv.

2026-05-05 11:41:47 -06:00

compare

Phase 31: Head-to-head benchmark vs IfxPy (the C-bound PyPI driver)

2026-05-05 11:41:47 -06:00

__init__.py

Phase 21: Performance benchmarks (2026.05.04.5)

2026-05-04 17:21:12 -06:00

baseline.json

Phase 25: Branch reorder + invariant tripwires (2026.05.04.10)

2026-05-04 23:34:05 -06:00

conftest.py

Phase 21: Performance benchmarks (2026.05.04.5)

2026-05-04 17:21:12 -06:00

README.md

Phase 21.1: executemany perf - it was the autocommit cliff (2026.05.04.6)

2026-05-04 17:26:16 -06:00

test_async_perf.py

Phase 21: Performance benchmarks (2026.05.04.5)

2026-05-04 17:21:12 -06:00

test_codec_perf.py

Phase 21: Performance benchmarks (2026.05.04.5)

2026-05-04 17:21:12 -06:00

test_insert_perf.py

Phase 21.1: executemany perf - it was the autocommit cliff (2026.05.04.6)

2026-05-04 17:26:16 -06:00

test_pool_perf.py

Phase 21: Performance benchmarks (2026.05.04.5)

2026-05-04 17:21:12 -06:00

test_select_perf.py

Phase 21: Performance benchmarks (2026.05.04.5)

2026-05-04 17:21:12 -06:00

README.md

Benchmarks (Phase 21)

Performance baselines for informix-db. Two layers:

Codec micro-benchmarks (test_codec_perf.py) — pure CPU, no server. These set the ceiling for what end-to-end can achieve. Run with make bench-codec. Suitable for CI's pre-merge job.
End-to-end benchmarks — exercise the full PREPARE → BIND → EXECUTE → FETCH → CLOSE → RELEASE round-trip. Need an Informix container (make ifx-up). Run with make bench.

Headline numbers (baseline 2026-05-04, x86_64 Linux, dev container on loopback)

Operation	Mean	Ops/sec
`decode(int)` (per cell)	181 ns	5.5M
`parse_tuple_payload(5 cols)` (per row)	2.87 µs	350K
`encode_param(int)` (per param)	103 ns	9.7M
`SELECT 1` round-trip	177 µs	5,650
Pool acquire + tiny query + release	295 µs	3,400
Cold connect + close (login handshake)	11.2 ms	89
1000-row SELECT *	1.56 ms	640
INSERT (single, prepared)	1.88 ms	530
`executemany(100)` autocommit=True	181 ms	~550 rows/sec
`executemany(1000)` autocommit=True	1.72 s	~580 rows/sec
`executemany(1000)` in single transaction	32 ms	~31,000 rows/sec

What these tell you

Pool gives 72× speedup over cold connect. If your app opens a connection per request, fix that first.
Wrap bulk INSERTs in a transaction. That's a 53× speedup over the autocommit-True default. With autocommit on, each row forces the server to flush its transaction log; in transaction mode the flush happens once at COMMIT. Per-row cost drops from 1.72 ms (storage-bound) to 32 µs (pure protocol). PEP 249's default autocommit=False was designed for this — we just default to False.
Codec is not the bottleneck. Per-row decode (2.9 µs) is 1000× faster than wire round-trip (177 µs for SELECT 1). Network and server-side cost dominate.
UTF-8 carries no measurable cost. decode_varchar_utf8 runs at 216 ns vs decode_varchar_short at 170 ns — the 27% delta is the multibyte string walk inherent in UTF-8 decoding, not Phase 20 overhead.

Performance gotchas

autocommit=True + executemany is the slowest reasonable pattern. Use it only when each row genuinely needs to land independently. For bulk loads, default autocommit=False and call conn.commit() at the end of the batch.
Single INSERT in a tight loop is 1.88 ms each — strictly worse than executemany (which saves PREPARE/RELEASE overhead). If you find yourself looping over cur.execute("INSERT...") hundreds of times, switch to executemany.
Cold connect is 11 ms. The login handshake is expensive compared to anything you'll do with the connection. Pool everything in long-lived processes.

Regression policy

baseline.json is committed and represents the dev-container baseline. Compare a current run against it with:

uv run pytest tests/benchmarks/ -m benchmark --benchmark-only \
    --benchmark-compare=tests/benchmarks/baseline.json \
    --benchmark-compare-fail=mean:25%

A 25% mean-regression fails the run. Adjust the threshold per CI noise profile. CI's loopback-network-on-shared-runner is noisier than dev container on a quiet box — start permissive and tighten as you collect runs.

Updating the baseline

When you intentionally change performance (an optimization, or accept a regression for correctness), refresh:

make bench-save                                 # writes .results/0001_run.json
cp tests/benchmarks/.results/Linux-CPython-*/0001_run.json tests/benchmarks/baseline.json
git add tests/benchmarks/baseline.json

Document the change in CHANGELOG so reviewers know why the floor moved.

Files

test_codec_perf.py — codec dispatch (decode, encode_param, parse_tuple_payload)
test_select_perf.py — SELECT round-trips, single + multi-row
test_insert_perf.py — INSERT single + executemany throughput
test_pool_perf.py — cold connect vs pool acquire/release
test_async_perf.py — async-path latency + concurrent throughput
conftest.py — long-lived bench_conn and 1k-row bench_table fixtures
baseline.json — committed baseline for regression comparison
.results/ — gitignored; per-run output from make bench-save

README.md Unescape Escape

Benchmarks (Phase 21)

Headline numbers (baseline 2026-05-04, x86_64 Linux, dev container on loopback)

What these tell you

Performance gotchas

Regression policy

Updating the baseline

Files

README.md