Ryan Malloy 270155d2de Phase 36: IfxPy scaling comparison + honest comparison numbers (2026.05.05.9)
Extends the IfxPy comparison bench script with scaling workloads
(1k/10k/100k rows for both executemany and SELECT). Re-runs the
full comparison with consistent measurement methodology and updates
the README with the actually-correct numbers.

Earlier comparison runs reported informix-db winning all 5
benchmarks. Re-running select_bench_table_all with consistent
measurement gives 3.04 ms, not the 891 us I cited earlier - a
3.4x discrepancy attributable to noisy warmup + small-fixture
artifacts. The "we win everything" framing was wrong.

Corrected comparison reveals two clear stories:

Bulk-insert: pure-Python wins 1.6x at scale.
  executemany(10k):  IfxPy 259ms  -> us 161ms (1.6x faster)
  executemany(100k): IfxPy 2376ms -> us 1487ms (1.6x faster)
Reason: Phase 33's pipelining eliminates per-row RTT. IfxPy's
per-call API can't pipeline.

Large-fetch: IfxPy wins 2.3-2.4x at scale.
  SELECT 1k rows:   IfxPy 1.2ms  / us 2.7ms (IfxPy 2.3x)
  SELECT 10k rows:  IfxPy 11.3ms / us 25.8ms (IfxPy 2.3x)
  SELECT 100k rows: IfxPy 112ms  / us 271ms (IfxPy 2.4x)
Reason: C-level fetch_tuple at ~1.1us/row beats Python
parse_tuple_payload at ~2.7us/row. Real C-vs-Python codec gap
showing up at scale.

For everyday workloads (single SELECT in a request, INSERT a
handful of rows), drivers are within 5-25%. For workloads where
the gap widens, direction depends on what you're doing - bulk-
write favors us, bulk-read favors IfxPy.

README's "Compared to IfxPy" section rewritten with the corrected
numbers and an honest "when to prefer which" subsection.
tests/benchmarks/compare/README.md mirror updated.

Net narrative: a "faster at bulk-write, slower at bulk-read,
comparable elsewhere" comparison story is more honest and more
durable than a "we win everything" claim that would have collapsed
the first time a user ran their own benchmark.

Side note (lint): one ambiguous unicode `×` in cursors.py replaced
with `x`.

Phase 37 ticket: parse_tuple_payload is the bottleneck at scale.
Closing the 1.6 us/row gap to IfxPy would make us competitive on
bulk-fetch too. Possible approaches: Cython codec, deeper inlining,
per-column dispatch pre-bake.
2026-05-05 12:44:52 -06:00

8.6 KiB
Raw Permalink Blame History

informix-db vs IfxPy comparison benchmark

Head-to-head benchmarks against IfxPy, the IBM-published C-bound Informix driver, on identical workloads against the same Informix Developer Edition Docker container.

TL;DR

Using median + IQR over 10+ rounds (mean was unreliable on the slow benchmarks — see "Statistical robustness" below). Phase 36 added scaling benchmarks at 1k / 10k / 100k rows so the comparison shape is clearer:

Benchmark IfxPy 3.0.5 informix-db Result
select_one_row 118 µs 114 µs comparable
select_systables_first_10 130 µs 159 µs IfxPy 22% faster
cold_connect_disconnect 11.0 ms 10.5 ms comparable
executemany(1k) in txn 23.5 ms 23.2 ms tied
executemany(10k) in txn 259 ms 161 ms informix-db 1.6× faster
executemany(100k) in txn 2376 ms 1487 ms informix-db 1.6× faster
SELECT 1k rows 1.2 ms 2.7 ms IfxPy 2.3× faster
SELECT 10k rows 11.3 ms 25.8 ms IfxPy 2.3× faster
SELECT 100k rows 112 ms 271 ms IfxPy 2.4× faster

Two clear stories:

1. Bulk insert: informix-db wins 1.6× at scale. The pipelined executemany (Phase 33) sends all N BIND+EXECUTE PDUs to the wire before draining responses, eliminating per-row RTT. IfxPy still pays one synchronous round-trip per IfxPy.execute(stmt, tuple) call — that's ~24 µs/row regardless of N. We pay ~15 µs/row at scale (the prepare/release overhead amortizes better at larger N).

2. Large fetch: IfxPy wins 2.3-2.4× at scale. Their C-level fetch_tuple decoder runs at ~1.1 µs/row; our pure-Python parse_tuple_payload runs at ~2.7 µs/row. At 100k rows, the 1.6 µs/row gap accumulates into a 160 ms wall-clock difference. This is the C-vs-Python codec cost showing up at scale, where it actually matters.

For everyday-application workloads (single SELECT in a request, INSERT a handful of rows, transactional UPDATE), the two drivers are within 5-25% of each other. For the workloads where the gap widens, the direction depends on what you're doing — bulk-write favors us, bulk-read favors IfxPy.

The wire-alignment assumption that makes pipelined executemany safe — that Informix sends exactly N responses for N pipelined PDUs even when one row fails — is verified by tests/test_executemany_pipeline.py (constraint violation at row 0/100, 99/100, 500/1000).

Statistical robustness — why median, not mean

Earlier runs of this comparison reported mean (the pytest-benchmark default) and showed wildly different per-run numbers — executemany(1000) was variously 14%, 30%, or 43% slower than IfxPy depending on which run we sampled. The mean was being dominated by single-round outliers (GC pauses, server scheduler hiccups).

Switching to median + IQR with 10+ rounds gives stable run-to-run results:

  • Median resists single outliers: one 50 ms round in a sample of 10 doesn't move the median; it would move the mean by 5 ms.
  • IQR (Q3 Q1) is the noise estimator: directly comparable across drivers. If IfxPy's IQR is 8 ms on a 28 ms median (29% spread) while ours is 3 ms on 31 ms (10% spread), our number is ~3× more reliable than theirs even though our median is higher.
  • 10 rounds for slow benchmarks (1+ second per round) costs ~1 minute of wall time but eliminates the noisy-comparison problem.

Both tests/benchmarks/test_*_perf.py (host-side, pytest-benchmark) and ifxpy_bench.py (container-side, hand-rolled time.perf_counter measure loop) report median + IQR for cross-comparable numbers.

What this means

Conventional wisdom says C beats Python at I/O drivers. Here, the picture is more nuanced:

  • When the wire dominates (single round-trips, bulk fetch), informix-db wins because IfxPy adds an ODBC abstraction layer (Python → OneDB ODBC driver → libifdmr.so → wire) where we go direct (Python → wire).
  • When per-row marshaling dominates (executemany, wider tuple construction), IfxPy wins because its C-level execute(stmt, tuple) is faster than our Python BIND-PDU build.
  • When the wire handshake dominates (cold connect), they tie because both drivers wait ~11 ms for the server's login response.

The takeaway is that pure-Python doesn't mean "performance compromise" — it means different overhead distribution. For most application workloads (web requests doing a handful of small queries), the wire round-trip is what matters, and the abstraction-layer overhead IfxPy carries means informix-db is typically the same speed or faster.

Why this comparison was hard to set up

IfxPy is genuinely difficult to install on a modern system. Capturing the install gauntlet for the record:

Step Detail
1. Pin Python 3.11 Python 3.13 fails: IfxPy's setup.py uses use_2to3, removed from setuptools 58 (October 2021).
2. Pin setuptools <58 Same root cause.
3. CFLAGS hack GCC 11+ (default since 2021) escalates the C extension's pointer-type warnings to errors. Need CFLAGS="-Wno-incompatible-pointer-types -Wno-error" to demote them.
4. Download OneDB ODBC drivers A 92 MB tarball from hcl-onedb.github.io/odbc/. The pip install only fetches headers — the runtime libs are a separate, undocumented download.
5. Set INFORMIXDIR + LD_LIBRARY_PATH Across four directories (lib/, lib/cli/, lib/esql/, gls/dll/).
6. Install libcrypt.so.1 The OneDB drivers link against the libcrypt-1 ABI (deprecated in 2018, replaced by libcrypt.so.2). Modern Arch / Fedora 35+ / RHEL 9 ship only libcrypt.so.2; you need a compatibility shim (Ubuntu 20.04 still has it; modern distros need libxcrypt-compat or similar).
7. Build runtime container We use Dockerfile.ifxpy here because Ubuntu 20.04 is the most recent base distro that still ships libcrypt.so.1 natively.

By contrast, informix-db's install is pip install informix-db. No external downloads, no system packages, no LD_LIBRARY_PATH, no Docker required.

Methodology

  • Both drivers ran against the same Informix Developer Edition 15.0.1.0.3DE Docker container (informix-db-test from tests/docker-compose.yml).
  • The host runs Arch Linux on x86_64; the IfxPy container runs Ubuntu 20.04 on x86_64. Both reach the server through the loopback path (host's 127.0.0.1:9088 for informix-db; --network=host for the IfxPy container).
  • Each benchmark runs 100/20/3 rounds depending on per-iteration cost; we report the mean. Stddev is small (under 5%) for all reported numbers — within-run jitter doesn't affect the qualitative result.
  • Workloads are matched semantically: same SQL, same row counts, same fetch patterns. Where they differ (IfxPy's IfxPy.fetch_tuple vs. our cursor.fetchall), we use whichever idiom exhausts the cursor in each driver.

Reproduce

From the project root:

# 1. Start the dev Informix container
make ifx-up

# 2. Seed the 1k-row test table on the host (using informix-db)
uv run python -c "
import informix_db, contextlib
conn = informix_db.connect(host='127.0.0.1', port=9088,
    user='informix', password='in4mix',
    database='sysmaster', server='informix', autocommit=True)
cur = conn.cursor()
with contextlib.suppress(Exception): cur.execute('DROP TABLE p21_bench')
cur.execute('CREATE TABLE p21_bench (id INT, name VARCHAR(64), counter INT, value FLOAT, created DATE)')
cur.executemany('INSERT INTO p21_bench VALUES (?, ?, ?, ?, ?)',
    [(i, f'row_{i:04d}', i*7, float(i)*1.5, None) for i in range(1000)])
conn.close()
"

# 3. Build + run the IfxPy benchmark container
docker build -f tests/benchmarks/compare/Dockerfile.ifxpy \
    -t ifxpy-bench tests/benchmarks/compare/
docker run --rm --network=host ifxpy-bench

# 4. Run informix-db benchmarks for the matched comparison
uv run pytest tests/benchmarks/test_select_perf.py \
    tests/benchmarks/test_pool_perf.py \
    tests/benchmarks/test_insert_perf.py \
    -m benchmark --benchmark-only --benchmark-warmup=on

Files

  • Dockerfile.ifxpy — Ubuntu 20.04 container with Python 3.9, IfxPy, and OneDB drivers installed
  • ifxpy_bench.py — IfxPy benchmark workloads (mirrors tests/benchmarks/test_*_perf.py)
  • This README

Caveats

  • IfxPy 3.0.5 is the latest PyPI version (from October 2020). It's the most actively-maintained C-bound option but hasn't shipped a release in ~5 years.
  • Numbers will vary by host, distro, kernel, network stack — re-run on your own hardware before drawing strong conclusions.
  • The 1k-row INSERT benchmark uses different APIs (IfxPy's prepare+execute loop vs our executemany); the comparison is by total wall-clock time for the equivalent workload, not by per-call overhead.