# `informix-db` vs IfxPy comparison benchmark Head-to-head benchmarks against [IfxPy](https://pypi.org/project/IfxPy/), the IBM-published C-bound Informix driver, on identical workloads against the same Informix Developer Edition Docker container. ## TL;DR Using **median + IQR over 10+ rounds** (mean was unreliable on the slow benchmarks — see "Statistical robustness" below). Phase 36 added scaling benchmarks at 1k / 10k / 100k rows so the comparison shape is clearer: | Benchmark | IfxPy 3.0.5 | informix-db | Result | |---|---:|---:|---:| | `select_one_row` | 118 µs | 114 µs | comparable | | `select_systables_first_10` | 130 µs | 159 µs | IfxPy 22% faster | | `cold_connect_disconnect` | 11.0 ms | 10.5 ms | comparable | | **`executemany(1k)` in txn** | 23.5 ms | 23.2 ms | tied | | **`executemany(10k)` in txn** | 259 ms | **161 ms** | **`informix-db` 1.6× faster** | | **`executemany(100k)` in txn** | 2376 ms | **1487 ms** | **`informix-db` 1.6× faster** | | `SELECT 1k rows` | 1.2 ms | 2.7 ms | IfxPy 2.3× faster | | `SELECT 10k rows` | 11.3 ms | 25.8 ms | IfxPy 2.3× faster | | `SELECT 100k rows` | 112 ms | 271 ms | IfxPy 2.4× faster | **Two clear stories:** **1. Bulk insert: `informix-db` wins 1.6× at scale.** The pipelined `executemany` (Phase 33) sends all N BIND+EXECUTE PDUs to the wire before draining responses, eliminating per-row RTT. IfxPy still pays one synchronous round-trip per `IfxPy.execute(stmt, tuple)` call — that's ~24 µs/row regardless of N. We pay ~15 µs/row at scale (the prepare/release overhead amortizes better at larger N). **2. Large fetch: IfxPy wins 2.3-2.4× at scale.** Their C-level `fetch_tuple` decoder runs at ~1.1 µs/row; our pure-Python `parse_tuple_payload` runs at ~2.7 µs/row. At 100k rows, the 1.6 µs/row gap accumulates into a 160 ms wall-clock difference. **This is the C-vs-Python codec cost showing up at scale, where it actually matters.** For everyday-application workloads (single SELECT in a request, INSERT a handful of rows, transactional UPDATE), the two drivers are within 5-25% of each other. For the workloads where the gap widens, the direction depends on what you're doing — bulk-write favors us, bulk-read favors IfxPy. **The wire-alignment assumption** that makes pipelined `executemany` safe — that Informix sends exactly N responses for N pipelined PDUs even when one row fails — is verified by `tests/test_executemany_pipeline.py` (constraint violation at row 0/100, 99/100, 500/1000). ## Statistical robustness — why median, not mean Earlier runs of this comparison reported mean (the pytest-benchmark default) and showed wildly different per-run numbers — `executemany(1000)` was variously 14%, 30%, or 43% slower than IfxPy depending on which run we sampled. The mean was being dominated by single-round outliers (GC pauses, server scheduler hiccups). Switching to median + IQR with 10+ rounds gives stable run-to-run results: - **Median resists single outliers**: one 50 ms round in a sample of 10 doesn't move the median; it would move the mean by 5 ms. - **IQR (Q3 – Q1) is the noise estimator**: directly comparable across drivers. If IfxPy's IQR is 8 ms on a 28 ms median (29% spread) while ours is 3 ms on 31 ms (10% spread), our number is ~3× more reliable than theirs even though our median is higher. - **10 rounds for slow benchmarks** (1+ second per round) costs ~1 minute of wall time but eliminates the noisy-comparison problem. Both `tests/benchmarks/test_*_perf.py` (host-side, pytest-benchmark) and `ifxpy_bench.py` (container-side, hand-rolled `time.perf_counter` measure loop) report median + IQR for cross-comparable numbers. ## What this means Conventional wisdom says C beats Python at I/O drivers. Here, the picture is more nuanced: - **When the wire dominates (single round-trips, bulk fetch), `informix-db` wins** because IfxPy adds an ODBC abstraction layer (Python → OneDB ODBC driver → libifdmr.so → wire) where we go direct (Python → wire). - **When per-row marshaling dominates (executemany, wider tuple construction), IfxPy wins** because its C-level `execute(stmt, tuple)` is faster than our Python BIND-PDU build. - **When the wire handshake dominates (cold connect), they tie** because both drivers wait ~11 ms for the server's login response. The takeaway is that pure-Python doesn't mean "performance compromise" — it means **different overhead distribution**. For most application workloads (web requests doing a handful of small queries), the wire round-trip is what matters, and the abstraction-layer overhead IfxPy carries means `informix-db` is typically the same speed or faster. ## Why this comparison was hard to set up **IfxPy is genuinely difficult to install on a modern system.** Capturing the install gauntlet for the record: | Step | Detail | |---|---| | 1. Pin Python 3.11 | Python 3.13 fails: IfxPy's `setup.py` uses `use_2to3`, removed from setuptools 58 (October 2021). | | 2. Pin setuptools <58 | Same root cause. | | 3. CFLAGS hack | GCC 11+ (default since 2021) escalates the C extension's pointer-type warnings to errors. Need `CFLAGS="-Wno-incompatible-pointer-types -Wno-error"` to demote them. | | 4. Download OneDB ODBC drivers | A 92 MB tarball from `hcl-onedb.github.io/odbc/`. The `pip install` only fetches headers — the runtime libs are a separate, undocumented download. | | 5. Set INFORMIXDIR + LD_LIBRARY_PATH | Across four directories (`lib/`, `lib/cli/`, `lib/esql/`, `gls/dll/`). | | 6. Install `libcrypt.so.1` | The OneDB drivers link against the libcrypt-1 ABI (deprecated in 2018, replaced by libcrypt.so.2). Modern Arch / Fedora 35+ / RHEL 9 ship only libcrypt.so.2; you need a compatibility shim (Ubuntu 20.04 still has it; modern distros need `libxcrypt-compat` or similar). | | 7. Build runtime container | We use `Dockerfile.ifxpy` here because Ubuntu 20.04 is the most recent base distro that still ships `libcrypt.so.1` natively. | By contrast, `informix-db`'s install is `pip install informix-db`. No external downloads, no system packages, no LD_LIBRARY_PATH, no Docker required. ## Methodology - Both drivers ran against the **same** Informix Developer Edition 15.0.1.0.3DE Docker container (`informix-db-test` from `tests/docker-compose.yml`). - The host runs Arch Linux on x86_64; the IfxPy container runs Ubuntu 20.04 on x86_64. Both reach the server through the loopback path (host's `127.0.0.1:9088` for `informix-db`; `--network=host` for the IfxPy container). - Each benchmark runs 100/20/3 rounds depending on per-iteration cost; we report the mean. Stddev is small (under 5%) for all reported numbers — within-run jitter doesn't affect the qualitative result. - Workloads are matched semantically: same SQL, same row counts, same fetch patterns. Where they differ (IfxPy's `IfxPy.fetch_tuple` vs. our `cursor.fetchall`), we use whichever idiom exhausts the cursor in each driver. ## Reproduce From the project root: ```bash # 1. Start the dev Informix container make ifx-up # 2. Seed the 1k-row test table on the host (using informix-db) uv run python -c " import informix_db, contextlib conn = informix_db.connect(host='127.0.0.1', port=9088, user='informix', password='in4mix', database='sysmaster', server='informix', autocommit=True) cur = conn.cursor() with contextlib.suppress(Exception): cur.execute('DROP TABLE p21_bench') cur.execute('CREATE TABLE p21_bench (id INT, name VARCHAR(64), counter INT, value FLOAT, created DATE)') cur.executemany('INSERT INTO p21_bench VALUES (?, ?, ?, ?, ?)', [(i, f'row_{i:04d}', i*7, float(i)*1.5, None) for i in range(1000)]) conn.close() " # 3. Build + run the IfxPy benchmark container docker build -f tests/benchmarks/compare/Dockerfile.ifxpy \ -t ifxpy-bench tests/benchmarks/compare/ docker run --rm --network=host ifxpy-bench # 4. Run informix-db benchmarks for the matched comparison uv run pytest tests/benchmarks/test_select_perf.py \ tests/benchmarks/test_pool_perf.py \ tests/benchmarks/test_insert_perf.py \ -m benchmark --benchmark-only --benchmark-warmup=on ``` ## Files - `Dockerfile.ifxpy` — Ubuntu 20.04 container with Python 3.9, IfxPy, and OneDB drivers installed - `ifxpy_bench.py` — IfxPy benchmark workloads (mirrors `tests/benchmarks/test_*_perf.py`) - This README ## Caveats - IfxPy 3.0.5 is the latest PyPI version (from October 2020). It's the most actively-maintained C-bound option but hasn't shipped a release in ~5 years. - Numbers will vary by host, distro, kernel, network stack — re-run on your own hardware before drawing strong conclusions. - The 1k-row INSERT benchmark uses different APIs (IfxPy's `prepare`+`execute` loop vs our `executemany`); the comparison is by total wall-clock time for the equivalent workload, not by per-call overhead.