informix-db

warehack.ing/informix-db

Fork 0

Commit Graph

Author	SHA1	Message	Date
Ryan Malloy	270155d2de	Phase 36: IfxPy scaling comparison + honest comparison numbers (2026.05.05.9) Extends the IfxPy comparison bench script with scaling workloads (1k/10k/100k rows for both executemany and SELECT). Re-runs the full comparison with consistent measurement methodology and updates the README with the actually-correct numbers. Earlier comparison runs reported informix-db winning all 5 benchmarks. Re-running select_bench_table_all with consistent measurement gives 3.04 ms, not the 891 us I cited earlier - a 3.4x discrepancy attributable to noisy warmup + small-fixture artifacts. The "we win everything" framing was wrong. Corrected comparison reveals two clear stories: Bulk-insert: pure-Python wins 1.6x at scale. executemany(10k): IfxPy 259ms -> us 161ms (1.6x faster) executemany(100k): IfxPy 2376ms -> us 1487ms (1.6x faster) Reason: Phase 33's pipelining eliminates per-row RTT. IfxPy's per-call API can't pipeline. Large-fetch: IfxPy wins 2.3-2.4x at scale. SELECT 1k rows: IfxPy 1.2ms / us 2.7ms (IfxPy 2.3x) SELECT 10k rows: IfxPy 11.3ms / us 25.8ms (IfxPy 2.3x) SELECT 100k rows: IfxPy 112ms / us 271ms (IfxPy 2.4x) Reason: C-level fetch_tuple at ~1.1us/row beats Python parse_tuple_payload at ~2.7us/row. Real C-vs-Python codec gap showing up at scale. For everyday workloads (single SELECT in a request, INSERT a handful of rows), drivers are within 5-25%. For workloads where the gap widens, direction depends on what you're doing - bulk- write favors us, bulk-read favors IfxPy. README's "Compared to IfxPy" section rewritten with the corrected numbers and an honest "when to prefer which" subsection. tests/benchmarks/compare/README.md mirror updated. Net narrative: a "faster at bulk-write, slower at bulk-read, comparable elsewhere" comparison story is more honest and more durable than a "we win everything" claim that would have collapsed the first time a user ran their own benchmark. Side note (lint): one ambiguous unicode `×` in cursors.py replaced with `x`. Phase 37 ticket: parse_tuple_payload is the bottleneck at scale. Closing the 1.6 us/row gap to IfxPy would make us competitive on bulk-fetch too. Possible approaches: Cython codec, deeper inlining, per-column dispatch pre-bake.	2026-05-05 12:44:52 -06:00
Ryan Malloy	01757415a5	Phase 32: Benchmark improvements (Tier 1 + Tier 2) Tier 1 — make existing benchmarks reliable: * Bumped slow-bench rounds: cold_connect_disconnect 5->15, executemany series 3->10. Single-round outliers no longer dominate. * Switched bench reporting to median + IQR. Mean was being moved by individual GC pauses / scheduler hiccups (IfxPy executemany IQR was 8.2 ms on a 28 ms median - 29% spread - mean was unreliable). * Updated ifxpy_bench.py to also report median + IQR alongside mean for cross-comparable numbers. * Makefile bench targets now show median, iqr, mean, stddev, ops, rounds. The robust statistics flipped the comparison story: Old (mean, 3 rounds): us 9% faster / IfxPy 30% faster on 2 of 5 New (median, 10+ rds): us faster on 4 of 5 benchmarks \| Benchmark \| IfxPy \| informix-db \| Δ \| \|---\|---\|---\|---\| \| select_one_row \| 170us \| 119us \| us 30% faster \| \| select_systables_first_10 \| 186us \| 142us \| us 24% faster \| \| select_bench_table_all 1k \| 980us \| 832us \| us 15% faster \| \| executemany 1k in txn \| 28.3ms \| 31.3ms \| us 10% slower \| \| cold_connect_disconnect \| 12.0ms \| 10.7ms \| us 11% faster \| Tier 2 — add benchmarks for claims we make but don't verify: tests/benchmarks/test_observability_perf.py: * test_streaming_fetch_memory_profile — RSS sampling during a cursor iteration. Documents memory growth shape; regression wall at 100 MB / 1k rows. Currently flat (in-memory cursor doesn't grow detectably for 278 rows). * test_select_1_latency_percentiles — 1000-query distribution with p50/p90/p95/p99/max. Result: p99/p50 = 1.42x (tight tail). p50=108us, p99=153us. * test_concurrent_pool_throughput[2,4,8] — N worker threads through pool, measures aggregate QPS + per-thread fairness. Plateaus at ~6K QPS (server-bound); per-thread latency scales ~linearly with N (server serialization expected). README.md (project root): updated Compared-to-IfxPy table with the median-based numbers + IQR awareness note. tests/benchmarks/compare/README.md: added "Statistical robustness" section explaining why median over mean for fair comparison. 236 integration tests pass; ruff clean.	2026-05-05 12:01:11 -06:00
Ryan Malloy	a9e1f17bae	Phase 31: Head-to-head benchmark vs IfxPy (the C-bound PyPI driver) Adds a paired benchmark of informix-db (pure Python) against IfxPy 3.0.5 (IBM's C-bound driver via OneDB ODBC) on identical workloads against the same Informix dev container. Headline result: pure Python is competitive — and faster on 2/5 benchmarks where wire round-trip dominates over codec/marshaling. \| Benchmark \| IfxPy \| informix-db \| Result \| \|---\|---:\|---:\|---:\| \| select_one_row (single-row latency) \| 128 us \| 116 us \| us 9% faster \| \| select_systables_first_10 \| 126 us \| 184 us \| IfxPy 32% faster \| \| select_bench_table_all (1k rows) \| 969 us \| 855 us \| us 12% faster \| \| executemany(1000) in txn \| 21.5 ms \| 30.8 ms \| IfxPy 30% slower \| \| cold_connect_disconnect \| 11.0 ms \| 10.9 ms \| comparable \| Why the surprising wins: IfxPy's path is Python -> OneDB ODBC -> libifdmr -> wire. Ours is Python -> wire. When wire round-trip dominates (single-row, bulk fetch), the missing abstraction layer makes us faster. When per-row marshaling dominates (executemany), IfxPy's C-level execute(stmt, tuple) beats Python BIND-PDU build. Files added under tests/benchmarks/compare/: * Dockerfile.ifxpy — Ubuntu 20.04 base with IfxPy + OneDB drivers * ifxpy_bench.py — IfxPy benchmark workloads matching test__perf.py README.md — methodology, results, install gauntlet, reproduction The IfxPy install gauntlet itself is part of the comparison story: modern Python 3.11 (not 3.13), setuptools <58, permissive CFLAGS, manual download of 92MB OneDB ODBC tarball, four LD_LIBRARY_PATH directories, libcrypt.so.1 (deprecated 2018, missing on Arch / Fedora 35+ / RHEL 9). Versus our `pip install informix-db`. README.md (project root): added "Compared to IfxPy" section under Performance with the headline numbers and a pointer to the full methodology. .gitignore: keep Dockerfile/script/README under tests/benchmarks/ compare/, exclude the 92MB OneDB tarball and the local venv.	2026-05-05 11:41:47 -06:00

Author

SHA1

Message

Date

Ryan Malloy

270155d2de

Phase 36: IfxPy scaling comparison + honest comparison numbers (2026.05.05.9)

Extends the IfxPy comparison bench script with scaling workloads
(1k/10k/100k rows for both executemany and SELECT). Re-runs the
full comparison with consistent measurement methodology and updates
the README with the actually-correct numbers.

Earlier comparison runs reported informix-db winning all 5
benchmarks. Re-running select_bench_table_all with consistent
measurement gives 3.04 ms, not the 891 us I cited earlier - a
3.4x discrepancy attributable to noisy warmup + small-fixture
artifacts. The "we win everything" framing was wrong.

Corrected comparison reveals two clear stories:

Bulk-insert: pure-Python wins 1.6x at scale.
  executemany(10k):  IfxPy 259ms  -> us 161ms (1.6x faster)
  executemany(100k): IfxPy 2376ms -> us 1487ms (1.6x faster)
Reason: Phase 33's pipelining eliminates per-row RTT. IfxPy's
per-call API can't pipeline.

Large-fetch: IfxPy wins 2.3-2.4x at scale.
  SELECT 1k rows:   IfxPy 1.2ms  / us 2.7ms (IfxPy 2.3x)
  SELECT 10k rows:  IfxPy 11.3ms / us 25.8ms (IfxPy 2.3x)
  SELECT 100k rows: IfxPy 112ms  / us 271ms (IfxPy 2.4x)
Reason: C-level fetch_tuple at ~1.1us/row beats Python
parse_tuple_payload at ~2.7us/row. Real C-vs-Python codec gap
showing up at scale.

For everyday workloads (single SELECT in a request, INSERT a
handful of rows), drivers are within 5-25%. For workloads where
the gap widens, direction depends on what you're doing - bulk-
write favors us, bulk-read favors IfxPy.

README's "Compared to IfxPy" section rewritten with the corrected
numbers and an honest "when to prefer which" subsection.
tests/benchmarks/compare/README.md mirror updated.

Net narrative: a "faster at bulk-write, slower at bulk-read,
comparable elsewhere" comparison story is more honest and more
durable than a "we win everything" claim that would have collapsed
the first time a user ran their own benchmark.

Side note (lint): one ambiguous unicode `×` in cursors.py replaced
with `x`.

Phase 37 ticket: parse_tuple_payload is the bottleneck at scale.
Closing the 1.6 us/row gap to IfxPy would make us competitive on
bulk-fetch too. Possible approaches: Cython codec, deeper inlining,
per-column dispatch pre-bake.

2026-05-05 12:44:52 -06:00

Ryan Malloy

01757415a5

Phase 32: Benchmark improvements (Tier 1 + Tier 2)

Tier 1 — make existing benchmarks reliable:
* Bumped slow-bench rounds: cold_connect_disconnect 5->15, executemany
  series 3->10. Single-round outliers no longer dominate.
* Switched bench reporting to median + IQR. Mean was being moved by
  individual GC pauses / scheduler hiccups (IfxPy executemany IQR
  was 8.2 ms on a 28 ms median - 29% spread - mean was unreliable).
* Updated ifxpy_bench.py to also report median + IQR alongside mean
  for cross-comparable numbers.
* Makefile bench targets now show median, iqr, mean, stddev, ops, rounds.

The robust statistics flipped the comparison story:

  Old (mean, 3 rounds):   us 9% faster  / IfxPy 30% faster on 2 of 5
  New (median, 10+ rds):  us faster on 4 of 5 benchmarks

| Benchmark | IfxPy | informix-db | Δ |
|---|---|---|---|
| select_one_row             | 170us | 119us | us 30% faster |
| select_systables_first_10  | 186us | 142us | us 24% faster |
| select_bench_table_all 1k  | 980us | 832us | us 15% faster |
| executemany 1k in txn      | 28.3ms | 31.3ms | us 10% slower |
| cold_connect_disconnect    | 12.0ms | 10.7ms | us 11% faster |

Tier 2 — add benchmarks for claims we make but don't verify:

tests/benchmarks/test_observability_perf.py:
* test_streaming_fetch_memory_profile — RSS sampling during a
  cursor iteration. Documents memory growth shape; regression
  wall at 100 MB / 1k rows. Currently flat (in-memory cursor
  doesn't grow detectably for 278 rows).
* test_select_1_latency_percentiles — 1000-query distribution
  with p50/p90/p95/p99/max. Result: p99/p50 = 1.42x (tight tail).
  p50=108us, p99=153us.
* test_concurrent_pool_throughput[2,4,8] — N worker threads
  through pool, measures aggregate QPS + per-thread fairness.
  Plateaus at ~6K QPS (server-bound); per-thread latency scales
  ~linearly with N (server serialization expected).

README.md (project root): updated Compared-to-IfxPy table with
the median-based numbers + IQR awareness note.
tests/benchmarks/compare/README.md: added "Statistical robustness"
section explaining why median over mean for fair comparison.

236 integration tests pass; ruff clean.

2026-05-05 12:01:11 -06:00

Ryan Malloy

a9e1f17bae

Phase 31: Head-to-head benchmark vs IfxPy (the C-bound PyPI driver)

Adds a paired benchmark of informix-db (pure Python) against IfxPy
3.0.5 (IBM's C-bound driver via OneDB ODBC) on identical workloads
against the same Informix dev container.

Headline result: pure Python is competitive — and faster on 2/5
benchmarks where wire round-trip dominates over codec/marshaling.

| Benchmark | IfxPy | informix-db | Result |
|---|---:|---:|---:|
| select_one_row (single-row latency) | 128 us | 116 us | us 9% faster |
| select_systables_first_10 | 126 us | 184 us | IfxPy 32% faster |
| select_bench_table_all (1k rows) | 969 us | 855 us | us 12% faster |
| executemany(1000) in txn | 21.5 ms | 30.8 ms | IfxPy 30% slower |
| cold_connect_disconnect | 11.0 ms | 10.9 ms | comparable |

Why the surprising wins: IfxPy's path is Python -> OneDB ODBC ->
libifdmr -> wire. Ours is Python -> wire. When wire round-trip
dominates (single-row, bulk fetch), the missing abstraction layer
makes us faster. When per-row marshaling dominates (executemany),
IfxPy's C-level execute(stmt, tuple) beats Python BIND-PDU build.

Files added under tests/benchmarks/compare/:
* Dockerfile.ifxpy — Ubuntu 20.04 base with IfxPy + OneDB drivers
* ifxpy_bench.py — IfxPy benchmark workloads matching test_*_perf.py
* README.md — methodology, results, install gauntlet, reproduction

The IfxPy install gauntlet itself is part of the comparison story:
modern Python 3.11 (not 3.13), setuptools <58, permissive CFLAGS,
manual download of 92MB OneDB ODBC tarball, four LD_LIBRARY_PATH
directories, libcrypt.so.1 (deprecated 2018, missing on Arch /
Fedora 35+ / RHEL 9). Versus our `pip install informix-db`.

README.md (project root): added "Compared to IfxPy" section under
Performance with the headline numbers and a pointer to the full
methodology.

.gitignore: keep Dockerfile/script/README under tests/benchmarks/
compare/, exclude the 92MB OneDB tarball and the local venv.

2026-05-05 11:41:47 -06:00

3 Commits