informix-db

warehack.ing/informix-db

Fork 0

Commit Graph

Author	SHA1	Message	Date
Ryan Malloy	01757415a5	Phase 32: Benchmark improvements (Tier 1 + Tier 2) Tier 1 — make existing benchmarks reliable: * Bumped slow-bench rounds: cold_connect_disconnect 5->15, executemany series 3->10. Single-round outliers no longer dominate. * Switched bench reporting to median + IQR. Mean was being moved by individual GC pauses / scheduler hiccups (IfxPy executemany IQR was 8.2 ms on a 28 ms median - 29% spread - mean was unreliable). * Updated ifxpy_bench.py to also report median + IQR alongside mean for cross-comparable numbers. * Makefile bench targets now show median, iqr, mean, stddev, ops, rounds. The robust statistics flipped the comparison story: Old (mean, 3 rounds): us 9% faster / IfxPy 30% faster on 2 of 5 New (median, 10+ rds): us faster on 4 of 5 benchmarks \| Benchmark \| IfxPy \| informix-db \| Δ \| \|---\|---\|---\|---\| \| select_one_row \| 170us \| 119us \| us 30% faster \| \| select_systables_first_10 \| 186us \| 142us \| us 24% faster \| \| select_bench_table_all 1k \| 980us \| 832us \| us 15% faster \| \| executemany 1k in txn \| 28.3ms \| 31.3ms \| us 10% slower \| \| cold_connect_disconnect \| 12.0ms \| 10.7ms \| us 11% faster \| Tier 2 — add benchmarks for claims we make but don't verify: tests/benchmarks/test_observability_perf.py: * test_streaming_fetch_memory_profile — RSS sampling during a cursor iteration. Documents memory growth shape; regression wall at 100 MB / 1k rows. Currently flat (in-memory cursor doesn't grow detectably for 278 rows). * test_select_1_latency_percentiles — 1000-query distribution with p50/p90/p95/p99/max. Result: p99/p50 = 1.42x (tight tail). p50=108us, p99=153us. * test_concurrent_pool_throughput[2,4,8] — N worker threads through pool, measures aggregate QPS + per-thread fairness. Plateaus at ~6K QPS (server-bound); per-thread latency scales ~linearly with N (server serialization expected). README.md (project root): updated Compared-to-IfxPy table with the median-based numbers + IQR awareness note. tests/benchmarks/compare/README.md: added "Statistical robustness" section explaining why median over mean for fair comparison. 236 integration tests pass; ruff clean.	2026-05-05 12:01:11 -06:00

Author

SHA1

Message

Date

Ryan Malloy

01757415a5

Phase 32: Benchmark improvements (Tier 1 + Tier 2)

Tier 1 — make existing benchmarks reliable:
* Bumped slow-bench rounds: cold_connect_disconnect 5->15, executemany
  series 3->10. Single-round outliers no longer dominate.
* Switched bench reporting to median + IQR. Mean was being moved by
  individual GC pauses / scheduler hiccups (IfxPy executemany IQR
  was 8.2 ms on a 28 ms median - 29% spread - mean was unreliable).
* Updated ifxpy_bench.py to also report median + IQR alongside mean
  for cross-comparable numbers.
* Makefile bench targets now show median, iqr, mean, stddev, ops, rounds.

The robust statistics flipped the comparison story:

  Old (mean, 3 rounds):   us 9% faster  / IfxPy 30% faster on 2 of 5
  New (median, 10+ rds):  us faster on 4 of 5 benchmarks

| Benchmark | IfxPy | informix-db | Δ |
|---|---|---|---|
| select_one_row             | 170us | 119us | us 30% faster |
| select_systables_first_10  | 186us | 142us | us 24% faster |
| select_bench_table_all 1k  | 980us | 832us | us 15% faster |
| executemany 1k in txn      | 28.3ms | 31.3ms | us 10% slower |
| cold_connect_disconnect    | 12.0ms | 10.7ms | us 11% faster |

Tier 2 — add benchmarks for claims we make but don't verify:

tests/benchmarks/test_observability_perf.py:
* test_streaming_fetch_memory_profile — RSS sampling during a
  cursor iteration. Documents memory growth shape; regression
  wall at 100 MB / 1k rows. Currently flat (in-memory cursor
  doesn't grow detectably for 278 rows).
* test_select_1_latency_percentiles — 1000-query distribution
  with p50/p90/p95/p99/max. Result: p99/p50 = 1.42x (tight tail).
  p50=108us, p99=153us.
* test_concurrent_pool_throughput[2,4,8] — N worker threads
  through pool, measures aggregate QPS + per-thread fairness.
  Plateaus at ~6K QPS (server-bound); per-thread latency scales
  ~linearly with N (server serialization expected).

README.md (project root): updated Compared-to-IfxPy table with
the median-based numbers + IQR awareness note.
tests/benchmarks/compare/README.md: added "Statistical robustness"
section explaining why median over mean for fair comparison.

236 integration tests pass; ruff clean.

2026-05-05 12:01:11 -06:00

1 Commits