informix-db

warehack.ing/informix-db

Fork 0

Commit Graph

Author	SHA1	Message	Date
Ryan Malloy	5825d5c55e	Extend scaling benches: 100-column case + 100k memory profile + 1M gating Adds three things to test_scaling_perf.py: 1. 100-column wide-row SELECT - codec stress test at extreme widths. 1k rows x 100 cols = 19.4 ms (~194 us/row, ~1.94 us/column-decode). Per-column cost continues to drop with width thanks to loop amortization (5 cols: 480 ns/col -> 100 cols: 194 ns/col). 2. 100k-row memory profile - samples RSS pre-execute, post-execute (materialization cost), and during iteration. Real numbers: pre-execute: 45.8 MB post-execute: 71.2 MB (+25.4 MB = ~259 bytes/row materialization) iteration: 0 KB extra (just walks the existing list) Documents the in-memory cursor's actual cost: 100k rows = 25 MB, 1M rows = ~250 MB. Fair regression baseline (tripped at 500 MB). 3. 1M-row scaling gated behind IFX_BENCH_1M=1 env var. Default off because the dev container's rootdbs runs out of space. For production-sized servers users can opt in. The implementation is linear-extrapolation-correct (executemany 100k -> 1M = ~15s, SELECT 100k -> 1M = ~3s). Note on the dev-container size limit: dev image's rootdbs is sized for typical developer workloads, not stress testing. A 1M-row INSERT exceeds the available pages and fails with -242 ISAM -113 (out of space). This is correct behavior - the limit is enforced at the storage layer. Switched RSS sampling from ru_maxrss (peak, monotonic) to /proc/self/status VmRSS (current). Earlier runs showed flat RSS because peak from earlier in the test session masked the fluctuation.	2026-05-05 13:10:32 -06:00
Ryan Malloy	8eb19f7534	Phase 34: Scaling benchmarks (1k/10k/100k rows; 5/20/50 cols) (2026.05.05.8) Adds tests/benchmarks/test_scaling_perf.py with parametrized benchmarks across row-count, column-width, and type-mix axes. Caught the NFETCH-loop bug (Phase 35) immediately on first run. Headline numbers: Bulk insert (executemany in transaction): 1k rows: 23 ms (23 us/row) 10k rows: 161 ms (16 us/row) 100k rows: 1487 ms (15 us/row, ~67k rows/sec sustained) SELECT (linear scaling, near-constant per-row): 1k rows: 2.7 ms (2.7 us/row) 10k rows: 25.8 ms (2.6 us/row) 100k rows: 271 ms (2.7 us/row) Wide-row SELECT (1k rows x N cols): 5 cols: 2.4 ms 20 cols: 5.1 ms 50 cols: 10.1 ms Type-mix SELECT (INT + VARCHAR + DECIMAL + DATE + FLOAT + SMALLINT): 1000 rows: 4.7 ms (4.7 us/row, ~1.7x baseline) Per-row codec cost is essentially constant from 1k to 100k rows (2.7 us/row), proving parse_tuple_payload optimizations (Phases 23-25) hold at 100x scale with no GC-pause amplification or memory-pressure degradation. Per-row insert cost actually DECREASES with scale (23us at 1k to 15us at 100k) - Phase 33's pipelining amortizes prepare/release overhead better at larger N. 10 new parametrized benchmarks. Total: 77 unit + 249 integration + 43 benchmark = 369 tests.	2026-05-05 12:38:07 -06:00

Author

SHA1

Message

Date

Ryan Malloy

5825d5c55e

Extend scaling benches: 100-column case + 100k memory profile + 1M gating

Adds three things to test_scaling_perf.py:

1. 100-column wide-row SELECT - codec stress test at extreme widths.
   1k rows x 100 cols = 19.4 ms (~194 us/row, ~1.94 us/column-decode).
   Per-column cost continues to drop with width thanks to loop
   amortization (5 cols: 480 ns/col -> 100 cols: 194 ns/col).

2. 100k-row memory profile - samples RSS pre-execute, post-execute
   (materialization cost), and during iteration. Real numbers:
     pre-execute:  45.8 MB
     post-execute: 71.2 MB  (+25.4 MB = ~259 bytes/row materialization)
     iteration:    0 KB extra (just walks the existing list)

   Documents the in-memory cursor's actual cost: 100k rows = 25 MB,
   1M rows = ~250 MB. Fair regression baseline (tripped at 500 MB).

3. 1M-row scaling gated behind IFX_BENCH_1M=1 env var. Default off
   because the dev container's rootdbs runs out of space. For
   production-sized servers users can opt in. The implementation
   is linear-extrapolation-correct (executemany 100k -> 1M = ~15s,
   SELECT 100k -> 1M = ~3s).

Note on the dev-container size limit: dev image's rootdbs is sized
for typical developer workloads, not stress testing. A 1M-row
INSERT exceeds the available pages and fails with -242 ISAM -113
(out of space). This is correct behavior - the limit is enforced
at the storage layer.

Switched RSS sampling from ru_maxrss (peak, monotonic) to
/proc/self/status VmRSS (current). Earlier runs showed flat
RSS because peak from earlier in the test session masked the
fluctuation.

2026-05-05 13:10:32 -06:00

Ryan Malloy

8eb19f7534

Phase 34: Scaling benchmarks (1k/10k/100k rows; 5/20/50 cols) (2026.05.05.8)

Adds tests/benchmarks/test_scaling_perf.py with parametrized
benchmarks across row-count, column-width, and type-mix axes.
Caught the NFETCH-loop bug (Phase 35) immediately on first run.

Headline numbers:

Bulk insert (executemany in transaction):
  1k rows:   23 ms (23 us/row)
  10k rows:  161 ms (16 us/row)
  100k rows: 1487 ms (15 us/row, ~67k rows/sec sustained)

SELECT (linear scaling, near-constant per-row):
  1k rows:   2.7 ms (2.7 us/row)
  10k rows:  25.8 ms (2.6 us/row)
  100k rows: 271 ms (2.7 us/row)

Wide-row SELECT (1k rows x N cols):
  5 cols:  2.4 ms
  20 cols: 5.1 ms
  50 cols: 10.1 ms

Type-mix SELECT (INT + VARCHAR + DECIMAL + DATE + FLOAT + SMALLINT):
  1000 rows: 4.7 ms (4.7 us/row, ~1.7x baseline)

Per-row codec cost is essentially constant from 1k to 100k rows
(2.7 us/row), proving parse_tuple_payload optimizations (Phases
23-25) hold at 100x scale with no GC-pause amplification or
memory-pressure degradation.

Per-row insert cost actually DECREASES with scale (23us at 1k to
15us at 100k) - Phase 33's pipelining amortizes prepare/release
overhead better at larger N.

10 new parametrized benchmarks. Total: 77 unit + 249 integration +
43 benchmark = 369 tests.

2026-05-05 12:38:07 -06:00

2 Commits