Ryan Malloy 8eb19f7534 Phase 34: Scaling benchmarks (1k/10k/100k rows; 5/20/50 cols) (2026.05.05.8)
Adds tests/benchmarks/test_scaling_perf.py with parametrized
benchmarks across row-count, column-width, and type-mix axes.
Caught the NFETCH-loop bug (Phase 35) immediately on first run.

Headline numbers:

Bulk insert (executemany in transaction):
  1k rows:   23 ms (23 us/row)
  10k rows:  161 ms (16 us/row)
  100k rows: 1487 ms (15 us/row, ~67k rows/sec sustained)

SELECT (linear scaling, near-constant per-row):
  1k rows:   2.7 ms (2.7 us/row)
  10k rows:  25.8 ms (2.6 us/row)
  100k rows: 271 ms (2.7 us/row)

Wide-row SELECT (1k rows x N cols):
  5 cols:  2.4 ms
  20 cols: 5.1 ms
  50 cols: 10.1 ms

Type-mix SELECT (INT + VARCHAR + DECIMAL + DATE + FLOAT + SMALLINT):
  1000 rows: 4.7 ms (4.7 us/row, ~1.7x baseline)

Per-row codec cost is essentially constant from 1k to 100k rows
(2.7 us/row), proving parse_tuple_payload optimizations (Phases
23-25) hold at 100x scale with no GC-pause amplification or
memory-pressure degradation.

Per-row insert cost actually DECREASES with scale (23us at 1k to
15us at 100k) - Phase 33's pipelining amortizes prepare/release
overhead better at larger N.

10 new parametrized benchmarks. Total: 77 unit + 249 integration +
43 benchmark = 369 tests.
2026-05-05 12:38:07 -06:00
..
2026-05-04 14:46:53 -06:00