Ryan Malloy 495128c679 Phase 21.1: executemany perf - it was the autocommit cliff (2026.05.04.6)

Investigation of the Phase 21 baseline finding that executemany(N) cost
scaled linearly per-row (1.74 ms x N) regardless of batch size.

Root cause: every autocommit=True INSERT forces a server-side
transaction-log flush. Not a wire-protocol bug.

Numbers:
* executemany(1000) autocommit=True: 1.72 s (1.72 ms/row)
* executemany(1000) in single txn:    32 ms (32 us/row)

53x speedup from changing the transaction boundary, not the driver.
Pure protocol overhead is ~32 us/row -> ~31K rows/sec sustained
throughput on a single connection. Comparable to pg8000.

Added test_executemany_1000_rows_in_txn benchmark to make this
visible. Updated README headline numbers and added a "Performance
gotchas" section explaining when autocommit=False matters.

Decision: don't pipeline. The remaining 32 us is already excellent;
the autocommit gotcha is the real user-facing footgun. Docs > code.
If someone reports needing >31K rows/sec single-connection, that
becomes Phase 22.

2026-05-04 17:26:16 -06:00

16 KiB

Raw Blame History

Changelog

All notable changes to informix-db. Versioning is CalVer — YYYY.MM.DD for date-based releases, YYYY.MM.DD.N for same-day post-releases per PEP 440.

2026.05.04.6 — `executemany` perf finding: it was the autocommit cliff

Investigation of the Phase 21 finding that executemany(N) cost scaled linearly per-row (1.74 ms × N) regardless of batch size. Root cause: every autocommit-True INSERT forces a server-side transaction-log flush. Not a wire-protocol bug.

Added

test_executemany_1000_rows_in_txn benchmark — same workload, but inside a single transaction with one COMMIT at the end. Isolates pure protocol cost from server-storage cost.
New module-scoped txn_conn fixture in tests/benchmarks/test_insert_perf.py for autocommit-False benchmarks.

Findings

Mode	Total	Per row
`executemany(1000)` autocommit=True	1.72 s	1.72 ms
`executemany(1000)` in single txn	32 ms	32 µs

53× speedup from changing the transaction boundary, not the driver. Pure protocol overhead is ~32 µs/row → ~31,000 rows/sec sustained throughput on a single connection. Comparable to mature pure-Python drivers (pg8000).

Changed

tests/benchmarks/README.md — updated headline numbers to show both modes, added a "Performance gotchas" section explaining when to use autocommit=False for bulk loads.
tests/benchmarks/baseline.json — refreshed to include the new txn-mode measurement (now 29 entries, was 28).

Decision: don't pipeline

Pipelining BIND+EXECUTE PDUs (writing N without waiting for responses between them) could potentially halve the 32 µs/row figure on loopback. Decided against:

The remaining 32 µs is already excellent — single-connection bulk-load performance is not where users hit limits.
Pipelining adds complexity around TCP send-buffer management, partial-failure semantics, and error reporting (which row failed when 50 are in flight).
The autocommit gotcha is the real user-facing footgun. Better docs > more code.

If someone reports needing >31K rows/sec single-connection, this becomes Phase 22 work.

2026.05.04.5 — Performance benchmarks (Phase 21)

Adds tests/benchmarks/ — a pytest-benchmark driven suite covering codec micro-benchmarks (no server required) and end-to-end SELECT/INSERT/pool/async benchmarks. Establishes a committed baseline.json so future PRs can be compared against the floor and regressions caught at review.

Added

tests/benchmarks/test_codec_perf.py — 16 micro-benchmarks for the hot codec paths (decode, encode_param, parse_tuple_payload). Run without an Informix container; suitable for pre-merge CI.
tests/benchmarks/test_select_perf.py — 4 SELECT round-trip benchmarks: 1-row latency floor, ~10 rows, full 1k-row table, parameterized.
tests/benchmarks/test_insert_perf.py — 3 INSERT benchmarks: single-row, executemany(100), executemany(1000).
tests/benchmarks/test_pool_perf.py — 3 pool benchmarks: cold connect (login handshake cost), pool acquire/release, pool acquire + tiny query + release.
tests/benchmarks/test_async_perf.py — 2 async benchmarks: single async round-trip overhead, 10 concurrent SELECTs through an async pool.
tests/benchmarks/conftest.py — bench_conn (long-lived autocommit connection) and bench_table (pre-populated 1k-row table) fixtures, both session-scoped.
tests/benchmarks/baseline.json — committed baseline (28 measurements) for --benchmark-compare regression checks.
tests/benchmarks/README.md — headline numbers, regression policy, how to update baseline, what each benchmark measures.
make bench / make bench-codec / make bench-save Makefile targets.
benchmark pytest marker — gated, off by default. pytest -m benchmark to opt in.

Changed

make test-integration now uses -m "integration and not benchmark" so the integration suite stays fast (~6s) — benchmarks (~27s) are gated behind make bench.
pytest default -m now excludes both integration and benchmark. Default run is unit-only.

Headline numbers (dev container, x86_64 Linux, loopback)

Operation	Mean
`decode(int)` (per cell)	181 ns
`parse_tuple_payload(5 cols)` (per row)	2.87 µs
`SELECT 1` round-trip	177 µs
Pool acquire + tiny query + release	295 µs
Cold connect + close	11.2 ms

Pool-vs-cold delta is 72×. UTF-8 decode carries no measurable cost over iso-8859-1 (Phase 20 didn't slow anything down).

Tests

28 new benchmark tests. Total: 69 unit + 211 integration + 28 benchmark = 308.

2026.05.04.4 — UTF-8 / multibyte locale support

Threads the connection's CLIENT_LOCALE through to user-data string codecs so multibyte locales (UTF-8, etc.) round-trip correctly. The driver previously hardcoded iso-8859-1 for every string conversion — fine for Western European text, broken-by-design for CJK, Cyrillic, Arabic, emoji.

Added

Connection.encoding property — reports the Python codec name derived from CLIENT_LOCALE (e.g., iso-8859-1, utf-8, iso-8859-15). Default for a connection without client_locale= is iso-8859-1 (compatible with the legacy default).
informix_db.connections._python_encoding_from_locale(locale: str) — maps Informix locale strings (en_US.utf8, en_US.8859-1, en_US.819) to Python codec names. Falls back to iso-8859-1 for unknown / unsuffixed forms.

Changed

encode_param(value, encoding=...) and _encode_str(value, encoding=...) honor the connection's encoding instead of hardcoded iso-8859-1. Cursor's _emit_bind_params forwards self._conn.encoding per parameter.
decode(type_code, raw, encoding=...) and parse_tuple_payload(reader, columns, encoding=...) thread the encoding to string column decoders (CHAR, VARCHAR, NCHAR, NVCHAR, LVARCHAR). Cursor's _read_fetch_response forwards self._conn.encoding.
Smart-LOB CLOB encode/decode (write_blob_column, simple-LOB TEXT fetch) honor self._conn.encoding.
Fast-path RPC (Connection.fast_path_call) honors self._encoding for its bound parameters.

Boundary discipline

Protocol-level strings stay iso-8859-1 (always ASCII, never user-controlled): cursor names, function signatures, server-fabricated SQ_FILE virtual filenames, error "near tokens", SQL keywords/identifiers. Only user-data strings (column values, parameter binds) follow CLIENT_LOCALE.

Error handling

Encoding-can't-represent-this-value (e.g., "你好" on an 8859-1 connection) now raises informix_db.DataError instead of letting Python's UnicodeEncodeError leak. The cursor releases the prepared statement before propagating, so the connection survives cleanly for the next query.

Tests

9 new integration tests in tests/test_unicode.py:

ASCII round-trip (regression)
Latin-1 high-bit chars round-trip on default locale
Full byte range 0x20-0xFE round-trip via VARCHAR
Locale → Python codec mapping for common forms
Connection.encoding exposes the resolved codec
UTF-8 locale negotiation (server transcodes for ASCII even with 8859-1 DB)
UTF-8 multibyte round-trip (skipped without IFX_UTF8_DATABASE env var pointing to a UTF-8 database)
Non-representable char raises DataError cleanly; connection survives
CLOB column round-trips Latin-1 text honoring connection encoding

Total: 69 unit + 212 integration = 281 tests.

Limitations

Multibyte UTF-8 storage requires both client_locale='en_US.utf8' AND a database whose DB_LOCALE is UTF-8. The dev container's testdb is 8859-1, so storing CJK chars there will continue to fail server-side regardless of the client codec. The test_utf8_multibyte_round_trip test is gated on the IFX_UTF8_DATABASE env var pointing to a UTF-8 database.

2026.05.04.3 — Resilience tests (fault injection)

Added

tests/_proxy.py — ControlledProxy helper: a thread-based TCP forwarder between the test client and Informix, with a kill() method that sends TCP RST (via SO_LINGER=0) to simulate a network drop or server crash. Used as a context manager.
tests/test_resilience.py — 12 integration tests filling the resilience gap identified in the test-coverage audit:
- Network drop mid-SELECT raises OperationalError cleanly (not hang)
- Network drop after describe but before fetch
- Network drop during fetch iteration (already-materialized rows still readable, fresh execute fails)
- Local socket close (yank-the-rug from client side)
- I/O error marks connection unusable
- Pool evicts a connection that died mid-with block
- Pool revives after all idle connections died (health-check on acquire mints fresh)
- Async cancellation via asyncio.wait_for — pool stays usable for subsequent queries
- Cursor reusable after SQL error
- Connection survives cursor close after error
- Pool sustained-load smoke (50 acquire/release cycles, no leak)
- read_timeout fires on a hung connection

What this catches

Hangs (waiting forever on a dead socket)
Silent data corruption (treating EOF as a valid tuple)
Double-fault (one error → cleanup raises a different error)
Pool poisoning (returning a broken connection to the pool)
Stale cursor reuse (same cursor reused across an error boundary)

Tests

12 new integration tests. Total: 69 unit + 203 integration = 272 tests.

The Phase 19 work fills the highest-priority gap from the test-adequacy audit. Remaining gaps from that audit (UTF-8 locale, server-version matrix, performance benchmarks) are real but lower-severity.

2026.05.04.2 — Server-side scrollable cursors

Added

Server-side scrollable cursors (Phase 18): opt in via conn.cursor(scrollable=True). The cursor opens with SQ_SCROLL (24) before SQ_OPEN (6), the result set stays materialized server-side, and each scroll method sends SQ_SFETCH (23) to fetch one row at a time. Use this for huge result sets where in-memory materialization would be wasteful.

The user-facing API is identical to Phase 17's in-memory scroll (fetch_first, fetch_last, fetch_prior, fetch_absolute, fetch_relative, scroll, rownumber); only the internal mechanism differs:

	Default cursor	`scrollable=True`
Memory	All rows materialized	One row at a time
Network round-trips per fetch	0 (after initial NFETCH)	1 (one SFETCH per call)
Cursor lifetime	Closed after `execute()`	Open until `close()`
Best for	Moderate result sets, sequential iteration	Huge result sets, random access

Implementation discovers total row count lazily via SFETCH(LAST=4) when negative absolute indexing requires it; result is cached in _scroll_total_rows. Position tracking is authoritative from the server's SQ_TUPID (25) tag, not client-computed.

Wire-protocol details

SQ_SFETCH (23): [short SQ_ID=4][int 23][short scrolltype][int target][int bufSize=4096][short SQ_EOT]. scrolltype values: 1=NEXT, 4=LAST, 6=ABSOLUTE.
SQ_SCROLL (24): emitted between CURNAME and SQ_OPEN to mark the cursor as scrollable.
SQ_TUPID (25): server response carrying the 1-indexed row position the server just delivered. [short 25][int rowID].

The trap on the way: I initially used SHORT for bufSize and the server hung silently — same SHORT-vs-INT diagnostic pattern as Phase 4.x's CURNAME+NFETCH. Captured a JDBC trace, byte-diffed against ours, found the mismatch.

Tests

14 new integration tests in test_scroll_cursor_server.py. Total: 69 unit + 191 integration = 260 tests.

2026.05.04.1 — Scroll cursors

Added

Scroll cursor API on Cursor (Phase 17):
- cur.scroll(value, mode='relative'|'absolute') — PEP 249 compatible
- cur.fetch_first() / cur.fetch_last() — jump to ends
- cur.fetch_prior() — backward step (SQL-standard semantics: from past-end yields the last row)
- cur.fetch_absolute(n) — 0-indexed jump; negative n indexes from the end
- cur.fetch_relative(n) — n-step from current position
- cur.rownumber — current 0-indexed position (None if before-first or no result set)
In-memory implementation — no new wire-protocol; the existing materialized result set in cur._rows is now indexed rather than iterated. For server-side scroll over huge result sets, SQ_SFETCH (tag 23) would be needed — Phase 18 if anyone hits the in-memory ceiling.

Tests

14 new integration tests in test_scroll_cursor.py. Total: 69 unit + 177 integration = 246 tests.

2026.05.04 — Library completion

The Phase 0 ambition — first pure-Python Informix SQLI driver — reaches feature completeness. Adds async, TLS, connection pool, smart-LOBs, fast-path RPC, composite UDTs.

Added

Async API (informix_db.aio) — AsyncConnection, AsyncCursor, AsyncConnectionPool for FastAPI / aiohttp / asyncio. Each blocking I/O call is offloaded to a worker thread via asyncio.to_thread; event loop never blocks.
Connection pool (informix_db.create_pool) — thread-safe with min/max sizing, lazy growth, health-check on acquire, error-aware eviction.
TLS — tls=True for self-signed dev servers, tls=ssl.SSLContext for production. Wrapping happens in IfxSocket so the rest of the protocol layer is unaware.
Smart-LOBs (BLOB / CLOB) — full read/write end-to-end via cursor.read_blob_column() / cursor.write_blob_column() using the server's lotofile / filetoblob SQL functions intercepted at the SQ_FILE (98) protocol level.
Legacy in-row blobs (BYTE / TEXT) — bind + read via the SQ_BBIND / SQ_BLOB / SQ_FETCHBLOB protocol family.
Fast-path RPC (Connection.fast_path_call) — direct stored-procedure invocation bypassing PREPARE/EXECUTE; routine handles cached per-connection.
Composite UDT recognition — ROW, SET, MULTISET, LIST columns return typed RowValue / CollectionValue wrappers exposing schema and raw bytes.
Type codecs — INTERVAL (both DAY-TO-FRACTION and YEAR-TO-MONTH families), DATETIME (all qualifier ranges), DECIMAL / MONEY (BCD with sign+exp head byte and asymmetric base-100 complement for negatives), DATE, BOOL, all integer / float widths, CHAR / VARCHAR / LVARCHAR.
Transactions — implicit SQ_BEGIN before each transaction in non-ANSI logged DBs; transparent no-ops on unlogged DBs.
PEP 249 exception hierarchy — server SQLCODE mapped to the right exception class (IntegrityError for duplicate-key violations, ProgrammingError for syntax errors, etc.).

Documentation

README.md — overview and quick-start
docs/USAGE.md — practical recipes and migration guide
docs/PROTOCOL_NOTES.md — byte-level wire-format reference
docs/DECISION_LOG.md — phase-by-phase architectural decisions, with the why preserved
docs/JDBC_NOTES.md — index into the decompiled IBM JDBC reference
docs/CAPTURES/ — annotated socat hex-dump captures

Test coverage

232 tests total: 69 unit + 163 integration. Unit tests run with no external dependencies; integration tests run against the IBM Informix Developer Edition Docker image.

Known gaps (deferred)

Full ROW/COLLECTION recursive parsing: Phase 12 ships type recognition + raw-bytes wrapper. Parsing the textual representation into typed Python tuples/sets/lists is deferred — most workloads can use SQL projections (SELECT row_col.fieldname FROM tbl) instead.
UDT parameter encoding for fast-path: scalar params/returns work; passing a 72-byte BLOB locator as a UDT param requires extending the SQ_BIND encoder with the extended_owner/extended_name preamble for type > 18.
Native async I/O: Phase 16 ships a thread-pool wrapper that's functionally equivalent for typical FastAPI workloads. Native async (asyncpg-style transport abstraction) would be Phase 17 if a real workload needs it.

2026.05.02 — Phase 1: connection lifecycle

Initial release. connect() / close() works end-to-end. Cursor / execute / fetch arrived in Phase 2 (subsequent commits within the same session).

16 KiB Raw Blame History Unescape Escape

Changelog

2026.05.04.6 — executemany perf finding: it was the autocommit cliff

Added

Findings

Changed

Decision: don't pipeline

2026.05.04.5 — Performance benchmarks (Phase 21)

Added

Changed

Headline numbers (dev container, x86_64 Linux, loopback)

Tests

2026.05.04.4 — UTF-8 / multibyte locale support

Added

Changed

Boundary discipline

Error handling

Tests

Limitations

2026.05.04.3 — Resilience tests (fault injection)

Added

What this catches

Tests

2026.05.04.2 — Server-side scrollable cursors

Added

Wire-protocol details

Tests

2026.05.04.1 — Scroll cursors

Added

Tests

2026.05.04 — Library completion

Added

Documentation

Test coverage

Known gaps (deferred)

2026.05.02 — Phase 1: connection lifecycle

16 KiB

Raw Blame History

2026.05.04.6 — `executemany` perf finding: it was the autocommit cliff