informix-db

Author	SHA1	Message	Date
Ryan Malloy	8eb19f7534	Phase 34: Scaling benchmarks (1k/10k/100k rows; 5/20/50 cols) (2026.05.05.8) Adds tests/benchmarks/test_scaling_perf.py with parametrized benchmarks across row-count, column-width, and type-mix axes. Caught the NFETCH-loop bug (Phase 35) immediately on first run. Headline numbers: Bulk insert (executemany in transaction): 1k rows: 23 ms (23 us/row) 10k rows: 161 ms (16 us/row) 100k rows: 1487 ms (15 us/row, ~67k rows/sec sustained) SELECT (linear scaling, near-constant per-row): 1k rows: 2.7 ms (2.7 us/row) 10k rows: 25.8 ms (2.6 us/row) 100k rows: 271 ms (2.7 us/row) Wide-row SELECT (1k rows x N cols): 5 cols: 2.4 ms 20 cols: 5.1 ms 50 cols: 10.1 ms Type-mix SELECT (INT + VARCHAR + DECIMAL + DATE + FLOAT + SMALLINT): 1000 rows: 4.7 ms (4.7 us/row, ~1.7x baseline) Per-row codec cost is essentially constant from 1k to 100k rows (2.7 us/row), proving parse_tuple_payload optimizations (Phases 23-25) hold at 100x scale with no GC-pause amplification or memory-pressure degradation. Per-row insert cost actually DECREASES with scale (23us at 1k to 15us at 100k) - Phase 33's pipelining amortizes prepare/release overhead better at larger N. 10 new parametrized benchmarks. Total: 77 unit + 249 integration + 43 benchmark = 369 tests.	2026-05-05 12:38:07 -06:00
Ryan Malloy	1282893412	Phase 35: CRITICAL fix - NFETCH loop for large result sets (2026.05.05.7) DATA-LOSS BUG: cursor.fetchall() on result sets larger than ~200 rows was silently truncating to the first ~200 rows. The exact cap depended on row width and the server's per-NFETCH buffer (4096 bytes default). The bug: _execute_select sent NFETCH twice and stopped: self._conn._send_pdu(self._build_curname_nfetch_pdu(cursor_name)) self._read_fetch_response() self._conn._send_pdu(self._build_nfetch_pdu()) # comment: "DONE only" self._read_fetch_response() # then CLOSE+RELEASE — discarding remaining queued rows The "second fetch returns DONE only" comment was wrong. For any result set larger than the server's per-NFETCH batch, the second fetch returns more tuples AND there are still tuples queued server-side. The cursor closed and dropped them. Latent for 30 phases because every existing test used either a small result set (FIRST 10) or relied on row counts that fit naturally in 1-2 batches. Discovered by Phase 34's scaling benchmark when SELECT FIRST 100000 from a 100k-row table returned 200 rows. The fix: loop NFETCH until a response yields zero new tuples. self._conn._send_pdu(self._build_curname_nfetch_pdu(cursor_name)) rows_before = len(self._rows) self._read_fetch_response() rows_received = len(self._rows) - rows_before while rows_received > 0: self._conn._send_pdu(self._build_nfetch_pdu()) rows_before = len(self._rows) self._read_fetch_response() rows_received = len(self._rows) - rows_before 249 integration tests pass. The scaling benchmark suite (Phase 34, shipping next) is the regression test going forward. Workaround for users on older versions: use scrollable cursors (cursor(scrollable=True)) which use the SQ_SFETCH protocol path and don't have this bug. If you've been using this driver for queries returning large result sets, your queries may have been truncating silently. Re-run them against 2026.05.05.7+ to verify your data.	2026-05-05 12:37:22 -06:00
Ryan Malloy	362ecb3d63	Phase 33: Pipelined executemany - 2.85x faster bulk insert (2026.05.05.6) The serial-loop executemany paid one wire round-trip per row (~30us/ row on loopback). It was the one benchmark where IfxPy beat us in the comparison work - 10% slower at executemany(1000) in txn. Phase 33 pipelines the BIND+EXECUTE PDUs: build all N PDUs, send them back-to-back, then drain all N responses. Eliminates per-row RTT entirely. Performance impact: * executemany(1000) in txn: 31.3 ms -> 11.0 ms (2.85x faster) * executemany(100) autocommit: 173 ms -> 154 ms (11% faster) * executemany(1000) autocommit: 1740 ms -> 1590 ms (9% faster) (Autocommit gets smaller wins because server-side log flushes dominate - Phase 21.1's "autocommit cliff".) IfxPy comparison flipped: us 10% slower -> us 2.05x faster on bulk inserts. We now win all 5 head-to-head benchmarks against the C-bound driver. Margaret Hamilton review surfaced one CRITICAL concern (C1) - the pipeline assumes Informix sends N responses for N pipelined PDUs even when one fails. If the server cut the stream short, the drain loop would deadlock on the next read. Verified by 3 new integration tests in tests/test_executemany_pipeline.py: * test_pipelined_executemany_mid_batch_constraint_violation (row 500/1000) * test_pipelined_executemany_first_row_fails (row 0/100) * test_pipelined_executemany_last_row_fails (row 99/100) All confirm Informix sends N responses; wire stays aligned; connection is usable after. Plus 4 lower-priority fixes Hamilton recommended: * H1: documented _raise_sq_err self-drains-SQ_EOT invariant + tripwire * H2: docstring warning about O(N) lock duration; chunk for huge batches * M1: prepend row-index to exception message rather than reformat * M2: documented sendall-no-timeout caveat on hostile networks 77 unit + 239 integration + 33 benchmark = 349 tests; ruff clean. Note: Phase 32 (Tier 1+2 benchmarks) was tagged without bumping pyproject.toml's version string. .5 was git-tag-only; .6 is the next published version increment.	2026-05-05 12:26:15 -06:00
Ryan Malloy	eb8d15d204	README + classifier polish for PyPI launch PyPI users landing on the README need to know quickly: - What this is (already strong) - Whether it's safe to use in production (was missing) - Performance expectations (was missing) - Python version requirement (was only in pyproject.toml metadata) Updates: * Added "Status" section with the Hamilton audit findings table - every critical/high/medium addressed, 0 remaining. Names the Hamilton-style review process explicitly as the credibility signal. * Added Python ≥ 3.10 requirement under the install command. * Added "Performance" section with single-connection benchmarks and the 53x autocommit-cliff gotcha (most important perf pitfall). * Updated "Standards & guarantees" to mention Phase 27's wire lock alongside the PEP 249 Threadsafety=1 declaration - accurate context for sophisticated readers. * Tightened "Development" to PyPI-appropriate brevity (short Makefile target list instead of full uv invocations). * Updated stale phase count (22+ → 30) and test counts (69 → 77 unit, 163 → 231 integration). Added "300+ tests" rough number in the Status section to reduce future staleness churn. * Fixed typo: "no thread of native machinery" → "no native machinery anywhere in the thread of execution". * Bumped pyproject.toml classifier from "Development Status :: 4 - Beta" to "5 - Production/Stable" - earned by the audit work. No code changes.	2026-05-05 11:06:49 -06:00
Ryan Malloy	0b13acb13d	Phase 30: Final hardening pass (2026.05.05.4) Closes the last 3 medium-severity items from Hamilton's system-wide audit. 0 critical, 0 high, 0 medium remaining. What changed: pool.py: * Pool acquire() growth path: restructured to remove _lock._is_owned() (CPython-private API) usage. Two explicit re-acquires (success path + exception path) replace the older try/finally + private check. connections.py: * _raise_from_rejection now extracts the server's human-readable error string from the rejection payload and surfaces it in the OperationalError. Wrong-password vs wrong-database now produce distinguishable errors. New helper _extract_server_error_text finds the longest printable-ASCII run (8-256 chars). Falls back to a hex preview when no string is found. * _send_exit: broadened catch from (OperationalError, InterfaceError, OSError, ProtocolError) to bare Exception. Best-effort by definition; the socket FD is freed by close()'s finally clause via _socket.IfxSocket.close (idempotent, never-raising). Prevents unexpected errors from escaping close() and leaving partial state. 5 new unit tests in test_protocol.py for _extract_server_error_text: finds-longest-run, picks-longest-of-multiple, too-short-returns-None, empty-handled, caps-at-256. 77 unit + 231 integration + 28 benchmark = 336 tests; ruff clean. Hamilton audit punch list final state: every actionable finding addressed. No CRITICAL, no HIGH, no MEDIUM remaining. Pre-Phase-26: 2 critical, 3 high, 5 medium Post-Phase-30: 0 critical, 0 high, 0 medium - PRODUCTION READY	2026-05-05 10:52:39 -06:00
Ryan Malloy	8e8b81fe8d	Phase 29: Deferred-cleanup queue (2026.05.05.3) Closes the unbounded-leak gap on long-lived pooled connections that Phase 28's cursor finalizer left as future work. When the finalizer can't acquire the wire lock (cross-thread GC during another thread's op), instead of leaking + logging, it enqueues the cleanup PDUs to a per-connection deferred queue. The next normal operation drains the queue under the wire lock, completing the cleanup atomically before the new op. What changed: connections.py: * Connection._pending_cleanup: list[bytes] + Connection._cleanup_lock (separate from _wire_lock - tiny critical section for list mutation only, allows enqueue without waiting for an in-flight wire op) * _enqueue_cleanup(pdus): thread-safe append, callable from any thread (including finalizers without lock ownership) * _drain_pending_cleanup(): pop-the-list + send-each-PDU. Caller must hold _wire_lock. Force-closes on wire desync (same doctrine as _raise_sq_err) * _send_pdu opportunistically drains the queue before sending. Cost is one length-check when queue is empty (the common case) cursors.py: * _finalize_cursor enqueues [_CLOSE_PDU, _RELEASE_PDU] instead of leaking when the lock is busy. WARNING demoted to DEBUG since leak no longer accumulates. Lock-order discipline: _cleanup_lock is held only for list extend/pop; _wire_lock is held for the actual wire I/O. Never grab _cleanup_lock while holding _wire_lock - the drain pops-and-clears under _cleanup_lock, then iterates under _wire_lock (which caller holds). Two new regression tests: * test_enqueue_cleanup_drains_on_next_send_pdu - verifies queue mechanism end-to-end * test_pending_cleanup_thread_safe_enqueue - 8x50 concurrent enqueues, no race-loss 72 unit + 231 integration + 28 benchmark = 331 tests; ruff clean. Hamilton audit punch list status: 0 critical, 0 high, 3 medium remaining (login errors, _send_exit cleanup, pool acquire re-entrance) - all Phase 30 scope.	2026-05-05 10:47:49 -06:00
Ryan Malloy	fdb9ba32d5	Phase 28: Resource leak hardening (2026.05.05.2) Closes Hamilton audit High #4 (bare-except in error drain) and High #5 (no cursor finalizers), plus 1 medium one-liner. After Phases 26-28, 0 CRITICAL and 0 HIGH audit findings remain. Driver is PRODUCTION READY. What changed: cursors.py: * Cursor finalizers via weakref.finalize. Mid-fetch raises (or any GC without explicit close()) now release server-side resources (CLOSE + RELEASE PDUs). Pre-built static PDU bytes at module load so finalizer can run on any thread without allocating or calling cursor methods. * Non-blocking lock acquire prevents cross-thread GC deadlock. WARNING log on lock-busy so leak accumulation is visible. * state=[False] list pattern keeps finalizer closure weak. GIL dependency of atomic single-element mutation documented. * _raise_sq_err near-token parse: (ProtocolError, OSError) only. * _raise_sq_err drain: force-close connection on same exceptions (wire unrecoverable after desync). connections.py: * _raise_sq_err drain: same hardening as cursor version. Force-close on (ProtocolError, OSError, OperationalError) - the latter from _drain_to_eot raising on unknown tags. Documented inline. * Added contextlib import for force-close suppression. cursors.py write_blob_column: * BLOB_PLACEHOLDER validation now requires EXACTLY ONE occurrence. Pre-Phase-28, str.replace silently substituted every occurrence - corrupting SQL containing the literal string in comments etc. Now raises ProgrammingError with workaround pointer. _resultset.py: * Investigated end-of-loop bounds check for parse_tuple_payload. Reverted: long-standing off-by-one in UDTVAR(lvarchar) trailing- pad logic produces benign over-reads (payload is a fully-extracted bytes object; over-reads return empty slices through unused branches). Real silent-corruption surfaces are length-prefix decoders, needing branch-local checks. Documented as deliberate non-fix. Margaret Hamilton review surfaced two blocking conditions: * Asymmetric failure handling: _raise_sq_err force-closed the connection on wire desync, but the cursor finalizer silently swallowed identical failures. "Same wire, same failure mode, same response" - finalizer now matches _raise_sq_err's discipline. * Leak visibility: wire-lock-busy log was DEBUG. Promoted to WARNING so leak accumulation on pooled connections is visible. Plus three documentation improvements (GIL dependency, OperationalError in desync taxonomy, parse_tuple non-fix rationale). One new regression test: * test_write_blob_column_rejects_multiple_placeholders 72 unit + 229 integration + 28 benchmark = 329 tests; ruff clean. Phase 29 ticket (Hamilton recommended): deferred-cleanup queue drained at next _send_pdu, closes unbounded-leak gap on long-lived pooled connections. Not blocking Phase 28. Hamilton audit verdict: Pre-26: 2 critical, 3 high, 5 medium Post-28: 0 critical, 0 high, 4 medium	2026-05-05 03:56:24 -06:00
Ryan Malloy	6afdbcabb3	Phase 27: Wire lock + async cancellation eviction (2026.05.05.1) Closes Hamilton audit Critical #2 (concurrency / wire lock) and High #3 (async cancellation evicts cleanly). Phase 26 fixed what gets returned to the pool; Phase 27 fixes what can interleave on the wire while it's running. What changed: connections.py: * Added Connection._wire_lock = threading.RLock(). Wrapped commit(), rollback(), fast_path_call() under the lock. * _ensure_transaction documents the lock as a precondition AND asserts ownership at runtime (_wire_lock._is_owned()) so a future caller adding a third call site fails loudly. * close() tries to acquire wire lock with 0.5s timeout before SQ_EXIT; skips polite exit and force-closes if busy. cursors.py: * execute() body extracted into _execute_under_wire_lock() and called under the lock. * executemany() body wrapped inline. * _sfetch_at() wrapped - covers all scrollable fetch_* methods that delegate to it. * close() locks the CLOSE+RELEASE for scrollable cursors. pool.py: * release() acquires conn._wire_lock with 5s timeout before rollback. On timeout: log WARNING, evict connection. Constant _RELEASE_WIRE_LOCK_TIMEOUT for tunability. aio.py: * AsyncConnectionPool.connection() now catches CancelledError / TimeoutError separately and routes to broken=True. Combined with the wire lock, asyncio.wait_for around aio DB calls is now safe. * Updated docstring; mirrored in docs/USAGE.md. Margaret Hamilton review surfaced three actionable conditions, all addressed before tagging: * Cancellation test used contextlib.suppress - could pass without exercising the cancellation path on a fast runner. Switched to pytest.raises so the test fails if timeout doesn't fire. * _ensure_transaction precondition documented but unchecked at runtime. Added assert self._wire_lock._is_owned() guard. * Connection.close() was unsynchronized. Now tries 0.5s acquire before SQ_EXIT. Two new regression tests in tests/test_pool.py: * test_concurrent_threads_on_one_connection_dont_interleave_pdus (without lock: garbled results / hangs) * test_async_wait_for_cancellation_evicts_connection (asserts pool size shrinks; cancellation actually fires) 72 unit + 228 integration + 28 benchmark = 328 tests; ruff clean. Hamilton verdict: PRODUCTION READY WITH CAVEATS (was) -> CAVEATS NARROWED FURTHER (now). 0 critical, 2 high remaining (cursor finalizers + bare-except in error drain) - both Phase 28 scope.	2026-05-05 03:40:39 -06:00
Ryan Malloy	5c4a7a57f1	Phase 26: Pool rollback-on-release - CRITICAL data-correctness fix (2026.05.05) Fixes the dirty-pool-checkout bug surfaced by Margaret Hamilton's system-wide audit (Critical #1). The bug: ConnectionPool.release() returned connections with open server-side transactions still active. Request A's uncommitted INSERTs would be inherited by Request B reusing the same connection - B's commit would land A's writes permanently; B's rollback would silently lose them. Same shape as psycopg2's pre-2.5 dirty-pool bug. The fix: pool.release() now rolls back any open transaction before returning the connection to the idle list. The rollback runs OUTSIDE the pool lock since it's a wire round-trip - the connection is already off the idle list and counted in _total, so no other thread can grab it during the rollback window. If the rollback itself fails (dead socket, etc.), the connection is evicted rather than recycled. Async path covered automatically: AsyncConnectionPool.release() delegates to the sync pool's release via _to_thread. Margaret Hamilton review pass surfaced two findings, both addressed: * Silent rollback failure: added a WARNING log via logging.getLogger ("informix_db.pool") so evictions are debuggable. First logger in the project. * Async cancellation race: the fix doesn't introduce the asyncio.wait_for race (Critical #2, deferred to Phase 27), but it adds a code path that can trigger it. Documented loudly in pool.release() docstring, aio.py module docstring, and USAGE.md async section. Recommendation: use read_timeout on the connection instead of asyncio.wait_for until Phase 27 lands. Two new regression tests in tests/test_pool.py: * test_uncommitted_writes_invisible_to_next_acquirer (the bug) * test_committed_writes_survive_pool_checkout (no over-correction) Verified the regression test catches the bug: stashed the fix, ran the test - it fails with "B sees 1 rows - leaked across pool checkout boundary" - confirming it tests the real failure mode. Total tests: 72 unit + 226 integration + 28 benchmark = 326. Deferred to Phase 27 per Hamilton audit: * Critical #2 (concurrency / per-connection wire lock) * High #3 (async cancellation routes to broken=True) * High #4 (bare except in _raise_sq_err drain) * High #5 (no cursor finalizers - server-side resource leaks)	2026-05-05 03:22:18 -06:00
Ryan Malloy	e9aed6ce59	Phase 25: Branch reorder + invariant tripwires (2026.05.04.10) Third-pass optimization on parse_tuple_payload's hot loop. Previous phases removed redundant work; this one removes correct-but-wasteful work: the if/elif chain checked branches in implementation order, not frequency order. Fixed-width types (INT, FLOAT, DATE, BIGINT - the most common columns in real queries) sat at the bottom, paying ~7 frozenset misses per column. Changes (src/informix_db/_resultset.py): * Added _FIXED_WIDTH_TYPES = frozenset(FIXED_WIDTHS.keys()) at module load. * New fast-path branch at the TOP of parse_tuple_payload's loop body that handles every _FIXED_WIDTH_TYPES column inline: one frozenset check, one dict lookup, one decode, continue. Skips every other branch. * Cleaned up the bottom fall-through; it now genuinely only catches unknown types. Performance vs Phase 24 baseline: * parse_tuple_5cols_iso8859: 1659 ns -> 1400 ns (-16%) * parse_tuple_5cols_utf8: 1649 ns -> 1341 ns (-19%) Cumulative vs Phase 21 baseline (before any optimization): * parse_tuple_5cols: 2796 ns -> 1400 ns (-50%) - HALF the time * decode_int: 230 ns -> 139 ns (-40%) Margaret Hamilton review surfaced one HIGH finding addressed before tagging: * H: The fast-path optimization assumes every FIXED_WIDTHS key is decodable WITHOUT qualifier inspection (encoded_length etc.). True today, but a future contributor adding a fixed-width type that needs qualifier bits (like DATETIME does) would silently get wrong decode behavior - Lauren-Bug class failure. Fix: added INVARIANT comment to FIXED_WIDTHS in converters.py AND added tests/test_resultset_invariants.py with three CI tripwire tests: - _FIXED_WIDTH_TYPES is disjoint from every other dispatch branch - Every FIXED_WIDTHS key has a DECODERS entry - DECODERS keys stay < 0x100 (Phase 24 collision-free guarantee) The tests carry instructions: if one fires, don't update the test to match - either restore the property or refactor the optimization. Comments rot when nobody reads them; tests fail loudly. baseline.json refreshed; 72 unit + 224 integration + 28 bench = 324 tests; ruff clean.	2026-05-04 23:34:05 -06:00
Ryan Malloy	dfa60ea501	Phase 24: Decoder dispatch split + struct precompilation (2026.05.04.9) Second pass of hot-path optimization on parse_tuple_payload. Two changes to converters.py: 1. Split decode() into public + internal. Added _decode_base(base_tc, raw, encoding) that takes an already-base-typed code and skips the redundant base_type() call. Public decode() is now a one-line wrapper. parse_tuple_payload's 4 call sites swapped to use _decode_base directly. _fastpath.py's external decode() caller is unaffected. 2. Pre-compiled struct.Struct unpackers. The fixed-width integer/float decoders (_decode_smallint, _decode_int, _decode_bigint, _decode_smfloat, _decode_float, _decode_date) switched from per-call struct.unpack(fmt, raw) to module-level bound methods like _UNPACK_INT = struct.Struct("!i").unpack. Format-string parsed once at module load. Measured 37% faster than per-call struct.unpack on CPython 3.13 micro. Performance vs Phase 23 baseline: * decode_int: 173 ns -> 139 ns (-20%) * decode_bigint: 188 ns -> 150 ns (-20%) * parse_tuple_5cols: 2047 ns -> 1592 ns (-22%) * 1k-row SELECT: 1255 us -> 989 us (-21%) Cumulative vs original Phase 21 baseline: * decode_int: 230 ns -> 139 ns (-40%) * parse_tuple_5cols: 2796 ns -> 1592 ns (-43%) * 1k-row SELECT: 1477 us -> 989 us (-33%) Real-world fetch ceiling: 358K rows/sec -> ~620K rows/sec. Margaret Hamilton review surfaced one HIGH-severity finding addressed before tagging: * H: The no-collision guarantee that makes _decode_base safe is structural but undocumented (all DECODERS keys are ≤ 0xFF, all flag bits are ≥ 0x100, so flagged inputs cannot coincidentally match). Added load-bearing INVARIANT comment at DECODERS dict explaining the constraint and what to do if violated. Cross-referenced from _decode_base's docstring for bidirectional traceability. baseline.json refreshed; all 224 integration tests pass; ruff clean.	2026-05-04 19:31:21 -06:00
Ryan Malloy	f3e589c5bf	Phase 23: Hot-path optimization for parse_tuple_payload (2026.05.04.8) Per-row decode is hit on every row of every SELECT. The original code had three forms of waste in the inner loop: 1. Redundant base_type() call. ColumnInfo.type_code is already base-typed by parse_describe at construction; calling base_type() again per column per row was pure waste. Single largest savings. 2. IntFlag->int conversions inline (~10x per iteration). Lifted to module-level _TC_X constants. 3. Lazy imports inside the loop body (_decode_datetime, _decode_interval, BlobLocator, ClobLocator, RowValue, CollectionValue). Moved to top. Plus three precomputed frozensets (_LENGTH_PREFIXED_SHORT_TYPES, _COMPOSITE_UDT_TYPES, _NUMERIC_TYPES) replace inline tuple-membership checks. _COLLECTION_KIND_MAP is now MappingProxyType (actually frozen). Performance: * parse_tuple_5cols: 2796 ns -> 2030 ns (-27%) * select_bench_table_all (1k rows): 1477 us -> 1198 us (-19%) * Codec micro-bench, cold connect, executemany: unchanged Real-world fetch ceiling on a single connection: 350K rows/sec -> 490K rows/sec. Margaret Hamilton review surfaced four cleanup items, all addressed before tagging: * H1: cursor._dereference_blob_columns had the same redundant base_type() call - stripped for consistency. * M1: documented the load-bearing invariant at parse_describe (the single producer site) so future contributors have a grep target. * M2: _COLLECTION_KIND_MAP wrapped in MappingProxyType. * L1: stale line-number comment fixed to point at the INVARIANT comment instead. baseline.json refreshed; all 224 integration tests pass; ruff clean.	2026-05-04 17:52:20 -06:00
Ryan Malloy	0e0dfcba26	Phase 22: User-facing documentation refresh (2026.05.04.7) The docs/USAGE.md predated Phases 17-21, so anyone landing on PyPI was missing scrollable cursors, locale/Unicode, the autocommit cliff finding, and the type-mapping reference. Added sections to docs/USAGE.md: * Locale and Unicode - client_locale, Connection.encoding, CLIENT_LOCALE vs DB_LOCALE, when characters can't fit the codec * Type mapping reference - full SQL <-> Python type table, NULL sentinels subsection, IntervalYM * Performance tips - 53x autocommit-cliff fix, 100x executemany win, 72x pool win, with the actual benchmark numbers from Phase 21.1 * Scrollable cursors - fetch_* API, in-memory vs server-side trade-off, edge cases (past-end semantics, negative indexing, rownumber) * Timeouts and keepalive subsection - production starting points * Environment dictionary subsection - env={} parameter * Known limitations - explicit table of what doesn't work (named params, complex UDT bind, GSSAPI, XA) with workarounds; "things that might surprise you" notes README.md - added Documentation section linking to docs/USAGE.md and tests/benchmarks/README.md. Doc corrections caught during review: * cursor.rownumber is 0-indexed (impl has always been correct; only the original docstring wording was loose) * fetch_* methods work on BOTH scrollable=True and default cursors; the in-memory path supports them too USAGE.md grew from 345 lines to 633.	2026-05-04 17:33:37 -06:00
Ryan Malloy	495128c679	Phase 21.1: executemany perf - it was the autocommit cliff (2026.05.04.6) Investigation of the Phase 21 baseline finding that executemany(N) cost scaled linearly per-row (1.74 ms x N) regardless of batch size. Root cause: every autocommit=True INSERT forces a server-side transaction-log flush. Not a wire-protocol bug. Numbers: * executemany(1000) autocommit=True: 1.72 s (1.72 ms/row) * executemany(1000) in single txn: 32 ms (32 us/row) 53x speedup from changing the transaction boundary, not the driver. Pure protocol overhead is ~32 us/row -> ~31K rows/sec sustained throughput on a single connection. Comparable to pg8000. Added test_executemany_1000_rows_in_txn benchmark to make this visible. Updated README headline numbers and added a "Performance gotchas" section explaining when autocommit=False matters. Decision: don't pipeline. The remaining 32 us is already excellent; the autocommit gotcha is the real user-facing footgun. Docs > code. If someone reports needing >31K rows/sec single-connection, that becomes Phase 22.	2026-05-04 17:26:16 -06:00
Ryan Malloy	90ce035a00	Phase 21: Performance benchmarks (2026.05.04.5) Adds tests/benchmarks/ with pytest-benchmark coverage of the hot codec paths and end-to-end SELECT/INSERT/pool/async round-trips. Establishes a committed baseline.json so PRs can be regression-checked at review via --benchmark-compare. * test_codec_perf.py (16): decode/encode_param/parse_tuple_payload micro-benchmarks - run without container, suitable for pre-merge CI. * test_select_perf.py (4): SELECT round-trips - 1-row latency floor, 10-row, 1k-row full fetch, parameterized. * test_insert_perf.py (3): single-row INSERT, executemany 100 / 1000. * test_pool_perf.py (3): cold connect, pool acquire/release, pool acquire + query + release. * test_async_perf.py (2): async round-trip overhead, 10x concurrent. * baseline.json: committed snapshot, 28 measurements. * benchmark pytest marker, gated off by default. * Makefile: bench / bench-codec / bench-save targets; test-integration excludes benchmarks for speed. Headline numbers (dev container loopback): * decode(int): 181 ns * parse_tuple 5 cols: 2.87 µs/row * SELECT 1 round-trip: 177 µs * Pool acquire+query+release: 295 µs * Cold connect: 11.2 ms (72x slower than pool) UTF-8 decode carries no measurable cost vs iso-8859-1 - confirms Phase 20 didn't regress anything. Total: 69 unit + 211 integration + 28 benchmark = 308 tests.	2026-05-04 17:21:12 -06:00
Ryan Malloy	bea1a1cd0c	Phase 20: UTF-8/multibyte locale support (2026.05.04.4) Thread CLIENT_LOCALE through to user-data string codecs. Driver previously hardcoded iso-8859-1 for all string conversions, which broke any locale outside Western European code points. * Connection.encoding property derived from client_locale via _python_encoding_from_locale (en_US.utf8 -> utf-8, en_US.8859-1 -> iso-8859-1, etc.) * encode_param / decode / parse_tuple_payload accept an encoding parameter; cursor and fast-path call sites forward conn.encoding * Smart-LOB CLOB encode/decode and TEXT decode honor connection encoding * DataError raised for non-representable chars; cursor releases the prepared statement before propagating so connection state stays clean Boundary discipline: protocol-level strings (cursor names, function signatures, SQ_FILE fnames, error near-tokens, SQL text) stay iso-8859-1 (always ASCII, never user-controlled). 9 new integration tests in tests/test_unicode.py covering ASCII round-trip, Latin-1 high-bit, full byte range, locale-mapping, encoding property, UTF-8 negotiation, multibyte (skipped without IFX_UTF8_DATABASE), DataError on non-representable, CLOB round-trip. Total: 69 unit + 212 integration = 281 tests.	2026-05-04 17:13:19 -06:00
Ryan Malloy	9703279bc8	Phase 19: resilience tests via fault injection (v2026.05.04.3) Fills the highest-priority gap from the test-adequacy audit: connection-failure recovery. 12 new integration tests using a thread-based TCP proxy (ControlledProxy) that can be kill()'d at any moment to simulate network drops or server crashes via TCP RST (SO_LINGER=0). Coverage: * Network drop mid-SELECT — OperationalError, not hang * Network drop after describe, before fetch * Network drop during fetch (already-materialized rows still readable; fresh execute fails) * Local socket forced-close (kernel-level disconnect simulation) * I/O error marks connection unusable post-failure * Pool evicts connection that died mid-`with` block (size drops) * Pool revives after all idle connections died (health check on acquire mints fresh) * Async cancellation via asyncio.wait_for — pool stays usable * Cursor reusable after SQL error * Connection survives cursor close after error * Sustained pool load (50 acquire/release cycles, no leak) * read_timeout fires on a hung connection within bounds Catches the failure classes that bite production users: * Hangs (waiting forever on dead socket) * Silent corruption (EOF treated as valid tuple) * Double-fault (cleanup raises after primary error) * Pool poisoning (broken connection returned to pool) * Stale cursor reuse across error boundaries Helper: * tests/_proxy.py — ControlledProxy: thread-based TCP forwarder with kill() for fault injection. Two-thread pump model. SO_LINGER=0 for RST-on-close (mimics router drop). Total: 69 unit + 203 integration = 272 tests. Remaining gaps from the audit (UTF-8 multibyte locale, server-version matrix, performance benchmarks) are real but lower-severity. Phase 19 addressed the one most likely to bite production deployments.	2026-05-04 16:57:06 -06:00
Ryan Malloy	a42dc5c5de	Phase 18: server-side scrollable cursors via SQ_SFETCH (v2026.05.04.2) Opt-in via conn.cursor(scrollable=True). Opens the cursor with SQ_SCROLL (24) before SQ_OPEN (6), keeps it open server-side, and sends SQ_SFETCH (23) per scroll call instead of materializing the result set up-front. User-facing API is identical to Phase 17's in-memory scroll (fetch_first/last/prior/absolute/relative, scroll, rownumber). Only the internal mechanism differs: \| feature \| default \| scrollable=True \|-------------------\|------------------\|------------------ \| memory \| all rows \| one row at a time \| round-trips/fetch \| 0 (after NFETCH) \| 1 per call \| cursor lifetime \| closed after exec\| open until close() \| best for \| sequential iter \| random access on \| huge result sets Wire format (verified against JDBC ScrollProbe capture): * SQ_SFETCH: [short SQ_ID=4][int 23][short scrolltype] [int target][int bufSize=4096][short SQ_EOT] scrolltype: 1=NEXT, 4=LAST, 6=ABSOLUTE * SQ_SCROLL (24): emitted between CURNAME and SQ_OPEN * SQ_TUPID (25): response tag with 1-indexed row position; authoritative source for client-side position tracking Position tracking uses the server's SQ_TUPID rather than client- computed indexes. Total row count discovered lazily via SFETCH(LAST) when negative absolute indexing requires it; cached in _scroll_total_rows. Trap on the way: initial SFETCH used SHORT for bufSize → server hung silently. Same SHORT-vs-INT diagnostic pattern as Phase 4.x's CURNAME+NFETCH. Captured JDBC trace, byte-diffed against ours, found the mismatch (bufSize is INT in modern Informix per isXPSVER8_40 / is2GBFetchBufferSupported). Tests: 14 integration tests in test_scroll_cursor_server.py covering lifecycle, sequential fetch, fetch_first/last/prior/ absolute/relative, negative indexing, scroll, empty result sets, past-end, and random-access on a 100-row result set. Total: 69 unit + 191 integration = 260 tests.	2026-05-04 16:41:25 -06:00
Ryan Malloy	461c62c8d3	Phase 17: scroll cursor API (v2026.05.04.1) Adds scroll/random-access methods on Cursor: * scroll(value, mode='relative'\|'absolute') — PEP 249 compatible * fetch_first() / fetch_last() — jump to result-set ends * fetch_prior() — step backward (SQL-standard: from past-end yields the last row, matching JDBC ResultSet.previous() semantics) * fetch_absolute(n) — 0-indexed jump; negative n indexes from end * fetch_relative(n) — n-step from current position * rownumber property — current 0-indexed position Implementation: replaced _row_iter (single-pass iterator) with _row_index (random-access index) on the cursor. The result set is already materialized in _rows during execute(); scroll just repositions the index. No new wire protocol needed. For server-side scroll over genuinely huge result sets, SQ_SFETCH (tag 23) would be needed — JDBC has executeScrollFetch (line 3908) but we only need it if someone hits the in-memory materialization ceiling. Phase 18 if so. Out-of-range scroll raises IndexError per PEP 249. Invalid mode strings raise ProgrammingError. fetchall() now correctly returns only the rows from the current position to end (not all rows). 14 new integration tests in test_scroll_cursor.py covering: * fetchone advancing rownumber sequentially * fetch_first reset * fetch_last * fetch_prior including the past-end-to-last-row semantics * fetch_absolute with positive and negative indexes * fetch_relative * PEP 249 scroll(value, mode='relative'/'absolute') * IndexError on out-of-range * ProgrammingError on bad mode * Empty-result-set edge cases * fetchall after partial iteration Total: 69 unit + 177 integration = 246 tests.	2026-05-04 15:51:24 -06:00
Ryan Malloy	0c856372a6	v2026.05.04: bump CalVer + polish docs Version bump (2026.05.02 → 2026.05.04) reflects the library reaching feature completeness across Phases 1-16. Documentation: * README.md — full rewrite. The previous README was from Phase 1 ("cursor() / execute() / fetchone() arrive in Phase 2"). New README covers: sync + async APIs, connection pool, TLS, full type matrix, smart-LOBs, fast-path RPC, server-compatibility, development workflow, and pointers to the protocol research docs. * docs/USAGE.md — new practical recipe guide. Connecting, cursor lifecycle, parameter binding, transactions (logged + unlogged), executemany, smart-LOB read/write, connection pool, async, TLS, error handling, fast-path RPC, server-side setup steps, and a migration table from IfxPy / legacy informixdb. * CHANGELOG.md — new file. Captures the v2026.05.04 release as the Phase 1-16 completion milestone with a full feature inventory and known-gap list. Future point-releases append here. Classifiers updated: * Development Status: 2 → 4 (Pre-Alpha → Beta) * Added Framework :: AsyncIO Keywords: added asyncio, async. No code changes; tests still pass (69 unit + 163 integration = 232). Ruff clean.	2026-05-04 15:38:09 -06:00
Ryan Malloy	300e1bf7b4	Phase 16: async API (informix_db.aio) Ships AsyncConnection, AsyncCursor, and AsyncConnectionPool that expose async/await versions of the sync API for use with FastAPI, aiohttp, etc. Strategy: thread-pool wrapping (aiopg pattern), not native async. Each blocking I/O call is offloaded to a worker thread via asyncio.to_thread. The event loop never blocks; queries run in parallel up to the pool's max_size. Cost: ~250 lines, no changes to the sync codebase. Native async (Phase 17) would require a ~2000-line transport abstraction refactor — deferred until a real workload needs it. For typical FastAPI/aiohttp workloads (request → one query → return), this is functionally equivalent to native async. Each await yields the loop while a worker thread does the I/O. Only differs for hundreds-of-concurrent-connections workloads. API mirrors the sync API one-to-one: import asyncio from informix_db import aio async def main(): pool = await aio.create_pool(host=..., min_size=1, max_size=10) async with pool.connection() as conn: cur = await conn.cursor() await cur.execute("SELECT id FROM users WHERE name = ?", (name,)) row = await cur.fetchone() await pool.close() The async pool preserves the sync pool's eviction policy: connection errors evict, application errors retain. Tests: 9 integration tests in test_aio.py covering open/close, async-with, simple/parameterized SELECT, async-for cursor iteration, pool acquire/release, 20-query concurrent gather (verifies parallelism through max_size=5 pool), pool async context manager, commit/rollback. Total: 69 unit + 163 integration = 232 tests. Pyproject changes: * Added pytest-asyncio>=1.3.0 as dev dep * asyncio_mode = "auto" so async tests don't need decorators Architectural completion: with Phase 16, every backlog item is done. The Phase 0 ambition — first pure-Python Informix driver, no native deps — is now genuinely complete.	2026-05-04 14:58:19 -06:00
Ryan Malloy	9b1fd8af2c	Phase 1: pure-Python SQLI login works end-to-end This commit takes informix-db from documentation-only (Phase 0 spike) to a functional connect() / close() against a real Informix server. To our knowledge, this is the first pure-socket Informix client in any language — no CSDK, no JVM, no native libraries. Layered architecture per the plan, mirroring PyMySQL's shape: src/informix_db/ __init__.py — PEP 249 surface (connect, exceptions, paramstyle="numeric") exceptions.py — full PEP 249 hierarchy declared up front _socket.py — raw socket I/O (read_exact, write_all, timeouts) _protocol.py — IfxStreamReader / IfxStreamWriter framing primitives (big-endian, 16-bit-aligned variable payloads, length-prefixed nul-terminated strings) _messages.py — SQ_* tags from IfxMessageTypes + ASF/login markers _auth.py — pluggable auth handlers; plain-password is the only Phase-1 implementation connections.py — Connection class: builds the binary login PDU (SLheader + PFheader byte-for-byte per PROTOCOL_NOTES.md §3), sends it, parses the server response, wires up close() Phase 1 design decisions locked in DECISION_LOG.md: - paramstyle = "numeric" (matches Informix ESQL/C convention) - Python >= 3.10 - autocommit defaults to off (PEP 249 implicit) - License: MIT - Distribution name: informix-db (verified PyPI-available) Test coverage: 34 unit tests (codec round-trips against synthetic byte streams; observed login-PDU values from the spike captures asserted as exact byte literals) + 6 integration tests (connect, idempotent close, context manager, bad-password → OperationalError, bad-host → OperationalError, cursor() raises NotImplementedError). pytest — runs 34 unit tests, no Docker needed pytest -m integration — runs 6 integration tests against the Developer Edition container (pinned by digest in tests/docker-compose.yml) pytest -m "" — runs everything ruff is clean across src/ and tests/. One bug found during smoke testing: threading.get_ident() can exceed signed 32-bit on some processes, overflowing struct.pack("!i"). Fixed the same way the JDBC reference does — clamp to signed 32-bit, fall back to 0 if out of range. The field is diagnostic only. One protocol-level observation that AMENDED the JDBC source reading: the "capability section" in the login PDU is three independently negotiated 4-byte ints (Cap_1=1, Cap_2=0x3c000000, Cap_3=0), not one int + 8 reserved zero bytes as my CFR decompile read suggested. The server echoes them back identically. Trust the wire over the decompiler. Phase 1 verification matrix (from PROTOCOL_NOTES.md §12): - Login byte layout: confirmed (server accepts our pure-Python PDU) - Disconnection: confirmed (SQ_EXIT round-trip works) - Framing primitives: confirmed (34 unit tests) - Error path: bad password → OperationalError, bad host → OperationalError Phase 2 (Cursor / SELECT / basic types) is the next phase. The hard unknowns there — exact column-descriptor layout, statement-time error format — were called out as bounded gaps in Phase 0 and have existing captures (02-select-1.socat.log, 02-dml-cycle.socat.log) to characterize against.	2026-05-02 19:10:24 -06:00

22 Commits