Closes some of the C-vs-Python codec gap on bulk fetch by moving
per-column dispatch decisions from row time to parse_describe time.
Same approach psycopg3 uses in its pure-Python mode (loader cache
per column).
What changed:
_resultset.py:
* New compile_column_readers(columns) builds a per-column dispatch
tuple at parse_describe time. Each tuple is (kind, *args) where
kind is a small int (FIXED/BYTE_PREFIX/CHAR/LVARCHAR/DECIMAL/
DATETIME/INTERVAL/LEGACY).
* parse_tuple_payload accepts optional readers= parameter. Fast
path uses int comparison + tuple unpack instead of the legacy
frozenset/dict-lookup chain.
* _legacy_dispatch_one_column factored out to handle rare types
(UDT/composite/UDTVAR) that fall through.
cursors.py:
* Cursor caches self._column_readers after parse_describe,
computed once via compile_column_readers. Reset on new execute.
* Fetch loop passes readers=self._column_readers.
Performance (median of 10+ rounds):
select_scaling[1000]: 2.7 ms -> 2.51 ms (-7%)
select_scaling[10000]: 25.8 ms -> 25.0 ms (-3%)
select_scaling[100000]: 271 ms -> 246 ms (-9%)
wide_row_select[5]: 2.4 ms -> 2.16 ms (-10%)
wide_row_select[20]: 5.1 ms -> 4.14 ms (-19%)
wide_row_select[50]: 10.1 ms -> 8.21 ms (-19%)
wide_row_select[100]: 19.4 ms -> 14.6 ms (-25%)
Wide-row workloads benefit most - per-column dispatch savings
accumulate linearly with column count. At 100 cols, 25% speedup.
IfxPy gap shrinks from ~2.4x to ~2.2x on bulk fetch. Real progress
but not closing-the-gap. Next lever is exec()-based codegen
(per-result-set decoder function) - possible Phase 38.
221 integration tests still pass. Benchmark suite acts as regression
test.
Architectural note: chose tuple dispatch (r[0] int compare) over
object-method dispatch (loader.load(data)) for ~20-30 ns/col speed
advantage in the inner loop. Slightly less extensible than psycopg3's
class-based loaders but materially faster in pure Python.
Adds three things to test_scaling_perf.py:
1. 100-column wide-row SELECT - codec stress test at extreme widths.
1k rows x 100 cols = 19.4 ms (~194 us/row, ~1.94 us/column-decode).
Per-column cost continues to drop with width thanks to loop
amortization (5 cols: 480 ns/col -> 100 cols: 194 ns/col).
2. 100k-row memory profile - samples RSS pre-execute, post-execute
(materialization cost), and during iteration. Real numbers:
pre-execute: 45.8 MB
post-execute: 71.2 MB (+25.4 MB = ~259 bytes/row materialization)
iteration: 0 KB extra (just walks the existing list)
Documents the in-memory cursor's actual cost: 100k rows = 25 MB,
1M rows = ~250 MB. Fair regression baseline (tripped at 500 MB).
3. 1M-row scaling gated behind IFX_BENCH_1M=1 env var. Default off
because the dev container's rootdbs runs out of space. For
production-sized servers users can opt in. The implementation
is linear-extrapolation-correct (executemany 100k -> 1M = ~15s,
SELECT 100k -> 1M = ~3s).
Note on the dev-container size limit: dev image's rootdbs is sized
for typical developer workloads, not stress testing. A 1M-row
INSERT exceeds the available pages and fails with -242 ISAM -113
(out of space). This is correct behavior - the limit is enforced
at the storage layer.
Switched RSS sampling from ru_maxrss (peak, monotonic) to
/proc/self/status VmRSS (current). Earlier runs showed flat
RSS because peak from earlier in the test session masked the
fluctuation.
Extends the IfxPy comparison bench script with scaling workloads
(1k/10k/100k rows for both executemany and SELECT). Re-runs the
full comparison with consistent measurement methodology and updates
the README with the actually-correct numbers.
Earlier comparison runs reported informix-db winning all 5
benchmarks. Re-running select_bench_table_all with consistent
measurement gives 3.04 ms, not the 891 us I cited earlier - a
3.4x discrepancy attributable to noisy warmup + small-fixture
artifacts. The "we win everything" framing was wrong.
Corrected comparison reveals two clear stories:
Bulk-insert: pure-Python wins 1.6x at scale.
executemany(10k): IfxPy 259ms -> us 161ms (1.6x faster)
executemany(100k): IfxPy 2376ms -> us 1487ms (1.6x faster)
Reason: Phase 33's pipelining eliminates per-row RTT. IfxPy's
per-call API can't pipeline.
Large-fetch: IfxPy wins 2.3-2.4x at scale.
SELECT 1k rows: IfxPy 1.2ms / us 2.7ms (IfxPy 2.3x)
SELECT 10k rows: IfxPy 11.3ms / us 25.8ms (IfxPy 2.3x)
SELECT 100k rows: IfxPy 112ms / us 271ms (IfxPy 2.4x)
Reason: C-level fetch_tuple at ~1.1us/row beats Python
parse_tuple_payload at ~2.7us/row. Real C-vs-Python codec gap
showing up at scale.
For everyday workloads (single SELECT in a request, INSERT a
handful of rows), drivers are within 5-25%. For workloads where
the gap widens, direction depends on what you're doing - bulk-
write favors us, bulk-read favors IfxPy.
README's "Compared to IfxPy" section rewritten with the corrected
numbers and an honest "when to prefer which" subsection.
tests/benchmarks/compare/README.md mirror updated.
Net narrative: a "faster at bulk-write, slower at bulk-read,
comparable elsewhere" comparison story is more honest and more
durable than a "we win everything" claim that would have collapsed
the first time a user ran their own benchmark.
Side note (lint): one ambiguous unicode `×` in cursors.py replaced
with `x`.
Phase 37 ticket: parse_tuple_payload is the bottleneck at scale.
Closing the 1.6 us/row gap to IfxPy would make us competitive on
bulk-fetch too. Possible approaches: Cython codec, deeper inlining,
per-column dispatch pre-bake.
DATA-LOSS BUG: cursor.fetchall() on result sets larger than ~200 rows
was silently truncating to the first ~200 rows. The exact cap depended
on row width and the server's per-NFETCH buffer (4096 bytes default).
The bug:
_execute_select sent NFETCH twice and stopped:
self._conn._send_pdu(self._build_curname_nfetch_pdu(cursor_name))
self._read_fetch_response()
self._conn._send_pdu(self._build_nfetch_pdu()) # comment: "DONE only"
self._read_fetch_response()
# then CLOSE+RELEASE — discarding remaining queued rows
The "second fetch returns DONE only" comment was wrong. For any
result set larger than the server's per-NFETCH batch, the second
fetch returns more tuples AND there are still tuples queued
server-side. The cursor closed and dropped them.
Latent for 30 phases because every existing test used either a small
result set (FIRST 10) or relied on row counts that fit naturally in
1-2 batches. Discovered by Phase 34's scaling benchmark when
SELECT FIRST 100000 from a 100k-row table returned 200 rows.
The fix: loop NFETCH until a response yields zero new tuples.
self._conn._send_pdu(self._build_curname_nfetch_pdu(cursor_name))
rows_before = len(self._rows)
self._read_fetch_response()
rows_received = len(self._rows) - rows_before
while rows_received > 0:
self._conn._send_pdu(self._build_nfetch_pdu())
rows_before = len(self._rows)
self._read_fetch_response()
rows_received = len(self._rows) - rows_before
249 integration tests pass. The scaling benchmark suite (Phase 34,
shipping next) is the regression test going forward.
Workaround for users on older versions: use scrollable cursors
(cursor(scrollable=True)) which use the SQ_SFETCH protocol path
and don't have this bug.
If you've been using this driver for queries returning large result
sets, your queries may have been truncating silently. Re-run them
against 2026.05.05.7+ to verify your data.
The serial-loop executemany paid one wire round-trip per row (~30us/
row on loopback). It was the one benchmark where IfxPy beat us in
the comparison work - 10% slower at executemany(1000) in txn.
Phase 33 pipelines the BIND+EXECUTE PDUs: build all N PDUs, send
them back-to-back, then drain all N responses. Eliminates per-row
RTT entirely.
Performance impact:
* executemany(1000) in txn: 31.3 ms -> 11.0 ms (2.85x faster)
* executemany(100) autocommit: 173 ms -> 154 ms (11% faster)
* executemany(1000) autocommit: 1740 ms -> 1590 ms (9% faster)
(Autocommit gets smaller wins because server-side log flushes
dominate - Phase 21.1's "autocommit cliff".)
IfxPy comparison flipped: us 10% slower -> us 2.05x faster on bulk
inserts. We now win all 5 head-to-head benchmarks against the C-bound
driver.
Margaret Hamilton review surfaced one CRITICAL concern (C1) - the
pipeline assumes Informix sends N responses for N pipelined PDUs
even when one fails. If the server cut the stream short, the drain
loop would deadlock on the next read.
Verified by 3 new integration tests in tests/test_executemany_pipeline.py:
* test_pipelined_executemany_mid_batch_constraint_violation (row 500/1000)
* test_pipelined_executemany_first_row_fails (row 0/100)
* test_pipelined_executemany_last_row_fails (row 99/100)
All confirm Informix sends N responses; wire stays aligned; connection
is usable after.
Plus 4 lower-priority fixes Hamilton recommended:
* H1: documented _raise_sq_err self-drains-SQ_EOT invariant + tripwire
* H2: docstring warning about O(N) lock duration; chunk for huge batches
* M1: prepend row-index to exception message rather than reformat
* M2: documented sendall-no-timeout caveat on hostile networks
77 unit + 239 integration + 33 benchmark = 349 tests; ruff clean.
Note: Phase 32 (Tier 1+2 benchmarks) was tagged without bumping
pyproject.toml's version string. .5 was git-tag-only; .6 is the next
published version increment.
Tier 1 — make existing benchmarks reliable:
* Bumped slow-bench rounds: cold_connect_disconnect 5->15, executemany
series 3->10. Single-round outliers no longer dominate.
* Switched bench reporting to median + IQR. Mean was being moved by
individual GC pauses / scheduler hiccups (IfxPy executemany IQR
was 8.2 ms on a 28 ms median - 29% spread - mean was unreliable).
* Updated ifxpy_bench.py to also report median + IQR alongside mean
for cross-comparable numbers.
* Makefile bench targets now show median, iqr, mean, stddev, ops, rounds.
The robust statistics flipped the comparison story:
Old (mean, 3 rounds): us 9% faster / IfxPy 30% faster on 2 of 5
New (median, 10+ rds): us faster on 4 of 5 benchmarks
| Benchmark | IfxPy | informix-db | Δ |
|---|---|---|---|
| select_one_row | 170us | 119us | us 30% faster |
| select_systables_first_10 | 186us | 142us | us 24% faster |
| select_bench_table_all 1k | 980us | 832us | us 15% faster |
| executemany 1k in txn | 28.3ms | 31.3ms | us 10% slower |
| cold_connect_disconnect | 12.0ms | 10.7ms | us 11% faster |
Tier 2 — add benchmarks for claims we make but don't verify:
tests/benchmarks/test_observability_perf.py:
* test_streaming_fetch_memory_profile — RSS sampling during a
cursor iteration. Documents memory growth shape; regression
wall at 100 MB / 1k rows. Currently flat (in-memory cursor
doesn't grow detectably for 278 rows).
* test_select_1_latency_percentiles — 1000-query distribution
with p50/p90/p95/p99/max. Result: p99/p50 = 1.42x (tight tail).
p50=108us, p99=153us.
* test_concurrent_pool_throughput[2,4,8] — N worker threads
through pool, measures aggregate QPS + per-thread fairness.
Plateaus at ~6K QPS (server-bound); per-thread latency scales
~linearly with N (server serialization expected).
README.md (project root): updated Compared-to-IfxPy table with
the median-based numbers + IQR awareness note.
tests/benchmarks/compare/README.md: added "Statistical robustness"
section explaining why median over mean for fair comparison.
236 integration tests pass; ruff clean.
Adds a paired benchmark of informix-db (pure Python) against IfxPy
3.0.5 (IBM's C-bound driver via OneDB ODBC) on identical workloads
against the same Informix dev container.
Headline result: pure Python is competitive — and faster on 2/5
benchmarks where wire round-trip dominates over codec/marshaling.
| Benchmark | IfxPy | informix-db | Result |
|---|---:|---:|---:|
| select_one_row (single-row latency) | 128 us | 116 us | us 9% faster |
| select_systables_first_10 | 126 us | 184 us | IfxPy 32% faster |
| select_bench_table_all (1k rows) | 969 us | 855 us | us 12% faster |
| executemany(1000) in txn | 21.5 ms | 30.8 ms | IfxPy 30% slower |
| cold_connect_disconnect | 11.0 ms | 10.9 ms | comparable |
Why the surprising wins: IfxPy's path is Python -> OneDB ODBC ->
libifdmr -> wire. Ours is Python -> wire. When wire round-trip
dominates (single-row, bulk fetch), the missing abstraction layer
makes us faster. When per-row marshaling dominates (executemany),
IfxPy's C-level execute(stmt, tuple) beats Python BIND-PDU build.
Files added under tests/benchmarks/compare/:
* Dockerfile.ifxpy — Ubuntu 20.04 base with IfxPy + OneDB drivers
* ifxpy_bench.py — IfxPy benchmark workloads matching test_*_perf.py
* README.md — methodology, results, install gauntlet, reproduction
The IfxPy install gauntlet itself is part of the comparison story:
modern Python 3.11 (not 3.13), setuptools <58, permissive CFLAGS,
manual download of 92MB OneDB ODBC tarball, four LD_LIBRARY_PATH
directories, libcrypt.so.1 (deprecated 2018, missing on Arch /
Fedora 35+ / RHEL 9). Versus our `pip install informix-db`.
README.md (project root): added "Compared to IfxPy" section under
Performance with the headline numbers and a pointer to the full
methodology.
.gitignore: keep Dockerfile/script/README under tests/benchmarks/
compare/, exclude the 92MB OneDB tarball and the local venv.
PyPI users landing on the README need to know quickly:
- What this is (already strong)
- Whether it's safe to use in production (was missing)
- Performance expectations (was missing)
- Python version requirement (was only in pyproject.toml metadata)
Updates:
* Added "Status" section with the Hamilton audit findings table -
every critical/high/medium addressed, 0 remaining. Names the
Hamilton-style review process explicitly as the credibility signal.
* Added Python ≥ 3.10 requirement under the install command.
* Added "Performance" section with single-connection benchmarks and
the 53x autocommit-cliff gotcha (most important perf pitfall).
* Updated "Standards & guarantees" to mention Phase 27's wire lock
alongside the PEP 249 Threadsafety=1 declaration - accurate context
for sophisticated readers.
* Tightened "Development" to PyPI-appropriate brevity (short Makefile
target list instead of full uv invocations).
* Updated stale phase count (22+ → 30) and test counts (69 → 77 unit,
163 → 231 integration). Added "300+ tests" rough number in the
Status section to reduce future staleness churn.
* Fixed typo: "no thread of native machinery" → "no native machinery
anywhere in the thread of execution".
* Bumped pyproject.toml classifier from "Development Status :: 4 -
Beta" to "5 - Production/Stable" - earned by the audit work.
No code changes.
Closes the last 3 medium-severity items from Hamilton's system-wide
audit. **0 critical, 0 high, 0 medium remaining.**
What changed:
pool.py:
* Pool acquire() growth path: restructured to remove _lock._is_owned()
(CPython-private API) usage. Two explicit re-acquires (success path
+ exception path) replace the older try/finally + private check.
connections.py:
* _raise_from_rejection now extracts the server's human-readable
error string from the rejection payload and surfaces it in the
OperationalError. Wrong-password vs wrong-database now produce
distinguishable errors. New helper _extract_server_error_text
finds the longest printable-ASCII run (8-256 chars). Falls back
to a hex preview when no string is found.
* _send_exit: broadened catch from (OperationalError, InterfaceError,
OSError, ProtocolError) to bare Exception. Best-effort by
definition; the socket FD is freed by close()'s finally clause via
_socket.IfxSocket.close (idempotent, never-raising). Prevents
unexpected errors from escaping close() and leaving partial state.
5 new unit tests in test_protocol.py for _extract_server_error_text:
finds-longest-run, picks-longest-of-multiple, too-short-returns-None,
empty-handled, caps-at-256.
77 unit + 231 integration + 28 benchmark = 336 tests; ruff clean.
Hamilton audit punch list final state: every actionable finding
addressed. No CRITICAL, no HIGH, no MEDIUM remaining.
Pre-Phase-26: 2 critical, 3 high, 5 medium
Post-Phase-30: 0 critical, 0 high, 0 medium - PRODUCTION READY
Closes the unbounded-leak gap on long-lived pooled connections that
Phase 28's cursor finalizer left as future work. When the finalizer
can't acquire the wire lock (cross-thread GC during another thread's
op), instead of leaking + logging, it enqueues the cleanup PDUs to a
per-connection deferred queue. The next normal operation drains the
queue under the wire lock, completing the cleanup atomically before
the new op.
What changed:
connections.py:
* Connection._pending_cleanup: list[bytes] + Connection._cleanup_lock
(separate from _wire_lock - tiny critical section for list mutation
only, allows enqueue without waiting for an in-flight wire op)
* _enqueue_cleanup(pdus): thread-safe append, callable from any
thread (including finalizers without lock ownership)
* _drain_pending_cleanup(): pop-the-list + send-each-PDU. Caller
must hold _wire_lock. Force-closes on wire desync (same doctrine
as _raise_sq_err)
* _send_pdu opportunistically drains the queue before sending. Cost
is one length-check when queue is empty (the common case)
cursors.py:
* _finalize_cursor enqueues [_CLOSE_PDU, _RELEASE_PDU] instead of
leaking when the lock is busy. WARNING demoted to DEBUG since
leak no longer accumulates.
Lock-order discipline: _cleanup_lock is held only for list extend/pop;
_wire_lock is held for the actual wire I/O. Never grab _cleanup_lock
while holding _wire_lock - the drain pops-and-clears under
_cleanup_lock, then iterates under _wire_lock (which caller holds).
Two new regression tests:
* test_enqueue_cleanup_drains_on_next_send_pdu - verifies queue
mechanism end-to-end
* test_pending_cleanup_thread_safe_enqueue - 8x50 concurrent enqueues,
no race-loss
72 unit + 231 integration + 28 benchmark = 331 tests; ruff clean.
Hamilton audit punch list status:
0 critical, 0 high, 3 medium remaining (login errors, _send_exit
cleanup, pool acquire re-entrance) - all Phase 30 scope.
Closes Hamilton audit High #4 (bare-except in error drain) and
High #5 (no cursor finalizers), plus 1 medium one-liner.
After Phases 26-28, 0 CRITICAL and 0 HIGH audit findings remain.
Driver is PRODUCTION READY.
What changed:
cursors.py:
* Cursor finalizers via weakref.finalize. Mid-fetch raises (or any
GC without explicit close()) now release server-side resources
(CLOSE + RELEASE PDUs). Pre-built static PDU bytes at module load
so finalizer can run on any thread without allocating or calling
cursor methods.
* Non-blocking lock acquire prevents cross-thread GC deadlock.
WARNING log on lock-busy so leak accumulation is visible.
* state=[False] list pattern keeps finalizer closure weak. GIL
dependency of atomic single-element mutation documented.
* _raise_sq_err near-token parse: (ProtocolError, OSError) only.
* _raise_sq_err drain: force-close connection on same exceptions
(wire unrecoverable after desync).
connections.py:
* _raise_sq_err drain: same hardening as cursor version. Force-close
on (ProtocolError, OSError, OperationalError) - the latter from
_drain_to_eot raising on unknown tags. Documented inline.
* Added contextlib import for force-close suppression.
cursors.py write_blob_column:
* BLOB_PLACEHOLDER validation now requires EXACTLY ONE occurrence.
Pre-Phase-28, str.replace silently substituted every occurrence -
corrupting SQL containing the literal string in comments etc.
Now raises ProgrammingError with workaround pointer.
_resultset.py:
* Investigated end-of-loop bounds check for parse_tuple_payload.
Reverted: long-standing off-by-one in UDTVAR(lvarchar) trailing-
pad logic produces benign over-reads (payload is a fully-extracted
bytes object; over-reads return empty slices through unused
branches). Real silent-corruption surfaces are length-prefix
decoders, needing branch-local checks. Documented as deliberate
non-fix.
Margaret Hamilton review surfaced two blocking conditions:
* Asymmetric failure handling: _raise_sq_err force-closed the
connection on wire desync, but the cursor finalizer silently
swallowed identical failures. "Same wire, same failure mode,
same response" - finalizer now matches _raise_sq_err's discipline.
* Leak visibility: wire-lock-busy log was DEBUG. Promoted to WARNING
so leak accumulation on pooled connections is visible.
Plus three documentation improvements (GIL dependency, OperationalError
in desync taxonomy, parse_tuple non-fix rationale).
One new regression test:
* test_write_blob_column_rejects_multiple_placeholders
72 unit + 229 integration + 28 benchmark = 329 tests; ruff clean.
Phase 29 ticket (Hamilton recommended): deferred-cleanup queue
drained at next _send_pdu, closes unbounded-leak gap on long-lived
pooled connections. Not blocking Phase 28.
Hamilton audit verdict:
Pre-26: 2 critical, 3 high, 5 medium
Post-28: 0 critical, 0 high, 4 medium
Closes Hamilton audit Critical #2 (concurrency / wire lock) and
High #3 (async cancellation evicts cleanly). Phase 26 fixed what
gets returned to the pool; Phase 27 fixes what can interleave on
the wire while it's running.
What changed:
connections.py:
* Added Connection._wire_lock = threading.RLock(). Wrapped commit(),
rollback(), fast_path_call() under the lock.
* _ensure_transaction documents the lock as a precondition AND
asserts ownership at runtime (_wire_lock._is_owned()) so a future
caller adding a third call site fails loudly.
* close() tries to acquire wire lock with 0.5s timeout before
SQ_EXIT; skips polite exit and force-closes if busy.
cursors.py:
* execute() body extracted into _execute_under_wire_lock() and
called under the lock.
* executemany() body wrapped inline.
* _sfetch_at() wrapped - covers all scrollable fetch_* methods
that delegate to it.
* close() locks the CLOSE+RELEASE for scrollable cursors.
pool.py:
* release() acquires conn._wire_lock with 5s timeout before rollback.
On timeout: log WARNING, evict connection. Constant
_RELEASE_WIRE_LOCK_TIMEOUT for tunability.
aio.py:
* AsyncConnectionPool.connection() now catches CancelledError /
TimeoutError separately and routes to broken=True. Combined with
the wire lock, asyncio.wait_for around aio DB calls is now safe.
* Updated docstring; mirrored in docs/USAGE.md.
Margaret Hamilton review surfaced three actionable conditions, all
addressed before tagging:
* Cancellation test used contextlib.suppress - could pass without
exercising the cancellation path on a fast runner. Switched to
pytest.raises so the test fails if timeout doesn't fire.
* _ensure_transaction precondition documented but unchecked at
runtime. Added assert self._wire_lock._is_owned() guard.
* Connection.close() was unsynchronized. Now tries 0.5s acquire
before SQ_EXIT.
Two new regression tests in tests/test_pool.py:
* test_concurrent_threads_on_one_connection_dont_interleave_pdus
(without lock: garbled results / hangs)
* test_async_wait_for_cancellation_evicts_connection
(asserts pool size shrinks; cancellation actually fires)
72 unit + 228 integration + 28 benchmark = 328 tests; ruff clean.
Hamilton verdict: PRODUCTION READY WITH CAVEATS (was) -> CAVEATS
NARROWED FURTHER (now). 0 critical, 2 high remaining (cursor
finalizers + bare-except in error drain) - both Phase 28 scope.
Fixes the dirty-pool-checkout bug surfaced by Margaret Hamilton's
system-wide audit (Critical #1).
The bug: ConnectionPool.release() returned connections with open
server-side transactions still active. Request A's uncommitted
INSERTs would be inherited by Request B reusing the same connection -
B's commit would land A's writes permanently; B's rollback would
silently lose them. Same shape as psycopg2's pre-2.5 dirty-pool bug.
The fix: pool.release() now rolls back any open transaction before
returning the connection to the idle list. The rollback runs OUTSIDE
the pool lock since it's a wire round-trip - the connection is
already off the idle list and counted in _total, so no other thread
can grab it during the rollback window. If the rollback itself fails
(dead socket, etc.), the connection is evicted rather than recycled.
Async path covered automatically: AsyncConnectionPool.release()
delegates to the sync pool's release via _to_thread.
Margaret Hamilton review pass surfaced two findings, both addressed:
* Silent rollback failure: added a WARNING log via logging.getLogger
("informix_db.pool") so evictions are debuggable. First logger in
the project.
* Async cancellation race: the fix doesn't introduce the
asyncio.wait_for race (Critical #2, deferred to Phase 27), but it
adds a code path that can trigger it. Documented loudly in
pool.release() docstring, aio.py module docstring, and USAGE.md
async section. Recommendation: use read_timeout on the connection
instead of asyncio.wait_for until Phase 27 lands.
Two new regression tests in tests/test_pool.py:
* test_uncommitted_writes_invisible_to_next_acquirer (the bug)
* test_committed_writes_survive_pool_checkout (no over-correction)
Verified the regression test catches the bug: stashed the fix, ran
the test - it fails with "B sees 1 rows - leaked across pool
checkout boundary" - confirming it tests the real failure mode.
Total tests: 72 unit + 226 integration + 28 benchmark = 326.
Deferred to Phase 27 per Hamilton audit:
* Critical #2 (concurrency / per-connection wire lock)
* High #3 (async cancellation routes to broken=True)
* High #4 (bare except in _raise_sq_err drain)
* High #5 (no cursor finalizers - server-side resource leaks)
Third-pass optimization on parse_tuple_payload's hot loop. Previous
phases removed redundant work; this one removes correct-but-wasteful
work: the if/elif chain checked branches in implementation order, not
frequency order. Fixed-width types (INT, FLOAT, DATE, BIGINT - the most
common columns in real queries) sat at the bottom, paying ~7 frozenset
misses per column.
Changes (src/informix_db/_resultset.py):
* Added _FIXED_WIDTH_TYPES = frozenset(FIXED_WIDTHS.keys()) at module
load.
* New fast-path branch at the TOP of parse_tuple_payload's loop body
that handles every _FIXED_WIDTH_TYPES column inline: one frozenset
check, one dict lookup, one decode, continue. Skips every other
branch.
* Cleaned up the bottom fall-through; it now genuinely only catches
unknown types.
Performance vs Phase 24 baseline:
* parse_tuple_5cols_iso8859: 1659 ns -> 1400 ns (-16%)
* parse_tuple_5cols_utf8: 1649 ns -> 1341 ns (-19%)
Cumulative vs Phase 21 baseline (before any optimization):
* parse_tuple_5cols: 2796 ns -> 1400 ns (-50%) - HALF the time
* decode_int: 230 ns -> 139 ns (-40%)
Margaret Hamilton review surfaced one HIGH finding addressed before
tagging:
* H: The fast-path optimization assumes every FIXED_WIDTHS key is
decodable WITHOUT qualifier inspection (encoded_length etc.). True
today, but a future contributor adding a fixed-width type that
needs qualifier bits (like DATETIME does) would silently get wrong
decode behavior - Lauren-Bug class failure.
Fix: added INVARIANT comment to FIXED_WIDTHS in converters.py AND
added tests/test_resultset_invariants.py with three CI tripwire
tests:
- _FIXED_WIDTH_TYPES is disjoint from every other dispatch branch
- Every FIXED_WIDTHS key has a DECODERS entry
- DECODERS keys stay < 0x100 (Phase 24 collision-free guarantee)
The tests carry instructions: if one fires, don't update the test
to match - either restore the property or refactor the optimization.
Comments rot when nobody reads them; tests fail loudly.
baseline.json refreshed; 72 unit + 224 integration + 28 bench = 324
tests; ruff clean.
Second pass of hot-path optimization on parse_tuple_payload. Two changes
to converters.py:
1. Split decode() into public + internal. Added _decode_base(base_tc,
raw, encoding) that takes an already-base-typed code and skips the
redundant base_type() call. Public decode() is now a one-line
wrapper. parse_tuple_payload's 4 call sites swapped to use
_decode_base directly. _fastpath.py's external decode() caller is
unaffected.
2. Pre-compiled struct.Struct unpackers. The fixed-width integer/float
decoders (_decode_smallint, _decode_int, _decode_bigint,
_decode_smfloat, _decode_float, _decode_date) switched from per-call
struct.unpack(fmt, raw) to module-level bound methods like
_UNPACK_INT = struct.Struct("!i").unpack. Format-string parsed once
at module load. Measured 37% faster than per-call struct.unpack on
CPython 3.13 micro.
Performance vs Phase 23 baseline:
* decode_int: 173 ns -> 139 ns (-20%)
* decode_bigint: 188 ns -> 150 ns (-20%)
* parse_tuple_5cols: 2047 ns -> 1592 ns (-22%)
* 1k-row SELECT: 1255 us -> 989 us (-21%)
Cumulative vs original Phase 21 baseline:
* decode_int: 230 ns -> 139 ns (-40%)
* parse_tuple_5cols: 2796 ns -> 1592 ns (-43%)
* 1k-row SELECT: 1477 us -> 989 us (-33%)
Real-world fetch ceiling: 358K rows/sec -> ~620K rows/sec.
Margaret Hamilton review surfaced one HIGH-severity finding addressed
before tagging:
* H: The no-collision guarantee that makes _decode_base safe is
structural but undocumented (all DECODERS keys are ≤ 0xFF, all flag
bits are ≥ 0x100, so flagged inputs cannot coincidentally match).
Added load-bearing INVARIANT comment at DECODERS dict explaining
the constraint and what to do if violated. Cross-referenced from
_decode_base's docstring for bidirectional traceability.
baseline.json refreshed; all 224 integration tests pass; ruff clean.
Per-row decode is hit on every row of every SELECT. The original code
had three forms of waste in the inner loop:
1. Redundant base_type() call. ColumnInfo.type_code is already
base-typed by parse_describe at construction; calling base_type()
again per column per row was pure waste. Single largest savings.
2. IntFlag->int conversions inline (~10x per iteration). Lifted to
module-level _TC_X constants.
3. Lazy imports inside the loop body (_decode_datetime, _decode_interval,
BlobLocator, ClobLocator, RowValue, CollectionValue). Moved to top.
Plus three precomputed frozensets (_LENGTH_PREFIXED_SHORT_TYPES,
_COMPOSITE_UDT_TYPES, _NUMERIC_TYPES) replace inline tuple-membership
checks. _COLLECTION_KIND_MAP is now MappingProxyType (actually frozen).
Performance:
* parse_tuple_5cols: 2796 ns -> 2030 ns (-27%)
* select_bench_table_all (1k rows): 1477 us -> 1198 us (-19%)
* Codec micro-bench, cold connect, executemany: unchanged
Real-world fetch ceiling on a single connection: 350K rows/sec ->
490K rows/sec.
Margaret Hamilton review surfaced four cleanup items, all addressed
before tagging:
* H1: cursor._dereference_blob_columns had the same redundant
base_type() call - stripped for consistency.
* M1: documented the load-bearing invariant at parse_describe (the
single producer site) so future contributors have a grep target.
* M2: _COLLECTION_KIND_MAP wrapped in MappingProxyType.
* L1: stale line-number comment fixed to point at the INVARIANT
comment instead.
baseline.json refreshed; all 224 integration tests pass; ruff clean.
The docs/USAGE.md predated Phases 17-21, so anyone landing on PyPI was
missing scrollable cursors, locale/Unicode, the autocommit cliff
finding, and the type-mapping reference.
Added sections to docs/USAGE.md:
* Locale and Unicode - client_locale, Connection.encoding, CLIENT_LOCALE
vs DB_LOCALE, when characters can't fit the codec
* Type mapping reference - full SQL <-> Python type table, NULL
sentinels subsection, IntervalYM
* Performance tips - 53x autocommit-cliff fix, 100x executemany win,
72x pool win, with the actual benchmark numbers from Phase 21.1
* Scrollable cursors - fetch_* API, in-memory vs server-side trade-off,
edge cases (past-end semantics, negative indexing, rownumber)
* Timeouts and keepalive subsection - production starting points
* Environment dictionary subsection - env={} parameter
* Known limitations - explicit table of what doesn't work (named
params, complex UDT bind, GSSAPI, XA) with workarounds; "things
that might surprise you" notes
README.md - added Documentation section linking to docs/USAGE.md
and tests/benchmarks/README.md.
Doc corrections caught during review:
* cursor.rownumber is 0-indexed (impl has always been correct; only
the original docstring wording was loose)
* fetch_* methods work on BOTH scrollable=True and default cursors;
the in-memory path supports them too
USAGE.md grew from 345 lines to 633.
Investigation of the Phase 21 baseline finding that executemany(N) cost
scaled linearly per-row (1.74 ms x N) regardless of batch size.
Root cause: every autocommit=True INSERT forces a server-side
transaction-log flush. Not a wire-protocol bug.
Numbers:
* executemany(1000) autocommit=True: 1.72 s (1.72 ms/row)
* executemany(1000) in single txn: 32 ms (32 us/row)
53x speedup from changing the transaction boundary, not the driver.
Pure protocol overhead is ~32 us/row -> ~31K rows/sec sustained
throughput on a single connection. Comparable to pg8000.
Added test_executemany_1000_rows_in_txn benchmark to make this
visible. Updated README headline numbers and added a "Performance
gotchas" section explaining when autocommit=False matters.
Decision: don't pipeline. The remaining 32 us is already excellent;
the autocommit gotcha is the real user-facing footgun. Docs > code.
If someone reports needing >31K rows/sec single-connection, that
becomes Phase 22.
Fills the highest-priority gap from the test-adequacy audit:
connection-failure recovery. 12 new integration tests using a
thread-based TCP proxy (ControlledProxy) that can be kill()'d at
any moment to simulate network drops or server crashes via TCP RST
(SO_LINGER=0).
Coverage:
* Network drop mid-SELECT — OperationalError, not hang
* Network drop after describe, before fetch
* Network drop during fetch (already-materialized rows still
readable; fresh execute fails)
* Local socket forced-close (kernel-level disconnect simulation)
* I/O error marks connection unusable post-failure
* Pool evicts connection that died mid-`with` block (size drops)
* Pool revives after all idle connections died (health check on
acquire mints fresh)
* Async cancellation via asyncio.wait_for — pool stays usable
* Cursor reusable after SQL error
* Connection survives cursor close after error
* Sustained pool load (50 acquire/release cycles, no leak)
* read_timeout fires on a hung connection within bounds
Catches the failure classes that bite production users:
* Hangs (waiting forever on dead socket)
* Silent corruption (EOF treated as valid tuple)
* Double-fault (cleanup raises after primary error)
* Pool poisoning (broken connection returned to pool)
* Stale cursor reuse across error boundaries
Helper:
* tests/_proxy.py — ControlledProxy: thread-based TCP forwarder
with kill() for fault injection. Two-thread pump model. SO_LINGER=0
for RST-on-close (mimics router drop).
Total: 69 unit + 203 integration = 272 tests.
Remaining gaps from the audit (UTF-8 multibyte locale, server-version
matrix, performance benchmarks) are real but lower-severity. Phase 19
addressed the one most likely to bite production deployments.
Opt-in via conn.cursor(scrollable=True). Opens the cursor with
SQ_SCROLL (24) before SQ_OPEN (6), keeps it open server-side, and
sends SQ_SFETCH (23) per scroll call instead of materializing the
result set up-front.
User-facing API is identical to Phase 17's in-memory scroll
(fetch_first/last/prior/absolute/relative, scroll, rownumber).
Only the internal mechanism differs:
| feature | default | scrollable=True
|-------------------|------------------|------------------
| memory | all rows | one row at a time
| round-trips/fetch | 0 (after NFETCH) | 1 per call
| cursor lifetime | closed after exec| open until close()
| best for | sequential iter | random access on
| huge result sets
Wire format (verified against JDBC ScrollProbe capture):
* SQ_SFETCH: [short SQ_ID=4][int 23][short scrolltype]
[int target][int bufSize=4096][short SQ_EOT]
scrolltype: 1=NEXT, 4=LAST, 6=ABSOLUTE
* SQ_SCROLL (24): emitted between CURNAME and SQ_OPEN
* SQ_TUPID (25): response tag with 1-indexed row position;
authoritative source for client-side position tracking
Position tracking uses the server's SQ_TUPID rather than client-
computed indexes. Total row count discovered lazily via SFETCH(LAST)
when negative absolute indexing requires it; cached in
_scroll_total_rows.
Trap on the way: initial SFETCH used SHORT for bufSize → server
hung silently. Same SHORT-vs-INT diagnostic pattern as Phase 4.x's
CURNAME+NFETCH. Captured JDBC trace, byte-diffed against ours,
found the mismatch (bufSize is INT in modern Informix per
isXPSVER8_40 / is2GBFetchBufferSupported).
Tests: 14 integration tests in test_scroll_cursor_server.py
covering lifecycle, sequential fetch, fetch_first/last/prior/
absolute/relative, negative indexing, scroll, empty result sets,
past-end, and random-access on a 100-row result set.
Total: 69 unit + 191 integration = 260 tests.
Adds scroll/random-access methods on Cursor:
* scroll(value, mode='relative'|'absolute') — PEP 249 compatible
* fetch_first() / fetch_last() — jump to result-set ends
* fetch_prior() — step backward (SQL-standard: from past-end yields
the last row, matching JDBC ResultSet.previous() semantics)
* fetch_absolute(n) — 0-indexed jump; negative n indexes from end
* fetch_relative(n) — n-step from current position
* rownumber property — current 0-indexed position
Implementation: replaced _row_iter (single-pass iterator) with
_row_index (random-access index) on the cursor. The result set
is already materialized in _rows during execute(); scroll just
repositions the index. No new wire protocol needed.
For server-side scroll over genuinely huge result sets, SQ_SFETCH
(tag 23) would be needed — JDBC has executeScrollFetch (line 3908)
but we only need it if someone hits the in-memory materialization
ceiling. Phase 18 if so.
Out-of-range scroll raises IndexError per PEP 249. Invalid mode
strings raise ProgrammingError. fetchall() now correctly returns
only the rows from the current position to end (not all rows).
14 new integration tests in test_scroll_cursor.py covering:
* fetchone advancing rownumber sequentially
* fetch_first reset
* fetch_last
* fetch_prior including the past-end-to-last-row semantics
* fetch_absolute with positive and negative indexes
* fetch_relative
* PEP 249 scroll(value, mode='relative'/'absolute')
* IndexError on out-of-range
* ProgrammingError on bad mode
* Empty-result-set edge cases
* fetchall after partial iteration
Total: 69 unit + 177 integration = 246 tests.
Version bump (2026.05.02 → 2026.05.04) reflects the library reaching
feature completeness across Phases 1-16.
Documentation:
* README.md — full rewrite. The previous README was from Phase 1
("cursor() / execute() / fetchone() arrive in Phase 2"). New
README covers: sync + async APIs, connection pool, TLS, full type
matrix, smart-LOBs, fast-path RPC, server-compatibility,
development workflow, and pointers to the protocol research docs.
* docs/USAGE.md — new practical recipe guide. Connecting, cursor
lifecycle, parameter binding, transactions (logged + unlogged),
executemany, smart-LOB read/write, connection pool, async,
TLS, error handling, fast-path RPC, server-side setup steps,
and a migration table from IfxPy / legacy informixdb.
* CHANGELOG.md — new file. Captures the v2026.05.04 release as the
Phase 1-16 completion milestone with a full feature inventory
and known-gap list. Future point-releases append here.
Classifiers updated:
* Development Status: 2 → 4 (Pre-Alpha → Beta)
* Added Framework :: AsyncIO
Keywords: added asyncio, async.
No code changes; tests still pass (69 unit + 163 integration = 232).
Ruff clean.
Ships AsyncConnection, AsyncCursor, and AsyncConnectionPool that
expose async/await versions of the sync API for use with FastAPI,
aiohttp, etc.
Strategy: thread-pool wrapping (aiopg pattern), not native async.
Each blocking I/O call is offloaded to a worker thread via
asyncio.to_thread. The event loop never blocks; queries run in
parallel up to the pool's max_size. Cost: ~250 lines, no changes
to the sync codebase. Native async (Phase 17) would require a
~2000-line transport abstraction refactor — deferred until a real
workload needs it.
For typical FastAPI/aiohttp workloads (request → one query → return),
this is functionally equivalent to native async. Each await yields
the loop while a worker thread does the I/O. Only differs for
hundreds-of-concurrent-connections workloads.
API mirrors the sync API one-to-one:
import asyncio
from informix_db import aio
async def main():
pool = await aio.create_pool(host=..., min_size=1, max_size=10)
async with pool.connection() as conn:
cur = await conn.cursor()
await cur.execute("SELECT id FROM users WHERE name = ?", (name,))
row = await cur.fetchone()
await pool.close()
The async pool preserves the sync pool's eviction policy: connection
errors evict, application errors retain.
Tests: 9 integration tests in test_aio.py covering open/close,
async-with, simple/parameterized SELECT, async-for cursor iteration,
pool acquire/release, 20-query concurrent gather (verifies parallelism
through max_size=5 pool), pool async context manager, commit/rollback.
Total: 69 unit + 163 integration = 232 tests.
Pyproject changes:
* Added pytest-asyncio>=1.3.0 as dev dep
* asyncio_mode = "auto" so async tests don't need decorators
Architectural completion: with Phase 16, every backlog item is
done. The Phase 0 ambition — first pure-Python Informix driver,
no native deps — is now genuinely complete.
Thread-safe connection pool with min/max sizing, lazy growth,
idle recycling, and per-acquire health-check.
API:
pool = informix_db.create_pool(host=..., min_size=1, max_size=10)
with pool.connection() as conn:
...
pool.close()
Design choices:
* Lazy growth from min_size — pre-opens min_size on construction,
grows to max_size on demand. Pay-nothing startup with burst capacity.
* Health-check on acquire, not release. Sends a trivial SELECT 1
round-trip before yielding. Dead idle connections (server-side
timeout, network drop) are silently replaced. The cost is ~1ms
per acquire, bought at the price of "users never see a stale-
connection error". Check-on-release is wrong because idle time
is when connections actually die.
* Eviction on OperationalError/InterfaceError only. The "with
pool.connection()" context manager retains the connection on
application-level errors (ValueError, IntegrityError, etc.).
Avoids the "every constraint violation evicts a healthy connection"
pitfall.
* Releases the pool lock during connect() — the slow handshake
(50-100ms) doesn't serialize other threads' acquires.
Tests: 15 integration tests in test_pool.py covering:
* API & lifecycle (pre-open, lazy growth, context-manager, LIFO)
* Exhaustion (timeout when full, per-acquire override, unblock-on-release)
* Eviction (explicit broken, auto on OperationalError, retain on
application errors)
* Health-check (dead idle silently replaced)
* Shutdown (close drains, idempotent, context-manager)
* Multi-thread safety (8 workers × 3 queries each, no leaks)
Total: 69 unit + 154 integration = 223 tests.
With Phase 14 (TLS) and Phase 15 (pool), the project covers the
three things a typical Python web/API workload needs from a
database driver: PEP 249 surface, TLS transport, connection pool.
Only async (informix_db.aio) remains in the backlog.
Optional TLS via the ``tls`` parameter on connect() and IfxSocket.
Three modes:
tls=False (default) — plain TCP, current behavior unchanged
tls=True — TLS w/ verification disabled (dev / self-signed)
tls=ssl.SSLContext — caller-supplied context (production)
Plus tls_server_hostname for SNI / cert verification.
Architectural choice: Informix uses dedicated TLS-enabled listener
ports (configured in server's sqlhosts), NOT STARTTLS upgrade. The
SSL handshake runs immediately after TCP connect with no protocol-
level negotiation. Wrapping happens inside IfxSocket.__init__ so the
rest of the protocol layer (login PDU, SQ_BIND, fast-path, file
transfer) is fully unaware of whether TLS is in use.
Why tls=True defaults to insecure: most Informix dev installations
use self-signed certs. tls=True produces a context with
check_hostname=False and verify_mode=CERT_NONE. Minimum protocol is
still TLSv1.2 (per ssl.PROTOCOL_TLS_CLIENT). Production users are
expected to pass ssl.SSLContext explicitly.
Tests: 5 unit tests in test_tls.py
* tls=True dev context properties
* default context uses TLSv1.2+
* real handshake against in-process TLS echo server (proves wrap_socket
works end-to-end)
* custom SSLContext honored verbatim
* tls=True against non-TLS port raises OperationalError clearly
Test certs are generated via openssl CLI subprocess instead of
adding cryptography as a dev dep (saves ~5MB transitive deps for
one phase).
Total: 69 unit + 139 integration = 208 tests.
Architectural milestone: with Phase 14 complete, the driver now
implements EVERYTHING in the SQLI wire-protocol family that a Python
application needs. Remaining backlog (async, pooling) is library-
design work, not protocol work.
Implements direct stored-procedure invocation via the parallel
fast-path protocol family. Three new wire messages:
* SQ_GETROUTINE (101) — handle resolution by signature
* SQ_EXFPROUTINE (102) — execute by handle with bound params
* SQ_FPROUTINE (103) — response with return values
API: Connection.fast_path_call(signature, *params) -> list
Routine handles cached per-connection in a dict[signature -> (db_name,
handle)] — first call resolves and caches, subsequent calls skip
GETROUTINE.
Why this matters even though Phase 10/11 already do most smart-LOB
work via SQL: ifx_lo_close(int) can't be invoked via "EXECUTE FUNCTION"
(returns -674). Without the fast-path, opened locators leak server-side
until the session ends. The fast-path also enables tighter UDF-in-loop
workloads — no PREPARE→DESCRIBE→EXECUTE overhead, just GETROUTINE+
EXFPROUTINE (one round-trip after caching).
Wire format examples (verified against JDBC):
* GETROUTINE request:
[short 101][byte 0][int sigLen][sig bytes][pad if odd]
[short 0][short SQ_EOT]
* EXFPROUTINE request:
[short 102][short dbNameLen][dbName][pad if odd][int handle]
[short paramCount][short fparamFlag][SQ_BIND data][short SQ_EOT]
* FPROUTINE response:
[short numReturns][per-return: type/UDT-info/ind/prec/data]
+ drain SQ_DONE/SQ_COST/SQ_XACTSTAT until SQ_EOT
MVP scope:
* Scalar params/returns only (int/float/str/bool/None/etc.)
* UDT params (e.g., 72-byte BLOB locator) deferred to Phase 13.x
* SQ_LODATA chunked I/O deferred — Phase 10/11 already cover read/write
Tests: 5 integration tests covering error paths, success paths,
handle caching, and multiple cycles.
Total: 64 unit + 139 integration = 203 tests.
Architectural milestone: with Phase 13 complete, the project now
covers every wire-message family JDBC uses for ordinary database
work. Only TLS handshake and cluster-redirect (replication failover)
remain unimplemented — neither is needed for a single-instance
driver.
Composite UDTs (ROW=22, COLLECTION=23, SET=19, MULTISET=20, LIST=21)
now decode into typed wrapper objects (informix_db.RowValue,
informix_db.CollectionValue) that expose schema + raw payload bytes.
The wire format is the now-familiar [byte ind][int length][bytes]
pattern (same as UDTVAR(lvarchar) from Phase 10). The bytes are a
TEXTUAL representation of the value when selected without the
extended-binary opt-in JDBC uses:
ROW value: b"ROW('Alice',30 )"
SET value: b"SET{'red','green','blue'}"
LIST value: b"LIST{10 ,20 ,30 }"
JDBC's binary-with-schema format runs ~30x larger (1420 bytes for a
2-field ROW vs. our 24). We don't request it — the textual form is
what the server returns by default and is sufficient for type
recognition.
Phase 12 ships type recognition only. Full recursive parsing into
Python tuples/lists/sets is deferred to Phase 13 (would require a
SQL-literal lexer + recursive type-driven decoding). Production
workloads that need typed field access today can project via SQL:
cur.execute("SELECT id, r.name, r.age FROM tbl")
Tests: 8 integration tests in test_composite_types.py covering ROW
recognition, NULL, sub-field projection workaround, long values
(>255 bytes — verifies 4-byte length prefix), SET/MULTISET/LIST
recognition, and null collections.
Total: 64 unit + 134 integration = 198 tests.
Lesson reinforced: once one UDT-shaped type is implemented (UDTVAR
in Phase 10, smart-LOB in Phase 9), every subsequent UDT-shaped type
is mostly a copy of the existing decoder branch. The hard part is
payload semantics, not framing.
Mirrors Phase 10's read implementation in the opposite direction —
extends the SQ_FILE (98) handler with optype 2 (read-from-client)
support. Users register bytes in cursor.virtual_files; the server's
filetoblob('path', 'client') call streams them up via SQ_FILE_READ
(106) chunks. Same architectural pivot as Phase 10 — avoids the
heavy SQ_FPROUTINE+SQ_LODATA stack.
Wire protocol (per IfxSqli.receiveSQFILE case 2 line 5103+):
* Server sends [short SQ_FILE=98][short optype=2][short bufSize]
[int readAmount][short SQ_EOT]
* Client responds [short 106][int totalAmount] then chunks
[short 106][short chunkSize][padded data]... terminated by SQ_EOT
API:
* Low-level: cur.virtual_files['/sentinel'] = data, then SQL with
filetoblob('/sentinel', 'client')
* High-level: cur.write_blob_column(sql, blob_data, params, clob=False)
— substitutes BLOB_PLACEHOLDER token in the SQL with filetoblob()
(or filetoclob for CLOB columns) and registers the bytes
automatically. Cleans up virtual_files after the call.
The BLOB_PLACEHOLDER design was chosen over magic ?-binding because:
* bytes already maps to BYTE type (legacy in-row blobs) for ?-params
* Method on BlobLocator doesn't work for inserts (no locator yet)
* PLACEHOLDER is unmistakable at the call site
Closes the smart-LOB loop in pure Python — Phase 9's tests and
Phase 10's read fixtures previously used JDBC to seed test data.
Phase 11 eliminated that dependency: tests/test_smart_lob.py and
tests/test_smart_lob_read.py now self-seed via write_blob_column.
Bonus: integration test runtime 5.78s → 2.78s (no more per-fixture
JVM spawns). Project goal "pure Python, no native deps" now true
for the test suite too.
Tests: 9 integration tests in test_smart_lob_write.py covering
* BLOB short, multichunk (51KB), empty, binary-safe (256 values)
* BLOB UPDATE
* BLOB multi-row INSERTs
* CLOB via filetoclob
* validation (rejects SQL without BLOB_PLACEHOLDER)
* virtual_files cleanup
Total: 64 unit + 126 integration = 190 tests.
Implements end-to-end BLOB reading by leveraging the server's
lotofile() function and intercepting the SQ_FILE protocol with
in-memory file emulation. Avoids implementing the heavier
SQ_FPROUTINE + SQ_LODATA stack initially planned for Phase 10.
Strategy: SELECT lotofile(blob_col, '/path', 'client') causes the
server to orchestrate a SQ_FILE (98) protocol round-trip — it tells
the client to "open file X, write these bytes, close". Our handler
buffers the writes in memory keyed by filename instead of touching
disk. The bytes appear in cursor.blob_files dict.
Wire protocol (per IfxSqli.receiveSQFILE line 4980):
* SQ_FILE optype 0 (open): server sends filename + mode/flags/offset
* SQ_FILE optype 3 (write): chunked SQ_FILE_WRITE (107) blocks of
data, terminated by SQ_EOT. Client responds with total size.
* SQ_FILE optype 1 (close): bare SQ_EOT both ways.
API:
* Low-level: cur.execute("SELECT lotofile(col, '/tmp/X', 'client') ...")
followed by cur.blob_files[returned_filename] for the bytes.
* High-level: cur.read_blob_column("SELECT col FROM ... WHERE ...", params)
returns bytes directly, wrapping the user's SQL with lotofile.
Bonus: row decoder now handles UDTVAR (type 40) with extended_name=
"lvarchar" — the wire format that lotofile() returns its result as.
Format: [byte indicator][int length][bytes].
Tests: 6 integration tests in test_smart_lob_read.py covering
low-level + high-level paths, NULL/no-match, multi-chunk (30KB),
and validation. Test data seeded via JDBC reference client since
smart-LOB writes still need Phase 11.
Total: 64 unit + 117 integration = 181 tests.
Strategic insight from this phase: don't estimate protocol-
implementation cost from JDBC's class hierarchy alone. JDBC's
IfxSmBlob is 600+ lines but the wire-level READ path reduces to one
SQL function call + one new tag handler. The wire is often simpler
than the SDK suggests.
Deferred to Phase 11+:
* Smart-LOB write (still needs SQ_FPROUTINE + SQ_LODATA)
* BlobLocator.read() OO API (requires locator-to-source mapping)
* SQ_FILE optype 2 (filetoblob client→server path)
SELECT on BLOB or CLOB columns no longer requires raw byte interpretation.
The 72-byte server-side locator is wrapped in a typed BlobLocator or
ClobLocator (frozen dataclass) so the column is recognizable as
"server-side reference, not actual bytes".
Wire-protocol findings:
* Smart-LOB columns DON'T appear with their nominal type codes (102/101)
in SQ_DESCRIBE. They surface as UDTFIXED (41) with extended_id 10
(BLOB) or 11 (CLOB) and encoded_length=72 (locator size).
* Retrieving the actual bytes requires SQ_FPROUTINE (103) RPC to
invoke ifx_lo_open, plus SQ_LODATA (97) for chunked transfer, plus
another SQ_FPROUTINE for ifx_lo_close. That's a Phase 10 lift —
roughly 2x the protocol surface of Phase 8.
Server config needed (added to Phase 7 setup):
* sbspace: onspaces -c -S sbspace1 ...
* default sbspace: onmode -wm SBSPACENAME=sbspace1
What ships in Phase 9:
* informix_db.BlobLocator(raw: bytes) — 72-byte frozen wrapper
* informix_db.ClobLocator(raw: bytes) — distinct type, same shape
* Row decoder branch in _resultset.parse_tuple_payload
* Wire constants SQ_LODATA=97, SQ_FPROUTINE=103, SQ_FPARAM=104
Tests:
* 11 unit tests in test_blob_locator_unit.py (no Informix needed) —
construction, immutability, equality, hash, repr safety, size
validation.
* 4 integration tests in test_smart_lob.py — fixture seeds via JDBC
reference client (smart-LOB writes also need deferred protocols).
* RefBlob.java helper in tests/reference/ for seeding via JDBC.
Total: 64 unit + 111 integration = 175 tests.
Locator design note: __repr__ omits the raw bytes (they're opaque to
the client). Same-bytes locators of different families compare
unequal — BlobLocator(x) != ClobLocator(x) — to avoid silent type
confusion.
Implements end-to-end round-trip for BYTE (type 11) and TEXT (type 12)
columns. Python bytes/bytearray map to BYTE; str is auto-encoded as
ISO-8859-1 for TEXT.
Wire protocol — write side:
* SQ_BIND payload carries a 56-byte blob descriptor with size at offset
[16..19] (per IfxBlob.toIfx). NULL is byte 39=1.
* After all per-param blocks, SQ_BBIND (41) declares blob count, then
chunked SQ_BLOB (39) messages stream the actual bytes (max 1024
bytes/chunk per JDBC), terminated by zero-length SQ_BLOB.
* Then SQ_EXECUTE proceeds normally.
Wire protocol — read side:
* SQ_TUPLE returns only the 56-byte descriptor; actual bytes live in
the blobspace.
* For each BYTE/TEXT column in each row, send SQ_FETCHBLOB with the
descriptor and read SQ_BLOB chunks until zero-length terminator.
* The locator is only valid while the cursor is open — must dereference
BEFORE sending CLOSE. Doing it after returns -602 (Cannot open blob).
Server-side prerequisites (one-time setup):
1. blobspace: onspaces -c -b blobspace1 -p /path -o 0 -s 50000
2. logged DB: CREATE DATABASE testdb WITH LOG
3. config + archive:
onmode -wm LTAPEDEV=/dev/null
onmode -wm TAPEDEV=/dev/null
onmode -l
ontape -s -L 0 -t /dev/null
Without #3, JDBC fails identically to our driver with "BLOB pages can't
be allocated from a chunk until chunk add is logged". This identical
failure was the diagnostic confirmation that our protocol bytes were
correct — same server response = byte-for-byte parity.
Tests: 9 integration tests in tests/test_blob.py — single-chunk,
multi-chunk (5120 bytes), NULL, multi-row, binary-safe, TEXT roundtrip,
ISO-8859-1, NULL TEXT, mixed columns. Plus the Phase 4
test_unsupported_param_type_raises was updated since bytes is no longer
the canonical unsupported type — switched to a custom class.
Total: 53 unit + 107 integration = 160 tests.
The smart-LOB family (BLOB/CLOB) is a separate state-machine extension
deferred to Phase 9 — it uses IfxLocator + LO_OPEN/LO_READ session
protocol against sbspace, not the BBIND/BLOB stream.
Introduces driver-managed transactions that work seamlessly across
logged and unlogged databases. The user calls commit() and rollback()
without needing to know which kind they're hitting — the connection
tracks transaction state internally.
Three protocol facts came out of integration testing:
1. Logged DBs in non-ANSI mode require an explicit SQ_BEGIN before
the first DML — the server doesn't auto-open a transaction.
Connection._ensure_transaction() sends SQ_BEGIN lazily and is
idempotent within an open txn. After commit/rollback, the next
DML triggers a fresh BEGIN.
2. SQ_RBWORK has a [short savepoint=0] payload before the SQ_EOT
framing tag — sending SQ_RBWORK alone causes the server to hang
silently (waiting for the missing 2 bytes). SQ_CMMTWORK has no
payload. This is the same pattern as the SHORT-vs-INT bug from
Phase 4.x and the 2-byte length prefix from Phase 6.c — when the
server hangs, it's an incomplete PDU body.
3. SQ_XACTSTAT (tag 99) is a logged-DB-only message that's
interleaved with normal responses. Now drained in all four
response-reading paths: cursor _drain_to_eot, _read_describe_
response, _read_fetch_response, and connection _drain_to_eot.
For unlogged DBs (e.g., sysmaster), SQ_BEGIN returns -201 and we
cache that result so subsequent DML doesn't re-probe. commit() and
rollback() are silent no-ops in that case — same client code works
across both DB modes.
Tests:
* New tests/test_transactions.py — 10 integration tests covering
commit visibility, rollback isolation, multi-row rollback, partial
commit-then-rollback, autocommit behavior, cross-connection
durability, UPDATE/DELETE rollback, implicit per-statement txn.
* conftest.py auto-creates testdb (logged) for the suite.
* Two old tests rewritten to assert new no-op behavior on unlogged
DBs (test_commit_rollback_in_unlogged_db_is_noop,
test_commit_in_unlogged_db_is_noop).
Total: 53 unit + 98 integration = 151 tests.
The Phase 3 "gate test" (test_rollback_hides_insert) — a rolled-back
INSERT must be invisible to subsequent SELECTs in the same session —
now passes against a real logged database for the first time.
Empirical and source-level investigation of the LOB type families.
Findings:
* BYTE/TEXT (type 11/12) cannot be inserted via SQL literals — even
dbaccess with `INSERT INTO t VALUES (1, "0x...")` returns -617
"A blob data type must be supplied within this context". The server
requires a binary BBIND wire path. Hard restriction.
* BYTE/TEXT wire protocol: SQ_BIND sends a 56-byte descriptor as the
inline placeholder, then a separate SQ_BBIND (41) PDU declares blob
count, then chunked SQ_BLOB (39) tags stream the actual bytes (max
1024 bytes/chunk per JDBC's sendStreamBlob).
* BLOB/CLOB (type 101/102) are even more involved — smart-LOBs use an
LO_OPEN/LO_READ/LO_WRITE/LO_CLOSE session protocol against sbspace,
with locators carried inline in SQ_TUPLE.
* Server-side setup confirmed working: blobspace1 + sbspace1 + logged
database (testdb) are now available in the dev container for future
Phase 8/9 implementation.
Both LOB families require materially more state-machine work than the
single-PDU codec types (DECIMAL/DATETIME/INTERVAL). Splitting into
Phase 8 (BYTE/TEXT) and Phase 9 (BLOB/CLOB) lets each get focused
attention rather than half-implementing both.
The SQ_BBIND, SQ_BLOB, SQ_FETCHBLOB, SQ_SBBIND, SQ_FILE_READ,
SQ_FILE_WRITE constants are already declared in _messages.py from
Phase 1 scaffolding — protocol layer is ready when implementation
lands.
For users who need binary data <32K today: LVARCHAR via str encoded
with iso-8859-1 is a viable interim path.
Implements encoders for datetime.timedelta → INTERVAL DAY(9) TO FRACTION(5)
and IntervalYM → INTERVAL YEAR(9) TO MONTH. Both follow the 2-byte-length-
prefixed BCD wire format established in Phase 6.c (DECIMAL/DATETIME).
The default qualifier choice is generous: DAY(9) covers any timedelta,
YEAR(9) handles ±1B years. JDBC defaults to smaller widths (DAY(2)/YEAR(4))
trading safety for compactness — we make the opposite trade.
FRACTION(5) is the Informix precision ceiling — sub-10us intervals can't
round-trip cleanly. Same limitation JDBC has.
Six integration tests, all green on first run against live Informix —
the synthetic round-trip in the test framework caught every framing bug
locally, before integration tests even started. This is the dividend from
owning both decoder and encoder.
Total: 53 unit + 88 integration = 141 tests.
Type matrix update: INTERVAL now has both decode + encode. Only BLOB/CLOB
and BYTE/TEXT remain among the common types.
Implements row-decoding for IDS INTERVAL, the last common temporal type.
The qualifier short bisects the type at the wire level: start_TU >= DAY
maps to datetime.timedelta (day-fraction), start_TU <= MONTH maps to a
new informix_db.IntervalYM (year-month).
Wire format mirrors DATETIME exactly — `[head byte][digit pairs in
base-100]`, with the qualifier dictating field interpretation. The
fraction-to-nanoseconds scaling (`scale_exp = 18 - end_TU`, forced odd)
is the JDBC pattern from `Decimal.fromIfxToArray`.
IntervalYM is a frozen dataclass holding signed total months, with
`years` and `remainder_months` as derived properties. Matches JDBC's
`IntervalYM.months` shape rather than a (years, months) tuple — avoids
ambiguity around what "negative" means for a multi-field tuple.
Tests: 13 unit (synthetic byte streams covering all decoder branches)
+ 9 integration (real Informix queries spanning DAY TO SECOND, HOUR TO
SECOND, YEAR TO MONTH, negatives, NULL, and mixed-family rows).
Total test count: 53 unit + 82 integration = 135.
Encoder for INTERVAL parameter binding is deferred to a later phase
(same arc as DECIMAL/DATETIME — decode lands first).
Now you can pass Python datetime/date/Decimal values directly:
cur.execute('INSERT INTO t VALUES (?, ?, ?)',
(1, datetime.datetime(2026, 5, 4, 12, 34, 56), Decimal('1234.56')))
cur.execute('SELECT id FROM t WHERE d > ?', (datetime.date(2025, 1, 1),))
The 2-byte length-prefix discovery: both my Phase 6.a DECIMAL encoder
and the new Phase 6.c DATETIME encoder produced "correct" BCD bytes
but the server silently dropped the SQ_BIND PDU (no response, just
timeout). Captured the wire, diffed against JDBC, and found that
DECIMAL/DATETIME bind data has a 2-byte length PREFIX wrapping the
BCD payload (per Decimal.javaToIfx line 457). With the prefix added,
both encoders work. DATE doesn't need the prefix — it's a fixed
4-byte int.
Per-type wire format:
date → DATE(7), [4-byte BE int = days since 1899-12-31]
datetime → DATETIME(10), [short total_len][byte 0xc7][7 BCD pairs]
Decimal → DECIMAL(5), [short total_len][byte exp][BCD digit pairs]
For DATETIME the encoder always emits YEAR TO SECOND form (no
microseconds) — covers the common case. Phase 6.x can add YEAR TO
FRACTION(N) variants if microsecond precision is needed.
For DECIMAL the encoder uses the asymmetric base-100 complement
(mirror of decoder) for negatives. Tested with positive, negative,
and fractional values.
Lesson for the protocol playbook: when the server silently drops a
PDU, it's almost always an envelope/framing issue rather than the
inner-value bytes being wrong. Same pattern as the SHORT-vs-INT
reserved field in CURNAME+NFETCH and the even-byte alignment pad.
Module changes:
src/informix_db/converters.py:
+ _encode_date — 4-byte BE int day count
+ _encode_datetime — YEAR TO SECOND form with 2-byte length prefix
+ _encode_decimal — re-enabled (was Phase 6.x stub) with the same
length-prefix fix
+ encode_param() dispatches on datetime.datetime BEFORE
datetime.date (since datetime is a subclass of date in Python)
Tests: 40 unit + 73 integration (3 new date/datetime param tests + 1
updated decimal param test) = 113 total, all green, ruff clean. New
tests cover:
- date as INSERT parameter via executemany — 3 dates round-trip
- datetime as INSERT parameter via executemany — 3 timestamps
- date as parameter in a WHERE clause filter (created_at > ?)
- Decimal round trip (was: NotImplementedError check; now: real
INSERT + SELECT verification)
Type support matrix updates:
DATE — encode ✓ + decode ✓ (was decode-only)
DATETIME — encode ✓ + decode ✓ (was decode-only)
DECIMAL — encode ✓ + decode ✓ (was decode-only)
Before:
cur.execute("SELECT CURRENT YEAR TO SECOND ...")
cur.fetchone() # → (b'\xc7\x14\x1a\x05\x04...',) raw BCD bytes
After:
cur.execute("SELECT CURRENT YEAR TO SECOND ...")
cur.fetchone() # → (datetime.datetime(2026, 5, 4, 12, 34, 56),)
Decoder picks the right Python type by qualifier:
YEAR/MONTH/DAY-only → datetime.date
HOUR/MIN/SEC-only → datetime.time
spans across both → datetime.datetime
Wire format (per IfxToJavaDateTime + Decimal.init treating as packed BCD):
byte[0] = sign + biased exponent (in base-100 digit pairs)
byte[1..] = BCD digit pairs: YYYY (2 bytes) + MM + DD + HH + MI + SS + FFFFF
Qualifier extraction from column descriptor:
encoded_length = (digit_count << 8) | (start_TU << 4) | end_TU
TU codes: YEAR=0, MONTH=2, DAY=4, HOUR=6, MIN=8, SEC=10,
FRAC1=11..FRAC5=15
Verified against four DATETIME columns of different qualifiers in
one tuple — see test_datetime_multiple_columns_in_one_row:
YEAR TO SECOND → datetime.datetime(2026, 5, 4, 12, 34, 56)
YEAR TO DAY → datetime.date(2026, 5, 4)
HOUR TO SECOND → datetime.time(12, 34, 56)
YEAR TO FRACTION(3) → datetime.datetime(...)
Module changes:
src/informix_db/converters.py:
+ _decode_datetime(raw, encoded_length) — qualifier-driven BCD walk
+ TU constants (_TU_YEAR, _TU_MONTH, ..., _TU_SECOND)
src/informix_db/_resultset.py:
+ DATETIME row-decoder branch — computes width from digit_count
in encoded_length high byte, calls _decode_datetime with the
packed qualifier so it can pick the right Python type
Tests: 40 unit + 70 integration (7 new DATETIME tests) = 110 total,
all green, ruff clean. Tests cover:
- YEAR TO SECOND → datetime.datetime
- YEAR TO DAY → datetime.date
- HOUR TO SECOND → datetime.time
- CURRENT YEAR TO FRACTION(3) → datetime.datetime
- Mixed qualifiers in one row
- DATETIME stored in a real table column (round-trip via SELECT)
- NULL DATETIME → Python None
DATETIME parameter binding (encoder) is Phase 6.x — same status as
DECIMAL encoder.
Before:
ProgrammingError: server returned SQ_ERR sqlcode=-201 isamcode=0
After:
ProgrammingError: [-201] A syntax error has occurred (offset 1)
ProgrammingError: [-206] The specified table is not in the database
(near 'no_such_table') [ISAM -111] (offset 27)
IntegrityError: [-268] Cannot insert duplicate value -
violates UNIQUE constraint (near 'on table u')
[ISAM -100] (offset 23)
IntegrityError: [-391] Cannot insert NULL value into a NOT NULL column
(near 't.id') (offset 23)
OperationalError: [-255] Not in transaction
PEP 249 exception classes mapped from sqlcode:
-201, -206, -217, -286, -310, ... → ProgrammingError
-239, -268, -291, -292, -391, -703 → IntegrityError
-255, -256, -407, -440, -908, ... → OperationalError
-329, -349, -510 → NotSupportedError
others → DatabaseError (safe fallback)
SQ_ERR wire decode (per IfxSqli.receiveError line 2717):
[short sqlcode][short isamcode][int offset]
[short near_token_len][bytes name][optional pad][short SQ_EOT]
The "near" token is the object name where the error occurred (table or
column name for "not found" errors); empty for most syntax errors.
Structured fields attached to every Informix error for programmatic
inspection:
e.sqlcode — Informix error code (e.g. -206)
e.isamcode — ISAM/RSAM-level error (e.g. -111 = "table not found")
e.offset — character offset in the SQL where the error occurred
e.near — object name in the "near 'XYZ'" clause (or "")
Connection state survives errors: a failed query doesn't poison the
session — subsequent execute() calls work normally. Verified by
test_connection_survives_query_error.
Built-in error catalog of ~50 most common Informix sqlcodes shipped
in src/informix_db/_errcodes.py. Users can extend at runtime with
register_error_text(code, text). Unknown codes get a generic
"Informix error <N>" with structured fields still populated.
Module changes:
src/informix_db/_errcodes.py (new) — error catalog + exception
classification + register_error_text()
src/informix_db/cursors.py — _raise_sq_err now uses the catalog
src/informix_db/connections.py — same upgrade for the connection-side
SQ_ERR path (catches commit/rollback errors etc.)
Tests: 40 unit + 63 integration (8 new error tests) = 103 total, all
green, ruff clean. Tests cover:
- syntax error → ProgrammingError(-201)
- table not found → ProgrammingError(-206) with near='no_such_table'
- column not found → ProgrammingError(-217)
- UNIQUE violation → IntegrityError(-268)
- NOT NULL violation → IntegrityError(-391)
- commit on unlogged DB → OperationalError(-255)
- connection survives errors (subsequent queries work)
- all errors carry structured sqlcode/isamcode/offset/near attrs
Three Phase 4 follow-ups in one push, all with empirical wire analysis:
1. PARAMETERIZED SELECT
cur.execute('SELECT tabname FROM systables WHERE tabid = ?', (1,))
→ ('systables',)
Wire flow: PREPARE → DESCRIBE → SQ_BIND-only (no EXECUTE) →
CURNAME+NFETCH → TUPLE+DONE → drain → CLOSE+RELEASE.
The cursor open is what executes the prepared query; SQ_BIND just
binds values into scope. No need for the IDESCRIBE handshake JDBC
does for type discovery — server accepts our typed bind directly.
2. NULL ROW DECODING — per-type sentinel detection
Each IDS type has its own NULL sentinel in tuple data:
INT → 0x80000000 (INT_MIN)
BIGINT → 0x8000000000000000 (LONG_MIN)
SMALLINT→ 0x8000 (SHORT_MIN)
REAL → all 0xFF (NaN bit pattern)
FLOAT → all 0xFF
DATE → 0x80000000 (same as INT)
VARCHAR → [byte 1][byte 0] (length=1, single nul) — distinguishable
from empty '' which is [byte 0] (length=0)
Verified by wire capture against the dev container — see
docs/CAPTURES/19-py-null-vs-onechar.socat.log and
docs/CAPTURES/20-py-int-null.socat.log.
The VARCHAR null marker is the trickiest because it LOOKS like a
1-byte string of nul, but VARCHAR can't contain embedded nuls
anyway, so the byte-0 within length-1 is unambiguous.
3. executemany(sql, seq_of_params) — PEP 249 batched DML
PREPARE once, loop SQ_BIND+SQ_EXECUTE per param set, RELEASE once.
Performance: only ~1.06x faster than execute() loop for 200 INSERTs
(dominated by per-row round trips). Phase 4.x optimization opportunity:
chain BIND+EXECUTE in one PDU without intermediate flush+read for
true bulk performance (would likely give 5-10x). Documented in
DECISION_LOG.md as a follow-up.
Module changes:
src/informix_db/converters.py:
+ Per-type NULL sentinel constants and detection in each decoder
+ Decoders now return None for sentinel values
src/informix_db/cursors.py:
+ _execute_select_with_params() — SQ_BIND alone, then cursor open
+ _build_bind_only_pdu() — SQ_BIND without trailing SQ_EXECUTE
+ executemany() — loop BIND+EXECUTE, accumulate rowcount
+ execute() now dispatches to _execute_select_with_params for
parameterized SELECT (was: NotSupportedError)
Tests: 40 unit + 47 integration (was 32; added 15 new) = 87 total,
all green, ruff clean. New test files / cases:
tests/test_nulls.py (7) — NULL decoding for INT, BIGINT, FLOAT,
REAL, VARCHAR, empty-vs-null, mixed columns
tests/test_params.py — added 4 parameterized SELECT tests, 5
executemany tests
tests/test_smoke.py — updated cursor-with-params test (was Phase 1
"raises", now Phase 4 "works")
Discovered captures kept for next-session debugging:
docs/CAPTURES/18-py-null-rows.socat.log
docs/CAPTURES/19-py-null-vs-onechar.socat.log
docs/CAPTURES/20-py-int-null.socat.log
cur.execute("INSERT INTO t VALUES (?, ?, ?)", (42, "hello", 3.14))
cur.execute("INSERT INTO t VALUES (:1, :2)", (99, "world"))
cur.execute("UPDATE t SET name = ? WHERE id = ?", ("new", 2))
cur.execute("DELETE FROM t WHERE id = ?", (5,))
# all work end-to-end against a real Informix server
Two breakthroughs decoded from JDBC:
1. SQ_BIND PDU shape (chained with SQ_EXECUTE in one PDU, no separate
round trip):
[short SQ_ID=4][int SQ_BIND=5][short numparams]
for each param:
[short type][short indicator][short prec_or_encLen]
writePadded(rawbytes)
[short SQ_EXECUTE=7][short SQ_EOT]
2. Strings are sent as CHAR (type=0) not VARCHAR (type=13). The server
handles conversion to the actual column type via internal CIDESCRIBE
— we don't need to do it explicitly.
Per-type encoding (Phase 4 MVP):
int (32-bit) → IDS INT (type=2), prec=0x0a00 (packed width=10/scale=0),
4-byte BE
int (64-bit) → IDS BIGINT (type=52), prec=0x1300, 8-byte BE
str → IDS CHAR (type=0), prec=0, [short len][bytes][pad]
float → IDS FLOAT (type=3), prec=0, 8-byte IEEE 754
bool → IDS BOOL (type=45), prec=0, 1 byte
None → indicator=-1, no data
The integer "precision" field is PACKED — initially looked like a bug
(why would precision be 2560?) until I realized 0x0a00 = (10 << 8) | 0
= packed display-width and scale. Captured this surprise in
DECISION_LOG.md.
Critical fix to execute-path branching: parameterized INSERT also
returns nfields > 0 (server describes the would-be inserted row).
Switched from "branch on nfields" to "branch on SQL keyword" — JDBC
does the same via its IfxStatement / IfxPreparedStatement subclassing.
Numeric paramstyle support: cur.execute("... :1 ...", (val,)) works
by rewriting :N → ? before sending PREPARE. Trivial regex (doesn't
escape strings/comments — Phase 5 can add a proper SQL tokenizer).
Module changes:
src/informix_db/converters.py:
+ encode_param() dispatcher
+ _encode_int / _encode_bigint / _encode_str / _encode_float / _encode_bool
src/informix_db/cursors.py:
+ _build_bind_execute_pdu() — chains SQ_BIND + SQ_EXECUTE in one PDU
+ _execute_dml_with_params() — sends bind PDU, drains, releases
+ execute() now accepts parameters; rewrites :N → ?; branches by
SQL keyword (SELECT vs DML)
+ _NUMERIC_PLACEHOLDER_RE for paramstyle="numeric" support
Tests: 40 unit + 32 integration (8 new parameter tests + 1 updated
smoke) = 72 total, all green, ruff clean. New tests cover:
- INSERT with ? params
- INSERT with :N params
- INT + FLOAT + str round trip via INSERT then SELECT
- UPDATE with params in SET and WHERE
- DELETE with parameter in WHERE
- Unsupported param type (bytes) raises NotImplementedError
- Parameterized SELECT raises NotSupportedError (Phase 4.x)
- Dict/named params raise NotSupportedError
Known gaps (Phase 4.x / Phase 5):
- Parameterized SELECT (needs SQ_BIND before CURNAME+NFETCH)
- NULL row decoding for VARCHAR (currently surfaces empty string)
- Proper SQL tokenizer (so :N inside string literals is preserved)
- bytes/datetime/Decimal parameter types
Cursor.execute now branches on DESCRIBE response's nfields:
- nfields > 0 → SELECT path (cursor lifecycle: CURNAME+NFETCH+...)
- nfields == 0 → DDL/DML path (just SQ_EXECUTE then SQ_RELEASE)
Examples that work end-to-end against the dev container:
cur.execute('CREATE TEMP TABLE t (id INTEGER, name VARCHAR(50))')
cur.execute("INSERT INTO t VALUES (1, 'hello')") # rowcount=1
cur.execute("UPDATE t SET name = 'new' WHERE id = 1")
cur.execute('DELETE FROM t WHERE id = 1')
Plus full mix: CREATE → 5 INSERTs → SELECT WHERE → DELETE WHERE → SELECT
(see tests/test_dml.py::test_full_dml_cycle_in_one_connection).
Three protocol findings during this push, documented in DECISION_LOG.md:
1. SQ_INSERTDONE (=94) is METADATA, not execution. It arrives in BOTH
the DESCRIBE response (PREPARE phase) AND the EXECUTE response for
literal-value INSERTs. The PREPARE-phase SQ_INSERTDONE carries the
serial values that WILL be assigned IF you execute. The EXECUTE-
phase SQ_INSERTDONE confirms execution. My initial assumption was
"PREPARE-phase INSERTDONE means already-executed" — wrong. Skipping
SQ_EXECUTE made the row not persist (SELECT returned []). Lesson:
optimization-looking responses may not be what they look like —
always verify with a follow-up SELECT.
2. SQ_INSERTDONE wire format: 18 bytes (10 byte longint serial8 + 8
byte bigint bigserial). Per IfxSqli.receiveInsertDone line 2347.
We read-and-discard for now; Phase 5+ surfaces as Cursor.lastrowid.
3. Transactions: commit() and rollback() are 2-byte messages.
SQ_CMMTWORK=19 + SQ_EOT for commit; SQ_RBWORK=20 + SQ_EOT for
rollback. Server responds with SQ_DONE+SQ_EOT in logged databases,
or SQ_ERR sqlcode=-255 ("Not in transaction") in unlogged databases
like sysmaster. Wire machinery is implemented; full transaction
testing needs a logged DB (use ``stores_demo`` from the dev image).
Module changes:
src/informix_db/cursors.py:
- execute() branches on nfields (SELECT path vs DDL/DML path)
- new _execute_dml() does just EXECUTE + RELEASE
- new _build_execute_pdu() emits the 8-byte SQ_ID(EXECUTE)+EOT
- _read_describe_response() and _drain_to_eot() handle SQ_INSERTDONE
src/informix_db/connections.py:
- commit() / rollback() now functional — send the SQ_CMMTWORK /
SQ_RBWORK PDU and drain the response
Tests: 40 unit + 24 integration (6 new DML tests) = 64 total, all
green, ruff clean. New tests cover:
- CREATE TEMP TABLE
- INSERT (rowcount=1, persists, SELECT shows it)
- UPDATE WHERE (specific row changed)
- DELETE WHERE (specific row removed)
- Full mixed cycle (CREATE + 5 INSERTs + SELECT + DELETE + SELECT)
- commit() in unlogged DB raises OperationalError sqlcode=-255
Captured wire artifacts kept for future debugging:
docs/CAPTURES/16-py-insert-literal.socat.log
docs/CAPTURES/17-py-insert-select.socat.log
Three findings, each caught by a different debugging technique,
documented in DECISION_LOG.md:
1. CURNAME+NFETCH PDU: trailing reserved field is SHORT not INT.
Caught by byte-diffing our 44-byte PDU against JDBC's 42-byte
reference under socat. The server tolerated the longer version
for INT-only SELECTs (silently consuming extra zeros) but
rejected it for VARCHAR queries. Lesson: server tolerance varies
by query type — always match JDBC byte-for-byte.
2. SQ_TUPLE payload pads to even byte alignment. An 11-byte
"syscolumns" VARCHAR payload had a trailing 0x00 between it and
the next SQ_TUPLE tag. JDBC's IfxRowColumn.readTuple consumes
this pad silently; we weren't, so any odd-length variable-width
row desynced the parser.
3. VARCHAR/NCHAR/NVCHAR in tuple data use a SINGLE-byte length
prefix (max 255 chars — IDS VARCHAR's hard limit). NOT a 2-byte
short as I'd initially assumed. CHAR is fixed-width per
encoded_length. LVARCHAR uses a 4-byte int prefix for >255 byte
values.
Module changes:
src/informix_db/_resultset.py — _LENGTH_PREFIXED_SHORT_TYPES set,
branched VARCHAR/NCHAR/NVCHAR (1-byte prefix) vs CHAR (fixed)
vs LVARCHAR (4-byte prefix); even-byte alignment pad consumed
after each SQ_TUPLE payload.
src/informix_db/cursors.py — CURNAME+NFETCH and standalone NFETCH
PDUs now write_short(0) for the reserved trailing field.
Tests: 40 unit + 18 integration (3 new VARCHAR tests) = 58 total,
all green, ruff clean. New tests cover:
- VARCHAR single-column SELECT
- Odd-length VARCHAR row (regression for the pad-byte bug)
- Mixed INT + VARCHAR + FLOAT three-column SELECT
Sample output:
SELECT FIRST 5 tabname FROM systables → ('systables',),
('syscolumns',), ('sysindices',), ('systabauth',), ('syscolauth',)
SELECT FIRST 3 tabname, tabid, nrows → ('systables', 1, 276.0), ...
VARCHAR was the last known gap from the Phase 2 commit. Phase 2
now reads INT, BIGINT, REAL, FLOAT, CHAR, VARCHAR end-to-end. Phase
6+ types (DATETIME, INTERVAL, DECIMAL, BLOBs) remain.
cursor.execute("SELECT 1 FROM systables WHERE tabid = 1")
cursor.fetchone() == (1,)
To my knowledge, this is the first time a pure-Python implementation
has read data from Informix without wrapping IBM's CSDK or JDBC.
Three breakthroughs in this commit:
1. Login PDU's database field is BROKEN. Passing a database name there
makes the server reject subsequent SQ_DBOPEN with sqlcode -759
("database not available"). JDBC always sends NULL in the login
PDU's database slot — we now do the same. The user-supplied database
opens via SQ_DBOPEN in _init_session.
2. Post-login session init dance: SQ_PROTOCOLS (8-byte feature mask
replayed verbatim from JDBC) → SQ_INFO with INFO_ENV + env vars
(48-byte PDU replayed verbatim — DBTEMP=/tmp, SUBQCACHESZ=10) →
SQ_DBOPEN. Without all three steps in this exact order, the server
silently ignores SELECTs.
3. SQ_DESCRIBE per-column block has 10 fields per column (not the
simple "name + type" my best-effort parser assumed): fieldIndex,
columnStartPos, columnType, columnExtendedId, ownerName,
extendedName, reference, alignment, sourceType, encodedLength.
The string table at the end is offset-indexed (fieldIndex points
into it), which is how JDBC handles disambiguation.
Cursor lifecycle implementation in cursors.py mirrors JDBC exactly:
PREPARE+NDESCRIBE+WANTDONE → DESCRIBE+DONE+COST+EOT
CURNAME+NFETCH(4096) → TUPLE*+DONE+COST+EOT
NFETCH(4096) → DONE+COST+EOT (drain)
CLOSE → EOT
RELEASE → EOT
Five round trips per SELECT — same as JDBC.
Module changes:
src/informix_db/connections.py — added _init_session(), _send_protocols(),
_send_dbopen(), _drain_to_eot(), _raise_sq_err(); login PDU now
forces database=None always; SQ_INFO PDU replayed verbatim from
JDBC capture (offsets-indexed env-var format too gnarly to derive
in MVP).
src/informix_db/cursors.py — full rewrite: real PDU builders for
PREPARE/CURNAME+NFETCH/NFETCH/CLOSE/RELEASE; tag-dispatched
response readers; cursor-name generator matching JDBC's "_ifxc"
convention.
src/informix_db/_resultset.py — proper SQ_DESCRIBE parser per
JDBC's receiveDescribe (USVER mode); offset-indexed string table
with name lookup by fieldIndex; ColumnInfo dataclass with raw
type-code preserved for null-flag extraction.
src/informix_db/_messages.py — added SQ_NDESCRIBE=22, SQ_WANTDONE=49.
Test coverage: 40 unit + 15 integration tests (7 smoke + 8 new SELECT)
= 55 total, all green, ruff clean. New tests cover:
- SELECT 1 returns (1,)
- cursor.description shape per PEP 249
- Multi-row INT SELECT
- Multi-column mixed types (INT + FLOAT)
- Iterator protocol (for row in cursor)
- fetchmany(n)
- Re-executing on same cursor resets state
- Two cursors on one connection (sequential)
Known gap: VARCHAR row decoding doesn't yet handle the variable-width
on-wire encoding correctly. Phase 2.x will address — for now NotImpl
errors surface raw bytes in the row tuple.
Cursor class scaffolded with full PEP 249 surface:
src/informix_db/cursors.py — Cursor with execute, fetchone, fetchmany,
fetchall, description, rowcount, arraysize, close, iterator,
context manager. Sends SQ_COMMAND chains for parameterless SQL
(Phase 4 adds SQ_BIND/SQ_EXECUTE for params).
src/informix_db/_resultset.py — ColumnInfo, parse_describe,
parse_tuple_payload. Best-effort SQ_DESCRIBE parser; refines in
Phase 2.1.
src/informix_db/connections.py — Connection.cursor() now returns a
real Cursor; new _send_pdu() lets Cursor share the connection's
socket without violating encapsulation.
Protocol findings landed in PROTOCOL_NOTES.md §6:
§6a — SQ_PREPARE format with named tags (the "trailing 22, 49"
are SQ_NDESCRIBE and SQ_WANTDONE chained into the same PDU).
Confirmed against IfxSqli.sendPrepare line 1062.
§6c — Server requires post-login init sequence (SQ_PROTOCOLS →
SQ_INFO → SQ_ID(env vars) → SQ_DBOPEN) BEFORE any PREPARE works.
Discovered the hard way: PREPARE without this sequence gets no
response; SQ_DBOPEN without SQ_PROTOCOLS gets sqlcode=-759
("Database not available"). The login PDU's database field is
a hint, not an open.
§6e — SQ_TUPLE corrected: [short warn][int size][bytes payload]
(not [int 0][short payloadLen] as earlier draft claimed).
Two more constants added to _messages.MessageType:
SQ_NDESCRIBE = 22, SQ_WANTDONE = 49
Tests: 40 unit + 7 integration (added 2 new — cursor() returns a
Cursor, parameter binding raises NotSupportedError). All green, ruff
clean. Removed obsolete "cursor() raises NotImplementedError" test.
What works end-to-end now: connect, cursor(), close, parameter-attempt
gating. What doesn't yet: cursor.execute("SELECT 1") — server requires
the post-login init sequence we don't yet send.
Discovered captures (kept for next session's analysis):
docs/CAPTURES/06-py-select1-attempt.socat.log
docs/CAPTURES/07-py-replay-jdbc-prepare.socat.log
docs/CAPTURES/08-py-with-dbopen.socat.log
docs/CAPTURES/09-py-full-replay.socat.log
Three new tasks created tracking the remaining Phase 2 blockers:
post-login init sequence, proper SQ_DESCRIBE parser, SQ_ID action
vocabulary helpers.
Decoded the post-login execution flow from docs/CAPTURES/02-select-1.socat.log:
SQ_PREPARE format (validated against both observed PREPAREs):
[short SQ_PREPARE=2]
[short flags=0]
[int sqlLen] ← SQL byte count, NOT including nul
[bytes sql]
[byte 0] ← nul terminator
[short 0x0016] ← observed 22; cursor options? statement type?
[short 0x0031] ← observed 49; identical across both PREPAREs
[short SQ_EOT=12]
SQ_TUPLE format (definitive):
[short SQ_TUPLE=14]
[int 0] ← flags / reserved
[short payloadLen]
[bytes payload] ← column values back-to-back, per type encoding
SQ_DONE format (partial — see PROTOCOL_NOTES.md §6e for what's known)
JDBC's full prepare/fetch/release sequence (PREPARE → DESCRIBE → ID(3
=cursor name) → ID(9=NFETCH) → TUPLE → DONE → ID(10=close) →
ID(11=release)) documented in §6c. The action codes inside SQ_ID
roughly map to other SQ_* tag values from IfxMessageTypes.
For Python MVP we'll likely try SQ_COMMAND=1 (execute-immediate)
first — it might let us skip the cursor lifecycle for parameterless
queries.
New modules:
src/informix_db/_types.py — IfxType IntEnum ported from
com.informix.lang.IfxTypes. All IDS internal type codes (CHAR=0,
SMALLINT=1, INT=2, ..., BOOLEAN=45, BIGINT=52, BIGSERIAL=53, CLOB=101,
BLOB=102) plus the high-bit flags (NOTNULLABLE=0x100 etc) and helpers
base_type() / is_nullable() to strip and inspect the flag byte.
src/informix_db/converters.py — wire-bytes → Python decoders for the
Phase-2 MVP type set: SMALLINT, INT, BIGINT, SMFLOAT, FLOAT, CHAR,
VARCHAR, NCHAR, NVCHAR, LVARCHAR, BOOL, DATE. Plus FIXED_WIDTHS table
for the row decoder. ENCODERS dict declared empty (Phase 4 fills it
in for parameter binding).
DATE handling uses Informix epoch (1899-12-31, day 0); 4-byte BE int
day count → datetime.date. Smoke-tested decoders all return correct
Python values.
Cursor / _resultset implementation NOT in this commit — they need
deeper SQ_DESCRIBE byte-layout analysis and the SQ_ID sub-action
vocabulary characterization. Both are bounded-but-substantial Phase 2
tasks deferred to a fresh session.
40 unit tests still passing, ruff clean.
Polish item #1: byte-for-byte regression test that asserts our
generated login PDU is structurally identical to JDBC's reference
captured in docs/CAPTURES/01-connect-only.socat.log.
The test (tests/test_pdu_match.py) immediately caught a real bug:
the capability section was misread during Phase 0 byte-decoding.
Earlier text claimed Cap_1=1, Cap_2=0x3c000000, Cap_3=0 — actually:
Cap_1 = 0x0000013c (= (capability_class << 8) | protocol_version
where protocol_version = 0x3c = PF_PROT_SQLI_0600)
Cap_2 = 0
Cap_3 = 0
The misalignment was: the 0x3c byte I attributed to Cap_2's high
byte was actually Cap_1's low byte. The dev-image server is
permissive enough to accept arbitrary capability values, so the
connection succeeded even with the wrong bytes — but the PDU wasn't
structurally identical to JDBC's reference. SERVER-ACCEPTS ≠
STRUCTURALLY-CORRECT. This is exactly why the byte-for-byte diff
was the right polish item; "it connects" was a false ceiling.
After fix:
- 6 PDU-match tests assert byte-for-byte equality at offsets 2..280
(the structural prefix: SLheader sans length, all login markers,
capability ints, username, password, protocol IDs, env vars).
- Bytes 280+ legitimately differ per process (PID, TID, hostname,
cwd, AppName) — those are NOT asserted.
- Length field (offsets 0..1) also legitimately differs because our
PDU has shorter env list and AppName.
- Test uses monkey-patched IfxSocket so no network is needed.
Polish item #2: Makefile per global CLAUDE.md convention. Targets:
install, lint, format, test, test-integration, test-all, test-pdu,
ifx-up/down/logs/shell/status, capture (re-run JDBC scenarios under
socat), clean. `make` (no target) prints help.
Doc updates:
- PROTOCOL_NOTES.md §12: corrected capability section with the
actual values and an explanation of the methodology lesson
- DECISION_LOG.md: new entry recording the correction with a
pointer to the regression test and the takeaway
Side artifacts:
- docs/CAPTURES/03-py-connect-only.socat.log
- docs/CAPTURES/04-py-no-database.socat.log
- docs/CAPTURES/05-py-fixed-caps.socat.log
Test counts: 40 unit + 6 integration = 46 total, all green, ruff clean.