Three Phase 4 follow-ups in one push, all with empirical wire analysis:
1. PARAMETERIZED SELECT
cur.execute('SELECT tabname FROM systables WHERE tabid = ?', (1,))
→ ('systables',)
Wire flow: PREPARE → DESCRIBE → SQ_BIND-only (no EXECUTE) →
CURNAME+NFETCH → TUPLE+DONE → drain → CLOSE+RELEASE.
The cursor open is what executes the prepared query; SQ_BIND just
binds values into scope. No need for the IDESCRIBE handshake JDBC
does for type discovery — server accepts our typed bind directly.
2. NULL ROW DECODING — per-type sentinel detection
Each IDS type has its own NULL sentinel in tuple data:
INT → 0x80000000 (INT_MIN)
BIGINT → 0x8000000000000000 (LONG_MIN)
SMALLINT→ 0x8000 (SHORT_MIN)
REAL → all 0xFF (NaN bit pattern)
FLOAT → all 0xFF
DATE → 0x80000000 (same as INT)
VARCHAR → [byte 1][byte 0] (length=1, single nul) — distinguishable
from empty '' which is [byte 0] (length=0)
Verified by wire capture against the dev container — see
docs/CAPTURES/19-py-null-vs-onechar.socat.log and
docs/CAPTURES/20-py-int-null.socat.log.
The VARCHAR null marker is the trickiest because it LOOKS like a
1-byte string of nul, but VARCHAR can't contain embedded nuls
anyway, so the byte-0 within length-1 is unambiguous.
3. executemany(sql, seq_of_params) — PEP 249 batched DML
PREPARE once, loop SQ_BIND+SQ_EXECUTE per param set, RELEASE once.
Performance: only ~1.06x faster than execute() loop for 200 INSERTs
(dominated by per-row round trips). Phase 4.x optimization opportunity:
chain BIND+EXECUTE in one PDU without intermediate flush+read for
true bulk performance (would likely give 5-10x). Documented in
DECISION_LOG.md as a follow-up.
Module changes:
src/informix_db/converters.py:
+ Per-type NULL sentinel constants and detection in each decoder
+ Decoders now return None for sentinel values
src/informix_db/cursors.py:
+ _execute_select_with_params() — SQ_BIND alone, then cursor open
+ _build_bind_only_pdu() — SQ_BIND without trailing SQ_EXECUTE
+ executemany() — loop BIND+EXECUTE, accumulate rowcount
+ execute() now dispatches to _execute_select_with_params for
parameterized SELECT (was: NotSupportedError)
Tests: 40 unit + 47 integration (was 32; added 15 new) = 87 total,
all green, ruff clean. New test files / cases:
tests/test_nulls.py (7) — NULL decoding for INT, BIGINT, FLOAT,
REAL, VARCHAR, empty-vs-null, mixed columns
tests/test_params.py — added 4 parameterized SELECT tests, 5
executemany tests
tests/test_smoke.py — updated cursor-with-params test (was Phase 1
"raises", now Phase 4 "works")
Discovered captures kept for next-session debugging:
docs/CAPTURES/18-py-null-rows.socat.log
docs/CAPTURES/19-py-null-vs-onechar.socat.log
docs/CAPTURES/20-py-int-null.socat.log
Cursor.execute now branches on DESCRIBE response's nfields:
- nfields > 0 → SELECT path (cursor lifecycle: CURNAME+NFETCH+...)
- nfields == 0 → DDL/DML path (just SQ_EXECUTE then SQ_RELEASE)
Examples that work end-to-end against the dev container:
cur.execute('CREATE TEMP TABLE t (id INTEGER, name VARCHAR(50))')
cur.execute("INSERT INTO t VALUES (1, 'hello')") # rowcount=1
cur.execute("UPDATE t SET name = 'new' WHERE id = 1")
cur.execute('DELETE FROM t WHERE id = 1')
Plus full mix: CREATE → 5 INSERTs → SELECT WHERE → DELETE WHERE → SELECT
(see tests/test_dml.py::test_full_dml_cycle_in_one_connection).
Three protocol findings during this push, documented in DECISION_LOG.md:
1. SQ_INSERTDONE (=94) is METADATA, not execution. It arrives in BOTH
the DESCRIBE response (PREPARE phase) AND the EXECUTE response for
literal-value INSERTs. The PREPARE-phase SQ_INSERTDONE carries the
serial values that WILL be assigned IF you execute. The EXECUTE-
phase SQ_INSERTDONE confirms execution. My initial assumption was
"PREPARE-phase INSERTDONE means already-executed" — wrong. Skipping
SQ_EXECUTE made the row not persist (SELECT returned []). Lesson:
optimization-looking responses may not be what they look like —
always verify with a follow-up SELECT.
2. SQ_INSERTDONE wire format: 18 bytes (10 byte longint serial8 + 8
byte bigint bigserial). Per IfxSqli.receiveInsertDone line 2347.
We read-and-discard for now; Phase 5+ surfaces as Cursor.lastrowid.
3. Transactions: commit() and rollback() are 2-byte messages.
SQ_CMMTWORK=19 + SQ_EOT for commit; SQ_RBWORK=20 + SQ_EOT for
rollback. Server responds with SQ_DONE+SQ_EOT in logged databases,
or SQ_ERR sqlcode=-255 ("Not in transaction") in unlogged databases
like sysmaster. Wire machinery is implemented; full transaction
testing needs a logged DB (use ``stores_demo`` from the dev image).
Module changes:
src/informix_db/cursors.py:
- execute() branches on nfields (SELECT path vs DDL/DML path)
- new _execute_dml() does just EXECUTE + RELEASE
- new _build_execute_pdu() emits the 8-byte SQ_ID(EXECUTE)+EOT
- _read_describe_response() and _drain_to_eot() handle SQ_INSERTDONE
src/informix_db/connections.py:
- commit() / rollback() now functional — send the SQ_CMMTWORK /
SQ_RBWORK PDU and drain the response
Tests: 40 unit + 24 integration (6 new DML tests) = 64 total, all
green, ruff clean. New tests cover:
- CREATE TEMP TABLE
- INSERT (rowcount=1, persists, SELECT shows it)
- UPDATE WHERE (specific row changed)
- DELETE WHERE (specific row removed)
- Full mixed cycle (CREATE + 5 INSERTs + SELECT + DELETE + SELECT)
- commit() in unlogged DB raises OperationalError sqlcode=-255
Captured wire artifacts kept for future debugging:
docs/CAPTURES/16-py-insert-literal.socat.log
docs/CAPTURES/17-py-insert-select.socat.log
Three findings, each caught by a different debugging technique,
documented in DECISION_LOG.md:
1. CURNAME+NFETCH PDU: trailing reserved field is SHORT not INT.
Caught by byte-diffing our 44-byte PDU against JDBC's 42-byte
reference under socat. The server tolerated the longer version
for INT-only SELECTs (silently consuming extra zeros) but
rejected it for VARCHAR queries. Lesson: server tolerance varies
by query type — always match JDBC byte-for-byte.
2. SQ_TUPLE payload pads to even byte alignment. An 11-byte
"syscolumns" VARCHAR payload had a trailing 0x00 between it and
the next SQ_TUPLE tag. JDBC's IfxRowColumn.readTuple consumes
this pad silently; we weren't, so any odd-length variable-width
row desynced the parser.
3. VARCHAR/NCHAR/NVCHAR in tuple data use a SINGLE-byte length
prefix (max 255 chars — IDS VARCHAR's hard limit). NOT a 2-byte
short as I'd initially assumed. CHAR is fixed-width per
encoded_length. LVARCHAR uses a 4-byte int prefix for >255 byte
values.
Module changes:
src/informix_db/_resultset.py — _LENGTH_PREFIXED_SHORT_TYPES set,
branched VARCHAR/NCHAR/NVCHAR (1-byte prefix) vs CHAR (fixed)
vs LVARCHAR (4-byte prefix); even-byte alignment pad consumed
after each SQ_TUPLE payload.
src/informix_db/cursors.py — CURNAME+NFETCH and standalone NFETCH
PDUs now write_short(0) for the reserved trailing field.
Tests: 40 unit + 18 integration (3 new VARCHAR tests) = 58 total,
all green, ruff clean. New tests cover:
- VARCHAR single-column SELECT
- Odd-length VARCHAR row (regression for the pad-byte bug)
- Mixed INT + VARCHAR + FLOAT three-column SELECT
Sample output:
SELECT FIRST 5 tabname FROM systables → ('systables',),
('syscolumns',), ('sysindices',), ('systabauth',), ('syscolauth',)
SELECT FIRST 3 tabname, tabid, nrows → ('systables', 1, 276.0), ...
VARCHAR was the last known gap from the Phase 2 commit. Phase 2
now reads INT, BIGINT, REAL, FLOAT, CHAR, VARCHAR end-to-end. Phase
6+ types (DATETIME, INTERVAL, DECIMAL, BLOBs) remain.
cursor.execute("SELECT 1 FROM systables WHERE tabid = 1")
cursor.fetchone() == (1,)
To my knowledge, this is the first time a pure-Python implementation
has read data from Informix without wrapping IBM's CSDK or JDBC.
Three breakthroughs in this commit:
1. Login PDU's database field is BROKEN. Passing a database name there
makes the server reject subsequent SQ_DBOPEN with sqlcode -759
("database not available"). JDBC always sends NULL in the login
PDU's database slot — we now do the same. The user-supplied database
opens via SQ_DBOPEN in _init_session.
2. Post-login session init dance: SQ_PROTOCOLS (8-byte feature mask
replayed verbatim from JDBC) → SQ_INFO with INFO_ENV + env vars
(48-byte PDU replayed verbatim — DBTEMP=/tmp, SUBQCACHESZ=10) →
SQ_DBOPEN. Without all three steps in this exact order, the server
silently ignores SELECTs.
3. SQ_DESCRIBE per-column block has 10 fields per column (not the
simple "name + type" my best-effort parser assumed): fieldIndex,
columnStartPos, columnType, columnExtendedId, ownerName,
extendedName, reference, alignment, sourceType, encodedLength.
The string table at the end is offset-indexed (fieldIndex points
into it), which is how JDBC handles disambiguation.
Cursor lifecycle implementation in cursors.py mirrors JDBC exactly:
PREPARE+NDESCRIBE+WANTDONE → DESCRIBE+DONE+COST+EOT
CURNAME+NFETCH(4096) → TUPLE*+DONE+COST+EOT
NFETCH(4096) → DONE+COST+EOT (drain)
CLOSE → EOT
RELEASE → EOT
Five round trips per SELECT — same as JDBC.
Module changes:
src/informix_db/connections.py — added _init_session(), _send_protocols(),
_send_dbopen(), _drain_to_eot(), _raise_sq_err(); login PDU now
forces database=None always; SQ_INFO PDU replayed verbatim from
JDBC capture (offsets-indexed env-var format too gnarly to derive
in MVP).
src/informix_db/cursors.py — full rewrite: real PDU builders for
PREPARE/CURNAME+NFETCH/NFETCH/CLOSE/RELEASE; tag-dispatched
response readers; cursor-name generator matching JDBC's "_ifxc"
convention.
src/informix_db/_resultset.py — proper SQ_DESCRIBE parser per
JDBC's receiveDescribe (USVER mode); offset-indexed string table
with name lookup by fieldIndex; ColumnInfo dataclass with raw
type-code preserved for null-flag extraction.
src/informix_db/_messages.py — added SQ_NDESCRIBE=22, SQ_WANTDONE=49.
Test coverage: 40 unit + 15 integration tests (7 smoke + 8 new SELECT)
= 55 total, all green, ruff clean. New tests cover:
- SELECT 1 returns (1,)
- cursor.description shape per PEP 249
- Multi-row INT SELECT
- Multi-column mixed types (INT + FLOAT)
- Iterator protocol (for row in cursor)
- fetchmany(n)
- Re-executing on same cursor resets state
- Two cursors on one connection (sequential)
Known gap: VARCHAR row decoding doesn't yet handle the variable-width
on-wire encoding correctly. Phase 2.x will address — for now NotImpl
errors surface raw bytes in the row tuple.
Cursor class scaffolded with full PEP 249 surface:
src/informix_db/cursors.py — Cursor with execute, fetchone, fetchmany,
fetchall, description, rowcount, arraysize, close, iterator,
context manager. Sends SQ_COMMAND chains for parameterless SQL
(Phase 4 adds SQ_BIND/SQ_EXECUTE for params).
src/informix_db/_resultset.py — ColumnInfo, parse_describe,
parse_tuple_payload. Best-effort SQ_DESCRIBE parser; refines in
Phase 2.1.
src/informix_db/connections.py — Connection.cursor() now returns a
real Cursor; new _send_pdu() lets Cursor share the connection's
socket without violating encapsulation.
Protocol findings landed in PROTOCOL_NOTES.md §6:
§6a — SQ_PREPARE format with named tags (the "trailing 22, 49"
are SQ_NDESCRIBE and SQ_WANTDONE chained into the same PDU).
Confirmed against IfxSqli.sendPrepare line 1062.
§6c — Server requires post-login init sequence (SQ_PROTOCOLS →
SQ_INFO → SQ_ID(env vars) → SQ_DBOPEN) BEFORE any PREPARE works.
Discovered the hard way: PREPARE without this sequence gets no
response; SQ_DBOPEN without SQ_PROTOCOLS gets sqlcode=-759
("Database not available"). The login PDU's database field is
a hint, not an open.
§6e — SQ_TUPLE corrected: [short warn][int size][bytes payload]
(not [int 0][short payloadLen] as earlier draft claimed).
Two more constants added to _messages.MessageType:
SQ_NDESCRIBE = 22, SQ_WANTDONE = 49
Tests: 40 unit + 7 integration (added 2 new — cursor() returns a
Cursor, parameter binding raises NotSupportedError). All green, ruff
clean. Removed obsolete "cursor() raises NotImplementedError" test.
What works end-to-end now: connect, cursor(), close, parameter-attempt
gating. What doesn't yet: cursor.execute("SELECT 1") — server requires
the post-login init sequence we don't yet send.
Discovered captures (kept for next session's analysis):
docs/CAPTURES/06-py-select1-attempt.socat.log
docs/CAPTURES/07-py-replay-jdbc-prepare.socat.log
docs/CAPTURES/08-py-with-dbopen.socat.log
docs/CAPTURES/09-py-full-replay.socat.log
Three new tasks created tracking the remaining Phase 2 blockers:
post-login init sequence, proper SQ_DESCRIBE parser, SQ_ID action
vocabulary helpers.
Polish item #1: byte-for-byte regression test that asserts our
generated login PDU is structurally identical to JDBC's reference
captured in docs/CAPTURES/01-connect-only.socat.log.
The test (tests/test_pdu_match.py) immediately caught a real bug:
the capability section was misread during Phase 0 byte-decoding.
Earlier text claimed Cap_1=1, Cap_2=0x3c000000, Cap_3=0 — actually:
Cap_1 = 0x0000013c (= (capability_class << 8) | protocol_version
where protocol_version = 0x3c = PF_PROT_SQLI_0600)
Cap_2 = 0
Cap_3 = 0
The misalignment was: the 0x3c byte I attributed to Cap_2's high
byte was actually Cap_1's low byte. The dev-image server is
permissive enough to accept arbitrary capability values, so the
connection succeeded even with the wrong bytes — but the PDU wasn't
structurally identical to JDBC's reference. SERVER-ACCEPTS ≠
STRUCTURALLY-CORRECT. This is exactly why the byte-for-byte diff
was the right polish item; "it connects" was a false ceiling.
After fix:
- 6 PDU-match tests assert byte-for-byte equality at offsets 2..280
(the structural prefix: SLheader sans length, all login markers,
capability ints, username, password, protocol IDs, env vars).
- Bytes 280+ legitimately differ per process (PID, TID, hostname,
cwd, AppName) — those are NOT asserted.
- Length field (offsets 0..1) also legitimately differs because our
PDU has shorter env list and AppName.
- Test uses monkey-patched IfxSocket so no network is needed.
Polish item #2: Makefile per global CLAUDE.md convention. Targets:
install, lint, format, test, test-integration, test-all, test-pdu,
ifx-up/down/logs/shell/status, capture (re-run JDBC scenarios under
socat), clean. `make` (no target) prints help.
Doc updates:
- PROTOCOL_NOTES.md §12: corrected capability section with the
actual values and an explanation of the methodology lesson
- DECISION_LOG.md: new entry recording the correction with a
pointer to the regression test and the takeaway
Side artifacts:
- docs/CAPTURES/03-py-connect-only.socat.log
- docs/CAPTURES/04-py-no-database.socat.log
- docs/CAPTURES/05-py-fixed-caps.socat.log
Test counts: 40 unit + 6 integration = 46 total, all green, ruff clean.
The user's global ~/.gitignore_global excludes *.log universally, which
silently dropped our docs/CAPTURES/*.socat.log files from the previous
Phase 0 commit. Add explicit negation rules in the project .gitignore
so the spike capture deliverables are tracked.
Captured under socat MITM relay (host:9090 → container:9088, hex-dump
both directions), driven by tests/reference/RefClient.java:
- 01-connect-only.socat.log: bare login + disconnect (~1.7 KB)
- 02-select-1.socat.log: SELECT 1 round-trip (~6.7 KB)
- 02-dml-cycle.socat.log: CREATE TEMP + INSERT + SELECT (~9.9 KB)
These are referenced from PROTOCOL_NOTES.md §12 as the canonical
ground-truth for the wire-format claims.