Introduces driver-managed transactions that work seamlessly across
logged and unlogged databases. The user calls commit() and rollback()
without needing to know which kind they're hitting — the connection
tracks transaction state internally.
Three protocol facts came out of integration testing:
1. Logged DBs in non-ANSI mode require an explicit SQ_BEGIN before
the first DML — the server doesn't auto-open a transaction.
Connection._ensure_transaction() sends SQ_BEGIN lazily and is
idempotent within an open txn. After commit/rollback, the next
DML triggers a fresh BEGIN.
2. SQ_RBWORK has a [short savepoint=0] payload before the SQ_EOT
framing tag — sending SQ_RBWORK alone causes the server to hang
silently (waiting for the missing 2 bytes). SQ_CMMTWORK has no
payload. This is the same pattern as the SHORT-vs-INT bug from
Phase 4.x and the 2-byte length prefix from Phase 6.c — when the
server hangs, it's an incomplete PDU body.
3. SQ_XACTSTAT (tag 99) is a logged-DB-only message that's
interleaved with normal responses. Now drained in all four
response-reading paths: cursor _drain_to_eot, _read_describe_
response, _read_fetch_response, and connection _drain_to_eot.
For unlogged DBs (e.g., sysmaster), SQ_BEGIN returns -201 and we
cache that result so subsequent DML doesn't re-probe. commit() and
rollback() are silent no-ops in that case — same client code works
across both DB modes.
Tests:
* New tests/test_transactions.py — 10 integration tests covering
commit visibility, rollback isolation, multi-row rollback, partial
commit-then-rollback, autocommit behavior, cross-connection
durability, UPDATE/DELETE rollback, implicit per-statement txn.
* conftest.py auto-creates testdb (logged) for the suite.
* Two old tests rewritten to assert new no-op behavior on unlogged
DBs (test_commit_rollback_in_unlogged_db_is_noop,
test_commit_in_unlogged_db_is_noop).
Total: 53 unit + 98 integration = 151 tests.
The Phase 3 "gate test" (test_rollback_hides_insert) — a rolled-back
INSERT must be invisible to subsequent SELECTs in the same session —
now passes against a real logged database for the first time.
Empirical and source-level investigation of the LOB type families.
Findings:
* BYTE/TEXT (type 11/12) cannot be inserted via SQL literals — even
dbaccess with `INSERT INTO t VALUES (1, "0x...")` returns -617
"A blob data type must be supplied within this context". The server
requires a binary BBIND wire path. Hard restriction.
* BYTE/TEXT wire protocol: SQ_BIND sends a 56-byte descriptor as the
inline placeholder, then a separate SQ_BBIND (41) PDU declares blob
count, then chunked SQ_BLOB (39) tags stream the actual bytes (max
1024 bytes/chunk per JDBC's sendStreamBlob).
* BLOB/CLOB (type 101/102) are even more involved — smart-LOBs use an
LO_OPEN/LO_READ/LO_WRITE/LO_CLOSE session protocol against sbspace,
with locators carried inline in SQ_TUPLE.
* Server-side setup confirmed working: blobspace1 + sbspace1 + logged
database (testdb) are now available in the dev container for future
Phase 8/9 implementation.
Both LOB families require materially more state-machine work than the
single-PDU codec types (DECIMAL/DATETIME/INTERVAL). Splitting into
Phase 8 (BYTE/TEXT) and Phase 9 (BLOB/CLOB) lets each get focused
attention rather than half-implementing both.
The SQ_BBIND, SQ_BLOB, SQ_FETCHBLOB, SQ_SBBIND, SQ_FILE_READ,
SQ_FILE_WRITE constants are already declared in _messages.py from
Phase 1 scaffolding — protocol layer is ready when implementation
lands.
For users who need binary data <32K today: LVARCHAR via str encoded
with iso-8859-1 is a viable interim path.
Implements encoders for datetime.timedelta → INTERVAL DAY(9) TO FRACTION(5)
and IntervalYM → INTERVAL YEAR(9) TO MONTH. Both follow the 2-byte-length-
prefixed BCD wire format established in Phase 6.c (DECIMAL/DATETIME).
The default qualifier choice is generous: DAY(9) covers any timedelta,
YEAR(9) handles ±1B years. JDBC defaults to smaller widths (DAY(2)/YEAR(4))
trading safety for compactness — we make the opposite trade.
FRACTION(5) is the Informix precision ceiling — sub-10us intervals can't
round-trip cleanly. Same limitation JDBC has.
Six integration tests, all green on first run against live Informix —
the synthetic round-trip in the test framework caught every framing bug
locally, before integration tests even started. This is the dividend from
owning both decoder and encoder.
Total: 53 unit + 88 integration = 141 tests.
Type matrix update: INTERVAL now has both decode + encode. Only BLOB/CLOB
and BYTE/TEXT remain among the common types.
Implements row-decoding for IDS INTERVAL, the last common temporal type.
The qualifier short bisects the type at the wire level: start_TU >= DAY
maps to datetime.timedelta (day-fraction), start_TU <= MONTH maps to a
new informix_db.IntervalYM (year-month).
Wire format mirrors DATETIME exactly — `[head byte][digit pairs in
base-100]`, with the qualifier dictating field interpretation. The
fraction-to-nanoseconds scaling (`scale_exp = 18 - end_TU`, forced odd)
is the JDBC pattern from `Decimal.fromIfxToArray`.
IntervalYM is a frozen dataclass holding signed total months, with
`years` and `remainder_months` as derived properties. Matches JDBC's
`IntervalYM.months` shape rather than a (years, months) tuple — avoids
ambiguity around what "negative" means for a multi-field tuple.
Tests: 13 unit (synthetic byte streams covering all decoder branches)
+ 9 integration (real Informix queries spanning DAY TO SECOND, HOUR TO
SECOND, YEAR TO MONTH, negatives, NULL, and mixed-family rows).
Total test count: 53 unit + 82 integration = 135.
Encoder for INTERVAL parameter binding is deferred to a later phase
(same arc as DECIMAL/DATETIME — decode lands first).
Now you can pass Python datetime/date/Decimal values directly:
cur.execute('INSERT INTO t VALUES (?, ?, ?)',
(1, datetime.datetime(2026, 5, 4, 12, 34, 56), Decimal('1234.56')))
cur.execute('SELECT id FROM t WHERE d > ?', (datetime.date(2025, 1, 1),))
The 2-byte length-prefix discovery: both my Phase 6.a DECIMAL encoder
and the new Phase 6.c DATETIME encoder produced "correct" BCD bytes
but the server silently dropped the SQ_BIND PDU (no response, just
timeout). Captured the wire, diffed against JDBC, and found that
DECIMAL/DATETIME bind data has a 2-byte length PREFIX wrapping the
BCD payload (per Decimal.javaToIfx line 457). With the prefix added,
both encoders work. DATE doesn't need the prefix — it's a fixed
4-byte int.
Per-type wire format:
date → DATE(7), [4-byte BE int = days since 1899-12-31]
datetime → DATETIME(10), [short total_len][byte 0xc7][7 BCD pairs]
Decimal → DECIMAL(5), [short total_len][byte exp][BCD digit pairs]
For DATETIME the encoder always emits YEAR TO SECOND form (no
microseconds) — covers the common case. Phase 6.x can add YEAR TO
FRACTION(N) variants if microsecond precision is needed.
For DECIMAL the encoder uses the asymmetric base-100 complement
(mirror of decoder) for negatives. Tested with positive, negative,
and fractional values.
Lesson for the protocol playbook: when the server silently drops a
PDU, it's almost always an envelope/framing issue rather than the
inner-value bytes being wrong. Same pattern as the SHORT-vs-INT
reserved field in CURNAME+NFETCH and the even-byte alignment pad.
Module changes:
src/informix_db/converters.py:
+ _encode_date — 4-byte BE int day count
+ _encode_datetime — YEAR TO SECOND form with 2-byte length prefix
+ _encode_decimal — re-enabled (was Phase 6.x stub) with the same
length-prefix fix
+ encode_param() dispatches on datetime.datetime BEFORE
datetime.date (since datetime is a subclass of date in Python)
Tests: 40 unit + 73 integration (3 new date/datetime param tests + 1
updated decimal param test) = 113 total, all green, ruff clean. New
tests cover:
- date as INSERT parameter via executemany — 3 dates round-trip
- datetime as INSERT parameter via executemany — 3 timestamps
- date as parameter in a WHERE clause filter (created_at > ?)
- Decimal round trip (was: NotImplementedError check; now: real
INSERT + SELECT verification)
Type support matrix updates:
DATE — encode ✓ + decode ✓ (was decode-only)
DATETIME — encode ✓ + decode ✓ (was decode-only)
DECIMAL — encode ✓ + decode ✓ (was decode-only)
Before:
cur.execute("SELECT CURRENT YEAR TO SECOND ...")
cur.fetchone() # → (b'\xc7\x14\x1a\x05\x04...',) raw BCD bytes
After:
cur.execute("SELECT CURRENT YEAR TO SECOND ...")
cur.fetchone() # → (datetime.datetime(2026, 5, 4, 12, 34, 56),)
Decoder picks the right Python type by qualifier:
YEAR/MONTH/DAY-only → datetime.date
HOUR/MIN/SEC-only → datetime.time
spans across both → datetime.datetime
Wire format (per IfxToJavaDateTime + Decimal.init treating as packed BCD):
byte[0] = sign + biased exponent (in base-100 digit pairs)
byte[1..] = BCD digit pairs: YYYY (2 bytes) + MM + DD + HH + MI + SS + FFFFF
Qualifier extraction from column descriptor:
encoded_length = (digit_count << 8) | (start_TU << 4) | end_TU
TU codes: YEAR=0, MONTH=2, DAY=4, HOUR=6, MIN=8, SEC=10,
FRAC1=11..FRAC5=15
Verified against four DATETIME columns of different qualifiers in
one tuple — see test_datetime_multiple_columns_in_one_row:
YEAR TO SECOND → datetime.datetime(2026, 5, 4, 12, 34, 56)
YEAR TO DAY → datetime.date(2026, 5, 4)
HOUR TO SECOND → datetime.time(12, 34, 56)
YEAR TO FRACTION(3) → datetime.datetime(...)
Module changes:
src/informix_db/converters.py:
+ _decode_datetime(raw, encoded_length) — qualifier-driven BCD walk
+ TU constants (_TU_YEAR, _TU_MONTH, ..., _TU_SECOND)
src/informix_db/_resultset.py:
+ DATETIME row-decoder branch — computes width from digit_count
in encoded_length high byte, calls _decode_datetime with the
packed qualifier so it can pick the right Python type
Tests: 40 unit + 70 integration (7 new DATETIME tests) = 110 total,
all green, ruff clean. Tests cover:
- YEAR TO SECOND → datetime.datetime
- YEAR TO DAY → datetime.date
- HOUR TO SECOND → datetime.time
- CURRENT YEAR TO FRACTION(3) → datetime.datetime
- Mixed qualifiers in one row
- DATETIME stored in a real table column (round-trip via SELECT)
- NULL DATETIME → Python None
DATETIME parameter binding (encoder) is Phase 6.x — same status as
DECIMAL encoder.
Before:
ProgrammingError: server returned SQ_ERR sqlcode=-201 isamcode=0
After:
ProgrammingError: [-201] A syntax error has occurred (offset 1)
ProgrammingError: [-206] The specified table is not in the database
(near 'no_such_table') [ISAM -111] (offset 27)
IntegrityError: [-268] Cannot insert duplicate value -
violates UNIQUE constraint (near 'on table u')
[ISAM -100] (offset 23)
IntegrityError: [-391] Cannot insert NULL value into a NOT NULL column
(near 't.id') (offset 23)
OperationalError: [-255] Not in transaction
PEP 249 exception classes mapped from sqlcode:
-201, -206, -217, -286, -310, ... → ProgrammingError
-239, -268, -291, -292, -391, -703 → IntegrityError
-255, -256, -407, -440, -908, ... → OperationalError
-329, -349, -510 → NotSupportedError
others → DatabaseError (safe fallback)
SQ_ERR wire decode (per IfxSqli.receiveError line 2717):
[short sqlcode][short isamcode][int offset]
[short near_token_len][bytes name][optional pad][short SQ_EOT]
The "near" token is the object name where the error occurred (table or
column name for "not found" errors); empty for most syntax errors.
Structured fields attached to every Informix error for programmatic
inspection:
e.sqlcode — Informix error code (e.g. -206)
e.isamcode — ISAM/RSAM-level error (e.g. -111 = "table not found")
e.offset — character offset in the SQL where the error occurred
e.near — object name in the "near 'XYZ'" clause (or "")
Connection state survives errors: a failed query doesn't poison the
session — subsequent execute() calls work normally. Verified by
test_connection_survives_query_error.
Built-in error catalog of ~50 most common Informix sqlcodes shipped
in src/informix_db/_errcodes.py. Users can extend at runtime with
register_error_text(code, text). Unknown codes get a generic
"Informix error <N>" with structured fields still populated.
Module changes:
src/informix_db/_errcodes.py (new) — error catalog + exception
classification + register_error_text()
src/informix_db/cursors.py — _raise_sq_err now uses the catalog
src/informix_db/connections.py — same upgrade for the connection-side
SQ_ERR path (catches commit/rollback errors etc.)
Tests: 40 unit + 63 integration (8 new error tests) = 103 total, all
green, ruff clean. Tests cover:
- syntax error → ProgrammingError(-201)
- table not found → ProgrammingError(-206) with near='no_such_table'
- column not found → ProgrammingError(-217)
- UNIQUE violation → IntegrityError(-268)
- NOT NULL violation → IntegrityError(-391)
- commit on unlogged DB → OperationalError(-255)
- connection survives errors (subsequent queries work)
- all errors carry structured sqlcode/isamcode/offset/near attrs
Three Phase 4 follow-ups in one push, all with empirical wire analysis:
1. PARAMETERIZED SELECT
cur.execute('SELECT tabname FROM systables WHERE tabid = ?', (1,))
→ ('systables',)
Wire flow: PREPARE → DESCRIBE → SQ_BIND-only (no EXECUTE) →
CURNAME+NFETCH → TUPLE+DONE → drain → CLOSE+RELEASE.
The cursor open is what executes the prepared query; SQ_BIND just
binds values into scope. No need for the IDESCRIBE handshake JDBC
does for type discovery — server accepts our typed bind directly.
2. NULL ROW DECODING — per-type sentinel detection
Each IDS type has its own NULL sentinel in tuple data:
INT → 0x80000000 (INT_MIN)
BIGINT → 0x8000000000000000 (LONG_MIN)
SMALLINT→ 0x8000 (SHORT_MIN)
REAL → all 0xFF (NaN bit pattern)
FLOAT → all 0xFF
DATE → 0x80000000 (same as INT)
VARCHAR → [byte 1][byte 0] (length=1, single nul) — distinguishable
from empty '' which is [byte 0] (length=0)
Verified by wire capture against the dev container — see
docs/CAPTURES/19-py-null-vs-onechar.socat.log and
docs/CAPTURES/20-py-int-null.socat.log.
The VARCHAR null marker is the trickiest because it LOOKS like a
1-byte string of nul, but VARCHAR can't contain embedded nuls
anyway, so the byte-0 within length-1 is unambiguous.
3. executemany(sql, seq_of_params) — PEP 249 batched DML
PREPARE once, loop SQ_BIND+SQ_EXECUTE per param set, RELEASE once.
Performance: only ~1.06x faster than execute() loop for 200 INSERTs
(dominated by per-row round trips). Phase 4.x optimization opportunity:
chain BIND+EXECUTE in one PDU without intermediate flush+read for
true bulk performance (would likely give 5-10x). Documented in
DECISION_LOG.md as a follow-up.
Module changes:
src/informix_db/converters.py:
+ Per-type NULL sentinel constants and detection in each decoder
+ Decoders now return None for sentinel values
src/informix_db/cursors.py:
+ _execute_select_with_params() — SQ_BIND alone, then cursor open
+ _build_bind_only_pdu() — SQ_BIND without trailing SQ_EXECUTE
+ executemany() — loop BIND+EXECUTE, accumulate rowcount
+ execute() now dispatches to _execute_select_with_params for
parameterized SELECT (was: NotSupportedError)
Tests: 40 unit + 47 integration (was 32; added 15 new) = 87 total,
all green, ruff clean. New test files / cases:
tests/test_nulls.py (7) — NULL decoding for INT, BIGINT, FLOAT,
REAL, VARCHAR, empty-vs-null, mixed columns
tests/test_params.py — added 4 parameterized SELECT tests, 5
executemany tests
tests/test_smoke.py — updated cursor-with-params test (was Phase 1
"raises", now Phase 4 "works")
Discovered captures kept for next-session debugging:
docs/CAPTURES/18-py-null-rows.socat.log
docs/CAPTURES/19-py-null-vs-onechar.socat.log
docs/CAPTURES/20-py-int-null.socat.log
cur.execute("INSERT INTO t VALUES (?, ?, ?)", (42, "hello", 3.14))
cur.execute("INSERT INTO t VALUES (:1, :2)", (99, "world"))
cur.execute("UPDATE t SET name = ? WHERE id = ?", ("new", 2))
cur.execute("DELETE FROM t WHERE id = ?", (5,))
# all work end-to-end against a real Informix server
Two breakthroughs decoded from JDBC:
1. SQ_BIND PDU shape (chained with SQ_EXECUTE in one PDU, no separate
round trip):
[short SQ_ID=4][int SQ_BIND=5][short numparams]
for each param:
[short type][short indicator][short prec_or_encLen]
writePadded(rawbytes)
[short SQ_EXECUTE=7][short SQ_EOT]
2. Strings are sent as CHAR (type=0) not VARCHAR (type=13). The server
handles conversion to the actual column type via internal CIDESCRIBE
— we don't need to do it explicitly.
Per-type encoding (Phase 4 MVP):
int (32-bit) → IDS INT (type=2), prec=0x0a00 (packed width=10/scale=0),
4-byte BE
int (64-bit) → IDS BIGINT (type=52), prec=0x1300, 8-byte BE
str → IDS CHAR (type=0), prec=0, [short len][bytes][pad]
float → IDS FLOAT (type=3), prec=0, 8-byte IEEE 754
bool → IDS BOOL (type=45), prec=0, 1 byte
None → indicator=-1, no data
The integer "precision" field is PACKED — initially looked like a bug
(why would precision be 2560?) until I realized 0x0a00 = (10 << 8) | 0
= packed display-width and scale. Captured this surprise in
DECISION_LOG.md.
Critical fix to execute-path branching: parameterized INSERT also
returns nfields > 0 (server describes the would-be inserted row).
Switched from "branch on nfields" to "branch on SQL keyword" — JDBC
does the same via its IfxStatement / IfxPreparedStatement subclassing.
Numeric paramstyle support: cur.execute("... :1 ...", (val,)) works
by rewriting :N → ? before sending PREPARE. Trivial regex (doesn't
escape strings/comments — Phase 5 can add a proper SQL tokenizer).
Module changes:
src/informix_db/converters.py:
+ encode_param() dispatcher
+ _encode_int / _encode_bigint / _encode_str / _encode_float / _encode_bool
src/informix_db/cursors.py:
+ _build_bind_execute_pdu() — chains SQ_BIND + SQ_EXECUTE in one PDU
+ _execute_dml_with_params() — sends bind PDU, drains, releases
+ execute() now accepts parameters; rewrites :N → ?; branches by
SQL keyword (SELECT vs DML)
+ _NUMERIC_PLACEHOLDER_RE for paramstyle="numeric" support
Tests: 40 unit + 32 integration (8 new parameter tests + 1 updated
smoke) = 72 total, all green, ruff clean. New tests cover:
- INSERT with ? params
- INSERT with :N params
- INT + FLOAT + str round trip via INSERT then SELECT
- UPDATE with params in SET and WHERE
- DELETE with parameter in WHERE
- Unsupported param type (bytes) raises NotImplementedError
- Parameterized SELECT raises NotSupportedError (Phase 4.x)
- Dict/named params raise NotSupportedError
Known gaps (Phase 4.x / Phase 5):
- Parameterized SELECT (needs SQ_BIND before CURNAME+NFETCH)
- NULL row decoding for VARCHAR (currently surfaces empty string)
- Proper SQL tokenizer (so :N inside string literals is preserved)
- bytes/datetime/Decimal parameter types
Cursor.execute now branches on DESCRIBE response's nfields:
- nfields > 0 → SELECT path (cursor lifecycle: CURNAME+NFETCH+...)
- nfields == 0 → DDL/DML path (just SQ_EXECUTE then SQ_RELEASE)
Examples that work end-to-end against the dev container:
cur.execute('CREATE TEMP TABLE t (id INTEGER, name VARCHAR(50))')
cur.execute("INSERT INTO t VALUES (1, 'hello')") # rowcount=1
cur.execute("UPDATE t SET name = 'new' WHERE id = 1")
cur.execute('DELETE FROM t WHERE id = 1')
Plus full mix: CREATE → 5 INSERTs → SELECT WHERE → DELETE WHERE → SELECT
(see tests/test_dml.py::test_full_dml_cycle_in_one_connection).
Three protocol findings during this push, documented in DECISION_LOG.md:
1. SQ_INSERTDONE (=94) is METADATA, not execution. It arrives in BOTH
the DESCRIBE response (PREPARE phase) AND the EXECUTE response for
literal-value INSERTs. The PREPARE-phase SQ_INSERTDONE carries the
serial values that WILL be assigned IF you execute. The EXECUTE-
phase SQ_INSERTDONE confirms execution. My initial assumption was
"PREPARE-phase INSERTDONE means already-executed" — wrong. Skipping
SQ_EXECUTE made the row not persist (SELECT returned []). Lesson:
optimization-looking responses may not be what they look like —
always verify with a follow-up SELECT.
2. SQ_INSERTDONE wire format: 18 bytes (10 byte longint serial8 + 8
byte bigint bigserial). Per IfxSqli.receiveInsertDone line 2347.
We read-and-discard for now; Phase 5+ surfaces as Cursor.lastrowid.
3. Transactions: commit() and rollback() are 2-byte messages.
SQ_CMMTWORK=19 + SQ_EOT for commit; SQ_RBWORK=20 + SQ_EOT for
rollback. Server responds with SQ_DONE+SQ_EOT in logged databases,
or SQ_ERR sqlcode=-255 ("Not in transaction") in unlogged databases
like sysmaster. Wire machinery is implemented; full transaction
testing needs a logged DB (use ``stores_demo`` from the dev image).
Module changes:
src/informix_db/cursors.py:
- execute() branches on nfields (SELECT path vs DDL/DML path)
- new _execute_dml() does just EXECUTE + RELEASE
- new _build_execute_pdu() emits the 8-byte SQ_ID(EXECUTE)+EOT
- _read_describe_response() and _drain_to_eot() handle SQ_INSERTDONE
src/informix_db/connections.py:
- commit() / rollback() now functional — send the SQ_CMMTWORK /
SQ_RBWORK PDU and drain the response
Tests: 40 unit + 24 integration (6 new DML tests) = 64 total, all
green, ruff clean. New tests cover:
- CREATE TEMP TABLE
- INSERT (rowcount=1, persists, SELECT shows it)
- UPDATE WHERE (specific row changed)
- DELETE WHERE (specific row removed)
- Full mixed cycle (CREATE + 5 INSERTs + SELECT + DELETE + SELECT)
- commit() in unlogged DB raises OperationalError sqlcode=-255
Captured wire artifacts kept for future debugging:
docs/CAPTURES/16-py-insert-literal.socat.log
docs/CAPTURES/17-py-insert-select.socat.log
Three findings, each caught by a different debugging technique,
documented in DECISION_LOG.md:
1. CURNAME+NFETCH PDU: trailing reserved field is SHORT not INT.
Caught by byte-diffing our 44-byte PDU against JDBC's 42-byte
reference under socat. The server tolerated the longer version
for INT-only SELECTs (silently consuming extra zeros) but
rejected it for VARCHAR queries. Lesson: server tolerance varies
by query type — always match JDBC byte-for-byte.
2. SQ_TUPLE payload pads to even byte alignment. An 11-byte
"syscolumns" VARCHAR payload had a trailing 0x00 between it and
the next SQ_TUPLE tag. JDBC's IfxRowColumn.readTuple consumes
this pad silently; we weren't, so any odd-length variable-width
row desynced the parser.
3. VARCHAR/NCHAR/NVCHAR in tuple data use a SINGLE-byte length
prefix (max 255 chars — IDS VARCHAR's hard limit). NOT a 2-byte
short as I'd initially assumed. CHAR is fixed-width per
encoded_length. LVARCHAR uses a 4-byte int prefix for >255 byte
values.
Module changes:
src/informix_db/_resultset.py — _LENGTH_PREFIXED_SHORT_TYPES set,
branched VARCHAR/NCHAR/NVCHAR (1-byte prefix) vs CHAR (fixed)
vs LVARCHAR (4-byte prefix); even-byte alignment pad consumed
after each SQ_TUPLE payload.
src/informix_db/cursors.py — CURNAME+NFETCH and standalone NFETCH
PDUs now write_short(0) for the reserved trailing field.
Tests: 40 unit + 18 integration (3 new VARCHAR tests) = 58 total,
all green, ruff clean. New tests cover:
- VARCHAR single-column SELECT
- Odd-length VARCHAR row (regression for the pad-byte bug)
- Mixed INT + VARCHAR + FLOAT three-column SELECT
Sample output:
SELECT FIRST 5 tabname FROM systables → ('systables',),
('syscolumns',), ('sysindices',), ('systabauth',), ('syscolauth',)
SELECT FIRST 3 tabname, tabid, nrows → ('systables', 1, 276.0), ...
VARCHAR was the last known gap from the Phase 2 commit. Phase 2
now reads INT, BIGINT, REAL, FLOAT, CHAR, VARCHAR end-to-end. Phase
6+ types (DATETIME, INTERVAL, DECIMAL, BLOBs) remain.
cursor.execute("SELECT 1 FROM systables WHERE tabid = 1")
cursor.fetchone() == (1,)
To my knowledge, this is the first time a pure-Python implementation
has read data from Informix without wrapping IBM's CSDK or JDBC.
Three breakthroughs in this commit:
1. Login PDU's database field is BROKEN. Passing a database name there
makes the server reject subsequent SQ_DBOPEN with sqlcode -759
("database not available"). JDBC always sends NULL in the login
PDU's database slot — we now do the same. The user-supplied database
opens via SQ_DBOPEN in _init_session.
2. Post-login session init dance: SQ_PROTOCOLS (8-byte feature mask
replayed verbatim from JDBC) → SQ_INFO with INFO_ENV + env vars
(48-byte PDU replayed verbatim — DBTEMP=/tmp, SUBQCACHESZ=10) →
SQ_DBOPEN. Without all three steps in this exact order, the server
silently ignores SELECTs.
3. SQ_DESCRIBE per-column block has 10 fields per column (not the
simple "name + type" my best-effort parser assumed): fieldIndex,
columnStartPos, columnType, columnExtendedId, ownerName,
extendedName, reference, alignment, sourceType, encodedLength.
The string table at the end is offset-indexed (fieldIndex points
into it), which is how JDBC handles disambiguation.
Cursor lifecycle implementation in cursors.py mirrors JDBC exactly:
PREPARE+NDESCRIBE+WANTDONE → DESCRIBE+DONE+COST+EOT
CURNAME+NFETCH(4096) → TUPLE*+DONE+COST+EOT
NFETCH(4096) → DONE+COST+EOT (drain)
CLOSE → EOT
RELEASE → EOT
Five round trips per SELECT — same as JDBC.
Module changes:
src/informix_db/connections.py — added _init_session(), _send_protocols(),
_send_dbopen(), _drain_to_eot(), _raise_sq_err(); login PDU now
forces database=None always; SQ_INFO PDU replayed verbatim from
JDBC capture (offsets-indexed env-var format too gnarly to derive
in MVP).
src/informix_db/cursors.py — full rewrite: real PDU builders for
PREPARE/CURNAME+NFETCH/NFETCH/CLOSE/RELEASE; tag-dispatched
response readers; cursor-name generator matching JDBC's "_ifxc"
convention.
src/informix_db/_resultset.py — proper SQ_DESCRIBE parser per
JDBC's receiveDescribe (USVER mode); offset-indexed string table
with name lookup by fieldIndex; ColumnInfo dataclass with raw
type-code preserved for null-flag extraction.
src/informix_db/_messages.py — added SQ_NDESCRIBE=22, SQ_WANTDONE=49.
Test coverage: 40 unit + 15 integration tests (7 smoke + 8 new SELECT)
= 55 total, all green, ruff clean. New tests cover:
- SELECT 1 returns (1,)
- cursor.description shape per PEP 249
- Multi-row INT SELECT
- Multi-column mixed types (INT + FLOAT)
- Iterator protocol (for row in cursor)
- fetchmany(n)
- Re-executing on same cursor resets state
- Two cursors on one connection (sequential)
Known gap: VARCHAR row decoding doesn't yet handle the variable-width
on-wire encoding correctly. Phase 2.x will address — for now NotImpl
errors surface raw bytes in the row tuple.
Cursor class scaffolded with full PEP 249 surface:
src/informix_db/cursors.py — Cursor with execute, fetchone, fetchmany,
fetchall, description, rowcount, arraysize, close, iterator,
context manager. Sends SQ_COMMAND chains for parameterless SQL
(Phase 4 adds SQ_BIND/SQ_EXECUTE for params).
src/informix_db/_resultset.py — ColumnInfo, parse_describe,
parse_tuple_payload. Best-effort SQ_DESCRIBE parser; refines in
Phase 2.1.
src/informix_db/connections.py — Connection.cursor() now returns a
real Cursor; new _send_pdu() lets Cursor share the connection's
socket without violating encapsulation.
Protocol findings landed in PROTOCOL_NOTES.md §6:
§6a — SQ_PREPARE format with named tags (the "trailing 22, 49"
are SQ_NDESCRIBE and SQ_WANTDONE chained into the same PDU).
Confirmed against IfxSqli.sendPrepare line 1062.
§6c — Server requires post-login init sequence (SQ_PROTOCOLS →
SQ_INFO → SQ_ID(env vars) → SQ_DBOPEN) BEFORE any PREPARE works.
Discovered the hard way: PREPARE without this sequence gets no
response; SQ_DBOPEN without SQ_PROTOCOLS gets sqlcode=-759
("Database not available"). The login PDU's database field is
a hint, not an open.
§6e — SQ_TUPLE corrected: [short warn][int size][bytes payload]
(not [int 0][short payloadLen] as earlier draft claimed).
Two more constants added to _messages.MessageType:
SQ_NDESCRIBE = 22, SQ_WANTDONE = 49
Tests: 40 unit + 7 integration (added 2 new — cursor() returns a
Cursor, parameter binding raises NotSupportedError). All green, ruff
clean. Removed obsolete "cursor() raises NotImplementedError" test.
What works end-to-end now: connect, cursor(), close, parameter-attempt
gating. What doesn't yet: cursor.execute("SELECT 1") — server requires
the post-login init sequence we don't yet send.
Discovered captures (kept for next session's analysis):
docs/CAPTURES/06-py-select1-attempt.socat.log
docs/CAPTURES/07-py-replay-jdbc-prepare.socat.log
docs/CAPTURES/08-py-with-dbopen.socat.log
docs/CAPTURES/09-py-full-replay.socat.log
Three new tasks created tracking the remaining Phase 2 blockers:
post-login init sequence, proper SQ_DESCRIBE parser, SQ_ID action
vocabulary helpers.
Decoded the post-login execution flow from docs/CAPTURES/02-select-1.socat.log:
SQ_PREPARE format (validated against both observed PREPAREs):
[short SQ_PREPARE=2]
[short flags=0]
[int sqlLen] ← SQL byte count, NOT including nul
[bytes sql]
[byte 0] ← nul terminator
[short 0x0016] ← observed 22; cursor options? statement type?
[short 0x0031] ← observed 49; identical across both PREPAREs
[short SQ_EOT=12]
SQ_TUPLE format (definitive):
[short SQ_TUPLE=14]
[int 0] ← flags / reserved
[short payloadLen]
[bytes payload] ← column values back-to-back, per type encoding
SQ_DONE format (partial — see PROTOCOL_NOTES.md §6e for what's known)
JDBC's full prepare/fetch/release sequence (PREPARE → DESCRIBE → ID(3
=cursor name) → ID(9=NFETCH) → TUPLE → DONE → ID(10=close) →
ID(11=release)) documented in §6c. The action codes inside SQ_ID
roughly map to other SQ_* tag values from IfxMessageTypes.
For Python MVP we'll likely try SQ_COMMAND=1 (execute-immediate)
first — it might let us skip the cursor lifecycle for parameterless
queries.
New modules:
src/informix_db/_types.py — IfxType IntEnum ported from
com.informix.lang.IfxTypes. All IDS internal type codes (CHAR=0,
SMALLINT=1, INT=2, ..., BOOLEAN=45, BIGINT=52, BIGSERIAL=53, CLOB=101,
BLOB=102) plus the high-bit flags (NOTNULLABLE=0x100 etc) and helpers
base_type() / is_nullable() to strip and inspect the flag byte.
src/informix_db/converters.py — wire-bytes → Python decoders for the
Phase-2 MVP type set: SMALLINT, INT, BIGINT, SMFLOAT, FLOAT, CHAR,
VARCHAR, NCHAR, NVCHAR, LVARCHAR, BOOL, DATE. Plus FIXED_WIDTHS table
for the row decoder. ENCODERS dict declared empty (Phase 4 fills it
in for parameter binding).
DATE handling uses Informix epoch (1899-12-31, day 0); 4-byte BE int
day count → datetime.date. Smoke-tested decoders all return correct
Python values.
Cursor / _resultset implementation NOT in this commit — they need
deeper SQ_DESCRIBE byte-layout analysis and the SQ_ID sub-action
vocabulary characterization. Both are bounded-but-substantial Phase 2
tasks deferred to a fresh session.
40 unit tests still passing, ruff clean.
Polish item #1: byte-for-byte regression test that asserts our
generated login PDU is structurally identical to JDBC's reference
captured in docs/CAPTURES/01-connect-only.socat.log.
The test (tests/test_pdu_match.py) immediately caught a real bug:
the capability section was misread during Phase 0 byte-decoding.
Earlier text claimed Cap_1=1, Cap_2=0x3c000000, Cap_3=0 — actually:
Cap_1 = 0x0000013c (= (capability_class << 8) | protocol_version
where protocol_version = 0x3c = PF_PROT_SQLI_0600)
Cap_2 = 0
Cap_3 = 0
The misalignment was: the 0x3c byte I attributed to Cap_2's high
byte was actually Cap_1's low byte. The dev-image server is
permissive enough to accept arbitrary capability values, so the
connection succeeded even with the wrong bytes — but the PDU wasn't
structurally identical to JDBC's reference. SERVER-ACCEPTS ≠
STRUCTURALLY-CORRECT. This is exactly why the byte-for-byte diff
was the right polish item; "it connects" was a false ceiling.
After fix:
- 6 PDU-match tests assert byte-for-byte equality at offsets 2..280
(the structural prefix: SLheader sans length, all login markers,
capability ints, username, password, protocol IDs, env vars).
- Bytes 280+ legitimately differ per process (PID, TID, hostname,
cwd, AppName) — those are NOT asserted.
- Length field (offsets 0..1) also legitimately differs because our
PDU has shorter env list and AppName.
- Test uses monkey-patched IfxSocket so no network is needed.
Polish item #2: Makefile per global CLAUDE.md convention. Targets:
install, lint, format, test, test-integration, test-all, test-pdu,
ifx-up/down/logs/shell/status, capture (re-run JDBC scenarios under
socat), clean. `make` (no target) prints help.
Doc updates:
- PROTOCOL_NOTES.md §12: corrected capability section with the
actual values and an explanation of the methodology lesson
- DECISION_LOG.md: new entry recording the correction with a
pointer to the regression test and the takeaway
Side artifacts:
- docs/CAPTURES/03-py-connect-only.socat.log
- docs/CAPTURES/04-py-no-database.socat.log
- docs/CAPTURES/05-py-fixed-caps.socat.log
Test counts: 40 unit + 6 integration = 46 total, all green, ruff clean.
This commit takes informix-db from documentation-only (Phase 0 spike)
to a functional connect() / close() against a real Informix server.
To our knowledge, this is the first pure-socket Informix client in any
language — no CSDK, no JVM, no native libraries.
Layered architecture per the plan, mirroring PyMySQL's shape:
src/informix_db/
__init__.py — PEP 249 surface (connect, exceptions, paramstyle="numeric")
exceptions.py — full PEP 249 hierarchy declared up front
_socket.py — raw socket I/O (read_exact, write_all, timeouts)
_protocol.py — IfxStreamReader / IfxStreamWriter framing primitives
(big-endian, 16-bit-aligned variable payloads,
length-prefixed nul-terminated strings)
_messages.py — SQ_* tags from IfxMessageTypes + ASF/login markers
_auth.py — pluggable auth handlers; plain-password is the
only Phase-1 implementation
connections.py — Connection class: builds the binary login PDU
(SLheader + PFheader byte-for-byte per
PROTOCOL_NOTES.md §3), sends it, parses the
server response, wires up close()
Phase 1 design decisions locked in DECISION_LOG.md:
- paramstyle = "numeric" (matches Informix ESQL/C convention)
- Python >= 3.10
- autocommit defaults to off (PEP 249 implicit)
- License: MIT
- Distribution name: informix-db (verified PyPI-available)
Test coverage: 34 unit tests (codec round-trips against synthetic byte
streams; observed login-PDU values from the spike captures asserted as
exact byte literals) + 6 integration tests (connect, idempotent close,
context manager, bad-password → OperationalError, bad-host →
OperationalError, cursor() raises NotImplementedError).
pytest — runs 34 unit tests, no Docker needed
pytest -m integration — runs 6 integration tests against the
Developer Edition container (pinned by digest
in tests/docker-compose.yml)
pytest -m "" — runs everything
ruff is clean across src/ and tests/.
One bug found during smoke testing: threading.get_ident() can exceed
signed 32-bit on some processes, overflowing struct.pack("!i"). Fixed
the same way the JDBC reference does — clamp to signed 32-bit, fall
back to 0 if out of range. The field is diagnostic only.
One protocol-level observation that AMENDED the JDBC source reading:
the "capability section" in the login PDU is three independently
negotiated 4-byte ints (Cap_1=1, Cap_2=0x3c000000, Cap_3=0), not one
int + 8 reserved zero bytes as my CFR decompile read suggested. The
server echoes them back identically. Trust the wire over the
decompiler.
Phase 1 verification matrix (from PROTOCOL_NOTES.md §12):
- Login byte layout: confirmed (server accepts our pure-Python PDU)
- Disconnection: confirmed (SQ_EXIT round-trip works)
- Framing primitives: confirmed (34 unit tests)
- Error path: bad password → OperationalError, bad host → OperationalError
Phase 2 (Cursor / SELECT / basic types) is the next phase. The hard
unknowns there — exact column-descriptor layout, statement-time error
format — were called out as bounded gaps in Phase 0 and have existing
captures (02-select-1.socat.log, 02-dml-cycle.socat.log) to characterize
against.
The user's global ~/.gitignore_global excludes *.log universally, which
silently dropped our docs/CAPTURES/*.socat.log files from the previous
Phase 0 commit. Add explicit negation rules in the project .gitignore
so the spike capture deliverables are tracked.
Captured under socat MITM relay (host:9090 → container:9088, hex-dump
both directions), driven by tests/reference/RefClient.java:
- 01-connect-only.socat.log: bare login + disconnect (~1.7 KB)
- 02-select-1.socat.log: SELECT 1 round-trip (~6.7 KB)
- 02-dml-cycle.socat.log: CREATE TEMP + INSERT + SELECT (~9.9 KB)
These are referenced from PROTOCOL_NOTES.md §12 as the canonical
ground-truth for the wire-format claims.
Java reference client (tests/reference/RefClient.java) drives the
official ifxjdbc.jar through three controlled scenarios:
- connect-only: bare connect+disconnect
- select-1: SELECT 1 round-trip with column metadata
- dml-cycle: CREATE TEMP + INSERT + SELECT in one connection
All three work end-to-end against the dev container with the
documented credentials (informix/in4mix on sysmaster).
Wire traffic captured via socat MITM relay (no sudo needed) — listen
on 9090, forward to 9088, hex-dump both directions. Captures saved
to docs/CAPTURES/. Total ~24 KB across the three scenarios.
PROTOCOL_NOTES.md cross-reference findings (§12):
Confirmed against the wire (✅ both JDBC + PCAP):
- Big-endian framing throughout
- Login PDU structure matches encodeAscBinary field-by-field
- Server response matches DecodeAscBinary
- Post-login messages are bare [short tag][payload]
- SQ_EOT (=12) is a per-PDU flush/submit marker, not just
disconnect ack — every logical request ends with [short 0x000c]
Wire findings that AMENDED the JDBC-derived hypothesis:
- The "capability section" is actually three 4-byte negotiated
capability ints (Cap_1, Cap_2, Cap_3), not one int + 8 reserved
zero bytes. The CFR decompile read it as adjacent zero writes
but the wire shows distinct values that the server echoes back.
Trust the wire over the decompiler for byte layouts.
Validated post-login execution:
- The first SELECT after login is JDBC-internal (locale lookup
via informix.systables) — a Python implementation doesn't need
to do this housekeeping
- SQ_PREPARE format observed: [short SQ_PREPARE=2][short flags=0]
[int sqlLen][bytes sql][nul][short ?][short ?][short SQ_EOT=12]
- Server sends [short SQ_DESCRIBE=8] followed by column metadata
Phase 0 exit verdict: GO. All four hard exit criteria confirmed.
Remaining gaps (result-set descriptor exact layout, statement-time
errors, capability semantics) are bounded and tractable in Phase 2.
The narrow-scope off-ramp is not needed.
Decompiled ifxjdbc.jar (4.50.JC10, build 146, 2023-03-07) with CFR 0.152
into build/jdbc-src/. The decompiled tree is gitignored — it's a
clean-room understanding reference, not shipped code.
Findings landed in two artifacts:
JDBC_NOTES.md — the reverse-lookup index:
- JAR identity (SHA256, manifest, line counts)
- Package layout (com.informix.{asf,jdbc,lang} are the load-bearing
packages; org.bson and the JDBC API surface get ignored)
- Class index mapping each wire-protocol concern to the responsible
Java class. Highlights:
- com.informix.asf.Connection (the wire transport / login PDU)
- com.informix.asf.IfxData{Input,Output}Stream (framing primitives)
- com.informix.jdbc.IfxMessageTypes (140+ message-tag constants)
- com.informix.lang.JavaToIfxType / IfxToJavaType (codecs)
- com.informix.jdbc.IfxSqli / IfxSqliConnect (the SQLI state machine)
- Auth landscape: plain-password is inline in the binary login PDU;
PAM is a server-initiated post-login challenge/response; CSM is
removed from this driver (literally throws an error if you try)
PROTOCOL_NOTES.md — the byte-level wire-format reference:
- Endianness: big-endian, network byte order (confirmed from
JavaToIfxInt source)
- Width table: SmallInt 2B, Int 4B, BigInt 8B, plus the legacy 10-byte
LongInt that we skip for MVP
- 16-bit alignment requirement for variable-length payloads — every
string/decimal/datetime is 0-padded if odd-length, missing this
desynchronizes the parser
- Login PDU structure decoded byte-by-byte from encodeAscBinary():
SLheader (6 bytes) + PFheader with markers 100/101/104/106/107/
108/116/127, capability bitfield, env vars, process info, app name
- Disconnection: bare [short SQ_EXIT=56] both directions, no header
- Post-login messages have NO header — protocol is stream-oriented:
[short tag][payload][short tag][payload]...
- Message-type tag table categorized by purpose
- Open questions list and cross-check matrix tracking what's
JDBC-derived vs PCAP-confirmed
DECISION_LOG.md additions:
- ifxjdbc.jar 4.50.JC10 selected as JDBC reference; CFR 0.152 as decompiler
- CSM is officially dead — never plan for it
- Plain-password auth is single-round-trip (no challenge/response)
- Wire-framing primitives locked in for _protocol.py
- Container credentials: user=informix, password=in4mix, on port 9088,
TLS off
Phase 0 exit gate: criteria #1 (login layout), #2 (message-type tags),
#3 (SELECT 1 hypothesis) are derived from JDBC. PCAP capture (task #7)
and cross-reference (task #2) remaining to corroborate.
Project goal: pure-Python implementation of the Informix SQLI wire
protocol. No CSDK, no JVM, no native deps. Targets icr.io/informix
/informix-developer-database (port 9088) as the dev/test instance.
Phase 0 is a documentation-only spike that gates all implementation
work. The four scaffolds:
- README.md: project status and Phase 0 deliverable index
- docs/PROTOCOL_NOTES.md: byte-level wire-format reference (TBD)
- docs/JDBC_NOTES.md: reverse-lookup index into the decompiled IBM
JDBC driver (4.50.4.1), populated from build/jdbc-src/ once the
decompile lands
- docs/DECISION_LOG.md: running rationale, with the Phase-1 paramstyle
/Python-floor/autocommit decisions pre-locked so they don't churn
later
CLAUDE.md is gitignored — operator-private context, public-PyPI repo.