cursor.execute("SELECT 1 FROM systables WHERE tabid = 1")
cursor.fetchone() == (1,)
To my knowledge, this is the first time a pure-Python implementation
has read data from Informix without wrapping IBM's CSDK or JDBC.
Three breakthroughs in this commit:
1. Login PDU's database field is BROKEN. Passing a database name there
makes the server reject subsequent SQ_DBOPEN with sqlcode -759
("database not available"). JDBC always sends NULL in the login
PDU's database slot — we now do the same. The user-supplied database
opens via SQ_DBOPEN in _init_session.
2. Post-login session init dance: SQ_PROTOCOLS (8-byte feature mask
replayed verbatim from JDBC) → SQ_INFO with INFO_ENV + env vars
(48-byte PDU replayed verbatim — DBTEMP=/tmp, SUBQCACHESZ=10) →
SQ_DBOPEN. Without all three steps in this exact order, the server
silently ignores SELECTs.
3. SQ_DESCRIBE per-column block has 10 fields per column (not the
simple "name + type" my best-effort parser assumed): fieldIndex,
columnStartPos, columnType, columnExtendedId, ownerName,
extendedName, reference, alignment, sourceType, encodedLength.
The string table at the end is offset-indexed (fieldIndex points
into it), which is how JDBC handles disambiguation.
Cursor lifecycle implementation in cursors.py mirrors JDBC exactly:
PREPARE+NDESCRIBE+WANTDONE → DESCRIBE+DONE+COST+EOT
CURNAME+NFETCH(4096) → TUPLE*+DONE+COST+EOT
NFETCH(4096) → DONE+COST+EOT (drain)
CLOSE → EOT
RELEASE → EOT
Five round trips per SELECT — same as JDBC.
Module changes:
src/informix_db/connections.py — added _init_session(), _send_protocols(),
_send_dbopen(), _drain_to_eot(), _raise_sq_err(); login PDU now
forces database=None always; SQ_INFO PDU replayed verbatim from
JDBC capture (offsets-indexed env-var format too gnarly to derive
in MVP).
src/informix_db/cursors.py — full rewrite: real PDU builders for
PREPARE/CURNAME+NFETCH/NFETCH/CLOSE/RELEASE; tag-dispatched
response readers; cursor-name generator matching JDBC's "_ifxc"
convention.
src/informix_db/_resultset.py — proper SQ_DESCRIBE parser per
JDBC's receiveDescribe (USVER mode); offset-indexed string table
with name lookup by fieldIndex; ColumnInfo dataclass with raw
type-code preserved for null-flag extraction.
src/informix_db/_messages.py — added SQ_NDESCRIBE=22, SQ_WANTDONE=49.
Test coverage: 40 unit + 15 integration tests (7 smoke + 8 new SELECT)
= 55 total, all green, ruff clean. New tests cover:
- SELECT 1 returns (1,)
- cursor.description shape per PEP 249
- Multi-row INT SELECT
- Multi-column mixed types (INT + FLOAT)
- Iterator protocol (for row in cursor)
- fetchmany(n)
- Re-executing on same cursor resets state
- Two cursors on one connection (sequential)
Known gap: VARCHAR row decoding doesn't yet handle the variable-width
on-wire encoding correctly. Phase 2.x will address — for now NotImpl
errors surface raw bytes in the row tuple.
Cursor class scaffolded with full PEP 249 surface:
src/informix_db/cursors.py — Cursor with execute, fetchone, fetchmany,
fetchall, description, rowcount, arraysize, close, iterator,
context manager. Sends SQ_COMMAND chains for parameterless SQL
(Phase 4 adds SQ_BIND/SQ_EXECUTE for params).
src/informix_db/_resultset.py — ColumnInfo, parse_describe,
parse_tuple_payload. Best-effort SQ_DESCRIBE parser; refines in
Phase 2.1.
src/informix_db/connections.py — Connection.cursor() now returns a
real Cursor; new _send_pdu() lets Cursor share the connection's
socket without violating encapsulation.
Protocol findings landed in PROTOCOL_NOTES.md §6:
§6a — SQ_PREPARE format with named tags (the "trailing 22, 49"
are SQ_NDESCRIBE and SQ_WANTDONE chained into the same PDU).
Confirmed against IfxSqli.sendPrepare line 1062.
§6c — Server requires post-login init sequence (SQ_PROTOCOLS →
SQ_INFO → SQ_ID(env vars) → SQ_DBOPEN) BEFORE any PREPARE works.
Discovered the hard way: PREPARE without this sequence gets no
response; SQ_DBOPEN without SQ_PROTOCOLS gets sqlcode=-759
("Database not available"). The login PDU's database field is
a hint, not an open.
§6e — SQ_TUPLE corrected: [short warn][int size][bytes payload]
(not [int 0][short payloadLen] as earlier draft claimed).
Two more constants added to _messages.MessageType:
SQ_NDESCRIBE = 22, SQ_WANTDONE = 49
Tests: 40 unit + 7 integration (added 2 new — cursor() returns a
Cursor, parameter binding raises NotSupportedError). All green, ruff
clean. Removed obsolete "cursor() raises NotImplementedError" test.
What works end-to-end now: connect, cursor(), close, parameter-attempt
gating. What doesn't yet: cursor.execute("SELECT 1") — server requires
the post-login init sequence we don't yet send.
Discovered captures (kept for next session's analysis):
docs/CAPTURES/06-py-select1-attempt.socat.log
docs/CAPTURES/07-py-replay-jdbc-prepare.socat.log
docs/CAPTURES/08-py-with-dbopen.socat.log
docs/CAPTURES/09-py-full-replay.socat.log
Three new tasks created tracking the remaining Phase 2 blockers:
post-login init sequence, proper SQ_DESCRIBE parser, SQ_ID action
vocabulary helpers.
Polish item #1: byte-for-byte regression test that asserts our
generated login PDU is structurally identical to JDBC's reference
captured in docs/CAPTURES/01-connect-only.socat.log.
The test (tests/test_pdu_match.py) immediately caught a real bug:
the capability section was misread during Phase 0 byte-decoding.
Earlier text claimed Cap_1=1, Cap_2=0x3c000000, Cap_3=0 — actually:
Cap_1 = 0x0000013c (= (capability_class << 8) | protocol_version
where protocol_version = 0x3c = PF_PROT_SQLI_0600)
Cap_2 = 0
Cap_3 = 0
The misalignment was: the 0x3c byte I attributed to Cap_2's high
byte was actually Cap_1's low byte. The dev-image server is
permissive enough to accept arbitrary capability values, so the
connection succeeded even with the wrong bytes — but the PDU wasn't
structurally identical to JDBC's reference. SERVER-ACCEPTS ≠
STRUCTURALLY-CORRECT. This is exactly why the byte-for-byte diff
was the right polish item; "it connects" was a false ceiling.
After fix:
- 6 PDU-match tests assert byte-for-byte equality at offsets 2..280
(the structural prefix: SLheader sans length, all login markers,
capability ints, username, password, protocol IDs, env vars).
- Bytes 280+ legitimately differ per process (PID, TID, hostname,
cwd, AppName) — those are NOT asserted.
- Length field (offsets 0..1) also legitimately differs because our
PDU has shorter env list and AppName.
- Test uses monkey-patched IfxSocket so no network is needed.
Polish item #2: Makefile per global CLAUDE.md convention. Targets:
install, lint, format, test, test-integration, test-all, test-pdu,
ifx-up/down/logs/shell/status, capture (re-run JDBC scenarios under
socat), clean. `make` (no target) prints help.
Doc updates:
- PROTOCOL_NOTES.md §12: corrected capability section with the
actual values and an explanation of the methodology lesson
- DECISION_LOG.md: new entry recording the correction with a
pointer to the regression test and the takeaway
Side artifacts:
- docs/CAPTURES/03-py-connect-only.socat.log
- docs/CAPTURES/04-py-no-database.socat.log
- docs/CAPTURES/05-py-fixed-caps.socat.log
Test counts: 40 unit + 6 integration = 46 total, all green, ruff clean.
This commit takes informix-db from documentation-only (Phase 0 spike)
to a functional connect() / close() against a real Informix server.
To our knowledge, this is the first pure-socket Informix client in any
language — no CSDK, no JVM, no native libraries.
Layered architecture per the plan, mirroring PyMySQL's shape:
src/informix_db/
__init__.py — PEP 249 surface (connect, exceptions, paramstyle="numeric")
exceptions.py — full PEP 249 hierarchy declared up front
_socket.py — raw socket I/O (read_exact, write_all, timeouts)
_protocol.py — IfxStreamReader / IfxStreamWriter framing primitives
(big-endian, 16-bit-aligned variable payloads,
length-prefixed nul-terminated strings)
_messages.py — SQ_* tags from IfxMessageTypes + ASF/login markers
_auth.py — pluggable auth handlers; plain-password is the
only Phase-1 implementation
connections.py — Connection class: builds the binary login PDU
(SLheader + PFheader byte-for-byte per
PROTOCOL_NOTES.md §3), sends it, parses the
server response, wires up close()
Phase 1 design decisions locked in DECISION_LOG.md:
- paramstyle = "numeric" (matches Informix ESQL/C convention)
- Python >= 3.10
- autocommit defaults to off (PEP 249 implicit)
- License: MIT
- Distribution name: informix-db (verified PyPI-available)
Test coverage: 34 unit tests (codec round-trips against synthetic byte
streams; observed login-PDU values from the spike captures asserted as
exact byte literals) + 6 integration tests (connect, idempotent close,
context manager, bad-password → OperationalError, bad-host →
OperationalError, cursor() raises NotImplementedError).
pytest — runs 34 unit tests, no Docker needed
pytest -m integration — runs 6 integration tests against the
Developer Edition container (pinned by digest
in tests/docker-compose.yml)
pytest -m "" — runs everything
ruff is clean across src/ and tests/.
One bug found during smoke testing: threading.get_ident() can exceed
signed 32-bit on some processes, overflowing struct.pack("!i"). Fixed
the same way the JDBC reference does — clamp to signed 32-bit, fall
back to 0 if out of range. The field is diagnostic only.
One protocol-level observation that AMENDED the JDBC source reading:
the "capability section" in the login PDU is three independently
negotiated 4-byte ints (Cap_1=1, Cap_2=0x3c000000, Cap_3=0), not one
int + 8 reserved zero bytes as my CFR decompile read suggested. The
server echoes them back identically. Trust the wire over the
decompiler.
Phase 1 verification matrix (from PROTOCOL_NOTES.md §12):
- Login byte layout: confirmed (server accepts our pure-Python PDU)
- Disconnection: confirmed (SQ_EXIT round-trip works)
- Framing primitives: confirmed (34 unit tests)
- Error path: bad password → OperationalError, bad host → OperationalError
Phase 2 (Cursor / SELECT / basic types) is the next phase. The hard
unknowns there — exact column-descriptor layout, statement-time error
format — were called out as bounded gaps in Phase 0 and have existing
captures (02-select-1.socat.log, 02-dml-cycle.socat.log) to characterize
against.
Java reference client (tests/reference/RefClient.java) drives the
official ifxjdbc.jar through three controlled scenarios:
- connect-only: bare connect+disconnect
- select-1: SELECT 1 round-trip with column metadata
- dml-cycle: CREATE TEMP + INSERT + SELECT in one connection
All three work end-to-end against the dev container with the
documented credentials (informix/in4mix on sysmaster).
Wire traffic captured via socat MITM relay (no sudo needed) — listen
on 9090, forward to 9088, hex-dump both directions. Captures saved
to docs/CAPTURES/. Total ~24 KB across the three scenarios.
PROTOCOL_NOTES.md cross-reference findings (§12):
Confirmed against the wire (✅ both JDBC + PCAP):
- Big-endian framing throughout
- Login PDU structure matches encodeAscBinary field-by-field
- Server response matches DecodeAscBinary
- Post-login messages are bare [short tag][payload]
- SQ_EOT (=12) is a per-PDU flush/submit marker, not just
disconnect ack — every logical request ends with [short 0x000c]
Wire findings that AMENDED the JDBC-derived hypothesis:
- The "capability section" is actually three 4-byte negotiated
capability ints (Cap_1, Cap_2, Cap_3), not one int + 8 reserved
zero bytes. The CFR decompile read it as adjacent zero writes
but the wire shows distinct values that the server echoes back.
Trust the wire over the decompiler for byte layouts.
Validated post-login execution:
- The first SELECT after login is JDBC-internal (locale lookup
via informix.systables) — a Python implementation doesn't need
to do this housekeeping
- SQ_PREPARE format observed: [short SQ_PREPARE=2][short flags=0]
[int sqlLen][bytes sql][nul][short ?][short ?][short SQ_EOT=12]
- Server sends [short SQ_DESCRIBE=8] followed by column metadata
Phase 0 exit verdict: GO. All four hard exit criteria confirmed.
Remaining gaps (result-set descriptor exact layout, statement-time
errors, capability semantics) are bounded and tractable in Phase 2.
The narrow-scope off-ramp is not needed.