Decoded the post-login execution flow from docs/CAPTURES/02-select-1.socat.log:
SQ_PREPARE format (validated against both observed PREPAREs):
[short SQ_PREPARE=2]
[short flags=0]
[int sqlLen] ← SQL byte count, NOT including nul
[bytes sql]
[byte 0] ← nul terminator
[short 0x0016] ← observed 22; cursor options? statement type?
[short 0x0031] ← observed 49; identical across both PREPAREs
[short SQ_EOT=12]
SQ_TUPLE format (definitive):
[short SQ_TUPLE=14]
[int 0] ← flags / reserved
[short payloadLen]
[bytes payload] ← column values back-to-back, per type encoding
SQ_DONE format (partial — see PROTOCOL_NOTES.md §6e for what's known)
JDBC's full prepare/fetch/release sequence (PREPARE → DESCRIBE → ID(3
=cursor name) → ID(9=NFETCH) → TUPLE → DONE → ID(10=close) →
ID(11=release)) documented in §6c. The action codes inside SQ_ID
roughly map to other SQ_* tag values from IfxMessageTypes.
For Python MVP we'll likely try SQ_COMMAND=1 (execute-immediate)
first — it might let us skip the cursor lifecycle for parameterless
queries.
New modules:
src/informix_db/_types.py — IfxType IntEnum ported from
com.informix.lang.IfxTypes. All IDS internal type codes (CHAR=0,
SMALLINT=1, INT=2, ..., BOOLEAN=45, BIGINT=52, BIGSERIAL=53, CLOB=101,
BLOB=102) plus the high-bit flags (NOTNULLABLE=0x100 etc) and helpers
base_type() / is_nullable() to strip and inspect the flag byte.
src/informix_db/converters.py — wire-bytes → Python decoders for the
Phase-2 MVP type set: SMALLINT, INT, BIGINT, SMFLOAT, FLOAT, CHAR,
VARCHAR, NCHAR, NVCHAR, LVARCHAR, BOOL, DATE. Plus FIXED_WIDTHS table
for the row decoder. ENCODERS dict declared empty (Phase 4 fills it
in for parameter binding).
DATE handling uses Informix epoch (1899-12-31, day 0); 4-byte BE int
day count → datetime.date. Smoke-tested decoders all return correct
Python values.
Cursor / _resultset implementation NOT in this commit — they need
deeper SQ_DESCRIBE byte-layout analysis and the SQ_ID sub-action
vocabulary characterization. Both are bounded-but-substantial Phase 2
tasks deferred to a fresh session.
40 unit tests still passing, ruff clean.
Polish item #1: byte-for-byte regression test that asserts our
generated login PDU is structurally identical to JDBC's reference
captured in docs/CAPTURES/01-connect-only.socat.log.
The test (tests/test_pdu_match.py) immediately caught a real bug:
the capability section was misread during Phase 0 byte-decoding.
Earlier text claimed Cap_1=1, Cap_2=0x3c000000, Cap_3=0 — actually:
Cap_1 = 0x0000013c (= (capability_class << 8) | protocol_version
where protocol_version = 0x3c = PF_PROT_SQLI_0600)
Cap_2 = 0
Cap_3 = 0
The misalignment was: the 0x3c byte I attributed to Cap_2's high
byte was actually Cap_1's low byte. The dev-image server is
permissive enough to accept arbitrary capability values, so the
connection succeeded even with the wrong bytes — but the PDU wasn't
structurally identical to JDBC's reference. SERVER-ACCEPTS ≠
STRUCTURALLY-CORRECT. This is exactly why the byte-for-byte diff
was the right polish item; "it connects" was a false ceiling.
After fix:
- 6 PDU-match tests assert byte-for-byte equality at offsets 2..280
(the structural prefix: SLheader sans length, all login markers,
capability ints, username, password, protocol IDs, env vars).
- Bytes 280+ legitimately differ per process (PID, TID, hostname,
cwd, AppName) — those are NOT asserted.
- Length field (offsets 0..1) also legitimately differs because our
PDU has shorter env list and AppName.
- Test uses monkey-patched IfxSocket so no network is needed.
Polish item #2: Makefile per global CLAUDE.md convention. Targets:
install, lint, format, test, test-integration, test-all, test-pdu,
ifx-up/down/logs/shell/status, capture (re-run JDBC scenarios under
socat), clean. `make` (no target) prints help.
Doc updates:
- PROTOCOL_NOTES.md §12: corrected capability section with the
actual values and an explanation of the methodology lesson
- DECISION_LOG.md: new entry recording the correction with a
pointer to the regression test and the takeaway
Side artifacts:
- docs/CAPTURES/03-py-connect-only.socat.log
- docs/CAPTURES/04-py-no-database.socat.log
- docs/CAPTURES/05-py-fixed-caps.socat.log
Test counts: 40 unit + 6 integration = 46 total, all green, ruff clean.
This commit takes informix-db from documentation-only (Phase 0 spike)
to a functional connect() / close() against a real Informix server.
To our knowledge, this is the first pure-socket Informix client in any
language — no CSDK, no JVM, no native libraries.
Layered architecture per the plan, mirroring PyMySQL's shape:
src/informix_db/
__init__.py — PEP 249 surface (connect, exceptions, paramstyle="numeric")
exceptions.py — full PEP 249 hierarchy declared up front
_socket.py — raw socket I/O (read_exact, write_all, timeouts)
_protocol.py — IfxStreamReader / IfxStreamWriter framing primitives
(big-endian, 16-bit-aligned variable payloads,
length-prefixed nul-terminated strings)
_messages.py — SQ_* tags from IfxMessageTypes + ASF/login markers
_auth.py — pluggable auth handlers; plain-password is the
only Phase-1 implementation
connections.py — Connection class: builds the binary login PDU
(SLheader + PFheader byte-for-byte per
PROTOCOL_NOTES.md §3), sends it, parses the
server response, wires up close()
Phase 1 design decisions locked in DECISION_LOG.md:
- paramstyle = "numeric" (matches Informix ESQL/C convention)
- Python >= 3.10
- autocommit defaults to off (PEP 249 implicit)
- License: MIT
- Distribution name: informix-db (verified PyPI-available)
Test coverage: 34 unit tests (codec round-trips against synthetic byte
streams; observed login-PDU values from the spike captures asserted as
exact byte literals) + 6 integration tests (connect, idempotent close,
context manager, bad-password → OperationalError, bad-host →
OperationalError, cursor() raises NotImplementedError).
pytest — runs 34 unit tests, no Docker needed
pytest -m integration — runs 6 integration tests against the
Developer Edition container (pinned by digest
in tests/docker-compose.yml)
pytest -m "" — runs everything
ruff is clean across src/ and tests/.
One bug found during smoke testing: threading.get_ident() can exceed
signed 32-bit on some processes, overflowing struct.pack("!i"). Fixed
the same way the JDBC reference does — clamp to signed 32-bit, fall
back to 0 if out of range. The field is diagnostic only.
One protocol-level observation that AMENDED the JDBC source reading:
the "capability section" in the login PDU is three independently
negotiated 4-byte ints (Cap_1=1, Cap_2=0x3c000000, Cap_3=0), not one
int + 8 reserved zero bytes as my CFR decompile read suggested. The
server echoes them back identically. Trust the wire over the
decompiler.
Phase 1 verification matrix (from PROTOCOL_NOTES.md §12):
- Login byte layout: confirmed (server accepts our pure-Python PDU)
- Disconnection: confirmed (SQ_EXIT round-trip works)
- Framing primitives: confirmed (34 unit tests)
- Error path: bad password → OperationalError, bad host → OperationalError
Phase 2 (Cursor / SELECT / basic types) is the next phase. The hard
unknowns there — exact column-descriptor layout, statement-time error
format — were called out as bounded gaps in Phase 0 and have existing
captures (02-select-1.socat.log, 02-dml-cycle.socat.log) to characterize
against.
The user's global ~/.gitignore_global excludes *.log universally, which
silently dropped our docs/CAPTURES/*.socat.log files from the previous
Phase 0 commit. Add explicit negation rules in the project .gitignore
so the spike capture deliverables are tracked.
Captured under socat MITM relay (host:9090 → container:9088, hex-dump
both directions), driven by tests/reference/RefClient.java:
- 01-connect-only.socat.log: bare login + disconnect (~1.7 KB)
- 02-select-1.socat.log: SELECT 1 round-trip (~6.7 KB)
- 02-dml-cycle.socat.log: CREATE TEMP + INSERT + SELECT (~9.9 KB)
These are referenced from PROTOCOL_NOTES.md §12 as the canonical
ground-truth for the wire-format claims.
Java reference client (tests/reference/RefClient.java) drives the
official ifxjdbc.jar through three controlled scenarios:
- connect-only: bare connect+disconnect
- select-1: SELECT 1 round-trip with column metadata
- dml-cycle: CREATE TEMP + INSERT + SELECT in one connection
All three work end-to-end against the dev container with the
documented credentials (informix/in4mix on sysmaster).
Wire traffic captured via socat MITM relay (no sudo needed) — listen
on 9090, forward to 9088, hex-dump both directions. Captures saved
to docs/CAPTURES/. Total ~24 KB across the three scenarios.
PROTOCOL_NOTES.md cross-reference findings (§12):
Confirmed against the wire (✅ both JDBC + PCAP):
- Big-endian framing throughout
- Login PDU structure matches encodeAscBinary field-by-field
- Server response matches DecodeAscBinary
- Post-login messages are bare [short tag][payload]
- SQ_EOT (=12) is a per-PDU flush/submit marker, not just
disconnect ack — every logical request ends with [short 0x000c]
Wire findings that AMENDED the JDBC-derived hypothesis:
- The "capability section" is actually three 4-byte negotiated
capability ints (Cap_1, Cap_2, Cap_3), not one int + 8 reserved
zero bytes. The CFR decompile read it as adjacent zero writes
but the wire shows distinct values that the server echoes back.
Trust the wire over the decompiler for byte layouts.
Validated post-login execution:
- The first SELECT after login is JDBC-internal (locale lookup
via informix.systables) — a Python implementation doesn't need
to do this housekeeping
- SQ_PREPARE format observed: [short SQ_PREPARE=2][short flags=0]
[int sqlLen][bytes sql][nul][short ?][short ?][short SQ_EOT=12]
- Server sends [short SQ_DESCRIBE=8] followed by column metadata
Phase 0 exit verdict: GO. All four hard exit criteria confirmed.
Remaining gaps (result-set descriptor exact layout, statement-time
errors, capability semantics) are bounded and tractable in Phase 2.
The narrow-scope off-ramp is not needed.
Decompiled ifxjdbc.jar (4.50.JC10, build 146, 2023-03-07) with CFR 0.152
into build/jdbc-src/. The decompiled tree is gitignored — it's a
clean-room understanding reference, not shipped code.
Findings landed in two artifacts:
JDBC_NOTES.md — the reverse-lookup index:
- JAR identity (SHA256, manifest, line counts)
- Package layout (com.informix.{asf,jdbc,lang} are the load-bearing
packages; org.bson and the JDBC API surface get ignored)
- Class index mapping each wire-protocol concern to the responsible
Java class. Highlights:
- com.informix.asf.Connection (the wire transport / login PDU)
- com.informix.asf.IfxData{Input,Output}Stream (framing primitives)
- com.informix.jdbc.IfxMessageTypes (140+ message-tag constants)
- com.informix.lang.JavaToIfxType / IfxToJavaType (codecs)
- com.informix.jdbc.IfxSqli / IfxSqliConnect (the SQLI state machine)
- Auth landscape: plain-password is inline in the binary login PDU;
PAM is a server-initiated post-login challenge/response; CSM is
removed from this driver (literally throws an error if you try)
PROTOCOL_NOTES.md — the byte-level wire-format reference:
- Endianness: big-endian, network byte order (confirmed from
JavaToIfxInt source)
- Width table: SmallInt 2B, Int 4B, BigInt 8B, plus the legacy 10-byte
LongInt that we skip for MVP
- 16-bit alignment requirement for variable-length payloads — every
string/decimal/datetime is 0-padded if odd-length, missing this
desynchronizes the parser
- Login PDU structure decoded byte-by-byte from encodeAscBinary():
SLheader (6 bytes) + PFheader with markers 100/101/104/106/107/
108/116/127, capability bitfield, env vars, process info, app name
- Disconnection: bare [short SQ_EXIT=56] both directions, no header
- Post-login messages have NO header — protocol is stream-oriented:
[short tag][payload][short tag][payload]...
- Message-type tag table categorized by purpose
- Open questions list and cross-check matrix tracking what's
JDBC-derived vs PCAP-confirmed
DECISION_LOG.md additions:
- ifxjdbc.jar 4.50.JC10 selected as JDBC reference; CFR 0.152 as decompiler
- CSM is officially dead — never plan for it
- Plain-password auth is single-round-trip (no challenge/response)
- Wire-framing primitives locked in for _protocol.py
- Container credentials: user=informix, password=in4mix, on port 9088,
TLS off
Phase 0 exit gate: criteria #1 (login layout), #2 (message-type tags),
#3 (SELECT 1 hypothesis) are derived from JDBC. PCAP capture (task #7)
and cross-reference (task #2) remaining to corroborate.
Project goal: pure-Python implementation of the Informix SQLI wire
protocol. No CSDK, no JVM, no native deps. Targets icr.io/informix
/informix-developer-database (port 9088) as the dev/test instance.
Phase 0 is a documentation-only spike that gates all implementation
work. The four scaffolds:
- README.md: project status and Phase 0 deliverable index
- docs/PROTOCOL_NOTES.md: byte-level wire-format reference (TBD)
- docs/JDBC_NOTES.md: reverse-lookup index into the decompiled IBM
JDBC driver (4.50.4.1), populated from build/jdbc-src/ once the
decompile lands
- docs/DECISION_LOG.md: running rationale, with the Phase-1 paramstyle
/Python-floor/autocommit decisions pre-locked so they don't churn
later
CLAUDE.md is gitignored — operator-private context, public-PyPI repo.