5 Commits

Author SHA1 Message Date
92c4fdbcbf Phase 3: DDL + DML + commit/rollback wire machinery
Cursor.execute now branches on DESCRIBE response's nfields:
  - nfields > 0 → SELECT path (cursor lifecycle: CURNAME+NFETCH+...)
  - nfields == 0 → DDL/DML path (just SQ_EXECUTE then SQ_RELEASE)

Examples that work end-to-end against the dev container:

  cur.execute('CREATE TEMP TABLE t (id INTEGER, name VARCHAR(50))')
  cur.execute("INSERT INTO t VALUES (1, 'hello')")  # rowcount=1
  cur.execute("UPDATE t SET name = 'new' WHERE id = 1")
  cur.execute('DELETE FROM t WHERE id = 1')

Plus full mix: CREATE → 5 INSERTs → SELECT WHERE → DELETE WHERE → SELECT
(see tests/test_dml.py::test_full_dml_cycle_in_one_connection).

Three protocol findings during this push, documented in DECISION_LOG.md:

1. SQ_INSERTDONE (=94) is METADATA, not execution. It arrives in BOTH
   the DESCRIBE response (PREPARE phase) AND the EXECUTE response for
   literal-value INSERTs. The PREPARE-phase SQ_INSERTDONE carries the
   serial values that WILL be assigned IF you execute. The EXECUTE-
   phase SQ_INSERTDONE confirms execution. My initial assumption was
   "PREPARE-phase INSERTDONE means already-executed" — wrong. Skipping
   SQ_EXECUTE made the row not persist (SELECT returned []). Lesson:
   optimization-looking responses may not be what they look like —
   always verify with a follow-up SELECT.

2. SQ_INSERTDONE wire format: 18 bytes (10 byte longint serial8 + 8
   byte bigint bigserial). Per IfxSqli.receiveInsertDone line 2347.
   We read-and-discard for now; Phase 5+ surfaces as Cursor.lastrowid.

3. Transactions: commit() and rollback() are 2-byte messages.
   SQ_CMMTWORK=19 + SQ_EOT for commit; SQ_RBWORK=20 + SQ_EOT for
   rollback. Server responds with SQ_DONE+SQ_EOT in logged databases,
   or SQ_ERR sqlcode=-255 ("Not in transaction") in unlogged databases
   like sysmaster. Wire machinery is implemented; full transaction
   testing needs a logged DB (use ``stores_demo`` from the dev image).

Module changes:
  src/informix_db/cursors.py:
    - execute() branches on nfields (SELECT path vs DDL/DML path)
    - new _execute_dml() does just EXECUTE + RELEASE
    - new _build_execute_pdu() emits the 8-byte SQ_ID(EXECUTE)+EOT
    - _read_describe_response() and _drain_to_eot() handle SQ_INSERTDONE
  src/informix_db/connections.py:
    - commit() / rollback() now functional — send the SQ_CMMTWORK /
      SQ_RBWORK PDU and drain the response

Tests: 40 unit + 24 integration (6 new DML tests) = 64 total, all
green, ruff clean. New tests cover:
  - CREATE TEMP TABLE
  - INSERT (rowcount=1, persists, SELECT shows it)
  - UPDATE WHERE (specific row changed)
  - DELETE WHERE (specific row removed)
  - Full mixed cycle (CREATE + 5 INSERTs + SELECT + DELETE + SELECT)
  - commit() in unlogged DB raises OperationalError sqlcode=-255

Captured wire artifacts kept for future debugging:
  docs/CAPTURES/16-py-insert-literal.socat.log
  docs/CAPTURES/17-py-insert-select.socat.log
2026-05-04 08:02:48 -06:00
34ad04a872 Phase 2.x: VARCHAR row decoding works — three byte-level fixes
Three findings, each caught by a different debugging technique,
documented in DECISION_LOG.md:

1. CURNAME+NFETCH PDU: trailing reserved field is SHORT not INT.
   Caught by byte-diffing our 44-byte PDU against JDBC's 42-byte
   reference under socat. The server tolerated the longer version
   for INT-only SELECTs (silently consuming extra zeros) but
   rejected it for VARCHAR queries. Lesson: server tolerance varies
   by query type — always match JDBC byte-for-byte.

2. SQ_TUPLE payload pads to even byte alignment. An 11-byte
   "syscolumns" VARCHAR payload had a trailing 0x00 between it and
   the next SQ_TUPLE tag. JDBC's IfxRowColumn.readTuple consumes
   this pad silently; we weren't, so any odd-length variable-width
   row desynced the parser.

3. VARCHAR/NCHAR/NVCHAR in tuple data use a SINGLE-byte length
   prefix (max 255 chars — IDS VARCHAR's hard limit). NOT a 2-byte
   short as I'd initially assumed. CHAR is fixed-width per
   encoded_length. LVARCHAR uses a 4-byte int prefix for >255 byte
   values.

Module changes:
  src/informix_db/_resultset.py — _LENGTH_PREFIXED_SHORT_TYPES set,
    branched VARCHAR/NCHAR/NVCHAR (1-byte prefix) vs CHAR (fixed)
    vs LVARCHAR (4-byte prefix); even-byte alignment pad consumed
    after each SQ_TUPLE payload.
  src/informix_db/cursors.py — CURNAME+NFETCH and standalone NFETCH
    PDUs now write_short(0) for the reserved trailing field.

Tests: 40 unit + 18 integration (3 new VARCHAR tests) = 58 total,
all green, ruff clean. New tests cover:
  - VARCHAR single-column SELECT
  - Odd-length VARCHAR row (regression for the pad-byte bug)
  - Mixed INT + VARCHAR + FLOAT three-column SELECT

Sample output:
  SELECT FIRST 5 tabname FROM systables → ('systables',),
    ('syscolumns',), ('sysindices',), ('systabauth',), ('syscolauth',)
  SELECT FIRST 3 tabname, tabid, nrows → ('systables', 1, 276.0), ...

VARCHAR was the last known gap from the Phase 2 commit. Phase 2
now reads INT, BIGINT, REAL, FLOAT, CHAR, VARCHAR end-to-end. Phase
6+ types (DATETIME, INTERVAL, DECIMAL, BLOBs) remain.
2026-05-04 07:55:13 -06:00
ea00990774 Phase 1 polish: PDU match test catches a real capability-int bug
Polish item #1: byte-for-byte regression test that asserts our
generated login PDU is structurally identical to JDBC's reference
captured in docs/CAPTURES/01-connect-only.socat.log.

The test (tests/test_pdu_match.py) immediately caught a real bug:
the capability section was misread during Phase 0 byte-decoding.
Earlier text claimed Cap_1=1, Cap_2=0x3c000000, Cap_3=0 — actually:

  Cap_1 = 0x0000013c   (= (capability_class << 8) | protocol_version
                          where protocol_version = 0x3c = PF_PROT_SQLI_0600)
  Cap_2 = 0
  Cap_3 = 0

The misalignment was: the 0x3c byte I attributed to Cap_2's high
byte was actually Cap_1's low byte. The dev-image server is
permissive enough to accept arbitrary capability values, so the
connection succeeded even with the wrong bytes — but the PDU wasn't
structurally identical to JDBC's reference. SERVER-ACCEPTS ≠
STRUCTURALLY-CORRECT. This is exactly why the byte-for-byte diff
was the right polish item; "it connects" was a false ceiling.

After fix:
- 6 PDU-match tests assert byte-for-byte equality at offsets 2..280
  (the structural prefix: SLheader sans length, all login markers,
  capability ints, username, password, protocol IDs, env vars).
- Bytes 280+ legitimately differ per process (PID, TID, hostname,
  cwd, AppName) — those are NOT asserted.
- Length field (offsets 0..1) also legitimately differs because our
  PDU has shorter env list and AppName.
- Test uses monkey-patched IfxSocket so no network is needed.

Polish item #2: Makefile per global CLAUDE.md convention. Targets:
install, lint, format, test, test-integration, test-all, test-pdu,
ifx-up/down/logs/shell/status, capture (re-run JDBC scenarios under
socat), clean. `make` (no target) prints help.

Doc updates:
- PROTOCOL_NOTES.md §12: corrected capability section with the
  actual values and an explanation of the methodology lesson
- DECISION_LOG.md: new entry recording the correction with a
  pointer to the regression test and the takeaway

Side artifacts:
- docs/CAPTURES/03-py-connect-only.socat.log
- docs/CAPTURES/04-py-no-database.socat.log
- docs/CAPTURES/05-py-fixed-caps.socat.log

Test counts: 40 unit + 6 integration = 46 total, all green, ruff clean.
2026-05-02 20:18:03 -06:00
1a149074d4 Phase 0: populate PROTOCOL_NOTES and JDBC_NOTES from clean-room JDBC reading
Decompiled ifxjdbc.jar (4.50.JC10, build 146, 2023-03-07) with CFR 0.152
into build/jdbc-src/. The decompiled tree is gitignored — it's a
clean-room understanding reference, not shipped code.

Findings landed in two artifacts:

JDBC_NOTES.md — the reverse-lookup index:
- JAR identity (SHA256, manifest, line counts)
- Package layout (com.informix.{asf,jdbc,lang} are the load-bearing
  packages; org.bson and the JDBC API surface get ignored)
- Class index mapping each wire-protocol concern to the responsible
  Java class. Highlights:
  - com.informix.asf.Connection (the wire transport / login PDU)
  - com.informix.asf.IfxData{Input,Output}Stream (framing primitives)
  - com.informix.jdbc.IfxMessageTypes (140+ message-tag constants)
  - com.informix.lang.JavaToIfxType / IfxToJavaType (codecs)
  - com.informix.jdbc.IfxSqli / IfxSqliConnect (the SQLI state machine)
- Auth landscape: plain-password is inline in the binary login PDU;
  PAM is a server-initiated post-login challenge/response; CSM is
  removed from this driver (literally throws an error if you try)

PROTOCOL_NOTES.md — the byte-level wire-format reference:
- Endianness: big-endian, network byte order (confirmed from
  JavaToIfxInt source)
- Width table: SmallInt 2B, Int 4B, BigInt 8B, plus the legacy 10-byte
  LongInt that we skip for MVP
- 16-bit alignment requirement for variable-length payloads — every
  string/decimal/datetime is 0-padded if odd-length, missing this
  desynchronizes the parser
- Login PDU structure decoded byte-by-byte from encodeAscBinary():
  SLheader (6 bytes) + PFheader with markers 100/101/104/106/107/
  108/116/127, capability bitfield, env vars, process info, app name
- Disconnection: bare [short SQ_EXIT=56] both directions, no header
- Post-login messages have NO header — protocol is stream-oriented:
  [short tag][payload][short tag][payload]...
- Message-type tag table categorized by purpose
- Open questions list and cross-check matrix tracking what's
  JDBC-derived vs PCAP-confirmed

DECISION_LOG.md additions:
- ifxjdbc.jar 4.50.JC10 selected as JDBC reference; CFR 0.152 as decompiler
- CSM is officially dead — never plan for it
- Plain-password auth is single-round-trip (no challenge/response)
- Wire-framing primitives locked in for _protocol.py
- Container credentials: user=informix, password=in4mix, on port 9088,
  TLS off

Phase 0 exit gate: criteria #1 (login layout), #2 (message-type tags),
#3 (SELECT 1 hypothesis) are derived from JDBC. PCAP capture (task #7)
and cross-reference (task #2) remaining to corroborate.
2026-05-02 16:00:30 -06:00
f202dbce0c Initialize Phase 0 spike scaffold
Project goal: pure-Python implementation of the Informix SQLI wire
protocol. No CSDK, no JVM, no native deps. Targets icr.io/informix
/informix-developer-database (port 9088) as the dev/test instance.

Phase 0 is a documentation-only spike that gates all implementation
work. The four scaffolds:

- README.md: project status and Phase 0 deliverable index
- docs/PROTOCOL_NOTES.md: byte-level wire-format reference (TBD)
- docs/JDBC_NOTES.md: reverse-lookup index into the decompiled IBM
  JDBC driver (4.50.4.1), populated from build/jdbc-src/ once the
  decompile lands
- docs/DECISION_LOG.md: running rationale, with the Phase-1 paramstyle
  /Python-floor/autocommit decisions pre-locked so they don't churn
  later

CLAUDE.md is gitignored — operator-private context, public-PyPI repo.
2026-05-02 13:22:28 -06:00