12 Commits

Author SHA1 Message Date
6819dd4cb0 Phase 6.b: DATETIME decoding for all qualifier ranges
Before:
  cur.execute("SELECT CURRENT YEAR TO SECOND ...")
  cur.fetchone()  # → (b'\xc7\x14\x1a\x05\x04...',) raw BCD bytes

After:
  cur.execute("SELECT CURRENT YEAR TO SECOND ...")
  cur.fetchone()  # → (datetime.datetime(2026, 5, 4, 12, 34, 56),)

Decoder picks the right Python type by qualifier:
  YEAR/MONTH/DAY-only → datetime.date
  HOUR/MIN/SEC-only   → datetime.time
  spans across both   → datetime.datetime

Wire format (per IfxToJavaDateTime + Decimal.init treating as packed BCD):
  byte[0] = sign + biased exponent (in base-100 digit pairs)
  byte[1..] = BCD digit pairs: YYYY (2 bytes) + MM + DD + HH + MI + SS + FFFFF

Qualifier extraction from column descriptor:
  encoded_length = (digit_count << 8) | (start_TU << 4) | end_TU
  TU codes: YEAR=0, MONTH=2, DAY=4, HOUR=6, MIN=8, SEC=10,
            FRAC1=11..FRAC5=15

Verified against four DATETIME columns of different qualifiers in
one tuple — see test_datetime_multiple_columns_in_one_row:
  YEAR TO SECOND       → datetime.datetime(2026, 5, 4, 12, 34, 56)
  YEAR TO DAY          → datetime.date(2026, 5, 4)
  HOUR TO SECOND       → datetime.time(12, 34, 56)
  YEAR TO FRACTION(3)  → datetime.datetime(...)

Module changes:
  src/informix_db/converters.py:
    + _decode_datetime(raw, encoded_length) — qualifier-driven BCD walk
    + TU constants (_TU_YEAR, _TU_MONTH, ..., _TU_SECOND)
  src/informix_db/_resultset.py:
    + DATETIME row-decoder branch — computes width from digit_count
      in encoded_length high byte, calls _decode_datetime with the
      packed qualifier so it can pick the right Python type

Tests: 40 unit + 70 integration (7 new DATETIME tests) = 110 total,
all green, ruff clean. Tests cover:
  - YEAR TO SECOND → datetime.datetime
  - YEAR TO DAY → datetime.date
  - HOUR TO SECOND → datetime.time
  - CURRENT YEAR TO FRACTION(3) → datetime.datetime
  - Mixed qualifiers in one row
  - DATETIME stored in a real table column (round-trip via SELECT)
  - NULL DATETIME → Python None

DATETIME parameter binding (encoder) is Phase 6.x — same status as
DECIMAL encoder.
2026-05-04 12:02:40 -06:00
d04000dfc3 Phase 5.a: real error messages with PEP 249 exception classification
Before:
  ProgrammingError: server returned SQ_ERR sqlcode=-201 isamcode=0

After:
  ProgrammingError: [-201] A syntax error has occurred (offset 1)
  ProgrammingError: [-206] The specified table is not in the database
                    (near 'no_such_table') [ISAM -111] (offset 27)
  IntegrityError:   [-268] Cannot insert duplicate value -
                    violates UNIQUE constraint (near 'on table u')
                    [ISAM -100] (offset 23)
  IntegrityError:   [-391] Cannot insert NULL value into a NOT NULL column
                    (near 't.id') (offset 23)
  OperationalError: [-255] Not in transaction

PEP 249 exception classes mapped from sqlcode:
  -201, -206, -217, -286, -310, ... → ProgrammingError
  -239, -268, -291, -292, -391, -703 → IntegrityError
  -255, -256, -407, -440, -908, ...   → OperationalError
  -329, -349, -510                    → NotSupportedError
  others                              → DatabaseError (safe fallback)

SQ_ERR wire decode (per IfxSqli.receiveError line 2717):
  [short sqlcode][short isamcode][int offset]
  [short near_token_len][bytes name][optional pad][short SQ_EOT]

The "near" token is the object name where the error occurred (table or
column name for "not found" errors); empty for most syntax errors.

Structured fields attached to every Informix error for programmatic
inspection:
  e.sqlcode      — Informix error code (e.g. -206)
  e.isamcode     — ISAM/RSAM-level error (e.g. -111 = "table not found")
  e.offset       — character offset in the SQL where the error occurred
  e.near         — object name in the "near 'XYZ'" clause (or "")

Connection state survives errors: a failed query doesn't poison the
session — subsequent execute() calls work normally. Verified by
test_connection_survives_query_error.

Built-in error catalog of ~50 most common Informix sqlcodes shipped
in src/informix_db/_errcodes.py. Users can extend at runtime with
register_error_text(code, text). Unknown codes get a generic
"Informix error <N>" with structured fields still populated.

Module changes:
  src/informix_db/_errcodes.py (new) — error catalog + exception
    classification + register_error_text()
  src/informix_db/cursors.py — _raise_sq_err now uses the catalog
  src/informix_db/connections.py — same upgrade for the connection-side
    SQ_ERR path (catches commit/rollback errors etc.)

Tests: 40 unit + 63 integration (8 new error tests) = 103 total, all
green, ruff clean. Tests cover:
  - syntax error → ProgrammingError(-201)
  - table not found → ProgrammingError(-206) with near='no_such_table'
  - column not found → ProgrammingError(-217)
  - UNIQUE violation → IntegrityError(-268)
  - NOT NULL violation → IntegrityError(-391)
  - commit on unlogged DB → OperationalError(-255)
  - connection survives errors (subsequent queries work)
  - all errors carry structured sqlcode/isamcode/offset/near attrs
2026-05-04 11:59:03 -06:00
2bacbc4e53 Phase 6.a: DECIMAL/MONEY row decoding works (COUNT/SUM/AVG return Decimal)
Before:
  cur.execute('SELECT COUNT(*) FROM systables')
  cur.fetchone()  # → (b'\xc2\x02\x00\x00\x00\x00\x00\x00\x00',) raw bytes

After:
  cur.execute('SELECT COUNT(*) FROM systables')
  cur.fetchone()  # → (Decimal('276'),)

The trickiest decode of the project so far. IDS DECIMAL/MONEY wire format:

  byte[0] = (sign << 7) | biased_exponent_base100
    bit 7 = sign (1=positive, 0=negative)
    bits 0-6 = (exponent + 64), XOR'd with 0x7F if negative
  byte[1..] = digit-pair bytes (each 0..99 = two BCD digits)
    if negative: asymmetric base-100 complement applied:
      walk digits right→left, trailing zeros stay zero,
      first non-zero subtracts from 100, rest from 99

Initial naive "99 - d for all digits" decoder gave artifacts like
-1234.559999 instead of -1234.56. The asymmetric complement rule
(from Decimal.decComplement line 447) is what makes negatives
round-trip exactly.

Width on the wire: per-column encoded_length packed as
(precision << 8) | scale; byte width = ceil(precision/2) + 1.
parse_tuple_payload uses this to slice DECIMAL columns correctly.

Tested cases all decode correctly:
  COUNT(*)             → Decimal('276')
  SUM(tabid)           → Decimal('55')
  AVG(tabid)           → Decimal('5.5')
  1234.56::DECIMAL     → Decimal('1234.56')
  -1234.56::DECIMAL    → Decimal('-1234.56')
  -0.5::DECIMAL        → Decimal('-0.5')
  -99.99::DECIMAL      → Decimal('-99.99')
  -12345678.9::DECIMAL → Decimal('-12345678.9')
  NULL                 → None

Encoder (_encode_decimal) is implemented but disabled — server rejects
the produced bytes (precision packing not quite right). Phase 6.x will
fix. Workaround: cast Decimal to float, or pass via SQL literal.

Module changes:
  src/informix_db/converters.py:
    + decimal module import
    + _decode_decimal — full BCD decoder with asymmetric complement
    + _encode_decimal (Phase 6.x stub — present but unreached)
    + DECIMAL/MONEY added to DECODERS dispatch
  src/informix_db/_resultset.py:
    + DECIMAL/MONEY width computation from encoded_length

Tests: 40 unit + 55 integration (8 new DECIMAL) = 95 total, all
green, ruff clean.
2026-05-04 11:17:59 -06:00
d508a489fd Phase 4.x: parameterized SELECT, NULL row decoding, executemany()
Three Phase 4 follow-ups in one push, all with empirical wire analysis:

1. PARAMETERIZED SELECT
   cur.execute('SELECT tabname FROM systables WHERE tabid = ?', (1,))
   → ('systables',)
   Wire flow: PREPARE → DESCRIBE → SQ_BIND-only (no EXECUTE) →
   CURNAME+NFETCH → TUPLE+DONE → drain → CLOSE+RELEASE.
   The cursor open is what executes the prepared query; SQ_BIND just
   binds values into scope. No need for the IDESCRIBE handshake JDBC
   does for type discovery — server accepts our typed bind directly.

2. NULL ROW DECODING — per-type sentinel detection
   Each IDS type has its own NULL sentinel in tuple data:
     INT     → 0x80000000 (INT_MIN)
     BIGINT  → 0x8000000000000000 (LONG_MIN)
     SMALLINT→ 0x8000 (SHORT_MIN)
     REAL    → all 0xFF (NaN bit pattern)
     FLOAT   → all 0xFF
     DATE    → 0x80000000 (same as INT)
     VARCHAR → [byte 1][byte 0]  (length=1, single nul) — distinguishable
                from empty '' which is [byte 0] (length=0)
   Verified by wire capture against the dev container — see
   docs/CAPTURES/19-py-null-vs-onechar.socat.log and
   docs/CAPTURES/20-py-int-null.socat.log.

   The VARCHAR null marker is the trickiest because it LOOKS like a
   1-byte string of nul, but VARCHAR can't contain embedded nuls
   anyway, so the byte-0 within length-1 is unambiguous.

3. executemany(sql, seq_of_params) — PEP 249 batched DML
   PREPARE once, loop SQ_BIND+SQ_EXECUTE per param set, RELEASE once.
   Performance: only ~1.06x faster than execute() loop for 200 INSERTs
   (dominated by per-row round trips). Phase 4.x optimization opportunity:
   chain BIND+EXECUTE in one PDU without intermediate flush+read for
   true bulk performance (would likely give 5-10x). Documented in
   DECISION_LOG.md as a follow-up.

Module changes:
  src/informix_db/converters.py:
    + Per-type NULL sentinel constants and detection in each decoder
    + Decoders now return None for sentinel values
  src/informix_db/cursors.py:
    + _execute_select_with_params() — SQ_BIND alone, then cursor open
    + _build_bind_only_pdu() — SQ_BIND without trailing SQ_EXECUTE
    + executemany() — loop BIND+EXECUTE, accumulate rowcount
    + execute() now dispatches to _execute_select_with_params for
      parameterized SELECT (was: NotSupportedError)

Tests: 40 unit + 47 integration (was 32; added 15 new) = 87 total,
all green, ruff clean. New test files / cases:
  tests/test_nulls.py (7) — NULL decoding for INT, BIGINT, FLOAT,
    REAL, VARCHAR, empty-vs-null, mixed columns
  tests/test_params.py — added 4 parameterized SELECT tests, 5
    executemany tests
  tests/test_smoke.py — updated cursor-with-params test (was Phase 1
    "raises", now Phase 4 "works")

Discovered captures kept for next-session debugging:
  docs/CAPTURES/18-py-null-rows.socat.log
  docs/CAPTURES/19-py-null-vs-onechar.socat.log
  docs/CAPTURES/20-py-int-null.socat.log
2026-05-04 11:11:50 -06:00
509af9efa4 Phase 4: parameter binding (SQ_BIND) — int, float, str, bool, None
cur.execute("INSERT INTO t VALUES (?, ?, ?)", (42, "hello", 3.14))
cur.execute("INSERT INTO t VALUES (:1, :2)", (99, "world"))
cur.execute("UPDATE t SET name = ? WHERE id = ?", ("new", 2))
cur.execute("DELETE FROM t WHERE id = ?", (5,))
# all work end-to-end against a real Informix server

Two breakthroughs decoded from JDBC:

1. SQ_BIND PDU shape (chained with SQ_EXECUTE in one PDU, no separate
   round trip):
     [short SQ_ID=4][int SQ_BIND=5][short numparams]
     for each param:
       [short type][short indicator][short prec_or_encLen]
       writePadded(rawbytes)
     [short SQ_EXECUTE=7][short SQ_EOT]

2. Strings are sent as CHAR (type=0) not VARCHAR (type=13). The server
   handles conversion to the actual column type via internal CIDESCRIBE
   — we don't need to do it explicitly.

Per-type encoding (Phase 4 MVP):
  int (32-bit) → IDS INT (type=2), prec=0x0a00 (packed width=10/scale=0),
                  4-byte BE
  int (64-bit) → IDS BIGINT (type=52), prec=0x1300, 8-byte BE
  str          → IDS CHAR (type=0), prec=0, [short len][bytes][pad]
  float        → IDS FLOAT (type=3), prec=0, 8-byte IEEE 754
  bool         → IDS BOOL (type=45), prec=0, 1 byte
  None         → indicator=-1, no data

The integer "precision" field is PACKED — initially looked like a bug
(why would precision be 2560?) until I realized 0x0a00 = (10 << 8) | 0
= packed display-width and scale. Captured this surprise in
DECISION_LOG.md.

Critical fix to execute-path branching: parameterized INSERT also
returns nfields > 0 (server describes the would-be inserted row).
Switched from "branch on nfields" to "branch on SQL keyword" — JDBC
does the same via its IfxStatement / IfxPreparedStatement subclassing.

Numeric paramstyle support: cur.execute("... :1 ...", (val,)) works
by rewriting :N → ? before sending PREPARE. Trivial regex (doesn't
escape strings/comments — Phase 5 can add a proper SQL tokenizer).

Module changes:
  src/informix_db/converters.py:
    + encode_param() dispatcher
    + _encode_int / _encode_bigint / _encode_str / _encode_float / _encode_bool
  src/informix_db/cursors.py:
    + _build_bind_execute_pdu() — chains SQ_BIND + SQ_EXECUTE in one PDU
    + _execute_dml_with_params() — sends bind PDU, drains, releases
    + execute() now accepts parameters; rewrites :N → ?; branches by
      SQL keyword (SELECT vs DML)
    + _NUMERIC_PLACEHOLDER_RE for paramstyle="numeric" support

Tests: 40 unit + 32 integration (8 new parameter tests + 1 updated
smoke) = 72 total, all green, ruff clean. New tests cover:
  - INSERT with ? params
  - INSERT with :N params
  - INT + FLOAT + str round trip via INSERT then SELECT
  - UPDATE with params in SET and WHERE
  - DELETE with parameter in WHERE
  - Unsupported param type (bytes) raises NotImplementedError
  - Parameterized SELECT raises NotSupportedError (Phase 4.x)
  - Dict/named params raise NotSupportedError

Known gaps (Phase 4.x / Phase 5):
  - Parameterized SELECT (needs SQ_BIND before CURNAME+NFETCH)
  - NULL row decoding for VARCHAR (currently surfaces empty string)
  - Proper SQL tokenizer (so :N inside string literals is preserved)
  - bytes/datetime/Decimal parameter types
2026-05-04 10:54:32 -06:00
92c4fdbcbf Phase 3: DDL + DML + commit/rollback wire machinery
Cursor.execute now branches on DESCRIBE response's nfields:
  - nfields > 0 → SELECT path (cursor lifecycle: CURNAME+NFETCH+...)
  - nfields == 0 → DDL/DML path (just SQ_EXECUTE then SQ_RELEASE)

Examples that work end-to-end against the dev container:

  cur.execute('CREATE TEMP TABLE t (id INTEGER, name VARCHAR(50))')
  cur.execute("INSERT INTO t VALUES (1, 'hello')")  # rowcount=1
  cur.execute("UPDATE t SET name = 'new' WHERE id = 1")
  cur.execute('DELETE FROM t WHERE id = 1')

Plus full mix: CREATE → 5 INSERTs → SELECT WHERE → DELETE WHERE → SELECT
(see tests/test_dml.py::test_full_dml_cycle_in_one_connection).

Three protocol findings during this push, documented in DECISION_LOG.md:

1. SQ_INSERTDONE (=94) is METADATA, not execution. It arrives in BOTH
   the DESCRIBE response (PREPARE phase) AND the EXECUTE response for
   literal-value INSERTs. The PREPARE-phase SQ_INSERTDONE carries the
   serial values that WILL be assigned IF you execute. The EXECUTE-
   phase SQ_INSERTDONE confirms execution. My initial assumption was
   "PREPARE-phase INSERTDONE means already-executed" — wrong. Skipping
   SQ_EXECUTE made the row not persist (SELECT returned []). Lesson:
   optimization-looking responses may not be what they look like —
   always verify with a follow-up SELECT.

2. SQ_INSERTDONE wire format: 18 bytes (10 byte longint serial8 + 8
   byte bigint bigserial). Per IfxSqli.receiveInsertDone line 2347.
   We read-and-discard for now; Phase 5+ surfaces as Cursor.lastrowid.

3. Transactions: commit() and rollback() are 2-byte messages.
   SQ_CMMTWORK=19 + SQ_EOT for commit; SQ_RBWORK=20 + SQ_EOT for
   rollback. Server responds with SQ_DONE+SQ_EOT in logged databases,
   or SQ_ERR sqlcode=-255 ("Not in transaction") in unlogged databases
   like sysmaster. Wire machinery is implemented; full transaction
   testing needs a logged DB (use ``stores_demo`` from the dev image).

Module changes:
  src/informix_db/cursors.py:
    - execute() branches on nfields (SELECT path vs DDL/DML path)
    - new _execute_dml() does just EXECUTE + RELEASE
    - new _build_execute_pdu() emits the 8-byte SQ_ID(EXECUTE)+EOT
    - _read_describe_response() and _drain_to_eot() handle SQ_INSERTDONE
  src/informix_db/connections.py:
    - commit() / rollback() now functional — send the SQ_CMMTWORK /
      SQ_RBWORK PDU and drain the response

Tests: 40 unit + 24 integration (6 new DML tests) = 64 total, all
green, ruff clean. New tests cover:
  - CREATE TEMP TABLE
  - INSERT (rowcount=1, persists, SELECT shows it)
  - UPDATE WHERE (specific row changed)
  - DELETE WHERE (specific row removed)
  - Full mixed cycle (CREATE + 5 INSERTs + SELECT + DELETE + SELECT)
  - commit() in unlogged DB raises OperationalError sqlcode=-255

Captured wire artifacts kept for future debugging:
  docs/CAPTURES/16-py-insert-literal.socat.log
  docs/CAPTURES/17-py-insert-select.socat.log
2026-05-04 08:02:48 -06:00
34ad04a872 Phase 2.x: VARCHAR row decoding works — three byte-level fixes
Three findings, each caught by a different debugging technique,
documented in DECISION_LOG.md:

1. CURNAME+NFETCH PDU: trailing reserved field is SHORT not INT.
   Caught by byte-diffing our 44-byte PDU against JDBC's 42-byte
   reference under socat. The server tolerated the longer version
   for INT-only SELECTs (silently consuming extra zeros) but
   rejected it for VARCHAR queries. Lesson: server tolerance varies
   by query type — always match JDBC byte-for-byte.

2. SQ_TUPLE payload pads to even byte alignment. An 11-byte
   "syscolumns" VARCHAR payload had a trailing 0x00 between it and
   the next SQ_TUPLE tag. JDBC's IfxRowColumn.readTuple consumes
   this pad silently; we weren't, so any odd-length variable-width
   row desynced the parser.

3. VARCHAR/NCHAR/NVCHAR in tuple data use a SINGLE-byte length
   prefix (max 255 chars — IDS VARCHAR's hard limit). NOT a 2-byte
   short as I'd initially assumed. CHAR is fixed-width per
   encoded_length. LVARCHAR uses a 4-byte int prefix for >255 byte
   values.

Module changes:
  src/informix_db/_resultset.py — _LENGTH_PREFIXED_SHORT_TYPES set,
    branched VARCHAR/NCHAR/NVCHAR (1-byte prefix) vs CHAR (fixed)
    vs LVARCHAR (4-byte prefix); even-byte alignment pad consumed
    after each SQ_TUPLE payload.
  src/informix_db/cursors.py — CURNAME+NFETCH and standalone NFETCH
    PDUs now write_short(0) for the reserved trailing field.

Tests: 40 unit + 18 integration (3 new VARCHAR tests) = 58 total,
all green, ruff clean. New tests cover:
  - VARCHAR single-column SELECT
  - Odd-length VARCHAR row (regression for the pad-byte bug)
  - Mixed INT + VARCHAR + FLOAT three-column SELECT

Sample output:
  SELECT FIRST 5 tabname FROM systables → ('systables',),
    ('syscolumns',), ('sysindices',), ('systabauth',), ('syscolauth',)
  SELECT FIRST 3 tabname, tabid, nrows → ('systables', 1, 276.0), ...

VARCHAR was the last known gap from the Phase 2 commit. Phase 2
now reads INT, BIGINT, REAL, FLOAT, CHAR, VARCHAR end-to-end. Phase
6+ types (DATETIME, INTERVAL, DECIMAL, BLOBs) remain.
2026-05-04 07:55:13 -06:00
a1bd52788d Phase 2: SELECT works end-to-end — pure-Python Informix fully reads data
cursor.execute("SELECT 1 FROM systables WHERE tabid = 1")
  cursor.fetchone() == (1,)

To my knowledge, this is the first time a pure-Python implementation
has read data from Informix without wrapping IBM's CSDK or JDBC.

Three breakthroughs in this commit:

1. Login PDU's database field is BROKEN. Passing a database name there
   makes the server reject subsequent SQ_DBOPEN with sqlcode -759
   ("database not available"). JDBC always sends NULL in the login
   PDU's database slot — we now do the same. The user-supplied database
   opens via SQ_DBOPEN in _init_session.

2. Post-login session init dance: SQ_PROTOCOLS (8-byte feature mask
   replayed verbatim from JDBC) → SQ_INFO with INFO_ENV + env vars
   (48-byte PDU replayed verbatim — DBTEMP=/tmp, SUBQCACHESZ=10) →
   SQ_DBOPEN. Without all three steps in this exact order, the server
   silently ignores SELECTs.

3. SQ_DESCRIBE per-column block has 10 fields per column (not the
   simple "name + type" my best-effort parser assumed): fieldIndex,
   columnStartPos, columnType, columnExtendedId, ownerName,
   extendedName, reference, alignment, sourceType, encodedLength.
   The string table at the end is offset-indexed (fieldIndex points
   into it), which is how JDBC handles disambiguation.

Cursor lifecycle implementation in cursors.py mirrors JDBC exactly:
  PREPARE+NDESCRIBE+WANTDONE → DESCRIBE+DONE+COST+EOT
  CURNAME+NFETCH(4096) → TUPLE*+DONE+COST+EOT
  NFETCH(4096) → DONE+COST+EOT (drain)
  CLOSE → EOT
  RELEASE → EOT

Five round trips per SELECT — same as JDBC.

Module changes:
  src/informix_db/connections.py — added _init_session(), _send_protocols(),
    _send_dbopen(), _drain_to_eot(), _raise_sq_err(); login PDU now
    forces database=None always; SQ_INFO PDU replayed verbatim from
    JDBC capture (offsets-indexed env-var format too gnarly to derive
    in MVP).
  src/informix_db/cursors.py — full rewrite: real PDU builders for
    PREPARE/CURNAME+NFETCH/NFETCH/CLOSE/RELEASE; tag-dispatched
    response readers; cursor-name generator matching JDBC's "_ifxc"
    convention.
  src/informix_db/_resultset.py — proper SQ_DESCRIBE parser per
    JDBC's receiveDescribe (USVER mode); offset-indexed string table
    with name lookup by fieldIndex; ColumnInfo dataclass with raw
    type-code preserved for null-flag extraction.
  src/informix_db/_messages.py — added SQ_NDESCRIBE=22, SQ_WANTDONE=49.

Test coverage: 40 unit + 15 integration tests (7 smoke + 8 new SELECT)
= 55 total, all green, ruff clean. New tests cover:
  - SELECT 1 returns (1,)
  - cursor.description shape per PEP 249
  - Multi-row INT SELECT
  - Multi-column mixed types (INT + FLOAT)
  - Iterator protocol (for row in cursor)
  - fetchmany(n)
  - Re-executing on same cursor resets state
  - Two cursors on one connection (sequential)

Known gap: VARCHAR row decoding doesn't yet handle the variable-width
on-wire encoding correctly. Phase 2.x will address — for now NotImpl
errors surface raw bytes in the row tuple.
2026-05-03 15:37:10 -06:00
e2c48f855e Phase 2 progress: cursor scaffolding + protocol findings (SELECT path WIP)
Cursor class scaffolded with full PEP 249 surface:
  src/informix_db/cursors.py — Cursor with execute, fetchone, fetchmany,
    fetchall, description, rowcount, arraysize, close, iterator,
    context manager. Sends SQ_COMMAND chains for parameterless SQL
    (Phase 4 adds SQ_BIND/SQ_EXECUTE for params).
  src/informix_db/_resultset.py — ColumnInfo, parse_describe,
    parse_tuple_payload. Best-effort SQ_DESCRIBE parser; refines in
    Phase 2.1.
  src/informix_db/connections.py — Connection.cursor() now returns a
    real Cursor; new _send_pdu() lets Cursor share the connection's
    socket without violating encapsulation.

Protocol findings landed in PROTOCOL_NOTES.md §6:
  §6a — SQ_PREPARE format with named tags (the "trailing 22, 49"
    are SQ_NDESCRIBE and SQ_WANTDONE chained into the same PDU).
    Confirmed against IfxSqli.sendPrepare line 1062.
  §6c — Server requires post-login init sequence (SQ_PROTOCOLS →
    SQ_INFO → SQ_ID(env vars) → SQ_DBOPEN) BEFORE any PREPARE works.
    Discovered the hard way: PREPARE without this sequence gets no
    response; SQ_DBOPEN without SQ_PROTOCOLS gets sqlcode=-759
    ("Database not available"). The login PDU's database field is
    a hint, not an open.
  §6e — SQ_TUPLE corrected: [short warn][int size][bytes payload]
    (not [int 0][short payloadLen] as earlier draft claimed).
  Two more constants added to _messages.MessageType:
    SQ_NDESCRIBE = 22, SQ_WANTDONE = 49

Tests: 40 unit + 7 integration (added 2 new — cursor() returns a
Cursor, parameter binding raises NotSupportedError). All green, ruff
clean. Removed obsolete "cursor() raises NotImplementedError" test.

What works end-to-end now: connect, cursor(), close, parameter-attempt
gating. What doesn't yet: cursor.execute("SELECT 1") — server requires
the post-login init sequence we don't yet send.

Discovered captures (kept for next session's analysis):
  docs/CAPTURES/06-py-select1-attempt.socat.log
  docs/CAPTURES/07-py-replay-jdbc-prepare.socat.log
  docs/CAPTURES/08-py-with-dbopen.socat.log
  docs/CAPTURES/09-py-full-replay.socat.log

Three new tasks created tracking the remaining Phase 2 blockers:
post-login init sequence, proper SQ_DESCRIBE parser, SQ_ID action
vocabulary helpers.
2026-05-02 21:04:30 -06:00
ddac40ff0b Phase 2 foundations: _types.py, converters.py, post-login protocol notes
Decoded the post-login execution flow from docs/CAPTURES/02-select-1.socat.log:

SQ_PREPARE format (validated against both observed PREPAREs):
  [short SQ_PREPARE=2]
  [short flags=0]
  [int sqlLen]            ← SQL byte count, NOT including nul
  [bytes sql]
  [byte 0]                ← nul terminator
  [short 0x0016]          ← observed 22; cursor options? statement type?
  [short 0x0031]          ← observed 49; identical across both PREPAREs
  [short SQ_EOT=12]

SQ_TUPLE format (definitive):
  [short SQ_TUPLE=14]
  [int 0]                  ← flags / reserved
  [short payloadLen]
  [bytes payload]          ← column values back-to-back, per type encoding

SQ_DONE format (partial — see PROTOCOL_NOTES.md §6e for what's known)

JDBC's full prepare/fetch/release sequence (PREPARE → DESCRIBE → ID(3
=cursor name) → ID(9=NFETCH) → TUPLE → DONE → ID(10=close) →
ID(11=release)) documented in §6c. The action codes inside SQ_ID
roughly map to other SQ_* tag values from IfxMessageTypes.

For Python MVP we'll likely try SQ_COMMAND=1 (execute-immediate)
first — it might let us skip the cursor lifecycle for parameterless
queries.

New modules:

src/informix_db/_types.py — IfxType IntEnum ported from
  com.informix.lang.IfxTypes. All IDS internal type codes (CHAR=0,
  SMALLINT=1, INT=2, ..., BOOLEAN=45, BIGINT=52, BIGSERIAL=53, CLOB=101,
  BLOB=102) plus the high-bit flags (NOTNULLABLE=0x100 etc) and helpers
  base_type() / is_nullable() to strip and inspect the flag byte.

src/informix_db/converters.py — wire-bytes → Python decoders for the
  Phase-2 MVP type set: SMALLINT, INT, BIGINT, SMFLOAT, FLOAT, CHAR,
  VARCHAR, NCHAR, NVCHAR, LVARCHAR, BOOL, DATE. Plus FIXED_WIDTHS table
  for the row decoder. ENCODERS dict declared empty (Phase 4 fills it
  in for parameter binding).

DATE handling uses Informix epoch (1899-12-31, day 0); 4-byte BE int
day count → datetime.date. Smoke-tested decoders all return correct
Python values.

Cursor / _resultset implementation NOT in this commit — they need
deeper SQ_DESCRIBE byte-layout analysis and the SQ_ID sub-action
vocabulary characterization. Both are bounded-but-substantial Phase 2
tasks deferred to a fresh session.

40 unit tests still passing, ruff clean.
2026-05-02 20:24:25 -06:00
ea00990774 Phase 1 polish: PDU match test catches a real capability-int bug
Polish item #1: byte-for-byte regression test that asserts our
generated login PDU is structurally identical to JDBC's reference
captured in docs/CAPTURES/01-connect-only.socat.log.

The test (tests/test_pdu_match.py) immediately caught a real bug:
the capability section was misread during Phase 0 byte-decoding.
Earlier text claimed Cap_1=1, Cap_2=0x3c000000, Cap_3=0 — actually:

  Cap_1 = 0x0000013c   (= (capability_class << 8) | protocol_version
                          where protocol_version = 0x3c = PF_PROT_SQLI_0600)
  Cap_2 = 0
  Cap_3 = 0

The misalignment was: the 0x3c byte I attributed to Cap_2's high
byte was actually Cap_1's low byte. The dev-image server is
permissive enough to accept arbitrary capability values, so the
connection succeeded even with the wrong bytes — but the PDU wasn't
structurally identical to JDBC's reference. SERVER-ACCEPTS ≠
STRUCTURALLY-CORRECT. This is exactly why the byte-for-byte diff
was the right polish item; "it connects" was a false ceiling.

After fix:
- 6 PDU-match tests assert byte-for-byte equality at offsets 2..280
  (the structural prefix: SLheader sans length, all login markers,
  capability ints, username, password, protocol IDs, env vars).
- Bytes 280+ legitimately differ per process (PID, TID, hostname,
  cwd, AppName) — those are NOT asserted.
- Length field (offsets 0..1) also legitimately differs because our
  PDU has shorter env list and AppName.
- Test uses monkey-patched IfxSocket so no network is needed.

Polish item #2: Makefile per global CLAUDE.md convention. Targets:
install, lint, format, test, test-integration, test-all, test-pdu,
ifx-up/down/logs/shell/status, capture (re-run JDBC scenarios under
socat), clean. `make` (no target) prints help.

Doc updates:
- PROTOCOL_NOTES.md §12: corrected capability section with the
  actual values and an explanation of the methodology lesson
- DECISION_LOG.md: new entry recording the correction with a
  pointer to the regression test and the takeaway

Side artifacts:
- docs/CAPTURES/03-py-connect-only.socat.log
- docs/CAPTURES/04-py-no-database.socat.log
- docs/CAPTURES/05-py-fixed-caps.socat.log

Test counts: 40 unit + 6 integration = 46 total, all green, ruff clean.
2026-05-02 20:18:03 -06:00
9b1fd8af2c Phase 1: pure-Python SQLI login works end-to-end
This commit takes informix-db from documentation-only (Phase 0 spike)
to a functional connect() / close() against a real Informix server.
To our knowledge, this is the first pure-socket Informix client in any
language — no CSDK, no JVM, no native libraries.

Layered architecture per the plan, mirroring PyMySQL's shape:

  src/informix_db/
    __init__.py        — PEP 249 surface (connect, exceptions, paramstyle="numeric")
    exceptions.py      — full PEP 249 hierarchy declared up front
    _socket.py         — raw socket I/O (read_exact, write_all, timeouts)
    _protocol.py       — IfxStreamReader / IfxStreamWriter framing primitives
                         (big-endian, 16-bit-aligned variable payloads,
                         length-prefixed nul-terminated strings)
    _messages.py       — SQ_* tags from IfxMessageTypes + ASF/login markers
    _auth.py           — pluggable auth handlers; plain-password is the
                         only Phase-1 implementation
    connections.py     — Connection class: builds the binary login PDU
                         (SLheader + PFheader byte-for-byte per
                         PROTOCOL_NOTES.md §3), sends it, parses the
                         server response, wires up close()

Phase 1 design decisions locked in DECISION_LOG.md:
  - paramstyle = "numeric" (matches Informix ESQL/C convention)
  - Python >= 3.10
  - autocommit defaults to off (PEP 249 implicit)
  - License: MIT
  - Distribution name: informix-db (verified PyPI-available)

Test coverage: 34 unit tests (codec round-trips against synthetic byte
streams; observed login-PDU values from the spike captures asserted as
exact byte literals) + 6 integration tests (connect, idempotent close,
context manager, bad-password → OperationalError, bad-host →
OperationalError, cursor() raises NotImplementedError).

  pytest                 — runs 34 unit tests, no Docker needed
  pytest -m integration  — runs 6 integration tests against the
                           Developer Edition container (pinned by digest
                           in tests/docker-compose.yml)
  pytest -m ""           — runs everything

ruff is clean across src/ and tests/.

One bug found during smoke testing: threading.get_ident() can exceed
signed 32-bit on some processes, overflowing struct.pack("!i"). Fixed
the same way the JDBC reference does — clamp to signed 32-bit, fall
back to 0 if out of range. The field is diagnostic only.

One protocol-level observation that AMENDED the JDBC source reading:
the "capability section" in the login PDU is three independently
negotiated 4-byte ints (Cap_1=1, Cap_2=0x3c000000, Cap_3=0), not one
int + 8 reserved zero bytes as my CFR decompile read suggested. The
server echoes them back identically. Trust the wire over the
decompiler.

Phase 1 verification matrix (from PROTOCOL_NOTES.md §12):
  - Login byte layout: confirmed (server accepts our pure-Python PDU)
  - Disconnection: confirmed (SQ_EXIT round-trip works)
  - Framing primitives: confirmed (34 unit tests)
  - Error path: bad password → OperationalError, bad host → OperationalError

Phase 2 (Cursor / SELECT / basic types) is the next phase. The hard
unknowns there — exact column-descriptor layout, statement-time error
format — were called out as bounded gaps in Phase 0 and have existing
captures (02-select-1.socat.log, 02-dml-cycle.socat.log) to characterize
against.
2026-05-02 19:10:24 -06:00