# Decision Log Running rationale for protocol, auth, type, and architecture decisions made during the project. New decisions append; old ones are *amended* (with date) rather than overwritten. Format: every decision has a date, a status (`active` / `superseded` / `revisited`), the chosen path, the discarded alternatives, and the *why*. --- ## 2026-05-02 — Project goal & off-ramp **Status**: active **Decision**: Build a pure-Python implementation of the SQLI wire protocol. No IBM Client SDK. No JVM. No native libraries. **Off-ramp** (chosen by user during planning): if Phase 0 reveals the protocol is intractable in pure Python — e.g., mandatory undocumented crypto in the handshake — narrow scope (lock to one server version, drop async, drop prepared statements if needed) and stay pure-Python. Do **not** fall back to JPype/JDBC; that defeats the project's purpose. **Why**: The "no SDK / no JVM" goal is what makes this driver valuable. A JPype fallback would ship something that works but solves nothing the existing JDBC-via-JPype solution doesn't already solve. --- ## 2026-05-02 — Package name **Status**: active **Decision**: `informix-db` **Discarded**: `informixdb-pure` (longer), `ifxsqli` (less discoverable), `pyifx` (obscure) **PyPI availability**: confirmed available 2026-05-02 (HTTP 404 on `/pypi/informix-db/json`). The legacy `informixdb` is taken (HTTP 200), `informix` is also free (404) but too generic. **Why**: Discoverability balanced with brevity. Anyone searching PyPI for "informix" finds it; the hyphen distinguishes it from the legacy C-extension wrapper. --- ## 2026-05-02 — License **Status**: active **Decision**: MIT **Discarded**: Apache-2.0 (more defensive but less common in Python ecosystem), BSD-3-Clause **Why**: Simplest, most permissive, ecosystem-standard for Python libraries. --- ## 2026-05-02 — Sync first; async deferred **Status**: active **Decision**: Build a sync, blocking-socket implementation. Async lands in Phase 6+ as a separate `informix_db.aio` subpackage following asyncpg's I/O-agnostic-protocol pattern. **Why**: Wire protocols are hard enough; debugging protocol bugs through asyncio plumbing is two layers of indirection too many. Sync-first means we can test against blocking sockets, prove correctness, then mechanically swap the I/O layer. --- ## 2026-05-02 — Test target **Status**: active **Decision**: `icr.io/informix/informix-developer-database` (the Developer Edition image, now maintained by HCL Software since the 2017 IBM→HCL transfer of Informix), port 9088 (native SQLI). **Pinned digest** (captured 2026-05-02 from `docker pull`): `sha256:8202d69ba5674df4b13140d5121dd11b7b26b28dc60119b7e8f87e533e538ba1` **On-disk footprint**: 2.23 GB unpacked / 665 MB compressed. **Default credentials** (from container startup logs, accept-license run): - OS/DB user: `informix` - Password: `in4mix` - HQ admin password: `Passw0rd` (don't need this) - DBA user/password: empty - DBSERVERNAME: defaults to `informix` (same as the user) - TLS_CONNECTIONS: OFF (plain auth on port 9088) - Always-present databases: `sysmaster`, `sysuser` (built during init) **Container startup**: `docker run -d --name ifx --privileged -p 9088:9088 -e LICENSE=accept -e SIZE=small icr.io/informix/informix-developer-database@sha256:8202d69b...` **Why**: Free, official, no license click-through, supports plain-password auth out of the box. The digest is locked from Phase 0 onward — `:latest` is the canonical source of flaky integration suites in DB-driver projects, so all `docker-compose.yml` files reference the digest, never the tag. --- ## 2026-05-02 — Phase 0 is a gate, not a step **Status**: active **Decision**: No library code is written until `PROTOCOL_NOTES.md` meets all four exit criteria: 1. Login byte layout documented end-to-end 2. Message-type tags identified for login/execute/row/end-of-result/error/disconnect 3. `SELECT 1` round-trip fully labeled 4. JDBC source and packet capture corroborate on login + execute paths If exit criteria can't be met within bounded effort, invoke the off-ramp. **Why**: Most greenfield projects fail by writing code before they understand the problem. This project has an undocumented wire protocol as its central unknown. Gating on Phase 0 means a failed spike still produces a publicly valuable artifact (`PROTOCOL_NOTES.md`) instead of a half-built driver. --- ## 2026-05-02 — Phase 1 architecture decisions (locked at start of Phase 1) > These are pre-decided so paramstyle/Python-floor/autocommit don't churn later. Recorded here so Phase 1 doesn't relitigate them. - **`paramstyle = "numeric"`** (`:1`, `:2`, …). Matches Informix ESQL/C convention. - **Python ≥ 3.10**. Gives us `match`, modern type hints, `tomllib`. - **`autocommit` defaults to off**. PEP 249 implicit semantics; opt-in via `connect(autocommit=True)`. - **Author**: Ryan Malloy `` (per global pyproject.toml convention). - **Versioning**: CalVer `YYYY.MM.DD` (`2026.05.02` initial); same-day fixes use PEP 440 post-release `2026.05.02.1`, `.2`, etc. --- ## 2026-05-02 — DATE pulled forward to MVP **Status**: active **Decision**: DATE is included in the Phase 2 MVP type set, alongside SMALLINT/INTEGER/BIGINT/FLOAT/CHAR/VARCHAR/BOOLEAN. **Discarded**: leaving DATE in the "medium" / Phase 6 bucket. **Why**: Almost no real Informix database is DATE-free. The encoding is trivial once the type code is known (4-byte day count from the Informix epoch 1899-12-31). Cheap to include; expensive to leave out. DATETIME / INTERVAL / DECIMAL / NUMERIC / MONEY remain in Phase 6+ — their encodings (qualifier-byte precision, BCD-style packed decimal) are non-trivial. --- ## 2026-05-02 — `CLAUDE.md` excluded from git and sdist **Status**: active **Decision**: `.gitignore` excludes `CLAUDE.md`. Once `pyproject.toml` exists, `[tool.hatch.build.targets.sdist].exclude` will also list `CLAUDE.md`. **Why**: `CLAUDE.md` contains the user's email and operator-private context. Per global convention, only commit `CLAUDE.md` to private repos. This project is destined for PyPI / public Git. --- ## 2026-05-02 — JDBC reference: `ifxjdbc.jar` 4.50.JC10 **Status**: active **Decision**: Use the user-provided `ifxjdbc.jar` from `/home/rpm/bingham/rtmt/lib/` as the JDBC reference, working copy at `build/ifxjdbc.jar`. **JAR identity**: `Implementation-Version: 4.50.10-SNAPSHOT`, build 146, dated 2023-03-07. Printable version string: `4.50.JC10`. SHA256 `dc5622cb4e95678d15836b684b6ef1783d37bc0cdd2725208577fc300df4e5f1`. **Discarded**: Maven Central `com.ibm.informix:jdbc:4.50.4.1` (not downloaded — the local copy is newer). **Why**: A newer reference is strictly better — the wire protocol is backwards-compatible, so anything `4.50.JC10` knows how to send/receive will be accepted by older servers. Avoids the Maven download. --- ## 2026-05-02 — Decompiler: CFR 0.152 **Status**: active **Decision**: Use CFR 0.152 (https://github.com/leibnitz27/cfr) as the JDBC decompiler. Cached at `build/tools/cfr.jar`. **Discarded**: Procyon, Fernflower, Ghidra (Ghidra MCP port pool was exhausted; CFR alone proved sufficient). **Why**: CFR produces the most readable Java for modern bytecode, ships as a single fat JAR, has no install step. Decompiles 478 .java files in seconds. --- ## 2026-05-02 — Confirmed: CSM is dead in modern Informix **Status**: active **Decision**: Do NOT plan for CSM (Communications Support Module) support. Ever. **Evidence**: `com.informix.asf.Connection.getOptProperties()` (decompiled) literally throws: `"CSM Encryption is no longer supported"` if `SECURITY` or `CSM` opt-prop is set. **Why**: This used to be the supplied-encryption-plugin layer. IBM removed it; modern Informix uses TLS/SSL exclusively. Removes CSM from every phase plan. --- ## 2026-05-02 — Wire framing primitives confirmed (from JDBC) **Status**: active (pending PCAP corroboration) **Decision**: Adopt these wire-framing primitives in `_protocol.py` from day one: - All multi-byte integers are **big-endian** (network byte order) - SmallInt = 2 bytes, Int = 4 bytes, BigInt = 8 bytes, Real = 4 bytes IEEE 754, Double = 8 bytes IEEE 754 - Variable-length payloads (string, decimal, datetime, interval, BLOB): `[short length][bytes][optional 0x00 pad if length is odd]` — **the 16-bit alignment requirement is mandatory; missing it desynchronizes the parser** - Strings emitted as `[short len+1][bytes][0x00 nul terminator]` (the +1 is the trailing nul) - Post-login messages have NO header: each is `[short messageType][payload]` and the next message begins immediately after the previous one's payload ends - Login PDU has its own SLheader (6 bytes) + PFheader structure **Source**: `com.informix.lang.JavaToIfxType` (encoders), `com.informix.asf.IfxDataInputStream`/`IfxDataOutputStream` (framing), `com.informix.asf.Connection` (login PDU). Documented byte-by-byte in `PROTOCOL_NOTES.md`. --- ## 2026-05-02 — Plain-password auth: no challenge-response round trip **Status**: active **Decision**: For MVP, treat plain-password auth as a single round trip: client sends one binary login PDU containing the password inline; server replies with one PDU containing version + capabilities or an error block. **Why**: `Connection.encodeAscBinary()` writes the password as a length-prefixed string within the login PDU body. There is no separate auth phase, no salt, no hashing, no `SQ_CHALLENGE`/`SQ_RESPONSE` exchange. Those constants (129/130) are reserved for PAM and other interactive auth methods, used AFTER the binary login PDU when the server initiates them. --- ## 2026-05-02 — Capability ints: corrected after PDU diff caught misread **Status**: active (corrects an earlier same-day entry) **Decision**: Send `Cap_1 = 0x0000013c, Cap_2 = 0, Cap_3 = 0` in the binary login PDU. These are the values IBM's JDBC driver sends; the server echoes them back identically. **Why this is a correction**: An earlier read of the wire bytes (before we wrote the byte-for-byte PDU diff) decoded the capability section as `Cap_1=1, Cap_2=0x3c000000, Cap_3=0`. That was a misalignment — the `0x3c` byte interpreted as `Cap_2`'s high byte was actually `Cap_1`'s low byte. Real layout: a single int `0x0000013c` = `(capability_class << 8) | PF_PROT_SQLI_0600 (60 = 0x3c)`. **How we caught it**: `tests/test_pdu_match.py` — captures our generated PDU via a monkey-patched socket and asserts byte-for-byte equality against `docs/CAPTURES/01-connect-only.socat.log` for offsets 2..280 (the structural prefix). The connection still worked with the wrong values because the dev image is permissive, but the PDU was structurally non-identical. **Server-accepts ≠ structurally-correct.** **Methodology takeaway**: For wire-protocol implementations, always diff against the reference vendor's PDU bytes, not just "it connected." Permissive servers mask real bugs. --- ## 2026-05-04 — VARCHAR row decoding: three byte-level discoveries **Status**: active **Decision**: ``parse_tuple_payload`` now handles VARCHAR/NCHAR/NVCHAR with a single-byte length prefix; SQ_TUPLE payloads are padded to even byte alignment; the trailing reserved field in CURNAME+NFETCH is a SHORT not an INT. **Why this is three findings**: each one was caught by a different debugging technique: 1. **CURNAME+NFETCH PDU off by 2 bytes**: my reserved trailing field was `write_int(0)` (4 bytes); JDBC's reference is `write_short(0)` (2 bytes). Caught by capturing both PDUs under socat and byte-diffing — our 44-byte vs JDBC's 42-byte. The server happened to accept the longer version for INT-only SELECTs (silently treating the extra zeros as padding) but rejected it for VARCHAR queries. Lesson: **server tolerance varies by query type — always match JDBC byte-for-byte**. 2. **SQ_TUPLE payload pads to even alignment**: when `size` is odd, an extra 0x00 byte follows the payload before the next tag. Found in `docs/CAPTURES/15-py-varchar-fixed.socat.log` — an 11-byte "syscolumns" VARCHAR payload had a trailing `0x00` that JDBC's `IfxRowColumn.readTuple` consumes silently. We weren't doing this, so the parser desynced for any odd-length variable-width row. **Even-byte alignment is a wire-protocol-wide invariant — every variable-length payload pads.** 3. **VARCHAR in tuple uses 1-byte length prefix, NOT 2**: per the on-wire encoding (verified empirically in capture 15), VARCHAR values in row data are `[byte length][bytes]` — single-byte prefix, max 255 chars. NCHAR and NVCHAR follow the same pattern. (CHAR is fixed-width per encoded_length, no length prefix at all.) LVARCHAR uses a 4-byte int prefix for values >255 bytes. **How to apply**: when adding new variable-width type decoders, capture a tuple under socat first to see the exact framing — don't infer from the column descriptor's `encoded_length`, which is the MAX storage, not the wire format. The wire format may differ by orders of magnitude (1-byte prefix vs encoded_length=128 for VARCHAR). --- ## 2026-05-04 — DML / DDL execution path: SQ_PREPARE + SQ_EXECUTE + SQ_RELEASE **Status**: active **Decision**: For statements that don't return rows (CREATE, INSERT, UPDATE, DELETE, DROP), Cursor.execute branches on ``nfields == 0`` in the DESCRIBE response. SELECT path is the cursor lifecycle (CURNAME+NFETCH+...); DDL/DML path is just SQ_EXECUTE then SQ_RELEASE. **Why**: JDBC uses SQ_PREPARE for everything; for non-SELECT it just doesn't open a cursor. Per IfxSqli.sendExecute (line 1075): non-prepared-statement execute is a bare ``[short SQ_ID=4][int SQ_EXECUTE=7][short SQ_EOT]`` (8 bytes). --- ## 2026-05-04 — SQ_INSERTDONE (=94) is execution metadata, NOT execution **Status**: active **Decision**: SQ_INSERTDONE arrives in BOTH the DESCRIBE response (PREPARE phase) AND the EXECUTE response for literal-value INSERTs. It carries the auto-generated serial values that WILL be / WERE inserted. Don't interpret SQ_INSERTDONE in the DESCRIBE response as "row was inserted" — it's just metadata. Always send SQ_EXECUTE. **Why this was a debugging trap**: when I first saw SQ_INSERTDONE in the PREPARE response for ``INSERT INTO t1 VALUES (1, 'hello')``, I assumed Informix optimizes literal INSERTs by executing during PREPARE and added a "skip SQ_EXECUTE" branch. Result: SELECT returned 0 rows. The data wasn't actually inserted; the SQ_INSERTDONE in PREPARE was just "here are the serials that WILL be assigned when you execute". After reverting to "always send SQ_EXECUTE", the row persists. Lesson: optimization-looking responses may not be what they look like — always verify with a follow-up SELECT. --- ## 2026-05-04 — SQ_INSERTDONE wire format **Status**: active **Decision**: Per IfxSqli.receiveInsertDone (line 2347), the SQ_INSERTDONE payload is 18 bytes for modern (bigint-supported) servers: - 10 bytes: serial8 inserted (Informix's variable-numeric LONGINT encoding) - 8 bytes: bigserial inserted (regular 64-bit long, big-endian) For now we read-and-discard. Phase 5+ will surface these as ``Cursor.lastrowid`` / similar. --- ## 2026-05-04 — Transactions: commit/rollback are 2-byte messages **Status**: active **Decision**: ``Connection.commit()`` sends ``[short SQ_CMMTWORK=19][short SQ_EOT=12]`` (4 bytes). ``Connection.rollback()`` sends ``[short SQ_RBWORK=20][short SQ_EOT=12]``. Server responds with SQ_DONE+SQ_EOT (in logged databases) or SQ_ERR sqlcode=-255 ("Not in transaction") in unlogged databases like sysmaster. **How to apply**: integration tests for transactions need a LOGGED database. The Informix Developer Edition image ships with ``stores_demo`` (logged) — point integration tests at that for commit/rollback verification. --- ## 2026-05-04 — Parameter binding: SQ_BIND chained with SQ_EXECUTE in one PDU **Status**: active **Decision**: ``Cursor.execute(sql, params)`` for DML sends one PDU containing SQ_BIND with all parameter values, immediately followed by SQ_EXECUTE. No separate CIDESCRIBE round trip — the server infers parameter types from the type tags we send in SQ_BIND. **Why this matters**: skipping the CIDESCRIBE/IDESCRIBE handshake (which JDBC does for type-discovery) saves one round trip per execute. The server accepts our SQ_BIND directly because we provide explicit type codes for each parameter. PDU structure (verified against ``docs/CAPTURES/02-dml-cycle.socat.log`` msg[29]): ``` [short SQ_ID=4][int SQ_BIND=5][short numparams] for each param: [short type][short indicator=0 or -1][short prec_or_encLen] writePadded(rawbytes) # data + 0x00 pad if odd-length [short SQ_EXECUTE=7] [short SQ_EOT] ``` Per-type encoding (Phase 4 MVP): | Python type | IDS type code | Precision short | Data | |-------------|---------------|-----------------|------| | ``int`` (32-bit) | 2 (INT) | ``0x0a00`` (=2560 packed display-width=10/scale=0) | 4 bytes BE | | ``int`` (64-bit) | 52 (BIGINT) | ``0x1300`` (=4864 packed width=19/scale=0) | 8 bytes BE | | ``str`` | 0 (CHAR — server casts) | 0 | ``[short len][bytes]`` (writePadded adds even pad) | | ``float`` | 3 (FLOAT/DOUBLE) | 0 | 8 bytes IEEE 754 | | ``bool`` | 45 (BOOL) | 0 | 1 byte (0x01 or 0x00) | | ``None`` | 0 | indicator=-1 | (no data) | **Surprise**: JDBC sends Python-string equivalents as **CHAR (type=0)**, not VARCHAR (type=13). The server handles conversion to the actual column type via internal CIDESCRIBE/IDESCRIBE inference. We do the same — string parameters always go out as CHAR. **Surprise**: integer precision is **packed** as ``(display_width << 8) | scale``. For INTEGER, that's ``(10 << 8) | 0 = 0x0a00 = 2560``. Initially looked like a bug (why would precision be 2560?) until I realized it's a packed field. Captured in cursor's ``_build_bind_execute_pdu`` and converters' ``_encode_int``. **Paramstyle**: we declare ``paramstyle = "numeric"`` (PEP 249), supporting ``:1``, ``:2`` placeholders. Internally we rewrite to ``?`` (Informix's native style) before sending PREPARE. Trivial regex; doesn't escape strings/comments — Phase 5 can add a proper SQL tokenizer for that edge case. --- ## 2026-05-04 — SELECT vs DML branching: keyword-based, not nfields-based **Status**: active **Decision**: ``Cursor.execute`` branches on the first word of the SQL (``SELECT`` → cursor-fetch path; everything else → execute-and-release path). Don't use ``nfields > 0`` from the DESCRIBE response. **Why**: a parameterized INSERT (``INSERT INTO t VALUES (?, ?, ?)``) returns a DESCRIBE response with ``nfields > 0`` because the server describes the row that WILL be inserted. The ``nfields == 0`` heuristic that worked for non-parameterized DML breaks here. JDBC does the same via its ``IfxStatement`` / ``IfxPreparedStatement`` subclassing. --- ## 2026-05-04 — Parameterized SELECT works with bind-then-cursor-open **Status**: active **Decision**: For parameterized SELECT, send SQ_BIND alone (without SQ_EXECUTE chained) right after PREPARE, then proceed with the regular cursor open + fetch lifecycle (CURNAME+NFETCH+...). The cursor open is what triggers query execution; SQ_BIND just binds the values into the prepared-statement scope. **Why**: simpler than I expected — server accepts SQ_BIND followed by cursor open in separate PDUs. No need for the IDESCRIBE handshake JDBC does for type discovery. PDU sequence: ``` 1. PREPARE+NDESCRIBE+WANTDONE → DESCRIBE+DONE+COST+EOT 2. SQ_BIND (no EXECUTE) → EOT 3. CURNAME+NFETCH → TUPLE*+DONE+COST+EOT 4. NFETCH (drain) → DONE+COST+EOT 5. CLOSE → EOT 6. RELEASE → EOT ``` Tested with single int param, multiple int params, string param, mixed `:N` style with LIKE patterns. All work correctly. --- ## 2026-05-04 — NULL row encoding: per-type sentinel values **Status**: active **Decision**: Each IDS type uses a specific NULL sentinel in tuple data; decoders detect and return Python ``None``. Sentinels (verified by capture analysis in ``docs/CAPTURES/19-py-null-vs-onechar.socat.log`` and ``20-py-int-null.socat.log``): | IDS type | NULL sentinel | Distinguishable from valid value? | |----------|---------------|------------------------------------| | SMALLINT | ``0x8000`` (= SHORT_MIN) | Yes — SHORT_MIN can't be a regular value | | INTEGER | ``0x80000000`` (= INT_MIN) | Yes | | BIGINT | ``0x8000000000000000`` (= LONG_MIN) | Yes | | REAL | ``ff ff ff ff`` (NaN bit pattern) | Yes (via bytes match, not value match — NaN != NaN) | | FLOAT/DOUBLE | ``ff ff ff ff ff ff ff ff`` | Yes | | VARCHAR | ``[byte 1][byte 0]`` (length=1, content=single nul) | Yes — VARCHAR can't contain embedded nuls; the byte-0 within length-1 is the unambiguous null marker | | DATE | ``0x80000000`` (same as INT) | Yes | | BOOL | (TBD — Phase 5+) | — | **The VARCHAR null marker is unusual**: ``[byte 1][byte 0]`` looks like "1-byte string containing 0x00" but Informix's VARCHAR can't have embedded nuls anyway, so it's an unambiguous out-of-band signal. Empty string is encoded as ``[byte 0]`` (length=0, no content) — distinct from NULL. --- ## 2026-05-04 — executemany: PREPARE once, BIND+EXECUTE per row, RELEASE once **Status**: active **Decision**: ``Cursor.executemany(sql, seq_of_params)`` does PREPARE once, then loops sending SQ_BIND+SQ_EXECUTE per parameter set, then RELEASE once. **Performance**: only ~1.06x faster than a loop of ``execute()`` for 200 INSERTs (336ms vs 319ms in our benchmark). Each BIND+EXECUTE round trip dominates; we save only PREPARE+RELEASE per call. **Phase 4.x optimization opportunity**: chain multiple BIND+EXECUTE calls in one PDU (no intermediate flush + read) for true batch performance — would likely give 5-10x speedup. JDBC's "isBatchUpdatePerSpec" path does this; not yet ported. For now, executemany still gives PEP 249 conformance and slight perf improvement; bulk-insert optimization is a future improvement. --- ## 2026-05-04 — DECIMAL/MONEY decoding: base-100 BCD with asymmetric complement **Status**: active (decoder); encoder is Phase 6.x **Decision**: ``_decode_decimal`` handles IDS DECIMAL/MONEY wire bytes per ``com.informix.lang.Decimal.init`` (line 374) format: ``` byte[0] = (sign << 7) | biased_exponent_base100 - bit 7 = sign (1=positive, 0=negative) - bits 0-6 = (exponent + 64) for positive - bits 0-6 = (exponent + 64) ^ 0x7F for negative ← XOR'd byte[1..] = digit-pair bytes (each 0..99 = two BCD digits) - for negative: asymmetric base-100 complement applied ``` Asymmetric base-100 complement (per ``Decimal.decComplement`` line 447): - Walk digits RIGHT to LEFT - Trailing zeros stay zero - First non-zero digit: subtract from 100 - Subsequent digits: subtract from 99 This was the trickiest decode of the project so far — initial naive ``99 - d`` for all digits gave artifacts like ``-1234.55999`` instead of ``-1234.56``. The trailing-zeros and "first non-zero from 100" rules are what make the round trip exact. NULL marker: byte[0] == 0 AND byte[1] == 0. **Width on the wire**: per-column ``encoded_length`` field is packed as ``(precision << 8) | scale``. Byte width = ``ceil(precision/2) + 1``. The row decoder uses this to slice DECIMAL columns out of the tuple payload (``parse_tuple_payload`` in ``_resultset.py``). **Encoder (``_encode_decimal``)**: implemented but disabled — server rejects the bytes (precision packing wrong somewhere). Workaround for Phase 6.x users: cast Decimal to float at the call site or pass via SQL literal. Decode side is fully working — handles COUNT, SUM, AVG, literal DECIMAL values, negatives, fractions, NULLs. --- ## 2026-05-04 — Better error messages with PEP 249 exception classification **Status**: active **Decision**: ``_raise_sq_err`` decodes the full SQ_ERR payload (sqlcode, isamcode, offset, near-token) and raises the appropriate PEP 249 exception class with a human-readable message and structured fields (``e.sqlcode``, ``e.isamcode``, ``e.offset``, ``e.near``). PEP 249 classification by sqlcode: - IntegrityError: -239, -268, -291, -292, -391, -703 (constraint violations) - ProgrammingError: -201, -206, -217, -286, -310, ... (syntax/object/permission) - OperationalError: -255, -256, -407, -440, -908, ... (transaction/connection) - NotSupportedError: -329, -349, -510 (caller-can't-fix) - DatabaseError: everything else (safe fallback) Built-in error catalog of ~50 most common Informix sqlcodes in ``src/informix_db/_errcodes.py``. Users extend at runtime via ``register_error_text(code, text)``. **Connection survives errors**: a failed query doesn't poison the session — subsequent ``execute()`` calls work normally. Verified by ``test_connection_survives_query_error``. --- ## 2026-05-04 — DATETIME decoding: BCD-packed with qualifier-driven field walk **Status**: active **Decision**: ``_decode_datetime(raw, encoded_length)`` walks BCD digit pairs into Python ``datetime`` objects. Returns ``datetime.date`` for date-only qualifiers, ``datetime.time`` for time-only, ``datetime.datetime`` for combined. Wire format: - byte[0] = sign + biased exponent (in base-100 digit pairs before decimal) - byte[1..] = BCD digit pairs (year takes 2 bytes = 4 digits; everything else 1 byte = 2 digits) The qualifier is packed in the column descriptor's ``encoded_length``: - high byte = digit_count (total base-10 digits) - middle nibble = start_TU (time-unit code: YEAR=0, MONTH=2, DAY=4, HOUR=6, MIN=8, SEC=10, FRAC1=11..FRAC5=15) - low nibble = end_TU Byte width on the wire = ``ceil(digit_count / 2) + 1``. Verified against 4 simultaneous DATETIME columns in one tuple: - YEAR TO SECOND → datetime.datetime(2026, 5, 4, 12, 34, 56) - YEAR TO DAY → datetime.date(2026, 5, 4) - HOUR TO SECOND → datetime.time(12, 34, 56) - YEAR TO FRACTION(3) → datetime.datetime(...) DATETIME parameter binding (encoder) is Phase 6.x — same status as DECIMAL encoder. --- ## 2026-05-04 — DATE / DATETIME / DECIMAL parameter encoding **Status**: active **Decision**: ``encode_param`` dispatches on ``isinstance(value, datetime.datetime / datetime.date / decimal.Decimal)`` to type-specific encoders. Round-trip verified through INSERT + SELECT. **The 2-byte length-prefix discovery (the unblocker)**: my Phase 6.a DECIMAL encoder and Phase 6.c DATETIME encoder both produced "correct" BCD bytes but the server silently dropped the SQ_BIND PDU. Captured the wire and compared to JDBC — DECIMAL/DATETIME bind data has a **2-byte length prefix** at the start (per ``Decimal.javaToIfx`` line 457) that wraps the BCD payload. With the prefix added (``raw = len(inner).to_bytes(2, "big") + inner``), both encoders work. DATE doesn't need the prefix — it's a fixed 4-byte int. Per-type encoded format: | Python | IDS type | Wire bytes | |--------|----------|------------| | ``datetime.date`` | DATE (7) | ``[int days_since_1899-12-31]`` (4 bytes BE) | | ``datetime.datetime`` | DATETIME (10) | ``[short total_len][byte 0xc7][7 BCD pairs]`` (10 bytes total for YEAR TO SECOND) | | ``decimal.Decimal`` | DECIMAL (5) | ``[short total_len][byte exp][BCD digit pairs]`` (variable) | For DATETIME, encoder always emits YEAR TO SECOND form (no microseconds). Phase 6.x can add YEAR TO FRACTION(N) variants if microsecond precision is needed. For DECIMAL, the encoder uses the asymmetric base-100 complement (mirror of decoder) for negatives. Tested with positive, negative, fraction values. **Lesson**: when a server silently drops a PDU, it's almost always an envelope/framing issue rather than the inner-value bytes being wrong. The 2-byte length prefix here, the SHORT-vs-INT reserved field in CURNAME+NFETCH, the even-byte alignment pad — same pattern. --- ## 2026-05-04 — INTERVAL decoding (both qualifier families) **Status**: active **Decision**: ``_decode_interval`` decodes IDS INTERVAL into one of two Python types based on the qualifier's ``start_TU``: - ``start_TU >= DAY (4)`` (IntervalDF) → ``datetime.timedelta`` - ``start_TU <= MONTH (2)`` (IntervalYM) → :class:`informix_db.IntervalYM` (a small frozen dataclass holding signed total months) **The wire format is the same as DECIMAL/DATETIME** — ``[head byte][digit pairs in base-100]`` with sign+biased-exponent header. The qualifier short tells you how to *interpret* those digits: - High byte = total digit count across all fields - Middle nibble = start_TU; low nibble = end_TU - First field has variable digit width: ``flen = total_len - (end_TU - start_TU)`` (which is the digits "added" past the first field; each non-first field is exactly 2 digits) - Subsequent non-first non-fractional fields are 1 byte each (since each is exactly 2 base-10 digits = 1 base-100 digit pair) - Fractional fields scale to nanoseconds via ``cv *= 10 ** scale_exp`` where ``scale_exp = 18 - end_TU`` forced odd Wire byte width on the SQ_TUPLE side = ``ceil(digit_count / 2) + 1`` (one head byte + ceil(digits/2) digit pairs). Same formula as DATETIME and DECIMAL — surfaces in ``_resultset.parse_tuple_payload`` as a dedicated branch (because the qualifier is needed at decode time). **The dec_exp arithmetic that initially fooled me**: I kept misreading ``(total_len + 10 - end_TU + 1) / 2`` as a much larger value than it is. For HOUR(2) TO SECOND, ``total_len=6, end_TU=10``, so dec_exp = 7//2 = 3, not 8. After the encoder writes dec_exp into the head byte and the decoder reads it back, the two match perfectly so the digit array lines up at offset 0 of the 16-byte working buffer — but only if you actually compute the value correctly. *Read your own arithmetic.* (The synthetic unit-test framework caught this immediately, before the integration tests even ran.) **IntervalYM design**: I considered a NamedTuple with (years, months) fields, but a frozen dataclass with a single signed ``months`` field matches JDBC's ``IntervalYM`` and avoids ambiguity around "what does negative mean for a tuple". ``years`` and ``remainder_months`` are read-only properties; ``__str__`` emits the standard "Y-MM" / "-Y-MM" form. ``slots=True`` makes it as cheap as a NamedTuple memory-wise. **Verified against 9 integration scenarios** (all decoder branches): DAY TO SECOND, HOUR TO SECOND, MINUTE TO SECOND, YEAR TO MONTH, YEAR-only, negative interval (9's-complement), table column, NULL, and a multi-INTERVAL row (proves per-column slicing works across mixed qualifier families). INTERVAL parameter binding (encoder) is deferred to Phase 6.e or later — same arc as DECIMAL/DATETIME, where decoding lands first and encoding follows once we have wire captures to compare against. --- ## 2026-05-04 — INTERVAL parameter encoding **Status**: active **Decision**: ``encode_param`` dispatches ``datetime.timedelta`` and :class:`IntervalYM` to dedicated encoders that produce the 2-byte-length-prefixed BCD payload (per the Phase 6.c discovery). Default qualifiers are chosen to cover any sane Python value: - ``timedelta`` → ``INTERVAL DAY(9) TO FRACTION(5)`` (covers ±999,999,999 days × 10us resolution) - ``IntervalYM`` → ``INTERVAL YEAR(9) TO MONTH`` (covers ±999,999,999 years) **Why DAY(9) and YEAR(9)?** Python's ``timedelta`` allows up to 999,999,999 days; YEAR/MONTH have no upper bound in Python (just a signed int). We could choose a smaller default, but the wire-format cost is one byte per two extra digits and the user-facing benefit is "no overflow surprises". JDBC's defaults (DAY(2) TO FRACTION(5) for IntervalDF, YEAR(4) TO MONTH for IntervalYM) trade safety for compactness — we make the opposite trade. **FRACTION(5) is the precision ceiling.** Informix doesn't expose FRAC6 even though the qualifier nibble allows it (per ``Interval.TU_F1..TU_F5``). The encoder scales nanoseconds via ``nans /= 10^(18 - end_TU)`` per JDBC, which means we lose the units digit of microseconds (10us is the smallest representable unit). This is the same limitation JDBC has — Informix fundamentally can't store sub-10us intervals in this format. **The synthetic round-trip caught every framing bug locally.** Once the decoder works, encoder verification becomes "decode my encoded bytes and compare to the input" — a closed loop with no server in the mix. All 6 integration tests passed on the first run against live Informix; no debugging cycle was needed. This is the dividend from owning both ends of the codec layer. **Lesson reinforced**: Phase 6.a (DECIMAL encoding) was the real cost — that's where the 2-byte-length-prefix wire-format discovery happened. Phase 6.c (DATE/DATETIME encoding) and Phase 6.e (INTERVAL encoding) each amortized that discovery with one new encoder per qualifier-bearing type. Total wall-clock time per phase is dropping geometrically. --- ## 2026-05-04 — Phase 6.f research: BYTE / TEXT / BLOB / CLOB protocol scope **Status**: research complete; implementation deferred **Decision**: Decoupling LOB types into their own phase. The four "LOB" types split into two protocol families with materially different wire-level cost: ### Protocol family A: BYTE (type=11) and TEXT (type=12) — legacy in-row-pointed blobs **Server-side requirements** (verified empirically against the IBM dev container 15.0.1.0.3DE): - A blobspace must exist (`onspaces -c -b blobspace1 -p ... -o 0 -s 50000`) - The database must be logged (`CREATE DATABASE testdb WITH LOG`) - The column declaration must place data in the blobspace: `data BYTE IN blobspace1` **Even with all that, BYTE/TEXT cannot be inserted via SQL literals.** I verified by running `dbaccess - test_byte.sql` with `INSERT INTO t VALUES (1, "0x68656c6c6f")` and getting: ``` 617: A blob data type must be supplied within this context. ``` This is a hard server-side restriction: blob data **must** arrive via the binary BBIND wire path. There is no string-literal escape hatch. **Wire protocol** (per `IfxSqli.sendBind` line 844, `sendBlob` line 3328, `sendStreamBlob` line 3482): 1. **SQ_BIND** (tag 5): per-param block declares the BYTE/TEXT slot but the inline data is a **56-byte blob descriptor** (per `IfxBlob.toIfx` line 162) — mostly zeros, with the size at offset [16:20] as a 4-byte big-endian int. Byte 39 is the null indicator (1 = null). 2. **SQ_BBIND** (tag 41): `[short tag=41][short blob_count]` — the count of BYTE/TEXT params being streamed. 3. **For each BYTE/TEXT param**: stream of `SQ_BLOB` (tag 39) chunks: `[short tag=39][short length][padded data]`. Chunks max out at 1024 bytes per `sendStreamBlob`. 4. **End-of-blob marker**: a final `SQ_BLOB` with `[short tag=39][short length=0]`. 5. Then SQ_EXECUTE proceeds normally. **Decoder side**: rows containing BYTE/TEXT have a 56-byte descriptor in the SQ_TUPLE payload (per `IfxRowColumn.loadColumnData` switch case for type 11/12 reading 56 bytes). Then a separate stream of SQ_BLOB tags arrives **between** SQ_TUPLE messages, carrying the actual bytes. **Estimated implementation cost**: substantial. Cursor state machine needs to: - Detect `bytes`/`str`-meant-as-TEXT params and route them through SQ_BBIND after SQ_BIND - Send the 56-byte descriptor as the inline placeholder - Stream chunks ≤1024 bytes each - On the read path, parse SQ_BLOB tags between SQ_TUPLE messages and reassemble per-column This is a multi-day effort and warrants its own phase, **Phase 7+**. ### Protocol family B: BLOB (type=102) and CLOB (type=101) — smart-LOBs with locators **Server-side requirements**: an sbspace (smart-LOB space), more complex than blobspace. (Verified: `onspaces -c -S sbspace1 ...`). **Wire protocol**: even more involved than BYTE/TEXT. Per `IfxLobInputStream` and `IfxSmartBlob`, smart-LOB access uses an LO_OPEN/LO_READ/LO_WRITE/LO_CLOSE session protocol against the sbspace, with handles called *locators* that travel inline in the SQ_TUPLE while the actual bytes go over a separate channel. JDBC's `IfxLocator` is a 56-byte descriptor (same shape as the BYTE descriptor!) but carries semantic meaning: storage type, sbspace ID, partition number, etc. **Estimated implementation cost**: substantial++ — significantly larger than BYTE/TEXT, because we'd need to implement the LO_* RPC sub-protocol entirely. ### Decision **Phase 6.f is closed as research-complete** with this entry as the deliverable. The findings replace assumptions (e.g., "BLOB/CLOB will be similar to INTERVAL") with actual protocol facts. Implementation is split into: - **Phase 8** (future): BYTE/TEXT bind+read with the SQ_BBIND/SQ_BLOB wire machinery - **Phase 9** (future): smart-LOB BLOB/CLOB with the LO_OPEN/LO_READ session protocol In the meantime, **users who need to insert binary data** can use the existing `LVARCHAR` path via `str` (works for binary if encoded with `iso-8859-1`) up to ~32K — which is the LVARCHAR on-wire limit. Not a substitute for true BYTE/TEXT but covers many practical cases. The constants `SQ_BBIND=41`, `SQ_BLOB=39`, `SQ_FETCHBLOB=38`, `SQ_SBBIND=52`, `SQ_FILE_READ=106`, `SQ_FILE_WRITE=107` are already declared in `_messages.py` from earlier scaffolding — the protocol layer is ready when implementation lands. **Honest scope-discovery moment**: I went into Phase 6.f assuming it'd be similar effort to INTERVAL. Reading the wire protocol revealed a different shape entirely — multi-PDU sequences require state-machine surgery, not just new codecs. Pivoting now (instead of half-implementing) is the right call. --- ## 2026-05-04 — Phase 7: real transaction semantics on logged databases **Status**: active **Decision**: The driver now manages transactions implicitly on logged databases. Three protocol facts came out of integration testing that materially shaped the implementation: ### Fact 1: SQ_BEGIN is REQUIRED before the first DML in a logged-DB transaction Informix in non-ANSI mode does NOT auto-open a server-side transaction on the first DML. Without an explicit ``SQ_BEGIN`` (tag 35), the server treats each statement as if it's already in some implicit txn (data is visible after the INSERT) but ``COMMIT WORK`` afterward fails with sqlcode -255 ("Not in transaction"). The "INSERT then COMMIT" sequence appears to work for visibility but the COMMIT-as-no-op is broken in a way that violates user expectations. **Solution**: ``Connection._ensure_transaction()`` is called by ``Cursor.execute()`` and ``Cursor.executemany()`` before sending PREPARE. It sends ``SQ_BEGIN`` if no transaction is currently open. Idempotent within an open txn. After ``commit()``/``rollback()``, ``_in_transaction`` is reset to ``False`` so the NEXT DML triggers a fresh ``SQ_BEGIN``. For unlogged databases, ``SQ_BEGIN`` returns sqlcode -201 ("BEGIN WORK requires logged DB"). We **cache that result** on the connection (``_supports_begin_work=False``) so subsequent DML doesn't re-probe. This means the same client code works seamlessly on logged or unlogged DBs without the user having to know which they're hitting. ### Fact 2: SQ_RBWORK has a savepoint short payload — SQ_CMMTWORK does not Reading ``IfxSqli.sendRollback`` (line 647) revealed that ``SQ_RBWORK`` (tag 20) is followed by ``[short savepoint=0]`` BEFORE the ``SQ_EOT`` framing tag. Without that 2-byte payload, the server **silently hangs** waiting for it — no error, no timeout, just a stuck socket read. This caused a confusing 30-second test timeout on the first integration run. The fix is one line: ```python self._sock.write_all(struct.pack("!hhh", SQ_RBWORK, 0, SQ_EOT)) ``` ``SQ_CMMTWORK`` (tag 19), by contrast, has no payload — it's just the tag followed by SQ_EOT. **Lesson**: same pattern as the SHORT-vs-INT field in CURNAME+NFETCH (Phase 4.x) and the 2-byte length prefix in DECIMAL/DATETIME/INTERVAL bind data (Phase 6.c+). When the server hangs, **it's almost always an incomplete PDU body** — the server is waiting for bytes you didn't send. Compare your bytes to JDBC's, byte-by-byte. ### Fact 3: SQ_XACTSTAT (tag 99) is a logged-DB-only message Logged databases emit ``SQ_XACTSTAT`` (tag 99) interleaved with normal DML responses to inform the client of transaction-state events. Body: ``[short xcEvent][short xcNewLevel][short xcOldLevel]``. We don't surface these events to the user (yet) but must drain them in **every** response-reading path: ``_drain_to_eot`` (used by commit, rollback, DML), ``_read_describe_response`` (PREPARE response), ``_read_fetch_response`` (NFETCH response), and the connection-level ``_drain_to_eot`` (used by SQ_BEGIN, session init). Without handling SQ_XACTSTAT in all four paths, the cursor desynchronizes from the wire stream and the next read pulls garbage tags (which then raise "unexpected tag" errors that hide the real cause). ### Cross-connection isolation tests are config-dependent — don't bake them in The original test plan included a cross-connection visibility test ("conn A inserts, conn B reads zero rows before commit, then sees one row after"). Informix's default isolation is **Committed Read with row-level locking**, so conn B's SELECT *blocks* on the unlocked row rather than returning zero. With ``LOCK MODE NOT WAIT`` (the default), this surfaces as sqlcode -252 (lock timeout) immediately. With ``LOCK MODE WAIT N``, it waits N seconds. Either behavior is correct under Informix semantics — the test would just be testing the lock manager, not transaction visibility. We removed that test and replaced it with the simpler ``test_committed_data_visible_to_fresh_connection`` which proves durability across connections without engaging the lock manager. ### Test coverage delivered 10 transaction tests in ``tests/test_transactions.py``, all passing against the auto-created ``testdb`` logged database: - Commit visibility (single connection) - Rollback isolation — the "Phase 3 gate" test - Multi-row rollback - Partial-commit-then-rollback - Autocommit semantics (persists, rollback no-op) - Cross-connection durability - UPDATE+rollback, DELETE+rollback - Implicit per-statement transaction The ``conftest.py::_ensure_testdb`` fixture auto-creates ``testdb WITH LOG`` if missing, so the tests work on a fresh dev container provided ``blobspace1`` and ``sbspace1`` exist (created during Phase 6.f research). ### Two old tests retired ``test_commit_rollback_in_unlogged_db_raises`` and ``test_commit_in_unlogged_db_is_operational_error`` were written assuming commit() on an unlogged DB raised -255. The Phase 7 driver-side smarts now make those calls a silent no-op (the connection knows there's no open txn). Both tests were rewritten to assert the new (better) behavior. PEP 249 doesn't mandate any specific behavior for unsupported operations; "graceful no-op" matches what most modern drivers do. --- ## 2026-05-04 — Phase 8: BYTE / TEXT bind+read (the SQ_BBIND/SQ_BLOB protocol) **Status**: active **Decision**: BYTE (type 11) and TEXT (type 12) round-trip end-to-end. Python `bytes`/`bytearray` map to BYTE; `str` is auto-encoded as ISO-8859-1 for TEXT (matching the server's default codeset). NULL is byte 39 of the descriptor. ### Wire protocol — write side A BYTE/TEXT param uses **two** PDU sections within the same SQ_BIND envelope: 1. **Inline placeholder** (per `IfxBlob.toIfx` line 162): a 56-byte blob descriptor with **only** the size at offset [16..19] as a 4-byte big-endian int. All other bytes are zero. (For NULL, byte 39 is set to 1.) 2. **SQ_BBIND stream** (per `IfxSqli.sendBlob` line 3328): after all per-param SQ_BIND blocks, emit `[short SQ_BBIND=41][short blob_count]`, then for each blob param stream chunked SQ_BLOB messages: `[short SQ_BLOB=39][short chunk_len][padded data]` (max 1024 bytes/chunk per JDBC's `sendStreamBlob`), ending with a zero-length terminator `[short SQ_BLOB=39][short 0]`. Then SQ_EXECUTE proceeds normally. ### Wire protocol — read side The SQ_TUPLE payload returns only the 56-byte descriptor for BYTE/TEXT columns — the actual bytes live in the blobspace. The client must explicitly fetch via SQ_FETCHBLOB (per `IfxSqli.sendFetchBlob` line 3716): ``` [short SQ_ID=4][int 38=SQ_FETCHBLOB][padded 56-byte descriptor][short SQ_EOT] ``` The server replies with one or more SQ_BLOB chunks ending with a zero-length terminator. The descriptor's locator is **only valid while the cursor is open** — the dereferencing must happen between the final NFETCH and CLOSE. Doing it after CLOSE returns -602 (Cannot open blob) with ISAM -101. ### Server-side prerequisites The IBM dev container needs three things, in this order, before BYTE/TEXT works at all: 1. **A blobspace**: `onspaces -c -b blobspace1 -p /path -o 0 -s 50000` 2. **A logged database**: `CREATE DATABASE testdb WITH LOG` (BYTE/TEXT rejected in unlogged DBs with sqlcode -617) 3. **Config + level-0 archive to allow chunk page allocation**: ```bash onmode -wm LTAPEDEV=/dev/null onmode -wm TAPEDEV=/dev/null onmode -l # advance logical log ontape -s -L 0 -t /dev/null # level-0 archive ``` Without the archive, JDBC fails identically to our driver with "Cannot close blob — BLOB pages can't be allocated from a chunk until chunk add is logged" (ISAM -169). **This was the unblocker that confirmed our protocol implementation was correct** — when JDBC and our driver fail identically against the same broken server config, you've got byte-for-byte protocol parity. Then fix the server. ### Architectural note: rest-of-the-codec-types-vs-this-one Phase 6.a/c/e (DECIMAL/DATETIME/INTERVAL) shipped fast because each type was a single-PDU codec — encode bytes, send inline. BYTE/TEXT required **state-machine surgery**: - The bind builder now knows about "blob-aware" params and queues them for a separate stream after the per-param block. - The cursor's SELECT lifecycle now does a SQ_FETCHBLOB round-trip per blob column per row before sending CLOSE. - The dereferencing is a separate read loop that handles its own SQ_DONE/SQ_COST/SQ_XACTSTAT interleaving. The smart-LOB family (BLOB type 102, CLOB type 101) is a **further** state-machine extension — they use `IfxLocator` references against sbspace and require an LO_OPEN/LO_READ/LO_WRITE/LO_CLOSE session protocol entirely separate from BBIND/BLOB. That's deferred to Phase 9. ### Test coverage delivered 9 integration tests in `tests/test_blob.py`: - `test_byte_roundtrip_short` — single-chunk payload - `test_byte_roundtrip_multichunk` — 5120 bytes (5 chunks at 1024 each) - `test_byte_null` — null descriptor (byte 39=1) → Python None - `test_byte_multi_row` — three rows, each with its own SQ_FETCHBLOB - `test_byte_binary_safe` — preserves null bytes, high bytes, etc. - `test_text_roundtrip` — TEXT column, str returned (decoded) - `test_text_with_unicode_iso8859` — extended-Latin chars round-trip - `test_text_null` - `test_byte_alongside_other_types` — BYTE column mixed with INT Plus the Phase 4 `test_unsupported_param_type_raises` was updated — `bytes` is no longer the canonical "unsupported" sentinel, since we now support it. Switched to a custom Python class for that role. ### The "JDBC fails identically" debugging discovery When the first round of integration tests failed with sqlcode -603, I built a Java `byte-cycle` scenario in `tests/reference/RefClient.java` that uses `PreparedStatement.setBytes()` against the same server. JDBC failed with the **exact same error** ("Cannot close blob — chunk add is logged"). That was the diagnostic moment: our protocol bytes were correct; the server config was wrong. After the level-0 archive, both JDBC and our driver succeeded. This is the third instance of "compare against JDBC at the byte level" diagnostic pattern paying off (after the SHORT-vs-INT bug from Phase 4.x and the 2-byte length prefix from Phase 6.c). Worth promoting to a debugging recipe: **when our driver fails and you suspect protocol error, replicate the operation through `RefClient`. Same error = server/config issue. Different error = our bug.** --- ## 2026-05-04 — Phase 9: smart-LOB BLOB/CLOB locator decoding (Phase 10 deferred for full fetch) **Status**: active **Decision**: Smart-LOB columns are decoded into typed `informix_db.BlobLocator` / `informix_db.ClobLocator` objects that wrap the 72-byte server-side reference. Full data retrieval (fetching the actual bytes) is deferred to **Phase 10** because it requires implementing two new wire-protocol families: ### How smart-LOBs surface in the wire protocol Surprise discovery: **BLOB and CLOB columns do not appear with their nominal type codes (102 / 101) in the SQ_DESCRIBE response.** Instead, the server presents them as `UDTFIXED` (type 41) with: - `extended_id = 10` for BLOB, `11` for CLOB - `extended_owner = "informix"`, `extended_name = "blob"` / `"clob"` - `encoded_length = 72` (locator size) The 72 bytes that arrive in the SQ_TUPLE are the locator — an opaque server-side pointer into the smart-LOB sbspace. They contain enough information for the server to find the actual data (sbspace ID, blob ID, etc.) but they are NOT the data. ### What it takes to retrieve the actual bytes (Phase 10 work) Captured JDBC wire flow shows that retrieving a BLOB requires: 1. **`SQ_FPROUTINE` (tag 103)** — fast-path RPC to invoke `ifx_lo_open(locator, mode=4)` (LO_RDONLY). This is a *separate* execution path from PREPARE/EXECUTE/FETCH. It includes its own parameter-marshaling format with UDT support (the locator goes in as an `IfxUDT` with `extended_type_name="blob"` and the 72 bytes). The response carries back a small int — the file descriptor (`loFd`). 2. **`SQ_LODATA` (tag 97)** — bulk byte transfer. Body: `[short subCom][short loFd][int length][short bufSize=32000]` with sub-commands 0=LO_READ, 1=LO_READWITHSEEK, 2=LO_WRITE. Response is `[short SQ_LODATA][short opType][int totalSize][short chunk_size][bytes data]...`. 3. **Another `SQ_FPROUTINE`** to invoke `ifx_lo_close(loFd)`. Writing a smart-LOB is even more involved: `ifx_lo_create(spec, mode, blob)` returns a fresh locator AND a file descriptor, then `SQ_LODATA(LO_WRITE, ...)` streams the bytes, then `ifx_lo_close`. The locator is then passed as an INSERT parameter (also via UDT marshaling). ### Server-side prerequisites Building on Phase 7/8 setup, smart-LOBs additionally need: 1. **An sbspace** (Phase 6.f setup): `onspaces -c -S sbspace1 -p /path -o 0 -s 50000 -Df "AVG_LO_SIZE=100"` 2. **`SBSPACENAME` config**: `onmode -wm SBSPACENAME=sbspace1` — the default sbspace name. Without this, `ifx_lo_create` fails with `-Invalid default sbspace name (sbspace).` (the default is the literal string `"sbspace"` which doesn't exist). ### What ships in Phase 9 - `informix_db.BlobLocator(raw: bytes)` — 72-byte frozen dataclass, validates length on construction, has a safe `__repr__` that doesn't leak the locator bytes (they're internal/opaque to the client). - `informix_db.ClobLocator(raw: bytes)` — same shape, distinct type. Same-bytes locators of different families compare *unequal* by design. - Row decoder branch in `_resultset.parse_tuple_payload` that detects `UDTFIXED` + extended_id 10/11 and wraps the bytes appropriately. - Wire constants `SQ_LODATA = 97`, `SQ_FPROUTINE = 103`, `SQ_FPARAM = 104` added to `_messages.py` for Phase 10 use. ### Test coverage - 11 unit tests (`tests/test_blob_locator_unit.py`) exercising construction, immutability, equality, hash, repr safety, and size validation. No Informix needed. - 4 integration tests (`tests/test_smart_lob.py`) verifying that SELECT on a BLOB column returns a `BlobLocator`, the description metadata is correct, the result is immutable, and the repr doesn't leak. The fixture seeds test data via the JDBC reference client (since smart-LOB writes also need the deferred protocols). Total project tests: **64 unit + 111 integration = 175 tests**. ### Why "research-first, implementation-after" is becoming the default for big-protocol phases Phases 6.f, 8, and 9 all followed the same arc: spend the first half of the phase on "what does the wire actually look like?" research (capturing JDBC traces, reading decompiled source, configuring the server until JDBC works). Then either ship implementation in the same phase (Phase 8) or split into a separate later phase (6.f → 8, 9 → 10). The split is appropriate when the protocol surface is materially larger than what we can validate in one focused session. For Phase 9, the deferred work is genuinely substantial: - SQ_FPROUTINE alone is a new RPC framework with its own request/response format - It needs UDT parameter marshaling (`extended_owner` + `extended_name` + raw bytes) - SQ_LODATA needs read+write paths with chunk streaming - The cursor needs new state-machine awareness (open the LOB, fetch, close — all between cursor open and CLOSE) Estimating Phase 10 at ~2x the protocol surface of Phase 8. --- ## 2026-05-04 — Phase 10: smart-LOB BLOB read via SQ_FILE / lotofile **Status**: active **Decision**: Implemented BLOB read end-to-end via the **`SQ_FILE` (98) protocol** rather than the heavier `SQ_FPROUTINE` (103) + `SQ_LODATA` (97) stack that the earlier Phase 9 entry estimated as 2x Phase 8. The actual implementation came in much smaller because we leveraged a server-side SQL function (`lotofile`) that orchestrates the byte transfer, with our driver acting as a remote filesystem. ### The strategic pivot Initial estimate for Phase 10 was: implement `SQ_FPROUTINE` (RPC fast-path with UDT parameter marshaling) + `SQ_LODATA` (chunked transfer to/from open file descriptors). Both are big new wire-protocol surfaces. Then I discovered that `SELECT ifx_lo_open(blob_col, 4) FROM tbl` works as **regular SQL** — the server reads the locator from the column itself and passes it to the function, returning the file descriptor as an INT result. No client-side UDT marshaling needed. But that was a partial win — we'd still need `SQ_LODATA` for actually transferring the bytes after the open. Then I tried `SELECT lotofile(blob_col, '/path', 'client') FROM tbl` — and the server responded with `unexpected tag in FETCH response: 0x0062`. That tag is **`SQ_FILE`** — a *separate* protocol I hadn't recognized as relevant. Reading the JDBC source: `SQ_FILE` is the "remote filesystem" protocol where the server tells the client to act as a file server (open a path, accept these chunks, close). The bytes flow back to us automatically. The key insight: **`lotofile(...)` is a server-side function that orchestrates the entire transfer in one SQL statement**. The client doesn't need to do `ifx_lo_open` → `ifx_lo_read` → `ifx_lo_close`. Just write the SQL, intercept the `SQ_FILE` messages, return the bytes. Maybe 1/3 the protocol surface I'd planned. ### Wire protocol — SQ_FILE (98) The server sends `SQ_FILE` messages with sub-types (per `IfxSqli.receiveSQFILE` line 4980): - **0 (open)**: `[short fnameLen][padded fname][int mode][int flags][int offset][short SQ_EOT]`. Client opens the named file. We respond with `[short SQ_EOT]`. - **3 (write to client)**: stream of `[short SQ_FILE_WRITE=107][short bufSize][padded data]` chunks, terminated by `SQ_EOT`. We respond with `[short 107][int totalBytesWritten][short SQ_EOT]`. - **1 (close)**: `[short SQ_EOT]`. We respond with `[short SQ_EOT]`. - (2 = read-from-client / `filetoblob` path; not implemented this phase.) Our implementation buffers writes in memory (`bytearray`) keyed by the requested filename; the bytes never touch disk. Users retrieve via `cursor.blob_files[filename]`. ### Implementation: in-memory file emulation ```python # In cursor state: self.blob_files: dict[str, bytes] = {} # filename -> assembled bytes self._sqfile_current_name: str | None = None self._sqfile_current_buf: bytearray | None = None # In _read_fetch_response, when tag == 98: self._handle_sq_file(reader) ``` The handler dispatches by optype: open creates a fresh buffer, write extends it, close seals it into `blob_files`. ### Bonus discovery: UDTVAR(lvarchar) row decoding `SELECT lotofile(...)` returns its result as **UDTVAR (type 40) with extended_name="lvarchar"** — not as plain LVARCHAR. The wire format is `[byte indicator][int length][bytes]` (vs. plain LVARCHAR's `[int length][bytes]`). Added a row-decoder branch that handles this — needed to surface the actual filename string instead of raw locator bytes. ### High-level helper: `cursor.read_blob_column` For the common case "give me the bytes of column X from row matching Y", added a convenience method that wraps the user's SQL with `lotofile(...)` and returns the assembled bytes: ```python data: bytes = cur.read_blob_column( "SELECT data FROM photos WHERE id = ?", (42,) ) ``` Naive SQL splitter that handles the common shape (single column, FROM clause). Power users can drop down to manual `lotofile` + `cur.blob_files[name]`. ### Test coverage 6 integration tests in `tests/test_smart_lob_read.py`: - Low-level `lotofile` + `blob_files` lookup - 30KB BLOB across multiple SQ_FILE_WRITE chunks - High-level `read_blob_column` simple case - `read_blob_column` returns `None` when no rows match - High-level helper for 30KB BLOB - `read_blob_column` validation (rejects non-SELECT and FROM-less SQL) Total project tests: **64 unit + 117 integration = 181 tests**. ### What's still deferred (Phase 11+) - **Smart-LOB write**: `INSERT INTO tbl VALUES (?, ?)` with a `bytes` BLOB parameter still requires the full `SQ_FPROUTINE` + `SQ_LODATA` stack to invoke `ifx_lo_create` + write chunks. There's no `lotofromfile_client(bytes)` SQL function with the same shape as `lotofile`. - **`BlobLocator.read(connection)`**: an OO API would be nice but requires reverse-mapping a locator back to its source — which the `SQ_FPROUTINE` path does naturally, but the `lotofile` path does not. - **`filetoblob` path**: server-as-reader (SQ_FILE optype 2) — for streaming files from client to server. ### Lesson **Don't estimate protocol-implementation cost from JDBC's class hierarchy alone.** JDBC's `IfxSmBlob` class is 600+ lines and looks like a massive surface, but the actual *wire-level* read path can be reduced to a single SQL function (`lotofile`) plus one new tag handler (`SQ_FILE`). When estimating, look at the wire trace, not the client SDK abstractions. The wire is often simpler than the SDK suggests. --- ## 2026-05-04 — Phase 11: smart-LOB BLOB/CLOB write via SQ_FILE / filetoblob **Status**: active **Decision**: Implemented BLOB and CLOB *write* using the same `SQ_FILE` (98) protocol pivot as Phase 10 — the symmetric counterpart in the opposite direction. Same pattern: leverage a server-side SQL function (`filetoblob`/`filetoclob`) that orchestrates the byte transfer, with our driver acting as a remote filesystem. ### What ships Two new pieces: 1. **`SQ_FILE` optype 2 (read-from-client)**: extended the Phase 10 handler. When the server says "open file X for reading, send me chunks", we look up registered bytes in `cursor.virtual_files[X]` and stream them as `SQ_FILE_READ` (106) chunks. The wire format mirrors optype 3 (write-to-client) but reversed. 2. **`cursor.write_blob_column(sql, blob_data, params, *, clob=False)`**: high-level helper. Takes a SQL statement with a `BLOB_PLACEHOLDER` token, replaces it with `filetoblob('', 'client')` (or `filetoclob` for CLOB), registers the bytes under the sentinel, runs the statement. The server reads the bytes via the SQ_FILE protocol mid-statement. ### Wire protocol — SQ_FILE optype 2 in detail Server sends: `[short SQ_FILE=98][short optype=2][short bufSize][int readAmount][short SQ_EOT]` We respond with: - `[short SQ_FILE_READ=106][int actualAmount]` — the total we'll send - For each chunk: `[short SQ_FILE_READ=106][short chunkSize][padded data]` - Final `[short SQ_EOT]` (per JDBC's `flip()`) The server's `bufSize` is the per-chunk cap; we honor it. `readAmount=-1` means "send everything". ### High-level API design The `BLOB_PLACEHOLDER` token approach was chosen over alternatives: - **`?`-style binding**: would conflict with normal parameter substitution and require introspecting parameter types from DESCRIBE - **Method on `BlobLocator`**: works for read (Phase 10's deferred design) but not write — there's no locator before the row exists - **Implicit bytes-detection in `execute()`**: too magical; `bytes` already maps to BYTE type for legacy in-row blobs `BLOB_PLACEHOLDER` is unmistakable, doesn't conflict with anything, and makes the code obvious at the call site: ```python cur.write_blob_column( "INSERT INTO photos VALUES (?, BLOB_PLACEHOLDER)", jpeg_bytes, (42,), ) ``` ### Closing the loop: pure Python end-to-end Phase 9's tests needed JDBC to seed BLOB rows. Phase 10's read tests still needed JDBC for fixtures. **Phase 11 eliminated that dependency entirely** — both `tests/test_smart_lob.py` and `tests/test_smart_lob_read.py` now use our own `write_blob_column` for fixture setup. The full smart-LOB read+write loop is **pure Python, no JVM needed**. Bonus: integration test runtime dropped from 5.78s → 2.78s because we're no longer spawning Java per fixture. The Phase 0 project goal — "pure Python Informix driver, no native deps" — was already met for the protocol implementation, but Phase 11 finally made it true for the test suite as well. ### Test coverage 9 integration tests in `tests/test_smart_lob_write.py`: - BLOB short payload round-trip (single chunk) - BLOB 51200 bytes (multi-chunk) - BLOB empty bytes - BLOB binary-safe (all 256 byte values) - BLOB UPDATE - BLOB multi-row INSERTs - CLOB round-trip (`clob=True` routes through `filetoclob`) - `write_blob_column` validation (rejects SQL without `BLOB_PLACEHOLDER`) - `virtual_files` cleanup after call Total project tests: **64 unit + 126 integration = 190 tests**. ### Type matrix complete (for the common types) | Type | Decode | Encode | |------|--------|--------| | INT/FLOAT/DECIMAL/etc. | ✓ | ✓ | | CHAR/VARCHAR/LVARCHAR/etc. | ✓ | ✓ | | BOOL/DATE/DATETIME/INTERVAL | ✓ | ✓ | | BYTE/TEXT (legacy in-row blobs) | ✓ | ✓ | | **BLOB/CLOB (smart-LOBs)** | **✓ via lotofile** | **✓ via filetoblob** | | ROW, COLLECTION | — | — | Smart-LOBs went from "research-only" (Phase 9) to "fully working in pure Python" (Phase 11) in three phases. The architectural insight that made it tractable: **lean on server-side SQL functions, not client-side RPC**. The fast-path `SQ_FPROUTINE`/`SQ_LODATA` stack would have been ~3-4x the work. ### What's still NOT done - ROW types (composite UDTs) - COLLECTION types (SET, LIST, MULTISET) - Async layer (`informix_db.aio`) - TLS/SSL - Connection pooling - SQL fast-path RPC (`SQ_FPROUTINE`/`SQ_LODATA`) — not needed for any common operation we've found, but would be needed for direct stored-procedure invocation with UDT params --- ## 2026-05-04 — Phase 12: ROW / COLLECTION type recognition (text representation) **Status**: active (recognition-only — full recursive parsing deferred) **Decision**: Composite UDTs (ROW=22, COLLECTION=23, SET=19, MULTISET=20, LIST=21) decode into typed wrapper objects (``informix_db.RowValue`` and ``informix_db.CollectionValue``) that expose the schema string and the raw payload bytes. Full recursive parsing into Python tuples / sets / lists is deferred — Phase 13 territory. ### The wire format surprise The first probe via JDBC's `RefClient` showed a 1420-byte SQ_TUPLE payload for a 2-field ROW value (just "Alice, 30"). Reading JDBC's `IfxComplex` and `IfxComplexInput` sources, this is the binary-with-full-schema-metadata format — JDBC opts into it because it wants to recursively parse fields by type. When SELECT runs without that opt-in (e.g., from our driver), the server uses a **textual representation** instead: ``` ROW value: b"ROW('Alice',30 )" # 24 bytes SET value: b"SET{'red','green','blue'}" # 25 bytes LIST value: b"LIST{10 ,20 ,30 }" # 41 bytes ``` That's ~30x lighter than JDBC's binary form. Per-element padding follows the declared column widths (note the trailing spaces — VARCHAR/INT padding). The wire framing is identical to ``UDTVAR(lvarchar)`` from Phase 10: ``` [byte indicator][int length][bytes] ``` Indicator: ``0`` = not null, ``1`` = null. Length is a 4-byte big-endian int. Bytes are the textual representation. ### Why we ship recognition only Implementing recursive parsing into Python tuples/lists/sets requires: 1. Parsing the textual representation — needs a small SQL-literal lexer (handle quoted strings, escapes, nested ROWs, NULL elements) 2. Per-element type decoding driven by the column's `extended_name` schema (e.g., ``ROW(name varchar(50), age integer)``) 3. Recursive handling for nested ROWs and collections of ROWs That's a substantial parser. The user-facing benefit is "a tuple instead of a string of the tuple" — useful, but most production use cases sidestep this by **projecting sub-fields via SQL** (`SELECT row_col.fieldname FROM tbl`) which is already supported and returns properly-typed columns. So Phase 12's deliverable is the **type recognition + wire-format envelope** part. The recursive parser can layer on top later without changing the API. ### Phase 13+ scope (if anyone needs it) If a future user has a real workload that needs structured parsing: 1. Implement a SQL-literal tokenizer for the textual format 2. Add `RowValue.fields() -> tuple` driven by parsing `extended_name` 3. Add `CollectionValue.elements() -> set | list` similarly Or alternatively, request the binary-with-schema format like JDBC does (which would require additional protocol negotiation) and parse that. ### Test coverage 8 integration tests in `tests/test_composite_types.py`: - ROW value recognition (returns `RowValue`) - ROW NULL - ROW sub-field projection workaround (`SELECT r.field FROM tbl`) - ROW with long value (>255 bytes — confirms 4-byte length prefix) - SET / MULTISET / LIST recognition - NULL collection Total: **64 unit + 134 integration = 198 tests**. ### Lesson **The same wire-protocol pattern keeps showing up.** Phase 10's UDTVAR(lvarchar) decoder, Phase 9's smart-LOB locator, and now Phase 12's composite UDTs all use ``[byte indicator][int length][bytes]``. Once you've implemented one UDT-shaped type, the next is mostly a copy of the decoder branch. The hard part of UDTs is the **payload semantics**, not the framing. This phase took less than an hour to ship after the protocol research from Phase 10/11 had already established the indicator+length convention. --- ## 2026-05-04 — Phase 13: SQ_FPROUTINE / SQ_EXFPROUTINE fast-path RPC **Status**: active (MVP — scalar-only params/returns; UDT params deferred to Phase 13.x) **Decision**: Implemented the fast-path RPC layer for direct stored-procedure invocation, exposed as ``Connection.fast_path_call(signature, *params) -> list``. Routine handles cached per-connection by signature. ### Wire protocol — three-message family 1. **`SQ_GETROUTINE`** (101) — handle resolution by signature - Request: `[short 101][byte isRoutineById=0][int sigLen][sig bytes][pad if odd][short fparamFlag=0][short SQ_EOT]` - Response: `[short 101][short dbNameLen][dbName bytes][int handle][short SQ_EOT]` 2. **`SQ_EXFPROUTINE`** (102) — execute by handle - Request: `[short 102][short dbNameLen][dbName][pad if odd][int handle][short paramCount][short fparamFlag][SQ_BIND-format params][short SQ_EOT]` 3. **`SQ_FPROUTINE`** (103) — response with return values - Body: `[short numReturns]` then per return: `[short type]` (with optional UDT info block when type > 18 except 52/53), `[short ind][short prec][data]`, then drain SQ_DONE/SQ_COST/SQ_XACTSTAT/SQ_EOT ### Why this layer matters even though SQL works for the common cases Phase 10 and 11 showed you can do most smart-LOB work via plain SQL (`SELECT ifx_lo_open(col, mode)`, `INSERT INTO ... filetoblob(...)`). So why implement the fast-path? Three reasons: 1. **`ifx_lo_close(int)`** can't be invoked via plain SQL (`EXECUTE FUNCTION ifx_lo_close(?)` returns `-674`). The cleanup step of the smart-LOB lifecycle genuinely needs the fast-path. Without it, opened locators leak server-side until the session closes. 2. **Direct UDF invocation** is faster — no PREPARE → DESCRIBE → EXECUTE → DRAIN. Just the two-message GETROUTINE+EXFPROUTINE round-trip (one if cached). For tight UDF-in-loop workloads, this is materially cheaper. 3. **Some Informix introspection functions** (e.g., procedural diagnostics) aren't projectable through SQL. ### Routine-handle cache Mirrors JDBC's `setFPCacheInfo`. Per-connection `dict[str, tuple[str, int]]` mapping signature → (db_name, handle). First call resolves and caches; subsequent calls skip `SQ_GETROUTINE` entirely. The cache is invalidated implicitly when the connection closes — the server-side handle is per-session anyway. No explicit eviction needed for typical workloads (a few stored functions used repeatedly). ### MVP scope — what's NOT in Phase 13 - **UDT param encoding**: `ifx_lo_open(blob_locator, mode)` takes a 72-byte UDT blob param (extended_type_name="blob"). Our `encode_param` doesn't yet emit the UDT-specific extended_owner/extended_name preamble required for type > 18. So users can't pass locators directly through fast-path. - **UDT return decoding**: `ifx_lo_create(spec, mode, blob)` returns both a 72-byte locator AND an int. The response parser handles the int but not the locator UDT. - **`SQ_LODATA`** for chunked binary read/write to open file descriptors (the other half of the original Phase 13 plan). Not strictly needed since Phase 10/11's `lotofile`/`filetoblob` already cover read/write end-to-end. These are Phase 13.x items if a real workload needs them. For now, the **read/write smart-LOB cycle works fully without any of them** thanks to the SQL-as-shortcut design from Phase 10/11. ### Test coverage 5 integration tests in `tests/test_fastpath.py`: - `ifx_lo_close(-1)` raises with sqlcode -9810 (server error path) - End-to-end open via SQL → close via fast-path returns 0 - Handle caching (signature appears in cache after first call, reused on second) - Unknown function signature raises (SQ_GETROUTINE error path) - Multiple open/close cycles (verifies cleanup) Total: **64 unit + 139 integration = 203 tests**. ### Architectural completion With Phase 13, the project's protocol implementation now covers **every wire-message family JDBC uses** for ordinary database work: - Login + session init (Phases 0-2) - PREPARE → EXECUTE → FETCH (Phases 2-4) - Transaction control (Phase 7) - Type codecs for all common types (Phases 5-12) - Fast-path RPC (Phase 13) - File-transfer protocol (Phases 10-11) The only families still unimplemented are: TLS handshake (`STARTTLS`), and the cluster-redirect protocol (replication failover). Neither is needed for a single-instance driver. --- ## 2026-05-04 — Phase 14: TLS/SSL transport **Status**: active **Decision**: Optional TLS via a new ``tls`` parameter on ``connect()`` and the ``IfxSocket`` constructor. Three modes: ```python # 1. Plain TCP (default, current behavior) informix_db.connect(host, port=9088, ...) # 2. TLS with verification DISABLED — dev / self-signed servers informix_db.connect(host, port=9089, ..., tls=True) # 3. Caller-supplied ssl.SSLContext — production ctx = ssl.create_default_context(cafile="/path/to/ca.pem") informix_db.connect(host, port=9089, ..., tls=ctx) ``` ### Why "no STARTTLS" — Informix uses dedicated TLS ports Postgres and others negotiate TLS via STARTTLS-style upgrade: connect plain, send a magic byte, receive ack, wrap socket. Informix does not — TLS-enabled listeners run on **separate ports** (the server's ``sqlhosts`` config), and the SSL handshake runs immediately after TCP connect with no protocol-level negotiation. From the driver's perspective this is simpler: just wrap the socket before any SQLI bytes flow. The wrapping happens inside ``IfxSocket.__init__``, so the rest of the protocol layer (login PDU assembly, SQ_BIND, fast-path, file transfer) is fully unaware of whether TLS is in use. The same code path drives encrypted and plain connections. ### Why ``tls=True`` defaults to *insecure* Most Informix dev installations use self-signed certs. The first thing users hit when adding TLS is a "self-signed certificate verification failed" error, which then sends them down a path of disabling verification anyway. ``tls=True`` short-circuits that: it produces a context with ``check_hostname=False`` and ``verify_mode=CERT_NONE``. The minimum protocol is still TLSv1.2 (per ``ssl.PROTOCOL_TLS_CLIENT``). For production, ``tls=ssl.SSLContext(...)`` is the explicit, secure path. The defaults are documented as "dev / loopback only". ### Test strategy: avoid making cryptography a dev dep Initial draft used the ``cryptography`` package to mint a test certificate — which would have added a 5MB transitive dep just for one phase's tests. Switched to spawning ``openssl`` CLI as a subprocess (skipping the test if not available). Cleaner, no new Python deps. The integration tests run against a tiny in-process Python TLS echo server using ``ssl.SSLContext(PROTOCOL_TLS_SERVER)``. Each test: 1. Generates an ephemeral self-signed cert via ``openssl req -x509`` 2. Spins up a one-shot TLS server in a daemon thread 3. Connects via ``IfxSocket(tls=True)`` (or with a custom context) 4. Sends/receives an echo round-trip to prove the encrypted channel works ### Tests delivered 5 unit tests in `tests/test_tls.py`: - `tls=True` dev context has `check_hostname=False` and `verify_mode=CERT_NONE` - Default context uses TLSv1.2+ - Real handshake against in-process TLS echo server (proves wrap_socket works) - Custom `SSLContext` honored verbatim - `tls=True` against a non-TLS port raises `OperationalError("TLS handshake failed...")` Total: **69 unit + 139 integration = 208 tests**. ### What's now done at the protocol level The driver now implements **everything in the SQLI wire-protocol family that a Python application needs**: - Login → cursor → fetch (Phases 0-4) - All common types (Phases 5-12) - Transactions, file transfer, fast-path, smart-LOBs (Phases 7-13) - TLS transport (this phase) The remaining backlog (async, pooling) is **library-design** work, not protocol work. --- ## 2026-05-04 — Phase 15: connection pool **Status**: active **Decision**: Added a thread-safe `informix_db.ConnectionPool` with min/max sizing, lazy growth, idle recycling, and per-acquire health-check. Construct via `informix_db.create_pool(...)`. ```python import informix_db pool = informix_db.create_pool( host="...", user="informix", password="...", min_size=1, max_size=10, acquire_timeout=5.0, ) with pool.connection() as conn: cur = conn.cursor() cur.execute("SELECT ...") rows = cur.fetchall() # conn returned to pool pool.close() ``` ### Design choices **Lazy growth from `min_size`**. The pool pre-opens `min_size` connections on construction (defaults to 0). Beyond that, new connections are minted on-demand up to `max_size`. This matches what FastAPI / Flask apps want: pay nothing on startup if the workload is light, but burst up to `max_size` under load. **Health-check on acquire, not on release**. Before returning a connection to a caller, the pool sends a trivial `SELECT 1 FROM systables WHERE tabid=1` round-trip. Dead connections (server-side timeout, network drop) are silently dropped and a fresh one is minted. The cost is one round-trip per acquire — typically <1ms on the same network — bought at the price of "users never see a stale-connection error". The alternative (check on *release*) is wrong for the obvious reason: idle time between release and the next acquire is when connections actually die. Network drops, server-side idle timeouts, and OOM-kills happen *between* uses, not during them. **Eviction on `OperationalError` / `InterfaceError` only**. The `with pool.connection()` context manager evicts the connection on connection-related exceptions but *retains* it on application-level errors (e.g., `ValueError` from user code, or `IntegrityError` from a constraint violation). This avoids the "every constraint violation evicts a healthy connection" pitfall some pools have. **Releasing the lock during `connect()`**. The slow part of pool growth is the actual TCP/TLS handshake + login PDU exchange — easily 50-100ms even on the same host. Holding the pool lock during this would serialize all growth. Instead, the `acquire()` method increments `_total` (under the lock), releases the lock, opens the connection, then re-acquires to return it. Other threads can grow / acquire idles concurrently. There's a careful try/finally around the lock release because exceptions during `connect()` need to decrement `_total` and notify waiters. ### Why I went with `threading.Condition` instead of `asyncio.Queue` or similar The pool is sync-only. Async support is a separate phase that needs an entire `informix_db.aio` module — making the pool dual-API now would couple two unrelated concerns. The sync-pool implementation is ~250 lines; an async-pool will reuse the lifecycle logic but needs different waiting primitives (no real way to share). ### Test coverage: 15 integration tests `tests/test_pool.py`: - API & lifecycle: `min_size` pre-opens, lazy growth, context-manager release, LIFO reuse - Exhaustion: timeout when full, per-acquire timeout override, release unblocks waiters - Eviction: explicit `broken=True`, auto-evict on `OperationalError`, retain on application errors - Health-check: dead idle connection silently replaced - Shutdown: `close()` drains idles, idempotent close, `with pool: ...` context manager - Multi-thread safety: 8 workers × 3 queries each, no leaks, no double-use ### What's now in the library-design layer With the pool, the project covers the three things a typical Python web/API workload needs from a database driver: 1. PEP 249 surface (Connection, Cursor, types) — Phases 0-12 2. TLS transport — Phase 14 3. Connection pool — this phase Remaining backlog is just `informix_db.aio` (async) — a more substantial refactor since it requires factoring the I/O layer behind a transport abstraction. --- ## 2026-05-04 — Phase 16: async API (`informix_db.aio`) **Status**: active (thread-pool wrapping; native async deferred to Phase 17 if needed) **Decision**: Shipped an async API at `informix_db.aio` that exposes `AsyncConnection`, `AsyncCursor`, and `AsyncConnectionPool`. Each blocking I/O call is offloaded to a worker thread via `asyncio.to_thread`. The event loop never blocks; FastAPI/aiohttp handlers stay non-blocking; queries run in parallel up to the pool's `max_size`. ### Two viable strategies, very different costs I considered two approaches: 1. **Native async I/O** (asyncpg pattern): rewrite the entire I/O layer behind a sync/async transport abstraction. `connection.execute()` would actually `await` on socket reads. Estimated cost: ~2000 lines of careful refactor, every code path touched. Performance ceiling: limited only by Python event-loop overhead. 2. **Thread-pool wrapping** (aiopg's original approach): expose async-compatible API by running existing sync code in `asyncio.to_thread()`. Each `await` yields the event loop while a worker thread does the I/O. Cost: ~250 lines, no changes to sync codebase. Performance ceiling: limited by thread-pool size, not event-loop overhead. I shipped strategy 2. ### Why strategy 2 is "good enough" for the typical use case Picture a FastAPI app handling an HTTP request: - Request arrives → handler awaits `pool.connection()` - Handler awaits `cur.execute(sql, params)` → worker thread does I/O, event loop yields - Handler awaits `cur.fetchone()` → worker thread does I/O, event loop yields - Handler returns the row During those `await` points, the event loop is free to handle other requests. From the FastAPI side, nothing's different from using asyncpg. The only difference is: a worker thread is doing the actual socket I/O instead of native async. For workloads with hundreds of *concurrent* connections sharing a small thread pool, native async wins. For typical request-scoped workloads where each request gets its own connection from a pool of <50, the thread-pool overhead is dominated by network latency anyway. ~1ms thread-hop overhead vs. ~0.5ms with native async; both noise next to a 5ms query round-trip. ### What I'd do differently for native async (Phase 17) If a real workload needs it: factor `IfxSocket` into a `Transport` ABC with sync/async implementations, then have `Connection` and `Cursor` use the transport's async methods when called from an async context. The protocol layer (PDU builders/parsers) stays I/O-agnostic and is shared across both. Estimated cost: 1-2 weeks of careful refactor. ### API surface ```python import asyncio from informix_db import aio async def main(): pool = await aio.create_pool( host="...", user="informix", password="...", min_size=1, max_size=10, ) async with pool.connection() as conn: cur = await conn.cursor() await cur.execute("SELECT id, name FROM users WHERE id = ?", (42,)) row = await cur.fetchone() async for r in cur: # also supported ... await pool.close() asyncio.run(main()) ``` The API mirrors the sync API one-to-one with `async`/`await` added. Same parameter names, same exceptions, same behavior. Users moving between sync and async should not need to learn a new mental model. ### Implementation note: sync's `create_pool(min_size>0)` is blocking `informix_db.create_pool(min_size=2)` opens 2 connections during construction (real network I/O). The async `aio.create_pool` is therefore an `async def` that wraps the sync call in `asyncio.to_thread`. For `min_size=0` (the default) this is essentially free, but always-await keeps the API uniform — users don't need to care whether the construction actually does I/O. ### Pool eviction policy preserved The async pool's `connection()` context manager applies the same "evict on `OperationalError`/`InterfaceError`, retain on application errors" policy as the sync pool. This means a `ValueError` raised by user code inside `async with pool.connection() as conn:` doesn't poison the connection — it's returned to the pool for the next request. ### Test coverage 9 integration tests in `tests/test_aio.py`: - Connection lifecycle (open / close / async-with) - Simple SELECT - Parameterized SELECT - Cursor async iteration (`async for`) - Pool basic acquire/release - Pool concurrent queries (20 queries through max_size=5 pool, all complete <5s) - Pool async context manager - Transactions (commit / rollback) Total: **69 unit + 163 integration = 232 tests**. ### Architectural completion With Phase 16, **every backlog item is complete**. The project ships: 1. Pure-Python SQLI wire-protocol implementation (Phases 0-13) — no native deps 2. TLS transport (Phase 14) 3. Connection pool (Phase 15) 4. Async API (this phase) The Phase 0 ambition — "first pure-Python implementation of Informix's SQLI protocol in any language" — is now genuinely complete. The library is ready for `pip install informix-db` and use in production Python web/API services. --- ## (template — copy below this line for new entries) ``` ## YYYY-MM-DD — **Status**: active | superseded | revisited **Decision**: **Discarded**: **Why**: ```