informix-db

Author	SHA1	Message	Date
Ryan Malloy	9048335462	Phase 12: ROW / COLLECTION type recognition Composite UDTs (ROW=22, COLLECTION=23, SET=19, MULTISET=20, LIST=21) now decode into typed wrapper objects (informix_db.RowValue, informix_db.CollectionValue) that expose schema + raw payload bytes. The wire format is the now-familiar [byte ind][int length][bytes] pattern (same as UDTVAR(lvarchar) from Phase 10). The bytes are a TEXTUAL representation of the value when selected without the extended-binary opt-in JDBC uses: ROW value: b"ROW('Alice',30 )" SET value: b"SET{'red','green','blue'}" LIST value: b"LIST{10 ,20 ,30 }" JDBC's binary-with-schema format runs ~30x larger (1420 bytes for a 2-field ROW vs. our 24). We don't request it — the textual form is what the server returns by default and is sufficient for type recognition. Phase 12 ships type recognition only. Full recursive parsing into Python tuples/lists/sets is deferred to Phase 13 (would require a SQL-literal lexer + recursive type-driven decoding). Production workloads that need typed field access today can project via SQL: cur.execute("SELECT id, r.name, r.age FROM tbl") Tests: 8 integration tests in test_composite_types.py covering ROW recognition, NULL, sub-field projection workaround, long values (>255 bytes — verifies 4-byte length prefix), SET/MULTISET/LIST recognition, and null collections. Total: 64 unit + 134 integration = 198 tests. Lesson reinforced: once one UDT-shaped type is implemented (UDTVAR in Phase 10, smart-LOB in Phase 9), every subsequent UDT-shaped type is mostly a copy of the existing decoder branch. The hard part is payload semantics, not framing.	2026-05-04 14:30:44 -06:00
Ryan Malloy	389c32434c	Phase 9: smart-LOB BLOB/CLOB locator decoding (Phase 10 deferred for fetch) SELECT on BLOB or CLOB columns no longer requires raw byte interpretation. The 72-byte server-side locator is wrapped in a typed BlobLocator or ClobLocator (frozen dataclass) so the column is recognizable as "server-side reference, not actual bytes". Wire-protocol findings: * Smart-LOB columns DON'T appear with their nominal type codes (102/101) in SQ_DESCRIBE. They surface as UDTFIXED (41) with extended_id 10 (BLOB) or 11 (CLOB) and encoded_length=72 (locator size). * Retrieving the actual bytes requires SQ_FPROUTINE (103) RPC to invoke ifx_lo_open, plus SQ_LODATA (97) for chunked transfer, plus another SQ_FPROUTINE for ifx_lo_close. That's a Phase 10 lift — roughly 2x the protocol surface of Phase 8. Server config needed (added to Phase 7 setup): * sbspace: onspaces -c -S sbspace1 ... * default sbspace: onmode -wm SBSPACENAME=sbspace1 What ships in Phase 9: * informix_db.BlobLocator(raw: bytes) — 72-byte frozen wrapper * informix_db.ClobLocator(raw: bytes) — distinct type, same shape * Row decoder branch in _resultset.parse_tuple_payload * Wire constants SQ_LODATA=97, SQ_FPROUTINE=103, SQ_FPARAM=104 Tests: * 11 unit tests in test_blob_locator_unit.py (no Informix needed) — construction, immutability, equality, hash, repr safety, size validation. * 4 integration tests in test_smart_lob.py — fixture seeds via JDBC reference client (smart-LOB writes also need deferred protocols). * RefBlob.java helper in tests/reference/ for seeding via JDBC. Total: 64 unit + 111 integration = 175 tests. Locator design note: __repr__ omits the raw bytes (they're opaque to the client). Same-bytes locators of different families compare unequal — BlobLocator(x) != ClobLocator(x) — to avoid silent type confusion.	2026-05-04 13:26:15 -06:00
Ryan Malloy	4dafbf8ce9	Phase 6.d: INTERVAL decoding (both qualifier families) Implements row-decoding for IDS INTERVAL, the last common temporal type. The qualifier short bisects the type at the wire level: start_TU >= DAY maps to datetime.timedelta (day-fraction), start_TU <= MONTH maps to a new informix_db.IntervalYM (year-month). Wire format mirrors DATETIME exactly — `[head byte][digit pairs in base-100]`, with the qualifier dictating field interpretation. The fraction-to-nanoseconds scaling (`scale_exp = 18 - end_TU`, forced odd) is the JDBC pattern from `Decimal.fromIfxToArray`. IntervalYM is a frozen dataclass holding signed total months, with `years` and `remainder_months` as derived properties. Matches JDBC's `IntervalYM.months` shape rather than a (years, months) tuple — avoids ambiguity around what "negative" means for a multi-field tuple. Tests: 13 unit (synthetic byte streams covering all decoder branches) + 9 integration (real Informix queries spanning DAY TO SECOND, HOUR TO SECOND, YEAR TO MONTH, negatives, NULL, and mixed-family rows). Total test count: 53 unit + 82 integration = 135. Encoder for INTERVAL parameter binding is deferred to a later phase (same arc as DECIMAL/DATETIME — decode lands first).	2026-05-04 12:22:07 -06:00
Ryan Malloy	a1bd52788d	Phase 2: SELECT works end-to-end — pure-Python Informix fully reads data cursor.execute("SELECT 1 FROM systables WHERE tabid = 1") cursor.fetchone() == (1,) To my knowledge, this is the first time a pure-Python implementation has read data from Informix without wrapping IBM's CSDK or JDBC. Three breakthroughs in this commit: 1. Login PDU's database field is BROKEN. Passing a database name there makes the server reject subsequent SQ_DBOPEN with sqlcode -759 ("database not available"). JDBC always sends NULL in the login PDU's database slot — we now do the same. The user-supplied database opens via SQ_DBOPEN in _init_session. 2. Post-login session init dance: SQ_PROTOCOLS (8-byte feature mask replayed verbatim from JDBC) → SQ_INFO with INFO_ENV + env vars (48-byte PDU replayed verbatim — DBTEMP=/tmp, SUBQCACHESZ=10) → SQ_DBOPEN. Without all three steps in this exact order, the server silently ignores SELECTs. 3. SQ_DESCRIBE per-column block has 10 fields per column (not the simple "name + type" my best-effort parser assumed): fieldIndex, columnStartPos, columnType, columnExtendedId, ownerName, extendedName, reference, alignment, sourceType, encodedLength. The string table at the end is offset-indexed (fieldIndex points into it), which is how JDBC handles disambiguation. Cursor lifecycle implementation in cursors.py mirrors JDBC exactly: PREPARE+NDESCRIBE+WANTDONE → DESCRIBE+DONE+COST+EOT CURNAME+NFETCH(4096) → TUPLE*+DONE+COST+EOT NFETCH(4096) → DONE+COST+EOT (drain) CLOSE → EOT RELEASE → EOT Five round trips per SELECT — same as JDBC. Module changes: src/informix_db/connections.py — added _init_session(), _send_protocols(), _send_dbopen(), _drain_to_eot(), _raise_sq_err(); login PDU now forces database=None always; SQ_INFO PDU replayed verbatim from JDBC capture (offsets-indexed env-var format too gnarly to derive in MVP). src/informix_db/cursors.py — full rewrite: real PDU builders for PREPARE/CURNAME+NFETCH/NFETCH/CLOSE/RELEASE; tag-dispatched response readers; cursor-name generator matching JDBC's "_ifxc" convention. src/informix_db/_resultset.py — proper SQ_DESCRIBE parser per JDBC's receiveDescribe (USVER mode); offset-indexed string table with name lookup by fieldIndex; ColumnInfo dataclass with raw type-code preserved for null-flag extraction. src/informix_db/_messages.py — added SQ_NDESCRIBE=22, SQ_WANTDONE=49. Test coverage: 40 unit + 15 integration tests (7 smoke + 8 new SELECT) = 55 total, all green, ruff clean. New tests cover: - SELECT 1 returns (1,) - cursor.description shape per PEP 249 - Multi-row INT SELECT - Multi-column mixed types (INT + FLOAT) - Iterator protocol (for row in cursor) - fetchmany(n) - Re-executing on same cursor resets state - Two cursors on one connection (sequential) Known gap: VARCHAR row decoding doesn't yet handle the variable-width on-wire encoding correctly. Phase 2.x will address — for now NotImpl errors surface raw bytes in the row tuple.	2026-05-03 15:37:10 -06:00
Ryan Malloy	9b1fd8af2c	Phase 1: pure-Python SQLI login works end-to-end This commit takes informix-db from documentation-only (Phase 0 spike) to a functional connect() / close() against a real Informix server. To our knowledge, this is the first pure-socket Informix client in any language — no CSDK, no JVM, no native libraries. Layered architecture per the plan, mirroring PyMySQL's shape: src/informix_db/ __init__.py — PEP 249 surface (connect, exceptions, paramstyle="numeric") exceptions.py — full PEP 249 hierarchy declared up front _socket.py — raw socket I/O (read_exact, write_all, timeouts) _protocol.py — IfxStreamReader / IfxStreamWriter framing primitives (big-endian, 16-bit-aligned variable payloads, length-prefixed nul-terminated strings) _messages.py — SQ_* tags from IfxMessageTypes + ASF/login markers _auth.py — pluggable auth handlers; plain-password is the only Phase-1 implementation connections.py — Connection class: builds the binary login PDU (SLheader + PFheader byte-for-byte per PROTOCOL_NOTES.md §3), sends it, parses the server response, wires up close() Phase 1 design decisions locked in DECISION_LOG.md: - paramstyle = "numeric" (matches Informix ESQL/C convention) - Python >= 3.10 - autocommit defaults to off (PEP 249 implicit) - License: MIT - Distribution name: informix-db (verified PyPI-available) Test coverage: 34 unit tests (codec round-trips against synthetic byte streams; observed login-PDU values from the spike captures asserted as exact byte literals) + 6 integration tests (connect, idempotent close, context manager, bad-password → OperationalError, bad-host → OperationalError, cursor() raises NotImplementedError). pytest — runs 34 unit tests, no Docker needed pytest -m integration — runs 6 integration tests against the Developer Edition container (pinned by digest in tests/docker-compose.yml) pytest -m "" — runs everything ruff is clean across src/ and tests/. One bug found during smoke testing: threading.get_ident() can exceed signed 32-bit on some processes, overflowing struct.pack("!i"). Fixed the same way the JDBC reference does — clamp to signed 32-bit, fall back to 0 if out of range. The field is diagnostic only. One protocol-level observation that AMENDED the JDBC source reading: the "capability section" in the login PDU is three independently negotiated 4-byte ints (Cap_1=1, Cap_2=0x3c000000, Cap_3=0), not one int + 8 reserved zero bytes as my CFR decompile read suggested. The server echoes them back identically. Trust the wire over the decompiler. Phase 1 verification matrix (from PROTOCOL_NOTES.md §12): - Login byte layout: confirmed (server accepts our pure-Python PDU) - Disconnection: confirmed (SQ_EXIT round-trip works) - Framing primitives: confirmed (34 unit tests) - Error path: bad password → OperationalError, bad host → OperationalError Phase 2 (Cursor / SELECT / basic types) is the next phase. The hard unknowns there — exact column-descriptor layout, statement-time error format — were called out as bounded gaps in Phase 0 and have existing captures (02-select-1.socat.log, 02-dml-cycle.socat.log) to characterize against.	2026-05-02 19:10:24 -06:00

5 Commits