informix-db

warehack.ing/informix-db

Fork 0

Commit Graph

Author	SHA1	Message	Date
Ryan Malloy	52259f0152	Phase 8: BYTE/TEXT bind+read via SQ_BBIND/SQ_BLOB/SQ_FETCHBLOB Implements end-to-end round-trip for BYTE (type 11) and TEXT (type 12) columns. Python bytes/bytearray map to BYTE; str is auto-encoded as ISO-8859-1 for TEXT. Wire protocol — write side: * SQ_BIND payload carries a 56-byte blob descriptor with size at offset [16..19] (per IfxBlob.toIfx). NULL is byte 39=1. * After all per-param blocks, SQ_BBIND (41) declares blob count, then chunked SQ_BLOB (39) messages stream the actual bytes (max 1024 bytes/chunk per JDBC), terminated by zero-length SQ_BLOB. * Then SQ_EXECUTE proceeds normally. Wire protocol — read side: * SQ_TUPLE returns only the 56-byte descriptor; actual bytes live in the blobspace. * For each BYTE/TEXT column in each row, send SQ_FETCHBLOB with the descriptor and read SQ_BLOB chunks until zero-length terminator. * The locator is only valid while the cursor is open — must dereference BEFORE sending CLOSE. Doing it after returns -602 (Cannot open blob). Server-side prerequisites (one-time setup): 1. blobspace: onspaces -c -b blobspace1 -p /path -o 0 -s 50000 2. logged DB: CREATE DATABASE testdb WITH LOG 3. config + archive: onmode -wm LTAPEDEV=/dev/null onmode -wm TAPEDEV=/dev/null onmode -l ontape -s -L 0 -t /dev/null Without #3, JDBC fails identically to our driver with "BLOB pages can't be allocated from a chunk until chunk add is logged". This identical failure was the diagnostic confirmation that our protocol bytes were correct — same server response = byte-for-byte parity. Tests: 9 integration tests in tests/test_blob.py — single-chunk, multi-chunk (5120 bytes), NULL, multi-row, binary-safe, TEXT roundtrip, ISO-8859-1, NULL TEXT, mixed columns. Plus the Phase 4 test_unsupported_param_type_raises was updated since bytes is no longer the canonical unsupported type — switched to a custom class. Total: 53 unit + 107 integration = 160 tests. The smart-LOB family (BLOB/CLOB) is a separate state-machine extension deferred to Phase 9 — it uses IfxLocator + LO_OPEN/LO_READ session protocol against sbspace, not the BBIND/BLOB stream.	2026-05-04 13:13:55 -06:00
Ryan Malloy	d508a489fd	Phase 4.x: parameterized SELECT, NULL row decoding, executemany() Three Phase 4 follow-ups in one push, all with empirical wire analysis: 1. PARAMETERIZED SELECT cur.execute('SELECT tabname FROM systables WHERE tabid = ?', (1,)) → ('systables',) Wire flow: PREPARE → DESCRIBE → SQ_BIND-only (no EXECUTE) → CURNAME+NFETCH → TUPLE+DONE → drain → CLOSE+RELEASE. The cursor open is what executes the prepared query; SQ_BIND just binds values into scope. No need for the IDESCRIBE handshake JDBC does for type discovery — server accepts our typed bind directly. 2. NULL ROW DECODING — per-type sentinel detection Each IDS type has its own NULL sentinel in tuple data: INT → 0x80000000 (INT_MIN) BIGINT → 0x8000000000000000 (LONG_MIN) SMALLINT→ 0x8000 (SHORT_MIN) REAL → all 0xFF (NaN bit pattern) FLOAT → all 0xFF DATE → 0x80000000 (same as INT) VARCHAR → [byte 1][byte 0] (length=1, single nul) — distinguishable from empty '' which is [byte 0] (length=0) Verified by wire capture against the dev container — see docs/CAPTURES/19-py-null-vs-onechar.socat.log and docs/CAPTURES/20-py-int-null.socat.log. The VARCHAR null marker is the trickiest because it LOOKS like a 1-byte string of nul, but VARCHAR can't contain embedded nuls anyway, so the byte-0 within length-1 is unambiguous. 3. executemany(sql, seq_of_params) — PEP 249 batched DML PREPARE once, loop SQ_BIND+SQ_EXECUTE per param set, RELEASE once. Performance: only ~1.06x faster than execute() loop for 200 INSERTs (dominated by per-row round trips). Phase 4.x optimization opportunity: chain BIND+EXECUTE in one PDU without intermediate flush+read for true bulk performance (would likely give 5-10x). Documented in DECISION_LOG.md as a follow-up. Module changes: src/informix_db/converters.py: + Per-type NULL sentinel constants and detection in each decoder + Decoders now return None for sentinel values src/informix_db/cursors.py: + _execute_select_with_params() — SQ_BIND alone, then cursor open + _build_bind_only_pdu() — SQ_BIND without trailing SQ_EXECUTE + executemany() — loop BIND+EXECUTE, accumulate rowcount + execute() now dispatches to _execute_select_with_params for parameterized SELECT (was: NotSupportedError) Tests: 40 unit + 47 integration (was 32; added 15 new) = 87 total, all green, ruff clean. New test files / cases: tests/test_nulls.py (7) — NULL decoding for INT, BIGINT, FLOAT, REAL, VARCHAR, empty-vs-null, mixed columns tests/test_params.py — added 4 parameterized SELECT tests, 5 executemany tests tests/test_smoke.py — updated cursor-with-params test (was Phase 1 "raises", now Phase 4 "works") Discovered captures kept for next-session debugging: docs/CAPTURES/18-py-null-rows.socat.log docs/CAPTURES/19-py-null-vs-onechar.socat.log docs/CAPTURES/20-py-int-null.socat.log	2026-05-04 11:11:50 -06:00
Ryan Malloy	509af9efa4	Phase 4: parameter binding (SQ_BIND) — int, float, str, bool, None cur.execute("INSERT INTO t VALUES (?, ?, ?)", (42, "hello", 3.14)) cur.execute("INSERT INTO t VALUES (:1, :2)", (99, "world")) cur.execute("UPDATE t SET name = ? WHERE id = ?", ("new", 2)) cur.execute("DELETE FROM t WHERE id = ?", (5,)) # all work end-to-end against a real Informix server Two breakthroughs decoded from JDBC: 1. SQ_BIND PDU shape (chained with SQ_EXECUTE in one PDU, no separate round trip): [short SQ_ID=4][int SQ_BIND=5][short numparams] for each param: [short type][short indicator][short prec_or_encLen] writePadded(rawbytes) [short SQ_EXECUTE=7][short SQ_EOT] 2. Strings are sent as CHAR (type=0) not VARCHAR (type=13). The server handles conversion to the actual column type via internal CIDESCRIBE — we don't need to do it explicitly. Per-type encoding (Phase 4 MVP): int (32-bit) → IDS INT (type=2), prec=0x0a00 (packed width=10/scale=0), 4-byte BE int (64-bit) → IDS BIGINT (type=52), prec=0x1300, 8-byte BE str → IDS CHAR (type=0), prec=0, [short len][bytes][pad] float → IDS FLOAT (type=3), prec=0, 8-byte IEEE 754 bool → IDS BOOL (type=45), prec=0, 1 byte None → indicator=-1, no data The integer "precision" field is PACKED — initially looked like a bug (why would precision be 2560?) until I realized 0x0a00 = (10 << 8) \| 0 = packed display-width and scale. Captured this surprise in DECISION_LOG.md. Critical fix to execute-path branching: parameterized INSERT also returns nfields > 0 (server describes the would-be inserted row). Switched from "branch on nfields" to "branch on SQL keyword" — JDBC does the same via its IfxStatement / IfxPreparedStatement subclassing. Numeric paramstyle support: cur.execute("... :1 ...", (val,)) works by rewriting :N → ? before sending PREPARE. Trivial regex (doesn't escape strings/comments — Phase 5 can add a proper SQL tokenizer). Module changes: src/informix_db/converters.py: + encode_param() dispatcher + _encode_int / _encode_bigint / _encode_str / _encode_float / _encode_bool src/informix_db/cursors.py: + _build_bind_execute_pdu() — chains SQ_BIND + SQ_EXECUTE in one PDU + _execute_dml_with_params() — sends bind PDU, drains, releases + execute() now accepts parameters; rewrites :N → ?; branches by SQL keyword (SELECT vs DML) + _NUMERIC_PLACEHOLDER_RE for paramstyle="numeric" support Tests: 40 unit + 32 integration (8 new parameter tests + 1 updated smoke) = 72 total, all green, ruff clean. New tests cover: - INSERT with ? params - INSERT with :N params - INT + FLOAT + str round trip via INSERT then SELECT - UPDATE with params in SET and WHERE - DELETE with parameter in WHERE - Unsupported param type (bytes) raises NotImplementedError - Parameterized SELECT raises NotSupportedError (Phase 4.x) - Dict/named params raise NotSupportedError Known gaps (Phase 4.x / Phase 5): - Parameterized SELECT (needs SQ_BIND before CURNAME+NFETCH) - NULL row decoding for VARCHAR (currently surfaces empty string) - Proper SQL tokenizer (so :N inside string literals is preserved) - bytes/datetime/Decimal parameter types	2026-05-04 10:54:32 -06:00

Author

SHA1

Message

Date

Ryan Malloy

52259f0152

Phase 8: BYTE/TEXT bind+read via SQ_BBIND/SQ_BLOB/SQ_FETCHBLOB

Implements end-to-end round-trip for BYTE (type 11) and TEXT (type 12)
columns. Python bytes/bytearray map to BYTE; str is auto-encoded as
ISO-8859-1 for TEXT.

Wire protocol — write side:
* SQ_BIND payload carries a 56-byte blob descriptor with size at offset
  [16..19] (per IfxBlob.toIfx). NULL is byte 39=1.
* After all per-param blocks, SQ_BBIND (41) declares blob count, then
  chunked SQ_BLOB (39) messages stream the actual bytes (max 1024
  bytes/chunk per JDBC), terminated by zero-length SQ_BLOB.
* Then SQ_EXECUTE proceeds normally.

Wire protocol — read side:
* SQ_TUPLE returns only the 56-byte descriptor; actual bytes live in
  the blobspace.
* For each BYTE/TEXT column in each row, send SQ_FETCHBLOB with the
  descriptor and read SQ_BLOB chunks until zero-length terminator.
* The locator is only valid while the cursor is open — must dereference
  BEFORE sending CLOSE. Doing it after returns -602 (Cannot open blob).

Server-side prerequisites (one-time setup):
1. blobspace: onspaces -c -b blobspace1 -p /path -o 0 -s 50000
2. logged DB: CREATE DATABASE testdb WITH LOG
3. config + archive:
     onmode -wm LTAPEDEV=/dev/null
     onmode -wm TAPEDEV=/dev/null
     onmode -l
     ontape -s -L 0 -t /dev/null

Without #3, JDBC fails identically to our driver with "BLOB pages can't
be allocated from a chunk until chunk add is logged". This identical
failure was the diagnostic confirmation that our protocol bytes were
correct — same server response = byte-for-byte parity.

Tests: 9 integration tests in tests/test_blob.py — single-chunk,
multi-chunk (5120 bytes), NULL, multi-row, binary-safe, TEXT roundtrip,
ISO-8859-1, NULL TEXT, mixed columns. Plus the Phase 4
test_unsupported_param_type_raises was updated since bytes is no longer
the canonical unsupported type — switched to a custom class.

Total: 53 unit + 107 integration = 160 tests.

The smart-LOB family (BLOB/CLOB) is a separate state-machine extension
deferred to Phase 9 — it uses IfxLocator + LO_OPEN/LO_READ session
protocol against sbspace, not the BBIND/BLOB stream.

2026-05-04 13:13:55 -06:00

Ryan Malloy

d508a489fd

Phase 4.x: parameterized SELECT, NULL row decoding, executemany()

Three Phase 4 follow-ups in one push, all with empirical wire analysis:

1. PARAMETERIZED SELECT
   cur.execute('SELECT tabname FROM systables WHERE tabid = ?', (1,))
   → ('systables',)
   Wire flow: PREPARE → DESCRIBE → SQ_BIND-only (no EXECUTE) →
   CURNAME+NFETCH → TUPLE+DONE → drain → CLOSE+RELEASE.
   The cursor open is what executes the prepared query; SQ_BIND just
   binds values into scope. No need for the IDESCRIBE handshake JDBC
   does for type discovery — server accepts our typed bind directly.

2. NULL ROW DECODING — per-type sentinel detection
   Each IDS type has its own NULL sentinel in tuple data:
     INT     → 0x80000000 (INT_MIN)
     BIGINT  → 0x8000000000000000 (LONG_MIN)
     SMALLINT→ 0x8000 (SHORT_MIN)
     REAL    → all 0xFF (NaN bit pattern)
     FLOAT   → all 0xFF
     DATE    → 0x80000000 (same as INT)
     VARCHAR → [byte 1][byte 0]  (length=1, single nul) — distinguishable
                from empty '' which is [byte 0] (length=0)
   Verified by wire capture against the dev container — see
   docs/CAPTURES/19-py-null-vs-onechar.socat.log and
   docs/CAPTURES/20-py-int-null.socat.log.

   The VARCHAR null marker is the trickiest because it LOOKS like a
   1-byte string of nul, but VARCHAR can't contain embedded nuls
   anyway, so the byte-0 within length-1 is unambiguous.

3. executemany(sql, seq_of_params) — PEP 249 batched DML
   PREPARE once, loop SQ_BIND+SQ_EXECUTE per param set, RELEASE once.
   Performance: only ~1.06x faster than execute() loop for 200 INSERTs
   (dominated by per-row round trips). Phase 4.x optimization opportunity:
   chain BIND+EXECUTE in one PDU without intermediate flush+read for
   true bulk performance (would likely give 5-10x). Documented in
   DECISION_LOG.md as a follow-up.

Module changes:
  src/informix_db/converters.py:
    + Per-type NULL sentinel constants and detection in each decoder
    + Decoders now return None for sentinel values
  src/informix_db/cursors.py:
    + _execute_select_with_params() — SQ_BIND alone, then cursor open
    + _build_bind_only_pdu() — SQ_BIND without trailing SQ_EXECUTE
    + executemany() — loop BIND+EXECUTE, accumulate rowcount
    + execute() now dispatches to _execute_select_with_params for
      parameterized SELECT (was: NotSupportedError)

Tests: 40 unit + 47 integration (was 32; added 15 new) = 87 total,
all green, ruff clean. New test files / cases:
  tests/test_nulls.py (7) — NULL decoding for INT, BIGINT, FLOAT,
    REAL, VARCHAR, empty-vs-null, mixed columns
  tests/test_params.py — added 4 parameterized SELECT tests, 5
    executemany tests
  tests/test_smoke.py — updated cursor-with-params test (was Phase 1
    "raises", now Phase 4 "works")

Discovered captures kept for next-session debugging:
  docs/CAPTURES/18-py-null-rows.socat.log
  docs/CAPTURES/19-py-null-vs-onechar.socat.log
  docs/CAPTURES/20-py-int-null.socat.log

2026-05-04 11:11:50 -06:00

Ryan Malloy

509af9efa4

Phase 4: parameter binding (SQ_BIND) — int, float, str, bool, None

cur.execute("INSERT INTO t VALUES (?, ?, ?)", (42, "hello", 3.14))
cur.execute("INSERT INTO t VALUES (:1, :2)", (99, "world"))
cur.execute("UPDATE t SET name = ? WHERE id = ?", ("new", 2))
cur.execute("DELETE FROM t WHERE id = ?", (5,))
# all work end-to-end against a real Informix server

Two breakthroughs decoded from JDBC:

1. SQ_BIND PDU shape (chained with SQ_EXECUTE in one PDU, no separate
   round trip):
     [short SQ_ID=4][int SQ_BIND=5][short numparams]
     for each param:
       [short type][short indicator][short prec_or_encLen]
       writePadded(rawbytes)
     [short SQ_EXECUTE=7][short SQ_EOT]

2. Strings are sent as CHAR (type=0) not VARCHAR (type=13). The server
   handles conversion to the actual column type via internal CIDESCRIBE
   — we don't need to do it explicitly.

Per-type encoding (Phase 4 MVP):
  int (32-bit) → IDS INT (type=2), prec=0x0a00 (packed width=10/scale=0),
                  4-byte BE
  int (64-bit) → IDS BIGINT (type=52), prec=0x1300, 8-byte BE
  str          → IDS CHAR (type=0), prec=0, [short len][bytes][pad]
  float        → IDS FLOAT (type=3), prec=0, 8-byte IEEE 754
  bool         → IDS BOOL (type=45), prec=0, 1 byte
  None         → indicator=-1, no data

The integer "precision" field is PACKED — initially looked like a bug
(why would precision be 2560?) until I realized 0x0a00 = (10 << 8) | 0
= packed display-width and scale. Captured this surprise in
DECISION_LOG.md.

Critical fix to execute-path branching: parameterized INSERT also
returns nfields > 0 (server describes the would-be inserted row).
Switched from "branch on nfields" to "branch on SQL keyword" — JDBC
does the same via its IfxStatement / IfxPreparedStatement subclassing.

Numeric paramstyle support: cur.execute("... :1 ...", (val,)) works
by rewriting :N → ? before sending PREPARE. Trivial regex (doesn't
escape strings/comments — Phase 5 can add a proper SQL tokenizer).

Module changes:
  src/informix_db/converters.py:
    + encode_param() dispatcher
    + _encode_int / _encode_bigint / _encode_str / _encode_float / _encode_bool
  src/informix_db/cursors.py:
    + _build_bind_execute_pdu() — chains SQ_BIND + SQ_EXECUTE in one PDU
    + _execute_dml_with_params() — sends bind PDU, drains, releases
    + execute() now accepts parameters; rewrites :N → ?; branches by
      SQL keyword (SELECT vs DML)
    + _NUMERIC_PLACEHOLDER_RE for paramstyle="numeric" support

Tests: 40 unit + 32 integration (8 new parameter tests + 1 updated
smoke) = 72 total, all green, ruff clean. New tests cover:
  - INSERT with ? params
  - INSERT with :N params
  - INT + FLOAT + str round trip via INSERT then SELECT
  - UPDATE with params in SET and WHERE
  - DELETE with parameter in WHERE
  - Unsupported param type (bytes) raises NotImplementedError
  - Parameterized SELECT raises NotSupportedError (Phase 4.x)
  - Dict/named params raise NotSupportedError

Known gaps (Phase 4.x / Phase 5):
  - Parameterized SELECT (needs SQ_BIND before CURNAME+NFETCH)
  - NULL row decoding for VARCHAR (currently surfaces empty string)
  - Proper SQL tokenizer (so :N inside string literals is preserved)
  - bytes/datetime/Decimal parameter types

2026-05-04 10:54:32 -06:00

3 Commits