diff --git a/docs/DECISION_LOG.md b/docs/DECISION_LOG.md index 789abfc..917750e 100644 --- a/docs/DECISION_LOG.md +++ b/docs/DECISION_LOG.md @@ -469,6 +469,66 @@ INTERVAL parameter binding (encoder) is deferred to Phase 6.e or later — same --- +## 2026-05-04 — Phase 6.f research: BYTE / TEXT / BLOB / CLOB protocol scope + +**Status**: research complete; implementation deferred +**Decision**: Decoupling LOB types into their own phase. The four "LOB" types split into two protocol families with materially different wire-level cost: + +### Protocol family A: BYTE (type=11) and TEXT (type=12) — legacy in-row-pointed blobs + +**Server-side requirements** (verified empirically against the IBM dev container 15.0.1.0.3DE): +- A blobspace must exist (`onspaces -c -b blobspace1 -p ... -o 0 -s 50000`) +- The database must be logged (`CREATE DATABASE testdb WITH LOG`) +- The column declaration must place data in the blobspace: `data BYTE IN blobspace1` + +**Even with all that, BYTE/TEXT cannot be inserted via SQL literals.** I verified by running `dbaccess - test_byte.sql` with `INSERT INTO t VALUES (1, "0x68656c6c6f")` and getting: + +``` +617: A blob data type must be supplied within this context. +``` + +This is a hard server-side restriction: blob data **must** arrive via the binary BBIND wire path. There is no string-literal escape hatch. + +**Wire protocol** (per `IfxSqli.sendBind` line 844, `sendBlob` line 3328, `sendStreamBlob` line 3482): + +1. **SQ_BIND** (tag 5): per-param block declares the BYTE/TEXT slot but the inline data is a **56-byte blob descriptor** (per `IfxBlob.toIfx` line 162) — mostly zeros, with the size at offset [16:20] as a 4-byte big-endian int. Byte 39 is the null indicator (1 = null). +2. **SQ_BBIND** (tag 41): `[short tag=41][short blob_count]` — the count of BYTE/TEXT params being streamed. +3. **For each BYTE/TEXT param**: stream of `SQ_BLOB` (tag 39) chunks: `[short tag=39][short length][padded data]`. Chunks max out at 1024 bytes per `sendStreamBlob`. +4. **End-of-blob marker**: a final `SQ_BLOB` with `[short tag=39][short length=0]`. +5. Then SQ_EXECUTE proceeds normally. + +**Decoder side**: rows containing BYTE/TEXT have a 56-byte descriptor in the SQ_TUPLE payload (per `IfxRowColumn.loadColumnData` switch case for type 11/12 reading 56 bytes). Then a separate stream of SQ_BLOB tags arrives **between** SQ_TUPLE messages, carrying the actual bytes. + +**Estimated implementation cost**: substantial. Cursor state machine needs to: +- Detect `bytes`/`str`-meant-as-TEXT params and route them through SQ_BBIND after SQ_BIND +- Send the 56-byte descriptor as the inline placeholder +- Stream chunks ≤1024 bytes each +- On the read path, parse SQ_BLOB tags between SQ_TUPLE messages and reassemble per-column + +This is a multi-day effort and warrants its own phase, **Phase 7+**. + +### Protocol family B: BLOB (type=102) and CLOB (type=101) — smart-LOBs with locators + +**Server-side requirements**: an sbspace (smart-LOB space), more complex than blobspace. (Verified: `onspaces -c -S sbspace1 ...`). + +**Wire protocol**: even more involved than BYTE/TEXT. Per `IfxLobInputStream` and `IfxSmartBlob`, smart-LOB access uses an LO_OPEN/LO_READ/LO_WRITE/LO_CLOSE session protocol against the sbspace, with handles called *locators* that travel inline in the SQ_TUPLE while the actual bytes go over a separate channel. JDBC's `IfxLocator` is a 56-byte descriptor (same shape as the BYTE descriptor!) but carries semantic meaning: storage type, sbspace ID, partition number, etc. + +**Estimated implementation cost**: substantial++ — significantly larger than BYTE/TEXT, because we'd need to implement the LO_* RPC sub-protocol entirely. + +### Decision + +**Phase 6.f is closed as research-complete** with this entry as the deliverable. The findings replace assumptions (e.g., "BLOB/CLOB will be similar to INTERVAL") with actual protocol facts. Implementation is split into: +- **Phase 8** (future): BYTE/TEXT bind+read with the SQ_BBIND/SQ_BLOB wire machinery +- **Phase 9** (future): smart-LOB BLOB/CLOB with the LO_OPEN/LO_READ session protocol + +In the meantime, **users who need to insert binary data** can use the existing `LVARCHAR` path via `str` (works for binary if encoded with `iso-8859-1`) up to ~32K — which is the LVARCHAR on-wire limit. Not a substitute for true BYTE/TEXT but covers many practical cases. + +The constants `SQ_BBIND=41`, `SQ_BLOB=39`, `SQ_FETCHBLOB=38`, `SQ_SBBIND=52`, `SQ_FILE_READ=106`, `SQ_FILE_WRITE=107` are already declared in `_messages.py` from earlier scaffolding — the protocol layer is ready when implementation lands. + +**Honest scope-discovery moment**: I went into Phase 6.f assuming it'd be similar effort to INTERVAL. Reading the wire protocol revealed a different shape entirely — multi-PDU sequences require state-machine surgery, not just new codecs. Pivoting now (instead of half-implementing) is the right call. + +--- + ## (template — copy below this line for new entries) ```