Decoded the post-login execution flow from docs/CAPTURES/02-select-1.socat.log: SQ_PREPARE format (validated against both observed PREPAREs): [short SQ_PREPARE=2] [short flags=0] [int sqlLen] ← SQL byte count, NOT including nul [bytes sql] [byte 0] ← nul terminator [short 0x0016] ← observed 22; cursor options? statement type? [short 0x0031] ← observed 49; identical across both PREPAREs [short SQ_EOT=12] SQ_TUPLE format (definitive): [short SQ_TUPLE=14] [int 0] ← flags / reserved [short payloadLen] [bytes payload] ← column values back-to-back, per type encoding SQ_DONE format (partial — see PROTOCOL_NOTES.md §6e for what's known) JDBC's full prepare/fetch/release sequence (PREPARE → DESCRIBE → ID(3 =cursor name) → ID(9=NFETCH) → TUPLE → DONE → ID(10=close) → ID(11=release)) documented in §6c. The action codes inside SQ_ID roughly map to other SQ_* tag values from IfxMessageTypes. For Python MVP we'll likely try SQ_COMMAND=1 (execute-immediate) first — it might let us skip the cursor lifecycle for parameterless queries. New modules: src/informix_db/_types.py — IfxType IntEnum ported from com.informix.lang.IfxTypes. All IDS internal type codes (CHAR=0, SMALLINT=1, INT=2, ..., BOOLEAN=45, BIGINT=52, BIGSERIAL=53, CLOB=101, BLOB=102) plus the high-bit flags (NOTNULLABLE=0x100 etc) and helpers base_type() / is_nullable() to strip and inspect the flag byte. src/informix_db/converters.py — wire-bytes → Python decoders for the Phase-2 MVP type set: SMALLINT, INT, BIGINT, SMFLOAT, FLOAT, CHAR, VARCHAR, NCHAR, NVCHAR, LVARCHAR, BOOL, DATE. Plus FIXED_WIDTHS table for the row decoder. ENCODERS dict declared empty (Phase 4 fills it in for parameter binding). DATE handling uses Informix epoch (1899-12-31, day 0); 4-byte BE int day count → datetime.date. Smoke-tested decoders all return correct Python values. Cursor / _resultset implementation NOT in this commit — they need deeper SQ_DESCRIBE byte-layout analysis and the SQ_ID sub-action vocabulary characterization. Both are bounded-but-substantial Phase 2 tasks deferred to a fresh session. 40 unit tests still passing, ruff clean.
647 lines
30 KiB
Markdown
647 lines
30 KiB
Markdown
# SQLI Wire Protocol Notes
|
||
|
||
> **Phase 0 spike artifact.** Byte-level reference for the Informix SQLI wire protocol, derived from clean-room study of the decompiled IBM Informix JDBC driver (`ifxjdbc.jar`, `Implementation-Version: 4.50.10-SNAPSHOT` / printable `4.50.JC10`, build 146 from 2023-03-07; SHA256 in `JDBC_NOTES.md`) cross-checked against packet captures of the reference exchange (pending).
|
||
|
||
---
|
||
|
||
## Source attribution conventions
|
||
|
||
Each documented byte sequence cites both sources of evidence:
|
||
|
||
- 🟡 **JDBC**: cross-referenced against `<class>#<method>()` in the decompiled tree (see `JDBC_NOTES.md`)
|
||
- 🔵 **PCAP**: observed in `docs/CAPTURES/<file>.pcap` at offset `<n>`
|
||
- ✅ **CONFIRMED**: corroborated by both 🟡 and 🔵
|
||
- 🟠 **UNVERIFIED**: only one of the two sources
|
||
|
||
**Current state** (2026-05-02): all 🟡 findings are present from JDBC reading. PCAP capture is pending (Phase 0 task #7) and required for ✅. Treat everything below as 🟠 pending PCAP cross-check.
|
||
|
||
---
|
||
|
||
## 1. Wire framing primitives 🟡
|
||
|
||
### Endianness
|
||
|
||
**Big-endian (network byte order) for all multi-byte integers.**
|
||
|
||
Source: `com.informix.lang.JavaToIfxType.JavaToIfxInt(int)` line 51-54:
|
||
```java
|
||
public static byte[] JavaToIfxInt(int i) {
|
||
byte[] b = new byte[]{(byte)(i >> 24), (byte)(i >> 16), (byte)(i >> 8), (byte)i};
|
||
return b;
|
||
}
|
||
```
|
||
Same pattern for `JavaToIfxSmallInt` (2 bytes) and `JavaToIfxLongBigInt` (8 bytes). Java's `DataOutputStream` defaults to big-endian; Informix matches that.
|
||
|
||
### Width table
|
||
|
||
| Wire type | Width (bytes) | Java method | Notes |
|
||
|-----------|---------------|-------------|-------|
|
||
| SmallInt | 2 | `writeSmallInt(short)` / `writeShort` | SMALLINT, message-type tags, length prefixes |
|
||
| Int | 4 | `writeInt(int)` | INTEGER, capability flags, sizes |
|
||
| LongBigInt| 8 | `writeLongBigint(long)` | BIGINT (use this for 64-bit ints) |
|
||
| LongInt | 10 | `writeLongInt(long)` | **Legacy variable-numeric LONG INT — skip MVP**, predates 64-bit ints |
|
||
| Real | 4 | `writeReal(float)` | IEEE 754 single |
|
||
| Double | 8 | `writeDouble(double)` | IEEE 754 double |
|
||
| Date | 4 | `writeDate(Date)` | day-count from Informix epoch (1899-12-31, to confirm) |
|
||
|
||
### Variable-length encoding (string, decimal, datetime, interval, BYTE, TEXT)
|
||
|
||
```
|
||
[short length][bytes payload][optional 0x00 pad if length is odd]
|
||
```
|
||
|
||
**The 16-bit alignment is a hard requirement.** Source: `IfxDataOutputStream.writePadded(byte[])`:
|
||
```java
|
||
public void writePadded(byte[] b) throws IOException {
|
||
this.write(b, 0, b.length);
|
||
if ((b.length & 1) >= 1) {
|
||
this.write(0);
|
||
}
|
||
}
|
||
```
|
||
Mirror in `IfxDataInputStream.readPadded`. Every variable-payload message needs this padding or the next short read will misalign and the entire parser desynchronizes.
|
||
|
||
### String encoding
|
||
|
||
`com.informix.lang.JavaToIfxType.JavaToIfxChar(String)` returns:
|
||
```
|
||
[short length+1][bytes][0x00 nul terminator]
|
||
```
|
||
The `+1` is the trailing nul. So an N-character string takes `2 + N + 1` bytes, then padded to even.
|
||
|
||
Note: there is also a `JavaToIfx4BytesChar` variant for wide-char strings. Probably for GLS multibyte locales. Phase 6+.
|
||
|
||
---
|
||
|
||
## 2. Connection establishment 🟡
|
||
|
||
### TCP setup
|
||
|
||
- Default port: **9088** (native SQLI). Port 9089 is SSL.
|
||
- `Socket.connect(InetSocketAddress(host, port), loginTimeout)`
|
||
- `setTcpNoDelay(true)` — Nagle off
|
||
- `setKeepAlive(socKeepAlive)` — opt-in via `IFX_SOC_KEEPALIVE` property
|
||
- Optional SSL wrap via `SSLSocketFactory`. **Phase 6+; skip for MVP.**
|
||
- Buffered streams: `BufferedInputStream(in, 4096)` / `BufferedOutputStream(out, 4096)` over the raw socket streams. The `IfxDataInputStream`/`IfxDataOutputStream` wrap the buffered streams.
|
||
|
||
### Capability bits (driving login path selection)
|
||
|
||
- `capabilities == 0` → legacy text-mode login (`EncodeAscString`)
|
||
- `capabilities > 0` → modern binary login (`encodeAscBinary`) **— this is what we implement**
|
||
|
||
The capabilities bitfield itself is opaque from the client side; the JDBC driver computes it based on opt-props. For MVP we will hardcode a sensible value matching what JDBC sends in a vanilla connection (will derive from PCAP).
|
||
|
||
### Other constants seen in `Connection.java`
|
||
|
||
| Constant | Value | Notes |
|
||
|----------|-------|-------|
|
||
| `MAX_BUFF_SIZE` | 32768 | upper bound for individual PDUs |
|
||
| `MIN_BUFF_SIZE` | 140 | lower bound |
|
||
| `STREAM_BUF_SIZE` | 4096 | socket buffer size |
|
||
| `PFCONREQ_BUF_SIZE` | 2048 | login request buffer |
|
||
| `SL_HEADER_SIZE` | 6 | the SLheader is 6 bytes |
|
||
| `applType` | `"sqlexec |