Phase 2 foundations: _types.py, converters.py, post-login protocol notes
Decoded the post-login execution flow from docs/CAPTURES/02-select-1.socat.log: SQ_PREPARE format (validated against both observed PREPAREs): [short SQ_PREPARE=2] [short flags=0] [int sqlLen] ← SQL byte count, NOT including nul [bytes sql] [byte 0] ← nul terminator [short 0x0016] ← observed 22; cursor options? statement type? [short 0x0031] ← observed 49; identical across both PREPAREs [short SQ_EOT=12] SQ_TUPLE format (definitive): [short SQ_TUPLE=14] [int 0] ← flags / reserved [short payloadLen] [bytes payload] ← column values back-to-back, per type encoding SQ_DONE format (partial — see PROTOCOL_NOTES.md §6e for what's known) JDBC's full prepare/fetch/release sequence (PREPARE → DESCRIBE → ID(3 =cursor name) → ID(9=NFETCH) → TUPLE → DONE → ID(10=close) → ID(11=release)) documented in §6c. The action codes inside SQ_ID roughly map to other SQ_* tag values from IfxMessageTypes. For Python MVP we'll likely try SQ_COMMAND=1 (execute-immediate) first — it might let us skip the cursor lifecycle for parameterless queries. New modules: src/informix_db/_types.py — IfxType IntEnum ported from com.informix.lang.IfxTypes. All IDS internal type codes (CHAR=0, SMALLINT=1, INT=2, ..., BOOLEAN=45, BIGINT=52, BIGSERIAL=53, CLOB=101, BLOB=102) plus the high-bit flags (NOTNULLABLE=0x100 etc) and helpers base_type() / is_nullable() to strip and inspect the flag byte. src/informix_db/converters.py — wire-bytes → Python decoders for the Phase-2 MVP type set: SMALLINT, INT, BIGINT, SMFLOAT, FLOAT, CHAR, VARCHAR, NCHAR, NVCHAR, LVARCHAR, BOOL, DATE. Plus FIXED_WIDTHS table for the row decoder. ENCODERS dict declared empty (Phase 4 fills it in for parameter binding). DATE handling uses Informix epoch (1899-12-31, day 0); 4-byte BE int day count → datetime.date. Smoke-tested decoders all return correct Python values. Cursor / _resultset implementation NOT in this commit — they need deeper SQ_DESCRIBE byte-layout analysis and the SQ_ID sub-action vocabulary characterization. Both are bounded-but-substantial Phase 2 tasks deferred to a fresh session. 40 unit tests still passing, ruff clean.
This commit is contained in:
parent
ea00990774
commit
ddac40ff0b
Binary file not shown.
86
src/informix_db/_types.py
Normal file
86
src/informix_db/_types.py
Normal file
@ -0,0 +1,86 @@
|
||||
"""IDS internal type-code constants.
|
||||
|
||||
Ported from ``com.informix.lang.IfxTypes`` (decompile in
|
||||
``build/jdbc-src/com/informix/lang/IfxTypes.java``). The type codes
|
||||
appear in column descriptors (in the ``SQ_DESCRIBE`` response) and in
|
||||
parameter descriptors. The high bits encode flags (nullability, etc.);
|
||||
strip them with ``base_type()`` to get the raw type code.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
from enum import IntEnum
|
||||
|
||||
|
||||
class IfxType(IntEnum):
|
||||
"""IDS internal type codes (16-bit)."""
|
||||
|
||||
CHAR = 0
|
||||
SMALLINT = 1
|
||||
INT = 2 # 32-bit INTEGER
|
||||
FLOAT = 3 # 64-bit DOUBLE PRECISION
|
||||
SMFLOAT = 4 # 32-bit REAL / SMALLFLOAT
|
||||
DECIMAL = 5
|
||||
SERIAL = 6
|
||||
DATE = 7
|
||||
MONEY = 8
|
||||
NULL = 9
|
||||
DATETIME = 10
|
||||
BYTE = 11
|
||||
TEXT = 12
|
||||
VARCHAR = 13
|
||||
INTERVAL = 14
|
||||
NCHAR = 15
|
||||
NVCHAR = 16
|
||||
INT8 = 17 # legacy 64-bit "long int" — variable-length encoding
|
||||
SERIAL8 = 18
|
||||
SET = 19
|
||||
MULTISET = 20
|
||||
LIST = 21
|
||||
ROW = 22
|
||||
COLLECTION = 23
|
||||
ROWREF = 24
|
||||
UDTVAR = 40
|
||||
UDTFIXED = 41
|
||||
REFSER8 = 42
|
||||
LVARCHAR = 43
|
||||
SENDRECV = 44
|
||||
BOOL = 45
|
||||
IMPEXP = 46
|
||||
IMPEXPBIN = 47
|
||||
SQLUDRDEFAULT = 48
|
||||
UNKNOWN = 49
|
||||
BIGINT = 52 # modern 64-bit BIGINT (clean 8-byte BE encoding)
|
||||
BIGSERIAL = 53
|
||||
CLOB = 101
|
||||
BLOB = 102
|
||||
|
||||
|
||||
# High-bit flags packed into the type-code short alongside the type.
|
||||
# Strip these with ``base_type()`` before comparing to ``IfxType`` values.
|
||||
IFX_BIT_NOTNULLABLE = 0x0100 # 256 — column is NOT NULL
|
||||
IFX_BIT_DISTINCT = 0x0800 # 2048 — distinct/named type
|
||||
IFX_BIT_NAMEDROW = 0x1000 # 4096 — named row type
|
||||
IFX_BIT_DBOOLEAN = 0x4000 # 16384 — distinct boolean
|
||||
IFX_BIT_COLLCLIENT = -0x8000 # high bit (Java's Short.MIN_VALUE)
|
||||
|
||||
_TYPE_BITS_MASK = 0xFF # low byte = type code (sufficient for builtin types ≤ 102)
|
||||
_FLAG_BITS_MASK = 0xFF00
|
||||
|
||||
|
||||
def base_type(type_code: int) -> int:
|
||||
"""Strip high-bit flags and return the raw IfxType value.
|
||||
|
||||
Type codes on the wire are 16-bit shorts where the low byte is the
|
||||
type and the high byte holds flags (NOTNULLABLE, DISTINCT, etc.).
|
||||
For type codes > 255 (the CLOB=101, BLOB=102 line is the upper
|
||||
bound for builtin types), use the value directly.
|
||||
"""
|
||||
if type_code in (IfxType.CLOB, IfxType.BLOB):
|
||||
return type_code
|
||||
return type_code & _TYPE_BITS_MASK
|
||||
|
||||
|
||||
def is_nullable(type_code: int) -> bool:
|
||||
"""``True`` if the column accepts NULL (the NOTNULLABLE bit is clear)."""
|
||||
return not (type_code & IFX_BIT_NOTNULLABLE)
|
||||
128
src/informix_db/converters.py
Normal file
128
src/informix_db/converters.py
Normal file
@ -0,0 +1,128 @@
|
||||
"""Type codecs: IDS wire bytes ↔ Python values.
|
||||
|
||||
Phase 2 implements decoders for the MVP type set (SMALLINT, INT, BIGINT,
|
||||
SMFLOAT, FLOAT, CHAR, VARCHAR, BOOL, DATE). Encoders are stubbed —
|
||||
they land in Phase 4 with parameter binding.
|
||||
|
||||
Decoder dispatch: ``decode(type_code, raw_bytes) → python value`` looks
|
||||
up the codec in ``DECODERS`` keyed by ``IfxType`` (after stripping
|
||||
high-bit flags via ``_types.base_type``). NULL values are signaled by
|
||||
the row decoder, not by sentinel bytes here.
|
||||
|
||||
For DATE we use the Informix epoch (1899-12-31). The raw bytes are a
|
||||
4-byte big-endian signed int representing day count.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import datetime
|
||||
import struct
|
||||
from collections.abc import Callable
|
||||
|
||||
from ._types import IfxType, base_type
|
||||
|
||||
# Informix DATE epoch — day 0 is December 31, 1899 (per Informix convention).
|
||||
_INFORMIX_DATE_EPOCH = datetime.date(1899, 12, 31)
|
||||
|
||||
DecoderFn = Callable[[bytes], object]
|
||||
|
||||
|
||||
def _decode_smallint(raw: bytes) -> int:
|
||||
return struct.unpack("!h", raw)[0]
|
||||
|
||||
|
||||
def _decode_int(raw: bytes) -> int:
|
||||
return struct.unpack("!i", raw)[0]
|
||||
|
||||
|
||||
def _decode_bigint(raw: bytes) -> int:
|
||||
return struct.unpack("!q", raw)[0]
|
||||
|
||||
|
||||
def _decode_smfloat(raw: bytes) -> float:
|
||||
return struct.unpack("!f", raw)[0]
|
||||
|
||||
|
||||
def _decode_float(raw: bytes) -> float:
|
||||
return struct.unpack("!d", raw)[0]
|
||||
|
||||
|
||||
def _decode_char(raw: bytes) -> str:
|
||||
"""Strip trailing spaces (CHAR is space-padded to declared length)."""
|
||||
return raw.rstrip(b" \x00").decode("iso-8859-1")
|
||||
|
||||
|
||||
def _decode_varchar(raw: bytes) -> str:
|
||||
"""VARCHAR — variable-length string, nul-terminated on the wire."""
|
||||
return raw.rstrip(b"\x00").decode("iso-8859-1")
|
||||
|
||||
|
||||
def _decode_bool(raw: bytes) -> bool:
|
||||
"""Informix BOOLEAN is one byte: 't'/'T' (true), 'f'/'F' (false)."""
|
||||
if not raw:
|
||||
raise ValueError("empty BOOL payload")
|
||||
return raw[0] in (ord("t"), ord("T"), 1)
|
||||
|
||||
|
||||
def _decode_date(raw: bytes) -> datetime.date:
|
||||
"""4-byte big-endian signed int = day count from 1899-12-31."""
|
||||
days = struct.unpack("!i", raw)[0]
|
||||
return _INFORMIX_DATE_EPOCH + datetime.timedelta(days=days)
|
||||
|
||||
|
||||
# Wire byte length per Phase-2-MVP type. Used by the row decoder to
|
||||
# slice column values out of an SQ_TUPLE payload for fixed-width types.
|
||||
# Variable-width types (CHAR, VARCHAR, DECIMAL, etc.) are length-prefixed
|
||||
# on the wire and don't appear in this table.
|
||||
FIXED_WIDTHS: dict[int, int] = {
|
||||
IfxType.SMALLINT: 2,
|
||||
IfxType.INT: 4,
|
||||
IfxType.SERIAL: 4,
|
||||
IfxType.SMFLOAT: 4,
|
||||
IfxType.FLOAT: 8,
|
||||
IfxType.BIGINT: 8,
|
||||
IfxType.BIGSERIAL: 8,
|
||||
IfxType.DATE: 4,
|
||||
IfxType.BOOL: 1,
|
||||
}
|
||||
|
||||
|
||||
# Phase 2 MVP decoders. Phase 6+ adds DATETIME, INTERVAL, DECIMAL,
|
||||
# MONEY, LVARCHAR, BYTE/TEXT, BLOB/CLOB, ROW, COLLECTION.
|
||||
DECODERS: dict[int, DecoderFn] = {
|
||||
IfxType.SMALLINT: _decode_smallint,
|
||||
IfxType.INT: _decode_int,
|
||||
IfxType.SERIAL: _decode_int,
|
||||
IfxType.BIGINT: _decode_bigint,
|
||||
IfxType.BIGSERIAL: _decode_bigint,
|
||||
IfxType.SMFLOAT: _decode_smfloat,
|
||||
IfxType.FLOAT: _decode_float,
|
||||
IfxType.CHAR: _decode_char,
|
||||
IfxType.VARCHAR: _decode_varchar,
|
||||
IfxType.NCHAR: _decode_char,
|
||||
IfxType.NVCHAR: _decode_varchar,
|
||||
IfxType.LVARCHAR: _decode_varchar,
|
||||
IfxType.BOOL: _decode_bool,
|
||||
IfxType.DATE: _decode_date,
|
||||
}
|
||||
|
||||
|
||||
def decode(type_code: int, raw: bytes) -> object:
|
||||
"""Decode ``raw`` bytes for the given IDS type code into a Python value.
|
||||
|
||||
The high-bit flags (NOTNULLABLE etc.) are stripped before lookup.
|
||||
Raises ``KeyError`` for unsupported types — Phase 6+ adds the rest.
|
||||
"""
|
||||
base = base_type(type_code)
|
||||
decoder = DECODERS.get(base)
|
||||
if decoder is None:
|
||||
raise NotImplementedError(
|
||||
f"decoder for IDS type code {base} not yet implemented "
|
||||
f"(Phase 2 MVP supports: SMALLINT, INT, BIGINT, REAL, FLOAT, "
|
||||
f"CHAR, VARCHAR, BOOL, DATE)"
|
||||
)
|
||||
return decoder(raw)
|
||||
|
||||
|
||||
# Encoders — stubbed for Phase 4 parameter binding.
|
||||
ENCODERS: dict[int, Callable[[object], bytes]] = {}
|
||||
Loading…
x
Reference in New Issue
Block a user