9 Commits

Author SHA1 Message Date
0e0dfcba26 Phase 22: User-facing documentation refresh (2026.05.04.7)
The docs/USAGE.md predated Phases 17-21, so anyone landing on PyPI was
missing scrollable cursors, locale/Unicode, the autocommit cliff
finding, and the type-mapping reference.

Added sections to docs/USAGE.md:
* Locale and Unicode - client_locale, Connection.encoding, CLIENT_LOCALE
  vs DB_LOCALE, when characters can't fit the codec
* Type mapping reference - full SQL <-> Python type table, NULL
  sentinels subsection, IntervalYM
* Performance tips - 53x autocommit-cliff fix, 100x executemany win,
  72x pool win, with the actual benchmark numbers from Phase 21.1
* Scrollable cursors - fetch_* API, in-memory vs server-side trade-off,
  edge cases (past-end semantics, negative indexing, rownumber)
* Timeouts and keepalive subsection - production starting points
* Environment dictionary subsection - env={} parameter
* Known limitations - explicit table of what doesn't work (named
  params, complex UDT bind, GSSAPI, XA) with workarounds; "things
  that might surprise you" notes

README.md - added Documentation section linking to docs/USAGE.md
and tests/benchmarks/README.md.

Doc corrections caught during review:
* cursor.rownumber is 0-indexed (impl has always been correct; only
  the original docstring wording was loose)
* fetch_* methods work on BOTH scrollable=True and default cursors;
  the in-memory path supports them too

USAGE.md grew from 345 lines to 633.
2026-05-04 17:33:37 -06:00
495128c679 Phase 21.1: executemany perf - it was the autocommit cliff (2026.05.04.6)
Investigation of the Phase 21 baseline finding that executemany(N) cost
scaled linearly per-row (1.74 ms x N) regardless of batch size.

Root cause: every autocommit=True INSERT forces a server-side
transaction-log flush. Not a wire-protocol bug.

Numbers:
* executemany(1000) autocommit=True: 1.72 s (1.72 ms/row)
* executemany(1000) in single txn:    32 ms (32 us/row)

53x speedup from changing the transaction boundary, not the driver.
Pure protocol overhead is ~32 us/row -> ~31K rows/sec sustained
throughput on a single connection. Comparable to pg8000.

Added test_executemany_1000_rows_in_txn benchmark to make this
visible. Updated README headline numbers and added a "Performance
gotchas" section explaining when autocommit=False matters.

Decision: don't pipeline. The remaining 32 us is already excellent;
the autocommit gotcha is the real user-facing footgun. Docs > code.
If someone reports needing >31K rows/sec single-connection, that
becomes Phase 22.
2026-05-04 17:26:16 -06:00
90ce035a00 Phase 21: Performance benchmarks (2026.05.04.5)
Adds tests/benchmarks/ with pytest-benchmark coverage of the hot codec
paths and end-to-end SELECT/INSERT/pool/async round-trips. Establishes
a committed baseline.json so PRs can be regression-checked at review
via --benchmark-compare.

* test_codec_perf.py (16): decode/encode_param/parse_tuple_payload
  micro-benchmarks - run without container, suitable for pre-merge CI.
* test_select_perf.py (4): SELECT round-trips - 1-row latency floor,
  10-row, 1k-row full fetch, parameterized.
* test_insert_perf.py (3): single-row INSERT, executemany 100 / 1000.
* test_pool_perf.py (3): cold connect, pool acquire/release, pool
  acquire + query + release.
* test_async_perf.py (2): async round-trip overhead, 10x concurrent.
* baseline.json: committed snapshot, 28 measurements.
* benchmark pytest marker, gated off by default.
* Makefile: bench / bench-codec / bench-save targets;
  test-integration excludes benchmarks for speed.

Headline numbers (dev container loopback):
* decode(int): 181 ns
* parse_tuple 5 cols: 2.87 µs/row
* SELECT 1 round-trip: 177 µs
* Pool acquire+query+release: 295 µs
* Cold connect: 11.2 ms (72x slower than pool)

UTF-8 decode carries no measurable cost vs iso-8859-1 - confirms
Phase 20 didn't regress anything.

Total: 69 unit + 211 integration + 28 benchmark = 308 tests.
2026-05-04 17:21:12 -06:00
bea1a1cd0c Phase 20: UTF-8/multibyte locale support (2026.05.04.4)
Thread CLIENT_LOCALE through to user-data string codecs. Driver previously
hardcoded iso-8859-1 for all string conversions, which broke any locale
outside Western European code points.

* Connection.encoding property derived from client_locale via
  _python_encoding_from_locale (en_US.utf8 -> utf-8, en_US.8859-1 ->
  iso-8859-1, etc.)
* encode_param / decode / parse_tuple_payload accept an encoding
  parameter; cursor and fast-path call sites forward conn.encoding
* Smart-LOB CLOB encode/decode and TEXT decode honor connection encoding
* DataError raised for non-representable chars; cursor releases the
  prepared statement before propagating so connection state stays clean

Boundary discipline: protocol-level strings (cursor names, function
signatures, SQ_FILE fnames, error near-tokens, SQL text) stay
iso-8859-1 (always ASCII, never user-controlled).

9 new integration tests in tests/test_unicode.py covering ASCII
round-trip, Latin-1 high-bit, full byte range, locale-mapping,
encoding property, UTF-8 negotiation, multibyte (skipped without
IFX_UTF8_DATABASE), DataError on non-representable, CLOB round-trip.

Total: 69 unit + 212 integration = 281 tests.
2026-05-04 17:13:19 -06:00
9703279bc8 Phase 19: resilience tests via fault injection (v2026.05.04.3)
Fills the highest-priority gap from the test-adequacy audit:
connection-failure recovery. 12 new integration tests using a
thread-based TCP proxy (ControlledProxy) that can be kill()'d at
any moment to simulate network drops or server crashes via TCP RST
(SO_LINGER=0).

Coverage:
* Network drop mid-SELECT — OperationalError, not hang
* Network drop after describe, before fetch
* Network drop during fetch (already-materialized rows still
  readable; fresh execute fails)
* Local socket forced-close (kernel-level disconnect simulation)
* I/O error marks connection unusable post-failure
* Pool evicts connection that died mid-`with` block (size drops)
* Pool revives after all idle connections died (health check on
  acquire mints fresh)
* Async cancellation via asyncio.wait_for — pool stays usable
* Cursor reusable after SQL error
* Connection survives cursor close after error
* Sustained pool load (50 acquire/release cycles, no leak)
* read_timeout fires on a hung connection within bounds

Catches the failure classes that bite production users:
* Hangs (waiting forever on dead socket)
* Silent corruption (EOF treated as valid tuple)
* Double-fault (cleanup raises after primary error)
* Pool poisoning (broken connection returned to pool)
* Stale cursor reuse across error boundaries

Helper:
* tests/_proxy.py — ControlledProxy: thread-based TCP forwarder
  with kill() for fault injection. Two-thread pump model. SO_LINGER=0
  for RST-on-close (mimics router drop).

Total: 69 unit + 203 integration = 272 tests.

Remaining gaps from the audit (UTF-8 multibyte locale, server-version
matrix, performance benchmarks) are real but lower-severity. Phase 19
addressed the one most likely to bite production deployments.
2026-05-04 16:57:06 -06:00
a42dc5c5de Phase 18: server-side scrollable cursors via SQ_SFETCH (v2026.05.04.2)
Opt-in via conn.cursor(scrollable=True). Opens the cursor with
SQ_SCROLL (24) before SQ_OPEN (6), keeps it open server-side, and
sends SQ_SFETCH (23) per scroll call instead of materializing the
result set up-front.

User-facing API is identical to Phase 17's in-memory scroll
(fetch_first/last/prior/absolute/relative, scroll, rownumber).
Only the internal mechanism differs:

  | feature           | default          | scrollable=True
  |-------------------|------------------|------------------
  | memory            | all rows         | one row at a time
  | round-trips/fetch | 0 (after NFETCH) | 1 per call
  | cursor lifetime   | closed after exec| open until close()
  | best for          | sequential iter  | random access on
                                         | huge result sets

Wire format (verified against JDBC ScrollProbe capture):
* SQ_SFETCH: [short SQ_ID=4][int 23][short scrolltype]
  [int target][int bufSize=4096][short SQ_EOT]
  scrolltype: 1=NEXT, 4=LAST, 6=ABSOLUTE
* SQ_SCROLL (24): emitted between CURNAME and SQ_OPEN
* SQ_TUPID (25): response tag with 1-indexed row position;
  authoritative source for client-side position tracking

Position tracking uses the server's SQ_TUPID rather than client-
computed indexes. Total row count discovered lazily via SFETCH(LAST)
when negative absolute indexing requires it; cached in
_scroll_total_rows.

Trap on the way: initial SFETCH used SHORT for bufSize → server
hung silently. Same SHORT-vs-INT diagnostic pattern as Phase 4.x's
CURNAME+NFETCH. Captured JDBC trace, byte-diffed against ours,
found the mismatch (bufSize is INT in modern Informix per
isXPSVER8_40 / is2GBFetchBufferSupported).

Tests: 14 integration tests in test_scroll_cursor_server.py
covering lifecycle, sequential fetch, fetch_first/last/prior/
absolute/relative, negative indexing, scroll, empty result sets,
past-end, and random-access on a 100-row result set.

Total: 69 unit + 191 integration = 260 tests.
2026-05-04 16:41:25 -06:00
0c856372a6 v2026.05.04: bump CalVer + polish docs
Version bump (2026.05.02 → 2026.05.04) reflects the library reaching
feature completeness across Phases 1-16.

Documentation:

* README.md — full rewrite. The previous README was from Phase 1
  ("cursor() / execute() / fetchone() arrive in Phase 2"). New
  README covers: sync + async APIs, connection pool, TLS, full type
  matrix, smart-LOBs, fast-path RPC, server-compatibility,
  development workflow, and pointers to the protocol research docs.

* docs/USAGE.md — new practical recipe guide. Connecting, cursor
  lifecycle, parameter binding, transactions (logged + unlogged),
  executemany, smart-LOB read/write, connection pool, async,
  TLS, error handling, fast-path RPC, server-side setup steps,
  and a migration table from IfxPy / legacy informixdb.

* CHANGELOG.md — new file. Captures the v2026.05.04 release as the
  Phase 1-16 completion milestone with a full feature inventory
  and known-gap list. Future point-releases append here.

Classifiers updated:
* Development Status: 2 → 4 (Pre-Alpha → Beta)
* Added Framework :: AsyncIO

Keywords: added asyncio, async.

No code changes; tests still pass (69 unit + 163 integration = 232).
Ruff clean.
2026-05-04 15:38:09 -06:00
300e1bf7b4 Phase 16: async API (informix_db.aio)
Ships AsyncConnection, AsyncCursor, and AsyncConnectionPool that
expose async/await versions of the sync API for use with FastAPI,
aiohttp, etc.

Strategy: thread-pool wrapping (aiopg pattern), not native async.
Each blocking I/O call is offloaded to a worker thread via
asyncio.to_thread. The event loop never blocks; queries run in
parallel up to the pool's max_size. Cost: ~250 lines, no changes
to the sync codebase. Native async (Phase 17) would require a
~2000-line transport abstraction refactor — deferred until a real
workload needs it.

For typical FastAPI/aiohttp workloads (request → one query → return),
this is functionally equivalent to native async. Each await yields
the loop while a worker thread does the I/O. Only differs for
hundreds-of-concurrent-connections workloads.

API mirrors the sync API one-to-one:

  import asyncio
  from informix_db import aio

  async def main():
      pool = await aio.create_pool(host=..., min_size=1, max_size=10)
      async with pool.connection() as conn:
          cur = await conn.cursor()
          await cur.execute("SELECT id FROM users WHERE name = ?", (name,))
          row = await cur.fetchone()
      await pool.close()

The async pool preserves the sync pool's eviction policy: connection
errors evict, application errors retain.

Tests: 9 integration tests in test_aio.py covering open/close,
async-with, simple/parameterized SELECT, async-for cursor iteration,
pool acquire/release, 20-query concurrent gather (verifies parallelism
through max_size=5 pool), pool async context manager, commit/rollback.

Total: 69 unit + 163 integration = 232 tests.

Pyproject changes:
* Added pytest-asyncio>=1.3.0 as dev dep
* asyncio_mode = "auto" so async tests don't need decorators

Architectural completion: with Phase 16, every backlog item is
done. The Phase 0 ambition — first pure-Python Informix driver,
no native deps — is now genuinely complete.
2026-05-04 14:58:19 -06:00
9b1fd8af2c Phase 1: pure-Python SQLI login works end-to-end
This commit takes informix-db from documentation-only (Phase 0 spike)
to a functional connect() / close() against a real Informix server.
To our knowledge, this is the first pure-socket Informix client in any
language — no CSDK, no JVM, no native libraries.

Layered architecture per the plan, mirroring PyMySQL's shape:

  src/informix_db/
    __init__.py        — PEP 249 surface (connect, exceptions, paramstyle="numeric")
    exceptions.py      — full PEP 249 hierarchy declared up front
    _socket.py         — raw socket I/O (read_exact, write_all, timeouts)
    _protocol.py       — IfxStreamReader / IfxStreamWriter framing primitives
                         (big-endian, 16-bit-aligned variable payloads,
                         length-prefixed nul-terminated strings)
    _messages.py       — SQ_* tags from IfxMessageTypes + ASF/login markers
    _auth.py           — pluggable auth handlers; plain-password is the
                         only Phase-1 implementation
    connections.py     — Connection class: builds the binary login PDU
                         (SLheader + PFheader byte-for-byte per
                         PROTOCOL_NOTES.md §3), sends it, parses the
                         server response, wires up close()

Phase 1 design decisions locked in DECISION_LOG.md:
  - paramstyle = "numeric" (matches Informix ESQL/C convention)
  - Python >= 3.10
  - autocommit defaults to off (PEP 249 implicit)
  - License: MIT
  - Distribution name: informix-db (verified PyPI-available)

Test coverage: 34 unit tests (codec round-trips against synthetic byte
streams; observed login-PDU values from the spike captures asserted as
exact byte literals) + 6 integration tests (connect, idempotent close,
context manager, bad-password → OperationalError, bad-host →
OperationalError, cursor() raises NotImplementedError).

  pytest                 — runs 34 unit tests, no Docker needed
  pytest -m integration  — runs 6 integration tests against the
                           Developer Edition container (pinned by digest
                           in tests/docker-compose.yml)
  pytest -m ""           — runs everything

ruff is clean across src/ and tests/.

One bug found during smoke testing: threading.get_ident() can exceed
signed 32-bit on some processes, overflowing struct.pack("!i"). Fixed
the same way the JDBC reference does — clamp to signed 32-bit, fall
back to 0 if out of range. The field is diagnostic only.

One protocol-level observation that AMENDED the JDBC source reading:
the "capability section" in the login PDU is three independently
negotiated 4-byte ints (Cap_1=1, Cap_2=0x3c000000, Cap_3=0), not one
int + 8 reserved zero bytes as my CFR decompile read suggested. The
server echoes them back identically. Trust the wire over the
decompiler.

Phase 1 verification matrix (from PROTOCOL_NOTES.md §12):
  - Login byte layout: confirmed (server accepts our pure-Python PDU)
  - Disconnection: confirmed (SQ_EXIT round-trip works)
  - Framing primitives: confirmed (34 unit tests)
  - Error path: bad password → OperationalError, bad host → OperationalError

Phase 2 (Cursor / SELECT / basic types) is the next phase. The hard
unknowns there — exact column-descriptor layout, statement-time error
format — were called out as bounded gaps in Phase 0 and have existing
captures (02-select-1.socat.log, 02-dml-cycle.socat.log) to characterize
against.
2026-05-02 19:10:24 -06:00