diff --git a/CHANGELOG.md b/CHANGELOG.md index 8be3bdc..791b0d3 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -2,6 +2,56 @@ All notable changes to `informix-db`. Versioning is [CalVer](https://calver.org/) — `YYYY.MM.DD` for date-based releases, `YYYY.MM.DD.N` for same-day post-releases per PEP 440. +## 2026.05.05.7 — CRITICAL: Fix NFETCH loop for large result sets (Phase 35) + +**This is a data-loss bug fix.** Anyone running `cursor.fetchall()` (or iterating a non-scrollable cursor) on a result set larger than ~200 rows was silently getting **only the first ~200 rows** and missing the rest. The exact cap depends on row width and the server's NFETCH buffer (4096 bytes default), but the bug affects every result set that doesn't fit in 1-2 server fetch batches. + +### The bug + +`Cursor._execute_select` sent NFETCH twice and stopped: + +```python +self._conn._send_pdu(self._build_curname_nfetch_pdu(cursor_name)) +self._read_fetch_response() +# Drain — fetch again to confirm no more rows. +# (JDBC always does this; the second fetch returns DONE only.) +self._conn._send_pdu(self._build_nfetch_pdu()) +self._read_fetch_response() +``` + +The "second fetch returns DONE only" comment was wrong — for any result set larger than the server's per-NFETCH batch, the second fetch returns more tuples and there are still tuples queued server-side. After the second fetch, the cursor closed and the rest of the rows were discarded. + +This bug has been latent for ~30 phases because every existing test used either a small result set (e.g., systables FIRST 10) or relied on row-counts that fit naturally in 1-2 batches. The scaling benchmark (Phase 34) was the first time we tried `SELECT FIRST 100000` and got back 200 rows. + +### The fix + +`_execute_select` now loops NFETCH until a response yields zero new tuples: + +```python +self._conn._send_pdu(self._build_curname_nfetch_pdu(cursor_name)) +rows_before = len(self._rows) +self._read_fetch_response() +rows_received = len(self._rows) - rows_before + +while rows_received > 0: + self._conn._send_pdu(self._build_nfetch_pdu()) + rows_before = len(self._rows) + self._read_fetch_response() + rows_received = len(self._rows) - rows_before +``` + +### Tests + +All 249 existing integration tests still pass. The scaling benchmark suite (Phase 34) is the regression test that would have caught this earlier — `SELECT FIRST 100000` from a 100k-row table now returns the expected 100,000 rows. + +### Impact + +- **Severity**: CRITICAL (silent data loss). +- **Workaround prior to this fix**: use scrollable cursors (`conn.cursor(scrollable=True)`) which use the SQ_SFETCH protocol path and don't have this bug. +- **Affected versions**: every release before `2026.05.05.7`. + +If you've been using this driver for queries that returned large result sets, **you may have been getting truncated results without knowing it.** Re-run those queries against `2026.05.05.7+` to verify your data. + ## 2026.05.05.6 — Pipelined `executemany` (Phase 33) — 2.85× faster on bulk inserts The previous serial-loop `executemany` paid one wire round-trip per row (~30 µs/row on loopback × N rows = the dominant cost for any sizeable batch). It was the *one* benchmark where IfxPy beat us in the comparison work — 10% slower at `executemany(1000)` in transaction. diff --git a/pyproject.toml b/pyproject.toml index e3dc113..b69243c 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -1,6 +1,6 @@ [project] name = "informix-db" -version = "2026.05.05.6" +version = "2026.05.05.7" description = "Pure-Python driver for IBM Informix IDS — speaks the SQLI wire protocol over raw sockets. No CSDK, no JVM, no native libraries." readme = "README.md" license = { text = "MIT" } diff --git a/src/informix_db/cursors.py b/src/informix_db/cursors.py index 3b767c8..1b990ef 100644 --- a/src/informix_db/cursors.py +++ b/src/informix_db/cursors.py @@ -398,13 +398,22 @@ class Cursor: self._finalizer_state[0] = True # arm the GC-time fallback self._scroll_total_rows = None return # don't close; cursor stays live for SQ_SFETCH + # Phase 35: NFETCH loop — keep fetching until a response yields + # zero new tuples. The previous "two NFETCHes" pattern silently + # truncated any result set whose tuples didn't fit in 1-2 server + # batches (~200 rows at default 4096-byte buffer × 5-col rows). + # This bug was latent for ~30 phases because no test used a + # large enough result set to trigger it. self._conn._send_pdu(self._build_curname_nfetch_pdu(cursor_name)) + rows_before = len(self._rows) self._read_fetch_response() + rows_received = len(self._rows) - rows_before - # Drain — fetch again to confirm no more rows. - # (JDBC always does this; the second fetch returns DONE only.) - self._conn._send_pdu(self._build_nfetch_pdu()) - self._read_fetch_response() + while rows_received > 0: + self._conn._send_pdu(self._build_nfetch_pdu()) + rows_before = len(self._rows) + self._read_fetch_response() + rows_received = len(self._rows) - rows_before # Dereference BYTE/TEXT blob descriptors BEFORE CLOSE — the # locators are only valid while the cursor is open. No-op when diff --git a/uv.lock b/uv.lock index 1569326..831df8a 100644 --- a/uv.lock +++ b/uv.lock @@ -34,7 +34,7 @@ wheels = [ [[package]] name = "informix-db" -version = "2026.5.5.4" +version = "2026.5.5.6" source = { editable = "." } [package.optional-dependencies]