Phase 35: CRITICAL fix - NFETCH loop for large result sets (2026.05.05.7)

DATA-LOSS BUG: cursor.fetchall() on result sets larger than ~200 rows was silently truncating to the first ~200 rows. The exact cap depended on row width and the server's per-NFETCH buffer (4096 bytes default). The bug: _execute_select sent NFETCH twice and stopped: self._conn._send_pdu(self._build_curname_nfetch_pdu(cursor_name)) self._read_fetch_response() self._conn._send_pdu(self._build_nfetch_pdu()) # comment: "DONE only" self._read_fetch_response() # then CLOSE+RELEASE — discarding remaining queued rows The "second fetch returns DONE only" comment was wrong. For any result set larger than the server's per-NFETCH batch, the second fetch returns more tuples AND there are still tuples queued server-side. The cursor closed and dropped them. Latent for 30 phases because every existing test used either a small result set (FIRST 10) or relied on row counts that fit naturally in 1-2 batches. Discovered by Phase 34's scaling benchmark when SELECT FIRST 100000 from a 100k-row table returned 200 rows. The fix: loop NFETCH until a response yields zero new tuples. self._conn._send_pdu(self._build_curname_nfetch_pdu(cursor_name)) rows_before = len(self._rows) self._read_fetch_response() rows_received = len(self._rows) - rows_before while rows_received > 0: self._conn._send_pdu(self._build_nfetch_pdu()) rows_before = len(self._rows) self._read_fetch_response() rows_received = len(self._rows) - rows_before 249 integration tests pass. The scaling benchmark suite (Phase 34, shipping next) is the regression test going forward. Workaround for users on older versions: use scrollable cursors (cursor(scrollable=True)) which use the SQ_SFETCH protocol path and don't have this bug. If you've been using this driver for queries returning large result sets, your queries may have been truncating silently. Re-run them against 2026.05.05.7+ to verify your data.
2026-05-05 12:37:22 -06:00 · 2026-05-05 12:37:22 -06:00 · 1282893412
commit 1282893412
parent 362ecb3d63
4 changed files with 65 additions and 6 deletions
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@ -2,6 +2,56 @@
 All notable changes to `informix-db`. Versioning is [CalVer](https://calver.org/) — `YYYY.MM.DD` for date-based releases, `YYYY.MM.DD.N` for same-day post-releases per PEP 440.
 ## 2026.05.05.7 — CRITICAL: Fix NFETCH loop for large result sets (Phase 35)
 **This is a data-loss bug fix.** Anyone running `cursor.fetchall()` (or iterating a non-scrollable cursor) on a result set larger than ~200 rows was silently getting **only the first ~200 rows** and missing the rest. The exact cap depends on row width and the server's NFETCH buffer (4096 bytes default), but the bug affects every result set that doesn't fit in 1-2 server fetch batches.
 ### The bug
 `Cursor._execute_select` sent NFETCH twice and stopped:
 ```python
 self._conn._send_pdu(self._build_curname_nfetch_pdu(cursor_name))
 self._read_fetch_response()
 # Drain — fetch again to confirm no more rows.
 # (JDBC always does this; the second fetch returns DONE only.)
 self._conn._send_pdu(self._build_nfetch_pdu())
 self._read_fetch_response()
 ```
 The "second fetch returns DONE only" comment was wrong — for any result set larger than the server's per-NFETCH batch, the second fetch returns more tuples and there are still tuples queued server-side. After the second fetch, the cursor closed and the rest of the rows were discarded.
 This bug has been latent for ~30 phases because every existing test used either a small result set (e.g., systables FIRST 10) or relied on row-counts that fit naturally in 1-2 batches. The scaling benchmark (Phase 34) was the first time we tried `SELECT FIRST 100000` and got back 200 rows.
 ### The fix
 `_execute_select` now loops NFETCH until a response yields zero new tuples:
 ```python
 self._conn._send_pdu(self._build_curname_nfetch_pdu(cursor_name))
 rows_before = len(self._rows)
 self._read_fetch_response()
 rows_received = len(self._rows) - rows_before
 while rows_received > 0:
    self._conn._send_pdu(self._build_nfetch_pdu())
    rows_before = len(self._rows)
    self._read_fetch_response()
    rows_received = len(self._rows) - rows_before
 ```
 ### Tests
 All 249 existing integration tests still pass. The scaling benchmark suite (Phase 34) is the regression test that would have caught this earlier — `SELECT FIRST 100000` from a 100k-row table now returns the expected 100,000 rows.
 ### Impact
 - **Severity**: CRITICAL (silent data loss).
 - **Workaround prior to this fix**: use scrollable cursors (`conn.cursor(scrollable=True)`) which use the SQ_SFETCH protocol path and don't have this bug.
 - **Affected versions**: every release before `2026.05.05.7`.
 If you've been using this driver for queries that returned large result sets, **you may have been getting truncated results without knowing it.** Re-run those queries against `2026.05.05.7+` to verify your data.
 ## 2026.05.05.6 — Pipelined `executemany` (Phase 33) — 2.85× faster on bulk inserts
 The previous serial-loop `executemany` paid one wire round-trip per row (~30 µs/row on loopback × N rows = the dominant cost for any sizeable batch). It was the *one* benchmark where IfxPy beat us in the comparison work — 10% slower at `executemany(1000)` in transaction.
--- a/pyproject.toml
+++ b/pyproject.toml
@ -1,6 +1,6 @@
 [project]
 name = "informix-db"
-version = "2026.05.05.6"
+version = "2026.05.05.7"
 description = "Pure-Python driver for IBM Informix IDS — speaks the SQLI wire protocol over raw sockets. No CSDK, no JVM, no native libraries."
 readme = "README.md"
 license = { text = "MIT" }
--- a/src/informix_db/cursors.py
+++ b/src/informix_db/cursors.py
@ -398,13 +398,22 @@ class Cursor:
            self._finalizer_state[0] = True  # arm the GC-time fallback
            self._scroll_total_rows = None
            return  # don't close; cursor stays live for SQ_SFETCH
        # Phase 35: NFETCH loop — keep fetching until a response yields
        # zero new tuples. The previous "two NFETCHes" pattern silently
        # truncated any result set whose tuples didn't fit in 1-2 server
        # batches (~200 rows at default 4096-byte buffer × 5-col rows).
        # This bug was latent for ~30 phases because no test used a
        # large enough result set to trigger it.
        self._conn._send_pdu(self._build_curname_nfetch_pdu(cursor_name))
        rows_before = len(self._rows)
        self._read_fetch_response()
        rows_received = len(self._rows) - rows_before
-        # Drain — fetch again to confirm no more rows.
+        while rows_received > 0:
        # (JDBC always does this; the second fetch returns DONE only.)
            self._conn._send_pdu(self._build_nfetch_pdu())
            rows_before = len(self._rows)
            self._read_fetch_response()
            rows_received = len(self._rows) - rows_before
        # Dereference BYTE/TEXT blob descriptors BEFORE CLOSE — the
        # locators are only valid while the cursor is open. No-op when
--- a/uv.lock
+++ b/uv.lock
@ -34,7 +34,7 @@ wheels = [
 [[package]]
 name = "informix-db"
-version = "2026.5.5.4"
+version = "2026.5.5.6"
 source = { editable = "." }
 [package.optional-dependencies]