Phase 35: CRITICAL fix - NFETCH loop for large result sets (2026.05.05.7)

DATA-LOSS BUG: cursor.fetchall() on result sets larger than ~200 rows
was silently truncating to the first ~200 rows. The exact cap depended
on row width and the server's per-NFETCH buffer (4096 bytes default).

The bug:

_execute_select sent NFETCH twice and stopped:
  self._conn._send_pdu(self._build_curname_nfetch_pdu(cursor_name))
  self._read_fetch_response()
  self._conn._send_pdu(self._build_nfetch_pdu())  # comment: "DONE only"
  self._read_fetch_response()
  # then CLOSE+RELEASE — discarding remaining queued rows

The "second fetch returns DONE only" comment was wrong. For any
result set larger than the server's per-NFETCH batch, the second
fetch returns more tuples AND there are still tuples queued
server-side. The cursor closed and dropped them.

Latent for 30 phases because every existing test used either a small
result set (FIRST 10) or relied on row counts that fit naturally in
1-2 batches. Discovered by Phase 34's scaling benchmark when
SELECT FIRST 100000 from a 100k-row table returned 200 rows.

The fix: loop NFETCH until a response yields zero new tuples.

  self._conn._send_pdu(self._build_curname_nfetch_pdu(cursor_name))
  rows_before = len(self._rows)
  self._read_fetch_response()
  rows_received = len(self._rows) - rows_before
  while rows_received > 0:
      self._conn._send_pdu(self._build_nfetch_pdu())
      rows_before = len(self._rows)
      self._read_fetch_response()
      rows_received = len(self._rows) - rows_before

249 integration tests pass. The scaling benchmark suite (Phase 34,
shipping next) is the regression test going forward.

Workaround for users on older versions: use scrollable cursors
(cursor(scrollable=True)) which use the SQ_SFETCH protocol path
and don't have this bug.

If you've been using this driver for queries returning large result
sets, your queries may have been truncating silently. Re-run them
against 2026.05.05.7+ to verify your data.
This commit is contained in:
Ryan Malloy 2026-05-05 12:37:22 -06:00
parent 362ecb3d63
commit 1282893412
4 changed files with 65 additions and 6 deletions

View File

@ -2,6 +2,56 @@
All notable changes to `informix-db`. Versioning is [CalVer](https://calver.org/) — `YYYY.MM.DD` for date-based releases, `YYYY.MM.DD.N` for same-day post-releases per PEP 440.
## 2026.05.05.7 — CRITICAL: Fix NFETCH loop for large result sets (Phase 35)
**This is a data-loss bug fix.** Anyone running `cursor.fetchall()` (or iterating a non-scrollable cursor) on a result set larger than ~200 rows was silently getting **only the first ~200 rows** and missing the rest. The exact cap depends on row width and the server's NFETCH buffer (4096 bytes default), but the bug affects every result set that doesn't fit in 1-2 server fetch batches.
### The bug
`Cursor._execute_select` sent NFETCH twice and stopped:
```python
self._conn._send_pdu(self._build_curname_nfetch_pdu(cursor_name))
self._read_fetch_response()
# Drain — fetch again to confirm no more rows.
# (JDBC always does this; the second fetch returns DONE only.)
self._conn._send_pdu(self._build_nfetch_pdu())
self._read_fetch_response()
```
The "second fetch returns DONE only" comment was wrong — for any result set larger than the server's per-NFETCH batch, the second fetch returns more tuples and there are still tuples queued server-side. After the second fetch, the cursor closed and the rest of the rows were discarded.
This bug has been latent for ~30 phases because every existing test used either a small result set (e.g., systables FIRST 10) or relied on row-counts that fit naturally in 1-2 batches. The scaling benchmark (Phase 34) was the first time we tried `SELECT FIRST 100000` and got back 200 rows.
### The fix
`_execute_select` now loops NFETCH until a response yields zero new tuples:
```python
self._conn._send_pdu(self._build_curname_nfetch_pdu(cursor_name))
rows_before = len(self._rows)
self._read_fetch_response()
rows_received = len(self._rows) - rows_before
while rows_received > 0:
self._conn._send_pdu(self._build_nfetch_pdu())
rows_before = len(self._rows)
self._read_fetch_response()
rows_received = len(self._rows) - rows_before
```
### Tests
All 249 existing integration tests still pass. The scaling benchmark suite (Phase 34) is the regression test that would have caught this earlier — `SELECT FIRST 100000` from a 100k-row table now returns the expected 100,000 rows.
### Impact
- **Severity**: CRITICAL (silent data loss).
- **Workaround prior to this fix**: use scrollable cursors (`conn.cursor(scrollable=True)`) which use the SQ_SFETCH protocol path and don't have this bug.
- **Affected versions**: every release before `2026.05.05.7`.
If you've been using this driver for queries that returned large result sets, **you may have been getting truncated results without knowing it.** Re-run those queries against `2026.05.05.7+` to verify your data.
## 2026.05.05.6 — Pipelined `executemany` (Phase 33) — 2.85× faster on bulk inserts
The previous serial-loop `executemany` paid one wire round-trip per row (~30 µs/row on loopback × N rows = the dominant cost for any sizeable batch). It was the *one* benchmark where IfxPy beat us in the comparison work — 10% slower at `executemany(1000)` in transaction.

View File

@ -1,6 +1,6 @@
[project]
name = "informix-db"
version = "2026.05.05.6"
version = "2026.05.05.7"
description = "Pure-Python driver for IBM Informix IDS — speaks the SQLI wire protocol over raw sockets. No CSDK, no JVM, no native libraries."
readme = "README.md"
license = { text = "MIT" }

View File

@ -398,13 +398,22 @@ class Cursor:
self._finalizer_state[0] = True # arm the GC-time fallback
self._scroll_total_rows = None
return # don't close; cursor stays live for SQ_SFETCH
# Phase 35: NFETCH loop — keep fetching until a response yields
# zero new tuples. The previous "two NFETCHes" pattern silently
# truncated any result set whose tuples didn't fit in 1-2 server
# batches (~200 rows at default 4096-byte buffer × 5-col rows).
# This bug was latent for ~30 phases because no test used a
# large enough result set to trigger it.
self._conn._send_pdu(self._build_curname_nfetch_pdu(cursor_name))
rows_before = len(self._rows)
self._read_fetch_response()
rows_received = len(self._rows) - rows_before
# Drain — fetch again to confirm no more rows.
# (JDBC always does this; the second fetch returns DONE only.)
self._conn._send_pdu(self._build_nfetch_pdu())
self._read_fetch_response()
while rows_received > 0:
self._conn._send_pdu(self._build_nfetch_pdu())
rows_before = len(self._rows)
self._read_fetch_response()
rows_received = len(self._rows) - rows_before
# Dereference BYTE/TEXT blob descriptors BEFORE CLOSE — the
# locators are only valid while the cursor is open. No-op when

2
uv.lock generated
View File

@ -34,7 +34,7 @@ wheels = [
[[package]]
name = "informix-db"
version = "2026.5.5.4"
version = "2026.5.5.6"
source = { editable = "." }
[package.optional-dependencies]