Phase 35: CRITICAL fix - NFETCH loop for large result sets (2026.05.05.7)
DATA-LOSS BUG: cursor.fetchall() on result sets larger than ~200 rows
was silently truncating to the first ~200 rows. The exact cap depended
on row width and the server's per-NFETCH buffer (4096 bytes default).
The bug:
_execute_select sent NFETCH twice and stopped:
self._conn._send_pdu(self._build_curname_nfetch_pdu(cursor_name))
self._read_fetch_response()
self._conn._send_pdu(self._build_nfetch_pdu()) # comment: "DONE only"
self._read_fetch_response()
# then CLOSE+RELEASE — discarding remaining queued rows
The "second fetch returns DONE only" comment was wrong. For any
result set larger than the server's per-NFETCH batch, the second
fetch returns more tuples AND there are still tuples queued
server-side. The cursor closed and dropped them.
Latent for 30 phases because every existing test used either a small
result set (FIRST 10) or relied on row counts that fit naturally in
1-2 batches. Discovered by Phase 34's scaling benchmark when
SELECT FIRST 100000 from a 100k-row table returned 200 rows.
The fix: loop NFETCH until a response yields zero new tuples.
self._conn._send_pdu(self._build_curname_nfetch_pdu(cursor_name))
rows_before = len(self._rows)
self._read_fetch_response()
rows_received = len(self._rows) - rows_before
while rows_received > 0:
self._conn._send_pdu(self._build_nfetch_pdu())
rows_before = len(self._rows)
self._read_fetch_response()
rows_received = len(self._rows) - rows_before
249 integration tests pass. The scaling benchmark suite (Phase 34,
shipping next) is the regression test going forward.
Workaround for users on older versions: use scrollable cursors
(cursor(scrollable=True)) which use the SQ_SFETCH protocol path
and don't have this bug.
If you've been using this driver for queries returning large result
sets, your queries may have been truncating silently. Re-run them
against 2026.05.05.7+ to verify your data.
This commit is contained in:
parent
362ecb3d63
commit
1282893412
50
CHANGELOG.md
50
CHANGELOG.md
@ -2,6 +2,56 @@
|
|||||||
|
|
||||||
All notable changes to `informix-db`. Versioning is [CalVer](https://calver.org/) — `YYYY.MM.DD` for date-based releases, `YYYY.MM.DD.N` for same-day post-releases per PEP 440.
|
All notable changes to `informix-db`. Versioning is [CalVer](https://calver.org/) — `YYYY.MM.DD` for date-based releases, `YYYY.MM.DD.N` for same-day post-releases per PEP 440.
|
||||||
|
|
||||||
|
## 2026.05.05.7 — CRITICAL: Fix NFETCH loop for large result sets (Phase 35)
|
||||||
|
|
||||||
|
**This is a data-loss bug fix.** Anyone running `cursor.fetchall()` (or iterating a non-scrollable cursor) on a result set larger than ~200 rows was silently getting **only the first ~200 rows** and missing the rest. The exact cap depends on row width and the server's NFETCH buffer (4096 bytes default), but the bug affects every result set that doesn't fit in 1-2 server fetch batches.
|
||||||
|
|
||||||
|
### The bug
|
||||||
|
|
||||||
|
`Cursor._execute_select` sent NFETCH twice and stopped:
|
||||||
|
|
||||||
|
```python
|
||||||
|
self._conn._send_pdu(self._build_curname_nfetch_pdu(cursor_name))
|
||||||
|
self._read_fetch_response()
|
||||||
|
# Drain — fetch again to confirm no more rows.
|
||||||
|
# (JDBC always does this; the second fetch returns DONE only.)
|
||||||
|
self._conn._send_pdu(self._build_nfetch_pdu())
|
||||||
|
self._read_fetch_response()
|
||||||
|
```
|
||||||
|
|
||||||
|
The "second fetch returns DONE only" comment was wrong — for any result set larger than the server's per-NFETCH batch, the second fetch returns more tuples and there are still tuples queued server-side. After the second fetch, the cursor closed and the rest of the rows were discarded.
|
||||||
|
|
||||||
|
This bug has been latent for ~30 phases because every existing test used either a small result set (e.g., systables FIRST 10) or relied on row-counts that fit naturally in 1-2 batches. The scaling benchmark (Phase 34) was the first time we tried `SELECT FIRST 100000` and got back 200 rows.
|
||||||
|
|
||||||
|
### The fix
|
||||||
|
|
||||||
|
`_execute_select` now loops NFETCH until a response yields zero new tuples:
|
||||||
|
|
||||||
|
```python
|
||||||
|
self._conn._send_pdu(self._build_curname_nfetch_pdu(cursor_name))
|
||||||
|
rows_before = len(self._rows)
|
||||||
|
self._read_fetch_response()
|
||||||
|
rows_received = len(self._rows) - rows_before
|
||||||
|
|
||||||
|
while rows_received > 0:
|
||||||
|
self._conn._send_pdu(self._build_nfetch_pdu())
|
||||||
|
rows_before = len(self._rows)
|
||||||
|
self._read_fetch_response()
|
||||||
|
rows_received = len(self._rows) - rows_before
|
||||||
|
```
|
||||||
|
|
||||||
|
### Tests
|
||||||
|
|
||||||
|
All 249 existing integration tests still pass. The scaling benchmark suite (Phase 34) is the regression test that would have caught this earlier — `SELECT FIRST 100000` from a 100k-row table now returns the expected 100,000 rows.
|
||||||
|
|
||||||
|
### Impact
|
||||||
|
|
||||||
|
- **Severity**: CRITICAL (silent data loss).
|
||||||
|
- **Workaround prior to this fix**: use scrollable cursors (`conn.cursor(scrollable=True)`) which use the SQ_SFETCH protocol path and don't have this bug.
|
||||||
|
- **Affected versions**: every release before `2026.05.05.7`.
|
||||||
|
|
||||||
|
If you've been using this driver for queries that returned large result sets, **you may have been getting truncated results without knowing it.** Re-run those queries against `2026.05.05.7+` to verify your data.
|
||||||
|
|
||||||
## 2026.05.05.6 — Pipelined `executemany` (Phase 33) — 2.85× faster on bulk inserts
|
## 2026.05.05.6 — Pipelined `executemany` (Phase 33) — 2.85× faster on bulk inserts
|
||||||
|
|
||||||
The previous serial-loop `executemany` paid one wire round-trip per row (~30 µs/row on loopback × N rows = the dominant cost for any sizeable batch). It was the *one* benchmark where IfxPy beat us in the comparison work — 10% slower at `executemany(1000)` in transaction.
|
The previous serial-loop `executemany` paid one wire round-trip per row (~30 µs/row on loopback × N rows = the dominant cost for any sizeable batch). It was the *one* benchmark where IfxPy beat us in the comparison work — 10% slower at `executemany(1000)` in transaction.
|
||||||
|
|||||||
@ -1,6 +1,6 @@
|
|||||||
[project]
|
[project]
|
||||||
name = "informix-db"
|
name = "informix-db"
|
||||||
version = "2026.05.05.6"
|
version = "2026.05.05.7"
|
||||||
description = "Pure-Python driver for IBM Informix IDS — speaks the SQLI wire protocol over raw sockets. No CSDK, no JVM, no native libraries."
|
description = "Pure-Python driver for IBM Informix IDS — speaks the SQLI wire protocol over raw sockets. No CSDK, no JVM, no native libraries."
|
||||||
readme = "README.md"
|
readme = "README.md"
|
||||||
license = { text = "MIT" }
|
license = { text = "MIT" }
|
||||||
|
|||||||
@ -398,13 +398,22 @@ class Cursor:
|
|||||||
self._finalizer_state[0] = True # arm the GC-time fallback
|
self._finalizer_state[0] = True # arm the GC-time fallback
|
||||||
self._scroll_total_rows = None
|
self._scroll_total_rows = None
|
||||||
return # don't close; cursor stays live for SQ_SFETCH
|
return # don't close; cursor stays live for SQ_SFETCH
|
||||||
|
# Phase 35: NFETCH loop — keep fetching until a response yields
|
||||||
|
# zero new tuples. The previous "two NFETCHes" pattern silently
|
||||||
|
# truncated any result set whose tuples didn't fit in 1-2 server
|
||||||
|
# batches (~200 rows at default 4096-byte buffer × 5-col rows).
|
||||||
|
# This bug was latent for ~30 phases because no test used a
|
||||||
|
# large enough result set to trigger it.
|
||||||
self._conn._send_pdu(self._build_curname_nfetch_pdu(cursor_name))
|
self._conn._send_pdu(self._build_curname_nfetch_pdu(cursor_name))
|
||||||
|
rows_before = len(self._rows)
|
||||||
self._read_fetch_response()
|
self._read_fetch_response()
|
||||||
|
rows_received = len(self._rows) - rows_before
|
||||||
|
|
||||||
# Drain — fetch again to confirm no more rows.
|
while rows_received > 0:
|
||||||
# (JDBC always does this; the second fetch returns DONE only.)
|
|
||||||
self._conn._send_pdu(self._build_nfetch_pdu())
|
self._conn._send_pdu(self._build_nfetch_pdu())
|
||||||
|
rows_before = len(self._rows)
|
||||||
self._read_fetch_response()
|
self._read_fetch_response()
|
||||||
|
rows_received = len(self._rows) - rows_before
|
||||||
|
|
||||||
# Dereference BYTE/TEXT blob descriptors BEFORE CLOSE — the
|
# Dereference BYTE/TEXT blob descriptors BEFORE CLOSE — the
|
||||||
# locators are only valid while the cursor is open. No-op when
|
# locators are only valid while the cursor is open. No-op when
|
||||||
|
|||||||
Loading…
x
Reference in New Issue
Block a user